├── upath ├── py.typed ├── tests │ ├── __init__.py │ ├── pathlib │ │ ├── __init__.py │ │ └── conftest.py │ ├── third_party │ │ ├── __init__.py │ │ └── test_pydantic.py │ ├── implementations │ │ ├── __init__.py │ │ ├── test_ftp.py │ │ ├── test_hdfs.py │ │ ├── test_cached.py │ │ ├── test_memory.py │ │ ├── test_smb.py │ │ ├── test_gcs.py │ │ ├── test_azure.py │ │ ├── test_sftp.py │ │ ├── test_local.py │ │ ├── test_webdav.py │ │ ├── test_hf.py │ │ ├── test_tar.py │ │ ├── test_github.py │ │ ├── test_s3.py │ │ └── test_zip.py │ ├── utils.py │ ├── test_drive_root_anchor_parts.py │ ├── test_pydantic.py │ ├── test_stat.py │ ├── test_chain.py │ ├── test_registry.py │ └── test_extensions.py ├── implementations │ ├── __init__.py │ ├── _experimental.py │ ├── hdfs.py │ ├── sftp.py │ ├── memory.py │ ├── github.py │ ├── cached.py │ ├── tar.py │ ├── ftp.py │ ├── data.py │ ├── zip.py │ ├── smb.py │ ├── webdav.py │ ├── http.py │ └── cloud.py ├── types │ ├── _abc.py │ ├── __init__.py │ └── _abc.pyi ├── _info.py ├── __init__.py └── _protocol.py ├── .gitattributes ├── MANIFEST.in ├── docs ├── assets │ ├── favicon.png │ └── logo-128x128-white.svg ├── css │ └── extra.css ├── _plugins │ └── copy_changelog.py ├── api │ ├── registry.md │ ├── extensions.md │ ├── index.md │ ├── types.md │ └── implementations.md ├── concepts │ ├── index.md │ ├── pathlib.md │ ├── fsspec.md │ └── upath.md ├── install.md ├── index.md └── why.md ├── environment.yml ├── SECURITY.md ├── dev └── requirements.txt ├── .readthedocs.yaml ├── .flake8 ├── .github ├── workflows │ ├── release.yml │ ├── post-dependabot-update.yml │ └── tests.yml └── dependabot.yml ├── LICENSE ├── .pre-commit-config.yaml ├── .gitignore ├── CONTRIBUTING.rst ├── mkdocs.yml ├── CODE_OF_CONDUCT.rst ├── pyproject.toml └── noxfile.py /upath/py.typed: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /upath/tests/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /upath/implementations/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /upath/tests/pathlib/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /.gitattributes: -------------------------------------------------------------------------------- 1 | * text=auto eol=lf 2 | -------------------------------------------------------------------------------- /upath/tests/third_party/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /upath/tests/implementations/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /MANIFEST.in: -------------------------------------------------------------------------------- 1 | exclude .git* 2 | recursive-exclude .git * 3 | recursive-exclude .github * 4 | -------------------------------------------------------------------------------- /docs/assets/favicon.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fsspec/universal_pathlib/HEAD/docs/assets/favicon.png -------------------------------------------------------------------------------- /docs/css/extra.css: -------------------------------------------------------------------------------- 1 | :root { 2 | --md-primary-fg-color: #4361EE; 3 | scrollbar-gutter: stable; 4 | overflow-y: scroll; 5 | } 6 | 7 | .md-typeset table:not([class]) th:not(:first-child) { 8 | min-width: 1em; 9 | padding-left: 0.8em; 10 | padding-right: 0.8em; 11 | } 12 | -------------------------------------------------------------------------------- /environment.yml: -------------------------------------------------------------------------------- 1 | name: upath 2 | channels: 3 | - defaults 4 | - conda-forge 5 | dependencies: 6 | - python==3.10 7 | - fsspec 8 | # optional 9 | - requests 10 | - s3fs 11 | - jupyter 12 | - ipython 13 | - pytest 14 | - pylint 15 | - flake8 16 | - pyarrow 17 | - moto 18 | - pip 19 | - pip: 20 | - hadoop-test-cluster 21 | - gcsfs 22 | - nox 23 | -------------------------------------------------------------------------------- /SECURITY.md: -------------------------------------------------------------------------------- 1 | # Security Policy - Vulnerability Reporting 2 | 3 | If you believe you have discovered a security issue in universal-pathlib, do not open a public issue. 4 | 5 | Instead, report it via the repository’s **`Security`** tab using the **`Report a vulnerability`** button. 6 | Include clear details and verify whether the vulnerability is in `universal-pathlib` or one of its dependencies. 7 | 8 | Providing a minimal reproducible example will help resolve the issue more efficiently. 9 | -------------------------------------------------------------------------------- /dev/requirements.txt: -------------------------------------------------------------------------------- 1 | fsspec[git,hdfs,dask,http,sftp,smb]==2025.10.0 2 | 3 | # these dependencies define their own filesystems 4 | adlfs==2025.8.0 5 | boxfs==0.3.0 6 | dropboxdrivefs==1.4.1 7 | gcsfs==2025.10.0 8 | s3fs==2025.10.0 9 | ocifs==1.3.4 10 | webdav4[fsspec]==0.10.0 11 | # gfrivefs @ git+https://github.com/fsspec/gdrivefs@master broken ... 12 | morefs[asynclocalfs]==0.2.2 13 | dvc==3.64.2 14 | huggingface_hub==1.2.1 15 | lakefs-spec==0.12.0 16 | ossfs==2025.5.0 17 | fsspec-xrootd==0.5.1 18 | wandbfs==0.0.2 19 | -------------------------------------------------------------------------------- /upath/types/_abc.py: -------------------------------------------------------------------------------- 1 | """pathlib_abc exports for compatibility with pathlib.""" 2 | 3 | from pathlib_abc import JoinablePath 4 | from pathlib_abc import PathInfo 5 | from pathlib_abc import PathParser 6 | from pathlib_abc import ReadablePath 7 | from pathlib_abc import WritablePath 8 | from pathlib_abc import vfsopen 9 | from pathlib_abc import vfspath 10 | 11 | __all__ = [ 12 | "JoinablePath", 13 | "ReadablePath", 14 | "WritablePath", 15 | "PathInfo", 16 | "PathParser", 17 | "vfsopen", 18 | "vfspath", 19 | ] 20 | -------------------------------------------------------------------------------- /upath/tests/third_party/test_pydantic.py: -------------------------------------------------------------------------------- 1 | import pytest 2 | 3 | try: 4 | from pydantic import BaseConfig 5 | from pydantic_settings import BaseSettings 6 | except ImportError: 7 | BaseConfig = BaseSettings = None 8 | pytestmark = pytest.mark.skip(reason="requires pydantic") 9 | 10 | from upath.core import UPath 11 | 12 | 13 | def test_pydantic_settings_local_upath(): 14 | class MySettings(BaseSettings): 15 | example_path: UPath = UPath(__file__) 16 | 17 | assert isinstance(MySettings().example_path, UPath) 18 | -------------------------------------------------------------------------------- /docs/_plugins/copy_changelog.py: -------------------------------------------------------------------------------- 1 | from __future__ import annotations 2 | 3 | from pathlib import Path 4 | 5 | THIS_DIR = Path(__file__).parent 6 | DOCS_DIR = THIS_DIR.parent 7 | PROJECT_ROOT = DOCS_DIR.parent 8 | 9 | 10 | def on_pre_build(**_) -> None: 11 | """Add changelog to docs/changelog.md""" 12 | cl_now = PROJECT_ROOT.joinpath("CHANGELOG.md").read_text(encoding="utf-8") 13 | 14 | f_doc = DOCS_DIR.joinpath("changelog.md") 15 | if not f_doc.is_file() or f_doc.read_text(encoding="utf-8") != cl_now: 16 | f_doc.write_text(cl_now, encoding="utf-8") 17 | -------------------------------------------------------------------------------- /upath/tests/implementations/test_ftp.py: -------------------------------------------------------------------------------- 1 | import pytest 2 | 3 | from upath import UPath 4 | from upath.tests.cases import BaseTests 5 | from upath.tests.utils import skip_on_windows 6 | 7 | 8 | @skip_on_windows 9 | class TestUPathFTP(BaseTests): 10 | 11 | @pytest.fixture(autouse=True) 12 | def path(self, ftp_server): 13 | self.path = UPath("", protocol="ftp", **ftp_server) 14 | self.prepare_file_system() 15 | 16 | 17 | def test_ftp_path_mtime(ftp_server): 18 | path = UPath("file1.txt", protocol="ftp", **ftp_server) 19 | path.touch() 20 | mtime = path.stat().st_mtime 21 | assert isinstance(mtime, float) 22 | -------------------------------------------------------------------------------- /upath/tests/implementations/test_hdfs.py: -------------------------------------------------------------------------------- 1 | """see upath/tests/conftest.py for fixtures""" 2 | 3 | import pytest # noqa: F401 4 | 5 | from upath import UPath 6 | from upath.implementations.hdfs import HDFSPath 7 | 8 | from ..cases import BaseTests 9 | 10 | 11 | @pytest.mark.hdfs 12 | class TestUPathHDFS(BaseTests): 13 | @pytest.fixture(autouse=True) 14 | def path(self, local_testdir, hdfs): 15 | host, user, port = hdfs 16 | path = f"hdfs:{local_testdir}" 17 | self.path = UPath(path, host=host, user=user, port=port) 18 | 19 | def test_is_HDFSPath(self): 20 | assert isinstance(self.path, HDFSPath) 21 | -------------------------------------------------------------------------------- /upath/tests/implementations/test_cached.py: -------------------------------------------------------------------------------- 1 | import pytest 2 | 3 | from upath import UPath 4 | from upath.implementations.cached import SimpleCachePath 5 | 6 | from ..cases import BaseTests 7 | 8 | 9 | class TestSimpleCachePath(BaseTests): 10 | @pytest.fixture(autouse=True) 11 | def path(self, local_testdir): 12 | if not local_testdir.startswith("/"): 13 | local_testdir = "/" + local_testdir 14 | path = f"simplecache::memory:{local_testdir}" 15 | self.path = UPath(path) 16 | self.prepare_file_system() 17 | 18 | def test_is_SimpleCachePath(self): 19 | assert isinstance(self.path, SimpleCachePath) 20 | -------------------------------------------------------------------------------- /.readthedocs.yaml: -------------------------------------------------------------------------------- 1 | # Read the Docs configuration file 2 | # See https://docs.readthedocs.io/en/stable/config-file/v2.html for details 3 | 4 | # Required 5 | version: 2 6 | 7 | # Set the OS, Python version, and other tools you might need 8 | build: 9 | os: ubuntu-24.04 10 | tools: 11 | python: "3.13" 12 | jobs: 13 | pre_create_environment: 14 | - asdf plugin add uv 15 | - asdf install uv latest 16 | - asdf global uv latest 17 | create_environment: 18 | - uv venv "${READTHEDOCS_VIRTUALENV_PATH}" 19 | install: 20 | - UV_PROJECT_ENVIRONMENT="${READTHEDOCS_VIRTUALENV_PATH}" uv sync --group docs 21 | 22 | # Build documentation with Mkdocs 23 | mkdocs: 24 | configuration: mkdocs.yml 25 | -------------------------------------------------------------------------------- /.flake8: -------------------------------------------------------------------------------- 1 | [flake8] 2 | ignore= 3 | # Whitespace before ':' 4 | E203 5 | # Too many leading '#' for block comment 6 | E266 7 | # Line break occurred before a binary operator 8 | W503 9 | # unindexed parameters in the str.format, see: 10 | # https://pypi.org/project/flake8-string-format/ 11 | P1 12 | # def statements on the same line with overload 13 | E704 14 | max_line_length = 88 15 | max-complexity = 15 16 | select = B,C,E,F,W,T4,B902,T,P 17 | show_source = true 18 | count = true 19 | exclude = 20 | .noxfile, 21 | .nox, 22 | __pycache__, 23 | .git, 24 | .github, 25 | .gitignore, 26 | .pytest_cache, 27 | upath/tests/pathlib/_test_support.py, 28 | upath/tests/pathlib/test_pathlib_3*.py, 29 | -------------------------------------------------------------------------------- /.github/workflows/release.yml: -------------------------------------------------------------------------------- 1 | name: Release 2 | 3 | on: 4 | release: 5 | types: [published] 6 | workflow_dispatch: 7 | 8 | env: 9 | FORCE_COLOR: "1" 10 | 11 | jobs: 12 | release: 13 | runs-on: ubuntu-latest 14 | environment: pypi 15 | permissions: 16 | id-token: write 17 | steps: 18 | - name: Check out the repository 19 | uses: actions/checkout@v4 20 | with: 21 | fetch-depth: 0 22 | 23 | - uses: hynek/setup-cached-uv@v2 24 | 25 | - name: Build package 26 | run: uvx nox -s build 27 | 28 | - name: Upload package 29 | if: github.event_name == 'release' 30 | uses: pypa/gh-action-pypi-publish@release/v1 31 | with: 32 | verbose: true 33 | skip-existing: true 34 | -------------------------------------------------------------------------------- /upath/_info.py: -------------------------------------------------------------------------------- 1 | from __future__ import annotations 2 | 3 | from typing import TYPE_CHECKING 4 | 5 | from upath.types import PathInfo 6 | 7 | if TYPE_CHECKING: 8 | from upath import UPath 9 | 10 | 11 | __all__ = [ 12 | "UPathInfo", 13 | ] 14 | 15 | 16 | class UPathInfo(PathInfo): 17 | """Path info for UPath objects.""" 18 | 19 | def __init__(self, path: UPath) -> None: 20 | self._path = path.path 21 | self._fs = path.fs 22 | 23 | def exists(self, *, follow_symlinks=True) -> bool: 24 | return self._fs.exists(self._path) 25 | 26 | def is_dir(self, *, follow_symlinks=True) -> bool: 27 | return self._fs.isdir(self._path) 28 | 29 | def is_file(self, *, follow_symlinks=True) -> bool: 30 | return self._fs.isfile(self._path) 31 | 32 | def is_symlink(self) -> bool: 33 | return False 34 | -------------------------------------------------------------------------------- /upath/implementations/_experimental.py: -------------------------------------------------------------------------------- 1 | from __future__ import annotations 2 | 3 | from typing import TYPE_CHECKING 4 | 5 | from upath.registry import get_upath_class 6 | 7 | if TYPE_CHECKING: 8 | from upath import UPath 9 | 10 | 11 | def __getattr__(name: str) -> type[UPath]: 12 | if name.startswith("_") and name.endswith("Path"): 13 | from upath import UPath 14 | 15 | protocol = name[1:-4].lower() 16 | cls = get_upath_class(protocol) 17 | if cls is None: 18 | raise RuntimeError( 19 | f"Could not find fsspec implementation for protocol {protocol!r}" 20 | ) 21 | elif not issubclass(cls, UPath): 22 | raise RuntimeError( 23 | "UPath implementation not a subclass of upath.UPath, {cls!r}" 24 | ) 25 | return cls 26 | raise AttributeError(f"module {__name__!r} has no attribute {name!r}") 27 | -------------------------------------------------------------------------------- /upath/__init__.py: -------------------------------------------------------------------------------- 1 | """Pathlib API extended to use fsspec backends.""" 2 | 3 | from __future__ import annotations 4 | 5 | from typing import TYPE_CHECKING 6 | 7 | try: 8 | from upath._version import __version__ 9 | except ImportError: 10 | __version__ = "not-installed" 11 | 12 | if TYPE_CHECKING: 13 | from upath.core import UnsupportedOperation 14 | from upath.core import UPath 15 | 16 | __all__ = ["UPath", "UnsupportedOperation"] 17 | 18 | 19 | def __getattr__(name): 20 | if name == "UPath": 21 | from upath.core import UPath 22 | 23 | globals()["UPath"] = UPath 24 | return UPath 25 | elif name == "UnsupportedOperation": 26 | from upath.core import UnsupportedOperation 27 | 28 | globals()["UnsupportedOperation"] = UnsupportedOperation 29 | return UnsupportedOperation 30 | else: 31 | raise AttributeError(f"module {__name__} has no attribute {name}") 32 | -------------------------------------------------------------------------------- /.github/workflows/post-dependabot-update.yml: -------------------------------------------------------------------------------- 1 | name: Post Dependabot Update 2 | on: 3 | pull_request: 4 | branches: [main] 5 | paths: 6 | - dev/** 7 | 8 | jobs: 9 | auto-update: 10 | if: github.actor == 'dependabot[bot]' 11 | runs-on: ubuntu-latest 12 | permissions: 13 | contents: write 14 | pull-requests: write 15 | steps: 16 | - uses: actions/checkout@v4 17 | with: 18 | ref: ${{ github.head_ref }} 19 | 20 | - uses: hynek/setup-cached-uv@v2 21 | 22 | - name: Run tests 23 | run: uvx nox --sessions flavours-codegen 24 | 25 | - name: Commit changes 26 | run: | 27 | git config user.name "github-actions[bot]" 28 | git config user.email "github-actions[bot]@users.noreply.github.com" 29 | git add upath/_flavour_sources.py 30 | git commit -m "Auto-update generated flavours" || echo "No changes" 31 | git push 32 | -------------------------------------------------------------------------------- /.github/dependabot.yml: -------------------------------------------------------------------------------- 1 | version: 2 2 | 3 | updates: 4 | - directory: "/" 5 | package-ecosystem: "pip" 6 | schedule: 7 | interval: "weekly" 8 | labels: 9 | - "maintenance :construction:" 10 | groups: 11 | # Group all pip dependencies into one PR 12 | pip-dependencies: 13 | patterns: 14 | - "*" 15 | # Update via cruft 16 | ignore: 17 | - dependency-name: "mkdocs*" 18 | - dependency-name: "pytest*" 19 | - dependency-name: "pylint" 20 | - dependency-name: "mypy" 21 | 22 | - directory: "/" 23 | package-ecosystem: "github-actions" 24 | schedule: 25 | interval: "weekly" 26 | labels: 27 | - "maintenance :construction:" 28 | # Update via cruft 29 | ignore: 30 | - dependency-name: "actions/checkout" 31 | - dependency-name: "actions/setup-python" 32 | - dependency-name: "pypa/gh-action-pypi-publish" 33 | - dependency-name: "codecov/codecov-action" 34 | -------------------------------------------------------------------------------- /docs/api/registry.md: -------------------------------------------------------------------------------- 1 | # Registry :file_cabinet: 2 | 3 | The UPath registry system manages filesystem-specific path implementations. It allows you to 4 | register custom UPath subclasses for different protocols and retrieve the appropriate 5 | implementation for a given protocol. 6 | 7 | ## Functions 8 | 9 | ::: upath.registry.get_upath_class 10 | options: 11 | heading_level: 3 12 | show_root_heading: true 13 | show_root_full_path: false 14 | 15 | ::: upath.registry.register_implementation 16 | options: 17 | heading_level: 3 18 | show_root_heading: true 19 | show_root_full_path: false 20 | 21 | ::: upath.registry.available_implementations 22 | options: 23 | heading_level: 3 24 | show_root_heading: true 25 | show_root_full_path: false 26 | 27 | --- 28 | 29 | ## See Also :link: 30 | 31 | - [UPath](index.md) - Main UPath class documentation 32 | - [Implementations](implementations.md) - Built-in UPath subclasses 33 | - [Extensions](extensions.md) - Extending UPath functionality 34 | -------------------------------------------------------------------------------- /docs/concepts/index.md: -------------------------------------------------------------------------------- 1 | # Overview :map: 2 | 3 | Universal Pathlib brings together fsspec and pathlib to provide a unified, pythonic interface for working with files across different storage systems. Understanding how these components work together will help you make the most of universal-pathlib. 4 | 5 | - **[Filesystem Spec](fsspec.md)** provides the foundation—a specification and collection of filesystem implementations that offer consistent access to local storage, cloud services, and remote systems. 6 | - **[Pathlib](pathlib.md)** defines the familiar object-oriented API from Python's standard library for working with filesystem paths. 7 | - **[Universal Pathlib](upath.md)** ties them together, implementing the [pathlib-abc](https://github.com/barneygale/pathlib-abc) interface on top of fsspec filesystems to give you a Path-like experience everywhere. 8 | 9 | Start with [fsspec filesystems](fsspec.md) to understand the available storage backends, then explore [stdlib pathlib](pathlib.md) to learn about the path interface, and finally see [upath](upath.md) to discover how universal-pathlib combines them into a powerful, unified API. 10 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2022, Andrew Fulton 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /upath/tests/implementations/test_memory.py: -------------------------------------------------------------------------------- 1 | import pytest 2 | 3 | from upath import UPath 4 | from upath.implementations.memory import MemoryPath 5 | 6 | from ..cases import BaseTests 7 | 8 | 9 | class TestMemoryPath(BaseTests): 10 | @pytest.fixture(autouse=True) 11 | def path(self, local_testdir): 12 | if not local_testdir.startswith("/"): 13 | local_testdir = "/" + local_testdir 14 | path = f"memory:{local_testdir}" 15 | self.path = UPath(path) 16 | self.prepare_file_system() 17 | 18 | def test_is_MemoryPath(self): 19 | assert isinstance(self.path, MemoryPath) 20 | 21 | 22 | @pytest.mark.parametrize( 23 | "path, expected", 24 | [ 25 | ("memory:/", "memory://"), 26 | ("memory:/a", "memory://a"), 27 | ("memory:/a/b", "memory://a/b"), 28 | ("memory://", "memory://"), 29 | ("memory://a", "memory://a"), 30 | ("memory://a/b", "memory://a/b"), 31 | ("memory:///", "memory://"), 32 | ("memory:///a", "memory://a"), 33 | ("memory:///a/b", "memory://a/b"), 34 | ], 35 | ) 36 | def test_string_representation(path, expected): 37 | path = UPath(path) 38 | assert str(path) == expected 39 | -------------------------------------------------------------------------------- /upath/tests/implementations/test_smb.py: -------------------------------------------------------------------------------- 1 | import pytest 2 | from fsspec import __version__ as fsspec_version 3 | from packaging.version import Version 4 | 5 | from upath import UPath 6 | from upath.tests.cases import BaseTests 7 | from upath.tests.utils import skip_on_windows 8 | 9 | 10 | @skip_on_windows 11 | class TestUPathSMB(BaseTests): 12 | 13 | @pytest.fixture(autouse=True) 14 | def path(self, smb_fixture): 15 | self.path = UPath(smb_fixture) 16 | 17 | @pytest.mark.parametrize( 18 | "pattern", 19 | ( 20 | "*.txt", 21 | pytest.param( 22 | "*", 23 | marks=pytest.mark.xfail( 24 | reason="SMBFileSystem.info appends '/' to dirs" 25 | ), 26 | ), 27 | pytest.param( 28 | "**/*.txt", 29 | marks=( 30 | pytest.mark.xfail(reason="requires fsspec>=2023.9.0") 31 | if Version(fsspec_version) < Version("2023.9.0") 32 | else () 33 | ), 34 | ), 35 | ), 36 | ) 37 | def test_glob(self, pathlib_base, pattern): 38 | super().test_glob(pathlib_base, pattern) 39 | -------------------------------------------------------------------------------- /upath/implementations/hdfs.py: -------------------------------------------------------------------------------- 1 | from __future__ import annotations 2 | 3 | import sys 4 | from typing import TYPE_CHECKING 5 | 6 | from upath.core import UPath 7 | from upath.types import JoinablePathLike 8 | 9 | if TYPE_CHECKING: 10 | from typing import Literal 11 | 12 | if sys.version_info >= (3, 11): 13 | from typing import Unpack 14 | else: 15 | from typing_extensions import Unpack 16 | 17 | from upath._chain import FSSpecChainParser 18 | from upath.types.storage_options import HDFSStorageOptions 19 | 20 | __all__ = ["HDFSPath"] 21 | 22 | 23 | class HDFSPath(UPath): 24 | __slots__ = () 25 | 26 | if TYPE_CHECKING: 27 | 28 | def __init__( 29 | self, 30 | *args: JoinablePathLike, 31 | protocol: Literal["hdfs"] | None = ..., 32 | chain_parser: FSSpecChainParser = ..., 33 | **storage_options: Unpack[HDFSStorageOptions], 34 | ) -> None: ... 35 | 36 | def mkdir( 37 | self, mode: int = 0o777, parents: bool = False, exist_ok: bool = False 38 | ) -> None: 39 | if not exist_ok and self.exists(): 40 | raise FileExistsError(str(self)) 41 | super().mkdir(mode=mode, parents=parents, exist_ok=exist_ok) 42 | -------------------------------------------------------------------------------- /upath/implementations/sftp.py: -------------------------------------------------------------------------------- 1 | from __future__ import annotations 2 | 3 | import sys 4 | from typing import TYPE_CHECKING 5 | 6 | from upath.core import UPath 7 | from upath.types import JoinablePathLike 8 | 9 | if TYPE_CHECKING: 10 | from typing import Literal 11 | 12 | if sys.version_info >= (3, 11): 13 | from typing import Unpack 14 | else: 15 | from typing_extensions import Unpack 16 | 17 | from upath._chain import FSSpecChainParser 18 | from upath.types.storage_options import SFTPStorageOptions 19 | 20 | __all__ = ["SFTPPath"] 21 | 22 | 23 | class SFTPPath(UPath): 24 | __slots__ = () 25 | 26 | if TYPE_CHECKING: 27 | 28 | def __init__( 29 | self, 30 | *args: JoinablePathLike, 31 | protocol: Literal["sftp"] | None = ..., 32 | chain_parser: FSSpecChainParser = ..., 33 | **storage_options: Unpack[SFTPStorageOptions], 34 | ) -> None: ... 35 | 36 | @property 37 | def path(self) -> str: 38 | path = super().path 39 | if len(path) > 1: 40 | return path.removesuffix("/") 41 | return path 42 | 43 | def __str__(self) -> str: 44 | path_str = super().__str__() 45 | if path_str.startswith(("ssh:///", "sftp:///")): 46 | return path_str.removesuffix("/") 47 | return path_str 48 | -------------------------------------------------------------------------------- /upath/tests/pathlib/conftest.py: -------------------------------------------------------------------------------- 1 | import sys 2 | 3 | import pytest 4 | 5 | BASE_URL = "https://raw.githubusercontent.com/python/cpython/{}/Lib/test/test_pathlib.py" # noqa 6 | 7 | # current origin of pathlib tests: 8 | TEST_FILES = { 9 | "test_pathlib_38.py": "7475aa2c590e33a47f5e79e4079bca0645e93f2f", 10 | "test_pathlib_39.py": "d718764f389acd1bf4a5a65661bb58862f14fb98", 11 | "test_pathlib_310.py": "b382bf50c53e6eab09f3e3bf0802ab052cb0289d", 12 | "test_pathlib_311.py": "846a23d0b8f08e62a90682c51ce01301eb923f2e", 13 | "test_pathlib_312.py": "97a6a418167f1c8bbb014fab813e440b88cf2221", # 3.12.0b4 14 | } 15 | 16 | 17 | def pytest_ignore_collect(collection_path): 18 | """prevents pathlib tests from other python version than the current to be collected 19 | 20 | (otherwise we see a lot of skipped tests in the pytest output) 21 | """ 22 | v2 = sys.version_info[:2] 23 | return { 24 | "test_pathlib_38.py": v2 != (3, 8), 25 | "test_pathlib_39.py": v2 != (3, 9), 26 | "test_pathlib_310.py": v2 != (3, 10), 27 | "test_pathlib_311.py": v2 != (3, 11), 28 | "test_pathlib_312.py": v2 != (3, 12), 29 | }.get(collection_path.name, False) 30 | 31 | 32 | def pytest_collection_modifyitems(config, items): 33 | """mark all tests in this folder as pathlib tests""" 34 | for item in items: 35 | item.add_marker(pytest.mark.pathlib) 36 | -------------------------------------------------------------------------------- /upath/implementations/memory.py: -------------------------------------------------------------------------------- 1 | from __future__ import annotations 2 | 3 | import sys 4 | from typing import TYPE_CHECKING 5 | 6 | from upath.core import UPath 7 | from upath.types import JoinablePathLike 8 | 9 | if TYPE_CHECKING: 10 | from typing import Literal 11 | 12 | if sys.version_info >= (3, 11): 13 | from typing import Unpack 14 | else: 15 | from typing_extensions import Unpack 16 | 17 | from upath._chain import FSSpecChainParser 18 | from upath.types.storage_options import MemoryStorageOptions 19 | 20 | __all__ = ["MemoryPath"] 21 | 22 | 23 | class MemoryPath(UPath): 24 | __slots__ = () 25 | 26 | if TYPE_CHECKING: 27 | 28 | def __init__( 29 | self, 30 | *args: JoinablePathLike, 31 | protocol: Literal["memory"] | None = ..., 32 | chain_parser: FSSpecChainParser = ..., 33 | **storage_options: Unpack[MemoryStorageOptions], 34 | ) -> None: ... 35 | 36 | @property 37 | def path(self) -> str: 38 | path = super().path 39 | return "/" if path in {"", "."} else path 40 | 41 | def is_absolute(self) -> bool: 42 | if self._relative_base is None and self.__vfspath__() == "/": 43 | return True 44 | return super().is_absolute() 45 | 46 | def __str__(self) -> str: 47 | s = super().__str__() 48 | if s.startswith("memory:///"): 49 | s = s.replace("memory:///", "memory://", 1) 50 | return s 51 | -------------------------------------------------------------------------------- /upath/implementations/github.py: -------------------------------------------------------------------------------- 1 | """ 2 | GitHub file system implementation 3 | """ 4 | 5 | from __future__ import annotations 6 | 7 | import sys 8 | from collections.abc import Sequence 9 | from typing import TYPE_CHECKING 10 | 11 | from upath.core import UPath 12 | from upath.types import JoinablePathLike 13 | 14 | if TYPE_CHECKING: 15 | from typing import Literal 16 | 17 | if sys.version_info >= (3, 11): 18 | from typing import Unpack 19 | else: 20 | from typing_extensions import Unpack 21 | 22 | from upath._chain import FSSpecChainParser 23 | from upath.types.storage_options import GitHubStorageOptions 24 | 25 | __all__ = ["GitHubPath"] 26 | 27 | 28 | class GitHubPath(UPath): 29 | """ 30 | GitHubPath supporting the fsspec.GitHubFileSystem 31 | """ 32 | 33 | __slots__ = () 34 | 35 | if TYPE_CHECKING: 36 | 37 | def __init__( 38 | self, 39 | *args: JoinablePathLike, 40 | protocol: Literal["github"] | None = ..., 41 | chain_parser: FSSpecChainParser = ..., 42 | **storage_options: Unpack[GitHubStorageOptions], 43 | ) -> None: ... 44 | 45 | @property 46 | def path(self) -> str: 47 | pth = super().path 48 | if pth == ".": 49 | return "" 50 | return pth 51 | 52 | @property 53 | def parts(self) -> Sequence[str]: 54 | parts = super().parts 55 | if parts and parts[0] == "/": 56 | return parts[1:] 57 | else: 58 | return parts 59 | -------------------------------------------------------------------------------- /upath/tests/implementations/test_gcs.py: -------------------------------------------------------------------------------- 1 | import fsspec 2 | import pytest 3 | 4 | from upath import UPath 5 | from upath.implementations.cloud import GCSPath 6 | 7 | from ..cases import BaseTests 8 | from ..utils import skip_on_windows 9 | 10 | 11 | @skip_on_windows 12 | @pytest.mark.usefixtures("path") 13 | class TestGCSPath(BaseTests): 14 | SUPPORTS_EMPTY_DIRS = False 15 | 16 | @pytest.fixture(autouse=True, scope="function") 17 | def path(self, gcs_fixture): 18 | path, endpoint_url = gcs_fixture 19 | self.path = UPath(path, endpoint_url=endpoint_url, token="anon") 20 | 21 | def test_is_GCSPath(self): 22 | assert isinstance(self.path, GCSPath) 23 | 24 | def test_rmdir(self): 25 | dirname = "rmdir_test" 26 | mock_dir = self.path.joinpath(dirname) 27 | mock_dir.joinpath("test.txt").write_text("hello") 28 | mock_dir.fs.invalidate_cache() 29 | mock_dir.rmdir() 30 | assert not mock_dir.exists() 31 | with pytest.raises(NotADirectoryError): 32 | self.path.joinpath("file1.txt").rmdir() 33 | 34 | @pytest.mark.skip 35 | def test_makedirs_exist_ok_false(self): 36 | pass 37 | 38 | 39 | @skip_on_windows 40 | def test_mkdir_in_empty_bucket(docker_gcs): 41 | fs = fsspec.filesystem("gcs", endpoint_url=docker_gcs, token="anon") 42 | fs.mkdir("my-fresh-bucket") 43 | assert "my-fresh-bucket/" in fs.buckets 44 | fs.invalidate_cache() 45 | del fs 46 | 47 | UPath( 48 | "gs://my-fresh-bucket/some-dir/another-dir/file", 49 | endpoint_url=docker_gcs, 50 | token="anon", 51 | ).parent.mkdir(parents=True, exist_ok=True) 52 | -------------------------------------------------------------------------------- /upath/implementations/cached.py: -------------------------------------------------------------------------------- 1 | from __future__ import annotations 2 | 3 | import sys 4 | from types import MappingProxyType 5 | from typing import TYPE_CHECKING 6 | 7 | from upath.core import UPath 8 | from upath.types import JoinablePathLike 9 | 10 | if TYPE_CHECKING: 11 | from collections.abc import Mapping 12 | from typing import Any 13 | from typing import Literal 14 | 15 | if sys.version_info >= (3, 11): 16 | from typing import Unpack 17 | else: 18 | from typing_extensions import Unpack 19 | 20 | from fsspec import AbstractFileSystem 21 | 22 | from upath._chain import FSSpecChainParser 23 | from upath.types.storage_options import SimpleCacheStorageOptions 24 | 25 | 26 | __all__ = ["SimpleCachePath"] 27 | 28 | 29 | class SimpleCachePath(UPath): 30 | __slots__ = () 31 | 32 | if TYPE_CHECKING: 33 | 34 | def __init__( 35 | self, 36 | *args: JoinablePathLike, 37 | protocol: Literal["simplecache"] | None = ..., 38 | chain_parser: FSSpecChainParser = ..., 39 | **storage_options: Unpack[SimpleCacheStorageOptions], 40 | ) -> None: ... 41 | 42 | @classmethod 43 | def _fs_factory( 44 | cls, 45 | urlpath: str, 46 | protocol: str, 47 | storage_options: Mapping[str, Any], 48 | ) -> AbstractFileSystem: 49 | so = dict(storage_options) 50 | so.pop("fo", None) 51 | return super()._fs_factory( 52 | urlpath, 53 | protocol, 54 | so, 55 | ) 56 | 57 | @property 58 | def storage_options(self) -> Mapping[str, Any]: 59 | so = self._storage_options.copy() 60 | so.pop("fo", None) 61 | return MappingProxyType(so) 62 | -------------------------------------------------------------------------------- /upath/tests/implementations/test_azure.py: -------------------------------------------------------------------------------- 1 | import pytest 2 | 3 | from upath import UPath 4 | from upath.implementations.cloud import AzurePath 5 | 6 | from ..cases import BaseTests 7 | from ..utils import skip_on_windows 8 | 9 | 10 | @skip_on_windows 11 | @pytest.mark.usefixtures("path") 12 | class TestAzurePath(BaseTests): 13 | SUPPORTS_EMPTY_DIRS = False 14 | 15 | @pytest.fixture(autouse=True, scope="function") 16 | def path(self, azurite_credentials, azure_fixture): 17 | account_name, connection_string = azurite_credentials 18 | 19 | self.storage_options = { 20 | "account_name": account_name, 21 | "connection_string": connection_string, 22 | } 23 | self.path = UPath(azure_fixture, **self.storage_options) 24 | self.prepare_file_system() 25 | 26 | def test_is_AzurePath(self): 27 | assert isinstance(self.path, AzurePath) 28 | 29 | def test_rmdir(self): 30 | new_dir = self.path / "new_dir_rmdir" 31 | new_dir.mkdir() 32 | path = new_dir / "test.txt" 33 | path.write_text("hello") 34 | assert path.exists() 35 | new_dir.rmdir() 36 | assert not new_dir.exists() 37 | 38 | with pytest.raises(NotADirectoryError): 39 | (self.path / "a" / "file.txt").rmdir() 40 | 41 | def test_protocol(self): 42 | # test all valid protocols for azure... 43 | protocol = self.path.protocol 44 | assert protocol in ["abfs", "abfss", "adl", "az"] 45 | 46 | def test_broken_mkdir(self): 47 | path = UPath( 48 | "az://new-container/", 49 | **self.storage_options, 50 | ) 51 | if path.exists(): 52 | path.rmdir() 53 | path.mkdir(parents=True, exist_ok=False) 54 | 55 | (path / "file").write_text("foo") 56 | assert path.exists() 57 | -------------------------------------------------------------------------------- /.pre-commit-config.yaml: -------------------------------------------------------------------------------- 1 | default_language_version: 2 | python: python3 3 | exclude: ^upath/tests/pathlib/test_pathlib.*\.py|^upath/tests/pathlib/_test_support\.py|^upath/_flavour_sources\.py 4 | repos: 5 | - repo: https://github.com/psf/black 6 | rev: 25.1.0 7 | hooks: 8 | - id: black 9 | - repo: https://github.com/pre-commit/pre-commit-hooks 10 | rev: v4.6.0 11 | hooks: 12 | - id: check-added-large-files 13 | - id: check-case-conflict 14 | - id: check-docstring-first 15 | - id: check-executables-have-shebangs 16 | - id: check-json 17 | - id: check-merge-conflict 18 | args: ['--assume-in-merge'] 19 | - id: check-toml 20 | - id: check-yaml 21 | exclude: ^mkdocs\.yml$ 22 | - id: debug-statements 23 | - id: end-of-file-fixer 24 | - id: mixed-line-ending 25 | args: ['--fix=lf'] 26 | - id: sort-simple-yaml 27 | - id: trailing-whitespace 28 | - repo: https://github.com/codespell-project/codespell 29 | rev: v2.4.1 30 | hooks: 31 | - id: codespell 32 | args: ['-L', 'fo'] 33 | additional_dependencies: ["tomli"] 34 | - repo: https://github.com/asottile/pyupgrade 35 | rev: v3.19.1 36 | hooks: 37 | - id: pyupgrade 38 | args: [--py39-plus] 39 | - repo: https://github.com/PyCQA/isort 40 | rev: 5.13.2 41 | hooks: 42 | - id: isort 43 | - repo: https://github.com/pycqa/flake8 44 | rev: 7.2.0 45 | hooks: 46 | - id: flake8 47 | additional_dependencies: 48 | - flake8-bugbear==24.1.17 49 | - flake8-comprehensions==3.14.0 50 | - flake8-debugger==4.1.2 51 | - flake8-string-format==0.3.0 52 | - repo: https://github.com/pycqa/bandit 53 | rev: 1.8.3 54 | hooks: 55 | - id: bandit 56 | args: [-c, pyproject.toml] 57 | additional_dependencies: ["tomli>=1.1.0"] 58 | -------------------------------------------------------------------------------- /upath/tests/implementations/test_sftp.py: -------------------------------------------------------------------------------- 1 | import pytest 2 | 3 | from upath import UPath 4 | from upath.tests.cases import BaseTests 5 | from upath.tests.utils import skip_on_windows 6 | from upath.tests.utils import xfail_if_version 7 | 8 | _xfail_old_fsspec = xfail_if_version( 9 | "fsspec", 10 | lt="2022.7.0", 11 | reason="fsspec<2022.7.0 sftp does not support create_parents", 12 | ) 13 | 14 | 15 | @skip_on_windows 16 | class TestUPathSFTP(BaseTests): 17 | 18 | @pytest.fixture(autouse=True) 19 | def path(self, ssh_fixture): 20 | self.path = UPath(ssh_fixture) 21 | 22 | @_xfail_old_fsspec 23 | def test_mkdir(self): 24 | super().test_mkdir() 25 | 26 | @_xfail_old_fsspec 27 | def test_mkdir_exists_ok_true(self): 28 | super().test_mkdir_exists_ok_true() 29 | 30 | @_xfail_old_fsspec 31 | def test_mkdir_exists_ok_false(self): 32 | super().test_mkdir_exists_ok_false() 33 | 34 | @_xfail_old_fsspec 35 | def test_mkdir_parents_true_exists_ok_false(self): 36 | super().test_mkdir_parents_true_exists_ok_false() 37 | 38 | @_xfail_old_fsspec 39 | def test_mkdir_parents_true_exists_ok_true(self): 40 | super().test_mkdir_parents_true_exists_ok_true() 41 | 42 | 43 | @pytest.mark.parametrize( 44 | "args,parts", 45 | [ 46 | (("sftp://user@host",), ("/",)), 47 | (("sftp://user@host/",), ("/",)), 48 | (("sftp://user@host", ""), ("/",)), 49 | (("sftp://user@host/", ""), ("/",)), 50 | (("sftp://user@host", "/"), ("/",)), 51 | (("sftp://user@host/", "/"), ("/",)), 52 | (("sftp://user@host/abc",), ("/", "abc")), 53 | (("sftp://user@host", "abc"), ("/", "abc")), 54 | (("sftp://user@host", "/abc"), ("/", "abc")), 55 | (("sftp://user@host/", "/abc"), ("/", "abc")), 56 | ], 57 | ) 58 | def test_join_produces_correct_parts(args, parts): 59 | pth = UPath(*args) 60 | assert pth.storage_options == {"host": "host", "username": "user"} 61 | assert pth.parts == parts 62 | -------------------------------------------------------------------------------- /docs/api/extensions.md: -------------------------------------------------------------------------------- 1 | # Extensions :puzzle_piece: 2 | 3 | The extensions module provides a base class for extending UPath functionality while maintaining 4 | compatibility with all filesystem implementations. 5 | 6 | ## ProxyUPath 7 | 8 | ::: upath.extensions.ProxyUPath 9 | options: 10 | heading_level: 3 11 | show_root_heading: true 12 | show_root_full_path: false 13 | members: false 14 | show_bases: true 15 | 16 | --- 17 | 18 | ## Usage Example 19 | 20 | `ProxyUPath` allows you to extend the UPath interface with additional methods while 21 | preserving compatibility with all supported filesystem implementations. It acts as a 22 | wrapper around any UPath instance. 23 | 24 | ### Creating a Custom Extension 25 | 26 | ```python 27 | from upath import UPath 28 | from upath.extensions import ProxyUPath 29 | 30 | class MyCustomPath(ProxyUPath): 31 | """Custom path with additional functionality""" 32 | 33 | def custom_method(self) -> str: 34 | """Add your custom functionality here""" 35 | return f"Custom processing for: {self.path}" 36 | 37 | def enhanced_read(self) -> str: 38 | """Enhanced read with preprocessing""" 39 | content = self.read_text() 40 | # Add custom processing 41 | return content.upper() 42 | 43 | # Use with any filesystem 44 | s3_path = MyCustomPath("s3://bucket/file.txt") 45 | local_path = MyCustomPath("/tmp/file.txt") 46 | gcs_path = MyCustomPath("gs://bucket/file.txt") 47 | 48 | # All standard UPath methods work 49 | print(s3_path.exists()) 50 | print(local_path.parent) 51 | 52 | # Always a subclass of your class 53 | assert isinstance(s3_path, MyCustomPath) 54 | assert isinstance(local_path, MyCustomPath) 55 | 56 | # Plus your custom methods 57 | print(s3_path.custom_method()) 58 | content = local_path.enhanced_read() 59 | ``` 60 | 61 | --- 62 | 63 | ## See Also :link: 64 | 65 | - [UPath](index.md) - Main UPath class documentation 66 | - [Implementations](implementations.md) - Built-in UPath subclasses 67 | - [Registry](registry.md) - Implementation registry 68 | -------------------------------------------------------------------------------- /upath/implementations/tar.py: -------------------------------------------------------------------------------- 1 | from __future__ import annotations 2 | 3 | import stat 4 | import sys 5 | import warnings 6 | from typing import TYPE_CHECKING 7 | 8 | from upath._stat import UPathStatResult 9 | from upath.core import UPath 10 | from upath.types import JoinablePathLike 11 | from upath.types import StatResultType 12 | 13 | if TYPE_CHECKING: 14 | from collections.abc import Iterator 15 | from typing import Literal 16 | 17 | if sys.version_info >= (3, 11): 18 | from typing import Self 19 | from typing import Unpack 20 | else: 21 | from typing_extensions import Self 22 | from typing_extensions import Unpack 23 | 24 | from upath._chain import FSSpecChainParser 25 | from upath.types.storage_options import TarStorageOptions 26 | 27 | 28 | __all__ = ["TarPath"] 29 | 30 | 31 | class TarPath(UPath): 32 | __slots__ = () 33 | 34 | if TYPE_CHECKING: 35 | 36 | def __init__( 37 | self, 38 | *args: JoinablePathLike, 39 | protocol: Literal["zip"] | None = ..., 40 | chain_parser: FSSpecChainParser = ..., 41 | **storage_options: Unpack[TarStorageOptions], 42 | ) -> None: ... 43 | 44 | def stat( 45 | self, 46 | *, 47 | follow_symlinks: bool = True, 48 | ) -> StatResultType: 49 | if not follow_symlinks: 50 | warnings.warn( 51 | f"{type(self).__name__}.stat(follow_symlinks=False):" 52 | " is currently ignored.", 53 | UserWarning, 54 | stacklevel=2, 55 | ) 56 | info = self.fs.info(self.path).copy() 57 | # convert mode 58 | if info["type"] == "directory": 59 | info["mode"] = stat.S_IFDIR 60 | elif info["type"] == "file": 61 | info["mode"] = stat.S_IFREG 62 | return UPathStatResult.from_info(info) 63 | 64 | def iterdir(self) -> Iterator[Self]: 65 | it = iter(super().iterdir()) 66 | p0 = next(it) 67 | if p0.name != "": 68 | yield p0 69 | yield from it 70 | -------------------------------------------------------------------------------- /upath/tests/utils.py: -------------------------------------------------------------------------------- 1 | import operator 2 | import sys 3 | from contextlib import contextmanager 4 | 5 | import pytest 6 | from fsspec.utils import get_package_version_without_import 7 | from packaging.version import Version 8 | 9 | 10 | def skip_on_windows(func): 11 | return pytest.mark.skipif( 12 | sys.platform.startswith("win"), reason="Don't run on Windows" 13 | )(func) 14 | 15 | 16 | def only_on_windows(func): 17 | return pytest.mark.skipif( 18 | not sys.platform.startswith("win"), reason="Only run on Windows" 19 | )(func) 20 | 21 | 22 | def posixify(path): 23 | return str(path).replace("\\", "/") 24 | 25 | 26 | def xfail_if_version(module, *, reason, **conditions): 27 | ver_str = get_package_version_without_import(module) 28 | if ver_str is None: 29 | return pytest.mark.skip(reason=f"NOT INSTALLED ({reason})") 30 | ver = Version(ver_str) 31 | if not set(conditions).issubset({"lt", "le", "ne", "eq", "ge", "gt"}): 32 | raise ValueError("unknown condition") 33 | cond = True 34 | for op, val in conditions.items(): 35 | cond &= getattr(operator, op)(ver, Version(val)) 36 | return pytest.mark.xfail(cond, reason=reason) 37 | 38 | 39 | def xfail_if_no_ssl_connection(func): 40 | try: 41 | import requests 42 | except ImportError: 43 | return pytest.mark.skip(reason="requests not installed")(func) 44 | try: 45 | requests.get("https://example.com") 46 | except (requests.exceptions.ConnectionError, requests.exceptions.SSLError): 47 | return pytest.mark.xfail(reason="No SSL connection")(func) 48 | else: 49 | return func 50 | 51 | 52 | @contextmanager 53 | def temporary_register(protocol, cls): 54 | """helper to temporarily register a protocol for testing purposes""" 55 | from upath.registry import _registry 56 | from upath.registry import get_upath_class 57 | 58 | m = _registry._m.maps[0] 59 | try: 60 | m[protocol] = cls 61 | get_upath_class.cache_clear() 62 | yield 63 | finally: 64 | m.clear() 65 | get_upath_class.cache_clear() 66 | -------------------------------------------------------------------------------- /upath/implementations/ftp.py: -------------------------------------------------------------------------------- 1 | from __future__ import annotations 2 | 3 | import sys 4 | from ftplib import error_perm as FTPPermanentError # nosec B402 5 | from typing import TYPE_CHECKING 6 | 7 | from upath.core import UPath 8 | from upath.types import UNSET_DEFAULT 9 | from upath.types import JoinablePathLike 10 | 11 | if TYPE_CHECKING: 12 | from typing import Any 13 | from typing import Literal 14 | 15 | if sys.version_info >= (3, 11): 16 | from typing import Self 17 | from typing import Unpack 18 | else: 19 | from typing_extensions import Self 20 | from typing_extensions import Unpack 21 | 22 | from upath._chain import FSSpecChainParser 23 | from upath.types import WritablePathLike 24 | from upath.types.storage_options import FTPStorageOptions 25 | 26 | __all__ = ["FTPPath"] 27 | 28 | 29 | class FTPPath(UPath): 30 | __slots__ = () 31 | 32 | if TYPE_CHECKING: 33 | 34 | def __init__( 35 | self, 36 | *args: JoinablePathLike, 37 | protocol: Literal["ftp"] | None = ..., 38 | chain_parser: FSSpecChainParser = ..., 39 | **storage_options: Unpack[FTPStorageOptions], 40 | ) -> None: ... 41 | 42 | def mkdir( 43 | self, 44 | mode: int = 0o777, 45 | parents: bool = False, 46 | exist_ok: bool = False, 47 | ) -> None: 48 | try: 49 | return super().mkdir(mode, parents, exist_ok) 50 | except FTPPermanentError as e: 51 | if e.args[0].startswith("550") and exist_ok: 52 | return 53 | raise FileExistsError(str(self)) from e 54 | 55 | def rename( 56 | self, 57 | target: WritablePathLike, 58 | *, # note: non-standard compared to pathlib 59 | recursive: bool = UNSET_DEFAULT, 60 | maxdepth: int | None = UNSET_DEFAULT, 61 | **kwargs: Any, 62 | ) -> Self: 63 | t = super().rename(target, recursive=recursive, maxdepth=maxdepth, **kwargs) 64 | self_dir = self.parent.path 65 | t.fs.invalidate_cache(self_dir) 66 | self.fs.invalidate_cache(self_dir) 67 | return t 68 | -------------------------------------------------------------------------------- /upath/implementations/data.py: -------------------------------------------------------------------------------- 1 | from __future__ import annotations 2 | 3 | import sys 4 | from collections.abc import Sequence 5 | from typing import TYPE_CHECKING 6 | 7 | from upath.core import UnsupportedOperation 8 | from upath.core import UPath 9 | from upath.types import JoinablePathLike 10 | 11 | if TYPE_CHECKING: 12 | from typing import Literal 13 | 14 | if sys.version_info >= (3, 11): 15 | from typing import Self 16 | from typing import Unpack 17 | else: 18 | from typing_extensions import Self 19 | from typing_extensions import Unpack 20 | 21 | from upath._chain import FSSpecChainParser 22 | from upath.types.storage_options import DataStorageOptions 23 | 24 | __all__ = ["DataPath"] 25 | 26 | 27 | class DataPath(UPath): 28 | __slots__ = () 29 | 30 | if TYPE_CHECKING: 31 | 32 | def __init__( 33 | self, 34 | *args: JoinablePathLike, 35 | protocol: Literal["data"] | None = ..., 36 | chain_parser: FSSpecChainParser = ..., 37 | **storage_options: Unpack[DataStorageOptions], 38 | ) -> None: ... 39 | 40 | @property 41 | def parts(self) -> Sequence[str]: 42 | return (self.path,) 43 | 44 | def __str__(self) -> str: 45 | return self.parser.join(*self._raw_urlpaths) 46 | 47 | def with_segments(self, *pathsegments: JoinablePathLike) -> Self: 48 | raise UnsupportedOperation("path operation not supported by DataPath") 49 | 50 | def with_suffix(self, suffix: str) -> Self: 51 | raise UnsupportedOperation("path operation not supported by DataPath") 52 | 53 | def mkdir( 54 | self, mode: int = 0o777, parents: bool = False, exist_ok: bool = False 55 | ) -> None: 56 | raise FileExistsError(str(self)) 57 | 58 | def write_bytes(self, data: bytes) -> int: 59 | raise UnsupportedOperation("DataPath does not support writing") 60 | 61 | def write_text( 62 | self, 63 | data: str, 64 | encoding: str | None = None, 65 | errors: str | None = None, 66 | newline: str | None = None, 67 | ) -> int: 68 | raise UnsupportedOperation("DataPath does not support writing") 69 | -------------------------------------------------------------------------------- /upath/tests/implementations/test_local.py: -------------------------------------------------------------------------------- 1 | import os 2 | from pathlib import Path 3 | 4 | import pytest 5 | 6 | from upath import UPath 7 | from upath.implementations.local import LocalPath 8 | from upath.tests.cases import BaseTests 9 | from upath.tests.utils import xfail_if_version 10 | 11 | 12 | class TestFSSpecLocal(BaseTests): 13 | @pytest.fixture(autouse=True) 14 | def path(self, local_testdir): 15 | path = f"file://{local_testdir}" 16 | self.path = UPath(path) 17 | 18 | def test_is_LocalPath(self): 19 | assert isinstance(self.path, LocalPath) 20 | 21 | def test_cwd(self): 22 | cwd = type(self.path).cwd() 23 | assert isinstance(cwd, LocalPath) 24 | assert cwd.path == Path.cwd().as_posix() 25 | 26 | def test_home(self): 27 | cwd = type(self.path).home() 28 | assert isinstance(cwd, LocalPath) 29 | assert cwd.path == Path.home().as_posix() 30 | 31 | def test_chmod(self): 32 | self.path.joinpath("file1.txt").chmod(777) 33 | 34 | 35 | @xfail_if_version("fsspec", lt="2023.10.0", reason="requires fsspec>=2023.10.0") 36 | class TestRayIOFSSpecLocal(BaseTests): 37 | @pytest.fixture(autouse=True) 38 | def path(self, local_testdir): 39 | path = f"local://{local_testdir}" 40 | self.path = UPath(path) 41 | 42 | def test_is_LocalPath(self): 43 | assert isinstance(self.path, LocalPath) 44 | 45 | def test_cwd(self): 46 | cwd = type(self.path).cwd() 47 | assert isinstance(cwd, LocalPath) 48 | assert cwd.path == Path.cwd().as_posix() 49 | 50 | def test_home(self): 51 | cwd = type(self.path).home() 52 | assert isinstance(cwd, LocalPath) 53 | assert cwd.path == Path.home().as_posix() 54 | 55 | def test_chmod(self): 56 | self.path.joinpath("file1.txt").chmod(777) 57 | 58 | 59 | @pytest.mark.parametrize( 60 | "protocol,path", 61 | [ 62 | (None, "/tmp/somefile.txt"), 63 | ("file", "file:///tmp/somefile.txt"), 64 | ("local", "local:///tmp/somefile.txt"), 65 | ], 66 | ) 67 | def test_local_paths_are_pathlike(protocol, path): 68 | assert isinstance(UPath(path, protocol=protocol), os.PathLike) 69 | -------------------------------------------------------------------------------- /upath/tests/implementations/test_webdav.py: -------------------------------------------------------------------------------- 1 | from pathlib import Path 2 | 3 | import pytest 4 | 5 | from upath import UPath 6 | 7 | from ..cases import BaseTests 8 | 9 | 10 | class TestUPathWebdav(BaseTests): 11 | @pytest.fixture(autouse=True, scope="function") 12 | def path(self, webdav_fixture): 13 | self.path = UPath(webdav_fixture, auth=("USER", "PASSWORD")) 14 | 15 | def test_fsspec_compat(self): 16 | pass 17 | 18 | def test_storage_options(self): 19 | # we need to add base_url to storage options for webdav filesystems, 20 | # to be able to serialize the http protocol to string... 21 | storage_options = self.path.storage_options 22 | base_url = storage_options["base_url"] 23 | assert storage_options == self.path.fs.storage_options 24 | assert base_url == self.path.fs.client.base_url 25 | 26 | def test_read_with_fsspec(self): 27 | # this test used to fail with fsspec<2022.5.0 because webdav was not 28 | # registered in fsspec. But when UPath(webdav_fixture) is called, to 29 | # run the BaseTests, the upath.implementations.webdav module is 30 | # imported, which registers the webdav implementation in fsspec. 31 | super().test_read_with_fsspec() 32 | 33 | @pytest.mark.parametrize( 34 | "target_factory", 35 | [ 36 | lambda obj, name: str(obj.joinpath(name).absolute()), 37 | pytest.param( 38 | lambda obj, name: UPath(obj.absolute().joinpath(name).path), 39 | marks=pytest.mark.xfail(reason="webdav has no root..."), 40 | ), 41 | pytest.param( 42 | lambda obj, name: Path(obj.absolute().joinpath(name).path), 43 | marks=pytest.mark.xfail(reason="webdav has no root..."), 44 | ), 45 | lambda obj, name: obj.absolute().joinpath(name), 46 | ], 47 | ids=[ 48 | "str_absolute", 49 | "plain_upath_absolute", 50 | "plain_path_absolute", 51 | "self_upath_absolute", 52 | ], 53 | ) 54 | def test_rename_with_target_absolute(self, target_factory): 55 | super().test_rename_with_target_absolute(target_factory) 56 | -------------------------------------------------------------------------------- /.github/workflows/tests.yml: -------------------------------------------------------------------------------- 1 | name: Tests 2 | 3 | on: 4 | push: 5 | branches: [main] 6 | pull_request: 7 | workflow_dispatch: 8 | 9 | permissions: 10 | contents: read 11 | 12 | env: 13 | FORCE_COLOR: "1" 14 | 15 | concurrency: 16 | group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }} 17 | cancel-in-progress: true 18 | 19 | jobs: 20 | tests: 21 | timeout-minutes: 10 22 | runs-on: ${{ matrix.os }} 23 | strategy: 24 | fail-fast: false 25 | matrix: 26 | os: [ubuntu-latest, windows-latest, macos-latest] 27 | pyv: ['3.9', '3.10', '3.11', '3.12', '3.13', '3.14'] 28 | session: ['tests'] 29 | 30 | include: 31 | - os: ubuntu-latest 32 | pyv: '3.9' 33 | session: 'tests-minversion' 34 | 35 | steps: 36 | - name: Check out the repository 37 | uses: actions/checkout@v4 38 | with: 39 | fetch-depth: 0 40 | 41 | - uses: hynek/setup-cached-uv@v2 42 | 43 | - name: Run tests 44 | run: uvx nox --sessions ${{ matrix.session }} --python ${{ matrix.pyv }} -- --cov-report=xml 45 | 46 | typesafety: 47 | runs-on: ubuntu-latest 48 | strategy: 49 | fail-fast: false 50 | matrix: 51 | pyv: ['3.9', '3.10', '3.11', '3.12', '3.13', '3.14'] 52 | 53 | steps: 54 | - name: Check out the repository 55 | uses: actions/checkout@v4 56 | with: 57 | fetch-depth: 0 58 | 59 | - uses: hynek/setup-cached-uv@v2 60 | 61 | - name: Run tests 62 | run: uvx nox --sessions type-safety --python ${{ matrix.pyv }} 63 | 64 | lint: 65 | runs-on: ubuntu-latest 66 | 67 | steps: 68 | - name: Check out the repository 69 | uses: actions/checkout@v4 70 | with: 71 | fetch-depth: 0 72 | 73 | - uses: hynek/setup-cached-uv@v2 74 | 75 | - name: Lint code and check dependencies 76 | run: uvx nox -s lint 77 | 78 | build: 79 | needs: [tests, lint] 80 | runs-on: ubuntu-latest 81 | steps: 82 | - name: Check out the repository 83 | uses: actions/checkout@v4 84 | with: 85 | fetch-depth: 0 86 | 87 | - uses: hynek/setup-cached-uv@v2 88 | 89 | - name: Build package 90 | run: uvx nox -s build 91 | -------------------------------------------------------------------------------- /upath/implementations/zip.py: -------------------------------------------------------------------------------- 1 | from __future__ import annotations 2 | 3 | import sys 4 | from typing import TYPE_CHECKING 5 | from zipfile import ZipInfo 6 | 7 | from upath.core import UPath 8 | from upath.types import JoinablePathLike 9 | 10 | if TYPE_CHECKING: 11 | from typing import Literal 12 | 13 | if sys.version_info >= (3, 11): 14 | from typing import Unpack 15 | else: 16 | from typing_extensions import Unpack 17 | 18 | from upath._chain import FSSpecChainParser 19 | from upath.types.storage_options import ZipStorageOptions 20 | 21 | 22 | __all__ = ["ZipPath"] 23 | 24 | 25 | class ZipPath(UPath): 26 | __slots__ = () 27 | 28 | if TYPE_CHECKING: 29 | 30 | def __init__( 31 | self, 32 | *args: JoinablePathLike, 33 | protocol: Literal["zip"] | None = ..., 34 | chain_parser: FSSpecChainParser = ..., 35 | **storage_options: Unpack[ZipStorageOptions], 36 | ) -> None: ... 37 | 38 | if sys.version_info >= (3, 11): 39 | 40 | def mkdir( 41 | self, 42 | mode: int = 0o777, 43 | parents: bool = False, 44 | exist_ok: bool = False, 45 | ) -> None: 46 | is_dir = self.is_dir() 47 | if is_dir and not exist_ok: 48 | raise FileExistsError(f"File exists: {self.path!r}") 49 | elif not is_dir: 50 | zipfile = self.fs.zip 51 | zipfile.mkdir(self.path, mode) 52 | 53 | else: 54 | 55 | def mkdir( 56 | self, 57 | mode: int = 0o777, 58 | parents: bool = False, 59 | exist_ok: bool = False, 60 | ) -> None: 61 | is_dir = self.is_dir() 62 | if is_dir and not exist_ok: 63 | raise FileExistsError(f"File exists: {self.path!r}") 64 | elif not is_dir: 65 | dirname = self.path 66 | if dirname and not dirname.endswith("/"): 67 | dirname += "/" 68 | zipfile = self.fs.zip 69 | zinfo = ZipInfo(dirname) 70 | zinfo.compress_size = 0 71 | zinfo.CRC = 0 72 | zinfo.external_attr = ((0o40000 | mode) & 0xFFFF) << 16 73 | zinfo.file_size = 0 74 | zinfo.external_attr |= 0x10 75 | zipfile.writestr(zinfo, b"") 76 | -------------------------------------------------------------------------------- /docs/api/index.md: -------------------------------------------------------------------------------- 1 | 6 | 7 | # UPath ![upath](../assets/logo-128x128.svg){: #upath-logo } 8 | 9 | The `UPath` class is your default entry point for interacting with fsspec filesystems. 10 | When instantiating UPath, a specific `UPath` subclass will be returned, dependent on the 11 | detected or provided `protocol`. Here we document all methods and properties available on 12 | UPath instances. 13 | 14 | !!! info "Compatibility" 15 | All methods documented here work consistently across all supported Python versions, 16 | even if they were introduced in later Python versions. We consider it a bug if they 17 | don't :bug: so please report and issue if you run into inconsistencies. 18 | 19 | 20 | ```python 21 | from upath import UPath 22 | ``` 23 | 24 | ::: upath.core.UPath 25 | options: 26 | heading_level: 2 27 | merge_init_into_class: false 28 | inherited_members: true 29 | members: 30 | - __init__ 31 | - protocol 32 | - storage_options 33 | - fs 34 | - path 35 | - parts 36 | - name 37 | - stem 38 | - drive 39 | - root 40 | - anchor 41 | - suffix 42 | - suffixes 43 | - parent 44 | - parents 45 | - joinpath 46 | - joinuri 47 | - with_name 48 | - with_stem 49 | - with_suffix 50 | - with_segments 51 | - relative_to 52 | - is_relative_to 53 | - match 54 | - full_match 55 | - as_posix 56 | - as_uri 57 | - open 58 | - read_text 59 | - read_bytes 60 | - write_text 61 | - write_bytes 62 | - iterdir 63 | - glob 64 | - rglob 65 | - walk 66 | - mkdir 67 | - rmdir 68 | - touch 69 | - unlink 70 | - rename 71 | - replace 72 | - copy 73 | - move 74 | - copy_into 75 | - move_into 76 | - exists 77 | - is_file 78 | - is_dir 79 | - is_symlink 80 | - is_absolute 81 | - stat 82 | - info 83 | - absolute 84 | - resolve 85 | - expanduser 86 | - cwd 87 | - home 88 | 89 | --- 90 | 91 | ## See Also :link: 92 | 93 | - [Registry](registry.md) - The upath implementation registry 94 | - [Implementations](implementations.md) - UPath subclasses 95 | - [Extensions](extensions.md) - Extending UPath functionality 96 | - [Types](types.md) - Type hints and protocols 97 | -------------------------------------------------------------------------------- /docs/install.md: -------------------------------------------------------------------------------- 1 | 2 | # Installation :package: 3 | 4 | Getting started with `universal-pathlib` is easy! Choose your preferred package manager below and you'll be working with cloud storage in minutes. 5 | 6 | ## Quick Install 7 | 8 | === "uv" 9 | 10 | ```bash 11 | uv add universal-pathlib 12 | ``` 13 | 14 | === "pip" 15 | 16 | ```bash 17 | python -m pip install universal-pathlib 18 | ``` 19 | 20 | === "conda" 21 | 22 | ```bash 23 | conda install -c conda-forge universal-pathlib 24 | ``` 25 | 26 | That's it! You now have `universal-pathlib` installed. :tada: 27 | 28 | ## Filesystem-Specific Dependencies 29 | 30 | While `universal-pathlib` comes with `fsspec` out of the box, **some filesystems require additional packages**. Don't worry—installing them is straightforward! 31 | 32 | For example, to work with **AWS S3**, you'll need to install `s3fs`: 33 | 34 | ```bash 35 | pip install s3fs 36 | # or better yet, use fsspec extras: 37 | pip install "fsspec[s3]" 38 | ``` 39 | 40 | Here are some common filesystem extras you might need: 41 | 42 | | Filesystem | Install Command | 43 | |------------|----------------| 44 | | **AWS S3** | `pip install "fsspec[s3]"` | 45 | | **Google Cloud Storage** | `pip install "fsspec[gcs]"` | 46 | | **Azure Blob Storage** | `pip install "fsspec[azure]"` | 47 | | **HTTP/HTTPS** | `pip install "fsspec[http]"` | 48 | | **GitHub** | `pip install "fsspec[github]"` | 49 | | **SSH/SFTP** | `pip install "fsspec[ssh]"` | 50 | 51 | ## Adding to Your Project 52 | 53 | When adding `universal-pathlib` to your project, specify the filesystem extras you need. Here's a `pyproject.toml` example for a project using **S3** and **HTTP**: 54 | 55 | ```toml 56 | [project] 57 | name = "myproject" 58 | requires-python = ">=3.9" 59 | dependencies = [ 60 | "universal_pathlib>=0.3.7", 61 | "fsspec[s3,http]", # Add the filesystems you need 62 | ] 63 | ``` 64 | 65 | !!! tip "Complete List of Filesystem Extras" 66 | 67 | For a complete overview of all available filesystem extras and their dependencies, check out the [fsspec pyproject.toml file][fsspec-pyproject-toml]. It includes extras for: 68 | 69 | - Cloud storage (S3, GCS, Azure, etc.) 70 | - Remote protocols (HTTP, FTP, SSH, etc.) 71 | - Specialized systems (HDFS, WebDAV, SMB, etc.) 72 | 73 | [fsspec-pyproject-toml]: https://github.com/fsspec/filesystem_spec/blob/master/pyproject.toml#L26 74 | 75 | --- 76 | 77 |
78 | 79 | **Ready to get started?** Learn about [Universal Pathlib Concepts](concepts/index.md) :rocket: 80 | 81 |
82 | -------------------------------------------------------------------------------- /upath/tests/test_drive_root_anchor_parts.py: -------------------------------------------------------------------------------- 1 | from pathlib import Path 2 | 3 | import pytest 4 | 5 | from upath import UPath 6 | 7 | DRIVE_ROOT_ANCHOR_TESTS = [ 8 | # cloud 9 | ("s3://bucket", "bucket", "/", "bucket/", ("bucket/",)), 10 | ("s3://bucket/a", "bucket", "/", "bucket/", ("bucket/", "a")), 11 | ("gs://bucket", "bucket", "/", "bucket/", ("bucket/",)), 12 | ("gs://bucket/a", "bucket", "/", "bucket/", ("bucket/", "a")), 13 | ("az://bucket", "bucket", "/", "bucket/", ("bucket/",)), 14 | ("az://bucket/a", "bucket", "/", "bucket/", ("bucket/", "a")), 15 | # data 16 | ( 17 | "data:text/plain,A%20brief%20note", 18 | "", 19 | "", 20 | "", 21 | ("data:text/plain,A%20brief%20note",), 22 | ), 23 | # github 24 | ("github://user:token@repo/abc", "", "", "", ("abc",)), 25 | # hdfs 26 | ("hdfs://a/b/c", "", "/", "/", ("/", "b", "c")), 27 | ("hdfs:///a/b/c", "", "/", "/", ("/", "a", "b", "c")), 28 | # http 29 | ("http://a/", "http://a", "/", "http://a/", ("http://a/", "")), 30 | ("http://a/b/c", "http://a", "/", "http://a/", ("http://a/", "b", "c")), 31 | ("https://a/b/c", "https://a", "/", "https://a/", ("https://a/", "b", "c")), 32 | # memory 33 | ("memory://a/b/c", "", "/", "/", ("/", "a", "b", "c")), 34 | ("memory:///a/b/c", "", "/", "/", ("/", "a", "b", "c")), 35 | # sftp 36 | ("sftp://a/b/c", "", "/", "/", ("/", "b", "c")), 37 | ("sftp:///a/b/c", "", "/", "/", ("/", "a", "b", "c")), 38 | # smb 39 | ("smb://a/b/c", "", "/", "/", ("/", "b", "c")), 40 | ("smb:///a/b/c", "", "/", "/", ("/", "a", "b", "c")), 41 | # webdav 42 | ("webdav+http://host.com/a/b/c", "", "", "", ("a", "b", "c")), 43 | ("webdav+http://host.com/a/b/c", "", "", "", ("a", "b", "c")), 44 | # local 45 | ( 46 | "file:///a/b/c", 47 | Path("/a/b/c").absolute().drive, 48 | Path("/").absolute().root.replace("\\", "/"), 49 | Path("/").absolute().anchor.replace("\\", "/"), 50 | tuple(x.replace("\\", "/") for x in Path("/a/b/c").absolute().parts), 51 | ), 52 | ] 53 | 54 | 55 | @pytest.mark.parametrize( 56 | "path,drive,root,anchor", 57 | [x[0:4] for x in DRIVE_ROOT_ANCHOR_TESTS], 58 | ) 59 | def test_drive_root_anchor(path, drive, root, anchor): 60 | p = UPath(path) 61 | assert (p.drive, p.root, p.anchor) == (drive, root, anchor) 62 | 63 | 64 | @pytest.mark.parametrize( 65 | "path,parts", 66 | [(x[0], x[4]) for x in DRIVE_ROOT_ANCHOR_TESTS], 67 | ) 68 | def test_parts(path, parts): 69 | p = UPath(path) 70 | assert p.parts == parts 71 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | build/ 12 | develop-eggs/ 13 | dist/ 14 | downloads/ 15 | eggs/ 16 | .eggs/ 17 | lib/ 18 | lib64/ 19 | parts/ 20 | sdist/ 21 | var/ 22 | wheels/ 23 | share/python-wheels/ 24 | *.egg-info/ 25 | .installed.cfg 26 | *.egg 27 | MANIFEST 28 | 29 | # PyInstaller 30 | # Usually these files are written by a python script from a template 31 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 32 | *.manifest 33 | *.spec 34 | 35 | # Installer logs 36 | pip-log.txt 37 | pip-delete-this-directory.txt 38 | 39 | # Unit test / coverage reports 40 | htmlcov/ 41 | .tox/ 42 | .nox/ 43 | .coverage 44 | .coverage.* 45 | .cache 46 | nosetests.xml 47 | coverage.xml 48 | *.cover 49 | *.py,cover 50 | .hypothesis/ 51 | .pytest_cache/ 52 | cover/ 53 | 54 | # Translations 55 | *.mo 56 | *.pot 57 | 58 | # Django stuff: 59 | *.log 60 | local_settings.py 61 | db.sqlite3 62 | db.sqlite3-journal 63 | 64 | # Flask stuff: 65 | instance/ 66 | .webassets-cache 67 | 68 | # Scrapy stuff: 69 | .scrapy 70 | 71 | # Sphinx documentation 72 | docs/_build/ 73 | docs/changelog.md 74 | 75 | # PyBuilder 76 | .pybuilder/ 77 | target/ 78 | 79 | # Jupyter Notebook 80 | .ipynb_checkpoints 81 | 82 | # IPython 83 | profile_default/ 84 | ipython_config.py 85 | 86 | # pyenv 87 | # For a library or package, you might want to ignore these files since the code is 88 | # intended to run in multiple environments; otherwise, check them in: 89 | # .python-version 90 | 91 | # pipenv 92 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. 93 | # However, in case of collaboration, if having platform-specific dependencies or dependencies 94 | # having no cross-platform support, pipenv may install dependencies that don't work, or not 95 | # install all needed dependencies. 96 | #Pipfile.lock 97 | 98 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow 99 | __pypackages__/ 100 | 101 | # Celery stuff 102 | celerybeat-schedule 103 | celerybeat.pid 104 | 105 | # SageMath parsed files 106 | *.sage.py 107 | 108 | # Environments 109 | .env 110 | .venv 111 | env/ 112 | venv/ 113 | venv*/ 114 | ENV/ 115 | env.bak/ 116 | venv.bak/ 117 | 118 | # Spyder project settings 119 | .spyderproject 120 | .spyproject 121 | 122 | # Rope project settings 123 | .ropeproject 124 | 125 | # mkdocs documentation 126 | /site 127 | 128 | # mypy 129 | .mypy_cache/ 130 | .dmypy.json 131 | dmypy.json 132 | 133 | # Pyre type checker 134 | .pyre/ 135 | 136 | # pytype static type analyzer 137 | .pytype/ 138 | 139 | # Cython debug symbols 140 | cython_debug/ 141 | 142 | # setuptools_scm 143 | upath/_version.py 144 | 145 | # vscode workspace settings 146 | .vscode/ 147 | 148 | # mac 149 | **/.DS_Store 150 | -------------------------------------------------------------------------------- /upath/implementations/smb.py: -------------------------------------------------------------------------------- 1 | from __future__ import annotations 2 | 3 | import sys 4 | import warnings 5 | from typing import TYPE_CHECKING 6 | from typing import Any 7 | 8 | from upath.core import UPath 9 | from upath.types import UNSET_DEFAULT 10 | from upath.types import JoinablePathLike 11 | from upath.types import WritablePathLike 12 | 13 | if TYPE_CHECKING: 14 | from typing import Literal 15 | 16 | if sys.version_info >= (3, 11): 17 | from typing import Self 18 | from typing import Unpack 19 | else: 20 | from typing_extensions import Self 21 | from typing_extensions import Unpack 22 | 23 | from upath._chain import FSSpecChainParser 24 | from upath.types.storage_options import SMBStorageOptions 25 | 26 | 27 | class SMBPath(UPath): 28 | __slots__ = () 29 | 30 | if TYPE_CHECKING: 31 | 32 | def __init__( 33 | self, 34 | *args: JoinablePathLike, 35 | protocol: Literal["smb"] | None = ..., 36 | chain_parser: FSSpecChainParser = ..., 37 | **storage_options: Unpack[SMBStorageOptions], 38 | ) -> None: ... 39 | 40 | @property 41 | def path(self) -> str: 42 | path = super().path 43 | if len(path) > 1: 44 | return path.removesuffix("/") 45 | return path 46 | 47 | def __str__(self) -> str: 48 | path_str = super().__str__() 49 | if path_str.startswith("smb:///"): 50 | return path_str.removesuffix("/") 51 | return path_str 52 | 53 | def mkdir( 54 | self, 55 | mode: int = 0o777, 56 | parents: bool = False, 57 | exist_ok: bool = False, 58 | ) -> None: 59 | # smbclient does not support setting mode externally 60 | from smbprotocol.exceptions import SMBOSError 61 | 62 | if parents and not exist_ok and self.exists(): 63 | raise FileExistsError(str(self)) 64 | try: 65 | self.fs.mkdir( 66 | self.path, 67 | create_parents=parents, 68 | ) 69 | except SMBOSError: 70 | if not exist_ok: 71 | raise FileExistsError(str(self)) 72 | if not self.is_dir(): 73 | raise FileExistsError(str(self)) 74 | 75 | def rename( 76 | self, 77 | target: WritablePathLike, 78 | *, 79 | recursive: bool = UNSET_DEFAULT, 80 | maxdepth: int | None = UNSET_DEFAULT, 81 | **kwargs: Any, 82 | ) -> Self: 83 | if recursive is not UNSET_DEFAULT: 84 | warnings.warn( 85 | "SMBPath.rename(): recursive is currently ignored.", 86 | UserWarning, 87 | stacklevel=2, 88 | ) 89 | if maxdepth is not UNSET_DEFAULT: 90 | warnings.warn( 91 | "SMBPath.rename(): maxdepth is currently ignored.", 92 | UserWarning, 93 | stacklevel=2, 94 | ) 95 | return super().rename(target, **kwargs) 96 | -------------------------------------------------------------------------------- /CONTRIBUTING.rst: -------------------------------------------------------------------------------- 1 | Contributor Guide 2 | ================= 3 | 4 | Thank you for your interest in improving this project. 5 | This project is open-source under the `MIT license`_ and 6 | welcomes contributions in the form of bug reports, feature requests, and pull requests. 7 | 8 | Here is a list of important resources for contributors: 9 | 10 | - `Source Code`_ 11 | - `Issue Tracker`_ 12 | - `Code of Conduct`_ 13 | 14 | .. _MIT license: https://opensource.org/licenses/MIT 15 | .. _Source Code: https://github.com/fsspec/universal_pathlib 16 | .. _Issue Tracker: https://github.com/fsspec/universal_pathlib/issues 17 | 18 | How to report a bug 19 | ------------------- 20 | 21 | Report bugs on the `Issue Tracker`_. 22 | 23 | When filing an issue, make sure to answer these questions: 24 | 25 | - Which operating system and Python version are you using? 26 | - Which version of this project are you using? 27 | - What did you do? 28 | - What did you expect to see? 29 | - What did you see instead? 30 | 31 | The best way to get your bug fixed is to provide a test case, 32 | and/or steps to reproduce the issue. 33 | 34 | 35 | How to request a feature 36 | ------------------------ 37 | 38 | Request features on the `Issue Tracker`_. 39 | 40 | 41 | How to set up your development environment 42 | ------------------------------------------ 43 | 44 | You need Python 3.8+ and the following tools: 45 | 46 | - Nox_ 47 | 48 | Install the package with development requirements: 49 | 50 | .. code:: console 51 | 52 | $ pip install nox 53 | 54 | .. _Nox: https://nox.thea.codes/ 55 | 56 | 57 | How to test the project 58 | ----------------------- 59 | 60 | Run the full test suite: 61 | 62 | .. code:: console 63 | 64 | $ nox 65 | 66 | List the available Nox sessions: 67 | 68 | .. code:: console 69 | 70 | $ nox --list-sessions 71 | 72 | You can also run a specific Nox session. 73 | For example, invoke the unit test suite like this: 74 | 75 | .. code:: console 76 | 77 | $ nox --session=tests 78 | 79 | Unit tests are located in the ``tests`` directory, 80 | and are written using the pytest_ testing framework. 81 | 82 | .. _pytest: https://pytest.readthedocs.io/ 83 | 84 | 85 | How to submit changes 86 | --------------------- 87 | 88 | Open a `pull request`_ to submit changes to this project. 89 | 90 | Your pull request needs to meet the following guidelines for acceptance: 91 | 92 | - The Nox test suite must pass without errors and warnings. 93 | - Include unit tests. This project maintains 100% code coverage. 94 | - If your changes add functionality, update the documentation accordingly. 95 | 96 | Feel free to submit early, though—we can always iterate on this. 97 | 98 | To run linting and code formatting checks, you can invoke a `lint` session in nox: 99 | 100 | .. code:: console 101 | 102 | $ nox -s lint 103 | 104 | It is recommended to open an issue before starting work on anything. 105 | This will allow a chance to talk it over with the owners and validate your approach. 106 | 107 | .. _pull request: https://github.com/fsspec/universal_pathlib/pulls 108 | .. github-only 109 | .. _Code of Conduct: CODE_OF_CONDUCT.rst 110 | -------------------------------------------------------------------------------- /upath/implementations/webdav.py: -------------------------------------------------------------------------------- 1 | from __future__ import annotations 2 | 3 | import sys 4 | from collections.abc import Mapping 5 | from collections.abc import Sequence 6 | from typing import TYPE_CHECKING 7 | from typing import Any 8 | from urllib.parse import urlsplit 9 | 10 | from fsspec.registry import known_implementations 11 | from fsspec.registry import register_implementation 12 | 13 | from upath.core import UPath 14 | from upath.types import JoinablePathLike 15 | 16 | if TYPE_CHECKING: 17 | from typing import Literal 18 | 19 | if sys.version_info >= (3, 11): 20 | from typing import Unpack 21 | else: 22 | from typing_extensions import Unpack 23 | 24 | from upath._chain import FSSpecChainParser 25 | from upath.types.storage_options import WebdavStorageOptions 26 | 27 | __all__ = ["WebdavPath"] 28 | 29 | # webdav was only registered in fsspec>=2022.5.0 30 | if "webdav" not in known_implementations: 31 | import webdav4.fsspec 32 | 33 | register_implementation("webdav", webdav4.fsspec.WebdavFileSystem) 34 | 35 | 36 | class WebdavPath(UPath): 37 | __slots__ = () 38 | 39 | if TYPE_CHECKING: 40 | 41 | def __init__( 42 | self, 43 | *args: JoinablePathLike, 44 | protocol: Literal["webdav+http", "webdav+https"] | None = ..., 45 | chain_parser: FSSpecChainParser = ..., 46 | **storage_options: Unpack[WebdavStorageOptions], 47 | ) -> None: ... 48 | 49 | @classmethod 50 | def _transform_init_args( 51 | cls, 52 | args: tuple[JoinablePathLike, ...], 53 | protocol: str, 54 | storage_options: dict[str, Any], 55 | ) -> tuple[tuple[JoinablePathLike, ...], str, dict[str, Any]]: 56 | if not args: 57 | args = ("/",) 58 | elif args and protocol in {"webdav+http", "webdav+https"}: 59 | args0, *argsN = args 60 | url = urlsplit(str(args0)) 61 | base = url._replace(scheme=protocol.split("+")[1], path="").geturl() 62 | args0 = url._replace(scheme="", netloc="").geturl() or "/" 63 | storage_options["base_url"] = base 64 | args = (args0, *argsN) 65 | if "base_url" not in storage_options: 66 | raise ValueError( 67 | f"must provide `base_url` storage option for args: {args!r}" 68 | ) 69 | return super()._transform_init_args(args, "webdav", storage_options) 70 | 71 | @classmethod 72 | def _parse_storage_options( 73 | cls, 74 | urlpath: str, 75 | protocol: str, 76 | storage_options: Mapping[str, Any], 77 | ) -> dict[str, Any]: 78 | so = dict(storage_options) 79 | if urlpath.startswith(("webdav+http:", "webdav+https:")): 80 | url = urlsplit(str(urlpath)) 81 | base = url._replace(scheme=url.scheme.split("+")[1], path="").geturl() 82 | urlpath = url._replace(scheme="", netloc="").geturl() or "/" 83 | so.setdefault("base_url", base) 84 | return super()._parse_storage_options(urlpath, "webdav", so) 85 | 86 | @property 87 | def parts(self) -> Sequence[str]: 88 | parts = super().parts 89 | if parts and parts[0] == "/": 90 | return parts[1:] 91 | else: 92 | return parts 93 | -------------------------------------------------------------------------------- /upath/tests/implementations/test_hf.py: -------------------------------------------------------------------------------- 1 | import pytest 2 | from fsspec import get_filesystem_class 3 | 4 | from upath import UPath 5 | from upath.implementations.cloud import HfPath 6 | 7 | from ..cases import BaseTests 8 | 9 | try: 10 | get_filesystem_class("hf") 11 | except ImportError: 12 | pytestmark = pytest.mark.skip 13 | 14 | 15 | def test_hfpath(): 16 | path = UPath("hf://HuggingFaceTB/SmolLM2-135M") 17 | assert isinstance(path, HfPath) 18 | try: 19 | assert path.exists() 20 | except AssertionError: 21 | from httpx import ConnectError 22 | from huggingface_hub import HfApi 23 | 24 | try: 25 | HfApi().repo_info("HuggingFaceTB/SmolLM2-135M") 26 | except ConnectError: 27 | pytest.xfail("No internet connection") 28 | except Exception as err: 29 | if "Service Unavailable" in str(err): 30 | pytest.xfail("HuggingFace API not reachable") 31 | raise 32 | 33 | 34 | class TestUPathHf(BaseTests): 35 | @pytest.fixture(autouse=True, scope="function") 36 | def path(self, hf_fixture_with_readonly_mocked_hf_api): 37 | self.path = UPath(hf_fixture_with_readonly_mocked_hf_api) 38 | 39 | @pytest.mark.skip 40 | def test_mkdir(self): 41 | pass 42 | 43 | @pytest.mark.skip 44 | def test_mkdir_exists_ok_false(self): 45 | pass 46 | 47 | @pytest.mark.skip 48 | def test_mkdir_exists_ok_true(self): 49 | pass 50 | 51 | @pytest.mark.skip 52 | def test_mkdir_parents_true_exists_ok_true(self): 53 | pass 54 | 55 | @pytest.mark.skip 56 | def test_mkdir_parents_true_exists_ok_false(self): 57 | pass 58 | 59 | @pytest.mark.skip 60 | def test_makedirs_exist_ok_true(self): 61 | pass 62 | 63 | @pytest.mark.skip 64 | def test_makedirs_exist_ok_false(self): 65 | pass 66 | 67 | @pytest.mark.skip 68 | def test_touch(self): 69 | pass 70 | 71 | @pytest.mark.skip 72 | def test_touch_unlink(self): 73 | pass 74 | 75 | @pytest.mark.skip 76 | def test_write_bytes(self, pathlib_base): 77 | pass 78 | 79 | @pytest.mark.skip 80 | def test_write_text(self, pathlib_base): 81 | pass 82 | 83 | def test_fsspec_compat(self): 84 | pass 85 | 86 | def test_rename(self): 87 | pass 88 | 89 | def test_rename2(self): 90 | pass 91 | 92 | def test_move_local(self, tmp_path): 93 | pass 94 | 95 | def test_move_into_local(self, tmp_path): 96 | pass 97 | 98 | def test_move_memory(self, clear_fsspec_memory_cache): 99 | pass 100 | 101 | def test_move_into_memory(self, clear_fsspec_memory_cache): 102 | pass 103 | 104 | @pytest.mark.skip(reason="HfPath does not support listing repositories") 105 | def test_iterdir(self, local_testdir): 106 | pass 107 | 108 | @pytest.mark.skip(reason="HfPath does not support listing repositories") 109 | def test_iterdir2(self, local_testdir): 110 | pass 111 | 112 | @pytest.mark.skip(reason="HfPath does not currently test write") 113 | def test_rename_with_target_absolute(self, target_factory): 114 | return super().test_rename_with_target_absolute(target_factory) 115 | -------------------------------------------------------------------------------- /upath/tests/test_pydantic.py: -------------------------------------------------------------------------------- 1 | import json 2 | from os.path import abspath 3 | 4 | import pydantic 5 | import pydantic_core 6 | import pytest 7 | from fsspec.implementations.http import get_client 8 | 9 | from upath import UPath 10 | 11 | 12 | @pytest.mark.parametrize( 13 | "path", 14 | [ 15 | "/abc", 16 | "file:///abc", 17 | "memory://abc", 18 | "s3://bucket/key", 19 | "https://www.example.com", 20 | ], 21 | ) 22 | @pytest.mark.parametrize("source", ["json", "python"]) 23 | def test_validate_from_str(path, source): 24 | expected = UPath(path) 25 | 26 | ta = pydantic.TypeAdapter(UPath) 27 | if source == "json": 28 | actual = ta.validate_json(json.dumps(path)) 29 | else: # source == "python" 30 | actual = ta.validate_python(path) 31 | 32 | assert abspath(actual.path) == abspath(expected.path) 33 | assert actual.protocol == expected.protocol 34 | 35 | 36 | @pytest.mark.parametrize( 37 | "dct", 38 | [ 39 | { 40 | "path": "/my/path", 41 | "protocol": "file", 42 | "storage_options": {"foo": "bar", "baz": 3}, 43 | } 44 | ], 45 | ) 46 | @pytest.mark.parametrize("source", ["json", "python"]) 47 | def test_validate_from_dict(dct, source): 48 | ta = pydantic.TypeAdapter(UPath) 49 | if source == "json": 50 | output = ta.validate_json(json.dumps(dct)) 51 | else: # source == "python" 52 | output = ta.validate_python(dct) 53 | 54 | assert abspath(output.path) == abspath(dct["path"]) 55 | assert output.protocol == dct["protocol"] 56 | assert output.storage_options == dct["storage_options"] 57 | 58 | 59 | @pytest.mark.parametrize( 60 | "path", 61 | [ 62 | "/abc", 63 | "file:///abc", 64 | "memory://abc", 65 | "s3://bucket/key", 66 | "https://www.example.com", 67 | ], 68 | ) 69 | def test_validate_from_instance(path): 70 | input = UPath(path) 71 | 72 | output = pydantic.TypeAdapter(UPath).validate_python(input) 73 | 74 | assert output is input 75 | 76 | 77 | @pytest.mark.parametrize( 78 | ("args", "kwargs"), 79 | [ 80 | ( 81 | ("/my/path",), 82 | { 83 | "protocol": "file", 84 | "foo": "bar", 85 | "baz": 3, 86 | }, 87 | ) 88 | ], 89 | ) 90 | @pytest.mark.parametrize("mode", ["json", "python"]) 91 | def test_dump(args, kwargs, mode): 92 | u = UPath(*args, **kwargs) 93 | 94 | output = pydantic.TypeAdapter(UPath).dump_python(u, mode=mode) 95 | 96 | assert output["path"] == u.path 97 | assert output["protocol"] == u.protocol 98 | assert output["storage_options"] == u.storage_options 99 | 100 | 101 | def test_dump_non_serializable_python(): 102 | output = pydantic.TypeAdapter(UPath).dump_python( 103 | UPath("https://www.example.com", get_client=get_client), mode="python" 104 | ) 105 | 106 | assert output["storage_options"]["get_client"] is get_client 107 | 108 | 109 | def test_dump_non_serializable_json(): 110 | with pytest.raises(pydantic_core.PydanticSerializationError, match="unknown type"): 111 | pydantic.TypeAdapter(UPath).dump_python( 112 | UPath("https://www.example.com", get_client=get_client), mode="json" 113 | ) 114 | 115 | 116 | def test_json_schema(): 117 | ta = pydantic.TypeAdapter(UPath) 118 | ta.json_schema() 119 | -------------------------------------------------------------------------------- /upath/tests/test_stat.py: -------------------------------------------------------------------------------- 1 | import os 2 | from datetime import datetime 3 | from datetime import timezone 4 | 5 | import pytest 6 | 7 | import upath 8 | 9 | 10 | @pytest.fixture 11 | def pth_file(tmp_path): 12 | f = tmp_path.joinpath("abc.txt") 13 | f.write_bytes(b"a") 14 | p = upath.UPath(f"file://{f.absolute().as_posix()}") 15 | yield p 16 | 17 | 18 | def test_stat_repr(pth_file): 19 | assert repr(pth_file.stat()).startswith("UPathStatResult") 20 | 21 | 22 | def test_stat_as_info(pth_file): 23 | dct = pth_file.stat().as_info() 24 | assert dct["size"] == pth_file.stat().st_size 25 | 26 | 27 | def test_stat_atime(pth_file): 28 | atime = pth_file.stat().st_atime 29 | assert isinstance(atime, (float, int)) 30 | 31 | 32 | @pytest.mark.xfail(reason="fsspec does not return 'atime'") 33 | def test_stat_atime_value(pth_file): 34 | atime = pth_file.stat().st_atime 35 | assert atime > 0 36 | 37 | 38 | def test_stat_mtime(pth_file): 39 | mtime = pth_file.stat().st_mtime 40 | assert isinstance(mtime, (float, int)) 41 | 42 | 43 | def test_stat_mtime_value(pth_file): 44 | mtime = pth_file.stat().st_mtime 45 | assert mtime > 0 46 | 47 | 48 | def test_stat_ctime(pth_file): 49 | ctime = pth_file.stat().st_ctime 50 | assert isinstance(ctime, (float, int)) 51 | 52 | 53 | @pytest.mark.xfail(reason="fsspec returns 'created' but not 'ctime'") 54 | def test_stat_ctime_value(pth_file): 55 | ctime = pth_file.stat().st_ctime 56 | assert ctime > 0 57 | 58 | 59 | def test_stat_birthtime(pth_file): 60 | birthtime = pth_file.stat().st_birthtime 61 | assert isinstance(birthtime, (float, int)) 62 | 63 | 64 | def test_stat_birthtime_value(pth_file): 65 | birthtime = pth_file.stat().st_birthtime 66 | assert birthtime > 0 67 | 68 | 69 | def test_stat_seq_interface(pth_file): 70 | assert len(tuple(pth_file.stat())) == os.stat_result.n_sequence_fields 71 | assert isinstance(pth_file.stat().index(0), int) 72 | assert isinstance(pth_file.stat().count(0), int) 73 | assert isinstance(pth_file.stat()[0], int) 74 | 75 | 76 | def test_stat_warn_if_dict_interface(pth_file): 77 | with pytest.warns(DeprecationWarning): 78 | pth_file.stat().keys() 79 | 80 | with pytest.warns(DeprecationWarning): 81 | pth_file.stat().items() 82 | 83 | with pytest.warns(DeprecationWarning): 84 | pth_file.stat().values() 85 | 86 | with pytest.warns(DeprecationWarning): 87 | pth_file.stat().get("size") 88 | 89 | with pytest.warns(DeprecationWarning): 90 | pth_file.stat().copy() 91 | 92 | with pytest.warns(DeprecationWarning): 93 | _ = pth_file.stat()["size"] 94 | 95 | 96 | @pytest.mark.parametrize( 97 | "timestamp", 98 | [ 99 | 10, 100 | datetime(1970, 1, 1, 0, 0, 10, tzinfo=timezone.utc), 101 | "1970-01-01T00:00:10Z", 102 | "1970-01-01T00:00:10+00:00", 103 | ], 104 | ) 105 | def test_timestamps(timestamp): 106 | from upath._stat import UPathStatResult 107 | 108 | s = UPathStatResult( 109 | [0] * 10, 110 | { 111 | "ctime": timestamp, 112 | "atime": timestamp, 113 | "mtime": timestamp, 114 | "created": timestamp, 115 | }, 116 | ) 117 | assert s.st_atime == 10.0 118 | assert s.st_ctime == 10.0 119 | assert s.st_mtime == 10.0 120 | 121 | 122 | def test_bad_timestamp(): 123 | from upath._stat import UPathStatResult 124 | 125 | with ( 126 | pytest.raises(TypeError), 127 | pytest.warns(RuntimeWarning, "universal_pathlib/issues"), 128 | ): 129 | s = UPathStatResult([0] * 10, {"ctime": "bad"}) 130 | _ = s.st_ctime 131 | -------------------------------------------------------------------------------- /upath/tests/implementations/test_tar.py: -------------------------------------------------------------------------------- 1 | import tarfile 2 | 3 | import pytest 4 | 5 | from upath import UPath 6 | from upath.implementations.tar import TarPath 7 | 8 | from ..cases import BaseTests 9 | 10 | 11 | @pytest.fixture(scope="function") 12 | def tarred_testdir_file(local_testdir, tmp_path_factory): 13 | base = tmp_path_factory.mktemp("tarpath") 14 | tar_path = base / "test.tar" 15 | with tarfile.TarFile(tar_path, "w") as tf: 16 | tf.add(local_testdir, arcname="", recursive=True) 17 | return str(tar_path) 18 | 19 | 20 | class TestTarPath(BaseTests): 21 | 22 | @pytest.fixture(autouse=True) 23 | def path(self, tarred_testdir_file): 24 | self.path = UPath("tar://", fo=tarred_testdir_file) 25 | # self.prepare_file_system() done outside of UPath 26 | 27 | def test_is_TarPath(self): 28 | assert isinstance(self.path, TarPath) 29 | 30 | @pytest.mark.skip(reason="Tar filesystem is read-only") 31 | def test_mkdir(self): 32 | pass 33 | 34 | @pytest.mark.skip(reason="Tar filesystem is read-only") 35 | def test_mkdir_exists_ok_false(self): 36 | pass 37 | 38 | @pytest.mark.skip(reason="Tar filesystem is read-only") 39 | def test_mkdir_parents_true_exists_ok_false(self): 40 | pass 41 | 42 | @pytest.mark.skip(reason="Tar filesystem is read-only") 43 | def test_rename(self): 44 | pass 45 | 46 | @pytest.mark.skip(reason="Tar filesystem is read-only") 47 | def test_rename2(self): 48 | pass 49 | 50 | @pytest.mark.skip(reason="Tar filesystem is read-only") 51 | def test_touch(self): 52 | pass 53 | 54 | @pytest.mark.skip(reason="Tar filesystem is read-only") 55 | def test_touch_unlink(self): 56 | pass 57 | 58 | @pytest.mark.skip(reason="Tar filesystem is read-only") 59 | def test_write_bytes(self): 60 | pass 61 | 62 | @pytest.mark.skip(reason="Tar filesystem is read-only") 63 | def test_write_text(self): 64 | pass 65 | 66 | @pytest.mark.skip(reason="Tar filesystem is read-only") 67 | def test_fsspec_compat(self): 68 | pass 69 | 70 | @pytest.mark.skip(reason="Only testing read on TarPath") 71 | def test_move_local(self, tmp_path): 72 | pass 73 | 74 | @pytest.mark.skip(reason="Only testing read on TarPath") 75 | def test_move_into_local(self, tmp_path): 76 | pass 77 | 78 | @pytest.mark.skip(reason="Only testing read on TarPath") 79 | def test_move_memory(self, clear_fsspec_memory_cache): 80 | pass 81 | 82 | @pytest.mark.skip(reason="Only testing read on TarPath") 83 | def test_move_into_memory(self, clear_fsspec_memory_cache): 84 | pass 85 | 86 | @pytest.mark.skip(reason="Only testing read on TarPath") 87 | def test_rename_with_target_absolute(self, target_factory): 88 | return super().test_rename_with_target_str_absolute(target_factory) 89 | 90 | @pytest.mark.skip(reason="Only testing read on TarPath") 91 | def test_write_text_encoding(self): 92 | return super().test_write_text_encoding() 93 | 94 | @pytest.mark.skip(reason="Only testing read on TarPath") 95 | def test_write_text_errors(self): 96 | return super().test_write_text_errors() 97 | 98 | 99 | @pytest.fixture(scope="function") 100 | def tarred_testdir_file_in_memory(tarred_testdir_file, clear_fsspec_memory_cache): 101 | p = UPath(tarred_testdir_file, protocol="file") 102 | t = p.move(UPath("memory:///mytarfile.tar")) 103 | assert t.protocol == "memory" 104 | assert t.exists() 105 | yield t.as_uri() 106 | 107 | 108 | class TestChainedTarPath(TestTarPath): 109 | 110 | @pytest.fixture(autouse=True) 111 | def path(self, tarred_testdir_file_in_memory): 112 | self.path = UPath("tar://::memory:///mytarfile.tar") 113 | -------------------------------------------------------------------------------- /upath/tests/test_chain.py: -------------------------------------------------------------------------------- 1 | import os 2 | from pathlib import Path 3 | 4 | import pytest 5 | from fsspec.implementations.memory import MemoryFileSystem 6 | 7 | from upath import UPath 8 | from upath._chain import FSSpecChainParser 9 | 10 | 11 | @pytest.mark.parametrize( 12 | "urlpath,expected", 13 | [ 14 | ("simplecache::file://tmp", "simplecache"), 15 | ("zip://file.txt::file://tmp.zip", "zip"), 16 | ], 17 | ) 18 | def test_chaining_upath_protocol(urlpath, expected): 19 | pth = UPath(urlpath) 20 | assert pth.protocol == expected 21 | 22 | 23 | def add_current_drive_on_windows(pth: str) -> str: 24 | drive = os.path.splitdrive(Path.cwd().as_posix())[0] 25 | return f"{drive}{pth}" 26 | 27 | 28 | @pytest.mark.parametrize( 29 | "urlpath,expected", 30 | [ 31 | pytest.param( 32 | "simplecache::file:///tmp", 33 | add_current_drive_on_windows("/tmp"), 34 | ), 35 | pytest.param( 36 | "zip://file.txt::file:///tmp.zip", 37 | "file.txt", 38 | ), 39 | pytest.param( 40 | "zip://a/b/c.txt::simplecache::memory://zipfile.zip", 41 | "a/b/c.txt", 42 | ), 43 | ], 44 | ) 45 | def test_chaining_upath_path(urlpath, expected): 46 | pth = UPath(urlpath) 47 | assert pth.path == expected 48 | 49 | 50 | @pytest.mark.parametrize( 51 | "urlpath,expected", 52 | [ 53 | ( 54 | "simplecache::file:///tmp", 55 | { 56 | "target_protocol": "file", 57 | "target_options": {}, 58 | }, 59 | ), 60 | ], 61 | ) 62 | def test_chaining_upath_storage_options(urlpath, expected): 63 | pth = UPath(urlpath) 64 | assert dict(pth.storage_options) == expected 65 | 66 | 67 | @pytest.mark.parametrize( 68 | "urlpath,expected", 69 | [ 70 | ("simplecache::memory://tmp", ("/", "tmp")), 71 | ], 72 | ) 73 | def test_chaining_upath_parts(urlpath, expected): 74 | pth = UPath(urlpath) 75 | assert pth.parts == expected 76 | 77 | 78 | @pytest.mark.parametrize( 79 | "urlpath,expected", 80 | [ 81 | ("simplecache::memory:///tmp", "simplecache::memory:///tmp"), 82 | ], 83 | ) 84 | def test_chaining_upath_str(urlpath, expected): 85 | pth = UPath(urlpath) 86 | assert str(pth) == expected 87 | 88 | 89 | @pytest.fixture 90 | def clear_memory_fs(): 91 | fs = MemoryFileSystem() 92 | store = fs.store 93 | pseudo_dirs = fs.pseudo_dirs 94 | try: 95 | yield fs 96 | finally: 97 | fs.store.clear() 98 | fs.store.update(store) 99 | fs.pseudo_dirs[:] = pseudo_dirs 100 | 101 | 102 | @pytest.fixture 103 | def memory_file_urlpath(clear_memory_fs): 104 | fs = clear_memory_fs 105 | fs.pipe_file("/abc/file.txt", b"hello world") 106 | yield fs.unstrip_protocol("/abc/file.txt") 107 | 108 | 109 | def test_read_file(memory_file_urlpath): 110 | pth = UPath(f"simplecache::{memory_file_urlpath}") 111 | assert pth.read_bytes() == b"hello world" 112 | 113 | 114 | def test_write_file(clear_memory_fs): 115 | pth = UPath("simplecache::memory://abc.txt") 116 | pth.write_bytes(b"hello world") 117 | assert clear_memory_fs.cat_file("abc.txt") == b"hello world" 118 | 119 | 120 | @pytest.mark.parametrize( 121 | "urlpath", 122 | [ 123 | "memory:///file.txt", 124 | "simplecache::memory:///tmp", 125 | "zip://file.txt::memory:///tmp.zip", 126 | "zip://a/b/c.txt::simplecache::memory:///zipfile.zip", 127 | "simplecache::zip://a/b/c.txt::tar://blah.zip::memory:///file.tar", 128 | ], 129 | ) 130 | def test_chain_parser_roundtrip(urlpath: str): 131 | parser = FSSpecChainParser() 132 | segments = parser.unchain(urlpath, protocol=None, storage_options={}) 133 | rechained, kw = parser.chain(segments) 134 | assert rechained == urlpath 135 | assert kw == {} 136 | -------------------------------------------------------------------------------- /upath/types/__init__.py: -------------------------------------------------------------------------------- 1 | from __future__ import annotations 2 | 3 | import enum 4 | import sys 5 | from os import PathLike 6 | from typing import TYPE_CHECKING 7 | from typing import Any 8 | from typing import Protocol 9 | from typing import Union 10 | from typing import runtime_checkable 11 | 12 | from upath.types._abc import JoinablePath 13 | from upath.types._abc import PathInfo 14 | from upath.types._abc import PathParser 15 | from upath.types._abc import ReadablePath 16 | from upath.types._abc import WritablePath 17 | 18 | if TYPE_CHECKING: 19 | 20 | if sys.version_info >= (3, 12): 21 | from typing import TypeAlias 22 | else: 23 | from typing_extensions import TypeAlias 24 | 25 | __all__ = [ 26 | "JoinablePath", 27 | "ReadablePath", 28 | "WritablePath", 29 | "JoinablePathLike", 30 | "ReadablePathLike", 31 | "WritablePathLike", 32 | "SupportsPathLike", 33 | "PathInfo", 34 | "StatResultType", 35 | "PathParser", 36 | "UPathParser", 37 | "UNSET_DEFAULT", 38 | ] 39 | 40 | 41 | class VFSPathLike(Protocol): 42 | def __vfspath__(self) -> str: ... 43 | 44 | 45 | SupportsPathLike: TypeAlias = Union[VFSPathLike, PathLike[str]] 46 | JoinablePathLike: TypeAlias = Union[JoinablePath, SupportsPathLike, str] 47 | ReadablePathLike: TypeAlias = Union[ReadablePath, SupportsPathLike, str] 48 | WritablePathLike: TypeAlias = Union[WritablePath, SupportsPathLike, str] 49 | 50 | 51 | class _DefaultValue(enum.Enum): 52 | UNSET = enum.auto() 53 | 54 | 55 | UNSET_DEFAULT: Any = _DefaultValue.UNSET 56 | 57 | # We can't assume this, because pathlib_abc==0.5.1 is ahead of stdlib 3.14 58 | # if sys.version_info >= (3, 14): 59 | # JoinablePath.register(pathlib.PurePath) 60 | # ReadablePath.register(pathlib.Path) 61 | # WritablePath.register(pathlib.Path) 62 | 63 | 64 | @runtime_checkable 65 | class StatResultType(Protocol): 66 | """duck-type for os.stat_result""" 67 | 68 | @property 69 | def st_mode(self) -> int: ... 70 | @property 71 | def st_ino(self) -> int: ... 72 | @property 73 | def st_dev(self) -> int: ... 74 | @property 75 | def st_nlink(self) -> int: ... 76 | @property 77 | def st_uid(self) -> int: ... 78 | @property 79 | def st_gid(self) -> int: ... 80 | @property 81 | def st_size(self) -> int: ... 82 | @property 83 | def st_atime(self) -> float: ... 84 | @property 85 | def st_mtime(self) -> float: ... 86 | @property 87 | def st_ctime(self) -> float: ... 88 | @property 89 | def st_atime_ns(self) -> int: ... 90 | @property 91 | def st_mtime_ns(self) -> int: ... 92 | @property 93 | def st_ctime_ns(self) -> int: ... 94 | 95 | # st_birthtime is available on Windows (3.12+), FreeBSD, and macOS 96 | # On Linux it's currently unavailable 97 | # see: https://discuss.python.org/t/st-birthtime-not-available/104350/2 98 | if (sys.platform == "win32" and sys.version_info >= (3, 12)) or ( 99 | sys.platform == "darwin" or sys.platform.startswith("freebsd") 100 | ): 101 | 102 | @property 103 | def st_birthtime(self) -> float: ... 104 | 105 | 106 | @runtime_checkable 107 | class UPathParser(PathParser, Protocol): 108 | """duck-type for upath.core.UPathParser""" 109 | 110 | def split(self, path: JoinablePathLike) -> tuple[str, str]: ... 111 | def splitext(self, path: JoinablePathLike) -> tuple[str, str]: ... 112 | def normcase(self, path: JoinablePathLike) -> str: ... 113 | 114 | def strip_protocol(self, path: JoinablePathLike) -> str: ... 115 | 116 | def join( 117 | self, 118 | path: JoinablePathLike, 119 | *paths: JoinablePathLike, 120 | ) -> str: ... 121 | 122 | def isabs(self, path: JoinablePathLike) -> bool: ... 123 | 124 | def splitdrive(self, path: JoinablePathLike) -> tuple[str, str]: ... 125 | 126 | def splitroot(self, path: JoinablePathLike) -> tuple[str, str, str]: ... 127 | -------------------------------------------------------------------------------- /mkdocs.yml: -------------------------------------------------------------------------------- 1 | site_name: Universal Pathlib 2 | site_description: A universal pathlib implementation for Python 3 | strict: true 4 | # site_url: !ENV READTHEDOCS_CANONICAL_URL 5 | 6 | theme: 7 | name: 'material' 8 | logo: assets/logo-128x128-white.svg 9 | favicon: 'assets/favicon.png' 10 | palette: 11 | - media: "(prefers-color-scheme: light)" 12 | toggle: 13 | icon: material/lightbulb-outline 14 | name: "Switch to dark mode" 15 | - media: "(prefers-color-scheme: dark)" 16 | scheme: slate 17 | toggle: 18 | icon: material/lightbulb 19 | name: "Switch to light mode" 20 | features: 21 | # - content.tabs.link 22 | - content.code.annotate 23 | - content.code.copy 24 | - announce.dismiss 25 | - navigation.tabs 26 | 27 | extra_css: 28 | - css/extra.css 29 | 30 | repo_name: fsspec/universal_pathlib 31 | repo_url: https://github.com/fsspec/universal_pathlib 32 | edit_uri: edit/main/docs/ 33 | 34 | validation: 35 | omitted_files: warn 36 | absolute_links: warn 37 | unrecognized_links: warn 38 | 39 | nav: 40 | - Home: 41 | - Introduction: index.md 42 | - Why use Universal Pathlib: why.md 43 | - Installation: install.md 44 | - Contributing: contributing.md 45 | - Changelog: changelog.md 46 | - Concepts: 47 | - Overview: concepts/index.md 48 | - Filesystem Spec: concepts/fsspec.md 49 | - Standard Library Pathlib: concepts/pathlib.md 50 | - Universal Pathlib: concepts/upath.md 51 | - Usage: 52 | - Basic Usage: usage.md 53 | - API Reference: 54 | - Core: api/index.md 55 | - Registry: api/registry.md 56 | - Implementations: api/implementations.md 57 | - Extensions: api/extensions.md 58 | - Types: api/types.md 59 | - Migration Guide: migration.md 60 | 61 | markdown_extensions: 62 | - tables 63 | - attr_list 64 | - toc: 65 | permalink: true 66 | title: Page contents 67 | - admonition 68 | - pymdownx.details 69 | - pymdownx.highlight: 70 | pygments_lang_class: true 71 | - pymdownx.extra 72 | - pymdownx.emoji: 73 | emoji_index: !!python/name:material.extensions.emoji.twemoji 74 | emoji_generator: !!python/name:material.extensions.emoji.to_svg 75 | - pymdownx.tasklist: 76 | custom_checkbox: true 77 | - pymdownx.tabbed: 78 | alternate_style: true 79 | - pymdownx.superfences: 80 | custom_fences: 81 | - name: mermaid 82 | class: mermaid 83 | format: !!python/name:pymdownx.superfences.fence_code_format 84 | 85 | watch: 86 | - docs/ 87 | - upath/ 88 | 89 | plugins: 90 | - search 91 | - mkdocstrings: 92 | handlers: 93 | python: 94 | inventories: 95 | - https://docs.python.org/3/objects.inv 96 | paths: [.] 97 | options: 98 | preload_modules: 99 | - __future__ 100 | - typing 101 | - abc 102 | - asyncio 103 | - pathlib 104 | - pathlib_abc 105 | - fsspec 106 | - upath.types._abc 107 | - upath.types 108 | - upath.registry 109 | - upath.core 110 | docstring_style: numpy 111 | docstring_section_style: list 112 | group_by_category: false 113 | members_order: source 114 | docstring_options: 115 | ignore_init_summary: true 116 | docstring_section_style: spacy 117 | merge_init_into_class: true 118 | show_source: true 119 | show_root_heading: true 120 | show_root_toc_entry: true 121 | allow_inspection: true 122 | separate_signature: true 123 | show_signature: true 124 | show_signature_annotations: true 125 | show_signature_type_parameters: true 126 | signature_crossrefs: true 127 | show_symbol_type_heading: true 128 | - exclude: 129 | glob: 130 | - _plugins/* 131 | - __pycache__/* 132 | - tests/* 133 | - test_*.py 134 | hooks: 135 | - docs/_plugins/copy_changelog.py 136 | -------------------------------------------------------------------------------- /upath/_protocol.py: -------------------------------------------------------------------------------- 1 | from __future__ import annotations 2 | 3 | import re 4 | from collections import ChainMap 5 | from pathlib import PurePath 6 | from typing import TYPE_CHECKING 7 | from typing import Any 8 | 9 | from fsspec.registry import known_implementations as _known_implementations 10 | from fsspec.registry import registry as _registry 11 | 12 | if TYPE_CHECKING: 13 | from upath.types import JoinablePathLike 14 | 15 | __all__ = [ 16 | "get_upath_protocol", 17 | "normalize_empty_netloc", 18 | "compatible_protocol", 19 | ] 20 | 21 | # Regular expression to match fsspec style protocols. 22 | # Matches single slash usage too for compatibility. 23 | _PROTOCOL_RE = re.compile( 24 | r"^(?P[A-Za-z][A-Za-z0-9+]+):(?:(?P//?)|:)(?P.*)" 25 | ) 26 | 27 | # Matches data URIs 28 | _DATA_URI_RE = re.compile(r"^data:[^,]*,") 29 | 30 | 31 | def _match_protocol(pth: str) -> str: 32 | if m := _PROTOCOL_RE.match(pth): 33 | return m.group("protocol") 34 | elif _DATA_URI_RE.match(pth): 35 | return "data" 36 | return "" 37 | 38 | 39 | _fsspec_registry_map = ChainMap(_registry, _known_implementations) 40 | 41 | 42 | def _fsspec_protocol_equals(p0: str, p1: str) -> bool: 43 | """check if two fsspec protocols are equivalent""" 44 | p0 = p0 or "file" 45 | p1 = p1 or "file" 46 | if p0 == p1: 47 | return True 48 | 49 | try: 50 | o0 = _fsspec_registry_map[p0] 51 | except KeyError: 52 | raise ValueError(f"Protocol not known: {p0!r}") 53 | try: 54 | o1 = _fsspec_registry_map[p1] 55 | except KeyError: 56 | raise ValueError(f"Protocol not known: {p1!r}") 57 | 58 | return o0 == o1 59 | 60 | 61 | def get_upath_protocol( 62 | pth: JoinablePathLike, 63 | *, 64 | protocol: str | None = None, 65 | storage_options: dict[str, Any] | None = None, 66 | ) -> str: 67 | """return the filesystem spec protocol""" 68 | from upath.core import UPath 69 | 70 | if isinstance(pth, str): 71 | pth_protocol = _match_protocol(pth) 72 | elif isinstance(pth, UPath): 73 | pth_protocol = pth.protocol 74 | elif isinstance(pth, PurePath): 75 | pth_protocol = getattr(pth, "protocol", "") 76 | elif hasattr(pth, "__vfspath__"): 77 | pth_protocol = _match_protocol(pth.__vfspath__()) 78 | elif hasattr(pth, "__fspath__"): 79 | pth_protocol = _match_protocol(pth.__fspath__()) 80 | else: 81 | pth_protocol = _match_protocol(str(pth)) 82 | # if storage_options and not protocol and not pth_protocol: 83 | # protocol = "file" 84 | if protocol is None: 85 | return pth_protocol or "" 86 | elif ( 87 | protocol 88 | and pth_protocol 89 | and not _fsspec_protocol_equals(pth_protocol, protocol) 90 | ): 91 | raise ValueError( 92 | f"requested protocol {protocol!r} incompatible with {pth_protocol!r}" 93 | ) 94 | elif protocol == "" and pth_protocol: 95 | # explicitly requested empty protocol, but path has non-empty protocol 96 | raise ValueError( 97 | f"explicitly requested empty protocol {protocol!r}" 98 | f" incompatible with {pth_protocol!r}" 99 | ) 100 | return protocol or pth_protocol or "" 101 | 102 | 103 | def normalize_empty_netloc(pth: str) -> str: 104 | if m := _PROTOCOL_RE.match(pth): 105 | if m.group("slashes") == "/": 106 | protocol = m.group("protocol") 107 | path = m.group("path") 108 | pth = f"{protocol}:///{path}" 109 | return pth 110 | 111 | 112 | def compatible_protocol( 113 | protocol: str, 114 | *args: JoinablePathLike, 115 | ) -> bool: 116 | """check if UPath protocols are compatible""" 117 | from upath.core import UPath 118 | 119 | for arg in args: 120 | if isinstance(arg, UPath) and not arg.is_absolute(): 121 | # relative UPath are always compatible 122 | continue 123 | other_protocol = get_upath_protocol(arg) 124 | # consider protocols equivalent if they match up to the first "+" 125 | other_protocol = other_protocol.partition("+")[0] 126 | # protocols: only identical (or empty "") protocols can combine 127 | if other_protocol and not _fsspec_protocol_equals(other_protocol, protocol): 128 | return False 129 | return True 130 | -------------------------------------------------------------------------------- /docs/api/types.md: -------------------------------------------------------------------------------- 1 | # Types :label: 2 | 3 | The types module provides type hints, protocols, and type aliases for working with UPath 4 | and filesystem operations. This includes abstract base classes, type aliases for path-like 5 | objects, and typed dictionaries for filesystem-specific storage options. 6 | 7 | ## pathlib-abc base classes 8 | 9 | These abstract base classes and protocols are re-exported from [pathlib-abc](https://github.com/barneygale/pathlib-abc) 10 | They define the core path interfaces that stdlib pathlib and UPath implementations conform to. 11 | 12 | ::: upath.types.JoinablePath 13 | options: 14 | heading_level: 3 15 | show_root_heading: true 16 | show_root_full_path: false 17 | members: false 18 | show_bases: true 19 | 20 | ::: upath.types.ReadablePath 21 | options: 22 | heading_level: 3 23 | show_root_heading: true 24 | show_root_full_path: false 25 | members: false 26 | show_bases: true 27 | 28 | ::: upath.types.WritablePath 29 | options: 30 | heading_level: 3 31 | show_root_heading: true 32 | show_root_full_path: false 33 | members: false 34 | show_bases: true 35 | 36 | ::: upath.types.PathInfo 37 | options: 38 | heading_level: 3 39 | show_root_heading: true 40 | show_root_full_path: false 41 | members: false 42 | show_bases: true 43 | 44 | ::: upath.types.PathParser 45 | options: 46 | heading_level: 3 47 | show_root_heading: true 48 | show_root_full_path: false 49 | members: false 50 | show_bases: true 51 | 52 | --- 53 | 54 | ## UPath specific protocols 55 | 56 | ::: upath.types.UPathParser 57 | options: 58 | heading_level: 3 59 | show_root_heading: true 60 | show_root_full_path: false 61 | members: false 62 | show_bases: true 63 | 64 | --- 65 | 66 | ## Type Aliases 67 | 68 | Convenient type aliases for path-like objects used throughout UPath. 69 | 70 | ::: upath.types.JoinablePathLike 71 | options: 72 | heading_level: 3 73 | show_root_heading: true 74 | show_root_full_path: false 75 | 76 | Union of types that can be joined as path segments. 77 | 78 | ::: upath.types.ReadablePathLike 79 | options: 80 | heading_level: 3 81 | show_root_heading: true 82 | show_root_full_path: false 83 | 84 | Union of types that can be read from. 85 | 86 | ::: upath.types.WritablePathLike 87 | options: 88 | heading_level: 3 89 | show_root_heading: true 90 | show_root_full_path: false 91 | 92 | Union of types that can be written to. 93 | 94 | ::: upath.types.SupportsPathLike 95 | options: 96 | heading_level: 3 97 | show_root_heading: true 98 | show_root_full_path: false 99 | 100 | Union of objects that support `__fspath__()` or `__vfspath__()` protocols. 101 | 102 | ::: upath.types.StatResultType 103 | options: 104 | heading_level: 3 105 | show_root_heading: true 106 | show_root_full_path: false 107 | members: false 108 | 109 | Protocol for `os.stat_result`-like objects. 110 | 111 | --- 112 | 113 | ## Storage Options 114 | 115 | Typed dictionaries providing type hints for filesystem-specific configuration options. 116 | These help ensure correct parameter names and types when configuring different filesystems. 117 | 118 | ::: upath.types.storage_options 119 | options: 120 | heading_level: 3 121 | show_root_heading: true 122 | show_root_full_path: false 123 | show_bases: false 124 | members: 125 | - SimpleCacheStorageOptions 126 | - GCSStorageOptions 127 | - S3StorageOptions 128 | - AzureStorageOptions 129 | - DataStorageOptions 130 | - FTPStorageOptions 131 | - GitHubStorageOptions 132 | - HDFSStorageOptions 133 | - HTTPStorageOptions 134 | - FileStorageOptions 135 | - MemoryStorageOptions 136 | - SFTPStorageOptions 137 | - SMBStorageOptions 138 | - WebdavStorageOptions 139 | - ZipStorageOptions 140 | - TarStorageOptions 141 | 142 | --- 143 | 144 | ## See Also :link: 145 | 146 | - [UPath](index.md) - Main UPath class documentation 147 | - [Implementations](implementations.md) - Built-in UPath subclasses 148 | - [Extensions](extensions.md) - Extending UPath functionality 149 | - [Registry](registry.md) - Implementation registry 150 | -------------------------------------------------------------------------------- /upath/implementations/http.py: -------------------------------------------------------------------------------- 1 | from __future__ import annotations 2 | 3 | import sys 4 | import warnings 5 | from collections.abc import Iterator 6 | from itertools import chain 7 | from typing import TYPE_CHECKING 8 | from typing import Any 9 | from urllib.parse import urlsplit 10 | 11 | from fsspec.asyn import sync 12 | 13 | from upath._stat import UPathStatResult 14 | from upath.core import UPath 15 | from upath.types import JoinablePathLike 16 | from upath.types import StatResultType 17 | 18 | if TYPE_CHECKING: 19 | from typing import Literal 20 | 21 | if sys.version_info >= (3, 11): 22 | from typing import Self 23 | from typing import Unpack 24 | else: 25 | from typing_extensions import Self 26 | from typing_extensions import Unpack 27 | 28 | from upath._chain import FSSpecChainParser 29 | from upath.types.storage_options import HTTPStorageOptions 30 | 31 | __all__ = ["HTTPPath"] 32 | 33 | 34 | class HTTPPath(UPath): 35 | __slots__ = () 36 | 37 | if TYPE_CHECKING: 38 | 39 | def __init__( 40 | self, 41 | *args: JoinablePathLike, 42 | protocol: Literal["http", "https"] | None = ..., 43 | chain_parser: FSSpecChainParser = ..., 44 | **storage_options: Unpack[HTTPStorageOptions], 45 | ) -> None: ... 46 | 47 | @classmethod 48 | def _transform_init_args( 49 | cls, 50 | args: tuple[JoinablePathLike, ...], 51 | protocol: str, 52 | storage_options: dict[str, Any], 53 | ) -> tuple[tuple[JoinablePathLike, ...], str, dict[str, Any]]: 54 | # allow initialization via a path argument and protocol keyword 55 | if args and not str(args[0]).startswith(protocol): 56 | args = (f"{protocol}://{str(args[0]).lstrip('/')}", *args[1:]) 57 | return args, protocol, storage_options 58 | 59 | def __str__(self) -> str: 60 | sr = urlsplit(super().__str__()) 61 | return sr._replace(path=sr.path or "/").geturl() 62 | 63 | @property 64 | def path(self) -> str: 65 | sr = urlsplit(super().path) 66 | return sr._replace(path=sr.path or "/").geturl() 67 | 68 | def stat(self, follow_symlinks: bool = True) -> StatResultType: 69 | if not follow_symlinks: 70 | warnings.warn( 71 | f"{type(self).__name__}.stat(follow_symlinks=False):" 72 | " is currently ignored.", 73 | UserWarning, 74 | stacklevel=2, 75 | ) 76 | info = self.fs.info(self.path) 77 | if "url" in info: 78 | info["type"] = "directory" if info["url"].endswith("/") else "file" 79 | return UPathStatResult.from_info(info) 80 | 81 | def iterdir(self) -> Iterator[Self]: 82 | it = iter(super().iterdir()) 83 | try: 84 | item0 = next(it) 85 | except (StopIteration, NotADirectoryError): 86 | raise NotADirectoryError(str(self)) 87 | except FileNotFoundError: 88 | raise FileNotFoundError(str(self)) 89 | else: 90 | yield from chain([item0], it) 91 | 92 | def resolve( 93 | self, 94 | strict: bool = False, 95 | follow_redirects: bool = True, 96 | ) -> Self: 97 | """Normalize the path and resolve redirects.""" 98 | # special handling of trailing slash behaviour 99 | parts = list(self.parts) 100 | if parts[-1:] == ["."]: 101 | parts[-1:] = [""] 102 | if parts[-2:] == ["", ".."]: 103 | parts[-2:] = [""] 104 | pth = self.with_segments(*parts) 105 | resolved_path = super(HTTPPath, pth).resolve(strict=strict) 106 | 107 | if follow_redirects: 108 | cls = type(self) 109 | # Get the fsspec fs 110 | fs = self.fs 111 | url = str(self) 112 | # Ensure we have a session 113 | session = sync(fs.loop, fs.set_session) 114 | # Use HEAD requests if the server allows it, falling back to GETs 115 | for method in (session.head, session.get): 116 | r = sync(fs.loop, method, url, allow_redirects=True) 117 | try: 118 | r.raise_for_status() 119 | except Exception as exc: 120 | if method == session.get: 121 | raise FileNotFoundError(self) from exc 122 | else: 123 | resolved_path = cls(str(r.url)) 124 | break 125 | 126 | return resolved_path 127 | -------------------------------------------------------------------------------- /upath/tests/implementations/test_github.py: -------------------------------------------------------------------------------- 1 | import functools 2 | import os 3 | import platform 4 | import sys 5 | 6 | import pytest 7 | 8 | from upath import UPath 9 | from upath.implementations.github import GitHubPath 10 | from upath.tests.cases import BaseTests 11 | 12 | pytestmark = pytest.mark.skipif( 13 | os.environ.get("CI") 14 | and not ( 15 | platform.system() == "Linux" and sys.version_info[:2] in {(3, 9), (3, 13)} 16 | ), 17 | reason="Skipping GitHubPath tests to prevent rate limiting on GitHub API.", 18 | ) 19 | 20 | 21 | def xfail_on_github_connection_error(func): 22 | """Method decorator to xfail tests on GitHub rate limit or connection errors.""" 23 | 24 | @functools.wraps(func) 25 | def wrapper(self, *args, **kwargs): 26 | try: 27 | return func(self, *args, **kwargs) 28 | except Exception as e: 29 | str_e = str(e) 30 | if "rate limit exceeded" in str_e or "too many requests for url" in str_e: 31 | pytest.xfail("GitHub API rate limit exceeded") 32 | elif ( 33 | "nodename nor servname provided, or not known" in str_e 34 | or "Network is unreachable" in str_e 35 | ): 36 | pytest.xfail("No internet connection") 37 | else: 38 | raise 39 | 40 | return wrapper 41 | 42 | 43 | def wrap_all_tests(decorator): 44 | """Class decorator factory to wrap all test methods with a given decorator.""" 45 | 46 | def class_decorator(cls): 47 | for attr_name in dir(cls): 48 | if attr_name.startswith("test_"): 49 | orig_method = getattr(cls, attr_name) 50 | setattr(cls, attr_name, decorator(orig_method)) 51 | return cls 52 | 53 | return class_decorator 54 | 55 | 56 | @wrap_all_tests(xfail_on_github_connection_error) 57 | class TestUPathGitHubPath(BaseTests): 58 | """ 59 | Unit-tests for the GitHubPath implementation of UPath. 60 | """ 61 | 62 | @pytest.fixture(autouse=True) 63 | def path(self): 64 | """ 65 | Fixture for the UPath instance to be tested. 66 | """ 67 | path = "github://ap--:universal_pathlib@test_data/data" 68 | self.path = UPath(path) 69 | 70 | def test_is_GitHubPath(self): 71 | """ 72 | Test that the path is a GitHubPath instance. 73 | """ 74 | assert isinstance(self.path, GitHubPath) 75 | 76 | @pytest.mark.skip(reason="GitHub filesystem is read-only") 77 | def test_mkdir(self): 78 | pass 79 | 80 | @pytest.mark.skip(reason="GitHub filesystem is read-only") 81 | def test_mkdir_exists_ok_false(self): 82 | pass 83 | 84 | @pytest.mark.skip(reason="GitHub filesystem is read-only") 85 | def test_mkdir_parents_true_exists_ok_false(self): 86 | pass 87 | 88 | @pytest.mark.skip(reason="GitHub filesystem is read-only") 89 | def test_rename(self): 90 | pass 91 | 92 | @pytest.mark.skip(reason="GitHub filesystem is read-only") 93 | def test_rename2(self): 94 | pass 95 | 96 | @pytest.mark.skip(reason="GitHub filesystem is read-only") 97 | def test_touch(self): 98 | pass 99 | 100 | @pytest.mark.skip(reason="GitHub filesystem is read-only") 101 | def test_touch_unlink(self): 102 | pass 103 | 104 | @pytest.mark.skip(reason="GitHub filesystem is read-only") 105 | def test_write_bytes(self): 106 | pass 107 | 108 | @pytest.mark.skip(reason="GitHub filesystem is read-only") 109 | def test_write_text(self): 110 | pass 111 | 112 | @pytest.mark.skip(reason="GitHub filesystem is read-only") 113 | def test_fsspec_compat(self): 114 | pass 115 | 116 | @pytest.mark.skip(reason="Only testing read on GithubPath") 117 | def test_move_local(self, tmp_path): 118 | pass 119 | 120 | @pytest.mark.skip(reason="Only testing read on GithubPath") 121 | def test_move_into_local(self, tmp_path): 122 | pass 123 | 124 | @pytest.mark.skip(reason="Only testing read on GithubPath") 125 | def test_move_memory(self, clear_fsspec_memory_cache): 126 | pass 127 | 128 | @pytest.mark.skip(reason="Only testing read on GithubPath") 129 | def test_move_into_memory(self, clear_fsspec_memory_cache): 130 | pass 131 | 132 | @pytest.mark.skip(reason="Only testing read on GithubPath") 133 | def test_rename_with_target_absolute(self, target_factory): 134 | return super().test_rename_with_target_str_absolute(target_factory) 135 | 136 | @pytest.mark.skip(reason="Only testing read on GithubPath") 137 | def test_write_text_encoding(self): 138 | return super().test_write_text_encoding() 139 | 140 | @pytest.mark.skip(reason="Only testing read on GithubPath") 141 | def test_write_text_errors(self): 142 | return super().test_write_text_errors() 143 | -------------------------------------------------------------------------------- /upath/types/_abc.pyi: -------------------------------------------------------------------------------- 1 | """pathlib_abc exports for compatibility with pathlib.""" 2 | 3 | import sys 4 | from abc import ABC 5 | from abc import abstractmethod 6 | from typing import Any 7 | from typing import BinaryIO 8 | from typing import Callable 9 | from typing import Iterator 10 | from typing import Literal 11 | from typing import Protocol 12 | from typing import Sequence 13 | from typing import TextIO 14 | from typing import TypeVar 15 | from typing import runtime_checkable 16 | 17 | if sys.version_info > (3, 11): 18 | from typing import Self 19 | else: 20 | from typing_extensions import Self 21 | 22 | class JoinablePath(ABC): 23 | __slots__ = () 24 | 25 | @property 26 | @abstractmethod 27 | def parser(self) -> PathParser: ... 28 | @abstractmethod 29 | def with_segments(self, *pathsegments: str) -> Self: ... 30 | @abstractmethod 31 | def __vfspath__(self) -> str: ... 32 | @property 33 | def anchor(self) -> str: ... 34 | @property 35 | def name(self) -> str: ... 36 | @property 37 | def suffix(self) -> str: ... 38 | @property 39 | def suffixes(self) -> list[str]: ... 40 | @property 41 | def stem(self) -> str: ... 42 | def with_name(self, name: str) -> Self: ... 43 | def with_stem(self, stem: str) -> Self: ... 44 | def with_suffix(self, suffix: str) -> Self: ... 45 | @property 46 | def parts(self) -> Sequence[str]: ... 47 | def joinpath(self, *pathsegments: str) -> Self: ... 48 | def __truediv__(self, key: str) -> Self: ... 49 | def __rtruediv__(self, key: str) -> Self: ... 50 | @property 51 | def parent(self) -> Self: ... 52 | @property 53 | def parents(self) -> Sequence[Self]: ... 54 | def full_match(self, pattern: str) -> bool: ... 55 | 56 | OnErrorCallable = Callable[[Exception], Any] 57 | T = TypeVar("T", bound="WritablePath") 58 | 59 | class ReadablePath(JoinablePath): 60 | __slots__ = () 61 | 62 | @property 63 | @abstractmethod 64 | def info(self) -> PathInfo: ... 65 | @abstractmethod 66 | def __open_reader__(self) -> BinaryIO: ... 67 | def read_bytes(self) -> bytes: ... 68 | def read_text( 69 | self, 70 | encoding: str | None = ..., 71 | errors: str | None = ..., 72 | newline: str | None = ..., 73 | ) -> str: ... 74 | @abstractmethod 75 | def iterdir(self) -> Iterator[Self]: ... 76 | def glob(self, pattern: str, *, recurse_symlinks: bool = ...) -> Iterator[Self]: ... 77 | def walk( 78 | self, 79 | top_down: bool = ..., 80 | on_error: OnErrorCallable | None = ..., 81 | follow_symlinks: bool = ..., 82 | ) -> Iterator[tuple[Self, list[str], list[str]]]: ... 83 | @abstractmethod 84 | def readlink(self) -> Self: ... 85 | def copy(self, target: T, **kwargs: Any) -> T: ... 86 | def copy_into(self, target_dir: T, **kwargs: Any) -> T: ... 87 | 88 | class WritablePath(JoinablePath): 89 | __slots__ = () 90 | 91 | @abstractmethod 92 | def symlink_to( 93 | self, target: ReadablePath, target_is_directory: bool = ... 94 | ) -> None: ... 95 | @abstractmethod 96 | def mkdir(self) -> None: ... 97 | @abstractmethod 98 | def __open_writer__(self, mode: Literal["a", "w", "x"]) -> BinaryIO: ... 99 | def write_bytes(self, data: bytes) -> int: ... 100 | def write_text( 101 | self, 102 | data: str, 103 | encoding: str | None = ..., 104 | errors: str | None = ..., 105 | newline: str | None = ..., 106 | ) -> int: ... 107 | def _copy_from(self, source: ReadablePath, follow_symlinks: bool = ...) -> None: ... 108 | 109 | @runtime_checkable 110 | class PathParser(Protocol): 111 | sep: str 112 | altsep: str | None 113 | 114 | def split(self, path: str) -> tuple[str, str]: ... 115 | def splitext(self, path: str) -> tuple[str, str]: ... 116 | def normcase(self, path: str) -> str: ... 117 | 118 | @runtime_checkable 119 | class PathInfo(Protocol): 120 | def exists(self, *, follow_symlinks: bool = True) -> bool: ... 121 | def is_dir(self, *, follow_symlinks: bool = True) -> bool: ... 122 | def is_file(self, *, follow_symlinks: bool = True) -> bool: ... 123 | def is_symlink(self) -> bool: ... 124 | 125 | class SupportsOpenReader(Protocol): 126 | def __open_reader__(self) -> BinaryIO: ... 127 | 128 | class SupportsOpenWriter(Protocol): 129 | def __open_writer__(self, mode: Literal["a", "w", "x"]) -> BinaryIO: ... 130 | 131 | class SupportsOpenUpdater(Protocol): 132 | def __open_updater__(self, mode: Literal["r+", "w+", "+r", "+w"]) -> BinaryIO: ... 133 | 134 | def vfsopen( 135 | obj: SupportsOpenReader | SupportsOpenWriter | SupportsOpenUpdater, 136 | mode="r", 137 | buffering: int = -1, 138 | encoding: str | None = None, 139 | errors: str | None = None, 140 | newline: str | None = None, 141 | ) -> BinaryIO | TextIO: ... 142 | 143 | class SupportsVFSPath(Protocol): 144 | def __vfspath__(self) -> str: ... 145 | 146 | def vfspath(obj: SupportsVFSPath) -> str: ... 147 | -------------------------------------------------------------------------------- /docs/assets/logo-128x128-white.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 16 | 18 | 20 | 24 | 28 | 29 | 30 | 51 | 56 | 61 | 66 | 71 | 76 | 81 | 86 | 91 | 92 | 97 | 100 | 105 | 113 | 114 | 115 | 116 | -------------------------------------------------------------------------------- /upath/tests/test_registry.py: -------------------------------------------------------------------------------- 1 | import random 2 | import string 3 | 4 | import pytest 5 | from fsspec.implementations.local import LocalFileSystem 6 | from fsspec.registry import _registry as fsspec_registry_private 7 | from fsspec.registry import known_implementations as fsspec_known_implementations 8 | from fsspec.registry import register_implementation as fsspec_register_implementation 9 | from fsspec.registry import registry as fsspec_registry 10 | 11 | from upath import UPath 12 | from upath.registry import available_implementations 13 | from upath.registry import get_upath_class 14 | from upath.registry import register_implementation 15 | 16 | IMPLEMENTATIONS = { 17 | "abfs", 18 | "abfss", 19 | "adl", 20 | "az", 21 | "data", 22 | "file", 23 | "ftp", 24 | "gcs", 25 | "gs", 26 | "hdfs", 27 | "hf", 28 | "http", 29 | "https", 30 | "local", 31 | "memory", 32 | "s3", 33 | "s3a", 34 | "simplecache", 35 | "sftp", 36 | "smb", 37 | "ssh", 38 | "webdav", 39 | "webdav+http", 40 | "webdav+https", 41 | "github", 42 | "zip", 43 | "tar", 44 | } 45 | 46 | 47 | @pytest.fixture(autouse=True) 48 | def reset_registry(): 49 | from upath.registry import _registry 50 | 51 | try: 52 | yield 53 | finally: 54 | _registry._m.maps[0].clear() # type: ignore 55 | 56 | 57 | @pytest.fixture() 58 | def fake_entrypoint(): 59 | from importlib.metadata import EntryPoint 60 | 61 | from upath.registry import _registry 62 | 63 | ep = EntryPoint( 64 | name="myeps", 65 | value="upath.core:UPath", 66 | group="universal_pathlib.implementations", 67 | ) 68 | old_registry = _registry._entries.copy() 69 | 70 | try: 71 | _registry._entries["myeps"] = ep 72 | yield 73 | finally: 74 | _registry._entries.clear() 75 | _registry._entries.update(old_registry) 76 | 77 | 78 | def test_available_implementations(): 79 | impl = available_implementations() 80 | assert len(impl) == len(set(impl)) 81 | assert set(impl) == IMPLEMENTATIONS 82 | 83 | 84 | @pytest.fixture 85 | def fake_registered_proto(): 86 | fake_proto = "".join(random.choices(string.ascii_lowercase, k=8)) 87 | 88 | class FakeRandomFS(LocalFileSystem): 89 | protocol = fake_proto 90 | 91 | fsspec_register_implementation(fake_proto, FakeRandomFS) 92 | try: 93 | yield fake_proto 94 | finally: 95 | fsspec_registry_private.pop(fake_proto, None) 96 | 97 | 98 | def test_available_implementations_with_fallback(fake_registered_proto): 99 | impl = available_implementations(fallback=True) 100 | assert fake_registered_proto in impl 101 | assert set(impl) == IMPLEMENTATIONS.union( 102 | { 103 | *fsspec_known_implementations, 104 | *fsspec_registry, 105 | } 106 | ) 107 | 108 | 109 | def test_available_implementations_with_entrypoint(fake_entrypoint): 110 | impl = available_implementations() 111 | assert set(impl) == IMPLEMENTATIONS.union({"myeps"}) 112 | 113 | 114 | def test_register_implementation(): 115 | class MyProtoPath(UPath): 116 | pass 117 | 118 | register_implementation("myproto", MyProtoPath) 119 | 120 | assert get_upath_class("myproto") is MyProtoPath 121 | 122 | 123 | def test_register_implementation_wrong_input(): 124 | with pytest.raises(TypeError): 125 | register_implementation(None, UPath) # type: ignore 126 | with pytest.raises(ValueError): 127 | register_implementation("incorrect**protocol", UPath) 128 | with pytest.raises(ValueError): 129 | register_implementation("myproto", object, clobber=True) # type: ignore 130 | with pytest.raises(ValueError): 131 | register_implementation("file", UPath, clobber=False) 132 | assert set(available_implementations()) == IMPLEMENTATIONS 133 | 134 | 135 | @pytest.mark.parametrize("protocol", IMPLEMENTATIONS) 136 | def test_get_upath_class(protocol): 137 | upath_cls = get_upath_class("file") 138 | assert issubclass(upath_cls, UPath) 139 | 140 | 141 | def test_get_upath_class_without_implementation(clear_registry): 142 | with pytest.warns( 143 | UserWarning, match="UPath 'mock' filesystem not explicitly implemented." 144 | ): 145 | upath_cls = get_upath_class("mock") 146 | assert issubclass(upath_cls, UPath) 147 | 148 | 149 | def test_get_upath_class_without_implementation_no_fallback(clear_registry): 150 | assert get_upath_class("mock", fallback=False) is None 151 | 152 | 153 | def test_get_upath_class_unknown_protocol(clear_registry): 154 | assert get_upath_class("doesnotexist") is None 155 | 156 | 157 | def test_get_upath_class_from_entrypoint(fake_entrypoint): 158 | assert issubclass(get_upath_class("myeps"), UPath) 159 | 160 | 161 | @pytest.mark.parametrize( 162 | "protocol", [pytest.param("", id="empty-str"), pytest.param(None, id="none")] 163 | ) 164 | def test_get_upath_class_falsey_protocol(protocol): 165 | assert issubclass(get_upath_class(protocol), UPath) 166 | -------------------------------------------------------------------------------- /CODE_OF_CONDUCT.rst: -------------------------------------------------------------------------------- 1 | Contributor Covenant Code of Conduct 2 | ==================================== 3 | 4 | Our Pledge 5 | ---------- 6 | 7 | We as members, contributors, and leaders pledge to make participation in our community a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socioeconomic status, nationality, personal appearance, race, religion, or sexual identity and orientation. 8 | 9 | We pledge to act and interact in ways that contribute to an open, welcoming, diverse, inclusive, and healthy community. 10 | 11 | 12 | Our Standards 13 | ------------- 14 | 15 | Examples of behavior that contributes to a positive environment for our community include: 16 | 17 | - Demonstrating empathy and kindness toward other people 18 | - Being respectful of differing opinions, viewpoints, and experiences 19 | - Giving and gracefully accepting constructive feedback 20 | - Accepting responsibility and apologizing to those affected by our mistakes, and learning from the experience 21 | - Focusing on what is best not just for us as individuals, but for the overall community 22 | 23 | Examples of unacceptable behavior include: 24 | 25 | - The use of sexualized language or imagery, and sexual attention or 26 | advances of any kind 27 | - Trolling, insulting or derogatory comments, and personal or political attacks 28 | - Public or private harassment 29 | - Publishing others' private information, such as a physical or email 30 | address, without their explicit permission 31 | - Other conduct which could reasonably be considered inappropriate in a 32 | professional setting 33 | 34 | Enforcement Responsibilities 35 | ---------------------------- 36 | 37 | Community leaders are responsible for clarifying and enforcing our standards of acceptable behavior and will take appropriate and fair corrective action in response to any behavior that they deem inappropriate, threatening, offensive, or harmful. 38 | 39 | Community leaders have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, and will communicate reasons for moderation decisions when appropriate. 40 | 41 | 42 | Scope 43 | ----- 44 | 45 | This Code of Conduct applies within all community spaces, and also applies when an individual is officially representing the community in public spaces. Examples of representing our community include using an official e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event. 46 | 47 | 48 | Enforcement 49 | ----------- 50 | 51 | Instances of abusive, harassing, or otherwise unacceptable behavior may be reported to the community leaders responsible for enforcement at andrewfulton9gmail.com. All complaints will be reviewed and investigated promptly and fairly. 52 | 53 | All community leaders are obligated to respect the privacy and security of the reporter of any incident. 54 | 55 | 56 | Enforcement Guidelines 57 | ---------------------- 58 | 59 | Community leaders will follow these Community Impact Guidelines in determining the consequences for any action they deem in violation of this Code of Conduct: 60 | 61 | 62 | 1. Correction 63 | ~~~~~~~~~~~~~ 64 | 65 | **Community Impact**: Use of inappropriate language or other behavior deemed unprofessional or unwelcome in the community. 66 | 67 | **Consequence**: A private, written warning from community leaders, providing clarity around the nature of the violation and an explanation of why the behavior was inappropriate. A public apology may be requested. 68 | 69 | 70 | 2. Warning 71 | ~~~~~~~~~~ 72 | 73 | **Community Impact**: A violation through a single incident or series of actions. 74 | 75 | **Consequence**: A warning with consequences for continued behavior. No interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, for a specified period of time. This includes avoiding interactions in community spaces as well as external channels like social media. Violating these terms may lead to a temporary or permanent ban. 76 | 77 | 78 | 3. Temporary Ban 79 | ~~~~~~~~~~~~~~~~ 80 | 81 | **Community Impact**: A serious violation of community standards, including sustained inappropriate behavior. 82 | 83 | **Consequence**: A temporary ban from any sort of interaction or public communication with the community for a specified period of time. No public or private interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, is allowed during this period. Violating these terms may lead to a permanent ban. 84 | 85 | 86 | 4. Permanent Ban 87 | ~~~~~~~~~~~~~~~~ 88 | 89 | **Community Impact**: Demonstrating a pattern of violation of community standards, including sustained inappropriate behavior, harassment of an individual, or aggression toward or disparagement of classes of individuals. 90 | 91 | **Consequence**: A permanent ban from any sort of public interaction within the community. 92 | 93 | 94 | Attribution 95 | ----------- 96 | 97 | This Code of Conduct is adapted from the `Contributor Covenant `__, version 2.0, 98 | available at https://www.contributor-covenant.org/version/2/0/code_of_conduct/. 99 | 100 | Community Impact Guidelines were inspired by `Mozilla’s code of conduct enforcement ladder `__. 101 | 102 | .. _homepage: https://www.contributor-covenant.org 103 | 104 | For answers to common questions about this code of conduct, see the FAQ at 105 | https://www.contributor-covenant.org/faq. Translations are available at https://www.contributor-covenant.org/translations. 106 | -------------------------------------------------------------------------------- /upath/tests/implementations/test_s3.py: -------------------------------------------------------------------------------- 1 | """see upath/tests/conftest.py for fixtures""" 2 | 3 | import sys 4 | 5 | import fsspec 6 | import pytest # noqa: F401 7 | 8 | from upath import UPath 9 | from upath.implementations.cloud import S3Path 10 | 11 | from ..cases import BaseTests 12 | 13 | 14 | def silence_botocore_datetime_deprecation(cls): 15 | # botocore uses datetime.datetime.utcnow in 3.12 which is deprecated 16 | # see: https://github.com/boto/boto3/issues/3889#issuecomment-1751296363 17 | if sys.version_info >= (3, 12): 18 | return pytest.mark.filterwarnings( 19 | "ignore" 20 | r":datetime.datetime.utcnow\(\) is deprecated" 21 | ":DeprecationWarning" 22 | )(cls) 23 | else: 24 | return cls 25 | 26 | 27 | @silence_botocore_datetime_deprecation 28 | class TestUPathS3(BaseTests): 29 | SUPPORTS_EMPTY_DIRS = False 30 | 31 | @pytest.fixture(autouse=True) 32 | def path(self, s3_fixture): 33 | path, anon, s3so = s3_fixture 34 | self.path = UPath(path, anon=anon, **s3so) 35 | self.anon = anon 36 | self.s3so = s3so 37 | 38 | def test_is_S3Path(self): 39 | assert isinstance(self.path, S3Path) 40 | 41 | def test_chmod(self): 42 | # todo 43 | pass 44 | 45 | def test_rmdir(self): 46 | dirname = "rmdir_test" 47 | mock_dir = self.path.joinpath(dirname) 48 | mock_dir.joinpath("test.txt").touch() 49 | mock_dir.rmdir() 50 | assert not mock_dir.exists() 51 | with pytest.raises(NotADirectoryError): 52 | self.path.joinpath("file1.txt").rmdir() 53 | 54 | def test_relative_to(self): 55 | assert "file.txt" == str( 56 | UPath("s3://test_bucket/file.txt").relative_to(UPath("s3://test_bucket")) 57 | ) 58 | 59 | def test_iterdir_root(self): 60 | client_kwargs = self.path.storage_options["client_kwargs"] 61 | bucket_path = UPath("s3://other_test_bucket", client_kwargs=client_kwargs) 62 | bucket_path.mkdir() 63 | 64 | (bucket_path / "test1.txt").touch() 65 | (bucket_path / "test2.txt").touch() 66 | 67 | for x in bucket_path.iterdir(): 68 | assert x.name != "" 69 | assert x.exists() 70 | 71 | @pytest.mark.parametrize( 72 | "joiner", [["bucket", "path", "file"], ["bucket/path/file"]] 73 | ) 74 | def test_no_bucket_joinpath(self, joiner): 75 | path = UPath("s3://", anon=self.anon, **self.s3so) 76 | path = path.joinpath(*joiner) 77 | assert str(path) == "s3://bucket/path/file" 78 | 79 | def test_creating_s3path_with_bucket(self): 80 | path = UPath("s3://", bucket="bucket", anon=self.anon, **self.s3so) 81 | assert str(path) == "s3://bucket/" 82 | 83 | def test_iterdir_with_plus_in_name(self, s3_with_plus_chr_name): 84 | bucket, anon, s3so = s3_with_plus_chr_name 85 | p = UPath( 86 | f"s3://{bucket}/manual__2022-02-19T14:31:25.891270+00:00", 87 | anon=True, 88 | **s3so, 89 | ) 90 | 91 | files = list(p.iterdir()) 92 | assert len(files) == 1 93 | (file,) = files 94 | assert file == p.joinpath("file.txt") 95 | 96 | @pytest.mark.xfail(reason="fsspec/universal_pathlib#144") 97 | def test_rglob_with_double_fwd_slash(self, s3_with_double_fwd_slash_files): 98 | import boto3 99 | import botocore.exceptions 100 | 101 | bucket, anon, s3so = s3_with_double_fwd_slash_files 102 | 103 | conn = boto3.resource("s3", **s3so["client_kwargs"]) 104 | # ensure there's no s3://bucket/key.txt object 105 | with pytest.raises(botocore.exceptions.ClientError, match=".*Not Found.*"): 106 | conn.Object(bucket, "key.txt").load() 107 | # ensure there's a s3://bucket//key.txt object 108 | assert conn.Object(bucket, "/key.txt").get()["Body"].read() == b"hello world" 109 | 110 | p0 = UPath(f"s3://{bucket}//key.txt", **s3so) 111 | assert p0.read_bytes() == b"hello world" 112 | p1 = UPath(f"s3://{bucket}", **s3so) 113 | assert list(p1.rglob("*.txt")) == [p0] 114 | 115 | 116 | @pytest.fixture 117 | def s3_with_plus_chr_name(s3_server): 118 | anon, s3so = s3_server 119 | s3 = fsspec.filesystem("s3", anon=False, **s3so) 120 | bucket = "plus_chr_bucket" 121 | path = f"{bucket}/manual__2022-02-19T14:31:25.891270+00:00" 122 | s3.mkdir(path) 123 | s3.touch(f"{path}/file.txt") 124 | s3.invalidate_cache() 125 | try: 126 | yield bucket, anon, s3so 127 | finally: 128 | if s3.exists(bucket): 129 | for dir, _, keys in s3.walk(bucket): 130 | for key in keys: 131 | if key.rstrip("/"): 132 | s3.rm(f"{dir}/{key}") 133 | 134 | 135 | @pytest.fixture 136 | def s3_with_double_fwd_slash_files(s3_server): 137 | anon, s3so = s3_server 138 | s3 = fsspec.filesystem("s3", anon=False, **s3so) 139 | bucket = "double_fwd_slash_bucket" 140 | s3.mkdir(bucket + "/") 141 | s3.pipe_file(f"{bucket}//key.txt", b"hello world") 142 | try: 143 | yield bucket, anon, s3so 144 | finally: 145 | if s3.exists(bucket): 146 | for dir, _, keys in s3.walk(bucket): 147 | for key in keys: 148 | if key.rstrip("/"): 149 | s3.rm(f"{dir}/{key}") 150 | 151 | 152 | def test_path_with_hash_and_space(): 153 | assert "with#hash and space" in UPath("s3://bucket/with#hash and space/abc").parts 154 | 155 | 156 | def test_pathlib_consistent_join(): 157 | b0 = UPath("s3://mybucket/withkey/").joinpath("subfolder/myfile.txt") 158 | b1 = UPath("s3://mybucket/withkey").joinpath("subfolder/myfile.txt") 159 | assert b0 == b1 160 | assert "s3://mybucket/withkey/subfolder/myfile.txt" == str(b0) == str(b1) 161 | -------------------------------------------------------------------------------- /pyproject.toml: -------------------------------------------------------------------------------- 1 | [build-system] 2 | requires = ["setuptools>=64", "setuptools_scm>=8"] 3 | build-backend = "setuptools.build_meta" 4 | 5 | [project] 6 | name = "universal_pathlib" 7 | license = "MIT" 8 | authors = [ 9 | {name = "Andrew Fulton", email = "andrewfulton9@gmail.com"}, 10 | ] 11 | description = "pathlib api extended to use fsspec backends" 12 | maintainers = [ 13 | {name = "Andreas Poehlmann"}, 14 | {name = "Andreas Poehlmann", email = "andreas@poehlmann.io"}, 15 | {name = "Norman Rzepka"}, 16 | ] 17 | requires-python = ">=3.9" 18 | dependencies = [ 19 | "fsspec >=2024.5.0", 20 | "pathlib-abc >=0.5.1,<0.6.0", 21 | ] 22 | classifiers = [ 23 | "Programming Language :: Python :: 3", 24 | "Programming Language :: Python :: 3.9", 25 | "Programming Language :: Python :: 3.10", 26 | "Programming Language :: Python :: 3.11", 27 | "Programming Language :: Python :: 3.12", 28 | "Programming Language :: Python :: 3.13", 29 | "Programming Language :: Python :: 3.14", 30 | "Development Status :: 4 - Beta", 31 | ] 32 | keywords = ["filesystem-spec", "pathlib"] 33 | dynamic = ["version", "readme"] 34 | 35 | [tool.setuptools.dynamic] 36 | readme = {file = ["README.md"], content-type = "text/markdown"} 37 | 38 | [project.optional-dependencies] 39 | tests = [ 40 | "pytest >=8", 41 | "pytest-sugar >=0.9.7", 42 | "pytest-cov >=4.1.0", 43 | "pytest-mock >=3.12.0", 44 | "pylint >=2.17.4", 45 | "mypy >=1.10.0", 46 | "pydantic >=2", 47 | "pytest-mypy-plugins >=3.1.2", 48 | "packaging", 49 | ] 50 | typechecking = [ 51 | "mypy >=1.10.0", 52 | "pytest-mypy-plugins >=3.1.2", 53 | ] 54 | dev = [ 55 | "fsspec[adl,http,github,gcs,s3,ssh,smb] >=2024.5.0", 56 | "s3fs >=2024.5.0", 57 | "gcsfs >=2024.5.0", 58 | "adlfs >=2024", 59 | "huggingface_hub", 60 | "webdav4[fsspec]", 61 | # testing 62 | "moto[s3,server]", 63 | "wsgidav", 64 | "cheroot", 65 | # "hadoop-test-cluster", 66 | # "pyarrow", 67 | "pyftpdlib", 68 | "typing_extensions; python_version<'3.11'", 69 | ] 70 | dev-third-party = [ 71 | "pydantic", 72 | "pydantic-settings", 73 | ] 74 | 75 | [project.urls] 76 | Homepage = "https://github.com/fsspec/universal_pathlib" 77 | Changelog = "https://github.com/fsspec/universal_pathlib/blob/main/CHANGELOG.md" 78 | 79 | [tool.setuptools] 80 | include-package-data = false 81 | 82 | [tool.setuptools.package-data] 83 | upath = ["py.typed"] 84 | 85 | [tool.setuptools.packages.find] 86 | exclude = [ 87 | "upath.tests", 88 | "upath.tests.*", 89 | ] 90 | namespaces = false 91 | 92 | [tool.setuptools_scm] 93 | write_to = "upath/_version.py" 94 | version_scheme = "post-release" 95 | 96 | [tool.black] 97 | line-length = 88 98 | include = '\.pyi?$' 99 | exclude = ''' 100 | /( 101 | \.eggs 102 | | \.git 103 | | \.hg 104 | | \.mypy_cache 105 | | \.tox 106 | | \.venv 107 | | _build 108 | | buck-out 109 | | build 110 | | dist 111 | )/ 112 | ''' 113 | force-exclude = ''' 114 | ( 115 | ^/upath/tests/pathlib/_test_support\.py 116 | |^/upath/tests/pathlib/test_pathlib_.*\.py 117 | ) 118 | ''' 119 | 120 | [tool.isort] 121 | profile = "black" 122 | known_first_party = ["upath"] 123 | force_single_line = true 124 | line_length = 88 125 | 126 | [tool.pytest.ini_options] 127 | addopts = "-ra -m 'not hdfs' -p no:pytest-mypy-plugins" 128 | markers = [ 129 | "hdfs: mark test as hdfs", 130 | "pathlib: mark cpython pathlib tests", 131 | ] 132 | 133 | [tool.coverage.run] 134 | branch = true 135 | source = ["upath"] 136 | 137 | [tool.coverage.report] 138 | show_missing = true 139 | exclude_lines = [ 140 | "pragma: no cover", 141 | "if __name__ == .__main__.:", 142 | "if typing.TYPE_CHECKING:", 143 | "if TYPE_CHECKING:", 144 | "raise NotImplementedError", 145 | "raise AssertionError", 146 | "@overload", 147 | "except ImportError", 148 | ] 149 | 150 | [tool.mypy] 151 | # Error output 152 | show_column_numbers = false 153 | show_error_codes = true 154 | show_error_context = true 155 | show_traceback = true 156 | pretty = true 157 | check_untyped_defs = false 158 | # Warnings 159 | warn_no_return = true 160 | warn_redundant_casts = true 161 | warn_unreachable = true 162 | files = ["upath"] 163 | exclude = "^notebooks|^venv.*|tests.*|^noxfile.py" 164 | 165 | [[tool.mypy.overrides]] 166 | module = "fsspec.*" 167 | ignore_missing_imports = true 168 | 169 | [[tool.mypy.overrides]] 170 | module = "webdav4.*" 171 | ignore_missing_imports = true 172 | 173 | [[tool.mypy.overrides]] 174 | module = "pathlib_abc.*" 175 | ignore_missing_imports = true 176 | 177 | [[tool.mypy.overrides]] 178 | module = "smbprotocol.*" 179 | ignore_missing_imports = true 180 | 181 | [[tool.mypy.overrides]] 182 | module = "pydantic.*" 183 | ignore_missing_imports = true 184 | ignore_errors = true 185 | 186 | [[tool.mypy.overrides]] 187 | module = "pydantic_core.*" 188 | ignore_missing_imports = true 189 | ignore_errors = true 190 | 191 | [[tool.mypy.overrides]] 192 | module = "typing_inspection.*" 193 | ignore_errors = true 194 | 195 | [[tool.mypy.overrides]] 196 | module = "annotated_types.*" 197 | ignore_errors = true 198 | 199 | [tool.pylint.format] 200 | max-line-length = 88 201 | 202 | [tool.pylint.message_control] 203 | enable = ["c-extension-no-member", "no-else-return"] 204 | 205 | [tool.pylint.variables] 206 | dummy-variables-rgx = "_+$|(_[a-zA-Z0-9_]*[a-zA-Z0-9]+?$)|dummy|^ignored_|^unused_" 207 | ignored-argument-names = "_.*|^ignored_|^unused_|args|kwargs" 208 | 209 | [tool.codespell] 210 | ignore-words-list = " " 211 | 212 | [tool.bandit] 213 | exclude_dirs = ["tests"] 214 | skips = ["B101"] 215 | 216 | [dependency-groups] 217 | docs = [ 218 | "mkdocs>=1.6.1", 219 | "click!=8.2.2,!=8.3.0", # https://github.com/mkdocs/mkdocs/issues/4032 220 | "mkdocs-material>=9.6.22", 221 | "mkdocstrings[python]>=0.30.1", 222 | "mkdocs-exclude>=1.0.2", 223 | "pymdown-extensions>=10.7.0", 224 | "ruff>=0.14.1", 225 | ] 226 | -------------------------------------------------------------------------------- /noxfile.py: -------------------------------------------------------------------------------- 1 | """Automation using nox.""" 2 | 3 | import glob 4 | import os 5 | import sys 6 | 7 | import nox 8 | 9 | nox.options.reuse_existing_virtualenvs = True 10 | nox.options.error_on_external_run = True 11 | 12 | nox.needs_version = ">=2024.04.15" 13 | nox.options.default_venv_backend = "uv" 14 | 15 | nox.options.sessions = "lint", "tests", "type-checking", "type-safety" 16 | locations = ("upath",) 17 | running_in_ci = os.environ.get("CI", "") != "" 18 | 19 | SUPPORTED_PYTHONS = ["3.9", "3.10", "3.11", "3.12", "3.13", "3.14"] 20 | BASE_PYTHON = SUPPORTED_PYTHONS[-3] 21 | MIN_PYTHON = SUPPORTED_PYTHONS[0] 22 | 23 | 24 | @(lambda f: f()) 25 | def FSSPEC_MIN_VERSION() -> str: 26 | """Get the minimum fsspec version boundary from pyproject.toml.""" 27 | try: 28 | from packaging.requirements import Requirement 29 | 30 | if sys.version_info >= (3, 11): 31 | from tomllib import load as toml_load 32 | else: 33 | from tomli import load as toml_load 34 | except ImportError: 35 | raise RuntimeError( 36 | "We rely on nox>=2024.04.15 depending on `packaging` and `tomli/tomllib`." 37 | " Please report if you see this error." 38 | ) 39 | 40 | with open("pyproject.toml", "rb") as f: 41 | pyproject_data = toml_load(f) 42 | 43 | for requirement in pyproject_data["project"]["dependencies"]: 44 | req = Requirement(requirement) 45 | if req.name == "fsspec": 46 | for specifier in req.specifier: 47 | if specifier.operator == ">=": 48 | return str(specifier.version) 49 | raise RuntimeError("Could not find fsspec minimum version in pyproject.toml") 50 | 51 | 52 | @nox.session(python=SUPPORTED_PYTHONS) 53 | def tests(session: nox.Session) -> None: 54 | """Run the test suite.""" 55 | # workaround in case no aiohttp binary wheels are available 56 | if session.python == "3.14": 57 | session.env["AIOHTTP_NO_EXTENSIONS"] = "1" 58 | session.install(".[tests,dev]", "pydantic>=2.12.0a1") 59 | else: 60 | session.install(".[tests,dev,dev-third-party]") 61 | session.run("uv", "pip", "freeze", silent=not running_in_ci) 62 | session.run( 63 | "pytest", 64 | "-m", 65 | "not hdfs", 66 | "--cov", 67 | "--cov-config=pyproject.toml", 68 | *session.posargs, 69 | env={"COVERAGE_FILE": f".coverage.{session.python}"}, 70 | ) 71 | 72 | 73 | @nox.session(python=MIN_PYTHON, name="tests-minversion") 74 | def tests_minversion(session: nox.Session) -> None: 75 | session.install(f"fsspec=={FSSPEC_MIN_VERSION}", ".[tests,dev]") 76 | session.run("uv", "pip", "freeze", silent=not running_in_ci) 77 | session.run( 78 | "pytest", 79 | "-m", 80 | "not hdfs", 81 | "--cov", 82 | "--cov-config=pyproject.toml", 83 | *session.posargs, 84 | env={"COVERAGE_FILE": f".coverage.{session.python}"}, 85 | ) 86 | 87 | 88 | tests_minversion.__doc__ = f"Run the test suite with fsspec=={FSSPEC_MIN_VERSION}." 89 | 90 | 91 | @nox.session 92 | def lint(session: nox.Session) -> None: 93 | """Run pre-commit hooks.""" 94 | session.install("pre-commit") 95 | session.install("-e", ".[tests]") 96 | 97 | args = *(session.posargs or ("--show-diff-on-failure",)), "--all-files" 98 | session.run("pre-commit", "run", *args) 99 | 100 | 101 | @nox.session 102 | def safety(session: nox.Session) -> None: 103 | """Scan dependencies for insecure packages.""" 104 | session.install(".") 105 | session.install("safety") 106 | session.run("safety", "check", "--full-report") 107 | 108 | 109 | @nox.session 110 | def build(session: nox.Session) -> None: 111 | """Build sdists and wheels.""" 112 | session.install("build", "setuptools", "twine") 113 | session.run("python", "-m", "build") 114 | dists = glob.glob("dist/*") 115 | session.run("twine", "check", *dists, silent=True) 116 | 117 | 118 | @nox.session 119 | def develop(session: nox.Session) -> None: 120 | """Sets up a python development environment for the project.""" 121 | session.run("uv", "venv", external=True) 122 | 123 | 124 | @nox.session(name="type-checking", python=BASE_PYTHON) 125 | def type_checking(session): 126 | """Run mypy checks.""" 127 | session.install("-e", ".[typechecking]") 128 | session.run("python", "-m", "mypy") 129 | 130 | 131 | @nox.session(name="type-safety", python=SUPPORTED_PYTHONS) 132 | def type_safety(session): 133 | """Run typesafety tests.""" 134 | session.install("-e", ".[typechecking]") 135 | session.run( 136 | "python", 137 | "-m", 138 | "pytest", 139 | "-v", 140 | "-p", 141 | "pytest-mypy-plugins", 142 | "--mypy-pyproject-toml-file", 143 | "pyproject.toml", 144 | "typesafety", 145 | *session.posargs, 146 | ) 147 | 148 | 149 | @nox.session(name="flavours-upgrade-deps", python=BASE_PYTHON) 150 | def upgrade_flavours(session): 151 | session.run("uvx", "pur", "-r", "dev/requirements.txt") 152 | 153 | 154 | @nox.session(name="flavours-codegen", python=BASE_PYTHON) 155 | def generate_flavours(session): 156 | session.install("-r", "dev/requirements.txt") 157 | with open("upath/_flavour_sources.py", "w") as target: 158 | session.run( 159 | "python", 160 | "dev/fsspec_inspector/generate_flavours.py", 161 | stdout=target, 162 | stderr=None, 163 | ) 164 | 165 | 166 | @nox.session(name="docs-build", python=BASE_PYTHON) 167 | def docs_build(session): 168 | """Build the documentation in strict mode.""" 169 | session.install("--group=docs", "-e", ".") 170 | session.run("mkdocs", "build") 171 | 172 | 173 | @nox.session(name="docs-serve", python=BASE_PYTHON) 174 | def docs_serve(session): 175 | """Serve the documentation with live reloading.""" 176 | session.install("--group=docs", "-e", ".") 177 | session.run("mkdocs", "serve", "--no-strict") 178 | -------------------------------------------------------------------------------- /upath/implementations/cloud.py: -------------------------------------------------------------------------------- 1 | from __future__ import annotations 2 | 3 | import sys 4 | from typing import TYPE_CHECKING 5 | from typing import Any 6 | 7 | from upath._chain import DEFAULT_CHAIN_PARSER 8 | from upath._flavour import upath_strip_protocol 9 | from upath.core import UPath 10 | from upath.types import JoinablePathLike 11 | 12 | if TYPE_CHECKING: 13 | from typing import Literal 14 | 15 | if sys.version_info >= (3, 11): 16 | from typing import Unpack 17 | else: 18 | from typing_extensions import Unpack 19 | 20 | from upath._chain import FSSpecChainParser 21 | from upath.types.storage_options import AzureStorageOptions 22 | from upath.types.storage_options import GCSStorageOptions 23 | from upath.types.storage_options import HfStorageOptions 24 | from upath.types.storage_options import S3StorageOptions 25 | 26 | __all__ = [ 27 | "CloudPath", 28 | "GCSPath", 29 | "S3Path", 30 | "AzurePath", 31 | "HfPath", 32 | ] 33 | 34 | 35 | class CloudPath(UPath): 36 | __slots__ = () 37 | 38 | @classmethod 39 | def _transform_init_args( 40 | cls, 41 | args: tuple[JoinablePathLike, ...], 42 | protocol: str, 43 | storage_options: dict[str, Any], 44 | ) -> tuple[tuple[JoinablePathLike, ...], str, dict[str, Any]]: 45 | for key in ["bucket", "netloc"]: 46 | bucket = storage_options.pop(key, None) 47 | if bucket: 48 | if str(args[0]).startswith("/"): 49 | args = (f"{protocol}://{bucket}{args[0]}", *args[1:]) 50 | else: 51 | args0 = upath_strip_protocol(args[0]) 52 | args = (f"{protocol}://{bucket}/", args0, *args[1:]) 53 | break 54 | return super()._transform_init_args(args, protocol, storage_options) 55 | 56 | @property 57 | def root(self) -> str: 58 | if self._relative_base is not None: 59 | return "" 60 | return self.parser.sep 61 | 62 | def __str__(self) -> str: 63 | path = super().__str__() 64 | if self._relative_base is None: 65 | drive = self.parser.splitdrive(path)[0] 66 | if drive and path == f"{self.protocol}://{drive}": 67 | return f"{path}{self.root}" 68 | return path 69 | 70 | @property 71 | def path(self) -> str: 72 | self_path = super().path.rstrip(self.parser.sep) 73 | if ( 74 | self._relative_base is None 75 | and self_path 76 | and self.parser.sep not in self_path 77 | ): 78 | return self_path + self.root 79 | return self_path 80 | 81 | def mkdir( 82 | self, mode: int = 0o777, parents: bool = False, exist_ok: bool = False 83 | ) -> None: 84 | if not parents and not exist_ok and self.exists(): 85 | raise FileExistsError(self.path) 86 | super().mkdir(mode=mode, parents=parents, exist_ok=exist_ok) 87 | 88 | 89 | class GCSPath(CloudPath): 90 | __slots__ = () 91 | 92 | def __init__( 93 | self, 94 | *args: JoinablePathLike, 95 | protocol: Literal["gcs", "gs"] | None = None, 96 | chain_parser: FSSpecChainParser = DEFAULT_CHAIN_PARSER, 97 | **storage_options: Unpack[GCSStorageOptions], 98 | ) -> None: 99 | super().__init__( 100 | *args, protocol=protocol, chain_parser=chain_parser, **storage_options 101 | ) 102 | if not self.drive and len(self.parts) > 1: 103 | raise ValueError("non key-like path provided (bucket/container missing)") 104 | 105 | def mkdir( 106 | self, mode: int = 0o777, parents: bool = False, exist_ok: bool = False 107 | ) -> None: 108 | try: 109 | super().mkdir(mode=mode, parents=parents, exist_ok=exist_ok) 110 | except TypeError as err: 111 | if "unexpected keyword argument 'create_parents'" in str(err): 112 | self.fs.mkdir(self.path) 113 | 114 | def exists(self, *, follow_symlinks: bool = True) -> bool: 115 | # required for gcsfs<2025.5.0, see: https://github.com/fsspec/gcsfs/pull/676 116 | path = self.path 117 | if len(path) > 1: 118 | path = path.removesuffix(self.root) 119 | return self.fs.exists(path) 120 | 121 | 122 | class S3Path(CloudPath): 123 | __slots__ = () 124 | 125 | def __init__( 126 | self, 127 | *args: JoinablePathLike, 128 | protocol: Literal["s3", "s3a"] | None = None, 129 | chain_parser: FSSpecChainParser = DEFAULT_CHAIN_PARSER, 130 | **storage_options: Unpack[S3StorageOptions], 131 | ) -> None: 132 | super().__init__( 133 | *args, protocol=protocol, chain_parser=chain_parser, **storage_options 134 | ) 135 | if not self.drive and len(self.parts) > 1: 136 | raise ValueError("non key-like path provided (bucket/container missing)") 137 | 138 | 139 | class AzurePath(CloudPath): 140 | __slots__ = () 141 | 142 | def __init__( 143 | self, 144 | *args: JoinablePathLike, 145 | protocol: Literal["abfs", "abfss", "adl", "az"] | None = None, 146 | chain_parser: FSSpecChainParser = DEFAULT_CHAIN_PARSER, 147 | **storage_options: Unpack[AzureStorageOptions], 148 | ) -> None: 149 | super().__init__( 150 | *args, protocol=protocol, chain_parser=chain_parser, **storage_options 151 | ) 152 | if not self.drive and len(self.parts) > 1: 153 | raise ValueError("non key-like path provided (bucket/container missing)") 154 | 155 | 156 | class HfPath(CloudPath): 157 | __slots__ = () 158 | 159 | def __init__( 160 | self, 161 | *args: JoinablePathLike, 162 | protocol: Literal["hf"] | None = None, 163 | chain_parser: FSSpecChainParser = DEFAULT_CHAIN_PARSER, 164 | **storage_options: Unpack[HfStorageOptions], 165 | ) -> None: 166 | super().__init__( 167 | *args, protocol=protocol, chain_parser=chain_parser, **storage_options 168 | ) 169 | -------------------------------------------------------------------------------- /upath/tests/test_extensions.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | from contextlib import nullcontext 4 | 5 | import pytest 6 | 7 | from upath import UnsupportedOperation 8 | from upath import UPath 9 | from upath.extensions import ProxyUPath 10 | from upath.implementations.local import FilePath 11 | from upath.implementations.local import PosixUPath 12 | from upath.implementations.local import WindowsUPath 13 | from upath.implementations.memory import MemoryPath 14 | from upath.tests.cases import BaseTests 15 | 16 | 17 | class TestProxyMemoryPath(BaseTests): 18 | @pytest.fixture(autouse=True) 19 | def path(self, local_testdir): 20 | if not local_testdir.startswith("/"): 21 | local_testdir = "/" + local_testdir 22 | self.path = ProxyUPath(f"memory:{local_testdir}") 23 | self.prepare_file_system() 24 | 25 | def test_is_ProxyUPath(self): 26 | assert isinstance(self.path, ProxyUPath) 27 | 28 | def test_is_not_MemoryPath(self): 29 | assert not isinstance(self.path, MemoryPath) 30 | 31 | 32 | class TestProxyFilePath(BaseTests): 33 | @pytest.fixture(autouse=True) 34 | def path(self, local_testdir): 35 | self.path = ProxyUPath(f"file://{local_testdir}") 36 | self.prepare_file_system() 37 | 38 | def test_is_ProxyUPath(self): 39 | assert isinstance(self.path, ProxyUPath) 40 | 41 | def test_is_not_FilePath(self): 42 | assert not isinstance(self.path, FilePath) 43 | 44 | def test_chmod(self): 45 | self.path.joinpath("file1.txt").chmod(777) 46 | 47 | def test_cwd(self): 48 | self.path.cwd() 49 | with pytest.raises(UnsupportedOperation): 50 | type(self.path).cwd() 51 | 52 | 53 | class TestProxyPathlibPath(BaseTests): 54 | @pytest.fixture(autouse=True) 55 | def path(self, local_testdir): 56 | self.path = ProxyUPath(f"{local_testdir}") 57 | self.prepare_file_system() 58 | 59 | def test_is_ProxyUPath(self): 60 | assert isinstance(self.path, ProxyUPath) 61 | 62 | def test_is_not_PosixUPath_WindowsUPath(self): 63 | assert not isinstance(self.path, (PosixUPath, WindowsUPath)) 64 | 65 | def test_chmod(self): 66 | self.path.joinpath("file1.txt").chmod(777) 67 | 68 | @pytest.mark.skipif( 69 | sys.version_info < (3, 12), reason="storage options only handled in 3.12+" 70 | ) 71 | def test_eq(self): 72 | super().test_eq() 73 | 74 | if sys.version_info < (3, 12): 75 | 76 | def test_storage_options_dont_affect_hash(self): 77 | # On Python < 3.12, storage_options trigger warnings for LocalPath 78 | with pytest.warns( 79 | UserWarning, 80 | match=r".*on python <= \(3, 11\) ignores protocol and storage_options", 81 | ): 82 | super().test_storage_options_dont_affect_hash() 83 | 84 | def test_group(self): 85 | pytest.importorskip("grp") 86 | self.path.group() 87 | 88 | def test_owner(self): 89 | pytest.importorskip("pwd") 90 | self.path.owner() 91 | 92 | def test_readlink(self): 93 | try: 94 | os.readlink 95 | except AttributeError: 96 | pytest.skip("os.readlink not available on this platform") 97 | with pytest.raises((OSError, UnsupportedOperation)): 98 | self.path.readlink() 99 | 100 | def test_protocol(self): 101 | assert self.path.protocol == "" 102 | 103 | def test_as_uri(self): 104 | assert self.path.as_uri().startswith("file://") 105 | 106 | if sys.version_info < (3, 10): 107 | 108 | def test_lstat(self): 109 | # On Python < 3.10, stat(follow_symlinks=False) triggers warnings 110 | with pytest.warns( 111 | UserWarning, 112 | match=r".*stat\(\) follow_symlinks=False is currently ignored", 113 | ): 114 | st = self.path.lstat() 115 | assert st is not None 116 | 117 | else: 118 | 119 | def test_lstat(self): 120 | st = self.path.lstat() 121 | assert st is not None 122 | 123 | def test_relative_to(self): 124 | base = self.path 125 | child = self.path / "folder1" / "file1.txt" 126 | relative = child.relative_to(base) 127 | assert str(relative) == f"folder1{os.sep}file1.txt" 128 | 129 | def test_cwd(self): 130 | self.path.cwd() 131 | with pytest.raises(UnsupportedOperation): 132 | type(self.path).cwd() 133 | 134 | def test_lchmod(self): 135 | # setup 136 | a = self.path.joinpath("a") 137 | b = self.path.joinpath("b") 138 | a.touch() 139 | b.symlink_to(a) 140 | 141 | # see: https://github.com/python/cpython/issues/108660#issuecomment-1854645898 142 | if hasattr(os, "lchmod") or os.chmod in os.supports_follow_symlinks: 143 | cm = nullcontext() 144 | else: 145 | cm = pytest.raises((UnsupportedOperation, NotImplementedError)) 146 | with cm: 147 | b.lchmod(mode=0o777) 148 | 149 | def test_symlink_to(self): 150 | self.path.joinpath("link").symlink_to(self.path) 151 | 152 | def test_hardlink_to(self): 153 | try: 154 | self.path.joinpath("link").hardlink_to(self.path) 155 | except PermissionError: 156 | pass # hardlink may require elevated permissions 157 | 158 | 159 | def test_custom_subclass(): 160 | 161 | class ReversePath(ProxyUPath): 162 | def read_bytes_reversed(self): 163 | return self.read_bytes()[::-1] 164 | 165 | def write_bytes_reversed(self, value): 166 | self.write_bytes(value[::-1]) 167 | 168 | b = MemoryPath("memory://base") 169 | 170 | p = b.joinpath("file1") 171 | p.write_bytes(b"dlrow olleh") 172 | 173 | r = ReversePath("memory://base/file1") 174 | assert r.read_bytes_reversed() == b"hello world" 175 | 176 | r.parent.joinpath("file2").write_bytes_reversed(b"dlrow olleh") 177 | assert b.joinpath("file2").read_bytes() == b"hello world" 178 | 179 | 180 | def test_protocol_dispatch_deprecation_warning(): 181 | 182 | class MyPath(UPath): 183 | _protocol_dispatch = False 184 | 185 | with pytest.warns(DeprecationWarning, match="_protocol_dispatch = False"): 186 | a = MyPath(".", protocol="memory") 187 | 188 | assert isinstance(a, MyPath) 189 | -------------------------------------------------------------------------------- /docs/concepts/pathlib.md: -------------------------------------------------------------------------------- 1 | # Pathlib :snake: 2 | 3 | [pathlib](https://docs.python.org/3/library/pathlib.html) is a Python standard library module that provides an object-oriented interface for working with filesystem paths. It's the modern, pythonic way to handle file paths and filesystem operations, replacing the older string-based `os.path` approach. 4 | 5 | ## What is pathlib? 6 | 7 | Introduced in Python 3.4, pathlib represents filesystem paths as objects rather than strings. 8 | 9 | ### Path Objects 10 | 11 | In pathlib, paths are instances of `Path` (or platform-specific subclasses) that represent local filesystem paths: 12 | 13 | ```python 14 | from pathlib import Path 15 | 16 | # Create path objects 17 | p = Path("/home/user/documents") 18 | p = Path("relative/path/to/file.txt") 19 | p = Path.home() # User's home directory 20 | p = Path.cwd() # Current working directory 21 | ``` 22 | 23 | ### Pure vs. Concrete Paths 24 | 25 | pathlib distinguishes between two types of paths: 26 | 27 | **Pure Paths** (`PurePath`, `PurePosixPath`, `PureWindowsPath`): 28 | - Only manipulate path strings 29 | - Don't access the filesystem 30 | - Work on any platform regardless of OS 31 | - Useful for path manipulation without I/O 32 | 33 | ```python 34 | from pathlib import PurePath, PurePosixPath, PureWindowsPath 35 | 36 | # Pure path - string manipulation only 37 | pure = PurePath("/home/user/file.txt") 38 | parent = pure.parent # Works 39 | name = pure.name # Works 40 | # exists = pure.exists() # AttributeError - no filesystem access 41 | 42 | # Platform-specific pure paths 43 | posix = PurePosixPath("/home/user/file.txt") # Always uses / 44 | windows = PureWindowsPath("C:\\Users\\file.txt") # Always uses \ 45 | ``` 46 | 47 | **Concrete Paths** (`Path`, `PosixPath`, `WindowsPath`): 48 | - Inherit from pure paths 49 | - Actually access the filesystem 50 | - Support operations like `.exists()`, `.stat()`, `.read_text()` 51 | - Platform-specific: `PosixPath` on Unix, `WindowsPath` on Windows 52 | 53 | ```python 54 | from pathlib import Path 55 | 56 | # Concrete path - filesystem operations 57 | p = Path("/home/user/file.txt") 58 | exists = p.exists() # Checks filesystem 59 | content = p.read_text() # Reads file 60 | size = p.stat().st_size # Gets file size 61 | ``` 62 | 63 | ## When to use pathlib 64 | 65 | Use pathlib when you: 66 | 67 | - Work with local filesystem paths in Python 68 | - Need cross-platform path handling 69 | - Want object-oriented path manipulation 70 | 71 | ## What is pathlib-abc? 72 | 73 | [pathlib-abc](https://github.com/barneygale/pathlib-abc) is a Python library that defines abstract base classes (ABCs) for path-like objects. It provides a formal specification for the pathlib interface that can be implemented by different path types, not just local filesystem paths. 74 | 75 | ### Abstract Base Classes for Paths 76 | 77 | pathlib-abc extracts the core concepts from Python's pathlib module into abstract base classes. This allows library authors and framework developers to: 78 | 79 | 1. **Define path-like interfaces** that work across different storage backends 80 | 2. **Type hint** functions that accept any path-like object 81 | 3. **Implement custom path classes** that follow pathlib conventions 82 | 4. **Ensure compatibility** between different path implementations 83 | 84 | !!! info "Relationship to Python's pathlib" 85 | Currently (as of Python 3.14), the standard library `pathlib.Path` does **not** inherit from public pathlib-abc classes. However, there is ongoing work to incorporate these ABCs into future Python releases. 86 | 87 | The library defines three main abstract base classes that represent different levels of path functionality: 88 | 89 | ### JoinablePath 90 | 91 | `JoinablePath` is the most basic path abstraction. It represents paths that can be constructed, manipulated, and joined together, but cannot necessarily access any actual filesystem. 92 | 93 | **Key capabilities:** 94 | 95 | - Path construction and manipulation 96 | - String operations on paths 97 | - Path component access (name, stem, suffix, parent, etc.) 98 | - Path joining with the `/` operator 99 | - Pattern matching 100 | 101 | Think of `JoinablePath` as equivalent to pathlib's `PurePath` - it only manipulates path strings. 102 | 103 | ### ReadablePath 104 | 105 | `ReadablePath` extends `JoinablePath` to add read-only filesystem operations. It represents paths where you can read data but not modify the filesystem. 106 | 107 | **Adds capabilities for:** 108 | 109 | - Reading file contents (`.read_text()`, `.read_bytes()`) 110 | - Opening files for reading 111 | - Checking file existence and type (`.exists()`, `.is_file()`, `.is_dir()`) 112 | - Listing directory contents (`.iterdir()`) 113 | - Globbing and pattern matching (`.glob()`, `.rglob()`) 114 | - Walking directory trees (`.walk()`) 115 | - Reading symlinks (`.readlink()`) 116 | - Accessing file metadata (`.info` property) 117 | 118 | ### WritablePath 119 | 120 | `WritablePath` extends `JoinablePath` (not `ReadablePath`) to add write operations. It represents paths where you can create, modify, and delete filesystem objects. 121 | 122 | **Adds capabilities for:** 123 | 124 | - Writing file contents (`.write_text()`, `.write_bytes()`) 125 | - Opening files for writing 126 | - Creating directories (`.mkdir()`) 127 | - Creating symlinks (`.symlink_to()`) 128 | 129 | !!! note "WritablePath Does Not Inherit from ReadablePath" 130 | `WritablePath` does NOT inherit from `ReadablePath`. A path that is writable is not automatically readable. In practice, most filesystem paths are both readable and writable (like `UPath` which inherits from both), but the separation allows for specialized use cases like write-only destinations or read-only sources. 131 | 132 | ## Learn More 133 | 134 | For comprehensive information about pathlib: 135 | 136 | - **Official documentation**: [Python pathlib documentation](https://docs.python.org/3/library/pathlib.html) 137 | - **PEP 428**: [The pathlib module – object-oriented filesystem paths](https://www.python.org/dev/peps/pep-0428/) 138 | - **Comparison with os.path**: [Correspondence to tools in the os module](https://docs.python.org/3/library/pathlib.html#correspondence-to-tools-in-the-os-module) 139 | 140 | For comprehensive information about pathlib-abc: 141 | 142 | - **GitHub repository**: [barneygale/pathlib-abc](https://github.com/barneygale/pathlib-abc) 143 | 144 | For using pathlib-style paths with remote and cloud filesystems, see [upath.md](upath.md). 145 | -------------------------------------------------------------------------------- /upath/tests/implementations/test_zip.py: -------------------------------------------------------------------------------- 1 | import os 2 | import zipfile 3 | 4 | import pytest 5 | 6 | from upath import UPath 7 | from upath.implementations.zip import ZipPath 8 | 9 | from ..cases import BaseTests 10 | 11 | 12 | @pytest.fixture(scope="function") 13 | def zipped_testdir_file(local_testdir, tmp_path_factory): 14 | base = tmp_path_factory.mktemp("zippath") 15 | zip_path = base / "test.zip" 16 | with zipfile.ZipFile(zip_path, "w") as zf: 17 | for root, _, files in os.walk(local_testdir): 18 | for file in files: 19 | full_path = os.path.join(root, file) 20 | arcname = os.path.relpath(full_path, start=local_testdir) 21 | zf.write(full_path, arcname=arcname) 22 | return str(zip_path) 23 | 24 | 25 | @pytest.fixture(scope="function") 26 | def empty_zipped_testdir_file(tmp_path): 27 | tmp_path = tmp_path.joinpath("zippath") 28 | tmp_path.mkdir() 29 | zip_path = tmp_path / "test.zip" 30 | 31 | with zipfile.ZipFile(zip_path, "w"): 32 | pass 33 | return str(zip_path) 34 | 35 | 36 | class TestZipPath(BaseTests): 37 | 38 | @pytest.fixture(autouse=True) 39 | def path(self, zipped_testdir_file, request): 40 | try: 41 | (mode,) = request.param 42 | except (ValueError, TypeError, AttributeError): 43 | mode = "r" 44 | self.path = UPath("zip://", fo=zipped_testdir_file, mode=mode) 45 | try: 46 | yield 47 | finally: 48 | self.path.fs.clear_instance_cache() 49 | 50 | def test_is_ZipPath(self): 51 | assert isinstance(self.path, ZipPath) 52 | 53 | @pytest.mark.parametrize( 54 | "path", [("w",)], ids=["zipfile_mode_write"], indirect=True 55 | ) 56 | def test_mkdir(self): 57 | super().test_mkdir() 58 | 59 | @pytest.mark.parametrize( 60 | "path", [("w",)], ids=["zipfile_mode_write"], indirect=True 61 | ) 62 | def test_mkdir_exists_ok_true(self): 63 | super().test_mkdir_exists_ok_true() 64 | 65 | @pytest.mark.parametrize( 66 | "path", [("w",)], ids=["zipfile_mode_write"], indirect=True 67 | ) 68 | def test_mkdir_exists_ok_false(self): 69 | super().test_mkdir_exists_ok_false() 70 | 71 | @pytest.mark.parametrize( 72 | "path", [("w",)], ids=["zipfile_mode_write"], indirect=True 73 | ) 74 | def test_mkdir_parents_true_exists_ok_true(self): 75 | super().test_mkdir_parents_true_exists_ok_true() 76 | 77 | @pytest.mark.parametrize( 78 | "path", [("w",)], ids=["zipfile_mode_write"], indirect=True 79 | ) 80 | def test_mkdir_parents_true_exists_ok_false(self): 81 | super().test_mkdir_parents_true_exists_ok_false() 82 | 83 | def test_rename(self): 84 | with pytest.raises(NotImplementedError): 85 | super().test_rename() # delete is not implemented in fsspec 86 | 87 | def test_move_local(self, tmp_path): 88 | with pytest.raises(NotImplementedError): 89 | super().test_move_local(tmp_path) # delete is not implemented in fsspec 90 | 91 | def test_move_into_local(self, tmp_path): 92 | with pytest.raises(NotImplementedError): 93 | super().test_move_into_local( 94 | tmp_path 95 | ) # delete is not implemented in fsspec 96 | 97 | def test_move_memory(self, clear_fsspec_memory_cache): 98 | with pytest.raises(NotImplementedError): 99 | super().test_move_memory(clear_fsspec_memory_cache) 100 | 101 | def test_move_into_memory(self, clear_fsspec_memory_cache): 102 | with pytest.raises(NotImplementedError): 103 | super().test_move_into_memory(clear_fsspec_memory_cache) 104 | 105 | @pytest.mark.parametrize( 106 | "path", [("w",)], ids=["zipfile_mode_write"], indirect=True 107 | ) 108 | def test_touch(self): 109 | super().test_touch() 110 | 111 | @pytest.mark.parametrize( 112 | "path", [("w",)], ids=["zipfile_mode_write"], indirect=True 113 | ) 114 | def test_touch_unlink(self): 115 | with pytest.raises(NotImplementedError): 116 | super().test_touch_unlink() # delete is not implemented in fsspec 117 | 118 | @pytest.mark.parametrize( 119 | "path", [("w",)], ids=["zipfile_mode_write"], indirect=True 120 | ) 121 | def test_write_bytes(self): 122 | fn = "test_write_bytes.txt" 123 | s = b"hello_world" 124 | path = self.path.joinpath(fn) 125 | path.write_bytes(s) 126 | so = {**path.storage_options, "mode": "r"} 127 | urlpath = str(path) 128 | path.fs.close() 129 | assert UPath(urlpath, **so).read_bytes() == s 130 | 131 | @pytest.mark.parametrize( 132 | "path", [("w",)], ids=["zipfile_mode_write"], indirect=True 133 | ) 134 | def test_write_text(self): 135 | fn = "test_write_text.txt" 136 | s = "hello_world" 137 | path = self.path.joinpath(fn) 138 | path.write_text(s) 139 | so = {**path.storage_options, "mode": "r"} 140 | urlpath = str(path) 141 | path.fs.close() 142 | assert UPath(urlpath, **so).read_text() == s 143 | 144 | @pytest.mark.skip(reason="fsspec zipfile filesystem is either read xor write mode") 145 | def test_fsspec_compat(self): 146 | pass 147 | 148 | @pytest.mark.skip(reason="fsspec zipfile filesystem is either read xor write mode") 149 | def test_rename_with_target_absolute(self, target_factory): 150 | return super().test_rename_with_target_absolute(target_factory) 151 | 152 | @pytest.mark.skip(reason="fsspec zipfile filesystem is either read xor write mode") 153 | def test_write_text_encoding(self): 154 | return super().test_write_text_encoding() 155 | 156 | @pytest.mark.skip(reason="fsspec zipfile filesystem is either read xor write mode") 157 | def test_write_text_errors(self): 158 | return super().test_write_text_errors() 159 | 160 | 161 | @pytest.fixture(scope="function") 162 | def zipped_testdir_file_in_memory(zipped_testdir_file, clear_fsspec_memory_cache): 163 | p = UPath(zipped_testdir_file, protocol="file") 164 | t = p.move(UPath("memory:///myzipfile.zip")) 165 | assert t.protocol == "memory" 166 | assert t.exists() 167 | yield t.as_uri() 168 | 169 | 170 | class TestChainedZipPath(TestZipPath): 171 | 172 | @pytest.fixture(autouse=True) 173 | def path(self, zipped_testdir_file_in_memory, request): 174 | try: 175 | (mode,) = request.param 176 | except (ValueError, TypeError, AttributeError): 177 | mode = "r" 178 | self.path = UPath( 179 | "zip://", fo="/myzipfile.zip", mode=mode, target_protocol="memory" 180 | ) 181 | -------------------------------------------------------------------------------- /docs/concepts/fsspec.md: -------------------------------------------------------------------------------- 1 | # Filesystem Spec :file_folder: 2 | 3 | [fsspec](https://filesystem-spec.readthedocs.io/) is a Python library that provides a unified, pythonic interface for working with different storage backends. It abstracts away the differences between various storage systems, allowing you to interact with local files, cloud storage, remote systems, and specialty filesystems using a consistent API. 4 | 5 | ## What is fsspec? 6 | 7 | fsspec is both a **specification** and a **collection of implementations** for pythonic filesystems. The project defines a standard interface that filesystem implementations should follow, then provides concrete implementations for dozens of different storage backends. 8 | 9 | The core idea is simple: whether you're working with files on your local disk, objects in an S3 bucket, blobs in Azure storage, or data over HTTP, the API to interact with them should be the same. 10 | 11 | ### Core Functionality 12 | 13 | fsspec provides filesystem objects with methods for common operations. All filesystem implementations inherit from `fsspec.spec.AbstractFileSystem`, which defines the standard interface that all filesystems must implement: 14 | 15 | ```python 16 | import fsspec 17 | 18 | # Create a filesystem instance 19 | # Returns an AbstractFileSystem subclass for the specified protocol 20 | fs = fsspec.filesystem('s3', anon=True) 21 | 22 | # List files 23 | files = fs.ls('my-bucket/data/') 24 | 25 | # Check if file exists 26 | exists = fs.exists('my-bucket/data/file.txt') 27 | 28 | # Get file info 29 | info = fs.info('my-bucket/data/file.txt') 30 | 31 | # Read file 32 | with fs.open('my-bucket/data/file.txt', 'r') as f: 33 | content = f.read() 34 | 35 | # Write file 36 | with fs.open('my-bucket/output.txt', 'w') as f: 37 | f.write('Hello, World!') 38 | 39 | # Copy files 40 | fs.cp('my-bucket/source.txt', 'my-bucket/dest.txt') 41 | 42 | # Delete files 43 | fs.rm('my-bucket/file.txt') 44 | ``` 45 | 46 | ### Protocols 47 | 48 | fsspec identifies filesystem types via **protocols**. Each protocol corresponds to a specific filesystem implementation: 49 | 50 | - `file://` - Local filesystem 51 | - `memory://` - In-memory filesystem (temporary, non-persistent) 52 | - `s3://` or `s3a://` - Amazon S3 53 | - `gs://` or `gcs://` - Google Cloud Storage 54 | - `az://` or `abfs://` - Azure Blob Storage 55 | - `adl://` - Azure Data Lake Gen1 56 | - `abfss://` - Azure Data Lake Gen2 (secure) 57 | - `http://` or `https://` - HTTP(S) access 58 | - `ftp://` - FTP 59 | - `sftp://` or `ssh://` - SFTP over SSH 60 | - `smb://` - Samba/Windows file shares 61 | - `webdav://` or `webdav+http://` - WebDAV 62 | - `hdfs://` - Hadoop Distributed File System 63 | - `hf://` - Hugging Face Hub 64 | - `github://` - GitHub repositories 65 | - `zip://` - ZIP archives 66 | - `tar://` - TAR archives 67 | - `gzip://` - GZIP compressed files 68 | - `cached://` - Caching layer over other filesystems 69 | 70 | And many more. See the [fsspec documentation](https://filesystem-spec.readthedocs.io/en/latest/api.html#built-in-implementations) for the complete list. 71 | 72 | ### Storage Options 73 | 74 | Each filesystem implementation accepts different configuration parameters called **storage options**. These control authentication, connection settings, caching behavior, and more. 75 | They are usually provided as keyword parameters to the 76 | specific filesystem class on instantiation. 77 | 78 | Common storage option patterns: 79 | 80 | ```python 81 | import fsspec 82 | 83 | # Authentication credentials 84 | fs = fsspec.filesystem('s3', key='...', secret='...') 85 | 86 | # Anonymous/public access 87 | fs = fsspec.filesystem('s3', anon=True) 88 | 89 | # Tokens and service accounts 90 | fs = fsspec.filesystem('gs', token='path/to/creds.json') 91 | 92 | # Connection settings 93 | fs = fsspec.filesystem('sftp', host='...', port=22, username='...') 94 | 95 | # Behavioral options 96 | fs = fsspec.filesystem('s3', use_ssl=True, default_block_size=5*2**20) 97 | ``` 98 | 99 | Refer to the [fsspec documentation](https://filesystem-spec.readthedocs.io/en/latest/api.html#built-in-implementations) for details on what each filesystem supports. 100 | 101 | ### URI-Based Access: urlpaths 102 | 103 | fsspec supports opening files directly using URIs. Usually a 104 | resource is clearly defined by its 'protocol', 'storage options', and 'path'. The protocol and path can usually be 105 | combined to a urlpath string: 106 | 107 | ```python 108 | import fsspec 109 | 110 | # resource 111 | protocol = "s3" 112 | storage_options = {"anon": True} 113 | path = "bucket/file.txt" 114 | 115 | # Create filesystem and open path 116 | fs = fsspec.filesystem("s3", anon=True) 117 | with fs.open("bucket/file.txt", "r") as f: 118 | content = f.read() 119 | 120 | # Or open a file via its urlpath with storage_options 121 | with fsspec.open('s3://bucket/file.txt', 'r', anon=True) as f: 122 | content = f.read() 123 | ``` 124 | 125 | ### Chained Filesystems 126 | 127 | fsspec supports composing filesystems together using the `::` separator. This allows one filesystem to be used as the target 128 | filesystem for another: 129 | 130 | ```python 131 | import fsspec 132 | 133 | # Access a file inside a ZIP archive on S3 134 | with fsspec.open('zip://data.csv::s3://bucket/archive.zip', 'r', anon=True) as f: 135 | content = f.read() 136 | 137 | # Read a compressed file 138 | with fsspec.open('tar://file.txt::s3://bucket/archive.tar', 'r', anon=True) as f: 139 | content = f.read() 140 | ``` 141 | 142 | ### Caching 143 | 144 | fsspec includes powerful caching capabilities to improve performance when accessing remote files: 145 | 146 | ```python 147 | import fsspec 148 | 149 | # Simple caching 150 | fs = fsspec.filesystem( 151 | 's3', 152 | anon=True, 153 | use_listings_cache=True, 154 | listings_expiry_time=600 # Cache for 10 minutes 155 | ) 156 | 157 | # File-level caching 158 | cached_fs = fsspec.filesystem( 159 | 'filecache', 160 | target_protocol='s3', 161 | target_options={'anon': True}, 162 | cache_storage='/tmp/fsspec-cache' 163 | ) 164 | ``` 165 | 166 | ## When to use fsspec directly 167 | 168 | You typically use fsspec directly when you: 169 | 170 | - Need filesystem-level operations (`ls`, `cp`, `rm`, `find`) 171 | - Want to work with file-like objects without path abstractions 172 | - Need low-level control over filesystem behavior 173 | - Are integrating with data libraries that accept fsspec URLs 174 | - Want to implement custom filesystem wrappers 175 | - Want to avoid the overhead of UPath instance creation 176 | 177 | ## Learn More 178 | 179 | For comprehensive information about fsspec: 180 | 181 | - **Official documentation**: [fsspec.readthedocs.io](https://filesystem-spec.readthedocs.io/) 182 | - **API reference**: [Built-in filesystem implementations](https://filesystem-spec.readthedocs.io/en/latest/api.html#built-in-implementations) 183 | - **GitHub repository**: [fsspec/filesystem_spec](https://github.com/fsspec/filesystem_spec) 184 | - **Usage guides**: [Examples and tutorials](https://filesystem-spec.readthedocs.io/en/latest/usage.html) 185 | 186 | For using fsspec with a pathlib-style interface, see [upath.md](upath.md). 187 | -------------------------------------------------------------------------------- /docs/api/implementations.md: -------------------------------------------------------------------------------- 1 | # Implementations :file_folder: 2 | 3 | Universal Pathlib provides specialized UPath subclasses for different filesystem protocols. 4 | Each implementation is optimized for its respective filesystem and may provide additional 5 | protocol-specific functionality. 6 | 7 | ## upath.implementations.cloud 8 | 9 | ::: upath.implementations.cloud.S3Path 10 | options: 11 | heading_level: 3 12 | show_root_heading: true 13 | show_root_full_path: false 14 | members: [] 15 | show_bases: true 16 | 17 | **Protocols:** `s3://`, `s3a://` 18 | 19 | Amazon S3 compatible object storage implementation. 20 | 21 | ::: upath.implementations.cloud.GCSPath 22 | options: 23 | heading_level: 3 24 | show_root_heading: true 25 | show_root_full_path: false 26 | members: [] 27 | show_bases: true 28 | 29 | **Protocols:** `gs://`, `gcs://` 30 | 31 | Google Cloud Storage implementation. 32 | 33 | ::: upath.implementations.cloud.AzurePath 34 | options: 35 | heading_level: 3 36 | show_root_heading: true 37 | show_root_full_path: false 38 | members: [] 39 | show_bases: true 40 | 41 | **Protocols:** `abfs://`, `abfss://`, `adl://`, `az://` 42 | 43 | Azure Blob Storage and Azure Data Lake implementation. 44 | 45 | ::: upath.implementations.cloud.HfPath 46 | options: 47 | heading_level: 3 48 | show_root_heading: true 49 | show_root_full_path: false 50 | members: [] 51 | show_bases: true 52 | 53 | **Protocols:** `hf://` 54 | 55 | Hugging Face Hub implementation for accessing models, datasets, and spaces. 56 | 57 | --- 58 | 59 | ## upath.implementations.local 60 | 61 | ::: upath.implementations.local.PosixUPath 62 | options: 63 | heading_level: 3 64 | show_root_heading: true 65 | show_root_full_path: false 66 | members: [] 67 | show_bases: true 68 | 69 | POSIX-style local filesystem paths (Linux, macOS, Unix). 70 | 71 | ::: upath.implementations.local.WindowsUPath 72 | options: 73 | heading_level: 3 74 | show_root_heading: true 75 | show_root_full_path: false 76 | members: [] 77 | show_bases: true 78 | 79 | Windows-style local filesystem paths. 80 | 81 | ::: upath.implementations.local.FilePath 82 | options: 83 | heading_level: 3 84 | show_root_heading: true 85 | show_root_full_path: false 86 | members: [] 87 | show_bases: true 88 | 89 | **Protocols:** `file://`, `local://` 90 | 91 | File URI implementation for local filesystem. 92 | 93 | --- 94 | 95 | ## upath.implementations.http 96 | 97 | ::: upath.implementations.http.HTTPPath 98 | options: 99 | heading_level: 3 100 | show_root_heading: true 101 | show_root_full_path: false 102 | members: [] 103 | show_bases: true 104 | 105 | **Protocols:** `http://`, `https://` 106 | 107 | HTTP/HTTPS read-only filesystem implementation. 108 | 109 | --- 110 | 111 | ## upath.implementations.sftp 112 | 113 | ::: upath.implementations.sftp.SFTPPath 114 | options: 115 | heading_level: 3 116 | show_root_heading: true 117 | show_root_full_path: false 118 | members: [] 119 | show_bases: true 120 | 121 | **Protocols:** `sftp://`, `ssh://` 122 | 123 | SFTP (SSH File Transfer Protocol) implementation. 124 | 125 | --- 126 | 127 | ## upath.implementations.smb 128 | 129 | ::: upath.implementations.smb.SMBPath 130 | options: 131 | heading_level: 3 132 | show_root_heading: true 133 | show_root_full_path: false 134 | members: [] 135 | show_bases: true 136 | 137 | **Protocol:** `smb://` 138 | 139 | SMB/CIFS network filesystem implementation. 140 | 141 | --- 142 | 143 | ## upath.implementations.webdav 144 | 145 | ::: upath.implementations.webdav.WebdavPath 146 | options: 147 | heading_level: 3 148 | show_root_heading: true 149 | show_root_full_path: false 150 | members: [] 151 | show_bases: true 152 | 153 | **Protocols:** `webdav://`, `webdav+http://`, `webdav+https://` 154 | 155 | WebDAV protocol implementation. 156 | 157 | --- 158 | 159 | ## upath.implementations.hdfs 160 | 161 | ::: upath.implementations.hdfs.HDFSPath 162 | options: 163 | heading_level: 3 164 | show_root_heading: true 165 | show_root_full_path: false 166 | members: [] 167 | show_bases: true 168 | 169 | **Protocol:** `hdfs://` 170 | 171 | Hadoop Distributed File System implementation. 172 | 173 | --- 174 | 175 | ## upath.implementations.github 176 | 177 | ::: upath.implementations.github.GitHubPath 178 | options: 179 | heading_level: 3 180 | show_root_heading: true 181 | show_root_full_path: false 182 | members: [] 183 | show_bases: true 184 | 185 | **Protocol:** `github://` 186 | 187 | GitHub repository file access implementation. 188 | 189 | --- 190 | 191 | ## upath.implementations.zip 192 | 193 | ::: upath.implementations.zip.ZipPath 194 | options: 195 | heading_level: 3 196 | show_root_heading: true 197 | show_root_full_path: false 198 | members: [] 199 | show_bases: true 200 | 201 | **Protocol:** `zip://` 202 | 203 | ZIP archive filesystem implementation. 204 | 205 | --- 206 | 207 | ## upath.implementations.tar 208 | 209 | ::: upath.implementations.tar.TarPath 210 | options: 211 | heading_level: 3 212 | show_root_heading: true 213 | show_root_full_path: false 214 | members: [] 215 | show_bases: true 216 | 217 | **Protocol:** `tar://` 218 | 219 | TAR archive filesystem implementation. 220 | 221 | --- 222 | 223 | ## upath.implementations.memory 224 | 225 | ::: upath.implementations.memory.MemoryPath 226 | options: 227 | heading_level: 3 228 | show_root_heading: true 229 | show_root_full_path: false 230 | members: [] 231 | show_bases: true 232 | 233 | **Protocol:** `memory://` 234 | 235 | In-memory filesystem implementation for testing and temporary storage. 236 | 237 | --- 238 | 239 | ## upath.implementations.data 240 | 241 | ::: upath.implementations.data.DataPath 242 | options: 243 | heading_level: 3 244 | show_root_heading: true 245 | show_root_full_path: false 246 | members: [] 247 | show_bases: true 248 | 249 | **Protocol:** `data://` 250 | 251 | Data URL scheme implementation for embedded data. 252 | 253 | --- 254 | 255 | ## upath.implementations.ftp 256 | 257 | ::: upath.implementations.ftp.FTPPath 258 | options: 259 | heading_level: 3 260 | show_root_heading: true 261 | show_root_full_path: false 262 | members: [] 263 | show_bases: true 264 | 265 | **Protocol:** `ftp://` 266 | 267 | FTP (File Transfer Protocol) implementation. 268 | 269 | --- 270 | 271 | ## upath.implementations.cached 272 | 273 | ::: upath.implementations.cached.SimpleCachePath 274 | options: 275 | heading_level: 3 276 | show_root_heading: true 277 | show_root_full_path: false 278 | members: [] 279 | show_bases: true 280 | 281 | **Protocol:** `simplecache://` 282 | 283 | Local caching wrapper for remote filesystems. 284 | 285 | --- 286 | 287 | ## See Also :link: 288 | 289 | - [UPath](index.md) - Main UPath class documentation 290 | - [Registry](registry.md) - Implementation registry 291 | - [Extensions](extensions.md) - Extending UPath functionality 292 | -------------------------------------------------------------------------------- /docs/index.md: -------------------------------------------------------------------------------- 1 | 9 | 10 | ![universal pathlib logo](assets/logo-text.svg){: #upath-logo } 11 | 12 | [![PyPI - Python Version](https://img.shields.io/pypi/pyversions/universal_pathlib)](https://pypi.org/project/universal_pathlib/) 13 | [![PyPI - License](https://img.shields.io/pypi/l/universal_pathlib)](https://github.com/fsspec/universal_pathlib/blob/main/LICENSE) 14 | [![PyPI](https://img.shields.io/pypi/v/universal_pathlib.svg)](https://pypi.org/project/universal_pathlib/) 15 | [![Conda (channel only)](https://img.shields.io/conda/vn/conda-forge/universal_pathlib?label=conda)](https://anaconda.org/conda-forge/universal_pathlib) 16 | 17 | [![Docs](https://readthedocs.org/projects/universal-pathlib/badge/?version=latest)](https://universal-pathlib.readthedocs.io/en/latest/?badge=latest) 18 | [![Tests](https://github.com/fsspec/universal_pathlib/actions/workflows/tests.yml/badge.svg)](https://github.com/fsspec/universal_pathlib/actions/workflows/tests.yml) 19 | [![GitHub issues](https://img.shields.io/github/issues/fsspec/universal_pathlib)](https://github.com/fsspec/universal_pathlib/issues) 20 | [![Codestyle black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black) 21 | [![Changelog](https://img.shields.io/badge/changelog-Keep%20a%20Changelog-%23E05735)](./changelog.md) 22 | 23 | --- 24 | 25 | **Universal Pathlib** is a Python library that extends the [`pathlib_abc.JoinablePath`][pathlib_abc], [`pathlib_abc.Readable`][pathlib_abc], and [`pathlib_abc.Writable`][pathlib_abc] API to give you a unified, Pythonic interface for working with files, whether they're on your local machine, in S3, on GitHub, or anywhere else. Built on top of [`filesystem_spec`][fsspec], it brings the convenienve of a [`pathlib.Path`][pathlib]-like interface to cloud storage, remote filesystems, and more! :sparkles: 26 | 27 | [pathlib_abc]: https://github.com/barneygale/pathlib-abc 28 | [pathlib]: https://docs.python.org/3/library/pathlib.html 29 | [fsspec]: https://filesystem-spec.readthedocs.io/en/latest/intro.html 30 | 31 | --- 32 | 33 | If you enjoy working with Python's [pathlib][pathlib] objects to operate on local file system paths, 34 | universal pathlib provides the same interface for many supported [ filesystem_spec ][fsspec] 35 | implementations, from cloud-native object storage like `Amazon's S3 Storage`, `Google Cloud Storage`, 36 | `Azure Blob Storage`, to `http`, `sftp`, `memory` stores, and many more... 37 | 38 | If you're familiar with [ filesystem_spec ][fsspec], then universal pathlib provides a convenient 39 | way to handle the path, protocol and storage options of a object stored on a fsspec filesystem in a 40 | single container (`upath.UPath`). And it further provides a pathlib interface to do path operations on the 41 | fsspec urlpath. 42 | 43 | The great part is, if you're familiar with the [pathlib.Path][pathlib] API, you can immediately 44 | switch from working with local paths to working on remote and virtual filesystem by simply using 45 | the `UPath` class: 46 | 47 | === "The Problem" 48 | 49 | ```python 50 | # Local files: use pathlib 51 | from pathlib import Path 52 | local_file = Path("data/file.txt") 53 | content = local_file.read_text() 54 | 55 | # S3 files: use boto3/s3fs 56 | import boto3 57 | s3 = boto3.client('s3') 58 | obj = s3.get_object(Bucket='bucket', Key='data/file.txt') 59 | content = obj['Body'].read().decode('utf-8') 60 | 61 | # Different APIs, different patterns 😫 62 | ``` 63 | 64 | === "The Solution" 65 | 66 | ```python 67 | # All files: use UPath! ✨ 68 | from upath import UPath 69 | 70 | local_file = UPath("data/file.txt") 71 | s3_file = UPath("s3://bucket/data/file.txt") 72 | 73 | # Same API everywhere! 🎉 74 | content = local_file.read_text() 75 | content = s3_file.read_text() 76 | ``` 77 | 78 | [Learn more about why you should use Universal Pathlib →](why.md){ .md-button } 79 | 80 | --- 81 | 82 | ## Quick Start :rocket: 83 | 84 | ### Installation 85 | 86 | ```bash 87 | pip install universal-pathlib 88 | ``` 89 | 90 | !!! tip "Installing for specific filesystems" 91 | To use cloud storage or other remote filesystems, install the necessary fsspec extras: 92 | 93 | ```bash 94 | pip install "universal-pathlib" "fsspec[s3,gcs,azure]" 95 | ``` 96 | 97 | See the [Installation Guide](install.md) for more details. 98 | 99 | ### TL;DR Examples 100 | 101 | ```python 102 | from upath import UPath 103 | 104 | # Works with local paths 105 | local_path = UPath("documents/notes.txt") 106 | local_path.write_text("Hello, World!") 107 | print(local_path.read_text()) # "Hello, World!" 108 | 109 | # Works with S3 110 | s3_path = UPath("s3://my-bucket/data/processed/results.csv") 111 | if s3_path.exists(): 112 | data = s3_path.read_text() 113 | 114 | # Works with HTTP 115 | http_path = UPath("https://example.com/data/file.json") 116 | if http_path.exists(): 117 | content = http_path.read_bytes() 118 | 119 | # Works with many more! 🌟 120 | ``` 121 | 122 | --- 123 | 124 | ## Currently supported filesystems 125 | 126 | - :fontawesome-solid-folder: `file:` and `local:` Local filesystem 127 | - :fontawesome-solid-memory: `memory:` Ephemeral filesystem in RAM 128 | - :fontawesome-brands-microsoft: `az:`, `adl:`, `abfs:` and `abfss:` Azure Storage _(requires `adlfs`)_ 129 | - :fontawesome-solid-database: `data:` RFC 2397 style data URLs _(requires `fsspec>=2023.12.2`)_ 130 | - :fontawesome-solid-network-wired: `ftp:` FTP filesystem 131 | - :fontawesome-brands-github: `github:` GitHub repository filesystem 132 | - :fontawesome-solid-globe: `http:` and `https:` HTTP(S)-based filesystem 133 | - :fontawesome-solid-server: `hdfs:` Hadoop distributed filesystem 134 | - :fontawesome-brands-google: `gs:` and `gcs:` Google Cloud Storage _(requires `gcsfs`)_ 135 | - :simple-huggingface: `hf:` Hugging Face Hub _(requires `huggingface_hub`)_ 136 | - :fontawesome-brands-aws: `s3:` and `s3a:` AWS S3 _(requires `s3fs`)_ 137 | - :fontawesome-solid-network-wired: `sftp:` and `ssh:` SFTP and SSH filesystems _(requires `paramiko`)_ 138 | - :fontawesome-solid-share-nodes: `smb:` SMB filesystems _(requires `smbprotocol`)_ 139 | - :fontawesome-solid-cloud: `webdav:`, `webdav+http:` and `webdav+https:` WebDAV _(requires `webdav4[fsspec]`)_ 140 | 141 | !!! info "Untested Filesystems" 142 | Other fsspec-compatible filesystems likely work through the default implementation. If you encounter issues, please [report it our issue tracker](https://github.com/fsspec/universal_pathlib/issues)! We're happy to add official support! 143 | 144 | --- 145 | 146 | ## Getting Help :question: 147 | 148 | Need help? We're here for you! 149 | 150 | - :fontawesome-brands-github: [GitHub Issues](https://github.com/fsspec/universal_pathlib/issues) - Report bugs or request features 151 | - :material-book-open-variant: [Documentation](https://universal-pathlib.readthedocs.io/) - You're reading it! 152 | 153 | !!! tip "Before Opening an Issue" 154 | Please check if your question has already been answered in the documentation or existing issues. 155 | 156 | --- 157 | 158 | ## License :page_with_curl: 159 | 160 | Universal Pathlib is distributed under the [MIT license](https://github.com/fsspec/universal_pathlib/blob/main/LICENSE), making it free and open source software. Use it freely in your projects! 161 | 162 | --- 163 | 164 |
165 | 166 | **Ready to get started?** 167 | 168 | [Install Now](install.md){ .md-button .md-button--primary } 169 | 170 |
171 | -------------------------------------------------------------------------------- /docs/why.md: -------------------------------------------------------------------------------- 1 | # Why Use Universal Pathlib? :sparkles: 2 | 3 | If you've ever worked with cloud storage or remote filesystems in Python, you've probably experienced the frustration of juggling different APIs. Universal Pathlib solves this problem elegantly by bringing the beloved `pathlib.Path` interface to *any* filesystem spec filesystem. 4 | 5 | --- 6 | 7 | ## The Problem: Filesystem dependent APIs :broken_heart: 8 | 9 | Let's face it: working with files across different storage backends is messy. 10 | 11 | ### Example: The Old Way 12 | 13 | ```python 14 | # Local files 15 | from pathlib import Path 16 | local_file = Path("data/results.csv") 17 | with local_file.open('r') as f: 18 | data = f.read() 19 | 20 | # S3 files 21 | import boto3 22 | s3 = boto3.resource('s3') 23 | obj = s3.Object('my-bucket', 'data/results.csv') 24 | data = obj.get()['Body'].read().decode('utf-8') 25 | 26 | # Azure Blob Storage 27 | from azure.storage.blob import BlobServiceClient 28 | blob_client = BlobServiceClient.from_connection_string(conn_str) 29 | container_client = blob_client.get_container_client('my-container') 30 | blob_client = container_client.get_blob_client('data/results.csv') 31 | data = blob_client.download_blob().readall().decode('utf-8') 32 | 33 | # Three different APIs, three different patterns 😫 34 | ``` 35 | 36 | Each storage backend has its own: 37 | 38 | - :material-api: **Different API** - Learn a new interface for each service 39 | - :material-puzzle: **Different patterns** - Different ways to read, write, and list files 40 | - :material-code-braces: **Different imports** - Manage multiple dependencies 41 | - :material-hammer-wrench: **Different configurations** - Each with unique setup requirements 42 | 43 | !!! danger "The Maintenance Nightmare" 44 | Want to switch from S3 to GCS? Rewrite your code. Need to support multiple backends? Write wrapper functions. Testing? Mock each service differently. This doesn't scale! 45 | 46 | --- 47 | 48 | ## The Solution: One API to Rule Them All :crown: 49 | 50 | Universal Pathlib provides a single, unified interface that works everywhere: 51 | 52 | ```python 53 | from upath import UPath 54 | 55 | # Local files 56 | local_file = UPath("data/results.csv") 57 | 58 | # S3 files 59 | s3_file = UPath("s3://my-bucket/data/results.csv") 60 | 61 | # Azure Blob Storage 62 | azure_file = UPath("az://my-container/data/results.csv") 63 | 64 | # Same API everywhere! ✨ 65 | for path in [local_file, s3_file, azure_file]: 66 | with path.open('r') as f: 67 | data = f.read() 68 | ``` 69 | 70 | !!! success "One API, Infinite Possibilities" 71 | Write your code once, run it anywhere. Switch backends by changing a URL. Test locally, deploy to the cloud. It just works! :sparkles: 72 | 73 | --- 74 | 75 | ## Key Benefits :trophy: 76 | 77 | ### 1. Familiar and Pythonic :snake: 78 | 79 | If you know Python's `pathlib`, you already know Universal Pathlib! 80 | 81 | ```python 82 | from upath import UPath 83 | 84 | # All the familiar pathlib operations 85 | path = UPath("s3://bucket/data/file.txt") 86 | 87 | print(path.name) # "file.txt" 88 | print(path.stem) # "file" 89 | print(path.suffix) # ".txt" 90 | print(path.parent) # UPath("s3://bucket/data") 91 | 92 | # Path joining 93 | output = path.parent / "processed" / "output.csv" 94 | 95 | # File operations 96 | path.write_text("Hello!") 97 | content = path.read_text() 98 | 99 | # Directory operations 100 | for item in path.parent.iterdir(): 101 | print(item) 102 | ``` 103 | 104 | !!! tip "Zero Learning Curve" 105 | Your existing pathlib knowledge transfers directly. No new concepts to learn, no cognitive overhead! 106 | 107 | ### 2. Write Once, Run Anywhere :earth_americas: 108 | 109 | Change storage backends without changing code: 110 | 111 | === "Development (Local)" 112 | 113 | ```python 114 | from upath import UPath 115 | 116 | def process_data(input_path: str, output_path: str): 117 | data_file = UPath(input_path) 118 | result_file = UPath(output_path) 119 | 120 | # Read, process, write 121 | data = data_file.read_text() 122 | processed = data.upper() 123 | result_file.write_text(processed) 124 | 125 | # Local development 126 | process_data("data/input.txt", "data/output.txt") 127 | ``` 128 | 129 | === "Production (S3)" 130 | 131 | ```python 132 | from upath import UPath 133 | 134 | def process_data(input_path: str, output_path: str): 135 | data_file = UPath(input_path) 136 | result_file = UPath(output_path) 137 | 138 | # Same code! Just different paths 139 | data = data_file.read_text() 140 | processed = data.upper() 141 | result_file.write_text(processed) 142 | 143 | # Production on S3 144 | process_data( 145 | "s3://my-bucket/data/input.txt", 146 | "s3://my-bucket/data/output.txt" 147 | ) 148 | ``` 149 | 150 | === "Testing (Memory)" 151 | 152 | ```python 153 | from upath import UPath 154 | 155 | def process_data(input_path: str, output_path: str): 156 | data_file = UPath(input_path) 157 | result_file = UPath(output_path) 158 | 159 | # Same code! No mocking needed 160 | data = data_file.read_text() 161 | processed = data.upper() 162 | result_file.write_text(processed) 163 | 164 | # Fast tests with in-memory filesystem 165 | process_data( 166 | "memory://input.txt", 167 | "memory://output.txt" 168 | ) 169 | ``` 170 | 171 | !!! success "Truly Portable Code" 172 | Your business logic stays clean and your application does not have to 173 | care about where the files live anymore. 174 | 175 | ### 3. Type-Safe and IDE-Friendly :computer: 176 | 177 | Universal Pathlib includes type hints for excellent IDE support: 178 | 179 | ```python 180 | from upath import UPath 181 | from pathlib import Path 182 | 183 | def process_file(path: UPath | Path) -> str: 184 | # Your IDE knows about all methods! 185 | if path.exists(): # ✓ Autocomplete 186 | content = path.read_text() # ✓ Type checked 187 | return content.upper() 188 | return "" 189 | 190 | # Works with both! 191 | local_result = process_file(UPath("file.txt")) 192 | s3_result = process_file(UPath("s3://bucket/file.txt")) 193 | ``` 194 | 195 | !!! info "Editor Support" 196 | Get autocomplete, type checking, and inline documentation in VS Code, PyCharm, and other modern Python IDEs. 197 | 198 | ### 4. Extensively Tested :test_tube: 199 | 200 | Universal Pathlib runs a large subset of CPython's pathlib test suite: 201 | 202 | - :white_check_mark: **Compatibility tested** against standard library pathlib 203 | - :white_check_mark: **Cross-version tested** on Python 3.9-3.14 204 | - :white_check_mark: **Integration tested** with real filesystems 205 | - :white_check_mark: **Regression tested** for each release 206 | 207 | !!! quote "Extensively Tested" 208 | When we say "pathlib-compatible," we mean it. 209 | 210 | ### 5. Extensible and Future-Proof :rocket: 211 | 212 | Built on `fsspec`, the standard for Python filesystem abstractions: 213 | 214 | ```python 215 | # Works with many fsspec filesystems! 216 | UPath("s3://...", anon=True) 217 | UPath("gs://...", token='anon') 218 | UPath("az://...") 219 | UPath("https://...") 220 | ``` 221 | 222 | Need a custom filesystem? Implement it once with fsspec, and UPath works automatically! 223 | 224 | !!! tip "Ecosystem Benefits" 225 | Leverage the entire fsspec ecosystem: caching, compression, callback hooks, and more! 226 | 227 | --- 228 | 229 | ## Next Steps :footprints: 230 | 231 | Ready to give Universal Pathlib a try? 232 | 233 | 1. **[Install Universal Pathlib](install.md)** - Get set up in minutes 234 | 2. **[Understand the concepts](concepts/index.md)** - Understand the concepts 235 | 3. **[Read the API docs](api/index.md)** - Learn about all the features 236 | 237 |
238 | 239 | [Install Now →](install.md){ .md-button .md-button--primary } 240 | 241 |
242 | -------------------------------------------------------------------------------- /docs/concepts/upath.md: -------------------------------------------------------------------------------- 1 | 6 | 7 | # Universal Pathlib ![upath](../assets/logo-128x128.svg){: #upath-logo } 8 | 9 | **universal-pathlib** (imported as `upath`) bridges Python's [pathlib](https://docs.python.org/3/library/pathlib.html) API with [fsspec](https://filesystem-spec.readthedocs.io/)'s filesystem implementations. It provides a familiar, pathlib-style interface for working with files across local storage, cloud services, and remote systems. 10 | 11 | ## The Best of Both Worlds 12 | 13 | universal-pathlib combines: 14 | 15 | - **fsspec's filesystem support**: Access to S3, GCS, Azure, HDFS, HTTP, SFTP, and dozens more backends 16 | - **pathlib's elegant API**: Object-oriented paths, `/` operator, `.exists()`, `.read_text()`, etc. 17 | 18 | This means you can write code using the pathlib syntax you already know, and it works seamlessly across any storage system that fsspec supports. 19 | 20 | ## How UPath and Path Relate via pathlib-abc 21 | 22 | `UPath` and `pathlib.Path` are related through the abstract base classes defined in [pathlib-abc](https://github.com/barneygale/pathlib-abc). While they share a common API design, they serve different purposes and have distinct inheritance hierarchies. 23 | 24 | ### The Class Hierarchy 25 | 26 | The following diagram shows how `UPath` implementations relate to `pathlib` classes through the `pathlib_abc` abstract base classes: 27 | 28 | ```mermaid 29 | flowchart TB 30 | 31 | subgraph p0[pathlib_abc] 32 | X ----> Y 33 | X ----> Z 34 | end 35 | 36 | subgraph s0[pathlib] 37 | X -.-> A 38 | 39 | A----> B 40 | A--> AP 41 | A--> AW 42 | 43 | Y -.-> B 44 | Z -.-> B 45 | 46 | B--> BP 47 | AP----> BP 48 | B--> BW 49 | AW----> BW 50 | end 51 | subgraph s1[upath] 52 | Y ---> U 53 | Z ---> U 54 | 55 | U --> UP 56 | U --> UW 57 | BP ---> UP 58 | BW ---> UW 59 | U --> UL 60 | U --> US3 61 | U --> UH 62 | U -.-> UO 63 | end 64 | 65 | X(JoinablePath) 66 | Y(WritablePath) 67 | Z(ReadablePath) 68 | 69 | A(PurePath) 70 | AP(PurePosixPath) 71 | AW(PureWindowsPath) 72 | B(Path) 73 | BP(PosixPath) 74 | BW(WindowsPath) 75 | 76 | U(UPath) 77 | UP(PosixUPath) 78 | UW(WindowsUPath) 79 | UL(FilePath) 80 | US3(S3Path) 81 | UH(HttpPath) 82 | UO(...Path) 83 | 84 | classDef na fill:#f7f7f7,stroke:#02a822,stroke-width:2px,color:#333 85 | classDef np fill:#f7f7f7,stroke:#2166ac,stroke-width:2px,color:#333 86 | classDef nu fill:#f7f7f7,stroke:#b2182b,stroke-width:2px,color:#333 87 | 88 | class X,Y,Z na 89 | class A,AP,AW,B,BP,BW,UP,UW np 90 | class U,UL,US3,UH,UO nu 91 | 92 | style UO stroke-dasharray: 3 3 93 | 94 | style p0 fill:none,stroke:#0a2,stroke-width:3px,stroke-dasharray:3,color:#0a2 95 | style s0 fill:none,stroke:#07b,stroke-width:3px,stroke-dasharray:3,color:#07b 96 | style s1 fill:none,stroke:#d02,stroke-width:3px,stroke-dasharray:3,color:#d02 97 | ``` 98 | 99 | **Legend:** 100 | 101 | - **Green (pathlib_abc)**: Abstract base classes defining the path interface 102 | - **Blue (pathlib)**: Standard library path classes for local filesystems 103 | - **Red (upath)**: Universal pathlib classes for all filesystems 104 | - Solid lines: Direct inheritance 105 | - Dotted lines: Conceptual relationship (not actual inheritance yet) 106 | 107 | ### Understanding the Relationships 108 | 109 | **pathlib-abc Layer (Green):** 110 | 111 | - `JoinablePath` - Basic path manipulation without filesystem access 112 | - `ReadablePath` - Adds read-only filesystem operations 113 | - `WritablePath` - Adds write filesystem operations 114 | 115 | **pathlib Layer (Blue):** 116 | 117 | - `PurePath` - Pure path manipulation (similar to `JoinablePath` conceptually) 118 | - `Path` - Concrete local filesystem paths (conceptually similar to `ReadablePath` + `WritablePath`) 119 | - Platform-specific: `PosixPath`, `WindowsPath`, etc. 120 | 121 | **universal-pathlib Layer (Red):** 122 | 123 | - `UPath` - Universal path for any filesystem backend 124 | - Local implementations: `PosixUPath`, `WindowsUPath`, `FilePath` 125 | - Remote implementations: `S3Path`, `HttpPath`, and others 126 | 127 | ### Key Differences 128 | 129 | **Current State (Python 3.9-3.13):** 130 | 131 | ```python 132 | from pathlib import Path 133 | from upath import UPath 134 | from upath.types import JoinablePath, ReadablePath, WritablePath 135 | 136 | # UPath explicitly implements pathlib-abc 137 | path = UPath("s3://bucket/file.txt") 138 | assert isinstance(path, JoinablePath) # True 139 | assert isinstance(path, ReadablePath) # True 140 | assert isinstance(path, WritablePath) # True 141 | 142 | # pathlib.Path does NOT (yet) inherit from pathlib-abc 143 | local = Path("/home/user/file.txt") 144 | assert isinstance(local, JoinablePath) # False 145 | assert isinstance(local, ReadablePath) # False 146 | assert isinstance(local, WritablePath) # False 147 | ``` 148 | 149 | **Important Note:** The dotted lines in the diagram represent a conceptual relationship. While `pathlib.Path` doesn't currently inherit from `pathlib_abc` classes, it implements a compatible API. Future Python versions may formalize this relationship. 150 | 151 | ### Local Path Compatibility 152 | 153 | For local filesystem paths, `UPath` provides implementations that are 100% compatible with stdlib `pathlib`: 154 | 155 | ```python 156 | from pathlib import Path, PosixPath, WindowsPath 157 | from upath import UPath 158 | 159 | # Without protocol -> returns platform-specific UPath 160 | local = UPath("/home/user/file.txt") 161 | assert isinstance(local, UPath) # True 162 | assert isinstance(local, PosixPath) # True (on Unix systems) 163 | assert isinstance(local, Path) # True 164 | 165 | # With file:// protocol -> returns FilePath (fsspec-based) 166 | file_path = UPath("file:///home/user/file.txt") 167 | assert isinstance(file_path, UPath) # True 168 | assert not isinstance(file_path, Path) # False (uses fsspec instead) 169 | ``` 170 | 171 | **PosixUPath and WindowsUPath:** 172 | - Subclass both `UPath` and `pathlib.Path` 173 | - 100% compatible with stdlib pathlib for local paths 174 | - Tested against CPython's pathlib test suite 175 | - Implement `os.PathLike` protocol 176 | 177 | **FilePath:** 178 | - Subclass of `UPath` only 179 | - Uses fsspec's `LocalFileSystem` for file access 180 | - Useful for consistent fsspec-based access across all backends 181 | - Implements `os.PathLike` protocol 182 | 183 | ### Remote and Cloud Paths 184 | 185 | For remote filesystems, `UPath` implementations provide the pathlib API backed by fsspec: 186 | 187 | ```python 188 | from upath import UPath 189 | 190 | # S3Path 191 | s3 = UPath("s3://bucket/file.txt") 192 | assert isinstance(s3, UPath) 193 | assert not isinstance(s3, Path) # Not a local path 194 | 195 | # HttpPath 196 | http = UPath("https://example.com/data.json") 197 | assert isinstance(http, UPath) 198 | assert not isinstance(http, Path) # Not a local path 199 | ``` 200 | 201 | ### Why This Design? 202 | 203 | This architecture provides several benefits: 204 | 205 | 1. **Unified API**: Same pathlib interface works across all backends 206 | 2. **Type Safety**: pathlib-abc provides formal type hints for path operations 207 | 3. **Local Compatibility**: `PosixUPath`/`WindowsUPath` maintain full stdlib compatibility 208 | 4. **Flexibility**: Easy to add new filesystem implementations 209 | 5. **Future-Proof**: Ready for potential stdlib integration of pathlib-abc 210 | 211 | ### Writing Filesystem-Agnostic Code 212 | 213 | Use pathlib-abc types to write code that works with both `Path` and `UPath`: 214 | 215 | ```python 216 | from upath.types import ReadablePath, WritablePath 217 | 218 | def process_file(input_path: ReadablePath, output_path: WritablePath) -> None: 219 | """Works with Path, UPath, or any ReadablePath/WritablePath implementation.""" 220 | data = input_path.read_text() 221 | processed = data.upper() 222 | output_path.write_text(processed) 223 | 224 | # Works with stdlib Path 225 | from pathlib import Path 226 | process_file(Path("input.txt"), Path("output.txt")) 227 | 228 | # Works with UPath for cloud storage 229 | from upath import UPath 230 | process_file( 231 | UPath("s3://input-bucket/data.txt", anon=True), 232 | UPath("s3://output-bucket/result.txt") 233 | ) 234 | 235 | # Mix local and remote 236 | process_file( 237 | UPath("https://example.com/data.txt"), 238 | Path("/tmp/result.txt") 239 | ) 240 | ``` 241 | 242 | ## Learn More 243 | 244 | - **pathlib concepts**: See [pathlib.md](pathlib.md) for details on the pathlib API 245 | - **fsspec backends**: See [filesystems.md](fsspec.md) for information about available filesystems 246 | - **API reference**: Check the [API documentation](../api/index.md) for complete method details 247 | - **fsspec details**: Visit [fsspec documentation](https://filesystem-spec.readthedocs.io/) for filesystem-specific options 248 | --------------------------------------------------------------------------------