├── upath
├── py.typed
├── tests
│ ├── __init__.py
│ ├── pathlib
│ │ ├── __init__.py
│ │ └── conftest.py
│ ├── third_party
│ │ ├── __init__.py
│ │ └── test_pydantic.py
│ ├── implementations
│ │ ├── __init__.py
│ │ ├── test_ftp.py
│ │ ├── test_hdfs.py
│ │ ├── test_cached.py
│ │ ├── test_memory.py
│ │ ├── test_smb.py
│ │ ├── test_gcs.py
│ │ ├── test_azure.py
│ │ ├── test_sftp.py
│ │ ├── test_local.py
│ │ ├── test_webdav.py
│ │ ├── test_hf.py
│ │ ├── test_tar.py
│ │ ├── test_github.py
│ │ ├── test_s3.py
│ │ └── test_zip.py
│ ├── utils.py
│ ├── test_drive_root_anchor_parts.py
│ ├── test_pydantic.py
│ ├── test_stat.py
│ ├── test_chain.py
│ ├── test_registry.py
│ └── test_extensions.py
├── implementations
│ ├── __init__.py
│ ├── _experimental.py
│ ├── hdfs.py
│ ├── sftp.py
│ ├── memory.py
│ ├── github.py
│ ├── cached.py
│ ├── tar.py
│ ├── ftp.py
│ ├── data.py
│ ├── zip.py
│ ├── smb.py
│ ├── webdav.py
│ ├── http.py
│ └── cloud.py
├── types
│ ├── _abc.py
│ ├── __init__.py
│ └── _abc.pyi
├── _info.py
├── __init__.py
└── _protocol.py
├── .gitattributes
├── MANIFEST.in
├── docs
├── assets
│ ├── favicon.png
│ └── logo-128x128-white.svg
├── css
│ └── extra.css
├── _plugins
│ └── copy_changelog.py
├── api
│ ├── registry.md
│ ├── extensions.md
│ ├── index.md
│ ├── types.md
│ └── implementations.md
├── concepts
│ ├── index.md
│ ├── pathlib.md
│ ├── fsspec.md
│ └── upath.md
├── install.md
├── index.md
└── why.md
├── environment.yml
├── SECURITY.md
├── dev
└── requirements.txt
├── .readthedocs.yaml
├── .flake8
├── .github
├── workflows
│ ├── release.yml
│ ├── post-dependabot-update.yml
│ └── tests.yml
└── dependabot.yml
├── LICENSE
├── .pre-commit-config.yaml
├── .gitignore
├── CONTRIBUTING.rst
├── mkdocs.yml
├── CODE_OF_CONDUCT.rst
├── pyproject.toml
└── noxfile.py
/upath/py.typed:
--------------------------------------------------------------------------------
1 |
--------------------------------------------------------------------------------
/upath/tests/__init__.py:
--------------------------------------------------------------------------------
1 |
--------------------------------------------------------------------------------
/upath/implementations/__init__.py:
--------------------------------------------------------------------------------
1 |
--------------------------------------------------------------------------------
/upath/tests/pathlib/__init__.py:
--------------------------------------------------------------------------------
1 |
--------------------------------------------------------------------------------
/.gitattributes:
--------------------------------------------------------------------------------
1 | * text=auto eol=lf
2 |
--------------------------------------------------------------------------------
/upath/tests/third_party/__init__.py:
--------------------------------------------------------------------------------
1 |
--------------------------------------------------------------------------------
/upath/tests/implementations/__init__.py:
--------------------------------------------------------------------------------
1 |
--------------------------------------------------------------------------------
/MANIFEST.in:
--------------------------------------------------------------------------------
1 | exclude .git*
2 | recursive-exclude .git *
3 | recursive-exclude .github *
4 |
--------------------------------------------------------------------------------
/docs/assets/favicon.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/fsspec/universal_pathlib/HEAD/docs/assets/favicon.png
--------------------------------------------------------------------------------
/docs/css/extra.css:
--------------------------------------------------------------------------------
1 | :root {
2 | --md-primary-fg-color: #4361EE;
3 | scrollbar-gutter: stable;
4 | overflow-y: scroll;
5 | }
6 |
7 | .md-typeset table:not([class]) th:not(:first-child) {
8 | min-width: 1em;
9 | padding-left: 0.8em;
10 | padding-right: 0.8em;
11 | }
12 |
--------------------------------------------------------------------------------
/environment.yml:
--------------------------------------------------------------------------------
1 | name: upath
2 | channels:
3 | - defaults
4 | - conda-forge
5 | dependencies:
6 | - python==3.10
7 | - fsspec
8 | # optional
9 | - requests
10 | - s3fs
11 | - jupyter
12 | - ipython
13 | - pytest
14 | - pylint
15 | - flake8
16 | - pyarrow
17 | - moto
18 | - pip
19 | - pip:
20 | - hadoop-test-cluster
21 | - gcsfs
22 | - nox
23 |
--------------------------------------------------------------------------------
/SECURITY.md:
--------------------------------------------------------------------------------
1 | # Security Policy - Vulnerability Reporting
2 |
3 | If you believe you have discovered a security issue in universal-pathlib, do not open a public issue.
4 |
5 | Instead, report it via the repository’s **`Security`** tab using the **`Report a vulnerability`** button.
6 | Include clear details and verify whether the vulnerability is in `universal-pathlib` or one of its dependencies.
7 |
8 | Providing a minimal reproducible example will help resolve the issue more efficiently.
9 |
--------------------------------------------------------------------------------
/dev/requirements.txt:
--------------------------------------------------------------------------------
1 | fsspec[git,hdfs,dask,http,sftp,smb]==2025.10.0
2 |
3 | # these dependencies define their own filesystems
4 | adlfs==2025.8.0
5 | boxfs==0.3.0
6 | dropboxdrivefs==1.4.1
7 | gcsfs==2025.10.0
8 | s3fs==2025.10.0
9 | ocifs==1.3.4
10 | webdav4[fsspec]==0.10.0
11 | # gfrivefs @ git+https://github.com/fsspec/gdrivefs@master broken ...
12 | morefs[asynclocalfs]==0.2.2
13 | dvc==3.64.2
14 | huggingface_hub==1.2.1
15 | lakefs-spec==0.12.0
16 | ossfs==2025.5.0
17 | fsspec-xrootd==0.5.1
18 | wandbfs==0.0.2
19 |
--------------------------------------------------------------------------------
/upath/types/_abc.py:
--------------------------------------------------------------------------------
1 | """pathlib_abc exports for compatibility with pathlib."""
2 |
3 | from pathlib_abc import JoinablePath
4 | from pathlib_abc import PathInfo
5 | from pathlib_abc import PathParser
6 | from pathlib_abc import ReadablePath
7 | from pathlib_abc import WritablePath
8 | from pathlib_abc import vfsopen
9 | from pathlib_abc import vfspath
10 |
11 | __all__ = [
12 | "JoinablePath",
13 | "ReadablePath",
14 | "WritablePath",
15 | "PathInfo",
16 | "PathParser",
17 | "vfsopen",
18 | "vfspath",
19 | ]
20 |
--------------------------------------------------------------------------------
/upath/tests/third_party/test_pydantic.py:
--------------------------------------------------------------------------------
1 | import pytest
2 |
3 | try:
4 | from pydantic import BaseConfig
5 | from pydantic_settings import BaseSettings
6 | except ImportError:
7 | BaseConfig = BaseSettings = None
8 | pytestmark = pytest.mark.skip(reason="requires pydantic")
9 |
10 | from upath.core import UPath
11 |
12 |
13 | def test_pydantic_settings_local_upath():
14 | class MySettings(BaseSettings):
15 | example_path: UPath = UPath(__file__)
16 |
17 | assert isinstance(MySettings().example_path, UPath)
18 |
--------------------------------------------------------------------------------
/docs/_plugins/copy_changelog.py:
--------------------------------------------------------------------------------
1 | from __future__ import annotations
2 |
3 | from pathlib import Path
4 |
5 | THIS_DIR = Path(__file__).parent
6 | DOCS_DIR = THIS_DIR.parent
7 | PROJECT_ROOT = DOCS_DIR.parent
8 |
9 |
10 | def on_pre_build(**_) -> None:
11 | """Add changelog to docs/changelog.md"""
12 | cl_now = PROJECT_ROOT.joinpath("CHANGELOG.md").read_text(encoding="utf-8")
13 |
14 | f_doc = DOCS_DIR.joinpath("changelog.md")
15 | if not f_doc.is_file() or f_doc.read_text(encoding="utf-8") != cl_now:
16 | f_doc.write_text(cl_now, encoding="utf-8")
17 |
--------------------------------------------------------------------------------
/upath/tests/implementations/test_ftp.py:
--------------------------------------------------------------------------------
1 | import pytest
2 |
3 | from upath import UPath
4 | from upath.tests.cases import BaseTests
5 | from upath.tests.utils import skip_on_windows
6 |
7 |
8 | @skip_on_windows
9 | class TestUPathFTP(BaseTests):
10 |
11 | @pytest.fixture(autouse=True)
12 | def path(self, ftp_server):
13 | self.path = UPath("", protocol="ftp", **ftp_server)
14 | self.prepare_file_system()
15 |
16 |
17 | def test_ftp_path_mtime(ftp_server):
18 | path = UPath("file1.txt", protocol="ftp", **ftp_server)
19 | path.touch()
20 | mtime = path.stat().st_mtime
21 | assert isinstance(mtime, float)
22 |
--------------------------------------------------------------------------------
/upath/tests/implementations/test_hdfs.py:
--------------------------------------------------------------------------------
1 | """see upath/tests/conftest.py for fixtures"""
2 |
3 | import pytest # noqa: F401
4 |
5 | from upath import UPath
6 | from upath.implementations.hdfs import HDFSPath
7 |
8 | from ..cases import BaseTests
9 |
10 |
11 | @pytest.mark.hdfs
12 | class TestUPathHDFS(BaseTests):
13 | @pytest.fixture(autouse=True)
14 | def path(self, local_testdir, hdfs):
15 | host, user, port = hdfs
16 | path = f"hdfs:{local_testdir}"
17 | self.path = UPath(path, host=host, user=user, port=port)
18 |
19 | def test_is_HDFSPath(self):
20 | assert isinstance(self.path, HDFSPath)
21 |
--------------------------------------------------------------------------------
/upath/tests/implementations/test_cached.py:
--------------------------------------------------------------------------------
1 | import pytest
2 |
3 | from upath import UPath
4 | from upath.implementations.cached import SimpleCachePath
5 |
6 | from ..cases import BaseTests
7 |
8 |
9 | class TestSimpleCachePath(BaseTests):
10 | @pytest.fixture(autouse=True)
11 | def path(self, local_testdir):
12 | if not local_testdir.startswith("/"):
13 | local_testdir = "/" + local_testdir
14 | path = f"simplecache::memory:{local_testdir}"
15 | self.path = UPath(path)
16 | self.prepare_file_system()
17 |
18 | def test_is_SimpleCachePath(self):
19 | assert isinstance(self.path, SimpleCachePath)
20 |
--------------------------------------------------------------------------------
/.readthedocs.yaml:
--------------------------------------------------------------------------------
1 | # Read the Docs configuration file
2 | # See https://docs.readthedocs.io/en/stable/config-file/v2.html for details
3 |
4 | # Required
5 | version: 2
6 |
7 | # Set the OS, Python version, and other tools you might need
8 | build:
9 | os: ubuntu-24.04
10 | tools:
11 | python: "3.13"
12 | jobs:
13 | pre_create_environment:
14 | - asdf plugin add uv
15 | - asdf install uv latest
16 | - asdf global uv latest
17 | create_environment:
18 | - uv venv "${READTHEDOCS_VIRTUALENV_PATH}"
19 | install:
20 | - UV_PROJECT_ENVIRONMENT="${READTHEDOCS_VIRTUALENV_PATH}" uv sync --group docs
21 |
22 | # Build documentation with Mkdocs
23 | mkdocs:
24 | configuration: mkdocs.yml
25 |
--------------------------------------------------------------------------------
/.flake8:
--------------------------------------------------------------------------------
1 | [flake8]
2 | ignore=
3 | # Whitespace before ':'
4 | E203
5 | # Too many leading '#' for block comment
6 | E266
7 | # Line break occurred before a binary operator
8 | W503
9 | # unindexed parameters in the str.format, see:
10 | # https://pypi.org/project/flake8-string-format/
11 | P1
12 | # def statements on the same line with overload
13 | E704
14 | max_line_length = 88
15 | max-complexity = 15
16 | select = B,C,E,F,W,T4,B902,T,P
17 | show_source = true
18 | count = true
19 | exclude =
20 | .noxfile,
21 | .nox,
22 | __pycache__,
23 | .git,
24 | .github,
25 | .gitignore,
26 | .pytest_cache,
27 | upath/tests/pathlib/_test_support.py,
28 | upath/tests/pathlib/test_pathlib_3*.py,
29 |
--------------------------------------------------------------------------------
/.github/workflows/release.yml:
--------------------------------------------------------------------------------
1 | name: Release
2 |
3 | on:
4 | release:
5 | types: [published]
6 | workflow_dispatch:
7 |
8 | env:
9 | FORCE_COLOR: "1"
10 |
11 | jobs:
12 | release:
13 | runs-on: ubuntu-latest
14 | environment: pypi
15 | permissions:
16 | id-token: write
17 | steps:
18 | - name: Check out the repository
19 | uses: actions/checkout@v4
20 | with:
21 | fetch-depth: 0
22 |
23 | - uses: hynek/setup-cached-uv@v2
24 |
25 | - name: Build package
26 | run: uvx nox -s build
27 |
28 | - name: Upload package
29 | if: github.event_name == 'release'
30 | uses: pypa/gh-action-pypi-publish@release/v1
31 | with:
32 | verbose: true
33 | skip-existing: true
34 |
--------------------------------------------------------------------------------
/upath/_info.py:
--------------------------------------------------------------------------------
1 | from __future__ import annotations
2 |
3 | from typing import TYPE_CHECKING
4 |
5 | from upath.types import PathInfo
6 |
7 | if TYPE_CHECKING:
8 | from upath import UPath
9 |
10 |
11 | __all__ = [
12 | "UPathInfo",
13 | ]
14 |
15 |
16 | class UPathInfo(PathInfo):
17 | """Path info for UPath objects."""
18 |
19 | def __init__(self, path: UPath) -> None:
20 | self._path = path.path
21 | self._fs = path.fs
22 |
23 | def exists(self, *, follow_symlinks=True) -> bool:
24 | return self._fs.exists(self._path)
25 |
26 | def is_dir(self, *, follow_symlinks=True) -> bool:
27 | return self._fs.isdir(self._path)
28 |
29 | def is_file(self, *, follow_symlinks=True) -> bool:
30 | return self._fs.isfile(self._path)
31 |
32 | def is_symlink(self) -> bool:
33 | return False
34 |
--------------------------------------------------------------------------------
/upath/implementations/_experimental.py:
--------------------------------------------------------------------------------
1 | from __future__ import annotations
2 |
3 | from typing import TYPE_CHECKING
4 |
5 | from upath.registry import get_upath_class
6 |
7 | if TYPE_CHECKING:
8 | from upath import UPath
9 |
10 |
11 | def __getattr__(name: str) -> type[UPath]:
12 | if name.startswith("_") and name.endswith("Path"):
13 | from upath import UPath
14 |
15 | protocol = name[1:-4].lower()
16 | cls = get_upath_class(protocol)
17 | if cls is None:
18 | raise RuntimeError(
19 | f"Could not find fsspec implementation for protocol {protocol!r}"
20 | )
21 | elif not issubclass(cls, UPath):
22 | raise RuntimeError(
23 | "UPath implementation not a subclass of upath.UPath, {cls!r}"
24 | )
25 | return cls
26 | raise AttributeError(f"module {__name__!r} has no attribute {name!r}")
27 |
--------------------------------------------------------------------------------
/upath/__init__.py:
--------------------------------------------------------------------------------
1 | """Pathlib API extended to use fsspec backends."""
2 |
3 | from __future__ import annotations
4 |
5 | from typing import TYPE_CHECKING
6 |
7 | try:
8 | from upath._version import __version__
9 | except ImportError:
10 | __version__ = "not-installed"
11 |
12 | if TYPE_CHECKING:
13 | from upath.core import UnsupportedOperation
14 | from upath.core import UPath
15 |
16 | __all__ = ["UPath", "UnsupportedOperation"]
17 |
18 |
19 | def __getattr__(name):
20 | if name == "UPath":
21 | from upath.core import UPath
22 |
23 | globals()["UPath"] = UPath
24 | return UPath
25 | elif name == "UnsupportedOperation":
26 | from upath.core import UnsupportedOperation
27 |
28 | globals()["UnsupportedOperation"] = UnsupportedOperation
29 | return UnsupportedOperation
30 | else:
31 | raise AttributeError(f"module {__name__} has no attribute {name}")
32 |
--------------------------------------------------------------------------------
/.github/workflows/post-dependabot-update.yml:
--------------------------------------------------------------------------------
1 | name: Post Dependabot Update
2 | on:
3 | pull_request:
4 | branches: [main]
5 | paths:
6 | - dev/**
7 |
8 | jobs:
9 | auto-update:
10 | if: github.actor == 'dependabot[bot]'
11 | runs-on: ubuntu-latest
12 | permissions:
13 | contents: write
14 | pull-requests: write
15 | steps:
16 | - uses: actions/checkout@v4
17 | with:
18 | ref: ${{ github.head_ref }}
19 |
20 | - uses: hynek/setup-cached-uv@v2
21 |
22 | - name: Run tests
23 | run: uvx nox --sessions flavours-codegen
24 |
25 | - name: Commit changes
26 | run: |
27 | git config user.name "github-actions[bot]"
28 | git config user.email "github-actions[bot]@users.noreply.github.com"
29 | git add upath/_flavour_sources.py
30 | git commit -m "Auto-update generated flavours" || echo "No changes"
31 | git push
32 |
--------------------------------------------------------------------------------
/.github/dependabot.yml:
--------------------------------------------------------------------------------
1 | version: 2
2 |
3 | updates:
4 | - directory: "/"
5 | package-ecosystem: "pip"
6 | schedule:
7 | interval: "weekly"
8 | labels:
9 | - "maintenance :construction:"
10 | groups:
11 | # Group all pip dependencies into one PR
12 | pip-dependencies:
13 | patterns:
14 | - "*"
15 | # Update via cruft
16 | ignore:
17 | - dependency-name: "mkdocs*"
18 | - dependency-name: "pytest*"
19 | - dependency-name: "pylint"
20 | - dependency-name: "mypy"
21 |
22 | - directory: "/"
23 | package-ecosystem: "github-actions"
24 | schedule:
25 | interval: "weekly"
26 | labels:
27 | - "maintenance :construction:"
28 | # Update via cruft
29 | ignore:
30 | - dependency-name: "actions/checkout"
31 | - dependency-name: "actions/setup-python"
32 | - dependency-name: "pypa/gh-action-pypi-publish"
33 | - dependency-name: "codecov/codecov-action"
34 |
--------------------------------------------------------------------------------
/docs/api/registry.md:
--------------------------------------------------------------------------------
1 | # Registry :file_cabinet:
2 |
3 | The UPath registry system manages filesystem-specific path implementations. It allows you to
4 | register custom UPath subclasses for different protocols and retrieve the appropriate
5 | implementation for a given protocol.
6 |
7 | ## Functions
8 |
9 | ::: upath.registry.get_upath_class
10 | options:
11 | heading_level: 3
12 | show_root_heading: true
13 | show_root_full_path: false
14 |
15 | ::: upath.registry.register_implementation
16 | options:
17 | heading_level: 3
18 | show_root_heading: true
19 | show_root_full_path: false
20 |
21 | ::: upath.registry.available_implementations
22 | options:
23 | heading_level: 3
24 | show_root_heading: true
25 | show_root_full_path: false
26 |
27 | ---
28 |
29 | ## See Also :link:
30 |
31 | - [UPath](index.md) - Main UPath class documentation
32 | - [Implementations](implementations.md) - Built-in UPath subclasses
33 | - [Extensions](extensions.md) - Extending UPath functionality
34 |
--------------------------------------------------------------------------------
/docs/concepts/index.md:
--------------------------------------------------------------------------------
1 | # Overview :map:
2 |
3 | Universal Pathlib brings together fsspec and pathlib to provide a unified, pythonic interface for working with files across different storage systems. Understanding how these components work together will help you make the most of universal-pathlib.
4 |
5 | - **[Filesystem Spec](fsspec.md)** provides the foundation—a specification and collection of filesystem implementations that offer consistent access to local storage, cloud services, and remote systems.
6 | - **[Pathlib](pathlib.md)** defines the familiar object-oriented API from Python's standard library for working with filesystem paths.
7 | - **[Universal Pathlib](upath.md)** ties them together, implementing the [pathlib-abc](https://github.com/barneygale/pathlib-abc) interface on top of fsspec filesystems to give you a Path-like experience everywhere.
8 |
9 | Start with [fsspec filesystems](fsspec.md) to understand the available storage backends, then explore [stdlib pathlib](pathlib.md) to learn about the path interface, and finally see [upath](upath.md) to discover how universal-pathlib combines them into a powerful, unified API.
10 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2022, Andrew Fulton
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/upath/tests/implementations/test_memory.py:
--------------------------------------------------------------------------------
1 | import pytest
2 |
3 | from upath import UPath
4 | from upath.implementations.memory import MemoryPath
5 |
6 | from ..cases import BaseTests
7 |
8 |
9 | class TestMemoryPath(BaseTests):
10 | @pytest.fixture(autouse=True)
11 | def path(self, local_testdir):
12 | if not local_testdir.startswith("/"):
13 | local_testdir = "/" + local_testdir
14 | path = f"memory:{local_testdir}"
15 | self.path = UPath(path)
16 | self.prepare_file_system()
17 |
18 | def test_is_MemoryPath(self):
19 | assert isinstance(self.path, MemoryPath)
20 |
21 |
22 | @pytest.mark.parametrize(
23 | "path, expected",
24 | [
25 | ("memory:/", "memory://"),
26 | ("memory:/a", "memory://a"),
27 | ("memory:/a/b", "memory://a/b"),
28 | ("memory://", "memory://"),
29 | ("memory://a", "memory://a"),
30 | ("memory://a/b", "memory://a/b"),
31 | ("memory:///", "memory://"),
32 | ("memory:///a", "memory://a"),
33 | ("memory:///a/b", "memory://a/b"),
34 | ],
35 | )
36 | def test_string_representation(path, expected):
37 | path = UPath(path)
38 | assert str(path) == expected
39 |
--------------------------------------------------------------------------------
/upath/tests/implementations/test_smb.py:
--------------------------------------------------------------------------------
1 | import pytest
2 | from fsspec import __version__ as fsspec_version
3 | from packaging.version import Version
4 |
5 | from upath import UPath
6 | from upath.tests.cases import BaseTests
7 | from upath.tests.utils import skip_on_windows
8 |
9 |
10 | @skip_on_windows
11 | class TestUPathSMB(BaseTests):
12 |
13 | @pytest.fixture(autouse=True)
14 | def path(self, smb_fixture):
15 | self.path = UPath(smb_fixture)
16 |
17 | @pytest.mark.parametrize(
18 | "pattern",
19 | (
20 | "*.txt",
21 | pytest.param(
22 | "*",
23 | marks=pytest.mark.xfail(
24 | reason="SMBFileSystem.info appends '/' to dirs"
25 | ),
26 | ),
27 | pytest.param(
28 | "**/*.txt",
29 | marks=(
30 | pytest.mark.xfail(reason="requires fsspec>=2023.9.0")
31 | if Version(fsspec_version) < Version("2023.9.0")
32 | else ()
33 | ),
34 | ),
35 | ),
36 | )
37 | def test_glob(self, pathlib_base, pattern):
38 | super().test_glob(pathlib_base, pattern)
39 |
--------------------------------------------------------------------------------
/upath/implementations/hdfs.py:
--------------------------------------------------------------------------------
1 | from __future__ import annotations
2 |
3 | import sys
4 | from typing import TYPE_CHECKING
5 |
6 | from upath.core import UPath
7 | from upath.types import JoinablePathLike
8 |
9 | if TYPE_CHECKING:
10 | from typing import Literal
11 |
12 | if sys.version_info >= (3, 11):
13 | from typing import Unpack
14 | else:
15 | from typing_extensions import Unpack
16 |
17 | from upath._chain import FSSpecChainParser
18 | from upath.types.storage_options import HDFSStorageOptions
19 |
20 | __all__ = ["HDFSPath"]
21 |
22 |
23 | class HDFSPath(UPath):
24 | __slots__ = ()
25 |
26 | if TYPE_CHECKING:
27 |
28 | def __init__(
29 | self,
30 | *args: JoinablePathLike,
31 | protocol: Literal["hdfs"] | None = ...,
32 | chain_parser: FSSpecChainParser = ...,
33 | **storage_options: Unpack[HDFSStorageOptions],
34 | ) -> None: ...
35 |
36 | def mkdir(
37 | self, mode: int = 0o777, parents: bool = False, exist_ok: bool = False
38 | ) -> None:
39 | if not exist_ok and self.exists():
40 | raise FileExistsError(str(self))
41 | super().mkdir(mode=mode, parents=parents, exist_ok=exist_ok)
42 |
--------------------------------------------------------------------------------
/upath/implementations/sftp.py:
--------------------------------------------------------------------------------
1 | from __future__ import annotations
2 |
3 | import sys
4 | from typing import TYPE_CHECKING
5 |
6 | from upath.core import UPath
7 | from upath.types import JoinablePathLike
8 |
9 | if TYPE_CHECKING:
10 | from typing import Literal
11 |
12 | if sys.version_info >= (3, 11):
13 | from typing import Unpack
14 | else:
15 | from typing_extensions import Unpack
16 |
17 | from upath._chain import FSSpecChainParser
18 | from upath.types.storage_options import SFTPStorageOptions
19 |
20 | __all__ = ["SFTPPath"]
21 |
22 |
23 | class SFTPPath(UPath):
24 | __slots__ = ()
25 |
26 | if TYPE_CHECKING:
27 |
28 | def __init__(
29 | self,
30 | *args: JoinablePathLike,
31 | protocol: Literal["sftp"] | None = ...,
32 | chain_parser: FSSpecChainParser = ...,
33 | **storage_options: Unpack[SFTPStorageOptions],
34 | ) -> None: ...
35 |
36 | @property
37 | def path(self) -> str:
38 | path = super().path
39 | if len(path) > 1:
40 | return path.removesuffix("/")
41 | return path
42 |
43 | def __str__(self) -> str:
44 | path_str = super().__str__()
45 | if path_str.startswith(("ssh:///", "sftp:///")):
46 | return path_str.removesuffix("/")
47 | return path_str
48 |
--------------------------------------------------------------------------------
/upath/tests/pathlib/conftest.py:
--------------------------------------------------------------------------------
1 | import sys
2 |
3 | import pytest
4 |
5 | BASE_URL = "https://raw.githubusercontent.com/python/cpython/{}/Lib/test/test_pathlib.py" # noqa
6 |
7 | # current origin of pathlib tests:
8 | TEST_FILES = {
9 | "test_pathlib_38.py": "7475aa2c590e33a47f5e79e4079bca0645e93f2f",
10 | "test_pathlib_39.py": "d718764f389acd1bf4a5a65661bb58862f14fb98",
11 | "test_pathlib_310.py": "b382bf50c53e6eab09f3e3bf0802ab052cb0289d",
12 | "test_pathlib_311.py": "846a23d0b8f08e62a90682c51ce01301eb923f2e",
13 | "test_pathlib_312.py": "97a6a418167f1c8bbb014fab813e440b88cf2221", # 3.12.0b4
14 | }
15 |
16 |
17 | def pytest_ignore_collect(collection_path):
18 | """prevents pathlib tests from other python version than the current to be collected
19 |
20 | (otherwise we see a lot of skipped tests in the pytest output)
21 | """
22 | v2 = sys.version_info[:2]
23 | return {
24 | "test_pathlib_38.py": v2 != (3, 8),
25 | "test_pathlib_39.py": v2 != (3, 9),
26 | "test_pathlib_310.py": v2 != (3, 10),
27 | "test_pathlib_311.py": v2 != (3, 11),
28 | "test_pathlib_312.py": v2 != (3, 12),
29 | }.get(collection_path.name, False)
30 |
31 |
32 | def pytest_collection_modifyitems(config, items):
33 | """mark all tests in this folder as pathlib tests"""
34 | for item in items:
35 | item.add_marker(pytest.mark.pathlib)
36 |
--------------------------------------------------------------------------------
/upath/implementations/memory.py:
--------------------------------------------------------------------------------
1 | from __future__ import annotations
2 |
3 | import sys
4 | from typing import TYPE_CHECKING
5 |
6 | from upath.core import UPath
7 | from upath.types import JoinablePathLike
8 |
9 | if TYPE_CHECKING:
10 | from typing import Literal
11 |
12 | if sys.version_info >= (3, 11):
13 | from typing import Unpack
14 | else:
15 | from typing_extensions import Unpack
16 |
17 | from upath._chain import FSSpecChainParser
18 | from upath.types.storage_options import MemoryStorageOptions
19 |
20 | __all__ = ["MemoryPath"]
21 |
22 |
23 | class MemoryPath(UPath):
24 | __slots__ = ()
25 |
26 | if TYPE_CHECKING:
27 |
28 | def __init__(
29 | self,
30 | *args: JoinablePathLike,
31 | protocol: Literal["memory"] | None = ...,
32 | chain_parser: FSSpecChainParser = ...,
33 | **storage_options: Unpack[MemoryStorageOptions],
34 | ) -> None: ...
35 |
36 | @property
37 | def path(self) -> str:
38 | path = super().path
39 | return "/" if path in {"", "."} else path
40 |
41 | def is_absolute(self) -> bool:
42 | if self._relative_base is None and self.__vfspath__() == "/":
43 | return True
44 | return super().is_absolute()
45 |
46 | def __str__(self) -> str:
47 | s = super().__str__()
48 | if s.startswith("memory:///"):
49 | s = s.replace("memory:///", "memory://", 1)
50 | return s
51 |
--------------------------------------------------------------------------------
/upath/implementations/github.py:
--------------------------------------------------------------------------------
1 | """
2 | GitHub file system implementation
3 | """
4 |
5 | from __future__ import annotations
6 |
7 | import sys
8 | from collections.abc import Sequence
9 | from typing import TYPE_CHECKING
10 |
11 | from upath.core import UPath
12 | from upath.types import JoinablePathLike
13 |
14 | if TYPE_CHECKING:
15 | from typing import Literal
16 |
17 | if sys.version_info >= (3, 11):
18 | from typing import Unpack
19 | else:
20 | from typing_extensions import Unpack
21 |
22 | from upath._chain import FSSpecChainParser
23 | from upath.types.storage_options import GitHubStorageOptions
24 |
25 | __all__ = ["GitHubPath"]
26 |
27 |
28 | class GitHubPath(UPath):
29 | """
30 | GitHubPath supporting the fsspec.GitHubFileSystem
31 | """
32 |
33 | __slots__ = ()
34 |
35 | if TYPE_CHECKING:
36 |
37 | def __init__(
38 | self,
39 | *args: JoinablePathLike,
40 | protocol: Literal["github"] | None = ...,
41 | chain_parser: FSSpecChainParser = ...,
42 | **storage_options: Unpack[GitHubStorageOptions],
43 | ) -> None: ...
44 |
45 | @property
46 | def path(self) -> str:
47 | pth = super().path
48 | if pth == ".":
49 | return ""
50 | return pth
51 |
52 | @property
53 | def parts(self) -> Sequence[str]:
54 | parts = super().parts
55 | if parts and parts[0] == "/":
56 | return parts[1:]
57 | else:
58 | return parts
59 |
--------------------------------------------------------------------------------
/upath/tests/implementations/test_gcs.py:
--------------------------------------------------------------------------------
1 | import fsspec
2 | import pytest
3 |
4 | from upath import UPath
5 | from upath.implementations.cloud import GCSPath
6 |
7 | from ..cases import BaseTests
8 | from ..utils import skip_on_windows
9 |
10 |
11 | @skip_on_windows
12 | @pytest.mark.usefixtures("path")
13 | class TestGCSPath(BaseTests):
14 | SUPPORTS_EMPTY_DIRS = False
15 |
16 | @pytest.fixture(autouse=True, scope="function")
17 | def path(self, gcs_fixture):
18 | path, endpoint_url = gcs_fixture
19 | self.path = UPath(path, endpoint_url=endpoint_url, token="anon")
20 |
21 | def test_is_GCSPath(self):
22 | assert isinstance(self.path, GCSPath)
23 |
24 | def test_rmdir(self):
25 | dirname = "rmdir_test"
26 | mock_dir = self.path.joinpath(dirname)
27 | mock_dir.joinpath("test.txt").write_text("hello")
28 | mock_dir.fs.invalidate_cache()
29 | mock_dir.rmdir()
30 | assert not mock_dir.exists()
31 | with pytest.raises(NotADirectoryError):
32 | self.path.joinpath("file1.txt").rmdir()
33 |
34 | @pytest.mark.skip
35 | def test_makedirs_exist_ok_false(self):
36 | pass
37 |
38 |
39 | @skip_on_windows
40 | def test_mkdir_in_empty_bucket(docker_gcs):
41 | fs = fsspec.filesystem("gcs", endpoint_url=docker_gcs, token="anon")
42 | fs.mkdir("my-fresh-bucket")
43 | assert "my-fresh-bucket/" in fs.buckets
44 | fs.invalidate_cache()
45 | del fs
46 |
47 | UPath(
48 | "gs://my-fresh-bucket/some-dir/another-dir/file",
49 | endpoint_url=docker_gcs,
50 | token="anon",
51 | ).parent.mkdir(parents=True, exist_ok=True)
52 |
--------------------------------------------------------------------------------
/upath/implementations/cached.py:
--------------------------------------------------------------------------------
1 | from __future__ import annotations
2 |
3 | import sys
4 | from types import MappingProxyType
5 | from typing import TYPE_CHECKING
6 |
7 | from upath.core import UPath
8 | from upath.types import JoinablePathLike
9 |
10 | if TYPE_CHECKING:
11 | from collections.abc import Mapping
12 | from typing import Any
13 | from typing import Literal
14 |
15 | if sys.version_info >= (3, 11):
16 | from typing import Unpack
17 | else:
18 | from typing_extensions import Unpack
19 |
20 | from fsspec import AbstractFileSystem
21 |
22 | from upath._chain import FSSpecChainParser
23 | from upath.types.storage_options import SimpleCacheStorageOptions
24 |
25 |
26 | __all__ = ["SimpleCachePath"]
27 |
28 |
29 | class SimpleCachePath(UPath):
30 | __slots__ = ()
31 |
32 | if TYPE_CHECKING:
33 |
34 | def __init__(
35 | self,
36 | *args: JoinablePathLike,
37 | protocol: Literal["simplecache"] | None = ...,
38 | chain_parser: FSSpecChainParser = ...,
39 | **storage_options: Unpack[SimpleCacheStorageOptions],
40 | ) -> None: ...
41 |
42 | @classmethod
43 | def _fs_factory(
44 | cls,
45 | urlpath: str,
46 | protocol: str,
47 | storage_options: Mapping[str, Any],
48 | ) -> AbstractFileSystem:
49 | so = dict(storage_options)
50 | so.pop("fo", None)
51 | return super()._fs_factory(
52 | urlpath,
53 | protocol,
54 | so,
55 | )
56 |
57 | @property
58 | def storage_options(self) -> Mapping[str, Any]:
59 | so = self._storage_options.copy()
60 | so.pop("fo", None)
61 | return MappingProxyType(so)
62 |
--------------------------------------------------------------------------------
/upath/tests/implementations/test_azure.py:
--------------------------------------------------------------------------------
1 | import pytest
2 |
3 | from upath import UPath
4 | from upath.implementations.cloud import AzurePath
5 |
6 | from ..cases import BaseTests
7 | from ..utils import skip_on_windows
8 |
9 |
10 | @skip_on_windows
11 | @pytest.mark.usefixtures("path")
12 | class TestAzurePath(BaseTests):
13 | SUPPORTS_EMPTY_DIRS = False
14 |
15 | @pytest.fixture(autouse=True, scope="function")
16 | def path(self, azurite_credentials, azure_fixture):
17 | account_name, connection_string = azurite_credentials
18 |
19 | self.storage_options = {
20 | "account_name": account_name,
21 | "connection_string": connection_string,
22 | }
23 | self.path = UPath(azure_fixture, **self.storage_options)
24 | self.prepare_file_system()
25 |
26 | def test_is_AzurePath(self):
27 | assert isinstance(self.path, AzurePath)
28 |
29 | def test_rmdir(self):
30 | new_dir = self.path / "new_dir_rmdir"
31 | new_dir.mkdir()
32 | path = new_dir / "test.txt"
33 | path.write_text("hello")
34 | assert path.exists()
35 | new_dir.rmdir()
36 | assert not new_dir.exists()
37 |
38 | with pytest.raises(NotADirectoryError):
39 | (self.path / "a" / "file.txt").rmdir()
40 |
41 | def test_protocol(self):
42 | # test all valid protocols for azure...
43 | protocol = self.path.protocol
44 | assert protocol in ["abfs", "abfss", "adl", "az"]
45 |
46 | def test_broken_mkdir(self):
47 | path = UPath(
48 | "az://new-container/",
49 | **self.storage_options,
50 | )
51 | if path.exists():
52 | path.rmdir()
53 | path.mkdir(parents=True, exist_ok=False)
54 |
55 | (path / "file").write_text("foo")
56 | assert path.exists()
57 |
--------------------------------------------------------------------------------
/.pre-commit-config.yaml:
--------------------------------------------------------------------------------
1 | default_language_version:
2 | python: python3
3 | exclude: ^upath/tests/pathlib/test_pathlib.*\.py|^upath/tests/pathlib/_test_support\.py|^upath/_flavour_sources\.py
4 | repos:
5 | - repo: https://github.com/psf/black
6 | rev: 25.1.0
7 | hooks:
8 | - id: black
9 | - repo: https://github.com/pre-commit/pre-commit-hooks
10 | rev: v4.6.0
11 | hooks:
12 | - id: check-added-large-files
13 | - id: check-case-conflict
14 | - id: check-docstring-first
15 | - id: check-executables-have-shebangs
16 | - id: check-json
17 | - id: check-merge-conflict
18 | args: ['--assume-in-merge']
19 | - id: check-toml
20 | - id: check-yaml
21 | exclude: ^mkdocs\.yml$
22 | - id: debug-statements
23 | - id: end-of-file-fixer
24 | - id: mixed-line-ending
25 | args: ['--fix=lf']
26 | - id: sort-simple-yaml
27 | - id: trailing-whitespace
28 | - repo: https://github.com/codespell-project/codespell
29 | rev: v2.4.1
30 | hooks:
31 | - id: codespell
32 | args: ['-L', 'fo']
33 | additional_dependencies: ["tomli"]
34 | - repo: https://github.com/asottile/pyupgrade
35 | rev: v3.19.1
36 | hooks:
37 | - id: pyupgrade
38 | args: [--py39-plus]
39 | - repo: https://github.com/PyCQA/isort
40 | rev: 5.13.2
41 | hooks:
42 | - id: isort
43 | - repo: https://github.com/pycqa/flake8
44 | rev: 7.2.0
45 | hooks:
46 | - id: flake8
47 | additional_dependencies:
48 | - flake8-bugbear==24.1.17
49 | - flake8-comprehensions==3.14.0
50 | - flake8-debugger==4.1.2
51 | - flake8-string-format==0.3.0
52 | - repo: https://github.com/pycqa/bandit
53 | rev: 1.8.3
54 | hooks:
55 | - id: bandit
56 | args: [-c, pyproject.toml]
57 | additional_dependencies: ["tomli>=1.1.0"]
58 |
--------------------------------------------------------------------------------
/upath/tests/implementations/test_sftp.py:
--------------------------------------------------------------------------------
1 | import pytest
2 |
3 | from upath import UPath
4 | from upath.tests.cases import BaseTests
5 | from upath.tests.utils import skip_on_windows
6 | from upath.tests.utils import xfail_if_version
7 |
8 | _xfail_old_fsspec = xfail_if_version(
9 | "fsspec",
10 | lt="2022.7.0",
11 | reason="fsspec<2022.7.0 sftp does not support create_parents",
12 | )
13 |
14 |
15 | @skip_on_windows
16 | class TestUPathSFTP(BaseTests):
17 |
18 | @pytest.fixture(autouse=True)
19 | def path(self, ssh_fixture):
20 | self.path = UPath(ssh_fixture)
21 |
22 | @_xfail_old_fsspec
23 | def test_mkdir(self):
24 | super().test_mkdir()
25 |
26 | @_xfail_old_fsspec
27 | def test_mkdir_exists_ok_true(self):
28 | super().test_mkdir_exists_ok_true()
29 |
30 | @_xfail_old_fsspec
31 | def test_mkdir_exists_ok_false(self):
32 | super().test_mkdir_exists_ok_false()
33 |
34 | @_xfail_old_fsspec
35 | def test_mkdir_parents_true_exists_ok_false(self):
36 | super().test_mkdir_parents_true_exists_ok_false()
37 |
38 | @_xfail_old_fsspec
39 | def test_mkdir_parents_true_exists_ok_true(self):
40 | super().test_mkdir_parents_true_exists_ok_true()
41 |
42 |
43 | @pytest.mark.parametrize(
44 | "args,parts",
45 | [
46 | (("sftp://user@host",), ("/",)),
47 | (("sftp://user@host/",), ("/",)),
48 | (("sftp://user@host", ""), ("/",)),
49 | (("sftp://user@host/", ""), ("/",)),
50 | (("sftp://user@host", "/"), ("/",)),
51 | (("sftp://user@host/", "/"), ("/",)),
52 | (("sftp://user@host/abc",), ("/", "abc")),
53 | (("sftp://user@host", "abc"), ("/", "abc")),
54 | (("sftp://user@host", "/abc"), ("/", "abc")),
55 | (("sftp://user@host/", "/abc"), ("/", "abc")),
56 | ],
57 | )
58 | def test_join_produces_correct_parts(args, parts):
59 | pth = UPath(*args)
60 | assert pth.storage_options == {"host": "host", "username": "user"}
61 | assert pth.parts == parts
62 |
--------------------------------------------------------------------------------
/docs/api/extensions.md:
--------------------------------------------------------------------------------
1 | # Extensions :puzzle_piece:
2 |
3 | The extensions module provides a base class for extending UPath functionality while maintaining
4 | compatibility with all filesystem implementations.
5 |
6 | ## ProxyUPath
7 |
8 | ::: upath.extensions.ProxyUPath
9 | options:
10 | heading_level: 3
11 | show_root_heading: true
12 | show_root_full_path: false
13 | members: false
14 | show_bases: true
15 |
16 | ---
17 |
18 | ## Usage Example
19 |
20 | `ProxyUPath` allows you to extend the UPath interface with additional methods while
21 | preserving compatibility with all supported filesystem implementations. It acts as a
22 | wrapper around any UPath instance.
23 |
24 | ### Creating a Custom Extension
25 |
26 | ```python
27 | from upath import UPath
28 | from upath.extensions import ProxyUPath
29 |
30 | class MyCustomPath(ProxyUPath):
31 | """Custom path with additional functionality"""
32 |
33 | def custom_method(self) -> str:
34 | """Add your custom functionality here"""
35 | return f"Custom processing for: {self.path}"
36 |
37 | def enhanced_read(self) -> str:
38 | """Enhanced read with preprocessing"""
39 | content = self.read_text()
40 | # Add custom processing
41 | return content.upper()
42 |
43 | # Use with any filesystem
44 | s3_path = MyCustomPath("s3://bucket/file.txt")
45 | local_path = MyCustomPath("/tmp/file.txt")
46 | gcs_path = MyCustomPath("gs://bucket/file.txt")
47 |
48 | # All standard UPath methods work
49 | print(s3_path.exists())
50 | print(local_path.parent)
51 |
52 | # Always a subclass of your class
53 | assert isinstance(s3_path, MyCustomPath)
54 | assert isinstance(local_path, MyCustomPath)
55 |
56 | # Plus your custom methods
57 | print(s3_path.custom_method())
58 | content = local_path.enhanced_read()
59 | ```
60 |
61 | ---
62 |
63 | ## See Also :link:
64 |
65 | - [UPath](index.md) - Main UPath class documentation
66 | - [Implementations](implementations.md) - Built-in UPath subclasses
67 | - [Registry](registry.md) - Implementation registry
68 |
--------------------------------------------------------------------------------
/upath/implementations/tar.py:
--------------------------------------------------------------------------------
1 | from __future__ import annotations
2 |
3 | import stat
4 | import sys
5 | import warnings
6 | from typing import TYPE_CHECKING
7 |
8 | from upath._stat import UPathStatResult
9 | from upath.core import UPath
10 | from upath.types import JoinablePathLike
11 | from upath.types import StatResultType
12 |
13 | if TYPE_CHECKING:
14 | from collections.abc import Iterator
15 | from typing import Literal
16 |
17 | if sys.version_info >= (3, 11):
18 | from typing import Self
19 | from typing import Unpack
20 | else:
21 | from typing_extensions import Self
22 | from typing_extensions import Unpack
23 |
24 | from upath._chain import FSSpecChainParser
25 | from upath.types.storage_options import TarStorageOptions
26 |
27 |
28 | __all__ = ["TarPath"]
29 |
30 |
31 | class TarPath(UPath):
32 | __slots__ = ()
33 |
34 | if TYPE_CHECKING:
35 |
36 | def __init__(
37 | self,
38 | *args: JoinablePathLike,
39 | protocol: Literal["zip"] | None = ...,
40 | chain_parser: FSSpecChainParser = ...,
41 | **storage_options: Unpack[TarStorageOptions],
42 | ) -> None: ...
43 |
44 | def stat(
45 | self,
46 | *,
47 | follow_symlinks: bool = True,
48 | ) -> StatResultType:
49 | if not follow_symlinks:
50 | warnings.warn(
51 | f"{type(self).__name__}.stat(follow_symlinks=False):"
52 | " is currently ignored.",
53 | UserWarning,
54 | stacklevel=2,
55 | )
56 | info = self.fs.info(self.path).copy()
57 | # convert mode
58 | if info["type"] == "directory":
59 | info["mode"] = stat.S_IFDIR
60 | elif info["type"] == "file":
61 | info["mode"] = stat.S_IFREG
62 | return UPathStatResult.from_info(info)
63 |
64 | def iterdir(self) -> Iterator[Self]:
65 | it = iter(super().iterdir())
66 | p0 = next(it)
67 | if p0.name != "":
68 | yield p0
69 | yield from it
70 |
--------------------------------------------------------------------------------
/upath/tests/utils.py:
--------------------------------------------------------------------------------
1 | import operator
2 | import sys
3 | from contextlib import contextmanager
4 |
5 | import pytest
6 | from fsspec.utils import get_package_version_without_import
7 | from packaging.version import Version
8 |
9 |
10 | def skip_on_windows(func):
11 | return pytest.mark.skipif(
12 | sys.platform.startswith("win"), reason="Don't run on Windows"
13 | )(func)
14 |
15 |
16 | def only_on_windows(func):
17 | return pytest.mark.skipif(
18 | not sys.platform.startswith("win"), reason="Only run on Windows"
19 | )(func)
20 |
21 |
22 | def posixify(path):
23 | return str(path).replace("\\", "/")
24 |
25 |
26 | def xfail_if_version(module, *, reason, **conditions):
27 | ver_str = get_package_version_without_import(module)
28 | if ver_str is None:
29 | return pytest.mark.skip(reason=f"NOT INSTALLED ({reason})")
30 | ver = Version(ver_str)
31 | if not set(conditions).issubset({"lt", "le", "ne", "eq", "ge", "gt"}):
32 | raise ValueError("unknown condition")
33 | cond = True
34 | for op, val in conditions.items():
35 | cond &= getattr(operator, op)(ver, Version(val))
36 | return pytest.mark.xfail(cond, reason=reason)
37 |
38 |
39 | def xfail_if_no_ssl_connection(func):
40 | try:
41 | import requests
42 | except ImportError:
43 | return pytest.mark.skip(reason="requests not installed")(func)
44 | try:
45 | requests.get("https://example.com")
46 | except (requests.exceptions.ConnectionError, requests.exceptions.SSLError):
47 | return pytest.mark.xfail(reason="No SSL connection")(func)
48 | else:
49 | return func
50 |
51 |
52 | @contextmanager
53 | def temporary_register(protocol, cls):
54 | """helper to temporarily register a protocol for testing purposes"""
55 | from upath.registry import _registry
56 | from upath.registry import get_upath_class
57 |
58 | m = _registry._m.maps[0]
59 | try:
60 | m[protocol] = cls
61 | get_upath_class.cache_clear()
62 | yield
63 | finally:
64 | m.clear()
65 | get_upath_class.cache_clear()
66 |
--------------------------------------------------------------------------------
/upath/implementations/ftp.py:
--------------------------------------------------------------------------------
1 | from __future__ import annotations
2 |
3 | import sys
4 | from ftplib import error_perm as FTPPermanentError # nosec B402
5 | from typing import TYPE_CHECKING
6 |
7 | from upath.core import UPath
8 | from upath.types import UNSET_DEFAULT
9 | from upath.types import JoinablePathLike
10 |
11 | if TYPE_CHECKING:
12 | from typing import Any
13 | from typing import Literal
14 |
15 | if sys.version_info >= (3, 11):
16 | from typing import Self
17 | from typing import Unpack
18 | else:
19 | from typing_extensions import Self
20 | from typing_extensions import Unpack
21 |
22 | from upath._chain import FSSpecChainParser
23 | from upath.types import WritablePathLike
24 | from upath.types.storage_options import FTPStorageOptions
25 |
26 | __all__ = ["FTPPath"]
27 |
28 |
29 | class FTPPath(UPath):
30 | __slots__ = ()
31 |
32 | if TYPE_CHECKING:
33 |
34 | def __init__(
35 | self,
36 | *args: JoinablePathLike,
37 | protocol: Literal["ftp"] | None = ...,
38 | chain_parser: FSSpecChainParser = ...,
39 | **storage_options: Unpack[FTPStorageOptions],
40 | ) -> None: ...
41 |
42 | def mkdir(
43 | self,
44 | mode: int = 0o777,
45 | parents: bool = False,
46 | exist_ok: bool = False,
47 | ) -> None:
48 | try:
49 | return super().mkdir(mode, parents, exist_ok)
50 | except FTPPermanentError as e:
51 | if e.args[0].startswith("550") and exist_ok:
52 | return
53 | raise FileExistsError(str(self)) from e
54 |
55 | def rename(
56 | self,
57 | target: WritablePathLike,
58 | *, # note: non-standard compared to pathlib
59 | recursive: bool = UNSET_DEFAULT,
60 | maxdepth: int | None = UNSET_DEFAULT,
61 | **kwargs: Any,
62 | ) -> Self:
63 | t = super().rename(target, recursive=recursive, maxdepth=maxdepth, **kwargs)
64 | self_dir = self.parent.path
65 | t.fs.invalidate_cache(self_dir)
66 | self.fs.invalidate_cache(self_dir)
67 | return t
68 |
--------------------------------------------------------------------------------
/upath/implementations/data.py:
--------------------------------------------------------------------------------
1 | from __future__ import annotations
2 |
3 | import sys
4 | from collections.abc import Sequence
5 | from typing import TYPE_CHECKING
6 |
7 | from upath.core import UnsupportedOperation
8 | from upath.core import UPath
9 | from upath.types import JoinablePathLike
10 |
11 | if TYPE_CHECKING:
12 | from typing import Literal
13 |
14 | if sys.version_info >= (3, 11):
15 | from typing import Self
16 | from typing import Unpack
17 | else:
18 | from typing_extensions import Self
19 | from typing_extensions import Unpack
20 |
21 | from upath._chain import FSSpecChainParser
22 | from upath.types.storage_options import DataStorageOptions
23 |
24 | __all__ = ["DataPath"]
25 |
26 |
27 | class DataPath(UPath):
28 | __slots__ = ()
29 |
30 | if TYPE_CHECKING:
31 |
32 | def __init__(
33 | self,
34 | *args: JoinablePathLike,
35 | protocol: Literal["data"] | None = ...,
36 | chain_parser: FSSpecChainParser = ...,
37 | **storage_options: Unpack[DataStorageOptions],
38 | ) -> None: ...
39 |
40 | @property
41 | def parts(self) -> Sequence[str]:
42 | return (self.path,)
43 |
44 | def __str__(self) -> str:
45 | return self.parser.join(*self._raw_urlpaths)
46 |
47 | def with_segments(self, *pathsegments: JoinablePathLike) -> Self:
48 | raise UnsupportedOperation("path operation not supported by DataPath")
49 |
50 | def with_suffix(self, suffix: str) -> Self:
51 | raise UnsupportedOperation("path operation not supported by DataPath")
52 |
53 | def mkdir(
54 | self, mode: int = 0o777, parents: bool = False, exist_ok: bool = False
55 | ) -> None:
56 | raise FileExistsError(str(self))
57 |
58 | def write_bytes(self, data: bytes) -> int:
59 | raise UnsupportedOperation("DataPath does not support writing")
60 |
61 | def write_text(
62 | self,
63 | data: str,
64 | encoding: str | None = None,
65 | errors: str | None = None,
66 | newline: str | None = None,
67 | ) -> int:
68 | raise UnsupportedOperation("DataPath does not support writing")
69 |
--------------------------------------------------------------------------------
/upath/tests/implementations/test_local.py:
--------------------------------------------------------------------------------
1 | import os
2 | from pathlib import Path
3 |
4 | import pytest
5 |
6 | from upath import UPath
7 | from upath.implementations.local import LocalPath
8 | from upath.tests.cases import BaseTests
9 | from upath.tests.utils import xfail_if_version
10 |
11 |
12 | class TestFSSpecLocal(BaseTests):
13 | @pytest.fixture(autouse=True)
14 | def path(self, local_testdir):
15 | path = f"file://{local_testdir}"
16 | self.path = UPath(path)
17 |
18 | def test_is_LocalPath(self):
19 | assert isinstance(self.path, LocalPath)
20 |
21 | def test_cwd(self):
22 | cwd = type(self.path).cwd()
23 | assert isinstance(cwd, LocalPath)
24 | assert cwd.path == Path.cwd().as_posix()
25 |
26 | def test_home(self):
27 | cwd = type(self.path).home()
28 | assert isinstance(cwd, LocalPath)
29 | assert cwd.path == Path.home().as_posix()
30 |
31 | def test_chmod(self):
32 | self.path.joinpath("file1.txt").chmod(777)
33 |
34 |
35 | @xfail_if_version("fsspec", lt="2023.10.0", reason="requires fsspec>=2023.10.0")
36 | class TestRayIOFSSpecLocal(BaseTests):
37 | @pytest.fixture(autouse=True)
38 | def path(self, local_testdir):
39 | path = f"local://{local_testdir}"
40 | self.path = UPath(path)
41 |
42 | def test_is_LocalPath(self):
43 | assert isinstance(self.path, LocalPath)
44 |
45 | def test_cwd(self):
46 | cwd = type(self.path).cwd()
47 | assert isinstance(cwd, LocalPath)
48 | assert cwd.path == Path.cwd().as_posix()
49 |
50 | def test_home(self):
51 | cwd = type(self.path).home()
52 | assert isinstance(cwd, LocalPath)
53 | assert cwd.path == Path.home().as_posix()
54 |
55 | def test_chmod(self):
56 | self.path.joinpath("file1.txt").chmod(777)
57 |
58 |
59 | @pytest.mark.parametrize(
60 | "protocol,path",
61 | [
62 | (None, "/tmp/somefile.txt"),
63 | ("file", "file:///tmp/somefile.txt"),
64 | ("local", "local:///tmp/somefile.txt"),
65 | ],
66 | )
67 | def test_local_paths_are_pathlike(protocol, path):
68 | assert isinstance(UPath(path, protocol=protocol), os.PathLike)
69 |
--------------------------------------------------------------------------------
/upath/tests/implementations/test_webdav.py:
--------------------------------------------------------------------------------
1 | from pathlib import Path
2 |
3 | import pytest
4 |
5 | from upath import UPath
6 |
7 | from ..cases import BaseTests
8 |
9 |
10 | class TestUPathWebdav(BaseTests):
11 | @pytest.fixture(autouse=True, scope="function")
12 | def path(self, webdav_fixture):
13 | self.path = UPath(webdav_fixture, auth=("USER", "PASSWORD"))
14 |
15 | def test_fsspec_compat(self):
16 | pass
17 |
18 | def test_storage_options(self):
19 | # we need to add base_url to storage options for webdav filesystems,
20 | # to be able to serialize the http protocol to string...
21 | storage_options = self.path.storage_options
22 | base_url = storage_options["base_url"]
23 | assert storage_options == self.path.fs.storage_options
24 | assert base_url == self.path.fs.client.base_url
25 |
26 | def test_read_with_fsspec(self):
27 | # this test used to fail with fsspec<2022.5.0 because webdav was not
28 | # registered in fsspec. But when UPath(webdav_fixture) is called, to
29 | # run the BaseTests, the upath.implementations.webdav module is
30 | # imported, which registers the webdav implementation in fsspec.
31 | super().test_read_with_fsspec()
32 |
33 | @pytest.mark.parametrize(
34 | "target_factory",
35 | [
36 | lambda obj, name: str(obj.joinpath(name).absolute()),
37 | pytest.param(
38 | lambda obj, name: UPath(obj.absolute().joinpath(name).path),
39 | marks=pytest.mark.xfail(reason="webdav has no root..."),
40 | ),
41 | pytest.param(
42 | lambda obj, name: Path(obj.absolute().joinpath(name).path),
43 | marks=pytest.mark.xfail(reason="webdav has no root..."),
44 | ),
45 | lambda obj, name: obj.absolute().joinpath(name),
46 | ],
47 | ids=[
48 | "str_absolute",
49 | "plain_upath_absolute",
50 | "plain_path_absolute",
51 | "self_upath_absolute",
52 | ],
53 | )
54 | def test_rename_with_target_absolute(self, target_factory):
55 | super().test_rename_with_target_absolute(target_factory)
56 |
--------------------------------------------------------------------------------
/.github/workflows/tests.yml:
--------------------------------------------------------------------------------
1 | name: Tests
2 |
3 | on:
4 | push:
5 | branches: [main]
6 | pull_request:
7 | workflow_dispatch:
8 |
9 | permissions:
10 | contents: read
11 |
12 | env:
13 | FORCE_COLOR: "1"
14 |
15 | concurrency:
16 | group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
17 | cancel-in-progress: true
18 |
19 | jobs:
20 | tests:
21 | timeout-minutes: 10
22 | runs-on: ${{ matrix.os }}
23 | strategy:
24 | fail-fast: false
25 | matrix:
26 | os: [ubuntu-latest, windows-latest, macos-latest]
27 | pyv: ['3.9', '3.10', '3.11', '3.12', '3.13', '3.14']
28 | session: ['tests']
29 |
30 | include:
31 | - os: ubuntu-latest
32 | pyv: '3.9'
33 | session: 'tests-minversion'
34 |
35 | steps:
36 | - name: Check out the repository
37 | uses: actions/checkout@v4
38 | with:
39 | fetch-depth: 0
40 |
41 | - uses: hynek/setup-cached-uv@v2
42 |
43 | - name: Run tests
44 | run: uvx nox --sessions ${{ matrix.session }} --python ${{ matrix.pyv }} -- --cov-report=xml
45 |
46 | typesafety:
47 | runs-on: ubuntu-latest
48 | strategy:
49 | fail-fast: false
50 | matrix:
51 | pyv: ['3.9', '3.10', '3.11', '3.12', '3.13', '3.14']
52 |
53 | steps:
54 | - name: Check out the repository
55 | uses: actions/checkout@v4
56 | with:
57 | fetch-depth: 0
58 |
59 | - uses: hynek/setup-cached-uv@v2
60 |
61 | - name: Run tests
62 | run: uvx nox --sessions type-safety --python ${{ matrix.pyv }}
63 |
64 | lint:
65 | runs-on: ubuntu-latest
66 |
67 | steps:
68 | - name: Check out the repository
69 | uses: actions/checkout@v4
70 | with:
71 | fetch-depth: 0
72 |
73 | - uses: hynek/setup-cached-uv@v2
74 |
75 | - name: Lint code and check dependencies
76 | run: uvx nox -s lint
77 |
78 | build:
79 | needs: [tests, lint]
80 | runs-on: ubuntu-latest
81 | steps:
82 | - name: Check out the repository
83 | uses: actions/checkout@v4
84 | with:
85 | fetch-depth: 0
86 |
87 | - uses: hynek/setup-cached-uv@v2
88 |
89 | - name: Build package
90 | run: uvx nox -s build
91 |
--------------------------------------------------------------------------------
/upath/implementations/zip.py:
--------------------------------------------------------------------------------
1 | from __future__ import annotations
2 |
3 | import sys
4 | from typing import TYPE_CHECKING
5 | from zipfile import ZipInfo
6 |
7 | from upath.core import UPath
8 | from upath.types import JoinablePathLike
9 |
10 | if TYPE_CHECKING:
11 | from typing import Literal
12 |
13 | if sys.version_info >= (3, 11):
14 | from typing import Unpack
15 | else:
16 | from typing_extensions import Unpack
17 |
18 | from upath._chain import FSSpecChainParser
19 | from upath.types.storage_options import ZipStorageOptions
20 |
21 |
22 | __all__ = ["ZipPath"]
23 |
24 |
25 | class ZipPath(UPath):
26 | __slots__ = ()
27 |
28 | if TYPE_CHECKING:
29 |
30 | def __init__(
31 | self,
32 | *args: JoinablePathLike,
33 | protocol: Literal["zip"] | None = ...,
34 | chain_parser: FSSpecChainParser = ...,
35 | **storage_options: Unpack[ZipStorageOptions],
36 | ) -> None: ...
37 |
38 | if sys.version_info >= (3, 11):
39 |
40 | def mkdir(
41 | self,
42 | mode: int = 0o777,
43 | parents: bool = False,
44 | exist_ok: bool = False,
45 | ) -> None:
46 | is_dir = self.is_dir()
47 | if is_dir and not exist_ok:
48 | raise FileExistsError(f"File exists: {self.path!r}")
49 | elif not is_dir:
50 | zipfile = self.fs.zip
51 | zipfile.mkdir(self.path, mode)
52 |
53 | else:
54 |
55 | def mkdir(
56 | self,
57 | mode: int = 0o777,
58 | parents: bool = False,
59 | exist_ok: bool = False,
60 | ) -> None:
61 | is_dir = self.is_dir()
62 | if is_dir and not exist_ok:
63 | raise FileExistsError(f"File exists: {self.path!r}")
64 | elif not is_dir:
65 | dirname = self.path
66 | if dirname and not dirname.endswith("/"):
67 | dirname += "/"
68 | zipfile = self.fs.zip
69 | zinfo = ZipInfo(dirname)
70 | zinfo.compress_size = 0
71 | zinfo.CRC = 0
72 | zinfo.external_attr = ((0o40000 | mode) & 0xFFFF) << 16
73 | zinfo.file_size = 0
74 | zinfo.external_attr |= 0x10
75 | zipfile.writestr(zinfo, b"")
76 |
--------------------------------------------------------------------------------
/docs/api/index.md:
--------------------------------------------------------------------------------
1 |
6 |
7 | # UPath {: #upath-logo }
8 |
9 | The `UPath` class is your default entry point for interacting with fsspec filesystems.
10 | When instantiating UPath, a specific `UPath` subclass will be returned, dependent on the
11 | detected or provided `protocol`. Here we document all methods and properties available on
12 | UPath instances.
13 |
14 | !!! info "Compatibility"
15 | All methods documented here work consistently across all supported Python versions,
16 | even if they were introduced in later Python versions. We consider it a bug if they
17 | don't :bug: so please report and issue if you run into inconsistencies.
18 |
19 |
20 | ```python
21 | from upath import UPath
22 | ```
23 |
24 | ::: upath.core.UPath
25 | options:
26 | heading_level: 2
27 | merge_init_into_class: false
28 | inherited_members: true
29 | members:
30 | - __init__
31 | - protocol
32 | - storage_options
33 | - fs
34 | - path
35 | - parts
36 | - name
37 | - stem
38 | - drive
39 | - root
40 | - anchor
41 | - suffix
42 | - suffixes
43 | - parent
44 | - parents
45 | - joinpath
46 | - joinuri
47 | - with_name
48 | - with_stem
49 | - with_suffix
50 | - with_segments
51 | - relative_to
52 | - is_relative_to
53 | - match
54 | - full_match
55 | - as_posix
56 | - as_uri
57 | - open
58 | - read_text
59 | - read_bytes
60 | - write_text
61 | - write_bytes
62 | - iterdir
63 | - glob
64 | - rglob
65 | - walk
66 | - mkdir
67 | - rmdir
68 | - touch
69 | - unlink
70 | - rename
71 | - replace
72 | - copy
73 | - move
74 | - copy_into
75 | - move_into
76 | - exists
77 | - is_file
78 | - is_dir
79 | - is_symlink
80 | - is_absolute
81 | - stat
82 | - info
83 | - absolute
84 | - resolve
85 | - expanduser
86 | - cwd
87 | - home
88 |
89 | ---
90 |
91 | ## See Also :link:
92 |
93 | - [Registry](registry.md) - The upath implementation registry
94 | - [Implementations](implementations.md) - UPath subclasses
95 | - [Extensions](extensions.md) - Extending UPath functionality
96 | - [Types](types.md) - Type hints and protocols
97 |
--------------------------------------------------------------------------------
/docs/install.md:
--------------------------------------------------------------------------------
1 |
2 | # Installation :package:
3 |
4 | Getting started with `universal-pathlib` is easy! Choose your preferred package manager below and you'll be working with cloud storage in minutes.
5 |
6 | ## Quick Install
7 |
8 | === "uv"
9 |
10 | ```bash
11 | uv add universal-pathlib
12 | ```
13 |
14 | === "pip"
15 |
16 | ```bash
17 | python -m pip install universal-pathlib
18 | ```
19 |
20 | === "conda"
21 |
22 | ```bash
23 | conda install -c conda-forge universal-pathlib
24 | ```
25 |
26 | That's it! You now have `universal-pathlib` installed. :tada:
27 |
28 | ## Filesystem-Specific Dependencies
29 |
30 | While `universal-pathlib` comes with `fsspec` out of the box, **some filesystems require additional packages**. Don't worry—installing them is straightforward!
31 |
32 | For example, to work with **AWS S3**, you'll need to install `s3fs`:
33 |
34 | ```bash
35 | pip install s3fs
36 | # or better yet, use fsspec extras:
37 | pip install "fsspec[s3]"
38 | ```
39 |
40 | Here are some common filesystem extras you might need:
41 |
42 | | Filesystem | Install Command |
43 | |------------|----------------|
44 | | **AWS S3** | `pip install "fsspec[s3]"` |
45 | | **Google Cloud Storage** | `pip install "fsspec[gcs]"` |
46 | | **Azure Blob Storage** | `pip install "fsspec[azure]"` |
47 | | **HTTP/HTTPS** | `pip install "fsspec[http]"` |
48 | | **GitHub** | `pip install "fsspec[github]"` |
49 | | **SSH/SFTP** | `pip install "fsspec[ssh]"` |
50 |
51 | ## Adding to Your Project
52 |
53 | When adding `universal-pathlib` to your project, specify the filesystem extras you need. Here's a `pyproject.toml` example for a project using **S3** and **HTTP**:
54 |
55 | ```toml
56 | [project]
57 | name = "myproject"
58 | requires-python = ">=3.9"
59 | dependencies = [
60 | "universal_pathlib>=0.3.7",
61 | "fsspec[s3,http]", # Add the filesystems you need
62 | ]
63 | ```
64 |
65 | !!! tip "Complete List of Filesystem Extras"
66 |
67 | For a complete overview of all available filesystem extras and their dependencies, check out the [fsspec pyproject.toml file][fsspec-pyproject-toml]. It includes extras for:
68 |
69 | - Cloud storage (S3, GCS, Azure, etc.)
70 | - Remote protocols (HTTP, FTP, SSH, etc.)
71 | - Specialized systems (HDFS, WebDAV, SMB, etc.)
72 |
73 | [fsspec-pyproject-toml]: https://github.com/fsspec/filesystem_spec/blob/master/pyproject.toml#L26
74 |
75 | ---
76 |
77 |
78 |
79 | **Ready to get started?** Learn about [Universal Pathlib Concepts](concepts/index.md) :rocket:
80 |
81 |
82 |
--------------------------------------------------------------------------------
/upath/tests/test_drive_root_anchor_parts.py:
--------------------------------------------------------------------------------
1 | from pathlib import Path
2 |
3 | import pytest
4 |
5 | from upath import UPath
6 |
7 | DRIVE_ROOT_ANCHOR_TESTS = [
8 | # cloud
9 | ("s3://bucket", "bucket", "/", "bucket/", ("bucket/",)),
10 | ("s3://bucket/a", "bucket", "/", "bucket/", ("bucket/", "a")),
11 | ("gs://bucket", "bucket", "/", "bucket/", ("bucket/",)),
12 | ("gs://bucket/a", "bucket", "/", "bucket/", ("bucket/", "a")),
13 | ("az://bucket", "bucket", "/", "bucket/", ("bucket/",)),
14 | ("az://bucket/a", "bucket", "/", "bucket/", ("bucket/", "a")),
15 | # data
16 | (
17 | "data:text/plain,A%20brief%20note",
18 | "",
19 | "",
20 | "",
21 | ("data:text/plain,A%20brief%20note",),
22 | ),
23 | # github
24 | ("github://user:token@repo/abc", "", "", "", ("abc",)),
25 | # hdfs
26 | ("hdfs://a/b/c", "", "/", "/", ("/", "b", "c")),
27 | ("hdfs:///a/b/c", "", "/", "/", ("/", "a", "b", "c")),
28 | # http
29 | ("http://a/", "http://a", "/", "http://a/", ("http://a/", "")),
30 | ("http://a/b/c", "http://a", "/", "http://a/", ("http://a/", "b", "c")),
31 | ("https://a/b/c", "https://a", "/", "https://a/", ("https://a/", "b", "c")),
32 | # memory
33 | ("memory://a/b/c", "", "/", "/", ("/", "a", "b", "c")),
34 | ("memory:///a/b/c", "", "/", "/", ("/", "a", "b", "c")),
35 | # sftp
36 | ("sftp://a/b/c", "", "/", "/", ("/", "b", "c")),
37 | ("sftp:///a/b/c", "", "/", "/", ("/", "a", "b", "c")),
38 | # smb
39 | ("smb://a/b/c", "", "/", "/", ("/", "b", "c")),
40 | ("smb:///a/b/c", "", "/", "/", ("/", "a", "b", "c")),
41 | # webdav
42 | ("webdav+http://host.com/a/b/c", "", "", "", ("a", "b", "c")),
43 | ("webdav+http://host.com/a/b/c", "", "", "", ("a", "b", "c")),
44 | # local
45 | (
46 | "file:///a/b/c",
47 | Path("/a/b/c").absolute().drive,
48 | Path("/").absolute().root.replace("\\", "/"),
49 | Path("/").absolute().anchor.replace("\\", "/"),
50 | tuple(x.replace("\\", "/") for x in Path("/a/b/c").absolute().parts),
51 | ),
52 | ]
53 |
54 |
55 | @pytest.mark.parametrize(
56 | "path,drive,root,anchor",
57 | [x[0:4] for x in DRIVE_ROOT_ANCHOR_TESTS],
58 | )
59 | def test_drive_root_anchor(path, drive, root, anchor):
60 | p = UPath(path)
61 | assert (p.drive, p.root, p.anchor) == (drive, root, anchor)
62 |
63 |
64 | @pytest.mark.parametrize(
65 | "path,parts",
66 | [(x[0], x[4]) for x in DRIVE_ROOT_ANCHOR_TESTS],
67 | )
68 | def test_parts(path, parts):
69 | p = UPath(path)
70 | assert p.parts == parts
71 |
--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | # Byte-compiled / optimized / DLL files
2 | __pycache__/
3 | *.py[cod]
4 | *$py.class
5 |
6 | # C extensions
7 | *.so
8 |
9 | # Distribution / packaging
10 | .Python
11 | build/
12 | develop-eggs/
13 | dist/
14 | downloads/
15 | eggs/
16 | .eggs/
17 | lib/
18 | lib64/
19 | parts/
20 | sdist/
21 | var/
22 | wheels/
23 | share/python-wheels/
24 | *.egg-info/
25 | .installed.cfg
26 | *.egg
27 | MANIFEST
28 |
29 | # PyInstaller
30 | # Usually these files are written by a python script from a template
31 | # before PyInstaller builds the exe, so as to inject date/other infos into it.
32 | *.manifest
33 | *.spec
34 |
35 | # Installer logs
36 | pip-log.txt
37 | pip-delete-this-directory.txt
38 |
39 | # Unit test / coverage reports
40 | htmlcov/
41 | .tox/
42 | .nox/
43 | .coverage
44 | .coverage.*
45 | .cache
46 | nosetests.xml
47 | coverage.xml
48 | *.cover
49 | *.py,cover
50 | .hypothesis/
51 | .pytest_cache/
52 | cover/
53 |
54 | # Translations
55 | *.mo
56 | *.pot
57 |
58 | # Django stuff:
59 | *.log
60 | local_settings.py
61 | db.sqlite3
62 | db.sqlite3-journal
63 |
64 | # Flask stuff:
65 | instance/
66 | .webassets-cache
67 |
68 | # Scrapy stuff:
69 | .scrapy
70 |
71 | # Sphinx documentation
72 | docs/_build/
73 | docs/changelog.md
74 |
75 | # PyBuilder
76 | .pybuilder/
77 | target/
78 |
79 | # Jupyter Notebook
80 | .ipynb_checkpoints
81 |
82 | # IPython
83 | profile_default/
84 | ipython_config.py
85 |
86 | # pyenv
87 | # For a library or package, you might want to ignore these files since the code is
88 | # intended to run in multiple environments; otherwise, check them in:
89 | # .python-version
90 |
91 | # pipenv
92 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
93 | # However, in case of collaboration, if having platform-specific dependencies or dependencies
94 | # having no cross-platform support, pipenv may install dependencies that don't work, or not
95 | # install all needed dependencies.
96 | #Pipfile.lock
97 |
98 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow
99 | __pypackages__/
100 |
101 | # Celery stuff
102 | celerybeat-schedule
103 | celerybeat.pid
104 |
105 | # SageMath parsed files
106 | *.sage.py
107 |
108 | # Environments
109 | .env
110 | .venv
111 | env/
112 | venv/
113 | venv*/
114 | ENV/
115 | env.bak/
116 | venv.bak/
117 |
118 | # Spyder project settings
119 | .spyderproject
120 | .spyproject
121 |
122 | # Rope project settings
123 | .ropeproject
124 |
125 | # mkdocs documentation
126 | /site
127 |
128 | # mypy
129 | .mypy_cache/
130 | .dmypy.json
131 | dmypy.json
132 |
133 | # Pyre type checker
134 | .pyre/
135 |
136 | # pytype static type analyzer
137 | .pytype/
138 |
139 | # Cython debug symbols
140 | cython_debug/
141 |
142 | # setuptools_scm
143 | upath/_version.py
144 |
145 | # vscode workspace settings
146 | .vscode/
147 |
148 | # mac
149 | **/.DS_Store
150 |
--------------------------------------------------------------------------------
/upath/implementations/smb.py:
--------------------------------------------------------------------------------
1 | from __future__ import annotations
2 |
3 | import sys
4 | import warnings
5 | from typing import TYPE_CHECKING
6 | from typing import Any
7 |
8 | from upath.core import UPath
9 | from upath.types import UNSET_DEFAULT
10 | from upath.types import JoinablePathLike
11 | from upath.types import WritablePathLike
12 |
13 | if TYPE_CHECKING:
14 | from typing import Literal
15 |
16 | if sys.version_info >= (3, 11):
17 | from typing import Self
18 | from typing import Unpack
19 | else:
20 | from typing_extensions import Self
21 | from typing_extensions import Unpack
22 |
23 | from upath._chain import FSSpecChainParser
24 | from upath.types.storage_options import SMBStorageOptions
25 |
26 |
27 | class SMBPath(UPath):
28 | __slots__ = ()
29 |
30 | if TYPE_CHECKING:
31 |
32 | def __init__(
33 | self,
34 | *args: JoinablePathLike,
35 | protocol: Literal["smb"] | None = ...,
36 | chain_parser: FSSpecChainParser = ...,
37 | **storage_options: Unpack[SMBStorageOptions],
38 | ) -> None: ...
39 |
40 | @property
41 | def path(self) -> str:
42 | path = super().path
43 | if len(path) > 1:
44 | return path.removesuffix("/")
45 | return path
46 |
47 | def __str__(self) -> str:
48 | path_str = super().__str__()
49 | if path_str.startswith("smb:///"):
50 | return path_str.removesuffix("/")
51 | return path_str
52 |
53 | def mkdir(
54 | self,
55 | mode: int = 0o777,
56 | parents: bool = False,
57 | exist_ok: bool = False,
58 | ) -> None:
59 | # smbclient does not support setting mode externally
60 | from smbprotocol.exceptions import SMBOSError
61 |
62 | if parents and not exist_ok and self.exists():
63 | raise FileExistsError(str(self))
64 | try:
65 | self.fs.mkdir(
66 | self.path,
67 | create_parents=parents,
68 | )
69 | except SMBOSError:
70 | if not exist_ok:
71 | raise FileExistsError(str(self))
72 | if not self.is_dir():
73 | raise FileExistsError(str(self))
74 |
75 | def rename(
76 | self,
77 | target: WritablePathLike,
78 | *,
79 | recursive: bool = UNSET_DEFAULT,
80 | maxdepth: int | None = UNSET_DEFAULT,
81 | **kwargs: Any,
82 | ) -> Self:
83 | if recursive is not UNSET_DEFAULT:
84 | warnings.warn(
85 | "SMBPath.rename(): recursive is currently ignored.",
86 | UserWarning,
87 | stacklevel=2,
88 | )
89 | if maxdepth is not UNSET_DEFAULT:
90 | warnings.warn(
91 | "SMBPath.rename(): maxdepth is currently ignored.",
92 | UserWarning,
93 | stacklevel=2,
94 | )
95 | return super().rename(target, **kwargs)
96 |
--------------------------------------------------------------------------------
/CONTRIBUTING.rst:
--------------------------------------------------------------------------------
1 | Contributor Guide
2 | =================
3 |
4 | Thank you for your interest in improving this project.
5 | This project is open-source under the `MIT license`_ and
6 | welcomes contributions in the form of bug reports, feature requests, and pull requests.
7 |
8 | Here is a list of important resources for contributors:
9 |
10 | - `Source Code`_
11 | - `Issue Tracker`_
12 | - `Code of Conduct`_
13 |
14 | .. _MIT license: https://opensource.org/licenses/MIT
15 | .. _Source Code: https://github.com/fsspec/universal_pathlib
16 | .. _Issue Tracker: https://github.com/fsspec/universal_pathlib/issues
17 |
18 | How to report a bug
19 | -------------------
20 |
21 | Report bugs on the `Issue Tracker`_.
22 |
23 | When filing an issue, make sure to answer these questions:
24 |
25 | - Which operating system and Python version are you using?
26 | - Which version of this project are you using?
27 | - What did you do?
28 | - What did you expect to see?
29 | - What did you see instead?
30 |
31 | The best way to get your bug fixed is to provide a test case,
32 | and/or steps to reproduce the issue.
33 |
34 |
35 | How to request a feature
36 | ------------------------
37 |
38 | Request features on the `Issue Tracker`_.
39 |
40 |
41 | How to set up your development environment
42 | ------------------------------------------
43 |
44 | You need Python 3.8+ and the following tools:
45 |
46 | - Nox_
47 |
48 | Install the package with development requirements:
49 |
50 | .. code:: console
51 |
52 | $ pip install nox
53 |
54 | .. _Nox: https://nox.thea.codes/
55 |
56 |
57 | How to test the project
58 | -----------------------
59 |
60 | Run the full test suite:
61 |
62 | .. code:: console
63 |
64 | $ nox
65 |
66 | List the available Nox sessions:
67 |
68 | .. code:: console
69 |
70 | $ nox --list-sessions
71 |
72 | You can also run a specific Nox session.
73 | For example, invoke the unit test suite like this:
74 |
75 | .. code:: console
76 |
77 | $ nox --session=tests
78 |
79 | Unit tests are located in the ``tests`` directory,
80 | and are written using the pytest_ testing framework.
81 |
82 | .. _pytest: https://pytest.readthedocs.io/
83 |
84 |
85 | How to submit changes
86 | ---------------------
87 |
88 | Open a `pull request`_ to submit changes to this project.
89 |
90 | Your pull request needs to meet the following guidelines for acceptance:
91 |
92 | - The Nox test suite must pass without errors and warnings.
93 | - Include unit tests. This project maintains 100% code coverage.
94 | - If your changes add functionality, update the documentation accordingly.
95 |
96 | Feel free to submit early, though—we can always iterate on this.
97 |
98 | To run linting and code formatting checks, you can invoke a `lint` session in nox:
99 |
100 | .. code:: console
101 |
102 | $ nox -s lint
103 |
104 | It is recommended to open an issue before starting work on anything.
105 | This will allow a chance to talk it over with the owners and validate your approach.
106 |
107 | .. _pull request: https://github.com/fsspec/universal_pathlib/pulls
108 | .. github-only
109 | .. _Code of Conduct: CODE_OF_CONDUCT.rst
110 |
--------------------------------------------------------------------------------
/upath/implementations/webdav.py:
--------------------------------------------------------------------------------
1 | from __future__ import annotations
2 |
3 | import sys
4 | from collections.abc import Mapping
5 | from collections.abc import Sequence
6 | from typing import TYPE_CHECKING
7 | from typing import Any
8 | from urllib.parse import urlsplit
9 |
10 | from fsspec.registry import known_implementations
11 | from fsspec.registry import register_implementation
12 |
13 | from upath.core import UPath
14 | from upath.types import JoinablePathLike
15 |
16 | if TYPE_CHECKING:
17 | from typing import Literal
18 |
19 | if sys.version_info >= (3, 11):
20 | from typing import Unpack
21 | else:
22 | from typing_extensions import Unpack
23 |
24 | from upath._chain import FSSpecChainParser
25 | from upath.types.storage_options import WebdavStorageOptions
26 |
27 | __all__ = ["WebdavPath"]
28 |
29 | # webdav was only registered in fsspec>=2022.5.0
30 | if "webdav" not in known_implementations:
31 | import webdav4.fsspec
32 |
33 | register_implementation("webdav", webdav4.fsspec.WebdavFileSystem)
34 |
35 |
36 | class WebdavPath(UPath):
37 | __slots__ = ()
38 |
39 | if TYPE_CHECKING:
40 |
41 | def __init__(
42 | self,
43 | *args: JoinablePathLike,
44 | protocol: Literal["webdav+http", "webdav+https"] | None = ...,
45 | chain_parser: FSSpecChainParser = ...,
46 | **storage_options: Unpack[WebdavStorageOptions],
47 | ) -> None: ...
48 |
49 | @classmethod
50 | def _transform_init_args(
51 | cls,
52 | args: tuple[JoinablePathLike, ...],
53 | protocol: str,
54 | storage_options: dict[str, Any],
55 | ) -> tuple[tuple[JoinablePathLike, ...], str, dict[str, Any]]:
56 | if not args:
57 | args = ("/",)
58 | elif args and protocol in {"webdav+http", "webdav+https"}:
59 | args0, *argsN = args
60 | url = urlsplit(str(args0))
61 | base = url._replace(scheme=protocol.split("+")[1], path="").geturl()
62 | args0 = url._replace(scheme="", netloc="").geturl() or "/"
63 | storage_options["base_url"] = base
64 | args = (args0, *argsN)
65 | if "base_url" not in storage_options:
66 | raise ValueError(
67 | f"must provide `base_url` storage option for args: {args!r}"
68 | )
69 | return super()._transform_init_args(args, "webdav", storage_options)
70 |
71 | @classmethod
72 | def _parse_storage_options(
73 | cls,
74 | urlpath: str,
75 | protocol: str,
76 | storage_options: Mapping[str, Any],
77 | ) -> dict[str, Any]:
78 | so = dict(storage_options)
79 | if urlpath.startswith(("webdav+http:", "webdav+https:")):
80 | url = urlsplit(str(urlpath))
81 | base = url._replace(scheme=url.scheme.split("+")[1], path="").geturl()
82 | urlpath = url._replace(scheme="", netloc="").geturl() or "/"
83 | so.setdefault("base_url", base)
84 | return super()._parse_storage_options(urlpath, "webdav", so)
85 |
86 | @property
87 | def parts(self) -> Sequence[str]:
88 | parts = super().parts
89 | if parts and parts[0] == "/":
90 | return parts[1:]
91 | else:
92 | return parts
93 |
--------------------------------------------------------------------------------
/upath/tests/implementations/test_hf.py:
--------------------------------------------------------------------------------
1 | import pytest
2 | from fsspec import get_filesystem_class
3 |
4 | from upath import UPath
5 | from upath.implementations.cloud import HfPath
6 |
7 | from ..cases import BaseTests
8 |
9 | try:
10 | get_filesystem_class("hf")
11 | except ImportError:
12 | pytestmark = pytest.mark.skip
13 |
14 |
15 | def test_hfpath():
16 | path = UPath("hf://HuggingFaceTB/SmolLM2-135M")
17 | assert isinstance(path, HfPath)
18 | try:
19 | assert path.exists()
20 | except AssertionError:
21 | from httpx import ConnectError
22 | from huggingface_hub import HfApi
23 |
24 | try:
25 | HfApi().repo_info("HuggingFaceTB/SmolLM2-135M")
26 | except ConnectError:
27 | pytest.xfail("No internet connection")
28 | except Exception as err:
29 | if "Service Unavailable" in str(err):
30 | pytest.xfail("HuggingFace API not reachable")
31 | raise
32 |
33 |
34 | class TestUPathHf(BaseTests):
35 | @pytest.fixture(autouse=True, scope="function")
36 | def path(self, hf_fixture_with_readonly_mocked_hf_api):
37 | self.path = UPath(hf_fixture_with_readonly_mocked_hf_api)
38 |
39 | @pytest.mark.skip
40 | def test_mkdir(self):
41 | pass
42 |
43 | @pytest.mark.skip
44 | def test_mkdir_exists_ok_false(self):
45 | pass
46 |
47 | @pytest.mark.skip
48 | def test_mkdir_exists_ok_true(self):
49 | pass
50 |
51 | @pytest.mark.skip
52 | def test_mkdir_parents_true_exists_ok_true(self):
53 | pass
54 |
55 | @pytest.mark.skip
56 | def test_mkdir_parents_true_exists_ok_false(self):
57 | pass
58 |
59 | @pytest.mark.skip
60 | def test_makedirs_exist_ok_true(self):
61 | pass
62 |
63 | @pytest.mark.skip
64 | def test_makedirs_exist_ok_false(self):
65 | pass
66 |
67 | @pytest.mark.skip
68 | def test_touch(self):
69 | pass
70 |
71 | @pytest.mark.skip
72 | def test_touch_unlink(self):
73 | pass
74 |
75 | @pytest.mark.skip
76 | def test_write_bytes(self, pathlib_base):
77 | pass
78 |
79 | @pytest.mark.skip
80 | def test_write_text(self, pathlib_base):
81 | pass
82 |
83 | def test_fsspec_compat(self):
84 | pass
85 |
86 | def test_rename(self):
87 | pass
88 |
89 | def test_rename2(self):
90 | pass
91 |
92 | def test_move_local(self, tmp_path):
93 | pass
94 |
95 | def test_move_into_local(self, tmp_path):
96 | pass
97 |
98 | def test_move_memory(self, clear_fsspec_memory_cache):
99 | pass
100 |
101 | def test_move_into_memory(self, clear_fsspec_memory_cache):
102 | pass
103 |
104 | @pytest.mark.skip(reason="HfPath does not support listing repositories")
105 | def test_iterdir(self, local_testdir):
106 | pass
107 |
108 | @pytest.mark.skip(reason="HfPath does not support listing repositories")
109 | def test_iterdir2(self, local_testdir):
110 | pass
111 |
112 | @pytest.mark.skip(reason="HfPath does not currently test write")
113 | def test_rename_with_target_absolute(self, target_factory):
114 | return super().test_rename_with_target_absolute(target_factory)
115 |
--------------------------------------------------------------------------------
/upath/tests/test_pydantic.py:
--------------------------------------------------------------------------------
1 | import json
2 | from os.path import abspath
3 |
4 | import pydantic
5 | import pydantic_core
6 | import pytest
7 | from fsspec.implementations.http import get_client
8 |
9 | from upath import UPath
10 |
11 |
12 | @pytest.mark.parametrize(
13 | "path",
14 | [
15 | "/abc",
16 | "file:///abc",
17 | "memory://abc",
18 | "s3://bucket/key",
19 | "https://www.example.com",
20 | ],
21 | )
22 | @pytest.mark.parametrize("source", ["json", "python"])
23 | def test_validate_from_str(path, source):
24 | expected = UPath(path)
25 |
26 | ta = pydantic.TypeAdapter(UPath)
27 | if source == "json":
28 | actual = ta.validate_json(json.dumps(path))
29 | else: # source == "python"
30 | actual = ta.validate_python(path)
31 |
32 | assert abspath(actual.path) == abspath(expected.path)
33 | assert actual.protocol == expected.protocol
34 |
35 |
36 | @pytest.mark.parametrize(
37 | "dct",
38 | [
39 | {
40 | "path": "/my/path",
41 | "protocol": "file",
42 | "storage_options": {"foo": "bar", "baz": 3},
43 | }
44 | ],
45 | )
46 | @pytest.mark.parametrize("source", ["json", "python"])
47 | def test_validate_from_dict(dct, source):
48 | ta = pydantic.TypeAdapter(UPath)
49 | if source == "json":
50 | output = ta.validate_json(json.dumps(dct))
51 | else: # source == "python"
52 | output = ta.validate_python(dct)
53 |
54 | assert abspath(output.path) == abspath(dct["path"])
55 | assert output.protocol == dct["protocol"]
56 | assert output.storage_options == dct["storage_options"]
57 |
58 |
59 | @pytest.mark.parametrize(
60 | "path",
61 | [
62 | "/abc",
63 | "file:///abc",
64 | "memory://abc",
65 | "s3://bucket/key",
66 | "https://www.example.com",
67 | ],
68 | )
69 | def test_validate_from_instance(path):
70 | input = UPath(path)
71 |
72 | output = pydantic.TypeAdapter(UPath).validate_python(input)
73 |
74 | assert output is input
75 |
76 |
77 | @pytest.mark.parametrize(
78 | ("args", "kwargs"),
79 | [
80 | (
81 | ("/my/path",),
82 | {
83 | "protocol": "file",
84 | "foo": "bar",
85 | "baz": 3,
86 | },
87 | )
88 | ],
89 | )
90 | @pytest.mark.parametrize("mode", ["json", "python"])
91 | def test_dump(args, kwargs, mode):
92 | u = UPath(*args, **kwargs)
93 |
94 | output = pydantic.TypeAdapter(UPath).dump_python(u, mode=mode)
95 |
96 | assert output["path"] == u.path
97 | assert output["protocol"] == u.protocol
98 | assert output["storage_options"] == u.storage_options
99 |
100 |
101 | def test_dump_non_serializable_python():
102 | output = pydantic.TypeAdapter(UPath).dump_python(
103 | UPath("https://www.example.com", get_client=get_client), mode="python"
104 | )
105 |
106 | assert output["storage_options"]["get_client"] is get_client
107 |
108 |
109 | def test_dump_non_serializable_json():
110 | with pytest.raises(pydantic_core.PydanticSerializationError, match="unknown type"):
111 | pydantic.TypeAdapter(UPath).dump_python(
112 | UPath("https://www.example.com", get_client=get_client), mode="json"
113 | )
114 |
115 |
116 | def test_json_schema():
117 | ta = pydantic.TypeAdapter(UPath)
118 | ta.json_schema()
119 |
--------------------------------------------------------------------------------
/upath/tests/test_stat.py:
--------------------------------------------------------------------------------
1 | import os
2 | from datetime import datetime
3 | from datetime import timezone
4 |
5 | import pytest
6 |
7 | import upath
8 |
9 |
10 | @pytest.fixture
11 | def pth_file(tmp_path):
12 | f = tmp_path.joinpath("abc.txt")
13 | f.write_bytes(b"a")
14 | p = upath.UPath(f"file://{f.absolute().as_posix()}")
15 | yield p
16 |
17 |
18 | def test_stat_repr(pth_file):
19 | assert repr(pth_file.stat()).startswith("UPathStatResult")
20 |
21 |
22 | def test_stat_as_info(pth_file):
23 | dct = pth_file.stat().as_info()
24 | assert dct["size"] == pth_file.stat().st_size
25 |
26 |
27 | def test_stat_atime(pth_file):
28 | atime = pth_file.stat().st_atime
29 | assert isinstance(atime, (float, int))
30 |
31 |
32 | @pytest.mark.xfail(reason="fsspec does not return 'atime'")
33 | def test_stat_atime_value(pth_file):
34 | atime = pth_file.stat().st_atime
35 | assert atime > 0
36 |
37 |
38 | def test_stat_mtime(pth_file):
39 | mtime = pth_file.stat().st_mtime
40 | assert isinstance(mtime, (float, int))
41 |
42 |
43 | def test_stat_mtime_value(pth_file):
44 | mtime = pth_file.stat().st_mtime
45 | assert mtime > 0
46 |
47 |
48 | def test_stat_ctime(pth_file):
49 | ctime = pth_file.stat().st_ctime
50 | assert isinstance(ctime, (float, int))
51 |
52 |
53 | @pytest.mark.xfail(reason="fsspec returns 'created' but not 'ctime'")
54 | def test_stat_ctime_value(pth_file):
55 | ctime = pth_file.stat().st_ctime
56 | assert ctime > 0
57 |
58 |
59 | def test_stat_birthtime(pth_file):
60 | birthtime = pth_file.stat().st_birthtime
61 | assert isinstance(birthtime, (float, int))
62 |
63 |
64 | def test_stat_birthtime_value(pth_file):
65 | birthtime = pth_file.stat().st_birthtime
66 | assert birthtime > 0
67 |
68 |
69 | def test_stat_seq_interface(pth_file):
70 | assert len(tuple(pth_file.stat())) == os.stat_result.n_sequence_fields
71 | assert isinstance(pth_file.stat().index(0), int)
72 | assert isinstance(pth_file.stat().count(0), int)
73 | assert isinstance(pth_file.stat()[0], int)
74 |
75 |
76 | def test_stat_warn_if_dict_interface(pth_file):
77 | with pytest.warns(DeprecationWarning):
78 | pth_file.stat().keys()
79 |
80 | with pytest.warns(DeprecationWarning):
81 | pth_file.stat().items()
82 |
83 | with pytest.warns(DeprecationWarning):
84 | pth_file.stat().values()
85 |
86 | with pytest.warns(DeprecationWarning):
87 | pth_file.stat().get("size")
88 |
89 | with pytest.warns(DeprecationWarning):
90 | pth_file.stat().copy()
91 |
92 | with pytest.warns(DeprecationWarning):
93 | _ = pth_file.stat()["size"]
94 |
95 |
96 | @pytest.mark.parametrize(
97 | "timestamp",
98 | [
99 | 10,
100 | datetime(1970, 1, 1, 0, 0, 10, tzinfo=timezone.utc),
101 | "1970-01-01T00:00:10Z",
102 | "1970-01-01T00:00:10+00:00",
103 | ],
104 | )
105 | def test_timestamps(timestamp):
106 | from upath._stat import UPathStatResult
107 |
108 | s = UPathStatResult(
109 | [0] * 10,
110 | {
111 | "ctime": timestamp,
112 | "atime": timestamp,
113 | "mtime": timestamp,
114 | "created": timestamp,
115 | },
116 | )
117 | assert s.st_atime == 10.0
118 | assert s.st_ctime == 10.0
119 | assert s.st_mtime == 10.0
120 |
121 |
122 | def test_bad_timestamp():
123 | from upath._stat import UPathStatResult
124 |
125 | with (
126 | pytest.raises(TypeError),
127 | pytest.warns(RuntimeWarning, "universal_pathlib/issues"),
128 | ):
129 | s = UPathStatResult([0] * 10, {"ctime": "bad"})
130 | _ = s.st_ctime
131 |
--------------------------------------------------------------------------------
/upath/tests/implementations/test_tar.py:
--------------------------------------------------------------------------------
1 | import tarfile
2 |
3 | import pytest
4 |
5 | from upath import UPath
6 | from upath.implementations.tar import TarPath
7 |
8 | from ..cases import BaseTests
9 |
10 |
11 | @pytest.fixture(scope="function")
12 | def tarred_testdir_file(local_testdir, tmp_path_factory):
13 | base = tmp_path_factory.mktemp("tarpath")
14 | tar_path = base / "test.tar"
15 | with tarfile.TarFile(tar_path, "w") as tf:
16 | tf.add(local_testdir, arcname="", recursive=True)
17 | return str(tar_path)
18 |
19 |
20 | class TestTarPath(BaseTests):
21 |
22 | @pytest.fixture(autouse=True)
23 | def path(self, tarred_testdir_file):
24 | self.path = UPath("tar://", fo=tarred_testdir_file)
25 | # self.prepare_file_system() done outside of UPath
26 |
27 | def test_is_TarPath(self):
28 | assert isinstance(self.path, TarPath)
29 |
30 | @pytest.mark.skip(reason="Tar filesystem is read-only")
31 | def test_mkdir(self):
32 | pass
33 |
34 | @pytest.mark.skip(reason="Tar filesystem is read-only")
35 | def test_mkdir_exists_ok_false(self):
36 | pass
37 |
38 | @pytest.mark.skip(reason="Tar filesystem is read-only")
39 | def test_mkdir_parents_true_exists_ok_false(self):
40 | pass
41 |
42 | @pytest.mark.skip(reason="Tar filesystem is read-only")
43 | def test_rename(self):
44 | pass
45 |
46 | @pytest.mark.skip(reason="Tar filesystem is read-only")
47 | def test_rename2(self):
48 | pass
49 |
50 | @pytest.mark.skip(reason="Tar filesystem is read-only")
51 | def test_touch(self):
52 | pass
53 |
54 | @pytest.mark.skip(reason="Tar filesystem is read-only")
55 | def test_touch_unlink(self):
56 | pass
57 |
58 | @pytest.mark.skip(reason="Tar filesystem is read-only")
59 | def test_write_bytes(self):
60 | pass
61 |
62 | @pytest.mark.skip(reason="Tar filesystem is read-only")
63 | def test_write_text(self):
64 | pass
65 |
66 | @pytest.mark.skip(reason="Tar filesystem is read-only")
67 | def test_fsspec_compat(self):
68 | pass
69 |
70 | @pytest.mark.skip(reason="Only testing read on TarPath")
71 | def test_move_local(self, tmp_path):
72 | pass
73 |
74 | @pytest.mark.skip(reason="Only testing read on TarPath")
75 | def test_move_into_local(self, tmp_path):
76 | pass
77 |
78 | @pytest.mark.skip(reason="Only testing read on TarPath")
79 | def test_move_memory(self, clear_fsspec_memory_cache):
80 | pass
81 |
82 | @pytest.mark.skip(reason="Only testing read on TarPath")
83 | def test_move_into_memory(self, clear_fsspec_memory_cache):
84 | pass
85 |
86 | @pytest.mark.skip(reason="Only testing read on TarPath")
87 | def test_rename_with_target_absolute(self, target_factory):
88 | return super().test_rename_with_target_str_absolute(target_factory)
89 |
90 | @pytest.mark.skip(reason="Only testing read on TarPath")
91 | def test_write_text_encoding(self):
92 | return super().test_write_text_encoding()
93 |
94 | @pytest.mark.skip(reason="Only testing read on TarPath")
95 | def test_write_text_errors(self):
96 | return super().test_write_text_errors()
97 |
98 |
99 | @pytest.fixture(scope="function")
100 | def tarred_testdir_file_in_memory(tarred_testdir_file, clear_fsspec_memory_cache):
101 | p = UPath(tarred_testdir_file, protocol="file")
102 | t = p.move(UPath("memory:///mytarfile.tar"))
103 | assert t.protocol == "memory"
104 | assert t.exists()
105 | yield t.as_uri()
106 |
107 |
108 | class TestChainedTarPath(TestTarPath):
109 |
110 | @pytest.fixture(autouse=True)
111 | def path(self, tarred_testdir_file_in_memory):
112 | self.path = UPath("tar://::memory:///mytarfile.tar")
113 |
--------------------------------------------------------------------------------
/upath/tests/test_chain.py:
--------------------------------------------------------------------------------
1 | import os
2 | from pathlib import Path
3 |
4 | import pytest
5 | from fsspec.implementations.memory import MemoryFileSystem
6 |
7 | from upath import UPath
8 | from upath._chain import FSSpecChainParser
9 |
10 |
11 | @pytest.mark.parametrize(
12 | "urlpath,expected",
13 | [
14 | ("simplecache::file://tmp", "simplecache"),
15 | ("zip://file.txt::file://tmp.zip", "zip"),
16 | ],
17 | )
18 | def test_chaining_upath_protocol(urlpath, expected):
19 | pth = UPath(urlpath)
20 | assert pth.protocol == expected
21 |
22 |
23 | def add_current_drive_on_windows(pth: str) -> str:
24 | drive = os.path.splitdrive(Path.cwd().as_posix())[0]
25 | return f"{drive}{pth}"
26 |
27 |
28 | @pytest.mark.parametrize(
29 | "urlpath,expected",
30 | [
31 | pytest.param(
32 | "simplecache::file:///tmp",
33 | add_current_drive_on_windows("/tmp"),
34 | ),
35 | pytest.param(
36 | "zip://file.txt::file:///tmp.zip",
37 | "file.txt",
38 | ),
39 | pytest.param(
40 | "zip://a/b/c.txt::simplecache::memory://zipfile.zip",
41 | "a/b/c.txt",
42 | ),
43 | ],
44 | )
45 | def test_chaining_upath_path(urlpath, expected):
46 | pth = UPath(urlpath)
47 | assert pth.path == expected
48 |
49 |
50 | @pytest.mark.parametrize(
51 | "urlpath,expected",
52 | [
53 | (
54 | "simplecache::file:///tmp",
55 | {
56 | "target_protocol": "file",
57 | "target_options": {},
58 | },
59 | ),
60 | ],
61 | )
62 | def test_chaining_upath_storage_options(urlpath, expected):
63 | pth = UPath(urlpath)
64 | assert dict(pth.storage_options) == expected
65 |
66 |
67 | @pytest.mark.parametrize(
68 | "urlpath,expected",
69 | [
70 | ("simplecache::memory://tmp", ("/", "tmp")),
71 | ],
72 | )
73 | def test_chaining_upath_parts(urlpath, expected):
74 | pth = UPath(urlpath)
75 | assert pth.parts == expected
76 |
77 |
78 | @pytest.mark.parametrize(
79 | "urlpath,expected",
80 | [
81 | ("simplecache::memory:///tmp", "simplecache::memory:///tmp"),
82 | ],
83 | )
84 | def test_chaining_upath_str(urlpath, expected):
85 | pth = UPath(urlpath)
86 | assert str(pth) == expected
87 |
88 |
89 | @pytest.fixture
90 | def clear_memory_fs():
91 | fs = MemoryFileSystem()
92 | store = fs.store
93 | pseudo_dirs = fs.pseudo_dirs
94 | try:
95 | yield fs
96 | finally:
97 | fs.store.clear()
98 | fs.store.update(store)
99 | fs.pseudo_dirs[:] = pseudo_dirs
100 |
101 |
102 | @pytest.fixture
103 | def memory_file_urlpath(clear_memory_fs):
104 | fs = clear_memory_fs
105 | fs.pipe_file("/abc/file.txt", b"hello world")
106 | yield fs.unstrip_protocol("/abc/file.txt")
107 |
108 |
109 | def test_read_file(memory_file_urlpath):
110 | pth = UPath(f"simplecache::{memory_file_urlpath}")
111 | assert pth.read_bytes() == b"hello world"
112 |
113 |
114 | def test_write_file(clear_memory_fs):
115 | pth = UPath("simplecache::memory://abc.txt")
116 | pth.write_bytes(b"hello world")
117 | assert clear_memory_fs.cat_file("abc.txt") == b"hello world"
118 |
119 |
120 | @pytest.mark.parametrize(
121 | "urlpath",
122 | [
123 | "memory:///file.txt",
124 | "simplecache::memory:///tmp",
125 | "zip://file.txt::memory:///tmp.zip",
126 | "zip://a/b/c.txt::simplecache::memory:///zipfile.zip",
127 | "simplecache::zip://a/b/c.txt::tar://blah.zip::memory:///file.tar",
128 | ],
129 | )
130 | def test_chain_parser_roundtrip(urlpath: str):
131 | parser = FSSpecChainParser()
132 | segments = parser.unchain(urlpath, protocol=None, storage_options={})
133 | rechained, kw = parser.chain(segments)
134 | assert rechained == urlpath
135 | assert kw == {}
136 |
--------------------------------------------------------------------------------
/upath/types/__init__.py:
--------------------------------------------------------------------------------
1 | from __future__ import annotations
2 |
3 | import enum
4 | import sys
5 | from os import PathLike
6 | from typing import TYPE_CHECKING
7 | from typing import Any
8 | from typing import Protocol
9 | from typing import Union
10 | from typing import runtime_checkable
11 |
12 | from upath.types._abc import JoinablePath
13 | from upath.types._abc import PathInfo
14 | from upath.types._abc import PathParser
15 | from upath.types._abc import ReadablePath
16 | from upath.types._abc import WritablePath
17 |
18 | if TYPE_CHECKING:
19 |
20 | if sys.version_info >= (3, 12):
21 | from typing import TypeAlias
22 | else:
23 | from typing_extensions import TypeAlias
24 |
25 | __all__ = [
26 | "JoinablePath",
27 | "ReadablePath",
28 | "WritablePath",
29 | "JoinablePathLike",
30 | "ReadablePathLike",
31 | "WritablePathLike",
32 | "SupportsPathLike",
33 | "PathInfo",
34 | "StatResultType",
35 | "PathParser",
36 | "UPathParser",
37 | "UNSET_DEFAULT",
38 | ]
39 |
40 |
41 | class VFSPathLike(Protocol):
42 | def __vfspath__(self) -> str: ...
43 |
44 |
45 | SupportsPathLike: TypeAlias = Union[VFSPathLike, PathLike[str]]
46 | JoinablePathLike: TypeAlias = Union[JoinablePath, SupportsPathLike, str]
47 | ReadablePathLike: TypeAlias = Union[ReadablePath, SupportsPathLike, str]
48 | WritablePathLike: TypeAlias = Union[WritablePath, SupportsPathLike, str]
49 |
50 |
51 | class _DefaultValue(enum.Enum):
52 | UNSET = enum.auto()
53 |
54 |
55 | UNSET_DEFAULT: Any = _DefaultValue.UNSET
56 |
57 | # We can't assume this, because pathlib_abc==0.5.1 is ahead of stdlib 3.14
58 | # if sys.version_info >= (3, 14):
59 | # JoinablePath.register(pathlib.PurePath)
60 | # ReadablePath.register(pathlib.Path)
61 | # WritablePath.register(pathlib.Path)
62 |
63 |
64 | @runtime_checkable
65 | class StatResultType(Protocol):
66 | """duck-type for os.stat_result"""
67 |
68 | @property
69 | def st_mode(self) -> int: ...
70 | @property
71 | def st_ino(self) -> int: ...
72 | @property
73 | def st_dev(self) -> int: ...
74 | @property
75 | def st_nlink(self) -> int: ...
76 | @property
77 | def st_uid(self) -> int: ...
78 | @property
79 | def st_gid(self) -> int: ...
80 | @property
81 | def st_size(self) -> int: ...
82 | @property
83 | def st_atime(self) -> float: ...
84 | @property
85 | def st_mtime(self) -> float: ...
86 | @property
87 | def st_ctime(self) -> float: ...
88 | @property
89 | def st_atime_ns(self) -> int: ...
90 | @property
91 | def st_mtime_ns(self) -> int: ...
92 | @property
93 | def st_ctime_ns(self) -> int: ...
94 |
95 | # st_birthtime is available on Windows (3.12+), FreeBSD, and macOS
96 | # On Linux it's currently unavailable
97 | # see: https://discuss.python.org/t/st-birthtime-not-available/104350/2
98 | if (sys.platform == "win32" and sys.version_info >= (3, 12)) or (
99 | sys.platform == "darwin" or sys.platform.startswith("freebsd")
100 | ):
101 |
102 | @property
103 | def st_birthtime(self) -> float: ...
104 |
105 |
106 | @runtime_checkable
107 | class UPathParser(PathParser, Protocol):
108 | """duck-type for upath.core.UPathParser"""
109 |
110 | def split(self, path: JoinablePathLike) -> tuple[str, str]: ...
111 | def splitext(self, path: JoinablePathLike) -> tuple[str, str]: ...
112 | def normcase(self, path: JoinablePathLike) -> str: ...
113 |
114 | def strip_protocol(self, path: JoinablePathLike) -> str: ...
115 |
116 | def join(
117 | self,
118 | path: JoinablePathLike,
119 | *paths: JoinablePathLike,
120 | ) -> str: ...
121 |
122 | def isabs(self, path: JoinablePathLike) -> bool: ...
123 |
124 | def splitdrive(self, path: JoinablePathLike) -> tuple[str, str]: ...
125 |
126 | def splitroot(self, path: JoinablePathLike) -> tuple[str, str, str]: ...
127 |
--------------------------------------------------------------------------------
/mkdocs.yml:
--------------------------------------------------------------------------------
1 | site_name: Universal Pathlib
2 | site_description: A universal pathlib implementation for Python
3 | strict: true
4 | # site_url: !ENV READTHEDOCS_CANONICAL_URL
5 |
6 | theme:
7 | name: 'material'
8 | logo: assets/logo-128x128-white.svg
9 | favicon: 'assets/favicon.png'
10 | palette:
11 | - media: "(prefers-color-scheme: light)"
12 | toggle:
13 | icon: material/lightbulb-outline
14 | name: "Switch to dark mode"
15 | - media: "(prefers-color-scheme: dark)"
16 | scheme: slate
17 | toggle:
18 | icon: material/lightbulb
19 | name: "Switch to light mode"
20 | features:
21 | # - content.tabs.link
22 | - content.code.annotate
23 | - content.code.copy
24 | - announce.dismiss
25 | - navigation.tabs
26 |
27 | extra_css:
28 | - css/extra.css
29 |
30 | repo_name: fsspec/universal_pathlib
31 | repo_url: https://github.com/fsspec/universal_pathlib
32 | edit_uri: edit/main/docs/
33 |
34 | validation:
35 | omitted_files: warn
36 | absolute_links: warn
37 | unrecognized_links: warn
38 |
39 | nav:
40 | - Home:
41 | - Introduction: index.md
42 | - Why use Universal Pathlib: why.md
43 | - Installation: install.md
44 | - Contributing: contributing.md
45 | - Changelog: changelog.md
46 | - Concepts:
47 | - Overview: concepts/index.md
48 | - Filesystem Spec: concepts/fsspec.md
49 | - Standard Library Pathlib: concepts/pathlib.md
50 | - Universal Pathlib: concepts/upath.md
51 | - Usage:
52 | - Basic Usage: usage.md
53 | - API Reference:
54 | - Core: api/index.md
55 | - Registry: api/registry.md
56 | - Implementations: api/implementations.md
57 | - Extensions: api/extensions.md
58 | - Types: api/types.md
59 | - Migration Guide: migration.md
60 |
61 | markdown_extensions:
62 | - tables
63 | - attr_list
64 | - toc:
65 | permalink: true
66 | title: Page contents
67 | - admonition
68 | - pymdownx.details
69 | - pymdownx.highlight:
70 | pygments_lang_class: true
71 | - pymdownx.extra
72 | - pymdownx.emoji:
73 | emoji_index: !!python/name:material.extensions.emoji.twemoji
74 | emoji_generator: !!python/name:material.extensions.emoji.to_svg
75 | - pymdownx.tasklist:
76 | custom_checkbox: true
77 | - pymdownx.tabbed:
78 | alternate_style: true
79 | - pymdownx.superfences:
80 | custom_fences:
81 | - name: mermaid
82 | class: mermaid
83 | format: !!python/name:pymdownx.superfences.fence_code_format
84 |
85 | watch:
86 | - docs/
87 | - upath/
88 |
89 | plugins:
90 | - search
91 | - mkdocstrings:
92 | handlers:
93 | python:
94 | inventories:
95 | - https://docs.python.org/3/objects.inv
96 | paths: [.]
97 | options:
98 | preload_modules:
99 | - __future__
100 | - typing
101 | - abc
102 | - asyncio
103 | - pathlib
104 | - pathlib_abc
105 | - fsspec
106 | - upath.types._abc
107 | - upath.types
108 | - upath.registry
109 | - upath.core
110 | docstring_style: numpy
111 | docstring_section_style: list
112 | group_by_category: false
113 | members_order: source
114 | docstring_options:
115 | ignore_init_summary: true
116 | docstring_section_style: spacy
117 | merge_init_into_class: true
118 | show_source: true
119 | show_root_heading: true
120 | show_root_toc_entry: true
121 | allow_inspection: true
122 | separate_signature: true
123 | show_signature: true
124 | show_signature_annotations: true
125 | show_signature_type_parameters: true
126 | signature_crossrefs: true
127 | show_symbol_type_heading: true
128 | - exclude:
129 | glob:
130 | - _plugins/*
131 | - __pycache__/*
132 | - tests/*
133 | - test_*.py
134 | hooks:
135 | - docs/_plugins/copy_changelog.py
136 |
--------------------------------------------------------------------------------
/upath/_protocol.py:
--------------------------------------------------------------------------------
1 | from __future__ import annotations
2 |
3 | import re
4 | from collections import ChainMap
5 | from pathlib import PurePath
6 | from typing import TYPE_CHECKING
7 | from typing import Any
8 |
9 | from fsspec.registry import known_implementations as _known_implementations
10 | from fsspec.registry import registry as _registry
11 |
12 | if TYPE_CHECKING:
13 | from upath.types import JoinablePathLike
14 |
15 | __all__ = [
16 | "get_upath_protocol",
17 | "normalize_empty_netloc",
18 | "compatible_protocol",
19 | ]
20 |
21 | # Regular expression to match fsspec style protocols.
22 | # Matches single slash usage too for compatibility.
23 | _PROTOCOL_RE = re.compile(
24 | r"^(?P[A-Za-z][A-Za-z0-9+]+):(?:(?P//?)|:)(?P.*)"
25 | )
26 |
27 | # Matches data URIs
28 | _DATA_URI_RE = re.compile(r"^data:[^,]*,")
29 |
30 |
31 | def _match_protocol(pth: str) -> str:
32 | if m := _PROTOCOL_RE.match(pth):
33 | return m.group("protocol")
34 | elif _DATA_URI_RE.match(pth):
35 | return "data"
36 | return ""
37 |
38 |
39 | _fsspec_registry_map = ChainMap(_registry, _known_implementations)
40 |
41 |
42 | def _fsspec_protocol_equals(p0: str, p1: str) -> bool:
43 | """check if two fsspec protocols are equivalent"""
44 | p0 = p0 or "file"
45 | p1 = p1 or "file"
46 | if p0 == p1:
47 | return True
48 |
49 | try:
50 | o0 = _fsspec_registry_map[p0]
51 | except KeyError:
52 | raise ValueError(f"Protocol not known: {p0!r}")
53 | try:
54 | o1 = _fsspec_registry_map[p1]
55 | except KeyError:
56 | raise ValueError(f"Protocol not known: {p1!r}")
57 |
58 | return o0 == o1
59 |
60 |
61 | def get_upath_protocol(
62 | pth: JoinablePathLike,
63 | *,
64 | protocol: str | None = None,
65 | storage_options: dict[str, Any] | None = None,
66 | ) -> str:
67 | """return the filesystem spec protocol"""
68 | from upath.core import UPath
69 |
70 | if isinstance(pth, str):
71 | pth_protocol = _match_protocol(pth)
72 | elif isinstance(pth, UPath):
73 | pth_protocol = pth.protocol
74 | elif isinstance(pth, PurePath):
75 | pth_protocol = getattr(pth, "protocol", "")
76 | elif hasattr(pth, "__vfspath__"):
77 | pth_protocol = _match_protocol(pth.__vfspath__())
78 | elif hasattr(pth, "__fspath__"):
79 | pth_protocol = _match_protocol(pth.__fspath__())
80 | else:
81 | pth_protocol = _match_protocol(str(pth))
82 | # if storage_options and not protocol and not pth_protocol:
83 | # protocol = "file"
84 | if protocol is None:
85 | return pth_protocol or ""
86 | elif (
87 | protocol
88 | and pth_protocol
89 | and not _fsspec_protocol_equals(pth_protocol, protocol)
90 | ):
91 | raise ValueError(
92 | f"requested protocol {protocol!r} incompatible with {pth_protocol!r}"
93 | )
94 | elif protocol == "" and pth_protocol:
95 | # explicitly requested empty protocol, but path has non-empty protocol
96 | raise ValueError(
97 | f"explicitly requested empty protocol {protocol!r}"
98 | f" incompatible with {pth_protocol!r}"
99 | )
100 | return protocol or pth_protocol or ""
101 |
102 |
103 | def normalize_empty_netloc(pth: str) -> str:
104 | if m := _PROTOCOL_RE.match(pth):
105 | if m.group("slashes") == "/":
106 | protocol = m.group("protocol")
107 | path = m.group("path")
108 | pth = f"{protocol}:///{path}"
109 | return pth
110 |
111 |
112 | def compatible_protocol(
113 | protocol: str,
114 | *args: JoinablePathLike,
115 | ) -> bool:
116 | """check if UPath protocols are compatible"""
117 | from upath.core import UPath
118 |
119 | for arg in args:
120 | if isinstance(arg, UPath) and not arg.is_absolute():
121 | # relative UPath are always compatible
122 | continue
123 | other_protocol = get_upath_protocol(arg)
124 | # consider protocols equivalent if they match up to the first "+"
125 | other_protocol = other_protocol.partition("+")[0]
126 | # protocols: only identical (or empty "") protocols can combine
127 | if other_protocol and not _fsspec_protocol_equals(other_protocol, protocol):
128 | return False
129 | return True
130 |
--------------------------------------------------------------------------------
/docs/api/types.md:
--------------------------------------------------------------------------------
1 | # Types :label:
2 |
3 | The types module provides type hints, protocols, and type aliases for working with UPath
4 | and filesystem operations. This includes abstract base classes, type aliases for path-like
5 | objects, and typed dictionaries for filesystem-specific storage options.
6 |
7 | ## pathlib-abc base classes
8 |
9 | These abstract base classes and protocols are re-exported from [pathlib-abc](https://github.com/barneygale/pathlib-abc)
10 | They define the core path interfaces that stdlib pathlib and UPath implementations conform to.
11 |
12 | ::: upath.types.JoinablePath
13 | options:
14 | heading_level: 3
15 | show_root_heading: true
16 | show_root_full_path: false
17 | members: false
18 | show_bases: true
19 |
20 | ::: upath.types.ReadablePath
21 | options:
22 | heading_level: 3
23 | show_root_heading: true
24 | show_root_full_path: false
25 | members: false
26 | show_bases: true
27 |
28 | ::: upath.types.WritablePath
29 | options:
30 | heading_level: 3
31 | show_root_heading: true
32 | show_root_full_path: false
33 | members: false
34 | show_bases: true
35 |
36 | ::: upath.types.PathInfo
37 | options:
38 | heading_level: 3
39 | show_root_heading: true
40 | show_root_full_path: false
41 | members: false
42 | show_bases: true
43 |
44 | ::: upath.types.PathParser
45 | options:
46 | heading_level: 3
47 | show_root_heading: true
48 | show_root_full_path: false
49 | members: false
50 | show_bases: true
51 |
52 | ---
53 |
54 | ## UPath specific protocols
55 |
56 | ::: upath.types.UPathParser
57 | options:
58 | heading_level: 3
59 | show_root_heading: true
60 | show_root_full_path: false
61 | members: false
62 | show_bases: true
63 |
64 | ---
65 |
66 | ## Type Aliases
67 |
68 | Convenient type aliases for path-like objects used throughout UPath.
69 |
70 | ::: upath.types.JoinablePathLike
71 | options:
72 | heading_level: 3
73 | show_root_heading: true
74 | show_root_full_path: false
75 |
76 | Union of types that can be joined as path segments.
77 |
78 | ::: upath.types.ReadablePathLike
79 | options:
80 | heading_level: 3
81 | show_root_heading: true
82 | show_root_full_path: false
83 |
84 | Union of types that can be read from.
85 |
86 | ::: upath.types.WritablePathLike
87 | options:
88 | heading_level: 3
89 | show_root_heading: true
90 | show_root_full_path: false
91 |
92 | Union of types that can be written to.
93 |
94 | ::: upath.types.SupportsPathLike
95 | options:
96 | heading_level: 3
97 | show_root_heading: true
98 | show_root_full_path: false
99 |
100 | Union of objects that support `__fspath__()` or `__vfspath__()` protocols.
101 |
102 | ::: upath.types.StatResultType
103 | options:
104 | heading_level: 3
105 | show_root_heading: true
106 | show_root_full_path: false
107 | members: false
108 |
109 | Protocol for `os.stat_result`-like objects.
110 |
111 | ---
112 |
113 | ## Storage Options
114 |
115 | Typed dictionaries providing type hints for filesystem-specific configuration options.
116 | These help ensure correct parameter names and types when configuring different filesystems.
117 |
118 | ::: upath.types.storage_options
119 | options:
120 | heading_level: 3
121 | show_root_heading: true
122 | show_root_full_path: false
123 | show_bases: false
124 | members:
125 | - SimpleCacheStorageOptions
126 | - GCSStorageOptions
127 | - S3StorageOptions
128 | - AzureStorageOptions
129 | - DataStorageOptions
130 | - FTPStorageOptions
131 | - GitHubStorageOptions
132 | - HDFSStorageOptions
133 | - HTTPStorageOptions
134 | - FileStorageOptions
135 | - MemoryStorageOptions
136 | - SFTPStorageOptions
137 | - SMBStorageOptions
138 | - WebdavStorageOptions
139 | - ZipStorageOptions
140 | - TarStorageOptions
141 |
142 | ---
143 |
144 | ## See Also :link:
145 |
146 | - [UPath](index.md) - Main UPath class documentation
147 | - [Implementations](implementations.md) - Built-in UPath subclasses
148 | - [Extensions](extensions.md) - Extending UPath functionality
149 | - [Registry](registry.md) - Implementation registry
150 |
--------------------------------------------------------------------------------
/upath/implementations/http.py:
--------------------------------------------------------------------------------
1 | from __future__ import annotations
2 |
3 | import sys
4 | import warnings
5 | from collections.abc import Iterator
6 | from itertools import chain
7 | from typing import TYPE_CHECKING
8 | from typing import Any
9 | from urllib.parse import urlsplit
10 |
11 | from fsspec.asyn import sync
12 |
13 | from upath._stat import UPathStatResult
14 | from upath.core import UPath
15 | from upath.types import JoinablePathLike
16 | from upath.types import StatResultType
17 |
18 | if TYPE_CHECKING:
19 | from typing import Literal
20 |
21 | if sys.version_info >= (3, 11):
22 | from typing import Self
23 | from typing import Unpack
24 | else:
25 | from typing_extensions import Self
26 | from typing_extensions import Unpack
27 |
28 | from upath._chain import FSSpecChainParser
29 | from upath.types.storage_options import HTTPStorageOptions
30 |
31 | __all__ = ["HTTPPath"]
32 |
33 |
34 | class HTTPPath(UPath):
35 | __slots__ = ()
36 |
37 | if TYPE_CHECKING:
38 |
39 | def __init__(
40 | self,
41 | *args: JoinablePathLike,
42 | protocol: Literal["http", "https"] | None = ...,
43 | chain_parser: FSSpecChainParser = ...,
44 | **storage_options: Unpack[HTTPStorageOptions],
45 | ) -> None: ...
46 |
47 | @classmethod
48 | def _transform_init_args(
49 | cls,
50 | args: tuple[JoinablePathLike, ...],
51 | protocol: str,
52 | storage_options: dict[str, Any],
53 | ) -> tuple[tuple[JoinablePathLike, ...], str, dict[str, Any]]:
54 | # allow initialization via a path argument and protocol keyword
55 | if args and not str(args[0]).startswith(protocol):
56 | args = (f"{protocol}://{str(args[0]).lstrip('/')}", *args[1:])
57 | return args, protocol, storage_options
58 |
59 | def __str__(self) -> str:
60 | sr = urlsplit(super().__str__())
61 | return sr._replace(path=sr.path or "/").geturl()
62 |
63 | @property
64 | def path(self) -> str:
65 | sr = urlsplit(super().path)
66 | return sr._replace(path=sr.path or "/").geturl()
67 |
68 | def stat(self, follow_symlinks: bool = True) -> StatResultType:
69 | if not follow_symlinks:
70 | warnings.warn(
71 | f"{type(self).__name__}.stat(follow_symlinks=False):"
72 | " is currently ignored.",
73 | UserWarning,
74 | stacklevel=2,
75 | )
76 | info = self.fs.info(self.path)
77 | if "url" in info:
78 | info["type"] = "directory" if info["url"].endswith("/") else "file"
79 | return UPathStatResult.from_info(info)
80 |
81 | def iterdir(self) -> Iterator[Self]:
82 | it = iter(super().iterdir())
83 | try:
84 | item0 = next(it)
85 | except (StopIteration, NotADirectoryError):
86 | raise NotADirectoryError(str(self))
87 | except FileNotFoundError:
88 | raise FileNotFoundError(str(self))
89 | else:
90 | yield from chain([item0], it)
91 |
92 | def resolve(
93 | self,
94 | strict: bool = False,
95 | follow_redirects: bool = True,
96 | ) -> Self:
97 | """Normalize the path and resolve redirects."""
98 | # special handling of trailing slash behaviour
99 | parts = list(self.parts)
100 | if parts[-1:] == ["."]:
101 | parts[-1:] = [""]
102 | if parts[-2:] == ["", ".."]:
103 | parts[-2:] = [""]
104 | pth = self.with_segments(*parts)
105 | resolved_path = super(HTTPPath, pth).resolve(strict=strict)
106 |
107 | if follow_redirects:
108 | cls = type(self)
109 | # Get the fsspec fs
110 | fs = self.fs
111 | url = str(self)
112 | # Ensure we have a session
113 | session = sync(fs.loop, fs.set_session)
114 | # Use HEAD requests if the server allows it, falling back to GETs
115 | for method in (session.head, session.get):
116 | r = sync(fs.loop, method, url, allow_redirects=True)
117 | try:
118 | r.raise_for_status()
119 | except Exception as exc:
120 | if method == session.get:
121 | raise FileNotFoundError(self) from exc
122 | else:
123 | resolved_path = cls(str(r.url))
124 | break
125 |
126 | return resolved_path
127 |
--------------------------------------------------------------------------------
/upath/tests/implementations/test_github.py:
--------------------------------------------------------------------------------
1 | import functools
2 | import os
3 | import platform
4 | import sys
5 |
6 | import pytest
7 |
8 | from upath import UPath
9 | from upath.implementations.github import GitHubPath
10 | from upath.tests.cases import BaseTests
11 |
12 | pytestmark = pytest.mark.skipif(
13 | os.environ.get("CI")
14 | and not (
15 | platform.system() == "Linux" and sys.version_info[:2] in {(3, 9), (3, 13)}
16 | ),
17 | reason="Skipping GitHubPath tests to prevent rate limiting on GitHub API.",
18 | )
19 |
20 |
21 | def xfail_on_github_connection_error(func):
22 | """Method decorator to xfail tests on GitHub rate limit or connection errors."""
23 |
24 | @functools.wraps(func)
25 | def wrapper(self, *args, **kwargs):
26 | try:
27 | return func(self, *args, **kwargs)
28 | except Exception as e:
29 | str_e = str(e)
30 | if "rate limit exceeded" in str_e or "too many requests for url" in str_e:
31 | pytest.xfail("GitHub API rate limit exceeded")
32 | elif (
33 | "nodename nor servname provided, or not known" in str_e
34 | or "Network is unreachable" in str_e
35 | ):
36 | pytest.xfail("No internet connection")
37 | else:
38 | raise
39 |
40 | return wrapper
41 |
42 |
43 | def wrap_all_tests(decorator):
44 | """Class decorator factory to wrap all test methods with a given decorator."""
45 |
46 | def class_decorator(cls):
47 | for attr_name in dir(cls):
48 | if attr_name.startswith("test_"):
49 | orig_method = getattr(cls, attr_name)
50 | setattr(cls, attr_name, decorator(orig_method))
51 | return cls
52 |
53 | return class_decorator
54 |
55 |
56 | @wrap_all_tests(xfail_on_github_connection_error)
57 | class TestUPathGitHubPath(BaseTests):
58 | """
59 | Unit-tests for the GitHubPath implementation of UPath.
60 | """
61 |
62 | @pytest.fixture(autouse=True)
63 | def path(self):
64 | """
65 | Fixture for the UPath instance to be tested.
66 | """
67 | path = "github://ap--:universal_pathlib@test_data/data"
68 | self.path = UPath(path)
69 |
70 | def test_is_GitHubPath(self):
71 | """
72 | Test that the path is a GitHubPath instance.
73 | """
74 | assert isinstance(self.path, GitHubPath)
75 |
76 | @pytest.mark.skip(reason="GitHub filesystem is read-only")
77 | def test_mkdir(self):
78 | pass
79 |
80 | @pytest.mark.skip(reason="GitHub filesystem is read-only")
81 | def test_mkdir_exists_ok_false(self):
82 | pass
83 |
84 | @pytest.mark.skip(reason="GitHub filesystem is read-only")
85 | def test_mkdir_parents_true_exists_ok_false(self):
86 | pass
87 |
88 | @pytest.mark.skip(reason="GitHub filesystem is read-only")
89 | def test_rename(self):
90 | pass
91 |
92 | @pytest.mark.skip(reason="GitHub filesystem is read-only")
93 | def test_rename2(self):
94 | pass
95 |
96 | @pytest.mark.skip(reason="GitHub filesystem is read-only")
97 | def test_touch(self):
98 | pass
99 |
100 | @pytest.mark.skip(reason="GitHub filesystem is read-only")
101 | def test_touch_unlink(self):
102 | pass
103 |
104 | @pytest.mark.skip(reason="GitHub filesystem is read-only")
105 | def test_write_bytes(self):
106 | pass
107 |
108 | @pytest.mark.skip(reason="GitHub filesystem is read-only")
109 | def test_write_text(self):
110 | pass
111 |
112 | @pytest.mark.skip(reason="GitHub filesystem is read-only")
113 | def test_fsspec_compat(self):
114 | pass
115 |
116 | @pytest.mark.skip(reason="Only testing read on GithubPath")
117 | def test_move_local(self, tmp_path):
118 | pass
119 |
120 | @pytest.mark.skip(reason="Only testing read on GithubPath")
121 | def test_move_into_local(self, tmp_path):
122 | pass
123 |
124 | @pytest.mark.skip(reason="Only testing read on GithubPath")
125 | def test_move_memory(self, clear_fsspec_memory_cache):
126 | pass
127 |
128 | @pytest.mark.skip(reason="Only testing read on GithubPath")
129 | def test_move_into_memory(self, clear_fsspec_memory_cache):
130 | pass
131 |
132 | @pytest.mark.skip(reason="Only testing read on GithubPath")
133 | def test_rename_with_target_absolute(self, target_factory):
134 | return super().test_rename_with_target_str_absolute(target_factory)
135 |
136 | @pytest.mark.skip(reason="Only testing read on GithubPath")
137 | def test_write_text_encoding(self):
138 | return super().test_write_text_encoding()
139 |
140 | @pytest.mark.skip(reason="Only testing read on GithubPath")
141 | def test_write_text_errors(self):
142 | return super().test_write_text_errors()
143 |
--------------------------------------------------------------------------------
/upath/types/_abc.pyi:
--------------------------------------------------------------------------------
1 | """pathlib_abc exports for compatibility with pathlib."""
2 |
3 | import sys
4 | from abc import ABC
5 | from abc import abstractmethod
6 | from typing import Any
7 | from typing import BinaryIO
8 | from typing import Callable
9 | from typing import Iterator
10 | from typing import Literal
11 | from typing import Protocol
12 | from typing import Sequence
13 | from typing import TextIO
14 | from typing import TypeVar
15 | from typing import runtime_checkable
16 |
17 | if sys.version_info > (3, 11):
18 | from typing import Self
19 | else:
20 | from typing_extensions import Self
21 |
22 | class JoinablePath(ABC):
23 | __slots__ = ()
24 |
25 | @property
26 | @abstractmethod
27 | def parser(self) -> PathParser: ...
28 | @abstractmethod
29 | def with_segments(self, *pathsegments: str) -> Self: ...
30 | @abstractmethod
31 | def __vfspath__(self) -> str: ...
32 | @property
33 | def anchor(self) -> str: ...
34 | @property
35 | def name(self) -> str: ...
36 | @property
37 | def suffix(self) -> str: ...
38 | @property
39 | def suffixes(self) -> list[str]: ...
40 | @property
41 | def stem(self) -> str: ...
42 | def with_name(self, name: str) -> Self: ...
43 | def with_stem(self, stem: str) -> Self: ...
44 | def with_suffix(self, suffix: str) -> Self: ...
45 | @property
46 | def parts(self) -> Sequence[str]: ...
47 | def joinpath(self, *pathsegments: str) -> Self: ...
48 | def __truediv__(self, key: str) -> Self: ...
49 | def __rtruediv__(self, key: str) -> Self: ...
50 | @property
51 | def parent(self) -> Self: ...
52 | @property
53 | def parents(self) -> Sequence[Self]: ...
54 | def full_match(self, pattern: str) -> bool: ...
55 |
56 | OnErrorCallable = Callable[[Exception], Any]
57 | T = TypeVar("T", bound="WritablePath")
58 |
59 | class ReadablePath(JoinablePath):
60 | __slots__ = ()
61 |
62 | @property
63 | @abstractmethod
64 | def info(self) -> PathInfo: ...
65 | @abstractmethod
66 | def __open_reader__(self) -> BinaryIO: ...
67 | def read_bytes(self) -> bytes: ...
68 | def read_text(
69 | self,
70 | encoding: str | None = ...,
71 | errors: str | None = ...,
72 | newline: str | None = ...,
73 | ) -> str: ...
74 | @abstractmethod
75 | def iterdir(self) -> Iterator[Self]: ...
76 | def glob(self, pattern: str, *, recurse_symlinks: bool = ...) -> Iterator[Self]: ...
77 | def walk(
78 | self,
79 | top_down: bool = ...,
80 | on_error: OnErrorCallable | None = ...,
81 | follow_symlinks: bool = ...,
82 | ) -> Iterator[tuple[Self, list[str], list[str]]]: ...
83 | @abstractmethod
84 | def readlink(self) -> Self: ...
85 | def copy(self, target: T, **kwargs: Any) -> T: ...
86 | def copy_into(self, target_dir: T, **kwargs: Any) -> T: ...
87 |
88 | class WritablePath(JoinablePath):
89 | __slots__ = ()
90 |
91 | @abstractmethod
92 | def symlink_to(
93 | self, target: ReadablePath, target_is_directory: bool = ...
94 | ) -> None: ...
95 | @abstractmethod
96 | def mkdir(self) -> None: ...
97 | @abstractmethod
98 | def __open_writer__(self, mode: Literal["a", "w", "x"]) -> BinaryIO: ...
99 | def write_bytes(self, data: bytes) -> int: ...
100 | def write_text(
101 | self,
102 | data: str,
103 | encoding: str | None = ...,
104 | errors: str | None = ...,
105 | newline: str | None = ...,
106 | ) -> int: ...
107 | def _copy_from(self, source: ReadablePath, follow_symlinks: bool = ...) -> None: ...
108 |
109 | @runtime_checkable
110 | class PathParser(Protocol):
111 | sep: str
112 | altsep: str | None
113 |
114 | def split(self, path: str) -> tuple[str, str]: ...
115 | def splitext(self, path: str) -> tuple[str, str]: ...
116 | def normcase(self, path: str) -> str: ...
117 |
118 | @runtime_checkable
119 | class PathInfo(Protocol):
120 | def exists(self, *, follow_symlinks: bool = True) -> bool: ...
121 | def is_dir(self, *, follow_symlinks: bool = True) -> bool: ...
122 | def is_file(self, *, follow_symlinks: bool = True) -> bool: ...
123 | def is_symlink(self) -> bool: ...
124 |
125 | class SupportsOpenReader(Protocol):
126 | def __open_reader__(self) -> BinaryIO: ...
127 |
128 | class SupportsOpenWriter(Protocol):
129 | def __open_writer__(self, mode: Literal["a", "w", "x"]) -> BinaryIO: ...
130 |
131 | class SupportsOpenUpdater(Protocol):
132 | def __open_updater__(self, mode: Literal["r+", "w+", "+r", "+w"]) -> BinaryIO: ...
133 |
134 | def vfsopen(
135 | obj: SupportsOpenReader | SupportsOpenWriter | SupportsOpenUpdater,
136 | mode="r",
137 | buffering: int = -1,
138 | encoding: str | None = None,
139 | errors: str | None = None,
140 | newline: str | None = None,
141 | ) -> BinaryIO | TextIO: ...
142 |
143 | class SupportsVFSPath(Protocol):
144 | def __vfspath__(self) -> str: ...
145 |
146 | def vfspath(obj: SupportsVFSPath) -> str: ...
147 |
--------------------------------------------------------------------------------
/docs/assets/logo-128x128-white.svg:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
116 |
--------------------------------------------------------------------------------
/upath/tests/test_registry.py:
--------------------------------------------------------------------------------
1 | import random
2 | import string
3 |
4 | import pytest
5 | from fsspec.implementations.local import LocalFileSystem
6 | from fsspec.registry import _registry as fsspec_registry_private
7 | from fsspec.registry import known_implementations as fsspec_known_implementations
8 | from fsspec.registry import register_implementation as fsspec_register_implementation
9 | from fsspec.registry import registry as fsspec_registry
10 |
11 | from upath import UPath
12 | from upath.registry import available_implementations
13 | from upath.registry import get_upath_class
14 | from upath.registry import register_implementation
15 |
16 | IMPLEMENTATIONS = {
17 | "abfs",
18 | "abfss",
19 | "adl",
20 | "az",
21 | "data",
22 | "file",
23 | "ftp",
24 | "gcs",
25 | "gs",
26 | "hdfs",
27 | "hf",
28 | "http",
29 | "https",
30 | "local",
31 | "memory",
32 | "s3",
33 | "s3a",
34 | "simplecache",
35 | "sftp",
36 | "smb",
37 | "ssh",
38 | "webdav",
39 | "webdav+http",
40 | "webdav+https",
41 | "github",
42 | "zip",
43 | "tar",
44 | }
45 |
46 |
47 | @pytest.fixture(autouse=True)
48 | def reset_registry():
49 | from upath.registry import _registry
50 |
51 | try:
52 | yield
53 | finally:
54 | _registry._m.maps[0].clear() # type: ignore
55 |
56 |
57 | @pytest.fixture()
58 | def fake_entrypoint():
59 | from importlib.metadata import EntryPoint
60 |
61 | from upath.registry import _registry
62 |
63 | ep = EntryPoint(
64 | name="myeps",
65 | value="upath.core:UPath",
66 | group="universal_pathlib.implementations",
67 | )
68 | old_registry = _registry._entries.copy()
69 |
70 | try:
71 | _registry._entries["myeps"] = ep
72 | yield
73 | finally:
74 | _registry._entries.clear()
75 | _registry._entries.update(old_registry)
76 |
77 |
78 | def test_available_implementations():
79 | impl = available_implementations()
80 | assert len(impl) == len(set(impl))
81 | assert set(impl) == IMPLEMENTATIONS
82 |
83 |
84 | @pytest.fixture
85 | def fake_registered_proto():
86 | fake_proto = "".join(random.choices(string.ascii_lowercase, k=8))
87 |
88 | class FakeRandomFS(LocalFileSystem):
89 | protocol = fake_proto
90 |
91 | fsspec_register_implementation(fake_proto, FakeRandomFS)
92 | try:
93 | yield fake_proto
94 | finally:
95 | fsspec_registry_private.pop(fake_proto, None)
96 |
97 |
98 | def test_available_implementations_with_fallback(fake_registered_proto):
99 | impl = available_implementations(fallback=True)
100 | assert fake_registered_proto in impl
101 | assert set(impl) == IMPLEMENTATIONS.union(
102 | {
103 | *fsspec_known_implementations,
104 | *fsspec_registry,
105 | }
106 | )
107 |
108 |
109 | def test_available_implementations_with_entrypoint(fake_entrypoint):
110 | impl = available_implementations()
111 | assert set(impl) == IMPLEMENTATIONS.union({"myeps"})
112 |
113 |
114 | def test_register_implementation():
115 | class MyProtoPath(UPath):
116 | pass
117 |
118 | register_implementation("myproto", MyProtoPath)
119 |
120 | assert get_upath_class("myproto") is MyProtoPath
121 |
122 |
123 | def test_register_implementation_wrong_input():
124 | with pytest.raises(TypeError):
125 | register_implementation(None, UPath) # type: ignore
126 | with pytest.raises(ValueError):
127 | register_implementation("incorrect**protocol", UPath)
128 | with pytest.raises(ValueError):
129 | register_implementation("myproto", object, clobber=True) # type: ignore
130 | with pytest.raises(ValueError):
131 | register_implementation("file", UPath, clobber=False)
132 | assert set(available_implementations()) == IMPLEMENTATIONS
133 |
134 |
135 | @pytest.mark.parametrize("protocol", IMPLEMENTATIONS)
136 | def test_get_upath_class(protocol):
137 | upath_cls = get_upath_class("file")
138 | assert issubclass(upath_cls, UPath)
139 |
140 |
141 | def test_get_upath_class_without_implementation(clear_registry):
142 | with pytest.warns(
143 | UserWarning, match="UPath 'mock' filesystem not explicitly implemented."
144 | ):
145 | upath_cls = get_upath_class("mock")
146 | assert issubclass(upath_cls, UPath)
147 |
148 |
149 | def test_get_upath_class_without_implementation_no_fallback(clear_registry):
150 | assert get_upath_class("mock", fallback=False) is None
151 |
152 |
153 | def test_get_upath_class_unknown_protocol(clear_registry):
154 | assert get_upath_class("doesnotexist") is None
155 |
156 |
157 | def test_get_upath_class_from_entrypoint(fake_entrypoint):
158 | assert issubclass(get_upath_class("myeps"), UPath)
159 |
160 |
161 | @pytest.mark.parametrize(
162 | "protocol", [pytest.param("", id="empty-str"), pytest.param(None, id="none")]
163 | )
164 | def test_get_upath_class_falsey_protocol(protocol):
165 | assert issubclass(get_upath_class(protocol), UPath)
166 |
--------------------------------------------------------------------------------
/CODE_OF_CONDUCT.rst:
--------------------------------------------------------------------------------
1 | Contributor Covenant Code of Conduct
2 | ====================================
3 |
4 | Our Pledge
5 | ----------
6 |
7 | We as members, contributors, and leaders pledge to make participation in our community a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socioeconomic status, nationality, personal appearance, race, religion, or sexual identity and orientation.
8 |
9 | We pledge to act and interact in ways that contribute to an open, welcoming, diverse, inclusive, and healthy community.
10 |
11 |
12 | Our Standards
13 | -------------
14 |
15 | Examples of behavior that contributes to a positive environment for our community include:
16 |
17 | - Demonstrating empathy and kindness toward other people
18 | - Being respectful of differing opinions, viewpoints, and experiences
19 | - Giving and gracefully accepting constructive feedback
20 | - Accepting responsibility and apologizing to those affected by our mistakes, and learning from the experience
21 | - Focusing on what is best not just for us as individuals, but for the overall community
22 |
23 | Examples of unacceptable behavior include:
24 |
25 | - The use of sexualized language or imagery, and sexual attention or
26 | advances of any kind
27 | - Trolling, insulting or derogatory comments, and personal or political attacks
28 | - Public or private harassment
29 | - Publishing others' private information, such as a physical or email
30 | address, without their explicit permission
31 | - Other conduct which could reasonably be considered inappropriate in a
32 | professional setting
33 |
34 | Enforcement Responsibilities
35 | ----------------------------
36 |
37 | Community leaders are responsible for clarifying and enforcing our standards of acceptable behavior and will take appropriate and fair corrective action in response to any behavior that they deem inappropriate, threatening, offensive, or harmful.
38 |
39 | Community leaders have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, and will communicate reasons for moderation decisions when appropriate.
40 |
41 |
42 | Scope
43 | -----
44 |
45 | This Code of Conduct applies within all community spaces, and also applies when an individual is officially representing the community in public spaces. Examples of representing our community include using an official e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event.
46 |
47 |
48 | Enforcement
49 | -----------
50 |
51 | Instances of abusive, harassing, or otherwise unacceptable behavior may be reported to the community leaders responsible for enforcement at andrewfulton9gmail.com. All complaints will be reviewed and investigated promptly and fairly.
52 |
53 | All community leaders are obligated to respect the privacy and security of the reporter of any incident.
54 |
55 |
56 | Enforcement Guidelines
57 | ----------------------
58 |
59 | Community leaders will follow these Community Impact Guidelines in determining the consequences for any action they deem in violation of this Code of Conduct:
60 |
61 |
62 | 1. Correction
63 | ~~~~~~~~~~~~~
64 |
65 | **Community Impact**: Use of inappropriate language or other behavior deemed unprofessional or unwelcome in the community.
66 |
67 | **Consequence**: A private, written warning from community leaders, providing clarity around the nature of the violation and an explanation of why the behavior was inappropriate. A public apology may be requested.
68 |
69 |
70 | 2. Warning
71 | ~~~~~~~~~~
72 |
73 | **Community Impact**: A violation through a single incident or series of actions.
74 |
75 | **Consequence**: A warning with consequences for continued behavior. No interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, for a specified period of time. This includes avoiding interactions in community spaces as well as external channels like social media. Violating these terms may lead to a temporary or permanent ban.
76 |
77 |
78 | 3. Temporary Ban
79 | ~~~~~~~~~~~~~~~~
80 |
81 | **Community Impact**: A serious violation of community standards, including sustained inappropriate behavior.
82 |
83 | **Consequence**: A temporary ban from any sort of interaction or public communication with the community for a specified period of time. No public or private interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, is allowed during this period. Violating these terms may lead to a permanent ban.
84 |
85 |
86 | 4. Permanent Ban
87 | ~~~~~~~~~~~~~~~~
88 |
89 | **Community Impact**: Demonstrating a pattern of violation of community standards, including sustained inappropriate behavior, harassment of an individual, or aggression toward or disparagement of classes of individuals.
90 |
91 | **Consequence**: A permanent ban from any sort of public interaction within the community.
92 |
93 |
94 | Attribution
95 | -----------
96 |
97 | This Code of Conduct is adapted from the `Contributor Covenant `__, version 2.0,
98 | available at https://www.contributor-covenant.org/version/2/0/code_of_conduct/.
99 |
100 | Community Impact Guidelines were inspired by `Mozilla’s code of conduct enforcement ladder `__.
101 |
102 | .. _homepage: https://www.contributor-covenant.org
103 |
104 | For answers to common questions about this code of conduct, see the FAQ at
105 | https://www.contributor-covenant.org/faq. Translations are available at https://www.contributor-covenant.org/translations.
106 |
--------------------------------------------------------------------------------
/upath/tests/implementations/test_s3.py:
--------------------------------------------------------------------------------
1 | """see upath/tests/conftest.py for fixtures"""
2 |
3 | import sys
4 |
5 | import fsspec
6 | import pytest # noqa: F401
7 |
8 | from upath import UPath
9 | from upath.implementations.cloud import S3Path
10 |
11 | from ..cases import BaseTests
12 |
13 |
14 | def silence_botocore_datetime_deprecation(cls):
15 | # botocore uses datetime.datetime.utcnow in 3.12 which is deprecated
16 | # see: https://github.com/boto/boto3/issues/3889#issuecomment-1751296363
17 | if sys.version_info >= (3, 12):
18 | return pytest.mark.filterwarnings(
19 | "ignore"
20 | r":datetime.datetime.utcnow\(\) is deprecated"
21 | ":DeprecationWarning"
22 | )(cls)
23 | else:
24 | return cls
25 |
26 |
27 | @silence_botocore_datetime_deprecation
28 | class TestUPathS3(BaseTests):
29 | SUPPORTS_EMPTY_DIRS = False
30 |
31 | @pytest.fixture(autouse=True)
32 | def path(self, s3_fixture):
33 | path, anon, s3so = s3_fixture
34 | self.path = UPath(path, anon=anon, **s3so)
35 | self.anon = anon
36 | self.s3so = s3so
37 |
38 | def test_is_S3Path(self):
39 | assert isinstance(self.path, S3Path)
40 |
41 | def test_chmod(self):
42 | # todo
43 | pass
44 |
45 | def test_rmdir(self):
46 | dirname = "rmdir_test"
47 | mock_dir = self.path.joinpath(dirname)
48 | mock_dir.joinpath("test.txt").touch()
49 | mock_dir.rmdir()
50 | assert not mock_dir.exists()
51 | with pytest.raises(NotADirectoryError):
52 | self.path.joinpath("file1.txt").rmdir()
53 |
54 | def test_relative_to(self):
55 | assert "file.txt" == str(
56 | UPath("s3://test_bucket/file.txt").relative_to(UPath("s3://test_bucket"))
57 | )
58 |
59 | def test_iterdir_root(self):
60 | client_kwargs = self.path.storage_options["client_kwargs"]
61 | bucket_path = UPath("s3://other_test_bucket", client_kwargs=client_kwargs)
62 | bucket_path.mkdir()
63 |
64 | (bucket_path / "test1.txt").touch()
65 | (bucket_path / "test2.txt").touch()
66 |
67 | for x in bucket_path.iterdir():
68 | assert x.name != ""
69 | assert x.exists()
70 |
71 | @pytest.mark.parametrize(
72 | "joiner", [["bucket", "path", "file"], ["bucket/path/file"]]
73 | )
74 | def test_no_bucket_joinpath(self, joiner):
75 | path = UPath("s3://", anon=self.anon, **self.s3so)
76 | path = path.joinpath(*joiner)
77 | assert str(path) == "s3://bucket/path/file"
78 |
79 | def test_creating_s3path_with_bucket(self):
80 | path = UPath("s3://", bucket="bucket", anon=self.anon, **self.s3so)
81 | assert str(path) == "s3://bucket/"
82 |
83 | def test_iterdir_with_plus_in_name(self, s3_with_plus_chr_name):
84 | bucket, anon, s3so = s3_with_plus_chr_name
85 | p = UPath(
86 | f"s3://{bucket}/manual__2022-02-19T14:31:25.891270+00:00",
87 | anon=True,
88 | **s3so,
89 | )
90 |
91 | files = list(p.iterdir())
92 | assert len(files) == 1
93 | (file,) = files
94 | assert file == p.joinpath("file.txt")
95 |
96 | @pytest.mark.xfail(reason="fsspec/universal_pathlib#144")
97 | def test_rglob_with_double_fwd_slash(self, s3_with_double_fwd_slash_files):
98 | import boto3
99 | import botocore.exceptions
100 |
101 | bucket, anon, s3so = s3_with_double_fwd_slash_files
102 |
103 | conn = boto3.resource("s3", **s3so["client_kwargs"])
104 | # ensure there's no s3://bucket/key.txt object
105 | with pytest.raises(botocore.exceptions.ClientError, match=".*Not Found.*"):
106 | conn.Object(bucket, "key.txt").load()
107 | # ensure there's a s3://bucket//key.txt object
108 | assert conn.Object(bucket, "/key.txt").get()["Body"].read() == b"hello world"
109 |
110 | p0 = UPath(f"s3://{bucket}//key.txt", **s3so)
111 | assert p0.read_bytes() == b"hello world"
112 | p1 = UPath(f"s3://{bucket}", **s3so)
113 | assert list(p1.rglob("*.txt")) == [p0]
114 |
115 |
116 | @pytest.fixture
117 | def s3_with_plus_chr_name(s3_server):
118 | anon, s3so = s3_server
119 | s3 = fsspec.filesystem("s3", anon=False, **s3so)
120 | bucket = "plus_chr_bucket"
121 | path = f"{bucket}/manual__2022-02-19T14:31:25.891270+00:00"
122 | s3.mkdir(path)
123 | s3.touch(f"{path}/file.txt")
124 | s3.invalidate_cache()
125 | try:
126 | yield bucket, anon, s3so
127 | finally:
128 | if s3.exists(bucket):
129 | for dir, _, keys in s3.walk(bucket):
130 | for key in keys:
131 | if key.rstrip("/"):
132 | s3.rm(f"{dir}/{key}")
133 |
134 |
135 | @pytest.fixture
136 | def s3_with_double_fwd_slash_files(s3_server):
137 | anon, s3so = s3_server
138 | s3 = fsspec.filesystem("s3", anon=False, **s3so)
139 | bucket = "double_fwd_slash_bucket"
140 | s3.mkdir(bucket + "/")
141 | s3.pipe_file(f"{bucket}//key.txt", b"hello world")
142 | try:
143 | yield bucket, anon, s3so
144 | finally:
145 | if s3.exists(bucket):
146 | for dir, _, keys in s3.walk(bucket):
147 | for key in keys:
148 | if key.rstrip("/"):
149 | s3.rm(f"{dir}/{key}")
150 |
151 |
152 | def test_path_with_hash_and_space():
153 | assert "with#hash and space" in UPath("s3://bucket/with#hash and space/abc").parts
154 |
155 |
156 | def test_pathlib_consistent_join():
157 | b0 = UPath("s3://mybucket/withkey/").joinpath("subfolder/myfile.txt")
158 | b1 = UPath("s3://mybucket/withkey").joinpath("subfolder/myfile.txt")
159 | assert b0 == b1
160 | assert "s3://mybucket/withkey/subfolder/myfile.txt" == str(b0) == str(b1)
161 |
--------------------------------------------------------------------------------
/pyproject.toml:
--------------------------------------------------------------------------------
1 | [build-system]
2 | requires = ["setuptools>=64", "setuptools_scm>=8"]
3 | build-backend = "setuptools.build_meta"
4 |
5 | [project]
6 | name = "universal_pathlib"
7 | license = "MIT"
8 | authors = [
9 | {name = "Andrew Fulton", email = "andrewfulton9@gmail.com"},
10 | ]
11 | description = "pathlib api extended to use fsspec backends"
12 | maintainers = [
13 | {name = "Andreas Poehlmann"},
14 | {name = "Andreas Poehlmann", email = "andreas@poehlmann.io"},
15 | {name = "Norman Rzepka"},
16 | ]
17 | requires-python = ">=3.9"
18 | dependencies = [
19 | "fsspec >=2024.5.0",
20 | "pathlib-abc >=0.5.1,<0.6.0",
21 | ]
22 | classifiers = [
23 | "Programming Language :: Python :: 3",
24 | "Programming Language :: Python :: 3.9",
25 | "Programming Language :: Python :: 3.10",
26 | "Programming Language :: Python :: 3.11",
27 | "Programming Language :: Python :: 3.12",
28 | "Programming Language :: Python :: 3.13",
29 | "Programming Language :: Python :: 3.14",
30 | "Development Status :: 4 - Beta",
31 | ]
32 | keywords = ["filesystem-spec", "pathlib"]
33 | dynamic = ["version", "readme"]
34 |
35 | [tool.setuptools.dynamic]
36 | readme = {file = ["README.md"], content-type = "text/markdown"}
37 |
38 | [project.optional-dependencies]
39 | tests = [
40 | "pytest >=8",
41 | "pytest-sugar >=0.9.7",
42 | "pytest-cov >=4.1.0",
43 | "pytest-mock >=3.12.0",
44 | "pylint >=2.17.4",
45 | "mypy >=1.10.0",
46 | "pydantic >=2",
47 | "pytest-mypy-plugins >=3.1.2",
48 | "packaging",
49 | ]
50 | typechecking = [
51 | "mypy >=1.10.0",
52 | "pytest-mypy-plugins >=3.1.2",
53 | ]
54 | dev = [
55 | "fsspec[adl,http,github,gcs,s3,ssh,smb] >=2024.5.0",
56 | "s3fs >=2024.5.0",
57 | "gcsfs >=2024.5.0",
58 | "adlfs >=2024",
59 | "huggingface_hub",
60 | "webdav4[fsspec]",
61 | # testing
62 | "moto[s3,server]",
63 | "wsgidav",
64 | "cheroot",
65 | # "hadoop-test-cluster",
66 | # "pyarrow",
67 | "pyftpdlib",
68 | "typing_extensions; python_version<'3.11'",
69 | ]
70 | dev-third-party = [
71 | "pydantic",
72 | "pydantic-settings",
73 | ]
74 |
75 | [project.urls]
76 | Homepage = "https://github.com/fsspec/universal_pathlib"
77 | Changelog = "https://github.com/fsspec/universal_pathlib/blob/main/CHANGELOG.md"
78 |
79 | [tool.setuptools]
80 | include-package-data = false
81 |
82 | [tool.setuptools.package-data]
83 | upath = ["py.typed"]
84 |
85 | [tool.setuptools.packages.find]
86 | exclude = [
87 | "upath.tests",
88 | "upath.tests.*",
89 | ]
90 | namespaces = false
91 |
92 | [tool.setuptools_scm]
93 | write_to = "upath/_version.py"
94 | version_scheme = "post-release"
95 |
96 | [tool.black]
97 | line-length = 88
98 | include = '\.pyi?$'
99 | exclude = '''
100 | /(
101 | \.eggs
102 | | \.git
103 | | \.hg
104 | | \.mypy_cache
105 | | \.tox
106 | | \.venv
107 | | _build
108 | | buck-out
109 | | build
110 | | dist
111 | )/
112 | '''
113 | force-exclude = '''
114 | (
115 | ^/upath/tests/pathlib/_test_support\.py
116 | |^/upath/tests/pathlib/test_pathlib_.*\.py
117 | )
118 | '''
119 |
120 | [tool.isort]
121 | profile = "black"
122 | known_first_party = ["upath"]
123 | force_single_line = true
124 | line_length = 88
125 |
126 | [tool.pytest.ini_options]
127 | addopts = "-ra -m 'not hdfs' -p no:pytest-mypy-plugins"
128 | markers = [
129 | "hdfs: mark test as hdfs",
130 | "pathlib: mark cpython pathlib tests",
131 | ]
132 |
133 | [tool.coverage.run]
134 | branch = true
135 | source = ["upath"]
136 |
137 | [tool.coverage.report]
138 | show_missing = true
139 | exclude_lines = [
140 | "pragma: no cover",
141 | "if __name__ == .__main__.:",
142 | "if typing.TYPE_CHECKING:",
143 | "if TYPE_CHECKING:",
144 | "raise NotImplementedError",
145 | "raise AssertionError",
146 | "@overload",
147 | "except ImportError",
148 | ]
149 |
150 | [tool.mypy]
151 | # Error output
152 | show_column_numbers = false
153 | show_error_codes = true
154 | show_error_context = true
155 | show_traceback = true
156 | pretty = true
157 | check_untyped_defs = false
158 | # Warnings
159 | warn_no_return = true
160 | warn_redundant_casts = true
161 | warn_unreachable = true
162 | files = ["upath"]
163 | exclude = "^notebooks|^venv.*|tests.*|^noxfile.py"
164 |
165 | [[tool.mypy.overrides]]
166 | module = "fsspec.*"
167 | ignore_missing_imports = true
168 |
169 | [[tool.mypy.overrides]]
170 | module = "webdav4.*"
171 | ignore_missing_imports = true
172 |
173 | [[tool.mypy.overrides]]
174 | module = "pathlib_abc.*"
175 | ignore_missing_imports = true
176 |
177 | [[tool.mypy.overrides]]
178 | module = "smbprotocol.*"
179 | ignore_missing_imports = true
180 |
181 | [[tool.mypy.overrides]]
182 | module = "pydantic.*"
183 | ignore_missing_imports = true
184 | ignore_errors = true
185 |
186 | [[tool.mypy.overrides]]
187 | module = "pydantic_core.*"
188 | ignore_missing_imports = true
189 | ignore_errors = true
190 |
191 | [[tool.mypy.overrides]]
192 | module = "typing_inspection.*"
193 | ignore_errors = true
194 |
195 | [[tool.mypy.overrides]]
196 | module = "annotated_types.*"
197 | ignore_errors = true
198 |
199 | [tool.pylint.format]
200 | max-line-length = 88
201 |
202 | [tool.pylint.message_control]
203 | enable = ["c-extension-no-member", "no-else-return"]
204 |
205 | [tool.pylint.variables]
206 | dummy-variables-rgx = "_+$|(_[a-zA-Z0-9_]*[a-zA-Z0-9]+?$)|dummy|^ignored_|^unused_"
207 | ignored-argument-names = "_.*|^ignored_|^unused_|args|kwargs"
208 |
209 | [tool.codespell]
210 | ignore-words-list = " "
211 |
212 | [tool.bandit]
213 | exclude_dirs = ["tests"]
214 | skips = ["B101"]
215 |
216 | [dependency-groups]
217 | docs = [
218 | "mkdocs>=1.6.1",
219 | "click!=8.2.2,!=8.3.0", # https://github.com/mkdocs/mkdocs/issues/4032
220 | "mkdocs-material>=9.6.22",
221 | "mkdocstrings[python]>=0.30.1",
222 | "mkdocs-exclude>=1.0.2",
223 | "pymdown-extensions>=10.7.0",
224 | "ruff>=0.14.1",
225 | ]
226 |
--------------------------------------------------------------------------------
/noxfile.py:
--------------------------------------------------------------------------------
1 | """Automation using nox."""
2 |
3 | import glob
4 | import os
5 | import sys
6 |
7 | import nox
8 |
9 | nox.options.reuse_existing_virtualenvs = True
10 | nox.options.error_on_external_run = True
11 |
12 | nox.needs_version = ">=2024.04.15"
13 | nox.options.default_venv_backend = "uv"
14 |
15 | nox.options.sessions = "lint", "tests", "type-checking", "type-safety"
16 | locations = ("upath",)
17 | running_in_ci = os.environ.get("CI", "") != ""
18 |
19 | SUPPORTED_PYTHONS = ["3.9", "3.10", "3.11", "3.12", "3.13", "3.14"]
20 | BASE_PYTHON = SUPPORTED_PYTHONS[-3]
21 | MIN_PYTHON = SUPPORTED_PYTHONS[0]
22 |
23 |
24 | @(lambda f: f())
25 | def FSSPEC_MIN_VERSION() -> str:
26 | """Get the minimum fsspec version boundary from pyproject.toml."""
27 | try:
28 | from packaging.requirements import Requirement
29 |
30 | if sys.version_info >= (3, 11):
31 | from tomllib import load as toml_load
32 | else:
33 | from tomli import load as toml_load
34 | except ImportError:
35 | raise RuntimeError(
36 | "We rely on nox>=2024.04.15 depending on `packaging` and `tomli/tomllib`."
37 | " Please report if you see this error."
38 | )
39 |
40 | with open("pyproject.toml", "rb") as f:
41 | pyproject_data = toml_load(f)
42 |
43 | for requirement in pyproject_data["project"]["dependencies"]:
44 | req = Requirement(requirement)
45 | if req.name == "fsspec":
46 | for specifier in req.specifier:
47 | if specifier.operator == ">=":
48 | return str(specifier.version)
49 | raise RuntimeError("Could not find fsspec minimum version in pyproject.toml")
50 |
51 |
52 | @nox.session(python=SUPPORTED_PYTHONS)
53 | def tests(session: nox.Session) -> None:
54 | """Run the test suite."""
55 | # workaround in case no aiohttp binary wheels are available
56 | if session.python == "3.14":
57 | session.env["AIOHTTP_NO_EXTENSIONS"] = "1"
58 | session.install(".[tests,dev]", "pydantic>=2.12.0a1")
59 | else:
60 | session.install(".[tests,dev,dev-third-party]")
61 | session.run("uv", "pip", "freeze", silent=not running_in_ci)
62 | session.run(
63 | "pytest",
64 | "-m",
65 | "not hdfs",
66 | "--cov",
67 | "--cov-config=pyproject.toml",
68 | *session.posargs,
69 | env={"COVERAGE_FILE": f".coverage.{session.python}"},
70 | )
71 |
72 |
73 | @nox.session(python=MIN_PYTHON, name="tests-minversion")
74 | def tests_minversion(session: nox.Session) -> None:
75 | session.install(f"fsspec=={FSSPEC_MIN_VERSION}", ".[tests,dev]")
76 | session.run("uv", "pip", "freeze", silent=not running_in_ci)
77 | session.run(
78 | "pytest",
79 | "-m",
80 | "not hdfs",
81 | "--cov",
82 | "--cov-config=pyproject.toml",
83 | *session.posargs,
84 | env={"COVERAGE_FILE": f".coverage.{session.python}"},
85 | )
86 |
87 |
88 | tests_minversion.__doc__ = f"Run the test suite with fsspec=={FSSPEC_MIN_VERSION}."
89 |
90 |
91 | @nox.session
92 | def lint(session: nox.Session) -> None:
93 | """Run pre-commit hooks."""
94 | session.install("pre-commit")
95 | session.install("-e", ".[tests]")
96 |
97 | args = *(session.posargs or ("--show-diff-on-failure",)), "--all-files"
98 | session.run("pre-commit", "run", *args)
99 |
100 |
101 | @nox.session
102 | def safety(session: nox.Session) -> None:
103 | """Scan dependencies for insecure packages."""
104 | session.install(".")
105 | session.install("safety")
106 | session.run("safety", "check", "--full-report")
107 |
108 |
109 | @nox.session
110 | def build(session: nox.Session) -> None:
111 | """Build sdists and wheels."""
112 | session.install("build", "setuptools", "twine")
113 | session.run("python", "-m", "build")
114 | dists = glob.glob("dist/*")
115 | session.run("twine", "check", *dists, silent=True)
116 |
117 |
118 | @nox.session
119 | def develop(session: nox.Session) -> None:
120 | """Sets up a python development environment for the project."""
121 | session.run("uv", "venv", external=True)
122 |
123 |
124 | @nox.session(name="type-checking", python=BASE_PYTHON)
125 | def type_checking(session):
126 | """Run mypy checks."""
127 | session.install("-e", ".[typechecking]")
128 | session.run("python", "-m", "mypy")
129 |
130 |
131 | @nox.session(name="type-safety", python=SUPPORTED_PYTHONS)
132 | def type_safety(session):
133 | """Run typesafety tests."""
134 | session.install("-e", ".[typechecking]")
135 | session.run(
136 | "python",
137 | "-m",
138 | "pytest",
139 | "-v",
140 | "-p",
141 | "pytest-mypy-plugins",
142 | "--mypy-pyproject-toml-file",
143 | "pyproject.toml",
144 | "typesafety",
145 | *session.posargs,
146 | )
147 |
148 |
149 | @nox.session(name="flavours-upgrade-deps", python=BASE_PYTHON)
150 | def upgrade_flavours(session):
151 | session.run("uvx", "pur", "-r", "dev/requirements.txt")
152 |
153 |
154 | @nox.session(name="flavours-codegen", python=BASE_PYTHON)
155 | def generate_flavours(session):
156 | session.install("-r", "dev/requirements.txt")
157 | with open("upath/_flavour_sources.py", "w") as target:
158 | session.run(
159 | "python",
160 | "dev/fsspec_inspector/generate_flavours.py",
161 | stdout=target,
162 | stderr=None,
163 | )
164 |
165 |
166 | @nox.session(name="docs-build", python=BASE_PYTHON)
167 | def docs_build(session):
168 | """Build the documentation in strict mode."""
169 | session.install("--group=docs", "-e", ".")
170 | session.run("mkdocs", "build")
171 |
172 |
173 | @nox.session(name="docs-serve", python=BASE_PYTHON)
174 | def docs_serve(session):
175 | """Serve the documentation with live reloading."""
176 | session.install("--group=docs", "-e", ".")
177 | session.run("mkdocs", "serve", "--no-strict")
178 |
--------------------------------------------------------------------------------
/upath/implementations/cloud.py:
--------------------------------------------------------------------------------
1 | from __future__ import annotations
2 |
3 | import sys
4 | from typing import TYPE_CHECKING
5 | from typing import Any
6 |
7 | from upath._chain import DEFAULT_CHAIN_PARSER
8 | from upath._flavour import upath_strip_protocol
9 | from upath.core import UPath
10 | from upath.types import JoinablePathLike
11 |
12 | if TYPE_CHECKING:
13 | from typing import Literal
14 |
15 | if sys.version_info >= (3, 11):
16 | from typing import Unpack
17 | else:
18 | from typing_extensions import Unpack
19 |
20 | from upath._chain import FSSpecChainParser
21 | from upath.types.storage_options import AzureStorageOptions
22 | from upath.types.storage_options import GCSStorageOptions
23 | from upath.types.storage_options import HfStorageOptions
24 | from upath.types.storage_options import S3StorageOptions
25 |
26 | __all__ = [
27 | "CloudPath",
28 | "GCSPath",
29 | "S3Path",
30 | "AzurePath",
31 | "HfPath",
32 | ]
33 |
34 |
35 | class CloudPath(UPath):
36 | __slots__ = ()
37 |
38 | @classmethod
39 | def _transform_init_args(
40 | cls,
41 | args: tuple[JoinablePathLike, ...],
42 | protocol: str,
43 | storage_options: dict[str, Any],
44 | ) -> tuple[tuple[JoinablePathLike, ...], str, dict[str, Any]]:
45 | for key in ["bucket", "netloc"]:
46 | bucket = storage_options.pop(key, None)
47 | if bucket:
48 | if str(args[0]).startswith("/"):
49 | args = (f"{protocol}://{bucket}{args[0]}", *args[1:])
50 | else:
51 | args0 = upath_strip_protocol(args[0])
52 | args = (f"{protocol}://{bucket}/", args0, *args[1:])
53 | break
54 | return super()._transform_init_args(args, protocol, storage_options)
55 |
56 | @property
57 | def root(self) -> str:
58 | if self._relative_base is not None:
59 | return ""
60 | return self.parser.sep
61 |
62 | def __str__(self) -> str:
63 | path = super().__str__()
64 | if self._relative_base is None:
65 | drive = self.parser.splitdrive(path)[0]
66 | if drive and path == f"{self.protocol}://{drive}":
67 | return f"{path}{self.root}"
68 | return path
69 |
70 | @property
71 | def path(self) -> str:
72 | self_path = super().path.rstrip(self.parser.sep)
73 | if (
74 | self._relative_base is None
75 | and self_path
76 | and self.parser.sep not in self_path
77 | ):
78 | return self_path + self.root
79 | return self_path
80 |
81 | def mkdir(
82 | self, mode: int = 0o777, parents: bool = False, exist_ok: bool = False
83 | ) -> None:
84 | if not parents and not exist_ok and self.exists():
85 | raise FileExistsError(self.path)
86 | super().mkdir(mode=mode, parents=parents, exist_ok=exist_ok)
87 |
88 |
89 | class GCSPath(CloudPath):
90 | __slots__ = ()
91 |
92 | def __init__(
93 | self,
94 | *args: JoinablePathLike,
95 | protocol: Literal["gcs", "gs"] | None = None,
96 | chain_parser: FSSpecChainParser = DEFAULT_CHAIN_PARSER,
97 | **storage_options: Unpack[GCSStorageOptions],
98 | ) -> None:
99 | super().__init__(
100 | *args, protocol=protocol, chain_parser=chain_parser, **storage_options
101 | )
102 | if not self.drive and len(self.parts) > 1:
103 | raise ValueError("non key-like path provided (bucket/container missing)")
104 |
105 | def mkdir(
106 | self, mode: int = 0o777, parents: bool = False, exist_ok: bool = False
107 | ) -> None:
108 | try:
109 | super().mkdir(mode=mode, parents=parents, exist_ok=exist_ok)
110 | except TypeError as err:
111 | if "unexpected keyword argument 'create_parents'" in str(err):
112 | self.fs.mkdir(self.path)
113 |
114 | def exists(self, *, follow_symlinks: bool = True) -> bool:
115 | # required for gcsfs<2025.5.0, see: https://github.com/fsspec/gcsfs/pull/676
116 | path = self.path
117 | if len(path) > 1:
118 | path = path.removesuffix(self.root)
119 | return self.fs.exists(path)
120 |
121 |
122 | class S3Path(CloudPath):
123 | __slots__ = ()
124 |
125 | def __init__(
126 | self,
127 | *args: JoinablePathLike,
128 | protocol: Literal["s3", "s3a"] | None = None,
129 | chain_parser: FSSpecChainParser = DEFAULT_CHAIN_PARSER,
130 | **storage_options: Unpack[S3StorageOptions],
131 | ) -> None:
132 | super().__init__(
133 | *args, protocol=protocol, chain_parser=chain_parser, **storage_options
134 | )
135 | if not self.drive and len(self.parts) > 1:
136 | raise ValueError("non key-like path provided (bucket/container missing)")
137 |
138 |
139 | class AzurePath(CloudPath):
140 | __slots__ = ()
141 |
142 | def __init__(
143 | self,
144 | *args: JoinablePathLike,
145 | protocol: Literal["abfs", "abfss", "adl", "az"] | None = None,
146 | chain_parser: FSSpecChainParser = DEFAULT_CHAIN_PARSER,
147 | **storage_options: Unpack[AzureStorageOptions],
148 | ) -> None:
149 | super().__init__(
150 | *args, protocol=protocol, chain_parser=chain_parser, **storage_options
151 | )
152 | if not self.drive and len(self.parts) > 1:
153 | raise ValueError("non key-like path provided (bucket/container missing)")
154 |
155 |
156 | class HfPath(CloudPath):
157 | __slots__ = ()
158 |
159 | def __init__(
160 | self,
161 | *args: JoinablePathLike,
162 | protocol: Literal["hf"] | None = None,
163 | chain_parser: FSSpecChainParser = DEFAULT_CHAIN_PARSER,
164 | **storage_options: Unpack[HfStorageOptions],
165 | ) -> None:
166 | super().__init__(
167 | *args, protocol=protocol, chain_parser=chain_parser, **storage_options
168 | )
169 |
--------------------------------------------------------------------------------
/upath/tests/test_extensions.py:
--------------------------------------------------------------------------------
1 | import os
2 | import sys
3 | from contextlib import nullcontext
4 |
5 | import pytest
6 |
7 | from upath import UnsupportedOperation
8 | from upath import UPath
9 | from upath.extensions import ProxyUPath
10 | from upath.implementations.local import FilePath
11 | from upath.implementations.local import PosixUPath
12 | from upath.implementations.local import WindowsUPath
13 | from upath.implementations.memory import MemoryPath
14 | from upath.tests.cases import BaseTests
15 |
16 |
17 | class TestProxyMemoryPath(BaseTests):
18 | @pytest.fixture(autouse=True)
19 | def path(self, local_testdir):
20 | if not local_testdir.startswith("/"):
21 | local_testdir = "/" + local_testdir
22 | self.path = ProxyUPath(f"memory:{local_testdir}")
23 | self.prepare_file_system()
24 |
25 | def test_is_ProxyUPath(self):
26 | assert isinstance(self.path, ProxyUPath)
27 |
28 | def test_is_not_MemoryPath(self):
29 | assert not isinstance(self.path, MemoryPath)
30 |
31 |
32 | class TestProxyFilePath(BaseTests):
33 | @pytest.fixture(autouse=True)
34 | def path(self, local_testdir):
35 | self.path = ProxyUPath(f"file://{local_testdir}")
36 | self.prepare_file_system()
37 |
38 | def test_is_ProxyUPath(self):
39 | assert isinstance(self.path, ProxyUPath)
40 |
41 | def test_is_not_FilePath(self):
42 | assert not isinstance(self.path, FilePath)
43 |
44 | def test_chmod(self):
45 | self.path.joinpath("file1.txt").chmod(777)
46 |
47 | def test_cwd(self):
48 | self.path.cwd()
49 | with pytest.raises(UnsupportedOperation):
50 | type(self.path).cwd()
51 |
52 |
53 | class TestProxyPathlibPath(BaseTests):
54 | @pytest.fixture(autouse=True)
55 | def path(self, local_testdir):
56 | self.path = ProxyUPath(f"{local_testdir}")
57 | self.prepare_file_system()
58 |
59 | def test_is_ProxyUPath(self):
60 | assert isinstance(self.path, ProxyUPath)
61 |
62 | def test_is_not_PosixUPath_WindowsUPath(self):
63 | assert not isinstance(self.path, (PosixUPath, WindowsUPath))
64 |
65 | def test_chmod(self):
66 | self.path.joinpath("file1.txt").chmod(777)
67 |
68 | @pytest.mark.skipif(
69 | sys.version_info < (3, 12), reason="storage options only handled in 3.12+"
70 | )
71 | def test_eq(self):
72 | super().test_eq()
73 |
74 | if sys.version_info < (3, 12):
75 |
76 | def test_storage_options_dont_affect_hash(self):
77 | # On Python < 3.12, storage_options trigger warnings for LocalPath
78 | with pytest.warns(
79 | UserWarning,
80 | match=r".*on python <= \(3, 11\) ignores protocol and storage_options",
81 | ):
82 | super().test_storage_options_dont_affect_hash()
83 |
84 | def test_group(self):
85 | pytest.importorskip("grp")
86 | self.path.group()
87 |
88 | def test_owner(self):
89 | pytest.importorskip("pwd")
90 | self.path.owner()
91 |
92 | def test_readlink(self):
93 | try:
94 | os.readlink
95 | except AttributeError:
96 | pytest.skip("os.readlink not available on this platform")
97 | with pytest.raises((OSError, UnsupportedOperation)):
98 | self.path.readlink()
99 |
100 | def test_protocol(self):
101 | assert self.path.protocol == ""
102 |
103 | def test_as_uri(self):
104 | assert self.path.as_uri().startswith("file://")
105 |
106 | if sys.version_info < (3, 10):
107 |
108 | def test_lstat(self):
109 | # On Python < 3.10, stat(follow_symlinks=False) triggers warnings
110 | with pytest.warns(
111 | UserWarning,
112 | match=r".*stat\(\) follow_symlinks=False is currently ignored",
113 | ):
114 | st = self.path.lstat()
115 | assert st is not None
116 |
117 | else:
118 |
119 | def test_lstat(self):
120 | st = self.path.lstat()
121 | assert st is not None
122 |
123 | def test_relative_to(self):
124 | base = self.path
125 | child = self.path / "folder1" / "file1.txt"
126 | relative = child.relative_to(base)
127 | assert str(relative) == f"folder1{os.sep}file1.txt"
128 |
129 | def test_cwd(self):
130 | self.path.cwd()
131 | with pytest.raises(UnsupportedOperation):
132 | type(self.path).cwd()
133 |
134 | def test_lchmod(self):
135 | # setup
136 | a = self.path.joinpath("a")
137 | b = self.path.joinpath("b")
138 | a.touch()
139 | b.symlink_to(a)
140 |
141 | # see: https://github.com/python/cpython/issues/108660#issuecomment-1854645898
142 | if hasattr(os, "lchmod") or os.chmod in os.supports_follow_symlinks:
143 | cm = nullcontext()
144 | else:
145 | cm = pytest.raises((UnsupportedOperation, NotImplementedError))
146 | with cm:
147 | b.lchmod(mode=0o777)
148 |
149 | def test_symlink_to(self):
150 | self.path.joinpath("link").symlink_to(self.path)
151 |
152 | def test_hardlink_to(self):
153 | try:
154 | self.path.joinpath("link").hardlink_to(self.path)
155 | except PermissionError:
156 | pass # hardlink may require elevated permissions
157 |
158 |
159 | def test_custom_subclass():
160 |
161 | class ReversePath(ProxyUPath):
162 | def read_bytes_reversed(self):
163 | return self.read_bytes()[::-1]
164 |
165 | def write_bytes_reversed(self, value):
166 | self.write_bytes(value[::-1])
167 |
168 | b = MemoryPath("memory://base")
169 |
170 | p = b.joinpath("file1")
171 | p.write_bytes(b"dlrow olleh")
172 |
173 | r = ReversePath("memory://base/file1")
174 | assert r.read_bytes_reversed() == b"hello world"
175 |
176 | r.parent.joinpath("file2").write_bytes_reversed(b"dlrow olleh")
177 | assert b.joinpath("file2").read_bytes() == b"hello world"
178 |
179 |
180 | def test_protocol_dispatch_deprecation_warning():
181 |
182 | class MyPath(UPath):
183 | _protocol_dispatch = False
184 |
185 | with pytest.warns(DeprecationWarning, match="_protocol_dispatch = False"):
186 | a = MyPath(".", protocol="memory")
187 |
188 | assert isinstance(a, MyPath)
189 |
--------------------------------------------------------------------------------
/docs/concepts/pathlib.md:
--------------------------------------------------------------------------------
1 | # Pathlib :snake:
2 |
3 | [pathlib](https://docs.python.org/3/library/pathlib.html) is a Python standard library module that provides an object-oriented interface for working with filesystem paths. It's the modern, pythonic way to handle file paths and filesystem operations, replacing the older string-based `os.path` approach.
4 |
5 | ## What is pathlib?
6 |
7 | Introduced in Python 3.4, pathlib represents filesystem paths as objects rather than strings.
8 |
9 | ### Path Objects
10 |
11 | In pathlib, paths are instances of `Path` (or platform-specific subclasses) that represent local filesystem paths:
12 |
13 | ```python
14 | from pathlib import Path
15 |
16 | # Create path objects
17 | p = Path("/home/user/documents")
18 | p = Path("relative/path/to/file.txt")
19 | p = Path.home() # User's home directory
20 | p = Path.cwd() # Current working directory
21 | ```
22 |
23 | ### Pure vs. Concrete Paths
24 |
25 | pathlib distinguishes between two types of paths:
26 |
27 | **Pure Paths** (`PurePath`, `PurePosixPath`, `PureWindowsPath`):
28 | - Only manipulate path strings
29 | - Don't access the filesystem
30 | - Work on any platform regardless of OS
31 | - Useful for path manipulation without I/O
32 |
33 | ```python
34 | from pathlib import PurePath, PurePosixPath, PureWindowsPath
35 |
36 | # Pure path - string manipulation only
37 | pure = PurePath("/home/user/file.txt")
38 | parent = pure.parent # Works
39 | name = pure.name # Works
40 | # exists = pure.exists() # AttributeError - no filesystem access
41 |
42 | # Platform-specific pure paths
43 | posix = PurePosixPath("/home/user/file.txt") # Always uses /
44 | windows = PureWindowsPath("C:\\Users\\file.txt") # Always uses \
45 | ```
46 |
47 | **Concrete Paths** (`Path`, `PosixPath`, `WindowsPath`):
48 | - Inherit from pure paths
49 | - Actually access the filesystem
50 | - Support operations like `.exists()`, `.stat()`, `.read_text()`
51 | - Platform-specific: `PosixPath` on Unix, `WindowsPath` on Windows
52 |
53 | ```python
54 | from pathlib import Path
55 |
56 | # Concrete path - filesystem operations
57 | p = Path("/home/user/file.txt")
58 | exists = p.exists() # Checks filesystem
59 | content = p.read_text() # Reads file
60 | size = p.stat().st_size # Gets file size
61 | ```
62 |
63 | ## When to use pathlib
64 |
65 | Use pathlib when you:
66 |
67 | - Work with local filesystem paths in Python
68 | - Need cross-platform path handling
69 | - Want object-oriented path manipulation
70 |
71 | ## What is pathlib-abc?
72 |
73 | [pathlib-abc](https://github.com/barneygale/pathlib-abc) is a Python library that defines abstract base classes (ABCs) for path-like objects. It provides a formal specification for the pathlib interface that can be implemented by different path types, not just local filesystem paths.
74 |
75 | ### Abstract Base Classes for Paths
76 |
77 | pathlib-abc extracts the core concepts from Python's pathlib module into abstract base classes. This allows library authors and framework developers to:
78 |
79 | 1. **Define path-like interfaces** that work across different storage backends
80 | 2. **Type hint** functions that accept any path-like object
81 | 3. **Implement custom path classes** that follow pathlib conventions
82 | 4. **Ensure compatibility** between different path implementations
83 |
84 | !!! info "Relationship to Python's pathlib"
85 | Currently (as of Python 3.14), the standard library `pathlib.Path` does **not** inherit from public pathlib-abc classes. However, there is ongoing work to incorporate these ABCs into future Python releases.
86 |
87 | The library defines three main abstract base classes that represent different levels of path functionality:
88 |
89 | ### JoinablePath
90 |
91 | `JoinablePath` is the most basic path abstraction. It represents paths that can be constructed, manipulated, and joined together, but cannot necessarily access any actual filesystem.
92 |
93 | **Key capabilities:**
94 |
95 | - Path construction and manipulation
96 | - String operations on paths
97 | - Path component access (name, stem, suffix, parent, etc.)
98 | - Path joining with the `/` operator
99 | - Pattern matching
100 |
101 | Think of `JoinablePath` as equivalent to pathlib's `PurePath` - it only manipulates path strings.
102 |
103 | ### ReadablePath
104 |
105 | `ReadablePath` extends `JoinablePath` to add read-only filesystem operations. It represents paths where you can read data but not modify the filesystem.
106 |
107 | **Adds capabilities for:**
108 |
109 | - Reading file contents (`.read_text()`, `.read_bytes()`)
110 | - Opening files for reading
111 | - Checking file existence and type (`.exists()`, `.is_file()`, `.is_dir()`)
112 | - Listing directory contents (`.iterdir()`)
113 | - Globbing and pattern matching (`.glob()`, `.rglob()`)
114 | - Walking directory trees (`.walk()`)
115 | - Reading symlinks (`.readlink()`)
116 | - Accessing file metadata (`.info` property)
117 |
118 | ### WritablePath
119 |
120 | `WritablePath` extends `JoinablePath` (not `ReadablePath`) to add write operations. It represents paths where you can create, modify, and delete filesystem objects.
121 |
122 | **Adds capabilities for:**
123 |
124 | - Writing file contents (`.write_text()`, `.write_bytes()`)
125 | - Opening files for writing
126 | - Creating directories (`.mkdir()`)
127 | - Creating symlinks (`.symlink_to()`)
128 |
129 | !!! note "WritablePath Does Not Inherit from ReadablePath"
130 | `WritablePath` does NOT inherit from `ReadablePath`. A path that is writable is not automatically readable. In practice, most filesystem paths are both readable and writable (like `UPath` which inherits from both), but the separation allows for specialized use cases like write-only destinations or read-only sources.
131 |
132 | ## Learn More
133 |
134 | For comprehensive information about pathlib:
135 |
136 | - **Official documentation**: [Python pathlib documentation](https://docs.python.org/3/library/pathlib.html)
137 | - **PEP 428**: [The pathlib module – object-oriented filesystem paths](https://www.python.org/dev/peps/pep-0428/)
138 | - **Comparison with os.path**: [Correspondence to tools in the os module](https://docs.python.org/3/library/pathlib.html#correspondence-to-tools-in-the-os-module)
139 |
140 | For comprehensive information about pathlib-abc:
141 |
142 | - **GitHub repository**: [barneygale/pathlib-abc](https://github.com/barneygale/pathlib-abc)
143 |
144 | For using pathlib-style paths with remote and cloud filesystems, see [upath.md](upath.md).
145 |
--------------------------------------------------------------------------------
/upath/tests/implementations/test_zip.py:
--------------------------------------------------------------------------------
1 | import os
2 | import zipfile
3 |
4 | import pytest
5 |
6 | from upath import UPath
7 | from upath.implementations.zip import ZipPath
8 |
9 | from ..cases import BaseTests
10 |
11 |
12 | @pytest.fixture(scope="function")
13 | def zipped_testdir_file(local_testdir, tmp_path_factory):
14 | base = tmp_path_factory.mktemp("zippath")
15 | zip_path = base / "test.zip"
16 | with zipfile.ZipFile(zip_path, "w") as zf:
17 | for root, _, files in os.walk(local_testdir):
18 | for file in files:
19 | full_path = os.path.join(root, file)
20 | arcname = os.path.relpath(full_path, start=local_testdir)
21 | zf.write(full_path, arcname=arcname)
22 | return str(zip_path)
23 |
24 |
25 | @pytest.fixture(scope="function")
26 | def empty_zipped_testdir_file(tmp_path):
27 | tmp_path = tmp_path.joinpath("zippath")
28 | tmp_path.mkdir()
29 | zip_path = tmp_path / "test.zip"
30 |
31 | with zipfile.ZipFile(zip_path, "w"):
32 | pass
33 | return str(zip_path)
34 |
35 |
36 | class TestZipPath(BaseTests):
37 |
38 | @pytest.fixture(autouse=True)
39 | def path(self, zipped_testdir_file, request):
40 | try:
41 | (mode,) = request.param
42 | except (ValueError, TypeError, AttributeError):
43 | mode = "r"
44 | self.path = UPath("zip://", fo=zipped_testdir_file, mode=mode)
45 | try:
46 | yield
47 | finally:
48 | self.path.fs.clear_instance_cache()
49 |
50 | def test_is_ZipPath(self):
51 | assert isinstance(self.path, ZipPath)
52 |
53 | @pytest.mark.parametrize(
54 | "path", [("w",)], ids=["zipfile_mode_write"], indirect=True
55 | )
56 | def test_mkdir(self):
57 | super().test_mkdir()
58 |
59 | @pytest.mark.parametrize(
60 | "path", [("w",)], ids=["zipfile_mode_write"], indirect=True
61 | )
62 | def test_mkdir_exists_ok_true(self):
63 | super().test_mkdir_exists_ok_true()
64 |
65 | @pytest.mark.parametrize(
66 | "path", [("w",)], ids=["zipfile_mode_write"], indirect=True
67 | )
68 | def test_mkdir_exists_ok_false(self):
69 | super().test_mkdir_exists_ok_false()
70 |
71 | @pytest.mark.parametrize(
72 | "path", [("w",)], ids=["zipfile_mode_write"], indirect=True
73 | )
74 | def test_mkdir_parents_true_exists_ok_true(self):
75 | super().test_mkdir_parents_true_exists_ok_true()
76 |
77 | @pytest.mark.parametrize(
78 | "path", [("w",)], ids=["zipfile_mode_write"], indirect=True
79 | )
80 | def test_mkdir_parents_true_exists_ok_false(self):
81 | super().test_mkdir_parents_true_exists_ok_false()
82 |
83 | def test_rename(self):
84 | with pytest.raises(NotImplementedError):
85 | super().test_rename() # delete is not implemented in fsspec
86 |
87 | def test_move_local(self, tmp_path):
88 | with pytest.raises(NotImplementedError):
89 | super().test_move_local(tmp_path) # delete is not implemented in fsspec
90 |
91 | def test_move_into_local(self, tmp_path):
92 | with pytest.raises(NotImplementedError):
93 | super().test_move_into_local(
94 | tmp_path
95 | ) # delete is not implemented in fsspec
96 |
97 | def test_move_memory(self, clear_fsspec_memory_cache):
98 | with pytest.raises(NotImplementedError):
99 | super().test_move_memory(clear_fsspec_memory_cache)
100 |
101 | def test_move_into_memory(self, clear_fsspec_memory_cache):
102 | with pytest.raises(NotImplementedError):
103 | super().test_move_into_memory(clear_fsspec_memory_cache)
104 |
105 | @pytest.mark.parametrize(
106 | "path", [("w",)], ids=["zipfile_mode_write"], indirect=True
107 | )
108 | def test_touch(self):
109 | super().test_touch()
110 |
111 | @pytest.mark.parametrize(
112 | "path", [("w",)], ids=["zipfile_mode_write"], indirect=True
113 | )
114 | def test_touch_unlink(self):
115 | with pytest.raises(NotImplementedError):
116 | super().test_touch_unlink() # delete is not implemented in fsspec
117 |
118 | @pytest.mark.parametrize(
119 | "path", [("w",)], ids=["zipfile_mode_write"], indirect=True
120 | )
121 | def test_write_bytes(self):
122 | fn = "test_write_bytes.txt"
123 | s = b"hello_world"
124 | path = self.path.joinpath(fn)
125 | path.write_bytes(s)
126 | so = {**path.storage_options, "mode": "r"}
127 | urlpath = str(path)
128 | path.fs.close()
129 | assert UPath(urlpath, **so).read_bytes() == s
130 |
131 | @pytest.mark.parametrize(
132 | "path", [("w",)], ids=["zipfile_mode_write"], indirect=True
133 | )
134 | def test_write_text(self):
135 | fn = "test_write_text.txt"
136 | s = "hello_world"
137 | path = self.path.joinpath(fn)
138 | path.write_text(s)
139 | so = {**path.storage_options, "mode": "r"}
140 | urlpath = str(path)
141 | path.fs.close()
142 | assert UPath(urlpath, **so).read_text() == s
143 |
144 | @pytest.mark.skip(reason="fsspec zipfile filesystem is either read xor write mode")
145 | def test_fsspec_compat(self):
146 | pass
147 |
148 | @pytest.mark.skip(reason="fsspec zipfile filesystem is either read xor write mode")
149 | def test_rename_with_target_absolute(self, target_factory):
150 | return super().test_rename_with_target_absolute(target_factory)
151 |
152 | @pytest.mark.skip(reason="fsspec zipfile filesystem is either read xor write mode")
153 | def test_write_text_encoding(self):
154 | return super().test_write_text_encoding()
155 |
156 | @pytest.mark.skip(reason="fsspec zipfile filesystem is either read xor write mode")
157 | def test_write_text_errors(self):
158 | return super().test_write_text_errors()
159 |
160 |
161 | @pytest.fixture(scope="function")
162 | def zipped_testdir_file_in_memory(zipped_testdir_file, clear_fsspec_memory_cache):
163 | p = UPath(zipped_testdir_file, protocol="file")
164 | t = p.move(UPath("memory:///myzipfile.zip"))
165 | assert t.protocol == "memory"
166 | assert t.exists()
167 | yield t.as_uri()
168 |
169 |
170 | class TestChainedZipPath(TestZipPath):
171 |
172 | @pytest.fixture(autouse=True)
173 | def path(self, zipped_testdir_file_in_memory, request):
174 | try:
175 | (mode,) = request.param
176 | except (ValueError, TypeError, AttributeError):
177 | mode = "r"
178 | self.path = UPath(
179 | "zip://", fo="/myzipfile.zip", mode=mode, target_protocol="memory"
180 | )
181 |
--------------------------------------------------------------------------------
/docs/concepts/fsspec.md:
--------------------------------------------------------------------------------
1 | # Filesystem Spec :file_folder:
2 |
3 | [fsspec](https://filesystem-spec.readthedocs.io/) is a Python library that provides a unified, pythonic interface for working with different storage backends. It abstracts away the differences between various storage systems, allowing you to interact with local files, cloud storage, remote systems, and specialty filesystems using a consistent API.
4 |
5 | ## What is fsspec?
6 |
7 | fsspec is both a **specification** and a **collection of implementations** for pythonic filesystems. The project defines a standard interface that filesystem implementations should follow, then provides concrete implementations for dozens of different storage backends.
8 |
9 | The core idea is simple: whether you're working with files on your local disk, objects in an S3 bucket, blobs in Azure storage, or data over HTTP, the API to interact with them should be the same.
10 |
11 | ### Core Functionality
12 |
13 | fsspec provides filesystem objects with methods for common operations. All filesystem implementations inherit from `fsspec.spec.AbstractFileSystem`, which defines the standard interface that all filesystems must implement:
14 |
15 | ```python
16 | import fsspec
17 |
18 | # Create a filesystem instance
19 | # Returns an AbstractFileSystem subclass for the specified protocol
20 | fs = fsspec.filesystem('s3', anon=True)
21 |
22 | # List files
23 | files = fs.ls('my-bucket/data/')
24 |
25 | # Check if file exists
26 | exists = fs.exists('my-bucket/data/file.txt')
27 |
28 | # Get file info
29 | info = fs.info('my-bucket/data/file.txt')
30 |
31 | # Read file
32 | with fs.open('my-bucket/data/file.txt', 'r') as f:
33 | content = f.read()
34 |
35 | # Write file
36 | with fs.open('my-bucket/output.txt', 'w') as f:
37 | f.write('Hello, World!')
38 |
39 | # Copy files
40 | fs.cp('my-bucket/source.txt', 'my-bucket/dest.txt')
41 |
42 | # Delete files
43 | fs.rm('my-bucket/file.txt')
44 | ```
45 |
46 | ### Protocols
47 |
48 | fsspec identifies filesystem types via **protocols**. Each protocol corresponds to a specific filesystem implementation:
49 |
50 | - `file://` - Local filesystem
51 | - `memory://` - In-memory filesystem (temporary, non-persistent)
52 | - `s3://` or `s3a://` - Amazon S3
53 | - `gs://` or `gcs://` - Google Cloud Storage
54 | - `az://` or `abfs://` - Azure Blob Storage
55 | - `adl://` - Azure Data Lake Gen1
56 | - `abfss://` - Azure Data Lake Gen2 (secure)
57 | - `http://` or `https://` - HTTP(S) access
58 | - `ftp://` - FTP
59 | - `sftp://` or `ssh://` - SFTP over SSH
60 | - `smb://` - Samba/Windows file shares
61 | - `webdav://` or `webdav+http://` - WebDAV
62 | - `hdfs://` - Hadoop Distributed File System
63 | - `hf://` - Hugging Face Hub
64 | - `github://` - GitHub repositories
65 | - `zip://` - ZIP archives
66 | - `tar://` - TAR archives
67 | - `gzip://` - GZIP compressed files
68 | - `cached://` - Caching layer over other filesystems
69 |
70 | And many more. See the [fsspec documentation](https://filesystem-spec.readthedocs.io/en/latest/api.html#built-in-implementations) for the complete list.
71 |
72 | ### Storage Options
73 |
74 | Each filesystem implementation accepts different configuration parameters called **storage options**. These control authentication, connection settings, caching behavior, and more.
75 | They are usually provided as keyword parameters to the
76 | specific filesystem class on instantiation.
77 |
78 | Common storage option patterns:
79 |
80 | ```python
81 | import fsspec
82 |
83 | # Authentication credentials
84 | fs = fsspec.filesystem('s3', key='...', secret='...')
85 |
86 | # Anonymous/public access
87 | fs = fsspec.filesystem('s3', anon=True)
88 |
89 | # Tokens and service accounts
90 | fs = fsspec.filesystem('gs', token='path/to/creds.json')
91 |
92 | # Connection settings
93 | fs = fsspec.filesystem('sftp', host='...', port=22, username='...')
94 |
95 | # Behavioral options
96 | fs = fsspec.filesystem('s3', use_ssl=True, default_block_size=5*2**20)
97 | ```
98 |
99 | Refer to the [fsspec documentation](https://filesystem-spec.readthedocs.io/en/latest/api.html#built-in-implementations) for details on what each filesystem supports.
100 |
101 | ### URI-Based Access: urlpaths
102 |
103 | fsspec supports opening files directly using URIs. Usually a
104 | resource is clearly defined by its 'protocol', 'storage options', and 'path'. The protocol and path can usually be
105 | combined to a urlpath string:
106 |
107 | ```python
108 | import fsspec
109 |
110 | # resource
111 | protocol = "s3"
112 | storage_options = {"anon": True}
113 | path = "bucket/file.txt"
114 |
115 | # Create filesystem and open path
116 | fs = fsspec.filesystem("s3", anon=True)
117 | with fs.open("bucket/file.txt", "r") as f:
118 | content = f.read()
119 |
120 | # Or open a file via its urlpath with storage_options
121 | with fsspec.open('s3://bucket/file.txt', 'r', anon=True) as f:
122 | content = f.read()
123 | ```
124 |
125 | ### Chained Filesystems
126 |
127 | fsspec supports composing filesystems together using the `::` separator. This allows one filesystem to be used as the target
128 | filesystem for another:
129 |
130 | ```python
131 | import fsspec
132 |
133 | # Access a file inside a ZIP archive on S3
134 | with fsspec.open('zip://data.csv::s3://bucket/archive.zip', 'r', anon=True) as f:
135 | content = f.read()
136 |
137 | # Read a compressed file
138 | with fsspec.open('tar://file.txt::s3://bucket/archive.tar', 'r', anon=True) as f:
139 | content = f.read()
140 | ```
141 |
142 | ### Caching
143 |
144 | fsspec includes powerful caching capabilities to improve performance when accessing remote files:
145 |
146 | ```python
147 | import fsspec
148 |
149 | # Simple caching
150 | fs = fsspec.filesystem(
151 | 's3',
152 | anon=True,
153 | use_listings_cache=True,
154 | listings_expiry_time=600 # Cache for 10 minutes
155 | )
156 |
157 | # File-level caching
158 | cached_fs = fsspec.filesystem(
159 | 'filecache',
160 | target_protocol='s3',
161 | target_options={'anon': True},
162 | cache_storage='/tmp/fsspec-cache'
163 | )
164 | ```
165 |
166 | ## When to use fsspec directly
167 |
168 | You typically use fsspec directly when you:
169 |
170 | - Need filesystem-level operations (`ls`, `cp`, `rm`, `find`)
171 | - Want to work with file-like objects without path abstractions
172 | - Need low-level control over filesystem behavior
173 | - Are integrating with data libraries that accept fsspec URLs
174 | - Want to implement custom filesystem wrappers
175 | - Want to avoid the overhead of UPath instance creation
176 |
177 | ## Learn More
178 |
179 | For comprehensive information about fsspec:
180 |
181 | - **Official documentation**: [fsspec.readthedocs.io](https://filesystem-spec.readthedocs.io/)
182 | - **API reference**: [Built-in filesystem implementations](https://filesystem-spec.readthedocs.io/en/latest/api.html#built-in-implementations)
183 | - **GitHub repository**: [fsspec/filesystem_spec](https://github.com/fsspec/filesystem_spec)
184 | - **Usage guides**: [Examples and tutorials](https://filesystem-spec.readthedocs.io/en/latest/usage.html)
185 |
186 | For using fsspec with a pathlib-style interface, see [upath.md](upath.md).
187 |
--------------------------------------------------------------------------------
/docs/api/implementations.md:
--------------------------------------------------------------------------------
1 | # Implementations :file_folder:
2 |
3 | Universal Pathlib provides specialized UPath subclasses for different filesystem protocols.
4 | Each implementation is optimized for its respective filesystem and may provide additional
5 | protocol-specific functionality.
6 |
7 | ## upath.implementations.cloud
8 |
9 | ::: upath.implementations.cloud.S3Path
10 | options:
11 | heading_level: 3
12 | show_root_heading: true
13 | show_root_full_path: false
14 | members: []
15 | show_bases: true
16 |
17 | **Protocols:** `s3://`, `s3a://`
18 |
19 | Amazon S3 compatible object storage implementation.
20 |
21 | ::: upath.implementations.cloud.GCSPath
22 | options:
23 | heading_level: 3
24 | show_root_heading: true
25 | show_root_full_path: false
26 | members: []
27 | show_bases: true
28 |
29 | **Protocols:** `gs://`, `gcs://`
30 |
31 | Google Cloud Storage implementation.
32 |
33 | ::: upath.implementations.cloud.AzurePath
34 | options:
35 | heading_level: 3
36 | show_root_heading: true
37 | show_root_full_path: false
38 | members: []
39 | show_bases: true
40 |
41 | **Protocols:** `abfs://`, `abfss://`, `adl://`, `az://`
42 |
43 | Azure Blob Storage and Azure Data Lake implementation.
44 |
45 | ::: upath.implementations.cloud.HfPath
46 | options:
47 | heading_level: 3
48 | show_root_heading: true
49 | show_root_full_path: false
50 | members: []
51 | show_bases: true
52 |
53 | **Protocols:** `hf://`
54 |
55 | Hugging Face Hub implementation for accessing models, datasets, and spaces.
56 |
57 | ---
58 |
59 | ## upath.implementations.local
60 |
61 | ::: upath.implementations.local.PosixUPath
62 | options:
63 | heading_level: 3
64 | show_root_heading: true
65 | show_root_full_path: false
66 | members: []
67 | show_bases: true
68 |
69 | POSIX-style local filesystem paths (Linux, macOS, Unix).
70 |
71 | ::: upath.implementations.local.WindowsUPath
72 | options:
73 | heading_level: 3
74 | show_root_heading: true
75 | show_root_full_path: false
76 | members: []
77 | show_bases: true
78 |
79 | Windows-style local filesystem paths.
80 |
81 | ::: upath.implementations.local.FilePath
82 | options:
83 | heading_level: 3
84 | show_root_heading: true
85 | show_root_full_path: false
86 | members: []
87 | show_bases: true
88 |
89 | **Protocols:** `file://`, `local://`
90 |
91 | File URI implementation for local filesystem.
92 |
93 | ---
94 |
95 | ## upath.implementations.http
96 |
97 | ::: upath.implementations.http.HTTPPath
98 | options:
99 | heading_level: 3
100 | show_root_heading: true
101 | show_root_full_path: false
102 | members: []
103 | show_bases: true
104 |
105 | **Protocols:** `http://`, `https://`
106 |
107 | HTTP/HTTPS read-only filesystem implementation.
108 |
109 | ---
110 |
111 | ## upath.implementations.sftp
112 |
113 | ::: upath.implementations.sftp.SFTPPath
114 | options:
115 | heading_level: 3
116 | show_root_heading: true
117 | show_root_full_path: false
118 | members: []
119 | show_bases: true
120 |
121 | **Protocols:** `sftp://`, `ssh://`
122 |
123 | SFTP (SSH File Transfer Protocol) implementation.
124 |
125 | ---
126 |
127 | ## upath.implementations.smb
128 |
129 | ::: upath.implementations.smb.SMBPath
130 | options:
131 | heading_level: 3
132 | show_root_heading: true
133 | show_root_full_path: false
134 | members: []
135 | show_bases: true
136 |
137 | **Protocol:** `smb://`
138 |
139 | SMB/CIFS network filesystem implementation.
140 |
141 | ---
142 |
143 | ## upath.implementations.webdav
144 |
145 | ::: upath.implementations.webdav.WebdavPath
146 | options:
147 | heading_level: 3
148 | show_root_heading: true
149 | show_root_full_path: false
150 | members: []
151 | show_bases: true
152 |
153 | **Protocols:** `webdav://`, `webdav+http://`, `webdav+https://`
154 |
155 | WebDAV protocol implementation.
156 |
157 | ---
158 |
159 | ## upath.implementations.hdfs
160 |
161 | ::: upath.implementations.hdfs.HDFSPath
162 | options:
163 | heading_level: 3
164 | show_root_heading: true
165 | show_root_full_path: false
166 | members: []
167 | show_bases: true
168 |
169 | **Protocol:** `hdfs://`
170 |
171 | Hadoop Distributed File System implementation.
172 |
173 | ---
174 |
175 | ## upath.implementations.github
176 |
177 | ::: upath.implementations.github.GitHubPath
178 | options:
179 | heading_level: 3
180 | show_root_heading: true
181 | show_root_full_path: false
182 | members: []
183 | show_bases: true
184 |
185 | **Protocol:** `github://`
186 |
187 | GitHub repository file access implementation.
188 |
189 | ---
190 |
191 | ## upath.implementations.zip
192 |
193 | ::: upath.implementations.zip.ZipPath
194 | options:
195 | heading_level: 3
196 | show_root_heading: true
197 | show_root_full_path: false
198 | members: []
199 | show_bases: true
200 |
201 | **Protocol:** `zip://`
202 |
203 | ZIP archive filesystem implementation.
204 |
205 | ---
206 |
207 | ## upath.implementations.tar
208 |
209 | ::: upath.implementations.tar.TarPath
210 | options:
211 | heading_level: 3
212 | show_root_heading: true
213 | show_root_full_path: false
214 | members: []
215 | show_bases: true
216 |
217 | **Protocol:** `tar://`
218 |
219 | TAR archive filesystem implementation.
220 |
221 | ---
222 |
223 | ## upath.implementations.memory
224 |
225 | ::: upath.implementations.memory.MemoryPath
226 | options:
227 | heading_level: 3
228 | show_root_heading: true
229 | show_root_full_path: false
230 | members: []
231 | show_bases: true
232 |
233 | **Protocol:** `memory://`
234 |
235 | In-memory filesystem implementation for testing and temporary storage.
236 |
237 | ---
238 |
239 | ## upath.implementations.data
240 |
241 | ::: upath.implementations.data.DataPath
242 | options:
243 | heading_level: 3
244 | show_root_heading: true
245 | show_root_full_path: false
246 | members: []
247 | show_bases: true
248 |
249 | **Protocol:** `data://`
250 |
251 | Data URL scheme implementation for embedded data.
252 |
253 | ---
254 |
255 | ## upath.implementations.ftp
256 |
257 | ::: upath.implementations.ftp.FTPPath
258 | options:
259 | heading_level: 3
260 | show_root_heading: true
261 | show_root_full_path: false
262 | members: []
263 | show_bases: true
264 |
265 | **Protocol:** `ftp://`
266 |
267 | FTP (File Transfer Protocol) implementation.
268 |
269 | ---
270 |
271 | ## upath.implementations.cached
272 |
273 | ::: upath.implementations.cached.SimpleCachePath
274 | options:
275 | heading_level: 3
276 | show_root_heading: true
277 | show_root_full_path: false
278 | members: []
279 | show_bases: true
280 |
281 | **Protocol:** `simplecache://`
282 |
283 | Local caching wrapper for remote filesystems.
284 |
285 | ---
286 |
287 | ## See Also :link:
288 |
289 | - [UPath](index.md) - Main UPath class documentation
290 | - [Registry](registry.md) - Implementation registry
291 | - [Extensions](extensions.md) - Extending UPath functionality
292 |
--------------------------------------------------------------------------------
/docs/index.md:
--------------------------------------------------------------------------------
1 |
9 |
10 | {: #upath-logo }
11 |
12 | [](https://pypi.org/project/universal_pathlib/)
13 | [](https://github.com/fsspec/universal_pathlib/blob/main/LICENSE)
14 | [](https://pypi.org/project/universal_pathlib/)
15 | [](https://anaconda.org/conda-forge/universal_pathlib)
16 |
17 | [](https://universal-pathlib.readthedocs.io/en/latest/?badge=latest)
18 | [](https://github.com/fsspec/universal_pathlib/actions/workflows/tests.yml)
19 | [](https://github.com/fsspec/universal_pathlib/issues)
20 | [](https://github.com/psf/black)
21 | [](./changelog.md)
22 |
23 | ---
24 |
25 | **Universal Pathlib** is a Python library that extends the [`pathlib_abc.JoinablePath`][pathlib_abc], [`pathlib_abc.Readable`][pathlib_abc], and [`pathlib_abc.Writable`][pathlib_abc] API to give you a unified, Pythonic interface for working with files, whether they're on your local machine, in S3, on GitHub, or anywhere else. Built on top of [`filesystem_spec`][fsspec], it brings the convenienve of a [`pathlib.Path`][pathlib]-like interface to cloud storage, remote filesystems, and more! :sparkles:
26 |
27 | [pathlib_abc]: https://github.com/barneygale/pathlib-abc
28 | [pathlib]: https://docs.python.org/3/library/pathlib.html
29 | [fsspec]: https://filesystem-spec.readthedocs.io/en/latest/intro.html
30 |
31 | ---
32 |
33 | If you enjoy working with Python's [pathlib][pathlib] objects to operate on local file system paths,
34 | universal pathlib provides the same interface for many supported [ filesystem_spec ][fsspec]
35 | implementations, from cloud-native object storage like `Amazon's S3 Storage`, `Google Cloud Storage`,
36 | `Azure Blob Storage`, to `http`, `sftp`, `memory` stores, and many more...
37 |
38 | If you're familiar with [ filesystem_spec ][fsspec], then universal pathlib provides a convenient
39 | way to handle the path, protocol and storage options of a object stored on a fsspec filesystem in a
40 | single container (`upath.UPath`). And it further provides a pathlib interface to do path operations on the
41 | fsspec urlpath.
42 |
43 | The great part is, if you're familiar with the [pathlib.Path][pathlib] API, you can immediately
44 | switch from working with local paths to working on remote and virtual filesystem by simply using
45 | the `UPath` class:
46 |
47 | === "The Problem"
48 |
49 | ```python
50 | # Local files: use pathlib
51 | from pathlib import Path
52 | local_file = Path("data/file.txt")
53 | content = local_file.read_text()
54 |
55 | # S3 files: use boto3/s3fs
56 | import boto3
57 | s3 = boto3.client('s3')
58 | obj = s3.get_object(Bucket='bucket', Key='data/file.txt')
59 | content = obj['Body'].read().decode('utf-8')
60 |
61 | # Different APIs, different patterns 😫
62 | ```
63 |
64 | === "The Solution"
65 |
66 | ```python
67 | # All files: use UPath! ✨
68 | from upath import UPath
69 |
70 | local_file = UPath("data/file.txt")
71 | s3_file = UPath("s3://bucket/data/file.txt")
72 |
73 | # Same API everywhere! 🎉
74 | content = local_file.read_text()
75 | content = s3_file.read_text()
76 | ```
77 |
78 | [Learn more about why you should use Universal Pathlib →](why.md){ .md-button }
79 |
80 | ---
81 |
82 | ## Quick Start :rocket:
83 |
84 | ### Installation
85 |
86 | ```bash
87 | pip install universal-pathlib
88 | ```
89 |
90 | !!! tip "Installing for specific filesystems"
91 | To use cloud storage or other remote filesystems, install the necessary fsspec extras:
92 |
93 | ```bash
94 | pip install "universal-pathlib" "fsspec[s3,gcs,azure]"
95 | ```
96 |
97 | See the [Installation Guide](install.md) for more details.
98 |
99 | ### TL;DR Examples
100 |
101 | ```python
102 | from upath import UPath
103 |
104 | # Works with local paths
105 | local_path = UPath("documents/notes.txt")
106 | local_path.write_text("Hello, World!")
107 | print(local_path.read_text()) # "Hello, World!"
108 |
109 | # Works with S3
110 | s3_path = UPath("s3://my-bucket/data/processed/results.csv")
111 | if s3_path.exists():
112 | data = s3_path.read_text()
113 |
114 | # Works with HTTP
115 | http_path = UPath("https://example.com/data/file.json")
116 | if http_path.exists():
117 | content = http_path.read_bytes()
118 |
119 | # Works with many more! 🌟
120 | ```
121 |
122 | ---
123 |
124 | ## Currently supported filesystems
125 |
126 | - :fontawesome-solid-folder: `file:` and `local:` Local filesystem
127 | - :fontawesome-solid-memory: `memory:` Ephemeral filesystem in RAM
128 | - :fontawesome-brands-microsoft: `az:`, `adl:`, `abfs:` and `abfss:` Azure Storage _(requires `adlfs`)_
129 | - :fontawesome-solid-database: `data:` RFC 2397 style data URLs _(requires `fsspec>=2023.12.2`)_
130 | - :fontawesome-solid-network-wired: `ftp:` FTP filesystem
131 | - :fontawesome-brands-github: `github:` GitHub repository filesystem
132 | - :fontawesome-solid-globe: `http:` and `https:` HTTP(S)-based filesystem
133 | - :fontawesome-solid-server: `hdfs:` Hadoop distributed filesystem
134 | - :fontawesome-brands-google: `gs:` and `gcs:` Google Cloud Storage _(requires `gcsfs`)_
135 | - :simple-huggingface: `hf:` Hugging Face Hub _(requires `huggingface_hub`)_
136 | - :fontawesome-brands-aws: `s3:` and `s3a:` AWS S3 _(requires `s3fs`)_
137 | - :fontawesome-solid-network-wired: `sftp:` and `ssh:` SFTP and SSH filesystems _(requires `paramiko`)_
138 | - :fontawesome-solid-share-nodes: `smb:` SMB filesystems _(requires `smbprotocol`)_
139 | - :fontawesome-solid-cloud: `webdav:`, `webdav+http:` and `webdav+https:` WebDAV _(requires `webdav4[fsspec]`)_
140 |
141 | !!! info "Untested Filesystems"
142 | Other fsspec-compatible filesystems likely work through the default implementation. If you encounter issues, please [report it our issue tracker](https://github.com/fsspec/universal_pathlib/issues)! We're happy to add official support!
143 |
144 | ---
145 |
146 | ## Getting Help :question:
147 |
148 | Need help? We're here for you!
149 |
150 | - :fontawesome-brands-github: [GitHub Issues](https://github.com/fsspec/universal_pathlib/issues) - Report bugs or request features
151 | - :material-book-open-variant: [Documentation](https://universal-pathlib.readthedocs.io/) - You're reading it!
152 |
153 | !!! tip "Before Opening an Issue"
154 | Please check if your question has already been answered in the documentation or existing issues.
155 |
156 | ---
157 |
158 | ## License :page_with_curl:
159 |
160 | Universal Pathlib is distributed under the [MIT license](https://github.com/fsspec/universal_pathlib/blob/main/LICENSE), making it free and open source software. Use it freely in your projects!
161 |
162 | ---
163 |
164 |
165 |
166 | **Ready to get started?**
167 |
168 | [Install Now](install.md){ .md-button .md-button--primary }
169 |
170 |
171 |
--------------------------------------------------------------------------------
/docs/why.md:
--------------------------------------------------------------------------------
1 | # Why Use Universal Pathlib? :sparkles:
2 |
3 | If you've ever worked with cloud storage or remote filesystems in Python, you've probably experienced the frustration of juggling different APIs. Universal Pathlib solves this problem elegantly by bringing the beloved `pathlib.Path` interface to *any* filesystem spec filesystem.
4 |
5 | ---
6 |
7 | ## The Problem: Filesystem dependent APIs :broken_heart:
8 |
9 | Let's face it: working with files across different storage backends is messy.
10 |
11 | ### Example: The Old Way
12 |
13 | ```python
14 | # Local files
15 | from pathlib import Path
16 | local_file = Path("data/results.csv")
17 | with local_file.open('r') as f:
18 | data = f.read()
19 |
20 | # S3 files
21 | import boto3
22 | s3 = boto3.resource('s3')
23 | obj = s3.Object('my-bucket', 'data/results.csv')
24 | data = obj.get()['Body'].read().decode('utf-8')
25 |
26 | # Azure Blob Storage
27 | from azure.storage.blob import BlobServiceClient
28 | blob_client = BlobServiceClient.from_connection_string(conn_str)
29 | container_client = blob_client.get_container_client('my-container')
30 | blob_client = container_client.get_blob_client('data/results.csv')
31 | data = blob_client.download_blob().readall().decode('utf-8')
32 |
33 | # Three different APIs, three different patterns 😫
34 | ```
35 |
36 | Each storage backend has its own:
37 |
38 | - :material-api: **Different API** - Learn a new interface for each service
39 | - :material-puzzle: **Different patterns** - Different ways to read, write, and list files
40 | - :material-code-braces: **Different imports** - Manage multiple dependencies
41 | - :material-hammer-wrench: **Different configurations** - Each with unique setup requirements
42 |
43 | !!! danger "The Maintenance Nightmare"
44 | Want to switch from S3 to GCS? Rewrite your code. Need to support multiple backends? Write wrapper functions. Testing? Mock each service differently. This doesn't scale!
45 |
46 | ---
47 |
48 | ## The Solution: One API to Rule Them All :crown:
49 |
50 | Universal Pathlib provides a single, unified interface that works everywhere:
51 |
52 | ```python
53 | from upath import UPath
54 |
55 | # Local files
56 | local_file = UPath("data/results.csv")
57 |
58 | # S3 files
59 | s3_file = UPath("s3://my-bucket/data/results.csv")
60 |
61 | # Azure Blob Storage
62 | azure_file = UPath("az://my-container/data/results.csv")
63 |
64 | # Same API everywhere! ✨
65 | for path in [local_file, s3_file, azure_file]:
66 | with path.open('r') as f:
67 | data = f.read()
68 | ```
69 |
70 | !!! success "One API, Infinite Possibilities"
71 | Write your code once, run it anywhere. Switch backends by changing a URL. Test locally, deploy to the cloud. It just works! :sparkles:
72 |
73 | ---
74 |
75 | ## Key Benefits :trophy:
76 |
77 | ### 1. Familiar and Pythonic :snake:
78 |
79 | If you know Python's `pathlib`, you already know Universal Pathlib!
80 |
81 | ```python
82 | from upath import UPath
83 |
84 | # All the familiar pathlib operations
85 | path = UPath("s3://bucket/data/file.txt")
86 |
87 | print(path.name) # "file.txt"
88 | print(path.stem) # "file"
89 | print(path.suffix) # ".txt"
90 | print(path.parent) # UPath("s3://bucket/data")
91 |
92 | # Path joining
93 | output = path.parent / "processed" / "output.csv"
94 |
95 | # File operations
96 | path.write_text("Hello!")
97 | content = path.read_text()
98 |
99 | # Directory operations
100 | for item in path.parent.iterdir():
101 | print(item)
102 | ```
103 |
104 | !!! tip "Zero Learning Curve"
105 | Your existing pathlib knowledge transfers directly. No new concepts to learn, no cognitive overhead!
106 |
107 | ### 2. Write Once, Run Anywhere :earth_americas:
108 |
109 | Change storage backends without changing code:
110 |
111 | === "Development (Local)"
112 |
113 | ```python
114 | from upath import UPath
115 |
116 | def process_data(input_path: str, output_path: str):
117 | data_file = UPath(input_path)
118 | result_file = UPath(output_path)
119 |
120 | # Read, process, write
121 | data = data_file.read_text()
122 | processed = data.upper()
123 | result_file.write_text(processed)
124 |
125 | # Local development
126 | process_data("data/input.txt", "data/output.txt")
127 | ```
128 |
129 | === "Production (S3)"
130 |
131 | ```python
132 | from upath import UPath
133 |
134 | def process_data(input_path: str, output_path: str):
135 | data_file = UPath(input_path)
136 | result_file = UPath(output_path)
137 |
138 | # Same code! Just different paths
139 | data = data_file.read_text()
140 | processed = data.upper()
141 | result_file.write_text(processed)
142 |
143 | # Production on S3
144 | process_data(
145 | "s3://my-bucket/data/input.txt",
146 | "s3://my-bucket/data/output.txt"
147 | )
148 | ```
149 |
150 | === "Testing (Memory)"
151 |
152 | ```python
153 | from upath import UPath
154 |
155 | def process_data(input_path: str, output_path: str):
156 | data_file = UPath(input_path)
157 | result_file = UPath(output_path)
158 |
159 | # Same code! No mocking needed
160 | data = data_file.read_text()
161 | processed = data.upper()
162 | result_file.write_text(processed)
163 |
164 | # Fast tests with in-memory filesystem
165 | process_data(
166 | "memory://input.txt",
167 | "memory://output.txt"
168 | )
169 | ```
170 |
171 | !!! success "Truly Portable Code"
172 | Your business logic stays clean and your application does not have to
173 | care about where the files live anymore.
174 |
175 | ### 3. Type-Safe and IDE-Friendly :computer:
176 |
177 | Universal Pathlib includes type hints for excellent IDE support:
178 |
179 | ```python
180 | from upath import UPath
181 | from pathlib import Path
182 |
183 | def process_file(path: UPath | Path) -> str:
184 | # Your IDE knows about all methods!
185 | if path.exists(): # ✓ Autocomplete
186 | content = path.read_text() # ✓ Type checked
187 | return content.upper()
188 | return ""
189 |
190 | # Works with both!
191 | local_result = process_file(UPath("file.txt"))
192 | s3_result = process_file(UPath("s3://bucket/file.txt"))
193 | ```
194 |
195 | !!! info "Editor Support"
196 | Get autocomplete, type checking, and inline documentation in VS Code, PyCharm, and other modern Python IDEs.
197 |
198 | ### 4. Extensively Tested :test_tube:
199 |
200 | Universal Pathlib runs a large subset of CPython's pathlib test suite:
201 |
202 | - :white_check_mark: **Compatibility tested** against standard library pathlib
203 | - :white_check_mark: **Cross-version tested** on Python 3.9-3.14
204 | - :white_check_mark: **Integration tested** with real filesystems
205 | - :white_check_mark: **Regression tested** for each release
206 |
207 | !!! quote "Extensively Tested"
208 | When we say "pathlib-compatible," we mean it.
209 |
210 | ### 5. Extensible and Future-Proof :rocket:
211 |
212 | Built on `fsspec`, the standard for Python filesystem abstractions:
213 |
214 | ```python
215 | # Works with many fsspec filesystems!
216 | UPath("s3://...", anon=True)
217 | UPath("gs://...", token='anon')
218 | UPath("az://...")
219 | UPath("https://...")
220 | ```
221 |
222 | Need a custom filesystem? Implement it once with fsspec, and UPath works automatically!
223 |
224 | !!! tip "Ecosystem Benefits"
225 | Leverage the entire fsspec ecosystem: caching, compression, callback hooks, and more!
226 |
227 | ---
228 |
229 | ## Next Steps :footprints:
230 |
231 | Ready to give Universal Pathlib a try?
232 |
233 | 1. **[Install Universal Pathlib](install.md)** - Get set up in minutes
234 | 2. **[Understand the concepts](concepts/index.md)** - Understand the concepts
235 | 3. **[Read the API docs](api/index.md)** - Learn about all the features
236 |
237 |
238 |
239 | [Install Now →](install.md){ .md-button .md-button--primary }
240 |
241 |
242 |
--------------------------------------------------------------------------------
/docs/concepts/upath.md:
--------------------------------------------------------------------------------
1 |
6 |
7 | # Universal Pathlib {: #upath-logo }
8 |
9 | **universal-pathlib** (imported as `upath`) bridges Python's [pathlib](https://docs.python.org/3/library/pathlib.html) API with [fsspec](https://filesystem-spec.readthedocs.io/)'s filesystem implementations. It provides a familiar, pathlib-style interface for working with files across local storage, cloud services, and remote systems.
10 |
11 | ## The Best of Both Worlds
12 |
13 | universal-pathlib combines:
14 |
15 | - **fsspec's filesystem support**: Access to S3, GCS, Azure, HDFS, HTTP, SFTP, and dozens more backends
16 | - **pathlib's elegant API**: Object-oriented paths, `/` operator, `.exists()`, `.read_text()`, etc.
17 |
18 | This means you can write code using the pathlib syntax you already know, and it works seamlessly across any storage system that fsspec supports.
19 |
20 | ## How UPath and Path Relate via pathlib-abc
21 |
22 | `UPath` and `pathlib.Path` are related through the abstract base classes defined in [pathlib-abc](https://github.com/barneygale/pathlib-abc). While they share a common API design, they serve different purposes and have distinct inheritance hierarchies.
23 |
24 | ### The Class Hierarchy
25 |
26 | The following diagram shows how `UPath` implementations relate to `pathlib` classes through the `pathlib_abc` abstract base classes:
27 |
28 | ```mermaid
29 | flowchart TB
30 |
31 | subgraph p0[pathlib_abc]
32 | X ----> Y
33 | X ----> Z
34 | end
35 |
36 | subgraph s0[pathlib]
37 | X -.-> A
38 |
39 | A----> B
40 | A--> AP
41 | A--> AW
42 |
43 | Y -.-> B
44 | Z -.-> B
45 |
46 | B--> BP
47 | AP----> BP
48 | B--> BW
49 | AW----> BW
50 | end
51 | subgraph s1[upath]
52 | Y ---> U
53 | Z ---> U
54 |
55 | U --> UP
56 | U --> UW
57 | BP ---> UP
58 | BW ---> UW
59 | U --> UL
60 | U --> US3
61 | U --> UH
62 | U -.-> UO
63 | end
64 |
65 | X(JoinablePath)
66 | Y(WritablePath)
67 | Z(ReadablePath)
68 |
69 | A(PurePath)
70 | AP(PurePosixPath)
71 | AW(PureWindowsPath)
72 | B(Path)
73 | BP(PosixPath)
74 | BW(WindowsPath)
75 |
76 | U(UPath)
77 | UP(PosixUPath)
78 | UW(WindowsUPath)
79 | UL(FilePath)
80 | US3(S3Path)
81 | UH(HttpPath)
82 | UO(...Path)
83 |
84 | classDef na fill:#f7f7f7,stroke:#02a822,stroke-width:2px,color:#333
85 | classDef np fill:#f7f7f7,stroke:#2166ac,stroke-width:2px,color:#333
86 | classDef nu fill:#f7f7f7,stroke:#b2182b,stroke-width:2px,color:#333
87 |
88 | class X,Y,Z na
89 | class A,AP,AW,B,BP,BW,UP,UW np
90 | class U,UL,US3,UH,UO nu
91 |
92 | style UO stroke-dasharray: 3 3
93 |
94 | style p0 fill:none,stroke:#0a2,stroke-width:3px,stroke-dasharray:3,color:#0a2
95 | style s0 fill:none,stroke:#07b,stroke-width:3px,stroke-dasharray:3,color:#07b
96 | style s1 fill:none,stroke:#d02,stroke-width:3px,stroke-dasharray:3,color:#d02
97 | ```
98 |
99 | **Legend:**
100 |
101 | - **Green (pathlib_abc)**: Abstract base classes defining the path interface
102 | - **Blue (pathlib)**: Standard library path classes for local filesystems
103 | - **Red (upath)**: Universal pathlib classes for all filesystems
104 | - Solid lines: Direct inheritance
105 | - Dotted lines: Conceptual relationship (not actual inheritance yet)
106 |
107 | ### Understanding the Relationships
108 |
109 | **pathlib-abc Layer (Green):**
110 |
111 | - `JoinablePath` - Basic path manipulation without filesystem access
112 | - `ReadablePath` - Adds read-only filesystem operations
113 | - `WritablePath` - Adds write filesystem operations
114 |
115 | **pathlib Layer (Blue):**
116 |
117 | - `PurePath` - Pure path manipulation (similar to `JoinablePath` conceptually)
118 | - `Path` - Concrete local filesystem paths (conceptually similar to `ReadablePath` + `WritablePath`)
119 | - Platform-specific: `PosixPath`, `WindowsPath`, etc.
120 |
121 | **universal-pathlib Layer (Red):**
122 |
123 | - `UPath` - Universal path for any filesystem backend
124 | - Local implementations: `PosixUPath`, `WindowsUPath`, `FilePath`
125 | - Remote implementations: `S3Path`, `HttpPath`, and others
126 |
127 | ### Key Differences
128 |
129 | **Current State (Python 3.9-3.13):**
130 |
131 | ```python
132 | from pathlib import Path
133 | from upath import UPath
134 | from upath.types import JoinablePath, ReadablePath, WritablePath
135 |
136 | # UPath explicitly implements pathlib-abc
137 | path = UPath("s3://bucket/file.txt")
138 | assert isinstance(path, JoinablePath) # True
139 | assert isinstance(path, ReadablePath) # True
140 | assert isinstance(path, WritablePath) # True
141 |
142 | # pathlib.Path does NOT (yet) inherit from pathlib-abc
143 | local = Path("/home/user/file.txt")
144 | assert isinstance(local, JoinablePath) # False
145 | assert isinstance(local, ReadablePath) # False
146 | assert isinstance(local, WritablePath) # False
147 | ```
148 |
149 | **Important Note:** The dotted lines in the diagram represent a conceptual relationship. While `pathlib.Path` doesn't currently inherit from `pathlib_abc` classes, it implements a compatible API. Future Python versions may formalize this relationship.
150 |
151 | ### Local Path Compatibility
152 |
153 | For local filesystem paths, `UPath` provides implementations that are 100% compatible with stdlib `pathlib`:
154 |
155 | ```python
156 | from pathlib import Path, PosixPath, WindowsPath
157 | from upath import UPath
158 |
159 | # Without protocol -> returns platform-specific UPath
160 | local = UPath("/home/user/file.txt")
161 | assert isinstance(local, UPath) # True
162 | assert isinstance(local, PosixPath) # True (on Unix systems)
163 | assert isinstance(local, Path) # True
164 |
165 | # With file:// protocol -> returns FilePath (fsspec-based)
166 | file_path = UPath("file:///home/user/file.txt")
167 | assert isinstance(file_path, UPath) # True
168 | assert not isinstance(file_path, Path) # False (uses fsspec instead)
169 | ```
170 |
171 | **PosixUPath and WindowsUPath:**
172 | - Subclass both `UPath` and `pathlib.Path`
173 | - 100% compatible with stdlib pathlib for local paths
174 | - Tested against CPython's pathlib test suite
175 | - Implement `os.PathLike` protocol
176 |
177 | **FilePath:**
178 | - Subclass of `UPath` only
179 | - Uses fsspec's `LocalFileSystem` for file access
180 | - Useful for consistent fsspec-based access across all backends
181 | - Implements `os.PathLike` protocol
182 |
183 | ### Remote and Cloud Paths
184 |
185 | For remote filesystems, `UPath` implementations provide the pathlib API backed by fsspec:
186 |
187 | ```python
188 | from upath import UPath
189 |
190 | # S3Path
191 | s3 = UPath("s3://bucket/file.txt")
192 | assert isinstance(s3, UPath)
193 | assert not isinstance(s3, Path) # Not a local path
194 |
195 | # HttpPath
196 | http = UPath("https://example.com/data.json")
197 | assert isinstance(http, UPath)
198 | assert not isinstance(http, Path) # Not a local path
199 | ```
200 |
201 | ### Why This Design?
202 |
203 | This architecture provides several benefits:
204 |
205 | 1. **Unified API**: Same pathlib interface works across all backends
206 | 2. **Type Safety**: pathlib-abc provides formal type hints for path operations
207 | 3. **Local Compatibility**: `PosixUPath`/`WindowsUPath` maintain full stdlib compatibility
208 | 4. **Flexibility**: Easy to add new filesystem implementations
209 | 5. **Future-Proof**: Ready for potential stdlib integration of pathlib-abc
210 |
211 | ### Writing Filesystem-Agnostic Code
212 |
213 | Use pathlib-abc types to write code that works with both `Path` and `UPath`:
214 |
215 | ```python
216 | from upath.types import ReadablePath, WritablePath
217 |
218 | def process_file(input_path: ReadablePath, output_path: WritablePath) -> None:
219 | """Works with Path, UPath, or any ReadablePath/WritablePath implementation."""
220 | data = input_path.read_text()
221 | processed = data.upper()
222 | output_path.write_text(processed)
223 |
224 | # Works with stdlib Path
225 | from pathlib import Path
226 | process_file(Path("input.txt"), Path("output.txt"))
227 |
228 | # Works with UPath for cloud storage
229 | from upath import UPath
230 | process_file(
231 | UPath("s3://input-bucket/data.txt", anon=True),
232 | UPath("s3://output-bucket/result.txt")
233 | )
234 |
235 | # Mix local and remote
236 | process_file(
237 | UPath("https://example.com/data.txt"),
238 | Path("/tmp/result.txt")
239 | )
240 | ```
241 |
242 | ## Learn More
243 |
244 | - **pathlib concepts**: See [pathlib.md](pathlib.md) for details on the pathlib API
245 | - **fsspec backends**: See [filesystems.md](fsspec.md) for information about available filesystems
246 | - **API reference**: Check the [API documentation](../api/index.md) for complete method details
247 | - **fsspec details**: Visit [fsspec documentation](https://filesystem-spec.readthedocs.io/) for filesystem-specific options
248 |
--------------------------------------------------------------------------------