├── debian ├── compat ├── source │ └── format ├── .gitignore ├── changelog ├── rules ├── control └── copyright ├── tests ├── steps │ ├── __init__.py │ ├── zookeeper.py │ ├── s3.py │ ├── failure_mockers.py │ └── chadmin.py ├── unit │ ├── __init__.py │ ├── common │ │ ├── __init__.py │ │ ├── query │ │ │ ├── __init__.py │ │ │ └── test_query.py │ │ ├── type │ │ │ ├── __init__.py │ │ │ └── test_typed_enum.py │ │ ├── clickhouse │ │ │ ├── __init__.py │ │ │ ├── metadata │ │ │ │ ├── table_summing_merge_tree.sql │ │ │ │ ├── table_replacing_merge_tree.sql │ │ │ │ ├── table_collapsing_merge_tree.sql │ │ │ │ ├── table_versioned_collapsing_merge_tree.sql │ │ │ │ ├── table_replicated_aggregating_merge_tree.sql │ │ │ │ ├── table_replicated_collapsing_merge_tree.sql │ │ │ │ ├── table_replicated_replacing_merge_tree.sql │ │ │ │ ├── table_replicated_summing_merge_tree.sql │ │ │ │ ├── table_aggregating_merge_tree.sql │ │ │ │ ├── broken_no_uuid.sql │ │ │ │ ├── broken_no_uuid_full.sql │ │ │ │ ├── broken_no_engine_full.sql │ │ │ │ ├── broken_uuid.sql │ │ │ │ ├── broken_no_engine.sql │ │ │ │ ├── table_replicated_versioned_collapsing_merge_tree.sql │ │ │ │ ├── table_merge_tree.sql │ │ │ │ ├── table_merge_tree_field_uuid.sql │ │ │ │ ├── table_merge_tree_field_engine.sql │ │ │ │ ├── table_replicated_merge_tree.sql │ │ │ │ └── table_replicated_merge_tree_ver.sql │ │ │ └── test_zk_path_escape.py │ │ └── test_utils.py │ ├── monrun │ │ └── __init__.py │ └── chadmin │ │ └── test_stat_dict.py ├── modules │ ├── __init__.py │ ├── typing.py │ ├── steps.py │ ├── logs.py │ ├── chadmin.py │ ├── minio.py │ ├── utils.py │ └── s3.py ├── images │ ├── clickhouse │ │ └── config │ │ │ ├── regions_hierarchy.txt │ │ │ ├── dbaas.conf │ │ │ ├── regions_names_ru.txt │ │ │ ├── monitor-ch-backup │ │ │ ├── clickhouse-keyring.gpg │ │ │ ├── supervisor │ │ │ ├── conf.d │ │ │ │ ├── sshd.conf │ │ │ │ └── clickhouse-server.conf │ │ │ └── supervisord.conf │ │ │ ├── ch-backup.conf │ │ │ └── users.xml │ ├── http_mock │ │ ├── Dockerfile │ │ └── service.py │ ├── minio │ │ ├── config │ │ │ └── mc.json │ │ └── Dockerfile │ └── zookeeper │ │ ├── config │ │ ├── zoo.cfg │ │ ├── zookeeper.conf │ │ ├── log4j.properties │ │ └── start_zk.sh │ │ └── Dockerfile ├── features │ ├── chadmin_perf_diag.feature │ ├── chs3_backup_cleanup.feature │ ├── monrun_keeper.feature │ └── s3_credentials.feature ├── configuration.py └── environment.py ├── .python-version ├── ch_tools ├── chadmin │ ├── __init__.py │ ├── cli │ │ ├── __init__.py │ │ ├── restore_replica_command.py │ │ ├── list_macros_command.py │ │ ├── list_events_command.py │ │ ├── list_metrics_command.py │ │ ├── config_command.py │ │ ├── list_async_metrics_command.py │ │ ├── list_functions_command.py │ │ ├── metadata.py │ │ ├── list_settings_command.py │ │ ├── stack_trace_command.py │ │ ├── diagnostics_command.py │ │ ├── crash_log_group.py │ │ ├── dictionary_group.py │ │ ├── disk_group.py │ │ ├── move_group.py │ │ ├── replicated_fetch_group.py │ │ ├── chadmin_group.py │ │ ├── merge_group.py │ │ ├── s3_credentials_config_group.py │ │ ├── process_group.py │ │ └── chs3_backup_group.py │ ├── internal │ │ ├── __init__.py │ │ ├── diagnostics │ │ │ ├── __init__.py │ │ │ └── utils.py │ │ ├── object_storage │ │ │ ├── __init__.py │ │ │ ├── orphaned_objects_state.py │ │ │ ├── obj_list_item.py │ │ │ ├── s3_iterator.py │ │ │ ├── s3_cleanup.py │ │ │ └── s3_cleanup_stats.py │ │ ├── table_info.py │ │ ├── database_replica.py │ │ ├── backup.py │ │ ├── dictionary.py │ │ ├── system.py │ │ ├── partition.py │ │ ├── clickhouse_disks.py │ │ └── database.py │ └── README.md ├── common │ ├── __init__.py │ ├── cli │ │ ├── __init__.py │ │ ├── progress_bar.py │ │ ├── context_settings.py │ │ ├── utils.py │ │ └── locale_resolver.py │ ├── clickhouse │ │ ├── __init__.py │ │ ├── client │ │ │ ├── __init__.py │ │ │ ├── error.py │ │ │ ├── utils.py │ │ │ ├── retry.py │ │ │ ├── query.py │ │ │ └── query_output_format.py │ │ └── config │ │ │ ├── path.py │ │ │ ├── zookeeper.py │ │ │ ├── __init__.py │ │ │ ├── users.py │ │ │ ├── clickhouse_keeper.py │ │ │ ├── utils.py │ │ │ └── clickhouse.py │ ├── type │ │ ├── __init__.py │ │ └── typed_enum.py │ ├── yaml.py │ ├── process_pool.py │ ├── result.py │ └── dbaas.py ├── monrun_checks │ ├── __init__.py │ ├── utils.py │ ├── ch_s3_backup_orphaned.py │ ├── README.md │ ├── ch_geobase.py │ ├── status.py │ ├── exceptions.py │ ├── clickhouse_info.py │ ├── ch_replication_lag.py │ ├── ch_keeper.py │ ├── ch_system_metrics.py │ ├── ch_tls.py │ ├── ch_core_dumps.py │ ├── ch_ro_replica.py │ ├── ch_resetup_state.py │ ├── ch_log_errors.py │ ├── ch_dist_tables.py │ ├── ch_system_queues.py │ └── ch_s3_credentials_config.py ├── monrun_checks_keeper │ ├── __init__.py │ ├── README.md │ └── status.py └── __init__.py ├── .sourcery.yaml ├── resources ├── logrotate │ ├── keeper-monitoring.logrotate │ ├── clickhouse-monitoring.logrotate │ └── chadmin.logrotate └── completion │ ├── chadmin-completion.bash │ ├── ch-monitoring-completion.bash │ └── keeper-monitoring-completion.bash ├── .github ├── dependabot.yml └── workflows │ ├── test_clickhouse_version.yml │ └── link_startrek.yml ├── .gitignore ├── AUTHORS ├── LICENSE ├── Dockerfile-deb-build ├── README.md └── CONTRIBUTING.md /debian/compat: -------------------------------------------------------------------------------- 1 | 10 2 | -------------------------------------------------------------------------------- /tests/steps/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /tests/unit/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /.python-version: -------------------------------------------------------------------------------- 1 | 3.10 2 | -------------------------------------------------------------------------------- /ch_tools/chadmin/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /ch_tools/common/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /tests/modules/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /ch_tools/chadmin/cli/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /ch_tools/common/cli/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /tests/unit/common/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /tests/unit/monrun/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /ch_tools/chadmin/internal/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /ch_tools/common/clickhouse/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /ch_tools/monrun_checks/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /debian/source/format: -------------------------------------------------------------------------------- 1 | 3.0 (native) 2 | -------------------------------------------------------------------------------- /tests/unit/common/query/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /tests/unit/common/type/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /ch_tools/monrun_checks_keeper/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /tests/unit/common/clickhouse/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /ch_tools/chadmin/internal/diagnostics/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /tests/images/clickhouse/config/regions_hierarchy.txt: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /debian/.gitignore: -------------------------------------------------------------------------------- 1 | *.log 2 | .debhelper/ 3 | clickhouse-tools 4 | -------------------------------------------------------------------------------- /tests/images/clickhouse/config/dbaas.conf: -------------------------------------------------------------------------------- 1 | {{ conf.dbaas_conf | json }} 2 | -------------------------------------------------------------------------------- /.sourcery.yaml: -------------------------------------------------------------------------------- 1 | rule_settings: 2 | disable: 3 | - use-named-expression 4 | -------------------------------------------------------------------------------- /tests/images/clickhouse/config/regions_names_ru.txt: -------------------------------------------------------------------------------- 1 | 1 Москва и Московская область 2 | -------------------------------------------------------------------------------- /tests/images/clickhouse/config/monitor-ch-backup: -------------------------------------------------------------------------------- 1 | monitor ALL=NOPASSWD: /usr/bin/ch-backup list * 2 | -------------------------------------------------------------------------------- /tests/images/clickhouse/config/clickhouse-keyring.gpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/yandex/ch-tools/main/tests/images/clickhouse/config/clickhouse-keyring.gpg -------------------------------------------------------------------------------- /ch_tools/chadmin/README.md: -------------------------------------------------------------------------------- 1 | # chadmin 2 | 3 | ClickHouse administration tool. 4 | 5 | For getting list of available command, run 6 | ```shell 7 | $ chadmin -h 8 | ``` 9 | -------------------------------------------------------------------------------- /debian/changelog: -------------------------------------------------------------------------------- 1 | clickhouse-tools (1.0.0) UNRELEASED; urgency=low 2 | 3 | * Initial Release. 4 | 5 | -- Dmitry Starov Thu, 01 Jun 2023 16:00:00 +0300 6 | -------------------------------------------------------------------------------- /ch_tools/common/type/__init__.py: -------------------------------------------------------------------------------- 1 | """ 2 | Useful types. 3 | """ 4 | 5 | from .typed_enum import IntEnum, StrEnum 6 | 7 | __all__ = [ 8 | "IntEnum", 9 | "StrEnum", 10 | ] 11 | -------------------------------------------------------------------------------- /resources/logrotate/keeper-monitoring.logrotate: -------------------------------------------------------------------------------- 1 | /var/log/keeper-monitoring/keeper-monitoring.log { 2 | rotate 7 3 | daily 4 | compress 5 | missingok 6 | nodateext 7 | copytruncate 8 | } 9 | -------------------------------------------------------------------------------- /resources/logrotate/clickhouse-monitoring.logrotate: -------------------------------------------------------------------------------- 1 | /var/log/clickhouse-monitoring/clickhouse-monitoring.log { 2 | rotate 7 3 | daily 4 | compress 5 | missingok 6 | nodateext 7 | copytruncate 8 | } 9 | -------------------------------------------------------------------------------- /ch_tools/__init__.py: -------------------------------------------------------------------------------- 1 | """A set of tools for administration and diagnostics of ClickHouse DBMS.""" 2 | 3 | from importlib.resources import files 4 | 5 | __version__ = files(__name__).joinpath("version.txt").read_text().strip() 6 | -------------------------------------------------------------------------------- /resources/logrotate/chadmin.logrotate: -------------------------------------------------------------------------------- 1 | /var/log/chadmin/chadmin.log { 2 | rotate 5 3 | monthly 4 | compress 5 | missingok 6 | nodateext 7 | notifempty 8 | copytruncate 9 | size 1M 10 | } 11 | -------------------------------------------------------------------------------- /tests/images/http_mock/Dockerfile: -------------------------------------------------------------------------------- 1 | FROM python:3-alpine 2 | RUN pip3 install flask --no-cache 3 | ENV FLASK_APP=/service.py 4 | COPY tests/images/http_mock/service.py / 5 | CMD ["python3", "-m", "flask", "run", "--host=0.0.0.0", "--port=8080"] 6 | -------------------------------------------------------------------------------- /.github/dependabot.yml: -------------------------------------------------------------------------------- 1 | version: 2 2 | updates: 3 | - package-ecosystem: pip 4 | directory: / 5 | schedule: 6 | interval: daily 7 | - package-ecosystem: github-actions 8 | directory: / 9 | schedule: 10 | interval: daily 11 | -------------------------------------------------------------------------------- /tests/unit/common/clickhouse/metadata/table_summing_merge_tree.sql: -------------------------------------------------------------------------------- 1 | ATTACH TABLE _ UUID '5f55555d-c9f7-47ac-8d87-b3ac8a889161' 2 | ( 3 | `key` UInt32, 4 | `value` UInt32 5 | ) 6 | ENGINE = SummingMergeTree 7 | ORDER BY key 8 | SETTINGS index_granularity = 8192 -------------------------------------------------------------------------------- /tests/modules/typing.py: -------------------------------------------------------------------------------- 1 | """ 2 | Type definitions. 3 | """ 4 | 5 | from types import SimpleNamespace 6 | from typing import Union 7 | 8 | from behave.runner import Context 9 | 10 | ContextT = Union[Context, SimpleNamespace] # pylint: disable=invalid-name 11 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | .DS_Store 2 | .idea 3 | .install-deps 4 | .mypy_cache/ 5 | .pytype/ 6 | .ruff_cache/ 7 | .session_conf.sav 8 | __pycache__ 9 | build/ 10 | cython_debug/ 11 | dist/ 12 | out/ 13 | tests/reports/ 14 | tests/staging/ 15 | .venv/ 16 | ch_tools/version.txt 17 | -------------------------------------------------------------------------------- /tests/unit/common/clickhouse/metadata/table_replacing_merge_tree.sql: -------------------------------------------------------------------------------- 1 | ATTACH TABLE _ UUID 'c322e832-2628-45f9-b2f5-fd659078c5c2' 2 | ( 3 | `key` Int64, 4 | `someCol` String, 5 | `eventTime` DateTime 6 | ) 7 | ENGINE = ReplacingMergeTree 8 | ORDER BY key 9 | SETTINGS index_granularity = 8192 -------------------------------------------------------------------------------- /ch_tools/chadmin/cli/restore_replica_command.py: -------------------------------------------------------------------------------- 1 | from click import ClickException 2 | from cloup import command 3 | 4 | 5 | @command("restore-replica") 6 | def restore_replica_command() -> None: 7 | raise ClickException( 8 | 'The command has been superseded by "replica restore" command.' 9 | ) 10 | -------------------------------------------------------------------------------- /ch_tools/chadmin/internal/diagnostics/utils.py: -------------------------------------------------------------------------------- 1 | from functools import partial, wraps 2 | from typing import Any, Callable 3 | 4 | 5 | def delayed(f: Any) -> Callable: 6 | @wraps(f) 7 | def wrapper(*args: Any, **kwargs: Any) -> Any: 8 | return partial(f, *args, **kwargs) 9 | 10 | return wrapper 11 | -------------------------------------------------------------------------------- /tests/images/minio/config/mc.json: -------------------------------------------------------------------------------- 1 | { 2 | "version": "10", 3 | "aliases": { 4 | "local": { 5 | "url": "http://localhost:9000", 6 | "accessKey": "{{conf.s3.access_key_id}}", 7 | "secretKey": "{{conf.s3.access_secret_key}}", 8 | "api": "S3v4", 9 | "path": "auto" 10 | } 11 | } 12 | } 13 | -------------------------------------------------------------------------------- /tests/images/clickhouse/config/supervisor/conf.d/sshd.conf: -------------------------------------------------------------------------------- 1 | [program:sshd] 2 | command=/usr/sbin/sshd -D 3 | process_name=%(program_name)s 4 | autostart=true 5 | autorestart=true 6 | stopsignal=TERM 7 | user=root 8 | stdout_logfile=/dev/stderr 9 | stdout_logfile_maxbytes=0 10 | stderr_logfile=/dev/stderr 11 | stderr_logfile_maxbytes=0 -------------------------------------------------------------------------------- /tests/unit/common/clickhouse/metadata/table_collapsing_merge_tree.sql: -------------------------------------------------------------------------------- 1 | ATTACH TABLE _ UUID '122b369e-3866-4c2b-8ca1-9e07c75ecee0' 2 | ( 3 | `UserID` UInt64, 4 | `PageViews` UInt8, 5 | `Duration` UInt8, 6 | `Sign` Int8 7 | ) 8 | ENGINE = CollapsingMergeTree(Sign) 9 | ORDER BY UserID 10 | SETTINGS index_granularity = 8192 -------------------------------------------------------------------------------- /ch_tools/common/clickhouse/client/__init__.py: -------------------------------------------------------------------------------- 1 | """ 2 | ClickHouse client. 3 | """ 4 | 5 | from .clickhouse_client import ClickhouseClient 6 | from .error import ClickhouseError 7 | from .query_output_format import OutputFormat 8 | 9 | __all__ = [ 10 | "ClickhouseClient", 11 | "ClickhouseError", 12 | "OutputFormat", 13 | ] 14 | -------------------------------------------------------------------------------- /ch_tools/chadmin/internal/object_storage/__init__.py: -------------------------------------------------------------------------------- 1 | from ch_tools.chadmin.internal.object_storage.obj_list_item import ObjListItem 2 | from ch_tools.chadmin.internal.object_storage.s3_cleanup import ( 3 | cleanup_s3_object_storage, 4 | ) 5 | from ch_tools.chadmin.internal.object_storage.s3_iterator import ( 6 | ObjectSummary, 7 | s3_object_storage_iterator, 8 | ) 9 | -------------------------------------------------------------------------------- /ch_tools/common/cli/progress_bar.py: -------------------------------------------------------------------------------- 1 | from typing import Any, Generator, Sequence, TypeVar, Union 2 | 3 | from tqdm import tqdm 4 | 5 | __all__ = ["progress"] 6 | 7 | T = TypeVar("T") 8 | 9 | 10 | def progress( 11 | i: Sequence[Union[T, Any]], description: str 12 | ) -> Generator[Union[T, Any], None, None]: 13 | yield from tqdm(i, desc=description, colour="green") 14 | -------------------------------------------------------------------------------- /tests/unit/common/clickhouse/metadata/table_versioned_collapsing_merge_tree.sql: -------------------------------------------------------------------------------- 1 | ATTACH TABLE _ UUID '42089e02-c13c-4b52-a1bf-4f9aa3e84e56' 2 | ( 3 | `UserID` UInt64, 4 | `PageViews` UInt8, 5 | `Duration` UInt8, 6 | `Sign` Int8, 7 | `Version` UInt8 8 | ) 9 | ENGINE = VersionedCollapsingMergeTree(Sign, Version) 10 | ORDER BY UserID 11 | SETTINGS index_granularity = 8192 -------------------------------------------------------------------------------- /tests/unit/common/clickhouse/metadata/table_replicated_aggregating_merge_tree.sql: -------------------------------------------------------------------------------- 1 | ATTACH TABLE _ UUID '8ac44a5e-091e-4dc4-9eb0-0ba577b3afd7' 2 | ( 3 | `id` UInt32, 4 | `value` AggregateFunction(sum, UInt32) 5 | ) 6 | ENGINE = ReplicatedAggregatingMergeTree('/clickhouse/tables/{shard}/example_replicated_aggregating_mergetree', '{replica}') 7 | ORDER BY id 8 | SETTINGS index_granularity = 8192 -------------------------------------------------------------------------------- /tests/unit/common/clickhouse/metadata/table_replicated_collapsing_merge_tree.sql: -------------------------------------------------------------------------------- 1 | ATTACH TABLE _ UUID '9317cb30-1efd-44bd-ab88-d0e3a025965a' 2 | ( 3 | `id` UInt32, 4 | `value` Int32, 5 | `sign` Int8 6 | ) 7 | ENGINE = ReplicatedCollapsingMergeTree('/clickhouse/tables/{shard}/example_replicated_collapsing_mergetree', '{replica}', sign) 8 | ORDER BY id 9 | SETTINGS index_granularity = 8192 -------------------------------------------------------------------------------- /tests/unit/common/clickhouse/metadata/table_replicated_replacing_merge_tree.sql: -------------------------------------------------------------------------------- 1 | ATTACH TABLE _ UUID '4ce817e2-8043-4655-869e-eeab3edeae6a' 2 | ( 3 | `D` Date, 4 | `ID` Int64, 5 | `Ver` UInt64 6 | ) 7 | ENGINE = ReplicatedReplacingMergeTree('/clickhouse/tables/tableName/{shard}/', '{replica}', Ver) 8 | PARTITION BY toYYYYMM(D) 9 | ORDER BY ID 10 | SETTINGS index_granularity = 8192 -------------------------------------------------------------------------------- /tests/unit/common/clickhouse/metadata/table_replicated_summing_merge_tree.sql: -------------------------------------------------------------------------------- 1 | ATTACH TABLE _ UUID '72b4c520-9cc2-4549-ba6c-bd952bb049d8' 2 | ( 3 | `D` Date, 4 | `ID` Int64, 5 | `Ver` UInt64 6 | ) 7 | ENGINE = ReplicatedSummingMergeTree('/clickhouse/tables/tableName/{shard}/1', '{replica}', Ver) 8 | PARTITION BY toYYYYMM(D) 9 | ORDER BY ID 10 | SETTINGS index_granularity = 8192 -------------------------------------------------------------------------------- /tests/images/clickhouse/config/supervisor/conf.d/clickhouse-server.conf: -------------------------------------------------------------------------------- 1 | [program:clickhouse-server] 2 | command=/usr/bin/clickhouse-server --config /etc/clickhouse-server/config.xml 3 | process_name=%(program_name)s 4 | autostart=true 5 | autorestart=true 6 | stopsignal=TERM 7 | user=clickhouse 8 | stdout_logfile=/dev/stderr 9 | stdout_logfile_maxbytes=0 10 | stderr_logfile=/dev/stderr 11 | stderr_logfile_maxbytes=0 -------------------------------------------------------------------------------- /ch_tools/chadmin/cli/list_macros_command.py: -------------------------------------------------------------------------------- 1 | from click import Context, command, pass_context 2 | 3 | from ch_tools.chadmin.internal.utils import execute_query 4 | from ch_tools.common import logging 5 | 6 | 7 | @command("macros") 8 | @pass_context 9 | def list_macros_command(ctx: Context) -> None: 10 | """ 11 | Show macros. 12 | """ 13 | logging.info(execute_query(ctx, "SELECT * FROM system.macros")) 14 | -------------------------------------------------------------------------------- /tests/images/http_mock/service.py: -------------------------------------------------------------------------------- 1 | import json 2 | 3 | from flask import Flask 4 | 5 | app = Flask(__name__) 6 | 7 | 8 | @app.route("/computeMetadata/v1/instance/service-accounts/default/token") 9 | def token() -> str: 10 | return json.dumps( 11 | {"access_token": "IAM_TOKEN", "expires_in": 0, "token_type": "Bearer"} 12 | ) 13 | 14 | 15 | @app.route("/") 16 | def ping() -> str: 17 | return "OK" 18 | -------------------------------------------------------------------------------- /tests/unit/common/clickhouse/metadata/table_aggregating_merge_tree.sql: -------------------------------------------------------------------------------- 1 | ATTACH TABLE _ UUID '40c7c9a8-7451-4a10-8b43-443436f33413' 2 | ( 3 | `StartDate` DateTime64(3), 4 | `CounterID` UInt64, 5 | `Visits` AggregateFunction(sum, Nullable(Int32)), 6 | `Users` AggregateFunction(uniq, Nullable(Int32)) 7 | ) 8 | ENGINE = AggregatingMergeTree 9 | ORDER BY (StartDate, CounterID) 10 | SETTINGS index_granularity = 8192 -------------------------------------------------------------------------------- /tests/unit/common/clickhouse/metadata/broken_no_uuid.sql: -------------------------------------------------------------------------------- 1 | ATTACH TABLE _ UUID 2 | ( 3 | `generation` UInt64, 4 | `date_key` DateTime, 5 | `number` UInt64, 6 | `text` String, 7 | `expired` DateTime DEFAULT now() 8 | ) 9 | ENGINE = MergeTree 10 | PARTITION BY toMonth(date_key) 11 | ORDER BY (generation, date_key) 12 | TTL expired + toIntervalMinute(3) TO DISK 'object_storage' 13 | SETTINGS index_granularity = 819 -------------------------------------------------------------------------------- /tests/unit/common/clickhouse/metadata/broken_no_uuid_full.sql: -------------------------------------------------------------------------------- 1 | ATTACH TABLE _ 2 | ( 3 | `generation` UInt64, 4 | `date_key` DateTime, 5 | `number` UInt64, 6 | `text` String, 7 | `expired` DateTime DEFAULT now() 8 | ) 9 | ENGINE = MergeTree 10 | PARTITION BY toMonth(date_key) 11 | ORDER BY (generation, date_key) 12 | TTL expired + toIntervalMinute(3) TO DISK 'object_storage' 13 | SETTINGS index_granularity = 819 -------------------------------------------------------------------------------- /ch_tools/chadmin/cli/list_events_command.py: -------------------------------------------------------------------------------- 1 | from click import Context, command, pass_context 2 | 3 | from ch_tools.chadmin.internal.utils import execute_query 4 | from ch_tools.common import logging 5 | 6 | 7 | @command("events") 8 | @pass_context 9 | def list_events_command(ctx: Context) -> None: 10 | """ 11 | Show metrics from system.events. 12 | """ 13 | logging.info(execute_query(ctx, "SELECT * FROM system.events")) 14 | -------------------------------------------------------------------------------- /ch_tools/chadmin/cli/list_metrics_command.py: -------------------------------------------------------------------------------- 1 | from click import Context, command, pass_context 2 | 3 | from ch_tools.chadmin.internal.utils import execute_query 4 | from ch_tools.common import logging 5 | 6 | 7 | @command("metrics") 8 | @pass_context 9 | def list_metrics_command(ctx: Context) -> None: 10 | """ 11 | Show metrics from system.metrics. 12 | """ 13 | logging.info(execute_query(ctx, "SELECT * FROM system.metrics")) 14 | -------------------------------------------------------------------------------- /tests/unit/common/clickhouse/metadata/broken_no_engine_full.sql: -------------------------------------------------------------------------------- 1 | ATTACH TABLE _ UUID '9612256b-b461-4df5-8015-72f9727d1f95' 2 | ( 3 | `generation` UInt64, 4 | `date_key` DateTime, 5 | `number` UInt64, 6 | `text` String, 7 | `expired` DateTime DEFAULT now() 8 | ) 9 | PARTITION BY toMonth(date_key) 10 | ORDER BY (generation, date_key) 11 | TTL expired + toIntervalMinute(3) TO DISK 'object_storage' 12 | SETTINGS index_granularity = 819 -------------------------------------------------------------------------------- /ch_tools/chadmin/internal/table_info.py: -------------------------------------------------------------------------------- 1 | from typing import List, TypedDict 2 | 3 | 4 | class TableInfo(TypedDict): 5 | """Table information.""" 6 | 7 | database: str 8 | name: str 9 | uuid: str 10 | engine: str 11 | create_table_query: str 12 | metadata_path: str 13 | metadata_modification_time: str 14 | data_paths: List[str] 15 | disk_size: int 16 | partitions: int 17 | parts: int 18 | rows: int 19 | -------------------------------------------------------------------------------- /tests/unit/common/clickhouse/metadata/broken_uuid.sql: -------------------------------------------------------------------------------- 1 | ATTACH TABLE _ UUID 'b461-4df5-8015-72f9727d1f95' 2 | ( 3 | `generation` UInt64, 4 | `date_key` DateTime, 5 | `number` UInt64, 6 | `text` String, 7 | `expired` DateTime DEFAULT now() 8 | ) 9 | ENGINE = MergeTree 10 | PARTITION BY toMonth(date_key) 11 | ORDER BY (generation, date_key) 12 | TTL expired + toIntervalMinute(3) TO DISK 'object_storage' 13 | SETTINGS index_granularity = 819 -------------------------------------------------------------------------------- /tests/unit/common/clickhouse/metadata/broken_no_engine.sql: -------------------------------------------------------------------------------- 1 | ATTACH TABLE _ UUID '40c7c9a8-7451-4a10-8b43-443436f33413' 2 | ( 3 | `generation` UInt64, 4 | `date_key` DateTime, 5 | `number` UInt64, 6 | `text` String, 7 | `expired` DateTime DEFAULT now() 8 | ) 9 | ENGINE 10 | PARTITION BY toMonth(date_key) 11 | ORDER BY (generation, date_key) 12 | TTL expired + toIntervalMinute(3) TO DISK 'object_storage' 13 | SETTINGS index_granularity = 819 -------------------------------------------------------------------------------- /tests/unit/common/clickhouse/metadata/table_replicated_versioned_collapsing_merge_tree.sql: -------------------------------------------------------------------------------- 1 | ATTACH TABLE _ UUID '10ccbec1-6b78-48fe-a51a-fb7c9f7fbe4a' 2 | ( 3 | `id` UInt32, 4 | `value` Int32, 5 | `sign` Int8, 6 | `version` UInt32 7 | ) 8 | ENGINE = ReplicatedVersionedCollapsingMergeTree('/clickhouse/tables/{shard}/example_replicated_versioned_collapsing_mergetree', '{replica}', sign, version) 9 | ORDER BY id 10 | SETTINGS index_granularity = 8192 -------------------------------------------------------------------------------- /ch_tools/chadmin/cli/config_command.py: -------------------------------------------------------------------------------- 1 | from click import Context, command, pass_context 2 | 3 | from ch_tools.common.cli.formatting import print_response 4 | from ch_tools.common.clickhouse.config import ClickhouseConfig 5 | 6 | 7 | @command("config") 8 | @pass_context 9 | def config_command(ctx: Context) -> None: 10 | """ 11 | Output ClickHouse config. 12 | """ 13 | config = ClickhouseConfig.load() 14 | print_response(ctx, config.dump()) 15 | -------------------------------------------------------------------------------- /ch_tools/common/clickhouse/client/error.py: -------------------------------------------------------------------------------- 1 | import re 2 | 3 | from requests import Response 4 | 5 | 6 | class ClickhouseError(Exception): 7 | """ 8 | ClickHouse interaction error. 9 | """ 10 | 11 | def __init__(self, query: str, response: Response) -> None: 12 | self.query = re.sub(r"\s*\n\s*", " ", query.strip()) 13 | self.response = response 14 | super().__init__(f"{self.response.text.strip()}\n\nQuery: {self.query}") 15 | -------------------------------------------------------------------------------- /tests/unit/common/clickhouse/metadata/table_merge_tree.sql: -------------------------------------------------------------------------------- 1 | ATTACH TABLE _ UUID '9612256b-b461-4df5-8015-72f9727d1f95' 2 | ( 3 | `generation` UInt64, 4 | `date_key` DateTime, 5 | `number` UInt64, 6 | `text` String, 7 | `expired` DateTime DEFAULT now() 8 | ) 9 | ENGINE = MergeTree 10 | PARTITION BY toMonth(date_key) 11 | ORDER BY (generation, date_key) 12 | TTL expired + toIntervalMinute(3) TO DISK 'object_storage' 13 | SETTINGS index_granularity = 819 -------------------------------------------------------------------------------- /tests/images/zookeeper/config/zoo.cfg: -------------------------------------------------------------------------------- 1 | autopurge.purgeInterval=1 2 | autopurge.snapRetainCount=2 3 | clientPort=2181 4 | dataDir=/var/lib/zookeeper 5 | forceSync=no 6 | fsync.warningthresholdms=500 7 | initLimit=7200 8 | jute.maxbuffer=16777216 9 | leaderServes=yes 10 | maxClientCnxns=2000 11 | cnxTimeout=3000 12 | maxSessionTimeout=60000 13 | quorumListenOnAllIPs=true 14 | skipACL=yes 15 | snapCount=10000 16 | syncLimit=20 17 | tickTime=2000 18 | 4lw.commands.whitelist=* 19 | -------------------------------------------------------------------------------- /tests/unit/common/clickhouse/metadata/table_merge_tree_field_uuid.sql: -------------------------------------------------------------------------------- 1 | ATTACH TABLE _ UUID '9612256b-b461-4df5-8015-72f9727d1f95' 2 | ( 3 | `generation` UInt64, 4 | `date_key` DateTime, 5 | `number` UInt64, 6 | `UUID` String, 7 | `expired` DateTime DEFAULT now() 8 | ) 9 | ENGINE = MergeTree 10 | PARTITION BY toMonth(date_key) 11 | ORDER BY (generation, date_key) 12 | TTL expired + toIntervalMinute(3) TO DISK 'object_storage' 13 | SETTINGS index_granularity = 819 -------------------------------------------------------------------------------- /ch_tools/chadmin/cli/list_async_metrics_command.py: -------------------------------------------------------------------------------- 1 | from click import Context, command, pass_context 2 | 3 | from ch_tools.chadmin.internal.utils import execute_query 4 | from ch_tools.common import logging 5 | 6 | 7 | @command("async-metrics") 8 | @pass_context 9 | def list_async_metrics_command(ctx: Context) -> None: 10 | """ 11 | Show metrics from system.async_metrics. 12 | """ 13 | logging.info(execute_query(ctx, "SELECT * FROM system.asynchronous_metrics")) 14 | -------------------------------------------------------------------------------- /tests/images/minio/Dockerfile: -------------------------------------------------------------------------------- 1 | FROM minio/mc:RELEASE.2022-01-07T06-01-38Z 2 | 3 | FROM minio/minio:RELEASE.2022-01-08T03-11-54Z 4 | 5 | COPY --from=0 /usr/bin/mc /usr/bin/mc 6 | 7 | ENV MINIO_ACCESS_KEY {{ conf.s3.access_key_id }} 8 | ENV MINIO_SECRET_KEY {{ conf.s3.access_secret_key }} 9 | 10 | ENTRYPOINT ["/usr/bin/docker-entrypoint.sh"] 11 | 12 | HEALTHCHECK --interval=30s --timeout=5s CMD /usr/bin/healthcheck.sh 13 | 14 | EXPOSE 9000 15 | 16 | CMD ["server", "/export"] 17 | -------------------------------------------------------------------------------- /tests/unit/common/clickhouse/metadata/table_merge_tree_field_engine.sql: -------------------------------------------------------------------------------- 1 | ATTACH TABLE _ UUID '9612256b-b461-4df5-8015-72f9727d1f95' 2 | ( 3 | `generation` UInt64, 4 | `date_key` DateTime, 5 | `number` UInt64, 6 | `ENGINE` String, 7 | `expired` DateTime DEFAULT now() 8 | ) 9 | ENGINE = MergeTree 10 | PARTITION BY toMonth(date_key) 11 | ORDER BY (generation, date_key) 12 | TTL expired + toIntervalMinute(3) TO DISK 'object_storage' 13 | SETTINGS index_granularity = 819 -------------------------------------------------------------------------------- /tests/unit/common/clickhouse/metadata/table_replicated_merge_tree.sql: -------------------------------------------------------------------------------- 1 | ATTACH TABLE _ UUID 'f438d816-605d-4fe0-a9cb-4edba3ce72dd' 2 | ( 3 | `generation` UInt64, 4 | `date_key` DateTime, 5 | `number` UInt64, 6 | `text` String, 7 | `expired` DateTime DEFAULT now() 8 | ) 9 | ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/test_table_repl', '{replica}') 10 | PARTITION BY toMonth(date_key) 11 | ORDER BY (generation, date_key) 12 | SETTINGS index_granularity = 8192 -------------------------------------------------------------------------------- /ch_tools/monrun_checks/utils.py: -------------------------------------------------------------------------------- 1 | from datetime import timedelta 2 | 3 | from click import Context 4 | 5 | from ch_tools.common import logging 6 | from ch_tools.common.clickhouse.client.clickhouse_client import clickhouse_client 7 | 8 | 9 | def get_uptime(ctx: Context) -> timedelta: 10 | try: 11 | return clickhouse_client(ctx).get_uptime() 12 | except Exception: 13 | logging.warning("Failed to get ClickHouse uptime", exc_info=True) 14 | return timedelta() 15 | -------------------------------------------------------------------------------- /ch_tools/common/cli/context_settings.py: -------------------------------------------------------------------------------- 1 | from cloup import Context, HelpFormatter, HelpTheme 2 | 3 | __all__ = [ 4 | "CONTEXT_SETTINGS", 5 | ] 6 | 7 | CONTEXT_SETTINGS = Context.settings( 8 | help_option_names=["-h", "--help"], 9 | terminal_width=120, 10 | align_option_groups=False, 11 | align_sections=True, 12 | show_constraints=True, 13 | show_default=True, 14 | formatter_settings=HelpFormatter.settings( 15 | theme=HelpTheme.light(), 16 | ), 17 | ) 18 | -------------------------------------------------------------------------------- /tests/unit/common/clickhouse/metadata/table_replicated_merge_tree_ver.sql: -------------------------------------------------------------------------------- 1 | ATTACH TABLE _ UUID 'f438d816-605d-4fe0-a9cb-4edba3ce72dd' 2 | ( 3 | `generation` UInt64, 4 | `date_key` DateTime, 5 | `number` UInt64, 6 | `text` String, 7 | `expired` DateTime DEFAULT now() 8 | ) 9 | ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/test_table_repl1', '{replica}', ver) 10 | PARTITION BY toMonth(date_key) 11 | ORDER BY (generation, date_key) 12 | SETTINGS index_granularity = 8192 -------------------------------------------------------------------------------- /tests/images/clickhouse/config/supervisor/supervisord.conf: -------------------------------------------------------------------------------- 1 | [unix_http_server] 2 | file=/var/run/supervisor.sock 3 | chmod=0700 4 | 5 | [supervisord] 6 | logfile=/dev/null 7 | logfile_maxbytes=0 8 | pidfile=/var/run/supervisord.pid 9 | minfds=1024 10 | nodaemon=true 11 | 12 | [rpcinterface:supervisor] 13 | supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface 14 | 15 | [supervisorctl] 16 | serverurl=unix:///var/run/supervisor.sock 17 | 18 | [include] 19 | files = /etc/supervisor/conf.d/*.conf 20 | -------------------------------------------------------------------------------- /tests/unit/common/clickhouse/test_zk_path_escape.py: -------------------------------------------------------------------------------- 1 | import pytest 2 | 3 | from ch_tools.chadmin.internal.zookeeper import escape_for_zookeeper 4 | 5 | # type: ignore 6 | 7 | 8 | @pytest.mark.parametrize( 9 | "hostname, result", 10 | [ 11 | pytest.param( 12 | "zone-hostname.database.urs.net", 13 | "zone%2Dhostname%2Edatabase%2Eurs%2Enet", 14 | ), 15 | ], 16 | ) 17 | def test_config(hostname: str, result: str) -> None: 18 | 19 | assert escape_for_zookeeper(hostname) == result 20 | -------------------------------------------------------------------------------- /tests/modules/steps.py: -------------------------------------------------------------------------------- 1 | """ 2 | Utility functions to use in implementation of test steps. 3 | """ 4 | 5 | from typing import Any 6 | 7 | import yaml 8 | 9 | from .templates import render_template 10 | from .typing import ContextT 11 | 12 | 13 | def get_step_data(context: ContextT) -> Any: 14 | """ 15 | Return step data deserialized from YAML representation and processed by 16 | template engine. 17 | """ 18 | if not context.text: 19 | return {} 20 | 21 | return yaml.load(render_template(context, context.text), yaml.SafeLoader) 22 | -------------------------------------------------------------------------------- /ch_tools/common/type/typed_enum.py: -------------------------------------------------------------------------------- 1 | """ 2 | Typed enumerations returning their values on `__str__`. 3 | """ 4 | 5 | from enum import Enum 6 | 7 | 8 | class TypedEnum(Enum): 9 | """ 10 | Base class for typed enumerations. 11 | """ 12 | 13 | def __str__(self) -> str: 14 | return str(self.value) 15 | 16 | 17 | class StrEnum(str, TypedEnum): 18 | """ 19 | String-value enumeration. 20 | """ 21 | 22 | pass 23 | 24 | 25 | class IntEnum(int, TypedEnum): 26 | """ 27 | Integer-value enumeration. 28 | """ 29 | 30 | pass 31 | -------------------------------------------------------------------------------- /ch_tools/common/clickhouse/client/utils.py: -------------------------------------------------------------------------------- 1 | from typing import Optional 2 | 3 | 4 | def _format_str_match(value: Optional[str]) -> Optional[str]: 5 | # pylint: disable=consider-using-f-string 6 | 7 | if value is None: 8 | return None 9 | 10 | if value.find(",") < 0: 11 | return f"LIKE '{value}'" 12 | 13 | return "IN ({0})".format(",".join(f"'{item.strip()}'" for item in value.split(","))) 14 | 15 | 16 | def _format_str_imatch(value: Optional[str]) -> Optional[str]: 17 | if value is None: 18 | return None 19 | 20 | return _format_str_match(value.lower()) 21 | -------------------------------------------------------------------------------- /ch_tools/monrun_checks_keeper/README.md: -------------------------------------------------------------------------------- 1 | # keeper-monitoring 2 | 3 | ClickHouse Keeper / ZooKeeper monitoring tool. 4 | 5 | It provides monitoring for: 6 | - Aliveness of Keeper 7 | - Average latency 8 | - Minimum latency 9 | - Maximum latency 10 | - Request queue size 11 | - Open file descriptors 12 | - Version of Keeper 13 | - Presence of snapshots 14 | - Presence of `NullPointerException` in logs for 24 hours 15 | 16 | Each monitoring check outputs in following format: 17 | ``` 18 | ; 19 | ``` 20 | Where `` is one of 21 | - `0` - OK 22 | - `1` - WARN 23 | - `2` - CRIT 24 | 25 | -------------------------------------------------------------------------------- /ch_tools/chadmin/internal/database_replica.py: -------------------------------------------------------------------------------- 1 | from click import Context 2 | 3 | from ch_tools.chadmin.internal.utils import execute_query 4 | 5 | 6 | def system_database_drop_replica( 7 | ctx: Context, database_zk_path: str, replica: str, dry_run: bool = False 8 | ) -> None: 9 | """ 10 | Perform "SYSTEM DROP DATABASE REPLICA" query. 11 | """ 12 | timeout = ctx.obj["config"]["clickhouse"]["drop_replica_timeout"] 13 | query = f"SYSTEM DROP DATABASE REPLICA '{replica}' FROM ZKPATH '{database_zk_path}'" 14 | execute_query(ctx, query, timeout=timeout, echo=True, dry_run=dry_run, format_=None) 15 | -------------------------------------------------------------------------------- /ch_tools/chadmin/internal/object_storage/orphaned_objects_state.py: -------------------------------------------------------------------------------- 1 | import json 2 | from dataclasses import asdict, dataclass 3 | 4 | 5 | @dataclass 6 | class OrphanedObjectsState: 7 | orphaned_objects_size: int 8 | error_msg: str 9 | 10 | @classmethod 11 | def from_json(cls, json_str: str) -> "OrphanedObjectsState": 12 | data = json.loads(json_str) 13 | return cls( 14 | orphaned_objects_size=data["orphaned_objects_size"], 15 | error_msg=data["error_msg"], 16 | ) 17 | 18 | def to_json(self) -> str: 19 | return json.dumps(asdict(self), indent=4) 20 | -------------------------------------------------------------------------------- /tests/images/zookeeper/Dockerfile: -------------------------------------------------------------------------------- 1 | FROM ubuntu/zookeeper 2 | 3 | RUN DEBIAN_FRONTEND=noninteractive apt-get update && apt-get install -y supervisor python3-pip && \ 4 | rm -rf /var/lib/apt/lists/* /var/cache/debconf && \ 5 | apt-get clean 6 | 7 | COPY tests/images/zookeeper/config/zookeeper.conf /etc/supervisor/supervisord.conf 8 | COPY tests/images/zookeeper/config/zoo.cfg /etc/zookeeper/conf/zoo.cfg 9 | COPY tests/images/zookeeper/config/log4j.properties /etc/zookeeper/conf/log4j.properties 10 | 11 | COPY dist/*.whl / 12 | RUN python3 -m pip install *.whl 13 | 14 | ENTRYPOINT ["supervisord", "-c", "/etc/supervisor/supervisord.conf"] 15 | -------------------------------------------------------------------------------- /ch_tools/chadmin/cli/list_functions_command.py: -------------------------------------------------------------------------------- 1 | from click import Context, command, option, pass_context 2 | 3 | from ch_tools.chadmin.internal.utils import execute_query 4 | from ch_tools.common import logging 5 | 6 | 7 | @command("functions") 8 | @option("--name") 9 | @pass_context 10 | def list_functions_command(ctx: Context, name: str) -> None: 11 | """ 12 | Show available functions. 13 | """ 14 | query = """ 15 | SELECT * 16 | FROM system.functions 17 | {% if name %} 18 | WHERE lower(name) {{ format_str_imatch(name) }} 19 | {% endif %} 20 | """ 21 | logging.info(execute_query(ctx, query, name=name)) 22 | -------------------------------------------------------------------------------- /ch_tools/common/clickhouse/client/retry.py: -------------------------------------------------------------------------------- 1 | from typing import Any, Tuple, Type, Union 2 | 3 | import tenacity 4 | 5 | 6 | def retry( 7 | exception_types: Union[Type[BaseException], Tuple[Type[BaseException], ...]], 8 | max_attempts: int = 5, 9 | max_interval: int = 5, 10 | ) -> Any: 11 | """ 12 | Function decorator that retries wrapped function on failures. 13 | """ 14 | return tenacity.retry( 15 | retry=tenacity.retry_if_exception_type(exception_types), 16 | wait=tenacity.wait_random_exponential(multiplier=0.5, max=max_interval), 17 | stop=tenacity.stop_after_attempt(max_attempts), 18 | reraise=True, 19 | ) 20 | -------------------------------------------------------------------------------- /tests/images/clickhouse/config/ch-backup.conf: -------------------------------------------------------------------------------- 1 | backup: 2 | path_root: ch_backup/ 3 | deduplicate_parts: True 4 | retain_time: 5 | days: 1 6 | retain_count: 1 7 | deduplication_age_limit: 8 | days: 1 9 | 10 | main: 11 | ca_bundle: [] 12 | 13 | encryption: 14 | type: nacl 15 | key: {{ conf.ch_backup.encrypt_key }} 16 | 17 | storage: 18 | type: s3 19 | credentials: 20 | endpoint_url: '{{ conf.s3.endpoint }}' 21 | access_key_id: {{ conf.s3.access_key_id }} 22 | secret_access_key: {{ conf.s3.access_secret_key }} 23 | bucket: {{ conf.s3.bucket }} 24 | 25 | zookeeper: 26 | hosts: 'zookeeper01:2181' 27 | root_path: '/' 28 | -------------------------------------------------------------------------------- /ch_tools/common/clickhouse/config/path.py: -------------------------------------------------------------------------------- 1 | CLICKHOUSE_SERVER_PREPROCESSED_CONFIG_PATH = ( 2 | "/var/lib/clickhouse/preprocessed_configs/config.xml" 3 | ) 4 | CLICKHOUSE_SERVER_CONFIG_PATH = "/etc/clickhouse-server/config.xml" 5 | CLICKHOUSE_RESETUP_CONFIG_PATH = "/etc/clickhouse-server/config.d/resetup_config.xml" 6 | CLICKHOUSE_S3_CREDENTIALS_CONFIG_PATH = ( 7 | "/etc/clickhouse-server/config.d/s3_credentials.xml" 8 | ) 9 | CLICKHOUSE_KEEPER_CONFIG_PATH = "/etc/clickhouse-keeper/config.xml" 10 | CLICKHOUSE_USERS_XML_CONFIG_PATH = "/etc/clickhouse-server/users.xml" 11 | CLICKHOUSE_USERS_YAML_CONFIG_PATH = "/etc/clickhouse-server/users.yaml" 12 | CLICKHOUSE_CERT_PATH_DEFAULT = "/etc/clickhouse-server/ssl/allCAs.pem" 13 | -------------------------------------------------------------------------------- /ch_tools/chadmin/cli/metadata.py: -------------------------------------------------------------------------------- 1 | import re 2 | import uuid 3 | 4 | UUID_TOKEN = "UUID" 5 | ENGINE_TOKEN = "ENGINE" 6 | UUID_PATTERN = re.compile(r"UUID\s+'([a-f0-9-]+)'", re.IGNORECASE) 7 | 8 | 9 | def _is_valid_uuid(uuid_str: str) -> bool: 10 | try: 11 | val = uuid.UUID(uuid_str) 12 | except ValueError: 13 | return False 14 | return str(val) == uuid_str 15 | 16 | 17 | def parse_uuid(line: str) -> str: 18 | match = UUID_PATTERN.search(line) 19 | 20 | if not match: 21 | raise RuntimeError("Failed parse UUID from metadata.") 22 | 23 | result = match.group(1) 24 | if not _is_valid_uuid(result): 25 | raise RuntimeError("Failed parse UUID from metadata.") 26 | return result 27 | -------------------------------------------------------------------------------- /ch_tools/monrun_checks/ch_s3_backup_orphaned.py: -------------------------------------------------------------------------------- 1 | import click 2 | 3 | from ch_tools.common.backup import get_orphaned_chs3_backups 4 | from ch_tools.common.result import OK, WARNING, Result 5 | 6 | 7 | @click.command("orphaned-backups") 8 | def orphaned_backups_command() -> Result: 9 | """ 10 | Check for orphaned backups. 11 | """ 12 | orphaned_backups = get_orphaned_chs3_backups() 13 | if not orphaned_backups: 14 | return Result(OK) 15 | 16 | orphaned_backups_str = ", ".join(sorted(orphaned_backups)[:3]) 17 | if len(orphaned_backups) > 3: 18 | orphaned_backups_str += ", ..." 19 | 20 | return Result( 21 | WARNING, 22 | f"There are {len(orphaned_backups)} orphaned S3 backups: {orphaned_backups_str}", 23 | ) 24 | -------------------------------------------------------------------------------- /ch_tools/common/clickhouse/config/zookeeper.py: -------------------------------------------------------------------------------- 1 | from typing import Any, Optional 2 | 3 | 4 | class ClickhouseZookeeperConfig: 5 | """ 6 | ZooKeeper section of ClickHouse server config. 7 | """ 8 | 9 | def __init__(self, config: dict) -> None: 10 | self._config = config 11 | 12 | def is_empty(self) -> bool: 13 | return not bool(self._config) 14 | 15 | @property 16 | def nodes(self) -> list: 17 | value = self._config["node"] 18 | if isinstance(value, list): 19 | return value 20 | 21 | return [value] 22 | 23 | @property 24 | def root(self) -> Optional[Any]: 25 | return self._config.get("root") 26 | 27 | @property 28 | def identity(self) -> Optional[Any]: 29 | return self._config.get("identity") 30 | -------------------------------------------------------------------------------- /ch_tools/monrun_checks/README.md: -------------------------------------------------------------------------------- 1 | # ch-monitoring 2 | 3 | ClickHouse monitoring tool. 4 | 5 | It provides monitoring for: 6 | - Backup: 7 | - Validity 8 | - Age 9 | - Count 10 | - Restoration failures 11 | - Presence of orphaned S3 backups 12 | - Core dumps 13 | - Old chunks on Distributed tables 14 | - Presence of geobase 15 | - Aliveness of ClickHouse Keeper 16 | - Count of errors in logs 17 | - Ping-ability of ClickHouse 18 | - Replication lag between replicas 19 | - Re-setup state 20 | - Read-only replicas 21 | - System queues 22 | - TLS certificate validity 23 | - Size of S3 orphaned objects 24 | - System metrics 25 | 26 | Each monitoring check outputs in following format: 27 | ``` 28 | ; 29 | ``` 30 | Where `` is one of 31 | - `0` - OK 32 | - `1` - WARN 33 | - `2` - CRIT 34 | -------------------------------------------------------------------------------- /.github/workflows/test_clickhouse_version.yml: -------------------------------------------------------------------------------- 1 | name: test_clickhouse_version 2 | 3 | run-name: ${{ github.workflow }}_${{ inputs.clickhouse_version }}_${{ inputs.id || github.run_number }} 4 | 5 | on: 6 | workflow_dispatch: 7 | inputs: 8 | clickhouse_version: 9 | description: 'ClickHouse version' 10 | required: true 11 | type: string 12 | id: 13 | description: 'Run identifier' 14 | required: false 15 | type: string 16 | default: "" 17 | 18 | jobs: 19 | test_integration: 20 | runs-on: ubuntu-latest 21 | env: 22 | CLICKHOUSE_VERSION: ${{ inputs.clickhouse_version }} 23 | steps: 24 | - uses: actions/checkout@v6 25 | - uses: astral-sh/setup-uv@v7 26 | - name: run integration tests 27 | run: make test-integration 28 | -------------------------------------------------------------------------------- /ch_tools/chadmin/cli/list_settings_command.py: -------------------------------------------------------------------------------- 1 | from click import Context, command, option, pass_context 2 | 3 | from ch_tools.chadmin.internal.utils import execute_query 4 | from ch_tools.common import logging 5 | 6 | 7 | @command("settings") 8 | @option("--name") 9 | @option("--changed", is_flag=True) 10 | @pass_context 11 | def list_settings_command(ctx: Context, name: str, changed: bool) -> None: 12 | """ 13 | Show settings. 14 | """ 15 | query = """ 16 | SELECT * 17 | FROM system.settings 18 | WHERE 1 19 | {% if name %} 20 | AND lower(name) {{ format_str_imatch(name) }} 21 | {% endif %} 22 | {% if changed %} 23 | AND changed 24 | {% endif %} 25 | """ 26 | logging.info(execute_query(ctx, query, name=name, changed=changed)) 27 | -------------------------------------------------------------------------------- /AUTHORS: -------------------------------------------------------------------------------- 1 | The following authors have created the source code of "clickhouse-tools" 2 | published and distributed by YANDEX LLC as the owner: 3 | 4 | Alexander Burmak 5 | Dmitry Starov 6 | Anton Ivashkin 7 | Grigorii Pervakov 8 | Petr Nuzhnov 9 | Egor Medvedev 10 | Aleksei Filatov 11 | Evgeny Dyukov 12 | Evgeny Strizhnev 13 | Vadim Volodin 14 | Anton Chaporgin 15 | Evgenii Kopanev 16 | Mikhail Kot 17 | Mikhail Burdukov 18 | Kirill Garbar 19 | Konstantin Morozov 20 | -------------------------------------------------------------------------------- /resources/completion/chadmin-completion.bash: -------------------------------------------------------------------------------- 1 | # Generated by "_CHADMIN_COMPLETE=bash_source chadmin" 2 | _chadmin_completion() { 3 | local IFS=$'\n' 4 | local response 5 | 6 | response=$(env COMP_WORDS="${COMP_WORDS[*]}" COMP_CWORD=$COMP_CWORD _CHADMIN_COMPLETE=bash_complete $1) 7 | 8 | for completion in $response; do 9 | IFS=',' read type value <<< "$completion" 10 | 11 | if [[ $type == 'dir' ]]; then 12 | COMREPLY=() 13 | compopt -o dirnames 14 | elif [[ $type == 'file' ]]; then 15 | COMREPLY=() 16 | compopt -o default 17 | elif [[ $type == 'plain' ]]; then 18 | COMPREPLY+=($value) 19 | fi 20 | done 21 | 22 | return 0 23 | } 24 | 25 | _chadmin_completion_setup() { 26 | complete -o nosort -F _chadmin_completion chadmin 27 | } 28 | 29 | _chadmin_completion_setup; 30 | -------------------------------------------------------------------------------- /ch_tools/common/clickhouse/config/__init__.py: -------------------------------------------------------------------------------- 1 | from typing import Any 2 | 3 | from click import Context 4 | 5 | from .clickhouse import ClickhouseConfig 6 | from .clickhouse_keeper import ClickhouseKeeperConfig 7 | from .users import ClickhouseUsersConfig 8 | from .zookeeper import ClickhouseZookeeperConfig 9 | 10 | __all__ = [ 11 | "ClickhouseConfig", 12 | "ClickhouseKeeperConfig", 13 | "ClickhouseUsersConfig", 14 | "ClickhouseZookeeperConfig", 15 | ] 16 | 17 | 18 | def get_clickhouse_config(ctx: Context) -> ClickhouseConfig: 19 | if "clickhouse_config" not in ctx.obj: 20 | ctx.obj["clickhouse_config"] = ClickhouseConfig.load() 21 | 22 | return ctx.obj["clickhouse_config"] 23 | 24 | 25 | def get_macros(ctx: Context) -> dict[str, Any]: 26 | return get_clickhouse_config(ctx).macros 27 | 28 | 29 | def get_cluster_name(ctx: Context) -> Any: 30 | return get_clickhouse_config(ctx).cluster_name 31 | -------------------------------------------------------------------------------- /tests/images/zookeeper/config/zookeeper.conf: -------------------------------------------------------------------------------- 1 | [supervisord] 2 | logfile=/dev/null 3 | logfile_maxbytes=0 4 | pidfile=/var/run/supervisord.pid 5 | minfds=1024 6 | nodaemon=true 7 | 8 | [unix_http_server] 9 | file=/var/run/supervisor.sock 10 | chmod=0700 11 | 12 | [supervisord] 13 | logfile=/dev/null 14 | logfile_maxbytes=0 15 | pidfile=/var/run/supervisord.pid 16 | minfds=1024 17 | nodaemon=true 18 | 19 | [rpcinterface:supervisor] 20 | supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface 21 | 22 | [supervisorctl] 23 | serverurl=unix:///var/run/supervisor.sock 24 | 25 | 26 | [program:zookeeper] 27 | command=bash /opt/kafka/bin/zookeeper-server-start.sh /etc/zookeeper/conf/zoo.cfg 28 | process_name=%(program_name)s 29 | autostart=true 30 | autorestart=true 31 | stopsignal=QUIT 32 | user=root 33 | stdout_logfile=/dev/stderr 34 | stdout_logfile_maxbytes=0 35 | stderr_logfile=/dev/stderr 36 | stderr_logfile_maxbytes=0 37 | -------------------------------------------------------------------------------- /resources/completion/ch-monitoring-completion.bash: -------------------------------------------------------------------------------- 1 | # Generated by "_CH_MONITORING_COMPLETE=bash_source ch-monitoring" 2 | _ch_monitoring_completion() { 3 | local IFS=$'\n' 4 | local response 5 | 6 | response=$(env COMP_WORDS="${COMP_WORDS[*]}" COMP_CWORD=$COMP_CWORD _CH_MONITORING_COMPLETE=bash_complete $1) 7 | 8 | for completion in $response; do 9 | IFS=',' read type value <<< "$completion" 10 | 11 | if [[ $type == 'dir' ]]; then 12 | COMREPLY=() 13 | compopt -o dirnames 14 | elif [[ $type == 'file' ]]; then 15 | COMREPLY=() 16 | compopt -o default 17 | elif [[ $type == 'plain' ]]; then 18 | COMPREPLY+=($value) 19 | fi 20 | done 21 | 22 | return 0 23 | } 24 | 25 | _ch_monitoring_completion_setup() { 26 | complete -o nosort -F _ch_monitoring_completion ch-monitoring 27 | } 28 | 29 | _ch_monitoring_completion_setup; 30 | -------------------------------------------------------------------------------- /ch_tools/chadmin/internal/backup.py: -------------------------------------------------------------------------------- 1 | from click import Context 2 | 3 | from ch_tools.chadmin.internal.utils import execute_query 4 | 5 | 6 | def unfreeze_table( 7 | ctx: Context, database: str, table: str, backup_name: str, dry_run: bool = False 8 | ) -> None: 9 | """ 10 | Perform "ALTER TABLE UNFREEZE". 11 | """ 12 | timeout = ctx.obj["config"]["clickhouse"]["unfreeze_timeout"] 13 | query = f"ALTER TABLE `{database}`.`{table}` UNFREEZE WITH NAME '{backup_name}'" 14 | execute_query(ctx, query, timeout=timeout, echo=True, format_=None, dry_run=dry_run) 15 | 16 | 17 | def unfreeze_backup(ctx: Context, backup_name: str, dry_run: bool = False) -> None: 18 | """ 19 | Perform "SYSTEM UNFREEZE". 20 | """ 21 | timeout = ctx.obj["config"]["clickhouse"]["unfreeze_timeout"] 22 | query = f"SYSTEM UNFREEZE WITH NAME '{backup_name}'" 23 | execute_query(ctx, query, timeout=timeout, echo=True, format_=None, dry_run=dry_run) 24 | -------------------------------------------------------------------------------- /ch_tools/monrun_checks/ch_geobase.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | import click 4 | import requests 5 | 6 | from ch_tools.common.clickhouse.client.clickhouse_client import clickhouse_client 7 | from ch_tools.common.result import CRIT, OK, Result 8 | 9 | 10 | @click.command("geobase") 11 | @click.pass_context 12 | def geobase_command(ctx: click.Context) -> Result: 13 | """ 14 | Check that embedded geobase is configured. 15 | """ 16 | 17 | try: 18 | response = clickhouse_client(ctx).query_json_data( 19 | query="SELECT regionToName(CAST(1 AS UInt32))" 20 | )[0][0] 21 | expected = "Москва и Московская область" 22 | if response != expected: 23 | return Result( 24 | CRIT, f"Geobase error, expected ({expected}), but got ({response})" 25 | ) 26 | except requests.exceptions.HTTPError as exc: 27 | return Result(CRIT, repr(exc)) 28 | 29 | return Result(OK) 30 | -------------------------------------------------------------------------------- /resources/completion/keeper-monitoring-completion.bash: -------------------------------------------------------------------------------- 1 | # Generated by "_KEEPER_MONITORING_COMPLETE=bash_source keeper-monitoring" 2 | _keeper_monitoring_completion() { 3 | local IFS=$'\n' 4 | local response 5 | 6 | response=$(env COMP_WORDS="${COMP_WORDS[*]}" COMP_CWORD=$COMP_CWORD _KEEPER_MONITORING_COMPLETE=bash_complete $1) 7 | 8 | for completion in $response; do 9 | IFS=',' read type value <<< "$completion" 10 | 11 | if [[ $type == 'dir' ]]; then 12 | COMREPLY=() 13 | compopt -o dirnames 14 | elif [[ $type == 'file' ]]; then 15 | COMREPLY=() 16 | compopt -o default 17 | elif [[ $type == 'plain' ]]; then 18 | COMPREPLY+=($value) 19 | fi 20 | done 21 | 22 | return 0 23 | } 24 | 25 | _keeper_monitoring_completion_setup() { 26 | complete -o nosort -F _keeper_monitoring_completion keeper-monitoring 27 | } 28 | 29 | _keeper_monitoring_completion_setup; 30 | -------------------------------------------------------------------------------- /ch_tools/chadmin/internal/object_storage/obj_list_item.py: -------------------------------------------------------------------------------- 1 | import json 2 | from dataclasses import dataclass 3 | from datetime import datetime 4 | 5 | from ch_tools.chadmin.internal.utils import DATETIME_FORMAT 6 | 7 | 8 | @dataclass 9 | class ObjListItem: 10 | """ 11 | Item of object storage listing. 12 | """ 13 | 14 | last_modified: datetime 15 | path: str 16 | size: int 17 | 18 | @classmethod 19 | def from_tab_separated(cls, value: str) -> "ObjListItem": 20 | time_str, path, size = value.split("\t") 21 | last_modified = datetime.strptime(time_str, DATETIME_FORMAT) 22 | return cls(last_modified, path, int(size)) 23 | 24 | @classmethod 25 | def from_json(cls, value: str) -> "ObjListItem": 26 | parsed_json = json.loads(value) 27 | last_modified = datetime.strptime(parsed_json["last_modified"], DATETIME_FORMAT) 28 | return cls(last_modified, parsed_json["obj_path"], int(parsed_json["obj_size"])) 29 | -------------------------------------------------------------------------------- /debian/rules: -------------------------------------------------------------------------------- 1 | #!/usr/bin/make -f 2 | 3 | PYTHON_MAJOR := $(shell python3 -c 'import sys; print(sys.version_info[0])') 4 | PYTHON_MINOR := $(shell python3 -c 'import sys; print(sys.version_info[1])') 5 | 6 | PYTHON_FROM := $(PYTHON_MAJOR).$(PYTHON_MINOR) 7 | PYTHON_TO := $(PYTHON_MAJOR).$(shell echo $$(( $(PYTHON_MINOR) + 1 ))) 8 | 9 | # Use conditional python3 dependency because package for Bionic requires python3.6, 10 | # but package for Jammy requires python3.10. 11 | # 12 | # All this is due to the fact that we put the entire virtual environment in a deb package 13 | # and that venv links to the system python 14 | SUBSTVARS := -Vpython:Depends="python3 (>= $(PYTHON_FROM)), python3 (<< $(PYTHON_TO))" 15 | 16 | %: 17 | dh $@ 18 | 19 | override_dh_auto_build: 20 | dh_auto_build 21 | 22 | override_dh_gencontrol: 23 | dh_gencontrol -- $(SUBSTVARS) 24 | 25 | override_dh_auto_clean: ; 26 | 27 | override_dh_strip: ; 28 | 29 | override_dh_shlibdeps: ; 30 | 31 | override_dh_auto_test: ; 32 | -------------------------------------------------------------------------------- /ch_tools/chadmin/cli/stack_trace_command.py: -------------------------------------------------------------------------------- 1 | from click import Context, command, pass_context 2 | 3 | from ch_tools.chadmin.internal.utils import execute_query 4 | from ch_tools.common import logging 5 | 6 | 7 | @command("stack-trace") 8 | @pass_context 9 | def stack_trace_command(ctx: Context) -> None: 10 | """ 11 | Collect stack traces. 12 | """ 13 | query_str = r""" 14 | SELECT 15 | thread_name, 16 | min(thread_id) AS min_thread_id, 17 | count() AS threads, 18 | '\n' || arrayStringConcat( 19 | arrayMap( 20 | x, 21 | y -> concat(x, ': ', y), 22 | arrayMap(x -> addressToLine(x), trace), 23 | arrayMap(x -> demangle(addressToSymbol(x)), trace)), 24 | '\n') AS trace 25 | FROM system.stack_trace 26 | GROUP BY thread_name, trace 27 | ORDER BY min_thread_id 28 | """ 29 | logging.info(execute_query(ctx, query_str, format_="Vertical")) 30 | -------------------------------------------------------------------------------- /debian/control: -------------------------------------------------------------------------------- 1 | Source: clickhouse-tools 2 | Section: database 3 | Priority: optional 4 | Maintainer: Yandex LLC 5 | Uploaders: Alexander Burmak , 6 | Dmitry Starov , 7 | Anton Ivashkin , 8 | Grigory Pervakov , 9 | Petr Nuzhnov , 10 | Egor Medvedev , 11 | Aleksei Filatov , 12 | Evgenii Kopanev , 13 | Mikhail Kot 14 | Build-Depends: debhelper (>= 10~), python3, python3-venv, python3-pip, python3-setuptools 15 | Standards-Version: 4.1.4 16 | Homepage: https://github.com/yandex/ch-tools 17 | Vcs-Browser: https://github.com/yandex/ch-tools.git 18 | Vcs-Git: git://github.com:yandex/ch-tools.git 19 | X-Python3-Version: >= 3.9 20 | 21 | Package: clickhouse-tools 22 | Architecture: any 23 | Description: A set of tools for administration and diagnostics of ClickHouse DBMS. 24 | Depends: ${python:Depends} 25 | Replaces: mdb-ch-tools, ch-tools 26 | Conflicts: mdb-ch-tools, ch-tools 27 | -------------------------------------------------------------------------------- /ch_tools/monrun_checks/status.py: -------------------------------------------------------------------------------- 1 | from typing import Any 2 | 3 | import click 4 | import tabulate 5 | 6 | DEFAULT_COLOR = "\033[0m" 7 | 8 | COLOR_MAP = { 9 | 0: "\033[92m", 10 | 1: "\033[93m", 11 | 2: "\033[91m", 12 | } 13 | 14 | 15 | def status_command(commands: Any) -> Any: 16 | @click.command("status") 17 | @click.pass_context 18 | def status_impl(ctx: Any) -> None: 19 | """ 20 | Perform all checks. 21 | """ 22 | config = ctx.obj["config"]["ch-monitoring"] 23 | ctx.obj["status_mode"] = True 24 | ctx.default_map = config 25 | 26 | checks_status = [] 27 | for cmd in commands: 28 | if not config.get(cmd.name, {}).get("@disabled"): 29 | status = ctx.invoke(cmd) 30 | checks_status.append( 31 | ( 32 | cmd.name, 33 | f"{COLOR_MAP[status.code]}{status.message}{DEFAULT_COLOR}", 34 | ) 35 | ) 36 | 37 | print(tabulate.tabulate(checks_status)) 38 | 39 | return status_impl 40 | -------------------------------------------------------------------------------- /ch_tools/monrun_checks_keeper/status.py: -------------------------------------------------------------------------------- 1 | from typing import Any 2 | 3 | import click 4 | import tabulate 5 | 6 | DEFAULT_COLOR = "\033[0m" 7 | 8 | COLOR_MAP = { 9 | 0: "\033[92m", 10 | 1: "\033[93m", 11 | 2: "\033[91m", 12 | } 13 | 14 | 15 | def status_command(commands: Any) -> Any: 16 | @click.command("status") 17 | @click.pass_context 18 | def status_impl(ctx: Any) -> None: 19 | """ 20 | Perform all checks. 21 | """ 22 | config = ctx.obj["config"]["keeper-monitoring"] 23 | ctx.obj.update({"status_mode": True}) 24 | ctx.default_map = config 25 | 26 | checks_status = [] 27 | for cmd in commands: 28 | if not config.get(cmd.name, {}).get("@disabled"): 29 | status = ctx.invoke(cmd) 30 | checks_status.append( 31 | ( 32 | cmd.name, 33 | f"{COLOR_MAP[status.code]}{status.message}{DEFAULT_COLOR}", 34 | ) 35 | ) 36 | 37 | print(tabulate.tabulate(checks_status)) 38 | 39 | return status_impl 40 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | The MIT License (MIT) 2 | Copyright (c) 2023 YANDEX LLC 3 | 4 | Permission is hereby granted, free of charge, to any person obtaining a copy 5 | of this software and associated documentation files (the "Software"), to deal 6 | in the Software without restriction, including without limitation the rights 7 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 8 | copies of the Software, and to permit persons to whom the Software is 9 | furnished to do so, subject to the following conditions: 10 | 11 | The above copyright notice and this permission notice shall be included in 12 | all copies or substantial portions of the Software. 13 | 14 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 15 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 16 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 17 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 18 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 19 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 20 | THE SOFTWARE. 21 | -------------------------------------------------------------------------------- /ch_tools/chadmin/internal/object_storage/s3_iterator.py: -------------------------------------------------------------------------------- 1 | from typing import Any, Iterator 2 | 3 | import boto3 # type: ignore[import] 4 | from botocore.client import Config 5 | 6 | from ch_tools.common.clickhouse.config.storage_configuration import S3DiskConfiguration 7 | 8 | ObjectSummary = Any 9 | IGNORED_OBJECT_NAME_PREFIXES = ["operations", ".SCHEMA_VERSION"] 10 | 11 | 12 | def s3_object_storage_iterator( 13 | disk: S3DiskConfiguration, 14 | *, 15 | object_name_prefix: str = "", 16 | skip_ignoring: bool = False, 17 | ) -> Iterator[ObjectSummary]: 18 | s3 = boto3.resource( 19 | "s3", 20 | endpoint_url=disk.endpoint_url, 21 | aws_access_key_id=disk.access_key_id, 22 | aws_secret_access_key=disk.secret_access_key, 23 | config=Config(s3={"addressing_style": "auto"}), 24 | ) 25 | bucket = s3.Bucket(disk.bucket_name) 26 | 27 | for obj in bucket.objects.filter(Prefix=object_name_prefix): 28 | if not skip_ignoring and _is_ignored(obj.key): 29 | continue 30 | yield obj 31 | 32 | 33 | def _is_ignored(name: str) -> bool: 34 | return any(p in name for p in IGNORED_OBJECT_NAME_PREFIXES) 35 | -------------------------------------------------------------------------------- /ch_tools/monrun_checks/exceptions.py: -------------------------------------------------------------------------------- 1 | from requests import RequestException 2 | 3 | from ch_tools.common.result import Status 4 | 5 | 6 | def user_warning(exc: UserWarning, status: Status) -> Status: 7 | code, message = exc.args 8 | status.append(message) 9 | status.set_code(code) 10 | return status 11 | 12 | 13 | def unknown_exception(exc: Exception, status: Status) -> Status: 14 | status.append(f"Unknown error: {exc}") 15 | status.set_code(1) 16 | return status 17 | 18 | 19 | def requests_error(exc: RequestException, status: Status) -> Status: 20 | status.append(f"ClickHouse connection error: {exc.__class__.__name__}") 21 | status.set_code(1) 22 | return status 23 | 24 | 25 | EXC_MAP = { 26 | UserWarning: user_warning, 27 | RequestException: requests_error, 28 | } 29 | 30 | 31 | def translate_to_status(exc: Exception, status: Status) -> Status: 32 | handler = unknown_exception 33 | if exc.__class__ in EXC_MAP: 34 | handler = EXC_MAP[exc.__class__] # type: ignore 35 | return handler(exc, status) 36 | 37 | 38 | def die(status_code: int, message: str) -> None: 39 | raise UserWarning(status_code, message) 40 | -------------------------------------------------------------------------------- /tests/images/clickhouse/config/users.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | ::/0 8 | 9 | default 10 | 1 11 | 12 | <_monitor> 13 | 14 | 15 | ::/0 16 | 17 | default 18 | 19 | <_admin> 20 | 21 | 22 | ::/0 23 | 24 | default 25 | 1 26 | 27 | 28 | 29 | 30 | 20 31 | 1 32 | 1 33 | 34 | 35 | 36 | -------------------------------------------------------------------------------- /ch_tools/chadmin/cli/diagnostics_command.py: -------------------------------------------------------------------------------- 1 | import cloup 2 | from click import Context, pass_context 3 | 4 | from ch_tools.chadmin.internal.diagnostics.diagnose import diagnose 5 | from ch_tools.common.cli.parameters import env_var_help 6 | 7 | 8 | @cloup.command("diagnostics") 9 | @cloup.option( 10 | "-o", 11 | "--format", 12 | "output_format", 13 | type=cloup.Choice( 14 | choices=["json", "yaml", "json.gz", "yaml.gz", "wiki", "wiki.gz"], 15 | case_sensitive=False, 16 | ), 17 | default="wiki", 18 | envvar="CHADMIN_DIAGNOSTICS_FORMAT", 19 | help="Output format for gathered diagnostics data. " 20 | + env_var_help("CHADMIN_DIAGNOSTICS_FORMAT"), 21 | ) 22 | @cloup.option( 23 | "-n", 24 | "--normalize-queries", 25 | is_flag=True, 26 | envvar="CHADMIN_DIAGNOSTICS_NORMALIZE_QUERIES", 27 | help="Whether to normalize queries for ClickHouse client. " 28 | + env_var_help("CHADMIN_DIAGNOSTICS_NORMALIZE_QUERIES"), 29 | ) 30 | @pass_context 31 | def diagnostics_command( 32 | ctx: Context, output_format: str, normalize_queries: bool 33 | ) -> None: 34 | """ 35 | Collect diagnostics data. 36 | """ 37 | diagnose(ctx, output_format, normalize_queries) 38 | -------------------------------------------------------------------------------- /.github/workflows/link_startrek.yml: -------------------------------------------------------------------------------- 1 | name: Link with Startrek 2 | 3 | on: 4 | pull_request: { branches: [main] } 5 | 6 | env: 7 | ISSUE_PATTERN: '[A-Z]+-[0-9]+' 8 | 9 | jobs: 10 | link: 11 | runs-on: ubuntu-latest 12 | steps: 13 | - name: Parse issue ID 14 | run: | 15 | LAST_COMMIT_MESSAGE="$(curl -s -H "Authorization: token ${{ secrets.GITHUB_TOKEN }}" "${{ github.event.pull_request.commits_url }}" | jq -r .[-1].commit.message)" 16 | if [[ "$LAST_COMMIT_MESSAGE" =~ \[($ISSUE_PATTERN)\] || "${{ github.head_ref }}" =~ ^($ISSUE_PATTERN) ]]; then 17 | echo ISSUE_NUMBER="${BASH_REMATCH[1]}" >> "${GITHUB_ENV}" 18 | fi 19 | - name: Link issue 20 | if: env.ISSUE_NUMBER 21 | uses: fjogeleit/http-request-action@v1 22 | with: 23 | url: 'https://st-api.yandex-team.ru/v2/issues/${{ env.ISSUE_NUMBER }}' 24 | method: 'LINK' 25 | customHeaders: > 26 | { 27 | "Link": "<${{ github.server_url }}/${{ github.repository }}/pull/${{ github.event.number }}>; rel=\"relates\"", 28 | "Authorization": "OAuth ${{ secrets.OAUTH_STARTREK_TOKEN }}" 29 | } 30 | ignoreStatusCodes: 409 31 | -------------------------------------------------------------------------------- /ch_tools/monrun_checks/clickhouse_info.py: -------------------------------------------------------------------------------- 1 | import functools 2 | from typing import Any, List 3 | 4 | from click import Context 5 | 6 | from ch_tools.common.clickhouse.client.clickhouse_client import clickhouse_client 7 | 8 | 9 | class ClickhouseInfo: 10 | @classmethod 11 | @functools.lru_cache(maxsize=1) 12 | def get_replicas(cls: Any, ctx: Context) -> List[str]: 13 | """ 14 | Get hostnames of replicas. 15 | """ 16 | cluster = cls.get_cluster(ctx) 17 | query = f""" 18 | SELECT host_name 19 | FROM system.clusters 20 | WHERE cluster = '{cluster}' 21 | AND shard_num = (SELECT shard_num FROM system.clusters 22 | WHERE host_name = hostName() AND cluster = '{cluster}') 23 | """ 24 | return [row[0] for row in clickhouse_client(ctx).query_json_data(query=query)] 25 | 26 | @classmethod 27 | @functools.lru_cache(maxsize=1) 28 | def get_cluster(cls: Any, ctx: Context) -> Any: 29 | """ 30 | Get cluster identifier. 31 | """ 32 | query = "SELECT substitution FROM system.macros WHERE macro = 'cluster'" 33 | return clickhouse_client(ctx).query_json_data(query=query)[0][0] 34 | -------------------------------------------------------------------------------- /ch_tools/common/clickhouse/config/users.py: -------------------------------------------------------------------------------- 1 | import os.path 2 | from typing import Any 3 | 4 | from .path import CLICKHOUSE_USERS_XML_CONFIG_PATH, CLICKHOUSE_USERS_YAML_CONFIG_PATH 5 | from .utils import dump_config, load_config 6 | 7 | 8 | class ClickhouseUsersConfig: 9 | """ 10 | ClickHouse users config (users.xml). 11 | """ 12 | 13 | def __init__(self, config: Any) -> None: 14 | self._config = config 15 | 16 | def dump(self, mask_secrets: bool = True) -> Any: 17 | return dump_config(self._config, mask_secrets=mask_secrets) 18 | 19 | def dump_xml(self, mask_secrets: bool = True) -> Any: 20 | return dump_config(self._config, mask_secrets=mask_secrets, xml_format=True) 21 | 22 | @staticmethod 23 | def load() -> "ClickhouseUsersConfig": 24 | config_path = None 25 | for path in ( 26 | CLICKHOUSE_USERS_XML_CONFIG_PATH, 27 | CLICKHOUSE_USERS_YAML_CONFIG_PATH, 28 | ): 29 | if os.path.exists(path): 30 | config_path = path 31 | break 32 | 33 | if not config_path: 34 | raise RuntimeError("Users configuration file not found") 35 | 36 | return ClickhouseUsersConfig(load_config(config_path, "users.d")) 37 | -------------------------------------------------------------------------------- /Dockerfile-deb-build: -------------------------------------------------------------------------------- 1 | ARG BASE_IMAGE=ubuntu:22.04 2 | FROM --platform=$TARGETPLATFORM $BASE_IMAGE 3 | 4 | ARG DEBIAN_FRONTEND=noninteractive 5 | 6 | RUN set -ex \ 7 | && apt-get update \ 8 | && apt-get install -y --no-install-recommends \ 9 | # Debian packaging tools 10 | build-essential \ 11 | debhelper \ 12 | devscripts \ 13 | fakeroot \ 14 | # Managing keys for debian package signing 15 | gpg \ 16 | gpg-agent \ 17 | # Python packaging tools 18 | python3-dev \ 19 | python3-pip \ 20 | python3-setuptools \ 21 | python3-venv \ 22 | # Misc 23 | curl \ 24 | locales \ 25 | # Configure locales 26 | && locale-gen en_US.UTF-8 \ 27 | && update-locale LANG=en_US.UTF-8 \ 28 | # Ensure that `python` refers to `python3` so that poetry works. 29 | # It makes sense for ubuntu:18.04 30 | && ln -s /usr/bin/python3 /usr/bin/python \ 31 | # Install `uv` 32 | && python3 -m pip install --upgrade pip \ 33 | && python3 -m pip install uv \ 34 | && ln -sf /usr/local/bin/uv /usr/bin/uv 35 | 36 | # Project directory must be mounted here 37 | VOLUME /src 38 | WORKDIR /src 39 | 40 | CMD ["make", "build-deb-package-local"] 41 | -------------------------------------------------------------------------------- /ch_tools/chadmin/internal/dictionary.py: -------------------------------------------------------------------------------- 1 | from typing import Any, Optional 2 | 3 | from click import Context 4 | 5 | from ch_tools.chadmin.internal.utils import execute_query 6 | 7 | 8 | def list_dictionaries( 9 | ctx: Context, *, name: Optional[str] = None, status: Optional[str] = None 10 | ) -> Any: 11 | """ 12 | List external dictionaries. 13 | """ 14 | query = """ 15 | SELECT 16 | database, 17 | name, 18 | status, 19 | type, 20 | source 21 | FROM system.dictionaries 22 | WHERE 1 23 | {% if name %} 24 | AND name = '{{ name }}' 25 | {% endif %} 26 | {% if status %} 27 | AND status = '{{ status }}' 28 | {% endif %} 29 | """ 30 | return execute_query(ctx, query, name=name, status=status, format_="JSON")["data"] 31 | 32 | 33 | def reload_dictionary( 34 | ctx: Context, *, name: str, database: Optional[str] = None 35 | ) -> None: 36 | """ 37 | Reload external dictionary. 38 | """ 39 | if database: 40 | full_name = f"`{database}`.`{name}`" 41 | else: 42 | full_name = f"`{name}`" 43 | 44 | query = f"""SYSTEM RELOAD DICTIONARY {full_name}""" 45 | execute_query(ctx, query, format_=None) 46 | -------------------------------------------------------------------------------- /tests/unit/common/test_utils.py: -------------------------------------------------------------------------------- 1 | import pytest 2 | 3 | from ch_tools.common.utils import version_ge, version_lt 4 | 5 | 6 | @pytest.mark.parametrize( 7 | "version1,version2,expected", 8 | [ 9 | ("22.8.21.38", "22.8.21.38", True), 10 | ("24.10.4.191", "22.8.21.38", True), 11 | ("24.10.4.191", "22.8", True), 12 | ("22.8.21.38", "24.10.4.191", False), 13 | ("24.10.4.191.dev", "22.8.21.38", True), 14 | ("24.10.4.191-dev.1", "22.8.21.38", True), 15 | ("22.8.21.38", "24.10.4.191-dev.1", False), 16 | ("24.10.4.191-dev.1", "24.10.4.191", True), 17 | ], 18 | ) 19 | def test_version_ge(version1: str, version2: str, expected: bool) -> None: 20 | assert version_ge(version1, version2) == expected 21 | 22 | 23 | @pytest.mark.parametrize( 24 | "version1,version2,expected", 25 | [ 26 | ("22.8.21.38", "22.8.21.38", False), 27 | ("24.10.4.191", "22.8.21.38", False), 28 | ("22.8.21.38", "24.10.4.191", True), 29 | ("22.8.21.38", "24.10", True), 30 | ("22.8.21.38", "24.10.4.191.dev", True), 31 | ("22.8.21.38", "24.10.4.191-dev.1", True), 32 | ("24.10.4.191-dev.1", "24.10.4.191", False), 33 | ], 34 | ) 35 | def test_version_lt(version1: str, version2: str, expected: bool) -> None: 36 | assert version_lt(version1, version2) == expected 37 | -------------------------------------------------------------------------------- /ch_tools/chadmin/internal/system.py: -------------------------------------------------------------------------------- 1 | import subprocess 2 | 3 | from click import Context 4 | 5 | from ch_tools.chadmin.internal.utils import clickhouse_client 6 | from ch_tools.common.utils import version_ge 7 | 8 | 9 | def get_version(ctx: Context) -> str: 10 | """ 11 | Get ClickHouse version. 12 | """ 13 | 14 | ch_version_from_config = ctx.obj["config"]["clickhouse"]["version"] 15 | if ch_version_from_config: 16 | return ch_version_from_config 17 | return clickhouse_client(ctx).get_clickhouse_version() 18 | 19 | 20 | def match_ch_version(ctx: Context, min_version: str) -> bool: 21 | """ 22 | Returns True if ClickHouse version >= min_version. 23 | """ 24 | return version_ge(get_version(ctx), min_version) 25 | 26 | 27 | def match_ch_backup_version(min_version: str) -> bool: 28 | """ 29 | Returns True if ClickHouse version >= min_version. 30 | """ 31 | cmd = ["ch-backup", "version"] 32 | proc = subprocess.run( 33 | cmd, 34 | shell=False, 35 | check=False, 36 | stdout=subprocess.PIPE, 37 | stderr=subprocess.PIPE, 38 | ) 39 | 40 | if proc.returncode: 41 | raise RuntimeError( 42 | f"Failed to get ch-backup version: retcode {proc.returncode}, stderr: {proc.stderr.decode()}" 43 | ) 44 | 45 | return version_ge(proc.stdout.decode(), min_version) 46 | -------------------------------------------------------------------------------- /tests/modules/logs.py: -------------------------------------------------------------------------------- 1 | """ 2 | Logs management. 3 | """ 4 | 5 | import json 6 | import os 7 | 8 | from docker.models.containers import Container 9 | 10 | from ch_tools.common import logging 11 | 12 | from .docker import copy_container_dir, get_containers 13 | from .minio import export_s3_data 14 | from .typing import ContextT 15 | 16 | 17 | def save_logs(context: ContextT) -> None: 18 | """ 19 | Save logs and support materials. 20 | """ 21 | try: 22 | logs_dir = os.path.join(context.conf["staging_dir"], "logs") 23 | 24 | for container in get_containers(context): 25 | _save_container_logs(container, logs_dir) 26 | 27 | with open( 28 | os.path.join(logs_dir, "session_conf.json"), "w", encoding="utf-8" 29 | ) as out: 30 | json.dump(context.conf, out, default=repr, indent=4) 31 | 32 | export_s3_data(context, logs_dir) 33 | 34 | except Exception: 35 | logging.exception("Failed to save logs") 36 | raise 37 | 38 | 39 | def _save_container_logs(container: Container, logs_dir: str) -> None: 40 | base = os.path.join(logs_dir, container.name) 41 | os.makedirs(base, exist_ok=True) 42 | with open(os.path.join(base, "docker.log"), "wb") as out: 43 | out.write(container.logs(stdout=True, stderr=True, timestamps=True)) 44 | 45 | copy_container_dir(container, "/var/log", base) 46 | -------------------------------------------------------------------------------- /ch_tools/common/yaml.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | from collections import OrderedDict 4 | from typing import Any, Optional 5 | 6 | import yaml 7 | import yaml.representer 8 | 9 | 10 | def dict_representer(dumper: Any, data: Any) -> Any: 11 | return yaml.representer.SafeRepresenter.represent_dict(dumper, data.items()) 12 | 13 | 14 | def str_representer(dumper: Any, data: Any) -> Any: 15 | if "\n" in data: 16 | style = "|" 17 | else: 18 | style = None 19 | 20 | return yaml.representer.SafeRepresenter.represent_scalar( 21 | dumper, "tag:yaml.org,2002:str", data, style=style 22 | ) 23 | 24 | 25 | yaml.add_representer(dict, dict_representer) 26 | yaml.add_representer(OrderedDict, dict_representer) 27 | yaml.add_representer(str, str_representer) 28 | 29 | 30 | def load_yaml(file_path: str) -> Any: 31 | with open(os.path.expanduser(file_path), "r", encoding="utf-8") as f: 32 | return yaml.safe_load(f) 33 | 34 | 35 | def dump_yaml(data: Any, file_path: Optional[str] = None) -> Any: 36 | if not file_path: 37 | return yaml.dump( 38 | data, default_flow_style=False, allow_unicode=True, width=sys.maxsize 39 | ) 40 | 41 | with open(os.path.expanduser(file_path), "w", encoding="utf-8") as f: 42 | return yaml.dump( 43 | data, f, default_flow_style=False, allow_unicode=True, width=sys.maxsize 44 | ) 45 | -------------------------------------------------------------------------------- /debian/copyright: -------------------------------------------------------------------------------- 1 | Format: https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/ 2 | Upstream-Name: clickhouse-tools 3 | Source: https://github.com/yandex/ch-tools 4 | 5 | Files: * 6 | Copyright: 2023 Yandex LLC 7 | License: MIT 8 | The MIT License (MIT) 9 | Copyright (c) 2023 YANDEX LLC 10 | . 11 | Permission is hereby granted, free of charge, to any person obtaining a copy 12 | of this software and associated documentation files (the "Software"), to deal 13 | in the Software without restriction, including without limitation the rights 14 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 15 | copies of the Software, and to permit persons to whom the Software is 16 | furnished to do so, subject to the following conditions: 17 | . 18 | The above copyright notice and this permission notice shall be included in 19 | all copies or substantial portions of the Software. 20 | . 21 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 22 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 23 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 24 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 25 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 26 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 27 | THE SOFTWARE. 28 | -------------------------------------------------------------------------------- /ch_tools/common/clickhouse/client/query.py: -------------------------------------------------------------------------------- 1 | from typing import Any, Dict, Optional 2 | 3 | 4 | class Query: 5 | mask = "*****" 6 | 7 | def __init__( 8 | self, 9 | value: str, 10 | sensitive_args: Optional[Dict[str, str]] = None, 11 | ): 12 | self.value = value 13 | self.sensitive_args = sensitive_args or {} 14 | 15 | def for_execute(self) -> str: 16 | return self._render(False) 17 | 18 | def _render(self, mask_sensitive: bool = True) -> str: 19 | if not self.sensitive_args: 20 | return self.value 21 | sensitive_args = ( 22 | self._sensitive_args_mask() if mask_sensitive else self.sensitive_args 23 | ) 24 | return self.value.format(**sensitive_args) 25 | 26 | def _sensitive_args_mask(self) -> Dict[str, str]: 27 | return {key: self.mask for key in self.sensitive_args} 28 | 29 | def __str__(self) -> str: 30 | return self._render(True) 31 | 32 | def __repr__(self) -> str: 33 | return f"{self.__class__.__name__}(value='{str(self)}', sensitive_args={self._sensitive_args_mask()})" 34 | 35 | def __eq__(self, other: Any) -> bool: 36 | return isinstance(other, self.__class__) and repr(self) == repr(other) 37 | 38 | def __hash__(self) -> int: 39 | return hash(repr(self)) 40 | 41 | def __add__(self, other: str) -> "Query": 42 | return Query(self.value + other, self.sensitive_args) 43 | -------------------------------------------------------------------------------- /ch_tools/chadmin/internal/object_storage/s3_cleanup.py: -------------------------------------------------------------------------------- 1 | from typing import Any, Iterator, List 2 | 3 | import boto3 4 | from botocore.client import Config 5 | 6 | from ch_tools.chadmin.internal.object_storage.s3_cleanup_stats import ResultStat 7 | from ch_tools.chadmin.internal.utils import chunked 8 | from ch_tools.common.clickhouse.config.storage_configuration import S3DiskConfiguration 9 | 10 | from .obj_list_item import ObjListItem 11 | 12 | BULK_DELETE_CHUNK_SIZE = 1000 13 | 14 | 15 | def cleanup_s3_object_storage( 16 | disk: S3DiskConfiguration, 17 | keys: Iterator[ObjListItem], 18 | stat: ResultStat, 19 | dry_run: bool = False, 20 | ) -> None: 21 | s3 = boto3.resource( 22 | "s3", 23 | endpoint_url=disk.endpoint_url, 24 | aws_access_key_id=disk.access_key_id, 25 | aws_secret_access_key=disk.secret_access_key, 26 | config=Config( 27 | s3={ 28 | "addressing_style": "auto", 29 | }, 30 | ), 31 | ) 32 | bucket = s3.Bucket(disk.bucket_name) 33 | 34 | for chunk in chunked(keys, BULK_DELETE_CHUNK_SIZE): 35 | if not dry_run: 36 | _bulk_delete(bucket, chunk) 37 | for item in chunk: 38 | stat.update_by_item(item) 39 | 40 | 41 | def _bulk_delete(bucket: Any, items: List[ObjListItem]) -> None: 42 | objects = [{"Key": item.path} for item in items] 43 | bucket.delete_objects(Delete={"Objects": objects, "Quiet": False}) 44 | -------------------------------------------------------------------------------- /ch_tools/chadmin/cli/crash_log_group.py: -------------------------------------------------------------------------------- 1 | from click import Context, group, option, pass_context 2 | 3 | from ch_tools.chadmin.cli.chadmin_group import Chadmin 4 | from ch_tools.chadmin.internal.utils import execute_query 5 | from ch_tools.common import logging 6 | from ch_tools.common.clickhouse.config import get_cluster_name 7 | 8 | 9 | @group("crash-log", cls=Chadmin) 10 | def crash_log_group() -> None: 11 | """ 12 | Commands for retrieving information from system.crash_log. 13 | """ 14 | pass 15 | 16 | 17 | @crash_log_group.command("list") 18 | @option( 19 | "--cluster", 20 | "--on-cluster", 21 | "on_cluster", 22 | is_flag=True, 23 | help="Get log records from all hosts in the cluster.", 24 | ) 25 | @pass_context 26 | def list_crashes_command(ctx: Context, on_cluster: bool) -> None: 27 | cluster = get_cluster_name(ctx) if on_cluster else None 28 | query_str = """ 29 | SELECT 30 | {% if cluster %} 31 | hostName() "host", 32 | {% endif %} 33 | event_time, 34 | signal, 35 | thread_id, 36 | query_id, 37 | '\n' || arrayStringConcat(trace_full, '\n') AS trace, 38 | version 39 | {% if cluster %} 40 | FROM clusterAllReplicas({{ cluster }}, system.crash_log) 41 | {% else %} 42 | FROM system.crash_log 43 | {% endif %} 44 | ORDER BY event_time DESC 45 | """ 46 | logging.info(execute_query(ctx, query_str, cluster=cluster, format_="Vertical")) 47 | -------------------------------------------------------------------------------- /ch_tools/monrun_checks/ch_replication_lag.py: -------------------------------------------------------------------------------- 1 | import click 2 | 3 | from ch_tools.common.commands.replication_lag import estimate_replication_lag 4 | from ch_tools.common.result import Result 5 | 6 | 7 | @click.command("replication-lag") 8 | @click.option( 9 | "-x", 10 | "--exec-critical", 11 | "xcrit", 12 | type=int, 13 | help="Critical threshold for one task execution.", 14 | ) 15 | @click.option( 16 | "-c", 17 | "--critical", 18 | "crit", 19 | type=int, 20 | help="Critical threshold for lag with errors.", 21 | ) 22 | @click.option("-w", "--warning", "warn", type=int, help="Warning threshold.") 23 | @click.option( 24 | "-M", 25 | "--merges-critical", 26 | "mcrit", 27 | type=click.FloatRange(0.0, 100.0), 28 | help="Critical threshold in percent of max_replicated_merges_in_queue.", 29 | ) 30 | @click.option( 31 | "-m", 32 | "--merges-warning", 33 | "mwarn", 34 | type=click.FloatRange(0.0, 100.0), 35 | help="Warning threshold in percent of max_replicated_merges_in_queue.", 36 | ) 37 | @click.option( 38 | "-v", 39 | "--verbose", 40 | "verbose", 41 | type=int, 42 | count=True, 43 | default=0, 44 | help="Show details about lag.", 45 | ) 46 | @click.pass_context 47 | def replication_lag_command( 48 | ctx: click.Context, 49 | xcrit: int, 50 | crit: int, 51 | warn: int, 52 | mwarn: float, 53 | mcrit: float, 54 | verbose: int, 55 | ) -> Result: 56 | return estimate_replication_lag(ctx, xcrit, crit, warn, mwarn, mcrit, verbose) 57 | -------------------------------------------------------------------------------- /ch_tools/monrun_checks/ch_keeper.py: -------------------------------------------------------------------------------- 1 | import cloup 2 | from kazoo.client import KazooClient, KazooException 3 | from kazoo.handlers.threading import KazooTimeoutError 4 | 5 | from ch_tools.common.clickhouse.config import ClickhouseKeeperConfig 6 | from ch_tools.common.result import CRIT, OK, Result 7 | 8 | 9 | @cloup.command("keeper") 10 | @cloup.option( 11 | "-r", 12 | "--retries", 13 | "retries", 14 | type=int, 15 | default=3, 16 | help="Connection retries", 17 | ) 18 | @cloup.option( 19 | "-t", 20 | "--timeout", 21 | "timeout", 22 | type=int, 23 | default=10, 24 | help="Connection timeout (s)", 25 | ) 26 | @cloup.option( 27 | "-n", 28 | "--no-verify-ssl-certs", 29 | "no_verify_ssl_certs", 30 | is_flag=True, 31 | default=False, 32 | help="Allow unverified SSL certificates, e.g. self-signed ones", 33 | ) 34 | def keeper_command(retries: int, timeout: int, no_verify_ssl_certs: bool) -> Result: 35 | """ 36 | Check ClickHouse Keeper is alive. 37 | """ 38 | keeper_port, use_ssl = ClickhouseKeeperConfig.load().port_pair 39 | if not keeper_port: 40 | return Result(OK, "Disabled") 41 | 42 | client = KazooClient( 43 | f"127.0.0.1:{keeper_port}", 44 | connection_retry=retries, 45 | command_retry=retries, 46 | timeout=timeout, 47 | use_ssl=use_ssl, 48 | verify_certs=not no_verify_ssl_certs, 49 | ) 50 | try: 51 | client.start() 52 | client.get("/") 53 | client.stop() 54 | except (KazooException, KazooTimeoutError) as e: 55 | return Result(CRIT, repr(e)) 56 | 57 | return Result(OK) 58 | -------------------------------------------------------------------------------- /ch_tools/common/process_pool.py: -------------------------------------------------------------------------------- 1 | from concurrent.futures import ThreadPoolExecutor, as_completed 2 | from dataclasses import dataclass 3 | from typing import Any, Callable, Dict, List 4 | 5 | from ch_tools.common import logging 6 | 7 | 8 | @dataclass 9 | class WorkerTask: 10 | identifier: str 11 | function: Callable 12 | kwargs: Dict[str, Any] 13 | 14 | 15 | def execute_tasks_in_parallel( 16 | tasks: List[WorkerTask], max_workers: int = 4, keep_going: bool = False 17 | ) -> Dict[str, Any]: 18 | with ThreadPoolExecutor(max_workers=max_workers) as executor: 19 | # Can't use map function here. The map method returns a generator 20 | # and it is not possible to resume a generator after an exception occurs. 21 | # https://peps.python.org/pep-0255/#specification-generators-and-exception-propagation 22 | futures_to_indedifier = { 23 | executor.submit( 24 | task.function, 25 | **task.kwargs, 26 | ): task.identifier 27 | for task in tasks 28 | } 29 | result: Dict[str, Any] = {} 30 | for future in as_completed(futures_to_indedifier): 31 | idf = futures_to_indedifier[future] 32 | try: 33 | result[idf] = future.result() 34 | except Exception as e: 35 | if keep_going: 36 | logging.warning( 37 | "Ignoring the exception due to while executing {} due to keep-going flag : {!r}", 38 | id, 39 | e, 40 | ) 41 | else: 42 | raise 43 | return result 44 | -------------------------------------------------------------------------------- /tests/unit/common/type/test_typed_enum.py: -------------------------------------------------------------------------------- 1 | from functools import reduce 2 | from typing import Sequence, TypeVar 3 | 4 | from hamcrest import assert_that, equal_to 5 | from pytest import mark 6 | 7 | from ch_tools.common.type.typed_enum import IntEnum, StrEnum, TypedEnum 8 | 9 | T = TypeVar("T", int, str) 10 | 11 | 12 | class SEnum(StrEnum): 13 | A = "AAA" 14 | B = "BBB" 15 | 16 | 17 | class IEnum(IntEnum): 18 | A = 1 19 | B = 2 20 | 21 | 22 | @mark.parametrize( 23 | ["inputs", "stringified_expected", "summed_expected"], 24 | [ 25 | ((SEnum.A,), ["AAA"], "AAA"), 26 | ((SEnum.B,), ["BBB"], "BBB"), 27 | ((SEnum.A, SEnum.A), ["AAA", "AAA"], "AAAAAA"), 28 | ((SEnum.A, SEnum.B), ["AAA", "BBB"], "AAABBB"), 29 | ((SEnum.B, SEnum.A), ["BBB", "AAA"], "BBBAAA"), 30 | ((SEnum.B, SEnum.B), ["BBB", "BBB"], "BBBBBB"), 31 | ((SEnum.B, SEnum.B, SEnum.B), ["BBB", "BBB", "BBB"], "BBBBBBBBB"), 32 | ((IEnum.A,), ["1"], 1), 33 | ((IEnum.B,), ["2"], 2), 34 | ((IEnum.A, IEnum.A), ["1", "1"], 2), 35 | ((IEnum.A, IEnum.B), ["1", "2"], 3), 36 | ((IEnum.B, IEnum.A), ["2", "1"], 3), 37 | ((IEnum.B, IEnum.B), ["2", "2"], 4), 38 | ((IEnum.B, IEnum.B, IEnum.B), ["2", "2", "2"], 6), 39 | ], 40 | ) 41 | def test_typed_enum( 42 | inputs: Sequence[TypedEnum], stringified_expected: Sequence[str], summed_expected: T 43 | ) -> None: 44 | stringified: Sequence[str] = [str(i) for i in inputs] 45 | assert_that(stringified, equal_to(stringified_expected)) 46 | 47 | summed = reduce(lambda a, b: a + b, inputs) # type: ignore 48 | assert_that(summed, equal_to(summed_expected)) 49 | -------------------------------------------------------------------------------- /ch_tools/monrun_checks/ch_system_metrics.py: -------------------------------------------------------------------------------- 1 | import click 2 | import requests 3 | 4 | from ch_tools.common.clickhouse.client.clickhouse_client import clickhouse_client 5 | from ch_tools.common.result import CRIT, OK, WARNING, Result 6 | 7 | 8 | @click.command("system-metrics") 9 | @click.option("-n", "--name", "name", type=str, help="Metric's name to check") 10 | @click.option("-c", "--critical", "crit", type=int, help="Critical threshold.") 11 | @click.option("-w", "--warning", "warn", type=int, help="Warning threshold.") 12 | @click.pass_context 13 | def system_metrics_command( 14 | ctx: click.Context, 15 | name: str, 16 | crit: int, 17 | warn: int, 18 | ) -> Result: 19 | """ 20 | Check system metric. 21 | """ 22 | try: 23 | metric = _get_metric(ctx, name) 24 | except IndexError: 25 | return Result(CRIT, "Metric not available") 26 | except requests.exceptions.HTTPError: 27 | return Result(CRIT, "Failed to get metric") 28 | 29 | metric_name, value = metric[0], int(metric[1]) 30 | if value > crit: 31 | return Result(CRIT, f'"{metric_name}", crit = {crit}, count = {value}') 32 | if value > warn: 33 | return Result( 34 | WARNING, 35 | f'"{metric_name}", warn = {warn}, count = {value}', 36 | ) 37 | 38 | return Result(OK) 39 | 40 | 41 | def _get_metric(ctx: click.Context, name: str) -> dict: 42 | """ 43 | Select and return metric from system.metrics. 44 | """ 45 | query = "SELECT * from system.metrics WHERE lower(metric) = '{{ name }}'" 46 | return clickhouse_client(ctx).query_json_data_first_row( 47 | query=query, query_args={"name": name.lower()} 48 | ) 49 | -------------------------------------------------------------------------------- /ch_tools/chadmin/cli/dictionary_group.py: -------------------------------------------------------------------------------- 1 | from typing import Any 2 | 3 | from click import Context, group, option, pass_context 4 | 5 | from ch_tools.chadmin.cli.chadmin_group import Chadmin 6 | from ch_tools.chadmin.internal.dictionary import list_dictionaries, reload_dictionary 7 | from ch_tools.common import logging 8 | from ch_tools.common.cli.formatting import print_response 9 | 10 | 11 | @group("dictionary", cls=Chadmin) 12 | def dictionary_group() -> None: 13 | """Commands to manage external dictionaries.""" 14 | pass 15 | 16 | 17 | @dictionary_group.command("list") 18 | @option("--name") 19 | @option("--status") 20 | @pass_context 21 | def list_command(ctx: Context, name: str, status: str) -> None: 22 | """ 23 | List dictionaries. 24 | """ 25 | dictionaries = list_dictionaries(ctx, name=name, status=status) 26 | print_response( 27 | ctx, 28 | dictionaries, 29 | default_format="table", 30 | ) 31 | 32 | 33 | @dictionary_group.command("reload") 34 | @option("--name") 35 | @option("--status") 36 | @pass_context 37 | def reload_command(ctx: Context, name: str, status: str) -> None: 38 | """ 39 | Reload one or several dictionaries. 40 | """ 41 | dictionaries = list_dictionaries(ctx, name=name, status=status) 42 | for dictionary in dictionaries: 43 | logging.info("Reloading dictionary {}", _full_name(dictionary)) 44 | reload_dictionary(ctx, database=dictionary["database"], name=dictionary["name"]) 45 | 46 | 47 | def _full_name(dictionary: Any) -> str: 48 | db_name = dictionary["database"] 49 | dict_name = dictionary["name"] 50 | 51 | if db_name: 52 | return f"`{db_name}`.`{dict_name}`" 53 | 54 | return f"`{dict_name}`" 55 | -------------------------------------------------------------------------------- /ch_tools/monrun_checks/ch_tls.py: -------------------------------------------------------------------------------- 1 | from typing import List, Optional 2 | 3 | import click 4 | 5 | from ch_tools.common.clickhouse.client.clickhouse_client import ( 6 | ClickhousePort, 7 | clickhouse_client, 8 | ) 9 | from ch_tools.common.result import Result 10 | from ch_tools.common.tls import check_cert_on_ports 11 | 12 | CERTIFICATE_PATH = "/etc/clickhouse-server/ssl/server.crt" 13 | 14 | 15 | @click.command("tls") 16 | @click.option("-c", "--critical", "crit", type=int, help="Critical threshold.") 17 | @click.option("-w", "--warning", "warn", type=int, help="Warning threshold.") 18 | @click.option( 19 | "-p", 20 | "--ports", 21 | "ports", 22 | type=str, 23 | default=None, 24 | help="Comma separated list of ports. By default read from ClickHouse config", 25 | ) 26 | @click.option("--chain", "chain", is_flag=True, help="Verify certificate chain.") 27 | @click.pass_context 28 | def tls_command( 29 | ctx: click.Context, 30 | crit: int, 31 | warn: int, 32 | ports: Optional[str], 33 | chain: bool, 34 | ) -> Result: 35 | """ 36 | Check TLS certificate for expiration and that actual cert from fs used. 37 | """ 38 | return check_cert_on_ports( 39 | get_ports(ctx, ports), crit, warn, chain, CERTIFICATE_PATH 40 | ) 41 | 42 | 43 | def get_ports(ctx: click.Context, ports: Optional[str]) -> List[str]: 44 | if ports: 45 | return ports.split(",") 46 | client = clickhouse_client(ctx) 47 | result = [] 48 | if client.check_port(ClickhousePort.HTTPS): 49 | result.append(client.get_port(ClickhousePort.HTTPS)) 50 | if client.check_port(ClickhousePort.TCP_SECURE): 51 | result.append(client.get_port(ClickhousePort.TCP_SECURE)) 52 | return [str(port) for port in result] 53 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | [![license](https://img.shields.io/github/license/yandex/ch-tools)](https://github.com/yandex/ch-tools/blob/main/LICENSE) 2 | [![tests status](https://img.shields.io/github/actions/workflow/status/yandex/ch-tools/.github%2Fworkflows%2Fworkflow.yml?event=push&label=tests&logo=github)](https://github.com/yandex/ch-tools/actions/workflows/workflow.yml?query=event%3Apush) 3 | [![chat](https://img.shields.io/badge/telegram-chat-blue)](https://t.me/+O4gURpLnQ604OTE6) 4 | 5 | # clickhouse-tools 6 | 7 | **clickhouse-tools** is a set of tools for administration and diagnostics of [ClickHouse](https://clickhouse.com/) DBMS. 8 | 9 | ## Tools 10 | 11 | **clickhouse-tools** consist of following components: 12 | - [chadmin](./ch_tools/chadmin/README.md) - ClickHouse administration tool 13 | - [ch-monitoring](./ch_tools/monrun_checks/README.md) - ClickHouse monitoring tool 14 | - [keeper-monitoring](./ch_tools/monrun_checks_keeper/README.md) - ClickHouse Keeper / ZooKeeper monitoring tool 15 | 16 | All of these tools must be run on the same host as ClickHouse server is running. 17 | 18 | ## Local development 19 | 20 | Requirements: 21 | * GNU Make version > 3.81 22 | * [uv](https://docs.astral.sh/uv) 23 | * Docker 24 | 25 | ```sh 26 | # lint 27 | make lint 28 | 29 | # unit tests 30 | make test-unit 31 | make test-unit PYTEST_ARGS="-k test_name" 32 | 33 | # integration tests (rebuild docker images using a .whl file) 34 | make test-integration 35 | make test-integration BEHAVE_ARGS="-i feature_name" 36 | 37 | # integration tests (supply a custom ClickHouse version to test against) 38 | CLICKHOUSE_VERSION="1.2.3.4" make test-integration 39 | # If you want to have containers running on failure, supply a flag: 40 | # BEHAVE_ARGS="-D no_stop_on_fail" 41 | 42 | # For building deb packages 43 | make build-deb-package 44 | ``` 45 | -------------------------------------------------------------------------------- /ch_tools/common/result.py: -------------------------------------------------------------------------------- 1 | import re 2 | 3 | from click import Context 4 | 5 | OK = 0 6 | WARNING = 1 7 | CRIT = 2 8 | 9 | 10 | class Result: 11 | def __init__(self, code: int = OK, message: str = "OK", verbose: str = "") -> None: 12 | self.code = code 13 | self.message = message 14 | self.verbose = verbose 15 | 16 | 17 | class Status: 18 | """Class for holding Juggler status.""" 19 | 20 | def __init__(self) -> None: 21 | self.code = 0 22 | self.text: list[str] = [] 23 | self.verbose: list[str] = [] 24 | 25 | @property 26 | def message(self) -> str: 27 | """Result message.""" 28 | # concatenate all received statuses 29 | message = ". ".join(self.text) 30 | if not message and self.code == 0: 31 | message = "OK" 32 | 33 | return message 34 | 35 | def set_code(self, new_code: int) -> None: 36 | """Set the code if it is greater than the current.""" 37 | if new_code > self.code: 38 | self.code = new_code 39 | 40 | def append(self, new_text: str) -> None: 41 | """Accumulate the status text.""" 42 | self.text.append(new_text) 43 | 44 | def add_verbose(self, new_text: str) -> None: 45 | """Add detail info.""" 46 | self.verbose.append(new_text) 47 | 48 | def report(self, ctx: Context) -> None: 49 | """Output formatted status message.""" 50 | message = self.message 51 | for rule in ctx.obj["config"]["monitoring"]["output"]["escaping_rules"]: 52 | message = re.sub(rule["pattern"], rule["replacement"], message) 53 | 54 | print(f"{self.code};{message}") 55 | if self.verbose: 56 | for v in self.verbose: 57 | if v: 58 | print("\n") 59 | print(v) 60 | -------------------------------------------------------------------------------- /ch_tools/monrun_checks/ch_core_dumps.py: -------------------------------------------------------------------------------- 1 | import pathlib 2 | import time 3 | from datetime import datetime 4 | from typing import Any, List, Optional 5 | 6 | import click 7 | 8 | from ch_tools.common.result import Result 9 | 10 | 11 | @click.command("core-dumps") 12 | @click.option( 13 | "-t", 14 | "--core-directory", 15 | "core_directory", 16 | help="Core dump directory.", 17 | ) 18 | @click.option( 19 | "-n", 20 | "--crit-interval-seconds", 21 | "crit_seconds", 22 | type=int, 23 | help="Time interval to check in seconds.", 24 | ) 25 | def core_dumps_command(core_directory: str, crit_seconds: int) -> Result: 26 | """ 27 | Check for core dumps. 28 | """ 29 | status = 0 30 | 31 | core_dir = pathlib.Path(core_directory) 32 | if core_dir.exists(): 33 | dumps = get_core_dumps(core_dir, crit_seconds) 34 | if dumps: 35 | status = 2 36 | else: 37 | # look for old dumps 38 | dumps = get_core_dumps(core_dir) 39 | if dumps: 40 | status = 1 41 | message = ";".join([f"{f} [{dt}]" for f, dt in dumps]) 42 | else: 43 | status = 1 44 | message = f"Core dump directory does not exist: {core_dir}" 45 | return Result(status, message or "OK") 46 | 47 | 48 | def get_core_dumps(core_dir: Any, interval_seconds: Optional[int] = None) -> List: 49 | """ 50 | Get core dumps dumped during the last `interval_seconds`. 51 | """ 52 | result = [] 53 | for f in core_dir.iterdir(): 54 | if not (f.is_file() and f.owner() == "clickhouse"): 55 | continue 56 | ctime = f.stat().st_ctime 57 | dt = datetime.fromtimestamp(ctime) 58 | if interval_seconds is None or (ctime > time.time() - interval_seconds): 59 | result.append((f, dt)) 60 | 61 | return result 62 | -------------------------------------------------------------------------------- /ch_tools/monrun_checks/ch_ro_replica.py: -------------------------------------------------------------------------------- 1 | import click 2 | 3 | from ch_tools.common.clickhouse.client.clickhouse_client import clickhouse_client 4 | from ch_tools.common.result import CRIT, OK, Result 5 | 6 | 7 | @click.command("ro-replica") 8 | @click.option( 9 | "-v", 10 | "--verbose", 11 | is_flag=True, 12 | help="Show details about ro tables.", 13 | ) 14 | @click.pass_context 15 | def ro_replica_command(ctx: click.Context, verbose: bool = False) -> Result: 16 | """ 17 | Check for readonly replicated tables. 18 | """ 19 | query = """ 20 | SELECT database, table, replica_path, last_queue_update_exception, zookeeper_exception 21 | FROM system.replicas WHERE is_readonly 22 | """ 23 | response = clickhouse_client(ctx).query_json_data(query, compact=False) 24 | if response: 25 | msg_verbose = "" 26 | 27 | if verbose: 28 | headers = [ 29 | "database", 30 | "table", 31 | "replica_path", 32 | "last_queue_update_exception", 33 | "zookeeper_exception", 34 | ] 35 | 36 | formatted_data = [] 37 | 38 | for item in response: 39 | formatted_row = "\n".join( 40 | [ 41 | f"{header}: {item[header]}" 42 | for header in headers 43 | if header in item 44 | ] 45 | ) 46 | formatted_data.append(formatted_row) 47 | 48 | msg_verbose = "\n\n".join(data for data in formatted_data) 49 | 50 | tables_str = ", ".join( 51 | f"{item['database']}.{item['table']}" for item in response 52 | ) 53 | 54 | return Result( 55 | CRIT, f"Readonly replica tables: {tables_str}", verbose=msg_verbose 56 | ) 57 | 58 | return Result(OK) 59 | -------------------------------------------------------------------------------- /tests/features/chadmin_perf_diag.feature: -------------------------------------------------------------------------------- 1 | Feature: chadmin performance diagnostics. 2 | 3 | Background: 4 | Given default configuration 5 | And a working s3 6 | And a working zookeeper 7 | And a working clickhouse on clickhouse01 8 | And a working clickhouse on clickhouse02 9 | 10 | @require_version_23.8 11 | Scenario: Sanity checks: 12 | When we execute command on clickhouse01 13 | """ 14 | chadmin flamegraph collect-by-interval --trace-type CPU 15 | """ 16 | And we execute command on clickhouse01 17 | """ 18 | chadmin flamegraph setup --trace-type MemorySample 19 | """ 20 | And we execute command on clickhouse01 21 | """ 22 | clickhouse client --query-id 123 --query ' SELECT count(*) FROM numbers(4000000) AS l LEFT JOIN (select rand32()%1000000 AS number FROM numbers(40000000)) AS r ON l.number=r.number SETTINGS use_query_cache=0;' 23 | """ 24 | And we execute command on clickhouse01 25 | """ 26 | chadmin flamegraph collect-by-query --query-id 123 --trace-type MemorySample 27 | """ 28 | And we execute command on clickhouse01 29 | """ 30 | chadmin flamegraph cleanup --trace-type MemorySample 31 | """ 32 | And we execute command on clickhouse01 33 | """ 34 | chadmin flamegraph setup --trace-type Real 35 | """ 36 | And we execute command on clickhouse01 37 | """ 38 | clickhouse client --query-id 1234 --query ' SELECT count(*) FROM numbers(4000000) AS l LEFT JOIN (select rand32()%1000000 AS number FROM numbers(40000000)) AS r ON l.number=r.number SETTINGS use_query_cache=0;' 39 | """ 40 | And we execute command on clickhouse01 41 | """ 42 | chadmin flamegraph collect-by-query --query-id 1234 --trace-type Real 43 | """ 44 | And we execute command on clickhouse01 45 | """ 46 | chadmin flamegraph cleanup --trace-type Real 47 | """ 48 | Then it completes successfully 49 | -------------------------------------------------------------------------------- /ch_tools/monrun_checks/ch_resetup_state.py: -------------------------------------------------------------------------------- 1 | import json 2 | import os 3 | import subprocess 4 | from typing import Any 5 | 6 | import click 7 | import psutil 8 | 9 | from ch_tools.common.clickhouse.config.path import CLICKHOUSE_RESETUP_CONFIG_PATH 10 | from ch_tools.common.result import CRIT, OK, Result 11 | from ch_tools.monrun_checks.exceptions import die 12 | 13 | 14 | # TODO: delete unused ssl and ca_bundle options after some time 15 | @click.command("resetup-state") 16 | @click.option("-s", "--ssl", "_ssl", is_flag=True, help="Use HTTPS rather than HTTP.") 17 | @click.option("--ca_bundle", "_ca_bundle", help="Path to CA bundle to use.") 18 | def resetup_state_command(_ssl: bool, _ca_bundle: Any) -> Any: 19 | """ 20 | Check state of resetup process. 21 | """ 22 | 23 | check_resetup_running() 24 | check_resetup_required() 25 | 26 | if os.path.isfile(CLICKHOUSE_RESETUP_CONFIG_PATH): 27 | return Result( 28 | CRIT, "Detected resetup config, but couldn't find running resetup process" 29 | ) 30 | 31 | return Result(OK) 32 | 33 | 34 | def check_resetup_running() -> None: 35 | """ 36 | Check for currently running resetup 37 | """ 38 | for proc in psutil.process_iter(): 39 | if {"/usr/bin/ch-backup", "restore-schema"}.issubset(proc.cmdline()): 40 | die(0, "resetup is running (restore schema)") 41 | if {"/usr/bin/chadmin", "wait", "replication-sync"}.issubset(proc.cmdline()): 42 | die(0, "resetup is running (wait for replication sync)") 43 | 44 | 45 | def check_resetup_required() -> None: 46 | """ 47 | Check resetup conditions 48 | """ 49 | cmd = [ 50 | "sudo", 51 | "salt-call", 52 | "mdb_clickhouse.resetup_required", 53 | "--out", 54 | "json", 55 | "--local", 56 | ] 57 | output = subprocess.check_output(cmd, stderr=subprocess.STDOUT) 58 | if json.loads(output)["local"]: 59 | die(0, "OK") 60 | -------------------------------------------------------------------------------- /tests/unit/common/query/test_query.py: -------------------------------------------------------------------------------- 1 | from ch_tools.common.clickhouse.client.query import Query 2 | 3 | query_1 = Query( 4 | "SELECT * FROM users WHERE name = {{name}} AND password = {password}", 5 | {"password": "123"}, 6 | ) 7 | query_2 = Query("SELECT * FROM users WHERE name = {name}", {}) 8 | query_3 = Query("SELECT * FROM users WHERE password = {password}", {"password": "123"}) 9 | 10 | 11 | def test_for_execute() -> None: 12 | assert ( 13 | query_1.for_execute() 14 | == "SELECT * FROM users WHERE name = {name} AND password = 123" 15 | ) 16 | assert query_2.for_execute() == "SELECT * FROM users WHERE name = {name}" 17 | assert query_3.for_execute() == "SELECT * FROM users WHERE password = 123" 18 | 19 | 20 | def test_str() -> None: 21 | assert ( 22 | str(query_1) == "SELECT * FROM users WHERE name = {name} AND password = *****" 23 | ) 24 | assert str(query_2) == "SELECT * FROM users WHERE name = {name}" 25 | assert str(query_3) == "SELECT * FROM users WHERE password = *****" 26 | 27 | 28 | def test_repr() -> None: 29 | assert ( 30 | repr(query_1) 31 | == "Query(value='SELECT * FROM users WHERE name = {name} AND password = *****', sensitive_args={'password': '*****'})" 32 | ) 33 | assert ( 34 | repr(query_2) 35 | == "Query(value='SELECT * FROM users WHERE name = {name}', sensitive_args={})" 36 | ) 37 | assert ( 38 | repr(query_3) 39 | == "Query(value='SELECT * FROM users WHERE password = *****', sensitive_args={'password': '*****'})" 40 | ) 41 | 42 | 43 | def test_eq_and_hash() -> None: 44 | query_2 = Query(query_1.value, query_1.sensitive_args) 45 | assert query_1 == query_2 46 | assert hash(query_1) == hash(query_2) 47 | 48 | 49 | def test_add() -> None: 50 | added_query = query_1 + " LIMIT 10" 51 | assert ( 52 | str(added_query) 53 | == "SELECT * FROM users WHERE name = {name} AND password = ***** LIMIT 10" 54 | ) 55 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Notice to external contributors 2 | 3 | ## General info 4 | 5 | Hello! In order for us (YANDEX LLC) to accept patches and other contributions from you, you will have to adopt our Yandex Contributor License Agreement (the "**CLA**"). The current version of the CLA can be found here: 6 | 1) https://yandex.ru/legal/cla/?lang=en (in English) and 7 | 2) https://yandex.ru/legal/cla/?lang=ru (in Russian). 8 | 9 | By adopting the CLA, you state the following: 10 | 11 | * You obviously wish and are willingly licensing your contributions to us for our open source projects under the terms of the CLA, 12 | * You have read the terms and conditions of the CLA and agree with them in full, 13 | * You are legally able to provide and license your contributions as stated, 14 | * We may use your contributions for our open source projects and for any other our project too, 15 | * We rely on your assurances concerning the rights of third parties in relation to your contributions. 16 | 17 | If you agree with these principles, please read and adopt our CLA. By providing us your contributions, you hereby declare that you have already read and adopt our CLA, and we may freely merge your contributions with our corresponding open source project and use it in further in accordance with terms and conditions of the CLA. 18 | 19 | ## Provide contributions 20 | 21 | If you have already adopted terms and conditions of the CLA, you are able to provide your contributions. When you submit your pull request, please add the following information into it: 22 | 23 | ``` 24 | I hereby agree to the terms of the CLA available at: [link]. 25 | ``` 26 | 27 | Replace the bracketed text as follows: 28 | * [link] is the link to the current version of the CLA: https://yandex.ru/legal/cla/?lang=en (in English) or https://yandex.ru/legal/cla/?lang=ru (in Russian). 29 | 30 | It is enough to provide us such notification once. 31 | 32 | ## Other questions 33 | 34 | If you have any questions, please mail us at opensource@yandex-team.ru. 35 | -------------------------------------------------------------------------------- /tests/unit/chadmin/test_stat_dict.py: -------------------------------------------------------------------------------- 1 | from ch_tools.chadmin.internal.object_storage.obj_list_item import ObjListItem 2 | from ch_tools.chadmin.internal.object_storage.s3_cleanup_stats import ( 3 | ResultStat, 4 | StatisticsPeriod, 5 | ) 6 | 7 | 8 | def test_default_keys() -> None: 9 | stat = ResultStat() 10 | 11 | assert stat.total == {"total_size": 0, "deleted": 0} 12 | assert len(stat.items()) == 1 13 | 14 | 15 | def test_update() -> None: 16 | stat = ResultStat() 17 | item = ObjListItem.from_tab_separated("2025-11-05 10:15:20\tsome/path/on/s3\t4") 18 | stat.update_by_item(item) 19 | 20 | assert len(stat.items()) == 1 21 | assert stat.total == {"total_size": 4, "deleted": 1} 22 | 23 | stat.update_by_item(item) 24 | assert stat.total == {"total_size": 8, "deleted": 2} 25 | 26 | 27 | def test_partitioning_month() -> None: 28 | stat = ResultStat(StatisticsPeriod.MONTH) 29 | item1 = ObjListItem.from_tab_separated("2025-11-05 10:15:20\tsome/path/on/s3\t4") 30 | item2 = ObjListItem.from_tab_separated("2025-10-05 10:15:20\tsome/path/on/s3\t2") 31 | 32 | stat.update_by_item(item1) 33 | stat.update_by_item(item2) 34 | 35 | assert len(stat.items()) == 3 36 | assert stat.total == {"total_size": 6, "deleted": 2} 37 | assert stat["2025-11"] == {"total_size": 4, "deleted": 1} 38 | assert stat["2025-10"] == {"total_size": 2, "deleted": 1} 39 | 40 | 41 | def test_partitioning_day() -> None: 42 | stat = ResultStat(StatisticsPeriod.DAY) 43 | item1 = ObjListItem.from_tab_separated("2025-11-05 10:15:20\tsome/path/on/s3\t4") 44 | item2 = ObjListItem.from_tab_separated("2025-11-06 10:15:20\tsome/path/on/s3\t2") 45 | 46 | stat.update_by_item(item1) 47 | stat.update_by_item(item2) 48 | 49 | assert len(stat.items()) == 3 50 | assert stat.total == {"total_size": 6, "deleted": 2} 51 | assert stat["2025-11-05"] == {"total_size": 4, "deleted": 1} 52 | assert stat["2025-11-06"] == {"total_size": 2, "deleted": 1} 53 | -------------------------------------------------------------------------------- /ch_tools/chadmin/internal/object_storage/s3_cleanup_stats.py: -------------------------------------------------------------------------------- 1 | from collections import defaultdict 2 | from datetime import datetime 3 | from enum import Enum 4 | from typing import TypedDict 5 | 6 | from ch_tools.chadmin.internal.object_storage.obj_list_item import ObjListItem 7 | from ch_tools.chadmin.internal.utils import DATETIME_FORMAT 8 | 9 | 10 | class StatisticsPeriod(str, Enum): 11 | """ 12 | How to partition stats of deleted orphaned objects 13 | """ 14 | 15 | DAY = "day" 16 | MONTH = "month" 17 | ALL = "all" 18 | 19 | 20 | class StatDict(TypedDict): 21 | deleted: int 22 | total_size: int 23 | 24 | 25 | class ResultStat(defaultdict): 26 | def _default_factory(self) -> StatDict: 27 | return {"deleted": 0, "total_size": 0} 28 | 29 | def __init__( 30 | self, stat_partitioning: StatisticsPeriod = StatisticsPeriod.ALL 31 | ) -> None: 32 | super().__init__(self._default_factory) 33 | self._stat_partitioning = stat_partitioning 34 | 35 | @property 36 | def total(self) -> StatDict: 37 | return self["Total"] 38 | 39 | def update_by_item(self, item: ObjListItem) -> None: 40 | self.total["deleted"] += 1 41 | self.total["total_size"] += item.size 42 | 43 | if self._stat_partitioning == StatisticsPeriod.ALL: 44 | return 45 | 46 | key = self._get_stat_key(item.last_modified) 47 | self[key]["deleted"] += 1 48 | self[key]["total_size"] += item.size 49 | 50 | def _get_stat_key(self, timestamp: datetime) -> str: 51 | if self._stat_partitioning == StatisticsPeriod.ALL: 52 | return "Total" 53 | time_str = timestamp.strftime(DATETIME_FORMAT) 54 | if self._stat_partitioning == StatisticsPeriod.MONTH: 55 | ymd = time_str.split("-") 56 | return "-".join(ymd[:2]) 57 | if self._stat_partitioning == StatisticsPeriod.DAY: 58 | ymd_hms = time_str.split(" ") 59 | return ymd_hms[0] 60 | -------------------------------------------------------------------------------- /ch_tools/monrun_checks/ch_log_errors.py: -------------------------------------------------------------------------------- 1 | import re 2 | from datetime import datetime, timedelta 3 | from typing import Any 4 | 5 | import click 6 | from file_read_backwards import FileReadBackwards 7 | 8 | from ch_tools.common.cli.parameters import RegexpParamType 9 | from ch_tools.common.result import CRIT, OK, WARNING, Result 10 | 11 | REGEXP = re.compile( 12 | r"^([0-9]{4}\.[0-9]{2}\.[0-9]{2}\ [0-9]{2}\:[0-9]{2}\:[0-9]{2}).*?<(Error|Fatal)>" 13 | ) 14 | 15 | 16 | @click.command("log-errors") 17 | @click.option("-c", "--critical", "crit", type=int, help="Critical threshold.") 18 | @click.option("-w", "--warning", "warn", type=int, help="Warning threshold.") 19 | @click.option( 20 | "-n", 21 | "--watch-seconds", 22 | "watch_seconds", 23 | type=int, 24 | help="Watch seconds.", 25 | ) 26 | @click.option( 27 | "-e", 28 | "--exclude", 29 | "exclude", 30 | type=RegexpParamType(), 31 | help="Excluded error.", 32 | ) 33 | @click.option( 34 | "-f", 35 | "--logfile", 36 | "logfile", 37 | help="Log file path.", 38 | ) 39 | def log_errors_command( 40 | crit: int, warn: int, watch_seconds: int, exclude: Any, logfile: str 41 | ) -> Result: 42 | """ 43 | Check errors in ClickHouse server logs. 44 | """ 45 | datetime_start = datetime.now() - timedelta(seconds=watch_seconds) 46 | errors = 0 47 | 48 | with FileReadBackwards(logfile, encoding="utf-8") as f: 49 | for line in f: 50 | if exclude.search(line): 51 | continue 52 | match = REGEXP.match(line) 53 | if match is None: 54 | continue 55 | date = match.group(1) 56 | if datetime.strptime(date, "%Y.%m.%d %H:%M:%S") < datetime_start: 57 | break 58 | errors += 1 59 | 60 | msg = f"{errors} errors for last {watch_seconds} seconds" 61 | if errors >= crit: 62 | return Result(CRIT, msg) 63 | if errors >= warn: 64 | return Result(WARNING, msg) 65 | return Result(OK, f"OK, {msg}") 66 | -------------------------------------------------------------------------------- /tests/images/zookeeper/config/log4j.properties: -------------------------------------------------------------------------------- 1 | # Define some default values that can be overridden by system properties 2 | zookeeper.root.logger=INFO, ROLLINGFILE 3 | 4 | zookeeper.console.threshold=INFO 5 | 6 | zookeeper.log.dir=/var/log/zookeeper/ 7 | zookeeper.log.file=zookeeper.log 8 | zookeeper.log.threshold=INFO 9 | zookeeper.log.maxfilesize=256MB 10 | zookeeper.log.maxbackupindex=2 11 | 12 | zookeeper.tracelog.dir=/var/log/zookeeper 13 | zookeeper.tracelog.file=zookeeper_trace.log 14 | 15 | log4j.rootLogger=${zookeeper.root.logger} 16 | 17 | # 18 | # console 19 | # Add "console" to rootlogger above if you want to use this 20 | # 21 | log4j.appender.CONSOLE=org.apache.log4j.ConsoleAppender 22 | log4j.appender.CONSOLE.Threshold=${zookeeper.console.threshold} 23 | log4j.appender.CONSOLE.layout=org.apache.log4j.PatternLayout 24 | log4j.appender.CONSOLE.layout.ConversionPattern=%d{ISO8601} [myid:%X{myid}] - %-5p [%t:%C{1}@%L] - %m%n 25 | 26 | # 27 | # Add ROLLINGFILE to rootLogger to get log file output 28 | # 29 | log4j.appender.ROLLINGFILE=org.apache.log4j.RollingFileAppender 30 | log4j.appender.ROLLINGFILE.Threshold=${zookeeper.log.threshold} 31 | log4j.appender.ROLLINGFILE.File=${zookeeper.log.dir}/${zookeeper.log.file} 32 | log4j.appender.ROLLINGFILE.MaxFileSize=${zookeeper.log.maxfilesize} 33 | log4j.appender.ROLLINGFILE.MaxBackupIndex=${zookeeper.log.maxbackupindex} 34 | log4j.appender.ROLLINGFILE.layout=org.apache.log4j.PatternLayout 35 | log4j.appender.ROLLINGFILE.layout.ConversionPattern=%d{ISO8601} [myid:%X{myid}] - %-5p [%t:%C{1}@%L] - %m%n 36 | 37 | # 38 | # Add TRACEFILE to rootLogger to get log file output 39 | # Log TRACE level and above messages to a log file 40 | # 41 | log4j.appender.TRACEFILE=org.apache.log4j.FileAppender 42 | log4j.appender.TRACEFILE.Threshold=TRACE 43 | log4j.appender.TRACEFILE.File=${zookeeper.tracelog.dir}/${zookeeper.tracelog.file} 44 | 45 | log4j.appender.TRACEFILE.layout=org.apache.log4j.PatternLayout 46 | ### Notice we are including log4j's NDC here (%x) 47 | log4j.appender.TRACEFILE.layout.ConversionPattern=%d{ISO8601} [myid:%X{myid}] - %-5p [%t:%C{1}@%L][%x] - %m%n 48 | 49 | -------------------------------------------------------------------------------- /ch_tools/chadmin/internal/partition.py: -------------------------------------------------------------------------------- 1 | from click import Context 2 | 3 | from ch_tools.chadmin.internal.utils import execute_query 4 | 5 | 6 | def attach_partition( 7 | ctx: Context, database: str, table: str, partition_id: str, dry_run: bool = False 8 | ) -> None: 9 | """ 10 | Attach the specified table partition. 11 | """ 12 | query = f"ALTER TABLE `{database}`.`{table}` ATTACH PARTITION ID '{partition_id}'" 13 | _execute_query(ctx, query, dry_run) 14 | 15 | 16 | def detach_partition( 17 | ctx: Context, database: str, table: str, partition_id: str, dry_run: bool = False 18 | ) -> None: 19 | """ 20 | Detach the specified table partition. 21 | """ 22 | query = f"ALTER TABLE `{database}`.`{table}` DETACH PARTITION ID '{partition_id}'" 23 | _execute_query(ctx, query, dry_run) 24 | 25 | 26 | def drop_partition( 27 | ctx: Context, database: str, table: str, partition_id: str, dry_run: bool = False 28 | ) -> None: 29 | """ 30 | Drop the specified table partition. 31 | """ 32 | query = f"ALTER TABLE `{database}`.`{table}` DROP PARTITION ID '{partition_id}'" 33 | _execute_query(ctx, query, dry_run) 34 | 35 | 36 | def optimize_partition( 37 | ctx: Context, database: str, table: str, partition_id: str, dry_run: bool = False 38 | ) -> None: 39 | """ 40 | Optimize the specified table partition. 41 | """ 42 | query = f"OPTIMIZE TABLE `{database}`.`{table}` PARTITION ID '{partition_id}'" 43 | _execute_query(ctx, query, dry_run) 44 | 45 | 46 | def materialize_ttl_in_partition( 47 | ctx: Context, database: str, table: str, partition_id: str, dry_run: bool = False 48 | ) -> None: 49 | """ 50 | Materialize TTL for the specified table partition. 51 | """ 52 | query = f"ALTER TABLE `{database}`.`{table}` MATERIALIZE TTL IN PARTITION ID '{partition_id}'" 53 | _execute_query(ctx, query, dry_run) 54 | 55 | 56 | def _execute_query(ctx: Context, query: str, dry_run: bool) -> None: 57 | timeout = ctx.obj["config"]["clickhouse"]["alter_table_timeout"] 58 | execute_query(ctx, query, timeout=timeout, format_=None, echo=True, dry_run=dry_run) 59 | -------------------------------------------------------------------------------- /ch_tools/chadmin/cli/disk_group.py: -------------------------------------------------------------------------------- 1 | import os 2 | import re 3 | import shutil 4 | 5 | from click import group, option 6 | 7 | from ch_tools.chadmin.cli.chadmin_group import Chadmin 8 | from ch_tools.common import logging 9 | 10 | 11 | @group("disks", cls=Chadmin) 12 | def disks_group() -> None: 13 | """Commands to manage disks.""" 14 | pass 15 | 16 | 17 | @disks_group.command("check-s3-metadata") 18 | @option( 19 | "--path", 20 | "path", 21 | default="/var/lib/clickhouse/disks/object_storage/store", 22 | help="Path to S3 metadata.", 23 | ) 24 | @option("--cleanup", is_flag=True, help="Remove parts with corrupted S3 metadata.") 25 | def check_s3_metadata_command(path: str, cleanup: bool) -> None: 26 | check_dir(path, cleanup) 27 | 28 | 29 | def check_dir(path: str, cleanup: bool) -> None: 30 | corrupted_dirs = [] 31 | for dirpath, _, filenames in os.walk(path): 32 | for filename in filenames: 33 | if not check_file(f"{dirpath}/{filename}"): 34 | logging.info("{}/{}", dirpath, filename) 35 | if dirpath not in corrupted_dirs: 36 | corrupted_dirs.append(dirpath) 37 | if cleanup: 38 | for dirpath in corrupted_dirs: 39 | logging.info('Remove directory "{}"', dirpath) 40 | shutil.rmtree(dirpath) 41 | 42 | 43 | def check_file(filename: str) -> bool: 44 | with open(filename, mode="r", encoding="latin-1") as file: 45 | lines = file.readlines(1024) 46 | if len(lines) != 5: 47 | file.close() 48 | return False 49 | result = True 50 | if not re.match("[123]\n", lines[0]): # version 1-3 51 | result = False 52 | elif not re.match("1\\s+\\d+\n", lines[1]): # object count=1 & size 53 | result = False 54 | elif not re.match("\\d+\\s+\\S+\n", lines[2]): # size & object name 55 | result = False 56 | elif not re.match("\\d+\n", lines[3]): # refcount 57 | result = False 58 | elif not re.match("[01]\n?", lines[4]): # is readonly 59 | result = False 60 | 61 | return result 62 | -------------------------------------------------------------------------------- /ch_tools/common/cli/utils.py: -------------------------------------------------------------------------------- 1 | """ 2 | Utility functions. 3 | """ 4 | 5 | from collections import defaultdict 6 | from datetime import datetime, timedelta 7 | from typing import Any, Optional, Tuple, Union 8 | 9 | import humanfriendly 10 | from click import Context 11 | from dateutil.tz import gettz, tzfile 12 | from deepdiff import DeepDiff 13 | 14 | 15 | def parse_timespan(value: str) -> timedelta: 16 | """ 17 | Parse time span value. 18 | """ 19 | return timedelta(seconds=humanfriendly.parse_timespan(value)) 20 | 21 | 22 | def now(ctx: Context) -> datetime: 23 | """ 24 | Like `datetime.now`, but with timezone information. 25 | """ 26 | return datetime.now(get_timezone(ctx)) 27 | 28 | 29 | def get_timezone(ctx: Context) -> tzfile: 30 | if "timezone" not in ctx.obj: 31 | config = ctx.obj["config"] 32 | ctx.obj["timezone"] = gettz(config.get("timezone", "UTC")) 33 | 34 | return ctx.obj["timezone"] 35 | 36 | 37 | def diff_objects(value1: Any, value2: Any) -> DeepDiff: 38 | """ 39 | Calculate structural diff between 2 values. 40 | """ 41 | return DeepDiff( 42 | value1, 43 | value2, 44 | verbose_level=2, 45 | view="tree", 46 | ignore_type_in_groups=[(dict, defaultdict)], 47 | ) 48 | 49 | 50 | class Nullable: 51 | """ 52 | Nullable wrapper type. It helps to distinguish the cases when a value is not specified vs. 53 | it's specified None value. 54 | """ 55 | 56 | def __init__(self, value: Optional[Any] = None) -> None: 57 | self.value = value 58 | 59 | 60 | def flatten_nullable(value: Union[Any, Nullable]) -> Tuple[bool, Optional[Any]]: 61 | """ 62 | Flatten a Nullable wrapper into a tuple (is_specified, value). 63 | """ 64 | if value is None: 65 | return False, None 66 | 67 | if isinstance(value, Nullable): 68 | value = value.value 69 | 70 | return True, value 71 | 72 | 73 | def is_not_null(value: Union[Any, Nullable]) -> bool: 74 | """ 75 | Return True if the value is not null. 76 | """ 77 | if isinstance(value, Nullable): 78 | value = value.value 79 | 80 | return value is not None 81 | -------------------------------------------------------------------------------- /tests/features/chs3_backup_cleanup.feature: -------------------------------------------------------------------------------- 1 | Feature: Cleanup of orphaned S3 disk backups 2 | 3 | Background: 4 | Given default configuration 5 | And a working s3 6 | And a working zookeeper 7 | And a working clickhouse on clickhouse01 8 | 9 | Scenario: Cleanup of orphaned S3 disk backups 10 | Given we have executed queries on clickhouse01 11 | """ 12 | CREATE DATABASE db1; 13 | 14 | CREATE TABLE db1.table1 (n Int32) 15 | ENGINE = ReplicatedMergeTree('/tables/db1_table1', '{replica}') 16 | ORDER BY n PARTITION BY n SETTINGS storage_policy = 'hybrid_storage'; 17 | 18 | INSERT INTO db1.table1 (n) VALUES (1) (11); 19 | ALTER TABLE db1.table1 MOVE PARTITION ID '11' TO DISK 'object_storage' 20 | """ 21 | And we have executed command on clickhouse01 22 | """ 23 | ch-backup backup --name backup1_partially_deleted 24 | """ 25 | And we have executed queries on clickhouse01 26 | """ 27 | INSERT INTO db1.table1 (n) VALUES (2) (12); 28 | ALTER TABLE db1.table1 MOVE PARTITION ID '12' TO DISK 'object_storage' 29 | """ 30 | And we have executed command on clickhouse01 31 | """ 32 | ch-backup backup --name backup2 33 | """ 34 | And we have executed command on clickhouse01 35 | """ 36 | ch-backup delete backup1_partially_deleted 37 | """ 38 | And we have executed queries on clickhouse01 39 | """ 40 | ALTER TABLE db1.table1 FREEZE WITH NAME 'backup1_partially_deleted'; 41 | ALTER TABLE db1.table1 FREEZE WITH NAME 'backup3_deleted'; 42 | """ 43 | 44 | When we execute command on clickhouse01 45 | """ 46 | ch-monitoring orphaned-backups 47 | """ 48 | Then we get response 49 | """ 50 | 1;There are 2 orphaned S3 backups: backup1_partially_deleted, backup3_deleted 51 | """ 52 | 53 | When we execute command on clickhouse01 54 | """ 55 | chadmin chs3-backup cleanup 56 | """ 57 | And we execute command on clickhouse01 58 | """ 59 | ch-monitoring orphaned-backups 60 | """ 61 | Then we get response 62 | """ 63 | 0;OK 64 | """ 65 | When we execute command on clickhouse01 66 | """ 67 | ls /var/lib/clickhouse/disks/object_storage/shadow 68 | """ 69 | Then we get response 70 | """ 71 | backup2 72 | """ 73 | -------------------------------------------------------------------------------- /ch_tools/chadmin/internal/clickhouse_disks.py: -------------------------------------------------------------------------------- 1 | import subprocess 2 | from typing import Optional, Tuple 3 | 4 | import xmltodict 5 | 6 | from ch_tools.common import logging 7 | from ch_tools.common.clickhouse.config import ClickhouseConfig 8 | from ch_tools.common.utils import version_ge 9 | 10 | CLICKHOUSE_PATH = "/var/lib/clickhouse" 11 | CLICKHOUSE_STORE_PATH = CLICKHOUSE_PATH + "/store" 12 | CLICKHOUSE_DATA_PATH = CLICKHOUSE_PATH + "/data" 13 | CLICKHOUSE_METADATA_PATH = CLICKHOUSE_PATH + "/metadata" 14 | S3_PATH = CLICKHOUSE_PATH + "/disks/object_storage" 15 | S3_METADATA_STORE_PATH = S3_PATH + "/store" 16 | 17 | OBJECT_STORAGE_DISK_TYPES = ["s3", "object_storage", "ObjectStorage"] 18 | 19 | 20 | def make_ch_disks_config(disk: str) -> str: 21 | disk_config = ClickhouseConfig.load().storage_configuration.get_disk_config(disk) 22 | disk_config_path = f"/tmp/chadmin-ch-disks-{disk}.xml" 23 | logging.info("Create a conf for {} disk: {}", disk, disk_config_path) 24 | with open(disk_config_path, "w", encoding="utf-8") as f: 25 | xmltodict.unparse( 26 | { 27 | "clickhouse": { 28 | "storage_configuration": {"disks": {disk: disk_config}}, 29 | } 30 | }, 31 | f, 32 | pretty=True, 33 | ) 34 | return disk_config_path 35 | 36 | 37 | def remove_from_ch_disk( 38 | disk: str, 39 | path: str, 40 | ch_version: str, 41 | disk_config_path: Optional[str] = None, 42 | dry_run: bool = False, 43 | ) -> Tuple[int, bytes]: 44 | cmd = f"clickhouse-disks {'-C ' + disk_config_path if disk_config_path else ''} --disk {disk}" 45 | if version_ge(ch_version, "24.7"): 46 | cmd += f' --query "remove {path} --recursive"' 47 | else: 48 | cmd += f" remove {path}" 49 | 50 | logging.info("Run : {}", cmd) 51 | 52 | if dry_run: 53 | return (0, b"") 54 | 55 | proc = subprocess.run( 56 | cmd, 57 | shell=True, 58 | check=False, 59 | stdout=subprocess.PIPE, 60 | stderr=subprocess.PIPE, 61 | ) 62 | 63 | logging.info( 64 | "clickhouse-disks remove command has finished: retcode {}, stderr: {}", 65 | proc.returncode, 66 | proc.stderr.decode(), 67 | ) 68 | return (proc.returncode, proc.stderr) 69 | -------------------------------------------------------------------------------- /tests/images/zookeeper/config/start_zk.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # Basically, a cannibalized version of ubuntu upstart script 3 | # + contents of /etc/zookeeper/conf/environment 4 | ulimit -n 8196 5 | 6 | NAME=zookeeper 7 | JMXDISABLE="yes, please" 8 | ZOOCFGDIR=/etc/$NAME/conf 9 | 10 | # TODO this is really ugly 11 | # How to find out, which jars are needed? 12 | # seems, that log4j requires the log4j.properties file to be in the classpath 13 | CLASSPATH="$ZOOCFGDIR:/usr/share/java/jline.jar:/usr/share/java/log4j-1.2.jar:/usr/share/java/xercesImpl.jar:/usr/share/java/xmlParserAPIs.jar:/usr/share/java/netty.jar:/usr/share/java/slf4j-api.jar:/usr/share/java/slf4j-log4j12.jar:/usr/share/java/zookeeper.jar" 14 | 15 | EXEC_OVERRIDES="-Dzookeeper.forceSync=no \ 16 | -Djute.maxbuffer=16777216 \ 17 | -Dzookeeper.snapCount=10000" 18 | 19 | ZOOCFG="$ZOOCFGDIR/zoo.cfg" 20 | ZOO_LOG_DIR=/var/log/$NAME 21 | USER=$NAME 22 | GROUP=$NAME 23 | PIDDIR=/var/run/$NAME 24 | PIDFILE=$PIDDIR/$NAME.pid 25 | JAVA=/usr/bin/java 26 | ZOOMAIN="org.apache.zookeeper.server.quorum.QuorumPeerMain" 27 | ZOO_LOG4J_PROP="INFO,ROLLINGFILE" 28 | JMXLOCALONLY="true" 29 | JAVA_OPTS="-XX:+UseG1GC -Xmx256M -XX:+PrintGCDateStamps -Xloggc:/var/log/zookeeper/gc.log -Djava.net.preferIPv6Addresses=true -Djava.net.preferIPv4Stack=false ${EXEC_OVERRIDES}" 30 | 31 | 32 | [ -r "/usr/share/java/zookeeper.jar" ] || exit 0 33 | [ -d $ZOO_LOG_DIR ] || mkdir -p $ZOO_LOG_DIR 34 | chown $USER:$GROUP $ZOO_LOG_DIR 35 | 36 | [ -r /etc/default/zookeeper ] && . /etc/default/zookeeper 37 | if [ -z "$JMXDISABLE" ]; then 38 | JAVA_OPTS="$JAVA_OPTS -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.local.only=$JMXLOCALONLY" 39 | fi 40 | # Set "myid" with a number from hostname: zookeeper01.domain -> 1 41 | myid=$(hostname -f | awk -F '.' '{printf("%s", $1)}' | tail -c 1) 42 | datadir=$(awk -F'=' '/dataDir/ {print $2}' "${ZOOCFG}") 43 | [ -z ${myid} ] && exit 1 44 | echo ${myid} > /etc/zookeeper/conf/myid 45 | rm -f ${datadir}/myid 46 | ln -s /etc/zookeeper/conf/myid ${datadir}/myid 47 | 48 | # Start process. ZK`s main class never detaches, so start-stop stays in foreground. 49 | exec start-stop-daemon --start -c $USER --exec $JAVA --name zookeeper -- \ 50 | -cp $CLASSPATH $JAVA_OPTS -Dzookeeper.log.dir=${ZOO_LOG_DIR} \ 51 | -Dzookeeper.root.logger=${ZOO_LOG4J_PROP} $ZOOMAIN $ZOOCFG 52 | -------------------------------------------------------------------------------- /ch_tools/chadmin/cli/move_group.py: -------------------------------------------------------------------------------- 1 | from collections import OrderedDict 2 | from typing import Any 3 | 4 | from click import Context 5 | from cloup import group, option, pass_context 6 | 7 | from ch_tools.chadmin.cli.chadmin_group import Chadmin 8 | from ch_tools.chadmin.internal.process import list_moves 9 | from ch_tools.common.cli.formatting import format_bytes, format_float, print_response 10 | from ch_tools.common.clickhouse.config import get_cluster_name 11 | 12 | FIELD_FORMATTERS = { 13 | "part_size": format_bytes, 14 | "elapsed": format_float, 15 | } 16 | 17 | 18 | @group("move", cls=Chadmin) 19 | def move_group() -> None: 20 | """Commands to manage moves (retrieve information from system.moves).""" 21 | pass 22 | 23 | 24 | @move_group.command("list") 25 | @option( 26 | "-d", 27 | "--database", 28 | help="Filter in moves to output by the specified database.", 29 | ) 30 | @option( 31 | "-t", 32 | "--table", 33 | help="Filter in moves to output by the specified table.", 34 | ) 35 | @option( 36 | "--cluster", 37 | "--on-cluster", 38 | "on_cluster", 39 | is_flag=True, 40 | help="Get moves from all hosts in the cluster.", 41 | ) 42 | @option( 43 | "-l", 44 | "--limit", 45 | type=int, 46 | default=1000, 47 | help="Limit the max number of objects in the output.", 48 | ) 49 | @pass_context 50 | def list_command(ctx: Context, on_cluster: bool, limit: int, **kwargs: Any) -> None: 51 | """List executing merges.""" 52 | 53 | def _table_formatter(item: Any) -> OrderedDict: 54 | return OrderedDict( 55 | ( 56 | ("database", item["database"]), 57 | ("table", item["table"]), 58 | ("elapsed", item["elapsed"]), 59 | ("target_disk", item["target_disk_name"]), 60 | ("target_path", item["target_disk_path"]), 61 | ("part_name", item["part_name"]), 62 | ("part_size", item["part_size"]), 63 | ) 64 | ) 65 | 66 | cluster = get_cluster_name(ctx) if on_cluster else None 67 | 68 | print_response( 69 | ctx, 70 | list_moves(ctx, cluster=cluster, limit=limit, **kwargs), 71 | default_format="table", 72 | table_formatter=_table_formatter, 73 | field_formatters=FIELD_FORMATTERS, 74 | ) 75 | -------------------------------------------------------------------------------- /tests/modules/chadmin.py: -------------------------------------------------------------------------------- 1 | from typing import Any 2 | 3 | from ch_tools.common import logging 4 | 5 | 6 | class Chadmin: 7 | def __init__(self, container: Any) -> None: 8 | self._container = container 9 | 10 | def exec_cmd(self, cmd: str) -> Any: 11 | ch_admin_cmd = f"chadmin {cmd}" 12 | logging.debug("chadmin command: {}", ch_admin_cmd) 13 | 14 | result = self._container.exec_run(["bash", "-c", ch_admin_cmd], user="root") 15 | return result 16 | 17 | def create_zk_node( 18 | self, zk_node: str, no_ch_config: bool = False, recursive: bool = True 19 | ) -> Any: 20 | cmd = "zookeeper {use_config} create {make_parents} {node}".format( 21 | use_config="--no-ch-config" if no_ch_config else "", 22 | make_parents="--make-parents" if recursive else "", 23 | node=zk_node, 24 | ) 25 | return self.exec_cmd(cmd) 26 | 27 | def zk_delete(self, zk_nodes: str, no_ch_config: bool = False) -> Any: 28 | cmd = "zookeeper {use_config} delete {nodes}".format( 29 | use_config="--no-ch-config" if no_ch_config else "", 30 | nodes=zk_nodes, 31 | ) 32 | return self.exec_cmd(cmd) 33 | 34 | def zk_list(self, zk_node: str, no_ch_config: bool = False) -> Any: 35 | cmd = "zookeeper {use_config} list {node}".format( 36 | use_config="--no-ch-config" if no_ch_config else "", 37 | node=zk_node, 38 | ) 39 | return self.exec_cmd(cmd) 40 | 41 | def zk_cleanup( 42 | self, 43 | fqdn: str, 44 | zk_root: Any = None, 45 | no_ch_config: bool = False, 46 | dry_run: bool = False, 47 | ) -> Any: 48 | cmd = "zookeeper {use_config} {root} cleanup-removed-hosts-metadata {hosts} {dry}".format( 49 | use_config="--no-ch-config" if no_ch_config else "", 50 | root=f"--chroot {zk_root}" if zk_root else "", 51 | hosts=fqdn, 52 | dry="" if not dry_run else "--dry-run", 53 | ) 54 | return self.exec_cmd(cmd) 55 | 56 | def zk_cleanup_table(self, fqdn: str, zk_table_path_: str) -> Any: 57 | cmd = "zookeeper remove-hosts-from-table {zk_table_path} {hosts}".format( 58 | zk_table_path=zk_table_path_, 59 | hosts=fqdn, 60 | ) 61 | return self.exec_cmd(cmd) 62 | -------------------------------------------------------------------------------- /tests/features/monrun_keeper.feature: -------------------------------------------------------------------------------- 1 | Feature: keeper-monitoring tool 2 | 3 | Background: 4 | Given default configuration 5 | And a working s3 6 | And a working zookeeper 7 | And a working clickhouse on clickhouse01 8 | 9 | Scenario: Check status command not throwing 10 | When we execute command on zookeeper01 11 | """ 12 | keeper-monitoring status 13 | """ 14 | 15 | Scenario: Check Zookeeper alive with keeper monitoring 16 | When we execute command on zookeeper01 17 | """ 18 | keeper-monitoring -n alive 19 | """ 20 | Then we get response 21 | """ 22 | 0;OK 23 | """ 24 | When we execute command on zookeeper01 25 | """ 26 | supervisorctl stop zookeeper 27 | """ 28 | When we execute command on zookeeper01 29 | """ 30 | keeper-monitoring -n alive 31 | """ 32 | Then we get response 33 | """ 34 | 2;KazooTimeoutError('Connection time-out') 35 | """ 36 | 37 | Scenario: Check ZooKeeper version 38 | When we execute command on zookeeper01 39 | """ 40 | keeper-monitoring version 41 | """ 42 | Then we get response contains 43 | """ 44 | 0; 45 | """ 46 | When we execute command on zookeeper01 47 | """ 48 | supervisorctl stop zookeeper 49 | """ 50 | When we execute command on zookeeper01 51 | """ 52 | keeper-monitoring version 53 | """ 54 | Then we get response 55 | """ 56 | 1;ConnectionRefusedError(111, 'Connection refused') 57 | """ 58 | 59 | Scenario: Check CH keeper alive with keeper monitoring 60 | Given a working keeper on clickhouse01 61 | When we execute command on clickhouse01 62 | """ 63 | keeper-monitoring -n alive 64 | """ 65 | Then we get response 66 | """ 67 | 0;OK 68 | """ 69 | When we execute command on clickhouse01 70 | """ 71 | supervisorctl stop clickhouse-server 72 | """ 73 | When we execute command on clickhouse01 74 | """ 75 | keeper-monitoring -n alive 76 | """ 77 | Then we get response 78 | """ 79 | 2;KazooTimeoutError('Connection time-out') 80 | """ 81 | # check that keeper-monitoring works fine without CH configs 82 | When we execute command on clickhouse01 83 | """ 84 | rm -fr /etc/clickhouse* 85 | keeper-monitoring -n alive 86 | """ 87 | Then we get response 88 | """ 89 | 2;KazooTimeoutError('Connection time-out') 90 | """ 91 | -------------------------------------------------------------------------------- /ch_tools/chadmin/cli/replicated_fetch_group.py: -------------------------------------------------------------------------------- 1 | from collections import OrderedDict 2 | from typing import Any 3 | 4 | from click import Context, group, option, pass_context 5 | 6 | from ch_tools.chadmin.cli.chadmin_group import Chadmin 7 | from ch_tools.chadmin.internal.process import list_replicated_fetches 8 | from ch_tools.common.cli.formatting import ( 9 | format_bytes, 10 | format_float, 11 | format_percents, 12 | print_response, 13 | ) 14 | from ch_tools.common.clickhouse.config import get_cluster_name 15 | 16 | FIELD_FORMATTERS = { 17 | "total_size_bytes_compressed": format_bytes, 18 | "elapsed": format_float, 19 | "progress": format_percents, 20 | } 21 | 22 | 23 | @group("replicated-fetch", cls=Chadmin) 24 | def replicated_fetch_group() -> None: 25 | """Commands to manage fetches (retrieve information from system.replicated_fetches).""" 26 | pass 27 | 28 | 29 | @replicated_fetch_group.command("list") 30 | @option( 31 | "-d", "--database", help="Filter in merges to output by the specified database." 32 | ) 33 | @option("-t", "--table", help="Filter in merges to output by the specified table.") 34 | @option( 35 | "--cluster", 36 | "--on-cluster", 37 | "on_cluster", 38 | is_flag=True, 39 | help="Get merges from all hosts in the cluster.", 40 | ) 41 | @option( 42 | "-l", 43 | "--limit", 44 | type=int, 45 | default=1000, 46 | help="Limit the max number of objects in the output.", 47 | ) 48 | @pass_context 49 | def list_command(ctx: Context, on_cluster: bool, limit: int, **kwargs: Any) -> None: 50 | """List executing fetches.""" 51 | 52 | def _table_formatter(fetch: Any) -> OrderedDict: 53 | return OrderedDict( 54 | ( 55 | ("database", fetch["database"]), 56 | ("table", fetch["table"]), 57 | ("result_part", fetch["result_part_name"]), 58 | ("elapsed", fetch["elapsed"]), 59 | ("progress", fetch["progress"]), 60 | ("source_replica", fetch["source_replica_hostname"]), 61 | ("total_size", fetch["total_size_bytes_compressed"]), 62 | ) 63 | ) 64 | 65 | cluster = get_cluster_name(ctx) if on_cluster else None 66 | 67 | merges = list_replicated_fetches(ctx, cluster=cluster, limit=limit, **kwargs) 68 | 69 | print_response( 70 | ctx, 71 | merges, 72 | default_format="table", 73 | table_formatter=_table_formatter, 74 | field_formatters=FIELD_FORMATTERS, 75 | ) 76 | -------------------------------------------------------------------------------- /ch_tools/chadmin/cli/chadmin_group.py: -------------------------------------------------------------------------------- 1 | import sys 2 | from functools import wraps 3 | from typing import Any, Optional 4 | 5 | import click 6 | import cloup 7 | 8 | from ch_tools import __version__ 9 | from ch_tools.common import logging 10 | from ch_tools.common.utils import get_full_command_name 11 | 12 | # pylint: disable=too-many-ancestors 13 | 14 | 15 | class Chadmin(cloup.Group): 16 | def add_command( 17 | self, 18 | cmd: click.Command, 19 | name: Optional[str] = None, 20 | section: Optional[cloup.Section] = None, 21 | fallback_to_default_section: bool = True, 22 | ) -> None: 23 | if cmd.callback is None: 24 | super().add_command( 25 | cmd, 26 | name=name, 27 | section=section, 28 | fallback_to_default_section=fallback_to_default_section, 29 | ) 30 | return 31 | 32 | cmd_callback = cmd.callback 33 | 34 | @wraps(cmd_callback) 35 | @cloup.pass_context 36 | def wrapper(ctx: Any, *a: Any, **kw: Any) -> None: 37 | logging.configure( 38 | ctx.obj["config"]["loguru"], 39 | "chadmin", 40 | {"cmd_name": get_full_command_name(ctx)}, 41 | ) 42 | 43 | logging.debug( 44 | "Command starts executing, params: {}, args: {}, version: {}", 45 | { 46 | **ctx.parent.params, 47 | **ctx.params, 48 | }, 49 | ctx.args, 50 | __version__, 51 | ) 52 | 53 | try: 54 | cmd_callback(*a, **kw) 55 | logging.debug("Command completed") 56 | except Exception: 57 | logging.exception("Command failed with error:", short_stdout=True) 58 | sys.exit(1) 59 | 60 | cmd.callback = wrapper 61 | super().add_command( 62 | cmd, 63 | name=name, 64 | section=section, 65 | fallback_to_default_section=fallback_to_default_section, 66 | ) 67 | 68 | def add_group( 69 | self, 70 | cmd: click.Group, 71 | name: Optional[str] = None, 72 | section: Optional[cloup.Section] = None, 73 | fallback_to_default_section: bool = True, 74 | ) -> None: 75 | super().add_command( 76 | cmd, 77 | name=name, 78 | section=section, 79 | fallback_to_default_section=fallback_to_default_section, 80 | ) 81 | -------------------------------------------------------------------------------- /ch_tools/common/cli/locale_resolver.py: -------------------------------------------------------------------------------- 1 | import locale 2 | import os 3 | import subprocess 4 | from typing import List, Tuple 5 | 6 | __all__ = [ 7 | "LocaleResolver", 8 | ] 9 | 10 | 11 | class LocaleResolver: 12 | """ 13 | Sets the locale for Click. Otherwise, it may fail with an error like 14 | 15 | ``` 16 | RuntimeError: Click discovered that you exported a UTF-8 locale 17 | but the locale system could not pick up from it because it does not exist. 18 | The exported locale is 'en_US.UTF-8' but it is not supported. 19 | ``` 20 | """ 21 | 22 | @staticmethod 23 | def resolve() -> None: 24 | lang, _ = locale.getlocale() 25 | locales, has_c, has_en_us = LocaleResolver._get_utf8_locales() 26 | 27 | langs = map(lambda loc: str.lower(loc[0]), locales) 28 | if lang is None or lang.lower() not in langs: 29 | if has_c: 30 | lang = "C" 31 | elif has_en_us: 32 | lang = "en_US" 33 | else: 34 | raise RuntimeError( 35 | f'Locale "{lang}" is not supported. ' 36 | 'We tried to use "C" and "en_US" but they\'re absent on your machine.', 37 | ) 38 | 39 | for locale_ in locales: 40 | if lang != locale_[0]: 41 | continue 42 | 43 | os.environ["LC_ALL"] = f"{lang}.{locale_[1]}" 44 | os.environ["LANG"] = f"{lang}.{locale_[1]}" 45 | 46 | @staticmethod 47 | def _get_utf8_locales() -> Tuple[List[Tuple[str, str]], bool, bool]: 48 | try: 49 | with subprocess.Popen( 50 | ["locale", "-a"], 51 | stdout=subprocess.PIPE, 52 | stderr=subprocess.PIPE, 53 | encoding="ascii", 54 | errors="replace", 55 | ) as proc: 56 | stdout, _ = proc.communicate() 57 | except OSError: 58 | stdout = "" 59 | 60 | langs = [] 61 | encodings = [] 62 | 63 | has_c = False 64 | has_en_us = False 65 | 66 | for line in stdout.splitlines(): 67 | locale_ = line.strip() 68 | if not locale_.lower().endswith(("utf-8", "utf8")): 69 | continue 70 | 71 | lang, encoding = locale_.split(".") 72 | 73 | langs.append(lang) 74 | encodings.append(encoding) 75 | 76 | has_c |= lang.lower() == "c" 77 | has_en_us |= lang.lower() == "en_us" 78 | 79 | res = list(zip(langs, encodings)) 80 | 81 | return res, has_c, has_en_us 82 | -------------------------------------------------------------------------------- /ch_tools/monrun_checks/ch_dist_tables.py: -------------------------------------------------------------------------------- 1 | import pathlib 2 | import time 3 | from typing import Any 4 | from urllib.parse import quote 5 | 6 | import click 7 | 8 | from ch_tools.common.clickhouse.client.clickhouse_client import clickhouse_client 9 | from ch_tools.common.result import Result 10 | 11 | 12 | @click.command("dist-tables") 13 | @click.option( 14 | "-c", "--critical", "crit", type=int, default=3600, help="Critical threshold." 15 | ) 16 | @click.option( 17 | "-w", "--warning", "warn", type=int, default=600, help="Warning threshold." 18 | ) 19 | @click.pass_context 20 | def dist_tables_command(ctx: click.Context, crit: int, warn: int) -> Result: 21 | """ 22 | Check for old chunks on Distributed tables. 23 | """ 24 | 25 | status = 0 26 | issues = [] 27 | 28 | ch_client = clickhouse_client(ctx) 29 | 30 | query = "SELECT database, name FROM system.tables WHERE engine = 'Distributed'" 31 | distributed_tables = ch_client.query_json_data(query=query, compact=False) 32 | for table in distributed_tables: 33 | tss = get_chunk_timestamps(table) 34 | if tss["broken"]: 35 | issues.append( 36 | f'{table["database"]}.{table["name"]}: {len(tss["broken"])} broken chunks' 37 | ) 38 | status = max(1, status) 39 | 40 | oldest_ts, oldest_fn = tss["root"] and tss["root"][0] or (None, None) 41 | if not oldest_ts: 42 | continue 43 | timespan = int(time.time()) - oldest_ts 44 | if timespan < warn: 45 | continue 46 | 47 | if timespan < crit: 48 | status = max(1, status) 49 | else: 50 | status = 2 51 | 52 | issues.append( 53 | f'{table["database"],}.{table["name"]}: {oldest_fn} ({int(timespan)})' 54 | ) 55 | 56 | message = ", ".join(issues) 57 | return Result(status, message or "OK") 58 | 59 | 60 | def get_chunk_timestamps(table: Any) -> Any: 61 | """ 62 | Return timestamps of files contained within dist table directory. 63 | """ 64 | path = pathlib.Path(get_table_path(table)) 65 | 66 | patterns = { 67 | "broken": "*/broken/*", 68 | "root": "*/*", 69 | } 70 | return { 71 | subdir: sorted( 72 | [(f.stat().st_atime, f.name) for f in path.glob(pattern) if f.is_file()] 73 | ) 74 | for subdir, pattern in patterns.items() 75 | } 76 | 77 | 78 | def get_table_path(table: Any) -> str: 79 | """ 80 | Return path to table directory on file system. 81 | """ 82 | db_name = quote(table["database"], safe="") 83 | table_name = quote(table["name"], safe="") 84 | return f"/var/lib/clickhouse/data/{db_name}/{table_name}" 85 | -------------------------------------------------------------------------------- /tests/features/s3_credentials.feature: -------------------------------------------------------------------------------- 1 | Feature: ch_s3_credentials tool 2 | 3 | Background: 4 | Given default configuration 5 | And a working s3 6 | And a working zookeeper 7 | And a working clickhouse on clickhouse01 8 | And a working clickhouse on clickhouse02 9 | And a working http server 10 | 11 | Scenario Outline: chadmin s3 check work correctly 12 | When we execute command on clickhouse01 13 | """ 14 | ch-monitoring --setting cloud.metadata_service_endpoint http://http_mock01:8080 s3-credentials-config --missing 15 | """ 16 | Then we get response 17 | """ 18 | 0;OK 19 | """ 20 | When we execute command on clickhouse01 21 | """ 22 | chadmin --setting cloud.metadata_service_endpoint http://http_mock01:8080 s3-credentials-config update --endpoint=storage.com 23 | """ 24 | And we execute command on clickhouse01 25 | """ 26 | ch-monitoring --setting cloud.metadata_service_endpoint http://http_mock01:8080 s3-credentials-config --present 27 | """ 28 | Then we get response 29 | """ 30 | 0;OK 31 | """ 32 | When we execute command on clickhouse01 33 | """ 34 | cat /etc/clickhouse-server/config.d/s3_credentials.xml 35 | """ 36 | Then we get response 37 | """ 38 | 39 | 40 | 41 | 42 | storage.com 43 | <>X-YaCloud-SubjectToken: IAM_TOKEN> 44 | 45 | 46 | 47 | """ 48 | @require_version_24.11 49 | Examples: 50 | | header_tag_name | 51 | | access_header | 52 | 53 | @require_version_less_than_24.11 54 | Examples: 55 | | header_tag_name | 56 | | header | 57 | 58 | Scenario: Offline token update. 59 | Given installed clickhouse-tools config with version on clickhouse01 60 | When we execute command on clickhouse01 61 | """ 62 | ch-monitoring --setting cloud.metadata_service_endpoint http://http_mock01:8080 s3-credentials-config --missing 63 | """ 64 | When we execute command on clickhouse01 65 | """ 66 | supervisorctl stop clickhouse-server 67 | """ 68 | When we execute command on clickhouse01 69 | """ 70 | chadmin --setting cloud.metadata_service_endpoint http://http_mock01:8080 s3-credentials-config update --endpoint=storage.com 71 | """ 72 | And we execute command on clickhouse01 73 | """ 74 | ch-monitoring --setting cloud.metadata_service_endpoint http://http_mock01:8080 s3-credentials-config --present 75 | """ 76 | Then we get response 77 | """ 78 | 0;OK 79 | """ 80 | -------------------------------------------------------------------------------- /ch_tools/common/clickhouse/config/clickhouse_keeper.py: -------------------------------------------------------------------------------- 1 | import os 2 | from typing import Any, Optional, Tuple 3 | 4 | from ...utils import first_value 5 | from .path import ( 6 | CLICKHOUSE_KEEPER_CONFIG_PATH, 7 | CLICKHOUSE_SERVER_PREPROCESSED_CONFIG_PATH, 8 | ) 9 | from .utils import dump_config, load_config 10 | 11 | 12 | class ClickhouseKeeperConfig: 13 | """ 14 | ClickHouse keeper server config (config.xml). 15 | """ 16 | 17 | def __init__(self, config: Any, config_path: str) -> None: 18 | self._config = config 19 | self._config_path = config_path 20 | 21 | @property 22 | def _clickhouse(self) -> Any: 23 | return first_value(self._config) 24 | 25 | @property 26 | def _keeper_server(self) -> Any: 27 | return self._clickhouse.get("keeper_server", {}) 28 | 29 | @property 30 | def port_pair(self) -> Tuple[int, bool]: 31 | """ 32 | :returns tuple (ClickHouse port, port is secure) 33 | If both and are present, a secure port 34 | is returned. 35 | """ 36 | secure_port = self._keeper_server.get("tcp_port_secure") 37 | if secure_port is not None: 38 | return int(secure_port), True 39 | 40 | return int(self._keeper_server.get("tcp_port", 0)), False 41 | 42 | @property 43 | def tls_cert_path(self) -> Optional[str]: 44 | return ( 45 | self._clickhouse.get("openSSL", {}) 46 | .get("server", {}) 47 | .get("certificateFile", None) 48 | ) 49 | 50 | @property 51 | def snapshots_dir(self) -> Optional[str]: 52 | return self._keeper_server.get("snapshot_storage_path") 53 | 54 | @property 55 | def storage_dir(self) -> Optional[str]: 56 | return self._keeper_server.get("storage_path") 57 | 58 | @property 59 | def separated(self) -> bool: 60 | """ 61 | Return True if ClickHouse Keeper is configured to run in separate process. 62 | """ 63 | return self._config_path == CLICKHOUSE_KEEPER_CONFIG_PATH 64 | 65 | def dump(self, mask_secrets: bool = True) -> str: 66 | return dump_config(self._config, mask_secrets=mask_secrets) 67 | 68 | def dump_xml(self, mask_secrets: bool = True) -> str: 69 | return dump_config(self._config, mask_secrets=mask_secrets, xml_format=True) 70 | 71 | @staticmethod 72 | def load() -> "ClickhouseKeeperConfig": 73 | if os.path.exists(CLICKHOUSE_KEEPER_CONFIG_PATH): 74 | config_path = CLICKHOUSE_KEEPER_CONFIG_PATH 75 | else: 76 | config_path = CLICKHOUSE_SERVER_PREPROCESSED_CONFIG_PATH 77 | 78 | config = load_config(config_path) 79 | return ClickhouseKeeperConfig(config, config_path) 80 | -------------------------------------------------------------------------------- /ch_tools/chadmin/cli/merge_group.py: -------------------------------------------------------------------------------- 1 | from collections import OrderedDict 2 | from typing import Any 3 | 4 | from click import Context, group, option, pass_context 5 | 6 | from ch_tools.chadmin.cli.chadmin_group import Chadmin 7 | from ch_tools.chadmin.internal.process import list_merges 8 | from ch_tools.common.cli.formatting import ( 9 | format_bytes, 10 | format_float, 11 | format_percents, 12 | print_response, 13 | ) 14 | from ch_tools.common.clickhouse.config import get_cluster_name 15 | 16 | FIELD_FORMATTERS = { 17 | "total_size_bytes_compressed": format_bytes, 18 | "bytes_read_uncompressed": format_bytes, 19 | "bytes_written_uncompressed": format_bytes, 20 | "memory_usage": format_bytes, 21 | "elapsed": format_float, 22 | "progress": format_percents, 23 | } 24 | 25 | 26 | @group("merge", cls=Chadmin) 27 | def merge_group() -> None: 28 | """Commands to manage merges (retrieve information from system.merges).""" 29 | pass 30 | 31 | 32 | @merge_group.command("list") 33 | @option( 34 | "-d", "--database", help="Filter in merges to output by the specified database." 35 | ) 36 | @option("-t", "--table", help="Filter in merges to output by the specified table.") 37 | @option("--mutation", "is_mutation", is_flag=True) 38 | @option( 39 | "--cluster", 40 | "--on-cluster", 41 | "on_cluster", 42 | is_flag=True, 43 | help="Get merges from all hosts in the cluster.", 44 | ) 45 | @option( 46 | "-l", 47 | "--limit", 48 | type=int, 49 | default=1000, 50 | help="Limit the max number of objects in the output.", 51 | ) 52 | @pass_context 53 | def list_command(ctx: Context, on_cluster: bool, limit: int, **kwargs: Any) -> None: 54 | """List executing merges.""" 55 | 56 | def _table_formatter(merge: Any) -> OrderedDict: 57 | if merge["is_mutation"]: 58 | merge_type = "mutation" 59 | else: 60 | merge_type = f"{merge['merge_type']} {merge['merge_algorithm']} merge" 61 | return OrderedDict( 62 | ( 63 | ("database", merge["database"]), 64 | ("table", merge["table"]), 65 | ("result_part", merge["result_part_name"]), 66 | ("source_parts", "\n".join(merge["source_part_names"])), 67 | ("type", merge_type), 68 | ("elapsed", merge["elapsed"]), 69 | ("progress", merge["progress"]), 70 | ("total_size", merge["total_size_bytes_compressed"]), 71 | ("memory_usage", merge["memory_usage"]), 72 | ) 73 | ) 74 | 75 | cluster = get_cluster_name(ctx) if on_cluster else None 76 | 77 | merges = list_merges(ctx, cluster=cluster, limit=limit, **kwargs) 78 | 79 | print_response( 80 | ctx, 81 | merges, 82 | default_format="table", 83 | table_formatter=_table_formatter, 84 | field_formatters=FIELD_FORMATTERS, 85 | ) 86 | -------------------------------------------------------------------------------- /tests/steps/zookeeper.py: -------------------------------------------------------------------------------- 1 | """ 2 | Steps for interacting with ZooKeeper or Clickhouse Keeper. 3 | """ 4 | 5 | import os 6 | from typing import Any 7 | 8 | from behave import given, then 9 | from click import Context 10 | from kazoo.client import KazooClient 11 | from modules.docker import get_container, get_exposed_port 12 | from tenacity import retry, stop_after_attempt, wait_fixed 13 | 14 | from ch_tools.common import logging 15 | 16 | 17 | @given("a working zookeeper") 18 | @retry(wait=wait_fixed(0.5), stop=stop_after_attempt(40)) 19 | def step_wait_for_zookeeper_alive(context: Context) -> None: 20 | """ 21 | Ensure that ZK is ready to accept incoming requests. 22 | """ 23 | client = _zk_client(context) 24 | try: 25 | client.start() 26 | finally: 27 | client.stop() 28 | 29 | 30 | @given("a working keeper on {node:w}") 31 | @retry(wait=wait_fixed(0.5), stop=stop_after_attempt(20)) 32 | def step_wait_for_keeper_alive(context: Context, node: Any) -> None: 33 | """ 34 | Wait until clickhouse keeper is ready to accept incoming requests. 35 | """ 36 | client = _zk_client(context, instance_name=node, port=2281, use_ssl=True) 37 | try: 38 | client.start() 39 | client.get("/") 40 | except Exception: 41 | client.stop() 42 | raise 43 | finally: 44 | client.stop() 45 | 46 | 47 | @given("we have removed ZK metadata for {node:w}") 48 | def clean_zk_tables_metadata_for_host(context: Context, node: Any) -> None: 49 | """ 50 | Remove all metadata for specified host from ZK 51 | """ 52 | 53 | def recursive_remove_node_data(zk_client: Any, path: Any, node: Any) -> None: 54 | for subpath in zk_client.get_children(path): 55 | if subpath == node: 56 | zk_client.delete(os.path.join(path, subpath), recursive=True) 57 | else: 58 | recursive_remove_node_data(zk_client, os.path.join(path, subpath), node) 59 | 60 | client = _zk_client(context) 61 | 62 | try: 63 | client.start() 64 | recursive_remove_node_data(client, "/", node) 65 | finally: 66 | client.stop() 67 | 68 | 69 | @then('we get zookeeper node with "{path}" path') 70 | def step_get_zk_node(context: Context, path: Any) -> None: 71 | client = _zk_client(context) 72 | 73 | try: 74 | client.start() 75 | result = client.get(path)[0].decode().strip() 76 | finally: 77 | client.stop() 78 | 79 | print(result) 80 | 81 | 82 | def _zk_client( 83 | context: Context, 84 | instance_name: str = "zookeeper01", 85 | port: int = 2181, 86 | use_ssl: bool = False, 87 | ) -> Any: 88 | logging.set_module_log_level("kazoo", logging.CRITICAL) 89 | 90 | zk_container = get_container(context, instance_name) 91 | host, port = get_exposed_port(zk_container, port) 92 | 93 | return KazooClient(f"{host}:{port}", use_ssl=use_ssl, verify_certs=False) 94 | -------------------------------------------------------------------------------- /ch_tools/chadmin/cli/s3_credentials_config_group.py: -------------------------------------------------------------------------------- 1 | import json 2 | import random 3 | import time 4 | from typing import Any 5 | from xml.dom import minidom 6 | 7 | import requests 8 | from click import Context, group, option, pass_context 9 | 10 | from ch_tools.chadmin.cli.chadmin_group import Chadmin 11 | from ch_tools.chadmin.internal.system import match_ch_version 12 | from ch_tools.common.clickhouse.config.path import CLICKHOUSE_S3_CREDENTIALS_CONFIG_PATH 13 | 14 | 15 | @group("s3-credentials-config", cls=Chadmin) 16 | def s3_credentials_config_group() -> None: 17 | """ 18 | Commands to manage S3 credentials config. 19 | """ 20 | pass 21 | 22 | 23 | @s3_credentials_config_group.command("update") 24 | @option( 25 | "-e", 26 | "--endpoint", 27 | "s3_endpoint", 28 | type=str, 29 | required=True, 30 | help="S3 endpoint.", 31 | ) 32 | @option( 33 | "-s", 34 | "--random-sleep", 35 | "random_sleep", 36 | is_flag=True, 37 | default=False, 38 | help="Perform random sleep before updating S3 credentials config.", 39 | ) 40 | @pass_context 41 | def update_s3_credentials(ctx: Context, s3_endpoint: str, random_sleep: bool) -> None: 42 | """Update S3 credentials config.""" 43 | if random_sleep: 44 | time.sleep(random.randint(0, 30)) 45 | 46 | doc = minidom.Document() 47 | storage = _add_xml_node( 48 | doc, 49 | _add_xml_node(doc, _add_xml_node(doc, doc, "clickhouse"), "s3"), 50 | "cloud_storage", 51 | ) 52 | endpoint_header = ( 53 | "access_header" if match_ch_version(ctx, min_version="24.11") else "header" 54 | ) 55 | _add_xml_node(doc, storage, "endpoint").appendChild(doc.createTextNode(s3_endpoint)) 56 | _add_xml_node(doc, storage, endpoint_header).appendChild( 57 | doc.createTextNode(f"X-YaCloud-SubjectToken: {_get_token(ctx)}") 58 | ) 59 | 60 | with open(CLICKHOUSE_S3_CREDENTIALS_CONFIG_PATH, "wb") as file: 61 | file.write(doc.toprettyxml(indent=4 * " ", encoding="utf-8")) 62 | 63 | 64 | def _add_xml_node(document: minidom.Document, root: Any, name: str) -> minidom.Element: 65 | node = document.createElement(name) 66 | root.appendChild(node) 67 | return node 68 | 69 | 70 | def _get_token(ctx: Context) -> str: 71 | response = _request_token(ctx) 72 | if response.status_code != 200: 73 | raise RuntimeError(f"Can't get token. Response {response.status_code}") 74 | data = json.loads(response.content) 75 | if data["token_type"] != "Bearer": 76 | raise RuntimeError(f"Can't get token. Invalid Token type {data['token_type']}") 77 | return data["access_token"] 78 | 79 | 80 | def _request_token(ctx: Context) -> requests.Response: 81 | endpoint = ctx.obj["config"]["cloud"]["metadata_service_endpoint"] 82 | return requests.get( 83 | f"{endpoint}/computeMetadata/v1/instance/service-accounts/default/token", 84 | headers={"Metadata-Flavor": "Google"}, 85 | timeout=60, 86 | ) 87 | -------------------------------------------------------------------------------- /tests/steps/s3.py: -------------------------------------------------------------------------------- 1 | """ 2 | Steps for interacting with S3. 3 | """ 4 | 5 | from behave import given, then, when 6 | from hamcrest import assert_that, equal_to, greater_than 7 | from modules import minio, s3 8 | from modules.steps import get_step_data 9 | from modules.typing import ContextT 10 | 11 | 12 | @given("a working S3") 13 | def step_wait_for_s3_alive(context: ContextT) -> None: 14 | """ 15 | Ensure that S3 is ready to accept incoming requests. 16 | """ 17 | minio.initialize(context) 18 | 19 | 20 | @then("S3 contains {count:d} objects") 21 | def step_s3_contains_files(context: ContextT, count: int) -> None: 22 | s3_client = s3.S3Client(context) 23 | objects = s3_client.list_objects("") 24 | assert_that( 25 | len(objects), 26 | equal_to(count), 27 | f"Objects count = {len(objects)}, expected {count}, objects {objects}", 28 | ) 29 | 30 | 31 | @then("S3 contains greater than {count:d} objects") 32 | def step_s3_contains_greater_than_files(context: ContextT, count: int) -> None: 33 | s3_client = s3.S3Client(context) 34 | objects = s3_client.list_objects("") 35 | assert_that( 36 | len(objects), 37 | greater_than(count), 38 | f"Objects count = {len(objects)}, expected greater than {count}, objects {objects}", 39 | ) 40 | 41 | 42 | @then("S3 bucket {bucket} contains {count:d} objects") 43 | def step_cloud_storage_bucket_contains_files( 44 | context: ContextT, bucket: str, count: int 45 | ) -> None: 46 | s3_client = s3.S3Client(context, bucket) 47 | objects = s3_client.list_objects("") 48 | assert_that( 49 | len(objects), 50 | equal_to(count), 51 | f"Objects count = {len(objects)}, expected {count}, objects {objects}", 52 | ) 53 | 54 | 55 | @when("we put object in S3") 56 | def step_put_file_in_s3(context: ContextT) -> None: 57 | conf = get_step_data(context) 58 | s3_client = s3.S3Client(context, conf["bucket"]) 59 | s3_client.upload_data(conf["data"], conf["path"]) 60 | assert s3_client.path_exists(conf["path"]) 61 | 62 | 63 | @when("we put {count:d} objects in S3") 64 | def step_put_file_count_in_s3(context: ContextT, count: int) -> None: 65 | conf = get_step_data(context) 66 | s3_client = s3.S3Client(context, conf["bucket"]) 67 | for i in range(count): 68 | path = f"{conf['path'].format(i)}" 69 | s3_client.upload_data(conf["data"], path) 70 | assert s3_client.path_exists(path) 71 | 72 | 73 | @when("we delete object in S3") 74 | def stop_delete_file_in_S3(context: ContextT) -> None: 75 | conf = get_step_data(context) 76 | s3_client = s3.S3Client(context, conf["bucket"]) 77 | s3_client.delete_data(conf["path"]) 78 | assert not s3_client.path_exists(conf["path"]) 79 | 80 | 81 | @then("Path does not exist in S3") 82 | def step_create_file_in_s3(context: ContextT) -> None: 83 | conf = get_step_data(context) 84 | s3_client = s3.S3Client(context, conf["bucket"]) 85 | assert not s3_client.path_exists(conf["path"]) 86 | -------------------------------------------------------------------------------- /ch_tools/chadmin/internal/database.py: -------------------------------------------------------------------------------- 1 | from typing import Any, Optional, Union 2 | 3 | from click import Context 4 | 5 | from ch_tools.chadmin.internal.utils import execute_query 6 | 7 | 8 | def list_databases( 9 | ctx: Context, 10 | database: Optional[str] = None, 11 | exclude_database: Optional[str] = None, 12 | engine_pattern: Optional[str] = None, 13 | exclude_engine_pattern: Optional[str] = None, 14 | active_parts: Optional[bool] = None, 15 | format_: str = "JSON", 16 | ) -> Union[Any, dict]: 17 | query = """ 18 | SELECT 19 | database, 20 | engine, 21 | tables, 22 | formatReadableSize(bytes_on_disk) "disk_size", 23 | partitions, 24 | parts, 25 | rows 26 | FROM ( 27 | SELECT 28 | name "database", 29 | engine 30 | FROM system.databases 31 | ) q1 32 | ALL LEFT JOIN ( 33 | SELECT 34 | database, 35 | count() "tables", 36 | sum(bytes_on_disk) "bytes_on_disk", 37 | sum(partitions) "partitions", 38 | sum(parts) "parts", 39 | sum(rows) "rows" 40 | FROM ( 41 | SELECT 42 | database, 43 | name "table" 44 | FROM system.tables 45 | ) q2_1 46 | ALL LEFT JOIN ( 47 | SELECT 48 | database, 49 | table, 50 | uniq(partition) "partitions", 51 | count() "parts", 52 | sum(rows) "rows", 53 | sum(bytes_on_disk) "bytes_on_disk" 54 | FROM system.parts 55 | {% if active_parts %} 56 | WHERE active 57 | {% endif %} 58 | GROUP BY database, table 59 | ) q2_2 60 | USING database, table 61 | GROUP BY database 62 | ) q2 63 | USING database 64 | {% if database %} 65 | WHERE database {{ format_str_match(database) }} 66 | {% else %} 67 | WHERE database NOT IN ('system', 'information_schema', 'INFORMATION_SCHEMA') 68 | {% endif %} 69 | {% if exclude_database %} 70 | AND database != '{{ exclude_database }}' 71 | {% endif %} 72 | {% if engine_pattern %} 73 | AND engine {{ format_str_match(engine_pattern) }} 74 | {% endif %} 75 | {% if exclude_engine_pattern %} 76 | AND engine NOT {{ format_str_match(exclude_engine_pattern) }} 77 | {% endif %} 78 | ORDER BY database 79 | """ 80 | res = execute_query( 81 | ctx, 82 | query, 83 | database=database, 84 | exclude_database=exclude_database, 85 | engine_pattern=engine_pattern, 86 | exclude_engine_pattern=exclude_engine_pattern, 87 | active_parts=active_parts, 88 | format_=format_, 89 | ) 90 | return res["data"] if format_ == "JSON" else res 91 | -------------------------------------------------------------------------------- /ch_tools/common/clickhouse/config/utils.py: -------------------------------------------------------------------------------- 1 | import os.path 2 | from copy import deepcopy 3 | from typing import Any, MutableMapping 4 | 5 | import xmltodict 6 | import yaml 7 | 8 | from ch_tools.common.utils import first_value 9 | 10 | 11 | def load_config(config_path: str, configd_dir: str = "config.d") -> Any: 12 | """ 13 | Load ClickHouse config file. 14 | """ 15 | # Load main config file. 16 | config = _load_config(config_path) 17 | 18 | # Load config files from config.d/ directory. 19 | configd_path = os.path.join(os.path.dirname(config_path), configd_dir) 20 | if os.path.exists(configd_path): 21 | for file in os.listdir(configd_path): 22 | file_path = os.path.join(configd_path, file) 23 | _merge_configs(config, _load_config(file_path)) 24 | 25 | # Process includes. 26 | root_section = first_value(config) 27 | include_file = root_section.get("include_from") 28 | if include_file: 29 | include_config = first_value(_load_config(include_file)) 30 | _apply_config_directives(root_section, include_config) 31 | 32 | return config 33 | 34 | 35 | def dump_config( 36 | config: Any, *, mask_secrets: bool = True, xml_format: bool = False 37 | ) -> Any: 38 | """ 39 | Dump ClickHouse config. 40 | """ 41 | result = deepcopy(config) 42 | 43 | if mask_secrets: 44 | _mask_secrets(result) 45 | 46 | if xml_format: 47 | result = xmltodict.unparse(result, pretty=True) 48 | 49 | return result 50 | 51 | 52 | def _load_config(config_path: str) -> Any: 53 | with open(config_path, "r", encoding="utf-8") as file: 54 | if config_path.endswith(".xml"): 55 | return xmltodict.parse(file.read(), disable_entities=False) 56 | return {"clickhouse": yaml.safe_load(file)} 57 | 58 | 59 | def _merge_configs(main_config: Any, additional_config: Any) -> None: 60 | for key, value in additional_config.items(): 61 | if key not in main_config: 62 | main_config[key] = value 63 | continue 64 | 65 | if isinstance(main_config[key], dict) and isinstance(value, dict): 66 | _merge_configs(main_config[key], value) 67 | continue 68 | 69 | if value is not None: 70 | main_config[key] = value 71 | 72 | 73 | def _apply_config_directives(config_section: dict, include_config: dict) -> None: 74 | for key, item in config_section.items(): 75 | if not isinstance(item, dict): 76 | continue 77 | 78 | include = item.get("@incl") 79 | if include: 80 | config_section[key] = include_config[include] 81 | continue 82 | 83 | _apply_config_directives(item, include_config) 84 | 85 | 86 | def _mask_secrets(config: Any) -> None: 87 | if isinstance(config, MutableMapping): 88 | for key, value in list(config.items()): 89 | if isinstance(value, MutableMapping): 90 | _mask_secrets(config[key]) 91 | elif key in ("password", "secret_access_key", "header", "identity"): 92 | config[key] = "*****" 93 | -------------------------------------------------------------------------------- /ch_tools/common/clickhouse/client/query_output_format.py: -------------------------------------------------------------------------------- 1 | """ 2 | Query output format enumeration. 3 | https://clickhouse.com/docs/en/interfaces/formats 4 | """ 5 | 6 | from ch_tools.common.type import StrEnum 7 | 8 | # pylint: disable=invalid-name 9 | 10 | 11 | class OutputFormat(StrEnum): 12 | Default = "PrettyCompact" 13 | TabSeparated = "TabSeparated" 14 | TabSeparatedRaw = "TabSeparatedRaw" 15 | TabSeparatedWithNames = "TabSeparatedWithNames" 16 | TabSeparatedWithNamesAndTypes = "TabSeparatedWithNamesAndTypes" 17 | TabSeparatedRawWithNames = "TabSeparatedRawWithNames" 18 | TabSeparatedRawWithNamesAndTypes = "TabSeparatedRawWithNamesAndTypes" 19 | CSV = "CSV" 20 | CSVWithNames = "CSVWithNames" 21 | CSVWithNamesAndTypes = "CSVWithNamesAndTypes" 22 | SQLInsert = "SQLInsert" 23 | Values = "Values" 24 | Vertical = "Vertical" 25 | JSON = "JSON" 26 | JSONStrings = "JSONStrings" 27 | JSONColumns = "JSONColumns" 28 | JSONColumnsWithMetadata = "JSONColumnsWithMetadata" 29 | JSONCompact = "JSONCompact" 30 | JSONCompactStrings = "JSONCompactStrings" 31 | JSONCompactColumns = "JSONCompactColumns" 32 | JSONEachRow = "JSONEachRow" 33 | PrettyJSONEachRow = "PrettyJSONEachRow" 34 | JSONEachRowWithProgress = "JSONEachRowWithProgress" 35 | JSONStringsEachRow = "JSONStringsEachRow" 36 | JSONStringsEachRowWithProgress = "JSONStringsEachRowWithProgress" 37 | JSONCompactEachRow = "JSONCompactEachRow" 38 | JSONCompactEachRowWithNames = "JSONCompactEachRowWithNames" 39 | JSONCompactEachRowWithNamesAndTypes = "JSONCompactEachRowWithNamesAndTypes" 40 | JSONCompactStringsEachRow = "JSONCompactStringsEachRow" 41 | JSONCompactStringsEachRowWithNames = "JSONCompactStringsEachRowWithNames" 42 | JSONCompactStringsEachRowWithNamesAndTypes = ( 43 | "JSONCompactStringsEachRowWithNamesAndTypes" 44 | ) 45 | JSONObjectEachRow = "JSONObjectEachRow" 46 | BSONEachRow = "BSONEachRow" 47 | TSKV = "TSKV" 48 | Pretty = "Pretty" 49 | PrettyNoEscapes = "PrettyNoEscapes" 50 | PrettyMonoBlock = "PrettyMonoBlock" 51 | PrettyNoEscapesMonoBlock = "PrettyNoEscapesMonoBlock" 52 | PrettyCompact = "PrettyCompact" 53 | PrettyCompactNoEscapes = "PrettyCompactNoEscapes" 54 | PrettyCompactMonoBlock = "PrettyCompactMonoBlock" 55 | PrettyCompactNoEscapesMonoBlock = "PrettyCompactNoEscapesMonoBlock" 56 | PrettySpace = "PrettySpace" 57 | PrettySpaceNoEscapes = "PrettySpaceNoEscapes" 58 | PrettySpaceMonoBlock = "PrettySpaceMonoBlock" 59 | PrettySpaceNoEscapesMonoBlock = "PrettySpaceNoEscapesMonoBlock" 60 | Prometheus = "Prometheus" 61 | Protobuf = "Protobuf" 62 | ProtobufSingle = "ProtobufSingle" 63 | Avro = "Avro" 64 | Parquet = "Parquet" 65 | Arrow = "Arrow" 66 | ORC = "ORC" 67 | RowBinary = "RowBinary" 68 | RowBinaryWithNames = "RowBinaryWithNames" 69 | RowBinaryWithNamesAndTypes = "RowBinaryWithNamesAndTypes" 70 | Native = "Native" 71 | Null = "Null" 72 | XML = "XML" 73 | CapnProto = "CapnProto" 74 | LineAsString = "LineAsString" 75 | RawBLOB = "RawBLOB" 76 | MsgPack = "MsgPack" 77 | Markdown = "Markdown" 78 | -------------------------------------------------------------------------------- /ch_tools/monrun_checks/ch_system_queues.py: -------------------------------------------------------------------------------- 1 | from typing import Any 2 | 3 | from cloup import command, option, pass_context 4 | 5 | from ch_tools.common.clickhouse.client.clickhouse_client import clickhouse_client 6 | from ch_tools.common.result import CRIT, OK, WARNING, Result 7 | 8 | 9 | @command("system-queues") 10 | @option("--merges-in-queue-warn", "merges_in_queue_warn", type=int) 11 | @option("--merges-in-queue-crit", "merges_in_queue_crit", type=int) 12 | @option("--future-parts-warn", "future_parts_warn", type=int) 13 | @option("--future-parts-crit", "future_parts_crit", type=int) 14 | @option("--parts-to-check-warn", "parts_to_check_warn", type=int) 15 | @option("--parts-to-check-crit", "parts_to_check_crit", type=int) 16 | @option("--queue-size-warn", "queue_size_warn", type=int) 17 | @option("--queue-size-crit", "queue_size_crit", type=int) 18 | @option("--inserts-in-queue-warn", "inserts_in_queue_warn", type=int) 19 | @option("--inserts-in-queue-crit", "inserts_in_queue_crit", type=int) 20 | @pass_context 21 | def system_queues_command( 22 | ctx: Any, 23 | merges_in_queue_warn: int, 24 | merges_in_queue_crit: int, 25 | future_parts_warn: int, 26 | future_parts_crit: int, 27 | parts_to_check_warn: int, 28 | parts_to_check_crit: int, 29 | queue_size_warn: int, 30 | queue_size_crit: int, 31 | inserts_in_queue_warn: int, 32 | inserts_in_queue_crit: int, 33 | ) -> Result: 34 | """ 35 | Check system queues. 36 | """ 37 | thresholds = [ 38 | ("merges_in_queue", merges_in_queue_warn, merges_in_queue_crit), 39 | ("future_parts", future_parts_warn, future_parts_crit), 40 | ("parts_to_check", parts_to_check_warn, parts_to_check_crit), 41 | ("queue_size", queue_size_warn, queue_size_crit), 42 | ("inserts_in_queue", inserts_in_queue_warn, inserts_in_queue_crit), 43 | ] 44 | 45 | issues = [] 46 | for item in _get_metrics(ctx): 47 | table_full_name = f"{item['database']}.{item['table']}" 48 | for parameter, warn, crit in thresholds: 49 | value = item[parameter] 50 | if value > crit: 51 | issues.append( 52 | ( 53 | CRIT, 54 | f"{table_full_name}: {parameter} {value} > {crit} (crit);", 55 | ) 56 | ) 57 | elif value > warn: 58 | issues.append( 59 | ( 60 | WARNING, 61 | f"{table_full_name}: {parameter} {value} > {warn} (warn);", 62 | ) 63 | ) 64 | 65 | if issues: 66 | issues.sort(reverse=True, key=lambda x: x[0]) 67 | status = issues[0][0] 68 | message = " ".join(x[1] for x in issues) 69 | return Result(status, message) 70 | 71 | return Result(OK) 72 | 73 | 74 | def _get_metrics(ctx: Any) -> list[dict]: 75 | """ 76 | Select and return metrics form system.replicas. 77 | """ 78 | query = ( 79 | "SELECT database, table, future_parts, parts_to_check, queue_size," 80 | " inserts_in_queue, merges_in_queue FROM system.replicas" 81 | ) 82 | return clickhouse_client(ctx).query_json_data(query=query, compact=False) 83 | -------------------------------------------------------------------------------- /ch_tools/common/dbaas.py: -------------------------------------------------------------------------------- 1 | import json 2 | from typing import Any, Dict, List 3 | 4 | 5 | class DbaasConfig: 6 | def __init__(self, config: Dict[str, Any]) -> None: 7 | self._config = config 8 | 9 | @property 10 | def vtype(self) -> Any: 11 | return self._config["vtype"] 12 | 13 | @property 14 | def cloud_id(self) -> Any: 15 | return self._config["cloud"]["cloud_ext_id"] 16 | 17 | @property 18 | def folder_id(self) -> Any: 19 | return self._config["folder"]["folder_ext_id"] 20 | 21 | @property 22 | def cluster_id(self) -> Any: 23 | return self._config["cluster_id"] 24 | 25 | @property 26 | def cluster_name(self) -> Any: 27 | return self._config["cluster_name"] 28 | 29 | @property 30 | def created_at(self) -> Any: 31 | return self._config["created_at"] 32 | 33 | @property 34 | def shard_count(self) -> int: 35 | subcluster = self._clickhouse_subcluster() 36 | return len(subcluster["shards"]) 37 | 38 | @property 39 | def host_count(self) -> int: 40 | return len(self._config["cluster_hosts"]) 41 | 42 | @property 43 | def clickhouse_host_count(self) -> int: 44 | subcluster = self._clickhouse_subcluster() 45 | count = 0 46 | for shard in subcluster["shards"].values(): 47 | count += len(shard["hosts"]) 48 | return count 49 | 50 | @property 51 | def shard_hosts(self) -> Any: 52 | return self._config["shard_hosts"] 53 | 54 | @property 55 | def replicas(self) -> List[Any]: 56 | return [host for host in self.shard_hosts if host != self.fqdn] 57 | 58 | @property 59 | def fqdn(self) -> Any: 60 | return self._config["fqdn"] 61 | 62 | @property 63 | def disk_type(self) -> Any: 64 | return self._config["disk_type_id"] 65 | 66 | @property 67 | def disk_size(self) -> Any: 68 | return self._config["space_limit"] 69 | 70 | @property 71 | def flavor(self) -> Any: 72 | return self._config["flavor"]["name"] 73 | 74 | @property 75 | def cpu_fraction(self) -> Any: 76 | return self._config["flavor"]["cpu_fraction"] 77 | 78 | @property 79 | def cpu_limit(self) -> Any: 80 | return self._config["flavor"]["cpu_limit"] 81 | 82 | @property 83 | def cpu_guarantee(self) -> Any: 84 | return self._config["flavor"]["cpu_guarantee"] 85 | 86 | @property 87 | def memory_limit(self) -> Any: 88 | return self._config["flavor"]["memory_limit"] 89 | 90 | @property 91 | def memory_guarantee(self) -> Any: 92 | return self._config["flavor"]["memory_guarantee"] 93 | 94 | def _clickhouse_subcluster(self) -> Dict[str, Any]: 95 | for subcluster in self._config["cluster"]["subclusters"].values(): 96 | if "clickhouse_cluster" in subcluster["roles"]: 97 | return subcluster 98 | raise RuntimeError("Unreachable") 99 | 100 | @staticmethod 101 | def load() -> "DbaasConfig": 102 | with open("/etc/dbaas.conf", "r", encoding="utf-8") as file: 103 | return DbaasConfig(json.load(file)) 104 | -------------------------------------------------------------------------------- /tests/steps/failure_mockers.py: -------------------------------------------------------------------------------- 1 | """ 2 | Steps for interacting with ClickHouse DBMS. 3 | """ 4 | 5 | import os 6 | 7 | from behave import when 8 | from hamcrest import assert_that, equal_to 9 | from modules import s3 10 | from modules.clickhouse import execute_query 11 | from modules.docker import get_container 12 | from modules.steps import get_step_data 13 | from modules.typing import ContextT 14 | 15 | 16 | @when("we remove key from s3 for partitions database {database} on {node:w}") 17 | def step_remove_keys_from_s3_for_partition( 18 | context: ContextT, database: str, node: str 19 | ) -> None: 20 | data = get_step_data(context) 21 | keys_to_remove = [] 22 | 23 | # Get the list of keys in s3 to broke specified partitions. 24 | for database, database_info in data.items(): 25 | for table, table_info in database_info.items(): 26 | for partition in table_info: 27 | get_parts_info_query = f"SELECT name, partition, path FROM system.parts where database='{database}' and table='{table}' and partition='{partition}'" 28 | # Get local path on disk to the single data part. 29 | part_local_path = execute_query( 30 | context, node, get_parts_info_query, format_="JSONCompact" 31 | )["data"][0][2] 32 | # For this part, get the single object key in s3. 33 | get_object_key_query = f"SELECT concat(path, local_path) AS full_path, remote_path from system.remote_data_paths WHERE disk_name='object_storage' and startsWith(full_path, '{os.path.join(part_local_path, 'columns.txt')}')" 34 | data_object_key = execute_query( 35 | context, node, get_object_key_query, format_="JSONCompact" 36 | )["data"][0][1] 37 | keys_to_remove.append(data_object_key) 38 | 39 | s3_client = s3.S3Client(context) 40 | for key in keys_to_remove: 41 | s3_client.delete_data(key) 42 | 43 | 44 | @when("we move parts as broken_on_start for table {database}.{table} on {node:w}") 45 | def step_mark_parts_as_broken_on_start( 46 | context: ContextT, database: str, table: str, node: str 47 | ) -> None: 48 | part_list_query = f"SELECT name FROM system.parts WHERE database='{database}' and table='{table}' and active" 49 | 50 | for resp in execute_query(context, node, part_list_query, format_="JSONCompact")[ 51 | "data" 52 | ]: 53 | detach_part = f"ALTER TABLE {database}.{table} DETACH PART '{resp[0]}'" 54 | execute_query(context, node, detach_part) 55 | 56 | broken_prefix = "broken-on-start_" 57 | detached_part_list_query = f"SELECT path FROM system.detached_parts WHERE database='{database}' and table='{table}'" 58 | container = get_container(context, node) 59 | 60 | for resp in execute_query( 61 | context, node, detached_part_list_query, format_="JSONCompact" 62 | )["data"]: 63 | path = resp[0] 64 | broken_path = path.split("/") 65 | broken_path[-1] = broken_prefix + broken_path[-1] 66 | broken_path = os.path.join("/", *broken_path) 67 | result = container.exec_run( 68 | ["bash", "-c", f"mv {path} {broken_path}"], user="root" 69 | ) 70 | assert_that(result.exit_code, equal_to(0)) 71 | -------------------------------------------------------------------------------- /tests/modules/minio.py: -------------------------------------------------------------------------------- 1 | """ 2 | Interface to Minio S3 server. 3 | """ 4 | 5 | import json 6 | import os 7 | 8 | from docker.models.containers import Container 9 | from tenacity import retry, retry_if_exception_type, stop_after_attempt, wait_fixed 10 | 11 | from .docker import copy_container_dir, get_container 12 | from .typing import ContextT 13 | 14 | 15 | class MinioException(Exception): 16 | """ 17 | Minion exception. 18 | """ 19 | 20 | def __init__(self, response: dict) -> None: 21 | super().__init__(self._fmt_message(response)) 22 | self.response = response 23 | 24 | @staticmethod 25 | def _fmt_message(response: dict) -> str: 26 | try: 27 | error = response["error"] 28 | message = f'{error["message"]} Cause: {error["cause"]["message"]}' 29 | 30 | code = error["cause"]["error"].get("Code") 31 | if code: 32 | message = f"{message} [{code}]" 33 | 34 | return message 35 | 36 | except Exception: 37 | return f"Failed with response: {response}" 38 | 39 | 40 | class BucketAlreadyOwnedByYou(MinioException): 41 | """ 42 | BucketAlreadyOwnedByYou Minion exception. 43 | """ 44 | 45 | pass 46 | 47 | 48 | def initialize(context: ContextT) -> None: 49 | """ 50 | Initialize Minio server. 51 | """ 52 | _configure_s3_credentials(context) 53 | _create_s3_bucket(context) 54 | 55 | 56 | def export_s3_data(context: ContextT, path: str) -> None: 57 | """ 58 | Export S3 data to the specified directory. 59 | """ 60 | local_dir = os.path.join(path, "minio") 61 | copy_container_dir(_container(context), "/export", local_dir) 62 | 63 | 64 | @retry( 65 | retry=retry_if_exception_type(MinioException), 66 | wait=wait_fixed(1), 67 | stop=stop_after_attempt(10), 68 | ) 69 | def _configure_s3_credentials(context: ContextT) -> None: 70 | """ 71 | Configure S3 credentials in mc (Minio client). 72 | """ 73 | access_key = context.conf["s3"]["access_key_id"] 74 | secret_key = context.conf["s3"]["access_secret_key"] 75 | _mc_execute( 76 | context, 77 | f"config host add local http://localhost:9000 {access_key} {secret_key}", 78 | ) 79 | 80 | 81 | def _create_s3_bucket(context: ContextT) -> None: 82 | """ 83 | Create S3 bucket specified in the config. 84 | """ 85 | bucket = context.conf["s3"]["bucket"] 86 | try: 87 | _mc_execute(context, f"mb local/{bucket}") 88 | except BucketAlreadyOwnedByYou: 89 | pass 90 | 91 | 92 | def _container(context: ContextT) -> Container: 93 | return get_container(context, "minio01") 94 | 95 | 96 | def _mc_execute(context: ContextT, command: str) -> dict: 97 | """ 98 | Execute mc (Minio client) command. 99 | """ 100 | output = _container(context).exec_run(f"mc --json {command}").output.decode() 101 | response = json.loads(output) 102 | if response["status"] == "success": 103 | return response 104 | 105 | error_code = response["error"]["cause"]["error"].get("Code") 106 | exception_types = { 107 | "BucketAlreadyOwnedByYou": BucketAlreadyOwnedByYou, 108 | } 109 | raise exception_types.get(error_code, MinioException)(response) 110 | -------------------------------------------------------------------------------- /tests/steps/chadmin.py: -------------------------------------------------------------------------------- 1 | """ 2 | Steps for interacting with chadmin. 3 | """ 4 | 5 | from behave import then, when 6 | from hamcrest import assert_that, equal_to 7 | from modules.chadmin import Chadmin 8 | from modules.docker import get_container 9 | from modules.typing import ContextT 10 | 11 | 12 | @when("we execute chadmin create zk nodes on {node:w}") 13 | def step_create_(context: ContextT, node: str) -> None: 14 | container = get_container(context, node) 15 | nodes = context.text.strip().split("\n") 16 | chadmin = Chadmin(container) 17 | 18 | for node in nodes: 19 | result = chadmin.create_zk_node(node) 20 | assert result.exit_code == 0, f" output:\n {result.output.decode().strip()}" 21 | 22 | 23 | @when("we do hosts cleanup on {node} with fqdn {fqdn} and zk root {zk_root}") 24 | def step_host_cleanup_with_zk_root( 25 | context: ContextT, node: str, fqdn: str, zk_root: str 26 | ) -> None: 27 | container = get_container(context, node) 28 | result = Chadmin(container).zk_cleanup(fqdn, zk_root) 29 | assert result.exit_code == 0, f" output:\n {result.output.decode().strip()}" 30 | 31 | 32 | @when("we do hosts dry cleanup on {node} with fqdn {fqdn} and zk root {zk_root}") 33 | def step_host_dry_cleanup_with_zk_root( 34 | context: ContextT, node: str, fqdn: str, zk_root: str 35 | ) -> None: 36 | container = get_container(context, node) 37 | result = Chadmin(container).zk_cleanup(fqdn, zk_root, dry_run=True) 38 | assert result.exit_code == 0, f" output:\n {result.output.decode().strip()}" 39 | 40 | 41 | @when("we do hosts cleanup on {node} with fqdn {fqdn}") 42 | def step_host_cleanup(context: ContextT, node: str, fqdn: str) -> None: 43 | container = get_container(context, node) 44 | result = Chadmin(container).zk_cleanup(fqdn, no_ch_config=False) 45 | assert result.exit_code == 0, f" output:\n {result.output.decode().strip()}" 46 | 47 | 48 | @when( 49 | "we do table cleanup on {node} with fqdn {fqdn} from table with {zk_table_path} zookeeper path" 50 | ) 51 | def step_table_cleanup( 52 | context: ContextT, node: str, fqdn: str, zk_table_path: str 53 | ) -> None: 54 | container = get_container(context, node) 55 | result = Chadmin(container).zk_cleanup_table(fqdn, zk_table_path) 56 | assert result.exit_code == 0, f" output:\n {result.output.decode().strip()}" 57 | 58 | 59 | @then("the list of children on {node:w} for zk node {zk_node} is equal to") 60 | def step_childen_list(context: ContextT, node: str, zk_node: str) -> None: 61 | container = get_container(context, node) 62 | result = Chadmin(container).zk_list(zk_node) 63 | assert_that(result.output.decode(), equal_to(context.text + "\n")) 64 | 65 | 66 | @then("the list of children on {node:w} for zk node {zk_node} is empty") 67 | def step_childen_list_empty(context: ContextT, node: str, zk_node: str) -> None: 68 | container = get_container(context, node) 69 | result = Chadmin(container).zk_list(zk_node) 70 | assert_that(result.output.decode(), equal_to("\n")) 71 | 72 | 73 | @when("we delete zookeepers nodes {zk_nodes} on {node:w}") 74 | def step_delete_command(context: ContextT, zk_nodes: str, node: str) -> None: 75 | container = get_container(context, node) 76 | result = Chadmin(container).zk_delete(zk_nodes) 77 | assert result.exit_code == 0, f" output:\n {result.output.decode().strip()}" 78 | -------------------------------------------------------------------------------- /ch_tools/chadmin/cli/process_group.py: -------------------------------------------------------------------------------- 1 | from typing import Any, Optional 2 | 3 | from click import Context 4 | from cloup import Choice, argument, group, option, option_group, pass_context 5 | from cloup.constraints import RequireAtLeast 6 | 7 | from ch_tools.chadmin.cli.chadmin_group import Chadmin 8 | from ch_tools.chadmin.internal.process import get_process, kill_process, list_processes 9 | from ch_tools.chadmin.internal.utils import format_query 10 | from ch_tools.common.cli.formatting import print_response 11 | from ch_tools.common.clickhouse.config import get_cluster_name 12 | 13 | FIELD_FORMATTERS = { 14 | "query": format_query, 15 | } 16 | 17 | 18 | @group("process", cls=Chadmin) 19 | def process_group() -> None: 20 | """ 21 | Commands to manage processes. 22 | """ 23 | pass 24 | 25 | 26 | @process_group.command("get") 27 | @argument("query_id") 28 | @pass_context 29 | def get_process_command(ctx: Any, query_id: Any) -> None: 30 | """ 31 | Get process. 32 | """ 33 | process = get_process(ctx, query_id) 34 | print_response( 35 | ctx, process, default_format="yaml", field_formatters=FIELD_FORMATTERS 36 | ) 37 | 38 | 39 | @process_group.command("list") 40 | @option("-u", "--user") 41 | @option("-U", "--exclude-user") 42 | @option("--query") 43 | @option("-v", "--verbose", is_flag=True, help="Verbose mode.") 44 | @option( 45 | "--cluster", 46 | "--on-cluster", 47 | "on_cluster", 48 | is_flag=True, 49 | help="Get records from all hosts in the cluster.", 50 | ) 51 | @option( 52 | "--order-by", 53 | type=Choice(["elapsed", "memory_usage"]), 54 | default="elapsed", 55 | help="Sorting order.", 56 | ) 57 | @option( 58 | "-l", "--limit", type=int, help="Limit the max number of objects in the output." 59 | ) 60 | @pass_context 61 | def list_processes_command( 62 | ctx: Context, 63 | user: Any, 64 | exclude_user: Any, 65 | query: Any, 66 | verbose: Any, 67 | on_cluster: Any, 68 | order_by: Any, 69 | limit: Any, 70 | ) -> None: 71 | """ 72 | List processes. 73 | """ 74 | cluster = get_cluster_name(ctx) if on_cluster else None 75 | 76 | processes = list_processes( 77 | ctx, 78 | user=user, 79 | exclude_user=exclude_user, 80 | query_pattern=query, 81 | cluster=cluster, 82 | limit=limit, 83 | order_by=order_by, 84 | verbose=verbose, 85 | ) 86 | 87 | print_response( 88 | ctx, processes, default_format="yaml", field_formatters=FIELD_FORMATTERS 89 | ) 90 | 91 | 92 | @process_group.command("kill") 93 | @option_group( 94 | "Process selection options", 95 | option("-a", "--all", "_all", is_flag=True, help="Kill all processes."), 96 | option("-q", "--query", "query_id"), 97 | option("-u", "--user"), 98 | option("-U", "--exclude-user"), 99 | constraint=RequireAtLeast(1), 100 | ) 101 | @pass_context 102 | def kill_process_command( 103 | ctx: Context, 104 | _all: Any, 105 | query_id: Optional[str], 106 | user: Optional[str], 107 | exclude_user: Optional[str], 108 | ) -> None: 109 | """ 110 | Kill one or several processes using "KILL QUERY" query. 111 | """ 112 | kill_process(ctx, query_id=query_id, user=user, exclude_user=exclude_user) 113 | -------------------------------------------------------------------------------- /tests/modules/utils.py: -------------------------------------------------------------------------------- 1 | """ 2 | Utility functions. 3 | """ 4 | 5 | import string 6 | from functools import wraps 7 | from random import choice as random_choise 8 | from types import SimpleNamespace 9 | from typing import Any, Callable, Mapping, MutableMapping 10 | 11 | from ch_tools.common import logging, utils 12 | 13 | from .typing import ContextT 14 | 15 | 16 | def merge( 17 | original: MutableMapping[Any, Any], update: Mapping[Any, Any] 18 | ) -> MutableMapping[Any, Any]: 19 | """ 20 | Recursively merge update dict into original. 21 | """ 22 | for key in update: 23 | recurse_conditions = [ 24 | key in original, 25 | isinstance(original.get(key), MutableMapping), 26 | isinstance(update.get(key), Mapping), 27 | ] 28 | if all(recurse_conditions): 29 | merge(original[key], update[key]) 30 | else: 31 | original[key] = update[key] 32 | return original 33 | 34 | 35 | def env_stage(event: str, fail: bool = False) -> Callable: 36 | """ 37 | Nicely logs env stage. 38 | """ 39 | 40 | def wrapper(fun: Callable) -> Callable: 41 | @wraps(fun) 42 | def _wrapped_fun(*args: Any, **kwargs: Any) -> Any: 43 | stage_name = f"{fun.__module__}.{fun.__name__}" 44 | logging.info("initiating {} stage {}", event, stage_name) 45 | try: 46 | return fun(*args, **kwargs) 47 | except Exception as e: 48 | logging.error("{} failed: {!r}", stage_name, e) 49 | if fail: 50 | raise 51 | 52 | return _wrapped_fun 53 | 54 | return wrapper 55 | 56 | 57 | def generate_random_string(length: int = 64) -> str: 58 | """ 59 | Generate random alphanum sequence. 60 | """ 61 | return "".join( 62 | random_choise(string.ascii_letters + string.digits) for _ in range(length) 63 | ) 64 | 65 | 66 | def context_to_dict(context: ContextT) -> dict: 67 | """ 68 | Convert context to dict representation. 69 | 70 | The context type can be either types.SimpleNamespace or behave.Context. 71 | """ 72 | if isinstance(context, SimpleNamespace): 73 | return context.__dict__ 74 | 75 | result: dict = {} 76 | for frame in context._stack: # pylint: disable=protected-access 77 | for key, value in frame.items(): 78 | if key not in result: 79 | result[key] = value 80 | 81 | return result 82 | 83 | 84 | def version_ge(current_version: str, comparing_version: str) -> bool: 85 | """ 86 | Return True if `current_version` is greater or equal than `comparing_version`, or False otherwise. 87 | """ 88 | # "latest" is greater or equal than any known version 89 | if current_version == "latest": 90 | return True 91 | 92 | return utils.version_ge(current_version, comparing_version) 93 | 94 | 95 | def version_lt(current_version: str, comparing_version: str) -> bool: 96 | """ 97 | Return True if `current_version` is less than `comparing_version`, or False otherwise. 98 | """ 99 | # "latest" is not less than any known version 100 | if current_version == "latest": 101 | return False 102 | 103 | return utils.version_lt(current_version, comparing_version) 104 | -------------------------------------------------------------------------------- /tests/modules/s3.py: -------------------------------------------------------------------------------- 1 | """ 2 | S3 client. 3 | """ 4 | 5 | from typing import List, Optional 6 | 7 | import boto3 8 | from botocore.client import Config 9 | from botocore.errorfactory import ClientError 10 | 11 | from ch_tools.common import logging 12 | 13 | from . import docker 14 | from .typing import ContextT 15 | 16 | 17 | class S3Client: 18 | """ 19 | S3 client. 20 | """ 21 | 22 | def __init__(self, context: ContextT, bucket: Optional[str] = None) -> None: 23 | config = context.conf["s3"] 24 | boto_config = config["boto_config"] 25 | self._s3_session = boto3.session.Session( 26 | aws_access_key_id=config["access_key_id"], 27 | aws_secret_access_key=config["access_secret_key"], 28 | region_name=boto_config["region_name"], 29 | ) 30 | 31 | host, port = docker.get_exposed_port( 32 | docker.get_container(context, context.conf["s3"]["container"]), 33 | context.conf["s3"]["port"], 34 | ) 35 | endpoint_url = f"http://{host}:{port}" 36 | self._s3_client = self._s3_session.client( 37 | service_name="s3", 38 | endpoint_url=endpoint_url, 39 | config=Config( 40 | s3={ 41 | "addressing_style": boto_config["addressing_style"], 42 | "region_name": boto_config["region_name"], 43 | }, 44 | ), 45 | ) 46 | 47 | self._s3_bucket_name = bucket if bucket else config["bucket"] 48 | 49 | for module_logger in ("boto3", "botocore", "s3transfer", "urllib3"): 50 | logging.set_module_log_level(module_logger, logging.CRITICAL) 51 | 52 | def upload_data(self, data: bytes, remote_path: str) -> None: 53 | """ 54 | Upload given bytes or file-like object. 55 | """ 56 | remote_path = remote_path.lstrip("/") 57 | self._s3_client.put_object( 58 | Body=data, Bucket=self._s3_bucket_name, Key=remote_path 59 | ) 60 | 61 | def delete_data(self, remote_path: str) -> None: 62 | """ 63 | Delete file from storage. 64 | """ 65 | remote_path = remote_path.lstrip("/") 66 | self._s3_client.delete_object(Bucket=self._s3_bucket_name, Key=remote_path) 67 | 68 | def path_exists(self, remote_path: str) -> bool: 69 | """ 70 | Check if remote path exists. 71 | """ 72 | try: 73 | self._s3_client.head_object(Bucket=self._s3_bucket_name, Key=remote_path) 74 | return True 75 | except ClientError: 76 | return False 77 | 78 | def list_objects(self, prefix: str) -> List[str]: 79 | """ 80 | List all objects with given prefix. 81 | """ 82 | contents = [] 83 | paginator = self._s3_client.get_paginator("list_objects") 84 | list_object_kwargs = dict(Bucket=self._s3_bucket_name, Prefix=prefix) 85 | 86 | for result in paginator.paginate(**list_object_kwargs): 87 | if result.get("CommonPrefixes") is not None: 88 | for dir_prefix in result.get("CommonPrefixes"): 89 | contents.append(dir_prefix.get("Prefix")) 90 | 91 | if result.get("Contents") is not None: 92 | for file_key in result.get("Contents"): 93 | contents.append(file_key.get("Key")) 94 | 95 | return contents 96 | -------------------------------------------------------------------------------- /ch_tools/monrun_checks/ch_s3_credentials_config.py: -------------------------------------------------------------------------------- 1 | import os 2 | import time 3 | from datetime import timedelta 4 | from typing import Any 5 | 6 | import requests 7 | from click import Context, pass_context 8 | from cloup import command, option 9 | 10 | from ch_tools.common import logging 11 | from ch_tools.common.clickhouse.config.path import ( 12 | CLICKHOUSE_RESETUP_CONFIG_PATH, 13 | CLICKHOUSE_S3_CREDENTIALS_CONFIG_PATH, 14 | ) 15 | from ch_tools.common.result import CRIT, OK, WARNING, Result 16 | 17 | 18 | @command("s3-credentials-config") 19 | @option( 20 | "-p", 21 | "--present/--missing", 22 | default=False, 23 | is_flag=True, 24 | help="Whether S3 credentials config should be present or not.", 25 | ) 26 | @pass_context 27 | def s3_credentials_configs_command(ctx: Context, present: bool) -> Result: 28 | """ 29 | Check S3 credentials config. 30 | """ 31 | # pylint: disable=too-many-return-statements 32 | try: 33 | if not present: 34 | if not os.path.exists(CLICKHOUSE_S3_CREDENTIALS_CONFIG_PATH): 35 | return Result(OK) 36 | return Result(CRIT, "S3 credentials config exists, but shouldn't") 37 | 38 | if os.path.isfile(CLICKHOUSE_RESETUP_CONFIG_PATH): 39 | return Result(OK, "Skipped as resetup is in progress") 40 | 41 | if os.path.exists(CLICKHOUSE_S3_CREDENTIALS_CONFIG_PATH): 42 | delta = timedelta( 43 | seconds=time.time() 44 | - os.path.getmtime(CLICKHOUSE_S3_CREDENTIALS_CONFIG_PATH) 45 | ) 46 | if delta < timedelta(hours=2): 47 | return Result(OK) 48 | if delta < timedelta(hours=4): 49 | return Result( 50 | WARNING, 51 | f"S3 token expire in {_delta_to_hours(timedelta(hours=12) - delta)} hours", 52 | ) 53 | 54 | if delta < timedelta(hours=12): 55 | msg = f"S3 token expire in {_delta_to_hours(timedelta(hours=12) - delta)} hours" 56 | else: 57 | msg = f"S3 token expired {_delta_to_hours(delta - timedelta(hours=12))} hours ago" 58 | else: 59 | msg = "S3 credentials config is missing" 60 | 61 | endpoint = ctx.obj["config"]["cloud"]["metadata_service_endpoint"] 62 | code = _request_token(endpoint).status_code 63 | if code == 404: 64 | if "default" in requests.get( 65 | f"{endpoint}/computeMetadata/v1/instance/?recursive=true", 66 | headers={"Metadata-Flavor": "Google"}, 67 | timeout=60, 68 | ).json().get("serviceAccounts", {}): 69 | return Result(WARNING, "service account deleted") 70 | 71 | return Result(CRIT, "service account not linked") 72 | 73 | return Result(CRIT, f"{msg}, IAM code {code}") 74 | 75 | except Exception: 76 | logging.exception("Failed to check S3 credentials config") 77 | return Result(CRIT, "Internal error") 78 | 79 | 80 | def _request_token(metadata_service_endpoint: str) -> Any: 81 | return requests.get( 82 | f"{metadata_service_endpoint}/computeMetadata/v1/instance/service-accounts/default/token", 83 | headers={"Metadata-Flavor": "Google"}, 84 | timeout=60, 85 | ) 86 | 87 | 88 | def _delta_to_hours(delta: timedelta) -> str: 89 | return f"{(delta.total_seconds() / 3600):.2f}" 90 | -------------------------------------------------------------------------------- /tests/configuration.py: -------------------------------------------------------------------------------- 1 | """ 2 | Variables that influence testing behavior are defined here. 3 | """ 4 | 5 | import os 6 | 7 | 8 | def create() -> dict: 9 | """ 10 | Create test configuration (non-idempotent function). 11 | """ 12 | network_name = "ch_tools_test" 13 | services: dict = { 14 | "clickhouse": { 15 | "instances": ["clickhouse01", "clickhouse02"], 16 | "expose": { 17 | "http": 8123, 18 | "clickhouse": 9000, 19 | "keeper": 2281, 20 | }, 21 | "depends_on": ["zookeeper"], 22 | "args": { 23 | "CLICKHOUSE_VERSION": "${CLICKHOUSE_VERSION:-latest}", 24 | }, 25 | "db": { 26 | "user": "reader", 27 | "password": "reader_password", 28 | }, 29 | }, 30 | "zookeeper": { 31 | "instances": ["zookeeper01"], 32 | "expose": { 33 | "tcp": 2181, 34 | }, 35 | }, 36 | "minio": { 37 | "instances": ["minio01"], 38 | "expose": { 39 | "http": 9000, 40 | }, 41 | }, 42 | "http_mock": { 43 | "instances": ["http_mock01"], 44 | "expose": { 45 | "tcp": 8080, 46 | }, 47 | }, 48 | } 49 | 50 | s3 = { 51 | "endpoint": "http://minio01:9000", 52 | "port": 9000, 53 | "access_secret_key": "test_secret", 54 | "access_key_id": "test_key", 55 | "bucket": "cloud-storage-test", 56 | "boto_config": { 57 | "addressing_style": "auto", 58 | "region_name": "us-east-1", 59 | }, 60 | "container": "minio01", 61 | } 62 | 63 | return { 64 | "ch_version": os.getenv("CLICKHOUSE_VERSION", "latest"), 65 | "images_dir": "images", 66 | "staging_dir": "staging", 67 | "network_name": network_name, 68 | "s3": s3, 69 | "ch_backup": { 70 | "encrypt_key": "keyshouldbe32byteskeyshouldbe32b", 71 | }, 72 | "services": services, 73 | "dbaas_conf": _dbaas_conf(services, network_name), 74 | } 75 | 76 | 77 | def _dbaas_conf(services: dict, network_name: str) -> dict: 78 | """ 79 | Generate dbaas.conf contents. 80 | """ 81 | 82 | def _fqdn(instance_name: str) -> str: 83 | return f"{instance_name}.{network_name}" 84 | 85 | return { 86 | "cluster_id": "cid1", 87 | "created_at": "2022-01-01T12:00:00.000000+03:00", 88 | "cluster": { 89 | "subclusters": { 90 | "subcid1": { 91 | "roles": ["clickhouse_cluster"], 92 | "shards": { 93 | "shard_id1": { 94 | "name": "shard1", 95 | "hosts": { 96 | _fqdn(instance_name): {} 97 | for instance_name in services["clickhouse"]["instances"] 98 | }, 99 | }, 100 | }, 101 | }, 102 | "subcid2": { 103 | "roles": ["zk"], 104 | "hosts": { 105 | _fqdn(services["zookeeper"]["instances"][0]): {}, 106 | }, 107 | }, 108 | }, 109 | }, 110 | } 111 | -------------------------------------------------------------------------------- /ch_tools/common/clickhouse/config/clickhouse.py: -------------------------------------------------------------------------------- 1 | import os.path 2 | from enum import Enum 3 | from typing import Any, Dict 4 | 5 | from ch_tools.common.clickhouse.config.storage_configuration import ( 6 | ClickhouseStorageConfiguration, 7 | ) 8 | 9 | from ...utils import first_value 10 | from .path import ( 11 | CLICKHOUSE_CERT_PATH_DEFAULT, 12 | CLICKHOUSE_SERVER_CONFIG_PATH, 13 | CLICKHOUSE_SERVER_PREPROCESSED_CONFIG_PATH, 14 | ) 15 | from .utils import dump_config, load_config 16 | from .zookeeper import ClickhouseZookeeperConfig 17 | 18 | 19 | class ClickhousePort(Enum): 20 | TCP = 1 21 | TCP_SECURE = 2 22 | HTTP = 3 23 | HTTPS = 4 24 | 25 | 26 | class ClickhouseConfig: 27 | """ 28 | ClickHouse server config (config.xml). 29 | """ 30 | 31 | def __init__(self, config: Any, preprocessed: Any) -> None: 32 | self._config = config 33 | self.preprocessed = preprocessed 34 | 35 | @property 36 | def _config_root(self) -> dict: 37 | return first_value(self._config) 38 | 39 | @property 40 | def macros(self) -> dict: 41 | """ 42 | ClickHouse macros. 43 | """ 44 | macros = self._config_root.get("macros", {}) 45 | return {key: value for key, value in macros.items() if not key.startswith("@")} 46 | 47 | @property 48 | def cluster_name(self) -> Any: 49 | return self.macros["cluster"] 50 | 51 | @property 52 | def zookeeper(self) -> ClickhouseZookeeperConfig: 53 | """ 54 | ZooKeeper configuration. 55 | """ 56 | return ClickhouseZookeeperConfig(self._config_root.get("zookeeper", {})) 57 | 58 | @property 59 | def storage_configuration(self) -> ClickhouseStorageConfiguration: 60 | return ClickhouseStorageConfiguration( 61 | self._config_root.get("storage_configuration", {}) 62 | ) 63 | 64 | @property 65 | def ports(self) -> Dict[ClickhousePort, int]: 66 | settings = { 67 | "tcp_port": ClickhousePort.TCP, 68 | "tcp_port_secure": ClickhousePort.TCP_SECURE, 69 | "http_port": ClickhousePort.HTTP, 70 | "https_port": ClickhousePort.HTTPS, 71 | } 72 | 73 | result = {} 74 | for setting_name, port in settings.items(): 75 | value = self._config_root.get(setting_name) 76 | if value: 77 | result[port] = int(value) 78 | 79 | return result 80 | 81 | @property 82 | def cert_path(self) -> str: 83 | openssl_server_config = self._config_root.get("openSSL", {}).get("server", {}) 84 | return openssl_server_config.get("caConfig", CLICKHOUSE_CERT_PATH_DEFAULT) 85 | 86 | def dump(self, mask_secrets: bool = True) -> Any: 87 | return dump_config(self._config, mask_secrets=mask_secrets) 88 | 89 | def dump_xml(self, mask_secrets: bool = True) -> Any: 90 | return dump_config(self._config, mask_secrets=mask_secrets, xml_format=True) 91 | 92 | @staticmethod 93 | def load(try_preprocessed: bool = False) -> "ClickhouseConfig": 94 | if try_preprocessed and os.path.exists( 95 | CLICKHOUSE_SERVER_PREPROCESSED_CONFIG_PATH 96 | ): 97 | config = load_config(CLICKHOUSE_SERVER_PREPROCESSED_CONFIG_PATH) 98 | return ClickhouseConfig(config, preprocessed=True) 99 | 100 | config = load_config(CLICKHOUSE_SERVER_CONFIG_PATH) 101 | return ClickhouseConfig(config, preprocessed=False) 102 | -------------------------------------------------------------------------------- /tests/environment.py: -------------------------------------------------------------------------------- 1 | """ 2 | Behave entry point. 3 | """ 4 | 5 | import re 6 | import sys 7 | from typing import Optional 8 | 9 | import env_control 10 | from behave import model 11 | from modules.logs import save_logs 12 | from modules.typing import ContextT 13 | from modules.utils import version_ge, version_lt 14 | 15 | from ch_tools.common import logging 16 | from ch_tools.common.config import load_config 17 | 18 | try: 19 | import ipdb as pdb 20 | except ImportError: 21 | import pdb # type: ignore 22 | 23 | 24 | def before_all(context: ContextT) -> None: 25 | """ 26 | Prepare environment for tests. 27 | """ 28 | config = load_config() 29 | logging.configure(config["loguru"], "test") 30 | logging.add( 31 | sys.stdout, 32 | level="INFO", 33 | format_="{time:YYYY-MM-DD HH:mm:ss,SSS} [{level:8}]:\t{message}", 34 | ) 35 | if not context.config.userdata.getbool("skip_setup"): 36 | env_control.create(context) 37 | 38 | 39 | def before_feature(context: ContextT, _feature: model.Feature) -> None: 40 | """ 41 | Cleanup function executing per feature. 42 | """ 43 | if "dependent-scenarios" in _feature.tags: 44 | env_control.restart(context) 45 | 46 | 47 | def before_scenario(context: ContextT, scenario: model.Scenario) -> None: 48 | """ 49 | Cleanup function executing per scenario. 50 | """ 51 | if "dependent-scenarios" not in context.feature.tags and _check_tags( 52 | context, scenario 53 | ): 54 | env_control.restart(context) 55 | 56 | 57 | def after_step(context: ContextT, step: model.Step) -> None: 58 | """ 59 | Save logs after failed step. 60 | """ 61 | if step.status == "failed": 62 | save_logs(context) 63 | if context.config.userdata.getbool("debug"): 64 | pdb.post_mortem(step.exc_traceback) 65 | 66 | 67 | def after_all(context: ContextT) -> None: 68 | """ 69 | Clean up. 70 | """ 71 | if (context.failed and not context.aborted) and context.config.userdata.getbool( 72 | "no_stop_on_fail" 73 | ): 74 | logging.info("Not stopping containers on failure as requested") 75 | return 76 | env_control.stop(context) 77 | 78 | 79 | def _check_tags(context: ContextT, scenario: model.Scenario) -> bool: 80 | ch_version = context.conf["ch_version"] 81 | 82 | require_version = _parse_version_tag(scenario.tags, "require_version") 83 | if require_version: 84 | if not version_ge(ch_version, require_version): 85 | logging.info("Skipping scenario due to require_version mismatch") 86 | scenario.mark_skipped() 87 | return False 88 | 89 | require_lt_version = _parse_version_tag(scenario.tags, "require_version_less_than") 90 | if require_lt_version: 91 | if not version_lt(ch_version, require_lt_version): 92 | logging.info("Skipping scenario due to require_version_less_than mismatch") 93 | scenario.mark_skipped() 94 | return False 95 | 96 | if "skip" in scenario.tags: 97 | logging.info("Skipping scenario due to skip tag") 98 | scenario.mark_skipped() 99 | return False 100 | 101 | return True 102 | 103 | 104 | def _parse_version_tag(tags: list, prefix: str) -> Optional[str]: 105 | tag_pattern = prefix + r"_(?P[\d\.]+)" 106 | for tag in tags: 107 | match = re.fullmatch(tag_pattern, tag) 108 | if match: 109 | return match.group("version") 110 | 111 | return None 112 | -------------------------------------------------------------------------------- /ch_tools/chadmin/cli/chs3_backup_group.py: -------------------------------------------------------------------------------- 1 | import os 2 | from typing import List 3 | 4 | from click import ClickException, Context, argument, group, option, pass_context 5 | 6 | from ch_tools.chadmin.cli.chadmin_group import Chadmin 7 | from ch_tools.chadmin.internal.backup import unfreeze_backup 8 | from ch_tools.common import logging 9 | from ch_tools.common.backup import ( 10 | DEFAULT_CHS3_BACKUPS_DIRECTORY, 11 | get_chs3_backups, 12 | get_orphaned_chs3_backups, 13 | ) 14 | from ch_tools.common.utils import clear_empty_directories_recursively 15 | 16 | 17 | @group("chs3-backup", cls=Chadmin) 18 | def chs3_backup_group() -> None: 19 | """Commands to manage ClickHouse over S3 backups (backups for data stored in S3).""" 20 | pass 21 | 22 | 23 | @chs3_backup_group.command("list") 24 | @option("--orphaned", is_flag=True) 25 | def list_backups(orphaned: bool) -> None: 26 | """List backups.""" 27 | backups = get_orphaned_chs3_backups() if orphaned else get_chs3_backups() 28 | for backup in backups: 29 | logging.info(backup) 30 | 31 | 32 | @chs3_backup_group.command("delete") 33 | @argument("backup") 34 | @option( 35 | "-n", 36 | "--dry-run", 37 | is_flag=True, 38 | default=False, 39 | help="Enable dry run mode and do not perform any modifying actions.", 40 | ) 41 | @pass_context 42 | def delete_backup(ctx: Context, backup: str, dry_run: bool) -> None: 43 | """Delete backup.""" 44 | chs3_backups = get_chs3_backups() 45 | if backup not in chs3_backups: 46 | raise ClickException(f"Backup {backup} not found.") 47 | 48 | delete_chs3_backups(ctx, [backup], dry_run=dry_run) 49 | 50 | 51 | @chs3_backup_group.command("cleanup") 52 | @option("-k", "--keep-going", is_flag=True, help="Do not stop on the first error.") 53 | @option( 54 | "-n", 55 | "--dry-run", 56 | is_flag=True, 57 | default=False, 58 | help="Enable dry run mode and do not perform any modifying actions.", 59 | ) 60 | @pass_context 61 | def cleanup_backups(ctx: Context, dry_run: bool, keep_going: bool) -> None: 62 | """Removed unnecessary / orphaned backups.""" 63 | orphaned_chs3_backups = get_orphaned_chs3_backups() 64 | delete_chs3_backups( 65 | ctx, orphaned_chs3_backups, keep_going=keep_going, dry_run=dry_run 66 | ) 67 | 68 | 69 | def delete_chs3_backups( 70 | ctx: Context, 71 | chs3_backups: List[str], 72 | *, 73 | keep_going: bool = False, 74 | dry_run: bool = False, 75 | ) -> None: 76 | """ 77 | Delete CHS3 backups. 78 | """ 79 | for chs3_backup in chs3_backups: 80 | try: 81 | unfreeze_backup(ctx, chs3_backup, dry_run=dry_run) 82 | except Exception as e: 83 | if keep_going: 84 | logging.warning("{!r}\n", e) 85 | else: 86 | raise 87 | 88 | 89 | def clear_empty_backup(orphaned_chs3_backup: str) -> None: 90 | backup_directory = os.path.join( 91 | DEFAULT_CHS3_BACKUPS_DIRECTORY, orphaned_chs3_backup 92 | ) 93 | try: 94 | backup_contents = os.listdir(backup_directory) 95 | clear_empty_directories_recursively(backup_directory) 96 | if len(os.listdir(backup_directory)) == 1 and "revision.txt" in backup_contents: 97 | os.remove(os.path.join(backup_directory, "revision.txt")) 98 | os.rmdir(backup_directory) 99 | except FileNotFoundError: 100 | logging.error( 101 | "Cannot remove backup directory {} as it doesn`t exist.\nMaybe it was already removed.", 102 | backup_directory, 103 | ) 104 | --------------------------------------------------------------------------------