├── luigi ├── py.typed ├── __version__.py ├── static │ └── visualiser │ │ ├── fonts │ │ ├── FontAwesome.otf │ │ ├── fontawesome-webfont.eot │ │ ├── fontawesome-webfont.ttf │ │ ├── fontawesome-webfont.woff │ │ ├── fontawesome-webfont.woff2 │ │ ├── glyphicons-halflings-regular.eot │ │ ├── glyphicons-halflings-regular.ttf │ │ └── glyphicons-halflings-regular.woff │ │ ├── lib │ │ ├── datatables │ │ │ └── images │ │ │ │ ├── favicon.ico │ │ │ │ ├── sort_asc.png │ │ │ │ ├── sort_both.png │ │ │ │ ├── sort_desc.png │ │ │ │ ├── Sorting icons.psd │ │ │ │ ├── sort_asc_disabled.png │ │ │ │ └── sort_desc_disabled.png │ │ ├── jquery-ui │ │ │ └── css │ │ │ │ └── images │ │ │ │ ├── animated-overlay.gif │ │ │ │ ├── ui-icons_222222_256x240.png │ │ │ │ ├── ui-icons_2e83ff_256x240.png │ │ │ │ ├── ui-icons_454545_256x240.png │ │ │ │ ├── ui-icons_888888_256x240.png │ │ │ │ ├── ui-icons_cd0a0a_256x240.png │ │ │ │ ├── ui-bg_flat_0_aaaaaa_40x100.png │ │ │ │ ├── ui-bg_flat_75_ffffff_40x100.png │ │ │ │ ├── ui-bg_glass_55_fbf9ee_1x400.png │ │ │ │ ├── ui-bg_glass_65_ffffff_1x400.png │ │ │ │ ├── ui-bg_glass_75_dadada_1x400.png │ │ │ │ ├── ui-bg_glass_75_e6e6e6_1x400.png │ │ │ │ ├── ui-bg_glass_95_fef1ec_1x400.png │ │ │ │ └── ui-bg_highlight-soft_75_cccccc_1x100.png │ │ ├── bootstrap-toggle │ │ │ └── css │ │ │ │ └── bootstrap-toggle.min.css │ │ └── AdminLTE │ │ │ └── css │ │ │ └── skin-green.min.css │ │ ├── js │ │ └── util.js │ │ ├── mockdata │ │ ├── fetch_error │ │ ├── dep_graph │ │ └── task_list │ │ ├── test.html │ │ └── css │ │ └── tipsy.css ├── templates │ ├── menu.html │ ├── recent.html │ └── show.html ├── contrib │ ├── __init__.py │ ├── hdfs │ │ ├── error.py │ │ ├── clients.py │ │ ├── __init__.py │ │ └── abstract_client.py │ ├── gcp.py │ ├── sparkey.py │ ├── lsf_runner.py │ ├── external_daily_snapshot.py │ ├── target.py │ ├── mrrunner.py │ └── sge_runner.py ├── __main__.py ├── tools │ ├── __init__.py │ ├── deps_tree.py │ └── luigi_grep.py ├── configuration │ ├── __init__.py │ ├── base_parser.py │ ├── core.py │ └── toml_parser.py ├── task_status.py ├── cmdline.py ├── event.py ├── task_history.py ├── freezing.py └── metrics.py ├── test ├── conftest.py ├── contrib │ ├── __init__.py │ ├── hdfs │ │ └── webhdfs_client_test.py │ ├── bigquery_avro_test.py │ ├── external_daily_snapshot_test.py │ ├── scalding_test.py │ ├── _webhdfs_test.py │ ├── redis_test.py │ └── cascading_test.py ├── visualiser │ └── __init__.py ├── create_packages_archive_root │ ├── package.egg-info │ │ └── top_level.txt │ ├── module.py │ └── package │ │ ├── __init__.py │ │ ├── subpackage │ │ ├── __init__.py │ │ └── submodule.py │ │ ├── submodule_without_imports.py │ │ ├── submodule.py │ │ └── submodule_with_absolute_import.py ├── auto_namespace_test │ ├── __init__.py │ └── my_namespace_test.py ├── testconfig │ ├── luigi_local.toml │ ├── pyproject.toml │ ├── luigi.cfg │ ├── core-site.xml │ ├── luigi.toml │ ├── log4j.properties │ ├── logging.cfg │ └── luigi_logging.toml ├── gcloud-credentials.json.enc ├── other_module.py ├── hdfs_client_test.py ├── runtests.py ├── most_common_test.py ├── task_progress_percentage_test.py ├── dynamic_import_test.py ├── task_status_message_test.py ├── set_task_name_test.py ├── metrics_test.py ├── factorial_test.py ├── remote_scheduler_test.py ├── subtask_test.py ├── recursion_test.py ├── priority_test.py ├── helpers_test.py ├── task_history_test.py ├── _mysqldb_test.py ├── test_ssh.py ├── task_register_test.py ├── choice_parameter_test.py ├── import_test.py ├── fib_test.py ├── task_bulk_complete_test.py ├── clone_test.py ├── instance_test.py ├── worker_task_process_test.py ├── mypy_test.py ├── test_sigpipe.py ├── config_toml_test.py └── task_forwarded_attributes_test.py ├── doc ├── .gitignore ├── luigi.png ├── history.png ├── user_recs.png ├── web_server.png ├── history_by_id.png ├── task_breakdown.png ├── dependency_graph.png ├── execution_model.png ├── history_by_name.png ├── parameters_enum.png ├── task_parameters.png ├── aggregate_artists.png ├── history_by_task_id.png ├── task_with_targets.png ├── parameters_recursion.png ├── visualiser_front_page.png ├── parameters_date_algebra.png ├── tasks_with_dependencies.png ├── tasks_input_output_requires.png ├── mypy.rst ├── index.rst ├── logging.rst └── design_and_limitations.rst ├── scripts └── ci │ ├── stop_azurite.sh │ ├── conditional_tox.sh │ ├── install_start_azurite.sh │ └── setup_hadoop_env.sh ├── catalog-info.yaml ├── SECURITY.md ├── .readthedocs.yaml ├── bin ├── luigi └── luigid ├── .coveragerc ├── .github ├── CODEOWNERS ├── PULL_REQUEST_TEMPLATE.md ├── stale.yml ├── ISSUE_TEMPLATE.md └── workflows │ └── codeql.yml ├── examples ├── hello_world.py ├── __init__.py ├── top_artists_spark.py ├── config.toml ├── foo.py ├── foo_complex.py ├── kubernetes.py ├── wordcount_hadoop.py ├── wordcount.py └── ssh_remote_execution.py ├── RELEASE-PROCESS.rst ├── codecov.yml ├── CONTRIBUTING.rst └── .gitignore /luigi/py.typed: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /test/conftest.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /test/contrib/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /doc/.gitignore: -------------------------------------------------------------------------------- 1 | _static 2 | _build 3 | _templates 4 | -------------------------------------------------------------------------------- /luigi/__version__.py: -------------------------------------------------------------------------------- 1 | # coding: utf-8 2 | 3 | VERSION = '3.6.0' 4 | -------------------------------------------------------------------------------- /test/visualiser/__init__.py: -------------------------------------------------------------------------------- 1 | # Tests for visualiser javascript. 2 | -------------------------------------------------------------------------------- /doc/luigi.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/doc/luigi.png -------------------------------------------------------------------------------- /test/create_packages_archive_root/package.egg-info/top_level.txt: -------------------------------------------------------------------------------- 1 | package 2 | -------------------------------------------------------------------------------- /doc/history.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/doc/history.png -------------------------------------------------------------------------------- /doc/user_recs.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/doc/user_recs.png -------------------------------------------------------------------------------- /doc/web_server.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/doc/web_server.png -------------------------------------------------------------------------------- /doc/history_by_id.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/doc/history_by_id.png -------------------------------------------------------------------------------- /doc/task_breakdown.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/doc/task_breakdown.png -------------------------------------------------------------------------------- /doc/dependency_graph.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/doc/dependency_graph.png -------------------------------------------------------------------------------- /doc/execution_model.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/doc/execution_model.png -------------------------------------------------------------------------------- /doc/history_by_name.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/doc/history_by_name.png -------------------------------------------------------------------------------- /doc/parameters_enum.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/doc/parameters_enum.png -------------------------------------------------------------------------------- /doc/task_parameters.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/doc/task_parameters.png -------------------------------------------------------------------------------- /doc/aggregate_artists.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/doc/aggregate_artists.png -------------------------------------------------------------------------------- /doc/history_by_task_id.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/doc/history_by_task_id.png -------------------------------------------------------------------------------- /doc/task_with_targets.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/doc/task_with_targets.png -------------------------------------------------------------------------------- /test/auto_namespace_test/__init__.py: -------------------------------------------------------------------------------- 1 | import luigi 2 | 3 | luigi.auto_namespace(scope=__name__) 4 | -------------------------------------------------------------------------------- /test/testconfig/luigi_local.toml: -------------------------------------------------------------------------------- 1 | [hdfs] 2 | namenode_host = "localhost" 3 | namenode_port = 50030 4 | -------------------------------------------------------------------------------- /doc/parameters_recursion.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/doc/parameters_recursion.png -------------------------------------------------------------------------------- /doc/visualiser_front_page.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/doc/visualiser_front_page.png -------------------------------------------------------------------------------- /doc/parameters_date_algebra.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/doc/parameters_date_algebra.png -------------------------------------------------------------------------------- /doc/tasks_with_dependencies.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/doc/tasks_with_dependencies.png -------------------------------------------------------------------------------- /test/gcloud-credentials.json.enc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/test/gcloud-credentials.json.enc -------------------------------------------------------------------------------- /test/testconfig/pyproject.toml: -------------------------------------------------------------------------------- 1 | [tool.mypy] 2 | plugins = ["luigi.mypy"] 3 | ignore_missing_imports = true 4 | -------------------------------------------------------------------------------- /doc/tasks_input_output_requires.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/doc/tasks_input_output_requires.png -------------------------------------------------------------------------------- /scripts/ci/stop_azurite.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | docker stop "$(docker ps -q --filter ancestor=mcr.microsoft.com/azure-storage/azurite)" -------------------------------------------------------------------------------- /luigi/static/visualiser/fonts/FontAwesome.otf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/luigi/static/visualiser/fonts/FontAwesome.otf -------------------------------------------------------------------------------- /catalog-info.yaml: -------------------------------------------------------------------------------- 1 | apiVersion: backstage.io/v1alpha1 2 | kind: Component 3 | metadata: 4 | name: luigi 5 | spec: 6 | type: library 7 | owner: dataex 8 | -------------------------------------------------------------------------------- /luigi/static/visualiser/fonts/fontawesome-webfont.eot: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/luigi/static/visualiser/fonts/fontawesome-webfont.eot -------------------------------------------------------------------------------- /luigi/static/visualiser/fonts/fontawesome-webfont.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/luigi/static/visualiser/fonts/fontawesome-webfont.ttf -------------------------------------------------------------------------------- /luigi/static/visualiser/fonts/fontawesome-webfont.woff: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/luigi/static/visualiser/fonts/fontawesome-webfont.woff -------------------------------------------------------------------------------- /luigi/static/visualiser/fonts/fontawesome-webfont.woff2: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/luigi/static/visualiser/fonts/fontawesome-webfont.woff2 -------------------------------------------------------------------------------- /luigi/static/visualiser/lib/datatables/images/favicon.ico: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/luigi/static/visualiser/lib/datatables/images/favicon.ico -------------------------------------------------------------------------------- /luigi/static/visualiser/lib/datatables/images/sort_asc.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/luigi/static/visualiser/lib/datatables/images/sort_asc.png -------------------------------------------------------------------------------- /luigi/static/visualiser/lib/datatables/images/sort_both.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/luigi/static/visualiser/lib/datatables/images/sort_both.png -------------------------------------------------------------------------------- /luigi/static/visualiser/lib/datatables/images/sort_desc.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/luigi/static/visualiser/lib/datatables/images/sort_desc.png -------------------------------------------------------------------------------- /luigi/static/visualiser/fonts/glyphicons-halflings-regular.eot: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/luigi/static/visualiser/fonts/glyphicons-halflings-regular.eot -------------------------------------------------------------------------------- /luigi/static/visualiser/fonts/glyphicons-halflings-regular.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/luigi/static/visualiser/fonts/glyphicons-halflings-regular.ttf -------------------------------------------------------------------------------- /luigi/static/visualiser/fonts/glyphicons-halflings-regular.woff: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/luigi/static/visualiser/fonts/glyphicons-halflings-regular.woff -------------------------------------------------------------------------------- /luigi/static/visualiser/lib/datatables/images/Sorting icons.psd: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/luigi/static/visualiser/lib/datatables/images/Sorting icons.psd -------------------------------------------------------------------------------- /luigi/static/visualiser/lib/datatables/images/sort_asc_disabled.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/luigi/static/visualiser/lib/datatables/images/sort_asc_disabled.png -------------------------------------------------------------------------------- /luigi/static/visualiser/lib/datatables/images/sort_desc_disabled.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/luigi/static/visualiser/lib/datatables/images/sort_desc_disabled.png -------------------------------------------------------------------------------- /luigi/static/visualiser/lib/jquery-ui/css/images/animated-overlay.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/luigi/static/visualiser/lib/jquery-ui/css/images/animated-overlay.gif -------------------------------------------------------------------------------- /luigi/static/visualiser/lib/jquery-ui/css/images/ui-icons_222222_256x240.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/luigi/static/visualiser/lib/jquery-ui/css/images/ui-icons_222222_256x240.png -------------------------------------------------------------------------------- /luigi/static/visualiser/lib/jquery-ui/css/images/ui-icons_2e83ff_256x240.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/luigi/static/visualiser/lib/jquery-ui/css/images/ui-icons_2e83ff_256x240.png -------------------------------------------------------------------------------- /luigi/static/visualiser/lib/jquery-ui/css/images/ui-icons_454545_256x240.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/luigi/static/visualiser/lib/jquery-ui/css/images/ui-icons_454545_256x240.png -------------------------------------------------------------------------------- /luigi/static/visualiser/lib/jquery-ui/css/images/ui-icons_888888_256x240.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/luigi/static/visualiser/lib/jquery-ui/css/images/ui-icons_888888_256x240.png -------------------------------------------------------------------------------- /luigi/static/visualiser/lib/jquery-ui/css/images/ui-icons_cd0a0a_256x240.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/luigi/static/visualiser/lib/jquery-ui/css/images/ui-icons_cd0a0a_256x240.png -------------------------------------------------------------------------------- /luigi/static/visualiser/lib/jquery-ui/css/images/ui-bg_flat_0_aaaaaa_40x100.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/luigi/static/visualiser/lib/jquery-ui/css/images/ui-bg_flat_0_aaaaaa_40x100.png -------------------------------------------------------------------------------- /luigi/static/visualiser/lib/jquery-ui/css/images/ui-bg_flat_75_ffffff_40x100.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/luigi/static/visualiser/lib/jquery-ui/css/images/ui-bg_flat_75_ffffff_40x100.png -------------------------------------------------------------------------------- /luigi/static/visualiser/lib/jquery-ui/css/images/ui-bg_glass_55_fbf9ee_1x400.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/luigi/static/visualiser/lib/jquery-ui/css/images/ui-bg_glass_55_fbf9ee_1x400.png -------------------------------------------------------------------------------- /luigi/static/visualiser/lib/jquery-ui/css/images/ui-bg_glass_65_ffffff_1x400.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/luigi/static/visualiser/lib/jquery-ui/css/images/ui-bg_glass_65_ffffff_1x400.png -------------------------------------------------------------------------------- /luigi/static/visualiser/lib/jquery-ui/css/images/ui-bg_glass_75_dadada_1x400.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/luigi/static/visualiser/lib/jquery-ui/css/images/ui-bg_glass_75_dadada_1x400.png -------------------------------------------------------------------------------- /luigi/static/visualiser/lib/jquery-ui/css/images/ui-bg_glass_75_e6e6e6_1x400.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/luigi/static/visualiser/lib/jquery-ui/css/images/ui-bg_glass_75_e6e6e6_1x400.png -------------------------------------------------------------------------------- /luigi/static/visualiser/lib/jquery-ui/css/images/ui-bg_glass_95_fef1ec_1x400.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/luigi/static/visualiser/lib/jquery-ui/css/images/ui-bg_glass_95_fef1ec_1x400.png -------------------------------------------------------------------------------- /test/testconfig/luigi.cfg: -------------------------------------------------------------------------------- 1 | [core] 2 | logging_conf_file: test/testconfig/logging.cfg 3 | 4 | [hdfs] 5 | client: hadoopcli 6 | snakebite_autoconfig: false 7 | namenode_host: localhost 8 | namenode_port: 50030 9 | -------------------------------------------------------------------------------- /luigi/static/visualiser/lib/jquery-ui/css/images/ui-bg_highlight-soft_75_cccccc_1x100.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spotify/luigi/HEAD/luigi/static/visualiser/lib/jquery-ui/css/images/ui-bg_highlight-soft_75_cccccc_1x100.png -------------------------------------------------------------------------------- /scripts/ci/conditional_tox.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | 3 | set -ex 4 | 5 | ENDENV=$(echo $TOXENV | tail -c 7) 6 | if [[ $ENDENV == gcloud ]] 7 | then 8 | [[ $DIDNT_CREATE_GCP_CREDS = 1 ]] || tox 9 | else 10 | tox --hashseed 1 11 | fi 12 | -------------------------------------------------------------------------------- /luigi/static/visualiser/js/util.js: -------------------------------------------------------------------------------- 1 | function escapeHtml(unsafe) { 2 | return unsafe 3 | .replace(/&/g, "&") 4 | .replace(//g, ">") 6 | .replace(/"/g, """) 7 | .replace(/'/g, "'"); 8 | } 9 | -------------------------------------------------------------------------------- /test/testconfig/core-site.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | fs.defaultFS 7 | hdfs://localhost:50030/ 8 | 9 | 10 | -------------------------------------------------------------------------------- /SECURITY.md: -------------------------------------------------------------------------------- 1 | # Security Policy 2 | 3 | ## Reporting a Vulnerability 4 | 5 | Please report sensitive security issues via Spotify's [bug-bounty program](https://hackerone.com/spotify) by following this [instruction](https://docs.hackerone.com/programs/security-page.html), rather than GitHub. 6 | -------------------------------------------------------------------------------- /test/testconfig/luigi.toml: -------------------------------------------------------------------------------- 1 | [core] 2 | logging_conf_file = "test/testconfig/logging.cfg" 3 | 4 | [hdfs] 5 | client = "hadoopcli" 6 | snakebite_autoconfig = false 7 | namenode_host = "must be overridden in local config" 8 | 9 | [SomeTask] 10 | param = {key1 = "value1", key2 = "value2"} 11 | -------------------------------------------------------------------------------- /.readthedocs.yaml: -------------------------------------------------------------------------------- 1 | version: 2 2 | 3 | build: 4 | os: ubuntu-22.04 5 | tools: 6 | python: "3.9" 7 | 8 | sphinx: 9 | configuration: doc/conf.py 10 | 11 | formats: 12 | - pdf 13 | - epub 14 | 15 | python: 16 | install: 17 | - method: pip 18 | path: . 19 | extra_requirements: 20 | - readthedocs 21 | -------------------------------------------------------------------------------- /bin/luigi: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | 3 | import sys 4 | import warnings 5 | import luigi.cmdline 6 | 7 | 8 | def main(argv): 9 | warnings.warn("'bin/luigi' has moved to console script 'luigi'", DeprecationWarning) 10 | luigi.cmdline.luigi_run(argv) 11 | 12 | 13 | if __name__ == '__main__': 14 | main(sys.argv[1:]) 15 | -------------------------------------------------------------------------------- /bin/luigid: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | 3 | import sys 4 | import warnings 5 | import luigi.cmdline 6 | 7 | 8 | def main(argv): 9 | warnings.warn("'bin/luigid' has moved to console script 'luigid'", DeprecationWarning) 10 | luigi.cmdline.luigid(argv) 11 | 12 | 13 | if __name__ == '__main__': 14 | main(sys.argv[1:]) 15 | -------------------------------------------------------------------------------- /test/testconfig/log4j.properties: -------------------------------------------------------------------------------- 1 | hadoop.root.logger=INFO,stderr 2 | log4j.logger.org.apache.hadoop=INFO,stderr 3 | log4j.logger.org.apache.hadoop.util.NativeCodeLoader=Off 4 | 5 | log4j.appender.stderr = org.apache.log4j.ConsoleAppender 6 | log4j.appender.stderr.layout = org.apache.log4j.PatternLayout 7 | log4j.appender.stderr.Target = System.err -------------------------------------------------------------------------------- /.coveragerc: -------------------------------------------------------------------------------- 1 | [report] 2 | omit = 3 | luigi/mrrunner.py 4 | test/_test_time_generated_module*.py 5 | */python?.?/* 6 | */site-packages/nose/* 7 | *__init__* 8 | *test/* 9 | */.tox/* 10 | */setup.py 11 | */bin/luigidc 12 | hadoop_test.py 13 | minicluster.py 14 | 15 | [run] 16 | parallel=True 17 | concurrency=multiprocessing 18 | -------------------------------------------------------------------------------- /test/testconfig/logging.cfg: -------------------------------------------------------------------------------- 1 | [loggers] 2 | keys=root 3 | 4 | [handlers] 5 | keys=consoleHandler 6 | 7 | [formatters] 8 | keys=simpleFormatter 9 | 10 | [logger_root] 11 | level=DEBUG 12 | handlers=consoleHandler 13 | 14 | [handler_consoleHandler] 15 | class=StreamHandler 16 | level=DEBUG 17 | formatter=simpleFormatter 18 | args=(sys.stdout,) 19 | 20 | [formatter_simpleFormatter] 21 | format=%(levelname)s: %(message)s 22 | -------------------------------------------------------------------------------- /scripts/ci/install_start_azurite.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | 3 | echo "$DOCKERHUB_TOKEN" | docker login -u spotifyci --password-stdin 4 | 5 | docker pull mcr.microsoft.com/azure-storage/azurite 6 | mkdir -p blob_emulator 7 | $1/stop_azurite.sh 8 | docker run -p 10000:10000 -v blob_emulator:/data -e AZURITE_ACCOUNTS=devstoreaccount1:YXp1cml0ZQ== -d mcr.microsoft.com/azure-storage/azurite azurite-blob -l /data --blobHost 0.0.0.0 --blobPort 10000 9 | -------------------------------------------------------------------------------- /test/auto_namespace_test/my_namespace_test.py: -------------------------------------------------------------------------------- 1 | import luigi 2 | from helpers import LuigiTestCase 3 | 4 | 5 | class MyNamespaceTest(LuigiTestCase): 6 | def test_auto_namespace_scope(self): 7 | class MyTask(luigi.Task): 8 | pass 9 | self.assertTrue(self.run_locally(['auto_namespace_test.my_namespace_test.MyTask'])) 10 | self.assertEqual(MyTask.get_task_namespace(), 'auto_namespace_test.my_namespace_test') 11 | -------------------------------------------------------------------------------- /luigi/static/visualiser/mockdata/fetch_error: -------------------------------------------------------------------------------- 1 | { 2 | "response": { 3 | "taskId": "FactorTask(product=2)", 4 | "error": "Runtime error:\nTraceback (most recent call last):\n File '/Users/davw/projects/luigi-core/luigi/worker.py', line 164, in _run_task\n task.run()\n File '/Users/davw/projects/luigi-core/test/scheduler_visualisation_test.py', line 62, in run\n raise Exception('Error Message')\nException: Error Message\n" 5 | } 6 | } -------------------------------------------------------------------------------- /test/testconfig/luigi_logging.toml: -------------------------------------------------------------------------------- 1 | [logging] 2 | version = 1 3 | disable_existing_loggers = false 4 | 5 | [logging.formatters.mockformatter] 6 | format = "{levelname}: {message}" 7 | style = "{" 8 | 9 | [logging.handlers.mockhandler] 10 | class = "logging.StreamHandler" 11 | level = "INFO" 12 | formatter = "mockformatter" 13 | 14 | [logging.loggers.mocklogger] 15 | handlers = ["mockhandler"] 16 | level = 'INFO' 17 | disabled = false 18 | propagate = false 19 | -------------------------------------------------------------------------------- /.github/CODEOWNERS: -------------------------------------------------------------------------------- 1 | # The following patterns are used to auto-assign review requests 2 | # to specific individuals. Order is important; the last matching 3 | # pattern takes the most precedence. 4 | 5 | # These owners will be the default owners for everything in 6 | # the repo. Unless a later match takes precedence, 7 | * @dlstadther @spotify/dataex 8 | 9 | # Specific files, directories, paths, or file types can be 10 | # assigned more specificially. 11 | contrib/redshift*.py @dlstadther 12 | 13 | -------------------------------------------------------------------------------- /luigi/static/visualiser/test.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | Luigi Visualiser Tests 5 | 6 | 7 | 8 |
9 |
10 | 11 | 12 | 13 | 14 | 15 | -------------------------------------------------------------------------------- /examples/hello_world.py: -------------------------------------------------------------------------------- 1 | """ 2 | You can run this example like this: 3 | 4 | .. code:: console 5 | 6 | $ luigi --module examples.hello_world examples.HelloWorldTask --local-scheduler 7 | 8 | If that does not work, see :ref:`CommandLine`. 9 | """ 10 | import luigi 11 | 12 | 13 | class HelloWorldTask(luigi.Task): 14 | task_namespace = 'examples' 15 | 16 | def run(self): 17 | print("{task} says: Hello world!".format(task=self.__class__.__name__)) 18 | 19 | 20 | if __name__ == '__main__': 21 | luigi.run(['examples.HelloWorldTask', '--workers', '1', '--local-scheduler']) 22 | -------------------------------------------------------------------------------- /examples/__init__.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | -------------------------------------------------------------------------------- /luigi/templates/menu.html: -------------------------------------------------------------------------------- 1 | 5 | 6 | 7 | {% extends "layout.html" %} 8 | 9 | 10 | 11 | {% block content %} 12 | 13 |
14 | {% if tasknames %} 15 |

[ Task History ]

16 | 23 | {% end %} 24 |
25 | 26 | {% end %} 27 | -------------------------------------------------------------------------------- /test/create_packages_archive_root/module.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | -------------------------------------------------------------------------------- /test/create_packages_archive_root/package/__init__.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | -------------------------------------------------------------------------------- /test/create_packages_archive_root/package/subpackage/__init__.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | -------------------------------------------------------------------------------- /test/create_packages_archive_root/package/submodule_without_imports.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | -------------------------------------------------------------------------------- /test/create_packages_archive_root/package/submodule.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | 18 | import os # NOQA 19 | -------------------------------------------------------------------------------- /test/create_packages_archive_root/package/subpackage/submodule.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | 18 | import os # NOQA 19 | -------------------------------------------------------------------------------- /luigi/contrib/__init__.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | """ 18 | Package containing optional and-on functionality. 19 | """ 20 | -------------------------------------------------------------------------------- /test/create_packages_archive_root/package/submodule_with_absolute_import.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | 18 | import os # NOQA 19 | -------------------------------------------------------------------------------- /luigi/__main__.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2016 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | from luigi.cmdline import luigi_run 18 | 19 | if __name__ == '__main__': 20 | luigi_run() 21 | -------------------------------------------------------------------------------- /.github/PULL_REQUEST_TEMPLATE.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | ## Description 6 | 7 | 8 | ## Motivation and Context 9 | 10 | 11 | 12 | ## Have you tested this? If so, how? 13 | 14 | 15 | 16 | 20 | -------------------------------------------------------------------------------- /examples/top_artists_spark.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | import operator 4 | import sys 5 | 6 | from pyspark.sql import SparkSession 7 | 8 | 9 | def main(argv): 10 | input_paths = argv[1].split(',') 11 | output_path = argv[2] 12 | 13 | spark = SparkSession.builder.getOrCreate() 14 | 15 | streams = spark.read.option('sep', '\t').csv(input_paths[0]) 16 | for stream_path in input_paths[1:]: 17 | streams.union(spark.read.option('sep', '\t').csv(stream_path)) 18 | 19 | # The second field is the artist 20 | counts = streams \ 21 | .map(lambda row: (row[1], 1)) \ 22 | .reduceByKey(operator.add) 23 | 24 | counts.write.option('sep', '\t').csv(output_path) 25 | 26 | 27 | if __name__ == '__main__': 28 | sys.exit(main(sys.argv)) 29 | -------------------------------------------------------------------------------- /luigi/tools/__init__.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # Copyright (c) 2014 Spotify AB 3 | # 4 | # Licensed under the Apache License, Version 2.0 (the "License"); you may not 5 | # use this file except in compliance with the License. You may obtain a copy of 6 | # the License at 7 | # 8 | # http://www.apache.org/licenses/LICENSE-2.0 9 | # 10 | # Unless required by applicable law or agreed to in writing, software 11 | # distributed under the License is distributed on an "AS IS" BASIS, WITHOUT 12 | # WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the 13 | # License for the specific language governing permissions and limitations under 14 | # the License. 15 | 16 | """ 17 | Sort of a standard library for doing stuff with Tasks at a somewhat abstract level. 18 | 19 | Submodule introduced to stop growing util.py unstructured. 20 | """ 21 | -------------------------------------------------------------------------------- /luigi/templates/recent.html: -------------------------------------------------------------------------------- 1 | {% extends "layout.html" %} 2 | {% block content %} 3 |

Luigi Task History

4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | {% for task in tasks %} 16 | 17 | 18 | 19 | 20 | 21 | 24 | 25 | {% end %} 26 | 27 |
NameHostLast ActionStatusParameters
{{task.name}}{{task.host}}{{task.events[0].ts}}{{task.events[0].event_name}}{% for (k, param) in task.parameters.items() %} 22 |
{{k}}{{param.value}}
23 | {% end %}
28 | {% end %} 29 | -------------------------------------------------------------------------------- /.github/stale.yml: -------------------------------------------------------------------------------- 1 | # Number of days of inactivity before an issue becomes stale 2 | daysUntilStale: 120 3 | # Number of days of inactivity before a stale issue is closed 4 | daysUntilClose: 14 5 | # Issues with these labels will never be considered stale 6 | exemptLabels: 7 | - pinned 8 | - security 9 | # Label to use when marking an issue as stale 10 | staleLabel: wontfix 11 | # Comment to post when marking an issue as stale. Set to `false` to disable 12 | markComment: > 13 | This issue has been automatically marked as stale because it has not had 14 | recent activity. It will be closed if no further activity occurs. 15 | If closed, you may revisit when your time allows and reopen! 16 | Thank you for your contributions. 17 | # Comment to post when closing a stale issue. Set to `false` to disable 18 | closeComment: false 19 | # Limit to only `issues` or `pulls` 20 | # only: issues 21 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE.md: -------------------------------------------------------------------------------- 1 | 20 | -------------------------------------------------------------------------------- /test/other_module.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | 18 | import luigi 19 | 20 | 21 | class OtherModuleTask(luigi.Task): 22 | p = luigi.Parameter() 23 | 24 | def output(self): 25 | return luigi.LocalTarget(self.p) 26 | 27 | def run(self): 28 | with self.output().open('w') as f: 29 | f.write('Done!') 30 | -------------------------------------------------------------------------------- /doc/mypy.rst: -------------------------------------------------------------------------------- 1 | Mypy plugin 2 | -------------- 3 | 4 | Mypy plugin provides type checking for ``luigi.Task`` using Mypy. 5 | 6 | Require Python 3.8 or later. 7 | 8 | How to use 9 | ~~~~~~~~~~ 10 | 11 | Configure Mypy to use this plugin by adding the following to your ``mypy.ini`` file: 12 | 13 | .. code:: ini 14 | 15 | [mypy] 16 | plugins = luigi.mypy 17 | 18 | or by adding the following to your ``pyproject.toml`` file: 19 | 20 | .. code:: toml 21 | 22 | [tool.mypy] 23 | plugins = ["luigi.mypy"] 24 | 25 | Then, run Mypy as usual. 26 | 27 | Examples 28 | ~~~~~~~~ 29 | 30 | For example the following code linted by Mypy: 31 | 32 | .. code:: python 33 | 34 | import luigi 35 | 36 | 37 | class MyTask(luigi.Task): 38 | foo: int = luigi.IntParameter() 39 | bar: str = luigi.Parameter() 40 | 41 | MyTask(foo=1, bar='2') # OK 42 | MyTask(foo='1', bar='2') # Error: Argument 1 to "Foo" has incompatible type "str"; expected "int" 43 | -------------------------------------------------------------------------------- /luigi/configuration/__init__.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | from .cfg_parser import LuigiConfigParser 18 | from .core import get_config, add_config_path 19 | from .toml_parser import LuigiTomlParser 20 | 21 | 22 | __all__ = [ 23 | 'add_config_path', 24 | 'get_config', 25 | 'LuigiConfigParser', 26 | 'LuigiTomlParser', 27 | ] 28 | -------------------------------------------------------------------------------- /doc/index.rst: -------------------------------------------------------------------------------- 1 | .. Luigi documentation master file, created by 2 | sphinx-quickstart on Sat Feb 8 00:56:43 2014. 3 | You can adapt this file completely to your liking, but it should at least 4 | contain the root `toctree` directive. 5 | 6 | .. include:: ../README.rst 7 | 8 | Table of Contents 9 | ----------------- 10 | 11 | .. toctree:: 12 | :maxdepth: 2 13 | 14 | example_top_artists.rst 15 | workflows.rst 16 | tasks.rst 17 | parameters.rst 18 | running_luigi.rst 19 | central_scheduler.rst 20 | execution_model.rst 21 | luigi_patterns.rst 22 | configuration.rst 23 | logging.rst 24 | design_and_limitations.rst 25 | mypy.rst 26 | 27 | API Reference 28 | ------------- 29 | 30 | .. autosummary:: 31 | :toctree: api 32 | 33 | luigi 34 | luigi.contrib 35 | luigi.tools 36 | luigi.local_target 37 | 38 | 39 | Indices and tables 40 | ================== 41 | 42 | * :ref:`genindex` 43 | * :ref:`modindex` 44 | * :ref:`search` 45 | -------------------------------------------------------------------------------- /test/hdfs_client_test.py: -------------------------------------------------------------------------------- 1 | import itertools 2 | import threading 3 | import unittest 4 | 5 | from luigi.contrib.hdfs import get_autoconfig_client 6 | 7 | 8 | class HdfsClientTest(unittest.TestCase): 9 | def test_get_autoconfig_client_cached(self): 10 | original_client = get_autoconfig_client() 11 | for _ in range(100): 12 | self.assertIs(original_client, get_autoconfig_client()) 13 | 14 | def test_threaded_clients_different(self): 15 | clients = [] 16 | 17 | def add_client(): 18 | clients.append(get_autoconfig_client()) 19 | 20 | # run a bunch of threads to get new clients in them 21 | threads = [threading.Thread(target=add_client) for _ in range(10)] 22 | for thread in threads: 23 | thread.start() 24 | for thread in threads: 25 | thread.join() 26 | 27 | for client1, client2 in itertools.combinations(clients, 2): 28 | self.assertIsNot(client1, client2) 29 | -------------------------------------------------------------------------------- /luigi/task_status.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | """ 18 | Possible values for a Task's status in the Scheduler 19 | """ 20 | 21 | PENDING = 'PENDING' 22 | FAILED = 'FAILED' 23 | DONE = 'DONE' 24 | RUNNING = 'RUNNING' 25 | BATCH_RUNNING = 'BATCH_RUNNING' 26 | SUSPENDED = 'SUSPENDED' # Only kept for backward compatibility with old clients 27 | UNKNOWN = 'UNKNOWN' 28 | DISABLED = 'DISABLED' 29 | -------------------------------------------------------------------------------- /test/runtests.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | 18 | import sys 19 | import warnings 20 | 21 | import pytest 22 | 23 | if __name__ == '__main__': 24 | with warnings.catch_warnings(): 25 | warnings.simplefilter("default") 26 | warnings.filterwarnings( 27 | "ignore", 28 | message='(.*)outputs has no custom(.*)', 29 | category=UserWarning 30 | ) 31 | sys.exit(pytest.main(sys.argv[1:])) 32 | -------------------------------------------------------------------------------- /examples/config.toml: -------------------------------------------------------------------------------- 1 | 2 | [hdfs] 3 | client = "hadoopcli" 4 | namenode_host = "localhost" 5 | namenode_port = 50030 6 | 7 | # LOGGING 8 | 9 | [logging] 10 | version = 1 11 | disable_existing_loggers = false 12 | 13 | # logs format 14 | [logging.formatters.simple] 15 | format = "{levelname:8} {asctime} {module}:{lineno} {message}" 16 | style = "{" 17 | datefmt = "%Y-%m-%d %H:%M:%S" 18 | 19 | # write logs to console 20 | [logging.handlers.console] 21 | level = "DEBUG" 22 | class = "logging.StreamHandler" 23 | formatter = "simple" 24 | 25 | # luigi worker logging 26 | [logging.loggers.luigi-interface] 27 | handlers = ["console"] 28 | level = "INFO" 29 | disabled = false 30 | propagate = false 31 | 32 | # luigid logging 33 | [logging.loggers.luigi] 34 | handlers = ["console"] 35 | level = "INFO" 36 | disabled = false 37 | propagate = false 38 | 39 | # luigid builded on tornado 40 | [logging.loggers.tornado] 41 | handlers = ["console"] 42 | level = "INFO" 43 | disabled = false 44 | propagate = false 45 | 46 | # custom logger for "project" 47 | [logging.loggers.project] 48 | handlers = ["console"] 49 | level = "DEBUG" 50 | disabled = false 51 | propagate = false 52 | -------------------------------------------------------------------------------- /RELEASE-PROCESS.rst: -------------------------------------------------------------------------------- 1 | For maintainers of Luigi, who have push access to pypi. Here's how you upload 2 | Luigi to pypi. 3 | 4 | #. Make sure [uv](https://github.com/astral-sh/uv) is installed ``curl -LsSf https://astral.sh/uv/install.sh | sh``. 5 | #. Update version number in `luigi/__version__.py`. 6 | #. Commit, perhaps simply with a commit message like ``Version x.y.z``. 7 | #. Push to GitHub at [spotify/luigi](https://github.com/spotify/luigi). 8 | #. Clean up previous distributions by executing ``rm -rf dist``. 9 | #. Build a source distribution by executing ``uv build``. 10 | #. Set pypi token on environment variable ``export UV_PUBLISH_TOKEN="LUIGI_PYPI_TOKEN_HERE"``. 11 | #. Upload to pypi by executing ``uv publish``. 12 | #. Add a tag on github (https://github.com/spotify/luigi/releases), 13 | including a handwritten changelog, possibly inspired from previous notes. 14 | 15 | Currently, Luigi is not released on any particular schedule and it is not 16 | strictly abiding semantic versioning. Whenever possible, bump major version when you make incompatible API changes, minor version when you add functionality in a backwards compatible manner, and patch version when you make backwards compatible bug fixes. 17 | -------------------------------------------------------------------------------- /codecov.yml: -------------------------------------------------------------------------------- 1 | codecov: 2 | require_ci_to_pass: yes 3 | notify: 4 | after_n_builds: 24 5 | wait_for_ci: yes 6 | 7 | coverage: 8 | precision: 2 # Just copied from default 9 | round: down # Just copied from default 10 | range: "70...100" # Just copied from default 11 | 12 | status: 13 | project: 14 | default: false # disable the default status that measures entire project 15 | core: 16 | target: 92% 17 | paths: 18 | - "luigi/*.py" 19 | patch: # Just copied from default 20 | default: 21 | if_no_uploads: error 22 | 23 | changes: true # Just copied from default 24 | 25 | ignore: 26 | - "examples/" 27 | - "luigi/tools" # These are tested as actual run commands without coverage 28 | # List modules who's tests are not run by Travis or 29 | # are run in a subprocesses (like on cluster). 30 | - "luigi/contrib/gcs.py" 31 | - "luigi/contrib/bigquery.py" 32 | - "luigi/contrib/bigquery_avro.py" 33 | - "luigi/contrib/hdfs/" 34 | - "luigi/contrib/hadoop.py" 35 | - "luigi/contrib/mrrunner.py" 36 | - "luigi/contrib/kubernetes.py" 37 | 38 | # For luigi we do not want any comments 39 | comment: false 40 | -------------------------------------------------------------------------------- /test/most_common_test.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | 18 | from helpers import unittest 19 | 20 | from luigi.tools.range import most_common 21 | 22 | 23 | class MostCommonTest(unittest.TestCase): 24 | 25 | def setUp(self): 26 | self.runs = [ 27 | ([1], (1, 1)), 28 | ([1, 1], (1, 2)), 29 | ([1, 1, 2], (1, 2)), 30 | ([1, 1, 2, 2, 2], (2, 3)) 31 | ] 32 | 33 | def test_runs(self): 34 | for args, result in self.runs: 35 | actual = most_common(args) 36 | expected = result 37 | self.assertEqual(expected, actual) 38 | -------------------------------------------------------------------------------- /luigi/contrib/hdfs/error.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | 18 | """ 19 | The implementations of the hdfs clients. 20 | """ 21 | 22 | 23 | class HDFSCliError(Exception): 24 | 25 | def __init__(self, command, returncode, stdout, stderr): 26 | self.returncode = returncode 27 | self.stdout = stdout 28 | self.stderr = stderr 29 | msg = ("Command %r failed [exit code %d]\n" 30 | "---stdout---\n" 31 | "%s\n" 32 | "---stderr---\n" 33 | "%s" 34 | "------------") % (command, returncode, stdout, stderr) 35 | super(HDFSCliError, self).__init__(msg) 36 | -------------------------------------------------------------------------------- /luigi/static/visualiser/mockdata/dep_graph: -------------------------------------------------------------------------------- 1 | { 2 | "response": { 3 | "FactorTask(product=12)": { 4 | "deps": [ 5 | "FactorTask(product=2)", 6 | "FactorTask(product=6)" 7 | ], 8 | "start_time": 1369300552.60482, 9 | "status": "PENDING", 10 | "workers": [ 11 | "worker-641996460" 12 | ] 13 | }, 14 | "FactorTask(product=2)": { 15 | "deps": [], 16 | "start_time": 1369300552.60741, 17 | "status": "FAILED", 18 | "workers": [ 19 | "worker-641996460" 20 | ] 21 | }, 22 | "FactorTask(product=3)": { 23 | "deps": [], 24 | "start_time": 1369300552.61154, 25 | "status": "PENDING", 26 | "workers": [ 27 | "worker-641996460" 28 | ] 29 | }, 30 | "FactorTask(product=6)": { 31 | "deps": [ 32 | "FactorTask(product=2)", 33 | "FactorTask(product=3)" 34 | ], 35 | "start_time": 1369300552.609396, 36 | "status": "DONE", 37 | "workers": [ 38 | "worker-641996460" 39 | ] 40 | } 41 | } 42 | } 43 | -------------------------------------------------------------------------------- /luigi/static/visualiser/mockdata/task_list: -------------------------------------------------------------------------------- 1 | { 2 | "response": { 3 | "FactorTask(product=12)": { 4 | "deps": [ 5 | "FactorTask(product=2)", 6 | "FactorTask(product=6)" 7 | ], 8 | "start_time": 1369300552.60482, 9 | "status": "PENDING", 10 | "workers": [ 11 | "worker-641996460" 12 | ] 13 | }, 14 | "FactorTask(product=2)": { 15 | "deps": [], 16 | "start_time": 1369300552.60741, 17 | "status": "FAILED", 18 | "workers": [ 19 | "worker-641996460" 20 | ] 21 | }, 22 | "FactorTask(product=3)": { 23 | "deps": [], 24 | "start_time": 1369300552.61154, 25 | "status": "PENDING", 26 | "workers": [ 27 | "worker-641996460" 28 | ] 29 | }, 30 | "FactorTask(product=6)": { 31 | "deps": [ 32 | "FactorTask(product=2)", 33 | "FactorTask(product=3)" 34 | ], 35 | "start_time": 1369300552.609396, 36 | "status": "DONE", 37 | "workers": [ 38 | "worker-641996460" 39 | ] 40 | } 41 | } 42 | } 43 | -------------------------------------------------------------------------------- /test/task_progress_percentage_test.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | 18 | from helpers import LuigiTestCase 19 | 20 | import luigi 21 | import luigi.scheduler 22 | import luigi.worker 23 | 24 | 25 | class TaskProgressPercentageTest(LuigiTestCase): 26 | 27 | def test_run(self): 28 | sch = luigi.scheduler.Scheduler() 29 | with luigi.worker.Worker(scheduler=sch) as w: 30 | class MyTask(luigi.Task): 31 | def run(self): 32 | self.set_progress_percentage(30) 33 | 34 | task = MyTask() 35 | w.add(task) 36 | w.run() 37 | 38 | self.assertEqual(sch.get_task_progress_percentage(task.task_id)["progressPercentage"], 30) 39 | -------------------------------------------------------------------------------- /test/dynamic_import_test.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | 18 | from helpers import LuigiTestCase, temporary_unloaded_module 19 | 20 | import luigi 21 | import luigi.interface 22 | 23 | CONTENTS = b''' 24 | import luigi 25 | 26 | class FooTask(luigi.Task): 27 | x = luigi.IntParameter() 28 | 29 | def run(self): 30 | luigi._testing_glob_var = self.x 31 | ''' 32 | 33 | 34 | class CmdlineTest(LuigiTestCase): 35 | 36 | def test_dynamic_loading(self): 37 | with temporary_unloaded_module(CONTENTS) as temp_module_name: 38 | luigi.interface.run(['--module', temp_module_name, 'FooTask', '--x', '123', '--local-scheduler', '--no-lock']) 39 | self.assertEqual(luigi._testing_glob_var, 123) 40 | -------------------------------------------------------------------------------- /test/task_status_message_test.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | 18 | from helpers import LuigiTestCase 19 | 20 | import luigi 21 | import luigi.scheduler 22 | import luigi.worker 23 | 24 | luigi.notifications.DEBUG = True 25 | 26 | 27 | class TaskStatusMessageTest(LuigiTestCase): 28 | 29 | def test_run(self): 30 | message = "test message" 31 | sch = luigi.scheduler.Scheduler() 32 | with luigi.worker.Worker(scheduler=sch) as w: 33 | class MyTask(luigi.Task): 34 | def run(self): 35 | self.set_status_message(message) 36 | 37 | task = MyTask() 38 | w.add(task) 39 | w.run() 40 | 41 | self.assertEqual(sch.get_task_status_message(task.task_id)["statusMessage"], message) 42 | -------------------------------------------------------------------------------- /test/set_task_name_test.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | 18 | from helpers import unittest 19 | 20 | import luigi 21 | 22 | 23 | def create_class(cls_name): 24 | class NewTask(luigi.WrapperTask): 25 | pass 26 | 27 | NewTask.__name__ = cls_name 28 | 29 | return NewTask 30 | 31 | 32 | create_class('MyNewTask') 33 | 34 | 35 | class SetTaskNameTest(unittest.TestCase): 36 | 37 | ''' I accidentally introduced an issue in this commit: 38 | https://github.com/spotify/luigi/commit/6330e9d0332e6152996292a39c42f752b9288c96 39 | 40 | This causes tasks not to get exposed if they change name later. Adding a unit test 41 | to resolve the issue. ''' 42 | 43 | def test_set_task_name(self): 44 | luigi.run(['--local-scheduler', '--no-lock', 'MyNewTask']) 45 | -------------------------------------------------------------------------------- /test/metrics_test.py: -------------------------------------------------------------------------------- 1 | import unittest 2 | 3 | import luigi.metrics as metrics 4 | 5 | from luigi.contrib.datadog_metric import DatadogMetricsCollector 6 | from luigi.contrib.prometheus_metric import PrometheusMetricsCollector 7 | 8 | 9 | class TestMetricsCollectors(unittest.TestCase): 10 | def test_default_value(self): 11 | collector = metrics.MetricsCollectors.default 12 | output = metrics.MetricsCollectors.get(collector) 13 | 14 | assert type(output) is metrics.NoMetricsCollector 15 | 16 | def test_datadog_value(self): 17 | collector = metrics.MetricsCollectors.datadog 18 | output = metrics.MetricsCollectors.get(collector) 19 | 20 | assert type(output) is DatadogMetricsCollector 21 | 22 | def test_prometheus_value(self): 23 | collector = metrics.MetricsCollectors.prometheus 24 | output = metrics.MetricsCollectors.get(collector) 25 | 26 | assert type(output) is PrometheusMetricsCollector 27 | 28 | def test_none_value(self): 29 | collector = metrics.MetricsCollectors.none 30 | output = metrics.MetricsCollectors.get(collector) 31 | 32 | assert type(output) is metrics.NoMetricsCollector 33 | 34 | def test_other_value(self): 35 | collector = 'junk' 36 | 37 | with self.assertRaises(ValueError) as context: 38 | metrics.MetricsCollectors.get(collector) 39 | assert ("MetricsCollectors value ' junk ' isn't supported") in str(context.exception) 40 | -------------------------------------------------------------------------------- /luigi/configuration/base_parser.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | import logging 18 | 19 | 20 | # IMPORTANT: don't inherit from `object`! 21 | # ConfigParser have some troubles in this case. 22 | # More info: https://stackoverflow.com/a/19323238 23 | class BaseParser: 24 | @classmethod 25 | def instance(cls, *args, **kwargs): 26 | """ Singleton getter """ 27 | if cls._instance is None: 28 | cls._instance = cls(*args, **kwargs) 29 | loaded = cls._instance.reload() 30 | logging.getLogger('luigi-interface').info('Loaded %r', loaded) 31 | 32 | return cls._instance 33 | 34 | @classmethod 35 | def add_config_path(cls, path): 36 | cls._config_paths.append(path) 37 | cls.reload() 38 | 39 | @classmethod 40 | def reload(cls): 41 | return cls.instance().read(cls._config_paths) 42 | -------------------------------------------------------------------------------- /test/contrib/hdfs/webhdfs_client_test.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2015 VNG Corporation 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | import unittest 18 | 19 | import pytest 20 | 21 | from helpers import with_config 22 | from luigi.contrib.hdfs import WebHdfsClient 23 | 24 | InsecureClient = pytest.importorskip('hdfs.InsecureClient') 25 | KerberosClient = pytest.importorskip('hdfs.ext.kerberos.KerberosClient') 26 | 27 | 28 | @pytest.mark.apache 29 | class TestWebHdfsClient(unittest.TestCase): 30 | 31 | @with_config({'webhdfs': {'client_type': 'insecure'}}) 32 | def test_insecure_client_type(self): 33 | client = WebHdfsClient(host='localhost').client 34 | self.assertIsInstance(client, InsecureClient) 35 | 36 | @with_config({'webhdfs': {'client_type': 'kerberos'}}) 37 | def test_kerberos_client_type(self): 38 | client = WebHdfsClient(host='localhost').client 39 | self.assertIsInstance(client, KerberosClient) 40 | -------------------------------------------------------------------------------- /luigi/templates/show.html: -------------------------------------------------------------------------------- 1 | {% extends "layout.html" %} 2 | {% block content %} 3 |
4 |
5 |

Info

6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 |
Task Id{{task.id}}
Task Name{{task.name}}
Host{{task.host}}
MoreAll "{{task.name}}" runs
26 |
27 |
28 |

Parameters

29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | {% for (k, param) in task.parameters.items() %} 38 | 39 | 40 | 41 | 42 | {% end %} 43 | 44 |
NameValue
{{k}}{{param.value}}
45 |

Actions

46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 54 | {% for event in task.events %} 55 | 56 | 57 | 58 | 59 | {% end %} 60 | 61 | 62 |
StatusAction Time
{{event.event_name}}{{event.ts}}
63 | {% end %} 64 | -------------------------------------------------------------------------------- /luigi/cmdline.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import sys 3 | 4 | from luigi.retcodes import run_with_retcodes 5 | from luigi.setup_logging import DaemonLogging 6 | 7 | 8 | def luigi_run(argv=sys.argv[1:]): 9 | run_with_retcodes(argv) 10 | 11 | 12 | def luigid(argv=sys.argv[1:]): 13 | import luigi.server 14 | import luigi.process 15 | import luigi.configuration 16 | parser = argparse.ArgumentParser(description=u'Central luigi server') 17 | parser.add_argument(u'--background', help=u'Run in background mode', action='store_true') 18 | parser.add_argument(u'--pidfile', help=u'Write pidfile') 19 | parser.add_argument(u'--logdir', help=u'log directory') 20 | parser.add_argument(u'--state-path', help=u'Pickled state file') 21 | parser.add_argument(u'--address', help=u'Listening interface') 22 | parser.add_argument(u'--unix-socket', help=u'Unix socket path') 23 | parser.add_argument(u'--port', default=8082, help=u'Listening port') 24 | 25 | opts = parser.parse_args(argv) 26 | 27 | if opts.state_path: 28 | config = luigi.configuration.get_config() 29 | config.set('scheduler', 'state_path', opts.state_path) 30 | 31 | DaemonLogging.setup(opts) 32 | if opts.background: 33 | luigi.process.daemonize(luigi.server.run, api_port=opts.port, 34 | address=opts.address, pidfile=opts.pidfile, 35 | logdir=opts.logdir, unix_socket=opts.unix_socket) 36 | else: 37 | luigi.server.run(api_port=opts.port, address=opts.address, unix_socket=opts.unix_socket) 38 | -------------------------------------------------------------------------------- /test/factorial_test.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | 18 | from helpers import unittest 19 | 20 | import luigi 21 | 22 | 23 | class Factorial(luigi.Task): 24 | 25 | ''' This calculates factorials *online* and does not write its results anywhere 26 | 27 | Demonstrates the ability for dependencies between Tasks and not just between their output. 28 | ''' 29 | n = luigi.IntParameter(default=100) 30 | 31 | def requires(self): 32 | if self.n > 1: 33 | return Factorial(self.n - 1) 34 | 35 | def run(self): 36 | if self.n > 1: 37 | self.value = self.n * self.requires().value 38 | else: 39 | self.value = 1 40 | self.complete = lambda: True 41 | 42 | def complete(self): 43 | return False 44 | 45 | 46 | class FactorialTest(unittest.TestCase): 47 | 48 | def test_invoke(self): 49 | luigi.build([Factorial(100)], local_scheduler=True) 50 | self.assertEqual(Factorial(42).value, 1405006117752879898543142606244511569936384000000000) 51 | -------------------------------------------------------------------------------- /test/remote_scheduler_test.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | 18 | import os 19 | import tempfile 20 | 21 | import luigi.server 22 | import server_test 23 | 24 | tempdir = tempfile.mkdtemp() 25 | 26 | 27 | class DummyTask(luigi.Task): 28 | id = luigi.IntParameter() 29 | 30 | def run(self): 31 | f = self.output().open('w') 32 | f.close() 33 | 34 | def output(self): 35 | return luigi.LocalTarget(os.path.join(tempdir, str(self.id))) 36 | 37 | 38 | class RemoteSchedulerTest(server_test.ServerTestBase): 39 | 40 | def _test_run(self, workers): 41 | tasks = [DummyTask(id) for id in range(20)] 42 | luigi.build(tasks, workers=workers, scheduler_port=self.get_http_port()) 43 | 44 | for t in tasks: 45 | self.assertEqual(t.complete(), True) 46 | self.assertTrue(os.path.exists(t.output().path)) 47 | 48 | def test_single_worker(self): 49 | self._test_run(workers=1) 50 | 51 | def test_multiple_workers(self): 52 | self._test_run(workers=10) 53 | -------------------------------------------------------------------------------- /luigi/contrib/gcp.py: -------------------------------------------------------------------------------- 1 | """ 2 | Common code for GCP (google cloud services) integration 3 | """ 4 | import logging 5 | logger = logging.getLogger('luigi-interface') 6 | 7 | try: 8 | import httplib2 9 | import google.auth 10 | except ImportError: 11 | logger.warning("Loading GCP module without the python packages httplib2, google-auth. \ 12 | This *could* crash at runtime if no other credentials are provided.") 13 | 14 | 15 | def get_authenticate_kwargs(oauth_credentials=None, http_=None): 16 | """Returns a dictionary with keyword arguments for use with discovery 17 | 18 | Prioritizes oauth_credentials or a http client provided by the user 19 | If none provided, falls back to default credentials provided by google's command line 20 | utilities. If that also fails, tries using httplib2.Http() 21 | 22 | Used by `gcs.GCSClient` and `bigquery.BigQueryClient` to initiate the API Client 23 | """ 24 | if oauth_credentials: 25 | authenticate_kwargs = { 26 | "credentials": oauth_credentials 27 | } 28 | elif http_: 29 | authenticate_kwargs = { 30 | "http": http_ 31 | } 32 | else: 33 | # neither http_ or credentials provided 34 | try: 35 | # try default credentials 36 | credentials, _ = google.auth.default() 37 | authenticate_kwargs = { 38 | "credentials": credentials 39 | } 40 | except google.auth.exceptions.DefaultCredentialsError: 41 | # try http using httplib2 42 | authenticate_kwargs = { 43 | "http": httplib2.Http() 44 | } 45 | 46 | return authenticate_kwargs 47 | -------------------------------------------------------------------------------- /examples/foo.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | """ 18 | You can run this example like this: 19 | 20 | .. code:: console 21 | 22 | $ rm -rf '/tmp/bar' 23 | $ luigi --module examples.foo examples.Foo --workers 2 --local-scheduler 24 | 25 | """ 26 | import time 27 | 28 | import luigi 29 | 30 | 31 | class Foo(luigi.WrapperTask): 32 | task_namespace = 'examples' 33 | 34 | def run(self): 35 | print("Running Foo") 36 | 37 | def requires(self): 38 | for i in range(10): 39 | yield Bar(i) 40 | 41 | 42 | class Bar(luigi.Task): 43 | task_namespace = 'examples' 44 | num = luigi.IntParameter() 45 | 46 | def run(self): 47 | time.sleep(1) 48 | self.output().open('w').close() 49 | 50 | def output(self): 51 | """ 52 | Returns the target output for this task. 53 | 54 | :return: the target output for this task. 55 | :rtype: object (:py:class:`~luigi.target.Target`) 56 | """ 57 | time.sleep(1) 58 | return luigi.LocalTarget('/tmp/bar/%d' % self.num) 59 | -------------------------------------------------------------------------------- /test/subtask_test.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | 18 | import abc 19 | from helpers import unittest 20 | 21 | import luigi 22 | 23 | 24 | class AbstractTask(luigi.Task): 25 | k = luigi.IntParameter() 26 | 27 | @property 28 | @abc.abstractmethod 29 | def foo(self): 30 | raise NotImplementedError 31 | 32 | @abc.abstractmethod 33 | def helper_function(self): 34 | raise NotImplementedError 35 | 36 | def run(self): 37 | return ",".join([self.foo, self.helper_function()]) 38 | 39 | 40 | class Implementation(AbstractTask): 41 | 42 | @property 43 | def foo(self): 44 | return "bar" 45 | 46 | def helper_function(self): 47 | return "hello" * self.k 48 | 49 | 50 | class AbstractSubclassTest(unittest.TestCase): 51 | 52 | def test_instantiate_abstract(self): 53 | def try_instantiate(): 54 | AbstractTask(k=1) 55 | 56 | self.assertRaises(TypeError, try_instantiate) 57 | 58 | def test_instantiate(self): 59 | self.assertEqual("bar,hellohello", Implementation(k=2).run()) 60 | -------------------------------------------------------------------------------- /luigi/static/visualiser/lib/bootstrap-toggle/css/bootstrap-toggle.min.css: -------------------------------------------------------------------------------- 1 | /*! ======================================================================== 2 | * Bootstrap Toggle: bootstrap-toggle.css v2.2.0 3 | * http://www.bootstraptoggle.com 4 | * ======================================================================== 5 | * Copyright 2014 Min Hur, The New York Times Company 6 | * Licensed under MIT 7 | * ======================================================================== */ 8 | .checkbox label .toggle,.checkbox-inline .toggle{margin-left:-20px;margin-right:5px} 9 | .toggle{position:relative;overflow:hidden} 10 | .toggle input[type=checkbox]{display:none} 11 | .toggle-group{position:absolute;width:200%;top:0;bottom:0;left:0;transition:left .35s;-webkit-transition:left .35s;-moz-user-select:none;-webkit-user-select:none} 12 | .toggle.off .toggle-group{left:-100%} 13 | .toggle-on{position:absolute;top:0;bottom:0;left:0;right:50%;margin:0;border:0;border-radius:0} 14 | .toggle-off{position:absolute;top:0;bottom:0;left:50%;right:0;margin:0;border:0;border-radius:0} 15 | .toggle-handle{position:relative;margin:0 auto;padding-top:0;padding-bottom:0;height:100%;width:0;border-width:0 1px} 16 | .toggle.btn{min-width:59px;min-height:34px} 17 | .toggle-on.btn{padding-right:24px} 18 | .toggle-off.btn{padding-left:24px} 19 | .toggle.btn-lg{min-width:79px;min-height:45px} 20 | .toggle-on.btn-lg{padding-right:31px} 21 | .toggle-off.btn-lg{padding-left:31px} 22 | .toggle-handle.btn-lg{width:40px} 23 | .toggle.btn-sm{min-width:50px;min-height:30px} 24 | .toggle-on.btn-sm{padding-right:20px} 25 | .toggle-off.btn-sm{padding-left:20px} 26 | .toggle.btn-xs{min-width:35px;min-height:22px} 27 | .toggle-on.btn-xs{padding-right:12px} 28 | .toggle-off.btn-xs{padding-left:12px} -------------------------------------------------------------------------------- /test/recursion_test.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | 18 | import datetime 19 | from helpers import unittest 20 | 21 | import luigi 22 | import luigi.interface 23 | from luigi.mock import MockTarget 24 | 25 | 26 | class Popularity(luigi.Task): 27 | date = luigi.DateParameter(default=datetime.date.today() - datetime.timedelta(1)) 28 | 29 | def output(self): 30 | return MockTarget('/tmp/popularity/%s.txt' % self.date.strftime('%Y-%m-%d')) 31 | 32 | def requires(self): 33 | return Popularity(self.date - datetime.timedelta(1)) 34 | 35 | def run(self): 36 | f = self.output().open('w') 37 | for line in self.input().open('r'): 38 | print(int(line.strip()) + 1, file=f) 39 | 40 | f.close() 41 | 42 | 43 | class RecursionTest(unittest.TestCase): 44 | 45 | def setUp(self): 46 | MockTarget.fs.get_all_data()['/tmp/popularity/2009-01-01.txt'] = b'0\n' 47 | 48 | def test_invoke(self): 49 | luigi.build([Popularity(datetime.date(2009, 1, 5))], local_scheduler=True) 50 | 51 | self.assertEqual(MockTarget.fs.get_data('/tmp/popularity/2009-01-05.txt'), b'4\n') 52 | -------------------------------------------------------------------------------- /test/contrib/bigquery_avro_test.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2019 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | 18 | """ 19 | These are the unit tests for the BigQueryLoadAvro class. 20 | """ 21 | 22 | import unittest 23 | import avro 24 | import avro.schema 25 | from luigi.contrib.bigquery_avro import BigQueryLoadAvro 26 | 27 | 28 | class BigQueryAvroTest(unittest.TestCase): 29 | 30 | def test_writer_schema_method_existence(self): 31 | schema_json = """ 32 | { 33 | "namespace": "example.avro", 34 | "type": "record", 35 | "name": "User", 36 | "fields": [ 37 | {"name": "name", "type": "string"}, 38 | {"name": "favorite_number", "type": ["int", "null"]}, 39 | {"name": "favorite_color", "type": ["string", "null"]} 40 | ] 41 | } 42 | """ 43 | avro_schema = avro.schema.Parse(schema_json) 44 | reader = avro.io.DatumReader(avro_schema, avro_schema) 45 | actual_schema = BigQueryLoadAvro._get_writer_schema(reader) 46 | self.assertEqual(actual_schema, avro_schema, 47 | "writer(s) avro_schema attribute not found") 48 | # otherwise AttributeError is thrown 49 | -------------------------------------------------------------------------------- /test/priority_test.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | 18 | from helpers import unittest 19 | 20 | import luigi 21 | import luigi.notifications 22 | 23 | luigi.notifications.DEBUG = True 24 | 25 | 26 | class PrioTask(luigi.Task): 27 | prio = luigi.Parameter() 28 | run_counter = 0 29 | 30 | @property 31 | def priority(self): 32 | return self.prio 33 | 34 | def requires(self): 35 | if self.prio > 10: 36 | return PrioTask(self.prio - 10) 37 | 38 | def run(self): 39 | self.t = PrioTask.run_counter 40 | PrioTask.run_counter += 1 41 | 42 | def complete(self): 43 | return hasattr(self, 't') 44 | 45 | 46 | class PriorityTest(unittest.TestCase): 47 | 48 | def test_priority(self): 49 | p, q, r = PrioTask(1), PrioTask(2), PrioTask(3) 50 | luigi.build([p, q, r], local_scheduler=True) 51 | self.assertTrue(r.t < q.t < p.t) 52 | 53 | def test_priority_w_dep(self): 54 | x, y, z = PrioTask(25), PrioTask(15), PrioTask(5) 55 | a, b, c = PrioTask(24), PrioTask(14), PrioTask(4) 56 | luigi.build([a, b, c, x, y, z], local_scheduler=True) 57 | self.assertTrue(z.t < y.t < x.t < c.t < b.t < a.t) 58 | -------------------------------------------------------------------------------- /doc/logging.rst: -------------------------------------------------------------------------------- 1 | Configure logging 2 | ----------------- 3 | 4 | 5 | Config options: 6 | ~~~~~~~~~~~~~~~ 7 | 8 | Some config options for config [core] section 9 | 10 | log_level 11 | The default log level to use when no logging_conf_file is set. Must be 12 | a valid name of a `Python log level 13 | `_. 14 | Default is ``DEBUG``. 15 | logging_conf_file 16 | Location of the logging configuration file. 17 | no_configure_logging 18 | If true, logging is not configured. Defaults to false. 19 | 20 | 21 | Config section 22 | ~~~~~~~~~~~~~~ 23 | 24 | If you're use TOML for configuration file, you can configure logging 25 | via ``logging`` section in this file. See `example 26 | `_ 27 | for more details. 28 | 29 | Luigid CLI options: 30 | ~~~~~~~~~~~~~~~~~~~ 31 | 32 | ``--background`` 33 | Run daemon in background mode. Disable logging setup 34 | and set up log level to INFO for root logger. 35 | ``--logdir`` 36 | set logging with INFO level and output in ``$logdir/luigi-server.log`` file 37 | 38 | 39 | Worker CLI options: 40 | ~~~~~~~~~~~~~~~~~~~ 41 | 42 | ``--logging-conf-file`` 43 | Configuration file for logging. 44 | ``--log-level`` 45 | Default log level. 46 | Available values: NOTSET, DEBUG, INFO, WARNING, ERROR, CRITICAL. 47 | Default DEBUG. See `Python documentation 48 | `_ 49 | For information about levels difference. 50 | 51 | 52 | Configuration options resolution order: 53 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 54 | 55 | 1. no_configure_logging option 56 | 2. ``--background`` 57 | 3. ``--logdir`` 58 | 4. ``--logging-conf-file`` 59 | 5. logging_conf_file option 60 | 6. ``logging`` section 61 | 7. ``--log-level`` 62 | 8. log_level option 63 | -------------------------------------------------------------------------------- /luigi/event.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | 18 | """ Definitions needed for events. See :ref:`Events` for info on how to use it.""" 19 | 20 | 21 | class Event: 22 | # TODO nice descriptive subclasses of Event instead of strings? pass their instances to the callback instead of an undocumented arg list? 23 | DEPENDENCY_DISCOVERED = "event.core.dependency.discovered" # triggered for every (task, upstream task) pair discovered in a jobflow 24 | DEPENDENCY_MISSING = "event.core.dependency.missing" 25 | DEPENDENCY_PRESENT = "event.core.dependency.present" 26 | BROKEN_TASK = "event.core.task.broken" 27 | START = "event.core.start" 28 | #: This event can be fired by the task itself while running. The purpose is 29 | #: for the task to report progress, metadata or any generic info so that 30 | #: event handler listening for this can keep track of the progress of running task. 31 | PROGRESS = "event.core.progress" 32 | FAILURE = "event.core.failure" 33 | SUCCESS = "event.core.success" 34 | PROCESSING_TIME = "event.core.processing_time" 35 | TIMEOUT = "event.core.timeout" # triggered if a task times out 36 | PROCESS_FAILURE = "event.core.process_failure" # triggered if the process a task is running in dies unexpectedly 37 | -------------------------------------------------------------------------------- /test/helpers_test.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2016 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | import luigi 18 | import luigi.date_interval 19 | import luigi.interface 20 | import luigi.notifications 21 | from helpers import LuigiTestCase, RunOnceTask 22 | 23 | 24 | class LuigiTestCaseTest(LuigiTestCase): 25 | 26 | def test_1(self): 27 | class MyClass(luigi.Task): 28 | pass 29 | 30 | self.assertTrue(self.run_locally(['MyClass'])) 31 | 32 | def test_2(self): 33 | class MyClass(luigi.Task): 34 | pass 35 | 36 | self.assertTrue(self.run_locally(['MyClass'])) 37 | 38 | 39 | class RunOnceTaskTest(LuigiTestCase): 40 | 41 | def test_complete_behavior(self): 42 | """ 43 | Verify that RunOnceTask works as expected. 44 | 45 | This task will fail if it is a normal ``luigi.Task``, because 46 | RequiringTask will not run (missing dependency at runtime). 47 | """ 48 | class MyTask(RunOnceTask): 49 | pass 50 | 51 | class RequiringTask(luigi.Task): 52 | counter = 0 53 | 54 | def requires(self): 55 | yield MyTask() 56 | 57 | def run(self): 58 | RequiringTask.counter += 1 59 | 60 | self.run_locally(['RequiringTask']) 61 | self.assertEqual(1, RequiringTask.counter) 62 | -------------------------------------------------------------------------------- /test/task_history_test.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | 18 | from helpers import LuigiTestCase 19 | 20 | import luigi 21 | import luigi.scheduler 22 | import luigi.task_history 23 | import luigi.worker 24 | 25 | luigi.notifications.DEBUG = True 26 | 27 | 28 | class SimpleTaskHistory(luigi.task_history.TaskHistory): 29 | 30 | def __init__(self): 31 | self.actions = [] 32 | 33 | def task_scheduled(self, task): 34 | self.actions.append(('scheduled', task.id)) 35 | 36 | def task_finished(self, task, successful): 37 | self.actions.append(('finished', task.id)) 38 | 39 | def task_started(self, task, worker_host): 40 | self.actions.append(('started', task.id)) 41 | 42 | 43 | class TaskHistoryTest(LuigiTestCase): 44 | 45 | def test_run(self): 46 | th = SimpleTaskHistory() 47 | sch = luigi.scheduler.Scheduler(task_history_impl=th) 48 | with luigi.worker.Worker(scheduler=sch) as w: 49 | class MyTask(luigi.Task): 50 | pass 51 | 52 | task = MyTask() 53 | w.add(task) 54 | w.run() 55 | 56 | self.assertEqual(th.actions, [ 57 | ('scheduled', task.task_id), 58 | ('started', task.task_id), 59 | ('finished', task.task_id) 60 | ]) 61 | -------------------------------------------------------------------------------- /test/_mysqldb_test.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | 18 | from helpers import unittest 19 | 20 | import mysql.connector 21 | from luigi.contrib.mysqldb import MySqlTarget 22 | 23 | host = 'localhost' 24 | port = 3306 25 | database = 'luigi_test' 26 | username = None 27 | password = None 28 | table_updates = 'table_updates' 29 | 30 | 31 | def _create_test_database(): 32 | con = mysql.connector.connect(user=username, 33 | password=password, 34 | host=host, 35 | port=port, 36 | autocommit=True) 37 | con.cursor().execute('CREATE DATABASE IF NOT EXISTS %s' % database) 38 | 39 | 40 | _create_test_database() 41 | target = MySqlTarget(host, database, username, password, '', 'update_id') 42 | 43 | 44 | class MySqlTargetTest(unittest.TestCase): 45 | 46 | def test_touch_and_exists(self): 47 | drop() 48 | self.assertFalse(target.exists(), 49 | 'Target should not exist before touching it') 50 | target.touch() 51 | self.assertTrue(target.exists(), 52 | 'Target should exist after touching it') 53 | 54 | 55 | def drop(): 56 | con = target.connect(autocommit=True) 57 | con.cursor().execute('DROP TABLE IF EXISTS %s' % table_updates) 58 | -------------------------------------------------------------------------------- /test/test_ssh.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | 18 | import subprocess 19 | from helpers import unittest 20 | 21 | from luigi.contrib.ssh import RemoteContext 22 | 23 | 24 | class TestMockedRemoteContext(unittest.TestCase): 25 | 26 | def test_subprocess_delegation(self): 27 | """ Test subprocess call structure using mock module """ 28 | orig_Popen = subprocess.Popen 29 | self.last_test = None 30 | 31 | def Popen(cmd, **kwargs): 32 | self.last_test = cmd 33 | 34 | subprocess.Popen = Popen 35 | context = RemoteContext( 36 | "some_host", 37 | username="luigi", 38 | key_file="/some/key.pub" 39 | ) 40 | context.Popen(["ls"]) 41 | self.assertTrue("ssh" in self.last_test) 42 | self.assertTrue("-i" in self.last_test) 43 | self.assertTrue("/some/key.pub" in self.last_test) 44 | self.assertTrue("luigi@some_host" in self.last_test) 45 | self.assertTrue("ls" in self.last_test) 46 | 47 | subprocess.Popen = orig_Popen 48 | 49 | def test_check_output_fail_connect(self): 50 | """ Test check_output to a non-existing host """ 51 | context = RemoteContext("__NO_HOST_LIKE_THIS__", connect_timeout=1) 52 | self.assertRaises( 53 | subprocess.CalledProcessError, 54 | context.check_output, ["ls"] 55 | ) 56 | -------------------------------------------------------------------------------- /test/task_register_test.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2017 VNG Corporation 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | from helpers import LuigiTestCase 18 | 19 | import luigi 20 | from luigi.task_register import (Register, 21 | TaskClassNotFoundException, 22 | TaskClassAmbigiousException, 23 | ) 24 | 25 | 26 | class TaskRegisterTest(LuigiTestCase): 27 | 28 | def test_externalize_taskclass(self): 29 | with self.assertRaises(TaskClassNotFoundException): 30 | Register.get_task_cls('scooby.Doo') 31 | 32 | class Task1(luigi.Task): 33 | @classmethod 34 | def get_task_family(cls): 35 | return "scooby.Doo" 36 | 37 | self.assertEqual(Task1, Register.get_task_cls('scooby.Doo')) 38 | 39 | class Task2(luigi.Task): 40 | @classmethod 41 | def get_task_family(cls): 42 | return "scooby.Doo" 43 | 44 | with self.assertRaises(TaskClassAmbigiousException): 45 | Register.get_task_cls('scooby.Doo') 46 | 47 | class Task3(luigi.Task): 48 | @classmethod 49 | def get_task_family(cls): 50 | return "scooby.Doo" 51 | 52 | # There previously was a rare bug where the third installed class could 53 | # "undo" class ambiguity. 54 | with self.assertRaises(TaskClassAmbigiousException): 55 | Register.get_task_cls('scooby.Doo') 56 | -------------------------------------------------------------------------------- /luigi/contrib/hdfs/clients.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | 18 | """ 19 | The implementations of the hdfs clients. 20 | """ 21 | import logging 22 | import threading 23 | 24 | from luigi.contrib.hdfs import config as hdfs_config 25 | from luigi.contrib.hdfs import webhdfs_client as hdfs_webhdfs_client 26 | from luigi.contrib.hdfs import hadoopcli_clients as hdfs_hadoopcli_clients 27 | 28 | logger = logging.getLogger('luigi-interface') 29 | 30 | _AUTOCONFIG_CLIENT = threading.local() 31 | 32 | 33 | def get_autoconfig_client(client_cache=_AUTOCONFIG_CLIENT): 34 | """ 35 | Creates the client as specified in the `luigi.cfg` configuration. 36 | """ 37 | try: 38 | return client_cache.client 39 | except AttributeError: 40 | configured_client = hdfs_config.get_configured_hdfs_client() 41 | if configured_client == "webhdfs": 42 | client_cache.client = hdfs_webhdfs_client.WebHdfsClient() 43 | elif configured_client == "hadoopcli": 44 | client_cache.client = hdfs_hadoopcli_clients.create_hadoopcli_client() 45 | else: 46 | raise Exception("Unknown hdfs client " + configured_client) 47 | return client_cache.client 48 | 49 | 50 | def _with_ac(method_name): 51 | def result(*args, **kwargs): 52 | return getattr(get_autoconfig_client(), method_name)(*args, **kwargs) 53 | return result 54 | 55 | 56 | exists = _with_ac('exists') 57 | rename = _with_ac('rename') 58 | remove = _with_ac('remove') 59 | mkdir = _with_ac('mkdir') 60 | listdir = _with_ac('listdir') 61 | -------------------------------------------------------------------------------- /examples/foo_complex.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | """ 18 | You can run this example like this: 19 | 20 | .. code:: console 21 | 22 | $ rm -rf '/tmp/bar' 23 | $ luigi --module examples.foo_complex examples.Foo --workers 2 --local-scheduler 24 | 25 | """ 26 | import time 27 | import random 28 | 29 | import luigi 30 | 31 | max_depth = 10 32 | max_total_nodes = 50 33 | current_nodes = 0 34 | 35 | 36 | class Foo(luigi.Task): 37 | task_namespace = 'examples' 38 | 39 | def run(self): 40 | print("Running Foo") 41 | 42 | def requires(self): 43 | global current_nodes 44 | for i in range(30 // max_depth): 45 | current_nodes += 1 46 | yield Bar(i) 47 | 48 | 49 | class Bar(luigi.Task): 50 | task_namespace = 'examples' 51 | 52 | num = luigi.IntParameter() 53 | 54 | def run(self): 55 | time.sleep(1) 56 | self.output().open('w').close() 57 | 58 | def requires(self): 59 | global current_nodes 60 | 61 | if max_total_nodes > current_nodes: 62 | valor = int(random.uniform(1, 30)) 63 | for i in range(valor // max_depth): 64 | current_nodes += 1 65 | yield Bar(current_nodes) 66 | 67 | def output(self): 68 | """ 69 | Returns the target output for this task. 70 | 71 | :return: the target output for this task. 72 | :rtype: object (:py:class:`~luigi.target.Target`) 73 | """ 74 | time.sleep(1) 75 | return luigi.LocalTarget('/tmp/bar/%d' % self.num) 76 | -------------------------------------------------------------------------------- /luigi/contrib/sparkey.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | 18 | import luigi 19 | 20 | 21 | class SparkeyExportTask(luigi.Task): 22 | """ 23 | A luigi task that writes to a local sparkey log file. 24 | 25 | Subclasses should implement the requires and output methods. The output 26 | must be a luigi.LocalTarget. 27 | 28 | The resulting sparkey log file will contain one entry for every line in 29 | the input, mapping from the first value to a tab-separated list of the 30 | rest of the line. 31 | 32 | To generate a simple key-value index, yield "key", "value" pairs from the input(s) to this task. 33 | """ 34 | 35 | # the separator used to split input lines 36 | separator = '\t' 37 | 38 | def __init__(self, *args, **kwargs): 39 | super(SparkeyExportTask, self).__init__(*args, **kwargs) 40 | 41 | def run(self): 42 | self._write_sparkey_file() 43 | 44 | def _write_sparkey_file(self): 45 | import sparkey 46 | 47 | infile = self.input() 48 | outfile = self.output() 49 | if not isinstance(outfile, luigi.LocalTarget): 50 | raise TypeError("output must be a LocalTarget") 51 | 52 | # write job output to temporary sparkey file 53 | temp_output = luigi.LocalTarget(is_tmp=True) 54 | w = sparkey.LogWriter(temp_output.path) 55 | for line in infile.open('r'): 56 | k, v = line.strip().split(self.separator, 1) 57 | w[k] = v 58 | w.close() 59 | 60 | # move finished sparkey file to final destination 61 | temp_output.move(outfile.path) 62 | -------------------------------------------------------------------------------- /luigi/static/visualiser/css/tipsy.css: -------------------------------------------------------------------------------- 1 | .tipsy { font-size: 10px; position: absolute; padding: 5px; z-index: 100000; } 2 | .tipsy-inner { background-color: #000; color: #FFF; max-width: 200px; padding: 5px 8px 4px 8px; text-align: center; } 3 | 4 | /* Rounded corners */ 5 | .tipsy-inner { border-radius: 3px; -moz-border-radius: 3px; -webkit-border-radius: 3px; } 6 | 7 | /* Uncomment for shadow */ 8 | .tipsy-inner { box-shadow: 0 0 5px #000000; -webkit-box-shadow: 0 0 5px #000000; -moz-box-shadow: 0 0 5px #000000; } 9 | 10 | .tipsy-arrow { position: absolute; width: 0; height: 0; line-height: 0; border: 5px dashed #000; } 11 | 12 | /* Rules to colour arrows */ 13 | .tipsy-arrow-n { border-bottom-color: #000; } 14 | .tipsy-arrow-s { border-top-color: #000; } 15 | .tipsy-arrow-e { border-left-color: #000; } 16 | .tipsy-arrow-w { border-right-color: #000; } 17 | 18 | .tipsy-n .tipsy-arrow { top: 0px; left: 50%; margin-left: -5px; border-bottom-style: solid; border-top: none; border-left-color: transparent; border-right-color: transparent; } 19 | .tipsy-nw .tipsy-arrow { top: 0; left: 10px; border-bottom-style: solid; border-top: none; border-left-color: transparent; border-right-color: transparent;} 20 | .tipsy-ne .tipsy-arrow { top: 0; right: 10px; border-bottom-style: solid; border-top: none; border-left-color: transparent; border-right-color: transparent;} 21 | .tipsy-s .tipsy-arrow { bottom: 0; left: 50%; margin-left: -5px; border-top-style: solid; border-bottom: none; border-left-color: transparent; border-right-color: transparent; } 22 | .tipsy-sw .tipsy-arrow { bottom: 0; left: 10px; border-top-style: solid; border-bottom: none; border-left-color: transparent; border-right-color: transparent; } 23 | .tipsy-se .tipsy-arrow { bottom: 0; right: 10px; border-top-style: solid; border-bottom: none; border-left-color: transparent; border-right-color: transparent; } 24 | .tipsy-e .tipsy-arrow { right: 0; top: 50%; margin-top: -5px; border-left-style: solid; border-right: none; border-top-color: transparent; border-bottom-color: transparent; } 25 | .tipsy-w .tipsy-arrow { left: 0; top: 50%; margin-top: -5px; border-right-style: solid; border-left: none; border-top-color: transparent; border-bottom-color: transparent; } 26 | -------------------------------------------------------------------------------- /examples/kubernetes.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2015 Outlier Bio, LLC 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | """ 18 | Example Kubernetes Job Task. 19 | 20 | Requires: 21 | 22 | - pykube: ``pip install pykube-ng`` 23 | - A local minikube custer up and running: http://kubernetes.io/docs/getting-started-guides/minikube/ 24 | 25 | **WARNING**: For Python versions < 3.5 the kubeconfig file must point to a Kubernetes API 26 | hostname, and NOT to an IP address. 27 | 28 | You can run this code example like this: 29 | 30 | .. code:: console 31 | $ luigi --module examples.kubernetes_job PerlPi --local-scheduler 32 | 33 | Running this code will create a pi-luigi-uuid kubernetes job within the cluster 34 | pointed to by the default context in "~/.kube/config". 35 | 36 | If running within a kubernetes cluster, set auth_method = "service-account" to 37 | access the local cluster. 38 | """ 39 | 40 | # import os 41 | # import luigi 42 | from luigi.contrib.kubernetes import KubernetesJobTask 43 | 44 | 45 | class PerlPi(KubernetesJobTask): 46 | 47 | name = "pi" 48 | max_retrials = 3 49 | spec_schema = { 50 | "containers": [{ 51 | "name": "pi", 52 | "image": "perl", 53 | "command": ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"] 54 | }] 55 | } 56 | 57 | # defining the two functions below allows for dependency checking, 58 | # but isn't a requirement 59 | # def signal_complete(self): 60 | # with self.output().open('w') as output: 61 | # output.write('') 62 | # 63 | # def output(self): 64 | # target = os.path.join("/tmp", "PerlPi") 65 | # return luigi.LocalTarget(target) 66 | -------------------------------------------------------------------------------- /test/contrib/external_daily_snapshot_test.py: -------------------------------------------------------------------------------- 1 | # Copyright (c) 2013 Spotify AB 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); you may not 4 | # use this file except in compliance with the License. You may obtain a copy of 5 | # the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, WITHOUT 11 | # WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the 12 | # License for the specific language governing permissions and limitations under 13 | # the License. 14 | 15 | import unittest 16 | import luigi 17 | from luigi.contrib.external_daily_snapshot import ExternalDailySnapshot 18 | from luigi.mock import MockTarget 19 | import datetime 20 | 21 | 22 | class DataDump(ExternalDailySnapshot): 23 | param = luigi.Parameter() 24 | a = luigi.Parameter(default='zebra') 25 | aa = luigi.Parameter(default='Congo') 26 | 27 | def output(self): 28 | return MockTarget('data-%s-%s-%s-%s' % (self.param, self.a, self.aa, self.date)) 29 | 30 | 31 | class ExternalDailySnapshotTest(unittest.TestCase): 32 | def test_latest(self): 33 | MockTarget('data-xyz-zebra-Congo-2012-01-01').open('w').close() 34 | d = DataDump.latest(date=datetime.date(2012, 1, 10), param='xyz') 35 | self.assertEqual(d.date, datetime.date(2012, 1, 1)) 36 | 37 | def test_latest_not_exists(self): 38 | MockTarget('data-abc-zebra-Congo-2012-01-01').open('w').close() 39 | d = DataDump.latest(date=datetime.date(2012, 1, 11), param='abc', lookback=5) 40 | self.assertEqual(d.date, datetime.date(2012, 1, 7)) 41 | 42 | def test_deterministic(self): 43 | MockTarget('data-pqr-zebra-Congo-2012-01-01').open('w').close() 44 | d = DataDump.latest(date=datetime.date(2012, 1, 10), param='pqr', a='zebra', aa='Congo') 45 | self.assertEqual(d.date, datetime.date(2012, 1, 1)) 46 | 47 | MockTarget('data-pqr-zebra-Congo-2012-01-05').open('w').close() 48 | d = DataDump.latest(date=datetime.date(2012, 1, 10), param='pqr', aa='Congo', a='zebra') 49 | self.assertEqual(d.date, datetime.date(2012, 1, 1)) # Should still be the same 50 | -------------------------------------------------------------------------------- /luigi/task_history.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | """ 18 | Abstract class for task history. 19 | Currently the only subclass is :py:class:`~luigi.db_task_history.DbTaskHistory`. 20 | """ 21 | 22 | import abc 23 | import logging 24 | 25 | 26 | logger = logging.getLogger('luigi-interface') 27 | 28 | 29 | class StoredTask: 30 | """ 31 | Interface for methods on TaskHistory 32 | """ 33 | 34 | # TODO : do we need this task as distinct from luigi.scheduler.Task? 35 | # this only records host and record_id in addition to task parameters. 36 | 37 | def __init__(self, task, status, host=None): 38 | self._task = task 39 | self.status = status 40 | self.record_id = None 41 | self.host = host 42 | 43 | @property 44 | def task_family(self): 45 | return self._task.family 46 | 47 | @property 48 | def parameters(self): 49 | return self._task.params 50 | 51 | 52 | class TaskHistory(metaclass=abc.ABCMeta): 53 | """ 54 | Abstract Base Class for updating the run history of a task 55 | """ 56 | 57 | @abc.abstractmethod 58 | def task_scheduled(self, task): 59 | pass 60 | 61 | @abc.abstractmethod 62 | def task_finished(self, task, successful): 63 | pass 64 | 65 | @abc.abstractmethod 66 | def task_started(self, task, worker_host): 67 | pass 68 | 69 | # TODO(erikbern): should web method (find_latest_runs etc) be abstract? 70 | 71 | 72 | class NopHistory(TaskHistory): 73 | 74 | def task_scheduled(self, task): 75 | pass 76 | 77 | def task_finished(self, task, successful): 78 | pass 79 | 80 | def task_started(self, task, worker_host): 81 | pass 82 | -------------------------------------------------------------------------------- /test/contrib/scalding_test.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | 18 | import luigi 19 | from luigi.contrib import scalding 20 | 21 | import mock 22 | import os 23 | import random 24 | import shutil 25 | import tempfile 26 | import unittest 27 | 28 | import pytest 29 | 30 | 31 | class MyScaldingTask(scalding.ScaldingJobTask): 32 | scala_source = luigi.Parameter() 33 | 34 | def source(self): 35 | return self.scala_source 36 | 37 | 38 | @pytest.mark.contrib 39 | class ScaldingTest(unittest.TestCase): 40 | def setUp(self): 41 | self.scalding_home = os.path.join(tempfile.gettempdir(), 'scalding-%09d' % random.randint(0, 999999999)) 42 | os.mkdir(self.scalding_home) 43 | self.lib_dir = os.path.join(self.scalding_home, 'lib') 44 | os.mkdir(self.lib_dir) 45 | os.mkdir(os.path.join(self.scalding_home, 'provided')) 46 | os.mkdir(os.path.join(self.scalding_home, 'libjars')) 47 | f = open(os.path.join(self.lib_dir, 'scalding-core-foo'), 'w') 48 | f.close() 49 | 50 | self.scala_source = os.path.join(self.scalding_home, 'my_source.scala') 51 | f = open(self.scala_source, 'w') 52 | f.write('class foo extends Job') 53 | f.close() 54 | 55 | os.environ['SCALDING_HOME'] = self.scalding_home 56 | 57 | def tearDown(self): 58 | shutil.rmtree(self.scalding_home) 59 | 60 | @mock.patch('subprocess.check_call') 61 | @mock.patch('luigi.contrib.hadoop.run_and_track_hadoop_job') 62 | def test_scalding(self, check_call, track_job): 63 | success = luigi.run(['MyScaldingTask', '--scala-source', self.scala_source, '--local-scheduler', '--no-lock']) 64 | self.assertTrue(success) 65 | # TODO: check more stuff 66 | -------------------------------------------------------------------------------- /test/contrib/_webhdfs_test.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | 18 | import os 19 | from helpers import unittest 20 | 21 | from luigi.contrib import webhdfs 22 | 23 | import pytest 24 | 25 | 26 | @pytest.mark.apache 27 | class TestWebHdfsTarget(unittest.TestCase): 28 | 29 | ''' 30 | This test requires a running Hadoop cluster with WebHdfs enabled 31 | This test requires the luigi.cfg file to have a `hdfs` section 32 | with the namenode_host, namenode_port and user settings. 33 | ''' 34 | 35 | def setUp(self): 36 | self.testDir = "/tmp/luigi-test".format() 37 | self.path = os.path.join(self.testDir, 'out.txt') 38 | self.client = webhdfs.WebHdfsClient() 39 | self.target = webhdfs.WebHdfsTarget(self.path) 40 | 41 | def tearDown(self): 42 | if self.client.exists(self.testDir): 43 | self.client.remove(self.testDir, recursive=True) 44 | 45 | def test_write(self): 46 | self.assertFalse(self.client.exists(self.path)) 47 | output = self.target.open('w') 48 | output.write('this is line 1\n') 49 | output.write('this is line #2\n') 50 | output.close() 51 | self.assertTrue(self.client.exists(self.path)) 52 | 53 | def test_read(self): 54 | self.test_write() 55 | input_ = self.target.open('r') 56 | all_test = 'this is line 1\nthis is line #2\n' 57 | self.assertEqual(all_test, input_.read()) 58 | input_.close() 59 | 60 | def test_read_lines(self): 61 | self.test_write() 62 | input_ = self.target.open('r') 63 | lines = list(input_.readlines()) 64 | self.assertEqual(lines[0], 'this is line 1') 65 | self.assertEqual(lines[1], 'this is line #2') 66 | input_.close() 67 | -------------------------------------------------------------------------------- /doc/design_and_limitations.rst: -------------------------------------------------------------------------------- 1 | Design and limitations 2 | ---------------------- 3 | 4 | Luigi is the successor to a couple of attempts that we weren't fully happy with. 5 | We learned a lot from our mistakes and some design decisions include: 6 | 7 | - Straightforward command-line integration. 8 | - As little boilerplate as possible. 9 | - Focus on job scheduling and dependency resolution, not a particular platform. 10 | In particular, this means no limitation to Hadoop. 11 | Though Hadoop/HDFS support is built-in and is easy to use, 12 | this is just one of many types of things you can run. 13 | - A file system abstraction where code doesn't have to care about where files are located. 14 | - Atomic file system operations through this abstraction. 15 | If a task crashes it won't lead to a broken state. 16 | - The dependencies are decentralized. 17 | No big config file in XML. 18 | Each task just specifies which inputs it needs and cross-module dependencies are trivial. 19 | - A web server that renders the dependency graph and does locking, etc for free. 20 | - Trivial to extend with new file systems, file formats, and job types. 21 | You can easily write jobs that inserts a Tokyo Cabinet into Cassandra. 22 | Adding support for new systems is generally not very hard. 23 | (Feel free to send us a patch when you're done!) 24 | - Date algebra included. 25 | - Lots of unit tests of the most basic stuff. 26 | 27 | It wouldn't be fair not to mention some limitations with the current design: 28 | 29 | - Its focus is on batch processing so 30 | it's probably less useful for near real-time pipelines or continuously running processes. 31 | - The assumption is that each task is a sizable chunk of work. 32 | While you can probably schedule a few thousand jobs, 33 | it's not meant to scale beyond tens of thousands. 34 | - Luigi does not support distribution of execution. 35 | When you have workers running thousands of jobs daily, this starts to matter, 36 | because the worker nodes get overloaded. 37 | There are some ways to mitigate this (trigger from many nodes, use resources), 38 | but none of them are ideal. 39 | - Luigi does not come with built-in triggering, and you still need to rely on something like 40 | crontab to trigger workflows periodically. 41 | 42 | Also, it should be mentioned that Luigi is named after the world's second most famous plumber. 43 | -------------------------------------------------------------------------------- /luigi/freezing.py: -------------------------------------------------------------------------------- 1 | """Internal-only module with immutable data structures. 2 | 3 | Please, do not use it outside of Luigi codebase itself. 4 | """ 5 | 6 | 7 | from collections import OrderedDict 8 | try: 9 | from collections.abc import Mapping 10 | except ImportError: 11 | from collections import Mapping # type: ignore 12 | import operator 13 | import functools 14 | 15 | 16 | class FrozenOrderedDict(Mapping): 17 | """ 18 | It is an immutable wrapper around ordered dictionaries that implements the complete :py:class:`collections.Mapping` 19 | interface. It can be used as a drop-in replacement for dictionaries where immutability and ordering are desired. 20 | """ 21 | 22 | def __init__(self, *args, **kwargs): 23 | self.__dict = OrderedDict(*args, **kwargs) 24 | self.__hash = None 25 | 26 | def __getitem__(self, key): 27 | return self.__dict[key] 28 | 29 | def __iter__(self): 30 | return iter(self.__dict) 31 | 32 | def __len__(self): 33 | return len(self.__dict) 34 | 35 | def __repr__(self): 36 | # We should use short representation for beautiful console output 37 | return repr(dict(self.__dict)) 38 | 39 | def __hash__(self): 40 | if self.__hash is None: 41 | hashes = map(hash, self.items()) 42 | self.__hash = functools.reduce(operator.xor, hashes, 0) 43 | 44 | return self.__hash 45 | 46 | def get_wrapped(self): 47 | return self.__dict 48 | 49 | 50 | def recursively_freeze(value): 51 | """ 52 | Recursively walks ``Mapping``s and ``list``s and converts them to ``FrozenOrderedDict`` and ``tuples``, respectively. 53 | """ 54 | if isinstance(value, Mapping): 55 | return FrozenOrderedDict(((k, recursively_freeze(v)) for k, v in value.items())) 56 | elif isinstance(value, list) or isinstance(value, tuple): 57 | return tuple(recursively_freeze(v) for v in value) 58 | return value 59 | 60 | 61 | def recursively_unfreeze(value): 62 | """ 63 | Recursively walks ``FrozenOrderedDict``s and ``tuple``s and converts them to ``dict`` and ``list``, respectively. 64 | """ 65 | if isinstance(value, Mapping): 66 | return dict(((k, recursively_unfreeze(v)) for k, v in value.items())) 67 | elif isinstance(value, list) or isinstance(value, tuple): 68 | return list(recursively_unfreeze(v) for v in value) 69 | return value 70 | -------------------------------------------------------------------------------- /test/contrib/redis_test.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | 18 | # pylint: disable=F0401 19 | from time import sleep 20 | from helpers import unittest 21 | 22 | import pytest 23 | 24 | try: 25 | import redis 26 | except ImportError: 27 | raise unittest.SkipTest('Unable to load redis module') 28 | 29 | from luigi.contrib.redis_store import RedisTarget 30 | 31 | HOST = 'localhost' 32 | PORT = 6379 33 | DB = 15 34 | PASSWORD = None 35 | SOCKET_TIMEOUT = None 36 | MARKER_PREFIX = 'luigi_test' 37 | EXPIRE = 5 38 | 39 | 40 | @pytest.mark.contrib 41 | class RedisTargetTest(unittest.TestCase): 42 | 43 | """ Test touch, exists and target expiration""" 44 | 45 | def test_touch_and_exists(self): 46 | target = RedisTarget(HOST, PORT, DB, 'update_id', PASSWORD) 47 | target.marker_prefix = MARKER_PREFIX 48 | flush() 49 | self.assertFalse(target.exists(), 50 | 'Target should not exist before touching it') 51 | target.touch() 52 | self.assertTrue(target.exists(), 53 | 'Target should exist after touching it') 54 | flush() 55 | 56 | def test_expiration(self): 57 | target = RedisTarget( 58 | HOST, PORT, DB, 'update_id', PASSWORD, None, EXPIRE) 59 | target.marker_prefix = MARKER_PREFIX 60 | flush() 61 | target.touch() 62 | self.assertTrue(target.exists(), 63 | 'Target should exist after touching it and before expiring') 64 | sleep(EXPIRE) 65 | self.assertFalse(target.exists(), 66 | 'Target should not exist after expiring') 67 | flush() 68 | 69 | 70 | def flush(): 71 | """ Flush test DB""" 72 | redis_client = redis.StrictRedis( 73 | host=HOST, port=PORT, db=DB, socket_timeout=SOCKET_TIMEOUT) 74 | redis_client.flushdb() 75 | -------------------------------------------------------------------------------- /test/choice_parameter_test.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | 18 | from helpers import unittest 19 | 20 | import luigi 21 | 22 | 23 | class ChoiceParameterTest(unittest.TestCase): 24 | def test_parse_str(self): 25 | d = luigi.ChoiceParameter(choices=["1", "2", "3"]) 26 | self.assertEqual("3", d.parse("3")) 27 | 28 | def test_parse_int(self): 29 | d = luigi.ChoiceParameter(var_type=int, choices=[1, 2, 3]) 30 | self.assertEqual(3, d.parse(3)) 31 | 32 | def test_parse_int_conv(self): 33 | d = luigi.ChoiceParameter(var_type=int, choices=[1, 2, 3]) 34 | self.assertEqual(3, d.parse("3")) 35 | 36 | def test_invalid_choice(self): 37 | d = luigi.ChoiceParameter(choices=["1", "2", "3"]) 38 | self.assertRaises(ValueError, lambda: d.parse("xyz")) 39 | 40 | def test_invalid_choice_type(self): 41 | self.assertRaises(AssertionError, lambda: luigi.ChoiceParameter(var_type=int, choices=[1, 2, "3"])) 42 | 43 | def test_choices_parameter_exception(self): 44 | self.assertRaises(luigi.parameter.ParameterException, lambda: luigi.ChoiceParameter(var_type=int)) 45 | 46 | def test_hash_str(self): 47 | class Foo(luigi.Task): 48 | args = luigi.ChoiceParameter(var_type=str, choices=["1", "2", "3"]) 49 | p = luigi.ChoiceParameter(var_type=str, choices=["3", "2", "1"]) 50 | self.assertEqual(hash(Foo(args="3").args), hash(p.parse("3"))) 51 | 52 | def test_serialize_parse(self): 53 | a = luigi.ChoiceParameter(var_type=str, choices=["1", "2", "3"]) 54 | b = "3" 55 | self.assertEqual(b, a.parse(a.serialize(b))) 56 | 57 | def test_invalid_choice_task(self): 58 | class Foo(luigi.Task): 59 | args = luigi.ChoiceParameter(var_type=str, choices=["1", "2", "3"]) 60 | self.assertRaises(ValueError, lambda: Foo(args="4")) 61 | -------------------------------------------------------------------------------- /test/import_test.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | 18 | import os 19 | 20 | from helpers import unittest 21 | 22 | 23 | class ImportTest(unittest.TestCase): 24 | 25 | def import_test(self): 26 | """Test that all module can be imported 27 | """ 28 | 29 | luigidir = os.path.join( 30 | os.path.dirname(os.path.abspath(__file__)), 31 | '..' 32 | ) 33 | 34 | packagedir = os.path.join(luigidir, 'luigi') 35 | 36 | for root, subdirs, files in os.walk(packagedir): 37 | package = os.path.relpath(root, luigidir).replace('/', '.') 38 | 39 | if '__init__.py' in files: 40 | __import__(package) 41 | 42 | for f in files: 43 | if f.endswith('.py') and not f.startswith('_'): 44 | __import__(package + '.' + f[:-3]) 45 | 46 | def import_luigi_test(self): 47 | """ 48 | Test that the top luigi package can be imported and contains the usual suspects. 49 | """ 50 | import luigi 51 | 52 | # These should exist (if not, this will cause AttributeErrors) 53 | expected = [ 54 | luigi.Event, 55 | luigi.Config, 56 | luigi.Task, luigi.ExternalTask, luigi.WrapperTask, 57 | luigi.Target, luigi.LocalTarget, 58 | luigi.namespace, 59 | luigi.RemoteScheduler, 60 | luigi.RPCError, 61 | luigi.run, luigi.build, 62 | luigi.Parameter, 63 | luigi.DateHourParameter, luigi.DateMinuteParameter, luigi.DateSecondParameter, luigi.DateParameter, 64 | luigi.MonthParameter, luigi.YearParameter, 65 | luigi.DateIntervalParameter, luigi.TimeDeltaParameter, 66 | luigi.IntParameter, luigi.FloatParameter, 67 | luigi.BoolParameter, 68 | ] 69 | self.assertGreater(len(expected), 0) 70 | -------------------------------------------------------------------------------- /CONTRIBUTING.rst: -------------------------------------------------------------------------------- 1 | Code of conduct 2 | --------------- 3 | 4 | This project adheres to the `Open Code of Conduct 5 | `_. By 6 | participating, you are expected to honor this code. 7 | 8 | Running the tests 9 | ----------------- 10 | 11 | 12 | We are always happy to receive Pull Requests. When you open a PR, it will 13 | automatically build on Travis. So you're not strictly required to test the 14 | patch locally before submitting it. 15 | 16 | If you do want to run the tests locally you'll need to run the commands below 17 | .. code:: bash 18 | curl -LsSf https://astral.sh/uv/install.sh | sh 19 | uv tool install tox --with tox-uv 20 | 21 | You will need a ``tox --version`` of at least 4.22. 22 | 23 | .. code:: bash 24 | 25 | # These commands are pretty fast and will tell if you've 26 | # broken something major: 27 | tox run -e flake8 28 | tox run -e py38-core 29 | 30 | # You can also test particular files for even faster iterations 31 | tox run -e py38-core -- test/rpc_test.py 32 | 33 | # The visualiser tests require phantomjs to be installed on your path 34 | tox run -e visualiser 35 | 36 | # And some of the others involve downloading and running Hadoop: 37 | tox run -e py38-cdh 38 | tox run -e py39-hdp 39 | 40 | Where ``flake8`` is the lint checking, ``py38`` is obviously Python 3.8. 41 | ``core`` are tests that do not require external components and ``cdh`` and 42 | ``hdp`` are two different hadoop distributions. For most local development it's 43 | usually enough to run the lint checking and a python version for ``core`` 44 | and let Travis run for the whole matrix. 45 | 46 | For `cdh` and `hdp`, tox will download the hadoop distribution for you. You 47 | however have to have Java installed and the `JAVA_HOME` environment variable 48 | set. 49 | 50 | For more details, check out the ``.github/workflows/pythonbuild.yml`` and ``tox.ini`` files. 51 | 52 | Writing documentation 53 | ===================== 54 | 55 | All documentation for Luigi is written in `reStructuredText/Sphinx markup 56 | `_ and are both in the 57 | code as docstrings and in `.rst`. Pull requests should come with documentation 58 | when appropriate. 59 | 60 | You verify that your documentation code compiles by running 61 | 62 | .. code:: bash 63 | 64 | tox run -e docs 65 | 66 | After that, you can check how it renders locally with your browser 67 | 68 | .. code:: bash 69 | 70 | firefox doc/_build/html/index.html 71 | -------------------------------------------------------------------------------- /test/fib_test.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | 18 | from helpers import unittest 19 | 20 | import luigi 21 | import luigi.interface 22 | from luigi.mock import MockTarget 23 | 24 | # Calculates Fibonacci numbers :) 25 | 26 | 27 | class Fib(luigi.Task): 28 | n = luigi.IntParameter(default=100) 29 | 30 | def requires(self): 31 | if self.n >= 2: 32 | return [Fib(self.n - 1), Fib(self.n - 2)] 33 | else: 34 | return [] 35 | 36 | def output(self): 37 | return MockTarget('/tmp/fib_%d' % self.n) 38 | 39 | def run(self): 40 | if self.n == 0: 41 | s = 0 42 | elif self.n == 1: 43 | s = 1 44 | else: 45 | s = 0 46 | for input in self.input(): 47 | for line in input.open('r'): 48 | s += int(line.strip()) 49 | 50 | f = self.output().open('w') 51 | f.write('%d\n' % s) 52 | f.close() 53 | 54 | 55 | class FibTestBase(unittest.TestCase): 56 | 57 | def setUp(self): 58 | MockTarget.fs.clear() 59 | 60 | 61 | class FibTest(FibTestBase): 62 | 63 | def test_invoke(self): 64 | luigi.build([Fib(100)], local_scheduler=True) 65 | self.assertEqual(MockTarget.fs.get_data('/tmp/fib_10'), b'55\n') 66 | self.assertEqual(MockTarget.fs.get_data('/tmp/fib_100'), b'354224848179261915075\n') 67 | 68 | def test_cmdline(self): 69 | luigi.run(['--local-scheduler', '--no-lock', 'Fib', '--n', '100']) 70 | 71 | self.assertEqual(MockTarget.fs.get_data('/tmp/fib_10'), b'55\n') 72 | self.assertEqual(MockTarget.fs.get_data('/tmp/fib_100'), b'354224848179261915075\n') 73 | 74 | def test_build_internal(self): 75 | luigi.build([Fib(100)], local_scheduler=True) 76 | 77 | self.assertEqual(MockTarget.fs.get_data('/tmp/fib_10'), b'55\n') 78 | self.assertEqual(MockTarget.fs.get_data('/tmp/fib_100'), b'354224848179261915075\n') 79 | -------------------------------------------------------------------------------- /luigi/contrib/lsf_runner.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | """ 4 | .. Copyright 2012-2015 Spotify AB 5 | Copyright 2018 6 | Copyright 2018 EMBL-European Bioinformatics Institute 7 | 8 | Licensed under the Apache License, Version 2.0 (the "License"); 9 | you may not use this file except in compliance with the License. 10 | You may obtain a copy of the License at 11 | 12 | http://www.apache.org/licenses/LICENSE-2.0 13 | 14 | Unless required by applicable law or agreed to in writing, software 15 | distributed under the License is distributed on an "AS IS" BASIS, 16 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 17 | See the License for the specific language governing permissions and 18 | limitations under the License. 19 | """ 20 | 21 | import os 22 | import sys 23 | try: 24 | # Dill is used for handling pickling and unpickling if there is a deference 25 | # in server setups between the LSF submission node and the nodes in the 26 | # cluster 27 | import dill as pickle 28 | except ImportError: 29 | import pickle 30 | import logging 31 | from luigi.safe_extractor import SafeExtractor 32 | 33 | 34 | def do_work_on_compute_node(work_dir): 35 | # Extract the necessary dependencies 36 | extract_packages_archive(work_dir) 37 | 38 | # Open up the pickle file with the work to be done 39 | os.chdir(work_dir) 40 | with open("job-instance.pickle", "r") as pickle_file_handle: 41 | job = pickle.load(pickle_file_handle) 42 | 43 | # Do the work contained 44 | job.work() 45 | 46 | 47 | def extract_packages_archive(work_dir): 48 | package_file = os.path.join(work_dir, "packages.tar") 49 | if not os.path.exists(package_file): 50 | return 51 | 52 | curdir = os.path.abspath(os.curdir) 53 | 54 | os.chdir(work_dir) 55 | extractor = SafeExtractor(work_dir) 56 | extractor.safe_extract(package_file) 57 | if '' not in sys.path: 58 | sys.path.insert(0, '') 59 | 60 | os.chdir(curdir) 61 | 62 | 63 | def main(args=sys.argv): 64 | """Run the work() method from the class instance in the file "job-instance.pickle". 65 | """ 66 | try: 67 | # Set up logging. 68 | logging.basicConfig(level=logging.WARN) 69 | work_dir = args[1] 70 | assert os.path.exists(work_dir), "First argument to lsf_runner.py must be a directory that exists" 71 | do_work_on_compute_node(work_dir) 72 | except Exception as exc: 73 | # Dump encoded data that we will try to fetch using mechanize 74 | print(exc) 75 | raise 76 | 77 | 78 | if __name__ == '__main__': 79 | main() 80 | -------------------------------------------------------------------------------- /test/task_bulk_complete_test.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2016 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | 18 | from helpers import unittest 19 | from luigi import Task 20 | from luigi import Parameter 21 | from luigi.task import MixinNaiveBulkComplete 22 | 23 | COMPLETE_TASKS = ["A", "B", "C"] 24 | 25 | 26 | class MockTask(MixinNaiveBulkComplete, Task): 27 | param_a = Parameter() 28 | param_b = Parameter(default="Not Mandatory") 29 | 30 | def complete(self): 31 | return self.param_a in COMPLETE_TASKS 32 | 33 | 34 | class MixinNaiveBulkCompleteTest(unittest.TestCase): 35 | """ 36 | Test that the MixinNaiveBulkComplete can handle 37 | input as 38 | - iterable of parameters (for single param tasks) 39 | - iterable of parameter tuples (for multi param tasks) 40 | - iterable of parameter dicts (for multi param tasks) 41 | """ 42 | def test_single_arg_list(self): 43 | single_arg_list = ["A", "B", "x"] 44 | expected_single_arg_list = {p for p in single_arg_list if p in COMPLETE_TASKS} 45 | self.assertEqual( 46 | expected_single_arg_list, 47 | set(MockTask.bulk_complete(single_arg_list)) 48 | ) 49 | 50 | def test_multiple_arg_tuple(self): 51 | multiple_arg_tuple = (("A", "1"), ("B", "2"), ("X", "3"), ("C", "2")) 52 | expected_multiple_arg_tuple = {p for p in multiple_arg_tuple if p[0] in COMPLETE_TASKS} 53 | self.assertEqual( 54 | expected_multiple_arg_tuple, 55 | set(MockTask.bulk_complete(multiple_arg_tuple)) 56 | ) 57 | 58 | def test_multiple_arg_dict(self): 59 | multiple_arg_dict = ( 60 | {"param_a": "X", "param_b": "1"}, 61 | {"param_a": "C", "param_b": "1"} 62 | ) 63 | expected_multiple_arg_dict = ( 64 | [p for p in multiple_arg_dict if p["param_a"] in COMPLETE_TASKS] 65 | ) 66 | self.assertEqual( 67 | expected_multiple_arg_dict, 68 | MockTask.bulk_complete(multiple_arg_dict) 69 | ) 70 | -------------------------------------------------------------------------------- /test/clone_test.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | 18 | from helpers import unittest 19 | 20 | import luigi 21 | import luigi.notifications 22 | 23 | luigi.notifications.DEBUG = True 24 | 25 | 26 | class LinearSum(luigi.Task): 27 | lo = luigi.IntParameter() 28 | hi = luigi.IntParameter() 29 | 30 | def requires(self): 31 | if self.hi > self.lo: 32 | return self.clone(hi=self.hi - 1) 33 | 34 | def run(self): 35 | if self.hi > self.lo: 36 | self.s = self.requires().s + self.f(self.hi - 1) 37 | else: 38 | self.s = 0 39 | self.complete = lambda: True # workaround since we don't write any output 40 | 41 | def complete(self): 42 | return False 43 | 44 | def f(self, x): 45 | return x 46 | 47 | 48 | class PowerSum(LinearSum): 49 | p = luigi.IntParameter() 50 | 51 | def f(self, x): 52 | return x ** self.p 53 | 54 | 55 | class CloneTest(unittest.TestCase): 56 | 57 | def test_args(self): 58 | t = LinearSum(lo=42, hi=45) 59 | self.assertEqual(t.param_args, (42, 45)) 60 | self.assertEqual(t.param_kwargs, {'lo': 42, 'hi': 45}) 61 | 62 | def test_recursion(self): 63 | t = LinearSum(lo=42, hi=45) 64 | luigi.build([t], local_scheduler=True) 65 | self.assertEqual(t.s, 42 + 43 + 44) 66 | 67 | def test_inheritance(self): 68 | t = PowerSum(lo=42, hi=45, p=2) 69 | luigi.build([t], local_scheduler=True) 70 | self.assertEqual(t.s, 42 ** 2 + 43 ** 2 + 44 ** 2) 71 | 72 | def test_inheritance_from_non_parameter(self): 73 | """ 74 | Cloning can pull non-source-parameters from source to target parameter. 75 | """ 76 | 77 | class SubTask(luigi.Task): 78 | lo = 1 79 | 80 | @property 81 | def hi(self): 82 | return 2 83 | 84 | t1 = SubTask() 85 | t2 = t1.clone(cls=LinearSum) 86 | self.assertEqual(t2.lo, 1) 87 | self.assertEqual(t2.hi, 2) 88 | -------------------------------------------------------------------------------- /luigi/contrib/external_daily_snapshot.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2017 Spotify AB. 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, 12 | # software distributed under the License is distributed on an 13 | # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY 14 | # KIND, either express or implied. See the License for the 15 | # specific language governing permissions and limitations 16 | # under the License. 17 | # 18 | import datetime 19 | import logging 20 | 21 | import luigi 22 | 23 | logger = logging.getLogger('luigi-interface') 24 | 25 | 26 | class ExternalDailySnapshot(luigi.ExternalTask): 27 | """ 28 | Abstract class containing a helper method to fetch the latest snapshot. 29 | 30 | Example:: 31 | 32 | class MyTask(luigi.Task): 33 | def requires(self): 34 | return PlaylistContent.latest() 35 | 36 | All tasks subclassing :class:`ExternalDailySnapshot` must have a :class:`luigi.DateParameter` 37 | named ``date``. 38 | 39 | You can also provide additional parameters to the class and also configure 40 | lookback size. 41 | 42 | Example:: 43 | 44 | ServiceLogs.latest(service="radio", lookback=21) 45 | 46 | """ 47 | date = luigi.DateParameter() 48 | __cache = [] 49 | 50 | @classmethod 51 | def latest(cls, *args, **kwargs): 52 | """This is cached so that requires() is deterministic.""" 53 | date = kwargs.pop("date", datetime.date.today()) 54 | lookback = kwargs.pop("lookback", 14) 55 | # hashing kwargs deterministically would be hard. Let's just lookup by equality 56 | key = (cls, args, kwargs, lookback, date) 57 | for k, v in ExternalDailySnapshot.__cache: 58 | if k == key: 59 | return v 60 | val = cls.__latest(date, lookback, args, kwargs) 61 | ExternalDailySnapshot.__cache.append((key, val)) 62 | return val 63 | 64 | @classmethod 65 | def __latest(cls, date, lookback, args, kwargs): 66 | assert lookback > 0 67 | t = None 68 | for i in range(lookback): 69 | d = date - datetime.timedelta(i) 70 | t = cls(date=d, *args, **kwargs) 71 | if t.complete(): 72 | return t 73 | logger.debug("Could not find last dump for %s (looked back %d days)", 74 | cls.__name__, lookback) 75 | return t 76 | -------------------------------------------------------------------------------- /.github/workflows/codeql.yml: -------------------------------------------------------------------------------- 1 | name: "CodeQL" 2 | 3 | on: 4 | push: 5 | branches: [ 'master' ] 6 | pull_request: 7 | # The branches below must be a subset of the branches above 8 | branches: [ 'master' ] 9 | schedule: 10 | - cron: '29 18 * * 0' 11 | 12 | jobs: 13 | analyze: 14 | name: Analyze 15 | runs-on: ubuntu-latest 16 | permissions: 17 | actions: read 18 | contents: read 19 | security-events: write 20 | 21 | strategy: 22 | fail-fast: false 23 | matrix: 24 | language: [ 'python', 'javascript' ] 25 | # CodeQL supports [ 'cpp', 'csharp', 'go', 'java', 'javascript', 'python', 'ruby' ] 26 | # Use only 'java' to analyze code written in Java, Kotlin or both 27 | # Use only 'javascript' to analyze code written in JavaScript, TypeScript or both 28 | # Learn more about CodeQL language support at https://aka.ms/codeql-docs/language-support 29 | 30 | steps: 31 | - name: Checkout repository 32 | uses: actions/checkout@v4 33 | 34 | # Initializes the CodeQL tools for scanning. 35 | - name: Initialize CodeQL 36 | uses: github/codeql-action/init@v2 37 | with: 38 | languages: ${{ matrix.language }} 39 | # If you wish to specify custom queries, you can do so here or in a config file. 40 | # By default, queries listed here will override any specified in a config file. 41 | # Prefix the list here with "+" to use these queries and those in the config file. 42 | 43 | # For more details on CodeQL's query packs, refer to: https://docs.github.com/en/code-security/code-scanning/automatically-scanning-your-code-for-vulnerabilities-and-errors/configuring-code-scanning#using-queries-in-ql-packs 44 | # queries: security-extended,security-and-quality 45 | 46 | 47 | # Autobuild attempts to build any compiled languages (C/C++, C#, Go, Java, or Swift). 48 | # If this step fails, then you should remove it and run the build manually (see below) 49 | - name: Autobuild 50 | uses: github/codeql-action/autobuild@v2 51 | 52 | # ℹ️ Command-line programs to run using the OS shell. 53 | # 📚 See https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idstepsrun 54 | 55 | # If the Autobuild fails above, remove it and uncomment the following three lines. 56 | # modify them (or add more) to build your code if your project, please refer to the EXAMPLE below for guidance. 57 | 58 | # - run: | 59 | # echo "Run, Build Application using script" 60 | # ./location_of_script_within_repo/buildscript.sh 61 | 62 | - name: Perform CodeQL Analysis 63 | uses: github/codeql-action/analyze@v2 64 | with: 65 | category: "/language:${{matrix.language}}" 66 | -------------------------------------------------------------------------------- /scripts/ci/setup_hadoop_env.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | 3 | HADOOP_DISTRO=${HADOOP_DISTRO:-"hdp"} 4 | 5 | ONLY_DOWNLOAD=${ONLY_DOWNLOAD:-false} 6 | ONLY_EXTRACT=${ONLY_EXTRACT:-false} 7 | 8 | while test $# -gt 0; do 9 | case "$1" in 10 | -h|--help) 11 | echo "Setup environment for snakebite tests" 12 | echo " " 13 | echo "options:" 14 | echo -e "\t-h, --help show brief help" 15 | echo -e "\t-o, --only-download just download hadoop tar(s)" 16 | echo -e "\t-e, --only-extract just extract hadoop tar(s)" 17 | echo -e "\t-d, --distro select distro (hdp|cdh)" 18 | exit 0 19 | ;; 20 | -o|--only-download) 21 | shift 22 | ONLY_DOWNLOAD=true 23 | ;; 24 | -e|--only-extract) 25 | shift 26 | ONLY_EXTRACT=true 27 | ;; 28 | -d|--distro) 29 | shift 30 | if test $# -gt 0; then 31 | HADOOP_DISTRO=$1 32 | else 33 | echo "No Hadoop distro specified - abort" >&2 34 | exit 1 35 | fi 36 | shift 37 | ;; 38 | *) 39 | echo "Unknown options: $1" >&2 40 | exit 1 41 | ;; 42 | esac 43 | done 44 | 45 | if $ONLY_DOWNLOAD && $ONLY_EXTRACT; then 46 | echo "Both only-download and only-extract specified - abort" >&2 47 | exit 1 48 | fi 49 | 50 | mkdir -p $HADOOP_HOME 51 | 52 | if [ $HADOOP_DISTRO = "cdh" ]; then 53 | URL="http://archive.cloudera.com/cdh5/cdh/5/hadoop-latest.tar.gz" 54 | elif [ $HADOOP_DISTRO = "hdp" ]; then 55 | # This site provides good URLs: 56 | # https://github.com/saltstack-formulas/hadoop-formula/blob/5034a2204da691eceb9c2d8cd8260f11d5cc06f3/hadoop/settings.sls 57 | URL="http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.2.6.0/tars/hadoop-2.6.0.2.2.6.0-2800.tar.gz" 58 | else 59 | echo "No/bad HADOOP_DISTRO='${HADOOP_DISTRO}' specified" >&2 60 | exit 1 61 | fi 62 | 63 | if ! $ONLY_EXTRACT && [ ! -e ${HADOOP_HOME}/hadoop.tar.gz ] ; then 64 | echo "Downloading Hadoop from $URL to ${HADOOP_HOME}/hadoop.tar.gz" 65 | curl -z ${HADOOP_HOME}/hadoop.tar.gz -o ${HADOOP_HOME}/hadoop.tar.gz -L $URL 66 | 67 | if [ $? != 0 ]; then 68 | echo "Failed to download Hadoop from $URL - abort" >&2 69 | exit 1 70 | fi 71 | fi 72 | 73 | if $ONLY_DOWNLOAD; then 74 | exit 0 75 | fi 76 | 77 | echo "Extracting ${HADOOP_HOME}/hadoop.tar.gz into $HADOOP_HOME" 78 | tar zxf ${HADOOP_HOME}/hadoop.tar.gz --strip-components 1 -C $HADOOP_HOME 79 | 80 | if [ $? != 0 ]; then 81 | echo "Failed to extract Hadoop from ${HADOOP_HOME}/hadoop.tar.gz to ${HADOOP_HOME} - abort" >&2 82 | exit 1 83 | fi 84 | -------------------------------------------------------------------------------- /luigi/tools/deps_tree.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | This module parses commands exactly the same as the luigi task runner. You must specify the module, the task and task parameters. 4 | Instead of executing a task, this module prints the significant parameters and state of the task and its dependencies in a tree format. 5 | Use this to visualize the execution plan in the terminal. 6 | 7 | .. code-block:: none 8 | 9 | $ luigi-deps-tree --module foo_complex examples.Foo 10 | ... 11 | └─--[Foo-{} (PENDING)] 12 | |---[Bar-{'num': '0'} (PENDING)] 13 | | |---[Bar-{'num': '4'} (PENDING)] 14 | | └─--[Bar-{'num': '5'} (PENDING)] 15 | |---[Bar-{'num': '1'} (PENDING)] 16 | └─--[Bar-{'num': '2'} (PENDING)] 17 | └─--[Bar-{'num': '6'} (PENDING)] 18 | |---[Bar-{'num': '7'} (PENDING)] 19 | | |---[Bar-{'num': '9'} (PENDING)] 20 | | └─--[Bar-{'num': '10'} (PENDING)] 21 | | └─--[Bar-{'num': '11'} (PENDING)] 22 | └─--[Bar-{'num': '8'} (PENDING)] 23 | └─--[Bar-{'num': '12'} (PENDING)] 24 | """ 25 | 26 | from luigi.task import flatten 27 | from luigi.cmdline_parser import CmdlineParser 28 | import sys 29 | import warnings 30 | 31 | 32 | class bcolors: 33 | ''' 34 | colored output for task status 35 | ''' 36 | OKBLUE = '\033[94m' 37 | OKGREEN = '\033[92m' 38 | ENDC = '\033[0m' 39 | 40 | 41 | def print_tree(task, indent='', last=True): 42 | ''' 43 | Return a string representation of the tasks, their statuses/parameters in a dependency tree format 44 | ''' 45 | # dont bother printing out warnings about tasks with no output 46 | with warnings.catch_warnings(): 47 | warnings.filterwarnings(action='ignore', message='Task .* without outputs has no custom complete\\(\\) method') 48 | is_task_complete = task.complete() 49 | is_complete = (bcolors.OKGREEN + 'COMPLETE' if is_task_complete else bcolors.OKBLUE + 'PENDING') + bcolors.ENDC 50 | name = task.__class__.__name__ 51 | params = task.to_str_params(only_significant=True) 52 | result = '\n' + indent 53 | if (last): 54 | result += '└─--' 55 | indent += ' ' 56 | else: 57 | result += '|---' 58 | indent += '| ' 59 | result += '[{0}-{1} ({2})]'.format(name, params, is_complete) 60 | children = flatten(task.requires()) 61 | for index, child in enumerate(children): 62 | result += print_tree(child, indent, (index+1) == len(children)) 63 | return result 64 | 65 | 66 | def main(): 67 | cmdline_args = sys.argv[1:] 68 | with CmdlineParser.global_instance(cmdline_args) as cp: 69 | task = cp.get_task_obj() 70 | print(print_tree(task)) 71 | 72 | 73 | if __name__ == '__main__': 74 | main() 75 | -------------------------------------------------------------------------------- /luigi/metrics.py: -------------------------------------------------------------------------------- 1 | import abc 2 | import importlib 3 | 4 | from enum import Enum 5 | 6 | 7 | class MetricsCollectors(Enum): 8 | custom = -1 9 | default = 1 10 | none = 1 11 | datadog = 2 12 | prometheus = 3 13 | 14 | @classmethod 15 | def get(cls, which, custom_import=None): 16 | if which == MetricsCollectors.none: 17 | return NoMetricsCollector() 18 | elif which == MetricsCollectors.datadog: 19 | from luigi.contrib.datadog_metric import DatadogMetricsCollector 20 | return DatadogMetricsCollector() 21 | elif which == MetricsCollectors.prometheus: 22 | from luigi.contrib.prometheus_metric import PrometheusMetricsCollector 23 | return PrometheusMetricsCollector() 24 | elif which == MetricsCollectors.custom: 25 | if custom_import is None: 26 | raise ValueError(f"MetricsCollectors value ' {which} ' is -1 and custom_import is None") 27 | 28 | split_import_string = custom_import.split(".") 29 | 30 | import_path = ".".join(split_import_string[:-1]) 31 | import_class_string = split_import_string[-1] 32 | 33 | mod = importlib.import_module(import_path) 34 | metrics_class = getattr(mod, import_class_string) 35 | 36 | if issubclass(metrics_class, MetricsCollector): 37 | return metrics_class() 38 | else: 39 | raise ValueError(f"Custom Import: {custom_import} is not a subclass of MetricsCollector") 40 | else: 41 | raise ValueError("MetricsCollectors value ' {0} ' isn't supported", which) 42 | 43 | 44 | class MetricsCollector(metaclass=abc.ABCMeta): 45 | """Abstractable MetricsCollector base class that can be replace by tool 46 | specific implementation. 47 | """ 48 | 49 | @abc.abstractmethod 50 | def __init__(self): 51 | pass 52 | 53 | @abc.abstractmethod 54 | def handle_task_started(self, task): 55 | pass 56 | 57 | @abc.abstractmethod 58 | def handle_task_failed(self, task): 59 | pass 60 | 61 | @abc.abstractmethod 62 | def handle_task_disabled(self, task, config): 63 | pass 64 | 65 | @abc.abstractmethod 66 | def handle_task_done(self, task): 67 | pass 68 | 69 | def handle_task_statistics(self, task, statistics): 70 | pass 71 | 72 | def generate_latest(self): 73 | return 74 | 75 | def configure_http_handler(self, http_handler): 76 | pass 77 | 78 | 79 | class NoMetricsCollector(MetricsCollector): 80 | """Empty MetricsCollector when no collector is being used 81 | """ 82 | 83 | def __init__(self): 84 | pass 85 | 86 | def handle_task_started(self, task): 87 | pass 88 | 89 | def handle_task_failed(self, task): 90 | pass 91 | 92 | def handle_task_disabled(self, task, config): 93 | pass 94 | 95 | def handle_task_done(self, task): 96 | pass 97 | -------------------------------------------------------------------------------- /luigi/static/visualiser/lib/AdminLTE/css/skin-green.min.css: -------------------------------------------------------------------------------- 1 | .skin-green .main-header .navbar{background-color:#00a65a}.skin-green .main-header .navbar .nav>li>a{color:#fff}.skin-green .main-header .navbar .nav>li>a:hover,.skin-green .main-header .navbar .nav>li>a:active,.skin-green .main-header .navbar .nav>li>a:focus,.skin-green .main-header .navbar .nav .open>a,.skin-green .main-header .navbar .nav .open>a:hover,.skin-green .main-header .navbar .nav .open>a:focus{background:rgba(0,0,0,0.1);color:#f6f6f6}.skin-green .main-header .navbar .sidebar-toggle{color:#fff}.skin-green .main-header .navbar .sidebar-toggle:hover{color:#f6f6f6;background:rgba(0,0,0,0.1)}.skin-green .main-header .navbar .sidebar-toggle{color:#fff}.skin-green .main-header .navbar .sidebar-toggle:hover{background-color:#008d4c}@media (max-width:767px){.skin-green .main-header .navbar .dropdown-menu li.divider{background-color:rgba(255,255,255,0.1)}.skin-green .main-header .navbar .dropdown-menu li a{color:#fff}.skin-green .main-header .navbar .dropdown-menu li a:hover{background:#008d4c}}.skin-green .main-header .logo{background-color:#008d4c;color:#fff;border-bottom:0 solid transparent}.skin-green .main-header .logo:hover{background-color:#008749}.skin-green .main-header li.user-header{background-color:#00a65a}.skin-green .content-header{background:transparent}.skin-green .wrapper,.skin-green .main-sidebar,.skin-green .left-side{background-color:#222d32}.skin-green .user-panel>.info,.skin-green .user-panel>.info>a{color:#fff}.skin-green .sidebar-menu>li.header{color:#4b646f;background:#1a2226}.skin-green .sidebar-menu>li>a{border-left:3px solid transparent}.skin-green .sidebar-menu>li:hover>a,.skin-green .sidebar-menu>li.active>a{color:#fff;background:#1e282c;border-left-color:#00a65a}.skin-green .sidebar-menu>li>.treeview-menu{margin:0 1px;background:#2c3b41}.skin-green .sidebar a{color:#b8c7ce}.skin-green .sidebar a:hover{text-decoration:none}.skin-green .treeview-menu>li>a{color:#8aa4af}.skin-green .treeview-menu>li.active>a,.skin-green .treeview-menu>li>a:hover{color:#fff}.skin-green .sidebar-form{border-radius:3px;border:1px solid #374850;margin:10px 10px}.skin-green .sidebar-form input[type="text"],.skin-green .sidebar-form .btn{box-shadow:none;background-color:#374850;border:1px solid transparent;height:35px;-webkit-transition:all .3s ease-in-out;-o-transition:all .3s ease-in-out;transition:all .3s ease-in-out}.skin-green .sidebar-form input[type="text"]{color:#666;border-top-left-radius:2px !important;border-top-right-radius:0 !important;border-bottom-right-radius:0 !important;border-bottom-left-radius:2px !important}.skin-green .sidebar-form input[type="text"]:focus,.skin-green .sidebar-form input[type="text"]:focus+.input-group-btn .btn{background-color:#fff;color:#666}.skin-green .sidebar-form input[type="text"]:focus+.input-group-btn .btn{border-left-color:#fff}.skin-green .sidebar-form .btn{color:#999;border-top-left-radius:0 !important;border-top-right-radius:2px !important;border-bottom-right-radius:2px !important;border-bottom-left-radius:0 !important} -------------------------------------------------------------------------------- /test/instance_test.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | 18 | from helpers import unittest 19 | 20 | import luigi 21 | import luigi.worker 22 | import luigi.date_interval 23 | import luigi.notifications 24 | 25 | luigi.notifications.DEBUG = True 26 | 27 | 28 | class InstanceTest(unittest.TestCase): 29 | 30 | def test_simple(self): 31 | class DummyTask(luigi.Task): 32 | x = luigi.Parameter() 33 | 34 | dummy_1 = DummyTask(1) 35 | dummy_2 = DummyTask(2) 36 | dummy_1b = DummyTask(1) 37 | 38 | self.assertNotEqual(dummy_1, dummy_2) 39 | self.assertEqual(dummy_1, dummy_1b) 40 | 41 | def test_dep(self): 42 | test = self 43 | 44 | class A(luigi.Task): 45 | task_namespace = 'instance' # to prevent task name conflict between tests 46 | 47 | def __init__(self): 48 | self.has_run = False 49 | super(A, self).__init__() 50 | 51 | def run(self): 52 | self.has_run = True 53 | 54 | class B(luigi.Task): 55 | x = luigi.Parameter() 56 | 57 | def requires(self): 58 | return A() # This will end up referring to the same object 59 | 60 | def run(self): 61 | test.assertTrue(self.requires().has_run) 62 | 63 | luigi.build([B(1), B(2)], local_scheduler=True) 64 | 65 | def test_external_instance_cache(self): 66 | class A(luigi.Task): 67 | task_namespace = 'instance' # to prevent task name conflict between tests 68 | pass 69 | 70 | class OtherA(luigi.ExternalTask): 71 | task_family = "A" 72 | 73 | oa = OtherA() 74 | a = A() 75 | self.assertNotEqual(oa, a) 76 | 77 | def test_date(self): 78 | ''' Adding unit test because we had a problem with this ''' 79 | class DummyTask(luigi.Task): 80 | x = luigi.DateIntervalParameter() 81 | 82 | dummy_1 = DummyTask(luigi.date_interval.Year(2012)) 83 | dummy_2 = DummyTask(luigi.date_interval.Year(2013)) 84 | dummy_1b = DummyTask(luigi.date_interval.Year(2012)) 85 | 86 | self.assertNotEqual(dummy_1, dummy_2) 87 | self.assertEqual(dummy_1, dummy_1b) 88 | 89 | def test_unhashable_type(self): 90 | # See #857 91 | class DummyTask(luigi.Task): 92 | x = luigi.Parameter() 93 | 94 | dummy = DummyTask(x={}) # NOQA 95 | -------------------------------------------------------------------------------- /test/worker_task_process_test.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | 18 | from helpers import LuigiTestCase, temporary_unloaded_module 19 | import luigi 20 | from luigi.worker import Worker 21 | import multiprocessing 22 | 23 | 24 | class ContextManagedTaskProcessTest(LuigiTestCase): 25 | 26 | def _test_context_manager(self, force_multiprocessing): 27 | CONTEXT_MANAGER_MODULE = b''' 28 | class MyContextManager: 29 | def __init__(self, task_process): 30 | self.task = task_process.task 31 | def __enter__(self): 32 | assert not self.task.run_event.is_set(), "the task should not have run yet" 33 | self.task.enter_event.set() 34 | return self 35 | def __exit__(self, exc_type=None, exc_value=None, traceback=None): 36 | assert self.task.run_event.is_set(), "the task should have run" 37 | self.task.exit_event.set() 38 | ''' 39 | 40 | class DummyEventRecordingTask(luigi.Task): 41 | def __init__(self, *args, **kwargs): 42 | self.enter_event = multiprocessing.Event() 43 | self.exit_event = multiprocessing.Event() 44 | self.run_event = multiprocessing.Event() 45 | super(DummyEventRecordingTask, self).__init__(*args, **kwargs) 46 | 47 | def run(self): 48 | assert self.enter_event.is_set(), "the context manager should have been entered" 49 | assert not self.exit_event.is_set(), "the context manager should not have been exited yet" 50 | assert not self.run_event.is_set(), "the task should not have run yet" 51 | self.run_event.set() 52 | 53 | def complete(self): 54 | return self.run_event.is_set() 55 | 56 | with temporary_unloaded_module(CONTEXT_MANAGER_MODULE) as module_name: 57 | t = DummyEventRecordingTask() 58 | w = Worker(task_process_context=module_name + '.MyContextManager', 59 | force_multiprocessing=force_multiprocessing) 60 | w.add(t) 61 | self.assertTrue(w.run()) 62 | self.assertTrue(t.complete()) 63 | self.assertTrue(t.enter_event.is_set()) 64 | self.assertTrue(t.exit_event.is_set()) 65 | 66 | def test_context_manager_without_multiprocessing(self): 67 | self._test_context_manager(False) 68 | 69 | def test_context_manager_with_multiprocessing(self): 70 | self._test_context_manager(True) 71 | -------------------------------------------------------------------------------- /examples/wordcount_hadoop.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | 18 | import luigi 19 | import luigi.contrib.hadoop 20 | import luigi.contrib.hdfs 21 | 22 | 23 | # To make this run, you probably want to edit /etc/luigi/client.cfg and add something like: 24 | # 25 | # [hadoop] 26 | # jar: /usr/lib/hadoop-xyz/hadoop-streaming-xyz-123.jar 27 | 28 | 29 | class InputText(luigi.ExternalTask): 30 | """ 31 | This task is a :py:class:`luigi.task.ExternalTask` which means it doesn't generate the 32 | :py:meth:`~.InputText.output` target on its own instead relying on the execution something outside of Luigi 33 | to produce it. 34 | """ 35 | 36 | date = luigi.DateParameter() 37 | 38 | def output(self): 39 | """ 40 | Returns the target output for this task. 41 | In this case, it expects a file to be present in HDFS. 42 | 43 | :return: the target output for this task. 44 | :rtype: object (:py:class:`luigi.target.Target`) 45 | """ 46 | return luigi.contrib.hdfs.HdfsTarget(self.date.strftime('/tmp/text/%Y-%m-%d.txt')) 47 | 48 | 49 | class WordCount(luigi.contrib.hadoop.JobTask): 50 | """ 51 | This task runs a :py:class:`luigi.contrib.hadoop.JobTask` 52 | over the target data returned by :py:meth:`~/.InputText.output` and 53 | writes the result into its :py:meth:`~.WordCount.output` target. 54 | 55 | This class uses :py:meth:`luigi.contrib.hadoop.JobTask.run`. 56 | """ 57 | 58 | date_interval = luigi.DateIntervalParameter() 59 | 60 | def requires(self): 61 | """ 62 | This task's dependencies: 63 | 64 | * :py:class:`~.InputText` 65 | 66 | :return: list of object (:py:class:`luigi.task.Task`) 67 | """ 68 | return [InputText(date) for date in self.date_interval.dates()] 69 | 70 | def output(self): 71 | """ 72 | Returns the target output for this task. 73 | In this case, a successful execution of this task will create a file in HDFS. 74 | 75 | :return: the target output for this task. 76 | :rtype: object (:py:class:`luigi.target.Target`) 77 | """ 78 | return luigi.contrib.hdfs.HdfsTarget('/tmp/text-count/%s' % self.date_interval) 79 | 80 | def mapper(self, line): 81 | for word in line.strip().split(): 82 | yield word, 1 83 | 84 | def reducer(self, key, values): 85 | yield key, sum(values) 86 | 87 | 88 | if __name__ == '__main__': 89 | luigi.run() 90 | -------------------------------------------------------------------------------- /luigi/contrib/target.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | 18 | import logging 19 | from types import MethodType 20 | 21 | import luigi.target 22 | 23 | logger = logging.getLogger('luigi-interface') 24 | 25 | 26 | class CascadingClient: 27 | """ 28 | A FilesystemClient that will cascade failing function calls through a list of clients. 29 | 30 | Which clients are used are specified at time of construction. 31 | """ 32 | 33 | # This constant member is supposed to include all methods, feel free to add 34 | # methods here. If you want full control of which methods that should be 35 | # created, pass the kwarg to the constructor. 36 | ALL_METHOD_NAMES = ['exists', 'rename', 'remove', 'chmod', 'chown', 37 | 'count', 'copy', 'get', 'put', 'mkdir', 'list', 'listdir', 38 | 'getmerge', 39 | 'isdir', 40 | 'rename_dont_move', 41 | 'touchz', 42 | ] 43 | 44 | def __init__(self, clients, method_names=None): 45 | self.clients = clients 46 | if method_names is None: 47 | method_names = self.ALL_METHOD_NAMES 48 | 49 | for method_name in method_names: 50 | new_method = self._make_method(method_name) 51 | real_method = MethodType(new_method, self) 52 | setattr(self, method_name, real_method) 53 | 54 | @classmethod 55 | def _make_method(cls, method_name): 56 | def new_method(self, *args, **kwargs): 57 | return self._chained_call(method_name, *args, **kwargs) 58 | return new_method 59 | 60 | def _chained_call(self, method_name, *args, **kwargs): 61 | for i in range(len(self.clients)): 62 | client = self.clients[i] 63 | try: 64 | result = getattr(client, method_name)(*args, **kwargs) 65 | return result 66 | except luigi.target.FileSystemException: 67 | # For exceptions that are semantical, we must throw along 68 | raise 69 | except BaseException: 70 | is_last_iteration = (i + 1) >= len(self.clients) 71 | if is_last_iteration: 72 | raise 73 | else: 74 | logger.warning('The %s failed to %s, using fallback class %s', 75 | client.__class__.__name__, method_name, self.clients[i + 1].__class__.__name__) 76 | -------------------------------------------------------------------------------- /luigi/tools/luigi_grep.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | 3 | import argparse 4 | import json 5 | from collections import defaultdict 6 | 7 | from urllib.request import urlopen 8 | 9 | 10 | class LuigiGrep: 11 | 12 | def __init__(self, host, port): 13 | self._host = host 14 | self._port = port 15 | 16 | @property 17 | def graph_url(self): 18 | return "http://{0}:{1}/api/graph".format(self._host, self._port) 19 | 20 | def _fetch_json(self): 21 | """Returns the json representation of the dep graph""" 22 | print("Fetching from url: " + self.graph_url) 23 | resp = urlopen(self.graph_url).read() 24 | return json.loads(resp.decode('utf-8')) 25 | 26 | def _build_results(self, jobs, job): 27 | job_info = jobs[job] 28 | deps = job_info['deps'] 29 | deps_status = defaultdict(list) 30 | for j in deps: 31 | if j in jobs: 32 | deps_status[jobs[j]['status']].append(j) 33 | else: 34 | deps_status['UNKNOWN'].append(j) 35 | return {"name": job, "status": job_info['status'], "deps_by_status": deps_status} 36 | 37 | def prefix_search(self, job_name_prefix): 38 | """Searches for jobs matching the given ``job_name_prefix``.""" 39 | json = self._fetch_json() 40 | jobs = json['response'] 41 | for job in jobs: 42 | if job.startswith(job_name_prefix): 43 | yield self._build_results(jobs, job) 44 | 45 | def status_search(self, status): 46 | """Searches for jobs matching the given ``status``.""" 47 | json = self._fetch_json() 48 | jobs = json['response'] 49 | for job in jobs: 50 | job_info = jobs[job] 51 | if job_info['status'].lower() == status.lower(): 52 | yield self._build_results(jobs, job) 53 | 54 | 55 | def main(): 56 | parser = argparse.ArgumentParser( 57 | "luigi-grep is used to search for workflows using the luigi scheduler's json api") 58 | parser.add_argument( 59 | "--scheduler-host", default="localhost", help="hostname of the luigi scheduler") 60 | parser.add_argument( 61 | "--scheduler-port", default="8082", help="port of the luigi scheduler") 62 | parser.add_argument("--prefix", help="prefix of a task query to search for", default=None) 63 | parser.add_argument("--status", help="search for jobs with the given status", default=None) 64 | 65 | args = parser.parse_args() 66 | grep = LuigiGrep(args.scheduler_host, args.scheduler_port) 67 | 68 | results = [] 69 | if args.prefix: 70 | results = grep.prefix_search(args.prefix) 71 | elif args.status: 72 | results = grep.status_search(args.status) 73 | 74 | for job in results: 75 | print("{name}: {status}, Dependencies:".format(name=job['name'], status=job['status'])) 76 | for status, jobs in job['deps_by_status'].items(): 77 | print(" status={status}".format(status=status)) 78 | for job in jobs: 79 | print(" {job}".format(job=job)) 80 | 81 | 82 | if __name__ == '__main__': 83 | main() 84 | -------------------------------------------------------------------------------- /test/mypy_test.py: -------------------------------------------------------------------------------- 1 | import sys 2 | import tempfile 3 | import unittest 4 | 5 | from mypy import api 6 | 7 | 8 | class TestMyMypyPlugin(unittest.TestCase): 9 | def test_plugin_no_issue(self): 10 | if sys.version_info[:2] < (3, 8): 11 | return 12 | 13 | test_code = """ 14 | import luigi 15 | from uuid import UUID 16 | 17 | 18 | class UUIDParameter(luigi.Parameter): 19 | def parse(self, s): 20 | return UUID(s) 21 | 22 | 23 | class MyTask(luigi.Task): 24 | foo: int = luigi.IntParameter() 25 | bar: str = luigi.Parameter() 26 | uniq: UUID = UUIDParameter() 27 | baz: str = luigi.Parameter(default="baz") 28 | 29 | MyTask(foo=1, bar='bar', uniq=UUID("9b0591d7-a167-4978-bc6d-41f7d84a288c")) 30 | """ 31 | 32 | with tempfile.NamedTemporaryFile(suffix=".py") as test_file: 33 | test_file.write(test_code.encode("utf-8")) 34 | test_file.flush() 35 | result = api.run( 36 | [ 37 | "--no-incremental", 38 | "--cache-dir=/dev/null", 39 | "--config-file", 40 | "test/testconfig/pyproject.toml", 41 | test_file.name, 42 | ] 43 | ) 44 | self.assertIn("Success: no issues found", result[0]) 45 | 46 | def test_plugin_invalid_arg(self): 47 | if sys.version_info[:2] < (3, 8): 48 | return 49 | 50 | test_code = """ 51 | import luigi 52 | 53 | 54 | class MyTask(luigi.Task): 55 | foo: int = luigi.IntParameter() 56 | bar: str = luigi.Parameter() 57 | baz: str = luigi.Parameter(default=1) # invalid assignment to str with default value int 58 | 59 | # issue: 60 | # - foo is int 61 | # - unknown is unknown parameter 62 | # - baz is invalid assignment to str with default value int 63 | MyTask(foo='1', bar="bar", unknown="unknown") 64 | """ 65 | 66 | with tempfile.NamedTemporaryFile(suffix=".py") as test_file: 67 | test_file.write(test_code.encode("utf-8")) 68 | test_file.flush() 69 | result = api.run( 70 | [ 71 | "--no-incremental", 72 | "--cache-dir=/dev/null", 73 | "--config-file", 74 | "test/testconfig/pyproject.toml", 75 | test_file.name, 76 | ] 77 | ) 78 | 79 | self.assertIn( 80 | 'error: Incompatible types in assignment (expression has type "int", variable has type "str") [assignment]', 81 | result[0], 82 | ) # check default value assignment 83 | self.assertIn( 84 | 'error: Argument "foo" to "MyTask" has incompatible type "str"; expected "int" [arg-type]', 85 | result[0], 86 | ) # check foo argument 87 | self.assertIn( 88 | 'error: Unexpected keyword argument "unknown" for "MyTask" [call-arg]', 89 | result[0], 90 | ) # check unknown argument 91 | self.assertIn("Found 3 errors in 1 file (checked 1 source file)", result[0]) 92 | -------------------------------------------------------------------------------- /luigi/configuration/core.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | import logging 18 | import os 19 | import warnings 20 | 21 | from .cfg_parser import LuigiConfigParser 22 | from .toml_parser import LuigiTomlParser 23 | 24 | 25 | logger = logging.getLogger('luigi-interface') 26 | 27 | 28 | PARSERS = { 29 | 'cfg': LuigiConfigParser, 30 | 'conf': LuigiConfigParser, 31 | 'ini': LuigiConfigParser, 32 | 'toml': LuigiTomlParser, 33 | } 34 | 35 | DEFAULT_PARSER = 'cfg' 36 | 37 | 38 | def _get_default_parser(): 39 | parser = os.environ.get('LUIGI_CONFIG_PARSER', DEFAULT_PARSER) 40 | if parser not in PARSERS: 41 | warnings.warn("Invalid parser: {parser}".format(parser=DEFAULT_PARSER)) 42 | parser = DEFAULT_PARSER 43 | return parser 44 | 45 | 46 | def _check_parser(parser_class, parser): 47 | if not parser_class.enabled: 48 | msg = ( 49 | "Parser not installed yet. " 50 | "Please, install luigi with required parser:\n" 51 | "pip install luigi[{parser}]" 52 | ) 53 | raise ImportError(msg.format(parser=parser)) 54 | 55 | 56 | def get_config(parser=None): 57 | """Get configs singleton for parser 58 | """ 59 | if parser is None: 60 | parser = _get_default_parser() 61 | parser_class = PARSERS[parser] 62 | _check_parser(parser_class, parser) 63 | return parser_class.instance() 64 | 65 | 66 | def add_config_path(path): 67 | """Select config parser by file extension and add path into parser. 68 | """ 69 | if not os.path.isfile(path): 70 | warnings.warn("Config file does not exist: {path}".format(path=path)) 71 | return False 72 | 73 | # select parser by file extension 74 | default_parser = _get_default_parser() 75 | _base, ext = os.path.splitext(path) 76 | if ext and ext[1:] in PARSERS: 77 | parser = ext[1:] 78 | else: 79 | parser = default_parser 80 | parser_class = PARSERS[parser] 81 | 82 | _check_parser(parser_class, parser) 83 | if parser != default_parser: 84 | msg = ( 85 | "Config for {added} parser added, but used {used} parser. " 86 | "Set up right parser via env var: " 87 | "export LUIGI_CONFIG_PARSER={added}" 88 | ) 89 | warnings.warn(msg.format(added=parser, used=default_parser)) 90 | 91 | # add config path to parser 92 | parser_class.add_config_path(path) 93 | return True 94 | 95 | 96 | if 'LUIGI_CONFIG_PATH' in os.environ: 97 | add_config_path(os.environ['LUIGI_CONFIG_PATH']) 98 | -------------------------------------------------------------------------------- /luigi/contrib/hdfs/__init__.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | 18 | """ 19 | Provides access to HDFS using the :py:class:`HdfsTarget`, a subclass of :py:class:`~luigi.target.Target`. 20 | You can configure what client by setting the "client" config under the "hdfs" section in the configuration, or using the ``--hdfs-client`` command line option. 21 | "hadoopcli" is the slowest, but should work out of the box. 22 | 23 | Since the hdfs functionality is quite big in luigi, it's split into smaller 24 | files under ``luigi/contrib/hdfs/*.py``. But for the sake of convenience and 25 | API stability, everything is reexported under :py:mod:`luigi.contrib.hdfs`. 26 | """ 27 | 28 | # imports 29 | from luigi.contrib.hdfs import config as hdfs_config 30 | from luigi.contrib.hdfs import clients as hdfs_clients 31 | from luigi.contrib.hdfs import error as hdfs_error 32 | from luigi.contrib.hdfs import hadoopcli_clients as hdfs_hadoopcli_clients 33 | from luigi.contrib.hdfs import webhdfs_client as hdfs_webhdfs_client 34 | from luigi.contrib.hdfs import format as hdfs_format 35 | from luigi.contrib.hdfs import target as hdfs_target 36 | 37 | 38 | # config.py 39 | hdfs = hdfs_config.hdfs 40 | load_hadoop_cmd = hdfs_config.load_hadoop_cmd 41 | get_configured_hadoop_version = hdfs_config.get_configured_hadoop_version 42 | get_configured_hdfs_client = hdfs_config.get_configured_hdfs_client 43 | tmppath = hdfs_config.tmppath 44 | 45 | 46 | # clients 47 | HDFSCliError = hdfs_error.HDFSCliError 48 | call_check = hdfs_hadoopcli_clients.HdfsClient.call_check 49 | HdfsClient = hdfs_hadoopcli_clients.HdfsClient 50 | WebHdfsClient = hdfs_webhdfs_client.WebHdfsClient 51 | HdfsClientCdh3 = hdfs_hadoopcli_clients.HdfsClientCdh3 52 | HdfsClientApache1 = hdfs_hadoopcli_clients.HdfsClientApache1 53 | create_hadoopcli_client = hdfs_hadoopcli_clients.create_hadoopcli_client 54 | get_autoconfig_client = hdfs_clients.get_autoconfig_client 55 | exists = hdfs_clients.exists 56 | rename = hdfs_clients.rename 57 | remove = hdfs_clients.remove 58 | mkdir = hdfs_clients.mkdir 59 | listdir = hdfs_clients.listdir 60 | 61 | 62 | # format.py 63 | HdfsReadPipe = hdfs_format.HdfsReadPipe 64 | HdfsAtomicWritePipe = hdfs_format.HdfsAtomicWritePipe 65 | HdfsAtomicWriteDirPipe = hdfs_format.HdfsAtomicWriteDirPipe 66 | PlainFormat = hdfs_format.PlainFormat 67 | PlainDirFormat = hdfs_format.PlainDirFormat 68 | Plain = hdfs_format.Plain 69 | PlainDir = hdfs_format.PlainDir 70 | CompatibleHdfsFormat = hdfs_format.CompatibleHdfsFormat 71 | 72 | 73 | # target.py 74 | HdfsTarget = hdfs_target.HdfsTarget 75 | HdfsFlagTarget = hdfs_target.HdfsFlagTarget 76 | -------------------------------------------------------------------------------- /luigi/contrib/mrrunner.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | # 4 | # Copyright 2012-2015 Spotify AB 5 | # 6 | # Licensed under the Apache License, Version 2.0 (the "License"); 7 | # you may not use this file except in compliance with the License. 8 | # You may obtain a copy of the License at 9 | # 10 | # http://www.apache.org/licenses/LICENSE-2.0 11 | # 12 | # Unless required by applicable law or agreed to in writing, software 13 | # distributed under the License is distributed on an "AS IS" BASIS, 14 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 15 | # See the License for the specific language governing permissions and 16 | # limitations under the License. 17 | # 18 | 19 | """ 20 | Since after Luigi 2.5.0, this is a private module to Luigi. Luigi users should 21 | not rely on that importing this module works. Furthermore, "luigi mr streaming" 22 | have been greatly superseeded by technologies like Spark, Hive, etc. 23 | 24 | The hadoop runner. 25 | 26 | This module contains the main() method which will be used to run the 27 | mapper, combiner, or reducer on the Hadoop nodes. 28 | """ 29 | 30 | import pickle 31 | import logging 32 | import os 33 | import sys 34 | import tarfile 35 | import traceback 36 | 37 | 38 | class Runner: 39 | """ 40 | Run the mapper, combiner, or reducer on hadoop nodes. 41 | """ 42 | 43 | def __init__(self, job=None): 44 | self.extract_packages_archive() 45 | self.job = job or pickle.load(open("job-instance.pickle", "rb")) 46 | self.job._setup_remote() 47 | 48 | def run(self, kind, stdin=sys.stdin, stdout=sys.stdout): 49 | if kind == "map": 50 | self.job.run_mapper(stdin, stdout) 51 | elif kind == "combiner": 52 | self.job.run_combiner(stdin, stdout) 53 | elif kind == "reduce": 54 | self.job.run_reducer(stdin, stdout) 55 | else: 56 | raise Exception('weird command: %s' % kind) 57 | 58 | def extract_packages_archive(self): 59 | if not os.path.exists("packages.tar"): 60 | return 61 | 62 | tar = tarfile.open("packages.tar") 63 | for tarinfo in tar: 64 | tar.extract(tarinfo) 65 | tar.close() 66 | if '' not in sys.path: 67 | sys.path.insert(0, '') 68 | 69 | 70 | def print_exception(exc): 71 | tb = traceback.format_exc() 72 | print('luigi-exc-hex=%s' % tb.encode('hex'), file=sys.stderr) 73 | 74 | 75 | def main(args=None, stdin=sys.stdin, stdout=sys.stdout, print_exception=print_exception): 76 | """ 77 | Run either the mapper, combiner, or reducer from the class instance in the file "job-instance.pickle". 78 | 79 | Arguments: 80 | 81 | kind -- is either map, combiner, or reduce 82 | """ 83 | try: 84 | # Set up logging. 85 | logging.basicConfig(level=logging.WARN) 86 | 87 | kind = args is not None and args[1] or sys.argv[1] 88 | Runner().run(kind, stdin=stdin, stdout=stdout) 89 | except Exception as exc: 90 | # Dump encoded data that we will try to fetch using mechanize 91 | print_exception(exc) 92 | raise 93 | 94 | 95 | if __name__ == '__main__': 96 | main() 97 | -------------------------------------------------------------------------------- /examples/wordcount.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | import luigi 18 | 19 | 20 | class InputText(luigi.ExternalTask): 21 | """ 22 | This class represents something that was created elsewhere by an external process, 23 | so all we want to do is to implement the output method. 24 | """ 25 | date = luigi.DateParameter() 26 | 27 | def output(self): 28 | """ 29 | Returns the target output for this task. 30 | In this case, it expects a file to be present in the local file system. 31 | 32 | :return: the target output for this task. 33 | :rtype: object (:py:class:`luigi.target.Target`) 34 | """ 35 | return luigi.LocalTarget(self.date.strftime('/var/tmp/text/%Y-%m-%d.txt')) 36 | 37 | 38 | class WordCount(luigi.Task): 39 | date_interval = luigi.DateIntervalParameter() 40 | 41 | def requires(self): 42 | """ 43 | This task's dependencies: 44 | 45 | * :py:class:`~.InputText` 46 | 47 | :return: list of object (:py:class:`luigi.task.Task`) 48 | """ 49 | return [InputText(date) for date in self.date_interval.dates()] 50 | 51 | def output(self): 52 | """ 53 | Returns the target output for this task. 54 | In this case, a successful execution of this task will create a file on the local filesystem. 55 | 56 | :return: the target output for this task. 57 | :rtype: object (:py:class:`luigi.target.Target`) 58 | """ 59 | return luigi.LocalTarget('/var/tmp/text-count/%s' % self.date_interval) 60 | 61 | def run(self): 62 | """ 63 | 1. count the words for each of the :py:meth:`~.InputText.output` targets created by :py:class:`~.InputText` 64 | 2. write the count into the :py:meth:`~.WordCount.output` target 65 | """ 66 | count = {} 67 | 68 | # NOTE: self.input() actually returns an element for the InputText.output() target 69 | for f in self.input(): # The input() method is a wrapper around requires() that returns Target objects 70 | for line in f.open('r'): # Target objects are a file system/format abstraction and this will return a file stream object 71 | for word in line.strip().split(): 72 | count[word] = count.get(word, 0) + 1 73 | 74 | # output data 75 | f = self.output().open('w') 76 | for word, count in count.items(): 77 | f.write("%s\t%d\n" % (word, count)) 78 | f.close() # WARNING: file system operations are atomic therefore if you don't close the file you lose all data 79 | -------------------------------------------------------------------------------- /examples/ssh_remote_execution.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | 18 | from collections import defaultdict 19 | 20 | import luigi 21 | from luigi.contrib.ssh import RemoteContext, RemoteTarget 22 | from luigi.mock import MockTarget 23 | 24 | SSH_HOST = "some.accessible.host" 25 | 26 | 27 | class CreateRemoteData(luigi.Task): 28 | """ 29 | Dump info on running processes on remote host. 30 | Data is still stored on the remote host 31 | """ 32 | 33 | def output(self): 34 | """ 35 | Returns the target output for this task. 36 | In this case, a successful execution of this task will create a file on a remote server using SSH. 37 | 38 | :return: the target output for this task. 39 | :rtype: object (:py:class:`~luigi.target.Target`) 40 | """ 41 | return RemoteTarget( 42 | "/tmp/stuff", 43 | SSH_HOST 44 | ) 45 | 46 | def run(self): 47 | remote = RemoteContext(SSH_HOST) 48 | print(remote.check_output([ 49 | "ps aux > {0}".format(self.output().path) 50 | ])) 51 | 52 | 53 | class ProcessRemoteData(luigi.Task): 54 | """ 55 | Create a toplist of users based on how many running processes they have on a remote machine. 56 | 57 | In this example the processed data is stored in a MockTarget. 58 | """ 59 | 60 | def requires(self): 61 | """ 62 | This task's dependencies: 63 | 64 | * :py:class:`~.CreateRemoteData` 65 | 66 | :return: object (:py:class:`luigi.task.Task`) 67 | """ 68 | return CreateRemoteData() 69 | 70 | def run(self): 71 | processes_per_user = defaultdict(int) 72 | with self.input().open('r') as infile: 73 | for line in infile: 74 | username = line.split()[0] 75 | processes_per_user[username] += 1 76 | 77 | toplist = sorted( 78 | processes_per_user.items(), 79 | key=lambda x: x[1], 80 | reverse=True 81 | ) 82 | 83 | with self.output().open('w') as outfile: 84 | for user, n_processes in toplist: 85 | print(n_processes, user, file=outfile) 86 | 87 | def output(self): 88 | """ 89 | Returns the target output for this task. 90 | In this case, a successful execution of this task will simulate the creation of a file in a filesystem. 91 | 92 | :return: the target output for this task. 93 | :rtype: object (:py:class:`~luigi.target.Target`) 94 | """ 95 | return MockTarget("output", mirror_on_stderr=True) 96 | -------------------------------------------------------------------------------- /luigi/contrib/sge_runner.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | 18 | """ 19 | The SunGrid Engine runner 20 | 21 | The main() function of this module will be executed on the 22 | compute node by the submitted job. It accepts as a single 23 | argument the shared temp folder containing the package archive 24 | and pickled task to run, and carries out these steps: 25 | 26 | - extract tarfile of package dependencies and place on the path 27 | - unpickle SGETask instance created on the master node 28 | - run SGETask.work() 29 | 30 | On completion, SGETask on the master node will detect that 31 | the job has left the queue, delete the temporary folder, and 32 | return from SGETask.run() 33 | """ 34 | 35 | import os 36 | import sys 37 | import pickle 38 | import logging 39 | from luigi.safe_extractor import SafeExtractor 40 | 41 | 42 | def _do_work_on_compute_node(work_dir, tarball=True): 43 | 44 | if tarball: 45 | # Extract the necessary dependencies 46 | # This can create a lot of I/O overhead when running many SGEJobTasks, 47 | # so is optional if the luigi project is accessible from the cluster node 48 | _extract_packages_archive(work_dir) 49 | 50 | # Open up the pickle file with the work to be done 51 | os.chdir(work_dir) 52 | with open("job-instance.pickle", "r") as f: 53 | job = pickle.load(f) 54 | 55 | # Do the work contained 56 | job.work() 57 | 58 | 59 | def _extract_packages_archive(work_dir): 60 | package_file = os.path.join(work_dir, "packages.tar") 61 | if not os.path.exists(package_file): 62 | return 63 | 64 | curdir = os.path.abspath(os.curdir) 65 | 66 | os.chdir(work_dir) 67 | extractor = SafeExtractor(work_dir) 68 | extractor.safe_extract(package_file) 69 | if '' not in sys.path: 70 | sys.path.insert(0, '') 71 | 72 | os.chdir(curdir) 73 | 74 | 75 | def main(args=sys.argv): 76 | """Run the work() method from the class instance in the file "job-instance.pickle". 77 | """ 78 | try: 79 | tarball = "--no-tarball" not in args 80 | # Set up logging. 81 | logging.basicConfig(level=logging.WARN) 82 | work_dir = args[1] 83 | assert os.path.exists(work_dir), "First argument to sge_runner.py must be a directory that exists" 84 | project_dir = args[2] 85 | sys.path.append(project_dir) 86 | _do_work_on_compute_node(work_dir, tarball) 87 | except Exception as e: 88 | # Dump encoded data that we will try to fetch using mechanize 89 | print(e) 90 | raise 91 | 92 | 93 | if __name__ == '__main__': 94 | main() 95 | -------------------------------------------------------------------------------- /test/test_sigpipe.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | 18 | import os 19 | from helpers import unittest 20 | 21 | from luigi.format import InputPipeProcessWrapper 22 | 23 | 24 | BASH_SCRIPT = """ 25 | #!/bin/bash 26 | 27 | trap "touch /tmp/luigi_sigpipe.marker; exit 141" SIGPIPE 28 | 29 | 30 | for i in {1..3} 31 | do 32 | sleep 0.1 33 | echo "Welcome $i times" 34 | done 35 | """ 36 | 37 | FAIL_SCRIPT = BASH_SCRIPT + """ 38 | exit 1 39 | """ 40 | 41 | 42 | class TestSigpipe(unittest.TestCase): 43 | 44 | def setUp(self): 45 | with open("/tmp/luigi_test_sigpipe.sh", "w") as fp: 46 | fp.write(BASH_SCRIPT) 47 | 48 | def tearDown(self): 49 | os.remove("/tmp/luigi_test_sigpipe.sh") 50 | if os.path.exists("/tmp/luigi_sigpipe.marker"): 51 | os.remove("/tmp/luigi_sigpipe.marker") 52 | 53 | def test_partial_read(self): 54 | p1 = InputPipeProcessWrapper(["bash", "/tmp/luigi_test_sigpipe.sh"]) 55 | self.assertEqual(p1.readline().decode('utf8'), "Welcome 1 times\n") 56 | p1.close() 57 | self.assertTrue(os.path.exists("/tmp/luigi_sigpipe.marker")) 58 | 59 | def test_full_read(self): 60 | p1 = InputPipeProcessWrapper(["bash", "/tmp/luigi_test_sigpipe.sh"]) 61 | counter = 1 62 | for line in p1: 63 | self.assertEqual(line.decode('utf8'), "Welcome %i times\n" % counter) 64 | counter += 1 65 | p1.close() 66 | self.assertFalse(os.path.exists("/tmp/luigi_sigpipe.marker")) 67 | 68 | 69 | class TestSubprocessException(unittest.TestCase): 70 | 71 | def setUp(self): 72 | with open("/tmp/luigi_test_sigpipe.sh", "w") as fp: 73 | fp.write(FAIL_SCRIPT) 74 | 75 | def tearDown(self): 76 | os.remove("/tmp/luigi_test_sigpipe.sh") 77 | if os.path.exists("/tmp/luigi_sigpipe.marker"): 78 | os.remove("/tmp/luigi_sigpipe.marker") 79 | 80 | def test_partial_read(self): 81 | p1 = InputPipeProcessWrapper(["bash", "/tmp/luigi_test_sigpipe.sh"]) 82 | self.assertEqual(p1.readline().decode('utf8'), "Welcome 1 times\n") 83 | p1.close() 84 | self.assertTrue(os.path.exists("/tmp/luigi_sigpipe.marker")) 85 | 86 | def test_full_read(self): 87 | def run(): 88 | p1 = InputPipeProcessWrapper(["bash", "/tmp/luigi_test_sigpipe.sh"]) 89 | counter = 1 90 | for line in p1: 91 | self.assertEqual(line.decode('utf8'), "Welcome %i times\n" % counter) 92 | counter += 1 93 | p1.close() 94 | 95 | self.assertRaises(RuntimeError, run) 96 | -------------------------------------------------------------------------------- /test/contrib/cascading_test.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | 18 | from helpers import unittest 19 | 20 | import pytest 21 | 22 | import luigi.target 23 | from luigi.contrib.target import CascadingClient 24 | 25 | 26 | @pytest.mark.contrib 27 | class CascadingClientTest(unittest.TestCase): 28 | 29 | def setUp(self): 30 | class FirstClient: 31 | 32 | def exists(self, pos_arg, kw_arg='first'): 33 | if pos_arg < 10: 34 | return pos_arg 35 | elif pos_arg < 20: 36 | return kw_arg 37 | elif kw_arg == 'raise_fae': 38 | raise luigi.target.FileAlreadyExists('oh noes!') 39 | else: 40 | raise Exception() 41 | 42 | class SecondClient: 43 | 44 | def exists(self, pos_arg, other_kw_arg='second', 45 | kw_arg='for-backwards-compatibility'): 46 | if pos_arg < 30: 47 | return -pos_arg 48 | elif pos_arg < 40: 49 | return other_kw_arg 50 | else: 51 | raise Exception() 52 | 53 | self.clients = [FirstClient(), SecondClient()] 54 | self.client = CascadingClient(self.clients) 55 | 56 | def test_successes(self): 57 | self.assertEqual(5, self.client.exists(5)) 58 | self.assertEqual('yay', self.client.exists(15, kw_arg='yay')) 59 | 60 | def test_fallbacking(self): 61 | self.assertEqual(-25, self.client.exists(25)) 62 | self.assertEqual('lol', self.client.exists(35, kw_arg='yay', 63 | other_kw_arg='lol')) 64 | # Note: the first method don't accept the other keyword argument 65 | self.assertEqual(-15, self.client.exists(15, kw_arg='yay', 66 | other_kw_arg='lol')) 67 | 68 | def test_failings(self): 69 | self.assertRaises(Exception, lambda: self.client.exists(45)) 70 | self.assertRaises(AttributeError, lambda: self.client.mkdir()) 71 | 72 | def test_FileAlreadyExists_propagation(self): 73 | self.assertRaises(luigi.target.FileAlreadyExists, 74 | lambda: self.client.exists(25, kw_arg='raise_fae')) 75 | 76 | def test_method_names_kwarg(self): 77 | self.client = CascadingClient(self.clients, method_names=[]) 78 | self.assertRaises(AttributeError, lambda: self.client.exists()) 79 | self.client = CascadingClient(self.clients, method_names=['exists']) 80 | self.assertEqual(5, self.client.exists(5)) 81 | -------------------------------------------------------------------------------- /luigi/contrib/hdfs/abstract_client.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | 18 | """ 19 | Module containing abstract class about hdfs clients. 20 | """ 21 | 22 | import abc 23 | import luigi.target 24 | 25 | 26 | class HdfsFileSystem(luigi.target.FileSystem, metaclass=abc.ABCMeta): 27 | """ 28 | This client uses Apache 2.x syntax for file system commands, which also matched CDH4. 29 | """ 30 | 31 | def rename(self, path, dest): 32 | """ 33 | Rename or move a file. 34 | 35 | In hdfs land, "mv" is often called rename. So we add an alias for 36 | ``move()`` called ``rename()``. This is also to keep backward 37 | compatibility since ``move()`` became standardized in luigi's 38 | filesystem interface. 39 | """ 40 | return self.move(path, dest) 41 | 42 | def rename_dont_move(self, path, dest): 43 | """ 44 | Override this method with an implementation that uses rename2, 45 | which is a rename operation that never moves. 46 | 47 | rename2 - 48 | https://github.com/apache/hadoop/blob/ae91b13/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/ClientProtocol.java 49 | (lines 483-523) 50 | """ 51 | # We only override this method to be able to provide a more specific 52 | # docstring. 53 | return super(HdfsFileSystem, self).rename_dont_move(path, dest) 54 | 55 | @abc.abstractmethod 56 | def remove(self, path, recursive=True, skip_trash=False): 57 | pass 58 | 59 | @abc.abstractmethod 60 | def chmod(self, path, permissions, recursive=False): 61 | pass 62 | 63 | @abc.abstractmethod 64 | def chown(self, path, owner, group, recursive=False): 65 | pass 66 | 67 | @abc.abstractmethod 68 | def count(self, path): 69 | """ 70 | Count contents in a directory 71 | """ 72 | pass 73 | 74 | @abc.abstractmethod 75 | def copy(self, path, destination): 76 | pass 77 | 78 | @abc.abstractmethod 79 | def put(self, local_path, destination): 80 | pass 81 | 82 | @abc.abstractmethod 83 | def get(self, path, local_destination): 84 | pass 85 | 86 | @abc.abstractmethod 87 | def mkdir(self, path, parents=True, raise_if_exists=False): 88 | pass 89 | 90 | @abc.abstractmethod 91 | def listdir(self, path, ignore_directories=False, ignore_files=False, 92 | include_size=False, include_type=False, include_time=False, recursive=False): 93 | pass 94 | 95 | @abc.abstractmethod 96 | def touchz(self, path): 97 | pass 98 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | .coverage.* 2 | doc/api/*.rst 3 | test/gcloud-credentials.json 4 | .hypothesis/ 5 | 6 | .nicesetup 7 | 8 | client.cfg 9 | luigi.cfg 10 | 11 | hadoop_test.py 12 | minicluster.py 13 | mrrunner.py 14 | pig_property_file 15 | 16 | packages.tar 17 | 18 | # Ignore the data files 19 | data 20 | test/data 21 | examples/data 22 | 23 | Vagrantfile 24 | 25 | *.pickle 26 | *.rej 27 | *.orig 28 | 29 | # Created by https://www.gitignore.io 30 | 31 | ### Python ### 32 | # Byte-compiled / optimized / DLL files 33 | __pycache__/ 34 | *.py[cod] 35 | 36 | # C extensions 37 | *.so 38 | 39 | # Distribution / packaging 40 | .Python 41 | env/ 42 | build/ 43 | develop-eggs/ 44 | dist/ 45 | downloads/ 46 | eggs/ 47 | # NOTE : lib/ prevents inclusion of static/visualiser/lib 48 | #lib/ 49 | lib64/ 50 | parts/ 51 | sdist/ 52 | var/ 53 | *.egg-info/ 54 | .installed.cfg 55 | *.egg 56 | 57 | # PyInstaller 58 | # Usually these files are written by a python script from a template 59 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 60 | *.manifest 61 | *.spec 62 | 63 | # Installer logs 64 | pip-log.txt 65 | pip-delete-this-directory.txt 66 | 67 | # Unit test / coverage reports 68 | htmlcov/ 69 | .tox/ 70 | .coverage 71 | .coverage.* 72 | .cache 73 | nosetests.xml 74 | coverage.xml 75 | my_dir 76 | 77 | # Translations 78 | *.mo 79 | *.pot 80 | 81 | # Django stuff: 82 | *.log 83 | 84 | # Sphinx documentation 85 | doc/_build/ 86 | 87 | # PyBuilder 88 | target/ 89 | 90 | 91 | ### Vim ### 92 | [._]*.s[a-w][a-z] 93 | [._]s[a-w][a-z] 94 | *.un~ 95 | Session.vim 96 | .netrwhist 97 | *~ 98 | 99 | 100 | ### PyCharm ### 101 | # Covers JetBrains IDEs: IntelliJ, RubyMine, PhpStorm, AppCode, PyCharm 102 | 103 | *.iml 104 | 105 | ## Directory-based project format: 106 | .idea/ 107 | # if you remove the above rule, at least ignore the following: 108 | 109 | # User-specific stuff: 110 | # .idea/workspace.xml 111 | # .idea/tasks.xml 112 | # .idea/dictionaries 113 | 114 | # Sensitive or high-churn files: 115 | # .idea/dataSources.ids 116 | # .idea/dataSources.xml 117 | # .idea/sqlDataSources.xml 118 | # .idea/dynamic.xml 119 | # .idea/uiDesigner.xml 120 | 121 | # Gradle: 122 | # .idea/gradle.xml 123 | # .idea/libraries 124 | 125 | # Mongo Explorer plugin: 126 | # .idea/mongoSettings.xml 127 | 128 | ## File-based project format: 129 | *.ipr 130 | *.iws 131 | 132 | ## Plugin-specific files: 133 | 134 | # IntelliJ 135 | out/ 136 | 137 | # mpeltonen/sbt-idea plugin 138 | .idea_modules/ 139 | 140 | # JIRA plugin 141 | atlassian-ide-plugin.xml 142 | 143 | # Crashlytics plugin (for Android Studio and IntelliJ) 144 | com_crashlytics_export_strings.xml 145 | crashlytics.properties 146 | crashlytics-build.properties 147 | 148 | 149 | ### Vagrant ### 150 | .vagrant/ 151 | 152 | 153 | ### OSX ### 154 | .DS_Store 155 | .AppleDouble 156 | .LSOverride 157 | 158 | # Icon must end with two \r 159 | Icon 160 | 161 | 162 | # Thumbnails 163 | ._* 164 | 165 | # Files that might appear on external disk 166 | .Spotlight-V100 167 | .Trashes 168 | 169 | # Directories potentially created on remote AFP share 170 | .AppleDB 171 | .AppleDesktop 172 | Network Trash Folder 173 | Temporary Items 174 | .apdisk 175 | 176 | .python-version 177 | -------------------------------------------------------------------------------- /luigi/configuration/toml_parser.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2018 Vote Inc. 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | import os.path 18 | from configparser import ConfigParser 19 | from typing import Any, Dict 20 | 21 | try: 22 | import toml 23 | toml_enabled = True 24 | except ImportError: 25 | toml_enabled = False 26 | 27 | from .base_parser import BaseParser 28 | from ..freezing import recursively_freeze 29 | 30 | 31 | class LuigiTomlParser(BaseParser, ConfigParser): 32 | NO_DEFAULT = object() 33 | enabled = bool(toml_enabled) 34 | data: Dict[str, Any] = dict() 35 | _instance = None 36 | _config_paths = [ 37 | '/etc/luigi/luigi.toml', 38 | 'luigi.toml', 39 | ] 40 | 41 | @staticmethod 42 | def _update_data(data, new_data): 43 | if not new_data: 44 | return data 45 | if not data: 46 | return new_data 47 | for section, content in new_data.items(): 48 | if section not in data: 49 | data[section] = dict() 50 | data[section].update(content) 51 | return data 52 | 53 | def read(self, config_paths): 54 | self.data = dict() 55 | for path in config_paths: 56 | if os.path.isfile(path): 57 | self.data = self._update_data(self.data, toml.load(path)) 58 | 59 | # freeze dict params 60 | for section, content in self.data.items(): 61 | for key, value in content.items(): 62 | if isinstance(value, dict): 63 | self.data[section][key] = recursively_freeze(value) 64 | 65 | return self.data 66 | 67 | def get(self, section, option, default=NO_DEFAULT, **kwargs): 68 | try: 69 | return self.data[section][option] 70 | except KeyError: 71 | if default is self.NO_DEFAULT: 72 | raise 73 | return default 74 | 75 | def getboolean(self, section, option, default=NO_DEFAULT): 76 | return self.get(section, option, default) 77 | 78 | def getint(self, section, option, default=NO_DEFAULT): 79 | return self.get(section, option, default) 80 | 81 | def getfloat(self, section, option, default=NO_DEFAULT): 82 | return self.get(section, option, default) 83 | 84 | def getintdict(self, section): 85 | return self.data.get(section, {}) 86 | 87 | def set(self, section, option, value=None): 88 | if section not in self.data: 89 | self.data[section] = {} 90 | self.data[section][option] = value 91 | 92 | def has_option(self, section, option): 93 | return section in self.data and option in self.data[section] 94 | 95 | def __getitem__(self, name): 96 | return self.data[name] 97 | -------------------------------------------------------------------------------- /test/config_toml_test.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2018 Vote inc. 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | from luigi.configuration import LuigiTomlParser, get_config, add_config_path 18 | 19 | 20 | from helpers import LuigiTestCase 21 | 22 | 23 | class TomlConfigParserTest(LuigiTestCase): 24 | @classmethod 25 | def setUpClass(cls): 26 | add_config_path('test/testconfig/luigi.toml') 27 | add_config_path('test/testconfig/luigi_local.toml') 28 | 29 | def setUp(self): 30 | LuigiTomlParser._instance = None 31 | super(TomlConfigParserTest, self).setUp() 32 | 33 | def test_get_config(self): 34 | config = get_config('toml') 35 | self.assertIsInstance(config, LuigiTomlParser) 36 | 37 | def test_file_reading(self): 38 | config = get_config('toml') 39 | self.assertIn('hdfs', config.data) 40 | 41 | def test_get(self): 42 | config = get_config('toml') 43 | 44 | # test getting 45 | self.assertEqual(config.get('hdfs', 'client'), 'hadoopcli') 46 | self.assertEqual(config.get('hdfs', 'client', 'test'), 'hadoopcli') 47 | 48 | # test default 49 | self.assertEqual(config.get('hdfs', 'test', 'check'), 'check') 50 | with self.assertRaises(KeyError): 51 | config.get('hdfs', 'test') 52 | 53 | # test override 54 | self.assertEqual(config.get('hdfs', 'namenode_host'), 'localhost') 55 | # test non-string values 56 | self.assertEqual(config.get('hdfs', 'namenode_port'), 50030) 57 | 58 | def test_set(self): 59 | config = get_config('toml') 60 | 61 | self.assertEqual(config.get('hdfs', 'client'), 'hadoopcli') 62 | config.set('hdfs', 'client', 'test') 63 | self.assertEqual(config.get('hdfs', 'client'), 'test') 64 | config.set('hdfs', 'check', 'test me') 65 | self.assertEqual(config.get('hdfs', 'check'), 'test me') 66 | 67 | def test_has_option(self): 68 | config = get_config('toml') 69 | self.assertTrue(config.has_option('hdfs', 'client')) 70 | self.assertFalse(config.has_option('hdfs', 'nope')) 71 | self.assertFalse(config.has_option('nope', 'client')) 72 | 73 | 74 | class HelpersTest(LuigiTestCase): 75 | def test_add_without_install(self): 76 | enabled = LuigiTomlParser.enabled 77 | LuigiTomlParser.enabled = False 78 | with self.assertRaises(ImportError): 79 | add_config_path('test/testconfig/luigi.toml') 80 | LuigiTomlParser.enabled = enabled 81 | 82 | def test_get_without_install(self): 83 | enabled = LuigiTomlParser.enabled 84 | LuigiTomlParser.enabled = False 85 | with self.assertRaises(ImportError): 86 | get_config('toml') 87 | LuigiTomlParser.enabled = enabled 88 | -------------------------------------------------------------------------------- /test/task_forwarded_attributes_test.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Copyright 2012-2015 Spotify AB 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | 18 | from helpers import LuigiTestCase, RunOnceTask 19 | 20 | import luigi 21 | import luigi.scheduler 22 | import luigi.worker 23 | 24 | 25 | FORWARDED_ATTRIBUTES = set(luigi.worker.TaskProcess.forward_reporter_attributes.values()) 26 | 27 | 28 | class NonYieldingTask(RunOnceTask): 29 | 30 | # need to accept messages in order for the "scheduler_message" attribute to be not None 31 | accepts_messages = True 32 | 33 | def gather_forwarded_attributes(self): 34 | """ 35 | Returns a set of names of attributes that are forwarded by the TaskProcess and that are not 36 | *None*. The tests in this file check if and which attributes are present at different times, 37 | e.g. while running, or before and after a dynamic dependency was yielded. 38 | """ 39 | attrs = set() 40 | for attr in FORWARDED_ATTRIBUTES: 41 | if getattr(self, attr, None) is not None: 42 | attrs.add(attr) 43 | return attrs 44 | 45 | def run(self): 46 | # store names of forwarded attributes which are only available within the run method 47 | self.attributes_while_running = self.gather_forwarded_attributes() 48 | 49 | # invoke the run method of the RunOnceTask which marks this task as complete 50 | RunOnceTask.run(self) 51 | 52 | 53 | class YieldingTask(NonYieldingTask): 54 | 55 | def run(self): 56 | # as TaskProcess._run_get_new_deps handles generators in a specific way, store names of 57 | # forwarded attributes before and after yielding a dynamic dependency, so we can explicitly 58 | # validate the attribute forwarding implementation 59 | self.attributes_before_yield = self.gather_forwarded_attributes() 60 | yield RunOnceTask() 61 | self.attributes_after_yield = self.gather_forwarded_attributes() 62 | 63 | # invoke the run method of the RunOnceTask which marks this task as complete 64 | RunOnceTask.run(self) 65 | 66 | 67 | class TaskForwardedAttributesTest(LuigiTestCase): 68 | 69 | def run_task(self, task): 70 | sch = luigi.scheduler.Scheduler() 71 | with luigi.worker.Worker(scheduler=sch) as w: 72 | w.add(task) 73 | w.run() 74 | return task 75 | 76 | def test_non_yielding_task(self): 77 | task = self.run_task(NonYieldingTask()) 78 | 79 | self.assertEqual(task.attributes_while_running, FORWARDED_ATTRIBUTES) 80 | 81 | def test_yielding_task(self): 82 | task = self.run_task(YieldingTask()) 83 | 84 | self.assertEqual(task.attributes_before_yield, FORWARDED_ATTRIBUTES) 85 | self.assertEqual(task.attributes_after_yield, FORWARDED_ATTRIBUTES) 86 | --------------------------------------------------------------------------------