├── .gitignore ├── LICENSE ├── README.md ├── examples ├── __init__.py └── basic_example.py ├── package.bat ├── requirements-dev.txt ├── setup.py └── tqdm_multiprocess ├── __init__.py ├── logger.py └── std.py /.gitignore: -------------------------------------------------------------------------------- 1 | *.log 2 | .vs* 3 | tqdm-multiprocess.pyproj 4 | tqdm-multiprocess.sln 5 | __pycache__* 6 | tqdm_multiprocess.egg-info* 7 | build* 8 | dist* 9 | .pypirc 10 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2020 EleutherAI 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # tqdm-multiprocess 2 | Using queues, tqdm-multiprocess supports multiple worker processes, each with multiple tqdm progress bars, displaying them cleanly through the main process. The worker processes also have access to a single global tqdm for aggregate progress monitoring. 3 | 4 | Logging is also redirected from the subprocesses to the root logger in the main process. 5 | 6 | Currently doesn't support tqdm(iterator), you will need to intialize your worker tqdms with a total and update manually. 7 | 8 | Due to the performance limits of the default Python multiprocess queue you need to update your global and worker process tqdms infrequently to avoid flooding the main process. I will attempt to implement a lock free ringbuffer at some point to see if things can be improved. 9 | 10 | ## Installation 11 | 12 | ```bash 13 | pip install tqdm-multiprocess 14 | ``` 15 | 16 | ## Usage 17 | 18 | *TqdmMultiProcessPool* creates a standard python multiprocessing pool with the desired number of processes. Under the hood it uses async_apply with an event loop to monitor a tqdm and logging queue, allowing the worker processes to redirect both their tqdm objects and logging messages to your main process. There is also a queue for the workers to update the single global tqdm. 19 | 20 | As shown below, you create a list of tasks containing their function and a tuple with your parameters. The functions you pass in will need the extra arguments on the end "tqdm_func, global_tqdm". You must use tqdm_func when initializing your tqdms for the redirection to work. As mentioned above, passing iterators into the tqdm function is currently not supported, so set total=total_steps when setting up your tqdm, and then update the progress manually with the update() method. All other arguments to tqdm should work fine. 21 | 22 | Once you have your task list, call the map() method on your pool, passing in the process count, global_tqdm (or None), task list, as well as error and done callback functions. The error callback will be trigerred if your task functions return anything evaluating as False (if not task_result in the source code). The done callback will be called when the task succesfully completes. 23 | 24 | The map method returns a list containing the returned results for all your tasks in original order. 25 | 26 | ### examples/basic_example.py 27 | 28 | ```python 29 | from time import sleep 30 | import multiprocessing 31 | import tqdm 32 | 33 | import logging 34 | from tqdm_multiprocess.logger import setup_logger_tqdm 35 | logger = logging.getLogger(__name__) 36 | 37 | from tqdm_multiprocess import TqdmMultiProcessPool 38 | 39 | iterations1 = 100 40 | iterations2 = 5 41 | iterations3 = 2 42 | def some_other_function(tqdm_func, global_tqdm): 43 | 44 | total_iterations = iterations1 * iterations2 * iterations3 45 | with tqdm_func(total=total_iterations, dynamic_ncols=True) as progress3: 46 | progress3.set_description("outer") 47 | for i in range(iterations3): 48 | logger.info("outer") 49 | total_iterations = iterations1 * iterations2 50 | with tqdm_func(total=total_iterations, dynamic_ncols=True) as progress2: 51 | progress2.set_description("middle") 52 | for j in range(iterations2): 53 | logger.info("middle") 54 | #for k in tqdm_func(range(iterations1), dynamic_ncols=True, desc="inner"): 55 | with tqdm_func(total=iterations1, dynamic_ncols=True) as progress1: 56 | for j in range(iterations1): 57 | # logger.info("inner") # Spam slows down tqdm too much 58 | progress1.set_description("inner") 59 | sleep(0.01) 60 | progress1.update() 61 | progress2.update() 62 | progress3.update() 63 | global_tqdm.update() 64 | 65 | logger.warning(f"Warning test message. {multiprocessing.current_process().name}") 66 | logger.error(f"Error test message. {multiprocessing.current_process().name}") 67 | 68 | 69 | # Multiprocessed 70 | def example_multiprocessing_function(some_input, tqdm_func, global_tqdm): 71 | logger.debug(f"Debug test message - I won't show up in console. {multiprocessing.current_process().name}") 72 | logger.info(f"Info test message. {multiprocessing.current_process().name}") 73 | some_other_function(tqdm_func, global_tqdm) 74 | return True 75 | 76 | def error_callback(result): 77 | print("Error!") 78 | 79 | def done_callback(result): 80 | print("Done. Result: ", result) 81 | 82 | def example(): 83 | process_count = 4 84 | pool = TqdmMultiProcessPool(process_count) 85 | 86 | task_count = 10 87 | initial_tasks = [(example_multiprocessing_function, (i,)) for i in range(task_count)] 88 | total_iterations = iterations1 * iterations2 * iterations3 * task_count 89 | with tqdm.tqdm(total=total_iterations, dynamic_ncols=True) as global_progress: 90 | global_progress.set_description("global") 91 | results = pool.map(global_progress, initial_tasks, error_callback, done_callback) 92 | print(results) 93 | 94 | if __name__ == '__main__': 95 | logfile_path = "tqdm_multiprocessing_example.log" 96 | setup_logger_tqdm(logfile_path) # Logger will write messages using tqdm.write 97 | example() 98 | ``` 99 | -------------------------------------------------------------------------------- /examples/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/EleutherAI/tqdm-multiprocess/fccefc473595055bf3a5e74bcf8a75b3a9517638/examples/__init__.py -------------------------------------------------------------------------------- /examples/basic_example.py: -------------------------------------------------------------------------------- 1 | from time import sleep 2 | import multiprocessing 3 | import tqdm 4 | 5 | import logging 6 | from tqdm_multiprocess.logger import setup_logger_tqdm 7 | logger = logging.getLogger(__name__) 8 | 9 | from tqdm_multiprocess import TqdmMultiProcessPool 10 | 11 | iterations1 = 100 12 | iterations2 = 5 13 | iterations3 = 2 14 | def some_other_function(tqdm_func, global_tqdm): 15 | 16 | total_iterations = iterations1 * iterations2 * iterations3 17 | with tqdm_func(total=total_iterations, dynamic_ncols=True) as progress3: 18 | progress3.set_description("outer") 19 | for i in range(iterations3): 20 | logger.info("outer") 21 | total_iterations = iterations1 * iterations2 22 | with tqdm_func(total=total_iterations, dynamic_ncols=True) as progress2: 23 | progress2.set_description("middle") 24 | for j in range(iterations2): 25 | logger.info("middle") 26 | #for k in tqdm_func(range(iterations1), dynamic_ncols=True, desc="inner"): 27 | with tqdm_func(total=iterations1, dynamic_ncols=True) as progress1: 28 | for j in range(iterations1): 29 | # logger.info("inner") # Spam slows down tqdm too much 30 | progress1.set_description("inner") 31 | sleep(0.01) 32 | progress1.update() 33 | progress2.update() 34 | progress3.update() 35 | global_tqdm.update() 36 | 37 | logger.warning(f"Warning test message. {multiprocessing.current_process().name}") 38 | logger.error(f"Error test message. {multiprocessing.current_process().name}") 39 | 40 | 41 | # Multiprocessed 42 | def example_multiprocessing_function(some_input, tqdm_func, global_tqdm): 43 | logger.debug(f"Debug test message - I won't show up in console. {multiprocessing.current_process().name}") 44 | logger.info(f"Info test message. {multiprocessing.current_process().name}") 45 | some_other_function(tqdm_func, global_tqdm) 46 | return True 47 | 48 | def error_callback(result): 49 | print("Error!") 50 | 51 | def done_callback(result): 52 | print("Done. Result: ", result) 53 | 54 | def example(): 55 | process_count = 4 56 | pool = TqdmMultiProcessPool(process_count) 57 | 58 | task_count = 10 59 | initial_tasks = [(example_multiprocessing_function, (i,)) for i in range(task_count)] 60 | total_iterations = iterations1 * iterations2 * iterations3 * task_count 61 | with tqdm.tqdm(total=total_iterations, dynamic_ncols=True) as global_progress: 62 | global_progress.set_description("global") 63 | results = pool.map(global_progress, initial_tasks, error_callback, done_callback) 64 | print(results) 65 | 66 | if __name__ == '__main__': 67 | logfile_path = "tqdm_multiprocessing_example.log" 68 | setup_logger_tqdm(logfile_path) # Logger will write messages using tqdm.write 69 | example() -------------------------------------------------------------------------------- /package.bat: -------------------------------------------------------------------------------- 1 | del dist\* /Q 2 | python setup.py sdist bdist_wheel 3 | python -m twine upload --repository pypi dist/* -------------------------------------------------------------------------------- /requirements-dev.txt: -------------------------------------------------------------------------------- 1 | twine -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | import setuptools 2 | import os 3 | from io import open as io_open 4 | 5 | src_dir = os.path.abspath(os.path.dirname(__file__)) 6 | 7 | with open("README.md", "r") as fh: 8 | long_description = fh.read() 9 | 10 | # Build requirements 11 | extras_require = {} 12 | requirements_dev = os.path.join(src_dir, 'requirements-dev.txt') 13 | with io_open(requirements_dev, mode='r') as fd: 14 | extras_require['dev'] = [i.strip().split('#', 1)[0].strip() 15 | for i in fd.read().strip().split('\n')] 16 | 17 | # Get version from tqdm/_version.py 18 | # __version__ = None 19 | # src_dir = os.path.abspath(os.path.dirname(__file__)) 20 | # version_file = os.path.join(src_dir, '_version.py') 21 | # with io_open(version_file, mode='r') as fd: 22 | # exec(fd.read()) 23 | 24 | install_requires = ["tqdm", "colorama"] 25 | 26 | setuptools.setup( 27 | name="tqdm-multiprocess", 28 | version="0.0.11", 29 | author="researcher2", 30 | author_email="2researcher2@gmail.com", 31 | description="Easy multiprocessing with tqdm and logging redirected to main process.", 32 | long_description=long_description, 33 | long_description_content_type="text/markdown", 34 | url="https://github.com/EleutherAI/tqdm-multiprocess", 35 | classifiers=[ 36 | "Programming Language :: Python :: 3", 37 | "License :: OSI Approved :: MIT License", 38 | "Operating System :: OS Independent", 39 | ], 40 | 41 | python_requires='>=3.6', 42 | extras_require=extras_require, 43 | install_requires=install_requires, 44 | packages=['tqdm_multiprocess'] + ['tqdm.' + i for i in setuptools.find_packages('tqdm')], 45 | package_data={'tqdm_multiprocess': ['LICENCE', 'examples/*.py','requirements-dev.txt']}, 46 | ) 47 | -------------------------------------------------------------------------------- /tqdm_multiprocess/__init__.py: -------------------------------------------------------------------------------- 1 | from .std import TqdmMultiProcessPool 2 | -------------------------------------------------------------------------------- /tqdm_multiprocess/logger.py: -------------------------------------------------------------------------------- 1 | import logging 2 | import time 3 | from datetime import timedelta 4 | from tqdm import tqdm 5 | 6 | class LogFormatter(): 7 | 8 | def __init__(self): 9 | self.start_time = time.time() 10 | 11 | def format(self, record): 12 | elapsed_seconds = round(record.created - self.start_time) 13 | 14 | prefix = "%s - %s - %s" % ( 15 | record.levelname, 16 | time.strftime('%x %X'), 17 | timedelta(seconds=elapsed_seconds) 18 | ) 19 | message = record.getMessage() 20 | message = message.replace('\n', '\n' + ' ' * (len(prefix) + 3)) 21 | return "%s - %s" % (prefix, message) if message else '' 22 | 23 | def reset_time(self): 24 | self.start_time = time.time() 25 | 26 | def setup_logger(filepath=None, to_console=True, formatter=LogFormatter()): 27 | 28 | # create logger 29 | logger = logging.getLogger() 30 | logger.setLevel(logging.DEBUG) 31 | logger.propagate = False 32 | 33 | logger.handlers = [] 34 | 35 | # create file handler 36 | if filepath is not None: 37 | file_handler = logging.FileHandler(filepath, "a") 38 | file_handler.setLevel(logging.DEBUG) 39 | file_handler.setFormatter(formatter) 40 | logger.addHandler(file_handler) 41 | 42 | # create console handler 43 | if to_console: 44 | console_handler = logging.StreamHandler() 45 | console_handler.setLevel(logging.INFO) 46 | console_handler.setFormatter(formatter) 47 | logger.addHandler(console_handler) 48 | 49 | class ChildProcessHandler(logging.StreamHandler): 50 | def __init__(self, message_queue): 51 | self.message_queue = message_queue 52 | logging.StreamHandler.__init__(self) 53 | 54 | def emit(self, record): 55 | self.message_queue.put(record) 56 | 57 | def setup_logger_child_process(message_queue): 58 | # create logger 59 | logger = logging.getLogger() 60 | logger.setLevel(logging.DEBUG) 61 | logger.propagate = False 62 | 63 | logger.handlers = [] 64 | 65 | # create queue handler 66 | child_process_handler = ChildProcessHandler(message_queue) 67 | child_process_handler.setLevel(logging.INFO) 68 | logger.addHandler(child_process_handler) 69 | 70 | class TqdmHandler(logging.StreamHandler): 71 | def __init__(self): 72 | logging.StreamHandler.__init__(self) 73 | 74 | def emit(self, record): 75 | msg = self.format(record) 76 | tqdm.write(msg) 77 | 78 | def setup_logger_tqdm(filepath=None, formatter=LogFormatter()): 79 | 80 | # create logger 81 | logger = logging.getLogger() 82 | logger.setLevel(logging.DEBUG) 83 | logger.propagate = False 84 | 85 | logger.handlers = [] 86 | 87 | # create file handler 88 | if filepath is not None: 89 | file_handler = logging.FileHandler(filepath, "a") 90 | file_handler.setLevel(logging.DEBUG) 91 | file_handler.setFormatter(formatter) 92 | logger.addHandler(file_handler) 93 | 94 | # create tqdm handler 95 | tqdm_handler = TqdmHandler() 96 | tqdm_handler.setLevel(logging.INFO) 97 | tqdm_handler.setFormatter(formatter) 98 | logger.addHandler(tqdm_handler) -------------------------------------------------------------------------------- /tqdm_multiprocess/std.py: -------------------------------------------------------------------------------- 1 | import multiprocessing 2 | import signal 3 | from signal import SIGINT, SIG_IGN 4 | from queue import Empty as EmptyQueue 5 | import sys 6 | import tqdm 7 | from functools import partial 8 | 9 | import logging 10 | from .logger import setup_logger_child_process 11 | logger = logging.getLogger(__name__) 12 | 13 | class MultiProcessTqdm(object): 14 | def __init__(self, message_queue, tqdm_id, *args, **kwargs): 15 | self.message_queue = message_queue 16 | self.tqdm_id = tqdm_id 17 | message = (multiprocessing.current_process().name, "__init__", args, kwargs) 18 | self.message_queue.put((self.tqdm_id, message)) 19 | 20 | def __enter__(self, *args, **kwargs): 21 | message = (multiprocessing.current_process().name, "__enter__", args, kwargs) 22 | self.message_queue.put((self.tqdm_id, message)) 23 | return self 24 | 25 | def __exit__(self, *args, **kwargs): 26 | message = (multiprocessing.current_process().name, "__exit__", args, kwargs) 27 | self.message_queue.put((self.tqdm_id, message)) 28 | 29 | def __getattr__(self, method_name): 30 | def _missing(*args, **kwargs): 31 | message = (multiprocessing.current_process().name, method_name, args, kwargs) 32 | self.message_queue.put((self.tqdm_id, message)) 33 | return _missing 34 | 35 | class GlobalMultiProcessTqdm(MultiProcessTqdm): 36 | # We don't want to init so no message is passed. Also the id is not applicable. 37 | def __init__(self, message_queue): 38 | self.message_queue = message_queue 39 | self.tqdm_id = 0 40 | 41 | def get_multi_tqdm(message_queue, tqdms_list, *args, **kwargs): 42 | tqdm_id = len(tqdms_list) 43 | # kwargs["mininterval"] = 1 # Slow it down 44 | multi_tqdm = MultiProcessTqdm(message_queue, tqdm_id, *args, **kwargs) 45 | tqdms_list.append(multi_tqdm) 46 | return multi_tqdm 47 | 48 | terminate = False 49 | def handler(signal_received, frame): 50 | global terminate 51 | terminate = True 52 | 53 | # Signal handling for multiprocess. The "correct" answer doesn't work on windows at all. 54 | # Using the version with a very slight race condition. Don't ctrl-c in that miniscule time window... 55 | # https://stackoverflow.com/questions/11312525/catch-ctrlc-sigint-and-exit-multiprocesses-gracefully-in-python 56 | def init_worker(logging_queue): 57 | setup_logger_child_process(logging_queue) 58 | signal.signal(SIGINT, SIG_IGN) 59 | 60 | def task_wrapper(tqdm_queue, global_tqdm_queue, operation, *args): 61 | tqdms_list = [] 62 | tqdm_partial = partial(get_multi_tqdm, tqdm_queue, tqdms_list) 63 | global_tqdm = GlobalMultiProcessTqdm(global_tqdm_queue) 64 | return operation(*args, tqdm_partial, global_tqdm) 65 | 66 | class TqdmMultiProcessPool(object): 67 | def __init__(self, process_count): 68 | self.mp_manager = multiprocessing.Manager() 69 | self.logging_queue = self.mp_manager.Queue() 70 | self.tqdm_queue = self.mp_manager.Queue() 71 | self.global_tqdm_queue = self.mp_manager.Queue() 72 | self.process_count = process_count 73 | worker_init_function = partial(init_worker, self.logging_queue) 74 | self.mp_pool = multiprocessing.Pool(self.process_count, worker_init_function) 75 | 76 | def map(self, global_tqdm, tasks, on_error, on_done): 77 | 78 | self.previous_signal_int = signal.signal(SIGINT, handler) 79 | 80 | tqdms = {} # {} for _ in range(process_count)] 81 | 82 | async_results = [] 83 | for operation, args in tasks: 84 | wrapper_args = tuple([self.tqdm_queue, self.global_tqdm_queue, operation] + list(args)) 85 | async_results.append(self.mp_pool.apply_async(task_wrapper, wrapper_args)) 86 | 87 | completion_status = [False for _ in async_results] 88 | countdown = len(completion_status) 89 | task_results = [None for _ in async_results] 90 | while countdown > 0 and not terminate: 91 | # Worker Logging 92 | try: 93 | logger_record = self.logging_queue.get_nowait() 94 | getattr(logger, logger_record.levelname.lower())(logger_record.getMessage()) 95 | except (EmptyQueue, InterruptedError): 96 | pass 97 | 98 | # Worker tqdms 99 | try: 100 | count = 0 101 | while True: 102 | tqdm_id, tqdm_message = self.tqdm_queue.get_nowait() 103 | process_id, method_name, args, kwargs = tqdm_message 104 | process_id = int(process_id[-1]) 105 | if process_id not in tqdms: 106 | tqdms[process_id] = {} 107 | 108 | if method_name == "__init__": 109 | tqdms[process_id][tqdm_id] = tqdm.tqdm(*args, **kwargs) 110 | else: 111 | getattr(tqdms[process_id][tqdm_id], method_name)(*args, **kwargs) 112 | 113 | count += 1 114 | if count > 1000: 115 | logger.info("Tqdm worker queue flood.") 116 | except (EmptyQueue, InterruptedError): 117 | pass 118 | 119 | # Global tqdm 120 | try: 121 | count = 0 122 | while True: 123 | tqdm_id, tqdm_message = self.global_tqdm_queue.get_nowait() 124 | process_id, method_name, args, kwargs = tqdm_message 125 | getattr(global_tqdm, method_name)(*args, **kwargs) 126 | 127 | count += 1 128 | if count > 1000: 129 | logger.info("Tqdm global queue flood.") 130 | except (EmptyQueue, InterruptedError): 131 | pass 132 | 133 | # Task Completion 134 | for i, async_result in enumerate(async_results): 135 | if completion_status[i]: 136 | continue 137 | if async_result.ready(): 138 | task_result = async_result.get() 139 | task_results[i] = task_result 140 | completion_status[i] = True 141 | countdown -= 1 142 | 143 | # Task failed, do on_error 144 | if not task_result: 145 | on_error(task_result) 146 | 147 | on_done(task_result) 148 | 149 | if terminate: 150 | logger.info('SIGINT or CTRL-C detected, closing pool. Please wait.') 151 | self.mp_pool.close() 152 | 153 | # Clear out remaining message queues. Sometimes get_nowait returns garbage 154 | # without erroring, just catching all exceptions as we don't care that much 155 | # about logging messages. 156 | try: 157 | while True: 158 | logger_record = self.logging_queue.get_nowait() 159 | getattr(logger, logger_record.levelname.lower())(logger_record.getMessage()) 160 | except (EmptyQueue, InterruptedError): 161 | pass 162 | except Exception: 163 | pass 164 | 165 | try: 166 | while True: 167 | tqdm_id, tqdm_message = self.global_tqdm_queue.get_nowait() 168 | process_id, method_name, args, kwargs = tqdm_message 169 | getattr(global_tqdm, method_name)(*args, **kwargs) 170 | except (EmptyQueue, InterruptedError): 171 | pass 172 | 173 | try: 174 | while True: 175 | tqdm_record = self.tqdm_queue.get_nowait() 176 | tqdm_id, tqdm_message = tqdm_record 177 | process_id, method_name, args, kwargs = tqdm_message 178 | process_id = int(process_id[-1]) 179 | if method_name == "__init__": 180 | tqdms[process_id][tqdm_id] = tqdm.tqdm(*args, **kwargs) 181 | else: 182 | getattr(tqdms[process_id][tqdm_id], method_name)(*args, **kwargs) 183 | except (EmptyQueue, InterruptedError): 184 | pass 185 | 186 | if terminate: 187 | logger.info('Terminating.') 188 | for key, process_tqdms in tqdms.items(): 189 | for key, tqdm_instance in process_tqdms.items(): 190 | if tqdm_instance: 191 | tqdm_instance.close() 192 | sys.exit(0) # Will trigger __exit__ 193 | 194 | signal.signal(SIGINT, self.previous_signal_int) 195 | 196 | return task_results --------------------------------------------------------------------------------