Skip to content

ProcessPoolExecutor hangs when 1<max_tasks_per_child<num_submitted//max_workers #115634

Open
@dalleyg

Description

@dalleyg

Bug report

Bug description:

Starting in Python 3.11 when the max_tasks_per_child parameter was introduced, ProcessPoolExecutor hangs when max_tasks_per_child>1 and enough tasks have been submitted to trigger a worker restart.

The following reproducer hangs on a fresh installation of Python 3.11 on Linux.

import os
from concurrent.futures import ProcessPoolExecutor
with ProcessPoolExecutor(1, max_tasks_per_child=2) as exe:
    futs = [exe.submit(os.getpid) for _ in range(10)]
    for fut in futs:
        print(fut.result())

Issuing a keyboard interrupt results in the following stack trace:

Traceback (most recent call last):
  File "<string>", line 7, in <module>
  File "/usr/lib/python3.11/concurrent/futures/_base.py", line 451, in result
    self._condition.wait(timeout)
  File "/usr/lib/python3.11/threading.py", line 320, in wait
    waiter.acquire()
KeyboardInterrupt

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 4, in <module>
  File "/usr/lib/python3.11/concurrent/futures/_base.py", line 647, in __exit__
    self.shutdown(wait=True)
  File "/usr/lib/python3.11/concurrent/futures/process.py", line 825, in shutdown
    self._executor_manager_thread.join()
  File "/usr/lib/python3.11/threading.py", line 1112, in join
    self._wait_for_tstate_lock()
  File "/usr/lib/python3.11/threading.py", line 1132, in _wait_for_tstate_lock
    if lock.acquire(block, timeout):
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
KeyboardInterrupt

Notes:

  • Interleaving submissions and result checking does not help.
  • Adding a timeout to the result() calls does not help.
  • Increasing the pool size does not help, unless it's made large enough to not require any worker restarts.
  • Interestingly, setting max_tasks_per_child=1 works great. It never hangs, and a new process correctly used for each task.
  • It does not hang if max_tasks_per_child is set high enough so that no worker restarts happen.
  • I have reproduced this problem in the following test environments:
    • On GitHub's default Linux CI environment using Python 3.11.
    • On GitHub's default Linux CI environment using Python 3.12.
    • On Ubuntu 22.04.3 LTS running under WSL2 using a fresh installation of Python 3.11.

CPython versions tested on:

3.11, 3.12

Operating systems tested on:

Linux

Linked PRs

Metadata

Metadata

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions