Description
Determine this is the right repository
- I determined this is the correct repository in which to report this bug.
Summary of the issue
Context
We have a service that uses Cloud Task to dispatch and run long running tasks. It has been running well for months, but since the 15th May we started noticing some sporadic DeadlineExceeded
errors raised from gRPC. We have not changed anything in our system or in the way we enqueue tasks.
The amount of failed tasks raising DeadlineExceeded
is probably around 20%.
After seeing these new recent errors, we decided to add support for retries and a timeout when creating tasks, here is a similar configuration than the we used:
retry = Retry(
predicate=if_exception_type(
exceptions.TooManyRequests,
exceptions.ServiceUnavailable,
requests.exceptions.ConnectionError,
requests.exceptions.ChunkedEncodingError,
auth_exceptions.TransportError,
exceptions.DeadlineExceeded, # exception we started noticing
),
initial=1.0,
maximum=2.0,
multiplier=2.0,
timeout=15.0, # how long to keep retrying
)
client.create_task(
request={"parent": parent, "task": task}, timeout=5.0, retry=retry
)
We saw the tasks that were raising an exception being retried, but none of them got successfully queued before timing out, and re-raising DeadlineExceeded
.
The ones that gets queued successfully are very quick, so the configured timeouts should have been enough.
After these failed attempts, the only way for us to actually make it work, and ensure 100% of our tasks are successfully put to the queue, was to switch the transport protocol from GRPC to HTTP.
client = CloudTasksClient(
transport=CloudTasksRestTransport(credentials=...)
)
Is there anything that we are missing, or anything that has changed in the way the library is using GRPC that would explain why we started noticing this behaviour?
Expected Behavior:
All our tasks should be queued successfully when retrying on sporadic DeadlineExceeded
errors.
Actual Behavior:
We are getting sporadic DeadlineExceeded
errors, the task never gets queued, even on retry.
API client name and version
google-cloud-tasks 2.19.2
Reproduction steps: code
This is some pseudo code using the GRPC transport protocol, before we switched to using HTTP. Note that reaching the DeadlineExceeded
exception seems to only happen on ~20% of the tasks we queued. The rest of the tasks were queued successfully.
file: main.py
client = CloudTasksClient()
retry = Retry(
predicate=if_exception_type(
exceptions.DeadlineExceeded, # exception we started noticing
),
initial=1.0,
maximum=2.0,
multiplier=2.0,
timeout=15.0, # how long to keep retrying
)
client.create_task(
request={"parent": parent, "task": task}, timeout=5.0, retry=retry
)
Reproduction steps: supporting files
No response
Reproduction steps: actual results
No response
Reproduction steps: expected results
OS & version + platform
Debian Bookworm (12) on App Engine flex
Python environment
Python 3.12.10
Python dependencies
Here is the list of our dependencies related to gRPC
grpc-google-iam-v1 0.14.2
grpcio 1.71.0
grpcio-status 1.62.3
Additional context
Here is a sample of the traceback of the actual exception we've been seeing:
File "/opt/.venv/lib/python3.12/site-packages/grpc/_channel.py", line 1006, in _end_unary_response_blocking
raise _InactiveRpcError(state) # pytype: disable=not-instantiable
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.DEADLINE_EXCEEDED
details = "Deadline Exceeded"
debug_error_string = "UNKNOWN:Error received from peer {created_time:"2025-05-21T14:14:37.173146012+00:00", grpc_status:4, grpc_message:"Deadline Exceeded"}"
>
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/opt/.venv/lib/python3.12/site-packages/google/api_core/retry/retry_unary.py", line 144, in retry_target
result = target()
^^^^^^^^
File "/opt/.venv/lib/python3.12/site-packages/google/api_core/timeout.py", line 130, in func_with_timeout
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/opt/.venv/lib/python3.12/site-packages/google/api_core/grpc_helpers.py", line 78, in error_remapped_callable
raise exceptions.from_grpc_error(exc) from exc
google.api_core.exceptions.DeadlineExceeded: 504 Deadline Exceeded