Custom timeout per task and retry doesn't seem possible #1281

saro2-a · 2025-01-10T11:32:51Z

I was trying to restart stalled jobs, with custom timeouts.

We have several jobs that depending on the input they can either last 1 minute or 3h, with a uniform distribution. At the time of job submission we know how long it is going to take (more or less), but when I fetch "get_stalled_jobs" it seems the "started_at" of the event might not be retained at the creation of the job:

It is fetched:
SELECT job.id, status, task_name, priority, lock, queueing_lock, args, scheduled_at, queue_name, attempts, max(event.at) started_at

but not retained
https://github.com/procrastinate-org/procrastinate/blob/main/procrastinate/manager.py#L175
https://github.com/procrastinate-org/procrastinate/blob/main/procrastinate/jobs.py#L77

hence seemingly making the task impossible?

        @self.app.periodic(cron="*/10 * * * *")
        @self.app.task(queueing_lock="retry_stalled_jobs", pass_context=True)
        async def retry_stalled_jobs(context, timestamp):
            stalled_jobs = await self.app.job_manager.get_stalled_jobs(
                nb_seconds=RUNNING_JOBS_MAX_TIME_SECONDS
            )
            # TODO it is currently not possible to have some jobs with custom duration.
            # it needs to be solved at lib level
            for job in stalled_jobs:
                proc_task_max_run_time = job.task_kwargs.get("proc_task_max_run_time")
                if not proc_task_max_run_time or proc_task_max_run_time < now()- {{{ job.started_at ??where to get the start time of the event??}}}:
                    await self.app.job_manager.retry_job(job)

Could we either:

support proc_task_max_run_time as a first class parameter (probably preferred)
or pass the started_at?

Thank you

The text was updated successfully, but these errors were encountered:

ewjoachim · 2025-01-10T15:16:09Z

This looks similar to #702 which we wanted to tackle in #740 with heartbeats

EDIT: well, no, timeouts and retrying are different. It's close but not the same. I'll try looking in more details.

ewjoachim · 2025-01-10T15:21:36Z

I think you're right in that the manager doesn't git access to the "Events" table. I think what would make the most sense is the ability to inspect the events of a job.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Custom timeout per task and retry doesn't seem possible #1281

Custom timeout per task and retry doesn't seem possible #1281

saro2-a commented Jan 10, 2025

ewjoachim commented Jan 10, 2025 •

edited

Loading

ewjoachim commented Jan 10, 2025

Custom timeout per task and retry doesn't seem possible #1281

Custom timeout per task and retry doesn't seem possible #1281

Comments

saro2-a commented Jan 10, 2025

ewjoachim commented Jan 10, 2025 • edited Loading

ewjoachim commented Jan 10, 2025

ewjoachim commented Jan 10, 2025 •

edited

Loading