[RFC PATCH 1/1] sched/deadline: Fix RT task potential starvation when expiry time passed

linux-mediatek.lists.infradead.org archive mirror
 help / color / mirror / Atom feed

* [RFC PATCH 1/1] sched/deadline: Fix RT task potential starvation when expiry time passed
@ 2025-06-15 13:10 Kuyo Chang
  2025-06-16 15:03 ` Juri Lelli
  2025-07-30 10:06 ` Geert Uytterhoeven
  0 siblings, 2 replies; 13+ messages in thread
From: Kuyo Chang @ 2025-06-15 13:10 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider, Matthias Brugger, AngeloGioacchino Del Regno
  Cc: jstultz, kuyo chang, linux-kernel, linux-arm-kernel,
	linux-mediatek

From: kuyo chang <kuyo.chang@mediatek.com>

[Symptom]
The fair server mechanism, which is intended to prevent fair starvation
when higher-priority tasks monopolize the CPU.
Specifically, RT tasks on the runqueue may not be scheduled as expected.

[Analysis]
---------
The log "sched: DL replenish lagged too much" triggered.

By memory dump of dl_server:
--------------
    curr = 0xFFFFFF80D6A0AC00 (
      dl_server = 0xFFFFFF83CD5B1470(
        dl_runtime = 0x02FAF080,
        dl_deadline = 0x3B9ACA00,
        dl_period = 0x3B9ACA00,
        dl_bw = 0xCCCC,
        dl_density = 0xCCCC,
        runtime = 0x02FAF080,
        deadline = 0x0000082031EB0E80,
        flags = 0x0,
        dl_throttled = 0x0,
        dl_yielded = 0x0,
        dl_non_contending = 0x0,
        dl_overrun = 0x0,
        dl_server = 0x1,
        dl_server_active = 0x1,
        dl_defer = 0x1,
        dl_defer_armed = 0x0,
        dl_defer_running = 0x1,
        dl_timer = (
          node = (
            expires = 0x000008199756E700),
          _softexpires = 0x000008199756E700,
          function = 0xFFFFFFDB9AF44D30 = dl_task_timer,
          base = 0xFFFFFF83CD5A12C0,
          state = 0x0,
          is_rel = 0x0,
          is_soft = 0x0,
    clock_update_flags = 0x4,
    clock = 0x000008204A496900,

- The timer expiration time (rq->curr->dl_server->dl_timer->expires)
  is already in the past, indicating the timer has expired.
- The timer state (rq->curr->dl_server->dl_timer->state) is 0.

[Suspected Root Cause]
--------------------
The relevant code flow in the throttle path of
update_curr_dl_se() as follows:

dequeue_dl_entity(dl_se, 0);                // the DL entity is dequeued

if (unlikely(is_dl_boosted(dl_se) || !start_dl_timer(dl_se))) {
    if (dl_server(dl_se))                   // timer registration fails
        enqueue_dl_entity(dl_se, ENQUEUE_REPLENISH);//enqueue immediately
    ...
}

The failure of `start_dl_timer` is caused by attempting to register a
timer with an expiration time that is already in the past. When this
situation persists, the code repeatedly re-enqueues the DL entity
without properly replenishing or restarting the timer, resulting in RT
task may not be scheduled as expected.

[Proposed Solution]:
------------------
Instead of immediately re-enqueuing the DL entity on timer registration
failure, this change ensures the DL entity is properly replenished and
the timer is restarted, preventing RT potential starvation.

Signed-off-by: kuyo chang <kuyo.chang@mediatek.com>
---
 kernel/sched/deadline.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index ad45a8fea245..e50cb76c961b 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1556,10 +1556,12 @@ static void update_curr_dl_se(struct rq *rq, struct sched_dl_entity *dl_se, s64
 		}
 
 		if (unlikely(is_dl_boosted(dl_se) || !start_dl_timer(dl_se))) {
-			if (dl_server(dl_se))
-				enqueue_dl_entity(dl_se, ENQUEUE_REPLENISH);
-			else
+			if (dl_server(dl_se)) {
+				replenish_dl_new_period(dl_se, rq);
+				start_dl_timer(dl_se);
+			} else {
 				enqueue_task_dl(rq, dl_task_of(dl_se), ENQUEUE_REPLENISH);
+			}
 		}
 
 		if (!is_leftmost(dl_se, &rq->dl))
-- 
2.45.2



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/1] sched/deadline: Fix RT task potential starvation when expiry time passed
  2025-06-15 13:10 [RFC PATCH 1/1] sched/deadline: Fix RT task potential starvation when expiry time passed Kuyo Chang
@ 2025-06-16 15:03 ` Juri Lelli
  2025-06-18 14:20   ` Kuyo Chang
  2025-07-30 10:06 ` Geert Uytterhoeven
  1 sibling, 1 reply; 13+ messages in thread
From: Juri Lelli @ 2025-06-16 15:03 UTC (permalink / raw)
  To: Kuyo Chang
  Cc: Ingo Molnar, Peter Zijlstra, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider,
	Matthias Brugger, AngeloGioacchino Del Regno, jstultz,
	linux-kernel, linux-arm-kernel, linux-mediatek

Hello,

On 15/06/25 21:10, Kuyo Chang wrote:
> From: kuyo chang <kuyo.chang@mediatek.com>
> 
> [Symptom]
> The fair server mechanism, which is intended to prevent fair starvation
> when higher-priority tasks monopolize the CPU.
> Specifically, RT tasks on the runqueue may not be scheduled as expected.
> 
> [Analysis]
> ---------
> The log "sched: DL replenish lagged too much" triggered.
> 
> By memory dump of dl_server:
> --------------
>     curr = 0xFFFFFF80D6A0AC00 (
>       dl_server = 0xFFFFFF83CD5B1470(
>         dl_runtime = 0x02FAF080,
>         dl_deadline = 0x3B9ACA00,
>         dl_period = 0x3B9ACA00,
>         dl_bw = 0xCCCC,
>         dl_density = 0xCCCC,
>         runtime = 0x02FAF080,
>         deadline = 0x0000082031EB0E80,
>         flags = 0x0,
>         dl_throttled = 0x0,
>         dl_yielded = 0x0,
>         dl_non_contending = 0x0,
>         dl_overrun = 0x0,
>         dl_server = 0x1,
>         dl_server_active = 0x1,
>         dl_defer = 0x1,
>         dl_defer_armed = 0x0,
>         dl_defer_running = 0x1,
>         dl_timer = (
>           node = (
>             expires = 0x000008199756E700),
>           _softexpires = 0x000008199756E700,
>           function = 0xFFFFFFDB9AF44D30 = dl_task_timer,
>           base = 0xFFFFFF83CD5A12C0,
>           state = 0x0,
>           is_rel = 0x0,
>           is_soft = 0x0,
>     clock_update_flags = 0x4,
>     clock = 0x000008204A496900,
> 
> - The timer expiration time (rq->curr->dl_server->dl_timer->expires)
>   is already in the past, indicating the timer has expired.
> - The timer state (rq->curr->dl_server->dl_timer->state) is 0.
> 
> [Suspected Root Cause]
> --------------------
> The relevant code flow in the throttle path of
> update_curr_dl_se() as follows:
> 
> dequeue_dl_entity(dl_se, 0);                // the DL entity is dequeued
> 
> if (unlikely(is_dl_boosted(dl_se) || !start_dl_timer(dl_se))) {
>     if (dl_server(dl_se))                   // timer registration fails
>         enqueue_dl_entity(dl_se, ENQUEUE_REPLENISH);//enqueue immediately
>     ...
> }
> 
> The failure of `start_dl_timer` is caused by attempting to register a
> timer with an expiration time that is already in the past. When this
> situation persists, the code repeatedly re-enqueues the DL entity
> without properly replenishing or restarting the timer, resulting in RT
> task may not be scheduled as expected.
> 
> [Proposed Solution]:
> ------------------
> Instead of immediately re-enqueuing the DL entity on timer registration
> failure, this change ensures the DL entity is properly replenished and
> the timer is restarted, preventing RT potential starvation.
> 
> Signed-off-by: kuyo chang <kuyo.chang@mediatek.com>
> ---
>  kernel/sched/deadline.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index ad45a8fea245..e50cb76c961b 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -1556,10 +1556,12 @@ static void update_curr_dl_se(struct rq *rq, struct sched_dl_entity *dl_se, s64
>  		}
>  
>  		if (unlikely(is_dl_boosted(dl_se) || !start_dl_timer(dl_se))) {
> -			if (dl_server(dl_se))
> -				enqueue_dl_entity(dl_se, ENQUEUE_REPLENISH);
> -			else
> +			if (dl_server(dl_se)) {
> +				replenish_dl_new_period(dl_se, rq);
> +				start_dl_timer(dl_se);

But, even today, enqueue_dl_entity() is called with ENQUEUE_REPLENISH
flag, so I don't get why you say 're-enqueues the DL entity without
properly replenishing'.

Also, why restarting the replenishing timer right after having
replenished the entity?

Thanks,
Juri



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/1] sched/deadline: Fix RT task potential starvation when expiry time passed
  2025-06-16 15:03 ` Juri Lelli
@ 2025-06-18 14:20   ` Kuyo Chang
  2025-06-19 13:13     ` Juri Lelli
  0 siblings, 1 reply; 13+ messages in thread
From: Kuyo Chang @ 2025-06-18 14:20 UTC (permalink / raw)
  To: Juri Lelli
  Cc: Ingo Molnar, Peter Zijlstra, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider,
	Matthias Brugger, AngeloGioacchino Del Regno, jstultz,
	linux-kernel, linux-arm-kernel, linux-mediatek

On Mon, 2025-06-16 at 17:03 +0200, Juri Lelli wrote:
> 
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
> 
> 
> Hello,
> 
> > 
> > [Proposed Solution]:
> > ------------------
> > Instead of immediately re-enqueuing the DL entity on timer
> > registration
> > failure, this change ensures the DL entity is properly replenished
> > and
> > the timer is restarted, preventing RT potential starvation.
> > 
> > Signed-off-by: kuyo chang <kuyo.chang@mediatek.com>
> > ---
> >  kernel/sched/deadline.c | 8 +++++---
> >  1 file changed, 5 insertions(+), 3 deletions(-)
> > 
> > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> > index ad45a8fea245..e50cb76c961b 100644
> > --- a/kernel/sched/deadline.c
> > +++ b/kernel/sched/deadline.c
> > @@ -1556,10 +1556,12 @@ static void update_curr_dl_se(struct rq
> > *rq, struct sched_dl_entity *dl_se, s64
> >               }
> > 
> >               if (unlikely(is_dl_boosted(dl_se) ||
> > !start_dl_timer(dl_se))) {
> > -                     if (dl_server(dl_se))
> > -                             enqueue_dl_entity(dl_se,
> > ENQUEUE_REPLENISH);
> > -                     else
> > +                     if (dl_server(dl_se)) {
> > +                             replenish_dl_new_period(dl_se, rq);
> > +                             start_dl_timer(dl_se);
> 
> But, even today, enqueue_dl_entity() is called with ENQUEUE_REPLENISH
> flag, so I don't get why you say 're-enqueues the DL entity without
> properly replenishing'.
> 
> Also, why restarting the replenishing timer right after having
> replenished the entity?
> 

When dl_defer_running = 1 and the running time has been exhausted, 
it means that the dl_server should stop at this point.
However, if start_dl_timer() returns a failure, it indicates that the
actual time spent consuming the running time was unexpectedly long. 

At this point, there are two options:
[as-is] 1. re-enqueuing the dl entity with ENQUEUE_REPLENISH will clear
the throttled flag 
and re-enqueue the dl entity to keep the fair_server running. 
enqueue_dl_entity(dl_se, ENQUEUE_REPLENISH);
=> replenish_dl_entity
  => replenish_dl_new_period(dl_se, rq);
  => dl_se->dl_yielded = 0;
  => dl_se->dl_throttled = 0;
=> __enqueue_dl_entity(dl_se);

[to-be] 2. To avoid RT latency, the fair_server should remain throttled
while replenishing the dl_se. 
Once replenishing is complete, we can ensure that a timer is
successfully started. 
When the timer is triggered, the throttled state will be cleared,
ensuring that RT tasks can execute during this interval.

It is a policy decision for dealing with the case of failure in
start_dl_timer().
The second approach is better for real-time (RT) latency in my opinion,
as RT tasks must be prioritized.

> Thanks,
> Juri
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/1] sched/deadline: Fix RT task potential starvation when expiry time passed
  2025-06-18 14:20   ` Kuyo Chang
@ 2025-06-19 13:13     ` Juri Lelli
  2025-06-20  3:00       ` Kuyo Chang
  0 siblings, 1 reply; 13+ messages in thread
From: Juri Lelli @ 2025-06-19 13:13 UTC (permalink / raw)
  To: Kuyo Chang
  Cc: Ingo Molnar, Peter Zijlstra, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider,
	Matthias Brugger, AngeloGioacchino Del Regno, jstultz,
	linux-kernel, linux-arm-kernel, linux-mediatek

On 18/06/25 22:20, Kuyo Chang wrote:

...

> When dl_defer_running = 1 and the running time has been exhausted, 
> it means that the dl_server should stop at this point.
> However, if start_dl_timer() returns a failure, it indicates that the
> actual time spent consuming the running time was unexpectedly long. 
>  
> At this point, there are two options:
> [as-is] 1. re-enqueuing the dl entity with ENQUEUE_REPLENISH will clear
> the throttled flag 
> and re-enqueue the dl entity to keep the fair_server running. 
> enqueue_dl_entity(dl_se, ENQUEUE_REPLENISH);
> => replenish_dl_entity
>   => replenish_dl_new_period(dl_se, rq);
>   => dl_se->dl_yielded = 0;
>   => dl_se->dl_throttled = 0;
> => __enqueue_dl_entity(dl_se);
> 
> [to-be] 2. To avoid RT latency, the fair_server should remain throttled
> while replenishing the dl_se. 
> Once replenishing is complete, we can ensure that a timer is
> successfully started. 
> When the timer is triggered, the throttled state will be cleared,
> ensuring that RT tasks can execute during this interval.
>  
> It is a policy decision for dealing with the case of failure in
> start_dl_timer().
> The second approach is better for real-time (RT) latency in my opinion,
> as RT tasks must be prioritized.

OK, I think I see your points, but I am still not sure I fully
understand the link with the issue you describe in the changelog - the
relation with "DL replenish lagged too much", that is.

Could you please expand on the details of the situation that is opening
up for the issue your patch is addressing? Do you know why we hit the
corner case that causes the warning in the first place?

I would like to understand exactly what we are trying to fix before
deciding how to fix it, sorry if I am being dense. :-)

Thanks,
Juri



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/1] sched/deadline: Fix RT task potential starvation when expiry time passed
  2025-06-19 13:13     ` Juri Lelli
@ 2025-06-20  3:00       ` Kuyo Chang
  2025-06-20 15:22         ` Juri Lelli
  0 siblings, 1 reply; 13+ messages in thread
From: Kuyo Chang @ 2025-06-20  3:00 UTC (permalink / raw)
  To: Juri Lelli
  Cc: Ingo Molnar, Peter Zijlstra, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider,
	Matthias Brugger, AngeloGioacchino Del Regno, jstultz,
	linux-kernel, linux-arm-kernel, linux-mediatek

On Thu, 2025-06-19 at 15:13 +0200, Juri Lelli wrote:
> 
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
> 
> 
> On 18/06/25 22:20, Kuyo Chang wrote:
> 
> ...
> 
> > When dl_defer_running = 1 and the running time has been exhausted,
> > it means that the dl_server should stop at this point.
> > However, if start_dl_timer() returns a failure, it indicates that
> > the
> > actual time spent consuming the running time was unexpectedly long.
> > 
> > At this point, there are two options:
> > [as-is] 1. re-enqueuing the dl entity with ENQUEUE_REPLENISH will
> > clear
> > the throttled flag
> > and re-enqueue the dl entity to keep the fair_server running.
> > enqueue_dl_entity(dl_se, ENQUEUE_REPLENISH);
> > => replenish_dl_entity
> >   => replenish_dl_new_period(dl_se, rq);
> >   => dl_se->dl_yielded = 0;
> >   => dl_se->dl_throttled = 0;
> > => __enqueue_dl_entity(dl_se);
> > 
> > [to-be] 2. To avoid RT latency, the fair_server should remain
> > throttled
> > while replenishing the dl_se.
> > Once replenishing is complete, we can ensure that a timer is
> > successfully started.
> > When the timer is triggered, the throttled state will be cleared,
> > ensuring that RT tasks can execute during this interval.
> > 
> > It is a policy decision for dealing with the case of failure in
> > start_dl_timer().
> > The second approach is better for real-time (RT) latency in my
> > opinion,
> > as RT tasks must be prioritized.
> 
> OK, I think I see your points, but I am still not sure I fully
> understand the link with the issue you describe in the changelog -
> the
> relation with "DL replenish lagged too much", that is.
> 
> Could you please expand on the details of the situation that is
> opening
> up for the issue your patch is addressing? Do you know why we hit the
> corner case that causes the warning in the first place?
> 

"DL replenish lagged too much" means the fair_server took much longer
than expected to use up its running time,
so the deadline fell way behind the clock (which is also why
start_dl_timer() failed). 
In this situation, just replenishing one dl_period isn’t enough to
catch up.
 
A corner case is when there are too many IRQs or IPIs in the system.
In this case, runtime gets consumed very slowly, and the fair_server
keep running without being throttled.
Even the runtime is exhausted finally, the fair_server would be
restarted immediately.
In the end, IRQs, IPIs, and fair tasks can take over the whole system,
no chance for RT tasks to run.

> I would like to understand exactly what we are trying to fix before
> deciding how to fix it, sorry if I am being dense. :-)
> 
> Thanks,
> Juri
> 



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/1] sched/deadline: Fix RT task potential starvation when expiry time passed
  2025-06-20  3:00       ` Kuyo Chang
@ 2025-06-20 15:22         ` Juri Lelli
  2025-06-21  2:55           ` Kuyo Chang
  0 siblings, 1 reply; 13+ messages in thread
From: Juri Lelli @ 2025-06-20 15:22 UTC (permalink / raw)
  To: Kuyo Chang
  Cc: Ingo Molnar, Peter Zijlstra, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider,
	Matthias Brugger, AngeloGioacchino Del Regno, jstultz,
	linux-kernel, linux-arm-kernel, linux-mediatek

On 20/06/25 11:00, Kuyo Chang wrote:

...

> "DL replenish lagged too much" means the fair_server took much longer
> than expected to use up its running time,
> so the deadline fell way behind the clock (which is also why
> start_dl_timer() failed). 
> In this situation, just replenishing one dl_period isn’t enough to
> catch up.
>  
> A corner case is when there are too many IRQs or IPIs in the system.
> In this case, runtime gets consumed very slowly, and the fair_server
> keep running without being throttled.
> Even the runtime is exhausted finally, the fair_server would be
> restarted immediately.
> In the end, IRQs, IPIs, and fair tasks can take over the whole system,
> no chance for RT tasks to run.

Thanks for the additional explanation.

The way I understand it now is the following (of course please correct
me if I am still not getting it :)

- a dl_server is actively servicing NORMAL tasks, but suffers lot of IRQ
  load and cannot make much progress
- it does anyway make progress, but it reaches update_curr_dl_se@throttle
  only when its current deadline is past rq_clock
- dl_runtime_exceeded() branch is entered, but start_dl_timer() fails as
  the computed act is still in the past
- enqueue_dl_entity(REPLENISH) call replenish_dl_entity() which tries to
  add runtime and advance the deadline, but time moved on so far that
  deadline is still behind rq_clock() and so "DL replenish ..." is
  printed
- replenish_dl_new_period() updates runtime and deadline from current
  clock and the dl-server is put back to run (so it continues to run
  over/starve FIFO tasks)

It looks like your proposed fix might work in this particular corner
case, but I am not 100% comfortable with not trying to replenish
properly (catch up with runtime) at all. I wonder if we might then start
missing some other corner case. Maybe we could try to catch this
particular corner case before even attempting to start the dl_timer,
since we know it will fail, and do something at that point?

Thanks,
Juri



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/1] sched/deadline: Fix RT task potential starvation when expiry time passed
  2025-06-20 15:22         ` Juri Lelli
@ 2025-06-21  2:55           ` Kuyo Chang
  2025-07-23 22:22             ` Pierce Wen (溫彥翔)
  0 siblings, 1 reply; 13+ messages in thread
From: Kuyo Chang @ 2025-06-21  2:55 UTC (permalink / raw)
  To: Juri Lelli
  Cc: Ingo Molnar, Peter Zijlstra, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider,
	Matthias Brugger, AngeloGioacchino Del Regno, jstultz,
	linux-kernel, linux-arm-kernel, linux-mediatek

On Fri, 2025-06-20 at 17:22 +0200, Juri Lelli wrote:
> 
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
> 
> 
> On 20/06/25 11:00, Kuyo Chang wrote:
> 
> ...
> 
> > 
> 
> Thanks for the additional explanation.
> 
> The way I understand it now is the following (of course please
> correct
> me if I am still not getting it :)
> 
> - a dl_server is actively servicing NORMAL tasks, but suffers lot of
> IRQ
>   load and cannot make much progress
> - it does anyway make progress, but it reaches
> update_curr_dl_se@throttle
>   only when its current deadline is past rq_clock
> - dl_runtime_exceeded() branch is entered, but start_dl_timer() fails
> as
>   the computed act is still in the past
> - enqueue_dl_entity(REPLENISH) call replenish_dl_entity() which tries
> to
>   add runtime and advance the deadline, but time moved on so far that
>   deadline is still behind rq_clock() and so "DL replenish ..." is
>   printed
> - replenish_dl_new_period() updates runtime and deadline from current
>   clock and the dl-server is put back to run (so it continues to run
>   over/starve FIFO tasks)
> 

Yes, "DL replenish ..." is the critical clue for identifying the root
cause of this issue.

> It looks like your proposed fix might work in this particular corner
> case, but I am not 100% comfortable with not trying to replenish
> properly (catch up with runtime) at all. I wonder if we might then
> start
> missing some other corner case. Maybe we could try to catch this
> particular corner case before even attempting to start the dl_timer,
> since we know it will fail, and do something at that point?
> 

You can consider the patch more as an error-proofing mechanism, and so
far, it has been working well on our platform.
However, it might be better to catch this particular corner case in
advance to prevent the issue.
> Thanks,
> Juri
> 



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/1] sched/deadline: Fix RT task potential starvation when expiry time passed
  2025-06-21  2:55           ` Kuyo Chang
@ 2025-07-23 22:22             ` Pierce Wen (溫彥翔)
  2025-07-24  8:23               ` juri.lelli
  0 siblings, 1 reply; 13+ messages in thread
From: Pierce Wen (溫彥翔) @ 2025-07-23 22:22 UTC (permalink / raw)
  To: juri.lelli@redhat.com, Kuyo Chang (張建文)
  Cc: bsegall@google.com, vschneid@redhat.com, dietmar.eggemann@arm.com,
	peterz@infradead.org, rostedt@goodmis.org, mingo@redhat.com,
	vincent.guittot@linaro.org, mgorman@suse.de, jstultz@google.com,
	matthias.bgg@gmail.com, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-mediatek@lists.infradead.org,
	AngeloGioacchino Del Regno

On Sat, 2025-06-21 at 10:55 +0800, Kuyo Chang wrote:
> On Fri, 2025-06-20 at 17:22 +0200, Juri Lelli wrote:
> > 
> > External email : Please do not click links or open attachments
> > until
> > you have verified the sender or the content.
> > 
> > 
> > On 20/06/25 11:00, Kuyo Chang wrote:
> > 
> > ...
> > 
> > > 
> > 
> > Thanks for the additional explanation.
> > 
> > The way I understand it now is the following (of course please
> > correct
> > me if I am still not getting it :)
> > 
> > - a dl_server is actively servicing NORMAL tasks, but suffers lot
> > of
> > IRQ
> >   load and cannot make much progress
> > - it does anyway make progress, but it reaches
> > update_curr_dl_se@throttle
> >   only when its current deadline is past rq_clock
> > - dl_runtime_exceeded() branch is entered, but start_dl_timer()
> > fails
> > as
> >   the computed act is still in the past
> > - enqueue_dl_entity(REPLENISH) call replenish_dl_entity() which
> > tries
> > to
> >   add runtime and advance the deadline, but time moved on so far
> > that
> >   deadline is still behind rq_clock() and so "DL replenish ..." is
> >   printed
> > - replenish_dl_new_period() updates runtime and deadline from
> > current
> >   clock and the dl-server is put back to run (so it continues to
> > run
> >   over/starve FIFO tasks)
> > 
> 
> Yes, "DL replenish ..." is the critical clue for identifying the root
> cause of this issue.
> 
> > It looks like your proposed fix might work in this particular
> > corner
> > case, but I am not 100% comfortable with not trying to replenish
> > properly (catch up with runtime) at all. I wonder if we might then
> > start
> > missing some other corner case. Maybe we could try to catch this
> > particular corner case before even attempting to start the
> > dl_timer,
> > since we know it will fail, and do something at that point?
> > 
> 
> You can consider the patch more as an error-proofing mechanism, and
> so
> far, it has been working well on our platform.
> However, it might be better to catch this particular corner case in
> advance to prevent the issue.
> > Thanks,
> > Juri
> > 
> 

Hi all,

I wanted to follow up on the discussion regarding the potential RT task
starvation issue and check if there have been any further updates or
feedback.

To recap and provide some additional context:

1. As discussed in the thread (see
https://lore.kernel.org/all/CANDhNCqYCpdhYS9afdKeY34Bmw8MXyqKWCSTxOZNLTjYrUaVXg@mail.gmail.com/
), it has been demonstrated that the use of a scaled timer can indeed
induce RT starvation under certain conditions.

2. Furthermore, since the delta_exec time calculation relies on the
clock_task member of struct rq, which is affected by IRQ time on the
runqueue, there is a risk that if IRQ time becomes excessively long in
some corner cases, it could also lead to RT starvation.

3. Based on these observations, we strongly recommend adopting a
recovery patch to address these critical scenarios and prevent RT task
starvation, especially in cases where the current logic may not be
sufficient.

Best regards,  
Pierce.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/1] sched/deadline: Fix RT task potential starvation when expiry time passed
  2025-07-23 22:22             ` Pierce Wen (溫彥翔)
@ 2025-07-24  8:23               ` juri.lelli
  0 siblings, 0 replies; 13+ messages in thread
From: juri.lelli @ 2025-07-24  8:23 UTC (permalink / raw)
  To: Pierce Wen (溫彥翔)
  Cc: Kuyo Chang (張建文), bsegall@google.com,
	vschneid@redhat.com, dietmar.eggemann@arm.com,
	peterz@infradead.org, rostedt@goodmis.org, mingo@redhat.com,
	vincent.guittot@linaro.org, mgorman@suse.de, jstultz@google.com,
	matthias.bgg@gmail.com, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-mediatek@lists.infradead.org,
	AngeloGioacchino Del Regno

On 23/07/25 22:22, Pierce Wen (溫彥翔) wrote:

...

> Hi all,

Hello!

> I wanted to follow up on the discussion regarding the potential RT task
> starvation issue and check if there have been any further updates or
> feedback.
> 
> To recap and provide some additional context:
> 
> 1. As discussed in the thread (see
> https://lore.kernel.org/all/CANDhNCqYCpdhYS9afdKeY34Bmw8MXyqKWCSTxOZNLTjYrUaVXg@mail.gmail.com/
> ), it has been demonstrated that the use of a scaled timer can indeed
> induce RT starvation under certain conditions.

This first case has been handled by fc975cfb36393 ("sched/deadline: Fix
dl_server runtime calculation formula") on tip/master.

> 2. Furthermore, since the delta_exec time calculation relies on the
> clock_task member of struct rq, which is affected by IRQ time on the
> runqueue, there is a risk that if IRQ time becomes excessively long in
> some corner cases, it could also lead to RT starvation.
> 
> 3. Based on these observations, we strongly recommend adopting a
> recovery patch to address these critical scenarios and prevent RT task
> starvation, especially in cases where the current logic may not be
> sufficient.

As I was saying, I am not against the patch proposed to address the
starvation issue discussed in this thread. Maybe the patch can be
reposted with the addition of a comment on the code path related to
dl-server explaining why the special case is needed?

Thanks,
Juri



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/1] sched/deadline: Fix RT task potential starvation when expiry time passed
  2025-06-15 13:10 [RFC PATCH 1/1] sched/deadline: Fix RT task potential starvation when expiry time passed Kuyo Chang
  2025-06-16 15:03 ` Juri Lelli
@ 2025-07-30 10:06 ` Geert Uytterhoeven
  2025-07-31 15:00   ` Christian Loehle
                     ` (2 more replies)
  1 sibling, 3 replies; 13+ messages in thread
From: Geert Uytterhoeven @ 2025-07-30 10:06 UTC (permalink / raw)
  To: Kuyo Chang
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider, Matthias Brugger, AngeloGioacchino Del Regno,
	jstultz, linux-kernel, linux-arm-kernel, linux-mediatek

Hi Kuyo,

On Mon, 16 Jun 2025 at 14:39, Kuyo Chang <kuyo.chang@mediatek.com> wrote:
> From: kuyo chang <kuyo.chang@mediatek.com>
>
> [Symptom]
> The fair server mechanism, which is intended to prevent fair starvation
> when higher-priority tasks monopolize the CPU.
> Specifically, RT tasks on the runqueue may not be scheduled as expected.
>
> [Analysis]
> ---------
> The log "sched: DL replenish lagged too much" triggered.
>
> By memory dump of dl_server:
> --------------
>     curr = 0xFFFFFF80D6A0AC00 (
>       dl_server = 0xFFFFFF83CD5B1470(
>         dl_runtime = 0x02FAF080,
>         dl_deadline = 0x3B9ACA00,
>         dl_period = 0x3B9ACA00,
>         dl_bw = 0xCCCC,
>         dl_density = 0xCCCC,
>         runtime = 0x02FAF080,
>         deadline = 0x0000082031EB0E80,
>         flags = 0x0,
>         dl_throttled = 0x0,
>         dl_yielded = 0x0,
>         dl_non_contending = 0x0,
>         dl_overrun = 0x0,
>         dl_server = 0x1,
>         dl_server_active = 0x1,
>         dl_defer = 0x1,
>         dl_defer_armed = 0x0,
>         dl_defer_running = 0x1,
>         dl_timer = (
>           node = (
>             expires = 0x000008199756E700),
>           _softexpires = 0x000008199756E700,
>           function = 0xFFFFFFDB9AF44D30 = dl_task_timer,
>           base = 0xFFFFFF83CD5A12C0,
>           state = 0x0,
>           is_rel = 0x0,
>           is_soft = 0x0,
>     clock_update_flags = 0x4,
>     clock = 0x000008204A496900,
>
> - The timer expiration time (rq->curr->dl_server->dl_timer->expires)
>   is already in the past, indicating the timer has expired.
> - The timer state (rq->curr->dl_server->dl_timer->state) is 0.
>
> [Suspected Root Cause]
> --------------------
> The relevant code flow in the throttle path of
> update_curr_dl_se() as follows:
>
> dequeue_dl_entity(dl_se, 0);                // the DL entity is dequeued
>
> if (unlikely(is_dl_boosted(dl_se) || !start_dl_timer(dl_se))) {
>     if (dl_server(dl_se))                   // timer registration fails
>         enqueue_dl_entity(dl_se, ENQUEUE_REPLENISH);//enqueue immediately
>     ...
> }
>
> The failure of `start_dl_timer` is caused by attempting to register a
> timer with an expiration time that is already in the past. When this
> situation persists, the code repeatedly re-enqueues the DL entity
> without properly replenishing or restarting the timer, resulting in RT
> task may not be scheduled as expected.
>
> [Proposed Solution]:
> ------------------
> Instead of immediately re-enqueuing the DL entity on timer registration
> failure, this change ensures the DL entity is properly replenished and
> the timer is restarted, preventing RT potential starvation.
>
> Signed-off-by: kuyo chang <kuyo.chang@mediatek.com>

Thanks, this fixes the issue I was seeing!

Closes: https://lore.kernel.org/CAMuHMdXn4z1pioTtBGMfQM0jsLviqS2jwysaWXpoLxWYoGa82w@mail.gmail.com
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/1] sched/deadline: Fix RT task potential starvation when expiry time passed
  2025-07-30 10:06 ` Geert Uytterhoeven
@ 2025-07-31 15:00   ` Christian Loehle
  2025-08-15  4:35   ` Jiri Slaby
  2025-08-20 12:57   ` Diederik de Haas
  2 siblings, 0 replies; 13+ messages in thread
From: Christian Loehle @ 2025-07-31 15:00 UTC (permalink / raw)
  To: Geert Uytterhoeven, Kuyo Chang
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider, Matthias Brugger, AngeloGioacchino Del Regno,
	jstultz, linux-kernel, linux-arm-kernel, linux-mediatek

On 7/30/25 11:06, Geert Uytterhoeven wrote:
> Hi Kuyo,
> 
> On Mon, 16 Jun 2025 at 14:39, Kuyo Chang <kuyo.chang@mediatek.com> wrote:
>> From: kuyo chang <kuyo.chang@mediatek.com>
>>
>> [Symptom]
>> The fair server mechanism, which is intended to prevent fair starvation
>> when higher-priority tasks monopolize the CPU.
>> Specifically, RT tasks on the runqueue may not be scheduled as expected.
>>
>> [Analysis]
>> ---------
>> The log "sched: DL replenish lagged too much" triggered.
>>
>> By memory dump of dl_server:
>> --------------
>>     curr = 0xFFFFFF80D6A0AC00 (
>>       dl_server = 0xFFFFFF83CD5B1470(
>>         dl_runtime = 0x02FAF080,
>>         dl_deadline = 0x3B9ACA00,
>>         dl_period = 0x3B9ACA00,
>>         dl_bw = 0xCCCC,
>>         dl_density = 0xCCCC,
>>         runtime = 0x02FAF080,
>>         deadline = 0x0000082031EB0E80,
>>         flags = 0x0,
>>         dl_throttled = 0x0,
>>         dl_yielded = 0x0,
>>         dl_non_contending = 0x0,
>>         dl_overrun = 0x0,
>>         dl_server = 0x1,
>>         dl_server_active = 0x1,
>>         dl_defer = 0x1,
>>         dl_defer_armed = 0x0,
>>         dl_defer_running = 0x1,
>>         dl_timer = (
>>           node = (
>>             expires = 0x000008199756E700),
>>           _softexpires = 0x000008199756E700,
>>           function = 0xFFFFFFDB9AF44D30 = dl_task_timer,
>>           base = 0xFFFFFF83CD5A12C0,
>>           state = 0x0,
>>           is_rel = 0x0,
>>           is_soft = 0x0,
>>     clock_update_flags = 0x4,
>>     clock = 0x000008204A496900,
>>
>> - The timer expiration time (rq->curr->dl_server->dl_timer->expires)
>>   is already in the past, indicating the timer has expired.
>> - The timer state (rq->curr->dl_server->dl_timer->state) is 0.
>>
>> [Suspected Root Cause]
>> --------------------
>> The relevant code flow in the throttle path of
>> update_curr_dl_se() as follows:
>>
>> dequeue_dl_entity(dl_se, 0);                // the DL entity is dequeued
>>
>> if (unlikely(is_dl_boosted(dl_se) || !start_dl_timer(dl_se))) {
>>     if (dl_server(dl_se))                   // timer registration fails
>>         enqueue_dl_entity(dl_se, ENQUEUE_REPLENISH);//enqueue immediately
>>     ...
>> }
>>
>> The failure of `start_dl_timer` is caused by attempting to register a
>> timer with an expiration time that is already in the past. When this
>> situation persists, the code repeatedly re-enqueues the DL entity
>> without properly replenishing or restarting the timer, resulting in RT
>> task may not be scheduled as expected.
>>
>> [Proposed Solution]:
>> ------------------
>> Instead of immediately re-enqueuing the DL entity on timer registration
>> failure, this change ensures the DL entity is properly replenished and
>> the timer is restarted, preventing RT potential starvation.
>>
>> Signed-off-by: kuyo chang <kuyo.chang@mediatek.com>
> 
> Thanks, this fixes the issue I was seeing!
> 
> Closes: https://lore.kernel.org/CAMuHMdXn4z1pioTtBGMfQM0jsLviqS2jwysaWXpoLxWYoGa82w@mail.gmail.com
> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
> 

FWIW the reported issue is also present on an arm64 rk3399 and
$SUBJECT fixes that.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/1] sched/deadline: Fix RT task potential starvation when expiry time passed
  2025-07-30 10:06 ` Geert Uytterhoeven
  2025-07-31 15:00   ` Christian Loehle
@ 2025-08-15  4:35   ` Jiri Slaby
  2025-08-20 12:57   ` Diederik de Haas
  2 siblings, 0 replies; 13+ messages in thread
From: Jiri Slaby @ 2025-08-15  4:35 UTC (permalink / raw)
  To: Geert Uytterhoeven, Kuyo Chang
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider, Matthias Brugger, AngeloGioacchino Del Regno,
	jstultz, linux-kernel, linux-arm-kernel, linux-mediatek

On 30. 07. 25, 12:06, Geert Uytterhoeven wrote:
>> [Proposed Solution]:
>> ------------------
>> Instead of immediately re-enqueuing the DL entity on timer registration
>> failure, this change ensures the DL entity is properly replenished and
>> the timer is restarted, preventing RT potential starvation.
>>
>> Signed-off-by: kuyo chang <kuyo.chang@mediatek.com>
> 
> Thanks, this fixes the issue I was seeing!
> 
> Closes: https://lore.kernel.org/CAMuHMdXn4z1pioTtBGMfQM0jsLviqS2jwysaWXpoLxWYoGa82w@mail.gmail.com
> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>

As well as:
Closes: 
https://lore.kernel.org/all/58c46200-95b0-4cd8-bb5e-44f963a66875@kernel.org/
Tested-by: Jiri Slaby <jirislaby@kernel.org>

thanks,
-- 
js
suse labs



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/1] sched/deadline: Fix RT task potential starvation when expiry time passed
  2025-07-30 10:06 ` Geert Uytterhoeven
  2025-07-31 15:00   ` Christian Loehle
  2025-08-15  4:35   ` Jiri Slaby
@ 2025-08-20 12:57   ` Diederik de Haas
  2 siblings, 0 replies; 13+ messages in thread
From: Diederik de Haas @ 2025-08-20 12:57 UTC (permalink / raw)
  To: Geert Uytterhoeven, Kuyo Chang
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider, Matthias Brugger, AngeloGioacchino Del Regno,
	jstultz, linux-kernel, linux-arm-kernel, linux-mediatek,
	linux-rockchip

[-- Attachment #1: Type: text/plain, Size: 739 bytes --]

On Wed Jul 30, 2025 at 12:06 PM CEST, Geert Uytterhoeven wrote:
> On Mon, 16 Jun 2025 at 14:39, Kuyo Chang <kuyo.chang@mediatek.com> wrote:
>> From: kuyo chang <kuyo.chang@mediatek.com>
>>
>> [Analysis]
>> ---------
>> The log "sched: DL replenish lagged too much" triggered.
>>
>> Signed-off-by: kuyo chang <kuyo.chang@mediatek.com>
>
> Thanks, this fixes the issue I was seeing!
>
> Closes: https://lore.kernel.org/CAMuHMdXn4z1pioTtBGMfQM0jsLviqS2jwysaWXpoLxWYoGa82w@mail.gmail.com
> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>

I was seeing this warning as well on several of my Rockchip based
devices, but that is gone after applying this patch. Thanks!

Tested-by: Diederik de Haas <didi.debian@cknow.org>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2025-08-20 13:22 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-15 13:10 [RFC PATCH 1/1] sched/deadline: Fix RT task potential starvation when expiry time passed Kuyo Chang
2025-06-16 15:03 ` Juri Lelli
2025-06-18 14:20   ` Kuyo Chang
2025-06-19 13:13     ` Juri Lelli
2025-06-20  3:00       ` Kuyo Chang
2025-06-20 15:22         ` Juri Lelli
2025-06-21  2:55           ` Kuyo Chang
2025-07-23 22:22             ` Pierce Wen (溫彥翔)
2025-07-24  8:23               ` juri.lelli
2025-07-30 10:06 ` Geert Uytterhoeven
2025-07-31 15:00   ` Christian Loehle
2025-08-15  4:35   ` Jiri Slaby
2025-08-20 12:57   ` Diederik de Haas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).