Re: [PATCH] sched/fair: Only increment deadline once on yield

kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: [PATCH] sched/fair: Only increment deadline once on yield
       [not found] <20250401123622.584018-1-sieberf@amazon.com>
@ 2025-04-13 18:38 ` Alexander Graf
  2025-07-22 11:46   ` [PATCH] " Wang Tao
  0 siblings, 1 reply; 2+ messages in thread
From: Alexander Graf @ 2025-04-13 18:38 UTC (permalink / raw)
  To: Fernand Sieber, Ingo Molnar, Peter Zijlstra, Vincent Guittot,
	linux-kernel, nh-open-source, kvm


On 01.04.25 14:36, Fernand Sieber wrote:
> If a task yields, the scheduler may decide to pick it again. The task in
> turn may decide to yield immediately or shortly after, leading to a tight
> loop of yields.
>
> If there's another runnable task as this point, the deadline will be
> increased by the slice at each loop. This can cause the deadline to runaway
> pretty quickly, and subsequent elevated run delays later on as the task
> doesn't get picked again. The reason the scheduler can pick the same task
> again and again despite its deadline increasing is because it may be the
> only eligible task at that point.
>
> Fix this by updating the deadline only to one slice ahead.
>
> Note, we might want to consider iterating on the implementation of yield as
> follow up:
> * the yielding task could be forfeiting its remaining slice by
>    incrementing its vruntime correspondingly
> * in case of yield_to the yielding task could be donating its remaining
>    slice to the target task
>
> Signed-off-by: Fernand Sieber <sieberf@amazon.com>


IMHO it's worth noting that this is not a theoretical issue. We have 
seen this in real life: A KVM virtual machine's vCPU which runs into a 
busy guest spin lock calls kvm_vcpu_yield_to() which eventually ends up 
in the yield_task_fair() function. We have seen such spin locks due to 
guest contention rather than host overcommit, which means we go into a 
loop of vCPU execution and spin loop exit, which results in an 
undesirable increase in the vCPU thread's deadline.

Given this impacts real workloads and is a bug present since the 
introduction of EEVDF, I would say it warrants a

Fixes: 147f3efaa24182 ("sched/fair: Implement an EEVDF-like scheduling 
policy")

tag.


Alex



^ permalink raw reply	[flat|nested] 2+ messages in thread

* [PATCH] Re: [PATCH] sched/fair: Only increment deadline once on yield
  2025-04-13 18:38 ` [PATCH] sched/fair: Only increment deadline once on yield Alexander Graf
@ 2025-07-22 11:46   ` Wang Tao
  0 siblings, 0 replies; 2+ messages in thread
From: Wang Tao @ 2025-07-22 11:46 UTC (permalink / raw)
  To: graf
  Cc: kvm, linux-kernel, mingo, nh-open-source, peterz, sieberf,
	vincent.guittot, tanghui20

>> On 01/04/25 18:06, Fernand Sieber wrote:
>> If a task yields, the scheduler may decide to pick it again. The task in
>> turn may decide to yield immediately or shortly after, leading to a tight
>> loop of yields.
>>
>> If there's another runnable task as this point, the deadline will be
>> increased by the slice at each loop. This can cause the deadline to runaway
>> pretty quickly, and subsequent elevated run delays later on as the task
>> doesn't get picked again. The reason the scheduler can pick the same task
>> again and again despite its deadline increasing is because it may be the
>> only eligible task at that point.
>>
>> Fix this by updating the deadline only to one slice ahead.
>>
>> Note, we might want to consider iterating on the implementation of yield as
>> follow up:
>> * the yielding task could be forfeiting its remaining slice by
>>    incrementing its vruntime correspondingly
>> * in case of yield_to the yielding task could be donating its remaining
>>    slice to the target task
>>
>> Signed-off-by: Fernand Sieber <sieberf@amazon.com>


>IMHO it's worth noting that this is not a theoretical issue. We have 
>seen this in real life: A KVM virtual machine's vCPU which runs into a 
>busy guest spin lock calls kvm_vcpu_yield_to() which eventually ends up 
>in the yield_task_fair() function. We have seen such spin locks due to 
>guest contention rather than host overcommit, which means we go into a 
>loop of vCPU execution and spin loop exit, which results in an 
>undesirable increase in the vCPU thread's deadline.

>Given this impacts real workloads and is a bug present since the 
>introduction of EEVDF, I would say it warrants a

>Fixes: 147f3efaa24182 ("sched/fair: Implement an EEVDF-like scheduling 
>policy")

>tag.


>Alex

Actually, as Alex described, we encountered the same issue in this 
testing scenario: starting qemu, binding cores to the cpuset group, 
setting cpuset.cpus=1-3 for stress testing in qemu, 
running taskset -c 1-3 ./stress-ng -c 20, and then encountering an error where qemu freezes, 
reporting a soft lockup issue in qemu. After applying this patch, the problem was resolved.
Do we have plans to merge this patch into the mainline?

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2025-07-22 12:02 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20250401123622.584018-1-sieberf@amazon.com>
2025-04-13 18:38 ` [PATCH] sched/fair: Only increment deadline once on yield Alexander Graf
2025-07-22 11:46   ` [PATCH] " Wang Tao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).