All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ankur Arora <ankur.a.arora@oracle.com>
To: Tianchen Ding <dtcccc@linux.alibaba.com>
Cc: Ankur Arora <ankur.a.arora@oracle.com>,
	tglx@linutronix.de, peterz@infradead.org,
	torvalds@linux-foundation.org, paulmck@kernel.org,
	rostedt@goodmis.org, mark.rutland@arm.com, juri.lelli@redhat.com,
	joel@joelfernandes.org, raghavendra.kt@amd.com,
	sshegde@linux.ibm.com, boris.ostrovsky@oracle.com,
	konrad.wilk@oracle.com, Ingo Molnar <mingo@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 29/35] sched: handle preempt=voluntary under PREEMPT_AUTO
Date: Mon, 24 Jun 2024 18:12:50 -0700	[thread overview]
Message-ID: <87ikxxakq5.fsf@oracle.com> (raw)
In-Reply-To: <dd6c29cf-3ca0-4aa3-8cfe-e85a35e300e4@linux.alibaba.com>


Tianchen Ding <dtcccc@linux.alibaba.com> writes:

> On 2024/6/22 02:58, Ankur Arora wrote:
>> Tianchen Ding <dtcccc@linux.alibaba.com> writes:
>>
>>> On 2024/5/28 08:35, Ankur Arora wrote:
>>>> The default preemption policy for voluntary preemption under
>>>> PREEMPT_AUTO is to schedule eagerly for tasks of higher scheduling
>>>> class, and lazily for well-behaved, non-idle tasks.
>>>> This is the same policy as preempt=none, with an eager handling of
>>>> higher priority scheduling classes.
>>>> Comparing a cyclictest workload with a background kernel load of
>>>> 'stress-ng --mmap', shows that both the average and the maximum
>>>> latencies improve:
>>>>    # stress-ng --mmap 0 &
>>>>    # cyclictest --mlockall --smp --priority=80 --interval=200 --distance=0 -q -D 300
>>>>                                        Min     (  %stdev )    Act     (  %stdev
>>>> )   Avg     (  %stdev )   Max      (  %stdev )
>>>>     PREEMPT_AUTO, preempt=voluntary    1.73  ( +-  25.43% )   62.16 ( +-
>>>> 303.39% )  14.92 ( +-  17.96% )  2778.22 ( +-  15.04% )
>>>>     PREEMPT_DYNAMIC, preempt=voluntary 1.83  ( +-  20.76% )  253.45 ( +- 233.21% )  18.70 ( +-  15.88% )  2992.45 ( +-  15.95% )
>>>> The table above shows the aggregated latencies across all CPUs.
>>>> Cc: Ingo Molnar <mingo@redhat.com>
>>>> Cc: Peter Ziljstra <peterz@infradead.org>
>>>> Cc: Juri Lelli <juri.lelli@redhat.com>
>>>> Cc: Vincent Guittot <vincent.guittot@linaro.org>
>>>> Originally-by: Thomas Gleixner <tglx@linutronix.de>
>>>> Link: https://lore.kernel.org/lkml/87jzshhexi.ffs@tglx/
>>>> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
>>>> ---
>>>>    kernel/sched/core.c  | 12 ++++++++----
>>>>    kernel/sched/sched.h |  6 ++++++
>>>>    2 files changed, 14 insertions(+), 4 deletions(-)
>>>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>>>> index c25cccc09b65..2bc3ae21a9d0 100644
>>>> --- a/kernel/sched/core.c
>>>> +++ b/kernel/sched/core.c
>>>> @@ -1052,6 +1052,9 @@ static resched_t resched_opt_translate(struct task_struct *curr,
>>>>    	if (preempt_model_preemptible())
>>>>    		return RESCHED_NOW;
>>>>    +	if (preempt_model_voluntary() && opt == RESCHED_PRIORITY)
>>>> +		return RESCHED_NOW;
>>>> +
>>>>    	if (is_idle_task(curr))
>>>>    		return RESCHED_NOW;
>>>>    @@ -2289,7 +2292,7 @@ void wakeup_preempt(struct rq *rq, struct task_struct
>>>> *p, int flags)
>>>>    	if (p->sched_class == rq->curr->sched_class)
>>>>    		rq->curr->sched_class->wakeup_preempt(rq, p, flags);
>>>>    	else if (sched_class_above(p->sched_class, rq->curr->sched_class))
>>>> -		resched_curr(rq);
>>>> +		resched_curr_priority(rq);
>>>>
>>> Besides the conditions about higher class, can we do resched_curr_priority() in the same class?
>>> For example, in fair class, we can do it when SCHED_NORMAL vs SCHED_IDLE.
>> So, I agree about the specific case of SCHED_NORMAL vs SCHED_IDLE.
>> (And, that case is already handled by resched_opt_translate() explicitly
>> promoting idle tasks to TIF_NEED_RESCHED.)
>> But, on the general question of doing resched_curr_priority() in the
>> same class: I did consider it. But, it seemed to me that we want to
>> keep run to completion semantics for lazy scheduling, and so not
>> enforcing priority in a scheduling class was a good line.
>>
>
> OK, on general question, this is just a suggestion :-)
>
> Actually, my key point is about SCHED_IDLE. It's not a real idle task, but a
> normal task with lowest priority. So is_idle_task() in resched_opt_translate()
> does not fit it. Should add task_has_idle_policy().
>
> However, even using task_has_idle_policy() may be still not enough. Because
> SCHED_IDLE policy:
>   1. It is the lowest priority, but still belongs to fair_sched_class, which is
>   the same as SCHED_NORMAL.
>   2. Not only tasks, *se of cgroup* can be SCHED_IDLE, too. (introduced by
>   commit 304000390f88d)

Thanks. That is useful to know. Let me see how best to incorporate this.

Side question: are there any benchmarks that would exercise various types
of sched policy, idle and otherwise?

--
ankur

  reply	other threads:[~2024-06-25  1:13 UTC|newest]

Thread overview: 95+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-28  0:34 [PATCH v2 00/35] PREEMPT_AUTO: support lazy rescheduling Ankur Arora
2024-05-28  0:34 ` [PATCH v2 01/35] sched/core: Move preempt_model_*() helpers from sched.h to preempt.h Ankur Arora
2024-06-06 17:45   ` [tip: sched/core] " tip-bot2 for Sean Christopherson
2024-05-28  0:34 ` [PATCH v2 02/35] sched/core: Drop spinlocks on contention iff kernel is preemptible Ankur Arora
2024-05-28  0:34 ` [PATCH v2 03/35] sched: make test_*_tsk_thread_flag() return bool Ankur Arora
2024-05-28  0:34 ` [PATCH v2 04/35] preempt: introduce CONFIG_PREEMPT_AUTO Ankur Arora
2024-06-03 15:04   ` Shrikanth Hegde
2024-06-04 17:52     ` Ankur Arora
2024-05-28  0:34 ` [PATCH v2 05/35] thread_info: selector for TIF_NEED_RESCHED[_LAZY] Ankur Arora
2024-05-28 15:55   ` Peter Zijlstra
2024-05-30  9:07     ` Ankur Arora
2024-05-28  0:34 ` [PATCH v2 06/35] thread_info: define __tif_need_resched(resched_t) Ankur Arora
2024-05-28 16:03   ` Peter Zijlstra
2024-05-28  0:34 ` [PATCH v2 07/35] sched: define *_tsk_need_resched_lazy() helpers Ankur Arora
2024-05-28 16:09   ` Peter Zijlstra
2024-05-30  9:02     ` Ankur Arora
2024-05-29  8:25   ` Peter Zijlstra
2024-05-30  9:08     ` Ankur Arora
2024-05-28  0:34 ` [PATCH v2 08/35] entry: handle lazy rescheduling at user-exit Ankur Arora
2024-05-28 16:12   ` Peter Zijlstra
2024-05-28  0:34 ` [PATCH v2 09/35] entry/kvm: handle lazy rescheduling at guest-entry Ankur Arora
2024-05-28 16:13   ` Peter Zijlstra
2024-05-30  9:04     ` Ankur Arora
2024-05-28  0:34 ` [PATCH v2 10/35] entry: irqentry_exit only preempts for TIF_NEED_RESCHED Ankur Arora
2024-05-28 16:18   ` Peter Zijlstra
2024-05-30  9:03     ` Ankur Arora
2024-05-28  0:34 ` [PATCH v2 11/35] sched: __schedule_loop() doesn't need to check for need_resched_lazy() Ankur Arora
2024-05-28  0:34 ` [PATCH v2 12/35] sched: separate PREEMPT_DYNAMIC config logic Ankur Arora
2024-05-28 16:25   ` Peter Zijlstra
2024-05-30  9:30     ` Ankur Arora
2024-05-28  0:34 ` [PATCH v2 13/35] sched: allow runtime config for PREEMPT_AUTO Ankur Arora
2024-05-28 16:27   ` Peter Zijlstra
2024-05-30  9:29     ` Ankur Arora
2024-06-06 11:51       ` Peter Zijlstra
2024-06-06 15:11         ` Ankur Arora
2024-06-06 17:32           ` Peter Zijlstra
2024-06-09  0:46             ` Ankur Arora
2024-06-12 18:10               ` Paul E. McKenney
2024-05-28  0:35 ` [PATCH v2 14/35] rcu: limit PREEMPT_RCU to full preemption under PREEMPT_AUTO Ankur Arora
2024-05-28  0:35 ` [PATCH v2 15/35] rcu: fix header guard for rcu_all_qs() Ankur Arora
2024-05-28  0:35 ` [PATCH v2 16/35] preempt,rcu: warn on PREEMPT_RCU=n, preempt=full Ankur Arora
2024-05-29  8:14   ` Peter Zijlstra
2024-05-30 18:32     ` Paul E. McKenney
2024-05-30 23:05       ` Ankur Arora
2024-05-30 23:15         ` Paul E. McKenney
2024-05-30 23:04     ` Ankur Arora
2024-05-30 23:20       ` Paul E. McKenney
2024-06-06 11:53         ` Peter Zijlstra
2024-06-06 13:38           ` Paul E. McKenney
2024-06-17 15:54             ` Paul E. McKenney
2024-06-18 16:29               ` Paul E. McKenney
2024-05-28  0:35 ` [PATCH v2 17/35] rcu: handle quiescent states for PREEMPT_RCU=n, PREEMPT_COUNT=y Ankur Arora
2024-05-28  0:35 ` [PATCH v2 18/35] rcu: force context-switch " Ankur Arora
2024-05-28  0:35 ` [PATCH v2 19/35] x86/thread_info: define TIF_NEED_RESCHED_LAZY Ankur Arora
2024-05-28  0:35 ` [PATCH v2 20/35] powerpc: add support for PREEMPT_AUTO Ankur Arora
2024-05-28  0:35 ` [PATCH v2 21/35] sched: prepare for lazy rescheduling in resched_curr() Ankur Arora
2024-05-29  9:32   ` Peter Zijlstra
2024-05-28  0:35 ` [PATCH v2 22/35] sched: default preemption policy for PREEMPT_AUTO Ankur Arora
2024-05-28  0:35 ` [PATCH v2 23/35] sched: handle idle preemption " Ankur Arora
2024-05-28  0:35 ` [PATCH v2 24/35] sched: schedule eagerly in resched_cpu() Ankur Arora
2024-05-28  0:35 ` [PATCH v2 25/35] sched/fair: refactor update_curr(), entity_tick() Ankur Arora
2024-05-28  0:35 ` [PATCH v2 26/35] sched/fair: handle tick expiry under lazy preemption Ankur Arora
2024-05-28  0:35 ` [PATCH v2 27/35] sched: support preempt=none under PREEMPT_AUTO Ankur Arora
2024-05-28  0:35 ` [PATCH v2 28/35] sched: support preempt=full " Ankur Arora
2024-05-28  0:35 ` [PATCH v2 29/35] sched: handle preempt=voluntary " Ankur Arora
2024-06-17  3:20   ` Tianchen Ding
2024-06-21 18:58     ` Ankur Arora
2024-06-24  2:35       ` Tianchen Ding
2024-06-25  1:12         ` Ankur Arora [this message]
2024-06-26  2:43           ` Tianchen Ding
2024-05-28  0:35 ` [PATCH v2 30/35] sched: latency warn for TIF_NEED_RESCHED_LAZY Ankur Arora
2024-05-28  0:35 ` [PATCH v2 31/35] tracing: support lazy resched Ankur Arora
2024-05-28  0:35 ` [PATCH v2 32/35] Documentation: tracing: add TIF_NEED_RESCHED_LAZY Ankur Arora
2024-05-28  0:35 ` [PATCH v2 33/35] osnoise: handle quiescent states for PREEMPT_RCU=n, PREEMPTION=y Ankur Arora
2024-05-28 13:12   ` Daniel Bristot de Oliveira
2024-05-28  0:35 ` [PATCH v2 34/35] kconfig: decompose ARCH_NO_PREEMPT Ankur Arora
2024-05-28  0:35 ` [PATCH v2 35/35] arch: " Ankur Arora
2024-05-29  6:16 ` [PATCH v2 00/35] PREEMPT_AUTO: support lazy rescheduling Shrikanth Hegde
2024-06-01 11:47   ` Ankur Arora
2024-06-04  7:32     ` Shrikanth Hegde
2024-06-07 16:48       ` Shrikanth Hegde
2024-06-10  7:23         ` Ankur Arora
2024-06-15 15:04           ` Shrikanth Hegde
2024-06-18 18:27             ` Shrikanth Hegde
2024-06-19  2:40               ` Ankur Arora
2024-06-24 18:37                 ` Shrikanth Hegde
2024-06-27  2:50                   ` Ankur Arora
2024-06-27  5:56                     ` Michael Ellerman
2024-06-27 15:44                       ` Shrikanth Hegde
2024-07-03  5:27                         ` Ankur Arora
2024-08-12 17:32                           ` Shrikanth Hegde
2024-08-12 21:07                             ` Linus Torvalds
2024-08-13  5:40                               ` Ankur Arora
2024-06-05 15:44 ` Sean Christopherson
2024-06-05 17:45   ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ikxxakq5.fsf@oracle.com \
    --to=ankur.a.arora@oracle.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=dtcccc@linux.alibaba.com \
    --cc=joel@joelfernandes.org \
    --cc=juri.lelli@redhat.com \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=raghavendra.kt@amd.com \
    --cc=rostedt@goodmis.org \
    --cc=sshegde@linux.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.