Re: [PATCH 02/10] sched/fair: Add rate-limiting and validation helpers

All of lore.kernel.org
 help / color / mirror / Atom feed

From: K Prateek Nayak <kprateek.nayak@amd.com>
To: Wanpeng Li <kernellwp@gmail.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	Sean Christopherson <seanjc@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	<linux-kernel@vger.kernel.org>, <kvm@vger.kernel.org>,
	Wanpeng Li <wanpengli@tencent.com>
Subject: Re: [PATCH 02/10] sched/fair: Add rate-limiting and validation helpers
Date: Wed, 12 Nov 2025 12:10:15 +0530	[thread overview]
Message-ID: <015bfa4d-d89c-4d4e-be06-d6e46aec28cb@amd.com> (raw)
In-Reply-To: <20251110033232.12538-3-kernellwp@gmail.com>

Hello Wanpeng,

On 11/10/2025 9:02 AM, Wanpeng Li wrote:
> +/*
> + * High-frequency yield gating to reduce overhead on compute-intensive workloads.
> + * Returns true if the yield should be skipped due to frequency limits.
> + *
> + * Optimized: single threshold with READ_ONCE/WRITE_ONCE, refresh timestamp on every call.
> + */
> +static bool yield_deboost_rate_limit(struct rq *rq, u64 now_ns)
> +{
> +	u64 last = READ_ONCE(rq->yield_deboost_last_time_ns);
> +	bool limited = false;
> +
> +	if (last) {
> +		u64 delta = now_ns - last;
> +		limited = (delta <= 6000ULL * NSEC_PER_USEC);
> +	}
> +
> +	WRITE_ONCE(rq->yield_deboost_last_time_ns, now_ns);

We only look at local rq so READ_ONCE()/WRITE_ONCE() seems
unnecessary.

> +	return limited;
> +}
> +
> +/*
> + * Validate tasks and basic parameters for yield deboost operation.
> + * Performs comprehensive safety checks including feature enablement,
> + * NULL pointer validation, task state verification, and same-rq requirement.
> + * Returns false with appropriate debug logging if any validation fails,
> + * ensuring only safe and meaningful yield operations proceed.
> + */
> +static bool __maybe_unused yield_deboost_validate_tasks(struct rq *rq, struct task_struct *p_target,
> +					  struct task_struct **p_yielding_out,
> +					  struct sched_entity **se_y_out,
> +					  struct sched_entity **se_t_out)
> +{
> +	struct task_struct *p_yielding;
> +	struct sched_entity *se_y, *se_t;
> +	u64 now_ns;
> +
> +	if (!sysctl_sched_vcpu_debooster_enabled)
> +		return false;
> +
> +	if (!rq || !p_target)
> +		return false;
> +
> +	now_ns = rq->clock;

Brief look at Patch 5 suggests we are under the rq_lock so might
as well use the rq_clock(rq) helper. Also, you have to do a
update_rq_clock() since it isn't done until yield_task_fair().

> +
> +	if (yield_deboost_rate_limit(rq, now_ns))
> +		return false;
> +
> +	p_yielding = rq->curr;
> +	if (!p_yielding || p_yielding == p_target ||
> +	    p_target->sched_class != &fair_sched_class ||
> +	    p_yielding->sched_class != &fair_sched_class)
> +		return false;

yield_to() in syscall.c has already checked for the sched
class matching under double_rq_lock. That cannot change by the
time we are here.

> +
> +	se_y = &p_yielding->se;
> +	se_t = &p_target->se;
> +
> +	if (!se_t || !se_y || !se_t->on_rq || !se_y->on_rq)
> +		return false;
> +
> +	if (task_rq(p_yielding) != rq || task_rq(p_target) != rq)

yield_to() has already checked for this under double_rq_lock()
so this too should be unnecessary.

> +		return false;
> +
> +	*p_yielding_out = p_yielding;
> +	*se_y_out = se_y;
> +	*se_t_out = se_t;

Why do we need these pointers? Can't the caller simply do:

    if (!yield_deboost_validate_tasks(rq, target))
        return;

    p_yielding = rq->donor;
    se_y_out = &p_yielding->se;
    se_t = &target->se;

That reminds me - now that we have proxy execution, you need
to re-evaluate the usage of rq->curr (running context) vs
rq->donor (vruntime context) when looking at all this.

> +	return true;
> +}
> +
>  /*
>   * sched_yield() is very simple
>   */

-- 
Thanks and Regards,
Prateek

next prev parent reply	other threads:[~2025-11-12  6:40 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-10  3:32 [PATCH 00/10] sched/kvm: Semantics-aware vCPU scheduling for oversubscribed KVM Wanpeng Li
2025-11-10  3:32 ` [PATCH 01/10] sched: Add vCPU debooster infrastructure Wanpeng Li
2025-11-10  3:32 ` [PATCH 02/10] sched/fair: Add rate-limiting and validation helpers Wanpeng Li
2025-11-12  6:40   ` K Prateek Nayak [this message]
2025-11-12  6:44     ` K Prateek Nayak
2025-11-13 13:36       ` Wanpeng Li
2025-11-13 12:00     ` Wanpeng Li
2025-11-10  3:32 ` [PATCH 03/10] sched/fair: Add cgroup LCA finder for hierarchical yield Wanpeng Li
2025-11-12  6:50   ` K Prateek Nayak
2025-11-13  8:59     ` Wanpeng Li
2025-11-10  3:32 ` [PATCH 04/10] sched/fair: Add penalty calculation and application logic Wanpeng Li
2025-11-12  7:25   ` K Prateek Nayak
2025-11-13 13:25     ` Wanpeng Li
2025-11-10  3:32 ` [PATCH 05/10] sched/fair: Wire up yield deboost in yield_to_task_fair() Wanpeng Li
2025-11-10  5:16   ` kernel test robot
2025-11-10  5:16   ` kernel test robot
2025-11-10  3:32 ` [PATCH 06/10] KVM: Fix last_boosted_vcpu index assignment bug Wanpeng Li
2025-11-21  0:35   ` Sean Christopherson
2025-11-21  0:38     ` Sean Christopherson
2025-11-21 11:46     ` Wanpeng Li
2025-11-10  3:32 ` [PATCH 07/10] KVM: x86: Add IPI tracking infrastructure Wanpeng Li
2025-11-10  3:32 ` [PATCH 08/10] KVM: x86/lapic: Integrate IPI tracking with interrupt delivery Wanpeng Li
2025-11-10  3:32 ` [PATCH 09/10] KVM: Implement IPI-aware directed yield candidate selection Wanpeng Li
2025-11-10  3:39 ` [PATCH 10/10] KVM: Relaxed boost as safety net Wanpeng Li
2025-11-10 12:02 ` [PATCH 00/10] sched/kvm: Semantics-aware vCPU scheduling for oversubscribed KVM Christian Borntraeger
2025-11-12  5:01   ` Wanpeng Li
2025-11-18  8:11     ` Christian Borntraeger
2025-11-18 14:19       ` Wanpeng Li
2025-11-11  6:28 ` K Prateek Nayak
2025-11-12  4:54   ` Wanpeng Li
2025-11-12  6:07     ` K Prateek Nayak
2025-11-13  5:37       ` Wanpeng Li
2025-11-13  4:42     ` K Prateek Nayak
2025-11-13  8:33       ` Wanpeng Li
2025-11-13  9:48         ` K Prateek Nayak
2025-11-13 13:56           ` Wanpeng Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=015bfa4d-d89c-4d4e-be06-d6e46aec28cb@amd.com \
    --to=kprateek.nayak@amd.com \
    --cc=juri.lelli@redhat.com \
    --cc=kernellwp@gmail.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=vincent.guittot@linaro.org \
    --cc=wanpengli@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.