Re: [PATCH 02/10] sched/fair: Add rate-limiting and validation helpers

public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed

From: K Prateek Nayak <kprateek.nayak@amd.com>
To: Wanpeng Li <kernellwp@gmail.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	Sean Christopherson <seanjc@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	<linux-kernel@vger.kernel.org>, <kvm@vger.kernel.org>,
	Wanpeng Li <wanpengli@tencent.com>
Subject: Re: [PATCH 02/10] sched/fair: Add rate-limiting and validation helpers
Date: Wed, 12 Nov 2025 12:10:15 +0530	[thread overview]
Message-ID: <015bfa4d-d89c-4d4e-be06-d6e46aec28cb@amd.com> (raw)
In-Reply-To: <20251110033232.12538-3-kernellwp@gmail.com>

Hello Wanpeng,

On 11/10/2025 9:02 AM, Wanpeng Li wrote:
> +/*
> + * High-frequency yield gating to reduce overhead on compute-intensive workloads.
> + * Returns true if the yield should be skipped due to frequency limits.
> + *
> + * Optimized: single threshold with READ_ONCE/WRITE_ONCE, refresh timestamp on every call.
> + */
> +static bool yield_deboost_rate_limit(struct rq *rq, u64 now_ns)
> +{
> +	u64 last = READ_ONCE(rq->yield_deboost_last_time_ns);
> +	bool limited = false;
> +
> +	if (last) {
> +		u64 delta = now_ns - last;
> +		limited = (delta <= 6000ULL * NSEC_PER_USEC);
> +	}
> +
> +	WRITE_ONCE(rq->yield_deboost_last_time_ns, now_ns);

We only look at local rq so READ_ONCE()/WRITE_ONCE() seems
unnecessary.

> +	return limited;
> +}
> +
> +/*
> + * Validate tasks and basic parameters for yield deboost operation.
> + * Performs comprehensive safety checks including feature enablement,
> + * NULL pointer validation, task state verification, and same-rq requirement.
> + * Returns false with appropriate debug logging if any validation fails,
> + * ensuring only safe and meaningful yield operations proceed.
> + */
> +static bool __maybe_unused yield_deboost_validate_tasks(struct rq *rq, struct task_struct *p_target,
> +					  struct task_struct **p_yielding_out,
> +					  struct sched_entity **se_y_out,
> +					  struct sched_entity **se_t_out)
> +{
> +	struct task_struct *p_yielding;
> +	struct sched_entity *se_y, *se_t;
> +	u64 now_ns;
> +
> +	if (!sysctl_sched_vcpu_debooster_enabled)
> +		return false;
> +
> +	if (!rq || !p_target)
> +		return false;
> +
> +	now_ns = rq->clock;

Brief look at Patch 5 suggests we are under the rq_lock so might
as well use the rq_clock(rq) helper. Also, you have to do a
update_rq_clock() since it isn't done until yield_task_fair().

> +
> +	if (yield_deboost_rate_limit(rq, now_ns))
> +		return false;
> +
> +	p_yielding = rq->curr;
> +	if (!p_yielding || p_yielding == p_target ||
> +	    p_target->sched_class != &fair_sched_class ||
> +	    p_yielding->sched_class != &fair_sched_class)
> +		return false;

yield_to() in syscall.c has already checked for the sched
class matching under double_rq_lock. That cannot change by the
time we are here.

> +
> +	se_y = &p_yielding->se;
> +	se_t = &p_target->se;
> +
> +	if (!se_t || !se_y || !se_t->on_rq || !se_y->on_rq)
> +		return false;
> +
> +	if (task_rq(p_yielding) != rq || task_rq(p_target) != rq)

yield_to() has already checked for this under double_rq_lock()
so this too should be unnecessary.

> +		return false;
> +
> +	*p_yielding_out = p_yielding;
> +	*se_y_out = se_y;
> +	*se_t_out = se_t;

Why do we need these pointers? Can't the caller simply do:

    if (!yield_deboost_validate_tasks(rq, target))
        return;

    p_yielding = rq->donor;
    se_y_out = &p_yielding->se;
    se_t = &target->se;

That reminds me - now that we have proxy execution, you need
to re-evaluate the usage of rq->curr (running context) vs
rq->donor (vruntime context) when looking at all this.

> +	return true;
> +}
> +
>  /*
>   * sched_yield() is very simple
>   */

-- 
Thanks and Regards,
Prateek

next prev parent reply	other threads:[~2025-11-12  6:40 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-10  3:32 [PATCH 00/10] sched/kvm: Semantics-aware vCPU scheduling for oversubscribed KVM Wanpeng Li
2025-11-10  3:32 ` [PATCH 01/10] sched: Add vCPU debooster infrastructure Wanpeng Li
2025-11-10  3:32 ` [PATCH 02/10] sched/fair: Add rate-limiting and validation helpers Wanpeng Li
2025-11-12  6:40   ` K Prateek Nayak [this message]
2025-11-12  6:44     ` K Prateek Nayak
2025-11-13 13:36       ` Wanpeng Li
2025-11-13 12:00     ` Wanpeng Li
2025-11-10  3:32 ` [PATCH 03/10] sched/fair: Add cgroup LCA finder for hierarchical yield Wanpeng Li
2025-11-12  6:50   ` K Prateek Nayak
2025-11-13  8:59     ` Wanpeng Li
2025-11-10  3:32 ` [PATCH 04/10] sched/fair: Add penalty calculation and application logic Wanpeng Li
2025-11-12  7:25   ` K Prateek Nayak
2025-11-13 13:25     ` Wanpeng Li
2025-11-10  3:32 ` [PATCH 05/10] sched/fair: Wire up yield deboost in yield_to_task_fair() Wanpeng Li
2025-11-10  5:16   ` kernel test robot
2025-11-10  5:16   ` kernel test robot
2025-11-10  3:32 ` [PATCH 06/10] KVM: Fix last_boosted_vcpu index assignment bug Wanpeng Li
2025-11-21  0:35   ` Sean Christopherson
2025-11-21  0:38     ` Sean Christopherson
2025-11-21 11:46     ` Wanpeng Li
2025-11-10  3:32 ` [PATCH 07/10] KVM: x86: Add IPI tracking infrastructure Wanpeng Li
2025-11-10  3:32 ` [PATCH 08/10] KVM: x86/lapic: Integrate IPI tracking with interrupt delivery Wanpeng Li
2025-11-10  3:32 ` [PATCH 09/10] KVM: Implement IPI-aware directed yield candidate selection Wanpeng Li
2025-11-10  3:39 ` [PATCH 10/10] KVM: Relaxed boost as safety net Wanpeng Li
2025-11-10 12:02 ` [PATCH 00/10] sched/kvm: Semantics-aware vCPU scheduling for oversubscribed KVM Christian Borntraeger
2025-11-12  5:01   ` Wanpeng Li
2025-11-18  8:11     ` Christian Borntraeger
2025-11-18 14:19       ` Wanpeng Li
2025-11-11  6:28 ` K Prateek Nayak
2025-11-12  4:54   ` Wanpeng Li
2025-11-12  6:07     ` K Prateek Nayak
2025-11-13  5:37       ` Wanpeng Li
2025-11-13  4:42     ` K Prateek Nayak
2025-11-13  8:33       ` Wanpeng Li
2025-11-13  9:48         ` K Prateek Nayak
2025-11-13 13:56           ` Wanpeng Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=015bfa4d-d89c-4d4e-be06-d6e46aec28cb@amd.com \
    --to=kprateek.nayak@amd.com \
    --cc=juri.lelli@redhat.com \
    --cc=kernellwp@gmail.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=vincent.guittot@linaro.org \
    --cc=wanpengli@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox