All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hillf Danton <hdanton@sina.com>
To: Wanpeng Li <kernellwp@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Sean Christopherson <seanjc@google.com>,
	K Prateek Nayak <kprateek.nayak@amd.com>,
	linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
	Wanpeng Li <wanpengli@tencent.com>
Subject: Re: [PATCH v2 2/9] sched/fair: Add rate-limiting and validation helpers
Date: Sun,  4 Jan 2026 12:09:34 +0800	[thread overview]
Message-ID: <20260104040936.1912-1-hdanton@sina.com> (raw)
In-Reply-To: <20251219035334.39790-3-kernellwp@gmail.com>

Hi Wanpeng 

On Fri, 19 Dec 2025 11:53:26 +0800
> From: Wanpeng Li <wanpengli@tencent.com>
> 
> Implement core safety mechanisms for yield deboost operations.
> 
> Add yield_deboost_rate_limit() for high-frequency gating to prevent
> excessive overhead on compute-intensive workloads. The 6ms threshold
> balances responsiveness with overhead reduction.
> 
> Add yield_deboost_validate_tasks() for comprehensive validation ensuring
> both tasks are valid and distinct, both belong to fair_sched_class,
> target is on the same runqueue, and tasks are runnable.
> 
Given IPI in subsequent pacthes, why is same rq required?

> The rate limiter prevents pathological high-frequency cases while
> validation ensures only appropriate task pairs proceed. Both functions
> are static and will be integrated in subsequent patches.
> 
> v1 -> v2:
> - Remove unnecessary READ_ONCE/WRITE_ONCE for per-rq fields accessed
>   under rq->lock
> - Change rq->clock to rq_clock(rq) helper for consistency
> - Change yield_deboost_rate_limit() signature from (rq, now_ns) to (rq),
>   obtaining time internally via rq_clock()
> - Remove redundant sched_class check for p_yielding (already implied by
>   rq->donor being fair)
> - Simplify task_rq check to only verify p_target
> - Change rq->curr to rq->donor for correct EEVDF donor tracking
> - Move sysctl_sched_vcpu_debooster_enabled and NULL checks to caller
>   (yield_to_deboost) for early exit before update_rq_clock()
> - Simplify function signature by returning p_yielding directly instead
>   of using output pointer parameters
> - Add documentation explaining the 6ms rate limit threshold
> 
> Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
> ---
>  kernel/sched/fair.c | 62 +++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 62 insertions(+)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 87c30db2c853..2f327882bf4d 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9040,6 +9040,68 @@ static void put_prev_task_fair(struct rq *rq, struct task_struct *prev, struct t
>  	}
>  }
>  
> +/*
> + * Rate-limit yield deboost operations to prevent excessive overhead.
> + * Returns true if the operation should be skipped due to rate limiting.
> + *
> + * The 6ms threshold balances responsiveness with overhead reduction:
> + * - Short enough to allow timely yield boosting for lock contention
> + * - Long enough to prevent pathological high-frequency penalty application
> + *
> + * Called under rq->lock, so direct field access is safe.
> + */
> +static bool yield_deboost_rate_limit(struct rq *rq)
> +{
> +	u64 now = rq_clock(rq);
> +	u64 last = rq->yield_deboost_last_time_ns;
> +
> +	if (last && (now - last) <= 6 * NSEC_PER_MSEC)
> +		return true;
> +
> +	rq->yield_deboost_last_time_ns = now;
> +	return false;
> +}
> +
> +/*
> + * Validate tasks for yield deboost operation.
> + * Returns the yielding task on success, NULL on validation failure.
> + *
> + * Checks: feature enabled, valid target, same runqueue, target is fair class,
> + * both on_rq. Called under rq->lock.
> + *
> + * Note: p_yielding (rq->donor) is guaranteed to be fair class by the caller
> + * (yield_to_task_fair is only called when curr->sched_class == p->sched_class).
> + */
> +static struct task_struct __maybe_unused *
> +yield_deboost_validate_tasks(struct rq *rq, struct task_struct *p_target)
> +{
> +	struct task_struct *p_yielding;
> +
> +	if (!sysctl_sched_vcpu_debooster_enabled)
> +		return NULL;
> +
> +	if (!p_target)
> +		return NULL;
> +
> +	if (yield_deboost_rate_limit(rq))
> +		return NULL;
> +
> +	p_yielding = rq->donor;
> +	if (!p_yielding || p_yielding == p_target)
> +		return NULL;
> +
> +	if (p_target->sched_class != &fair_sched_class)
> +		return NULL;
> +
> +	if (task_rq(p_target) != rq)
> +		return NULL;
> +
> +	if (!p_target->se.on_rq || !p_yielding->se.on_rq)
> +		return NULL;
> +
> +	return p_yielding;
> +}
> +
>  /*
>   * sched_yield() is very simple
>   */
> -- 
> 2.43.0

  parent reply	other threads:[~2026-01-04  4:24 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-19  3:53 [PATCH v2 0/9] sched/kvm: Semantics-aware vCPU scheduling for oversubscribed KVM Wanpeng Li
2025-12-19  3:53 ` [PATCH v2 1/9] sched: Add vCPU debooster infrastructure Wanpeng Li
2025-12-19  3:53 ` [PATCH v2 2/9] sched/fair: Add rate-limiting and validation helpers Wanpeng Li
2025-12-22 21:12   ` kernel test robot
2026-01-04  4:09   ` Hillf Danton [this message]
2025-12-19  3:53 ` [PATCH v2 3/9] sched/fair: Add cgroup LCA finder for hierarchical yield Wanpeng Li
2025-12-19  3:53 ` [PATCH v2 4/9] sched/fair: Add penalty calculation and application logic Wanpeng Li
2025-12-22 23:36   ` kernel test robot
2025-12-19  3:53 ` [PATCH v2 5/9] sched/fair: Wire up yield deboost in yield_to_task_fair() Wanpeng Li
2025-12-22  7:06   ` kernel test robot
2025-12-22  9:31   ` kernel test robot
2025-12-19  3:53 ` [PATCH v2 6/9] KVM: x86: Add IPI tracking infrastructure Wanpeng Li
2025-12-19  3:53 ` [PATCH v2 7/9] KVM: x86/lapic: Integrate IPI tracking with interrupt delivery Wanpeng Li
2025-12-19  3:53 ` [PATCH v2 8/9] KVM: Implement IPI-aware directed yield candidate selection Wanpeng Li
2025-12-19  3:53 ` [PATCH v2 9/9] KVM: Relaxed boost as safety net Wanpeng Li
2026-01-04  2:40 ` [PATCH v2 0/9] sched/kvm: Semantics-aware vCPU scheduling for oversubscribed KVM Wanpeng Li
2026-01-05  6:26 ` K Prateek Nayak
2026-03-13  1:13 ` Sean Christopherson
2026-04-01  9:48   ` Wanpeng Li
2026-04-02 23:43     ` Sean Christopherson
2026-03-26 14:41 ` Christian Borntraeger
2026-04-01  9:34   ` Wanpeng Li
2026-04-08  9:35     ` Richie Buturla
2026-04-17 11:30       ` Richie Buturla
2026-05-13 12:52         ` Richie Buturla

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260104040936.1912-1-hdanton@sina.com \
    --to=hdanton@sina.com \
    --cc=kernellwp@gmail.com \
    --cc=kprateek.nayak@amd.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=seanjc@google.com \
    --cc=wanpengli@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.