From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 48F343A7840 for ; Tue, 24 Feb 2026 16:35:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771950955; cv=none; b=fwRQByWmOmu8qPXlyCQNywDK64lggUOq3ohocwVL9fOMcuQ9Am4OlR24Bu2vTVJg0a6atu5gg/5ltqURbiA3Js25d5eSaETxijJxzlVKfki77GCQoM+yFMpYpvlnVs0SkVqmmuPbE304Sj1wu6jNsIVpHjkovsFUZ+c8BDDWNSQ= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771950955; c=relaxed/simple; bh=c6xVVCFIB2PCG4ORqRmwG4mHZWTxBNGMZZqzR6Eyiw8=; h=Date:Message-ID:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=kOTtKcl9Z28rhpuvFz5GiZl4gd3RJqXzlw77CLOY9k4VNWK0mFCAz5d9Nr6Bkx8YKcn9ETIuHEb6gsNvPRLdGWkaJQp/9PLrXh7Ah69zd62nsp0A7ikglE4ko7/EbHsSuejwBZeGRXFSK10jPJE0b8mZskf3iPT+xwRY7xqPCpE= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=uanKlBIf; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="uanKlBIf" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 48770C116D0; Tue, 24 Feb 2026 16:35:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1771950954; bh=c6xVVCFIB2PCG4ORqRmwG4mHZWTxBNGMZZqzR6Eyiw8=; h=Date:From:To:Cc:Subject:References:From; b=uanKlBIfpPRGJvzgwo6f93bmTR+m/MMnuuCpdWnhYGsbNZKNouJeCLZgsoPlOYWUv UOWxSRonegaPJ8BAWNivFEjLS1bRR0dXhP1HdofkfDmd1iLtiQ8QiDmnGSGoWgRvec PL1V3g444f6Krtt4ZoKBX+77wGg7RWC+yD60tLdTDEYSyhWAdzCa1eB8SEtQfmxGux LXTm4a/6S2vix4zxjb4PNzXM5EbTJjdJdtAAmUEejdt5Ivlas3/0nhe1U6zkkERkDg kW2Lv74VZwoyWTwwL68ODBTXMvRq/7zCeaCcL89d+h6JtPdkwWTAIA9gsZKPQiaCWC kh0Kh5pGBDx1w== Date: Tue, 24 Feb 2026 17:35:52 +0100 Message-ID: <20260224163429.273068659@kernel.org> User-Agent: quilt/0.68 From: Thomas Gleixner To: LKML Cc: Anna-Maria Behnsen , John Stultz , Stephen Boyd , Daniel Lezcano , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , x86@kernel.org, Peter Zijlstra , Frederic Weisbecker , Eric Dumazet Subject: [patch 08/48] sched: Optimize hrtimer handling References: <20260224163022.795809588@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 schedule() provides several mechanisms to update the hrtick timer: 1) When the next task is picked 2) When the balance callbacks are invoked before rq::lock is released Each of them can result in a first expiring timer and cause a reprogram of the clock event device. Solve this by deferring the rearm to the end of schedule() right before releasing rq::lock by setting a flag on entry which tells hrtick_start() to cache the runtime constraint in rq::hrtick_delay without touching the timer itself. Right before releasing rq::lock evaluate the flags and either rearm or cancel the hrtick timer. Signed-off-by: Thomas Gleixner --- kernel/sched/core.c | 57 ++++++++++++++++++++++++++++++++++++++++++--------- kernel/sched/sched.h | 2 + 2 files changed, 50 insertions(+), 9 deletions(-) --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -872,6 +872,12 @@ void update_rq_clock(struct rq *rq) * Use HR-timers to deliver accurate preemption points. */ +enum { + HRTICK_SCHED_NONE = 0, + HRTICK_SCHED_DEFER = BIT(1), + HRTICK_SCHED_START = BIT(2), +}; + static void hrtick_clear(struct rq *rq) { if (hrtimer_active(&rq->hrtick_timer)) @@ -932,6 +938,17 @@ void hrtick_start(struct rq *rq, u64 del * doesn't make sense and can cause timer DoS. */ delta = max_t(s64, delay, 10000LL); + + /* + * If this is in the middle of schedule() only note the delay + * and let hrtick_schedule_exit() deal with it. + */ + if (rq->hrtick_sched) { + rq->hrtick_sched |= HRTICK_SCHED_START; + rq->hrtick_delay = delta; + return; + } + rq->hrtick_time = ktime_add_ns(ktime_get(), delta); if (rq == this_rq()) @@ -940,19 +957,40 @@ void hrtick_start(struct rq *rq, u64 del smp_call_function_single_async(cpu_of(rq), &rq->hrtick_csd); } -static void hrtick_rq_init(struct rq *rq) +static inline void hrtick_schedule_enter(struct rq *rq) { - INIT_CSD(&rq->hrtick_csd, __hrtick_start, rq); - hrtimer_setup(&rq->hrtick_timer, hrtick, CLOCK_MONOTONIC, HRTIMER_MODE_REL_HARD); + rq->hrtick_sched = HRTICK_SCHED_DEFER; } -#else /* !CONFIG_SCHED_HRTICK: */ -static inline void hrtick_clear(struct rq *rq) + +static inline void hrtick_schedule_exit(struct rq *rq) { + if (rq->hrtick_sched & HRTICK_SCHED_START) { + rq->hrtick_time = ktime_add_ns(ktime_get(), rq->hrtick_delay); + __hrtick_restart(rq); + } else if (idle_rq(rq)) { + /* + * No need for using hrtimer_is_active(). The timer is CPU local + * and interrupts are disabled, so the callback cannot be + * running and the queued state is valid. + */ + if (hrtimer_is_queued(&rq->hrtick_timer)) + hrtimer_cancel(&rq->hrtick_timer); + } + + rq->hrtick_sched = HRTICK_SCHED_NONE; } -static inline void hrtick_rq_init(struct rq *rq) +static void hrtick_rq_init(struct rq *rq) { + INIT_CSD(&rq->hrtick_csd, __hrtick_start, rq); + rq->hrtick_sched = HRTICK_SCHED_NONE; + hrtimer_setup(&rq->hrtick_timer, hrtick, CLOCK_MONOTONIC, HRTIMER_MODE_REL_HARD); } +#else /* !CONFIG_SCHED_HRTICK: */ +static inline void hrtick_clear(struct rq *rq) { } +static inline void hrtick_rq_init(struct rq *rq) { } +static inline void hrtick_schedule_enter(struct rq *rq) { } +static inline void hrtick_schedule_exit(struct rq *rq) { } #endif /* !CONFIG_SCHED_HRTICK */ /* @@ -5028,6 +5066,7 @@ static inline void finish_lock_switch(st */ spin_acquire(&__rq_lockp(rq)->dep_map, 0, 0, _THIS_IP_); __balance_callbacks(rq, NULL); + hrtick_schedule_exit(rq); raw_spin_rq_unlock_irq(rq); } @@ -6781,9 +6820,6 @@ static void __sched notrace __schedule(i schedule_debug(prev, preempt); - if (sched_feat(HRTICK) || sched_feat(HRTICK_DL)) - hrtick_clear(rq); - klp_sched_try_switch(prev); local_irq_disable(); @@ -6810,6 +6846,8 @@ static void __sched notrace __schedule(i rq_lock(rq, &rf); smp_mb__after_spinlock(); + hrtick_schedule_enter(rq); + /* Promote REQ to ACT */ rq->clock_update_flags <<= 1; update_rq_clock(rq); @@ -6911,6 +6949,7 @@ static void __sched notrace __schedule(i rq_unpin_lock(rq, &rf); __balance_callbacks(rq, NULL); + hrtick_schedule_exit(rq); raw_spin_rq_unlock_irq(rq); } trace_sched_exit_tp(is_switch); --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -1285,6 +1285,8 @@ struct rq { call_single_data_t hrtick_csd; struct hrtimer hrtick_timer; ktime_t hrtick_time; + ktime_t hrtick_delay; + unsigned int hrtick_sched; #endif #ifdef CONFIG_SCHEDSTATS