Linux Power Management development

Linux Power Management development
 help / color / mirror / Atom feed

* [patch 12/12] alarmtimer: Remove unused interfaces
From: Thomas Gleixner @ 2026-04-07  8:55 UTC (permalink / raw)
  To: LKML
  Cc: John Stultz, Stephen Boyd, Calvin Owens, Peter Zijlstra,
	Anna-Maria Behnsen, Frederic Weisbecker, Ingo Molnar,
	Alexander Viro, Christian Brauner, Jan Kara, linux-fsdevel,
	Sebastian Reichel, linux-pm, Pablo Neira Ayuso, Florian Westphal,
	Phil Sutter, netfilter-devel, coreteam
In-Reply-To: <20260407083219.478203185@kernel.org>

All alarmtimer users are converted to alarmtimer_start(). Remove the now
unused interfaces.

Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Cc: John Stultz <jstultz@google.com>
Cc: Stephen Boyd <sboyd@kernel.org>
---
 include/linux/alarmtimer.h |    3 ---
 kernel/time/alarmtimer.c   |   44 --------------------------------------------
 2 files changed, 47 deletions(-)

--- a/include/linux/alarmtimer.h
+++ b/include/linux/alarmtimer.h
@@ -50,9 +50,6 @@ static __always_inline ktime_t alarm_get
 void alarm_init(struct alarm *alarm, enum alarmtimer_type type,
 		void (*function)(struct alarm *, ktime_t));
 bool alarmtimer_start(struct alarm *alarm, ktime_t expires, bool relative);
-void alarm_start(struct alarm *alarm, ktime_t start);
-void alarm_start_relative(struct alarm *alarm, ktime_t start);
-void alarm_restart(struct alarm *alarm);
 int alarm_try_to_cancel(struct alarm *alarm);
 int alarm_cancel(struct alarm *alarm);
 
--- a/kernel/time/alarmtimer.c
+++ b/kernel/time/alarmtimer.c
@@ -333,39 +333,6 @@ void alarm_init(struct alarm *alarm, enu
 EXPORT_SYMBOL_GPL(alarm_init);
 
 /**
- * alarm_start - Sets an absolute alarm to fire
- * @alarm: ptr to alarm to set
- * @start: time to run the alarm
- */
-void alarm_start(struct alarm *alarm, ktime_t start)
-{
-	struct alarm_base *base = &alarm_bases[alarm->type];
-
-	scoped_guard(spinlock_irqsave, &base->lock) {
-		alarm->node.expires = start;
-		alarmtimer_enqueue(base, alarm);
-		hrtimer_start(&alarm->timer, alarm->node.expires, HRTIMER_MODE_ABS);
-	}
-
-	trace_alarmtimer_start(alarm, base->get_ktime());
-}
-EXPORT_SYMBOL_GPL(alarm_start);
-
-/**
- * alarm_start_relative - Sets a relative alarm to fire
- * @alarm: ptr to alarm to set
- * @start: time relative to now to run the alarm
- */
-void alarm_start_relative(struct alarm *alarm, ktime_t start)
-{
-	struct alarm_base *base = &alarm_bases[alarm->type];
-
-	start = ktime_add_safe(start, base->get_ktime());
-	alarm_start(alarm, start);
-}
-EXPORT_SYMBOL_GPL(alarm_start_relative);
-
-/**
  * alarmtimer_start - Sets an alarm to fire
  * @alarm:	Pointer to alarm to set
  * @expires:	Expiry time
@@ -393,17 +360,6 @@ bool alarmtimer_start(struct alarm *alar
 }
 EXPORT_SYMBOL_GPL(alarmtimer_start);
 
-void alarm_restart(struct alarm *alarm)
-{
-	struct alarm_base *base = &alarm_bases[alarm->type];
-
-	guard(spinlock_irqsave)(&base->lock);
-	hrtimer_set_expires(&alarm->timer, alarm->node.expires);
-	hrtimer_restart(&alarm->timer);
-	alarmtimer_enqueue(base, alarm);
-}
-EXPORT_SYMBOL_GPL(alarm_restart);
-
 /**
  * alarm_try_to_cancel - Tries to cancel an alarm timer
  * @alarm: ptr to alarm to be canceled


^ permalink raw reply

* [PATCH] cpufreq: Fix race between suspend/resume and CPU hotplug
From: Tianxiang Chen @ 2026-04-07  9:35 UTC (permalink / raw)
  To: rafael; +Cc: viresh.kumar, linux-pm, linux-kernel, lingyue, Tianxiang Chen

CPU hotplug operations can race with cpufreq_suspend()
and cpufreq_resume(), leading to null pointer dereferences
when accessing governor data. This occurs because there is
no synchronization between suspend/resume operations and
CPU hotplug, allowing concurrent access to
policy->governor_data while it is being freed or initialized.

Detailed race condition scenario:

1. Thread A (cpufreq_suspend) starts execution:
   - Iterates through active policies
   - Calls cpufreq_stop_governor(policy) for each policy
   - Sets cpufreq_suspended = true

2. Thread B (CPU hotplug) executes concurrently:
   - Calls cpu_down(cpu)
   - Calls cpuhp_cpufreq_offline(cpu)
   - Calls cpufreq_offline(cpu)
   - Inside cpufreq_offline():
     * Stops governor: policy->governor->stop(policy)
     * Exits governor: policy->governor->exit(policy)
     * Frees governor_data: kfree(policy->governor_data)
     * Sets policy->governor_data = NULL

3. Race window between step 1 and step 2:
   - Thread A is iterating policies and stopping governors
   - Thread B is concurrently executing CPU offline
   - Both threads may access the same policy->governor_data
   - Thread B frees governor_data while Thread A is still using it
   - Thread A accesses freed governor_data → null pointer dereference

Similarly, cpufreq_resume() can race with CPU hotplug where governor_data
is being initialized while hotplug is trying to access it, leading to
accessing uninitialized data.

Signed-off-by: Tianxiang Chen <nanmu@xiaomi.com>
---
 drivers/cpufreq/cpufreq.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 1f794524a1d9..8b03785764fa 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1979,6 +1979,7 @@ void cpufreq_suspend(void)
        if (!cpufreq_driver)
                return;

+       cpus_read_lock();
        if (!has_target() && !cpufreq_driver->suspend)
                goto suspend;

@@ -1998,6 +1999,7 @@ void cpufreq_suspend(void)

 suspend:
        cpufreq_suspended = true;
+       cpus_read_unlock();
 }

 /**
@@ -2017,10 +2019,11 @@ void cpufreq_resume(void)
        if (unlikely(!cpufreq_suspended))
                return;

+       cpus_read_lock();
        cpufreq_suspended = false;

        if (!has_target() && !cpufreq_driver->resume)
-               return;
+               goto out;

        pr_debug("%s: Resuming Governors\n", __func__);

@@ -2038,6 +2041,9 @@ void cpufreq_resume(void)
                                       __func__, policy->cpu);
                }
        }
+
+out:
+       cpus_read_unlock();
 }

 /**
--
2.34.1

#/******本邮件及其附件含有小米公司的保密信息，仅限于发送给上面地址中列出的个人或群组。禁止任何其他人以任何形式使用（包括但不限于全部或部分地泄露、复制、或散发）本邮件中的信息。如果您错收了本邮件，请您立即电话或邮件通知发件人并删除本邮件！ This e-mail and its attachments contain confidential information from XIAOMI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!******/#

^ permalink raw reply related

* Re: [patch 01/12] clockevents: Prevent timer interrupt starvation
From: Peter Zijlstra @ 2026-04-07  9:42 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Calvin Owens, Anna-Maria Behnsen, Frederic Weisbecker,
	Ingo Molnar, John Stultz, Stephen Boyd, Alexander Viro,
	Christian Brauner, Jan Kara, linux-fsdevel, Sebastian Reichel,
	linux-pm, Pablo Neira Ayuso, Florian Westphal, Phil Sutter,
	netfilter-devel, coreteam
In-Reply-To: <20260407083247.562657657@kernel.org>

On Tue, Apr 07, 2026 at 10:54:17AM +0200, Thomas Gleixner wrote:


> @@ -324,16 +324,23 @@ int clockevents_program_event(struct clo
>  		return dev->set_next_ktime(expires, dev);
>  
>  	delta = ktime_to_ns(ktime_sub(expires, ktime_get()));
> -	if (delta <= 0)
> -		return force ? clockevents_program_min_delta(dev) : -ETIME;
>  
> -	delta = min(delta, (int64_t) dev->max_delta_ns);
> -	delta = max(delta, (int64_t) dev->min_delta_ns);
>  
> -	clc = ((unsigned long long) delta * dev->mult) >> dev->shift;
> -	rc = dev->set_next_event((unsigned long) clc, dev);
>  
> -	return (rc && force) ? clockevents_program_min_delta(dev) : rc;
>  }

> @@ -324,16 +324,23 @@ int clockevents_program_event(struct clo
>  		return dev->set_next_ktime(expires, dev);
>  
>  	delta = ktime_to_ns(ktime_sub(expires, ktime_get()));
>  
> +	if (delta > (int64_t)dev->min_delta_ns) {
> +		delta = min(delta, (int64_t) dev->max_delta_ns);
> +		clc = ((unsigned long long) delta * dev->mult) >> dev->shift;
> +		if (!dev->set_next_event((unsigned long) clc, dev))
> +			return 0;
> +	}
>  
> +	if (dev->next_event_forced)
> +		return 0;
>  
> +	if (dev->set_next_event(dev->min_delta_ticks, dev)) {
> +		if (!force || clockevents_program_min_delta(dev))
> +			return -ETIME;
> +	}
> +	dev->next_event_forced = 1;
> +	return 0;
>  }

Looking at the implementation of clockevents_program_min_delta() doing
that dev->set_next_event(dev->min_delta_ticks,) right before it seems a
bit daft.

But yes, this is effectively also what the old code did.

The only thing that seems to be different, is that the old code would
return the ->set_next_event() error code, rather than 0 in the !force
case.

^ permalink raw reply

* Re: [patch 02/12] hrtimer: Provide hrtimer_start_range_ns_user()
From: Peter Zijlstra @ 2026-04-07  9:54 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Calvin Owens, Anna-Maria Behnsen, Frederic Weisbecker,
	Ingo Molnar, John Stultz, Stephen Boyd, Alexander Viro,
	Christian Brauner, Jan Kara, linux-fsdevel, Sebastian Reichel,
	linux-pm, Pablo Neira Ayuso, Florian Westphal, Phil Sutter,
	netfilter-devel, coreteam
In-Reply-To: <20260407083247.630389532@kernel.org>

On Tue, Apr 07, 2026 at 10:54:22AM +0200, Thomas Gleixner wrote:

> +enum {
> +	HRTIMER_REPROGRAM_NONE,
> +	HRTIMER_REPROGRAM,
> +	HRTIMER_REPROGRAM_FORCE,
> +};

> +static int hrtimer_start_range_ns_common(struct hrtimer *timer, ktime_t tim,
> +					 u64 delta_ns, const enum hrtimer_mode mode,
> +					 struct hrtimer_clock_base *base)

> @@ -1315,25 +1337,110 @@ void hrtimer_start_range_ns(struct hrtim
>  	struct hrtimer_clock_base *base;
>  	unsigned long flags;
>  
> -	/*
> -	 * Check whether the HRTIMER_MODE_SOFT bit and hrtimer.is_soft
> -	 * match on CONFIG_PREEMPT_RT = n. With PREEMPT_RT check the hard
> -	 * expiry mode because unmarked timers are moved to softirq expiry.
> -	 */
> -	if (!IS_ENABLED(CONFIG_PREEMPT_RT))
> -		WARN_ON_ONCE(!(mode & HRTIMER_MODE_SOFT) ^ !timer->is_soft);
> -	else
> -		WARN_ON_ONCE(!(mode & HRTIMER_MODE_HARD) ^ !timer->is_hard);
> -
>  	base = lock_hrtimer_base(timer, &flags);
>  
> -	if (__hrtimer_start_range_ns(timer, tim, delta_ns, mode, base))
> +	switch (hrtimer_start_range_ns_common(timer, tim, delta_ns, mode, base)) {
> +	case HRTIMER_REPROGRAM:
>  		hrtimer_reprogram(timer, true);
> +		break;
> +	case HRTIMER_REPROGRAM_FORCE:
> +		hrtimer_force_reprogram(timer->base->cpu_base, 1);
> +		break;
> +	}
>  
>  	unlock_hrtimer_base(timer, &flags);
>  }

Something is going to figure out that hrtimer_start_range_ns_common() is
really returning that enum and then complain you don't handle NONE :-)

Anyway, to me it would make sense to instead pass that value to
hrtimer_reprogram() as the second argument. But this works I suppose.

^ permalink raw reply

* Re: [patch 02/12] hrtimer: Provide hrtimer_start_range_ns_user()
From: Peter Zijlstra @ 2026-04-07  9:57 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Calvin Owens, Anna-Maria Behnsen, Frederic Weisbecker,
	Ingo Molnar, John Stultz, Stephen Boyd, Alexander Viro,
	Christian Brauner, Jan Kara, linux-fsdevel, Sebastian Reichel,
	linux-pm, Pablo Neira Ayuso, Florian Westphal, Phil Sutter,
	netfilter-devel, coreteam
In-Reply-To: <20260407083247.630389532@kernel.org>

On Tue, Apr 07, 2026 at 10:54:22AM +0200, Thomas Gleixner wrote:

> +static inline bool hrtimer_check_user_timer(struct hrtimer *timer)
> +{
> +	struct hrtimer_cpu_base *cpu_base = timer->base->cpu_base;
> +	ktime_t expires;
> +
> +	/*
> +	 * This uses soft expires because that's the user provided
> +	 * expiry time, while expires can be further in the past
> +	 * due to a slack value added to the user expiry time.
> +	 */
> +	expires = hrtimer_get_softexpires(timer);
> +
> +	/* Convert to monotonic */
> +	expires = ktime_sub(expires, timer->base->offset);
> +
> +	/*
> +	 * Check whether this timer will end up as the first expiring timer in
> +	 * the CPU base. If not, no further checks required as it's then
> +	 * guaranteed to expire in the future.
> +	 */
> +	if (expires >= cpu_base->expires_next)
> +		return true;
> +
> +	/* Validate that the expiry time is in the future. */
> +	if (expires > ktime_get())
> +		return true;
> +
> +	debug_deactivate(timer);
> +	__remove_hrtimer(timer, timer->base, HRTIMER_STATE_INACTIVE, false);
> +	trace_hrtimer_start_expired(timer);
> +	return false;
> +}
> +
> +static bool hrtimer_reprogram_user(struct hrtimer *timer)
> +{
> +	if (!hrtimer_check_user_timer(timer))
> +		return false;
> +	hrtimer_reprogram(timer, true);
> +	return true;
> +}
> +
> +static bool hrtimer_force_reprogram_user(struct hrtimer *timer)
> +{
> +	bool ret = hrtimer_check_user_timer(timer);
> +
> +	/*
> +	 * The base must always be reevaluated, independent of the result
> +	 * above because the timer was the first pending timer.
> +	 */
> +	hrtimer_force_reprogram(timer->base->cpu_base, 1);
> +	return ret;
> +}
> +
> +/**
> + * hrtimer_start_range_ns_user - (re)start an user controlled hrtimer
> + * @timer:	the timer to be added
> + * @tim:	expiry time
> + * @delta_ns:	"slack" range for the timer
> + * @mode:	timer mode: absolute (HRTIMER_MODE_ABS) or
> + *		relative (HRTIMER_MODE_REL), and pinned (HRTIMER_MODE_PINNED);
> + *		softirq based mode is considered for debug purpose only!
> + *
> + * Returns: True when the timer was queued, false if it was already expired
> + *
> + * This function cannot invoke the timer callback for expired timers as it might
> + * be called under a lock which the timer callback needs to acquire. So the
> + * caller has to handle that case.
> + */
> +bool hrtimer_start_range_ns_user(struct hrtimer *timer, ktime_t tim,
> +				 u64 delta_ns, const enum hrtimer_mode mode)
> +{
> +	struct hrtimer_clock_base *base;
> +	unsigned long flags;
> +	bool ret = true;
> +
> +	base = lock_hrtimer_base(timer, &flags);
> +	switch (hrtimer_start_range_ns_common(timer, tim, delta_ns, mode, base)) {
> +	case HRTIMER_REPROGRAM:
> +		ret = hrtimer_reprogram_user(timer);
> +		break;
> +	case HRTIMER_REPROGRAM_FORCE:
> +		ret = hrtimer_force_reprogram_user(timer);
> +		break;
> +	}
> +	unlock_hrtimer_base(timer, &flags);
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(hrtimer_start_range_ns_user);

Can we do that hrtimer_check_user_timer() in
hrtimer_start_range_ns_user() and then not duplicate
hrtimer_*reprogram() ?

^ permalink raw reply

* Re: [patch 03/12] hrtimer: Use hrtimer_start_expires_user() for hrtimer sleepers
From: Peter Zijlstra @ 2026-04-07  9:59 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Anna-Maria Behnsen, Frederic Weisbecker, Calvin Owens,
	Ingo Molnar, John Stultz, Stephen Boyd, Alexander Viro,
	Christian Brauner, Jan Kara, linux-fsdevel, Sebastian Reichel,
	linux-pm, Pablo Neira Ayuso, Florian Westphal, Phil Sutter,
	netfilter-devel, coreteam
In-Reply-To: <20260407083247.696142908@kernel.org>

On Tue, Apr 07, 2026 at 10:54:27AM +0200, Thomas Gleixner wrote:
> Most hrtimer sleepers are user controlled and user space can hand arbitrary
> expiry values in as long as they are valid timespecs. If the expiry value
> is in the past then this requires a full loop through reprogramming the
> clock event device, taking the hrtimer interrupt, waking the task and
> reprogram again.
> 
> Use hrtimer_start_expires_user() which avoids the full round trip by
> checking the timer for expiry on enqueue.
> 
> Signed-off-by: Thomas Gleixner <tglx@kernel.org>
> Cc: Anna-Maria Behnsen <anna-maria@linutronix.de>
> Cc: Frederic Weisbecker <frederic@kernel.org>

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

> ---
>  kernel/time/hrtimer.c |    6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> --- a/kernel/time/hrtimer.c
> +++ b/kernel/time/hrtimer.c
> @@ -2152,7 +2152,11 @@ void hrtimer_sleeper_start_expires(struc
>  	if (IS_ENABLED(CONFIG_PREEMPT_RT) && sl->timer.is_hard)
>  		mode |= HRTIMER_MODE_HARD;
>  
> -	hrtimer_start_expires(&sl->timer, mode);
> +	/* If already expired, clear the task pointer and set current state to running */
> +	if (!hrtimer_start_expires_user(&sl->timer, mode)) {
> +		sl->task = NULL;
> +		__set_current_state(TASK_RUNNING);
> +	}
>  }
>  EXPORT_SYMBOL_GPL(hrtimer_sleeper_start_expires);
>  
> 

^ permalink raw reply

* Re: [patch 04/12] posix-timers: Expand timer_[re]arm() callbacks with a boolean return value
From: Peter Zijlstra @ 2026-04-07 10:00 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, John Stultz, Stephen Boyd, Anna-Maria Behnsen,
	Frederic Weisbecker, Calvin Owens, Ingo Molnar, Alexander Viro,
	Christian Brauner, Jan Kara, linux-fsdevel, Sebastian Reichel,
	linux-pm, Pablo Neira Ayuso, Florian Westphal, Phil Sutter,
	netfilter-devel, coreteam
In-Reply-To: <20260407083247.763539663@kernel.org>

On Tue, Apr 07, 2026 at 10:54:33AM +0200, Thomas Gleixner wrote:
> In order to catch expiry times which are already in the past the
> timer_arm() and timer_rearm() callbacks need to be able to report back to
> the caller whether the timer has been queued or not.
> 
> Change the function signature and let all implementations return true for
> now. While at it simplify posix_cpu_timer_rearm().
> 
> No functional change intended.
> 
> Signed-off-by: Thomas Gleixner <tglx@kernel.org>

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

^ permalink raw reply

* Re: [patch 05/12] posix-timers: Handle the timer_[re]arm() return value
From: Peter Zijlstra @ 2026-04-07 10:01 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Anna-Maria Behnsen, Frederic Weisbecker, Calvin Owens,
	Ingo Molnar, John Stultz, Stephen Boyd, Alexander Viro,
	Christian Brauner, Jan Kara, linux-fsdevel, Sebastian Reichel,
	linux-pm, Pablo Neira Ayuso, Florian Westphal, Phil Sutter,
	netfilter-devel, coreteam
In-Reply-To: <20260407083247.831143104@kernel.org>

On Tue, Apr 07, 2026 at 10:54:38AM +0200, Thomas Gleixner wrote:
> The [re]arm callbacks will return true when the timer was queued and false
> if it was already expired at enqueue time.
> 
> In both cases the call sites can trivially queue the signal right there,
> when the timer was already expired.
> 
> Signed-off-by: Thomas Gleixner <tglx@kernel.org>

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

^ permalink raw reply

* Re: [patch 06/12] posix-timers: Switch to hrtimer_start_expires_user()
From: Peter Zijlstra @ 2026-04-07 10:01 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Anna-Maria Behnsen, Frederic Weisbecker, Calvin Owens,
	Ingo Molnar, John Stultz, Stephen Boyd, Alexander Viro,
	Christian Brauner, Jan Kara, linux-fsdevel, Sebastian Reichel,
	linux-pm, Pablo Neira Ayuso, Florian Westphal, Phil Sutter,
	netfilter-devel, coreteam
In-Reply-To: <20260407083247.898494239@kernel.org>

On Tue, Apr 07, 2026 at 10:54:43AM +0200, Thomas Gleixner wrote:
> Switch the arm and rearm callbacks for hrtimer based posix timers over to
> hrtimer_start_expires_user() so that already expired timers are not
> queued. Hand the result back to the caller, which then queues the signal.
> 
> Signed-off-by: Thomas Gleixner <tglx@kernel.org>

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

^ permalink raw reply

* Re: [patch 07/12] alarmtimer: Provide alarmtimer_start()
From: Peter Zijlstra @ 2026-04-07 10:04 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, John Stultz, Stephen Boyd, Calvin Owens, Anna-Maria Behnsen,
	Frederic Weisbecker, Ingo Molnar, Alexander Viro,
	Christian Brauner, Jan Kara, linux-fsdevel, Sebastian Reichel,
	linux-pm, Pablo Neira Ayuso, Florian Westphal, Phil Sutter,
	netfilter-devel, coreteam
In-Reply-To: <20260407083247.965539525@kernel.org>

On Tue, Apr 07, 2026 at 10:54:48AM +0200, Thomas Gleixner wrote:
> Alarm timers utilize hrtimers for normal operation and only switch to the
> RTC on suspend. In order to catch already expired timers early and without
> going through a timer interrupt cycle, provide a new start function which
> internally uses hrtimer_start_range_ns_user().
> 
> If hrtimer_start_range_ns_user() detects an already expired timer, it does
> not queue it. In that case remove the timer from the alarm base as well.
> 
> Return the status queued or not back to the caller to handle the early
> expiry.
> 
> Signed-off-by: Thomas Gleixner <tglx@kernel.org>

Not familiar with this code, but my head hurts from the:

alarm_
alarm_timer_
alarmtimer_

prefixes, what's what?

^ permalink raw reply

* Re: [patch 09/12] fs/timerfd: Use the new alarm/hrtimer functions
From: Peter Zijlstra @ 2026-04-07 10:09 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Alexander Viro, Christian Brauner, Jan Kara,
	Anna-Maria Behnsen, Frederic Weisbecker, linux-fsdevel,
	Calvin Owens, Ingo Molnar, John Stultz, Stephen Boyd,
	Sebastian Reichel, linux-pm, Pablo Neira Ayuso, Florian Westphal,
	Phil Sutter, netfilter-devel, coreteam
In-Reply-To: <20260407083248.102440187@kernel.org>

On Tue, Apr 07, 2026 at 10:54:58AM +0200, Thomas Gleixner wrote:

> +static u64 timerfd_alarm_restart(struct timerfd_ctx *ctx)
> +{
> +	u64 ticks = alarm_forward_now(&ctx->t.alarm, ctx->tintv) - 1;

(still confused on the alarm_forward_now() vs alarmtimer_start()
namespacing)

> +
> +	timerfd_alarm_start(ctx, alarm_get_expires(&ctx->t.alarm), false);
> +	return ticks;
> +}
> +
> +static void timerfd_hrtimer_start(struct timerfd_ctx *ctx, ktime_t exp,
> +				  const enum hrtimer_mode mode)
> +{
> +	/* Start the timer. If it's expired already, handle the callback. */
> +	if (!hrtimer_start_range_ns_user(&ctx->t.tmr, exp, 0, mode))
> +		__timerfd_triggered(ctx);
> +}
> +
> +static u64 timerfd_hrtimer_restart(struct timerfd_ctx *ctx)
> +{
> +	u64 ticks = hrtimer_forward_now(&ctx->t.tmr, ctx->tintv) - 1;
> +
> +	timerfd_hrtimer_start(ctx, hrtimer_get_expires(&ctx->t.tmr), HRTIMER_MODE_ABS);
> +	return ticks;
> +}

> -		if (ctx->expired && ctx->tintv) {
> -			/*
> -			 * If tintv != 0, this is a periodic timer that
> -			 * needs to be re-armed. We avoid doing it in the timer
> -			 * callback to avoid DoS attacks specifying a very
> -			 * short timer period.
> -			 */
> -			if (isalarm(ctx)) {
> -				ticks += alarm_forward_now(
> -					&ctx->t.alarm, ctx->tintv) - 1;
> -				alarm_restart(&ctx->t.alarm);
> -			} else {
> -				ticks += hrtimer_forward_now(&ctx->t.tmr,
> -							     ctx->tintv) - 1;
> -				hrtimer_restart(&ctx->t.tmr);
> -			}
> -		}
> +		ticks = ctx->ticks;
>  		ctx->expired = 0;
>  		ctx->ticks = 0;
> +
> +		/*
> +		 * If tintv != 0, this is a periodic timer that needs to be
> +		 * re-armed. We avoid doing it in the timer callback to avoid
> +		 * DoS attacks specifying a very short timer period.
> +		 */
> +		if (expired && ctx->tintv)
> +			ticks += timerfd_restart(ctx);
>  	}
>  	spin_unlock_irq(&ctx->wqh.lock);
>  	if (ticks) {
> @@ -526,18 +554,7 @@ static int do_timerfd_gettime(int ufd, s
>  	spin_lock_irq(&ctx->wqh.lock);
>  	if (ctx->expired && ctx->tintv) {
>  		ctx->expired = 0;
> -
> -		if (isalarm(ctx)) {
> -			ctx->ticks +=
> -				alarm_forward_now(
> -					&ctx->t.alarm, ctx->tintv) - 1;
> -			alarm_restart(&ctx->t.alarm);
> -		} else {
> -			ctx->ticks +=
> -				hrtimer_forward_now(&ctx->t.tmr, ctx->tintv)
> -				- 1;

(argh!)

> -			hrtimer_restart(&ctx->t.tmr);
> -		}
> +		ctx->ticks += timerfd_restart(ctx);
>  	}
>  	t->it_value = ktime_to_timespec64(timerfd_get_remaining(ctx));
>  	t->it_interval = ktime_to_timespec64(ctx->tintv);

What's with the -1 thing?

Anyway, this looks about right.

^ permalink raw reply

* Re: [patch 10/12] power: supply: charger-manager: Switch to alarmtimer_start()
From: Peter Zijlstra @ 2026-04-07 10:11 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Sebastian Reichel, linux-pm, Calvin Owens,
	Anna-Maria Behnsen, Frederic Weisbecker, Ingo Molnar, John Stultz,
	Stephen Boyd, Alexander Viro, Christian Brauner, Jan Kara,
	linux-fsdevel, Pablo Neira Ayuso, Florian Westphal, Phil Sutter,
	netfilter-devel, coreteam
In-Reply-To: <20260407083248.169005310@kernel.org>

On Tue, Apr 07, 2026 at 10:55:03AM +0200, Thomas Gleixner wrote:

> +		exp = ktime_set(wakeup_ms / MSEC_PER_SEC,
>  				(wakeup_ms % MSEC_PER_SEC) * NSEC_PER_MSEC);

Surely we can write this less insane?

  exp = wakeup_ms * NSEC_PER_MSEC;

comes to mind? And yes, we then seem to loose that KTIME_SEC_MAX check,
but urgh.

^ permalink raw reply

* [PATCH] power: supply: fix OF node reference imbalance
From: Johan Hovold @ 2026-04-07 10:40 UTC (permalink / raw)
  To: Sebastian Reichel
  Cc: Hans de Goede, Krzysztof Kozlowski, Marek Szyprowski,
	Sebastian Krzyszkowiak, Purism Kernel Team, linux-pm,
	linux-kernel, Johan Hovold, stable, Dzmitry Sankouski

The driver reuses the OF node of the parent multi-function device but
fails to take another reference to balance the one dropped by the
platform bus code when unbinding the MFD and deregistering the child
devices.

Fix this by using the intended helper for reusing OF nodes.

Fixes: 0cd4f1f77ad4 ("power: supply: max17042: add platform driver variant")
Cc: stable@vger.kernel.org	# 6.14
Cc: Dzmitry Sankouski <dsankouski@gmail.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
---
 drivers/power/supply/max17042_battery.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/power/supply/max17042_battery.c b/drivers/power/supply/max17042_battery.c
index acea176101fa..914f18ce79b3 100644
--- a/drivers/power/supply/max17042_battery.c
+++ b/drivers/power/supply/max17042_battery.c
@@ -1165,7 +1165,8 @@ static int max17042_platform_probe(struct platform_device *pdev)
 	if (!i2c)
 		return -EINVAL;
 
-	dev->of_node = dev->parent->of_node;
+	device_set_of_node_from_dev(dev, dev->parent);
+
 	id = platform_get_device_id(pdev);
 	irq = platform_get_irq(pdev, 0);
 
-- 
2.52.0


^ permalink raw reply related

* Re: [PATCH v2 2/2] interconnect: qcom: add Hawi interconnect provider driver
From: Konrad Dybcio @ 2026-04-07 10:54 UTC (permalink / raw)
  To: Vivek Aknurwar, Georgi Djakov, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley
  Cc: linux-arm-msm, linux-pm, devicetree, linux-kernel, Mike Tipton,
	Krzysztof Kozlowski
In-Reply-To: <20260406-icc-hawi-v2-2-6cfee87a1d25@oss.qualcomm.com>

On 4/7/26 1:04 AM, Vivek Aknurwar wrote:
> Add driver for the Qualcomm interconnect buses found in Hawi
> based platforms. The topology consists of several NoCs that are
> controlled by a remote processor that collects the aggregated
> bandwidth for each master-slave pair.
> 
> Signed-off-by: Vivek Aknurwar <vivek.aknurwar@oss.qualcomm.com>
> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
> ---

You lost my and Dmitry's r-bs from v1

Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>

Konrad

^ permalink raw reply

* Re: [patch 01/12] clockevents: Prevent timer interrupt starvation
From: Thomas Gleixner @ 2026-04-07 11:30 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: LKML, Calvin Owens, Anna-Maria Behnsen, Frederic Weisbecker,
	Ingo Molnar, John Stultz, Stephen Boyd, Alexander Viro,
	Christian Brauner, Jan Kara, linux-fsdevel, Sebastian Reichel,
	linux-pm, Pablo Neira Ayuso, Florian Westphal, Phil Sutter,
	netfilter-devel, coreteam
In-Reply-To: <20260407094206.GL2872@noisy.programming.kicks-ass.net>

On Tue, Apr 07 2026 at 11:42, Peter Zijlstra wrote:
> On Tue, Apr 07, 2026 at 10:54:17AM +0200, Thomas Gleixner wrote:
>> @@ -324,16 +324,23 @@ int clockevents_program_event(struct clo
>>  		return dev->set_next_ktime(expires, dev);
>>  
>>  	delta = ktime_to_ns(ktime_sub(expires, ktime_get()));
>>  
>> +	if (delta > (int64_t)dev->min_delta_ns) {
>> +		delta = min(delta, (int64_t) dev->max_delta_ns);
>> +		clc = ((unsigned long long) delta * dev->mult) >> dev->shift;
>> +		if (!dev->set_next_event((unsigned long) clc, dev))
>> +			return 0;
>> +	}
>>  
>> +	if (dev->next_event_forced)
>> +		return 0;
>>  
>> +	if (dev->set_next_event(dev->min_delta_ticks, dev)) {
>> +		if (!force || clockevents_program_min_delta(dev))
>> +			return -ETIME;
>> +	}
>> +	dev->next_event_forced = 1;
>> +	return 0;
>>  }
>
> Looking at the implementation of clockevents_program_min_delta() doing
> that dev->set_next_event(dev->min_delta_ticks,) right before it seems a
> bit daft.
>
> But yes, this is effectively also what the old code did.

yes. I looked at that and didn't come up with a good plan.

> The only thing that seems to be different, is that the old code would
> return the ->set_next_event() error code, rather than 0 in the !force
> case.

You mean when dev->next_event_forced is set and the set_event() callback
above failed?

Thanks,

        tglx



^ permalink raw reply

* Re: [patch 02/12] hrtimer: Provide hrtimer_start_range_ns_user()
From: Thomas Gleixner @ 2026-04-07 11:32 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: LKML, Calvin Owens, Anna-Maria Behnsen, Frederic Weisbecker,
	Ingo Molnar, John Stultz, Stephen Boyd, Alexander Viro,
	Christian Brauner, Jan Kara, linux-fsdevel, Sebastian Reichel,
	linux-pm, Pablo Neira Ayuso, Florian Westphal, Phil Sutter,
	netfilter-devel, coreteam
In-Reply-To: <20260407095421.GM2872@noisy.programming.kicks-ass.net>

On Tue, Apr 07 2026 at 11:54, Peter Zijlstra wrote:
> On Tue, Apr 07, 2026 at 10:54:22AM +0200, Thomas Gleixner wrote:
>> -	if (__hrtimer_start_range_ns(timer, tim, delta_ns, mode, base))
>> +	switch (hrtimer_start_range_ns_common(timer, tim, delta_ns, mode, base)) {
>> +	case HRTIMER_REPROGRAM:
>>  		hrtimer_reprogram(timer, true);
>> +		break;
>> +	case HRTIMER_REPROGRAM_FORCE:
>> +		hrtimer_force_reprogram(timer->base->cpu_base, 1);
>> +		break;
>> +	}
>>  
>>  	unlock_hrtimer_base(timer, &flags);
>>  }
>
> Something is going to figure out that hrtimer_start_range_ns_common() is
> really returning that enum and then complain you don't handle NONE :-)

:)

> Anyway, to me it would make sense to instead pass that value to
> hrtimer_reprogram() as the second argument. But this works I suppose.

I can do that too. Splitting it this way made me more comfortable to
validate the logic I was implementing.

^ permalink raw reply

* Re: [PATCH] cpufreq: governor: Fix race between sysfs store and dbs work handler
From: Zhongqiu Han @ 2026-04-07 11:32 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: viresh.kumar, venkatesh.pallipadi, davej, trenn, linux-pm,
	linux-kernel, zhongqiu.han
In-Reply-To: <CAJZ5v0j7z0oQTDLAA7wtCpF-33ZSw6wxEoL30eohDpBTofAGVw@mail.gmail.com>

On 4/6/2026 11:21 PM, Rafael J. Wysocki wrote:
> On Mon, Apr 6, 2026 at 1:01 PM Zhongqiu Han
> <zhongqiu.han@oss.qualcomm.com> wrote:
>>
>> gov_update_cpu_data() resets per-CPU prev_cpu_idle and prev_cpu_nice
>> for every CPU in the governed domain. It is called from sysfs store
>> callbacks (e.g. ignore_nice_load_store) which run under
>> attr_set->update_lock, held by the surrounding governor_store().
>>
>> Concurrently, dbs_work_handler() calls gov->gov_dbs_update() (which
>> calls dbs_update()) under policy_dbs->update_mutex. dbs_update() both
>> reads and writes the same prev_cpu_idle / prev_cpu_nice fields. The
>> potential race path is:
>>
>> Path A (sysfs write, holds attr_set->update_lock only):
>>
>>    governor_store()
>>      mutex_lock(&attr_set->update_lock)
>>      ignore_nice_load_store()
>>        dbs_data->ignore_nice_load = input
>>        gov_update_cpu_data(dbs_data)
>>          list_for_each_entry(policy_dbs, ...)
>>            for_each_cpu(j, ...)
>>              j_cdbs->prev_cpu_idle = get_cpu_idle_time(...)  /* write */
>>              j_cdbs->prev_cpu_nice = kcpustat_field(...)     /* write */
>>      mutex_unlock(&attr_set->update_lock)
>>
>> Path B (work queue, holds policy_dbs->update_mutex only):
>>
>>    dbs_work_handler()
>>      mutex_lock(&policy_dbs->update_mutex)
>>      gov->gov_dbs_update(policy)
>>        dbs_update()
>>          for_each_cpu(j, policy->cpus)
>>            idle_time = cur - j_cdbs->prev_cpu_idle           /* read  */
>>            j_cdbs->prev_cpu_idle = cur_idle_time             /* write */
>>            idle_time += cur_nice - j_cdbs->prev_cpu_nice     /* read  */
>>            j_cdbs->prev_cpu_nice = cur_nice                  /* write */
>>      mutex_unlock(&policy_dbs->update_mutex)
>>
>> Because attr_set->update_lock and policy_dbs->update_mutex are two
>> completely independent locks, the two paths are not mutually exclusive.
>> This results in a data race on cpu_dbs_info.prev_cpu_idle and
>> cpu_dbs_info.prev_cpu_nice.
>>
>> Fix this by also acquiring policy_dbs->update_mutex in
>> gov_update_cpu_data() for each policy, so that path A participates in
>> the mutual exclusion already established by dbs_work_handler(). Also
>> update the function comment to accurately reflect the two-level locking
>> contract.
>>
>> The root of this race dates back to the original ondemand/conservative
>> governors. Before commit ee88415caf73 ("[CPUFREQ] Cleanup locking in
>> conservative governor") and commit 5a75c82828e7 ("[CPUFREQ] Cleanup
>> locking in ondemand governor"), all accesses to prev_cpu_idle and
>> prev_cpu_nice in cpufreq_governor_dbs() (path X), store_ignore_nice_load()
>> (path Y), and do_dbs_timer() (path Z) were serialised by the same
>> dbs_mutex, so no race existed. Those two commits switched do_dbs_timer()
>> from dbs_mutex to a per-policy/per-cpu timer_mutex to reduce lock
>> contention, but left store_ignore_nice_load() still holding dbs_mutex.
>> As a result, path Y (store) and path Z (do_dbs_timer) no longer shared a
>> common lock, introducing a potential race on prev_cpu_idle/prev_cpu_nice
>> between store_ignore_nice_load() and dbs_check_cpu().
>>
>> Commit 326c86deaed54a ("[CPUFREQ] Remove unneeded locks") then removed
>> dbs_mutex from store_ignore_nice_load() entirely, introducing an
>> additional potential race between store_ignore_nice_load() (path Y, now
>> lockless) and cpufreq_governor_dbs() (path X, still holding dbs_mutex),
>> while the race between path Y and path Z remained.
>>
>> Fixes: ee88415caf736b ("[CPUFREQ] Cleanup locking in conservative governor")
>> Fixes: 5a75c82828e7c0 ("[CPUFREQ] Cleanup locking in ondemand governor")
>> Fixes: 326c86deaed54a ("[CPUFREQ] Remove unneeded locks")
>> Signed-off-by: Zhongqiu Han <zhongqiu.han@oss.qualcomm.com>
>> ---
>>   drivers/cpufreq/cpufreq_governor.c | 5 ++++-
>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
>> index 86f35e451914..56ef793362db 100644
>> --- a/drivers/cpufreq/cpufreq_governor.c
>> +++ b/drivers/cpufreq/cpufreq_governor.c
>> @@ -90,7 +90,8 @@ EXPORT_SYMBOL_GPL(sampling_rate_store);
>>    * (that may be a single policy or a bunch of them if governor tunables are
>>    * system-wide).
>>    *
>> - * Call under the @dbs_data mutex.
>> + * Call under the @dbs_data->attr_set.update_lock. The per-policy
>> + * update_mutex is acquired and released internally for each policy.
>>    */
>>   void gov_update_cpu_data(struct dbs_data *dbs_data)
>>   {
>> @@ -99,6 +100,7 @@ void gov_update_cpu_data(struct dbs_data *dbs_data)
>>          list_for_each_entry(policy_dbs, &dbs_data->attr_set.policy_list, list) {
>>                  unsigned int j;
>>
>> +               mutex_lock(&policy_dbs->update_mutex);
>>                  for_each_cpu(j, policy_dbs->policy->cpus) {
>>                          struct cpu_dbs_info *j_cdbs = &per_cpu(cpu_dbs, j);
>>
>> @@ -107,6 +109,7 @@ void gov_update_cpu_data(struct dbs_data *dbs_data)
>>                          if (dbs_data->ignore_nice_load)
>>                                  j_cdbs->prev_cpu_nice = kcpustat_field(&kcpustat_cpu(j), CPUTIME_NICE, j);
>>                  }
>> +               mutex_unlock(&policy_dbs->update_mutex);
>>          }
>>   }
>>   EXPORT_SYMBOL_GPL(gov_update_cpu_data);
>> --
> 
> Please have a look at
> 
> https://sashiko.dev/#/patchset/20260406110113.3475920-1-zhongqiu.han%40oss.qualcomm.com
> 
> and let me know what you think.
> 
> Thanks!

Hi Rafael,

Thanks for the review,

sashiko.dev points out two pre‑existing gaps rather than regressions
introduced by this patch.

Issue 1: There is a window where sysfs writes can race with the
unprotected initialization loop in cpufreq_dbs_governor_start(), which
may lead to races. I will address this by protecting the initialization
loop with policy_dbs->update_mutex.

Issue 2: There is a residual race where dbs_update() may observe a newly
updated tunable together with stale counters, leading to an inflated
idle delta. I plan to address this by always updating prev_cpu_nice in
dbs_update(), regardless of ignore_nice, to avoid stale state. I will
double‑check the logic and address this in patch v2.


-- 
Thx and BRs,
Zhongqiu Han

^ permalink raw reply

* Re: [patch 02/12] hrtimer: Provide hrtimer_start_range_ns_user()
From: Thomas Gleixner @ 2026-04-07 11:34 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: LKML, Calvin Owens, Anna-Maria Behnsen, Frederic Weisbecker,
	Ingo Molnar, John Stultz, Stephen Boyd, Alexander Viro,
	Christian Brauner, Jan Kara, linux-fsdevel, Sebastian Reichel,
	linux-pm, Pablo Neira Ayuso, Florian Westphal, Phil Sutter,
	netfilter-devel, coreteam
In-Reply-To: <20260407095758.GN2872@noisy.programming.kicks-ass.net>

On Tue, Apr 07 2026 at 11:57, Peter Zijlstra wrote:
> On Tue, Apr 07, 2026 at 10:54:22AM +0200, Thomas Gleixner wrote:
>> +	return ret;
>> +}
>> +EXPORT_SYMBOL_GPL(hrtimer_start_range_ns_user);
>
> Can we do that hrtimer_check_user_timer() in
> hrtimer_start_range_ns_user() and then not duplicate
> hrtimer_*reprogram() ?

We probably can. Let me have a look.

^ permalink raw reply

* Re: [patch 07/12] alarmtimer: Provide alarmtimer_start()
From: Thomas Gleixner @ 2026-04-07 11:34 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: LKML, John Stultz, Stephen Boyd, Calvin Owens, Anna-Maria Behnsen,
	Frederic Weisbecker, Ingo Molnar, Alexander Viro,
	Christian Brauner, Jan Kara, linux-fsdevel, Sebastian Reichel,
	linux-pm, Pablo Neira Ayuso, Florian Westphal, Phil Sutter,
	netfilter-devel, coreteam
In-Reply-To: <20260407100427.GS2872@noisy.programming.kicks-ass.net>

On Tue, Apr 07 2026 at 12:04, Peter Zijlstra wrote:
> On Tue, Apr 07, 2026 at 10:54:48AM +0200, Thomas Gleixner wrote:
>> Alarm timers utilize hrtimers for normal operation and only switch to the
>> RTC on suspend. In order to catch already expired timers early and without
>> going through a timer interrupt cycle, provide a new start function which
>> internally uses hrtimer_start_range_ns_user().
>> 
>> If hrtimer_start_range_ns_user() detects an already expired timer, it does
>> not queue it. In that case remove the timer from the alarm base as well.
>> 
>> Return the status queued or not back to the caller to handle the early
>> expiry.
>> 
>> Signed-off-by: Thomas Gleixner <tglx@kernel.org>
>
> Not familiar with this code, but my head hurts from the:
>
> alarm_
> alarm_timer_
> alarmtimer_
>
> prefixes, what's what?

Yeah. I should have named it alarm_timer_start(). Let me fix this.

^ permalink raw reply

* Re: [patch 09/12] fs/timerfd: Use the new alarm/hrtimer functions
From: Thomas Gleixner @ 2026-04-07 11:41 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: LKML, Alexander Viro, Christian Brauner, Jan Kara,
	Anna-Maria Behnsen, Frederic Weisbecker, linux-fsdevel,
	Calvin Owens, Ingo Molnar, John Stultz, Stephen Boyd,
	Sebastian Reichel, linux-pm, Pablo Neira Ayuso, Florian Westphal,
	Phil Sutter, netfilter-devel, coreteam
In-Reply-To: <20260407100920.GT2872@noisy.programming.kicks-ass.net>

On Tue, Apr 07 2026 at 12:09, Peter Zijlstra wrote:
>> -			ctx->ticks +=
>> -				hrtimer_forward_now(&ctx->t.tmr, ctx->tintv)
>> -				- 1;
>
> (argh!)
>
>> -			hrtimer_restart(&ctx->t.tmr);
>> -		}
>> +		ctx->ticks += timerfd_restart(ctx);
>>  	}
>>  	t->it_value = ktime_to_timespec64(timerfd_get_remaining(ctx));
>>  	t->it_interval = ktime_to_timespec64(ctx->tintv);
>
> What's with the -1 thing?

Magic :)

Reading the timerfd returns the number of expired ticks since the last
read or since the timer was armed.

The expiry callback increments ticks by one, hrtimer_forward_now()
returns the number of expired ticks relative to the previous expiry
time. So it would double account that.

Not pretty, but we need to increment ticks in the callback because of
non-interval timers as for those we don't invoke the forwarding.

Thanks,

        tglx

^ permalink raw reply

* Re: [patch 01/12] clockevents: Prevent timer interrupt starvation
From: Peter Zijlstra @ 2026-04-07 11:49 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Calvin Owens, Anna-Maria Behnsen, Frederic Weisbecker,
	Ingo Molnar, John Stultz, Stephen Boyd, Alexander Viro,
	Christian Brauner, Jan Kara, linux-fsdevel, Sebastian Reichel,
	linux-pm, Pablo Neira Ayuso, Florian Westphal, Phil Sutter,
	netfilter-devel, coreteam
In-Reply-To: <87o6jv57od.ffs@tglx>

On Tue, Apr 07, 2026 at 01:30:42PM +0200, Thomas Gleixner wrote:
> On Tue, Apr 07 2026 at 11:42, Peter Zijlstra wrote:
> > On Tue, Apr 07, 2026 at 10:54:17AM +0200, Thomas Gleixner wrote:
> >> @@ -324,16 +324,23 @@ int clockevents_program_event(struct clo
> >>  		return dev->set_next_ktime(expires, dev);
> >>  
> >>  	delta = ktime_to_ns(ktime_sub(expires, ktime_get()));
> >>  
> >> +	if (delta > (int64_t)dev->min_delta_ns) {
> >> +		delta = min(delta, (int64_t) dev->max_delta_ns);
> >> +		clc = ((unsigned long long) delta * dev->mult) >> dev->shift;
> >> +		if (!dev->set_next_event((unsigned long) clc, dev))
> >> +			return 0;
> >> +	}
> >>  
> >> +	if (dev->next_event_forced)
> >> +		return 0;
> >>  
> >> +	if (dev->set_next_event(dev->min_delta_ticks, dev)) {
> >> +		if (!force || clockevents_program_min_delta(dev))
> >> +			return -ETIME;
> >> +	}
> >> +	dev->next_event_forced = 1;
> >> +	return 0;
> >>  }
> >
> > Looking at the implementation of clockevents_program_min_delta() doing
> > that dev->set_next_event(dev->min_delta_ticks,) right before it seems a
> > bit daft.
> >
> > But yes, this is effectively also what the old code did.
> 
> yes. I looked at that and didn't come up with a good plan.
> 
> > The only thing that seems to be different, is that the old code would
> > return the ->set_next_event() error code, rather than 0 in the !force
> > case.
> 
> You mean when dev->next_event_forced is set and the set_event() callback
> above failed?

next_event_foced = 0;
force = 0;

Then the old code would return rc (return value of ->set_next_event),
while the new code will return -ETIME.

(not 0 like I said).

I suppose ->set_next_event() will only ever fail with -ETIME?

^ permalink raw reply

* Re: [PATCH] cpufreq: Fix race between suspend/resume and CPU hotplug
From: Rafael J. Wysocki @ 2026-04-07 11:50 UTC (permalink / raw)
  To: Tianxiang Chen; +Cc: rafael, viresh.kumar, linux-pm, linux-kernel, lingyue
In-Reply-To: <20260407093529.4527-1-nanmu@xiaomi.com>

On Tue, Apr 7, 2026 at 11:35 AM Tianxiang Chen <nanmu@xiaomi.com> wrote:
>
> CPU hotplug operations can race with cpufreq_suspend()
> and cpufreq_resume(), leading to null pointer dereferences
> when accessing governor data.

So how exactly would CPU hotplug be started during a system suspend or resume?

> This occurs because there is
> no synchronization between suspend/resume operations and
> CPU hotplug, allowing concurrent access to
> policy->governor_data while it is being freed or initialized.
>
> Detailed race condition scenario:
>
> 1. Thread A (cpufreq_suspend) starts execution:
>    - Iterates through active policies
>    - Calls cpufreq_stop_governor(policy) for each policy
>    - Sets cpufreq_suspended = true
>
> 2. Thread B (CPU hotplug) executes concurrently:
>    - Calls cpu_down(cpu)
>    - Calls cpuhp_cpufreq_offline(cpu)
>    - Calls cpufreq_offline(cpu)
>    - Inside cpufreq_offline():
>      * Stops governor: policy->governor->stop(policy)
>      * Exits governor: policy->governor->exit(policy)
>      * Frees governor_data: kfree(policy->governor_data)
>      * Sets policy->governor_data = NULL
>
> 3. Race window between step 1 and step 2:
>    - Thread A is iterating policies and stopping governors
>    - Thread B is concurrently executing CPU offline
>    - Both threads may access the same policy->governor_data
>    - Thread B frees governor_data while Thread A is still using it
>    - Thread A accesses freed governor_data → null pointer dereference
>
> Similarly, cpufreq_resume() can race with CPU hotplug where governor_data
> is being initialized while hotplug is trying to access it, leading to
> accessing uninitialized data.
>
> Signed-off-by: Tianxiang Chen <nanmu@xiaomi.com>
> ---
>  drivers/cpufreq/cpufreq.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 1f794524a1d9..8b03785764fa 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -1979,6 +1979,7 @@ void cpufreq_suspend(void)
>         if (!cpufreq_driver)
>                 return;
>
> +       cpus_read_lock();
>         if (!has_target() && !cpufreq_driver->suspend)
>                 goto suspend;
>
> @@ -1998,6 +1999,7 @@ void cpufreq_suspend(void)
>
>  suspend:
>         cpufreq_suspended = true;
> +       cpus_read_unlock();
>  }
>
>  /**
> @@ -2017,10 +2019,11 @@ void cpufreq_resume(void)
>         if (unlikely(!cpufreq_suspended))
>                 return;
>
> +       cpus_read_lock();
>         cpufreq_suspended = false;
>
>         if (!has_target() && !cpufreq_driver->resume)
> -               return;
> +               goto out;
>
>         pr_debug("%s: Resuming Governors\n", __func__);
>
> @@ -2038,6 +2041,9 @@ void cpufreq_resume(void)
>                                        __func__, policy->cpu);
>                 }
>         }
> +
> +out:
> +       cpus_read_unlock();
>  }
>
>  /**
> --
> 2.34.1
>
> #/******本邮件及其附件含有小米公司的保密信息，仅限于发送给上面地址中列出的个人或群组。禁止任何其他人以任何形式使用（包括但不限于全部或部分地泄露、复制、或散发）本邮件中的信息。如果您错收了本邮件，请您立即电话或邮件通知发件人并删除本邮件！ This e-mail and its attachments contain confidential information from XIAOMI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!******/#
>

^ permalink raw reply

* [PATCH] MAINTAINERS, mailmap: Change Ulf Hansson's email
From: Ulf Hansson @ 2026-04-07 12:14 UTC (permalink / raw)
  To: linux-pm, linux-kernel, linux-arm-kernel, linux-mmc; +Cc: Ulf Hansson

Change my email in MAINTAINERS and add a few entries in mailmap to start
using ulfh@kernel.org.

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
---
 .mailmap    |  2 ++
 MAINTAINERS | 14 +++++++-------
 2 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/.mailmap b/.mailmap
index 2d04aeba68b4..22c5ab1c5d55 100644
--- a/.mailmap
+++ b/.mailmap
@@ -849,6 +849,8 @@ Tvrtko Ursulin <tursulin@ursulin.net> <tvrtko.ursulin@onelan.co.uk>
 Tvrtko Ursulin <tursulin@ursulin.net> <tvrtko@ursulin.net>
 Tycho Andersen <tycho@tycho.pizza> <tycho@tycho.ws>
 Tzung-Bi Shih <tzungbi@kernel.org> <tzungbi@google.com>
+Ulf Hansson <ulfh@kernel.org> <ulf.hansson@linaro.org>
+Ulf Hansson <ulfh@kernel.org> <ulf.hansson@stericsson.com>
 Umang Jain <uajain@igalia.com> <umang.jain@ideasonboard.com>
 Uwe Kleine-König <ukleinek@informatik.uni-freiburg.de>
 Uwe Kleine-König <u.kleine-koenig@baylibre.com> <ukleinek@baylibre.com>
diff --git a/MAINTAINERS b/MAINTAINERS
index c3fe46d7c4bc..7167dcea737d 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6716,7 +6716,7 @@ F:	include/linux/platform_data/cpuidle-exynos.h
 CPUIDLE DRIVER - ARM PSCI
 M:	Lorenzo Pieralisi <lpieralisi@kernel.org>
 M:	Sudeep Holla <sudeep.holla@kernel.org>
-M:	Ulf Hansson <ulf.hansson@linaro.org>
+M:	Ulf Hansson <ulfh@kernel.org>
 L:	linux-pm@vger.kernel.org
 L:	linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
 S:	Supported
@@ -6724,7 +6724,7 @@ T:	git git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm.git
 F:	drivers/cpuidle/cpuidle-psci.c
 
 CPUIDLE DRIVER - ARM PSCI PM DOMAIN
-M:	Ulf Hansson <ulf.hansson@linaro.org>
+M:	Ulf Hansson <ulfh@kernel.org>
 L:	linux-pm@vger.kernel.org
 L:	linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
 S:	Supported
@@ -6733,7 +6733,7 @@ F:	drivers/cpuidle/cpuidle-psci-domain.c
 F:	drivers/cpuidle/cpuidle-psci.h
 
 CPUIDLE DRIVER - DT IDLE PM DOMAIN
-M:	Ulf Hansson <ulf.hansson@linaro.org>
+M:	Ulf Hansson <ulfh@kernel.org>
 L:	linux-pm@vger.kernel.org
 S:	Supported
 T:	git git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm.git
@@ -10729,7 +10729,7 @@ F:	Documentation/devicetree/bindings/i2c/i2c-demux-pinctrl.yaml
 F:	drivers/i2c/muxes/i2c-demux-pinctrl.c
 
 GENERIC PM DOMAINS
-M:	Ulf Hansson <ulf.hansson@linaro.org>
+M:	Ulf Hansson <ulfh@kernel.org>
 L:	linux-pm@vger.kernel.org
 S:	Supported
 F:	Documentation/devicetree/bindings/power/power?domain*
@@ -18089,7 +18089,7 @@ F:	drivers/mmc/host/mmc_spi.c
 F:	include/linux/spi/mmc_spi.h
 
 MULTIMEDIA CARD (MMC), SECURE DIGITAL (SD) AND SDIO SUBSYSTEM
-M:	Ulf Hansson <ulf.hansson@linaro.org>
+M:	Ulf Hansson <ulfh@kernel.org>
 L:	linux-mmc@vger.kernel.org
 S:	Maintained
 T:	git git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc.git
@@ -24696,7 +24696,7 @@ F:	drivers/media/i2c/imx415.c
 SONY MEMORYSTICK SUBSYSTEM
 M:	Maxim Levitsky <maximlevitsky@gmail.com>
 M:	Alex Dubov <oakad@yahoo.com>
-M:	Ulf Hansson <ulf.hansson@linaro.org>
+M:	Ulf Hansson <ulfh@kernel.org>
 L:	linux-mmc@vger.kernel.org
 S:	Maintained
 T:	git git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc.git
@@ -27615,7 +27615,7 @@ F:	Documentation/fb/uvesafb.rst
 F:	drivers/video/fbdev/uvesafb.*
 
 Ux500 CLOCK DRIVERS
-M:	Ulf Hansson <ulf.hansson@linaro.org>
+M:	Ulf Hansson <ulfh@kernel.org>
 L:	linux-clk@vger.kernel.org
 L:	linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
 S:	Maintained
-- 
2.43.0


^ permalink raw reply related

* Re: [PATCH v10 6/6] usb: typec: tcpm/tcpci_maxim: deprecate WAR for setting charger mode
From: Heikki Krogerus @ 2026-04-07 12:24 UTC (permalink / raw)
  To: Amit Sunil Dhamne
  Cc: André Draszik, Lee Jones, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Greg Kroah-Hartman, Jagan Sridharan, Mark Brown,
	Matti Vaittinen, Andrew Morton, Sebastian Reichel, Peter Griffin,
	Tudor Ambarus, Alim Akhtar, linux-kernel, devicetree, linux-usb,
	linux-pm, linux-arm-kernel, linux-samsung-soc, RD Babiera,
	Kyle Tso
In-Reply-To: <017b8552-87e2-4409-ae34-9a3ab7365a68@google.com>

Hi Amit,

On Thu, Apr 02, 2026 at 11:47:30AM -0700, Amit Sunil Dhamne wrote:
> Hi Heikki,
> 
> On 4/2/26 7:33 AM, Heikki Krogerus wrote:
> > Hi Amit,
> > 
> > > +static int get_vbus_regulator_handle(struct max_tcpci_chip *chip)
> > > +{
> > > +	if (IS_ERR_OR_NULL(chip->vbus_reg)) {
> > > +		chip->vbus_reg = devm_regulator_get_exclusive(chip->dev,
> > > +							      "vbus");
> > Sorry to go back to this, but why can't you just get the regulator in
> > max_tcpci_probe()?
> 
> Thanks for calling this out. This was an intentional design decision to
> break a circular dependency.
> 
> The charger driver is guaranteed to probe after the TCPC driver due to a
> power supply dependency (the TCPC is a supplier of power for the Battery
> Charger). However, the charger driver is also the regulator provider for
> VBUS out (when Type-C goes into source mode).
> 
> Because of this, the regulator handle will not be available during the TCPC
> driver's probe. If we tried to fetch it in max_tcpci_probe() and returned
> -EPROBE_DEFER, it would create a probe deadlock, as the charger would then
> never probe. Therefore, I made the decision to get the regulator handle
> lazily and on-demand.

Got it. Thanks for the explanation!

-- 
heikki


^ permalink raw reply

* [PATCH v2] power: supply: max17042: fix OF node reference imbalance
From: Johan Hovold @ 2026-04-07 12:33 UTC (permalink / raw)
  To: Sebastian Reichel
  Cc: Hans de Goede, Krzysztof Kozlowski, Marek Szyprowski,
	Sebastian Krzyszkowiak, Purism Kernel Team, linux-pm,
	linux-kernel, Johan Hovold, stable, Dzmitry Sankouski

The driver reuses the OF node of the parent multi-function device but
fails to take another reference to balance the one dropped by the
platform bus code when unbinding the MFD and deregistering the child
devices.

Fix this by using the intended helper for reusing OF nodes.

Fixes: 0cd4f1f77ad4 ("power: supply: max17042: add platform driver variant")
Cc: stable@vger.kernel.org	# 6.14
Cc: Dzmitry Sankouski <dsankouski@gmail.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
---

Changes in v2:
 - add missing driver name to patch summary prefix


 drivers/power/supply/max17042_battery.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/power/supply/max17042_battery.c b/drivers/power/supply/max17042_battery.c
index acea176101fa..914f18ce79b3 100644
--- a/drivers/power/supply/max17042_battery.c
+++ b/drivers/power/supply/max17042_battery.c
@@ -1165,7 +1165,8 @@ static int max17042_platform_probe(struct platform_device *pdev)
 	if (!i2c)
 		return -EINVAL;
 
-	dev->of_node = dev->parent->of_node;
+	device_set_of_node_from_dev(dev, dev->parent);
+
 	id = platform_get_device_id(pdev);
 	irq = platform_get_irq(pdev, 0);
 
-- 
2.52.0


^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox