linux-rtc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [ BUG: Invalid wait context ] rtc_lock at: mc146818_avoid_UIP
@ 2025-03-30 11:32 Borislav Petkov
  2025-03-30 12:23 ` Mateusz Jończyk
  2025-04-03 13:12 ` Thomas Gleixner
  0 siblings, 2 replies; 19+ messages in thread
From: Borislav Petkov @ 2025-03-30 11:32 UTC (permalink / raw)
  To: linux-rtc, Alexandre Belloni, Mateusz Jończyk
  Cc: lkml, Anna-Maria Behnsen, Frederic Weisbecker, Thomas Gleixner

So,

while playing with suspend to RAM, I got this lockdep splat below.

Poking around I found:

ec5895c0f2d8 ("rtc: mc146818-lib: extract mc146818_avoid_UIP")

which is doing this funky taking and dropping the rtc_lock and I guess that's
inherited from ye olde times.

I "fixed" it so lockdeup doesn't warn by converting rtc_lock to a raw spinlock
but this is definitely not the right fix so let me bounce it off to the folks
on Cc who might have a better idea perhaps...

Thx.

[   95.916425] 
[   95.917942] =============================
[   95.921970] [ BUG: Invalid wait context ]
[   95.925993] 6.14.0-rc7+ #1 Not tainted
[   95.929762] -----------------------------
[   95.933784] swapper/7/0 is trying to lock:
[   95.937898] ffffffff8284ad18 (rtc_lock){....}-{3:3}, at: mc146818_avoid_UIP+0x36/0x120
[   95.945854] other info that might help us debug this:
[   95.950925] context-{5:5}
[   95.953559] 1 lock held by swapper/7/0:
[   95.957409]  #0: ffffffff8299d2f8 (tick_freeze_lock){....}-{2:2}, at: tick_freeze+0x12/0x110
[   95.965884] stack backtrace:
[   95.968779] CPU: 7 UID: 0 PID: 0 Comm: swapper/7 Not tainted 6.14.0-rc7+ #1 PREEMPT(voluntary) 
[   95.968785] Hardware name: Supermicro Super Server/H12SSL-i, BIOS 2.5 09/08/2022
[   95.968787] Call Trace:
[   95.968790]  <TASK>
[   95.968793]  dump_stack_lvl+0x68/0x90
[   95.968802]  __lock_acquire+0xe96/0x2210
[   95.968811]  ? srso_alias_return_thunk+0x5/0xfbef5
[   95.968816]  ? __lock_acquire+0x45d/0x2210
[   95.968825]  lock_acquire+0xd8/0x300
[   95.968829]  ? mc146818_avoid_UIP+0x36/0x120
[   95.968836]  ? lock_acquire+0xd8/0x300
[   95.968842]  ? srso_alias_return_thunk+0x5/0xfbef5
[   95.968850]  ? __pfx_mc146818_get_time_callback+0x10/0x10
[   95.968854]  _raw_spin_lock_irqsave+0x47/0x70
[   95.968860]  ? mc146818_avoid_UIP+0x36/0x120
[   95.968865]  mc146818_avoid_UIP+0x36/0x120
[   95.968874]  mc146818_get_time+0x39/0x110
[   95.968883]  mach_get_cmos_time+0x2d/0x70
[   95.968890]  ? srso_alias_return_thunk+0x5/0xfbef5
[   95.968894]  ? find_held_lock+0x2b/0x80
[   95.968900]  ? ktime_get+0x1a/0xe0
[   95.968908]  timekeeping_suspend+0x28/0x420
[   95.968918]  tick_freeze+0x7e/0x110
[   95.968923]  enter_s2idle_proper+0x21/0x120
[   95.968930]  cpuidle_enter_s2idle+0x65/0x90
[   95.968937]  do_idle+0x237/0x260
[   95.968947]  cpu_startup_entry+0x25/0x30
[   95.968952]  start_secondary+0x11c/0x140
[   95.968960]  common_startup_64+0x13e/0x141
[   95.968980]  </TASK>



-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [ BUG: Invalid wait context ] rtc_lock at: mc146818_avoid_UIP
  2025-03-30 11:32 [ BUG: Invalid wait context ] rtc_lock at: mc146818_avoid_UIP Borislav Petkov
@ 2025-03-30 12:23 ` Mateusz Jończyk
  2025-03-30 12:32   ` Mateusz Jończyk
  2025-04-03 13:12 ` Thomas Gleixner
  1 sibling, 1 reply; 19+ messages in thread
From: Mateusz Jończyk @ 2025-03-30 12:23 UTC (permalink / raw)
  To: Borislav Petkov, linux-rtc, Alexandre Belloni
  Cc: lkml, Anna-Maria Behnsen, Frederic Weisbecker, Thomas Gleixner

W dniu 30.03.2025 o 13:32, Borislav Petkov pisze:
> So,
>
> while playing with suspend to RAM, I got this lockdep splat below.
>
> Poking around I found:
>
> ec5895c0f2d8 ("rtc: mc146818-lib: extract mc146818_avoid_UIP")
>
> which is doing this funky taking and dropping the rtc_lock and I guess that's
> inherited from ye olde times.
>
> I "fixed" it so lockdeup doesn't warn by converting rtc_lock to a raw spinlock
> but this is definitely not the right fix so let me bounce it off to the folks
> on Cc who might have a better idea perhaps...
>
> Thx.

Hello,

This problem has been reported before, see

https://lore.kernel.org/all/463fbc29-b41f-4d2d-a869-108114000cdb@o2.pl/

I started work on converting rtc_lock to a raw spinlock, but got stuck 
mostly to a lack
of time etc. Only did some MIPS patches (unpublished).

The problem is that timekeeping_suspend() takes a raw spinlock called 
"tk_core.lock".
With this lock taken, this function indirectly calls 
mc146818_avoid_UIP(), which takes
a normal spinlock called "rtc_lock". It is necessary to take the 
rtc_lock while accessing
the RTC: the RTC access cycle consists of writing to an index register 
and then accessing
the data register. If something else touches the index register in the 
middle, we get garbage.

During a RTC tick, the RTC date/time registers are in an unspecified 
state - roughly during this
time the UIP (Update In Progress) bit in an RTC register A is set. This 
is handled by
mc146818_avoid_UIP(). This function takes and releases the rtc_lock 
multiple times,
in order not to hold it for too long a time (while sleeping).
But the rtc_lock does need to be taken anyway.

So the solutions would be:

1. Change the rtc_lock to a raw spinlock.

2. Change the tick_freeze_lock to a normal spinlock (if possible).

3. Possibly rewrite mc146818_avoid_UIP to avoid taking the rtc_lock if 
the tick_freeze_lock is held (likely ugly).

4. Maybe something else I haven't thought about.

Thanks,

Mateusz

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [ BUG: Invalid wait context ] rtc_lock at: mc146818_avoid_UIP
  2025-03-30 12:23 ` Mateusz Jończyk
@ 2025-03-30 12:32   ` Mateusz Jończyk
  0 siblings, 0 replies; 19+ messages in thread
From: Mateusz Jończyk @ 2025-03-30 12:32 UTC (permalink / raw)
  To: Borislav Petkov, linux-rtc, Alexandre Belloni
  Cc: lkml, Anna-Maria Behnsen, Frederic Weisbecker, Thomas Gleixner

W dniu 30.03.2025 o 14:23, Mateusz Jończyk pisze:
> W dniu 30.03.2025 o 13:32, Borislav Petkov pisze:
>> So,
>>
>> while playing with suspend to RAM, I got this lockdep splat below.
>>
>> Poking around I found:
>>
>> ec5895c0f2d8 ("rtc: mc146818-lib: extract mc146818_avoid_UIP")
>>
>> which is doing this funky taking and dropping the rtc_lock and I 
>> guess that's
>> inherited from ye olde times.
>>
>> I "fixed" it so lockdeup doesn't warn by converting rtc_lock to a raw 
>> spinlock
>> but this is definitely not the right fix so let me bounce it off to 
>> the folks
>> on Cc who might have a better idea perhaps...
>>
>> Thx.
>
> Hello,
>
> This problem has been reported before, see
>
> https://lore.kernel.org/all/463fbc29-b41f-4d2d-a869-108114000cdb@o2.pl/
>
> I started work on converting rtc_lock to a raw spinlock, but got stuck 
> mostly to a lack
> of time etc. Only did some MIPS patches (unpublished).
>
> The problem is that timekeeping_suspend() takes a raw spinlock called 
> "tk_core.lock".
> With this lock taken, this function indirectly calls 
> mc146818_avoid_UIP(), which takes
> a normal spinlock called "rtc_lock". It is necessary to take the 
> rtc_lock while accessing
> the RTC: the RTC access cycle consists of writing to an index register 
> and then accessing
> the data register. If something else touches the index register in the 
> middle, we get garbage.
>
> During a RTC tick, the RTC date/time registers are in an unspecified 
> state - roughly during this
> time the UIP (Update In Progress) bit in an RTC register A is set. 
> This is handled by
> mc146818_avoid_UIP(). This function takes and releases the rtc_lock 
> multiple times,
> in order not to hold it for too long a time (while sleeping).

I meant "while waiting / while calling udelay()". mc146818_avoid_UIP() 
does not sleep.

> But the rtc_lock does need to be taken anyway.
>
> So the solutions would be:
>
> 1. Change the rtc_lock to a raw spinlock.
>
> 2. Change the tick_freeze_lock to a normal spinlock (if possible).
>
> 3. Possibly rewrite mc146818_avoid_UIP to avoid taking the rtc_lock if 
> the tick_freeze_lock is held (likely ugly).
>
> 4. Maybe something else I haven't thought about.
>
> Thanks,
>
> Mateusz

Greetings,

Mateusz


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [ BUG: Invalid wait context ] rtc_lock at: mc146818_avoid_UIP
  2025-03-30 11:32 [ BUG: Invalid wait context ] rtc_lock at: mc146818_avoid_UIP Borislav Petkov
  2025-03-30 12:23 ` Mateusz Jończyk
@ 2025-04-03 13:12 ` Thomas Gleixner
  2025-04-03 13:50   ` Sebastian Andrzej Siewior
  2025-04-03 23:46   ` [ BUG: Invalid wait context ] rtc_lock at: mc146818_avoid_UIP Mateusz Jończyk
  1 sibling, 2 replies; 19+ messages in thread
From: Thomas Gleixner @ 2025-04-03 13:12 UTC (permalink / raw)
  To: Borislav Petkov, linux-rtc, Alexandre Belloni,
	Mateusz Jończyk
  Cc: lkml, Anna-Maria Behnsen, Frederic Weisbecker, Peter Zijlstra,
	Sebastian Andrzej Siewior

On Sun, Mar 30 2025 at 13:32, Borislav Petkov wrote:
> while playing with suspend to RAM, I got this lockdep splat below.
>
> Poking around I found:
>
> ec5895c0f2d8 ("rtc: mc146818-lib: extract mc146818_avoid_UIP")
>
> which is doing this funky taking and dropping the rtc_lock and I guess that's
> inherited from ye olde times.
>
> I "fixed" it so lockdeup doesn't warn by converting rtc_lock to a raw spinlock
> but this is definitely not the right fix so let me bounce it off to the folks
> on Cc who might have a better idea perhaps...

Converting it to a raw lock "fixes" the problem, but RT people will hunt
you down with a big latency bat.

But this is not related to the commit above and not new.

timekeeping_suspend() has always invoked mach_get_cmos_time() with the
freeze lock held and mc146818_get_time() has always locked rtc_lock.

I wonder, why this splat hasn't popped before. On RT lockdep should have
complained forever. Sebastian???

Suspend, whether it's suspend to idle or regular suspend, are weird
contexts as there is no concurrency possible because interrupts are
disabled and it is guaranteed that there is only a single CPU which is
operational. The other CPUs are either in a deep idle state or when they
are on the way back, they serialize on tick_freeze_lock.

So taking the non-raw spinlock in that context is safe, but obviously
lockdep does not know about that.

Peter, any ideas?

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [ BUG: Invalid wait context ] rtc_lock at: mc146818_avoid_UIP
  2025-04-03 13:12 ` Thomas Gleixner
@ 2025-04-03 13:50   ` Sebastian Andrzej Siewior
  2025-04-03 19:36     ` Sebastian Andrzej Siewior
  2025-04-03 23:46   ` [ BUG: Invalid wait context ] rtc_lock at: mc146818_avoid_UIP Mateusz Jończyk
  1 sibling, 1 reply; 19+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-04-03 13:50 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Borislav Petkov, linux-rtc, Alexandre Belloni,
	Mateusz Jończyk, lkml, Anna-Maria Behnsen,
	Frederic Weisbecker, Peter Zijlstra

On 2025-04-03 15:12:08 [+0200], Thomas Gleixner wrote:
> Converting it to a raw lock "fixes" the problem, but RT people will hunt
> you down with a big latency bat.
> 
> But this is not related to the commit above and not new.
> 
> timekeeping_suspend() has always invoked mach_get_cmos_time() with the
> freeze lock held and mc146818_get_time() has always locked rtc_lock.
> 
> I wonder, why this splat hasn't popped before. On RT lockdep should have
> complained forever. Sebastian???

I sure haven't seen it. But it has to.

> Suspend, whether it's suspend to idle or regular suspend, are weird
> contexts as there is no concurrency possible because interrupts are
> disabled and it is guaranteed that there is only a single CPU which is
> operational. The other CPUs are either in a deep idle state or when they
> are on the way back, they serialize on tick_freeze_lock.
> 
> So taking the non-raw spinlock in that context is safe, but obviously
> lockdep does not know about that.
> 
> Peter, any ideas?

We can tell lockdep that. I will look into this.

> Thanks,
> 
>         tglx

Sebastian

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [ BUG: Invalid wait context ] rtc_lock at: mc146818_avoid_UIP
  2025-04-03 13:50   ` Sebastian Andrzej Siewior
@ 2025-04-03 19:36     ` Sebastian Andrzej Siewior
  2025-04-03 20:26       ` Thomas Gleixner
  0 siblings, 1 reply; 19+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-04-03 19:36 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Borislav Petkov, linux-rtc, Alexandre Belloni,
	Mateusz Jończyk, lkml, Anna-Maria Behnsen,
	Frederic Weisbecker, Peter Zijlstra

On 2025-04-03 15:50:32 [+0200], To Thomas Gleixner wrote:
> On 2025-04-03 15:12:08 [+0200], Thomas Gleixner wrote:
> > Converting it to a raw lock "fixes" the problem, but RT people will hunt
> > you down with a big latency bat.
> > 
> > But this is not related to the commit above and not new.
> > 
> > timekeeping_suspend() has always invoked mach_get_cmos_time() with the
> > freeze lock held and mc146818_get_time() has always locked rtc_lock.
> > 
> > I wonder, why this splat hasn't popped before. On RT lockdep should have
> > complained forever. Sebastian???
> 
> I sure haven't seen it. But it has to.

might_sleep() is quite because system_states is SYSTEM_SUSPEND.
lockdep seems to be quiet because I don't have an outer lock while
testing via
	echo mem > /sys/power/state

it is just
| WARNING: CPU: 0 PID: 1007 at kernel/time/timekeeping.c:1858 timekeeping_suspend+0x3b/0x330
| RIP: 0010:timekeeping_suspend+0x3b/0x330
| Call Trace:
|  <TASK>
|  syscore_suspend+0x80/0x2f0
|  suspend_devices_and_enter+0x21c/0xad0
|  pm_suspend+0x25c/0x6b0
|  state_store+0x68/0xd0
|  kernfs_fop_write_iter+0x15e/0x210
|  vfs_write+0x2bb/0x560
|  ksys_write+0x73/0xf0
|  do_syscall_64+0x91/0x1a0

> > Thanks,
> > 
> >         tglx

Sebastian

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [ BUG: Invalid wait context ] rtc_lock at: mc146818_avoid_UIP
  2025-04-03 19:36     ` Sebastian Andrzej Siewior
@ 2025-04-03 20:26       ` Thomas Gleixner
  2025-04-04  6:16         ` Sebastian Andrzej Siewior
  2025-04-04 13:34         ` [PATCH] timekeeping: Add a lockdep override in tick_freeze() Sebastian Andrzej Siewior
  0 siblings, 2 replies; 19+ messages in thread
From: Thomas Gleixner @ 2025-04-03 20:26 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Borislav Petkov, linux-rtc, Alexandre Belloni,
	Mateusz Jończyk, lkml, Anna-Maria Behnsen,
	Frederic Weisbecker, Peter Zijlstra

On Thu, Apr 03 2025 at 21:36, Sebastian Andrzej Siewior wrote:
> On 2025-04-03 15:50:32 [+0200], To Thomas Gleixner wrote:
>> > I wonder, why this splat hasn't popped before. On RT lockdep should have
>> > complained forever. Sebastian???
>> 
>> I sure haven't seen it. But it has to.
>
> might_sleep() is quite because system_states is SYSTEM_SUSPEND.
> lockdep seems to be quiet because I don't have an outer lock while
> testing via
> 	echo mem > /sys/power/state

Right. freeze triggers the suspend to idle path

> it is just
> | WARNING: CPU: 0 PID: 1007 at kernel/time/timekeeping.c:1858 timekeeping_suspend+0x3b/0x330

Which kernel version is that? I don't see a warning in timekeeping_suspend()

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [ BUG: Invalid wait context ] rtc_lock at: mc146818_avoid_UIP
  2025-04-03 13:12 ` Thomas Gleixner
  2025-04-03 13:50   ` Sebastian Andrzej Siewior
@ 2025-04-03 23:46   ` Mateusz Jończyk
  2025-04-04  7:39     ` Thomas Gleixner
  1 sibling, 1 reply; 19+ messages in thread
From: Mateusz Jończyk @ 2025-04-03 23:46 UTC (permalink / raw)
  To: Thomas Gleixner, Borislav Petkov, linux-rtc, Alexandre Belloni
  Cc: lkml, Frederic Weisbecker, Peter Zijlstra,
	Sebastian Andrzej Siewior, Chris Bainbridge

W dniu 3.04.2025 o 15:12, Thomas Gleixner pisze:
> On Sun, Mar 30 2025 at 13:32, Borislav Petkov wrote:
>> while playing with suspend to RAM, I got this lockdep splat below.
>>
>> Poking around I found:
>>
>> ec5895c0f2d8 ("rtc: mc146818-lib: extract mc146818_avoid_UIP")
>>
>> which is doing this funky taking and dropping the rtc_lock and I guess that's
>> inherited from ye olde times.
>>
>> I "fixed" it so lockdeup doesn't warn by converting rtc_lock to a raw spinlock
>> but this is definitely not the right fix so let me bounce it off to the folks
>> on Cc who might have a better idea perhaps...
> Converting it to a raw lock "fixes" the problem, but RT people will hunt
> you down with a big latency bat.
>
> But this is not related to the commit above and not new.
>
> timekeeping_suspend() has always invoked mach_get_cmos_time() with the
> freeze lock held and mc146818_get_time() has always locked rtc_lock.
>
> I wonder, why this splat hasn't popped before. On RT lockdep should have
> complained forever. Sebastian???

Hello,

I was able to trigger the bug on Linux 6.1.0 with
CONFIG_PROVE_RAW_LOCK_NESTING enabled and I suspect it can be
triggered much earlier.

It is likely that only after

commit 560af5dc839eef ("lockdep: Enable PROVE_RAW_LOCK_NESTING with PROVE_LOCKING.")

people are seeing this simply because no one previously did
the test with CONFIG_PROVE_RAW_LOCK_NESTING=y.

See https://lore.kernel.org/all/CAP-bSRZ0CWyZZsMtx046YV8L28LhY0fson2g4EqcwRAVN1Jk+Q@mail.gmail.com/ :

> This splat happens on suspend/resume on a HP laptop. It doesn't appear
> to be a recent regression, as a bisect only leads to 560af5dc839e
> ("lockdep: Enable PROVE_RAW_LOCK_NESTING with PROVE_LOCKING.") -

Ccing Chris Bainbridge <chris.bainbridge@gmail.com>, author of the 
previous bug report.

Greetings,

Mateusz


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [ BUG: Invalid wait context ] rtc_lock at: mc146818_avoid_UIP
  2025-04-03 20:26       ` Thomas Gleixner
@ 2025-04-04  6:16         ` Sebastian Andrzej Siewior
  2025-04-04 13:34         ` [PATCH] timekeeping: Add a lockdep override in tick_freeze() Sebastian Andrzej Siewior
  1 sibling, 0 replies; 19+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-04-04  6:16 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Borislav Petkov, linux-rtc, Alexandre Belloni,
	Mateusz Jończyk, lkml, Anna-Maria Behnsen,
	Frederic Weisbecker, Peter Zijlstra

On 2025-04-03 22:26:24 [+0200], Thomas Gleixner wrote:
> > it is just
> > | WARNING: CPU: 0 PID: 1007 at kernel/time/timekeeping.c:1858 timekeeping_suspend+0x3b/0x330
> 
> Which kernel version is that? I don't see a warning in timekeeping_suspend()

I added a WARN_ON(1) to the top of timekeeping_suspend() to compare my
call chain with Borislav's.

> Thanks,
> 
>         tglx

Sebastian

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [ BUG: Invalid wait context ] rtc_lock at: mc146818_avoid_UIP
  2025-04-03 23:46   ` [ BUG: Invalid wait context ] rtc_lock at: mc146818_avoid_UIP Mateusz Jończyk
@ 2025-04-04  7:39     ` Thomas Gleixner
  0 siblings, 0 replies; 19+ messages in thread
From: Thomas Gleixner @ 2025-04-04  7:39 UTC (permalink / raw)
  To: Mateusz Jończyk, Borislav Petkov, linux-rtc,
	Alexandre Belloni
  Cc: lkml, Frederic Weisbecker, Peter Zijlstra,
	Sebastian Andrzej Siewior, Chris Bainbridge

On Fri, Apr 04 2025 at 01:46, Mateusz Jończyk wrote:
> W dniu 3.04.2025 o 15:12, Thomas Gleixner pisze:
>> Converting it to a raw lock "fixes" the problem, but RT people will hunt
>> you down with a big latency bat.
>>
>> But this is not related to the commit above and not new.
>>
>> timekeeping_suspend() has always invoked mach_get_cmos_time() with the
>> freeze lock held and mc146818_get_time() has always locked rtc_lock.
>>
>> I wonder, why this splat hasn't popped before. On RT lockdep should have
>> complained forever. Sebastian???
>
> I was able to trigger the bug on Linux 6.1.0 with
> CONFIG_PROVE_RAW_LOCK_NESTING enabled and I suspect it can be
> triggered much earlier.
>
> It is likely that only after
>
> commit 560af5dc839eef ("lockdep: Enable PROVE_RAW_LOCK_NESTING with PROVE_LOCKING.")
>
> people are seeing this simply because no one previously did
> the test with CONFIG_PROVE_RAW_LOCK_NESTING=y.

Well. On RT enabled kernels the raw lock nesting was always enabled. So
I just was curious why this didn't show up earlier. Though RT folks
might not regularly test suspend :)

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH] timekeeping: Add a lockdep override in tick_freeze().
  2025-04-03 20:26       ` Thomas Gleixner
  2025-04-04  6:16         ` Sebastian Andrzej Siewior
@ 2025-04-04 13:34         ` Sebastian Andrzej Siewior
  2025-04-04 18:47           ` Mateusz Jończyk
                             ` (2 more replies)
  1 sibling, 3 replies; 19+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-04-04 13:34 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Borislav Petkov, linux-rtc, Alexandre Belloni,
	Mateusz Jończyk, lkml, Anna-Maria Behnsen,
	Frederic Weisbecker, Peter Zijlstra, Chris Bainbridge

tick_freeze() acquires a raw_spinlock_t (tick_freeze_lock). Later in the
callchain (timekeeping_suspend() -> mc146818_avoid_UIP()) the RTC driver
can acquire a spinlock_t which becomes a sleeping lock on PREEMPT_RT.
Lockdep complains about this lock nesting.

Add a lockdep override for this special case and a comment explaining
why it is okay.

Reported-by: Borislav Petkov <bp@alien8.de>
Closes: https://lore.kernel.org/all/20250330113202.GAZ-krsjAnurOlTcp-@fat_crate.local/
Reported-by: Chris Bainbridge <chris.bainbridge@gmail.com>
Closes: https://lore.kernel.org/all/CAP-bSRZ0CWyZZsMtx046YV8L28LhY0fson2g4EqcwRAVN1Jk+Q@mail.gmail.com/
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 kernel/time/tick-common.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
index a47bcf71defcf..8fd8e2ee09fa1 100644
--- a/kernel/time/tick-common.c
+++ b/kernel/time/tick-common.c
@@ -509,6 +509,7 @@ void tick_resume(void)
 
 #ifdef CONFIG_SUSPEND
 static DEFINE_RAW_SPINLOCK(tick_freeze_lock);
+static DEFINE_WAIT_OVERRIDE_MAP(tick_freeze_map, LD_WAIT_SLEEP);
 static unsigned int tick_freeze_depth;
 
 /**
@@ -528,9 +529,20 @@ void tick_freeze(void)
 	if (tick_freeze_depth == num_online_cpus()) {
 		trace_suspend_resume(TPS("timekeeping_freeze"),
 				     smp_processor_id(), true);
+		/*
+		 * All other CPUs have their interrupts disabled and are
+		 * suspended to idle. Other tasks have been frozen so there is
+		 * no scheduling happening. This means that there is no
+		 * concurrency in the system at this point. Therefore it is okay
+		 * to acquire a sleeping lock on PREEMPT_RT, such as spinlock_t,
+		 * because the lock can not be acquired and can not block.
+		 * Inform lockdep about the situation.
+		 */
+		lock_map_acquire_try(&tick_freeze_map);
 		system_state = SYSTEM_SUSPEND;
 		sched_clock_suspend();
 		timekeeping_suspend();
+		lock_map_release(&tick_freeze_map);
 	} else {
 		tick_suspend_local();
 	}
@@ -552,8 +564,16 @@ void tick_unfreeze(void)
 	raw_spin_lock(&tick_freeze_lock);
 
 	if (tick_freeze_depth == num_online_cpus()) {
+		/*
+		 * Similar to tick_freeze(). On resumption the first CPU may
+		 * acquire uncontended sleeping locks while other CPUs block on
+		 * tick_freeze_lock.
+		 */
+		lock_map_acquire_try(&tick_freeze_map);
 		timekeeping_resume();
 		sched_clock_resume();
+		lock_map_release(&tick_freeze_map);
+
 		system_state = SYSTEM_RUNNING;
 		trace_suspend_resume(TPS("timekeeping_freeze"),
 				     smp_processor_id(), false);
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH] timekeeping: Add a lockdep override in tick_freeze().
  2025-04-04 13:34         ` [PATCH] timekeeping: Add a lockdep override in tick_freeze() Sebastian Andrzej Siewior
@ 2025-04-04 18:47           ` Mateusz Jończyk
  2025-04-09 19:13           ` Peter Zijlstra
  2025-05-31 18:27           ` Chris Bainbridge
  2 siblings, 0 replies; 19+ messages in thread
From: Mateusz Jończyk @ 2025-04-04 18:47 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, Thomas Gleixner
  Cc: Borislav Petkov, linux-rtc, Alexandre Belloni, lkml,
	Anna-Maria Behnsen, Frederic Weisbecker, Peter Zijlstra,
	Chris Bainbridge

W dniu 4.04.2025 o 15:34, Sebastian Andrzej Siewior pisze:
> tick_freeze() acquires a raw_spinlock_t (tick_freeze_lock). Later in the
> callchain (timekeeping_suspend() -> mc146818_avoid_UIP()) the RTC driver
> can acquire a spinlock_t which becomes a sleeping lock on PREEMPT_RT.
> Lockdep complains about this lock nesting.
>
> Add a lockdep override for this special case and a comment explaining
> why it is okay.
>
> Reported-by: Borislav Petkov <bp@alien8.de>
> Closes: https://lore.kernel.org/all/20250330113202.GAZ-krsjAnurOlTcp-@fat_crate.local/
> Reported-by: Chris Bainbridge <chris.bainbridge@gmail.com>
> Closes: https://lore.kernel.org/all/CAP-bSRZ0CWyZZsMtx046YV8L28LhY0fson2g4EqcwRAVN1Jk+Q@mail.gmail.com/
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
>   kernel/time/tick-common.c | 20 ++++++++++++++++++++
>   1 file changed, 20 insertions(+)
>
> diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
> index a47bcf71defcf..8fd8e2ee09fa1 100644
> --- a/kernel/time/tick-common.c
> +++ b/kernel/time/tick-common.c
> @@ -509,6 +509,7 @@ void tick_resume(void)
>   
>   #ifdef CONFIG_SUSPEND
>   static DEFINE_RAW_SPINLOCK(tick_freeze_lock);
> +static DEFINE_WAIT_OVERRIDE_MAP(tick_freeze_map, LD_WAIT_SLEEP);
>   static unsigned int tick_freeze_depth;
>   
>   /**
> @@ -528,9 +529,20 @@ void tick_freeze(void)
>   	if (tick_freeze_depth == num_online_cpus()) {
>   		trace_suspend_resume(TPS("timekeeping_freeze"),
>   				     smp_processor_id(), true);
> +		/*
> +		 * All other CPUs have their interrupts disabled and are
> +		 * suspended to idle. Other tasks have been frozen so there is
> +		 * no scheduling happening. This means that there is no
> +		 * concurrency in the system at this point. Therefore it is okay
> +		 * to acquire a sleeping lock on PREEMPT_RT, such as spinlock_t,
> +		 * because the lock can not be acquired and can not block.

because the lock cannot be held by other CPUs / threads and
acquiring it cannot block.

> +		 * Inform lockdep about the situation.
> +		 */

Greetings,

Mateusz


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] timekeeping: Add a lockdep override in tick_freeze().
  2025-04-04 13:34         ` [PATCH] timekeeping: Add a lockdep override in tick_freeze() Sebastian Andrzej Siewior
  2025-04-04 18:47           ` Mateusz Jończyk
@ 2025-04-09 19:13           ` Peter Zijlstra
  2025-05-31 18:27           ` Chris Bainbridge
  2 siblings, 0 replies; 19+ messages in thread
From: Peter Zijlstra @ 2025-04-09 19:13 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Thomas Gleixner, Borislav Petkov, linux-rtc, Alexandre Belloni,
	Mateusz Jończyk, lkml, Anna-Maria Behnsen,
	Frederic Weisbecker, Chris Bainbridge

On Fri, Apr 04, 2025 at 03:34:29PM +0200, Sebastian Andrzej Siewior wrote:
> tick_freeze() acquires a raw_spinlock_t (tick_freeze_lock). Later in the
> callchain (timekeeping_suspend() -> mc146818_avoid_UIP()) the RTC driver
> can acquire a spinlock_t which becomes a sleeping lock on PREEMPT_RT.
> Lockdep complains about this lock nesting.
> 
> Add a lockdep override for this special case and a comment explaining
> why it is okay.
> 
> Reported-by: Borislav Petkov <bp@alien8.de>
> Closes: https://lore.kernel.org/all/20250330113202.GAZ-krsjAnurOlTcp-@fat_crate.local/
> Reported-by: Chris Bainbridge <chris.bainbridge@gmail.com>
> Closes: https://lore.kernel.org/all/CAP-bSRZ0CWyZZsMtx046YV8L28LhY0fson2g4EqcwRAVN1Jk+Q@mail.gmail.com/
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

This is of course horrible :-) But yes, probably the best one can do
given how things are.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

> ---
>  kernel/time/tick-common.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
> 
> diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
> index a47bcf71defcf..8fd8e2ee09fa1 100644
> --- a/kernel/time/tick-common.c
> +++ b/kernel/time/tick-common.c
> @@ -509,6 +509,7 @@ void tick_resume(void)
>  
>  #ifdef CONFIG_SUSPEND
>  static DEFINE_RAW_SPINLOCK(tick_freeze_lock);
> +static DEFINE_WAIT_OVERRIDE_MAP(tick_freeze_map, LD_WAIT_SLEEP);
>  static unsigned int tick_freeze_depth;
>  
>  /**
> @@ -528,9 +529,20 @@ void tick_freeze(void)
>  	if (tick_freeze_depth == num_online_cpus()) {
>  		trace_suspend_resume(TPS("timekeeping_freeze"),
>  				     smp_processor_id(), true);
> +		/*
> +		 * All other CPUs have their interrupts disabled and are
> +		 * suspended to idle. Other tasks have been frozen so there is
> +		 * no scheduling happening. This means that there is no
> +		 * concurrency in the system at this point. Therefore it is okay
> +		 * to acquire a sleeping lock on PREEMPT_RT, such as spinlock_t,
> +		 * because the lock can not be acquired and can not block.
> +		 * Inform lockdep about the situation.
> +		 */
> +		lock_map_acquire_try(&tick_freeze_map);
>  		system_state = SYSTEM_SUSPEND;
>  		sched_clock_suspend();
>  		timekeeping_suspend();
> +		lock_map_release(&tick_freeze_map);
>  	} else {
>  		tick_suspend_local();
>  	}
> @@ -552,8 +564,16 @@ void tick_unfreeze(void)
>  	raw_spin_lock(&tick_freeze_lock);
>  
>  	if (tick_freeze_depth == num_online_cpus()) {
> +		/*
> +		 * Similar to tick_freeze(). On resumption the first CPU may
> +		 * acquire uncontended sleeping locks while other CPUs block on
> +		 * tick_freeze_lock.
> +		 */
> +		lock_map_acquire_try(&tick_freeze_map);
>  		timekeeping_resume();
>  		sched_clock_resume();
> +		lock_map_release(&tick_freeze_map);
> +
>  		system_state = SYSTEM_RUNNING;
>  		trace_suspend_resume(TPS("timekeeping_freeze"),
>  				     smp_processor_id(), false);
> -- 
> 2.49.0
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] timekeeping: Add a lockdep override in tick_freeze().
  2025-04-04 13:34         ` [PATCH] timekeeping: Add a lockdep override in tick_freeze() Sebastian Andrzej Siewior
  2025-04-04 18:47           ` Mateusz Jończyk
  2025-04-09 19:13           ` Peter Zijlstra
@ 2025-05-31 18:27           ` Chris Bainbridge
  2025-05-31 19:16             ` Chris Bainbridge
  2 siblings, 1 reply; 19+ messages in thread
From: Chris Bainbridge @ 2025-05-31 18:27 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Thomas Gleixner, Borislav Petkov, linux-rtc, Alexandre Belloni,
	Mateusz Jończyk, lkml, Anna-Maria Behnsen,
	Frederic Weisbecker, Peter Zijlstra

Hi,

I'm getting "WARNING: inconsistent lock state" on resume with this
commit (92e250c624ea37fde64bfd624fd2556f0d846f18):

[   43.747069] ACPI: \_SB_.PEP_: Successfully transitioned to state screen off
[   43.753611] ACPI: \_SB_.PEP_: Successfully transitioned to state lps0 ms entry
[   43.753777] ACPI: \_SB_.PEP_: Successfully transitioned to state lps0 entry
[   43.838479] amd_pmc AMDI0005:00: failed to set RTC: -22
[   43.838489] PM: suspend-to-idle
[   43.838542] amd_pmc: SMU idlemask s0i3: 0xc00e0eb5
[   46.246882] Timekeeping suspended for 2.528 seconds
[   46.249132] PM: Triggering wakeup from IRQ 9
[   46.249292] ACPI: PM: ACPI fixed event wakeup
[   46.249297] PM: resume from suspend-to-idle
[   46.250801] amd_pmc AMDI0005:00: Last suspend didn't reach deepest state
[   46.251096] ACPI: \_SB_.PEP_: Successfully transitioned to state lps0 exit
[   46.257167] ACPI: \_SB_.PEP_: Successfully transitioned to state lps0 ms exit
[   46.258450] ACPI: \_SB_.PEP_: Successfully transitioned to state screen on
[   46.259652] ACPI: EC: interrupt unblocked
[   46.291758] PM: noirq resume of devices complete after 32.885 msecs

[   46.291907] ================================
[   46.291909] WARNING: inconsistent lock state
[   46.291911] 6.15.0-rc1-00002-g92e250c624ea #379 Not tainted
[   46.291914] --------------------------------
[   46.291915] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
[   46.291917] irq/9-acpi/142 [HC0[0]:SC0[0]:HE1:SE1] takes:
[   46.291921] ffffffff88e48d18 (rtc_lock){?...}-{3:3}, at: cmos_interrupt+0x21/0x110
[   46.291939] {IN-HARDIRQ-W} state was registered at:
[   46.291941]   lock_acquire+0xc9/0x2d0
[   46.291946]   _raw_spin_lock+0x30/0x40
[   46.291951]   cmos_interrupt+0x21/0x110
[   46.291956]   __handle_irq_event_percpu+0x81/0x290
[   46.291962]   handle_irq_event+0x38/0x70
[   46.291966]   handle_edge_irq+0x85/0x210
[   46.291970]   __common_interrupt+0x45/0xd0
[   46.291973]   common_interrupt+0x80/0xa0
[   46.291977]   asm_common_interrupt+0x26/0x40
[   46.291981]   cpuidle_enter_state+0x113/0x540
[   46.291985]   cpuidle_enter+0x2d/0x40
[   46.291989]   do_idle+0x1f5/0x250
[   46.291992]   cpu_startup_entry+0x29/0x30
[   46.291994]   start_secondary+0x10c/0x130
[   46.291998]   common_startup_64+0x13e/0x141
[   46.292002] irq event stamp: 877
[   46.292004] hardirqs last  enabled at (877): [<ffffffff88778a90>] _raw_spin_unlock_irqrestore+0x40/0x50
[   46.292010] hardirqs last disabled at (876): [<ffffffff8877884c>] _raw_spin_lock_irqsave+0x5c/0x60
[   46.292014] softirqs last  enabled at (362): [<ffffffff87b0dc3c>] __irq_exit_rcu+0xcc/0xf0
[   46.292019] softirqs last disabled at (353): [<ffffffff87b0dc3c>] __irq_exit_rcu+0xcc/0xf0
[   46.292022] 
               other info that might help us debug this:
[   46.292024]  Possible unsafe locking scenario:

[   46.292025]        CPU0
[   46.292026]        ----
[   46.292027]   lock(rtc_lock);
[   46.292029]   <Interrupt>
[   46.292030]     lock(rtc_lock);
[   46.292032] 
                *** DEADLOCK ***

[   46.292033] no locks held by irq/9-acpi/142.
[   46.292035] 
               stack backtrace:
[   46.292040] CPU: 1 UID: 0 PID: 142 Comm: irq/9-acpi Not tainted 6.15.0-rc1-00002-g92e250c624ea #379 PREEMPT(voluntary) 
[   46.292046] Hardware name: HP HP Pavilion Aero Laptop 13-be0xxx/8916, BIOS F.17 12/18/2024
[   46.292049] Call Trace:
[   46.292052]  <TASK>
[   46.292058]  dump_stack_lvl+0x6e/0x90
[   46.292063]  print_usage_bug.part.0+0x22c/0x2c0
[   46.292067]  mark_lock+0x6f7/0x890
[   46.292072]  ? __lock_acquire+0x449/0x21c0
[   46.292077]  __lock_acquire+0x7e5/0x21c0
[   46.292080]  ? psi_task_switch+0x10a/0x330
[   46.292086]  lock_acquire+0xc9/0x2d0
[   46.292088]  ? cmos_interrupt+0x21/0x110
[   46.292094]  ? acpi_write_bit_register+0x79/0xf0
[   46.292102]  _raw_spin_lock+0x30/0x40
[   46.292106]  ? cmos_interrupt+0x21/0x110
[   46.292110]  cmos_interrupt+0x21/0x110
[   46.292115]  rtc_handler+0x28/0xc0
[   46.292120]  acpi_ev_fixed_event_detect+0xc8/0x140
[   46.292127]  ? irq_thread+0xa4/0x340
[   46.292129]  acpi_ev_sci_xrupt_handler+0x13/0x40
[   46.292132]  acpi_irq+0x16/0x30
[   46.292136]  irq_thread_fn+0x1d/0x50
[   46.292139]  irq_thread+0x1d5/0x340
[   46.292140]  ? irq_copy_pending.isra.0+0x70/0x70
[   46.292144]  ? irq_finalize_oneshot.part.0+0xc0/0xc0
[   46.292147]  ? irq_forced_thread_fn+0x70/0x70
[   46.292149]  kthread+0x10a/0x250
[   46.292154]  ? kthreads_online_cpu+0x130/0x130
[   46.292158]  ret_from_fork+0x31/0x50
[   46.292162]  ? kthreads_online_cpu+0x130/0x130
[   46.292166]  ret_from_fork_asm+0x11/0x20
[   46.292174]  </TASK>
[   46.292918] PM: early resume of devices complete after 0.829 msecs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] timekeeping: Add a lockdep override in tick_freeze().
  2025-05-31 18:27           ` Chris Bainbridge
@ 2025-05-31 19:16             ` Chris Bainbridge
  2025-06-02 19:52               ` Mateusz Jończyk
  0 siblings, 1 reply; 19+ messages in thread
From: Chris Bainbridge @ 2025-05-31 19:16 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Thomas Gleixner, Borislav Petkov, linux-rtc, Alexandre Belloni,
	Mateusz Jończyk, lkml, Anna-Maria Behnsen,
	Frederic Weisbecker, Peter Zijlstra

On Sat, May 31, 2025 at 07:27:03PM +0100, Chris Bainbridge wrote:
> Hi,
> 
> I'm getting "WARNING: inconsistent lock state" on resume with this
> commit (92e250c624ea37fde64bfd624fd2556f0d846f18):
> 

Further testing shows there are some required conditions for this
warning to be shown. The suspend must be of a short enough duration that
it does "not reach hardware sleep state" (according to amd_s2idle.py).

Also the warning is only shown once, I don't know if this is because the
conditions for the warning only occur once, or if there is log limit
somewhere that prevents it from being logged more than once.

I can reliably reproduce the warning by running amd_s2idle.py and
waiting for the automatic resume:

# ./amd_s2idle.py --log log --duration 5 --wait 4 --count 1
Debugging script for s2idle on AMD systems
💻 HP HP Pavilion Aero Laptop 13-be0xxx (103C_5335KV HP Pavilion) running BIOS 15.17 (F.17) released 12/18/2024 and EC 79.31
🐧 Debian GNU/Linux trixie/sid
🐧 Kernel 6.15.0-rc1-00002-g92e250c624ea
🔋 Battery BAT0 (313-27-3C-A PC03043XL) is operating at 100.00% of design
Checking prerequisites for s2idle
✅ Logs are provided via systemd
✅ AMD Ryzen 7 5800U with Radeon Graphics (family 19 model 50)
✅ SMT enabled
✅ LPS0 _DSM enabled
✅ ACPI FADT supports Low-power S0 idle
✅ HSMP driver `amd_hsmp` not detected (blocked: False)
✅ PMC driver `amd_pmc` loaded (Program 0 Firmware 64.73.0)
✅ GPU driver `amdgpu` bound to 0000:03:00.0
✅ System is configured for s2idle
✅ NVME Intel Corporation SSD 670p Series [Keystone Harbor] is configured for s2idle in BIOS
✅ GPIO driver `pinctrl_amd` available
🚦 Device firmware checks unavailable without fwupd gobject introspection
Started at 2025-05-31 19:46:33.911590 (cycle finish expected @ 2025-05-31 19:46:42.911616)
Results from last s2idle cycle
○ Suspend count: 1
○ Hardware sleep cycle count: 1
○ Wakeup triggered from IRQ 9: ACPI SCI
○ Woke up from IRQ 9: ACPI SCI
○ gpe03 increased from 140 to 148
✅ Userspace suspended for 0:00:08.256333
❌ Did not reach hardware sleep state

If the duration arg is 6 or higher, then amd_s2idle.py reports that the
hardware sleep state was entered, and the "inconsistent lock state"
warning does not appear. If the duration is too low (e.g. 1 second),
then the laptop does not wake up automatically, and upon pressing a
keyboard key, the amdgpu driver will report an error resuming the GPU,
and the GPU will not be working. (I don't think the amdgpu problem is
related to the lock state warning, I'm just mentioning it for
completeness). It is the state between these two cases, where the laptop
does suspend and resume correctly, but the suspend is too short to enter
a hardware sleep state, where the problem occurs.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] timekeeping: Add a lockdep override in tick_freeze().
  2025-05-31 19:16             ` Chris Bainbridge
@ 2025-06-02 19:52               ` Mateusz Jończyk
  2025-06-02 20:20                 ` [DRAFT PATCH] rtc-cmos: use spin_lock_irqsave in cmos_interrupt Mateusz Jończyk
  0 siblings, 1 reply; 19+ messages in thread
From: Mateusz Jończyk @ 2025-06-02 19:52 UTC (permalink / raw)
  To: Chris Bainbridge, Sebastian Andrzej Siewior
  Cc: Thomas Gleixner, Borislav Petkov, linux-rtc, Alexandre Belloni,
	lkml, Anna-Maria Behnsen, Frederic Weisbecker, Peter Zijlstra

W dniu 31.05.2025 o 21:16, Chris Bainbridge pisze:
> On Sat, May 31, 2025 at 07:27:03PM +0100, Chris Bainbridge wrote:
>> Hi,
>>
>> I'm getting "WARNING: inconsistent lock state" on resume with this
>> commit (92e250c624ea37fde64bfd624fd2556f0d846f18):
>>
> Further testing shows there are some required conditions for this
> warning to be shown. The suspend must be of a short enough duration that
> it does "not reach hardware sleep state" (according to amd_s2idle.py).
>
> Also the warning is only shown once, I don't know if this is because the
> conditions for the warning only occur once, or if there is log limit
> somewhere that prevents it from being logged more than once.
>
> I can reliably reproduce the warning by running amd_s2idle.py and
> waiting for the automatic resume:
>
> # ./amd_s2idle.py --log log --duration 5 --wait 4 --count 1
> Debugging script for s2idle on AMD systems
> 💻 HP HP Pavilion Aero Laptop 13-be0xxx (103C_5335KV HP Pavilion) running BIOS 15.17 (F.17) released 12/18/2024 and EC 79.31
> 🐧 Debian GNU/Linux trixie/sid
> 🐧 Kernel 6.15.0-rc1-00002-g92e250c624ea
> 🔋 Battery BAT0 (313-27-3C-A PC03043XL) is operating at 100.00% of design
> Checking prerequisites for s2idle
> ✅ Logs are provided via systemd
> ✅ AMD Ryzen 7 5800U with Radeon Graphics (family 19 model 50)
> ✅ SMT enabled
> ✅ LPS0 _DSM enabled
> ✅ ACPI FADT supports Low-power S0 idle
> ✅ HSMP driver `amd_hsmp` not detected (blocked: False)
> ✅ PMC driver `amd_pmc` loaded (Program 0 Firmware 64.73.0)
> ✅ GPU driver `amdgpu` bound to 0000:03:00.0
> ✅ System is configured for s2idle
> ✅ NVME Intel Corporation SSD 670p Series [Keystone Harbor] is configured for s2idle in BIOS
> ✅ GPIO driver `pinctrl_amd` available
> 🚦 Device firmware checks unavailable without fwupd gobject introspection
> Started at 2025-05-31 19:46:33.911590 (cycle finish expected @ 2025-05-31 19:46:42.911616)
> Results from last s2idle cycle
> ○ Suspend count: 1
> ○ Hardware sleep cycle count: 1
> ○ Wakeup triggered from IRQ 9: ACPI SCI
> ○ Woke up from IRQ 9: ACPI SCI
> ○ gpe03 increased from 140 to 148
> ✅ Userspace suspended for 0:00:08.256333
> ❌ Did not reach hardware sleep state
>
> If the duration arg is 6 or higher, then amd_s2idle.py reports that the
> hardware sleep state was entered, and the "inconsistent lock state"
> warning does not appear. If the duration is too low (e.g. 1 second),
> then the laptop does not wake up automatically, and upon pressing a
> keyboard key, the amdgpu driver will report an error resuming the GPU,
> and the GPU will not be working. (I don't think the amdgpu problem is
> related to the lock state warning, I'm just mentioning it for
> completeness). It is the state between these two cases, where the laptop
> does suspend and resume correctly, but the suspend is too short to enter
> a hardware sleep state, where the problem occurs.

Hello,

Thank you for this bug report.

amd_s2idle apparently uses an RTC alarm to wake the system up
(which on newer systems is handled by ACPI SCI instead).
When the delay before the alarm is very low (like 1 second),
the alarm fires before the system is fully
suspended and the system does not wake thereafter - you have
to wake it up manually. The ACPI SPI interrupt is queued, however,
and fires just thereafter.

It appears, however, that both the RTC interrupt and ACPI SPI
interrupts fired (one after the other or at the same time).

I have noticed that cmos_interrupt() in drivers/rtc/rtc-cmos.c
uses spin_lock(), not spin_lock_irqsave() etc., even though it
can be called from a non-interrupt context - indirectly by
cmos_resume() during system resume and also by rtc_handler().

This can lead to a deadlock and is likely while lockdep is
complaining - see "Single-lock state rules:" in
Documentation/locking/lockdep-design.rst .

It is possible that
commit 92e250c624ea ("timekeeping: Add a lockdep override in tick_freeze()")
is masking the current problem because only the first issue is shown.

I'll send you a debug patch shortly.

Greetings,
Mateusz


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [DRAFT PATCH] rtc-cmos: use spin_lock_irqsave in cmos_interrupt
  2025-06-02 19:52               ` Mateusz Jończyk
@ 2025-06-02 20:20                 ` Mateusz Jończyk
  2025-06-02 21:19                   ` Chris Bainbridge
  2025-06-03  6:47                   ` Sebastian Andrzej Siewior
  0 siblings, 2 replies; 19+ messages in thread
From: Mateusz Jończyk @ 2025-06-02 20:20 UTC (permalink / raw)
  To: Chris Bainbridge, Sebastian Andrzej Siewior
  Cc: Thomas Gleixner, Borislav Petkov, linux-rtc, Alexandre Belloni,
	lkml, Anna-Maria Behnsen, Frederic Weisbecker, Peter Zijlstra,
	Mateusz Jończyk

cmos_interrupt() can also be called also in non-interrupt contexts, such
as in ACPI handlers via rtc_handler(). Therefore, usage of
spin_lock(&rtc_lock) is insecure. Use spin_lock_irqsave() / etc.
instead.

Remove the local_irq_disable() hacks in cmos_check_wkalrm() and add a
comment so that these _irqsave / _irqrestore will not be disabled again,
as in

commit 6950d046eb6e ("rtc: cmos: Replace spin_lock_irqsave with spin_lock in hard IRQ")

Untested yet.

Signed-off-by: Mateusz Jończyk <mat.jonczyk@o2.pl>
Fixes: 13be2efc390a ("rtc: cmos: Disable irq around direct invocation of cmos_interrupt()")
---
 drivers/rtc/rtc-cmos.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/rtc/rtc-cmos.c b/drivers/rtc/rtc-cmos.c
index 8172869bd3d7..399bb82e6153 100644
--- a/drivers/rtc/rtc-cmos.c
+++ b/drivers/rtc/rtc-cmos.c
@@ -692,8 +692,12 @@ static irqreturn_t cmos_interrupt(int irq, void *p)
 {
 	u8		irqstat;
 	u8		rtc_control;
+	unsigned long	flags;
 
-	spin_lock(&rtc_lock);
+	/* We cannot use spin_lock() here, as cmos_interrupt() is also called
+	 * in non-irq context.
+	 */
+	spin_lock_irqsave(&rtc_lock, flags);
 
 	/* When the HPET interrupt handler calls us, the interrupt
 	 * status is passed as arg1 instead of the irq number.  But
@@ -727,7 +731,7 @@ static irqreturn_t cmos_interrupt(int irq, void *p)
 			hpet_mask_rtc_irq_bit(RTC_AIE);
 		CMOS_READ(RTC_INTR_FLAGS);
 	}
-	spin_unlock(&rtc_lock);
+	spin_unlock_irqrestore(&rtc_lock, flags);
 
 	if (is_intr(irqstat)) {
 		rtc_update_irq(p, 1, irqstat);
@@ -1295,9 +1299,7 @@ static void cmos_check_wkalrm(struct device *dev)
 	 * ACK the rtc irq here
 	 */
 	if (t_now >= cmos->alarm_expires && cmos_use_acpi_alarm()) {
-		local_irq_disable();
 		cmos_interrupt(0, (void *)cmos->rtc);
-		local_irq_enable();
 		return;
 	}
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [DRAFT PATCH] rtc-cmos: use spin_lock_irqsave in cmos_interrupt
  2025-06-02 20:20                 ` [DRAFT PATCH] rtc-cmos: use spin_lock_irqsave in cmos_interrupt Mateusz Jończyk
@ 2025-06-02 21:19                   ` Chris Bainbridge
  2025-06-03  6:47                   ` Sebastian Andrzej Siewior
  1 sibling, 0 replies; 19+ messages in thread
From: Chris Bainbridge @ 2025-06-02 21:19 UTC (permalink / raw)
  To: Mateusz Jończyk
  Cc: Sebastian Andrzej Siewior, Thomas Gleixner, Borislav Petkov,
	linux-rtc, Alexandre Belloni, lkml, Anna-Maria Behnsen,
	Frederic Weisbecker, Peter Zijlstra

On Mon, Jun 02, 2025 at 10:20:19PM +0200, Mateusz Jończyk wrote:
> cmos_interrupt() can also be called also in non-interrupt contexts, such
> as in ACPI handlers via rtc_handler(). Therefore, usage of
> spin_lock(&rtc_lock) is insecure. Use spin_lock_irqsave() / etc.
> instead.
> 
> Remove the local_irq_disable() hacks in cmos_check_wkalrm() and add a
> comment so that these _irqsave / _irqrestore will not be disabled again,
> as in
> 
> commit 6950d046eb6e ("rtc: cmos: Replace spin_lock_irqsave with spin_lock in hard IRQ")
> 
> Untested yet.
> 
> Signed-off-by: Mateusz Jończyk <mat.jonczyk@o2.pl>
> Fixes: 13be2efc390a ("rtc: cmos: Disable irq around direct invocation of cmos_interrupt()")

I tested this, and can confirm it does fix the "inconsistent lock state"
error in my test case.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [DRAFT PATCH] rtc-cmos: use spin_lock_irqsave in cmos_interrupt
  2025-06-02 20:20                 ` [DRAFT PATCH] rtc-cmos: use spin_lock_irqsave in cmos_interrupt Mateusz Jończyk
  2025-06-02 21:19                   ` Chris Bainbridge
@ 2025-06-03  6:47                   ` Sebastian Andrzej Siewior
  1 sibling, 0 replies; 19+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-06-03  6:47 UTC (permalink / raw)
  To: Mateusz Jończyk
  Cc: Chris Bainbridge, Thomas Gleixner, Borislav Petkov, linux-rtc,
	Alexandre Belloni, lkml, Anna-Maria Behnsen, Frederic Weisbecker,
	Peter Zijlstra

On 2025-06-02 22:20:19 [+0200], Mateusz Jończyk wrote:
> cmos_interrupt() can also be called also in non-interrupt contexts, such
> as in ACPI handlers via rtc_handler(). Therefore, usage of
> spin_lock(&rtc_lock) is insecure. Use spin_lock_irqsave() / etc.
> instead.
> 
> Remove the local_irq_disable() hacks in cmos_check_wkalrm() and add a
> comment so that these _irqsave / _irqrestore will not be disabled again,
> as in

This also broke PREEMPT_RT due to disable-IRQ -> spinlock_t.

> commit 6950d046eb6e ("rtc: cmos: Replace spin_lock_irqsave with spin_lock in hard IRQ")
> 
> Untested yet.
> 
> Signed-off-by: Mateusz Jończyk <mat.jonczyk@o2.pl>
> Fixes: 13be2efc390a ("rtc: cmos: Disable irq around direct invocation of cmos_interrupt()")

Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

Sebastian

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2025-06-03  6:47 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-30 11:32 [ BUG: Invalid wait context ] rtc_lock at: mc146818_avoid_UIP Borislav Petkov
2025-03-30 12:23 ` Mateusz Jończyk
2025-03-30 12:32   ` Mateusz Jończyk
2025-04-03 13:12 ` Thomas Gleixner
2025-04-03 13:50   ` Sebastian Andrzej Siewior
2025-04-03 19:36     ` Sebastian Andrzej Siewior
2025-04-03 20:26       ` Thomas Gleixner
2025-04-04  6:16         ` Sebastian Andrzej Siewior
2025-04-04 13:34         ` [PATCH] timekeeping: Add a lockdep override in tick_freeze() Sebastian Andrzej Siewior
2025-04-04 18:47           ` Mateusz Jończyk
2025-04-09 19:13           ` Peter Zijlstra
2025-05-31 18:27           ` Chris Bainbridge
2025-05-31 19:16             ` Chris Bainbridge
2025-06-02 19:52               ` Mateusz Jończyk
2025-06-02 20:20                 ` [DRAFT PATCH] rtc-cmos: use spin_lock_irqsave in cmos_interrupt Mateusz Jończyk
2025-06-02 21:19                   ` Chris Bainbridge
2025-06-03  6:47                   ` Sebastian Andrzej Siewior
2025-04-03 23:46   ` [ BUG: Invalid wait context ] rtc_lock at: mc146818_avoid_UIP Mateusz Jończyk
2025-04-04  7:39     ` Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).