public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* queue-5.10: Panic on shutdown at platform_shutdown+0x9
@ 2025-02-06 18:31 Chuck Lever
  2025-02-07 15:10 ` Greg KH
  0 siblings, 1 reply; 9+ messages in thread
From: Chuck Lever @ 2025-02-06 18:31 UTC (permalink / raw)
  To: Greg KH, rafael; +Cc: stable@vger.kernel.org, linux-kernel

Hi -

For the past 3-4 days, NFSD CI runs on queue-5.10.y have been failing. I
looked into it today, and the test guest fails to reboot because it
panics during a reboot shutdown:

[  146.793087] BUG: unable to handle page fault for address:
ffffffffffffffe8
[  146.793918] #PF: supervisor read access in kernel mode
[  146.794544] #PF: error_code(0x0000) - not-present page
[  146.795172] PGD 3d5c14067 P4D 3d5c15067 PUD 3d5c17067 PMD 0
[  146.795865] Oops: 0000 [#1] SMP NOPTI
[  146.796326] CPU: 3 PID: 1 Comm: systemd-shutdow Not tainted
5.10.234-g99349f441fe1 #1
[  146.797256] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
1.16.3-2.fc40 04/01/2014
[  146.798267] RIP: 0010:platform_shutdown+0x9/0x20
[  146.798838] Code: b7 46 08 c3 cc cc cc cc 31 c0 83 bf a8 02 00 00 ff
75 ec c3 cc cc cc cc 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 47
68 <48> 8b 40 e8 48 85 c0 74 09 48 83 ef 10 ff e0 0f 1f 00 c3 cc cc cc
[  146.801012] RSP: 0018:ff7f86f440013de0 EFLAGS: 00010246
[  146.801651] RAX: 0000000000000000 RBX: ff4f0637469df418 RCX:
0000000000000000
[  146.802500] RDX: 0000000000000001 RSI: ff4f0637469df418 RDI:
ff4f0637469df410
[  146.803350] RBP: ffffffffb2e79220 R08: ff4f0637469dd808 R09:
ffffffffb2c5c698
[  146.804203] R10: 0000000000000000 R11: 0000000000000000 R12:
ff4f0637469df410
[  146.805059] R13: ff4f0637469df490 R14: 00000000fee1dead R15:
0000000000000000
[  146.805909] FS:  00007f4e7ecc6b80(0000) GS:ff4f063aafd80000(0000)
knlGS:0000000000000000
[  146.806866] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  146.807558] CR2: ffffffffffffffe8 CR3: 000000010ecb2001 CR4:
0000000000771ee0
[  146.808412] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[  146.809262] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[  146.810109] PKRU: 55555554
[  146.810460] Call Trace:
[  146.810791]  ? __die_body.cold+0x1a/0x1f
[  146.811282]  ? no_context.constprop.0+0xf8/0x2f0
[  146.811854]  ? exc_page_fault+0xc5/0x150
[  146.812342]  ? asm_exc_page_fault+0x1e/0x30
[  146.812862]  ? platform_shutdown+0x9/0x20
[  146.813362]  device_shutdown+0x158/0x1c0
[  146.813853]  __do_sys_reboot.cold+0x2f/0x5b
[  146.814370]  ? vfs_writev+0x9b/0x110
[  146.814824]  ? do_writev+0x57/0xf0
[  146.815254]  do_syscall_64+0x30/0x40
[  146.815708]  entry_SYSCALL_64_after_hwframe+0x67/0xd1

Let me know how to further assist.


-- 
Chuck Lever


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: queue-5.10: Panic on shutdown at platform_shutdown+0x9
  2025-02-06 18:31 queue-5.10: Panic on shutdown at platform_shutdown+0x9 Chuck Lever
@ 2025-02-07 15:10 ` Greg KH
  2025-02-09 15:57   ` Chuck Lever
  0 siblings, 1 reply; 9+ messages in thread
From: Greg KH @ 2025-02-07 15:10 UTC (permalink / raw)
  To: Chuck Lever; +Cc: rafael, stable@vger.kernel.org, linux-kernel

On Thu, Feb 06, 2025 at 01:31:42PM -0500, Chuck Lever wrote:
> Hi -
> 
> For the past 3-4 days, NFSD CI runs on queue-5.10.y have been failing. I
> looked into it today, and the test guest fails to reboot because it
> panics during a reboot shutdown:
> 
> [  146.793087] BUG: unable to handle page fault for address:
> ffffffffffffffe8
> [  146.793918] #PF: supervisor read access in kernel mode
> [  146.794544] #PF: error_code(0x0000) - not-present page
> [  146.795172] PGD 3d5c14067 P4D 3d5c15067 PUD 3d5c17067 PMD 0
> [  146.795865] Oops: 0000 [#1] SMP NOPTI
> [  146.796326] CPU: 3 PID: 1 Comm: systemd-shutdow Not tainted
> 5.10.234-g99349f441fe1 #1
> [  146.797256] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
> 1.16.3-2.fc40 04/01/2014
> [  146.798267] RIP: 0010:platform_shutdown+0x9/0x20
> [  146.798838] Code: b7 46 08 c3 cc cc cc cc 31 c0 83 bf a8 02 00 00 ff
> 75 ec c3 cc cc cc cc 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 47
> 68 <48> 8b 40 e8 48 85 c0 74 09 48 83 ef 10 ff e0 0f 1f 00 c3 cc cc cc
> [  146.801012] RSP: 0018:ff7f86f440013de0 EFLAGS: 00010246
> [  146.801651] RAX: 0000000000000000 RBX: ff4f0637469df418 RCX:
> 0000000000000000
> [  146.802500] RDX: 0000000000000001 RSI: ff4f0637469df418 RDI:
> ff4f0637469df410
> [  146.803350] RBP: ffffffffb2e79220 R08: ff4f0637469dd808 R09:
> ffffffffb2c5c698
> [  146.804203] R10: 0000000000000000 R11: 0000000000000000 R12:
> ff4f0637469df410
> [  146.805059] R13: ff4f0637469df490 R14: 00000000fee1dead R15:
> 0000000000000000
> [  146.805909] FS:  00007f4e7ecc6b80(0000) GS:ff4f063aafd80000(0000)
> knlGS:0000000000000000
> [  146.806866] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  146.807558] CR2: ffffffffffffffe8 CR3: 000000010ecb2001 CR4:
> 0000000000771ee0
> [  146.808412] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [  146.809262] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
> 0000000000000400
> [  146.810109] PKRU: 55555554
> [  146.810460] Call Trace:
> [  146.810791]  ? __die_body.cold+0x1a/0x1f
> [  146.811282]  ? no_context.constprop.0+0xf8/0x2f0
> [  146.811854]  ? exc_page_fault+0xc5/0x150
> [  146.812342]  ? asm_exc_page_fault+0x1e/0x30
> [  146.812862]  ? platform_shutdown+0x9/0x20
> [  146.813362]  device_shutdown+0x158/0x1c0
> [  146.813853]  __do_sys_reboot.cold+0x2f/0x5b
> [  146.814370]  ? vfs_writev+0x9b/0x110
> [  146.814824]  ? do_writev+0x57/0xf0
> [  146.815254]  do_syscall_64+0x30/0x40
> [  146.815708]  entry_SYSCALL_64_after_hwframe+0x67/0xd1
> 
> Let me know how to further assist.

Bisect?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: queue-5.10: Panic on shutdown at platform_shutdown+0x9
  2025-02-07 15:10 ` Greg KH
@ 2025-02-09 15:57   ` Chuck Lever
  2025-02-10  4:32     ` Harshit Mogalapalli
  2025-03-07 13:55     ` Chuck Lever
  0 siblings, 2 replies; 9+ messages in thread
From: Chuck Lever @ 2025-02-09 15:57 UTC (permalink / raw)
  To: Greg KH; +Cc: rafael, stable@vger.kernel.org, linux-kernel

On 2/7/25 10:10 AM, Greg KH wrote:
> On Thu, Feb 06, 2025 at 01:31:42PM -0500, Chuck Lever wrote:
>> Hi -
>>
>> For the past 3-4 days, NFSD CI runs on queue-5.10.y have been failing. I
>> looked into it today, and the test guest fails to reboot because it
>> panics during a reboot shutdown:
>>
>> [  146.793087] BUG: unable to handle page fault for address:
>> ffffffffffffffe8
>> [  146.793918] #PF: supervisor read access in kernel mode
>> [  146.794544] #PF: error_code(0x0000) - not-present page
>> [  146.795172] PGD 3d5c14067 P4D 3d5c15067 PUD 3d5c17067 PMD 0
>> [  146.795865] Oops: 0000 [#1] SMP NOPTI
>> [  146.796326] CPU: 3 PID: 1 Comm: systemd-shutdow Not tainted
>> 5.10.234-g99349f441fe1 #1
>> [  146.797256] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
>> 1.16.3-2.fc40 04/01/2014
>> [  146.798267] RIP: 0010:platform_shutdown+0x9/0x20
>> [  146.798838] Code: b7 46 08 c3 cc cc cc cc 31 c0 83 bf a8 02 00 00 ff
>> 75 ec c3 cc cc cc cc 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 47
>> 68 <48> 8b 40 e8 48 85 c0 74 09 48 83 ef 10 ff e0 0f 1f 00 c3 cc cc cc
>> [  146.801012] RSP: 0018:ff7f86f440013de0 EFLAGS: 00010246
>> [  146.801651] RAX: 0000000000000000 RBX: ff4f0637469df418 RCX:
>> 0000000000000000
>> [  146.802500] RDX: 0000000000000001 RSI: ff4f0637469df418 RDI:
>> ff4f0637469df410
>> [  146.803350] RBP: ffffffffb2e79220 R08: ff4f0637469dd808 R09:
>> ffffffffb2c5c698
>> [  146.804203] R10: 0000000000000000 R11: 0000000000000000 R12:
>> ff4f0637469df410
>> [  146.805059] R13: ff4f0637469df490 R14: 00000000fee1dead R15:
>> 0000000000000000
>> [  146.805909] FS:  00007f4e7ecc6b80(0000) GS:ff4f063aafd80000(0000)
>> knlGS:0000000000000000
>> [  146.806866] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [  146.807558] CR2: ffffffffffffffe8 CR3: 000000010ecb2001 CR4:
>> 0000000000771ee0
>> [  146.808412] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>> 0000000000000000
>> [  146.809262] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
>> 0000000000000400
>> [  146.810109] PKRU: 55555554
>> [  146.810460] Call Trace:
>> [  146.810791]  ? __die_body.cold+0x1a/0x1f
>> [  146.811282]  ? no_context.constprop.0+0xf8/0x2f0
>> [  146.811854]  ? exc_page_fault+0xc5/0x150
>> [  146.812342]  ? asm_exc_page_fault+0x1e/0x30
>> [  146.812862]  ? platform_shutdown+0x9/0x20
>> [  146.813362]  device_shutdown+0x158/0x1c0
>> [  146.813853]  __do_sys_reboot.cold+0x2f/0x5b
>> [  146.814370]  ? vfs_writev+0x9b/0x110
>> [  146.814824]  ? do_writev+0x57/0xf0
>> [  146.815254]  do_syscall_64+0x30/0x40
>> [  146.815708]  entry_SYSCALL_64_after_hwframe+0x67/0xd1
>>
>> Let me know how to further assist.
> 
> Bisect?

First bad commit:

commit a06b4817f3d20721ae729d8b353457ff9fe6ff9c
Author:     Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
AuthorDate: Thu Nov 19 13:46:11 2020 +0100
Commit:     Sasha Levin <sashal@kernel.org>
CommitDate: Tue Feb 4 13:04:31 2025 -0500

    driver core: platform: use bus_type functions

    [ Upstream commit 9c30921fe7994907e0b3e0637b2c8c0fc4b5171f ]

    This works towards the goal mentioned in 2006 in commit 594c8281f905
    ("[PATCH] Add bus_type probe, remove, shutdown methods.").

    The functions are moved to where the other bus_type functions are
    defined and renamed to match the already established naming scheme.

    Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
    Link:
https://lore.kernel.org/r/20201119124611.2573057-3-u.kleine-koenig@pengutronix.de
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Stable-dep-of: bf5821909eb9 ("mtd: hyperbus: hbmc-am654: fix an OF
node reference leak")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: queue-5.10: Panic on shutdown at platform_shutdown+0x9
  2025-02-09 15:57   ` Chuck Lever
@ 2025-02-10  4:32     ` Harshit Mogalapalli
  2025-02-10 14:34       ` Chuck Lever
  2025-03-07 13:55     ` Chuck Lever
  1 sibling, 1 reply; 9+ messages in thread
From: Harshit Mogalapalli @ 2025-02-10  4:32 UTC (permalink / raw)
  To: Chuck Lever, Greg KH; +Cc: rafael, stable@vger.kernel.org, linux-kernel

Hello,

On 09/02/25 21:27, Chuck Lever wrote:
> On 2/7/25 10:10 AM, Greg KH wrote:
>> On Thu, Feb 06, 2025 at 01:31:42PM -0500, Chuck Lever wrote:
>>> Hi -
>>>
>>> For the past 3-4 days, NFSD CI runs on queue-5.10.y have been failing. I
>>> looked into it today, and the test guest fails to reboot because it
>>> panics during a reboot shutdown:
>>>
>>> [  146.793087] BUG: unable to handle page fault for address:
>>> ffffffffffffffe8
>>> [  146.793918] #PF: supervisor read access in kernel mode
>>> [  146.794544] #PF: error_code(0x0000) - not-present page
>>> [  146.795172] PGD 3d5c14067 P4D 3d5c15067 PUD 3d5c17067 PMD 0
>>> [  146.795865] Oops: 0000 [#1] SMP NOPTI
>>> [  146.796326] CPU: 3 PID: 1 Comm: systemd-shutdow Not tainted
>>> 5.10.234-g99349f441fe1 #1
>>> [  146.797256] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
>>> 1.16.3-2.fc40 04/01/2014
>>> [  146.798267] RIP: 0010:platform_shutdown+0x9/0x20
>>> [  146.798838] Code: b7 46 08 c3 cc cc cc cc 31 c0 83 bf a8 02 00 00 ff
>>> 75 ec c3 cc cc cc cc 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 47
>>> 68 <48> 8b 40 e8 48 85 c0 74 09 48 83 ef 10 ff e0 0f 1f 00 c3 cc cc cc
>>> [  146.801012] RSP: 0018:ff7f86f440013de0 EFLAGS: 00010246
>>> [  146.801651] RAX: 0000000000000000 RBX: ff4f0637469df418 RCX:
>>> 0000000000000000
>>> [  146.802500] RDX: 0000000000000001 RSI: ff4f0637469df418 RDI:
>>> ff4f0637469df410
>>> [  146.803350] RBP: ffffffffb2e79220 R08: ff4f0637469dd808 R09:
>>> ffffffffb2c5c698
>>> [  146.804203] R10: 0000000000000000 R11: 0000000000000000 R12:
>>> ff4f0637469df410
>>> [  146.805059] R13: ff4f0637469df490 R14: 00000000fee1dead R15:
>>> 0000000000000000
>>> [  146.805909] FS:  00007f4e7ecc6b80(0000) GS:ff4f063aafd80000(0000)
>>> knlGS:0000000000000000
>>> [  146.806866] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [  146.807558] CR2: ffffffffffffffe8 CR3: 000000010ecb2001 CR4:
>>> 0000000000771ee0
>>> [  146.808412] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>>> 0000000000000000
>>> [  146.809262] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
>>> 0000000000000400
>>> [  146.810109] PKRU: 55555554
>>> [  146.810460] Call Trace:
>>> [  146.810791]  ? __die_body.cold+0x1a/0x1f
>>> [  146.811282]  ? no_context.constprop.0+0xf8/0x2f0
>>> [  146.811854]  ? exc_page_fault+0xc5/0x150
>>> [  146.812342]  ? asm_exc_page_fault+0x1e/0x30
>>> [  146.812862]  ? platform_shutdown+0x9/0x20
>>> [  146.813362]  device_shutdown+0x158/0x1c0
>>> [  146.813853]  __do_sys_reboot.cold+0x2f/0x5b
>>> [  146.814370]  ? vfs_writev+0x9b/0x110
>>> [  146.814824]  ? do_writev+0x57/0xf0
>>> [  146.815254]  do_syscall_64+0x30/0x40
>>> [  146.815708]  entry_SYSCALL_64_after_hwframe+0x67/0xd1
>>>
>>> Let me know how to further assist.
>>
>> Bisect?
> 
> First bad commit:
> 
> commit a06b4817f3d20721ae729d8b353457ff9fe6ff9c
> Author:     Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> AuthorDate: Thu Nov 19 13:46:11 2020 +0100
> Commit:     Sasha Levin <sashal@kernel.org>
> CommitDate: Tue Feb 4 13:04:31 2025 -0500
> 
>      driver core: platform: use bus_type functions
> 
>      [ Upstream commit 9c30921fe7994907e0b3e0637b2c8c0fc4b5171f ]
> 
>      This works towards the goal mentioned in 2006 in commit 594c8281f905
>      ("[PATCH] Add bus_type probe, remove, shutdown methods.").
> 
>      The functions are moved to where the other bus_type functions are
>      defined and renamed to match the already established naming scheme.
> 
>      Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
>      Link:
> https://lore.kernel.org/r/20201119124611.2573057-3-u.kleine-koenig@pengutronix.de
>      Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>      Stable-dep-of: bf5821909eb9 ("mtd: hyperbus: hbmc-am654: fix an OF
> node reference leak")
>      Signed-off-by: Sasha Levin <sashal@kernel.org>
> 

While one option is to drop this, maybe we apply this below fix as well 
instead of dropping the above as it is pulled in as stable-dep-of for 
some other commit?

commit 46e85af0cc53f35584e00bb5db7db6893d0e16e5
Author: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Date:   Sun Dec 13 02:55:33 2020 +0300

     driver core: platform: don't oops in platform_shutdown() on unbound 
devices

     On shutdown the driver core calls the bus' shutdown callback also for
     unbound devices. A driver's shutdown callback however is only 
called for
     devices bound to this driver. Commit 9c30921fe799 ("driver core:
     platform: use bus_type functions") changed the platform bus from driver
     callbacks to bus callbacks, so the shutdown function must be 
prepared to
     be called without a driver. Add the corresponding check in the shutdown
     function.

     Fixes: 9c30921fe799 ("driver core: platform: use bus_type functions")
     Tested-by: Guenter Roeck <linux@roeck-us.net>
     Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
     Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
     Link: 
https://lore.kernel.org/r/20201212235533.247537-1-dmitry.baryshkov@linaro.org
     Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

This commit talks about fixing an oops in platform_shutdown()

Thanks,
Harshit


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: queue-5.10: Panic on shutdown at platform_shutdown+0x9
  2025-02-10  4:32     ` Harshit Mogalapalli
@ 2025-02-10 14:34       ` Chuck Lever
  0 siblings, 0 replies; 9+ messages in thread
From: Chuck Lever @ 2025-02-10 14:34 UTC (permalink / raw)
  To: Harshit Mogalapalli, Greg KH; +Cc: rafael, stable@vger.kernel.org, linux-kernel

On 2/9/25 11:32 PM, Harshit Mogalapalli wrote:
> Hello,
> 
> On 09/02/25 21:27, Chuck Lever wrote:
>> On 2/7/25 10:10 AM, Greg KH wrote:
>>> On Thu, Feb 06, 2025 at 01:31:42PM -0500, Chuck Lever wrote:
>>>> Hi -
>>>>
>>>> For the past 3-4 days, NFSD CI runs on queue-5.10.y have been
>>>> failing. I
>>>> looked into it today, and the test guest fails to reboot because it
>>>> panics during a reboot shutdown:
>>>>
>>>> [  146.793087] BUG: unable to handle page fault for address:
>>>> ffffffffffffffe8
>>>> [  146.793918] #PF: supervisor read access in kernel mode
>>>> [  146.794544] #PF: error_code(0x0000) - not-present page
>>>> [  146.795172] PGD 3d5c14067 P4D 3d5c15067 PUD 3d5c17067 PMD 0
>>>> [  146.795865] Oops: 0000 [#1] SMP NOPTI
>>>> [  146.796326] CPU: 3 PID: 1 Comm: systemd-shutdow Not tainted
>>>> 5.10.234-g99349f441fe1 #1
>>>> [  146.797256] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
>>>> 1.16.3-2.fc40 04/01/2014
>>>> [  146.798267] RIP: 0010:platform_shutdown+0x9/0x20
>>>> [  146.798838] Code: b7 46 08 c3 cc cc cc cc 31 c0 83 bf a8 02 00 00 ff
>>>> 75 ec c3 cc cc cc cc 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 47
>>>> 68 <48> 8b 40 e8 48 85 c0 74 09 48 83 ef 10 ff e0 0f 1f 00 c3 cc cc cc
>>>> [  146.801012] RSP: 0018:ff7f86f440013de0 EFLAGS: 00010246
>>>> [  146.801651] RAX: 0000000000000000 RBX: ff4f0637469df418 RCX:
>>>> 0000000000000000
>>>> [  146.802500] RDX: 0000000000000001 RSI: ff4f0637469df418 RDI:
>>>> ff4f0637469df410
>>>> [  146.803350] RBP: ffffffffb2e79220 R08: ff4f0637469dd808 R09:
>>>> ffffffffb2c5c698
>>>> [  146.804203] R10: 0000000000000000 R11: 0000000000000000 R12:
>>>> ff4f0637469df410
>>>> [  146.805059] R13: ff4f0637469df490 R14: 00000000fee1dead R15:
>>>> 0000000000000000
>>>> [  146.805909] FS:  00007f4e7ecc6b80(0000) GS:ff4f063aafd80000(0000)
>>>> knlGS:0000000000000000
>>>> [  146.806866] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>> [  146.807558] CR2: ffffffffffffffe8 CR3: 000000010ecb2001 CR4:
>>>> 0000000000771ee0
>>>> [  146.808412] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>>>> 0000000000000000
>>>> [  146.809262] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
>>>> 0000000000000400
>>>> [  146.810109] PKRU: 55555554
>>>> [  146.810460] Call Trace:
>>>> [  146.810791]  ? __die_body.cold+0x1a/0x1f
>>>> [  146.811282]  ? no_context.constprop.0+0xf8/0x2f0
>>>> [  146.811854]  ? exc_page_fault+0xc5/0x150
>>>> [  146.812342]  ? asm_exc_page_fault+0x1e/0x30
>>>> [  146.812862]  ? platform_shutdown+0x9/0x20
>>>> [  146.813362]  device_shutdown+0x158/0x1c0
>>>> [  146.813853]  __do_sys_reboot.cold+0x2f/0x5b
>>>> [  146.814370]  ? vfs_writev+0x9b/0x110
>>>> [  146.814824]  ? do_writev+0x57/0xf0
>>>> [  146.815254]  do_syscall_64+0x30/0x40
>>>> [  146.815708]  entry_SYSCALL_64_after_hwframe+0x67/0xd1
>>>>
>>>> Let me know how to further assist.
>>>
>>> Bisect?
>>
>> First bad commit:
>>
>> commit a06b4817f3d20721ae729d8b353457ff9fe6ff9c
>> Author:     Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
>> AuthorDate: Thu Nov 19 13:46:11 2020 +0100
>> Commit:     Sasha Levin <sashal@kernel.org>
>> CommitDate: Tue Feb 4 13:04:31 2025 -0500
>>
>>      driver core: platform: use bus_type functions
>>
>>      [ Upstream commit 9c30921fe7994907e0b3e0637b2c8c0fc4b5171f ]
>>
>>      This works towards the goal mentioned in 2006 in commit 594c8281f905
>>      ("[PATCH] Add bus_type probe, remove, shutdown methods.").
>>
>>      The functions are moved to where the other bus_type functions are
>>      defined and renamed to match the already established naming scheme.
>>
>>      Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
>>      Link:
>> https://lore.kernel.org/r/20201119124611.2573057-3-u.kleine-
>> koenig@pengutronix.de
>>      Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>>      Stable-dep-of: bf5821909eb9 ("mtd: hyperbus: hbmc-am654: fix an OF
>> node reference leak")
>>      Signed-off-by: Sasha Levin <sashal@kernel.org>
>>
> 
> While one option is to drop this, maybe we apply this below fix as well
> instead of dropping the above as it is pulled in as stable-dep-of for
> some other commit?
> 
> commit 46e85af0cc53f35584e00bb5db7db6893d0e16e5
> Author: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
> Date:   Sun Dec 13 02:55:33 2020 +0300
> 
>     driver core: platform: don't oops in platform_shutdown() on unbound
> devices
> 
>     On shutdown the driver core calls the bus' shutdown callback also for
>     unbound devices. A driver's shutdown callback however is only called
> for
>     devices bound to this driver. Commit 9c30921fe799 ("driver core:
>     platform: use bus_type functions") changed the platform bus from driver
>     callbacks to bus callbacks, so the shutdown function must be
> prepared to
>     be called without a driver. Add the corresponding check in the shutdown
>     function.
> 
>     Fixes: 9c30921fe799 ("driver core: platform: use bus_type functions")
>     Tested-by: Guenter Roeck <linux@roeck-us.net>
>     Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
>     Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
>     Link: https://lore.kernel.org/r/20201212235533.247537-1-
> dmitry.baryshkov@linaro.org
>     Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> 
> This commit talks about fixing an oops in platform_shutdown()
> 
> Thanks,
> Harshit
> 

I was about to test this idea, but 46e85af0cc53 does not apply cleanly
to origin/linux-5.10.y. Someone with more local expertise will need to
have a look.


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: queue-5.10: Panic on shutdown at platform_shutdown+0x9
  2025-02-09 15:57   ` Chuck Lever
  2025-02-10  4:32     ` Harshit Mogalapalli
@ 2025-03-07 13:55     ` Chuck Lever
  2025-03-07 14:29       ` Greg KH
  1 sibling, 1 reply; 9+ messages in thread
From: Chuck Lever @ 2025-03-07 13:55 UTC (permalink / raw)
  To: Greg KH; +Cc: rafael, stable@vger.kernel.org, linux-kernel

On 2/9/25 10:57 AM, Chuck Lever wrote:
> On 2/7/25 10:10 AM, Greg KH wrote:
>> On Thu, Feb 06, 2025 at 01:31:42PM -0500, Chuck Lever wrote:
>>> Hi -
>>>
>>> For the past 3-4 days, NFSD CI runs on queue-5.10.y have been failing. I
>>> looked into it today, and the test guest fails to reboot because it
>>> panics during a reboot shutdown:
>>>
>>> [  146.793087] BUG: unable to handle page fault for address:
>>> ffffffffffffffe8
>>> [  146.793918] #PF: supervisor read access in kernel mode
>>> [  146.794544] #PF: error_code(0x0000) - not-present page
>>> [  146.795172] PGD 3d5c14067 P4D 3d5c15067 PUD 3d5c17067 PMD 0
>>> [  146.795865] Oops: 0000 [#1] SMP NOPTI
>>> [  146.796326] CPU: 3 PID: 1 Comm: systemd-shutdow Not tainted
>>> 5.10.234-g99349f441fe1 #1
>>> [  146.797256] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
>>> 1.16.3-2.fc40 04/01/2014
>>> [  146.798267] RIP: 0010:platform_shutdown+0x9/0x20
>>> [  146.798838] Code: b7 46 08 c3 cc cc cc cc 31 c0 83 bf a8 02 00 00 ff
>>> 75 ec c3 cc cc cc cc 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 47
>>> 68 <48> 8b 40 e8 48 85 c0 74 09 48 83 ef 10 ff e0 0f 1f 00 c3 cc cc cc
>>> [  146.801012] RSP: 0018:ff7f86f440013de0 EFLAGS: 00010246
>>> [  146.801651] RAX: 0000000000000000 RBX: ff4f0637469df418 RCX:
>>> 0000000000000000
>>> [  146.802500] RDX: 0000000000000001 RSI: ff4f0637469df418 RDI:
>>> ff4f0637469df410
>>> [  146.803350] RBP: ffffffffb2e79220 R08: ff4f0637469dd808 R09:
>>> ffffffffb2c5c698
>>> [  146.804203] R10: 0000000000000000 R11: 0000000000000000 R12:
>>> ff4f0637469df410
>>> [  146.805059] R13: ff4f0637469df490 R14: 00000000fee1dead R15:
>>> 0000000000000000
>>> [  146.805909] FS:  00007f4e7ecc6b80(0000) GS:ff4f063aafd80000(0000)
>>> knlGS:0000000000000000
>>> [  146.806866] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [  146.807558] CR2: ffffffffffffffe8 CR3: 000000010ecb2001 CR4:
>>> 0000000000771ee0
>>> [  146.808412] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>>> 0000000000000000
>>> [  146.809262] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
>>> 0000000000000400
>>> [  146.810109] PKRU: 55555554
>>> [  146.810460] Call Trace:
>>> [  146.810791]  ? __die_body.cold+0x1a/0x1f
>>> [  146.811282]  ? no_context.constprop.0+0xf8/0x2f0
>>> [  146.811854]  ? exc_page_fault+0xc5/0x150
>>> [  146.812342]  ? asm_exc_page_fault+0x1e/0x30
>>> [  146.812862]  ? platform_shutdown+0x9/0x20
>>> [  146.813362]  device_shutdown+0x158/0x1c0
>>> [  146.813853]  __do_sys_reboot.cold+0x2f/0x5b
>>> [  146.814370]  ? vfs_writev+0x9b/0x110
>>> [  146.814824]  ? do_writev+0x57/0xf0
>>> [  146.815254]  do_syscall_64+0x30/0x40
>>> [  146.815708]  entry_SYSCALL_64_after_hwframe+0x67/0xd1
>>>
>>> Let me know how to further assist.
>>
>> Bisect?
> 
> First bad commit:
> 
> commit a06b4817f3d20721ae729d8b353457ff9fe6ff9c
> Author:     Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> AuthorDate: Thu Nov 19 13:46:11 2020 +0100
> Commit:     Sasha Levin <sashal@kernel.org>
> CommitDate: Tue Feb 4 13:04:31 2025 -0500
> 
>     driver core: platform: use bus_type functions
> 
>     [ Upstream commit 9c30921fe7994907e0b3e0637b2c8c0fc4b5171f ]
> 
>     This works towards the goal mentioned in 2006 in commit 594c8281f905
>     ("[PATCH] Add bus_type probe, remove, shutdown methods.").
> 
>     The functions are moved to where the other bus_type functions are
>     defined and renamed to match the already established naming scheme.
> 
>     Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
>     Link:
> https://lore.kernel.org/r/20201119124611.2573057-3-u.kleine-koenig@pengutronix.de
>     Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>     Stable-dep-of: bf5821909eb9 ("mtd: hyperbus: hbmc-am654: fix an OF
> node reference leak")
>     Signed-off-by: Sasha Levin <sashal@kernel.org>
> 

Hi Greg, I still see crashes on shutdown 100% of the time on queue/5.10
kernels. Is there a plan to revert this commit?


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: queue-5.10: Panic on shutdown at platform_shutdown+0x9
  2025-03-07 13:55     ` Chuck Lever
@ 2025-03-07 14:29       ` Greg KH
  2025-03-07 14:30         ` Chuck Lever
  0 siblings, 1 reply; 9+ messages in thread
From: Greg KH @ 2025-03-07 14:29 UTC (permalink / raw)
  To: Chuck Lever; +Cc: rafael, stable@vger.kernel.org, linux-kernel

On Fri, Mar 07, 2025 at 08:55:55AM -0500, Chuck Lever wrote:
> On 2/9/25 10:57 AM, Chuck Lever wrote:
> > On 2/7/25 10:10 AM, Greg KH wrote:
> >> On Thu, Feb 06, 2025 at 01:31:42PM -0500, Chuck Lever wrote:
> >>> Hi -
> >>>
> >>> For the past 3-4 days, NFSD CI runs on queue-5.10.y have been failing. I
> >>> looked into it today, and the test guest fails to reboot because it
> >>> panics during a reboot shutdown:
> >>>
> >>> [  146.793087] BUG: unable to handle page fault for address:
> >>> ffffffffffffffe8
> >>> [  146.793918] #PF: supervisor read access in kernel mode
> >>> [  146.794544] #PF: error_code(0x0000) - not-present page
> >>> [  146.795172] PGD 3d5c14067 P4D 3d5c15067 PUD 3d5c17067 PMD 0
> >>> [  146.795865] Oops: 0000 [#1] SMP NOPTI
> >>> [  146.796326] CPU: 3 PID: 1 Comm: systemd-shutdow Not tainted
> >>> 5.10.234-g99349f441fe1 #1
> >>> [  146.797256] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
> >>> 1.16.3-2.fc40 04/01/2014
> >>> [  146.798267] RIP: 0010:platform_shutdown+0x9/0x20
> >>> [  146.798838] Code: b7 46 08 c3 cc cc cc cc 31 c0 83 bf a8 02 00 00 ff
> >>> 75 ec c3 cc cc cc cc 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 47
> >>> 68 <48> 8b 40 e8 48 85 c0 74 09 48 83 ef 10 ff e0 0f 1f 00 c3 cc cc cc
> >>> [  146.801012] RSP: 0018:ff7f86f440013de0 EFLAGS: 00010246
> >>> [  146.801651] RAX: 0000000000000000 RBX: ff4f0637469df418 RCX:
> >>> 0000000000000000
> >>> [  146.802500] RDX: 0000000000000001 RSI: ff4f0637469df418 RDI:
> >>> ff4f0637469df410
> >>> [  146.803350] RBP: ffffffffb2e79220 R08: ff4f0637469dd808 R09:
> >>> ffffffffb2c5c698
> >>> [  146.804203] R10: 0000000000000000 R11: 0000000000000000 R12:
> >>> ff4f0637469df410
> >>> [  146.805059] R13: ff4f0637469df490 R14: 00000000fee1dead R15:
> >>> 0000000000000000
> >>> [  146.805909] FS:  00007f4e7ecc6b80(0000) GS:ff4f063aafd80000(0000)
> >>> knlGS:0000000000000000
> >>> [  146.806866] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >>> [  146.807558] CR2: ffffffffffffffe8 CR3: 000000010ecb2001 CR4:
> >>> 0000000000771ee0
> >>> [  146.808412] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> >>> 0000000000000000
> >>> [  146.809262] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
> >>> 0000000000000400
> >>> [  146.810109] PKRU: 55555554
> >>> [  146.810460] Call Trace:
> >>> [  146.810791]  ? __die_body.cold+0x1a/0x1f
> >>> [  146.811282]  ? no_context.constprop.0+0xf8/0x2f0
> >>> [  146.811854]  ? exc_page_fault+0xc5/0x150
> >>> [  146.812342]  ? asm_exc_page_fault+0x1e/0x30
> >>> [  146.812862]  ? platform_shutdown+0x9/0x20
> >>> [  146.813362]  device_shutdown+0x158/0x1c0
> >>> [  146.813853]  __do_sys_reboot.cold+0x2f/0x5b
> >>> [  146.814370]  ? vfs_writev+0x9b/0x110
> >>> [  146.814824]  ? do_writev+0x57/0xf0
> >>> [  146.815254]  do_syscall_64+0x30/0x40
> >>> [  146.815708]  entry_SYSCALL_64_after_hwframe+0x67/0xd1
> >>>
> >>> Let me know how to further assist.
> >>
> >> Bisect?
> > 
> > First bad commit:
> > 
> > commit a06b4817f3d20721ae729d8b353457ff9fe6ff9c
> > Author:     Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> > AuthorDate: Thu Nov 19 13:46:11 2020 +0100
> > Commit:     Sasha Levin <sashal@kernel.org>
> > CommitDate: Tue Feb 4 13:04:31 2025 -0500
> > 
> >     driver core: platform: use bus_type functions
> > 
> >     [ Upstream commit 9c30921fe7994907e0b3e0637b2c8c0fc4b5171f ]
> > 
> >     This works towards the goal mentioned in 2006 in commit 594c8281f905
> >     ("[PATCH] Add bus_type probe, remove, shutdown methods.").
> > 
> >     The functions are moved to where the other bus_type functions are
> >     defined and renamed to match the already established naming scheme.
> > 
> >     Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> >     Link:
> > https://lore.kernel.org/r/20201119124611.2573057-3-u.kleine-koenig@pengutronix.de
> >     Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> >     Stable-dep-of: bf5821909eb9 ("mtd: hyperbus: hbmc-am654: fix an OF
> > node reference leak")
> >     Signed-off-by: Sasha Levin <sashal@kernel.org>
> > 
> 
> Hi Greg, I still see crashes on shutdown 100% of the time on queue/5.10
> kernels. Is there a plan to revert this commit?

Yes, I haven't had the cycles to get to looking at the 5.10 queue in a
while, which is why I haven't pushed out new 5.10-rc kernels.

I'll get to it "soon".  Hopefully.  Ugh.

greg k-h

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: queue-5.10: Panic on shutdown at platform_shutdown+0x9
  2025-03-07 14:29       ` Greg KH
@ 2025-03-07 14:30         ` Chuck Lever
  2025-03-11 14:15           ` Greg KH
  0 siblings, 1 reply; 9+ messages in thread
From: Chuck Lever @ 2025-03-07 14:30 UTC (permalink / raw)
  To: Greg KH; +Cc: rafael, stable@vger.kernel.org, linux-kernel

On 3/7/25 9:29 AM, Greg KH wrote:
> On Fri, Mar 07, 2025 at 08:55:55AM -0500, Chuck Lever wrote:
>> On 2/9/25 10:57 AM, Chuck Lever wrote:
>>> On 2/7/25 10:10 AM, Greg KH wrote:
>>>> On Thu, Feb 06, 2025 at 01:31:42PM -0500, Chuck Lever wrote:
>>>>> Hi -
>>>>>
>>>>> For the past 3-4 days, NFSD CI runs on queue-5.10.y have been failing. I
>>>>> looked into it today, and the test guest fails to reboot because it
>>>>> panics during a reboot shutdown:
>>>>>
>>>>> [  146.793087] BUG: unable to handle page fault for address:
>>>>> ffffffffffffffe8
>>>>> [  146.793918] #PF: supervisor read access in kernel mode
>>>>> [  146.794544] #PF: error_code(0x0000) - not-present page
>>>>> [  146.795172] PGD 3d5c14067 P4D 3d5c15067 PUD 3d5c17067 PMD 0
>>>>> [  146.795865] Oops: 0000 [#1] SMP NOPTI
>>>>> [  146.796326] CPU: 3 PID: 1 Comm: systemd-shutdow Not tainted
>>>>> 5.10.234-g99349f441fe1 #1
>>>>> [  146.797256] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
>>>>> 1.16.3-2.fc40 04/01/2014
>>>>> [  146.798267] RIP: 0010:platform_shutdown+0x9/0x20
>>>>> [  146.798838] Code: b7 46 08 c3 cc cc cc cc 31 c0 83 bf a8 02 00 00 ff
>>>>> 75 ec c3 cc cc cc cc 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 47
>>>>> 68 <48> 8b 40 e8 48 85 c0 74 09 48 83 ef 10 ff e0 0f 1f 00 c3 cc cc cc
>>>>> [  146.801012] RSP: 0018:ff7f86f440013de0 EFLAGS: 00010246
>>>>> [  146.801651] RAX: 0000000000000000 RBX: ff4f0637469df418 RCX:
>>>>> 0000000000000000
>>>>> [  146.802500] RDX: 0000000000000001 RSI: ff4f0637469df418 RDI:
>>>>> ff4f0637469df410
>>>>> [  146.803350] RBP: ffffffffb2e79220 R08: ff4f0637469dd808 R09:
>>>>> ffffffffb2c5c698
>>>>> [  146.804203] R10: 0000000000000000 R11: 0000000000000000 R12:
>>>>> ff4f0637469df410
>>>>> [  146.805059] R13: ff4f0637469df490 R14: 00000000fee1dead R15:
>>>>> 0000000000000000
>>>>> [  146.805909] FS:  00007f4e7ecc6b80(0000) GS:ff4f063aafd80000(0000)
>>>>> knlGS:0000000000000000
>>>>> [  146.806866] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>> [  146.807558] CR2: ffffffffffffffe8 CR3: 000000010ecb2001 CR4:
>>>>> 0000000000771ee0
>>>>> [  146.808412] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>>>>> 0000000000000000
>>>>> [  146.809262] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
>>>>> 0000000000000400
>>>>> [  146.810109] PKRU: 55555554
>>>>> [  146.810460] Call Trace:
>>>>> [  146.810791]  ? __die_body.cold+0x1a/0x1f
>>>>> [  146.811282]  ? no_context.constprop.0+0xf8/0x2f0
>>>>> [  146.811854]  ? exc_page_fault+0xc5/0x150
>>>>> [  146.812342]  ? asm_exc_page_fault+0x1e/0x30
>>>>> [  146.812862]  ? platform_shutdown+0x9/0x20
>>>>> [  146.813362]  device_shutdown+0x158/0x1c0
>>>>> [  146.813853]  __do_sys_reboot.cold+0x2f/0x5b
>>>>> [  146.814370]  ? vfs_writev+0x9b/0x110
>>>>> [  146.814824]  ? do_writev+0x57/0xf0
>>>>> [  146.815254]  do_syscall_64+0x30/0x40
>>>>> [  146.815708]  entry_SYSCALL_64_after_hwframe+0x67/0xd1
>>>>>
>>>>> Let me know how to further assist.
>>>>
>>>> Bisect?
>>>
>>> First bad commit:
>>>
>>> commit a06b4817f3d20721ae729d8b353457ff9fe6ff9c
>>> Author:     Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
>>> AuthorDate: Thu Nov 19 13:46:11 2020 +0100
>>> Commit:     Sasha Levin <sashal@kernel.org>
>>> CommitDate: Tue Feb 4 13:04:31 2025 -0500
>>>
>>>     driver core: platform: use bus_type functions
>>>
>>>     [ Upstream commit 9c30921fe7994907e0b3e0637b2c8c0fc4b5171f ]
>>>
>>>     This works towards the goal mentioned in 2006 in commit 594c8281f905
>>>     ("[PATCH] Add bus_type probe, remove, shutdown methods.").
>>>
>>>     The functions are moved to where the other bus_type functions are
>>>     defined and renamed to match the already established naming scheme.
>>>
>>>     Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
>>>     Link:
>>> https://lore.kernel.org/r/20201119124611.2573057-3-u.kleine-koenig@pengutronix.de
>>>     Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>>>     Stable-dep-of: bf5821909eb9 ("mtd: hyperbus: hbmc-am654: fix an OF
>>> node reference leak")
>>>     Signed-off-by: Sasha Levin <sashal@kernel.org>
>>>
>>
>> Hi Greg, I still see crashes on shutdown 100% of the time on queue/5.10
>> kernels. Is there a plan to revert this commit?
> 
> Yes, I haven't had the cycles to get to looking at the 5.10 queue in a
> while, which is why I haven't pushed out new 5.10-rc kernels.
> 
> I'll get to it "soon".  Hopefully.  Ugh.

Understood. Thanks!


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: queue-5.10: Panic on shutdown at platform_shutdown+0x9
  2025-03-07 14:30         ` Chuck Lever
@ 2025-03-11 14:15           ` Greg KH
  0 siblings, 0 replies; 9+ messages in thread
From: Greg KH @ 2025-03-11 14:15 UTC (permalink / raw)
  To: Chuck Lever; +Cc: rafael, stable@vger.kernel.org, linux-kernel

On Fri, Mar 07, 2025 at 09:30:31AM -0500, Chuck Lever wrote:
> On 3/7/25 9:29 AM, Greg KH wrote:
> > On Fri, Mar 07, 2025 at 08:55:55AM -0500, Chuck Lever wrote:
> >> On 2/9/25 10:57 AM, Chuck Lever wrote:
> >>> On 2/7/25 10:10 AM, Greg KH wrote:
> >>>> On Thu, Feb 06, 2025 at 01:31:42PM -0500, Chuck Lever wrote:
> >>>>> Hi -
> >>>>>
> >>>>> For the past 3-4 days, NFSD CI runs on queue-5.10.y have been failing. I
> >>>>> looked into it today, and the test guest fails to reboot because it
> >>>>> panics during a reboot shutdown:
> >>>>>
> >>>>> [  146.793087] BUG: unable to handle page fault for address:
> >>>>> ffffffffffffffe8
> >>>>> [  146.793918] #PF: supervisor read access in kernel mode
> >>>>> [  146.794544] #PF: error_code(0x0000) - not-present page
> >>>>> [  146.795172] PGD 3d5c14067 P4D 3d5c15067 PUD 3d5c17067 PMD 0
> >>>>> [  146.795865] Oops: 0000 [#1] SMP NOPTI
> >>>>> [  146.796326] CPU: 3 PID: 1 Comm: systemd-shutdow Not tainted
> >>>>> 5.10.234-g99349f441fe1 #1
> >>>>> [  146.797256] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
> >>>>> 1.16.3-2.fc40 04/01/2014
> >>>>> [  146.798267] RIP: 0010:platform_shutdown+0x9/0x20
> >>>>> [  146.798838] Code: b7 46 08 c3 cc cc cc cc 31 c0 83 bf a8 02 00 00 ff
> >>>>> 75 ec c3 cc cc cc cc 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 47
> >>>>> 68 <48> 8b 40 e8 48 85 c0 74 09 48 83 ef 10 ff e0 0f 1f 00 c3 cc cc cc
> >>>>> [  146.801012] RSP: 0018:ff7f86f440013de0 EFLAGS: 00010246
> >>>>> [  146.801651] RAX: 0000000000000000 RBX: ff4f0637469df418 RCX:
> >>>>> 0000000000000000
> >>>>> [  146.802500] RDX: 0000000000000001 RSI: ff4f0637469df418 RDI:
> >>>>> ff4f0637469df410
> >>>>> [  146.803350] RBP: ffffffffb2e79220 R08: ff4f0637469dd808 R09:
> >>>>> ffffffffb2c5c698
> >>>>> [  146.804203] R10: 0000000000000000 R11: 0000000000000000 R12:
> >>>>> ff4f0637469df410
> >>>>> [  146.805059] R13: ff4f0637469df490 R14: 00000000fee1dead R15:
> >>>>> 0000000000000000
> >>>>> [  146.805909] FS:  00007f4e7ecc6b80(0000) GS:ff4f063aafd80000(0000)
> >>>>> knlGS:0000000000000000
> >>>>> [  146.806866] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >>>>> [  146.807558] CR2: ffffffffffffffe8 CR3: 000000010ecb2001 CR4:
> >>>>> 0000000000771ee0
> >>>>> [  146.808412] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> >>>>> 0000000000000000
> >>>>> [  146.809262] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
> >>>>> 0000000000000400
> >>>>> [  146.810109] PKRU: 55555554
> >>>>> [  146.810460] Call Trace:
> >>>>> [  146.810791]  ? __die_body.cold+0x1a/0x1f
> >>>>> [  146.811282]  ? no_context.constprop.0+0xf8/0x2f0
> >>>>> [  146.811854]  ? exc_page_fault+0xc5/0x150
> >>>>> [  146.812342]  ? asm_exc_page_fault+0x1e/0x30
> >>>>> [  146.812862]  ? platform_shutdown+0x9/0x20
> >>>>> [  146.813362]  device_shutdown+0x158/0x1c0
> >>>>> [  146.813853]  __do_sys_reboot.cold+0x2f/0x5b
> >>>>> [  146.814370]  ? vfs_writev+0x9b/0x110
> >>>>> [  146.814824]  ? do_writev+0x57/0xf0
> >>>>> [  146.815254]  do_syscall_64+0x30/0x40
> >>>>> [  146.815708]  entry_SYSCALL_64_after_hwframe+0x67/0xd1
> >>>>>
> >>>>> Let me know how to further assist.
> >>>>
> >>>> Bisect?
> >>>
> >>> First bad commit:
> >>>
> >>> commit a06b4817f3d20721ae729d8b353457ff9fe6ff9c
> >>> Author:     Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> >>> AuthorDate: Thu Nov 19 13:46:11 2020 +0100
> >>> Commit:     Sasha Levin <sashal@kernel.org>
> >>> CommitDate: Tue Feb 4 13:04:31 2025 -0500
> >>>
> >>>     driver core: platform: use bus_type functions
> >>>
> >>>     [ Upstream commit 9c30921fe7994907e0b3e0637b2c8c0fc4b5171f ]
> >>>
> >>>     This works towards the goal mentioned in 2006 in commit 594c8281f905
> >>>     ("[PATCH] Add bus_type probe, remove, shutdown methods.").
> >>>
> >>>     The functions are moved to where the other bus_type functions are
> >>>     defined and renamed to match the already established naming scheme.
> >>>
> >>>     Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> >>>     Link:
> >>> https://lore.kernel.org/r/20201119124611.2573057-3-u.kleine-koenig@pengutronix.de
> >>>     Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> >>>     Stable-dep-of: bf5821909eb9 ("mtd: hyperbus: hbmc-am654: fix an OF
> >>> node reference leak")
> >>>     Signed-off-by: Sasha Levin <sashal@kernel.org>
> >>>
> >>
> >> Hi Greg, I still see crashes on shutdown 100% of the time on queue/5.10
> >> kernels. Is there a plan to revert this commit?
> > 
> > Yes, I haven't had the cycles to get to looking at the 5.10 queue in a
> > while, which is why I haven't pushed out new 5.10-rc kernels.
> > 
> > I'll get to it "soon".  Hopefully.  Ugh.
> 
> Understood. Thanks!

Ok, all now dropped and cleaned up, thanks!

greg k-h

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-03-11 14:15 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-06 18:31 queue-5.10: Panic on shutdown at platform_shutdown+0x9 Chuck Lever
2025-02-07 15:10 ` Greg KH
2025-02-09 15:57   ` Chuck Lever
2025-02-10  4:32     ` Harshit Mogalapalli
2025-02-10 14:34       ` Chuck Lever
2025-03-07 13:55     ` Chuck Lever
2025-03-07 14:29       ` Greg KH
2025-03-07 14:30         ` Chuck Lever
2025-03-11 14:15           ` Greg KH

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox