* Question on xfs related kernel panic
@ 2023-11-10 8:14 Jianan Wang
2023-11-10 8:48 ` Jianan Wang
2023-11-10 19:34 ` Darrick J. Wong
0 siblings, 2 replies; 6+ messages in thread
From: Jianan Wang @ 2023-11-10 8:14 UTC (permalink / raw)
To: linux-xfs
Hi all,
I have a question regarding a kernel panic leading to our server reboot issue, which has its stack-trace like the following (copied from /var/lib/systemd/pstore/*):
<4>[888969.888666] general protection fault, probably for non-canonical address 0xbf5bc9c369fd38ba: 0000 [#1] SMP PTI
<4>[888969.891355] CPU: 47 PID: 2662145 Comm: find Tainted: P OE 5.15.0-46-generic #49~20.04.1-Ubuntu
<4>[888969.894004] Hardware name: Supermicro SYS-4029GP-TRT2/X11DPG-OT-CPU, BIOS 3.8b 01/17/2023
<4>[888969.896608] RIP: 0010:__kmalloc+0xfc/0x4b0
<4>[888969.899170] Code: ca 2b ad 56 49 8b 50 08 49 83 78 10 00 4d 8b 30 0f 84 67 03 00 00 4d 85 f6 0f 84 5e 03 00 00 41 8b 45 28 49 8b 7d 00 4c 01 f0 <48> 8b 18 48 89 c1 49 33 9d b8 00 00 00 4c 89 f0 48 0f c9 48 31 cb
<4>[888969.904329] RSP: 0018:ffffba69b18a78c0 EFLAGS: 00010282
<4>[888969.906872] RAX: bf5bc9c369fd38ba RBX: 0000000000002c40 RCX: ffffffffc4d3ea92
<4>[888969.909420] RDX: 0000000004d3b836 RSI: 0000000000002c40 RDI: 00000000000350a0
<4>[888969.911952] RBP: ffffba69b18a7900 R08: ffff979effef50a0 R09: 000000000000002c
<4>[888969.914471] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
<4>[888969.916976] R13: ffff976080042500 R14: bf5bc9c369fd389a R15: ffffffffc4d80b0e
<4>[888969.919594] FS: 00007fdbf10dd800(0000) GS:ffff979effec0000(0000) knlGS:0000000000000000
<4>[888969.922109] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[888969.924601] CR2: 00007f236f3419f0 CR3: 00000050e6e62001 CR4: 00000000007706e0
<4>[888969.927099] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4>[888969.929579] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
<4>[888969.932029] PKRU: 55555554
<4>[888969.934445] Call Trace:
<4>[888969.936827] <TASK>
<4>[888969.939269] kmem_alloc+0x6e/0x110 [xfs]
<4>[888969.941882] xfs_init_local_fork+0x72/0xf0 [xfs]
<4>[888969.944418] xfs_iformat_local+0xac/0x180 [xfs]
<4>[888969.946921] xfs_iformat_data_fork+0x105/0x130 [xfs]
<4>[888969.949405] xfs_inode_from_disk+0x2be/0x470 [xfs]
<4>[888969.951869] xfs_iget+0x334/0xbd0 [xfs]
<4>[888969.954319] ? kvfree+0x2c/0x40
<4>[888969.956529] xfs_lookup+0xd2/0x100 [xfs]
<4>[888969.958930] xfs_vn_lookup+0x76/0xb0 [xfs]
<4>[888969.961310] __lookup_slow+0x85/0x150
<4>[888969.963443] walk_component+0x145/0x1c0
<4>[888969.965637] ? __fdget_raw+0x10/0x20
<4>[888969.967747] ? path_init+0x1e5/0x390
<4>[888969.969888] path_lookupat.isra.0+0x6e/0x150
<4>[888969.971927] filename_lookup+0xcf/0x1a0
<4>[888969.973943] ? __check_object_size+0x14f/0x160
<4>[888969.975937] ? strncpy_from_user+0x44/0x160
<4>[888969.977879] ? getname_flags+0x6f/0x1f0
<4>[888969.979769] user_path_at_empty+0x3f/0x60
<4>[888969.981604] vfs_statx+0x73/0x110
<4>[888969.983390] __do_sys_newfstatat+0x36/0x70
<4>[888969.985125] ? alloc_fd+0x58/0x190
<4>[888969.986806] ? f_dupfd+0x4b/0x70
<4>[888969.988513] ? do_fcntl+0x3af/0x5b0
<4>[888969.990090] __x64_sys_newfstatat+0x1e/0x30
<4>[888969.991649] do_syscall_64+0x59/0xc0
<4>[888969.993146] ? syscall_exit_to_user_mode+0x27/0x50
<4>[888969.994611] ? do_syscall_64+0x69/0xc0
<4>[888969.996020] ? exit_to_user_mode_prepare+0x3d/0x1c0
<4>[888969.997404] ? filp_close+0x60/0x70
<4>[888969.998752] ? syscall_exit_to_user_mode+0x27/0x50
<4>[888970.000084] ? __x64_sys_close+0x12/0x50
<4>[888970.001371] ? do_syscall_64+0x69/0xc0
<4>[888970.002605] ? do_syscall_64+0x69/0xc0
<4>[888970.003793] entry_SYSCALL_64_after_hwframe+0x61/0xcb
Our xfs version, config, OS and kernel version are the following:
Linux$ xfs_info -V /data/
xfs_info version 5.9.0
Linux$ xfs_info /data
meta-data=/dev/md127p1 isize=512 agcount=32, agsize=117206400 blks
= sectsz=4096 attr=2, projid32bit=1
= crc=1 finobt=1, sparse=1, rmapbt=0
= reflink=1
data = bsize=4096 blocks=3750604800, imaxpct=5
= sunit=128 swidth=512 blks
naming =version 2 bsize=4096 ascii-ci=0, ftype=1
log =internal log bsize=4096 blocks=521728, version=2
= sectsz=4096 sunit=1 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
Linux$ cat /etc/*-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=20.04
DISTRIB_CODENAME=focal
DISTRIB_DESCRIPTION="Ubuntu-Server 20.04.6 2023.05.30 (Cubic 2023-05-30 13:13)"
NAME="Ubuntu"
VERSION="20.04.6 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu-Server 20.04.6 2023.05.30 (Cubic 2023-05-30 13:13)"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal
Linux$ uname -a
Linux abc-server-001 5.15.0-46-generic #49~20.04.1-Ubuntu SMP Thu Aug 4 19:15:44 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
It would be great if any insight could be provided on whether this is a known issue or how we could troubleshoot further.
Best Regards.
Jianan
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Question on xfs related kernel panic
2023-11-10 8:14 Question on xfs related kernel panic Jianan Wang
@ 2023-11-10 8:48 ` Jianan Wang
2023-11-10 19:34 ` Darrick J. Wong
1 sibling, 0 replies; 6+ messages in thread
From: Jianan Wang @ 2023-11-10 8:48 UTC (permalink / raw)
To: linux-xfs
Another note is that during the kernel panic and reboot situation, system still has plenty of free memory. More detailed memory stats harvested from the monitoring system could be found as the following:
Active_anon 413437952
Active_bytes 71427657728
Active_file 71014219776
AnonHugePages_bytes 0
AnonPages_bytes 8740204544
Bounce_bytes 0
Buffers_bytes 72273920
Cached_bytes 181501706240
CommitLimit_bytes 270333632512
Committed_AS 32973795328
DirectMap1G_bytes 9663676416
DirectMap2M_bytes 41498443776
DirectMap4k_bytes 500373483520
Dirty_bytes 32768
FileHugePages_bytes 0
FilePmdMapped_bytes 0
HardwareCorrupted_bytes 0
HugePages_Free 0
HugePages_Rsvd 0
HugePages_Surp 0
HugePages_Total 0
Hugepagesize_bytes 2097152
Hugetlb_bytes 0
Inactive_anon 9190891520
Inactive_bytes 119080124416
Inactive_file 109889232896
KReclaimable_bytes 35071422464
KernelStack_bytes 66338816
Mapped_bytes 3630841856
MemAvailable_bytes 348949958656
MemFree_bytes 136491732992
MemTotal_bytes 540667269120
Mlocked_bytes 19243008
NFS_Unstable 0
PageTables_bytes 72445952
Percpu_bytes 253624320
SReclaimable_bytes 35071422464
SUnreclaim_bytes 9349984256
ShmemHugePages_bytes 0
ShmemPmdMapped_bytes 0
Shmem_bytes 741265408
Slab_bytes 44421406720
SwapCached_bytes 0
SwapFree_bytes 0
SwapTotal_bytes 0
Unevictable_bytes 19243008
VmallocChunk_bytes 0
VmallocTotal_bytes 35184372087808
VmallocUsed_bytes 739246080
WritebackTmp_bytes 0
Writeback_bytes 0
On 11/10/23 00:14, Jianan Wang wrote:
> Hi all,
>
> I have a question regarding a kernel panic leading to our server reboot issue, which has its stack-trace like the following (copied from /var/lib/systemd/pstore/*):
>
> <4>[888969.888666] general protection fault, probably for non-canonical address 0xbf5bc9c369fd38ba: 0000 [#1] SMP PTI
> <4>[888969.891355] CPU: 47 PID: 2662145 Comm: find Tainted: P OE 5.15.0-46-generic #49~20.04.1-Ubuntu
> <4>[888969.894004] Hardware name: Supermicro SYS-4029GP-TRT2/X11DPG-OT-CPU, BIOS 3.8b 01/17/2023
> <4>[888969.896608] RIP: 0010:__kmalloc+0xfc/0x4b0
> <4>[888969.899170] Code: ca 2b ad 56 49 8b 50 08 49 83 78 10 00 4d 8b 30 0f 84 67 03 00 00 4d 85 f6 0f 84 5e 03 00 00 41 8b 45 28 49 8b 7d 00 4c 01 f0 <48> 8b 18 48 89 c1 49 33 9d b8 00 00 00 4c 89 f0 48 0f c9 48 31 cb
> <4>[888969.904329] RSP: 0018:ffffba69b18a78c0 EFLAGS: 00010282
> <4>[888969.906872] RAX: bf5bc9c369fd38ba RBX: 0000000000002c40 RCX: ffffffffc4d3ea92
> <4>[888969.909420] RDX: 0000000004d3b836 RSI: 0000000000002c40 RDI: 00000000000350a0
> <4>[888969.911952] RBP: ffffba69b18a7900 R08: ffff979effef50a0 R09: 000000000000002c
> <4>[888969.914471] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
> <4>[888969.916976] R13: ffff976080042500 R14: bf5bc9c369fd389a R15: ffffffffc4d80b0e
> <4>[888969.919594] FS: 00007fdbf10dd800(0000) GS:ffff979effec0000(0000) knlGS:0000000000000000
> <4>[888969.922109] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> <4>[888969.924601] CR2: 00007f236f3419f0 CR3: 00000050e6e62001 CR4: 00000000007706e0
> <4>[888969.927099] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> <4>[888969.929579] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> <4>[888969.932029] PKRU: 55555554
> <4>[888969.934445] Call Trace:
> <4>[888969.936827] <TASK>
> <4>[888969.939269] kmem_alloc+0x6e/0x110 [xfs]
> <4>[888969.941882] xfs_init_local_fork+0x72/0xf0 [xfs]
> <4>[888969.944418] xfs_iformat_local+0xac/0x180 [xfs]
> <4>[888969.946921] xfs_iformat_data_fork+0x105/0x130 [xfs]
> <4>[888969.949405] xfs_inode_from_disk+0x2be/0x470 [xfs]
> <4>[888969.951869] xfs_iget+0x334/0xbd0 [xfs]
> <4>[888969.954319] ? kvfree+0x2c/0x40
> <4>[888969.956529] xfs_lookup+0xd2/0x100 [xfs]
> <4>[888969.958930] xfs_vn_lookup+0x76/0xb0 [xfs]
> <4>[888969.961310] __lookup_slow+0x85/0x150
> <4>[888969.963443] walk_component+0x145/0x1c0
> <4>[888969.965637] ? __fdget_raw+0x10/0x20
> <4>[888969.967747] ? path_init+0x1e5/0x390
> <4>[888969.969888] path_lookupat.isra.0+0x6e/0x150
> <4>[888969.971927] filename_lookup+0xcf/0x1a0
> <4>[888969.973943] ? __check_object_size+0x14f/0x160
> <4>[888969.975937] ? strncpy_from_user+0x44/0x160
> <4>[888969.977879] ? getname_flags+0x6f/0x1f0
> <4>[888969.979769] user_path_at_empty+0x3f/0x60
> <4>[888969.981604] vfs_statx+0x73/0x110
> <4>[888969.983390] __do_sys_newfstatat+0x36/0x70
> <4>[888969.985125] ? alloc_fd+0x58/0x190
> <4>[888969.986806] ? f_dupfd+0x4b/0x70
> <4>[888969.988513] ? do_fcntl+0x3af/0x5b0
> <4>[888969.990090] __x64_sys_newfstatat+0x1e/0x30
> <4>[888969.991649] do_syscall_64+0x59/0xc0
> <4>[888969.993146] ? syscall_exit_to_user_mode+0x27/0x50
> <4>[888969.994611] ? do_syscall_64+0x69/0xc0
> <4>[888969.996020] ? exit_to_user_mode_prepare+0x3d/0x1c0
> <4>[888969.997404] ? filp_close+0x60/0x70
> <4>[888969.998752] ? syscall_exit_to_user_mode+0x27/0x50
> <4>[888970.000084] ? __x64_sys_close+0x12/0x50
> <4>[888970.001371] ? do_syscall_64+0x69/0xc0
> <4>[888970.002605] ? do_syscall_64+0x69/0xc0
> <4>[888970.003793] entry_SYSCALL_64_after_hwframe+0x61/0xcb
>
> Our xfs version, config, OS and kernel version are the following:
>
> Linux$ xfs_info -V /data/
> xfs_info version 5.9.0
>
> Linux$ xfs_info /data
> meta-data=/dev/md127p1 isize=512 agcount=32, agsize=117206400 blks
> = sectsz=4096 attr=2, projid32bit=1
> = crc=1 finobt=1, sparse=1, rmapbt=0
> = reflink=1
> data = bsize=4096 blocks=3750604800, imaxpct=5
> = sunit=128 swidth=512 blks
> naming =version 2 bsize=4096 ascii-ci=0, ftype=1
> log =internal log bsize=4096 blocks=521728, version=2
> = sectsz=4096 sunit=1 blks, lazy-count=1
> realtime =none extsz=4096 blocks=0, rtextents=0
>
> Linux$ cat /etc/*-release
> DISTRIB_ID=Ubuntu
> DISTRIB_RELEASE=20.04
> DISTRIB_CODENAME=focal
> DISTRIB_DESCRIPTION="Ubuntu-Server 20.04.6 2023.05.30 (Cubic 2023-05-30 13:13)"
> NAME="Ubuntu"
> VERSION="20.04.6 LTS (Focal Fossa)"
> ID=ubuntu
> ID_LIKE=debian
> PRETTY_NAME="Ubuntu-Server 20.04.6 2023.05.30 (Cubic 2023-05-30 13:13)"
> VERSION_ID="20.04"
> HOME_URL="https://www.ubuntu.com/"
> SUPPORT_URL="https://help.ubuntu.com/"
> BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
> PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
> VERSION_CODENAME=focal
> UBUNTU_CODENAME=focal
>
> Linux$ uname -a
> Linux abc-server-001 5.15.0-46-generic #49~20.04.1-Ubuntu SMP Thu Aug 4 19:15:44 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
>
> It would be great if any insight could be provided on whether this is a known issue or how we could troubleshoot further.
>
> Best Regards.
>
> Jianan
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Question on xfs related kernel panic
2023-11-10 8:14 Question on xfs related kernel panic Jianan Wang
2023-11-10 8:48 ` Jianan Wang
@ 2023-11-10 19:34 ` Darrick J. Wong
2023-11-10 22:23 ` Jianan Wang
1 sibling, 1 reply; 6+ messages in thread
From: Darrick J. Wong @ 2023-11-10 19:34 UTC (permalink / raw)
To: Jianan Wang; +Cc: linux-xfs
On Fri, Nov 10, 2023 at 12:14:45AM -0800, Jianan Wang wrote:
> Hi all,
>
> I have a question regarding a kernel panic leading to our server reboot issue, which has its stack-trace like the following (copied from /var/lib/systemd/pstore/*):
>
> <4>[888969.888666] general protection fault, probably for non-canonical address 0xbf5bc9c369fd38ba: 0000 [#1] SMP PTI
> <4>[888969.891355] CPU: 47 PID: 2662145 Comm: find Tainted: P OE 5.15.0-46-generic #49~20.04.1-Ubuntu
Please open a support case with your vendor for this issue with their
kernel.
--D
> <4>[888969.894004] Hardware name: Supermicro SYS-4029GP-TRT2/X11DPG-OT-CPU, BIOS 3.8b 01/17/2023
> <4>[888969.896608] RIP: 0010:__kmalloc+0xfc/0x4b0
> <4>[888969.899170] Code: ca 2b ad 56 49 8b 50 08 49 83 78 10 00 4d 8b 30 0f 84 67 03 00 00 4d 85 f6 0f 84 5e 03 00 00 41 8b 45 28 49 8b 7d 00 4c 01 f0 <48> 8b 18 48 89 c1 49 33 9d b8 00 00 00 4c 89 f0 48 0f c9 48 31 cb
> <4>[888969.904329] RSP: 0018:ffffba69b18a78c0 EFLAGS: 00010282
> <4>[888969.906872] RAX: bf5bc9c369fd38ba RBX: 0000000000002c40 RCX: ffffffffc4d3ea92
> <4>[888969.909420] RDX: 0000000004d3b836 RSI: 0000000000002c40 RDI: 00000000000350a0
> <4>[888969.911952] RBP: ffffba69b18a7900 R08: ffff979effef50a0 R09: 000000000000002c
> <4>[888969.914471] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
> <4>[888969.916976] R13: ffff976080042500 R14: bf5bc9c369fd389a R15: ffffffffc4d80b0e
> <4>[888969.919594] FS: 00007fdbf10dd800(0000) GS:ffff979effec0000(0000) knlGS:0000000000000000
> <4>[888969.922109] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> <4>[888969.924601] CR2: 00007f236f3419f0 CR3: 00000050e6e62001 CR4: 00000000007706e0
> <4>[888969.927099] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> <4>[888969.929579] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> <4>[888969.932029] PKRU: 55555554
> <4>[888969.934445] Call Trace:
> <4>[888969.936827] <TASK>
> <4>[888969.939269] kmem_alloc+0x6e/0x110 [xfs]
> <4>[888969.941882] xfs_init_local_fork+0x72/0xf0 [xfs]
> <4>[888969.944418] xfs_iformat_local+0xac/0x180 [xfs]
> <4>[888969.946921] xfs_iformat_data_fork+0x105/0x130 [xfs]
> <4>[888969.949405] xfs_inode_from_disk+0x2be/0x470 [xfs]
> <4>[888969.951869] xfs_iget+0x334/0xbd0 [xfs]
> <4>[888969.954319] ? kvfree+0x2c/0x40
> <4>[888969.956529] xfs_lookup+0xd2/0x100 [xfs]
> <4>[888969.958930] xfs_vn_lookup+0x76/0xb0 [xfs]
> <4>[888969.961310] __lookup_slow+0x85/0x150
> <4>[888969.963443] walk_component+0x145/0x1c0
> <4>[888969.965637] ? __fdget_raw+0x10/0x20
> <4>[888969.967747] ? path_init+0x1e5/0x390
> <4>[888969.969888] path_lookupat.isra.0+0x6e/0x150
> <4>[888969.971927] filename_lookup+0xcf/0x1a0
> <4>[888969.973943] ? __check_object_size+0x14f/0x160
> <4>[888969.975937] ? strncpy_from_user+0x44/0x160
> <4>[888969.977879] ? getname_flags+0x6f/0x1f0
> <4>[888969.979769] user_path_at_empty+0x3f/0x60
> <4>[888969.981604] vfs_statx+0x73/0x110
> <4>[888969.983390] __do_sys_newfstatat+0x36/0x70
> <4>[888969.985125] ? alloc_fd+0x58/0x190
> <4>[888969.986806] ? f_dupfd+0x4b/0x70
> <4>[888969.988513] ? do_fcntl+0x3af/0x5b0
> <4>[888969.990090] __x64_sys_newfstatat+0x1e/0x30
> <4>[888969.991649] do_syscall_64+0x59/0xc0
> <4>[888969.993146] ? syscall_exit_to_user_mode+0x27/0x50
> <4>[888969.994611] ? do_syscall_64+0x69/0xc0
> <4>[888969.996020] ? exit_to_user_mode_prepare+0x3d/0x1c0
> <4>[888969.997404] ? filp_close+0x60/0x70
> <4>[888969.998752] ? syscall_exit_to_user_mode+0x27/0x50
> <4>[888970.000084] ? __x64_sys_close+0x12/0x50
> <4>[888970.001371] ? do_syscall_64+0x69/0xc0
> <4>[888970.002605] ? do_syscall_64+0x69/0xc0
> <4>[888970.003793] entry_SYSCALL_64_after_hwframe+0x61/0xcb
>
> Our xfs version, config, OS and kernel version are the following:
>
> Linux$ xfs_info -V /data/
> xfs_info version 5.9.0
>
> Linux$ xfs_info /data
> meta-data=/dev/md127p1 isize=512 agcount=32, agsize=117206400 blks
> = sectsz=4096 attr=2, projid32bit=1
> = crc=1 finobt=1, sparse=1, rmapbt=0
> = reflink=1
> data = bsize=4096 blocks=3750604800, imaxpct=5
> = sunit=128 swidth=512 blks
> naming =version 2 bsize=4096 ascii-ci=0, ftype=1
> log =internal log bsize=4096 blocks=521728, version=2
> = sectsz=4096 sunit=1 blks, lazy-count=1
> realtime =none extsz=4096 blocks=0, rtextents=0
>
> Linux$ cat /etc/*-release
> DISTRIB_ID=Ubuntu
> DISTRIB_RELEASE=20.04
> DISTRIB_CODENAME=focal
> DISTRIB_DESCRIPTION="Ubuntu-Server 20.04.6 2023.05.30 (Cubic 2023-05-30 13:13)"
> NAME="Ubuntu"
> VERSION="20.04.6 LTS (Focal Fossa)"
> ID=ubuntu
> ID_LIKE=debian
> PRETTY_NAME="Ubuntu-Server 20.04.6 2023.05.30 (Cubic 2023-05-30 13:13)"
> VERSION_ID="20.04"
> HOME_URL="https://www.ubuntu.com/"
> SUPPORT_URL="https://help.ubuntu.com/"
> BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
> PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
> VERSION_CODENAME=focal
> UBUNTU_CODENAME=focal
>
> Linux$ uname -a
> Linux abc-server-001 5.15.0-46-generic #49~20.04.1-Ubuntu SMP Thu Aug 4 19:15:44 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
>
> It would be great if any insight could be provided on whether this is a known issue or how we could troubleshoot further.
>
> Best Regards.
>
> Jianan
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Question on xfs related kernel panic
2023-11-10 19:34 ` Darrick J. Wong
@ 2023-11-10 22:23 ` Jianan Wang
2023-11-21 11:20 ` Carlos Maiolino
2023-11-21 16:08 ` Christoph Hellwig
0 siblings, 2 replies; 6+ messages in thread
From: Jianan Wang @ 2023-11-10 22:23 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: linux-xfs
Hi Darrick,
Thanks for your response. I will open a case to Ubuntu on this issue. However, can you give me a hint on what could be wrong? Failed to kmalloc seems to be a pretty severe issue, and is that related to any kind of kernel memory corruption by certain kernel modules or so?
Thanks.
Jianan.
On 11/10/23 11:34, Darrick J. Wong wrote:
> On Fri, Nov 10, 2023 at 12:14:45AM -0800, Jianan Wang wrote:
>> Hi all,
>>
>> I have a question regarding a kernel panic leading to our server reboot issue, which has its stack-trace like the following (copied from /var/lib/systemd/pstore/*):
>>
>> <4>[888969.888666] general protection fault, probably for non-canonical address 0xbf5bc9c369fd38ba: 0000 [#1] SMP PTI
>> <4>[888969.891355] CPU: 47 PID: 2662145 Comm: find Tainted: P OE 5.15.0-46-generic #49~20.04.1-Ubuntu
> Please open a support case with your vendor for this issue with their
> kernel.
>
> --D
>
>> <4>[888969.894004] Hardware name: Supermicro SYS-4029GP-TRT2/X11DPG-OT-CPU, BIOS 3.8b 01/17/2023
>> <4>[888969.896608] RIP: 0010:__kmalloc+0xfc/0x4b0
>> <4>[888969.899170] Code: ca 2b ad 56 49 8b 50 08 49 83 78 10 00 4d 8b 30 0f 84 67 03 00 00 4d 85 f6 0f 84 5e 03 00 00 41 8b 45 28 49 8b 7d 00 4c 01 f0 <48> 8b 18 48 89 c1 49 33 9d b8 00 00 00 4c 89 f0 48 0f c9 48 31 cb
>> <4>[888969.904329] RSP: 0018:ffffba69b18a78c0 EFLAGS: 00010282
>> <4>[888969.906872] RAX: bf5bc9c369fd38ba RBX: 0000000000002c40 RCX: ffffffffc4d3ea92
>> <4>[888969.909420] RDX: 0000000004d3b836 RSI: 0000000000002c40 RDI: 00000000000350a0
>> <4>[888969.911952] RBP: ffffba69b18a7900 R08: ffff979effef50a0 R09: 000000000000002c
>> <4>[888969.914471] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
>> <4>[888969.916976] R13: ffff976080042500 R14: bf5bc9c369fd389a R15: ffffffffc4d80b0e
>> <4>[888969.919594] FS: 00007fdbf10dd800(0000) GS:ffff979effec0000(0000) knlGS:0000000000000000
>> <4>[888969.922109] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> <4>[888969.924601] CR2: 00007f236f3419f0 CR3: 00000050e6e62001 CR4: 00000000007706e0
>> <4>[888969.927099] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> <4>[888969.929579] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> <4>[888969.932029] PKRU: 55555554
>> <4>[888969.934445] Call Trace:
>> <4>[888969.936827] <TASK>
>> <4>[888969.939269] kmem_alloc+0x6e/0x110 [xfs]
>> <4>[888969.941882] xfs_init_local_fork+0x72/0xf0 [xfs]
>> <4>[888969.944418] xfs_iformat_local+0xac/0x180 [xfs]
>> <4>[888969.946921] xfs_iformat_data_fork+0x105/0x130 [xfs]
>> <4>[888969.949405] xfs_inode_from_disk+0x2be/0x470 [xfs]
>> <4>[888969.951869] xfs_iget+0x334/0xbd0 [xfs]
>> <4>[888969.954319] ? kvfree+0x2c/0x40
>> <4>[888969.956529] xfs_lookup+0xd2/0x100 [xfs]
>> <4>[888969.958930] xfs_vn_lookup+0x76/0xb0 [xfs]
>> <4>[888969.961310] __lookup_slow+0x85/0x150
>> <4>[888969.963443] walk_component+0x145/0x1c0
>> <4>[888969.965637] ? __fdget_raw+0x10/0x20
>> <4>[888969.967747] ? path_init+0x1e5/0x390
>> <4>[888969.969888] path_lookupat.isra.0+0x6e/0x150
>> <4>[888969.971927] filename_lookup+0xcf/0x1a0
>> <4>[888969.973943] ? __check_object_size+0x14f/0x160
>> <4>[888969.975937] ? strncpy_from_user+0x44/0x160
>> <4>[888969.977879] ? getname_flags+0x6f/0x1f0
>> <4>[888969.979769] user_path_at_empty+0x3f/0x60
>> <4>[888969.981604] vfs_statx+0x73/0x110
>> <4>[888969.983390] __do_sys_newfstatat+0x36/0x70
>> <4>[888969.985125] ? alloc_fd+0x58/0x190
>> <4>[888969.986806] ? f_dupfd+0x4b/0x70
>> <4>[888969.988513] ? do_fcntl+0x3af/0x5b0
>> <4>[888969.990090] __x64_sys_newfstatat+0x1e/0x30
>> <4>[888969.991649] do_syscall_64+0x59/0xc0
>> <4>[888969.993146] ? syscall_exit_to_user_mode+0x27/0x50
>> <4>[888969.994611] ? do_syscall_64+0x69/0xc0
>> <4>[888969.996020] ? exit_to_user_mode_prepare+0x3d/0x1c0
>> <4>[888969.997404] ? filp_close+0x60/0x70
>> <4>[888969.998752] ? syscall_exit_to_user_mode+0x27/0x50
>> <4>[888970.000084] ? __x64_sys_close+0x12/0x50
>> <4>[888970.001371] ? do_syscall_64+0x69/0xc0
>> <4>[888970.002605] ? do_syscall_64+0x69/0xc0
>> <4>[888970.003793] entry_SYSCALL_64_after_hwframe+0x61/0xcb
>>
>> Our xfs version, config, OS and kernel version are the following:
>>
>> Linux$ xfs_info -V /data/
>> xfs_info version 5.9.0
>>
>> Linux$ xfs_info /data
>> meta-data=/dev/md127p1 isize=512 agcount=32, agsize=117206400 blks
>> = sectsz=4096 attr=2, projid32bit=1
>> = crc=1 finobt=1, sparse=1, rmapbt=0
>> = reflink=1
>> data = bsize=4096 blocks=3750604800, imaxpct=5
>> = sunit=128 swidth=512 blks
>> naming =version 2 bsize=4096 ascii-ci=0, ftype=1
>> log =internal log bsize=4096 blocks=521728, version=2
>> = sectsz=4096 sunit=1 blks, lazy-count=1
>> realtime =none extsz=4096 blocks=0, rtextents=0
>>
>> Linux$ cat /etc/*-release
>> DISTRIB_ID=Ubuntu
>> DISTRIB_RELEASE=20.04
>> DISTRIB_CODENAME=focal
>> DISTRIB_DESCRIPTION="Ubuntu-Server 20.04.6 2023.05.30 (Cubic 2023-05-30 13:13)"
>> NAME="Ubuntu"
>> VERSION="20.04.6 LTS (Focal Fossa)"
>> ID=ubuntu
>> ID_LIKE=debian
>> PRETTY_NAME="Ubuntu-Server 20.04.6 2023.05.30 (Cubic 2023-05-30 13:13)"
>> VERSION_ID="20.04"
>> HOME_URL="https://www.ubuntu.com/"
>> SUPPORT_URL="https://help.ubuntu.com/"
>> BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
>> PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
>> VERSION_CODENAME=focal
>> UBUNTU_CODENAME=focal
>>
>> Linux$ uname -a
>> Linux abc-server-001 5.15.0-46-generic #49~20.04.1-Ubuntu SMP Thu Aug 4 19:15:44 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
>>
>> It would be great if any insight could be provided on whether this is a known issue or how we could troubleshoot further.
>>
>> Best Regards.
>>
>> Jianan
>>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Question on xfs related kernel panic
2023-11-10 22:23 ` Jianan Wang
@ 2023-11-21 11:20 ` Carlos Maiolino
2023-11-21 16:08 ` Christoph Hellwig
1 sibling, 0 replies; 6+ messages in thread
From: Carlos Maiolino @ 2023-11-21 11:20 UTC (permalink / raw)
To: Jianan Wang; +Cc: Darrick J. Wong, linux-xfs
On Fri, Nov 10, 2023 at 02:23:53PM -0800, Jianan Wang wrote:
> Hi Darrick,
>
> Thanks for your response. I will open a case to Ubuntu on this issue. However, can you give me a hint on what could be wrong? Failed to kmalloc seems to be a pretty severe issue, and is that related to any kind of kernel memory corruption by certain kernel modules or so?
Seems like kmalloc internal mechanisms are trying to access invalid memory.
>
> Thanks.
> Jianan.
>
> On 11/10/23 11:34, Darrick J. Wong wrote:
> > On Fri, Nov 10, 2023 at 12:14:45AM -0800, Jianan Wang wrote:
> >> Hi all,
> >>
> >> I have a question regarding a kernel panic leading to our server reboot issue, which has its stack-trace like the following (copied from /var/lib/systemd/pstore/*):
> >>
> >> <4>[888969.888666] general protection fault, probably for non-canonical address 0xbf5bc9c369fd38ba: 0000 [#1] SMP PTI
^^^ Smells like memory corruption
> >> <4>[888969.891355] CPU: 47 PID: 2662145 Comm: find Tainted: P OE 5.15.0-46-generic #49~20.04.1-Ubuntu
^^^ Proprietary loaded modules are big red flags when
you get those weird memory corruptions.
This doesn't seem anything xfs-related, I'd point my fingers to whatever you
have loaded in your kernel, but as Darrick said, this is beyond the scope of
this list.
Carlos
> >> <4>[888969.894004] Hardware name: Supermicro SYS-4029GP-TRT2/X11DPG-OT-CPU, BIOS 3.8b 01/17/2023
> >> <4>[888969.896608] RIP: 0010:__kmalloc+0xfc/0x4b0
> >> <4>[888969.899170] Code: ca 2b ad 56 49 8b 50 08 49 83 78 10 00 4d 8b 30 0f 84 67 03 00 00 4d 85 f6 0f 84 5e 03 00 00 41 8b 45 28 49 8b 7d 00 4c 01 f0 <48> 8b 18 48 89 c1 49 33 9d b8 00 00 00 4c 89 f0 48 0f c9 48 31 cb
> >> <4>[888969.904329] RSP: 0018:ffffba69b18a78c0 EFLAGS: 00010282
> >> <4>[888969.906872] RAX: bf5bc9c369fd38ba RBX: 0000000000002c40 RCX: ffffffffc4d3ea92
> >> <4>[888969.909420] RDX: 0000000004d3b836 RSI: 0000000000002c40 RDI: 00000000000350a0
> >> <4>[888969.911952] RBP: ffffba69b18a7900 R08: ffff979effef50a0 R09: 000000000000002c
> >> <4>[888969.914471] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
> >> <4>[888969.916976] R13: ffff976080042500 R14: bf5bc9c369fd389a R15: ffffffffc4d80b0e
> >> <4>[888969.919594] FS: 00007fdbf10dd800(0000) GS:ffff979effec0000(0000) knlGS:0000000000000000
> >> <4>[888969.922109] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> <4>[888969.924601] CR2: 00007f236f3419f0 CR3: 00000050e6e62001 CR4: 00000000007706e0
> >> <4>[888969.927099] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >> <4>[888969.929579] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> >> <4>[888969.932029] PKRU: 55555554
> >> <4>[888969.934445] Call Trace:
> >> <4>[888969.936827] <TASK>
> >> <4>[888969.939269] kmem_alloc+0x6e/0x110 [xfs]
> >> <4>[888969.941882] xfs_init_local_fork+0x72/0xf0 [xfs]
> >> <4>[888969.944418] xfs_iformat_local+0xac/0x180 [xfs]
> >> <4>[888969.946921] xfs_iformat_data_fork+0x105/0x130 [xfs]
> >> <4>[888969.949405] xfs_inode_from_disk+0x2be/0x470 [xfs]
> >> <4>[888969.951869] xfs_iget+0x334/0xbd0 [xfs]
> >> <4>[888969.954319] ? kvfree+0x2c/0x40
> >> <4>[888969.956529] xfs_lookup+0xd2/0x100 [xfs]
> >> <4>[888969.958930] xfs_vn_lookup+0x76/0xb0 [xfs]
> >> <4>[888969.961310] __lookup_slow+0x85/0x150
> >> <4>[888969.963443] walk_component+0x145/0x1c0
> >> <4>[888969.965637] ? __fdget_raw+0x10/0x20
> >> <4>[888969.967747] ? path_init+0x1e5/0x390
> >> <4>[888969.969888] path_lookupat.isra.0+0x6e/0x150
> >> <4>[888969.971927] filename_lookup+0xcf/0x1a0
> >> <4>[888969.973943] ? __check_object_size+0x14f/0x160
> >> <4>[888969.975937] ? strncpy_from_user+0x44/0x160
> >> <4>[888969.977879] ? getname_flags+0x6f/0x1f0
> >> <4>[888969.979769] user_path_at_empty+0x3f/0x60
> >> <4>[888969.981604] vfs_statx+0x73/0x110
> >> <4>[888969.983390] __do_sys_newfstatat+0x36/0x70
> >> <4>[888969.985125] ? alloc_fd+0x58/0x190
> >> <4>[888969.986806] ? f_dupfd+0x4b/0x70
> >> <4>[888969.988513] ? do_fcntl+0x3af/0x5b0
> >> <4>[888969.990090] __x64_sys_newfstatat+0x1e/0x30
> >> <4>[888969.991649] do_syscall_64+0x59/0xc0
> >> <4>[888969.993146] ? syscall_exit_to_user_mode+0x27/0x50
> >> <4>[888969.994611] ? do_syscall_64+0x69/0xc0
> >> <4>[888969.996020] ? exit_to_user_mode_prepare+0x3d/0x1c0
> >> <4>[888969.997404] ? filp_close+0x60/0x70
> >> <4>[888969.998752] ? syscall_exit_to_user_mode+0x27/0x50
> >> <4>[888970.000084] ? __x64_sys_close+0x12/0x50
> >> <4>[888970.001371] ? do_syscall_64+0x69/0xc0
> >> <4>[888970.002605] ? do_syscall_64+0x69/0xc0
> >> <4>[888970.003793] entry_SYSCALL_64_after_hwframe+0x61/0xcb
> >>
> >> Our xfs version, config, OS and kernel version are the following:
> >>
> >> Linux$ xfs_info -V /data/
> >> xfs_info version 5.9.0
> >>
> >> Linux$ xfs_info /data
> >> meta-data=/dev/md127p1 isize=512 agcount=32, agsize=117206400 blks
> >> = sectsz=4096 attr=2, projid32bit=1
> >> = crc=1 finobt=1, sparse=1, rmapbt=0
> >> = reflink=1
> >> data = bsize=4096 blocks=3750604800, imaxpct=5
> >> = sunit=128 swidth=512 blks
> >> naming =version 2 bsize=4096 ascii-ci=0, ftype=1
> >> log =internal log bsize=4096 blocks=521728, version=2
> >> = sectsz=4096 sunit=1 blks, lazy-count=1
> >> realtime =none extsz=4096 blocks=0, rtextents=0
> >>
> >> Linux$ cat /etc/*-release
> >> DISTRIB_ID=Ubuntu
> >> DISTRIB_RELEASE=20.04
> >> DISTRIB_CODENAME=focal
> >> DISTRIB_DESCRIPTION="Ubuntu-Server 20.04.6 2023.05.30 (Cubic 2023-05-30 13:13)"
> >> NAME="Ubuntu"
> >> VERSION="20.04.6 LTS (Focal Fossa)"
> >> ID=ubuntu
> >> ID_LIKE=debian
> >> PRETTY_NAME="Ubuntu-Server 20.04.6 2023.05.30 (Cubic 2023-05-30 13:13)"
> >> VERSION_ID="20.04"
> >> HOME_URL="https://www.ubuntu.com/"
> >> SUPPORT_URL="https://help.ubuntu.com/"
> >> BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
> >> PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
> >> VERSION_CODENAME=focal
> >> UBUNTU_CODENAME=focal
> >>
> >> Linux$ uname -a
> >> Linux abc-server-001 5.15.0-46-generic #49~20.04.1-Ubuntu SMP Thu Aug 4 19:15:44 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
> >>
> >> It would be great if any insight could be provided on whether this is a known issue or how we could troubleshoot further.
> >>
> >> Best Regards.
> >>
> >> Jianan
> >>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Question on xfs related kernel panic
2023-11-10 22:23 ` Jianan Wang
2023-11-21 11:20 ` Carlos Maiolino
@ 2023-11-21 16:08 ` Christoph Hellwig
1 sibling, 0 replies; 6+ messages in thread
From: Christoph Hellwig @ 2023-11-21 16:08 UTC (permalink / raw)
To: Jianan Wang; +Cc: Darrick J. Wong, linux-xfs
On Fri, Nov 10, 2023 at 02:23:53PM -0800, Jianan Wang wrote:
> Hi Darrick,
>
> Thanks for your response. I will open a case to Ubuntu on this issue. However, can you give me a hint on what could be wrong? Failed to kmalloc seems to be a pretty severe issue, and is that related to any kind of kernel memory corruption by certain kernel modules or so?
The P taint suggest you have a proprietary module loaded. Nothing
is even remotely supportable for that case.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2023-11-21 16:08 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-10 8:14 Question on xfs related kernel panic Jianan Wang
2023-11-10 8:48 ` Jianan Wang
2023-11-10 19:34 ` Darrick J. Wong
2023-11-10 22:23 ` Jianan Wang
2023-11-21 11:20 ` Carlos Maiolino
2023-11-21 16:08 ` Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox