* [6.15-rc2 regression] xfs: assertion failed in inode allocation
@ 2025-04-16 4:49 Dave Chinner
2025-04-16 5:30 ` Darrick J. Wong
0 siblings, 1 reply; 3+ messages in thread
From: Dave Chinner @ 2025-04-16 4:49 UTC (permalink / raw)
To: linux-xfs
Hi folks,
After upgrading to ia current TOT kernel from 6.15-rc1, I'm now
seeing these assert failures during inode allocation when running
check-parallel:
[ 355.630225] XFS: Assertion failed: freecount == to_perag(cur->bc_group)->pagi_freecount, file: fs/xfs/libxfs/xfs_ialloc.c, line: 280
[ 355.630301] ------------[ cut here ]------------
[ 355.630302] kernel BUG at fs/xfs/xfs_message.c:102!
[ 355.630310] Oops: invalid opcode: 0000 [#1] SMP NOPTI
[ 355.630315] CPU: 16 UID: 0 PID: 1167750 Comm: touch Not tainted 6.15.0-rc2-dgc+ #311 PREEMPT(full)
[ 355.630318] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[ 355.630320] RIP: 0010:assfail+0x3a/0x40
[ 355.630330] Code: 89 f1 48 89 fe 48 c7 c7 bc bf ed 82 48 c7 c2 91 4e e8 82 e8 c8 fc ff ff 80 3d 01 03 51 03 01 74 09 0f 0b 5d c3 cc cc cc cc cc <0f> 0b 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
[ 355.630332] RSP: 0018:ffffc9001c75b930 EFLAGS: 00010246
[ 355.630335] RAX: 5e6224bd563a9c00 RBX: ffff888853518f30 RCX: 5e6224bd563a9c00
[ 355.630336] RDX: ffffc9001c75b7f8 RSI: 000000000000000a RDI: ffffffff82edbfbc
[ 355.630337] RBP: ffffc9001c75b930 R08: 0000000000000000 R09: 000000000000000a
[ 355.630338] R10: 0000000000000000 R11: 0000000000000021 R12: 0000000000000000
[ 355.630339] R13: 0000000000000000 R14: ffffc9001c75b948 R15: ffffc9001c75b944
[ 355.630341] FS: 00007f07d8216740(0000) GS:ffff88909a489000(0000) knlGS:0000000000000000
[ 355.630343] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 355.630345] CR2: 00007f07d82fbca0 CR3: 0000000905200000 CR4: 0000000000350ef0
[ 355.630348] Call Trace:
[ 355.630351] <TASK>
[ 355.630353] xfs_check_agi_freecount+0xf1/0x100
[ 355.630358] xfs_dialloc_ag_inobt+0xd5/0x8a0
[ 355.630360] ? xfs_ialloc_read_agi+0x43/0x1b0
[ 355.630362] xfs_dialloc+0x362/0x8e0
[ 355.630363] ? xfs_trans_alloc+0x13c/0x240
[ 355.630368] ? xfs_trans_alloc_icreate+0xa0/0x150
[ 355.630370] xfs_create+0x1d4/0x430
[ 355.630374] ? __get_acl+0x29/0x1d0
[ 355.630379] xfs_generic_create+0x141/0x3e0
[ 355.630381] xfs_vn_create+0x14/0x20
[ 355.630382] path_openat+0x50e/0xe30
[ 355.630386] do_filp_open+0xbc/0x170
[ 355.630388] ? kmem_cache_alloc_noprof+0x188/0x320
[ 355.630393] ? getname_flags+0x47/0x1e0
[ 355.630395] ? _raw_spin_unlock+0xe/0x30
[ 355.630400] ? alloc_fd+0x165/0x190
[ 355.630404] do_sys_openat2+0x75/0xd0
[ 355.630407] __x64_sys_openat+0x7d/0xa0
[ 355.630408] x64_sys_call+0x1b2/0x2f60
[ 355.630413] do_syscall_64+0x68/0x130
[ 355.630415] ? exc_page_fault+0x62/0xc0
[ 355.630419] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 355.630420] RIP: 0033:0x7f07d831bc7c
[ 355.630423] Code: 83 e2 40 75 51 89 f0 f7 d0 a9 00 00 41 00 74 46 80 3d f7 c3 0e 00 00 74 6a 89 da 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 90 00 00 00 48 8b 54 24 28 64 48 2b 14 25
[ 355.630424] RSP: 002b:00007fffaddfbdc0 EFLAGS: 00000202 ORIG_RAX: 0000000000000101
[ 355.630426] RAX: ffffffffffffffda RBX: 0000000000000941 RCX: 00007f07d831bc7c
[ 355.630427] RDX: 0000000000000941 RSI: 00007fffaddfe3c3 RDI: 00000000ffffff9c
[ 355.630427] RBP: 00007fffaddfe3c3 R08: 0000000000000000 R09: 0000000000000000
[ 355.630428] R10: 00000000000001b6 R11: 0000000000000202 R12: 000055ed247bfb50
[ 355.630429] R13: 000055ed247be11f R14: 00007fffaddfe3c3 R15: 0000000000000000
[ 355.630430] </TASK>
I'm running on x86-64, using mkfs defaults (4k block size) and no
mount options on the test/scratch devices, so this is as normal a
configuration as you can get. I see this at least every second high
concurrency run, but I haven't been able to isolate which test(s)
are causing it because the failure either does not occur or some
other block device related weirdness crops up when I run the tests
sequentially in a single task context.
-Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [6.15-rc2 regression] xfs: assertion failed in inode allocation
2025-04-16 4:49 [6.15-rc2 regression] xfs: assertion failed in inode allocation Dave Chinner
@ 2025-04-16 5:30 ` Darrick J. Wong
2025-04-16 6:39 ` Dave Chinner
0 siblings, 1 reply; 3+ messages in thread
From: Darrick J. Wong @ 2025-04-16 5:30 UTC (permalink / raw)
To: Dave Chinner; +Cc: linux-xfs
On Wed, Apr 16, 2025 at 02:49:56PM +1000, Dave Chinner wrote:
> Hi folks,
>
> After upgrading to ia current TOT kernel from 6.15-rc1, I'm now
> seeing these assert failures during inode allocation when running
> check-parallel:
>
> [ 355.630225] XFS: Assertion failed: freecount == to_perag(cur->bc_group)->pagi_freecount, file: fs/xfs/libxfs/xfs_ialloc.c, line: 280
I haven't seen this assertion tripping any more than it has in the past.
But I will say that I've seen a number of other problems, like page
state corruption, null pointer derefs from the block layer, and weird
behavior from the rest of the kernel. Turning off LBS support fixes a
lot of it. -rc2 doesn't seem to have fixed anything over -rc1.
--D
> [ 355.630301] ------------[ cut here ]------------
> [ 355.630302] kernel BUG at fs/xfs/xfs_message.c:102!
> [ 355.630310] Oops: invalid opcode: 0000 [#1] SMP NOPTI
> [ 355.630315] CPU: 16 UID: 0 PID: 1167750 Comm: touch Not tainted 6.15.0-rc2-dgc+ #311 PREEMPT(full)
> [ 355.630318] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> [ 355.630320] RIP: 0010:assfail+0x3a/0x40
> [ 355.630330] Code: 89 f1 48 89 fe 48 c7 c7 bc bf ed 82 48 c7 c2 91 4e e8 82 e8 c8 fc ff ff 80 3d 01 03 51 03 01 74 09 0f 0b 5d c3 cc cc cc cc cc <0f> 0b 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
> [ 355.630332] RSP: 0018:ffffc9001c75b930 EFLAGS: 00010246
> [ 355.630335] RAX: 5e6224bd563a9c00 RBX: ffff888853518f30 RCX: 5e6224bd563a9c00
> [ 355.630336] RDX: ffffc9001c75b7f8 RSI: 000000000000000a RDI: ffffffff82edbfbc
> [ 355.630337] RBP: ffffc9001c75b930 R08: 0000000000000000 R09: 000000000000000a
> [ 355.630338] R10: 0000000000000000 R11: 0000000000000021 R12: 0000000000000000
> [ 355.630339] R13: 0000000000000000 R14: ffffc9001c75b948 R15: ffffc9001c75b944
> [ 355.630341] FS: 00007f07d8216740(0000) GS:ffff88909a489000(0000) knlGS:0000000000000000
> [ 355.630343] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 355.630345] CR2: 00007f07d82fbca0 CR3: 0000000905200000 CR4: 0000000000350ef0
> [ 355.630348] Call Trace:
> [ 355.630351] <TASK>
> [ 355.630353] xfs_check_agi_freecount+0xf1/0x100
> [ 355.630358] xfs_dialloc_ag_inobt+0xd5/0x8a0
> [ 355.630360] ? xfs_ialloc_read_agi+0x43/0x1b0
> [ 355.630362] xfs_dialloc+0x362/0x8e0
> [ 355.630363] ? xfs_trans_alloc+0x13c/0x240
> [ 355.630368] ? xfs_trans_alloc_icreate+0xa0/0x150
> [ 355.630370] xfs_create+0x1d4/0x430
> [ 355.630374] ? __get_acl+0x29/0x1d0
> [ 355.630379] xfs_generic_create+0x141/0x3e0
> [ 355.630381] xfs_vn_create+0x14/0x20
> [ 355.630382] path_openat+0x50e/0xe30
> [ 355.630386] do_filp_open+0xbc/0x170
> [ 355.630388] ? kmem_cache_alloc_noprof+0x188/0x320
> [ 355.630393] ? getname_flags+0x47/0x1e0
> [ 355.630395] ? _raw_spin_unlock+0xe/0x30
> [ 355.630400] ? alloc_fd+0x165/0x190
> [ 355.630404] do_sys_openat2+0x75/0xd0
> [ 355.630407] __x64_sys_openat+0x7d/0xa0
> [ 355.630408] x64_sys_call+0x1b2/0x2f60
> [ 355.630413] do_syscall_64+0x68/0x130
> [ 355.630415] ? exc_page_fault+0x62/0xc0
> [ 355.630419] entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [ 355.630420] RIP: 0033:0x7f07d831bc7c
> [ 355.630423] Code: 83 e2 40 75 51 89 f0 f7 d0 a9 00 00 41 00 74 46 80 3d f7 c3 0e 00 00 74 6a 89 da 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 90 00 00 00 48 8b 54 24 28 64 48 2b 14 25
> [ 355.630424] RSP: 002b:00007fffaddfbdc0 EFLAGS: 00000202 ORIG_RAX: 0000000000000101
> [ 355.630426] RAX: ffffffffffffffda RBX: 0000000000000941 RCX: 00007f07d831bc7c
> [ 355.630427] RDX: 0000000000000941 RSI: 00007fffaddfe3c3 RDI: 00000000ffffff9c
> [ 355.630427] RBP: 00007fffaddfe3c3 R08: 0000000000000000 R09: 0000000000000000
> [ 355.630428] R10: 00000000000001b6 R11: 0000000000000202 R12: 000055ed247bfb50
> [ 355.630429] R13: 000055ed247be11f R14: 00007fffaddfe3c3 R15: 0000000000000000
> [ 355.630430] </TASK>
>
> I'm running on x86-64, using mkfs defaults (4k block size) and no
> mount options on the test/scratch devices, so this is as normal a
> configuration as you can get. I see this at least every second high
> concurrency run, but I haven't been able to isolate which test(s)
> are causing it because the failure either does not occur or some
> other block device related weirdness crops up when I run the tests
> sequentially in a single task context.
>
> -Dave.
> --
> Dave Chinner
> david@fromorbit.com
>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [6.15-rc2 regression] xfs: assertion failed in inode allocation
2025-04-16 5:30 ` Darrick J. Wong
@ 2025-04-16 6:39 ` Dave Chinner
0 siblings, 0 replies; 3+ messages in thread
From: Dave Chinner @ 2025-04-16 6:39 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: linux-xfs
On Tue, Apr 15, 2025 at 10:30:06PM -0700, Darrick J. Wong wrote:
> On Wed, Apr 16, 2025 at 02:49:56PM +1000, Dave Chinner wrote:
> > Hi folks,
> >
> > After upgrading to ia current TOT kernel from 6.15-rc1, I'm now
> > seeing these assert failures during inode allocation when running
> > check-parallel:
> >
> > [ 355.630225] XFS: Assertion failed: freecount == to_perag(cur->bc_group)->pagi_freecount, file: fs/xfs/libxfs/xfs_ialloc.c, line: 280
>
> I haven't seen this assertion tripping any more than it has in the past.
>
> But I will say that I've seen a number of other problems, like page
> state corruption, null pointer derefs from the block layer, and weird
> behavior from the rest of the kernel. Turning off LBS support fixes a
> lot of it. -rc2 doesn't seem to have fixed anything over -rc1.
Yeah, that's pretty much my experience with the current TOT kernel -
it seems to be randomly corrupting block devices, failing to detect
filesystem features correctly (e.g. generic/577 seems to think that
XFS supports fsverity, then it fork bombs the machine with thousands
of child processes that never get reaped), stuff hanging in weird
places (e.g. udevadm getting stuck waiting on things that have
already happened) and so on.
I would not be running 6.15-rcX on anything important right now...
-Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-04-16 6:39 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-16 4:49 [6.15-rc2 regression] xfs: assertion failed in inode allocation Dave Chinner
2025-04-16 5:30 ` Darrick J. Wong
2025-04-16 6:39 ` Dave Chinner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox