* flakey assert failures in xfs/538 in for-next
@ 2025-07-16 12:13 Christoph Hellwig
2025-07-16 15:38 ` Darrick J. Wong
0 siblings, 1 reply; 4+ messages in thread
From: Christoph Hellwig @ 2025-07-16 12:13 UTC (permalink / raw)
To: Carlos Maiolino; +Cc: linux-xfs, Fedor Pchelkin
Hi all,
I'm seeing assert failures in xfs/538 in for-next when using 1k file
systems. Unfortunately the errors are a bit flakely, two days ago I had
a streak where I could reproduce them pretty easily and the bisection
landed at:
"xfs: refactor xfs_btree_diff_two_ptrs() to take advantage of cmp_int()"
but trying to reproduce it again yesterday mostly failed, with just
a single occurance of the failure in many runs. Below is the
assert output, which suggests that xfs_bmapi_write gets something
wrong in the accounting in case it rings a bell for someone:
[ 6062.095597] XFS (vdc): Injecting error (false) at file fs/xfs/libxfs/xfs_bmap.c, line 3665, on filesystem "vdc"
[ 6062.355716] XFS: Assertion failed: pathlen == 0, file: fs/xfs/libxfs/xfs_symlink_remote.c, line: 383
[ 6062.356258] ------------[ cut here ]------------
[ 6062.356502] kernel BUG at fs/xfs/xfs_message.c:102!
[ 6062.356761] Oops: invalid opcode: 0000 [#1] SMP NOPTI
[ 6062.357027] CPU: 1 UID: 0 PID: 1002774 Comm: fsstress Not tainted 6.16.0-rc2+ #1286 PREEMPT(full)
[ 6062.357481] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[ 6062.358024] RIP: 0010:assfail+0x2c/0x35
[ 6062.358229] Code: 1f 00 49 89 d0 41 89 c9 48 c7 c2 f0 2a 1a 83 48 89 f1 48 89 fe 48 c7 c7 8f 47 24 83 e8 fd fd ff ff 80 3d 1e 57 a4c
[ 6062.361574] RSP: 0018:ffff8881d6a53c80 EFLAGS: 00010202
[ 6062.361951] RAX: 0000000000000000 RBX: ffff88813bb6ee80 RCX: 000000007fffffff
[ 6062.362701] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff8324478f
[ 6062.363427] RBP: ffff8881026ee000 R08: 0000000000000000 R09: 000000000000000a
[ 6062.363756] R10: 000000000000000a R11: 0fffffffffffffff R12: 000000000000001f
[ 6062.364254] R13: 0000000000000001 R14: 00000000000003c8 R15: 00000000000003c8
[ 6062.364718] FS: 00007f6c9b5e1040(0000) GS:ffff8882b3418000(0000) knlGS:0000000000000000
[ 6062.365347] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 6062.365906] CR2: 00007f6c9b7df000 CR3: 00000001f456d005 CR4: 0000000000770ef0
[ 6062.366424] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 6062.366909] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400
[ 6062.367395] PKRU: 55555554
[ 6062.367593] Call Trace:
[ 6062.367777] <TASK>
[ 6062.367938] xfs_symlink_write_target+0x2c5/0x2d0
[ 6062.368282] ? xfs_diflags_to_iflags+0x14/0x100
[ 6062.368626] ? preempt_count_add+0x73/0xb0
[ 6062.368898] xfs_symlink+0x41d/0x520
[ 6062.369181] xfs_vn_symlink+0x8a/0x1b0
[ 6062.369446] vfs_symlink+0x10a/0x180
[ 6062.369765] do_symlinkat+0x104/0x130
[ 6062.370061] __x64_sys_symlink+0x32/0x40
[ 6062.370399] do_syscall_64+0x50/0x1d0
[ 6062.370659] entry_SYSCALL_64_after_hwframe+0x76/0x7e
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: flakey assert failures in xfs/538 in for-next
2025-07-16 12:13 flakey assert failures in xfs/538 in for-next Christoph Hellwig
@ 2025-07-16 15:38 ` Darrick J. Wong
2025-07-16 16:02 ` Christoph Hellwig
0 siblings, 1 reply; 4+ messages in thread
From: Darrick J. Wong @ 2025-07-16 15:38 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Carlos Maiolino, linux-xfs, Fedor Pchelkin
On Wed, Jul 16, 2025 at 02:13:39PM +0200, Christoph Hellwig wrote:
> Hi all,
>
> I'm seeing assert failures in xfs/538 in for-next when using 1k file
> systems. Unfortunately the errors are a bit flakely, two days ago I had
> a streak where I could reproduce them pretty easily and the bisection
> landed at:
>
> "xfs: refactor xfs_btree_diff_two_ptrs() to take advantage of cmp_int()"
O^o
> but trying to reproduce it again yesterday mostly failed, with just
> a single occurance of the failure in many runs. Below is the
> assert output, which suggests that xfs_bmapi_write gets something
> wrong in the accounting in case it rings a bell for someone:
>
> [ 6062.095597] XFS (vdc): Injecting error (false) at file fs/xfs/libxfs/xfs_bmap.c, line 3665, on filesystem "vdc"
> [ 6062.355716] XFS: Assertion failed: pathlen == 0, file: fs/xfs/libxfs/xfs_symlink_remote.c, line: 383
I've seen this happen maybe once or twice, I think the problem is that
the symlink xfs_bmapi_write fails to allocate enough blocks to store the
symlink target, doesn't notice, and then the actual target write runs
out of blocks before it runs out of pathlen and kaboom.
Probably the right answer is to ENOSPC if we can't allocate blocks, but
I guess we did reserve free space so perhaps we just keep bmapi'ing
until we get all the space we need?
The weird part is that XFS_SYMLINK_MAPS should be large enough to fit
all the target we need, so ... I don't know if bmapi_write is returning
fewer than 3 nmaps because it hit ENOSPC or what?
(and because I can't reproduce it reliably, I have not investigated
further :()
--D
> [ 6062.356258] ------------[ cut here ]------------
> [ 6062.356502] kernel BUG at fs/xfs/xfs_message.c:102!
> [ 6062.356761] Oops: invalid opcode: 0000 [#1] SMP NOPTI
> [ 6062.357027] CPU: 1 UID: 0 PID: 1002774 Comm: fsstress Not tainted 6.16.0-rc2+ #1286 PREEMPT(full)
> [ 6062.357481] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> [ 6062.358024] RIP: 0010:assfail+0x2c/0x35
> [ 6062.358229] Code: 1f 00 49 89 d0 41 89 c9 48 c7 c2 f0 2a 1a 83 48 89 f1 48 89 fe 48 c7 c7 8f 47 24 83 e8 fd fd ff ff 80 3d 1e 57 a4c
> [ 6062.361574] RSP: 0018:ffff8881d6a53c80 EFLAGS: 00010202
> [ 6062.361951] RAX: 0000000000000000 RBX: ffff88813bb6ee80 RCX: 000000007fffffff
> [ 6062.362701] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff8324478f
> [ 6062.363427] RBP: ffff8881026ee000 R08: 0000000000000000 R09: 000000000000000a
> [ 6062.363756] R10: 000000000000000a R11: 0fffffffffffffff R12: 000000000000001f
> [ 6062.364254] R13: 0000000000000001 R14: 00000000000003c8 R15: 00000000000003c8
> [ 6062.364718] FS: 00007f6c9b5e1040(0000) GS:ffff8882b3418000(0000) knlGS:0000000000000000
> [ 6062.365347] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 6062.365906] CR2: 00007f6c9b7df000 CR3: 00000001f456d005 CR4: 0000000000770ef0
> [ 6062.366424] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 6062.366909] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400
> [ 6062.367395] PKRU: 55555554
> [ 6062.367593] Call Trace:
> [ 6062.367777] <TASK>
> [ 6062.367938] xfs_symlink_write_target+0x2c5/0x2d0
> [ 6062.368282] ? xfs_diflags_to_iflags+0x14/0x100
> [ 6062.368626] ? preempt_count_add+0x73/0xb0
> [ 6062.368898] xfs_symlink+0x41d/0x520
> [ 6062.369181] xfs_vn_symlink+0x8a/0x1b0
> [ 6062.369446] vfs_symlink+0x10a/0x180
> [ 6062.369765] do_symlinkat+0x104/0x130
> [ 6062.370061] __x64_sys_symlink+0x32/0x40
> [ 6062.370399] do_syscall_64+0x50/0x1d0
> [ 6062.370659] entry_SYSCALL_64_after_hwframe+0x76/0x7e
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: flakey assert failures in xfs/538 in for-next
2025-07-16 15:38 ` Darrick J. Wong
@ 2025-07-16 16:02 ` Christoph Hellwig
2025-07-18 12:19 ` Chandan Babu R
0 siblings, 1 reply; 4+ messages in thread
From: Christoph Hellwig @ 2025-07-16 16:02 UTC (permalink / raw)
To: Darrick J. Wong
Cc: Christoph Hellwig, Carlos Maiolino, linux-xfs, Fedor Pchelkin,
Chandan Babu R
On Wed, Jul 16, 2025 at 08:38:12AM -0700, Darrick J. Wong wrote:
> I've seen this happen maybe once or twice, I think the problem is that
> the symlink xfs_bmapi_write fails to allocate enough blocks to store the
> symlink target, doesn't notice, and then the actual target write runs
> out of blocks before it runs out of pathlen and kaboom.
>
> Probably the right answer is to ENOSPC if we can't allocate blocks, but
> I guess we did reserve free space so perhaps we just keep bmapi'ing
> until we get all the space we need?
>
> The weird part is that XFS_SYMLINK_MAPS should be large enough to fit
> all the target we need, so ... I don't know if bmapi_write is returning
> fewer than 3 nmaps because it hit ENOSPC or what?
>
> (and because I can't reproduce it reliably, I have not investigated
> further :()
I guess the recent cleanups are not too blame then, or just slightly
changed the timing for me to have a streak to frequently hit it.
xfs/538 is the alloc minlen test that injects getting back the minlen
or failing allocations if minlen > 1. I guess that interacts badly
somehow with the rather uncommon multi-map allocations. The only
other one is xfs_da_grow_inode_int, and that only for directories
with a larger directory block size, and as a fallback when the contig
allocations fails. It might be worth crafting a test doing a lot
of symlinking while doing that error injetion to trigger it more
reliably.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: flakey assert failures in xfs/538 in for-next
2025-07-16 16:02 ` Christoph Hellwig
@ 2025-07-18 12:19 ` Chandan Babu R
0 siblings, 0 replies; 4+ messages in thread
From: Chandan Babu R @ 2025-07-18 12:19 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Darrick J. Wong, Carlos Maiolino, linux-xfs, Fedor Pchelkin
On Wed, Jul 16, 2025 at 06:02:34 PM +0200, Christoph Hellwig wrote:
> On Wed, Jul 16, 2025 at 08:38:12AM -0700, Darrick J. Wong wrote:
>> I've seen this happen maybe once or twice, I think the problem is that
>> the symlink xfs_bmapi_write fails to allocate enough blocks to store the
>> symlink target, doesn't notice, and then the actual target write runs
>> out of blocks before it runs out of pathlen and kaboom.
>>
>> Probably the right answer is to ENOSPC if we can't allocate blocks, but
>> I guess we did reserve free space so perhaps we just keep bmapi'ing
>> until we get all the space we need?
>>
>> The weird part is that XFS_SYMLINK_MAPS should be large enough to fit
>> all the target we need, so ... I don't know if bmapi_write is returning
>> fewer than 3 nmaps because it hit ENOSPC or what?
>>
>> (and because I can't reproduce it reliably, I have not investigated
>> further :()
I think you are right. Most likely we were able to successfully allocate less
than XFS_SYMLINK_MAPS (i.e. 3) and the next allocation only found free extents
whose length were larger than 1 FSB.
The test fills 90% of the filesystem and then punches holes at every other
block used by each of the "filler" files. So the filesystem could have some
"free extents" whose size is larger than 1 FSB. These larger free extents
allowed the block reservation to succeed.
During the test run, we could have consumed all the 1 FSB sized free extents
and hence a later allocation attempt can fail since we were trying to allocate
only 1 FSB sized extent.
>
> I guess the recent cleanups are not too blame then, or just slightly
> changed the timing for me to have a streak to frequently hit it.
>
> xfs/538 is the alloc minlen test that injects getting back the minlen
> or failing allocations if minlen > 1. I guess that interacts badly
> somehow with the rather uncommon multi-map allocations. The only
> other one is xfs_da_grow_inode_int, and that only for directories
> with a larger directory block size, and as a fallback when the contig
> allocations fails. It might be worth crafting a test doing a lot
> of symlinking while doing that error injetion to trigger it more
> reliably.
I have modifed xfs/538 to perform only write* and symlink
operations. Unfortunately, the test hasn't failed yet despite running for 27
iterations. I will let it run during the weekend.
--
Chandan
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-07-18 12:55 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-16 12:13 flakey assert failures in xfs/538 in for-next Christoph Hellwig
2025-07-16 15:38 ` Darrick J. Wong
2025-07-16 16:02 ` Christoph Hellwig
2025-07-18 12:19 ` Chandan Babu R
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.