[PATCH] fs/dax: Fix "don't skip locked entries when scanning entries"

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] fs/dax: Fix "don't skip locked entries when scanning entries"
@ 2025-05-23  4:37 Alistair Popple
  2025-05-26 14:43 ` Jan Kara
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Alistair Popple @ 2025-05-23  4:37 UTC (permalink / raw)
  To: akpm, linux-fsdevel, nvdimm, dan.j.williams, willy
  Cc: linux-kernel, Alistair Popple, Alison Schofield, Balbir Singh,
	Darrick J. Wong, Dave Chinner, David Hildenbrand, Jan Kara,
	John Hubbard, Ted Ts'o, Alexander Viro, Christian Brauner

Commit 6be3e21d25ca ("fs/dax: don't skip locked entries when scanning
entries") introduced a new function, wait_entry_unlocked_exclusive(),
which waits for the current entry to become unlocked without advancing
the XArray iterator state.

Waiting for the entry to become unlocked requires dropping the XArray
lock. This requires calling xas_pause() prior to dropping the lock
which leaves the xas in a suitable state for the next iteration. However
this has the side-effect of advancing the xas state to the next index.
Normally this isn't an issue because xas_for_each() contains code to
detect this state and thus avoid advancing the index a second time on
the next loop iteration.

However both callers of and wait_entry_unlocked_exclusive() itself
subsequently use the xas state to reload the entry. As xas_pause()
updated the state to the next index this will cause the current entry
which is being waited on to be skipped. This caused the following
warning to fire intermittently when running xftest generic/068 on an XFS
filesystem with FS DAX enabled:

[   35.067397] ------------[ cut here ]------------
[   35.068229] WARNING: CPU: 21 PID: 1640 at mm/truncate.c:89 truncate_folio_batch_exceptionals+0xd8/0x1e0
[   35.069717] Modules linked in: nd_pmem dax_pmem nd_btt nd_e820 libnvdimm
[   35.071006] CPU: 21 UID: 0 PID: 1640 Comm: fstest Not tainted 6.15.0-rc7+ #77 PREEMPT(voluntary)
[   35.072613] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/204
[   35.074845] RIP: 0010:truncate_folio_batch_exceptionals+0xd8/0x1e0
[   35.075962] Code: a1 00 00 00 f6 47 0d 20 0f 84 97 00 00 00 4c 63 e8 41 39 c4 7f 0b eb 61 49 83 c5 01 45 39 ec 7e 58 42 f68
[   35.079522] RSP: 0018:ffffb04e426c7850 EFLAGS: 00010202
[   35.080359] RAX: 0000000000000000 RBX: ffff9d21e3481908 RCX: ffffb04e426c77f4
[   35.081477] RDX: ffffb04e426c79e8 RSI: ffffb04e426c79e0 RDI: ffff9d21e34816e8
[   35.082590] RBP: ffffb04e426c79e0 R08: 0000000000000001 R09: 0000000000000003
[   35.083733] R10: 0000000000000000 R11: 822b53c0f7a49868 R12: 000000000000001f
[   35.084850] R13: 0000000000000000 R14: ffffb04e426c78e8 R15: fffffffffffffffe
[   35.085953] FS:  00007f9134c87740(0000) GS:ffff9d22abba0000(0000) knlGS:0000000000000000
[   35.087346] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   35.088244] CR2: 00007f9134c86000 CR3: 000000040afff000 CR4: 00000000000006f0
[   35.089354] Call Trace:
[   35.089749]  <TASK>
[   35.090168]  truncate_inode_pages_range+0xfc/0x4d0
[   35.091078]  truncate_pagecache+0x47/0x60
[   35.091735]  xfs_setattr_size+0xc7/0x3e0
[   35.092648]  xfs_vn_setattr+0x1ea/0x270
[   35.093437]  notify_change+0x1f4/0x510
[   35.094219]  ? do_truncate+0x97/0xe0
[   35.094879]  do_truncate+0x97/0xe0
[   35.095640]  path_openat+0xabd/0xca0
[   35.096278]  do_filp_open+0xd7/0x190
[   35.096860]  do_sys_openat2+0x8a/0xe0
[   35.097459]  __x64_sys_openat+0x6d/0xa0
[   35.098076]  do_syscall_64+0xbb/0x1d0
[   35.098647]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
[   35.099444] RIP: 0033:0x7f9134d81fc1
[   35.100033] Code: 75 57 89 f0 25 00 00 41 00 3d 00 00 41 00 74 49 80 3d 2a 26 0e 00 00 74 6d 89 da 48 89 ee bf 9c ff ff ff5
[   35.102993] RSP: 002b:00007ffcd41e0d10 EFLAGS: 00000202 ORIG_RAX: 0000000000000101
[   35.104263] RAX: ffffffffffffffda RBX: 0000000000000242 RCX: 00007f9134d81fc1
[   35.105452] RDX: 0000000000000242 RSI: 00007ffcd41e1200 RDI: 00000000ffffff9c
[   35.106663] RBP: 00007ffcd41e1200 R08: 0000000000000000 R09: 0000000000000064
[   35.107923] R10: 00000000000001a4 R11: 0000000000000202 R12: 0000000000000066
[   35.109112] R13: 0000000000100000 R14: 0000000000100000 R15: 0000000000000400
[   35.110357]  </TASK>
[   35.110769] irq event stamp: 8415587
[   35.111486] hardirqs last  enabled at (8415599): [<ffffffff8d74b562>] __up_console_sem+0x52/0x60
[   35.113067] hardirqs last disabled at (8415610): [<ffffffff8d74b547>] __up_console_sem+0x37/0x60
[   35.114575] softirqs last  enabled at (8415300): [<ffffffff8d6ac625>] handle_softirqs+0x315/0x3f0
[   35.115933] softirqs last disabled at (8415291): [<ffffffff8d6ac811>] __irq_exit_rcu+0xa1/0xc0
[   35.117316] ---[ end trace 0000000000000000 ]---

Fix this by using xas_reset() instead, which is equivalent in
implementation to xas_pause() but does not advance the XArray state.

Fixes: 6be3e21d25ca ("fs/dax: don't skip locked entries when scanning entries")
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Matthew Wilcow (Oracle) <willy@infradead.org>
Cc: Balbir Singh <balbirs@nvidia.com>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Ted Ts'o <tytso@mit.edu>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>

---

Hi Andrew,

Apologies for finding this so late in the cycle. This is a very
intermittent issue for me, and it seems it was only exposed by a recent
upgrade to my test machine/setup. The user visible impact is the same
as for the original commit this fixes. That is possible file data
corruption if a device has a FS DAX page pinned for DMA.

So in other words it means my original fix was not 100% effective.
The issue that commit fixed has existed for a long time without being
reported, so not sure if this is worth trying to get into v6.15 or not.

Either way I figured it would be best to send this ASAP, which means I
am still waiting for a complete xfstest run to complete (although the
failing test does now pass cleanly).
---
 fs/dax.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/dax.c b/fs/dax.c
index 676303419e9e..f8d8b1afd232 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -257,7 +257,7 @@ static void *wait_entry_unlocked_exclusive(struct xa_state *xas, void *entry)
 		wq = dax_entry_waitqueue(xas, entry, &ewait.key);
 		prepare_to_wait_exclusive(wq, &ewait.wait,
 					TASK_UNINTERRUPTIBLE);
-		xas_pause(xas);
+		xas_reset(xas);
 		xas_unlock_irq(xas);
 		schedule();
 		finish_wait(wq, &ewait.wait);
-- 
2.47.2

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] fs/dax: Fix "don't skip locked entries when scanning entries"
  2025-05-23  4:37 [PATCH] fs/dax: Fix "don't skip locked entries when scanning entries" Alistair Popple
@ 2025-05-26 14:43 ` Jan Kara
  2025-05-27 21:46 ` Dan Williams
  2025-05-30  5:10 ` Christian Brauner
  2 siblings, 0 replies; 5+ messages in thread
From: Jan Kara @ 2025-05-26 14:43 UTC (permalink / raw)
  To: Alistair Popple
  Cc: akpm, linux-fsdevel, nvdimm, dan.j.williams, willy, linux-kernel,
	Alison Schofield, Balbir Singh, Darrick J. Wong, Dave Chinner,
	David Hildenbrand, Jan Kara, John Hubbard, Ted Ts'o,
	Alexander Viro, Christian Brauner

On Fri 23-05-25 14:37:49, Alistair Popple wrote:
> Commit 6be3e21d25ca ("fs/dax: don't skip locked entries when scanning
> entries") introduced a new function, wait_entry_unlocked_exclusive(),
> which waits for the current entry to become unlocked without advancing
> the XArray iterator state.
> 
> Waiting for the entry to become unlocked requires dropping the XArray
> lock. This requires calling xas_pause() prior to dropping the lock
> which leaves the xas in a suitable state for the next iteration. However
> this has the side-effect of advancing the xas state to the next index.
> Normally this isn't an issue because xas_for_each() contains code to
> detect this state and thus avoid advancing the index a second time on
> the next loop iteration.
> 
> However both callers of and wait_entry_unlocked_exclusive() itself
> subsequently use the xas state to reload the entry. As xas_pause()
> updated the state to the next index this will cause the current entry
> which is being waited on to be skipped. This caused the following
> warning to fire intermittently when running xftest generic/068 on an XFS
> filesystem with FS DAX enabled:
> 
> [   35.067397] ------------[ cut here ]------------
> [   35.068229] WARNING: CPU: 21 PID: 1640 at mm/truncate.c:89 truncate_folio_batch_exceptionals+0xd8/0x1e0
> [   35.069717] Modules linked in: nd_pmem dax_pmem nd_btt nd_e820 libnvdimm
> [   35.071006] CPU: 21 UID: 0 PID: 1640 Comm: fstest Not tainted 6.15.0-rc7+ #77 PREEMPT(voluntary)
> [   35.072613] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/204
> [   35.074845] RIP: 0010:truncate_folio_batch_exceptionals+0xd8/0x1e0
> [   35.075962] Code: a1 00 00 00 f6 47 0d 20 0f 84 97 00 00 00 4c 63 e8 41 39 c4 7f 0b eb 61 49 83 c5 01 45 39 ec 7e 58 42 f68
> [   35.079522] RSP: 0018:ffffb04e426c7850 EFLAGS: 00010202
> [   35.080359] RAX: 0000000000000000 RBX: ffff9d21e3481908 RCX: ffffb04e426c77f4
> [   35.081477] RDX: ffffb04e426c79e8 RSI: ffffb04e426c79e0 RDI: ffff9d21e34816e8
> [   35.082590] RBP: ffffb04e426c79e0 R08: 0000000000000001 R09: 0000000000000003
> [   35.083733] R10: 0000000000000000 R11: 822b53c0f7a49868 R12: 000000000000001f
> [   35.084850] R13: 0000000000000000 R14: ffffb04e426c78e8 R15: fffffffffffffffe
> [   35.085953] FS:  00007f9134c87740(0000) GS:ffff9d22abba0000(0000) knlGS:0000000000000000
> [   35.087346] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   35.088244] CR2: 00007f9134c86000 CR3: 000000040afff000 CR4: 00000000000006f0
> [   35.089354] Call Trace:
> [   35.089749]  <TASK>
> [   35.090168]  truncate_inode_pages_range+0xfc/0x4d0
> [   35.091078]  truncate_pagecache+0x47/0x60
> [   35.091735]  xfs_setattr_size+0xc7/0x3e0
> [   35.092648]  xfs_vn_setattr+0x1ea/0x270
> [   35.093437]  notify_change+0x1f4/0x510
> [   35.094219]  ? do_truncate+0x97/0xe0
> [   35.094879]  do_truncate+0x97/0xe0
> [   35.095640]  path_openat+0xabd/0xca0
> [   35.096278]  do_filp_open+0xd7/0x190
> [   35.096860]  do_sys_openat2+0x8a/0xe0
> [   35.097459]  __x64_sys_openat+0x6d/0xa0
> [   35.098076]  do_syscall_64+0xbb/0x1d0
> [   35.098647]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
> [   35.099444] RIP: 0033:0x7f9134d81fc1
> [   35.100033] Code: 75 57 89 f0 25 00 00 41 00 3d 00 00 41 00 74 49 80 3d 2a 26 0e 00 00 74 6d 89 da 48 89 ee bf 9c ff ff ff5
> [   35.102993] RSP: 002b:00007ffcd41e0d10 EFLAGS: 00000202 ORIG_RAX: 0000000000000101
> [   35.104263] RAX: ffffffffffffffda RBX: 0000000000000242 RCX: 00007f9134d81fc1
> [   35.105452] RDX: 0000000000000242 RSI: 00007ffcd41e1200 RDI: 00000000ffffff9c
> [   35.106663] RBP: 00007ffcd41e1200 R08: 0000000000000000 R09: 0000000000000064
> [   35.107923] R10: 00000000000001a4 R11: 0000000000000202 R12: 0000000000000066
> [   35.109112] R13: 0000000000100000 R14: 0000000000100000 R15: 0000000000000400
> [   35.110357]  </TASK>
> [   35.110769] irq event stamp: 8415587
> [   35.111486] hardirqs last  enabled at (8415599): [<ffffffff8d74b562>] __up_console_sem+0x52/0x60
> [   35.113067] hardirqs last disabled at (8415610): [<ffffffff8d74b547>] __up_console_sem+0x37/0x60
> [   35.114575] softirqs last  enabled at (8415300): [<ffffffff8d6ac625>] handle_softirqs+0x315/0x3f0
> [   35.115933] softirqs last disabled at (8415291): [<ffffffff8d6ac811>] __irq_exit_rcu+0xa1/0xc0
> [   35.117316] ---[ end trace 0000000000000000 ]---
> 
> Fix this by using xas_reset() instead, which is equivalent in
> implementation to xas_pause() but does not advance the XArray state.
> 
> Fixes: 6be3e21d25ca ("fs/dax: don't skip locked entries when scanning entries")
> Signed-off-by: Alistair Popple <apopple@nvidia.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Alison Schofield <alison.schofield@intel.com>
> Cc: Matthew Wilcow (Oracle) <willy@infradead.org>
> Cc: Balbir Singh <balbirs@nvidia.com>
> Cc: "Darrick J. Wong" <djwong@kernel.org>
> Cc: Dave Chinner <david@fromorbit.com>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Jan Kara <jack@suse.cz>
> Cc: John Hubbard <jhubbard@nvidia.com>
> Cc: Ted Ts'o <tytso@mit.edu>
> Cc: Alexander Viro <viro@zeniv.linux.org.uk>
> Cc: Christian Brauner <brauner@kernel.org>

Looks good to me. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
> 
> Hi Andrew,
> 
> Apologies for finding this so late in the cycle. This is a very
> intermittent issue for me, and it seems it was only exposed by a recent
> upgrade to my test machine/setup. The user visible impact is the same
> as for the original commit this fixes. That is possible file data
> corruption if a device has a FS DAX page pinned for DMA.
> 
> So in other words it means my original fix was not 100% effective.
> The issue that commit fixed has existed for a long time without being
> reported, so not sure if this is worth trying to get into v6.15 or not.
> 
> Either way I figured it would be best to send this ASAP, which means I
> am still waiting for a complete xfstest run to complete (although the
> failing test does now pass cleanly).
> ---
>  fs/dax.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/dax.c b/fs/dax.c
> index 676303419e9e..f8d8b1afd232 100644
> --- a/fs/dax.c
> +++ b/fs/dax.c
> @@ -257,7 +257,7 @@ static void *wait_entry_unlocked_exclusive(struct xa_state *xas, void *entry)
>  		wq = dax_entry_waitqueue(xas, entry, &ewait.key);
>  		prepare_to_wait_exclusive(wq, &ewait.wait,
>  					TASK_UNINTERRUPTIBLE);
> -		xas_pause(xas);
> +		xas_reset(xas);
>  		xas_unlock_irq(xas);
>  		schedule();
>  		finish_wait(wq, &ewait.wait);
> -- 
> 2.47.2
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] fs/dax: Fix "don't skip locked entries when scanning entries"
  2025-05-23  4:37 [PATCH] fs/dax: Fix "don't skip locked entries when scanning entries" Alistair Popple
  2025-05-26 14:43 ` Jan Kara
@ 2025-05-27 21:46 ` Dan Williams
  2025-05-28  7:17   ` Alistair Popple
  2025-05-30  5:10 ` Christian Brauner
  2 siblings, 1 reply; 5+ messages in thread
From: Dan Williams @ 2025-05-27 21:46 UTC (permalink / raw)
  To: Alistair Popple, akpm, linux-fsdevel, nvdimm, dan.j.williams,
	willy
  Cc: linux-kernel, Alistair Popple, Alison Schofield, Balbir Singh,
	Darrick J. Wong, Dave Chinner, David Hildenbrand, Jan Kara,
	John Hubbard, Ted Ts'o, Alexander Viro, Christian Brauner

Alistair Popple wrote:
> Commit 6be3e21d25ca ("fs/dax: don't skip locked entries when scanning
> entries") introduced a new function, wait_entry_unlocked_exclusive(),
> which waits for the current entry to become unlocked without advancing
> the XArray iterator state.
> 
> Waiting for the entry to become unlocked requires dropping the XArray
> lock. This requires calling xas_pause() prior to dropping the lock
> which leaves the xas in a suitable state for the next iteration. However
> this has the side-effect of advancing the xas state to the next index.
> Normally this isn't an issue because xas_for_each() contains code to
> detect this state and thus avoid advancing the index a second time on
> the next loop iteration.
> 
> However both callers of and wait_entry_unlocked_exclusive() itself
> subsequently use the xas state to reload the entry. As xas_pause()
> updated the state to the next index this will cause the current entry
> which is being waited on to be skipped. This caused the following
> warning to fire intermittently when running xftest generic/068 on an XFS
> filesystem with FS DAX enabled:
> 
> [   35.067397] ------------[ cut here ]------------
> [   35.068229] WARNING: CPU: 21 PID: 1640 at mm/truncate.c:89 truncate_folio_batch_exceptionals+0xd8/0x1e0
[..]
> 
> Fix this by using xas_reset() instead, which is equivalent in
> implementation to xas_pause() but does not advance the XArray state.
> 
> Fixes: 6be3e21d25ca ("fs/dax: don't skip locked entries when scanning entries")
> Signed-off-by: Alistair Popple <apopple@nvidia.com>
[..]
> 
> Hi Andrew,
> 
> Apologies for finding this so late in the cycle. This is a very
> intermittent issue for me, and it seems it was only exposed by a recent
> upgrade to my test machine/setup. The user visible impact is the same
> as for the original commit this fixes. That is possible file data
> corruption if a device has a FS DAX page pinned for DMA.
> 
> So in other words it means my original fix was not 100% effective.
> The issue that commit fixed has existed for a long time without being
> reported, so not sure if this is worth trying to get into v6.15 or not.
> 
> Either way I figured it would be best to send this ASAP, which means I
> am still waiting for a complete xfstest run to complete (although the
> failing test does now pass cleanly).
> ---
>  fs/dax.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/dax.c b/fs/dax.c
> index 676303419e9e..f8d8b1afd232 100644
> --- a/fs/dax.c
> +++ b/fs/dax.c
> @@ -257,7 +257,7 @@ static void *wait_entry_unlocked_exclusive(struct xa_state *xas, void *entry)
>  		wq = dax_entry_waitqueue(xas, entry, &ewait.key);
>  		prepare_to_wait_exclusive(wq, &ewait.wait,
>  					TASK_UNINTERRUPTIBLE);
> -		xas_pause(xas);
> +		xas_reset(xas);
>  		xas_unlock_irq(xas);
>  		schedule();
>  		finish_wait(wq, &ewait.wait);

This looks super-subtle, but so did the original fix commit 6be3e21d25ca
("fs/dax: don't skip locked entries when scanning entries"). The
resolution is the same to make sure the xarray state does not mistakenly
advance when the lock is dropped.

You can add:

Reviewed-by: Dan Williams <dan.j.williams@intel.com>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] fs/dax: Fix "don't skip locked entries when scanning entries"
  2025-05-27 21:46 ` Dan Williams
@ 2025-05-28  7:17   ` Alistair Popple
  0 siblings, 0 replies; 5+ messages in thread
From: Alistair Popple @ 2025-05-28  7:17 UTC (permalink / raw)
  To: Dan Williams
  Cc: akpm, linux-fsdevel, nvdimm, willy, linux-kernel,
	Alison Schofield, Balbir Singh, Darrick J. Wong, Dave Chinner,
	David Hildenbrand, Jan Kara, John Hubbard, Ted Ts'o,
	Alexander Viro, Christian Brauner

On Tue, May 27, 2025 at 02:46:28PM -0700, Dan Williams wrote:
> Alistair Popple wrote:
> > Commit 6be3e21d25ca ("fs/dax: don't skip locked entries when scanning
> > entries") introduced a new function, wait_entry_unlocked_exclusive(),
> > which waits for the current entry to become unlocked without advancing
> > the XArray iterator state.
> > 
> > Waiting for the entry to become unlocked requires dropping the XArray
> > lock. This requires calling xas_pause() prior to dropping the lock
> > which leaves the xas in a suitable state for the next iteration. However
> > this has the side-effect of advancing the xas state to the next index.
> > Normally this isn't an issue because xas_for_each() contains code to
> > detect this state and thus avoid advancing the index a second time on
> > the next loop iteration.
> > 
> > However both callers of and wait_entry_unlocked_exclusive() itself
> > subsequently use the xas state to reload the entry. As xas_pause()
> > updated the state to the next index this will cause the current entry
> > which is being waited on to be skipped. This caused the following
> > warning to fire intermittently when running xftest generic/068 on an XFS
> > filesystem with FS DAX enabled:
> > 
> > [   35.067397] ------------[ cut here ]------------
> > [   35.068229] WARNING: CPU: 21 PID: 1640 at mm/truncate.c:89 truncate_folio_batch_exceptionals+0xd8/0x1e0
> [..]
> > 
> > Fix this by using xas_reset() instead, which is equivalent in
> > implementation to xas_pause() but does not advance the XArray state.
> > 
> > Fixes: 6be3e21d25ca ("fs/dax: don't skip locked entries when scanning entries")
> > Signed-off-by: Alistair Popple <apopple@nvidia.com>
> [..]
> > 
> > Hi Andrew,
> > 
> > Apologies for finding this so late in the cycle. This is a very
> > intermittent issue for me, and it seems it was only exposed by a recent
> > upgrade to my test machine/setup. The user visible impact is the same
> > as for the original commit this fixes. That is possible file data
> > corruption if a device has a FS DAX page pinned for DMA.
> > 
> > So in other words it means my original fix was not 100% effective.
> > The issue that commit fixed has existed for a long time without being
> > reported, so not sure if this is worth trying to get into v6.15 or not.
> > 
> > Either way I figured it would be best to send this ASAP, which means I
> > am still waiting for a complete xfstest run to complete (although the
> > failing test does now pass cleanly).
> > ---
> >  fs/dax.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/fs/dax.c b/fs/dax.c
> > index 676303419e9e..f8d8b1afd232 100644
> > --- a/fs/dax.c
> > +++ b/fs/dax.c
> > @@ -257,7 +257,7 @@ static void *wait_entry_unlocked_exclusive(struct xa_state *xas, void *entry)
> >  		wq = dax_entry_waitqueue(xas, entry, &ewait.key);
> >  		prepare_to_wait_exclusive(wq, &ewait.wait,
> >  					TASK_UNINTERRUPTIBLE);
> > -		xas_pause(xas);
> > +		xas_reset(xas);
> >  		xas_unlock_irq(xas);
> >  		schedule();
> >  		finish_wait(wq, &ewait.wait);
> 
> This looks super-subtle, but so did the original fix commit 6be3e21d25ca
> ("fs/dax: don't skip locked entries when scanning entries"). The
> resolution is the same to make sure the xarray state does not mistakenly
> advance when the lock is dropped.

Yes, the XArray iterator code is super subtle. The fact it took at least two
engineers three attempts which were also reviewed to get this (hopefully!) right
suggests it's certainly not obvious.

That said I don't have any good suggestions for making it better other than
renaming functions which will create a lot of churn.

> You can add:
> 
> Reviewed-by: Dan Williams <dan.j.williams@intel.com>

Thanks!

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] fs/dax: Fix "don't skip locked entries when scanning entries"
  2025-05-23  4:37 [PATCH] fs/dax: Fix "don't skip locked entries when scanning entries" Alistair Popple
  2025-05-26 14:43 ` Jan Kara
  2025-05-27 21:46 ` Dan Williams
@ 2025-05-30  5:10 ` Christian Brauner
  2 siblings, 0 replies; 5+ messages in thread
From: Christian Brauner @ 2025-05-30  5:10 UTC (permalink / raw)
  To: Alistair Popple
  Cc: Christian Brauner, linux-kernel, Alison Schofield, Balbir Singh,
	Darrick J. Wong, Dave Chinner, David Hildenbrand, Jan Kara,
	John Hubbard, Ted Ts'o, Alexander Viro, akpm, linux-fsdevel,
	nvdimm, dan.j.williams, willy

On Fri, 23 May 2025 14:37:49 +1000, Alistair Popple wrote:
> Commit 6be3e21d25ca ("fs/dax: don't skip locked entries when scanning
> entries") introduced a new function, wait_entry_unlocked_exclusive(),
> which waits for the current entry to become unlocked without advancing
> the XArray iterator state.
> 
> Waiting for the entry to become unlocked requires dropping the XArray
> lock. This requires calling xas_pause() prior to dropping the lock
> which leaves the xas in a suitable state for the next iteration. However
> this has the side-effect of advancing the xas state to the next index.
> Normally this isn't an issue because xas_for_each() contains code to
> detect this state and thus avoid advancing the index a second time on
> the next loop iteration.
> 
> [...]

Applied to the vfs.fixes branch of the vfs/vfs.git tree.
Patches in the vfs.fixes branch should appear in linux-next soon.

Please report any outstanding bugs that were missed during review in a
new review to the original patch series allowing us to drop it.

It's encouraged to provide Acked-bys and Reviewed-bys even though the
patch has now been applied. If possible patch trailers will be updated.

Note that commit hashes shown below are subject to change due to rebase,
trailer updates or similar. If in doubt, please check the listed branch.

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
branch: vfs.fixes

[1/1] fs/dax: Fix "don't skip locked entries when scanning entries"
      https://git.kernel.org/vfs/vfs/c/dd59137bfe70

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-05-30  5:10 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-23  4:37 [PATCH] fs/dax: Fix "don't skip locked entries when scanning entries" Alistair Popple
2025-05-26 14:43 ` Jan Kara
2025-05-27 21:46 ` Dan Williams
2025-05-28  7:17   ` Alistair Popple
2025-05-30  5:10 ` Christian Brauner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).