public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* clean up I/O path inode locking helpers and the page fault handler
@ 2024-06-19 11:53 Christoph Hellwig
  2024-06-19 11:53 ` [PATCH 1/6] xfs: move the dio write relocking out of xfs_ilock_for_iomap Christoph Hellwig
                   ` (5 more replies)
  0 siblings, 6 replies; 17+ messages in thread
From: Christoph Hellwig @ 2024-06-19 11:53 UTC (permalink / raw)
  To: Chandan Babu R; +Cc: Darrick J. Wong, linux-xfs

Hi all,

a while ago I started wondering about some of the I/O path locking,
and ended up wading through the various helpers.  This series refactors
them an improves comments to help understanding.  To get there I also
took a hard look at the page fault handler and ended up refactoring it
as well.

Diffstat:
 xfs_file.c  |  139 ++++++++++++++++++++++++++++++++----------------------------
 xfs_iomap.c |   71 ++++++++++++++----------------
 2 files changed, 109 insertions(+), 101 deletions(-)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/6] xfs: move the dio write relocking out of xfs_ilock_for_iomap
  2024-06-19 11:53 clean up I/O path inode locking helpers and the page fault handler Christoph Hellwig
@ 2024-06-19 11:53 ` Christoph Hellwig
  2024-06-20 18:40   ` Darrick J. Wong
  2024-06-19 11:53 ` [PATCH 2/6] xfs: cleanup xfs_ilock_iocb_for_write Christoph Hellwig
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2024-06-19 11:53 UTC (permalink / raw)
  To: Chandan Babu R; +Cc: Darrick J. Wong, linux-xfs

About half of xfs_ilock_for_iomap deals with a special case for direct
I/O writes to COW files that need to take the ilock exclusively.  Move
this code into the one callers that cares and simplify
xfs_ilock_for_iomap.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/xfs_iomap.c | 71 ++++++++++++++++++++++------------------------
 1 file changed, 34 insertions(+), 37 deletions(-)

diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 3783426739258c..b0085e5972393a 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -717,53 +717,30 @@ imap_needs_cow(
 	return true;
 }
 
+/*
+ * Extents not yet cached requires exclusive access, don't block for
+ * IOMAP_NOWAIT.
+ *
+ * This is basically an opencoded xfs_ilock_data_map_shared() call, but with
+ * support for IOMAP_NOWAIT.
+ */
 static int
 xfs_ilock_for_iomap(
 	struct xfs_inode	*ip,
 	unsigned		flags,
 	unsigned		*lockmode)
 {
-	unsigned int		mode = *lockmode;
-	bool			is_write = flags & (IOMAP_WRITE | IOMAP_ZERO);
-
-	/*
-	 * COW writes may allocate delalloc space or convert unwritten COW
-	 * extents, so we need to make sure to take the lock exclusively here.
-	 */
-	if (xfs_is_cow_inode(ip) && is_write)
-		mode = XFS_ILOCK_EXCL;
-
-	/*
-	 * Extents not yet cached requires exclusive access, don't block.  This
-	 * is an opencoded xfs_ilock_data_map_shared() call but with
-	 * non-blocking behaviour.
-	 */
-	if (xfs_need_iread_extents(&ip->i_df)) {
-		if (flags & IOMAP_NOWAIT)
-			return -EAGAIN;
-		mode = XFS_ILOCK_EXCL;
-	}
-
-relock:
 	if (flags & IOMAP_NOWAIT) {
-		if (!xfs_ilock_nowait(ip, mode))
+		if (xfs_need_iread_extents(&ip->i_df))
+			return -EAGAIN;
+		if (!xfs_ilock_nowait(ip, *lockmode))
 			return -EAGAIN;
 	} else {
-		xfs_ilock(ip, mode);
+		if (xfs_need_iread_extents(&ip->i_df))
+			*lockmode = XFS_ILOCK_EXCL;
+		xfs_ilock(ip, *lockmode);
 	}
 
-	/*
-	 * The reflink iflag could have changed since the earlier unlocked
-	 * check, so if we got ILOCK_SHARED for a write and but we're now a
-	 * reflink inode we have to switch to ILOCK_EXCL and relock.
-	 */
-	if (mode == XFS_ILOCK_SHARED && is_write && xfs_is_cow_inode(ip)) {
-		xfs_iunlock(ip, mode);
-		mode = XFS_ILOCK_EXCL;
-		goto relock;
-	}
-
-	*lockmode = mode;
 	return 0;
 }
 
@@ -801,7 +778,7 @@ xfs_direct_write_iomap_begin(
 	int			nimaps = 1, error = 0;
 	bool			shared = false;
 	u16			iomap_flags = 0;
-	unsigned int		lockmode = XFS_ILOCK_SHARED;
+	unsigned int		lockmode;
 	u64			seq;
 
 	ASSERT(flags & (IOMAP_WRITE | IOMAP_ZERO));
@@ -817,10 +794,30 @@ xfs_direct_write_iomap_begin(
 	if (offset + length > i_size_read(inode))
 		iomap_flags |= IOMAP_F_DIRTY;
 
+	/*
+	 * COW writes may allocate delalloc space or convert unwritten COW
+	 * extents, so we need to make sure to take the lock exclusively here.
+	 */
+	if (xfs_is_cow_inode(ip))
+		lockmode = XFS_ILOCK_EXCL;
+	else
+		lockmode = XFS_ILOCK_SHARED;
+
+relock:
 	error = xfs_ilock_for_iomap(ip, flags, &lockmode);
 	if (error)
 		return error;
 
+	/*
+	 * The reflink iflag could have changed since the earlier unlocked
+	 * check, check if it again and relock if needed.
+	 */
+	if (xfs_is_cow_inode(ip) && lockmode == XFS_ILOCK_SHARED) {
+		xfs_iunlock(ip, lockmode);
+		lockmode = XFS_ILOCK_EXCL;
+		goto relock;
+	}
+
 	error = xfs_bmapi_read(ip, offset_fsb, end_fsb - offset_fsb, &imap,
 			       &nimaps, 0);
 	if (error)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/6] xfs: cleanup xfs_ilock_iocb_for_write
  2024-06-19 11:53 clean up I/O path inode locking helpers and the page fault handler Christoph Hellwig
  2024-06-19 11:53 ` [PATCH 1/6] xfs: move the dio write relocking out of xfs_ilock_for_iomap Christoph Hellwig
@ 2024-06-19 11:53 ` Christoph Hellwig
  2024-06-20 18:41   ` Darrick J. Wong
  2024-06-19 11:53 ` [PATCH 3/6] xfs: simplify xfs_dax_fault Christoph Hellwig
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2024-06-19 11:53 UTC (permalink / raw)
  To: Chandan Babu R; +Cc: Darrick J. Wong, linux-xfs

Move the relock path out of the straight line and add a comment
explaining why it exists.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/xfs_file.c | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index b240ea5241dc9d..74c2c8d253e69b 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -213,14 +213,18 @@ xfs_ilock_iocb_for_write(
 	if (ret)
 		return ret;
 
-	if (*lock_mode == XFS_IOLOCK_EXCL)
-		return 0;
-	if (!xfs_iflags_test(ip, XFS_IREMAPPING))
-		return 0;
+	/*
+	 * If a reflink remap is in progress we always need to take the iolock
+	 * exclusively to wait for it to finish.
+	 */
+	if (*lock_mode == XFS_IOLOCK_SHARED &&
+	    xfs_iflags_test(ip, XFS_IREMAPPING)) {
+		xfs_iunlock(ip, *lock_mode);
+		*lock_mode = XFS_IOLOCK_EXCL;
+		return xfs_ilock_iocb(iocb, *lock_mode);
+	}
 
-	xfs_iunlock(ip, *lock_mode);
-	*lock_mode = XFS_IOLOCK_EXCL;
-	return xfs_ilock_iocb(iocb, *lock_mode);
+	return 0;
 }
 
 static unsigned int
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/6] xfs: simplify xfs_dax_fault
  2024-06-19 11:53 clean up I/O path inode locking helpers and the page fault handler Christoph Hellwig
  2024-06-19 11:53 ` [PATCH 1/6] xfs: move the dio write relocking out of xfs_ilock_for_iomap Christoph Hellwig
  2024-06-19 11:53 ` [PATCH 2/6] xfs: cleanup xfs_ilock_iocb_for_write Christoph Hellwig
@ 2024-06-19 11:53 ` Christoph Hellwig
  2024-06-20 18:50   ` Darrick J. Wong
  2024-06-19 11:53 ` [PATCH 4/6] xfs: refactor __xfs_filemap_fault Christoph Hellwig
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2024-06-19 11:53 UTC (permalink / raw)
  To: Chandan Babu R; +Cc: Darrick J. Wong, linux-xfs

Replace the separate stub with an IS_ENABLED check, and take the call to
dax_finish_sync_fault into xfs_dax_fault instead of leaving it in the
caller.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/xfs_file.c | 34 +++++++++++++---------------------
 1 file changed, 13 insertions(+), 21 deletions(-)

diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 74c2c8d253e69b..8aab2f66fe016f 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -1251,31 +1251,27 @@ xfs_file_llseek(
 	return vfs_setpos(file, offset, inode->i_sb->s_maxbytes);
 }
 
-#ifdef CONFIG_FS_DAX
 static inline vm_fault_t
 xfs_dax_fault(
 	struct vm_fault		*vmf,
 	unsigned int		order,
-	bool			write_fault,
-	pfn_t			*pfn)
+	bool			write_fault)
 {
-	return dax_iomap_fault(vmf, order, pfn, NULL,
+	vm_fault_t		ret;
+	pfn_t			pfn;
+
+	if (!IS_ENABLED(CONFIG_FS_DAX)) {
+		ASSERT(0);
+		return VM_FAULT_SIGBUS;
+	}
+	ret = dax_iomap_fault(vmf, order, &pfn, NULL,
 			(write_fault && !vmf->cow_page) ?
 				&xfs_dax_write_iomap_ops :
 				&xfs_read_iomap_ops);
+	if (ret & VM_FAULT_NEEDDSYNC)
+		ret = dax_finish_sync_fault(vmf, order, pfn);
+	return ret;
 }
-#else
-static inline vm_fault_t
-xfs_dax_fault(
-	struct vm_fault		*vmf,
-	unsigned int		order,
-	bool			write_fault,
-	pfn_t			*pfn)
-{
-	ASSERT(0);
-	return VM_FAULT_SIGBUS;
-}
-#endif
 
 /*
  * Locking for serialisation of IO during page faults. This results in a lock
@@ -1309,11 +1305,7 @@ __xfs_filemap_fault(
 		lock_mode = xfs_ilock_for_write_fault(XFS_I(inode));
 
 	if (IS_DAX(inode)) {
-		pfn_t pfn;
-
-		ret = xfs_dax_fault(vmf, order, write_fault, &pfn);
-		if (ret & VM_FAULT_NEEDDSYNC)
-			ret = dax_finish_sync_fault(vmf, order, pfn);
+		ret = xfs_dax_fault(vmf, order, write_fault);
 	} else if (write_fault) {
 		ret = iomap_page_mkwrite(vmf, &xfs_page_mkwrite_iomap_ops);
 	} else {
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 4/6] xfs: refactor __xfs_filemap_fault
  2024-06-19 11:53 clean up I/O path inode locking helpers and the page fault handler Christoph Hellwig
                   ` (2 preceding siblings ...)
  2024-06-19 11:53 ` [PATCH 3/6] xfs: simplify xfs_dax_fault Christoph Hellwig
@ 2024-06-19 11:53 ` Christoph Hellwig
  2024-06-20 18:54   ` Darrick J. Wong
  2024-06-19 11:53 ` [PATCH 5/6] xfs: always take XFS_MMAPLOCK shared in xfs_dax_fault Christoph Hellwig
  2024-06-19 11:53 ` [PATCH 6/6] xfs: fold xfs_ilock_for_write_fault into xfs_write_fault Christoph Hellwig
  5 siblings, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2024-06-19 11:53 UTC (permalink / raw)
  To: Chandan Babu R; +Cc: Darrick J. Wong, linux-xfs

Split the write fault and DAX fault handling into separate helpers
so that the main fault handler is easier to follow.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/xfs_file.c | 71 ++++++++++++++++++++++++++++++-----------------
 1 file changed, 45 insertions(+), 26 deletions(-)

diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 8aab2f66fe016f..51e50afd935895 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -1252,7 +1252,7 @@ xfs_file_llseek(
 }
 
 static inline vm_fault_t
-xfs_dax_fault(
+xfs_dax_fault_locked(
 	struct vm_fault		*vmf,
 	unsigned int		order,
 	bool			write_fault)
@@ -1273,6 +1273,45 @@ xfs_dax_fault(
 	return ret;
 }
 
+static vm_fault_t
+xfs_dax_fault(
+	struct vm_fault		*vmf,
+	unsigned int		order)
+{
+	struct xfs_inode	*ip = XFS_I(file_inode(vmf->vma->vm_file));
+	unsigned int		lock_mode;
+	vm_fault_t		ret;
+
+	lock_mode = xfs_ilock_for_write_fault(ip);
+	ret = xfs_dax_fault_locked(vmf, order, false);
+	xfs_iunlock(ip, lock_mode);
+
+	return ret;
+}
+
+static vm_fault_t
+xfs_write_fault(
+	struct vm_fault		*vmf,
+	unsigned int		order)
+{
+	struct inode		*inode = file_inode(vmf->vma->vm_file);
+	unsigned int		lock_mode;
+	vm_fault_t		ret;
+
+	sb_start_pagefault(inode->i_sb);
+	file_update_time(vmf->vma->vm_file);
+
+	lock_mode = xfs_ilock_for_write_fault(XFS_I(inode));
+	if (IS_DAX(inode))
+		ret = xfs_dax_fault_locked(vmf, order, true);
+	else
+		ret = iomap_page_mkwrite(vmf, &xfs_page_mkwrite_iomap_ops);
+	xfs_iunlock(XFS_I(inode), lock_mode);
+
+	sb_end_pagefault(inode->i_sb);
+	return ret;
+}
+
 /*
  * Locking for serialisation of IO during page faults. This results in a lock
  * ordering of:
@@ -1290,34 +1329,14 @@ __xfs_filemap_fault(
 	bool			write_fault)
 {
 	struct inode		*inode = file_inode(vmf->vma->vm_file);
-	struct xfs_inode	*ip = XFS_I(inode);
-	vm_fault_t		ret;
-	unsigned int		lock_mode = 0;
 
-	trace_xfs_filemap_fault(ip, order, write_fault);
-
-	if (write_fault) {
-		sb_start_pagefault(inode->i_sb);
-		file_update_time(vmf->vma->vm_file);
-	}
-
-	if (IS_DAX(inode) || write_fault)
-		lock_mode = xfs_ilock_for_write_fault(XFS_I(inode));
-
-	if (IS_DAX(inode)) {
-		ret = xfs_dax_fault(vmf, order, write_fault);
-	} else if (write_fault) {
-		ret = iomap_page_mkwrite(vmf, &xfs_page_mkwrite_iomap_ops);
-	} else {
-		ret = filemap_fault(vmf);
-	}
-
-	if (lock_mode)
-		xfs_iunlock(XFS_I(inode), lock_mode);
+	trace_xfs_filemap_fault(XFS_I(inode), order, write_fault);
 
 	if (write_fault)
-		sb_end_pagefault(inode->i_sb);
-	return ret;
+		return xfs_write_fault(vmf, order);
+	if (IS_DAX(inode))
+		return xfs_dax_fault(vmf, order);
+	return filemap_fault(vmf);
 }
 
 static inline bool
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 5/6] xfs: always take XFS_MMAPLOCK shared in xfs_dax_fault
  2024-06-19 11:53 clean up I/O path inode locking helpers and the page fault handler Christoph Hellwig
                   ` (3 preceding siblings ...)
  2024-06-19 11:53 ` [PATCH 4/6] xfs: refactor __xfs_filemap_fault Christoph Hellwig
@ 2024-06-19 11:53 ` Christoph Hellwig
  2024-06-20 18:56   ` Darrick J. Wong
  2024-06-19 11:53 ` [PATCH 6/6] xfs: fold xfs_ilock_for_write_fault into xfs_write_fault Christoph Hellwig
  5 siblings, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2024-06-19 11:53 UTC (permalink / raw)
  To: Chandan Babu R; +Cc: Darrick J. Wong, linux-xfs

After the previous refactoring, xfs_dax_fault is now never used for write
faults, so don't bother with the xfs_ilock_for_write_fault logic to
protect against writes when remapping is in progress.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/xfs_file.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 51e50afd935895..62a69ed796f2fd 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -1279,12 +1279,11 @@ xfs_dax_fault(
 	unsigned int		order)
 {
 	struct xfs_inode	*ip = XFS_I(file_inode(vmf->vma->vm_file));
-	unsigned int		lock_mode;
 	vm_fault_t		ret;
 
-	lock_mode = xfs_ilock_for_write_fault(ip);
+	xfs_ilock(ip, XFS_MMAPLOCK_SHARED);
 	ret = xfs_dax_fault_locked(vmf, order, false);
-	xfs_iunlock(ip, lock_mode);
+	xfs_iunlock(ip, XFS_MMAPLOCK_SHARED);
 
 	return ret;
 }
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 6/6] xfs: fold xfs_ilock_for_write_fault into xfs_write_fault
  2024-06-19 11:53 clean up I/O path inode locking helpers and the page fault handler Christoph Hellwig
                   ` (4 preceding siblings ...)
  2024-06-19 11:53 ` [PATCH 5/6] xfs: always take XFS_MMAPLOCK shared in xfs_dax_fault Christoph Hellwig
@ 2024-06-19 11:53 ` Christoph Hellwig
  2024-06-20 18:57   ` Darrick J. Wong
  5 siblings, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2024-06-19 11:53 UTC (permalink / raw)
  To: Chandan Babu R; +Cc: Darrick J. Wong, linux-xfs

Now that the page fault handler has been refactored, the only caller
of xfs_ilock_for_write_fault is simple enough and calls it
unconditionally.  Fold the logic and expand the comments explaining it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/xfs_file.c | 33 +++++++++++++++------------------
 1 file changed, 15 insertions(+), 18 deletions(-)

diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 62a69ed796f2fd..05ea96661c475f 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -227,21 +227,6 @@ xfs_ilock_iocb_for_write(
 	return 0;
 }
 
-static unsigned int
-xfs_ilock_for_write_fault(
-	struct xfs_inode	*ip)
-{
-	/* get a shared lock if no remapping in progress */
-	xfs_ilock(ip, XFS_MMAPLOCK_SHARED);
-	if (!xfs_iflags_test(ip, XFS_IREMAPPING))
-		return XFS_MMAPLOCK_SHARED;
-
-	/* wait for remapping to complete */
-	xfs_iunlock(ip, XFS_MMAPLOCK_SHARED);
-	xfs_ilock(ip, XFS_MMAPLOCK_EXCL);
-	return XFS_MMAPLOCK_EXCL;
-}
-
 STATIC ssize_t
 xfs_file_dio_read(
 	struct kiocb		*iocb,
@@ -1294,18 +1279,30 @@ xfs_write_fault(
 	unsigned int		order)
 {
 	struct inode		*inode = file_inode(vmf->vma->vm_file);
-	unsigned int		lock_mode;
+	struct xfs_inode	*ip = XFS_I(inode);
+	unsigned int		lock_mode = XFS_MMAPLOCK_SHARED;
 	vm_fault_t		ret;
 
 	sb_start_pagefault(inode->i_sb);
 	file_update_time(vmf->vma->vm_file);
 
-	lock_mode = xfs_ilock_for_write_fault(XFS_I(inode));
+	/*
+	 * Normally we only need the shared mmaplock, but if a reflink remap is
+	 * in progress we take the exclusive lock to wait for the remap to
+	 * finish before taking a write fault.
+	 */
+	xfs_ilock(ip, XFS_MMAPLOCK_SHARED);
+	if (xfs_iflags_test(ip, XFS_IREMAPPING)) {
+		xfs_iunlock(ip, XFS_MMAPLOCK_SHARED);
+		xfs_ilock(ip, XFS_MMAPLOCK_EXCL);
+		lock_mode = XFS_MMAPLOCK_EXCL;
+	}
+
 	if (IS_DAX(inode))
 		ret = xfs_dax_fault_locked(vmf, order, true);
 	else
 		ret = iomap_page_mkwrite(vmf, &xfs_page_mkwrite_iomap_ops);
-	xfs_iunlock(XFS_I(inode), lock_mode);
+	xfs_iunlock(ip, lock_mode);
 
 	sb_end_pagefault(inode->i_sb);
 	return ret;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/6] xfs: move the dio write relocking out of xfs_ilock_for_iomap
  2024-06-19 11:53 ` [PATCH 1/6] xfs: move the dio write relocking out of xfs_ilock_for_iomap Christoph Hellwig
@ 2024-06-20 18:40   ` Darrick J. Wong
  0 siblings, 0 replies; 17+ messages in thread
From: Darrick J. Wong @ 2024-06-20 18:40 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Chandan Babu R, linux-xfs

On Wed, Jun 19, 2024 at 01:53:51PM +0200, Christoph Hellwig wrote:
> About half of xfs_ilock_for_iomap deals with a special case for direct
> I/O writes to COW files that need to take the ilock exclusively.  Move
> this code into the one callers that cares and simplify
> xfs_ilock_for_iomap.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

It took me a few minutes to sort this out, but yeah, directio writes are
the only place where we do this shared->cow->excl promotion dance.

Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

> ---
>  fs/xfs/xfs_iomap.c | 71 ++++++++++++++++++++++------------------------
>  1 file changed, 34 insertions(+), 37 deletions(-)
> 
> diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
> index 3783426739258c..b0085e5972393a 100644
> --- a/fs/xfs/xfs_iomap.c
> +++ b/fs/xfs/xfs_iomap.c
> @@ -717,53 +717,30 @@ imap_needs_cow(
>  	return true;
>  }
>  
> +/*
> + * Extents not yet cached requires exclusive access, don't block for
> + * IOMAP_NOWAIT.
> + *
> + * This is basically an opencoded xfs_ilock_data_map_shared() call, but with
> + * support for IOMAP_NOWAIT.
> + */
>  static int
>  xfs_ilock_for_iomap(
>  	struct xfs_inode	*ip,
>  	unsigned		flags,
>  	unsigned		*lockmode)
>  {
> -	unsigned int		mode = *lockmode;
> -	bool			is_write = flags & (IOMAP_WRITE | IOMAP_ZERO);
> -
> -	/*
> -	 * COW writes may allocate delalloc space or convert unwritten COW
> -	 * extents, so we need to make sure to take the lock exclusively here.
> -	 */
> -	if (xfs_is_cow_inode(ip) && is_write)
> -		mode = XFS_ILOCK_EXCL;
> -
> -	/*
> -	 * Extents not yet cached requires exclusive access, don't block.  This
> -	 * is an opencoded xfs_ilock_data_map_shared() call but with
> -	 * non-blocking behaviour.
> -	 */
> -	if (xfs_need_iread_extents(&ip->i_df)) {
> -		if (flags & IOMAP_NOWAIT)
> -			return -EAGAIN;
> -		mode = XFS_ILOCK_EXCL;
> -	}
> -
> -relock:
>  	if (flags & IOMAP_NOWAIT) {
> -		if (!xfs_ilock_nowait(ip, mode))
> +		if (xfs_need_iread_extents(&ip->i_df))
> +			return -EAGAIN;
> +		if (!xfs_ilock_nowait(ip, *lockmode))
>  			return -EAGAIN;
>  	} else {
> -		xfs_ilock(ip, mode);
> +		if (xfs_need_iread_extents(&ip->i_df))
> +			*lockmode = XFS_ILOCK_EXCL;
> +		xfs_ilock(ip, *lockmode);
>  	}
>  
> -	/*
> -	 * The reflink iflag could have changed since the earlier unlocked
> -	 * check, so if we got ILOCK_SHARED for a write and but we're now a
> -	 * reflink inode we have to switch to ILOCK_EXCL and relock.
> -	 */
> -	if (mode == XFS_ILOCK_SHARED && is_write && xfs_is_cow_inode(ip)) {
> -		xfs_iunlock(ip, mode);
> -		mode = XFS_ILOCK_EXCL;
> -		goto relock;
> -	}
> -
> -	*lockmode = mode;
>  	return 0;
>  }
>  
> @@ -801,7 +778,7 @@ xfs_direct_write_iomap_begin(
>  	int			nimaps = 1, error = 0;
>  	bool			shared = false;
>  	u16			iomap_flags = 0;
> -	unsigned int		lockmode = XFS_ILOCK_SHARED;
> +	unsigned int		lockmode;
>  	u64			seq;
>  
>  	ASSERT(flags & (IOMAP_WRITE | IOMAP_ZERO));
> @@ -817,10 +794,30 @@ xfs_direct_write_iomap_begin(
>  	if (offset + length > i_size_read(inode))
>  		iomap_flags |= IOMAP_F_DIRTY;
>  
> +	/*
> +	 * COW writes may allocate delalloc space or convert unwritten COW
> +	 * extents, so we need to make sure to take the lock exclusively here.
> +	 */
> +	if (xfs_is_cow_inode(ip))
> +		lockmode = XFS_ILOCK_EXCL;
> +	else
> +		lockmode = XFS_ILOCK_SHARED;
> +
> +relock:
>  	error = xfs_ilock_for_iomap(ip, flags, &lockmode);
>  	if (error)
>  		return error;
>  
> +	/*
> +	 * The reflink iflag could have changed since the earlier unlocked
> +	 * check, check if it again and relock if needed.
> +	 */
> +	if (xfs_is_cow_inode(ip) && lockmode == XFS_ILOCK_SHARED) {
> +		xfs_iunlock(ip, lockmode);
> +		lockmode = XFS_ILOCK_EXCL;
> +		goto relock;
> +	}
> +
>  	error = xfs_bmapi_read(ip, offset_fsb, end_fsb - offset_fsb, &imap,
>  			       &nimaps, 0);
>  	if (error)
> -- 
> 2.43.0
> 
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/6] xfs: cleanup xfs_ilock_iocb_for_write
  2024-06-19 11:53 ` [PATCH 2/6] xfs: cleanup xfs_ilock_iocb_for_write Christoph Hellwig
@ 2024-06-20 18:41   ` Darrick J. Wong
  0 siblings, 0 replies; 17+ messages in thread
From: Darrick J. Wong @ 2024-06-20 18:41 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Chandan Babu R, linux-xfs

On Wed, Jun 19, 2024 at 01:53:52PM +0200, Christoph Hellwig wrote:
> Move the relock path out of the straight line and add a comment
> explaining why it exists.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Looks good,
Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

> ---
>  fs/xfs/xfs_file.c | 18 +++++++++++-------
>  1 file changed, 11 insertions(+), 7 deletions(-)
> 
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index b240ea5241dc9d..74c2c8d253e69b 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -213,14 +213,18 @@ xfs_ilock_iocb_for_write(
>  	if (ret)
>  		return ret;
>  
> -	if (*lock_mode == XFS_IOLOCK_EXCL)
> -		return 0;
> -	if (!xfs_iflags_test(ip, XFS_IREMAPPING))
> -		return 0;
> +	/*
> +	 * If a reflink remap is in progress we always need to take the iolock
> +	 * exclusively to wait for it to finish.
> +	 */
> +	if (*lock_mode == XFS_IOLOCK_SHARED &&
> +	    xfs_iflags_test(ip, XFS_IREMAPPING)) {
> +		xfs_iunlock(ip, *lock_mode);
> +		*lock_mode = XFS_IOLOCK_EXCL;
> +		return xfs_ilock_iocb(iocb, *lock_mode);
> +	}
>  
> -	xfs_iunlock(ip, *lock_mode);
> -	*lock_mode = XFS_IOLOCK_EXCL;
> -	return xfs_ilock_iocb(iocb, *lock_mode);
> +	return 0;
>  }
>  
>  static unsigned int
> -- 
> 2.43.0
> 
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/6] xfs: simplify xfs_dax_fault
  2024-06-19 11:53 ` [PATCH 3/6] xfs: simplify xfs_dax_fault Christoph Hellwig
@ 2024-06-20 18:50   ` Darrick J. Wong
  2024-06-21  5:05     ` Christoph Hellwig
  0 siblings, 1 reply; 17+ messages in thread
From: Darrick J. Wong @ 2024-06-20 18:50 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Chandan Babu R, linux-xfs

On Wed, Jun 19, 2024 at 01:53:53PM +0200, Christoph Hellwig wrote:
> Replace the separate stub with an IS_ENABLED check, and take the call to
> dax_finish_sync_fault into xfs_dax_fault instead of leaving it in the
> caller.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  fs/xfs/xfs_file.c | 34 +++++++++++++---------------------
>  1 file changed, 13 insertions(+), 21 deletions(-)
> 
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index 74c2c8d253e69b..8aab2f66fe016f 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -1251,31 +1251,27 @@ xfs_file_llseek(
>  	return vfs_setpos(file, offset, inode->i_sb->s_maxbytes);
>  }
>  
> -#ifdef CONFIG_FS_DAX
>  static inline vm_fault_t
>  xfs_dax_fault(
>  	struct vm_fault		*vmf,
>  	unsigned int		order,
> -	bool			write_fault,
> -	pfn_t			*pfn)
> +	bool			write_fault)
>  {
> -	return dax_iomap_fault(vmf, order, pfn, NULL,
> +	vm_fault_t		ret;
> +	pfn_t			pfn;
> +
> +	if (!IS_ENABLED(CONFIG_FS_DAX)) {
> +		ASSERT(0);
> +		return VM_FAULT_SIGBUS;

Does this actually work if FS_DAX=n?  AFAICT there's no !DAX stub for
dax_iomap_fault, so won't that cause a linker error?

> +	}
> +	ret = dax_iomap_fault(vmf, order, &pfn, NULL,
>  			(write_fault && !vmf->cow_page) ?
>  				&xfs_dax_write_iomap_ops :
>  				&xfs_read_iomap_ops);
> +	if (ret & VM_FAULT_NEEDDSYNC)
> +		ret = dax_finish_sync_fault(vmf, order, pfn);
> +	return ret;

I /almost/ wondered if these ought to be separate helpers for read and
write faults, but then I realized the (write && cow_page) case is a
"read" and ... lol.  So the only question I have is about linker errors.

--D

>  }
> -#else
> -static inline vm_fault_t
> -xfs_dax_fault(
> -	struct vm_fault		*vmf,
> -	unsigned int		order,
> -	bool			write_fault,
> -	pfn_t			*pfn)
> -{
> -	ASSERT(0);
> -	return VM_FAULT_SIGBUS;
> -}
> -#endif
>  
>  /*
>   * Locking for serialisation of IO during page faults. This results in a lock
> @@ -1309,11 +1305,7 @@ __xfs_filemap_fault(
>  		lock_mode = xfs_ilock_for_write_fault(XFS_I(inode));
>  
>  	if (IS_DAX(inode)) {
> -		pfn_t pfn;
> -
> -		ret = xfs_dax_fault(vmf, order, write_fault, &pfn);
> -		if (ret & VM_FAULT_NEEDDSYNC)
> -			ret = dax_finish_sync_fault(vmf, order, pfn);
> +		ret = xfs_dax_fault(vmf, order, write_fault);
>  	} else if (write_fault) {
>  		ret = iomap_page_mkwrite(vmf, &xfs_page_mkwrite_iomap_ops);
>  	} else {
> -- 
> 2.43.0
> 
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/6] xfs: refactor __xfs_filemap_fault
  2024-06-19 11:53 ` [PATCH 4/6] xfs: refactor __xfs_filemap_fault Christoph Hellwig
@ 2024-06-20 18:54   ` Darrick J. Wong
  2024-06-21  5:08     ` Christoph Hellwig
  0 siblings, 1 reply; 17+ messages in thread
From: Darrick J. Wong @ 2024-06-20 18:54 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Chandan Babu R, linux-xfs

On Wed, Jun 19, 2024 at 01:53:54PM +0200, Christoph Hellwig wrote:
> Split the write fault and DAX fault handling into separate helpers
> so that the main fault handler is easier to follow.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  fs/xfs/xfs_file.c | 71 ++++++++++++++++++++++++++++++-----------------
>  1 file changed, 45 insertions(+), 26 deletions(-)
> 
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index 8aab2f66fe016f..51e50afd935895 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -1252,7 +1252,7 @@ xfs_file_llseek(
>  }
>  
>  static inline vm_fault_t
> -xfs_dax_fault(
> +xfs_dax_fault_locked(
>  	struct vm_fault		*vmf,
>  	unsigned int		order,
>  	bool			write_fault)
> @@ -1273,6 +1273,45 @@ xfs_dax_fault(
>  	return ret;
>  }
>  
> +static vm_fault_t
> +xfs_dax_fault(

Oh, hey, you /did/ split the dax handling into two helpers.

Would you mind renaming this xfs_dax_read_fault since this doesn't
handle write faults?

With that changed,
Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

> +	struct vm_fault		*vmf,
> +	unsigned int		order)
> +{
> +	struct xfs_inode	*ip = XFS_I(file_inode(vmf->vma->vm_file));
> +	unsigned int		lock_mode;
> +	vm_fault_t		ret;
> +
> +	lock_mode = xfs_ilock_for_write_fault(ip);
> +	ret = xfs_dax_fault_locked(vmf, order, false);
> +	xfs_iunlock(ip, lock_mode);
> +
> +	return ret;
> +}
> +
> +static vm_fault_t
> +xfs_write_fault(
> +	struct vm_fault		*vmf,
> +	unsigned int		order)
> +{
> +	struct inode		*inode = file_inode(vmf->vma->vm_file);
> +	unsigned int		lock_mode;
> +	vm_fault_t		ret;
> +
> +	sb_start_pagefault(inode->i_sb);
> +	file_update_time(vmf->vma->vm_file);
> +
> +	lock_mode = xfs_ilock_for_write_fault(XFS_I(inode));
> +	if (IS_DAX(inode))
> +		ret = xfs_dax_fault_locked(vmf, order, true);
> +	else
> +		ret = iomap_page_mkwrite(vmf, &xfs_page_mkwrite_iomap_ops);
> +	xfs_iunlock(XFS_I(inode), lock_mode);
> +
> +	sb_end_pagefault(inode->i_sb);
> +	return ret;
> +}
> +
>  /*
>   * Locking for serialisation of IO during page faults. This results in a lock
>   * ordering of:
> @@ -1290,34 +1329,14 @@ __xfs_filemap_fault(
>  	bool			write_fault)
>  {
>  	struct inode		*inode = file_inode(vmf->vma->vm_file);
> -	struct xfs_inode	*ip = XFS_I(inode);
> -	vm_fault_t		ret;
> -	unsigned int		lock_mode = 0;
>  
> -	trace_xfs_filemap_fault(ip, order, write_fault);
> -
> -	if (write_fault) {
> -		sb_start_pagefault(inode->i_sb);
> -		file_update_time(vmf->vma->vm_file);
> -	}
> -
> -	if (IS_DAX(inode) || write_fault)
> -		lock_mode = xfs_ilock_for_write_fault(XFS_I(inode));
> -
> -	if (IS_DAX(inode)) {
> -		ret = xfs_dax_fault(vmf, order, write_fault);
> -	} else if (write_fault) {
> -		ret = iomap_page_mkwrite(vmf, &xfs_page_mkwrite_iomap_ops);
> -	} else {
> -		ret = filemap_fault(vmf);
> -	}
> -
> -	if (lock_mode)
> -		xfs_iunlock(XFS_I(inode), lock_mode);
> +	trace_xfs_filemap_fault(XFS_I(inode), order, write_fault);
>  
>  	if (write_fault)
> -		sb_end_pagefault(inode->i_sb);
> -	return ret;
> +		return xfs_write_fault(vmf, order);
> +	if (IS_DAX(inode))
> +		return xfs_dax_fault(vmf, order);
> +	return filemap_fault(vmf);
>  }
>  
>  static inline bool
> -- 
> 2.43.0
> 
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5/6] xfs: always take XFS_MMAPLOCK shared in xfs_dax_fault
  2024-06-19 11:53 ` [PATCH 5/6] xfs: always take XFS_MMAPLOCK shared in xfs_dax_fault Christoph Hellwig
@ 2024-06-20 18:56   ` Darrick J. Wong
  0 siblings, 0 replies; 17+ messages in thread
From: Darrick J. Wong @ 2024-06-20 18:56 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Chandan Babu R, linux-xfs

On Wed, Jun 19, 2024 at 01:53:55PM +0200, Christoph Hellwig wrote:
> After the previous refactoring, xfs_dax_fault is now never used for write
> faults, so don't bother with the xfs_ilock_for_write_fault logic to
> protect against writes when remapping is in progress.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Makes sense to me, all we need to do here is make sure that nobody can
invalidate the mappings -- there's no need to take MMAPLOCK_EXCL if a
reflink is cloning from this file's data.

Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

> ---
>  fs/xfs/xfs_file.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index 51e50afd935895..62a69ed796f2fd 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -1279,12 +1279,11 @@ xfs_dax_fault(
>  	unsigned int		order)
>  {
>  	struct xfs_inode	*ip = XFS_I(file_inode(vmf->vma->vm_file));
> -	unsigned int		lock_mode;
>  	vm_fault_t		ret;
>  
> -	lock_mode = xfs_ilock_for_write_fault(ip);
> +	xfs_ilock(ip, XFS_MMAPLOCK_SHARED);
>  	ret = xfs_dax_fault_locked(vmf, order, false);
> -	xfs_iunlock(ip, lock_mode);
> +	xfs_iunlock(ip, XFS_MMAPLOCK_SHARED);
>  
>  	return ret;
>  }
> -- 
> 2.43.0
> 
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 6/6] xfs: fold xfs_ilock_for_write_fault into xfs_write_fault
  2024-06-19 11:53 ` [PATCH 6/6] xfs: fold xfs_ilock_for_write_fault into xfs_write_fault Christoph Hellwig
@ 2024-06-20 18:57   ` Darrick J. Wong
  0 siblings, 0 replies; 17+ messages in thread
From: Darrick J. Wong @ 2024-06-20 18:57 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Chandan Babu R, linux-xfs

On Wed, Jun 19, 2024 at 01:53:56PM +0200, Christoph Hellwig wrote:
> Now that the page fault handler has been refactored, the only caller
> of xfs_ilock_for_write_fault is simple enough and calls it
> unconditionally.  Fold the logic and expand the comments explaining it.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Makes sense to me,
Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

> ---
>  fs/xfs/xfs_file.c | 33 +++++++++++++++------------------
>  1 file changed, 15 insertions(+), 18 deletions(-)
> 
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index 62a69ed796f2fd..05ea96661c475f 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -227,21 +227,6 @@ xfs_ilock_iocb_for_write(
>  	return 0;
>  }
>  
> -static unsigned int
> -xfs_ilock_for_write_fault(
> -	struct xfs_inode	*ip)
> -{
> -	/* get a shared lock if no remapping in progress */
> -	xfs_ilock(ip, XFS_MMAPLOCK_SHARED);
> -	if (!xfs_iflags_test(ip, XFS_IREMAPPING))
> -		return XFS_MMAPLOCK_SHARED;
> -
> -	/* wait for remapping to complete */
> -	xfs_iunlock(ip, XFS_MMAPLOCK_SHARED);
> -	xfs_ilock(ip, XFS_MMAPLOCK_EXCL);
> -	return XFS_MMAPLOCK_EXCL;
> -}
> -
>  STATIC ssize_t
>  xfs_file_dio_read(
>  	struct kiocb		*iocb,
> @@ -1294,18 +1279,30 @@ xfs_write_fault(
>  	unsigned int		order)
>  {
>  	struct inode		*inode = file_inode(vmf->vma->vm_file);
> -	unsigned int		lock_mode;
> +	struct xfs_inode	*ip = XFS_I(inode);
> +	unsigned int		lock_mode = XFS_MMAPLOCK_SHARED;
>  	vm_fault_t		ret;
>  
>  	sb_start_pagefault(inode->i_sb);
>  	file_update_time(vmf->vma->vm_file);
>  
> -	lock_mode = xfs_ilock_for_write_fault(XFS_I(inode));
> +	/*
> +	 * Normally we only need the shared mmaplock, but if a reflink remap is
> +	 * in progress we take the exclusive lock to wait for the remap to
> +	 * finish before taking a write fault.
> +	 */
> +	xfs_ilock(ip, XFS_MMAPLOCK_SHARED);
> +	if (xfs_iflags_test(ip, XFS_IREMAPPING)) {
> +		xfs_iunlock(ip, XFS_MMAPLOCK_SHARED);
> +		xfs_ilock(ip, XFS_MMAPLOCK_EXCL);
> +		lock_mode = XFS_MMAPLOCK_EXCL;
> +	}
> +
>  	if (IS_DAX(inode))
>  		ret = xfs_dax_fault_locked(vmf, order, true);
>  	else
>  		ret = iomap_page_mkwrite(vmf, &xfs_page_mkwrite_iomap_ops);
> -	xfs_iunlock(XFS_I(inode), lock_mode);
> +	xfs_iunlock(ip, lock_mode);
>  
>  	sb_end_pagefault(inode->i_sb);
>  	return ret;
> -- 
> 2.43.0
> 
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/6] xfs: simplify xfs_dax_fault
  2024-06-20 18:50   ` Darrick J. Wong
@ 2024-06-21  5:05     ` Christoph Hellwig
  2024-06-21 17:43       ` Darrick J. Wong
  0 siblings, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2024-06-21  5:05 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Christoph Hellwig, Chandan Babu R, linux-xfs

On Thu, Jun 20, 2024 at 11:50:26AM -0700, Darrick J. Wong wrote:
> > +	if (!IS_ENABLED(CONFIG_FS_DAX)) {
> > +		ASSERT(0);
> > +		return VM_FAULT_SIGBUS;
> 
> Does this actually work if FS_DAX=n?  AFAICT there's no !DAX stub for
> dax_iomap_fault, so won't that cause a linker error?

IS_ENABLED expands to a compile time constant, so the compiler eliminates
the call and no stub is needed.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/6] xfs: refactor __xfs_filemap_fault
  2024-06-20 18:54   ` Darrick J. Wong
@ 2024-06-21  5:08     ` Christoph Hellwig
  0 siblings, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2024-06-21  5:08 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Christoph Hellwig, Chandan Babu R, linux-xfs

On Thu, Jun 20, 2024 at 11:54:06AM -0700, Darrick J. Wong wrote:
> Oh, hey, you /did/ split the dax handling into two helpers.
> 
> Would you mind renaming this xfs_dax_read_fault since this doesn't
> handle write faults?

Sure.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/6] xfs: simplify xfs_dax_fault
  2024-06-21  5:05     ` Christoph Hellwig
@ 2024-06-21 17:43       ` Darrick J. Wong
  0 siblings, 0 replies; 17+ messages in thread
From: Darrick J. Wong @ 2024-06-21 17:43 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Chandan Babu R, linux-xfs

On Fri, Jun 21, 2024 at 07:05:58AM +0200, Christoph Hellwig wrote:
> On Thu, Jun 20, 2024 at 11:50:26AM -0700, Darrick J. Wong wrote:
> > > +	if (!IS_ENABLED(CONFIG_FS_DAX)) {
> > > +		ASSERT(0);
> > > +		return VM_FAULT_SIGBUS;
> > 
> > Does this actually work if FS_DAX=n?  AFAICT there's no !DAX stub for
> > dax_iomap_fault, so won't that cause a linker error?
> 
> IS_ENABLED expands to a compile time constant, so the compiler eliminates
> the call and no stub is needed.

Huh, that actually links.  I am astonished.

Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/6] xfs: move the dio write relocking out of xfs_ilock_for_iomap
  2024-06-23  5:44 clean up I/O path inode locking helpers and the page fault handler v2 Christoph Hellwig
@ 2024-06-23  5:44 ` Christoph Hellwig
  0 siblings, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2024-06-23  5:44 UTC (permalink / raw)
  To: Chandan Babu R; +Cc: Darrick J. Wong, linux-xfs

About half of xfs_ilock_for_iomap deals with a special case for direct
I/O writes to COW files that need to take the ilock exclusively.  Move
this code into the one callers that cares and simplify
xfs_ilock_for_iomap.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/xfs_iomap.c | 71 ++++++++++++++++++++++------------------------
 1 file changed, 34 insertions(+), 37 deletions(-)

diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 3783426739258c..b0085e5972393a 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -717,53 +717,30 @@ imap_needs_cow(
 	return true;
 }
 
+/*
+ * Extents not yet cached requires exclusive access, don't block for
+ * IOMAP_NOWAIT.
+ *
+ * This is basically an opencoded xfs_ilock_data_map_shared() call, but with
+ * support for IOMAP_NOWAIT.
+ */
 static int
 xfs_ilock_for_iomap(
 	struct xfs_inode	*ip,
 	unsigned		flags,
 	unsigned		*lockmode)
 {
-	unsigned int		mode = *lockmode;
-	bool			is_write = flags & (IOMAP_WRITE | IOMAP_ZERO);
-
-	/*
-	 * COW writes may allocate delalloc space or convert unwritten COW
-	 * extents, so we need to make sure to take the lock exclusively here.
-	 */
-	if (xfs_is_cow_inode(ip) && is_write)
-		mode = XFS_ILOCK_EXCL;
-
-	/*
-	 * Extents not yet cached requires exclusive access, don't block.  This
-	 * is an opencoded xfs_ilock_data_map_shared() call but with
-	 * non-blocking behaviour.
-	 */
-	if (xfs_need_iread_extents(&ip->i_df)) {
-		if (flags & IOMAP_NOWAIT)
-			return -EAGAIN;
-		mode = XFS_ILOCK_EXCL;
-	}
-
-relock:
 	if (flags & IOMAP_NOWAIT) {
-		if (!xfs_ilock_nowait(ip, mode))
+		if (xfs_need_iread_extents(&ip->i_df))
+			return -EAGAIN;
+		if (!xfs_ilock_nowait(ip, *lockmode))
 			return -EAGAIN;
 	} else {
-		xfs_ilock(ip, mode);
+		if (xfs_need_iread_extents(&ip->i_df))
+			*lockmode = XFS_ILOCK_EXCL;
+		xfs_ilock(ip, *lockmode);
 	}
 
-	/*
-	 * The reflink iflag could have changed since the earlier unlocked
-	 * check, so if we got ILOCK_SHARED for a write and but we're now a
-	 * reflink inode we have to switch to ILOCK_EXCL and relock.
-	 */
-	if (mode == XFS_ILOCK_SHARED && is_write && xfs_is_cow_inode(ip)) {
-		xfs_iunlock(ip, mode);
-		mode = XFS_ILOCK_EXCL;
-		goto relock;
-	}
-
-	*lockmode = mode;
 	return 0;
 }
 
@@ -801,7 +778,7 @@ xfs_direct_write_iomap_begin(
 	int			nimaps = 1, error = 0;
 	bool			shared = false;
 	u16			iomap_flags = 0;
-	unsigned int		lockmode = XFS_ILOCK_SHARED;
+	unsigned int		lockmode;
 	u64			seq;
 
 	ASSERT(flags & (IOMAP_WRITE | IOMAP_ZERO));
@@ -817,10 +794,30 @@ xfs_direct_write_iomap_begin(
 	if (offset + length > i_size_read(inode))
 		iomap_flags |= IOMAP_F_DIRTY;
 
+	/*
+	 * COW writes may allocate delalloc space or convert unwritten COW
+	 * extents, so we need to make sure to take the lock exclusively here.
+	 */
+	if (xfs_is_cow_inode(ip))
+		lockmode = XFS_ILOCK_EXCL;
+	else
+		lockmode = XFS_ILOCK_SHARED;
+
+relock:
 	error = xfs_ilock_for_iomap(ip, flags, &lockmode);
 	if (error)
 		return error;
 
+	/*
+	 * The reflink iflag could have changed since the earlier unlocked
+	 * check, check if it again and relock if needed.
+	 */
+	if (xfs_is_cow_inode(ip) && lockmode == XFS_ILOCK_SHARED) {
+		xfs_iunlock(ip, lockmode);
+		lockmode = XFS_ILOCK_EXCL;
+		goto relock;
+	}
+
 	error = xfs_bmapi_read(ip, offset_fsb, end_fsb - offset_fsb, &imap,
 			       &nimaps, 0);
 	if (error)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2024-06-23  5:45 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-19 11:53 clean up I/O path inode locking helpers and the page fault handler Christoph Hellwig
2024-06-19 11:53 ` [PATCH 1/6] xfs: move the dio write relocking out of xfs_ilock_for_iomap Christoph Hellwig
2024-06-20 18:40   ` Darrick J. Wong
2024-06-19 11:53 ` [PATCH 2/6] xfs: cleanup xfs_ilock_iocb_for_write Christoph Hellwig
2024-06-20 18:41   ` Darrick J. Wong
2024-06-19 11:53 ` [PATCH 3/6] xfs: simplify xfs_dax_fault Christoph Hellwig
2024-06-20 18:50   ` Darrick J. Wong
2024-06-21  5:05     ` Christoph Hellwig
2024-06-21 17:43       ` Darrick J. Wong
2024-06-19 11:53 ` [PATCH 4/6] xfs: refactor __xfs_filemap_fault Christoph Hellwig
2024-06-20 18:54   ` Darrick J. Wong
2024-06-21  5:08     ` Christoph Hellwig
2024-06-19 11:53 ` [PATCH 5/6] xfs: always take XFS_MMAPLOCK shared in xfs_dax_fault Christoph Hellwig
2024-06-20 18:56   ` Darrick J. Wong
2024-06-19 11:53 ` [PATCH 6/6] xfs: fold xfs_ilock_for_write_fault into xfs_write_fault Christoph Hellwig
2024-06-20 18:57   ` Darrick J. Wong
  -- strict thread matches above, loose matches on Subject: below --
2024-06-23  5:44 clean up I/O path inode locking helpers and the page fault handler v2 Christoph Hellwig
2024-06-23  5:44 ` [PATCH 1/6] xfs: move the dio write relocking out of xfs_ilock_for_iomap Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox