* clean up I/O path inode locking helpers and the page fault handler
@ 2024-06-19 11:53 Christoph Hellwig
2024-06-19 11:53 ` [PATCH 1/6] xfs: move the dio write relocking out of xfs_ilock_for_iomap Christoph Hellwig
` (5 more replies)
0 siblings, 6 replies; 17+ messages in thread
From: Christoph Hellwig @ 2024-06-19 11:53 UTC (permalink / raw)
To: Chandan Babu R; +Cc: Darrick J. Wong, linux-xfs
Hi all,
a while ago I started wondering about some of the I/O path locking,
and ended up wading through the various helpers. This series refactors
them an improves comments to help understanding. To get there I also
took a hard look at the page fault handler and ended up refactoring it
as well.
Diffstat:
xfs_file.c | 139 ++++++++++++++++++++++++++++++++----------------------------
xfs_iomap.c | 71 ++++++++++++++----------------
2 files changed, 109 insertions(+), 101 deletions(-)
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 1/6] xfs: move the dio write relocking out of xfs_ilock_for_iomap
2024-06-19 11:53 clean up I/O path inode locking helpers and the page fault handler Christoph Hellwig
@ 2024-06-19 11:53 ` Christoph Hellwig
2024-06-20 18:40 ` Darrick J. Wong
2024-06-19 11:53 ` [PATCH 2/6] xfs: cleanup xfs_ilock_iocb_for_write Christoph Hellwig
` (4 subsequent siblings)
5 siblings, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2024-06-19 11:53 UTC (permalink / raw)
To: Chandan Babu R; +Cc: Darrick J. Wong, linux-xfs
About half of xfs_ilock_for_iomap deals with a special case for direct
I/O writes to COW files that need to take the ilock exclusively. Move
this code into the one callers that cares and simplify
xfs_ilock_for_iomap.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
fs/xfs/xfs_iomap.c | 71 ++++++++++++++++++++++------------------------
1 file changed, 34 insertions(+), 37 deletions(-)
diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 3783426739258c..b0085e5972393a 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -717,53 +717,30 @@ imap_needs_cow(
return true;
}
+/*
+ * Extents not yet cached requires exclusive access, don't block for
+ * IOMAP_NOWAIT.
+ *
+ * This is basically an opencoded xfs_ilock_data_map_shared() call, but with
+ * support for IOMAP_NOWAIT.
+ */
static int
xfs_ilock_for_iomap(
struct xfs_inode *ip,
unsigned flags,
unsigned *lockmode)
{
- unsigned int mode = *lockmode;
- bool is_write = flags & (IOMAP_WRITE | IOMAP_ZERO);
-
- /*
- * COW writes may allocate delalloc space or convert unwritten COW
- * extents, so we need to make sure to take the lock exclusively here.
- */
- if (xfs_is_cow_inode(ip) && is_write)
- mode = XFS_ILOCK_EXCL;
-
- /*
- * Extents not yet cached requires exclusive access, don't block. This
- * is an opencoded xfs_ilock_data_map_shared() call but with
- * non-blocking behaviour.
- */
- if (xfs_need_iread_extents(&ip->i_df)) {
- if (flags & IOMAP_NOWAIT)
- return -EAGAIN;
- mode = XFS_ILOCK_EXCL;
- }
-
-relock:
if (flags & IOMAP_NOWAIT) {
- if (!xfs_ilock_nowait(ip, mode))
+ if (xfs_need_iread_extents(&ip->i_df))
+ return -EAGAIN;
+ if (!xfs_ilock_nowait(ip, *lockmode))
return -EAGAIN;
} else {
- xfs_ilock(ip, mode);
+ if (xfs_need_iread_extents(&ip->i_df))
+ *lockmode = XFS_ILOCK_EXCL;
+ xfs_ilock(ip, *lockmode);
}
- /*
- * The reflink iflag could have changed since the earlier unlocked
- * check, so if we got ILOCK_SHARED for a write and but we're now a
- * reflink inode we have to switch to ILOCK_EXCL and relock.
- */
- if (mode == XFS_ILOCK_SHARED && is_write && xfs_is_cow_inode(ip)) {
- xfs_iunlock(ip, mode);
- mode = XFS_ILOCK_EXCL;
- goto relock;
- }
-
- *lockmode = mode;
return 0;
}
@@ -801,7 +778,7 @@ xfs_direct_write_iomap_begin(
int nimaps = 1, error = 0;
bool shared = false;
u16 iomap_flags = 0;
- unsigned int lockmode = XFS_ILOCK_SHARED;
+ unsigned int lockmode;
u64 seq;
ASSERT(flags & (IOMAP_WRITE | IOMAP_ZERO));
@@ -817,10 +794,30 @@ xfs_direct_write_iomap_begin(
if (offset + length > i_size_read(inode))
iomap_flags |= IOMAP_F_DIRTY;
+ /*
+ * COW writes may allocate delalloc space or convert unwritten COW
+ * extents, so we need to make sure to take the lock exclusively here.
+ */
+ if (xfs_is_cow_inode(ip))
+ lockmode = XFS_ILOCK_EXCL;
+ else
+ lockmode = XFS_ILOCK_SHARED;
+
+relock:
error = xfs_ilock_for_iomap(ip, flags, &lockmode);
if (error)
return error;
+ /*
+ * The reflink iflag could have changed since the earlier unlocked
+ * check, check if it again and relock if needed.
+ */
+ if (xfs_is_cow_inode(ip) && lockmode == XFS_ILOCK_SHARED) {
+ xfs_iunlock(ip, lockmode);
+ lockmode = XFS_ILOCK_EXCL;
+ goto relock;
+ }
+
error = xfs_bmapi_read(ip, offset_fsb, end_fsb - offset_fsb, &imap,
&nimaps, 0);
if (error)
--
2.43.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 2/6] xfs: cleanup xfs_ilock_iocb_for_write
2024-06-19 11:53 clean up I/O path inode locking helpers and the page fault handler Christoph Hellwig
2024-06-19 11:53 ` [PATCH 1/6] xfs: move the dio write relocking out of xfs_ilock_for_iomap Christoph Hellwig
@ 2024-06-19 11:53 ` Christoph Hellwig
2024-06-20 18:41 ` Darrick J. Wong
2024-06-19 11:53 ` [PATCH 3/6] xfs: simplify xfs_dax_fault Christoph Hellwig
` (3 subsequent siblings)
5 siblings, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2024-06-19 11:53 UTC (permalink / raw)
To: Chandan Babu R; +Cc: Darrick J. Wong, linux-xfs
Move the relock path out of the straight line and add a comment
explaining why it exists.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
fs/xfs/xfs_file.c | 18 +++++++++++-------
1 file changed, 11 insertions(+), 7 deletions(-)
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index b240ea5241dc9d..74c2c8d253e69b 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -213,14 +213,18 @@ xfs_ilock_iocb_for_write(
if (ret)
return ret;
- if (*lock_mode == XFS_IOLOCK_EXCL)
- return 0;
- if (!xfs_iflags_test(ip, XFS_IREMAPPING))
- return 0;
+ /*
+ * If a reflink remap is in progress we always need to take the iolock
+ * exclusively to wait for it to finish.
+ */
+ if (*lock_mode == XFS_IOLOCK_SHARED &&
+ xfs_iflags_test(ip, XFS_IREMAPPING)) {
+ xfs_iunlock(ip, *lock_mode);
+ *lock_mode = XFS_IOLOCK_EXCL;
+ return xfs_ilock_iocb(iocb, *lock_mode);
+ }
- xfs_iunlock(ip, *lock_mode);
- *lock_mode = XFS_IOLOCK_EXCL;
- return xfs_ilock_iocb(iocb, *lock_mode);
+ return 0;
}
static unsigned int
--
2.43.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 3/6] xfs: simplify xfs_dax_fault
2024-06-19 11:53 clean up I/O path inode locking helpers and the page fault handler Christoph Hellwig
2024-06-19 11:53 ` [PATCH 1/6] xfs: move the dio write relocking out of xfs_ilock_for_iomap Christoph Hellwig
2024-06-19 11:53 ` [PATCH 2/6] xfs: cleanup xfs_ilock_iocb_for_write Christoph Hellwig
@ 2024-06-19 11:53 ` Christoph Hellwig
2024-06-20 18:50 ` Darrick J. Wong
2024-06-19 11:53 ` [PATCH 4/6] xfs: refactor __xfs_filemap_fault Christoph Hellwig
` (2 subsequent siblings)
5 siblings, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2024-06-19 11:53 UTC (permalink / raw)
To: Chandan Babu R; +Cc: Darrick J. Wong, linux-xfs
Replace the separate stub with an IS_ENABLED check, and take the call to
dax_finish_sync_fault into xfs_dax_fault instead of leaving it in the
caller.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
fs/xfs/xfs_file.c | 34 +++++++++++++---------------------
1 file changed, 13 insertions(+), 21 deletions(-)
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 74c2c8d253e69b..8aab2f66fe016f 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -1251,31 +1251,27 @@ xfs_file_llseek(
return vfs_setpos(file, offset, inode->i_sb->s_maxbytes);
}
-#ifdef CONFIG_FS_DAX
static inline vm_fault_t
xfs_dax_fault(
struct vm_fault *vmf,
unsigned int order,
- bool write_fault,
- pfn_t *pfn)
+ bool write_fault)
{
- return dax_iomap_fault(vmf, order, pfn, NULL,
+ vm_fault_t ret;
+ pfn_t pfn;
+
+ if (!IS_ENABLED(CONFIG_FS_DAX)) {
+ ASSERT(0);
+ return VM_FAULT_SIGBUS;
+ }
+ ret = dax_iomap_fault(vmf, order, &pfn, NULL,
(write_fault && !vmf->cow_page) ?
&xfs_dax_write_iomap_ops :
&xfs_read_iomap_ops);
+ if (ret & VM_FAULT_NEEDDSYNC)
+ ret = dax_finish_sync_fault(vmf, order, pfn);
+ return ret;
}
-#else
-static inline vm_fault_t
-xfs_dax_fault(
- struct vm_fault *vmf,
- unsigned int order,
- bool write_fault,
- pfn_t *pfn)
-{
- ASSERT(0);
- return VM_FAULT_SIGBUS;
-}
-#endif
/*
* Locking for serialisation of IO during page faults. This results in a lock
@@ -1309,11 +1305,7 @@ __xfs_filemap_fault(
lock_mode = xfs_ilock_for_write_fault(XFS_I(inode));
if (IS_DAX(inode)) {
- pfn_t pfn;
-
- ret = xfs_dax_fault(vmf, order, write_fault, &pfn);
- if (ret & VM_FAULT_NEEDDSYNC)
- ret = dax_finish_sync_fault(vmf, order, pfn);
+ ret = xfs_dax_fault(vmf, order, write_fault);
} else if (write_fault) {
ret = iomap_page_mkwrite(vmf, &xfs_page_mkwrite_iomap_ops);
} else {
--
2.43.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 4/6] xfs: refactor __xfs_filemap_fault
2024-06-19 11:53 clean up I/O path inode locking helpers and the page fault handler Christoph Hellwig
` (2 preceding siblings ...)
2024-06-19 11:53 ` [PATCH 3/6] xfs: simplify xfs_dax_fault Christoph Hellwig
@ 2024-06-19 11:53 ` Christoph Hellwig
2024-06-20 18:54 ` Darrick J. Wong
2024-06-19 11:53 ` [PATCH 5/6] xfs: always take XFS_MMAPLOCK shared in xfs_dax_fault Christoph Hellwig
2024-06-19 11:53 ` [PATCH 6/6] xfs: fold xfs_ilock_for_write_fault into xfs_write_fault Christoph Hellwig
5 siblings, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2024-06-19 11:53 UTC (permalink / raw)
To: Chandan Babu R; +Cc: Darrick J. Wong, linux-xfs
Split the write fault and DAX fault handling into separate helpers
so that the main fault handler is easier to follow.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
fs/xfs/xfs_file.c | 71 ++++++++++++++++++++++++++++++-----------------
1 file changed, 45 insertions(+), 26 deletions(-)
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 8aab2f66fe016f..51e50afd935895 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -1252,7 +1252,7 @@ xfs_file_llseek(
}
static inline vm_fault_t
-xfs_dax_fault(
+xfs_dax_fault_locked(
struct vm_fault *vmf,
unsigned int order,
bool write_fault)
@@ -1273,6 +1273,45 @@ xfs_dax_fault(
return ret;
}
+static vm_fault_t
+xfs_dax_fault(
+ struct vm_fault *vmf,
+ unsigned int order)
+{
+ struct xfs_inode *ip = XFS_I(file_inode(vmf->vma->vm_file));
+ unsigned int lock_mode;
+ vm_fault_t ret;
+
+ lock_mode = xfs_ilock_for_write_fault(ip);
+ ret = xfs_dax_fault_locked(vmf, order, false);
+ xfs_iunlock(ip, lock_mode);
+
+ return ret;
+}
+
+static vm_fault_t
+xfs_write_fault(
+ struct vm_fault *vmf,
+ unsigned int order)
+{
+ struct inode *inode = file_inode(vmf->vma->vm_file);
+ unsigned int lock_mode;
+ vm_fault_t ret;
+
+ sb_start_pagefault(inode->i_sb);
+ file_update_time(vmf->vma->vm_file);
+
+ lock_mode = xfs_ilock_for_write_fault(XFS_I(inode));
+ if (IS_DAX(inode))
+ ret = xfs_dax_fault_locked(vmf, order, true);
+ else
+ ret = iomap_page_mkwrite(vmf, &xfs_page_mkwrite_iomap_ops);
+ xfs_iunlock(XFS_I(inode), lock_mode);
+
+ sb_end_pagefault(inode->i_sb);
+ return ret;
+}
+
/*
* Locking for serialisation of IO during page faults. This results in a lock
* ordering of:
@@ -1290,34 +1329,14 @@ __xfs_filemap_fault(
bool write_fault)
{
struct inode *inode = file_inode(vmf->vma->vm_file);
- struct xfs_inode *ip = XFS_I(inode);
- vm_fault_t ret;
- unsigned int lock_mode = 0;
- trace_xfs_filemap_fault(ip, order, write_fault);
-
- if (write_fault) {
- sb_start_pagefault(inode->i_sb);
- file_update_time(vmf->vma->vm_file);
- }
-
- if (IS_DAX(inode) || write_fault)
- lock_mode = xfs_ilock_for_write_fault(XFS_I(inode));
-
- if (IS_DAX(inode)) {
- ret = xfs_dax_fault(vmf, order, write_fault);
- } else if (write_fault) {
- ret = iomap_page_mkwrite(vmf, &xfs_page_mkwrite_iomap_ops);
- } else {
- ret = filemap_fault(vmf);
- }
-
- if (lock_mode)
- xfs_iunlock(XFS_I(inode), lock_mode);
+ trace_xfs_filemap_fault(XFS_I(inode), order, write_fault);
if (write_fault)
- sb_end_pagefault(inode->i_sb);
- return ret;
+ return xfs_write_fault(vmf, order);
+ if (IS_DAX(inode))
+ return xfs_dax_fault(vmf, order);
+ return filemap_fault(vmf);
}
static inline bool
--
2.43.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 5/6] xfs: always take XFS_MMAPLOCK shared in xfs_dax_fault
2024-06-19 11:53 clean up I/O path inode locking helpers and the page fault handler Christoph Hellwig
` (3 preceding siblings ...)
2024-06-19 11:53 ` [PATCH 4/6] xfs: refactor __xfs_filemap_fault Christoph Hellwig
@ 2024-06-19 11:53 ` Christoph Hellwig
2024-06-20 18:56 ` Darrick J. Wong
2024-06-19 11:53 ` [PATCH 6/6] xfs: fold xfs_ilock_for_write_fault into xfs_write_fault Christoph Hellwig
5 siblings, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2024-06-19 11:53 UTC (permalink / raw)
To: Chandan Babu R; +Cc: Darrick J. Wong, linux-xfs
After the previous refactoring, xfs_dax_fault is now never used for write
faults, so don't bother with the xfs_ilock_for_write_fault logic to
protect against writes when remapping is in progress.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
fs/xfs/xfs_file.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 51e50afd935895..62a69ed796f2fd 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -1279,12 +1279,11 @@ xfs_dax_fault(
unsigned int order)
{
struct xfs_inode *ip = XFS_I(file_inode(vmf->vma->vm_file));
- unsigned int lock_mode;
vm_fault_t ret;
- lock_mode = xfs_ilock_for_write_fault(ip);
+ xfs_ilock(ip, XFS_MMAPLOCK_SHARED);
ret = xfs_dax_fault_locked(vmf, order, false);
- xfs_iunlock(ip, lock_mode);
+ xfs_iunlock(ip, XFS_MMAPLOCK_SHARED);
return ret;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 6/6] xfs: fold xfs_ilock_for_write_fault into xfs_write_fault
2024-06-19 11:53 clean up I/O path inode locking helpers and the page fault handler Christoph Hellwig
` (4 preceding siblings ...)
2024-06-19 11:53 ` [PATCH 5/6] xfs: always take XFS_MMAPLOCK shared in xfs_dax_fault Christoph Hellwig
@ 2024-06-19 11:53 ` Christoph Hellwig
2024-06-20 18:57 ` Darrick J. Wong
5 siblings, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2024-06-19 11:53 UTC (permalink / raw)
To: Chandan Babu R; +Cc: Darrick J. Wong, linux-xfs
Now that the page fault handler has been refactored, the only caller
of xfs_ilock_for_write_fault is simple enough and calls it
unconditionally. Fold the logic and expand the comments explaining it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
fs/xfs/xfs_file.c | 33 +++++++++++++++------------------
1 file changed, 15 insertions(+), 18 deletions(-)
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 62a69ed796f2fd..05ea96661c475f 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -227,21 +227,6 @@ xfs_ilock_iocb_for_write(
return 0;
}
-static unsigned int
-xfs_ilock_for_write_fault(
- struct xfs_inode *ip)
-{
- /* get a shared lock if no remapping in progress */
- xfs_ilock(ip, XFS_MMAPLOCK_SHARED);
- if (!xfs_iflags_test(ip, XFS_IREMAPPING))
- return XFS_MMAPLOCK_SHARED;
-
- /* wait for remapping to complete */
- xfs_iunlock(ip, XFS_MMAPLOCK_SHARED);
- xfs_ilock(ip, XFS_MMAPLOCK_EXCL);
- return XFS_MMAPLOCK_EXCL;
-}
-
STATIC ssize_t
xfs_file_dio_read(
struct kiocb *iocb,
@@ -1294,18 +1279,30 @@ xfs_write_fault(
unsigned int order)
{
struct inode *inode = file_inode(vmf->vma->vm_file);
- unsigned int lock_mode;
+ struct xfs_inode *ip = XFS_I(inode);
+ unsigned int lock_mode = XFS_MMAPLOCK_SHARED;
vm_fault_t ret;
sb_start_pagefault(inode->i_sb);
file_update_time(vmf->vma->vm_file);
- lock_mode = xfs_ilock_for_write_fault(XFS_I(inode));
+ /*
+ * Normally we only need the shared mmaplock, but if a reflink remap is
+ * in progress we take the exclusive lock to wait for the remap to
+ * finish before taking a write fault.
+ */
+ xfs_ilock(ip, XFS_MMAPLOCK_SHARED);
+ if (xfs_iflags_test(ip, XFS_IREMAPPING)) {
+ xfs_iunlock(ip, XFS_MMAPLOCK_SHARED);
+ xfs_ilock(ip, XFS_MMAPLOCK_EXCL);
+ lock_mode = XFS_MMAPLOCK_EXCL;
+ }
+
if (IS_DAX(inode))
ret = xfs_dax_fault_locked(vmf, order, true);
else
ret = iomap_page_mkwrite(vmf, &xfs_page_mkwrite_iomap_ops);
- xfs_iunlock(XFS_I(inode), lock_mode);
+ xfs_iunlock(ip, lock_mode);
sb_end_pagefault(inode->i_sb);
return ret;
--
2.43.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 1/6] xfs: move the dio write relocking out of xfs_ilock_for_iomap
2024-06-19 11:53 ` [PATCH 1/6] xfs: move the dio write relocking out of xfs_ilock_for_iomap Christoph Hellwig
@ 2024-06-20 18:40 ` Darrick J. Wong
0 siblings, 0 replies; 17+ messages in thread
From: Darrick J. Wong @ 2024-06-20 18:40 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Chandan Babu R, linux-xfs
On Wed, Jun 19, 2024 at 01:53:51PM +0200, Christoph Hellwig wrote:
> About half of xfs_ilock_for_iomap deals with a special case for direct
> I/O writes to COW files that need to take the ilock exclusively. Move
> this code into the one callers that cares and simplify
> xfs_ilock_for_iomap.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
It took me a few minutes to sort this out, but yeah, directio writes are
the only place where we do this shared->cow->excl promotion dance.
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
--D
> ---
> fs/xfs/xfs_iomap.c | 71 ++++++++++++++++++++++------------------------
> 1 file changed, 34 insertions(+), 37 deletions(-)
>
> diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
> index 3783426739258c..b0085e5972393a 100644
> --- a/fs/xfs/xfs_iomap.c
> +++ b/fs/xfs/xfs_iomap.c
> @@ -717,53 +717,30 @@ imap_needs_cow(
> return true;
> }
>
> +/*
> + * Extents not yet cached requires exclusive access, don't block for
> + * IOMAP_NOWAIT.
> + *
> + * This is basically an opencoded xfs_ilock_data_map_shared() call, but with
> + * support for IOMAP_NOWAIT.
> + */
> static int
> xfs_ilock_for_iomap(
> struct xfs_inode *ip,
> unsigned flags,
> unsigned *lockmode)
> {
> - unsigned int mode = *lockmode;
> - bool is_write = flags & (IOMAP_WRITE | IOMAP_ZERO);
> -
> - /*
> - * COW writes may allocate delalloc space or convert unwritten COW
> - * extents, so we need to make sure to take the lock exclusively here.
> - */
> - if (xfs_is_cow_inode(ip) && is_write)
> - mode = XFS_ILOCK_EXCL;
> -
> - /*
> - * Extents not yet cached requires exclusive access, don't block. This
> - * is an opencoded xfs_ilock_data_map_shared() call but with
> - * non-blocking behaviour.
> - */
> - if (xfs_need_iread_extents(&ip->i_df)) {
> - if (flags & IOMAP_NOWAIT)
> - return -EAGAIN;
> - mode = XFS_ILOCK_EXCL;
> - }
> -
> -relock:
> if (flags & IOMAP_NOWAIT) {
> - if (!xfs_ilock_nowait(ip, mode))
> + if (xfs_need_iread_extents(&ip->i_df))
> + return -EAGAIN;
> + if (!xfs_ilock_nowait(ip, *lockmode))
> return -EAGAIN;
> } else {
> - xfs_ilock(ip, mode);
> + if (xfs_need_iread_extents(&ip->i_df))
> + *lockmode = XFS_ILOCK_EXCL;
> + xfs_ilock(ip, *lockmode);
> }
>
> - /*
> - * The reflink iflag could have changed since the earlier unlocked
> - * check, so if we got ILOCK_SHARED for a write and but we're now a
> - * reflink inode we have to switch to ILOCK_EXCL and relock.
> - */
> - if (mode == XFS_ILOCK_SHARED && is_write && xfs_is_cow_inode(ip)) {
> - xfs_iunlock(ip, mode);
> - mode = XFS_ILOCK_EXCL;
> - goto relock;
> - }
> -
> - *lockmode = mode;
> return 0;
> }
>
> @@ -801,7 +778,7 @@ xfs_direct_write_iomap_begin(
> int nimaps = 1, error = 0;
> bool shared = false;
> u16 iomap_flags = 0;
> - unsigned int lockmode = XFS_ILOCK_SHARED;
> + unsigned int lockmode;
> u64 seq;
>
> ASSERT(flags & (IOMAP_WRITE | IOMAP_ZERO));
> @@ -817,10 +794,30 @@ xfs_direct_write_iomap_begin(
> if (offset + length > i_size_read(inode))
> iomap_flags |= IOMAP_F_DIRTY;
>
> + /*
> + * COW writes may allocate delalloc space or convert unwritten COW
> + * extents, so we need to make sure to take the lock exclusively here.
> + */
> + if (xfs_is_cow_inode(ip))
> + lockmode = XFS_ILOCK_EXCL;
> + else
> + lockmode = XFS_ILOCK_SHARED;
> +
> +relock:
> error = xfs_ilock_for_iomap(ip, flags, &lockmode);
> if (error)
> return error;
>
> + /*
> + * The reflink iflag could have changed since the earlier unlocked
> + * check, check if it again and relock if needed.
> + */
> + if (xfs_is_cow_inode(ip) && lockmode == XFS_ILOCK_SHARED) {
> + xfs_iunlock(ip, lockmode);
> + lockmode = XFS_ILOCK_EXCL;
> + goto relock;
> + }
> +
> error = xfs_bmapi_read(ip, offset_fsb, end_fsb - offset_fsb, &imap,
> &nimaps, 0);
> if (error)
> --
> 2.43.0
>
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/6] xfs: cleanup xfs_ilock_iocb_for_write
2024-06-19 11:53 ` [PATCH 2/6] xfs: cleanup xfs_ilock_iocb_for_write Christoph Hellwig
@ 2024-06-20 18:41 ` Darrick J. Wong
0 siblings, 0 replies; 17+ messages in thread
From: Darrick J. Wong @ 2024-06-20 18:41 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Chandan Babu R, linux-xfs
On Wed, Jun 19, 2024 at 01:53:52PM +0200, Christoph Hellwig wrote:
> Move the relock path out of the straight line and add a comment
> explaining why it exists.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
Looks good,
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
--D
> ---
> fs/xfs/xfs_file.c | 18 +++++++++++-------
> 1 file changed, 11 insertions(+), 7 deletions(-)
>
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index b240ea5241dc9d..74c2c8d253e69b 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -213,14 +213,18 @@ xfs_ilock_iocb_for_write(
> if (ret)
> return ret;
>
> - if (*lock_mode == XFS_IOLOCK_EXCL)
> - return 0;
> - if (!xfs_iflags_test(ip, XFS_IREMAPPING))
> - return 0;
> + /*
> + * If a reflink remap is in progress we always need to take the iolock
> + * exclusively to wait for it to finish.
> + */
> + if (*lock_mode == XFS_IOLOCK_SHARED &&
> + xfs_iflags_test(ip, XFS_IREMAPPING)) {
> + xfs_iunlock(ip, *lock_mode);
> + *lock_mode = XFS_IOLOCK_EXCL;
> + return xfs_ilock_iocb(iocb, *lock_mode);
> + }
>
> - xfs_iunlock(ip, *lock_mode);
> - *lock_mode = XFS_IOLOCK_EXCL;
> - return xfs_ilock_iocb(iocb, *lock_mode);
> + return 0;
> }
>
> static unsigned int
> --
> 2.43.0
>
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 3/6] xfs: simplify xfs_dax_fault
2024-06-19 11:53 ` [PATCH 3/6] xfs: simplify xfs_dax_fault Christoph Hellwig
@ 2024-06-20 18:50 ` Darrick J. Wong
2024-06-21 5:05 ` Christoph Hellwig
0 siblings, 1 reply; 17+ messages in thread
From: Darrick J. Wong @ 2024-06-20 18:50 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Chandan Babu R, linux-xfs
On Wed, Jun 19, 2024 at 01:53:53PM +0200, Christoph Hellwig wrote:
> Replace the separate stub with an IS_ENABLED check, and take the call to
> dax_finish_sync_fault into xfs_dax_fault instead of leaving it in the
> caller.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
> fs/xfs/xfs_file.c | 34 +++++++++++++---------------------
> 1 file changed, 13 insertions(+), 21 deletions(-)
>
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index 74c2c8d253e69b..8aab2f66fe016f 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -1251,31 +1251,27 @@ xfs_file_llseek(
> return vfs_setpos(file, offset, inode->i_sb->s_maxbytes);
> }
>
> -#ifdef CONFIG_FS_DAX
> static inline vm_fault_t
> xfs_dax_fault(
> struct vm_fault *vmf,
> unsigned int order,
> - bool write_fault,
> - pfn_t *pfn)
> + bool write_fault)
> {
> - return dax_iomap_fault(vmf, order, pfn, NULL,
> + vm_fault_t ret;
> + pfn_t pfn;
> +
> + if (!IS_ENABLED(CONFIG_FS_DAX)) {
> + ASSERT(0);
> + return VM_FAULT_SIGBUS;
Does this actually work if FS_DAX=n? AFAICT there's no !DAX stub for
dax_iomap_fault, so won't that cause a linker error?
> + }
> + ret = dax_iomap_fault(vmf, order, &pfn, NULL,
> (write_fault && !vmf->cow_page) ?
> &xfs_dax_write_iomap_ops :
> &xfs_read_iomap_ops);
> + if (ret & VM_FAULT_NEEDDSYNC)
> + ret = dax_finish_sync_fault(vmf, order, pfn);
> + return ret;
I /almost/ wondered if these ought to be separate helpers for read and
write faults, but then I realized the (write && cow_page) case is a
"read" and ... lol. So the only question I have is about linker errors.
--D
> }
> -#else
> -static inline vm_fault_t
> -xfs_dax_fault(
> - struct vm_fault *vmf,
> - unsigned int order,
> - bool write_fault,
> - pfn_t *pfn)
> -{
> - ASSERT(0);
> - return VM_FAULT_SIGBUS;
> -}
> -#endif
>
> /*
> * Locking for serialisation of IO during page faults. This results in a lock
> @@ -1309,11 +1305,7 @@ __xfs_filemap_fault(
> lock_mode = xfs_ilock_for_write_fault(XFS_I(inode));
>
> if (IS_DAX(inode)) {
> - pfn_t pfn;
> -
> - ret = xfs_dax_fault(vmf, order, write_fault, &pfn);
> - if (ret & VM_FAULT_NEEDDSYNC)
> - ret = dax_finish_sync_fault(vmf, order, pfn);
> + ret = xfs_dax_fault(vmf, order, write_fault);
> } else if (write_fault) {
> ret = iomap_page_mkwrite(vmf, &xfs_page_mkwrite_iomap_ops);
> } else {
> --
> 2.43.0
>
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 4/6] xfs: refactor __xfs_filemap_fault
2024-06-19 11:53 ` [PATCH 4/6] xfs: refactor __xfs_filemap_fault Christoph Hellwig
@ 2024-06-20 18:54 ` Darrick J. Wong
2024-06-21 5:08 ` Christoph Hellwig
0 siblings, 1 reply; 17+ messages in thread
From: Darrick J. Wong @ 2024-06-20 18:54 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Chandan Babu R, linux-xfs
On Wed, Jun 19, 2024 at 01:53:54PM +0200, Christoph Hellwig wrote:
> Split the write fault and DAX fault handling into separate helpers
> so that the main fault handler is easier to follow.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
> fs/xfs/xfs_file.c | 71 ++++++++++++++++++++++++++++++-----------------
> 1 file changed, 45 insertions(+), 26 deletions(-)
>
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index 8aab2f66fe016f..51e50afd935895 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -1252,7 +1252,7 @@ xfs_file_llseek(
> }
>
> static inline vm_fault_t
> -xfs_dax_fault(
> +xfs_dax_fault_locked(
> struct vm_fault *vmf,
> unsigned int order,
> bool write_fault)
> @@ -1273,6 +1273,45 @@ xfs_dax_fault(
> return ret;
> }
>
> +static vm_fault_t
> +xfs_dax_fault(
Oh, hey, you /did/ split the dax handling into two helpers.
Would you mind renaming this xfs_dax_read_fault since this doesn't
handle write faults?
With that changed,
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
--D
> + struct vm_fault *vmf,
> + unsigned int order)
> +{
> + struct xfs_inode *ip = XFS_I(file_inode(vmf->vma->vm_file));
> + unsigned int lock_mode;
> + vm_fault_t ret;
> +
> + lock_mode = xfs_ilock_for_write_fault(ip);
> + ret = xfs_dax_fault_locked(vmf, order, false);
> + xfs_iunlock(ip, lock_mode);
> +
> + return ret;
> +}
> +
> +static vm_fault_t
> +xfs_write_fault(
> + struct vm_fault *vmf,
> + unsigned int order)
> +{
> + struct inode *inode = file_inode(vmf->vma->vm_file);
> + unsigned int lock_mode;
> + vm_fault_t ret;
> +
> + sb_start_pagefault(inode->i_sb);
> + file_update_time(vmf->vma->vm_file);
> +
> + lock_mode = xfs_ilock_for_write_fault(XFS_I(inode));
> + if (IS_DAX(inode))
> + ret = xfs_dax_fault_locked(vmf, order, true);
> + else
> + ret = iomap_page_mkwrite(vmf, &xfs_page_mkwrite_iomap_ops);
> + xfs_iunlock(XFS_I(inode), lock_mode);
> +
> + sb_end_pagefault(inode->i_sb);
> + return ret;
> +}
> +
> /*
> * Locking for serialisation of IO during page faults. This results in a lock
> * ordering of:
> @@ -1290,34 +1329,14 @@ __xfs_filemap_fault(
> bool write_fault)
> {
> struct inode *inode = file_inode(vmf->vma->vm_file);
> - struct xfs_inode *ip = XFS_I(inode);
> - vm_fault_t ret;
> - unsigned int lock_mode = 0;
>
> - trace_xfs_filemap_fault(ip, order, write_fault);
> -
> - if (write_fault) {
> - sb_start_pagefault(inode->i_sb);
> - file_update_time(vmf->vma->vm_file);
> - }
> -
> - if (IS_DAX(inode) || write_fault)
> - lock_mode = xfs_ilock_for_write_fault(XFS_I(inode));
> -
> - if (IS_DAX(inode)) {
> - ret = xfs_dax_fault(vmf, order, write_fault);
> - } else if (write_fault) {
> - ret = iomap_page_mkwrite(vmf, &xfs_page_mkwrite_iomap_ops);
> - } else {
> - ret = filemap_fault(vmf);
> - }
> -
> - if (lock_mode)
> - xfs_iunlock(XFS_I(inode), lock_mode);
> + trace_xfs_filemap_fault(XFS_I(inode), order, write_fault);
>
> if (write_fault)
> - sb_end_pagefault(inode->i_sb);
> - return ret;
> + return xfs_write_fault(vmf, order);
> + if (IS_DAX(inode))
> + return xfs_dax_fault(vmf, order);
> + return filemap_fault(vmf);
> }
>
> static inline bool
> --
> 2.43.0
>
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 5/6] xfs: always take XFS_MMAPLOCK shared in xfs_dax_fault
2024-06-19 11:53 ` [PATCH 5/6] xfs: always take XFS_MMAPLOCK shared in xfs_dax_fault Christoph Hellwig
@ 2024-06-20 18:56 ` Darrick J. Wong
0 siblings, 0 replies; 17+ messages in thread
From: Darrick J. Wong @ 2024-06-20 18:56 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Chandan Babu R, linux-xfs
On Wed, Jun 19, 2024 at 01:53:55PM +0200, Christoph Hellwig wrote:
> After the previous refactoring, xfs_dax_fault is now never used for write
> faults, so don't bother with the xfs_ilock_for_write_fault logic to
> protect against writes when remapping is in progress.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
Makes sense to me, all we need to do here is make sure that nobody can
invalidate the mappings -- there's no need to take MMAPLOCK_EXCL if a
reflink is cloning from this file's data.
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
--D
> ---
> fs/xfs/xfs_file.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index 51e50afd935895..62a69ed796f2fd 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -1279,12 +1279,11 @@ xfs_dax_fault(
> unsigned int order)
> {
> struct xfs_inode *ip = XFS_I(file_inode(vmf->vma->vm_file));
> - unsigned int lock_mode;
> vm_fault_t ret;
>
> - lock_mode = xfs_ilock_for_write_fault(ip);
> + xfs_ilock(ip, XFS_MMAPLOCK_SHARED);
> ret = xfs_dax_fault_locked(vmf, order, false);
> - xfs_iunlock(ip, lock_mode);
> + xfs_iunlock(ip, XFS_MMAPLOCK_SHARED);
>
> return ret;
> }
> --
> 2.43.0
>
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 6/6] xfs: fold xfs_ilock_for_write_fault into xfs_write_fault
2024-06-19 11:53 ` [PATCH 6/6] xfs: fold xfs_ilock_for_write_fault into xfs_write_fault Christoph Hellwig
@ 2024-06-20 18:57 ` Darrick J. Wong
0 siblings, 0 replies; 17+ messages in thread
From: Darrick J. Wong @ 2024-06-20 18:57 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Chandan Babu R, linux-xfs
On Wed, Jun 19, 2024 at 01:53:56PM +0200, Christoph Hellwig wrote:
> Now that the page fault handler has been refactored, the only caller
> of xfs_ilock_for_write_fault is simple enough and calls it
> unconditionally. Fold the logic and expand the comments explaining it.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
Makes sense to me,
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
--D
> ---
> fs/xfs/xfs_file.c | 33 +++++++++++++++------------------
> 1 file changed, 15 insertions(+), 18 deletions(-)
>
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index 62a69ed796f2fd..05ea96661c475f 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -227,21 +227,6 @@ xfs_ilock_iocb_for_write(
> return 0;
> }
>
> -static unsigned int
> -xfs_ilock_for_write_fault(
> - struct xfs_inode *ip)
> -{
> - /* get a shared lock if no remapping in progress */
> - xfs_ilock(ip, XFS_MMAPLOCK_SHARED);
> - if (!xfs_iflags_test(ip, XFS_IREMAPPING))
> - return XFS_MMAPLOCK_SHARED;
> -
> - /* wait for remapping to complete */
> - xfs_iunlock(ip, XFS_MMAPLOCK_SHARED);
> - xfs_ilock(ip, XFS_MMAPLOCK_EXCL);
> - return XFS_MMAPLOCK_EXCL;
> -}
> -
> STATIC ssize_t
> xfs_file_dio_read(
> struct kiocb *iocb,
> @@ -1294,18 +1279,30 @@ xfs_write_fault(
> unsigned int order)
> {
> struct inode *inode = file_inode(vmf->vma->vm_file);
> - unsigned int lock_mode;
> + struct xfs_inode *ip = XFS_I(inode);
> + unsigned int lock_mode = XFS_MMAPLOCK_SHARED;
> vm_fault_t ret;
>
> sb_start_pagefault(inode->i_sb);
> file_update_time(vmf->vma->vm_file);
>
> - lock_mode = xfs_ilock_for_write_fault(XFS_I(inode));
> + /*
> + * Normally we only need the shared mmaplock, but if a reflink remap is
> + * in progress we take the exclusive lock to wait for the remap to
> + * finish before taking a write fault.
> + */
> + xfs_ilock(ip, XFS_MMAPLOCK_SHARED);
> + if (xfs_iflags_test(ip, XFS_IREMAPPING)) {
> + xfs_iunlock(ip, XFS_MMAPLOCK_SHARED);
> + xfs_ilock(ip, XFS_MMAPLOCK_EXCL);
> + lock_mode = XFS_MMAPLOCK_EXCL;
> + }
> +
> if (IS_DAX(inode))
> ret = xfs_dax_fault_locked(vmf, order, true);
> else
> ret = iomap_page_mkwrite(vmf, &xfs_page_mkwrite_iomap_ops);
> - xfs_iunlock(XFS_I(inode), lock_mode);
> + xfs_iunlock(ip, lock_mode);
>
> sb_end_pagefault(inode->i_sb);
> return ret;
> --
> 2.43.0
>
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 3/6] xfs: simplify xfs_dax_fault
2024-06-20 18:50 ` Darrick J. Wong
@ 2024-06-21 5:05 ` Christoph Hellwig
2024-06-21 17:43 ` Darrick J. Wong
0 siblings, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2024-06-21 5:05 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: Christoph Hellwig, Chandan Babu R, linux-xfs
On Thu, Jun 20, 2024 at 11:50:26AM -0700, Darrick J. Wong wrote:
> > + if (!IS_ENABLED(CONFIG_FS_DAX)) {
> > + ASSERT(0);
> > + return VM_FAULT_SIGBUS;
>
> Does this actually work if FS_DAX=n? AFAICT there's no !DAX stub for
> dax_iomap_fault, so won't that cause a linker error?
IS_ENABLED expands to a compile time constant, so the compiler eliminates
the call and no stub is needed.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 4/6] xfs: refactor __xfs_filemap_fault
2024-06-20 18:54 ` Darrick J. Wong
@ 2024-06-21 5:08 ` Christoph Hellwig
0 siblings, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2024-06-21 5:08 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: Christoph Hellwig, Chandan Babu R, linux-xfs
On Thu, Jun 20, 2024 at 11:54:06AM -0700, Darrick J. Wong wrote:
> Oh, hey, you /did/ split the dax handling into two helpers.
>
> Would you mind renaming this xfs_dax_read_fault since this doesn't
> handle write faults?
Sure.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 3/6] xfs: simplify xfs_dax_fault
2024-06-21 5:05 ` Christoph Hellwig
@ 2024-06-21 17:43 ` Darrick J. Wong
0 siblings, 0 replies; 17+ messages in thread
From: Darrick J. Wong @ 2024-06-21 17:43 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Chandan Babu R, linux-xfs
On Fri, Jun 21, 2024 at 07:05:58AM +0200, Christoph Hellwig wrote:
> On Thu, Jun 20, 2024 at 11:50:26AM -0700, Darrick J. Wong wrote:
> > > + if (!IS_ENABLED(CONFIG_FS_DAX)) {
> > > + ASSERT(0);
> > > + return VM_FAULT_SIGBUS;
> >
> > Does this actually work if FS_DAX=n? AFAICT there's no !DAX stub for
> > dax_iomap_fault, so won't that cause a linker error?
>
> IS_ENABLED expands to a compile time constant, so the compiler eliminates
> the call and no stub is needed.
Huh, that actually links. I am astonished.
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
--D
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 1/6] xfs: move the dio write relocking out of xfs_ilock_for_iomap
2024-06-23 5:44 clean up I/O path inode locking helpers and the page fault handler v2 Christoph Hellwig
@ 2024-06-23 5:44 ` Christoph Hellwig
0 siblings, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2024-06-23 5:44 UTC (permalink / raw)
To: Chandan Babu R; +Cc: Darrick J. Wong, linux-xfs
About half of xfs_ilock_for_iomap deals with a special case for direct
I/O writes to COW files that need to take the ilock exclusively. Move
this code into the one callers that cares and simplify
xfs_ilock_for_iomap.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
---
fs/xfs/xfs_iomap.c | 71 ++++++++++++++++++++++------------------------
1 file changed, 34 insertions(+), 37 deletions(-)
diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 3783426739258c..b0085e5972393a 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -717,53 +717,30 @@ imap_needs_cow(
return true;
}
+/*
+ * Extents not yet cached requires exclusive access, don't block for
+ * IOMAP_NOWAIT.
+ *
+ * This is basically an opencoded xfs_ilock_data_map_shared() call, but with
+ * support for IOMAP_NOWAIT.
+ */
static int
xfs_ilock_for_iomap(
struct xfs_inode *ip,
unsigned flags,
unsigned *lockmode)
{
- unsigned int mode = *lockmode;
- bool is_write = flags & (IOMAP_WRITE | IOMAP_ZERO);
-
- /*
- * COW writes may allocate delalloc space or convert unwritten COW
- * extents, so we need to make sure to take the lock exclusively here.
- */
- if (xfs_is_cow_inode(ip) && is_write)
- mode = XFS_ILOCK_EXCL;
-
- /*
- * Extents not yet cached requires exclusive access, don't block. This
- * is an opencoded xfs_ilock_data_map_shared() call but with
- * non-blocking behaviour.
- */
- if (xfs_need_iread_extents(&ip->i_df)) {
- if (flags & IOMAP_NOWAIT)
- return -EAGAIN;
- mode = XFS_ILOCK_EXCL;
- }
-
-relock:
if (flags & IOMAP_NOWAIT) {
- if (!xfs_ilock_nowait(ip, mode))
+ if (xfs_need_iread_extents(&ip->i_df))
+ return -EAGAIN;
+ if (!xfs_ilock_nowait(ip, *lockmode))
return -EAGAIN;
} else {
- xfs_ilock(ip, mode);
+ if (xfs_need_iread_extents(&ip->i_df))
+ *lockmode = XFS_ILOCK_EXCL;
+ xfs_ilock(ip, *lockmode);
}
- /*
- * The reflink iflag could have changed since the earlier unlocked
- * check, so if we got ILOCK_SHARED for a write and but we're now a
- * reflink inode we have to switch to ILOCK_EXCL and relock.
- */
- if (mode == XFS_ILOCK_SHARED && is_write && xfs_is_cow_inode(ip)) {
- xfs_iunlock(ip, mode);
- mode = XFS_ILOCK_EXCL;
- goto relock;
- }
-
- *lockmode = mode;
return 0;
}
@@ -801,7 +778,7 @@ xfs_direct_write_iomap_begin(
int nimaps = 1, error = 0;
bool shared = false;
u16 iomap_flags = 0;
- unsigned int lockmode = XFS_ILOCK_SHARED;
+ unsigned int lockmode;
u64 seq;
ASSERT(flags & (IOMAP_WRITE | IOMAP_ZERO));
@@ -817,10 +794,30 @@ xfs_direct_write_iomap_begin(
if (offset + length > i_size_read(inode))
iomap_flags |= IOMAP_F_DIRTY;
+ /*
+ * COW writes may allocate delalloc space or convert unwritten COW
+ * extents, so we need to make sure to take the lock exclusively here.
+ */
+ if (xfs_is_cow_inode(ip))
+ lockmode = XFS_ILOCK_EXCL;
+ else
+ lockmode = XFS_ILOCK_SHARED;
+
+relock:
error = xfs_ilock_for_iomap(ip, flags, &lockmode);
if (error)
return error;
+ /*
+ * The reflink iflag could have changed since the earlier unlocked
+ * check, check if it again and relock if needed.
+ */
+ if (xfs_is_cow_inode(ip) && lockmode == XFS_ILOCK_SHARED) {
+ xfs_iunlock(ip, lockmode);
+ lockmode = XFS_ILOCK_EXCL;
+ goto relock;
+ }
+
error = xfs_bmapi_read(ip, offset_fsb, end_fsb - offset_fsb, &imap,
&nimaps, 0);
if (error)
--
2.43.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
end of thread, other threads:[~2024-06-23 5:45 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-19 11:53 clean up I/O path inode locking helpers and the page fault handler Christoph Hellwig
2024-06-19 11:53 ` [PATCH 1/6] xfs: move the dio write relocking out of xfs_ilock_for_iomap Christoph Hellwig
2024-06-20 18:40 ` Darrick J. Wong
2024-06-19 11:53 ` [PATCH 2/6] xfs: cleanup xfs_ilock_iocb_for_write Christoph Hellwig
2024-06-20 18:41 ` Darrick J. Wong
2024-06-19 11:53 ` [PATCH 3/6] xfs: simplify xfs_dax_fault Christoph Hellwig
2024-06-20 18:50 ` Darrick J. Wong
2024-06-21 5:05 ` Christoph Hellwig
2024-06-21 17:43 ` Darrick J. Wong
2024-06-19 11:53 ` [PATCH 4/6] xfs: refactor __xfs_filemap_fault Christoph Hellwig
2024-06-20 18:54 ` Darrick J. Wong
2024-06-21 5:08 ` Christoph Hellwig
2024-06-19 11:53 ` [PATCH 5/6] xfs: always take XFS_MMAPLOCK shared in xfs_dax_fault Christoph Hellwig
2024-06-20 18:56 ` Darrick J. Wong
2024-06-19 11:53 ` [PATCH 6/6] xfs: fold xfs_ilock_for_write_fault into xfs_write_fault Christoph Hellwig
2024-06-20 18:57 ` Darrick J. Wong
-- strict thread matches above, loose matches on Subject: below --
2024-06-23 5:44 clean up I/O path inode locking helpers and the page fault handler v2 Christoph Hellwig
2024-06-23 5:44 ` [PATCH 1/6] xfs: move the dio write relocking out of xfs_ilock_for_iomap Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox