From: Christoph Hellwig <hch@lst.de>
To: Chandan Babu R <chandan.babu@oracle.com>,
Christian Brauner <brauner@kernel.org>,
"Darrick J. Wong" <djwong@kernel.org>
Cc: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: [PATCH 02/10] iomap: remove iomap_file_buffered_write_punch_delalloc
Date: Mon, 23 Sep 2024 17:28:16 +0200 [thread overview]
Message-ID: <20240923152904.1747117-3-hch@lst.de> (raw)
In-Reply-To: <20240923152904.1747117-1-hch@lst.de>
Currently iomap_file_buffered_write_punch_delalloc can be called from
XFS either with the invalidate lock held or not. To fix this while
keeping the locking in the file system and not the iomap library
code we'll need to life the locking up into the file system.
To prepare for that, open code iomap_file_buffered_write_punch_delalloc
in the only caller, and instead export iomap_write_delalloc_release.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
.../filesystems/iomap/operations.rst | 2 +-
fs/iomap/buffered-io.c | 85 ++++++-------------
fs/xfs/xfs_iomap.c | 16 +++-
include/linux/iomap.h | 6 +-
4 files changed, 46 insertions(+), 63 deletions(-)
diff --git a/Documentation/filesystems/iomap/operations.rst b/Documentation/filesystems/iomap/operations.rst
index 8e6c721d233010..b93115ab8748ae 100644
--- a/Documentation/filesystems/iomap/operations.rst
+++ b/Documentation/filesystems/iomap/operations.rst
@@ -208,7 +208,7 @@ The filesystem must arrange to `cancel
such `reservations
<https://lore.kernel.org/linux-xfs/20220817093627.GZ3600936@dread.disaster.area/>`_
because writeback will not consume the reservation.
-The ``iomap_file_buffered_write_punch_delalloc`` can be called from a
+The ``iomap_write_delalloc_release`` can be called from a
``->iomap_end`` function to find all the clean areas of the folios
caching a fresh (``IOMAP_F_NEW``) delalloc mapping.
It takes the ``invalidate_lock``.
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 884891ac7a226c..237aeb883166df 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -1149,6 +1149,32 @@ static void iomap_write_delalloc_scan(struct inode *inode,
* have dirty data still pending in the page cache - those are going to be
* written and so must still retain the delalloc backing for writeback.
*
+ * When a short write occurs, the filesystem may need to remove reserved space
+ * that was allocated in ->iomap_begin from it's ->iomap_end method. For
+ * filesystems that use delayed allocation, we need to punch out delalloc
+ * extents from the range that are not dirty in the page cache. As the write can
+ * race with page faults, there can be dirty pages over the delalloc extent
+ * outside the range of a short write but still within the delalloc extent
+ * allocated for this iomap.
+ *
+ * The punch() callback *must* only punch delalloc extents in the range passed
+ * to it. It must skip over all other types of extents in the range and leave
+ * them completely unchanged. It must do this punch atomically with respect to
+ * other extent modifications.
+ *
+ * The punch() callback may be called with a folio locked to prevent writeback
+ * extent allocation racing at the edge of the range we are currently punching.
+ * The locked folio may or may not cover the range being punched, so it is not
+ * safe for the punch() callback to lock folios itself.
+ *
+ * Lock order is:
+ *
+ * inode->i_rwsem (shared or exclusive)
+ * inode->i_mapping->invalidate_lock (exclusive)
+ * folio_lock()
+ * ->punch
+ * internal filesystem allocation lock
+ *
* As we are scanning the page cache for data, we don't need to reimplement the
* wheel - mapping_seek_hole_data() does exactly what we need to identify the
* start and end of data ranges correctly even for sub-folio block sizes. This
@@ -1177,7 +1203,7 @@ static void iomap_write_delalloc_scan(struct inode *inode,
* require sprinkling this code with magic "+ 1" and "- 1" arithmetic and expose
* the code to subtle off-by-one bugs....
*/
-static void iomap_write_delalloc_release(struct inode *inode, loff_t start_byte,
+void iomap_write_delalloc_release(struct inode *inode, loff_t start_byte,
loff_t end_byte, unsigned flags, struct iomap *iomap,
iomap_punch_t punch)
{
@@ -1243,62 +1269,7 @@ static void iomap_write_delalloc_release(struct inode *inode, loff_t start_byte,
out_unlock:
filemap_invalidate_unlock(inode->i_mapping);
}
-
-/*
- * When a short write occurs, the filesystem may need to remove reserved space
- * that was allocated in ->iomap_begin from it's ->iomap_end method. For
- * filesystems that use delayed allocation, we need to punch out delalloc
- * extents from the range that are not dirty in the page cache. As the write can
- * race with page faults, there can be dirty pages over the delalloc extent
- * outside the range of a short write but still within the delalloc extent
- * allocated for this iomap.
- *
- * This function uses [start_byte, end_byte) intervals (i.e. open ended) to
- * simplify range iterations.
- *
- * The punch() callback *must* only punch delalloc extents in the range passed
- * to it. It must skip over all other types of extents in the range and leave
- * them completely unchanged. It must do this punch atomically with respect to
- * other extent modifications.
- *
- * The punch() callback may be called with a folio locked to prevent writeback
- * extent allocation racing at the edge of the range we are currently punching.
- * The locked folio may or may not cover the range being punched, so it is not
- * safe for the punch() callback to lock folios itself.
- *
- * Lock order is:
- *
- * inode->i_rwsem (shared or exclusive)
- * inode->i_mapping->invalidate_lock (exclusive)
- * folio_lock()
- * ->punch
- * internal filesystem allocation lock
- */
-void iomap_file_buffered_write_punch_delalloc(struct inode *inode,
- loff_t pos, loff_t length, ssize_t written, unsigned flags,
- struct iomap *iomap, iomap_punch_t punch)
-{
- loff_t start_byte;
- loff_t end_byte;
-
- if (iomap->type != IOMAP_DELALLOC)
- return;
-
- /* If we didn't reserve the blocks, we're not allowed to punch them. */
- if (!(iomap->flags & IOMAP_F_NEW))
- return;
-
- start_byte = iomap_last_written_block(inode, pos, written);
- end_byte = round_up(pos + length, i_blocksize(inode));
-
- /* Nothing to do if we've written the entire delalloc extent */
- if (start_byte >= end_byte)
- return;
-
- iomap_write_delalloc_release(inode, start_byte, end_byte, flags, iomap,
- punch);
-}
-EXPORT_SYMBOL_GPL(iomap_file_buffered_write_punch_delalloc);
+EXPORT_SYMBOL_GPL(iomap_write_delalloc_release);
static loff_t iomap_unshare_iter(struct iomap_iter *iter)
{
diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 1e11f48814c0d0..30f2530b6d5461 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -1227,8 +1227,20 @@ xfs_buffered_write_iomap_end(
unsigned flags,
struct iomap *iomap)
{
- iomap_file_buffered_write_punch_delalloc(inode, offset, length, written,
- flags, iomap, &xfs_buffered_write_delalloc_punch);
+ loff_t start_byte, end_byte;
+
+ /* If we didn't reserve the blocks, we're not allowed to punch them. */
+ if (iomap->type != IOMAP_DELALLOC || !(iomap->flags & IOMAP_F_NEW))
+ return 0;
+
+ /* Nothing to do if we've written the entire delalloc extent */
+ start_byte = iomap_last_written_block(inode, offset, written);
+ end_byte = round_up(offset + length, i_blocksize(inode));
+ if (start_byte >= end_byte)
+ return 0;
+
+ iomap_write_delalloc_release(inode, start_byte, end_byte, flags, iomap,
+ xfs_buffered_write_delalloc_punch);
return 0;
}
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index e62df5d93f04de..137e0783faa224 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -290,9 +290,9 @@ vm_fault_t iomap_page_mkwrite(struct vm_fault *vmf,
typedef void (*iomap_punch_t)(struct inode *inode, loff_t offset, loff_t length,
struct iomap *iomap);
-void iomap_file_buffered_write_punch_delalloc(struct inode *inode, loff_t pos,
- loff_t length, ssize_t written, unsigned flag,
- struct iomap *iomap, iomap_punch_t punch);
+void iomap_write_delalloc_release(struct inode *inode, loff_t start_byte,
+ loff_t end_byte, unsigned flags, struct iomap *iomap,
+ iomap_punch_t punch);
int iomap_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
u64 start, u64 len, const struct iomap_ops *ops);
--
2.45.2
next prev parent reply other threads:[~2024-09-23 15:29 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-23 15:28 fix stale delalloc punching for COW I/O v3 Christoph Hellwig
2024-09-23 15:28 ` [PATCH 01/10] iomap: factor out a iomap_last_written_block helper Christoph Hellwig
2024-09-23 15:53 ` Darrick J. Wong
2024-09-24 5:45 ` Christoph Hellwig
2024-09-23 15:28 ` Christoph Hellwig [this message]
2024-09-23 16:18 ` [PATCH 02/10] iomap: remove iomap_file_buffered_write_punch_delalloc Darrick J. Wong
2024-09-23 22:43 ` Dave Chinner
2024-09-24 5:55 ` Christoph Hellwig
2024-09-24 6:05 ` Darrick J. Wong
2024-09-24 6:10 ` Christoph Hellwig
2024-09-23 15:28 ` [PATCH 03/10] iomap: move locking out of iomap_write_delalloc_release Christoph Hellwig
2024-09-23 16:19 ` Darrick J. Wong
2024-09-23 15:28 ` [PATCH 04/10] xfs: factor out a xfs_file_write_zero_eof helper Christoph Hellwig
2024-09-23 16:20 ` Darrick J. Wong
2024-09-23 15:28 ` [PATCH 05/10] xfs: take XFS_MMAPLOCK_EXCL xfs_file_write_zero_eof Christoph Hellwig
2024-09-23 15:28 ` [PATCH 06/10] xfs: zeroing already holds invalidate_lock Christoph Hellwig
2024-09-23 16:22 ` Darrick J. Wong
2024-09-24 5:44 ` Christoph Hellwig
2024-09-23 15:28 ` [PATCH 07/10] xfs: support the COW fork in xfs_bmap_punch_delalloc_range Christoph Hellwig
2024-09-23 15:28 ` [PATCH 08/10] xfs: share more code in xfs_buffered_write_iomap_begin Christoph Hellwig
2024-09-23 15:28 ` [PATCH 09/10] xfs: set IOMAP_F_SHARED for all COW fork allocations Christoph Hellwig
2024-09-23 15:28 ` [PATCH 10/10] xfs: punch delalloc extents from the COW fork for COW writes Christoph Hellwig
2024-09-25 9:19 ` fix stale delalloc punching for COW I/O v3 Christian Brauner
2024-09-25 9:24 ` fix stale delalloc punching for COW I/O v4 Christian Brauner
-- strict thread matches above, loose matches on Subject: below --
2024-09-24 7:40 Christoph Hellwig
2024-09-24 7:40 ` [PATCH 01/10] iomap: factor out a iomap_last_written_block helper Christoph Hellwig
2024-09-24 14:58 ` Darrick J. Wong
2024-09-24 7:40 ` [PATCH 02/10] iomap: remove iomap_file_buffered_write_punch_delalloc Christoph Hellwig
2024-09-24 7:40 ` [PATCH 03/10] iomap: move locking out of iomap_write_delalloc_release Christoph Hellwig
2024-09-24 7:40 ` [PATCH 04/10] xfs: factor out a xfs_file_write_zero_eof helper Christoph Hellwig
2024-09-24 7:40 ` [PATCH 05/10] xfs: take XFS_MMAPLOCK_EXCL xfs_file_write_zero_eof Christoph Hellwig
2024-09-24 7:40 ` [PATCH 06/10] xfs: zeroing already holds invalidate_lock Christoph Hellwig
2024-09-24 7:40 ` [PATCH 07/10] xfs: support the COW fork in xfs_bmap_punch_delalloc_range Christoph Hellwig
2024-09-24 7:40 ` [PATCH 08/10] xfs: share more code in xfs_buffered_write_iomap_begin Christoph Hellwig
2024-09-24 7:40 ` [PATCH 09/10] xfs: set IOMAP_F_SHARED for all COW fork allocations Christoph Hellwig
2024-09-24 7:40 ` [PATCH 10/10] xfs: punch delalloc extents from the COW fork for COW writes Christoph Hellwig
2024-10-05 15:53 ` fix stale delalloc punching for COW I/O v4 Darrick J. Wong
2024-10-07 5:41 ` Christoph Hellwig
2024-10-07 6:28 ` Darrick J. Wong
2024-10-07 6:46 ` Christoph Hellwig
2024-10-07 15:20 ` Darrick J. Wong
2024-10-08 8:59 fix stale delalloc punching for COW I/O v5 Christoph Hellwig
2024-10-08 8:59 ` [PATCH 02/10] iomap: remove iomap_file_buffered_write_punch_delalloc Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240923152904.1747117-3-hch@lst.de \
--to=hch@lst.de \
--cc=brauner@kernel.org \
--cc=chandan.babu@oracle.com \
--cc=djwong@kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).