* [PATCH v2 00/12] iomap: incremental advance conversion -- phase 2
@ 2025-02-19 17:50 Brian Foster
2025-02-19 17:50 ` [PATCH v2 01/12] iomap: advance the iter directly on buffered read Brian Foster
` (12 more replies)
0 siblings, 13 replies; 30+ messages in thread
From: Brian Foster @ 2025-02-19 17:50 UTC (permalink / raw)
To: linux-fsdevel; +Cc: linux-xfs, Christoph Hellwig, Darrick J . Wong
Hi all,
Here's phase 2 of the incremental iter advance conversions. This updates
all remaining iomap operations to advance the iter within the operation
and thus removes the need to advance from the core iomap iterator. Once
all operations are switched over, the core advance code is removed and
the processed field is renamed to reflect that it is now a pure status
code.
For context, this was first introduced in a previous series [1] that
focused mainly on the core mechanism and iomap buffered write. This is
because original impetus was to facilitate a folio batch mechanism where
a filesystem can optionally provide a batch of folios to process for a
given mapping (i.e. zero range of an unwritten mapping with dirty folios
in pagecache). That is still WIP, but the broader point is that this was
originally intended as an optional mode until consensus that fell out of
discussion was that it would be preferable to convert over everything.
This presumably facilitates some other future work and simplifies
semantics in the core iteration code.
Patches 1-3 convert over iomap buffered read, direct I/O and various
other remaining ops (swap, etc.). Patches 4-9 convert over the various
DAX iomap operations. Finally, patches 10-12 introduce some cleanups now
that all iomap operations have updated iteration semantics.
Thoughts, reviews, flames appreciated.
Brian
[1] https://lore.kernel.org/linux-fsdevel/20250207143253.314068-1-bfoster@redhat.com/
v2:
- Push dax_iomap_iter() advance changes down into type specific helpers
(new patch).
- Added patch for iomap_iter_advance_full() helper.
- Various minor cleanups.
v1: https://lore.kernel.org/linux-fsdevel/20250212135712.506987-1-bfoster@redhat.com/
Brian Foster (12):
iomap: advance the iter directly on buffered read
iomap: advance the iter on direct I/O
iomap: convert misc simple ops to incremental advance
dax: advance the iomap_iter in the read/write path
dax: push advance down into dax_iomap_iter() for read and write
dax: advance the iomap_iter on zero range
dax: advance the iomap_iter on unshare range
dax: advance the iomap_iter on dedupe range
dax: advance the iomap_iter on pte and pmd faults
iomap: remove unnecessary advance from iomap_iter()
iomap: rename iomap_iter processed field to status
iomap: introduce a full map advance helper
fs/dax.c | 111 ++++++++++++++++++++++-------------------
fs/iomap/buffered-io.c | 80 ++++++++++++++---------------
fs/iomap/direct-io.c | 24 ++++-----
fs/iomap/fiemap.c | 21 ++++----
fs/iomap/iter.c | 43 +++++++---------
fs/iomap/seek.c | 16 +++---
fs/iomap/swapfile.c | 7 +--
fs/iomap/trace.h | 8 +--
include/linux/iomap.h | 16 ++++--
9 files changed, 163 insertions(+), 163 deletions(-)
--
2.48.1
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH v2 01/12] iomap: advance the iter directly on buffered read
2025-02-19 17:50 [PATCH v2 00/12] iomap: incremental advance conversion -- phase 2 Brian Foster
@ 2025-02-19 17:50 ` Brian Foster
2025-02-19 22:22 ` Darrick J. Wong
2025-02-19 17:50 ` [PATCH v2 02/12] iomap: advance the iter on direct I/O Brian Foster
` (11 subsequent siblings)
12 siblings, 1 reply; 30+ messages in thread
From: Brian Foster @ 2025-02-19 17:50 UTC (permalink / raw)
To: linux-fsdevel; +Cc: linux-xfs, Christoph Hellwig, Darrick J . Wong
iomap buffered read advances the iter via iter.processed. To
continue separating iter advance from return status, update
iomap_readpage_iter() to advance the iter instead of returning the
number of bytes processed. In turn, drop the offset parameter and
sample the updated iter->pos at the start of the function. Update
the callers to loop based on remaining length in the current
iteration instead of number of bytes processed.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
fs/iomap/buffered-io.c | 45 +++++++++++++++++++-----------------------
1 file changed, 20 insertions(+), 25 deletions(-)
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index ec227b45f3aa..215866ba264d 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -366,15 +366,14 @@ static inline bool iomap_block_needs_zeroing(const struct iomap_iter *iter,
pos >= i_size_read(iter->inode);
}
-static loff_t iomap_readpage_iter(const struct iomap_iter *iter,
- struct iomap_readpage_ctx *ctx, loff_t offset)
+static loff_t iomap_readpage_iter(struct iomap_iter *iter,
+ struct iomap_readpage_ctx *ctx)
{
const struct iomap *iomap = &iter->iomap;
- loff_t pos = iter->pos + offset;
- loff_t length = iomap_length(iter) - offset;
+ loff_t pos = iter->pos;
+ loff_t length = iomap_length(iter);
struct folio *folio = ctx->cur_folio;
struct iomap_folio_state *ifs;
- loff_t orig_pos = pos;
size_t poff, plen;
sector_t sector;
@@ -438,25 +437,22 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter,
* we can skip trailing ones as they will be handled in the next
* iteration.
*/
- return pos - orig_pos + plen;
+ length = pos - iter->pos + plen;
+ return iomap_iter_advance(iter, &length);
}
-static loff_t iomap_read_folio_iter(const struct iomap_iter *iter,
+static loff_t iomap_read_folio_iter(struct iomap_iter *iter,
struct iomap_readpage_ctx *ctx)
{
- struct folio *folio = ctx->cur_folio;
- size_t offset = offset_in_folio(folio, iter->pos);
- loff_t length = min_t(loff_t, folio_size(folio) - offset,
- iomap_length(iter));
- loff_t done, ret;
-
- for (done = 0; done < length; done += ret) {
- ret = iomap_readpage_iter(iter, ctx, done);
- if (ret <= 0)
+ loff_t ret;
+
+ while (iomap_length(iter)) {
+ ret = iomap_readpage_iter(iter, ctx);
+ if (ret)
return ret;
}
- return done;
+ return 0;
}
int iomap_read_folio(struct folio *folio, const struct iomap_ops *ops)
@@ -493,15 +489,14 @@ int iomap_read_folio(struct folio *folio, const struct iomap_ops *ops)
}
EXPORT_SYMBOL_GPL(iomap_read_folio);
-static loff_t iomap_readahead_iter(const struct iomap_iter *iter,
+static loff_t iomap_readahead_iter(struct iomap_iter *iter,
struct iomap_readpage_ctx *ctx)
{
- loff_t length = iomap_length(iter);
- loff_t done, ret;
+ loff_t ret;
- for (done = 0; done < length; done += ret) {
+ while (iomap_length(iter)) {
if (ctx->cur_folio &&
- offset_in_folio(ctx->cur_folio, iter->pos + done) == 0) {
+ offset_in_folio(ctx->cur_folio, iter->pos) == 0) {
if (!ctx->cur_folio_in_bio)
folio_unlock(ctx->cur_folio);
ctx->cur_folio = NULL;
@@ -510,12 +505,12 @@ static loff_t iomap_readahead_iter(const struct iomap_iter *iter,
ctx->cur_folio = readahead_folio(ctx->rac);
ctx->cur_folio_in_bio = false;
}
- ret = iomap_readpage_iter(iter, ctx, done);
- if (ret <= 0)
+ ret = iomap_readpage_iter(iter, ctx);
+ if (ret)
return ret;
}
- return done;
+ return 0;
}
/**
--
2.48.1
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v2 02/12] iomap: advance the iter on direct I/O
2025-02-19 17:50 [PATCH v2 00/12] iomap: incremental advance conversion -- phase 2 Brian Foster
2025-02-19 17:50 ` [PATCH v2 01/12] iomap: advance the iter directly on buffered read Brian Foster
@ 2025-02-19 17:50 ` Brian Foster
2025-02-19 22:22 ` Darrick J. Wong
2025-02-19 17:50 ` [PATCH v2 03/12] iomap: convert misc simple ops to incremental advance Brian Foster
` (10 subsequent siblings)
12 siblings, 1 reply; 30+ messages in thread
From: Brian Foster @ 2025-02-19 17:50 UTC (permalink / raw)
To: linux-fsdevel; +Cc: linux-xfs, Christoph Hellwig, Darrick J . Wong
Update iomap direct I/O to advance the iter directly rather than via
iter.processed. Update each mapping type helper to advance based on
the amount of data processed and return success or failure.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
fs/iomap/direct-io.c | 22 +++++++++-------------
1 file changed, 9 insertions(+), 13 deletions(-)
diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index b521eb15759e..b3599f8d12ac 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -289,8 +289,7 @@ static inline blk_opf_t iomap_dio_bio_opflags(struct iomap_dio *dio,
return opflags;
}
-static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter,
- struct iomap_dio *dio)
+static int iomap_dio_bio_iter(struct iomap_iter *iter, struct iomap_dio *dio)
{
const struct iomap *iomap = &iter->iomap;
struct inode *inode = iter->inode;
@@ -303,7 +302,7 @@ static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter,
bool need_zeroout = false;
bool use_fua = false;
int nr_pages, ret = 0;
- size_t copied = 0;
+ u64 copied = 0;
size_t orig_count;
if (atomic && length != fs_block_size)
@@ -467,30 +466,28 @@ static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter,
/* Undo iter limitation to current extent */
iov_iter_reexpand(dio->submit.iter, orig_count - copied);
if (copied)
- return copied;
+ return iomap_iter_advance(iter, &copied);
return ret;
}
-static loff_t iomap_dio_hole_iter(const struct iomap_iter *iter,
- struct iomap_dio *dio)
+static int iomap_dio_hole_iter(struct iomap_iter *iter, struct iomap_dio *dio)
{
loff_t length = iov_iter_zero(iomap_length(iter), dio->submit.iter);
dio->size += length;
if (!length)
return -EFAULT;
- return length;
+ return iomap_iter_advance(iter, &length);
}
-static loff_t iomap_dio_inline_iter(const struct iomap_iter *iomi,
- struct iomap_dio *dio)
+static int iomap_dio_inline_iter(struct iomap_iter *iomi, struct iomap_dio *dio)
{
const struct iomap *iomap = &iomi->iomap;
struct iov_iter *iter = dio->submit.iter;
void *inline_data = iomap_inline_data(iomap, iomi->pos);
loff_t length = iomap_length(iomi);
loff_t pos = iomi->pos;
- size_t copied;
+ u64 copied;
if (WARN_ON_ONCE(!iomap_inline_data_valid(iomap)))
return -EIO;
@@ -512,11 +509,10 @@ static loff_t iomap_dio_inline_iter(const struct iomap_iter *iomi,
dio->size += copied;
if (!copied)
return -EFAULT;
- return copied;
+ return iomap_iter_advance(iomi, &copied);
}
-static loff_t iomap_dio_iter(const struct iomap_iter *iter,
- struct iomap_dio *dio)
+static int iomap_dio_iter(struct iomap_iter *iter, struct iomap_dio *dio)
{
switch (iter->iomap.type) {
case IOMAP_HOLE:
--
2.48.1
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v2 03/12] iomap: convert misc simple ops to incremental advance
2025-02-19 17:50 [PATCH v2 00/12] iomap: incremental advance conversion -- phase 2 Brian Foster
2025-02-19 17:50 ` [PATCH v2 01/12] iomap: advance the iter directly on buffered read Brian Foster
2025-02-19 17:50 ` [PATCH v2 02/12] iomap: advance the iter on direct I/O Brian Foster
@ 2025-02-19 17:50 ` Brian Foster
2025-02-19 22:24 ` Darrick J. Wong
2025-02-19 17:50 ` [PATCH v2 04/12] dax: advance the iomap_iter in the read/write path Brian Foster
` (9 subsequent siblings)
12 siblings, 1 reply; 30+ messages in thread
From: Brian Foster @ 2025-02-19 17:50 UTC (permalink / raw)
To: linux-fsdevel; +Cc: linux-xfs, Christoph Hellwig, Darrick J . Wong
Update several of the remaining iomap operations to advance the iter
directly rather than via return value. This includes page faults,
fiemap, seek data/hole and swapfile activation.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
fs/iomap/buffered-io.c | 2 +-
fs/iomap/fiemap.c | 18 +++++++++---------
fs/iomap/seek.c | 12 ++++++------
fs/iomap/swapfile.c | 7 +++++--
4 files changed, 21 insertions(+), 18 deletions(-)
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 215866ba264d..ddc82dab6bb5 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -1480,7 +1480,7 @@ static loff_t iomap_folio_mkwrite_iter(struct iomap_iter *iter,
folio_mark_dirty(folio);
}
- return length;
+ return iomap_iter_advance(iter, &length);
}
vm_fault_t iomap_page_mkwrite(struct vm_fault *vmf, const struct iomap_ops *ops)
diff --git a/fs/iomap/fiemap.c b/fs/iomap/fiemap.c
index 610ca6f1ec9b..8a0d8b034218 100644
--- a/fs/iomap/fiemap.c
+++ b/fs/iomap/fiemap.c
@@ -39,24 +39,24 @@ static int iomap_to_fiemap(struct fiemap_extent_info *fi,
iomap->length, flags);
}
-static loff_t iomap_fiemap_iter(const struct iomap_iter *iter,
+static loff_t iomap_fiemap_iter(struct iomap_iter *iter,
struct fiemap_extent_info *fi, struct iomap *prev)
{
+ u64 length = iomap_length(iter);
int ret;
if (iter->iomap.type == IOMAP_HOLE)
- return iomap_length(iter);
+ goto advance;
ret = iomap_to_fiemap(fi, prev, 0);
*prev = iter->iomap;
- switch (ret) {
- case 0: /* success */
- return iomap_length(iter);
- case 1: /* extent array full */
- return 0;
- default: /* error */
+ if (ret < 0)
return ret;
- }
+ if (ret == 1) /* extent array full */
+ return 0;
+
+advance:
+ return iomap_iter_advance(iter, &length);
}
int iomap_fiemap(struct inode *inode, struct fiemap_extent_info *fi,
diff --git a/fs/iomap/seek.c b/fs/iomap/seek.c
index a845c012b50c..83c687d6ccc0 100644
--- a/fs/iomap/seek.c
+++ b/fs/iomap/seek.c
@@ -10,7 +10,7 @@
#include <linux/pagemap.h>
#include <linux/pagevec.h>
-static loff_t iomap_seek_hole_iter(const struct iomap_iter *iter,
+static loff_t iomap_seek_hole_iter(struct iomap_iter *iter,
loff_t *hole_pos)
{
loff_t length = iomap_length(iter);
@@ -20,13 +20,13 @@ static loff_t iomap_seek_hole_iter(const struct iomap_iter *iter,
*hole_pos = mapping_seek_hole_data(iter->inode->i_mapping,
iter->pos, iter->pos + length, SEEK_HOLE);
if (*hole_pos == iter->pos + length)
- return length;
+ return iomap_iter_advance(iter, &length);
return 0;
case IOMAP_HOLE:
*hole_pos = iter->pos;
return 0;
default:
- return length;
+ return iomap_iter_advance(iter, &length);
}
}
@@ -56,19 +56,19 @@ iomap_seek_hole(struct inode *inode, loff_t pos, const struct iomap_ops *ops)
}
EXPORT_SYMBOL_GPL(iomap_seek_hole);
-static loff_t iomap_seek_data_iter(const struct iomap_iter *iter,
+static loff_t iomap_seek_data_iter(struct iomap_iter *iter,
loff_t *hole_pos)
{
loff_t length = iomap_length(iter);
switch (iter->iomap.type) {
case IOMAP_HOLE:
- return length;
+ return iomap_iter_advance(iter, &length);
case IOMAP_UNWRITTEN:
*hole_pos = mapping_seek_hole_data(iter->inode->i_mapping,
iter->pos, iter->pos + length, SEEK_DATA);
if (*hole_pos < 0)
- return length;
+ return iomap_iter_advance(iter, &length);
return 0;
default:
*hole_pos = iter->pos;
diff --git a/fs/iomap/swapfile.c b/fs/iomap/swapfile.c
index b90d0eda9e51..4395e46a4dc7 100644
--- a/fs/iomap/swapfile.c
+++ b/fs/iomap/swapfile.c
@@ -94,9 +94,11 @@ static int iomap_swapfile_fail(struct iomap_swapfile_info *isi, const char *str)
* swap only cares about contiguous page-aligned physical extents and makes no
* distinction between written and unwritten extents.
*/
-static loff_t iomap_swapfile_iter(const struct iomap_iter *iter,
+static loff_t iomap_swapfile_iter(struct iomap_iter *iter,
struct iomap *iomap, struct iomap_swapfile_info *isi)
{
+ u64 length = iomap_length(iter);
+
switch (iomap->type) {
case IOMAP_MAPPED:
case IOMAP_UNWRITTEN:
@@ -132,7 +134,8 @@ static loff_t iomap_swapfile_iter(const struct iomap_iter *iter,
return error;
memcpy(&isi->iomap, iomap, sizeof(isi->iomap));
}
- return iomap_length(iter);
+
+ return iomap_iter_advance(iter, &length);
}
/*
--
2.48.1
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v2 04/12] dax: advance the iomap_iter in the read/write path
2025-02-19 17:50 [PATCH v2 00/12] iomap: incremental advance conversion -- phase 2 Brian Foster
` (2 preceding siblings ...)
2025-02-19 17:50 ` [PATCH v2 03/12] iomap: convert misc simple ops to incremental advance Brian Foster
@ 2025-02-19 17:50 ` Brian Foster
2025-02-19 22:33 ` Darrick J. Wong
2025-02-19 17:50 ` [PATCH v2 05/12] dax: push advance down into dax_iomap_iter() for read and write Brian Foster
` (8 subsequent siblings)
12 siblings, 1 reply; 30+ messages in thread
From: Brian Foster @ 2025-02-19 17:50 UTC (permalink / raw)
To: linux-fsdevel; +Cc: linux-xfs, Christoph Hellwig, Darrick J . Wong
DAX reads and writes flow through dax_iomap_iter(), which has one or
more subtleties in terms of how it processes a range vs. what is
specified in the iomap_iter. To keep things simple and remove the
dependency on iomap_iter() advances, convert a positive return from
dax_iomap_iter() to the new advance and status return semantics. The
advance can be pushed further down in future patches.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
fs/dax.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/fs/dax.c b/fs/dax.c
index 21b47402b3dc..296f5aa18640 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -1585,8 +1585,12 @@ dax_iomap_rw(struct kiocb *iocb, struct iov_iter *iter,
if (iocb->ki_flags & IOCB_NOWAIT)
iomi.flags |= IOMAP_NOWAIT;
- while ((ret = iomap_iter(&iomi, ops)) > 0)
+ while ((ret = iomap_iter(&iomi, ops)) > 0) {
iomi.processed = dax_iomap_iter(&iomi, iter);
+ if (iomi.processed > 0)
+ iomi.processed = iomap_iter_advance(&iomi,
+ &iomi.processed);
+ }
done = iomi.pos - iocb->ki_pos;
iocb->ki_pos = iomi.pos;
--
2.48.1
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v2 05/12] dax: push advance down into dax_iomap_iter() for read and write
2025-02-19 17:50 [PATCH v2 00/12] iomap: incremental advance conversion -- phase 2 Brian Foster
` (3 preceding siblings ...)
2025-02-19 17:50 ` [PATCH v2 04/12] dax: advance the iomap_iter in the read/write path Brian Foster
@ 2025-02-19 17:50 ` Brian Foster
2025-02-19 22:34 ` Darrick J. Wong
2025-02-19 17:50 ` [PATCH v2 06/12] dax: advance the iomap_iter on zero range Brian Foster
` (7 subsequent siblings)
12 siblings, 1 reply; 30+ messages in thread
From: Brian Foster @ 2025-02-19 17:50 UTC (permalink / raw)
To: linux-fsdevel; +Cc: linux-xfs, Christoph Hellwig, Darrick J . Wong
DAX read and write currently advances the iter after the
dax_iomap_iter() returns the number of bytes processed rather than
internally within the iter handler itself, as most other iomap
operations do. Push the advance down into dax_iomap_iter() and
update the function to return op status instead of bytes processed.
dax_iomap_iter() shortcuts reads from a hole or unwritten mapping by
directly zeroing the iov_iter, so advance the iomap_iter similarly
in that case.
The DAX processing loop can operate on a range slightly different
than defined by the iomap_iter depending on circumstances. For
example, a read may be truncated by inode size, a read or write
range can be increased due to page alignment, etc. Therefore, this
patch aims to retain as much of the existing logic as possible.
The loop control logic remains pos based, but is sampled from the
iomap_iter on each iteration after the advance instead of being
updated manually. Similarly, length is updated based on the output
of the advance instead of being updated manually. The advance itself
is based on the number of bytes transferred, which was previously
used to update the local copies of pos and length.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
fs/dax.c | 27 +++++++++++----------------
1 file changed, 11 insertions(+), 16 deletions(-)
diff --git a/fs/dax.c b/fs/dax.c
index 296f5aa18640..139e299e53e6 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -1431,8 +1431,7 @@ int dax_truncate_page(struct inode *inode, loff_t pos, bool *did_zero,
}
EXPORT_SYMBOL_GPL(dax_truncate_page);
-static loff_t dax_iomap_iter(const struct iomap_iter *iomi,
- struct iov_iter *iter)
+static int dax_iomap_iter(struct iomap_iter *iomi, struct iov_iter *iter)
{
const struct iomap *iomap = &iomi->iomap;
const struct iomap *srcmap = iomap_iter_srcmap(iomi);
@@ -1451,8 +1450,10 @@ static loff_t dax_iomap_iter(const struct iomap_iter *iomi,
if (pos >= end)
return 0;
- if (iomap->type == IOMAP_HOLE || iomap->type == IOMAP_UNWRITTEN)
- return iov_iter_zero(min(length, end - pos), iter);
+ if (iomap->type == IOMAP_HOLE || iomap->type == IOMAP_UNWRITTEN) {
+ done = iov_iter_zero(min(length, end - pos), iter);
+ return iomap_iter_advance(iomi, &done);
+ }
}
/*
@@ -1485,7 +1486,7 @@ static loff_t dax_iomap_iter(const struct iomap_iter *iomi,
}
id = dax_read_lock();
- while (pos < end) {
+ while ((pos = iomi->pos) < end) {
unsigned offset = pos & (PAGE_SIZE - 1);
const size_t size = ALIGN(length + offset, PAGE_SIZE);
pgoff_t pgoff = dax_iomap_pgoff(iomap, pos);
@@ -1535,18 +1536,16 @@ static loff_t dax_iomap_iter(const struct iomap_iter *iomi,
xfer = dax_copy_to_iter(dax_dev, pgoff, kaddr,
map_len, iter);
- pos += xfer;
- length -= xfer;
- done += xfer;
-
- if (xfer == 0)
+ length = xfer;
+ ret = iomap_iter_advance(iomi, &length);
+ if (!ret && xfer == 0)
ret = -EFAULT;
if (xfer < map_len)
break;
}
dax_read_unlock(id);
- return done ? done : ret;
+ return ret;
}
/**
@@ -1585,12 +1584,8 @@ dax_iomap_rw(struct kiocb *iocb, struct iov_iter *iter,
if (iocb->ki_flags & IOCB_NOWAIT)
iomi.flags |= IOMAP_NOWAIT;
- while ((ret = iomap_iter(&iomi, ops)) > 0) {
+ while ((ret = iomap_iter(&iomi, ops)) > 0)
iomi.processed = dax_iomap_iter(&iomi, iter);
- if (iomi.processed > 0)
- iomi.processed = iomap_iter_advance(&iomi,
- &iomi.processed);
- }
done = iomi.pos - iocb->ki_pos;
iocb->ki_pos = iomi.pos;
--
2.48.1
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v2 06/12] dax: advance the iomap_iter on zero range
2025-02-19 17:50 [PATCH v2 00/12] iomap: incremental advance conversion -- phase 2 Brian Foster
` (4 preceding siblings ...)
2025-02-19 17:50 ` [PATCH v2 05/12] dax: push advance down into dax_iomap_iter() for read and write Brian Foster
@ 2025-02-19 17:50 ` Brian Foster
2025-02-19 22:34 ` Darrick J. Wong
2025-02-19 17:50 ` [PATCH v2 07/12] dax: advance the iomap_iter on unshare range Brian Foster
` (6 subsequent siblings)
12 siblings, 1 reply; 30+ messages in thread
From: Brian Foster @ 2025-02-19 17:50 UTC (permalink / raw)
To: linux-fsdevel; +Cc: linux-xfs, Christoph Hellwig, Darrick J . Wong
Update the DAX zero range iomap iter handler to advance the iter
directly. Advance by the full length in the hole/unwritten case, or
otherwise advance incrementally in the zeroing loop. In either case,
return 0 or an error code for success or failure.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
fs/dax.c | 33 +++++++++++++++++----------------
1 file changed, 17 insertions(+), 16 deletions(-)
diff --git a/fs/dax.c b/fs/dax.c
index 139e299e53e6..f4d8c8c10086 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -1358,13 +1358,12 @@ static s64 dax_zero_iter(struct iomap_iter *iter, bool *did_zero)
{
const struct iomap *iomap = &iter->iomap;
const struct iomap *srcmap = iomap_iter_srcmap(iter);
- loff_t pos = iter->pos;
u64 length = iomap_length(iter);
- s64 written = 0;
+ s64 ret;
/* already zeroed? we're done. */
if (srcmap->type == IOMAP_HOLE || srcmap->type == IOMAP_UNWRITTEN)
- return length;
+ return iomap_iter_advance(iter, &length);
/*
* invalidate the pages whose sharing state is to be changed
@@ -1372,33 +1371,35 @@ static s64 dax_zero_iter(struct iomap_iter *iter, bool *did_zero)
*/
if (iomap->flags & IOMAP_F_SHARED)
invalidate_inode_pages2_range(iter->inode->i_mapping,
- pos >> PAGE_SHIFT,
- (pos + length - 1) >> PAGE_SHIFT);
+ iter->pos >> PAGE_SHIFT,
+ (iter->pos + length - 1) >> PAGE_SHIFT);
do {
+ loff_t pos = iter->pos;
unsigned offset = offset_in_page(pos);
- unsigned size = min_t(u64, PAGE_SIZE - offset, length);
pgoff_t pgoff = dax_iomap_pgoff(iomap, pos);
- long rc;
int id;
+ length = min_t(u64, PAGE_SIZE - offset, length);
+
id = dax_read_lock();
- if (IS_ALIGNED(pos, PAGE_SIZE) && size == PAGE_SIZE)
- rc = dax_zero_page_range(iomap->dax_dev, pgoff, 1);
+ if (IS_ALIGNED(pos, PAGE_SIZE) && length == PAGE_SIZE)
+ ret = dax_zero_page_range(iomap->dax_dev, pgoff, 1);
else
- rc = dax_memzero(iter, pos, size);
+ ret = dax_memzero(iter, pos, length);
dax_read_unlock(id);
- if (rc < 0)
- return rc;
- pos += size;
- length -= size;
- written += size;
+ if (ret < 0)
+ return ret;
+
+ ret = iomap_iter_advance(iter, &length);
+ if (ret)
+ return ret;
} while (length > 0);
if (did_zero)
*did_zero = true;
- return written;
+ return ret;
}
int dax_zero_range(struct inode *inode, loff_t pos, loff_t len, bool *did_zero,
--
2.48.1
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v2 07/12] dax: advance the iomap_iter on unshare range
2025-02-19 17:50 [PATCH v2 00/12] iomap: incremental advance conversion -- phase 2 Brian Foster
` (5 preceding siblings ...)
2025-02-19 17:50 ` [PATCH v2 06/12] dax: advance the iomap_iter on zero range Brian Foster
@ 2025-02-19 17:50 ` Brian Foster
2025-02-19 22:35 ` Darrick J. Wong
2025-02-19 17:50 ` [PATCH v2 08/12] dax: advance the iomap_iter on dedupe range Brian Foster
` (5 subsequent siblings)
12 siblings, 1 reply; 30+ messages in thread
From: Brian Foster @ 2025-02-19 17:50 UTC (permalink / raw)
To: linux-fsdevel; +Cc: linux-xfs, Christoph Hellwig, Darrick J . Wong
Advance the iter and return 0 or an error code for success or
failure.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
fs/dax.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/fs/dax.c b/fs/dax.c
index f4d8c8c10086..c0fbab8c66f7 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -1266,11 +1266,11 @@ static s64 dax_unshare_iter(struct iomap_iter *iter)
u64 copy_len = iomap_length(iter);
u32 mod;
int id = 0;
- s64 ret = 0;
+ s64 ret = iomap_length(iter);
void *daddr = NULL, *saddr = NULL;
if (!iomap_want_unshare_iter(iter))
- return iomap_length(iter);
+ return iomap_iter_advance(iter, &ret);
/*
* Extend the file range to be aligned to fsblock/pagesize, because
@@ -1307,7 +1307,9 @@ static s64 dax_unshare_iter(struct iomap_iter *iter)
out_unlock:
dax_read_unlock(id);
- return dax_mem2blk_err(ret);
+ if (ret < 0)
+ return dax_mem2blk_err(ret);
+ return iomap_iter_advance(iter, &ret);
}
int dax_file_unshare(struct inode *inode, loff_t pos, loff_t len,
--
2.48.1
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v2 08/12] dax: advance the iomap_iter on dedupe range
2025-02-19 17:50 [PATCH v2 00/12] iomap: incremental advance conversion -- phase 2 Brian Foster
` (6 preceding siblings ...)
2025-02-19 17:50 ` [PATCH v2 07/12] dax: advance the iomap_iter on unshare range Brian Foster
@ 2025-02-19 17:50 ` Brian Foster
2025-02-19 22:35 ` Darrick J. Wong
2025-02-19 17:50 ` [PATCH v2 09/12] dax: advance the iomap_iter on pte and pmd faults Brian Foster
` (4 subsequent siblings)
12 siblings, 1 reply; 30+ messages in thread
From: Brian Foster @ 2025-02-19 17:50 UTC (permalink / raw)
To: linux-fsdevel; +Cc: linux-xfs, Christoph Hellwig, Darrick J . Wong
Advance the iter on successful dedupe. Dedupe range uses two iters
and iterates so long as both have outstanding work, so
correspondingly this needs to advance both on each iteration. Since
dax_range_compare_iter() now returns status instead of a byte count,
update the variable name in the caller as well.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
fs/dax.c | 21 ++++++++++++++-------
1 file changed, 14 insertions(+), 7 deletions(-)
diff --git a/fs/dax.c b/fs/dax.c
index c0fbab8c66f7..c8c0d81122ab 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -2001,12 +2001,13 @@ vm_fault_t dax_finish_sync_fault(struct vm_fault *vmf, unsigned int order,
}
EXPORT_SYMBOL_GPL(dax_finish_sync_fault);
-static loff_t dax_range_compare_iter(struct iomap_iter *it_src,
+static int dax_range_compare_iter(struct iomap_iter *it_src,
struct iomap_iter *it_dest, u64 len, bool *same)
{
const struct iomap *smap = &it_src->iomap;
const struct iomap *dmap = &it_dest->iomap;
loff_t pos1 = it_src->pos, pos2 = it_dest->pos;
+ u64 dest_len;
void *saddr, *daddr;
int id, ret;
@@ -2014,7 +2015,7 @@ static loff_t dax_range_compare_iter(struct iomap_iter *it_src,
if (smap->type == IOMAP_HOLE && dmap->type == IOMAP_HOLE) {
*same = true;
- return len;
+ goto advance;
}
if (smap->type == IOMAP_HOLE || dmap->type == IOMAP_HOLE) {
@@ -2037,7 +2038,13 @@ static loff_t dax_range_compare_iter(struct iomap_iter *it_src,
if (!*same)
len = 0;
dax_read_unlock(id);
- return len;
+
+advance:
+ dest_len = len;
+ ret = iomap_iter_advance(it_src, &len);
+ if (!ret)
+ ret = iomap_iter_advance(it_dest, &dest_len);
+ return ret;
out_unlock:
dax_read_unlock(id);
@@ -2060,15 +2067,15 @@ int dax_dedupe_file_range_compare(struct inode *src, loff_t srcoff,
.len = len,
.flags = IOMAP_DAX,
};
- int ret, compared = 0;
+ int ret, status;
while ((ret = iomap_iter(&src_iter, ops)) > 0 &&
(ret = iomap_iter(&dst_iter, ops)) > 0) {
- compared = dax_range_compare_iter(&src_iter, &dst_iter,
+ status = dax_range_compare_iter(&src_iter, &dst_iter,
min(src_iter.len, dst_iter.len), same);
- if (compared < 0)
+ if (status < 0)
return ret;
- src_iter.processed = dst_iter.processed = compared;
+ src_iter.processed = dst_iter.processed = status;
}
return ret;
}
--
2.48.1
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v2 09/12] dax: advance the iomap_iter on pte and pmd faults
2025-02-19 17:50 [PATCH v2 00/12] iomap: incremental advance conversion -- phase 2 Brian Foster
` (7 preceding siblings ...)
2025-02-19 17:50 ` [PATCH v2 08/12] dax: advance the iomap_iter on dedupe range Brian Foster
@ 2025-02-19 17:50 ` Brian Foster
2025-02-19 22:36 ` Darrick J. Wong
2025-02-19 17:50 ` [PATCH v2 10/12] iomap: remove unnecessary advance from iomap_iter() Brian Foster
` (3 subsequent siblings)
12 siblings, 1 reply; 30+ messages in thread
From: Brian Foster @ 2025-02-19 17:50 UTC (permalink / raw)
To: linux-fsdevel; +Cc: linux-xfs, Christoph Hellwig, Darrick J . Wong
Advance the iomap_iter on PTE and PMD faults. Each of these
operations assign a hardcoded size to iter.processed. Replace those
with an advance and status return.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
fs/dax.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/fs/dax.c b/fs/dax.c
index c8c0d81122ab..44701865ca94 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -1771,8 +1771,10 @@ static vm_fault_t dax_iomap_pte_fault(struct vm_fault *vmf, pfn_t *pfnp,
ret |= VM_FAULT_MAJOR;
}
- if (!(ret & VM_FAULT_ERROR))
- iter.processed = PAGE_SIZE;
+ if (!(ret & VM_FAULT_ERROR)) {
+ u64 length = PAGE_SIZE;
+ iter.processed = iomap_iter_advance(&iter, &length);
+ }
}
if (iomap_errp)
@@ -1885,8 +1887,10 @@ static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, pfn_t *pfnp,
continue; /* actually breaks out of the loop */
ret = dax_fault_iter(vmf, &iter, pfnp, &xas, &entry, true);
- if (ret != VM_FAULT_FALLBACK)
- iter.processed = PMD_SIZE;
+ if (ret != VM_FAULT_FALLBACK) {
+ u64 length = PMD_SIZE;
+ iter.processed = iomap_iter_advance(&iter, &length);
+ }
}
unlock_entry:
--
2.48.1
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v2 10/12] iomap: remove unnecessary advance from iomap_iter()
2025-02-19 17:50 [PATCH v2 00/12] iomap: incremental advance conversion -- phase 2 Brian Foster
` (8 preceding siblings ...)
2025-02-19 17:50 ` [PATCH v2 09/12] dax: advance the iomap_iter on pte and pmd faults Brian Foster
@ 2025-02-19 17:50 ` Brian Foster
2025-02-19 22:30 ` Darrick J. Wong
2025-02-19 17:50 ` [PATCH v2 11/12] iomap: rename iomap_iter processed field to status Brian Foster
` (2 subsequent siblings)
12 siblings, 1 reply; 30+ messages in thread
From: Brian Foster @ 2025-02-19 17:50 UTC (permalink / raw)
To: linux-fsdevel; +Cc: linux-xfs, Christoph Hellwig, Darrick J . Wong
At this point, all iomap operations have been updated to advance the
iomap_iter directly before returning to iomap_iter(). Therefore, the
complexity of handling both the old and new semantics is no longer
required and can be removed from iomap_iter().
Update iomap_iter() to expect success or failure status in
iter.processed. As a precaution and developer hint to prevent
inadvertent use of old semantics, warn on a positive return code and
fail the operation. Remove the unnecessary advance and simplify the
termination logic.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
fs/iomap/iter.c | 39 +++++++++++++++------------------------
1 file changed, 15 insertions(+), 24 deletions(-)
diff --git a/fs/iomap/iter.c b/fs/iomap/iter.c
index 0ebcabc7df52..e4dfe64029cc 100644
--- a/fs/iomap/iter.c
+++ b/fs/iomap/iter.c
@@ -60,9 +60,8 @@ static inline void iomap_iter_done(struct iomap_iter *iter)
int iomap_iter(struct iomap_iter *iter, const struct iomap_ops *ops)
{
bool stale = iter->iomap.flags & IOMAP_F_STALE;
- ssize_t advanced = iter->processed > 0 ? iter->processed : 0;
- u64 olen = iter->len;
- s64 processed;
+ ssize_t advanced;
+ u64 olen;
int ret;
trace_iomap_iter(iter, ops, _RET_IP_);
@@ -71,14 +70,11 @@ int iomap_iter(struct iomap_iter *iter, const struct iomap_ops *ops)
goto begin;
/*
- * If iter.processed is zero, the op may still have advanced the iter
- * itself. Calculate the advanced and original length bytes based on how
- * far pos has advanced for ->iomap_end().
+ * Calculate how far the iter was advanced and the original length bytes
+ * for ->iomap_end().
*/
- if (!advanced) {
- advanced = iter->pos - iter->iter_start_pos;
- olen += advanced;
- }
+ advanced = iter->pos - iter->iter_start_pos;
+ olen = iter->len + advanced;
if (ops->iomap_end) {
ret = ops->iomap_end(iter->inode, iter->iter_start_pos,
@@ -89,27 +85,22 @@ int iomap_iter(struct iomap_iter *iter, const struct iomap_ops *ops)
return ret;
}
- processed = iter->processed;
- if (processed < 0) {
- iomap_iter_reset_iomap(iter);
- return processed;
- }
+ /* detect old return semantics where this would advance */
+ if (WARN_ON_ONCE(iter->processed > 0))
+ iter->processed = -EIO;
/*
- * Advance the iter and clear state from the previous iteration. This
- * passes iter->processed because that reflects the bytes processed but
- * not yet advanced by the iter handler.
- *
* Use iter->len to determine whether to continue onto the next mapping.
- * Explicitly terminate in the case where the current iter has not
+ * Explicitly terminate on error status or if the current iter has not
* advanced at all (i.e. no work was done for some reason) unless the
* mapping has been marked stale and needs to be reprocessed.
*/
- ret = iomap_iter_advance(iter, &processed);
- if (!ret && iter->len > 0)
- ret = 1;
- if (ret > 0 && !advanced && !stale)
+ if (iter->processed < 0)
+ ret = iter->processed;
+ else if (iter->len == 0 || (!advanced && !stale))
ret = 0;
+ else
+ ret = 1;
iomap_iter_reset_iomap(iter);
if (ret <= 0)
return ret;
--
2.48.1
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v2 11/12] iomap: rename iomap_iter processed field to status
2025-02-19 17:50 [PATCH v2 00/12] iomap: incremental advance conversion -- phase 2 Brian Foster
` (9 preceding siblings ...)
2025-02-19 17:50 ` [PATCH v2 10/12] iomap: remove unnecessary advance from iomap_iter() Brian Foster
@ 2025-02-19 17:50 ` Brian Foster
2025-02-19 22:30 ` Darrick J. Wong
2025-02-19 17:50 ` [PATCH v2 12/12] iomap: introduce a full map advance helper Brian Foster
2025-02-20 9:10 ` [PATCH v2 00/12] iomap: incremental advance conversion -- phase 2 Christian Brauner
12 siblings, 1 reply; 30+ messages in thread
From: Brian Foster @ 2025-02-19 17:50 UTC (permalink / raw)
To: linux-fsdevel; +Cc: linux-xfs, Christoph Hellwig, Darrick J . Wong
The iter.processed field name is no longer appropriate now that
iomap operations do not return the number of bytes processed. Rename
the field to iter.status to reflect that a success or error code is
expected.
Also change the type to int as there is no longer a need for an s64.
This reduces the size of iomap_iter by 8 bytes due to a combination
of smaller type and reduction in structure padding. While here, fix
up the return types of various _iter() helpers to reflect the type
change.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
fs/dax.c | 20 ++++++++++----------
fs/iomap/buffered-io.c | 42 +++++++++++++++++++++---------------------
fs/iomap/direct-io.c | 2 +-
fs/iomap/fiemap.c | 6 +++---
fs/iomap/iter.c | 12 ++++++------
fs/iomap/seek.c | 8 ++++----
fs/iomap/swapfile.c | 4 ++--
fs/iomap/trace.h | 8 ++++----
include/linux/iomap.h | 7 +++----
9 files changed, 54 insertions(+), 55 deletions(-)
diff --git a/fs/dax.c b/fs/dax.c
index 44701865ca94..cab3c5abe5cb 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -1258,7 +1258,7 @@ static vm_fault_t dax_pmd_load_hole(struct xa_state *xas, struct vm_fault *vmf,
}
#endif /* CONFIG_FS_DAX_PMD */
-static s64 dax_unshare_iter(struct iomap_iter *iter)
+static int dax_unshare_iter(struct iomap_iter *iter)
{
struct iomap *iomap = &iter->iomap;
const struct iomap *srcmap = iomap_iter_srcmap(iter);
@@ -1328,7 +1328,7 @@ int dax_file_unshare(struct inode *inode, loff_t pos, loff_t len,
iter.len = min(len, size - pos);
while ((ret = iomap_iter(&iter, ops)) > 0)
- iter.processed = dax_unshare_iter(&iter);
+ iter.status = dax_unshare_iter(&iter);
return ret;
}
EXPORT_SYMBOL_GPL(dax_file_unshare);
@@ -1356,12 +1356,12 @@ static int dax_memzero(struct iomap_iter *iter, loff_t pos, size_t size)
return ret;
}
-static s64 dax_zero_iter(struct iomap_iter *iter, bool *did_zero)
+static int dax_zero_iter(struct iomap_iter *iter, bool *did_zero)
{
const struct iomap *iomap = &iter->iomap;
const struct iomap *srcmap = iomap_iter_srcmap(iter);
u64 length = iomap_length(iter);
- s64 ret;
+ int ret;
/* already zeroed? we're done. */
if (srcmap->type == IOMAP_HOLE || srcmap->type == IOMAP_UNWRITTEN)
@@ -1416,7 +1416,7 @@ int dax_zero_range(struct inode *inode, loff_t pos, loff_t len, bool *did_zero,
int ret;
while ((ret = iomap_iter(&iter, ops)) > 0)
- iter.processed = dax_zero_iter(&iter, did_zero);
+ iter.status = dax_zero_iter(&iter, did_zero);
return ret;
}
EXPORT_SYMBOL_GPL(dax_zero_range);
@@ -1588,7 +1588,7 @@ dax_iomap_rw(struct kiocb *iocb, struct iov_iter *iter,
iomi.flags |= IOMAP_NOWAIT;
while ((ret = iomap_iter(&iomi, ops)) > 0)
- iomi.processed = dax_iomap_iter(&iomi, iter);
+ iomi.status = dax_iomap_iter(&iomi, iter);
done = iomi.pos - iocb->ki_pos;
iocb->ki_pos = iomi.pos;
@@ -1759,7 +1759,7 @@ static vm_fault_t dax_iomap_pte_fault(struct vm_fault *vmf, pfn_t *pfnp,
while ((error = iomap_iter(&iter, ops)) > 0) {
if (WARN_ON_ONCE(iomap_length(&iter) < PAGE_SIZE)) {
- iter.processed = -EIO; /* fs corruption? */
+ iter.status = -EIO; /* fs corruption? */
continue;
}
@@ -1773,7 +1773,7 @@ static vm_fault_t dax_iomap_pte_fault(struct vm_fault *vmf, pfn_t *pfnp,
if (!(ret & VM_FAULT_ERROR)) {
u64 length = PAGE_SIZE;
- iter.processed = iomap_iter_advance(&iter, &length);
+ iter.status = iomap_iter_advance(&iter, &length);
}
}
@@ -1889,7 +1889,7 @@ static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, pfn_t *pfnp,
ret = dax_fault_iter(vmf, &iter, pfnp, &xas, &entry, true);
if (ret != VM_FAULT_FALLBACK) {
u64 length = PMD_SIZE;
- iter.processed = iomap_iter_advance(&iter, &length);
+ iter.status = iomap_iter_advance(&iter, &length);
}
}
@@ -2079,7 +2079,7 @@ int dax_dedupe_file_range_compare(struct inode *src, loff_t srcoff,
min(src_iter.len, dst_iter.len), same);
if (status < 0)
return ret;
- src_iter.processed = dst_iter.processed = status;
+ src_iter.status = dst_iter.status = status;
}
return ret;
}
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index ddc82dab6bb5..2b86978010bb 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -366,7 +366,7 @@ static inline bool iomap_block_needs_zeroing(const struct iomap_iter *iter,
pos >= i_size_read(iter->inode);
}
-static loff_t iomap_readpage_iter(struct iomap_iter *iter,
+static int iomap_readpage_iter(struct iomap_iter *iter,
struct iomap_readpage_ctx *ctx)
{
const struct iomap *iomap = &iter->iomap;
@@ -441,10 +441,10 @@ static loff_t iomap_readpage_iter(struct iomap_iter *iter,
return iomap_iter_advance(iter, &length);
}
-static loff_t iomap_read_folio_iter(struct iomap_iter *iter,
+static int iomap_read_folio_iter(struct iomap_iter *iter,
struct iomap_readpage_ctx *ctx)
{
- loff_t ret;
+ int ret;
while (iomap_length(iter)) {
ret = iomap_readpage_iter(iter, ctx);
@@ -470,7 +470,7 @@ int iomap_read_folio(struct folio *folio, const struct iomap_ops *ops)
trace_iomap_readpage(iter.inode, 1);
while ((ret = iomap_iter(&iter, ops)) > 0)
- iter.processed = iomap_read_folio_iter(&iter, &ctx);
+ iter.status = iomap_read_folio_iter(&iter, &ctx);
if (ctx.bio) {
submit_bio(ctx.bio);
@@ -489,10 +489,10 @@ int iomap_read_folio(struct folio *folio, const struct iomap_ops *ops)
}
EXPORT_SYMBOL_GPL(iomap_read_folio);
-static loff_t iomap_readahead_iter(struct iomap_iter *iter,
+static int iomap_readahead_iter(struct iomap_iter *iter,
struct iomap_readpage_ctx *ctx)
{
- loff_t ret;
+ int ret;
while (iomap_length(iter)) {
if (ctx->cur_folio &&
@@ -542,7 +542,7 @@ void iomap_readahead(struct readahead_control *rac, const struct iomap_ops *ops)
trace_iomap_readahead(rac->mapping->host, readahead_count(rac));
while (iomap_iter(&iter, ops) > 0)
- iter.processed = iomap_readahead_iter(&iter, &ctx);
+ iter.status = iomap_readahead_iter(&iter, &ctx);
if (ctx.bio)
submit_bio(ctx.bio);
@@ -902,10 +902,10 @@ static bool iomap_write_end(struct iomap_iter *iter, loff_t pos, size_t len,
return __iomap_write_end(iter->inode, pos, len, copied, folio);
}
-static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
+static int iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
{
ssize_t total_written = 0;
- long status = 0;
+ int status = 0;
struct address_space *mapping = iter->inode->i_mapping;
size_t chunk = mapping_max_folio_size(mapping);
unsigned int bdp_flags = (iter->flags & IOMAP_NOWAIT) ? BDP_ASYNC : 0;
@@ -1025,7 +1025,7 @@ iomap_file_buffered_write(struct kiocb *iocb, struct iov_iter *i,
iter.flags |= IOMAP_NOWAIT;
while ((ret = iomap_iter(&iter, ops)) > 0)
- iter.processed = iomap_write_iter(&iter, i);
+ iter.status = iomap_write_iter(&iter, i);
if (unlikely(iter.pos == iocb->ki_pos))
return ret;
@@ -1259,7 +1259,7 @@ void iomap_write_delalloc_release(struct inode *inode, loff_t start_byte,
}
EXPORT_SYMBOL_GPL(iomap_write_delalloc_release);
-static loff_t iomap_unshare_iter(struct iomap_iter *iter)
+static int iomap_unshare_iter(struct iomap_iter *iter)
{
struct iomap *iomap = &iter->iomap;
u64 bytes = iomap_length(iter);
@@ -1319,7 +1319,7 @@ iomap_file_unshare(struct inode *inode, loff_t pos, loff_t len,
iter.len = min(len, size - pos);
while ((ret = iomap_iter(&iter, ops)) > 0)
- iter.processed = iomap_unshare_iter(&iter);
+ iter.status = iomap_unshare_iter(&iter);
return ret;
}
EXPORT_SYMBOL_GPL(iomap_file_unshare);
@@ -1338,7 +1338,7 @@ static inline int iomap_zero_iter_flush_and_stale(struct iomap_iter *i)
return filemap_write_and_wait_range(mapping, i->pos, end);
}
-static loff_t iomap_zero_iter(struct iomap_iter *iter, bool *did_zero)
+static int iomap_zero_iter(struct iomap_iter *iter, bool *did_zero)
{
u64 bytes = iomap_length(iter);
int status;
@@ -1411,7 +1411,7 @@ iomap_zero_range(struct inode *inode, loff_t pos, loff_t len, bool *did_zero,
filemap_range_needs_writeback(mapping, pos, pos + plen - 1)) {
iter.len = plen;
while ((ret = iomap_iter(&iter, ops)) > 0)
- iter.processed = iomap_zero_iter(&iter, did_zero);
+ iter.status = iomap_zero_iter(&iter, did_zero);
iter.len = len - (iter.pos - pos);
if (ret || !iter.len)
@@ -1430,20 +1430,20 @@ iomap_zero_range(struct inode *inode, loff_t pos, loff_t len, bool *did_zero,
if (srcmap->type == IOMAP_HOLE ||
srcmap->type == IOMAP_UNWRITTEN) {
- s64 proc;
+ s64 status;
if (range_dirty) {
range_dirty = false;
- proc = iomap_zero_iter_flush_and_stale(&iter);
+ status = iomap_zero_iter_flush_and_stale(&iter);
} else {
u64 length = iomap_length(&iter);
- proc = iomap_iter_advance(&iter, &length);
+ status = iomap_iter_advance(&iter, &length);
}
- iter.processed = proc;
+ iter.status = status;
continue;
}
- iter.processed = iomap_zero_iter(&iter, did_zero);
+ iter.status = iomap_zero_iter(&iter, did_zero);
}
return ret;
}
@@ -1463,7 +1463,7 @@ iomap_truncate_page(struct inode *inode, loff_t pos, bool *did_zero,
}
EXPORT_SYMBOL_GPL(iomap_truncate_page);
-static loff_t iomap_folio_mkwrite_iter(struct iomap_iter *iter,
+static int iomap_folio_mkwrite_iter(struct iomap_iter *iter,
struct folio *folio)
{
loff_t length = iomap_length(iter);
@@ -1499,7 +1499,7 @@ vm_fault_t iomap_page_mkwrite(struct vm_fault *vmf, const struct iomap_ops *ops)
iter.pos = folio_pos(folio);
iter.len = ret;
while ((ret = iomap_iter(&iter, ops)) > 0)
- iter.processed = iomap_folio_mkwrite_iter(&iter, folio);
+ iter.status = iomap_folio_mkwrite_iter(&iter, folio);
if (ret < 0)
goto out_unlock;
diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index b3599f8d12ac..a84bba651e14 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -696,7 +696,7 @@ __iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
blk_start_plug(&plug);
while ((ret = iomap_iter(&iomi, ops)) > 0) {
- iomi.processed = iomap_dio_iter(&iomi, dio);
+ iomi.status = iomap_dio_iter(&iomi, dio);
/*
* We can only poll for single bio I/Os.
diff --git a/fs/iomap/fiemap.c b/fs/iomap/fiemap.c
index 8a0d8b034218..6776b800bde7 100644
--- a/fs/iomap/fiemap.c
+++ b/fs/iomap/fiemap.c
@@ -39,7 +39,7 @@ static int iomap_to_fiemap(struct fiemap_extent_info *fi,
iomap->length, flags);
}
-static loff_t iomap_fiemap_iter(struct iomap_iter *iter,
+static int iomap_fiemap_iter(struct iomap_iter *iter,
struct fiemap_extent_info *fi, struct iomap *prev)
{
u64 length = iomap_length(iter);
@@ -78,7 +78,7 @@ int iomap_fiemap(struct inode *inode, struct fiemap_extent_info *fi,
return ret;
while ((ret = iomap_iter(&iter, ops)) > 0)
- iter.processed = iomap_fiemap_iter(&iter, fi, &prev);
+ iter.status = iomap_fiemap_iter(&iter, fi, &prev);
if (prev.type != IOMAP_HOLE) {
ret = iomap_to_fiemap(fi, &prev, FIEMAP_EXTENT_LAST);
@@ -114,7 +114,7 @@ iomap_bmap(struct address_space *mapping, sector_t bno,
while ((ret = iomap_iter(&iter, ops)) > 0) {
if (iter.iomap.type == IOMAP_MAPPED)
bno = iomap_sector(&iter.iomap, iter.pos) >> blkshift;
- /* leave iter.processed unset to abort loop */
+ /* leave iter.status unset to abort loop */
}
if (ret)
return 0;
diff --git a/fs/iomap/iter.c b/fs/iomap/iter.c
index e4dfe64029cc..6ffc6a7b9ba5 100644
--- a/fs/iomap/iter.c
+++ b/fs/iomap/iter.c
@@ -9,7 +9,7 @@
static inline void iomap_iter_reset_iomap(struct iomap_iter *iter)
{
- iter->processed = 0;
+ iter->status = 0;
memset(&iter->iomap, 0, sizeof(iter->iomap));
memset(&iter->srcmap, 0, sizeof(iter->srcmap));
}
@@ -54,7 +54,7 @@ static inline void iomap_iter_done(struct iomap_iter *iter)
* function must be called in a loop that continues as long it returns a
* positive value. If 0 or a negative value is returned, the caller must not
* return to the loop body. Within a loop body, there are two ways to break out
- * of the loop body: leave @iter.processed unchanged, or set it to a negative
+ * of the loop body: leave @iter.status unchanged, or set it to a negative
* errno.
*/
int iomap_iter(struct iomap_iter *iter, const struct iomap_ops *ops)
@@ -86,8 +86,8 @@ int iomap_iter(struct iomap_iter *iter, const struct iomap_ops *ops)
}
/* detect old return semantics where this would advance */
- if (WARN_ON_ONCE(iter->processed > 0))
- iter->processed = -EIO;
+ if (WARN_ON_ONCE(iter->status > 0))
+ iter->status = -EIO;
/*
* Use iter->len to determine whether to continue onto the next mapping.
@@ -95,8 +95,8 @@ int iomap_iter(struct iomap_iter *iter, const struct iomap_ops *ops)
* advanced at all (i.e. no work was done for some reason) unless the
* mapping has been marked stale and needs to be reprocessed.
*/
- if (iter->processed < 0)
- ret = iter->processed;
+ if (iter->status < 0)
+ ret = iter->status;
else if (iter->len == 0 || (!advanced && !stale))
ret = 0;
else
diff --git a/fs/iomap/seek.c b/fs/iomap/seek.c
index 83c687d6ccc0..04d7919636c1 100644
--- a/fs/iomap/seek.c
+++ b/fs/iomap/seek.c
@@ -10,7 +10,7 @@
#include <linux/pagemap.h>
#include <linux/pagevec.h>
-static loff_t iomap_seek_hole_iter(struct iomap_iter *iter,
+static int iomap_seek_hole_iter(struct iomap_iter *iter,
loff_t *hole_pos)
{
loff_t length = iomap_length(iter);
@@ -47,7 +47,7 @@ iomap_seek_hole(struct inode *inode, loff_t pos, const struct iomap_ops *ops)
iter.len = size - pos;
while ((ret = iomap_iter(&iter, ops)) > 0)
- iter.processed = iomap_seek_hole_iter(&iter, &pos);
+ iter.status = iomap_seek_hole_iter(&iter, &pos);
if (ret < 0)
return ret;
if (iter.len) /* found hole before EOF */
@@ -56,7 +56,7 @@ iomap_seek_hole(struct inode *inode, loff_t pos, const struct iomap_ops *ops)
}
EXPORT_SYMBOL_GPL(iomap_seek_hole);
-static loff_t iomap_seek_data_iter(struct iomap_iter *iter,
+static int iomap_seek_data_iter(struct iomap_iter *iter,
loff_t *hole_pos)
{
loff_t length = iomap_length(iter);
@@ -93,7 +93,7 @@ iomap_seek_data(struct inode *inode, loff_t pos, const struct iomap_ops *ops)
iter.len = size - pos;
while ((ret = iomap_iter(&iter, ops)) > 0)
- iter.processed = iomap_seek_data_iter(&iter, &pos);
+ iter.status = iomap_seek_data_iter(&iter, &pos);
if (ret < 0)
return ret;
if (iter.len) /* found data before EOF */
diff --git a/fs/iomap/swapfile.c b/fs/iomap/swapfile.c
index 4395e46a4dc7..9ea185e58ca7 100644
--- a/fs/iomap/swapfile.c
+++ b/fs/iomap/swapfile.c
@@ -94,7 +94,7 @@ static int iomap_swapfile_fail(struct iomap_swapfile_info *isi, const char *str)
* swap only cares about contiguous page-aligned physical extents and makes no
* distinction between written and unwritten extents.
*/
-static loff_t iomap_swapfile_iter(struct iomap_iter *iter,
+static int iomap_swapfile_iter(struct iomap_iter *iter,
struct iomap *iomap, struct iomap_swapfile_info *isi)
{
u64 length = iomap_length(iter);
@@ -169,7 +169,7 @@ int iomap_swapfile_activate(struct swap_info_struct *sis,
return ret;
while ((ret = iomap_iter(&iter, ops)) > 0)
- iter.processed = iomap_swapfile_iter(&iter, &iter.iomap, &isi);
+ iter.status = iomap_swapfile_iter(&iter, &iter.iomap, &isi);
if (ret < 0)
return ret;
diff --git a/fs/iomap/trace.h b/fs/iomap/trace.h
index 4118a42cdab0..9eab2c8ac3c5 100644
--- a/fs/iomap/trace.h
+++ b/fs/iomap/trace.h
@@ -207,7 +207,7 @@ TRACE_EVENT(iomap_iter,
__field(u64, ino)
__field(loff_t, pos)
__field(u64, length)
- __field(s64, processed)
+ __field(int, status)
__field(unsigned int, flags)
__field(const void *, ops)
__field(unsigned long, caller)
@@ -217,17 +217,17 @@ TRACE_EVENT(iomap_iter,
__entry->ino = iter->inode->i_ino;
__entry->pos = iter->pos;
__entry->length = iomap_length(iter);
- __entry->processed = iter->processed;
+ __entry->status = iter->status;
__entry->flags = iter->flags;
__entry->ops = ops;
__entry->caller = caller;
),
- TP_printk("dev %d:%d ino 0x%llx pos 0x%llx length 0x%llx processed %lld flags %s (0x%x) ops %ps caller %pS",
+ TP_printk("dev %d:%d ino 0x%llx pos 0x%llx length 0x%llx status %d flags %s (0x%x) ops %ps caller %pS",
MAJOR(__entry->dev), MINOR(__entry->dev),
__entry->ino,
__entry->pos,
__entry->length,
- __entry->processed,
+ __entry->status,
__print_flags(__entry->flags, "|", IOMAP_FLAGS_STRINGS),
__entry->flags,
__entry->ops,
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index d832a540cc72..29b72a671104 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -213,9 +213,8 @@ struct iomap_ops {
* It is updated at the same time as @pos.
* @iter_start_pos: The original start pos for the current iomap. Used for
* incremental iter advance.
- * @processed: The number of bytes the most recent iteration needs iomap_iter()
- * to advance the iter, zero if the iter was already advanced, or a
- * negative errno for an error during the operation.
+ * @status: Status of the most recent iteration. Zero on success or a negative
+ * errno on error.
* @flags: Zero or more of the iomap_begin flags above.
* @iomap: Map describing the I/O iteration
* @srcmap: Source map for COW operations
@@ -225,7 +224,7 @@ struct iomap_iter {
loff_t pos;
u64 len;
loff_t iter_start_pos;
- s64 processed;
+ int status;
unsigned flags;
struct iomap iomap;
struct iomap srcmap;
--
2.48.1
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v2 12/12] iomap: introduce a full map advance helper
2025-02-19 17:50 [PATCH v2 00/12] iomap: incremental advance conversion -- phase 2 Brian Foster
` (10 preceding siblings ...)
2025-02-19 17:50 ` [PATCH v2 11/12] iomap: rename iomap_iter processed field to status Brian Foster
@ 2025-02-19 17:50 ` Brian Foster
2025-02-19 22:31 ` Darrick J. Wong
2025-02-20 9:10 ` [PATCH v2 00/12] iomap: incremental advance conversion -- phase 2 Christian Brauner
12 siblings, 1 reply; 30+ messages in thread
From: Brian Foster @ 2025-02-19 17:50 UTC (permalink / raw)
To: linux-fsdevel; +Cc: linux-xfs, Christoph Hellwig, Darrick J . Wong
Various iomap_iter_advance() calls advance by the full mapping
length and thus have no need for the current length input or
post-advance remaining length output from the standard advance
function. Add an iomap_iter_advance_full() helper to clean up these
cases.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
fs/dax.c | 10 ++++------
fs/iomap/buffered-io.c | 3 +--
fs/iomap/fiemap.c | 3 +--
fs/iomap/swapfile.c | 4 +---
include/linux/iomap.h | 9 +++++++++
5 files changed, 16 insertions(+), 13 deletions(-)
diff --git a/fs/dax.c b/fs/dax.c
index cab3c5abe5cb..7fd4cd9a51f2 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -1266,11 +1266,11 @@ static int dax_unshare_iter(struct iomap_iter *iter)
u64 copy_len = iomap_length(iter);
u32 mod;
int id = 0;
- s64 ret = iomap_length(iter);
+ s64 ret;
void *daddr = NULL, *saddr = NULL;
if (!iomap_want_unshare_iter(iter))
- return iomap_iter_advance(iter, &ret);
+ return iomap_iter_advance_full(iter);
/*
* Extend the file range to be aligned to fsblock/pagesize, because
@@ -1300,16 +1300,14 @@ static int dax_unshare_iter(struct iomap_iter *iter)
if (ret < 0)
goto out_unlock;
- if (copy_mc_to_kernel(daddr, saddr, copy_len) == 0)
- ret = iomap_length(iter);
- else
+ if (copy_mc_to_kernel(daddr, saddr, copy_len) != 0)
ret = -EIO;
out_unlock:
dax_read_unlock(id);
if (ret < 0)
return dax_mem2blk_err(ret);
- return iomap_iter_advance(iter, &ret);
+ return iomap_iter_advance_full(iter);
}
int dax_file_unshare(struct inode *inode, loff_t pos, loff_t len,
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 2b86978010bb..e53ac635e47c 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -1436,8 +1436,7 @@ iomap_zero_range(struct inode *inode, loff_t pos, loff_t len, bool *did_zero,
range_dirty = false;
status = iomap_zero_iter_flush_and_stale(&iter);
} else {
- u64 length = iomap_length(&iter);
- status = iomap_iter_advance(&iter, &length);
+ status = iomap_iter_advance_full(&iter);
}
iter.status = status;
continue;
diff --git a/fs/iomap/fiemap.c b/fs/iomap/fiemap.c
index 6776b800bde7..80675c42e94e 100644
--- a/fs/iomap/fiemap.c
+++ b/fs/iomap/fiemap.c
@@ -42,7 +42,6 @@ static int iomap_to_fiemap(struct fiemap_extent_info *fi,
static int iomap_fiemap_iter(struct iomap_iter *iter,
struct fiemap_extent_info *fi, struct iomap *prev)
{
- u64 length = iomap_length(iter);
int ret;
if (iter->iomap.type == IOMAP_HOLE)
@@ -56,7 +55,7 @@ static int iomap_fiemap_iter(struct iomap_iter *iter,
return 0;
advance:
- return iomap_iter_advance(iter, &length);
+ return iomap_iter_advance_full(iter);
}
int iomap_fiemap(struct inode *inode, struct fiemap_extent_info *fi,
diff --git a/fs/iomap/swapfile.c b/fs/iomap/swapfile.c
index 9ea185e58ca7..c1a762c10ce4 100644
--- a/fs/iomap/swapfile.c
+++ b/fs/iomap/swapfile.c
@@ -97,8 +97,6 @@ static int iomap_swapfile_fail(struct iomap_swapfile_info *isi, const char *str)
static int iomap_swapfile_iter(struct iomap_iter *iter,
struct iomap *iomap, struct iomap_swapfile_info *isi)
{
- u64 length = iomap_length(iter);
-
switch (iomap->type) {
case IOMAP_MAPPED:
case IOMAP_UNWRITTEN:
@@ -135,7 +133,7 @@ static int iomap_swapfile_iter(struct iomap_iter *iter,
memcpy(&isi->iomap, iomap, sizeof(isi->iomap));
}
- return iomap_iter_advance(iter, &length);
+ return iomap_iter_advance_full(iter);
}
/*
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index 29b72a671104..4c7e9fe32117 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -264,6 +264,15 @@ static inline u64 iomap_length(const struct iomap_iter *iter)
return iomap_length_trim(iter, iter->pos, iter->len);
}
+/**
+ * iomap_iter_advance_full - advance by the full length of current map
+ */
+static inline int iomap_iter_advance_full(struct iomap_iter *iter)
+{
+ u64 length = iomap_length(iter);
+ return iomap_iter_advance(iter, &length);
+}
+
/**
* iomap_iter_srcmap - return the source map for the current iomap iteration
* @i: iteration structure
--
2.48.1
^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [PATCH v2 01/12] iomap: advance the iter directly on buffered read
2025-02-19 17:50 ` [PATCH v2 01/12] iomap: advance the iter directly on buffered read Brian Foster
@ 2025-02-19 22:22 ` Darrick J. Wong
2025-02-19 22:31 ` Darrick J. Wong
0 siblings, 1 reply; 30+ messages in thread
From: Darrick J. Wong @ 2025-02-19 22:22 UTC (permalink / raw)
To: Brian Foster; +Cc: linux-fsdevel, linux-xfs, Christoph Hellwig
On Wed, Feb 19, 2025 at 12:50:39PM -0500, Brian Foster wrote:
> iomap buffered read advances the iter via iter.processed. To
> continue separating iter advance from return status, update
> iomap_readpage_iter() to advance the iter instead of returning the
> number of bytes processed. In turn, drop the offset parameter and
> sample the updated iter->pos at the start of the function. Update
> the callers to loop based on remaining length in the current
> iteration instead of number of bytes processed.
>
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> ---
> fs/iomap/buffered-io.c | 45 +++++++++++++++++++-----------------------
> 1 file changed, 20 insertions(+), 25 deletions(-)
>
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index ec227b45f3aa..215866ba264d 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -366,15 +366,14 @@ static inline bool iomap_block_needs_zeroing(const struct iomap_iter *iter,
> pos >= i_size_read(iter->inode);
> }
>
> -static loff_t iomap_readpage_iter(const struct iomap_iter *iter,
> - struct iomap_readpage_ctx *ctx, loff_t offset)
> +static loff_t iomap_readpage_iter(struct iomap_iter *iter,
I wonder, do we really need to return loff_t from some of these
functions now? I thought the only return codes were the -EIO/0 from
iomap_iter_advance?
--D
> + struct iomap_readpage_ctx *ctx)
> {
> const struct iomap *iomap = &iter->iomap;
> - loff_t pos = iter->pos + offset;
> - loff_t length = iomap_length(iter) - offset;
> + loff_t pos = iter->pos;
> + loff_t length = iomap_length(iter);
> struct folio *folio = ctx->cur_folio;
> struct iomap_folio_state *ifs;
> - loff_t orig_pos = pos;
> size_t poff, plen;
> sector_t sector;
>
> @@ -438,25 +437,22 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter,
> * we can skip trailing ones as they will be handled in the next
> * iteration.
> */
> - return pos - orig_pos + plen;
> + length = pos - iter->pos + plen;
> + return iomap_iter_advance(iter, &length);
> }
>
> -static loff_t iomap_read_folio_iter(const struct iomap_iter *iter,
> +static loff_t iomap_read_folio_iter(struct iomap_iter *iter,
> struct iomap_readpage_ctx *ctx)
> {
> - struct folio *folio = ctx->cur_folio;
> - size_t offset = offset_in_folio(folio, iter->pos);
> - loff_t length = min_t(loff_t, folio_size(folio) - offset,
> - iomap_length(iter));
> - loff_t done, ret;
> -
> - for (done = 0; done < length; done += ret) {
> - ret = iomap_readpage_iter(iter, ctx, done);
> - if (ret <= 0)
> + loff_t ret;
> +
> + while (iomap_length(iter)) {
> + ret = iomap_readpage_iter(iter, ctx);
> + if (ret)
> return ret;
> }
>
> - return done;
> + return 0;
> }
>
> int iomap_read_folio(struct folio *folio, const struct iomap_ops *ops)
> @@ -493,15 +489,14 @@ int iomap_read_folio(struct folio *folio, const struct iomap_ops *ops)
> }
> EXPORT_SYMBOL_GPL(iomap_read_folio);
>
> -static loff_t iomap_readahead_iter(const struct iomap_iter *iter,
> +static loff_t iomap_readahead_iter(struct iomap_iter *iter,
> struct iomap_readpage_ctx *ctx)
> {
> - loff_t length = iomap_length(iter);
> - loff_t done, ret;
> + loff_t ret;
>
> - for (done = 0; done < length; done += ret) {
> + while (iomap_length(iter)) {
> if (ctx->cur_folio &&
> - offset_in_folio(ctx->cur_folio, iter->pos + done) == 0) {
> + offset_in_folio(ctx->cur_folio, iter->pos) == 0) {
> if (!ctx->cur_folio_in_bio)
> folio_unlock(ctx->cur_folio);
> ctx->cur_folio = NULL;
> @@ -510,12 +505,12 @@ static loff_t iomap_readahead_iter(const struct iomap_iter *iter,
> ctx->cur_folio = readahead_folio(ctx->rac);
> ctx->cur_folio_in_bio = false;
> }
> - ret = iomap_readpage_iter(iter, ctx, done);
> - if (ret <= 0)
> + ret = iomap_readpage_iter(iter, ctx);
> + if (ret)
> return ret;
> }
>
> - return done;
> + return 0;
> }
>
> /**
> --
> 2.48.1
>
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2 02/12] iomap: advance the iter on direct I/O
2025-02-19 17:50 ` [PATCH v2 02/12] iomap: advance the iter on direct I/O Brian Foster
@ 2025-02-19 22:22 ` Darrick J. Wong
0 siblings, 0 replies; 30+ messages in thread
From: Darrick J. Wong @ 2025-02-19 22:22 UTC (permalink / raw)
To: Brian Foster; +Cc: linux-fsdevel, linux-xfs, Christoph Hellwig
On Wed, Feb 19, 2025 at 12:50:40PM -0500, Brian Foster wrote:
> Update iomap direct I/O to advance the iter directly rather than via
> iter.processed. Update each mapping type helper to advance based on
> the amount of data processed and return success or failure.
>
> Signed-off-by: Brian Foster <bfoster@redhat.com>
Looks pretty straightforward,
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
--D
> ---
> fs/iomap/direct-io.c | 22 +++++++++-------------
> 1 file changed, 9 insertions(+), 13 deletions(-)
>
> diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
> index b521eb15759e..b3599f8d12ac 100644
> --- a/fs/iomap/direct-io.c
> +++ b/fs/iomap/direct-io.c
> @@ -289,8 +289,7 @@ static inline blk_opf_t iomap_dio_bio_opflags(struct iomap_dio *dio,
> return opflags;
> }
>
> -static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter,
> - struct iomap_dio *dio)
> +static int iomap_dio_bio_iter(struct iomap_iter *iter, struct iomap_dio *dio)
> {
> const struct iomap *iomap = &iter->iomap;
> struct inode *inode = iter->inode;
> @@ -303,7 +302,7 @@ static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter,
> bool need_zeroout = false;
> bool use_fua = false;
> int nr_pages, ret = 0;
> - size_t copied = 0;
> + u64 copied = 0;
> size_t orig_count;
>
> if (atomic && length != fs_block_size)
> @@ -467,30 +466,28 @@ static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter,
> /* Undo iter limitation to current extent */
> iov_iter_reexpand(dio->submit.iter, orig_count - copied);
> if (copied)
> - return copied;
> + return iomap_iter_advance(iter, &copied);
> return ret;
> }
>
> -static loff_t iomap_dio_hole_iter(const struct iomap_iter *iter,
> - struct iomap_dio *dio)
> +static int iomap_dio_hole_iter(struct iomap_iter *iter, struct iomap_dio *dio)
> {
> loff_t length = iov_iter_zero(iomap_length(iter), dio->submit.iter);
>
> dio->size += length;
> if (!length)
> return -EFAULT;
> - return length;
> + return iomap_iter_advance(iter, &length);
> }
>
> -static loff_t iomap_dio_inline_iter(const struct iomap_iter *iomi,
> - struct iomap_dio *dio)
> +static int iomap_dio_inline_iter(struct iomap_iter *iomi, struct iomap_dio *dio)
> {
> const struct iomap *iomap = &iomi->iomap;
> struct iov_iter *iter = dio->submit.iter;
> void *inline_data = iomap_inline_data(iomap, iomi->pos);
> loff_t length = iomap_length(iomi);
> loff_t pos = iomi->pos;
> - size_t copied;
> + u64 copied;
>
> if (WARN_ON_ONCE(!iomap_inline_data_valid(iomap)))
> return -EIO;
> @@ -512,11 +509,10 @@ static loff_t iomap_dio_inline_iter(const struct iomap_iter *iomi,
> dio->size += copied;
> if (!copied)
> return -EFAULT;
> - return copied;
> + return iomap_iter_advance(iomi, &copied);
> }
>
> -static loff_t iomap_dio_iter(const struct iomap_iter *iter,
> - struct iomap_dio *dio)
> +static int iomap_dio_iter(struct iomap_iter *iter, struct iomap_dio *dio)
> {
> switch (iter->iomap.type) {
> case IOMAP_HOLE:
> --
> 2.48.1
>
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2 03/12] iomap: convert misc simple ops to incremental advance
2025-02-19 17:50 ` [PATCH v2 03/12] iomap: convert misc simple ops to incremental advance Brian Foster
@ 2025-02-19 22:24 ` Darrick J. Wong
2025-02-19 22:31 ` Darrick J. Wong
0 siblings, 1 reply; 30+ messages in thread
From: Darrick J. Wong @ 2025-02-19 22:24 UTC (permalink / raw)
To: Brian Foster; +Cc: linux-fsdevel, linux-xfs, Christoph Hellwig
On Wed, Feb 19, 2025 at 12:50:41PM -0500, Brian Foster wrote:
> Update several of the remaining iomap operations to advance the iter
> directly rather than via return value. This includes page faults,
> fiemap, seek data/hole and swapfile activation.
>
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---
> fs/iomap/buffered-io.c | 2 +-
> fs/iomap/fiemap.c | 18 +++++++++---------
> fs/iomap/seek.c | 12 ++++++------
> fs/iomap/swapfile.c | 7 +++++--
> 4 files changed, 21 insertions(+), 18 deletions(-)
>
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index 215866ba264d..ddc82dab6bb5 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -1480,7 +1480,7 @@ static loff_t iomap_folio_mkwrite_iter(struct iomap_iter *iter,
> folio_mark_dirty(folio);
> }
>
> - return length;
> + return iomap_iter_advance(iter, &length);
Same dorky question here -- doesn't iomap_iter_advance return int, so
all these functions can now return int instead of loff_t?
--D
> }
>
> vm_fault_t iomap_page_mkwrite(struct vm_fault *vmf, const struct iomap_ops *ops)
> diff --git a/fs/iomap/fiemap.c b/fs/iomap/fiemap.c
> index 610ca6f1ec9b..8a0d8b034218 100644
> --- a/fs/iomap/fiemap.c
> +++ b/fs/iomap/fiemap.c
> @@ -39,24 +39,24 @@ static int iomap_to_fiemap(struct fiemap_extent_info *fi,
> iomap->length, flags);
> }
>
> -static loff_t iomap_fiemap_iter(const struct iomap_iter *iter,
> +static loff_t iomap_fiemap_iter(struct iomap_iter *iter,
> struct fiemap_extent_info *fi, struct iomap *prev)
> {
> + u64 length = iomap_length(iter);
> int ret;
>
> if (iter->iomap.type == IOMAP_HOLE)
> - return iomap_length(iter);
> + goto advance;
>
> ret = iomap_to_fiemap(fi, prev, 0);
> *prev = iter->iomap;
> - switch (ret) {
> - case 0: /* success */
> - return iomap_length(iter);
> - case 1: /* extent array full */
> - return 0;
> - default: /* error */
> + if (ret < 0)
> return ret;
> - }
> + if (ret == 1) /* extent array full */
> + return 0;
> +
> +advance:
> + return iomap_iter_advance(iter, &length);
> }
>
> int iomap_fiemap(struct inode *inode, struct fiemap_extent_info *fi,
> diff --git a/fs/iomap/seek.c b/fs/iomap/seek.c
> index a845c012b50c..83c687d6ccc0 100644
> --- a/fs/iomap/seek.c
> +++ b/fs/iomap/seek.c
> @@ -10,7 +10,7 @@
> #include <linux/pagemap.h>
> #include <linux/pagevec.h>
>
> -static loff_t iomap_seek_hole_iter(const struct iomap_iter *iter,
> +static loff_t iomap_seek_hole_iter(struct iomap_iter *iter,
> loff_t *hole_pos)
> {
> loff_t length = iomap_length(iter);
> @@ -20,13 +20,13 @@ static loff_t iomap_seek_hole_iter(const struct iomap_iter *iter,
> *hole_pos = mapping_seek_hole_data(iter->inode->i_mapping,
> iter->pos, iter->pos + length, SEEK_HOLE);
> if (*hole_pos == iter->pos + length)
> - return length;
> + return iomap_iter_advance(iter, &length);
> return 0;
> case IOMAP_HOLE:
> *hole_pos = iter->pos;
> return 0;
> default:
> - return length;
> + return iomap_iter_advance(iter, &length);
> }
> }
>
> @@ -56,19 +56,19 @@ iomap_seek_hole(struct inode *inode, loff_t pos, const struct iomap_ops *ops)
> }
> EXPORT_SYMBOL_GPL(iomap_seek_hole);
>
> -static loff_t iomap_seek_data_iter(const struct iomap_iter *iter,
> +static loff_t iomap_seek_data_iter(struct iomap_iter *iter,
> loff_t *hole_pos)
> {
> loff_t length = iomap_length(iter);
>
> switch (iter->iomap.type) {
> case IOMAP_HOLE:
> - return length;
> + return iomap_iter_advance(iter, &length);
> case IOMAP_UNWRITTEN:
> *hole_pos = mapping_seek_hole_data(iter->inode->i_mapping,
> iter->pos, iter->pos + length, SEEK_DATA);
> if (*hole_pos < 0)
> - return length;
> + return iomap_iter_advance(iter, &length);
> return 0;
> default:
> *hole_pos = iter->pos;
> diff --git a/fs/iomap/swapfile.c b/fs/iomap/swapfile.c
> index b90d0eda9e51..4395e46a4dc7 100644
> --- a/fs/iomap/swapfile.c
> +++ b/fs/iomap/swapfile.c
> @@ -94,9 +94,11 @@ static int iomap_swapfile_fail(struct iomap_swapfile_info *isi, const char *str)
> * swap only cares about contiguous page-aligned physical extents and makes no
> * distinction between written and unwritten extents.
> */
> -static loff_t iomap_swapfile_iter(const struct iomap_iter *iter,
> +static loff_t iomap_swapfile_iter(struct iomap_iter *iter,
> struct iomap *iomap, struct iomap_swapfile_info *isi)
> {
> + u64 length = iomap_length(iter);
> +
> switch (iomap->type) {
> case IOMAP_MAPPED:
> case IOMAP_UNWRITTEN:
> @@ -132,7 +134,8 @@ static loff_t iomap_swapfile_iter(const struct iomap_iter *iter,
> return error;
> memcpy(&isi->iomap, iomap, sizeof(isi->iomap));
> }
> - return iomap_length(iter);
> +
> + return iomap_iter_advance(iter, &length);
> }
>
> /*
> --
> 2.48.1
>
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2 10/12] iomap: remove unnecessary advance from iomap_iter()
2025-02-19 17:50 ` [PATCH v2 10/12] iomap: remove unnecessary advance from iomap_iter() Brian Foster
@ 2025-02-19 22:30 ` Darrick J. Wong
0 siblings, 0 replies; 30+ messages in thread
From: Darrick J. Wong @ 2025-02-19 22:30 UTC (permalink / raw)
To: Brian Foster; +Cc: linux-fsdevel, linux-xfs, Christoph Hellwig
On Wed, Feb 19, 2025 at 12:50:48PM -0500, Brian Foster wrote:
> At this point, all iomap operations have been updated to advance the
> iomap_iter directly before returning to iomap_iter(). Therefore, the
> complexity of handling both the old and new semantics is no longer
> required and can be removed from iomap_iter().
>
> Update iomap_iter() to expect success or failure status in
> iter.processed. As a precaution and developer hint to prevent
> inadvertent use of old semantics, warn on a positive return code and
> fail the operation. Remove the unnecessary advance and simplify the
> termination logic.
>
> Signed-off-by: Brian Foster <bfoster@redhat.com>
Looks reasonable,
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
--D
> ---
> fs/iomap/iter.c | 39 +++++++++++++++------------------------
> 1 file changed, 15 insertions(+), 24 deletions(-)
>
> diff --git a/fs/iomap/iter.c b/fs/iomap/iter.c
> index 0ebcabc7df52..e4dfe64029cc 100644
> --- a/fs/iomap/iter.c
> +++ b/fs/iomap/iter.c
> @@ -60,9 +60,8 @@ static inline void iomap_iter_done(struct iomap_iter *iter)
> int iomap_iter(struct iomap_iter *iter, const struct iomap_ops *ops)
> {
> bool stale = iter->iomap.flags & IOMAP_F_STALE;
> - ssize_t advanced = iter->processed > 0 ? iter->processed : 0;
> - u64 olen = iter->len;
> - s64 processed;
> + ssize_t advanced;
> + u64 olen;
> int ret;
>
> trace_iomap_iter(iter, ops, _RET_IP_);
> @@ -71,14 +70,11 @@ int iomap_iter(struct iomap_iter *iter, const struct iomap_ops *ops)
> goto begin;
>
> /*
> - * If iter.processed is zero, the op may still have advanced the iter
> - * itself. Calculate the advanced and original length bytes based on how
> - * far pos has advanced for ->iomap_end().
> + * Calculate how far the iter was advanced and the original length bytes
> + * for ->iomap_end().
> */
> - if (!advanced) {
> - advanced = iter->pos - iter->iter_start_pos;
> - olen += advanced;
> - }
> + advanced = iter->pos - iter->iter_start_pos;
> + olen = iter->len + advanced;
>
> if (ops->iomap_end) {
> ret = ops->iomap_end(iter->inode, iter->iter_start_pos,
> @@ -89,27 +85,22 @@ int iomap_iter(struct iomap_iter *iter, const struct iomap_ops *ops)
> return ret;
> }
>
> - processed = iter->processed;
> - if (processed < 0) {
> - iomap_iter_reset_iomap(iter);
> - return processed;
> - }
> + /* detect old return semantics where this would advance */
> + if (WARN_ON_ONCE(iter->processed > 0))
> + iter->processed = -EIO;
>
> /*
> - * Advance the iter and clear state from the previous iteration. This
> - * passes iter->processed because that reflects the bytes processed but
> - * not yet advanced by the iter handler.
> - *
> * Use iter->len to determine whether to continue onto the next mapping.
> - * Explicitly terminate in the case where the current iter has not
> + * Explicitly terminate on error status or if the current iter has not
> * advanced at all (i.e. no work was done for some reason) unless the
> * mapping has been marked stale and needs to be reprocessed.
> */
> - ret = iomap_iter_advance(iter, &processed);
> - if (!ret && iter->len > 0)
> - ret = 1;
> - if (ret > 0 && !advanced && !stale)
> + if (iter->processed < 0)
> + ret = iter->processed;
> + else if (iter->len == 0 || (!advanced && !stale))
> ret = 0;
> + else
> + ret = 1;
> iomap_iter_reset_iomap(iter);
> if (ret <= 0)
> return ret;
> --
> 2.48.1
>
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2 11/12] iomap: rename iomap_iter processed field to status
2025-02-19 17:50 ` [PATCH v2 11/12] iomap: rename iomap_iter processed field to status Brian Foster
@ 2025-02-19 22:30 ` Darrick J. Wong
0 siblings, 0 replies; 30+ messages in thread
From: Darrick J. Wong @ 2025-02-19 22:30 UTC (permalink / raw)
To: Brian Foster; +Cc: linux-fsdevel, linux-xfs, Christoph Hellwig
On Wed, Feb 19, 2025 at 12:50:49PM -0500, Brian Foster wrote:
> The iter.processed field name is no longer appropriate now that
> iomap operations do not return the number of bytes processed. Rename
> the field to iter.status to reflect that a success or error code is
> expected.
>
> Also change the type to int as there is no longer a need for an s64.
> This reduces the size of iomap_iter by 8 bytes due to a combination
> of smaller type and reduction in structure padding. While here, fix
> up the return types of various _iter() helpers to reflect the type
> change.
>
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
Aha, this is the answer to my question from the first patch.
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
--D
> ---
> fs/dax.c | 20 ++++++++++----------
> fs/iomap/buffered-io.c | 42 +++++++++++++++++++++---------------------
> fs/iomap/direct-io.c | 2 +-
> fs/iomap/fiemap.c | 6 +++---
> fs/iomap/iter.c | 12 ++++++------
> fs/iomap/seek.c | 8 ++++----
> fs/iomap/swapfile.c | 4 ++--
> fs/iomap/trace.h | 8 ++++----
> include/linux/iomap.h | 7 +++----
> 9 files changed, 54 insertions(+), 55 deletions(-)
>
> diff --git a/fs/dax.c b/fs/dax.c
> index 44701865ca94..cab3c5abe5cb 100644
> --- a/fs/dax.c
> +++ b/fs/dax.c
> @@ -1258,7 +1258,7 @@ static vm_fault_t dax_pmd_load_hole(struct xa_state *xas, struct vm_fault *vmf,
> }
> #endif /* CONFIG_FS_DAX_PMD */
>
> -static s64 dax_unshare_iter(struct iomap_iter *iter)
> +static int dax_unshare_iter(struct iomap_iter *iter)
> {
> struct iomap *iomap = &iter->iomap;
> const struct iomap *srcmap = iomap_iter_srcmap(iter);
> @@ -1328,7 +1328,7 @@ int dax_file_unshare(struct inode *inode, loff_t pos, loff_t len,
>
> iter.len = min(len, size - pos);
> while ((ret = iomap_iter(&iter, ops)) > 0)
> - iter.processed = dax_unshare_iter(&iter);
> + iter.status = dax_unshare_iter(&iter);
> return ret;
> }
> EXPORT_SYMBOL_GPL(dax_file_unshare);
> @@ -1356,12 +1356,12 @@ static int dax_memzero(struct iomap_iter *iter, loff_t pos, size_t size)
> return ret;
> }
>
> -static s64 dax_zero_iter(struct iomap_iter *iter, bool *did_zero)
> +static int dax_zero_iter(struct iomap_iter *iter, bool *did_zero)
> {
> const struct iomap *iomap = &iter->iomap;
> const struct iomap *srcmap = iomap_iter_srcmap(iter);
> u64 length = iomap_length(iter);
> - s64 ret;
> + int ret;
>
> /* already zeroed? we're done. */
> if (srcmap->type == IOMAP_HOLE || srcmap->type == IOMAP_UNWRITTEN)
> @@ -1416,7 +1416,7 @@ int dax_zero_range(struct inode *inode, loff_t pos, loff_t len, bool *did_zero,
> int ret;
>
> while ((ret = iomap_iter(&iter, ops)) > 0)
> - iter.processed = dax_zero_iter(&iter, did_zero);
> + iter.status = dax_zero_iter(&iter, did_zero);
> return ret;
> }
> EXPORT_SYMBOL_GPL(dax_zero_range);
> @@ -1588,7 +1588,7 @@ dax_iomap_rw(struct kiocb *iocb, struct iov_iter *iter,
> iomi.flags |= IOMAP_NOWAIT;
>
> while ((ret = iomap_iter(&iomi, ops)) > 0)
> - iomi.processed = dax_iomap_iter(&iomi, iter);
> + iomi.status = dax_iomap_iter(&iomi, iter);
>
> done = iomi.pos - iocb->ki_pos;
> iocb->ki_pos = iomi.pos;
> @@ -1759,7 +1759,7 @@ static vm_fault_t dax_iomap_pte_fault(struct vm_fault *vmf, pfn_t *pfnp,
>
> while ((error = iomap_iter(&iter, ops)) > 0) {
> if (WARN_ON_ONCE(iomap_length(&iter) < PAGE_SIZE)) {
> - iter.processed = -EIO; /* fs corruption? */
> + iter.status = -EIO; /* fs corruption? */
> continue;
> }
>
> @@ -1773,7 +1773,7 @@ static vm_fault_t dax_iomap_pte_fault(struct vm_fault *vmf, pfn_t *pfnp,
>
> if (!(ret & VM_FAULT_ERROR)) {
> u64 length = PAGE_SIZE;
> - iter.processed = iomap_iter_advance(&iter, &length);
> + iter.status = iomap_iter_advance(&iter, &length);
> }
> }
>
> @@ -1889,7 +1889,7 @@ static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, pfn_t *pfnp,
> ret = dax_fault_iter(vmf, &iter, pfnp, &xas, &entry, true);
> if (ret != VM_FAULT_FALLBACK) {
> u64 length = PMD_SIZE;
> - iter.processed = iomap_iter_advance(&iter, &length);
> + iter.status = iomap_iter_advance(&iter, &length);
> }
> }
>
> @@ -2079,7 +2079,7 @@ int dax_dedupe_file_range_compare(struct inode *src, loff_t srcoff,
> min(src_iter.len, dst_iter.len), same);
> if (status < 0)
> return ret;
> - src_iter.processed = dst_iter.processed = status;
> + src_iter.status = dst_iter.status = status;
> }
> return ret;
> }
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index ddc82dab6bb5..2b86978010bb 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -366,7 +366,7 @@ static inline bool iomap_block_needs_zeroing(const struct iomap_iter *iter,
> pos >= i_size_read(iter->inode);
> }
>
> -static loff_t iomap_readpage_iter(struct iomap_iter *iter,
> +static int iomap_readpage_iter(struct iomap_iter *iter,
> struct iomap_readpage_ctx *ctx)
> {
> const struct iomap *iomap = &iter->iomap;
> @@ -441,10 +441,10 @@ static loff_t iomap_readpage_iter(struct iomap_iter *iter,
> return iomap_iter_advance(iter, &length);
> }
>
> -static loff_t iomap_read_folio_iter(struct iomap_iter *iter,
> +static int iomap_read_folio_iter(struct iomap_iter *iter,
> struct iomap_readpage_ctx *ctx)
> {
> - loff_t ret;
> + int ret;
>
> while (iomap_length(iter)) {
> ret = iomap_readpage_iter(iter, ctx);
> @@ -470,7 +470,7 @@ int iomap_read_folio(struct folio *folio, const struct iomap_ops *ops)
> trace_iomap_readpage(iter.inode, 1);
>
> while ((ret = iomap_iter(&iter, ops)) > 0)
> - iter.processed = iomap_read_folio_iter(&iter, &ctx);
> + iter.status = iomap_read_folio_iter(&iter, &ctx);
>
> if (ctx.bio) {
> submit_bio(ctx.bio);
> @@ -489,10 +489,10 @@ int iomap_read_folio(struct folio *folio, const struct iomap_ops *ops)
> }
> EXPORT_SYMBOL_GPL(iomap_read_folio);
>
> -static loff_t iomap_readahead_iter(struct iomap_iter *iter,
> +static int iomap_readahead_iter(struct iomap_iter *iter,
> struct iomap_readpage_ctx *ctx)
> {
> - loff_t ret;
> + int ret;
>
> while (iomap_length(iter)) {
> if (ctx->cur_folio &&
> @@ -542,7 +542,7 @@ void iomap_readahead(struct readahead_control *rac, const struct iomap_ops *ops)
> trace_iomap_readahead(rac->mapping->host, readahead_count(rac));
>
> while (iomap_iter(&iter, ops) > 0)
> - iter.processed = iomap_readahead_iter(&iter, &ctx);
> + iter.status = iomap_readahead_iter(&iter, &ctx);
>
> if (ctx.bio)
> submit_bio(ctx.bio);
> @@ -902,10 +902,10 @@ static bool iomap_write_end(struct iomap_iter *iter, loff_t pos, size_t len,
> return __iomap_write_end(iter->inode, pos, len, copied, folio);
> }
>
> -static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
> +static int iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
> {
> ssize_t total_written = 0;
> - long status = 0;
> + int status = 0;
> struct address_space *mapping = iter->inode->i_mapping;
> size_t chunk = mapping_max_folio_size(mapping);
> unsigned int bdp_flags = (iter->flags & IOMAP_NOWAIT) ? BDP_ASYNC : 0;
> @@ -1025,7 +1025,7 @@ iomap_file_buffered_write(struct kiocb *iocb, struct iov_iter *i,
> iter.flags |= IOMAP_NOWAIT;
>
> while ((ret = iomap_iter(&iter, ops)) > 0)
> - iter.processed = iomap_write_iter(&iter, i);
> + iter.status = iomap_write_iter(&iter, i);
>
> if (unlikely(iter.pos == iocb->ki_pos))
> return ret;
> @@ -1259,7 +1259,7 @@ void iomap_write_delalloc_release(struct inode *inode, loff_t start_byte,
> }
> EXPORT_SYMBOL_GPL(iomap_write_delalloc_release);
>
> -static loff_t iomap_unshare_iter(struct iomap_iter *iter)
> +static int iomap_unshare_iter(struct iomap_iter *iter)
> {
> struct iomap *iomap = &iter->iomap;
> u64 bytes = iomap_length(iter);
> @@ -1319,7 +1319,7 @@ iomap_file_unshare(struct inode *inode, loff_t pos, loff_t len,
>
> iter.len = min(len, size - pos);
> while ((ret = iomap_iter(&iter, ops)) > 0)
> - iter.processed = iomap_unshare_iter(&iter);
> + iter.status = iomap_unshare_iter(&iter);
> return ret;
> }
> EXPORT_SYMBOL_GPL(iomap_file_unshare);
> @@ -1338,7 +1338,7 @@ static inline int iomap_zero_iter_flush_and_stale(struct iomap_iter *i)
> return filemap_write_and_wait_range(mapping, i->pos, end);
> }
>
> -static loff_t iomap_zero_iter(struct iomap_iter *iter, bool *did_zero)
> +static int iomap_zero_iter(struct iomap_iter *iter, bool *did_zero)
> {
> u64 bytes = iomap_length(iter);
> int status;
> @@ -1411,7 +1411,7 @@ iomap_zero_range(struct inode *inode, loff_t pos, loff_t len, bool *did_zero,
> filemap_range_needs_writeback(mapping, pos, pos + plen - 1)) {
> iter.len = plen;
> while ((ret = iomap_iter(&iter, ops)) > 0)
> - iter.processed = iomap_zero_iter(&iter, did_zero);
> + iter.status = iomap_zero_iter(&iter, did_zero);
>
> iter.len = len - (iter.pos - pos);
> if (ret || !iter.len)
> @@ -1430,20 +1430,20 @@ iomap_zero_range(struct inode *inode, loff_t pos, loff_t len, bool *did_zero,
>
> if (srcmap->type == IOMAP_HOLE ||
> srcmap->type == IOMAP_UNWRITTEN) {
> - s64 proc;
> + s64 status;
>
> if (range_dirty) {
> range_dirty = false;
> - proc = iomap_zero_iter_flush_and_stale(&iter);
> + status = iomap_zero_iter_flush_and_stale(&iter);
> } else {
> u64 length = iomap_length(&iter);
> - proc = iomap_iter_advance(&iter, &length);
> + status = iomap_iter_advance(&iter, &length);
> }
> - iter.processed = proc;
> + iter.status = status;
> continue;
> }
>
> - iter.processed = iomap_zero_iter(&iter, did_zero);
> + iter.status = iomap_zero_iter(&iter, did_zero);
> }
> return ret;
> }
> @@ -1463,7 +1463,7 @@ iomap_truncate_page(struct inode *inode, loff_t pos, bool *did_zero,
> }
> EXPORT_SYMBOL_GPL(iomap_truncate_page);
>
> -static loff_t iomap_folio_mkwrite_iter(struct iomap_iter *iter,
> +static int iomap_folio_mkwrite_iter(struct iomap_iter *iter,
> struct folio *folio)
> {
> loff_t length = iomap_length(iter);
> @@ -1499,7 +1499,7 @@ vm_fault_t iomap_page_mkwrite(struct vm_fault *vmf, const struct iomap_ops *ops)
> iter.pos = folio_pos(folio);
> iter.len = ret;
> while ((ret = iomap_iter(&iter, ops)) > 0)
> - iter.processed = iomap_folio_mkwrite_iter(&iter, folio);
> + iter.status = iomap_folio_mkwrite_iter(&iter, folio);
>
> if (ret < 0)
> goto out_unlock;
> diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
> index b3599f8d12ac..a84bba651e14 100644
> --- a/fs/iomap/direct-io.c
> +++ b/fs/iomap/direct-io.c
> @@ -696,7 +696,7 @@ __iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
>
> blk_start_plug(&plug);
> while ((ret = iomap_iter(&iomi, ops)) > 0) {
> - iomi.processed = iomap_dio_iter(&iomi, dio);
> + iomi.status = iomap_dio_iter(&iomi, dio);
>
> /*
> * We can only poll for single bio I/Os.
> diff --git a/fs/iomap/fiemap.c b/fs/iomap/fiemap.c
> index 8a0d8b034218..6776b800bde7 100644
> --- a/fs/iomap/fiemap.c
> +++ b/fs/iomap/fiemap.c
> @@ -39,7 +39,7 @@ static int iomap_to_fiemap(struct fiemap_extent_info *fi,
> iomap->length, flags);
> }
>
> -static loff_t iomap_fiemap_iter(struct iomap_iter *iter,
> +static int iomap_fiemap_iter(struct iomap_iter *iter,
> struct fiemap_extent_info *fi, struct iomap *prev)
> {
> u64 length = iomap_length(iter);
> @@ -78,7 +78,7 @@ int iomap_fiemap(struct inode *inode, struct fiemap_extent_info *fi,
> return ret;
>
> while ((ret = iomap_iter(&iter, ops)) > 0)
> - iter.processed = iomap_fiemap_iter(&iter, fi, &prev);
> + iter.status = iomap_fiemap_iter(&iter, fi, &prev);
>
> if (prev.type != IOMAP_HOLE) {
> ret = iomap_to_fiemap(fi, &prev, FIEMAP_EXTENT_LAST);
> @@ -114,7 +114,7 @@ iomap_bmap(struct address_space *mapping, sector_t bno,
> while ((ret = iomap_iter(&iter, ops)) > 0) {
> if (iter.iomap.type == IOMAP_MAPPED)
> bno = iomap_sector(&iter.iomap, iter.pos) >> blkshift;
> - /* leave iter.processed unset to abort loop */
> + /* leave iter.status unset to abort loop */
> }
> if (ret)
> return 0;
> diff --git a/fs/iomap/iter.c b/fs/iomap/iter.c
> index e4dfe64029cc..6ffc6a7b9ba5 100644
> --- a/fs/iomap/iter.c
> +++ b/fs/iomap/iter.c
> @@ -9,7 +9,7 @@
>
> static inline void iomap_iter_reset_iomap(struct iomap_iter *iter)
> {
> - iter->processed = 0;
> + iter->status = 0;
> memset(&iter->iomap, 0, sizeof(iter->iomap));
> memset(&iter->srcmap, 0, sizeof(iter->srcmap));
> }
> @@ -54,7 +54,7 @@ static inline void iomap_iter_done(struct iomap_iter *iter)
> * function must be called in a loop that continues as long it returns a
> * positive value. If 0 or a negative value is returned, the caller must not
> * return to the loop body. Within a loop body, there are two ways to break out
> - * of the loop body: leave @iter.processed unchanged, or set it to a negative
> + * of the loop body: leave @iter.status unchanged, or set it to a negative
> * errno.
> */
> int iomap_iter(struct iomap_iter *iter, const struct iomap_ops *ops)
> @@ -86,8 +86,8 @@ int iomap_iter(struct iomap_iter *iter, const struct iomap_ops *ops)
> }
>
> /* detect old return semantics where this would advance */
> - if (WARN_ON_ONCE(iter->processed > 0))
> - iter->processed = -EIO;
> + if (WARN_ON_ONCE(iter->status > 0))
> + iter->status = -EIO;
>
> /*
> * Use iter->len to determine whether to continue onto the next mapping.
> @@ -95,8 +95,8 @@ int iomap_iter(struct iomap_iter *iter, const struct iomap_ops *ops)
> * advanced at all (i.e. no work was done for some reason) unless the
> * mapping has been marked stale and needs to be reprocessed.
> */
> - if (iter->processed < 0)
> - ret = iter->processed;
> + if (iter->status < 0)
> + ret = iter->status;
> else if (iter->len == 0 || (!advanced && !stale))
> ret = 0;
> else
> diff --git a/fs/iomap/seek.c b/fs/iomap/seek.c
> index 83c687d6ccc0..04d7919636c1 100644
> --- a/fs/iomap/seek.c
> +++ b/fs/iomap/seek.c
> @@ -10,7 +10,7 @@
> #include <linux/pagemap.h>
> #include <linux/pagevec.h>
>
> -static loff_t iomap_seek_hole_iter(struct iomap_iter *iter,
> +static int iomap_seek_hole_iter(struct iomap_iter *iter,
> loff_t *hole_pos)
> {
> loff_t length = iomap_length(iter);
> @@ -47,7 +47,7 @@ iomap_seek_hole(struct inode *inode, loff_t pos, const struct iomap_ops *ops)
>
> iter.len = size - pos;
> while ((ret = iomap_iter(&iter, ops)) > 0)
> - iter.processed = iomap_seek_hole_iter(&iter, &pos);
> + iter.status = iomap_seek_hole_iter(&iter, &pos);
> if (ret < 0)
> return ret;
> if (iter.len) /* found hole before EOF */
> @@ -56,7 +56,7 @@ iomap_seek_hole(struct inode *inode, loff_t pos, const struct iomap_ops *ops)
> }
> EXPORT_SYMBOL_GPL(iomap_seek_hole);
>
> -static loff_t iomap_seek_data_iter(struct iomap_iter *iter,
> +static int iomap_seek_data_iter(struct iomap_iter *iter,
> loff_t *hole_pos)
> {
> loff_t length = iomap_length(iter);
> @@ -93,7 +93,7 @@ iomap_seek_data(struct inode *inode, loff_t pos, const struct iomap_ops *ops)
>
> iter.len = size - pos;
> while ((ret = iomap_iter(&iter, ops)) > 0)
> - iter.processed = iomap_seek_data_iter(&iter, &pos);
> + iter.status = iomap_seek_data_iter(&iter, &pos);
> if (ret < 0)
> return ret;
> if (iter.len) /* found data before EOF */
> diff --git a/fs/iomap/swapfile.c b/fs/iomap/swapfile.c
> index 4395e46a4dc7..9ea185e58ca7 100644
> --- a/fs/iomap/swapfile.c
> +++ b/fs/iomap/swapfile.c
> @@ -94,7 +94,7 @@ static int iomap_swapfile_fail(struct iomap_swapfile_info *isi, const char *str)
> * swap only cares about contiguous page-aligned physical extents and makes no
> * distinction between written and unwritten extents.
> */
> -static loff_t iomap_swapfile_iter(struct iomap_iter *iter,
> +static int iomap_swapfile_iter(struct iomap_iter *iter,
> struct iomap *iomap, struct iomap_swapfile_info *isi)
> {
> u64 length = iomap_length(iter);
> @@ -169,7 +169,7 @@ int iomap_swapfile_activate(struct swap_info_struct *sis,
> return ret;
>
> while ((ret = iomap_iter(&iter, ops)) > 0)
> - iter.processed = iomap_swapfile_iter(&iter, &iter.iomap, &isi);
> + iter.status = iomap_swapfile_iter(&iter, &iter.iomap, &isi);
> if (ret < 0)
> return ret;
>
> diff --git a/fs/iomap/trace.h b/fs/iomap/trace.h
> index 4118a42cdab0..9eab2c8ac3c5 100644
> --- a/fs/iomap/trace.h
> +++ b/fs/iomap/trace.h
> @@ -207,7 +207,7 @@ TRACE_EVENT(iomap_iter,
> __field(u64, ino)
> __field(loff_t, pos)
> __field(u64, length)
> - __field(s64, processed)
> + __field(int, status)
> __field(unsigned int, flags)
> __field(const void *, ops)
> __field(unsigned long, caller)
> @@ -217,17 +217,17 @@ TRACE_EVENT(iomap_iter,
> __entry->ino = iter->inode->i_ino;
> __entry->pos = iter->pos;
> __entry->length = iomap_length(iter);
> - __entry->processed = iter->processed;
> + __entry->status = iter->status;
> __entry->flags = iter->flags;
> __entry->ops = ops;
> __entry->caller = caller;
> ),
> - TP_printk("dev %d:%d ino 0x%llx pos 0x%llx length 0x%llx processed %lld flags %s (0x%x) ops %ps caller %pS",
> + TP_printk("dev %d:%d ino 0x%llx pos 0x%llx length 0x%llx status %d flags %s (0x%x) ops %ps caller %pS",
> MAJOR(__entry->dev), MINOR(__entry->dev),
> __entry->ino,
> __entry->pos,
> __entry->length,
> - __entry->processed,
> + __entry->status,
> __print_flags(__entry->flags, "|", IOMAP_FLAGS_STRINGS),
> __entry->flags,
> __entry->ops,
> diff --git a/include/linux/iomap.h b/include/linux/iomap.h
> index d832a540cc72..29b72a671104 100644
> --- a/include/linux/iomap.h
> +++ b/include/linux/iomap.h
> @@ -213,9 +213,8 @@ struct iomap_ops {
> * It is updated at the same time as @pos.
> * @iter_start_pos: The original start pos for the current iomap. Used for
> * incremental iter advance.
> - * @processed: The number of bytes the most recent iteration needs iomap_iter()
> - * to advance the iter, zero if the iter was already advanced, or a
> - * negative errno for an error during the operation.
> + * @status: Status of the most recent iteration. Zero on success or a negative
> + * errno on error.
> * @flags: Zero or more of the iomap_begin flags above.
> * @iomap: Map describing the I/O iteration
> * @srcmap: Source map for COW operations
> @@ -225,7 +224,7 @@ struct iomap_iter {
> loff_t pos;
> u64 len;
> loff_t iter_start_pos;
> - s64 processed;
> + int status;
> unsigned flags;
> struct iomap iomap;
> struct iomap srcmap;
> --
> 2.48.1
>
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2 12/12] iomap: introduce a full map advance helper
2025-02-19 17:50 ` [PATCH v2 12/12] iomap: introduce a full map advance helper Brian Foster
@ 2025-02-19 22:31 ` Darrick J. Wong
0 siblings, 0 replies; 30+ messages in thread
From: Darrick J. Wong @ 2025-02-19 22:31 UTC (permalink / raw)
To: Brian Foster; +Cc: linux-fsdevel, linux-xfs, Christoph Hellwig
On Wed, Feb 19, 2025 at 12:50:50PM -0500, Brian Foster wrote:
> Various iomap_iter_advance() calls advance by the full mapping
> length and thus have no need for the current length input or
> post-advance remaining length output from the standard advance
> function. Add an iomap_iter_advance_full() helper to clean up these
> cases.
>
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---
> fs/dax.c | 10 ++++------
> fs/iomap/buffered-io.c | 3 +--
> fs/iomap/fiemap.c | 3 +--
> fs/iomap/swapfile.c | 4 +---
> include/linux/iomap.h | 9 +++++++++
> 5 files changed, 16 insertions(+), 13 deletions(-)
>
> diff --git a/fs/dax.c b/fs/dax.c
> index cab3c5abe5cb..7fd4cd9a51f2 100644
> --- a/fs/dax.c
> +++ b/fs/dax.c
> @@ -1266,11 +1266,11 @@ static int dax_unshare_iter(struct iomap_iter *iter)
> u64 copy_len = iomap_length(iter);
> u32 mod;
> int id = 0;
> - s64 ret = iomap_length(iter);
> + s64 ret;
> void *daddr = NULL, *saddr = NULL;
>
> if (!iomap_want_unshare_iter(iter))
> - return iomap_iter_advance(iter, &ret);
> + return iomap_iter_advance_full(iter);
>
> /*
> * Extend the file range to be aligned to fsblock/pagesize, because
> @@ -1300,16 +1300,14 @@ static int dax_unshare_iter(struct iomap_iter *iter)
> if (ret < 0)
> goto out_unlock;
>
> - if (copy_mc_to_kernel(daddr, saddr, copy_len) == 0)
> - ret = iomap_length(iter);
> - else
> + if (copy_mc_to_kernel(daddr, saddr, copy_len) != 0)
> ret = -EIO;
>
> out_unlock:
> dax_read_unlock(id);
> if (ret < 0)
> return dax_mem2blk_err(ret);
> - return iomap_iter_advance(iter, &ret);
> + return iomap_iter_advance_full(iter);
> }
>
> int dax_file_unshare(struct inode *inode, loff_t pos, loff_t len,
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index 2b86978010bb..e53ac635e47c 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -1436,8 +1436,7 @@ iomap_zero_range(struct inode *inode, loff_t pos, loff_t len, bool *did_zero,
> range_dirty = false;
> status = iomap_zero_iter_flush_and_stale(&iter);
> } else {
> - u64 length = iomap_length(&iter);
> - status = iomap_iter_advance(&iter, &length);
> + status = iomap_iter_advance_full(&iter);
> }
> iter.status = status;
> continue;
> diff --git a/fs/iomap/fiemap.c b/fs/iomap/fiemap.c
> index 6776b800bde7..80675c42e94e 100644
> --- a/fs/iomap/fiemap.c
> +++ b/fs/iomap/fiemap.c
> @@ -42,7 +42,6 @@ static int iomap_to_fiemap(struct fiemap_extent_info *fi,
> static int iomap_fiemap_iter(struct iomap_iter *iter,
> struct fiemap_extent_info *fi, struct iomap *prev)
> {
> - u64 length = iomap_length(iter);
> int ret;
>
> if (iter->iomap.type == IOMAP_HOLE)
> @@ -56,7 +55,7 @@ static int iomap_fiemap_iter(struct iomap_iter *iter,
> return 0;
>
> advance:
> - return iomap_iter_advance(iter, &length);
> + return iomap_iter_advance_full(iter);
> }
>
> int iomap_fiemap(struct inode *inode, struct fiemap_extent_info *fi,
> diff --git a/fs/iomap/swapfile.c b/fs/iomap/swapfile.c
> index 9ea185e58ca7..c1a762c10ce4 100644
> --- a/fs/iomap/swapfile.c
> +++ b/fs/iomap/swapfile.c
> @@ -97,8 +97,6 @@ static int iomap_swapfile_fail(struct iomap_swapfile_info *isi, const char *str)
> static int iomap_swapfile_iter(struct iomap_iter *iter,
> struct iomap *iomap, struct iomap_swapfile_info *isi)
> {
> - u64 length = iomap_length(iter);
> -
> switch (iomap->type) {
> case IOMAP_MAPPED:
> case IOMAP_UNWRITTEN:
> @@ -135,7 +133,7 @@ static int iomap_swapfile_iter(struct iomap_iter *iter,
> memcpy(&isi->iomap, iomap, sizeof(isi->iomap));
> }
>
> - return iomap_iter_advance(iter, &length);
> + return iomap_iter_advance_full(iter);
> }
>
> /*
> diff --git a/include/linux/iomap.h b/include/linux/iomap.h
> index 29b72a671104..4c7e9fe32117 100644
> --- a/include/linux/iomap.h
> +++ b/include/linux/iomap.h
> @@ -264,6 +264,15 @@ static inline u64 iomap_length(const struct iomap_iter *iter)
> return iomap_length_trim(iter, iter->pos, iter->len);
> }
>
> +/**
> + * iomap_iter_advance_full - advance by the full length of current map
> + */
> +static inline int iomap_iter_advance_full(struct iomap_iter *iter)
> +{
> + u64 length = iomap_length(iter);
> + return iomap_iter_advance(iter, &length);
Dumb nit: blank line between the variable definition and the return
statement.
With that fixed
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
--D
> +}
> +
> /**
> * iomap_iter_srcmap - return the source map for the current iomap iteration
> * @i: iteration structure
> --
> 2.48.1
>
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2 01/12] iomap: advance the iter directly on buffered read
2025-02-19 22:22 ` Darrick J. Wong
@ 2025-02-19 22:31 ` Darrick J. Wong
0 siblings, 0 replies; 30+ messages in thread
From: Darrick J. Wong @ 2025-02-19 22:31 UTC (permalink / raw)
To: Brian Foster; +Cc: linux-fsdevel, linux-xfs, Christoph Hellwig
On Wed, Feb 19, 2025 at 02:22:35PM -0800, Darrick J. Wong wrote:
> On Wed, Feb 19, 2025 at 12:50:39PM -0500, Brian Foster wrote:
> > iomap buffered read advances the iter via iter.processed. To
> > continue separating iter advance from return status, update
> > iomap_readpage_iter() to advance the iter instead of returning the
> > number of bytes processed. In turn, drop the offset parameter and
> > sample the updated iter->pos at the start of the function. Update
> > the callers to loop based on remaining length in the current
> > iteration instead of number of bytes processed.
> >
> > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > Reviewed-by: Christoph Hellwig <hch@lst.de>
> > ---
> > fs/iomap/buffered-io.c | 45 +++++++++++++++++++-----------------------
> > 1 file changed, 20 insertions(+), 25 deletions(-)
> >
> > diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> > index ec227b45f3aa..215866ba264d 100644
> > --- a/fs/iomap/buffered-io.c
> > +++ b/fs/iomap/buffered-io.c
> > @@ -366,15 +366,14 @@ static inline bool iomap_block_needs_zeroing(const struct iomap_iter *iter,
> > pos >= i_size_read(iter->inode);
> > }
> >
> > -static loff_t iomap_readpage_iter(const struct iomap_iter *iter,
> > - struct iomap_readpage_ctx *ctx, loff_t offset)
> > +static loff_t iomap_readpage_iter(struct iomap_iter *iter,
>
> I wonder, do we really need to return loff_t from some of these
> functions now? I thought the only return codes were the -EIO/0 from
> iomap_iter_advance?
And that's provided in the last patch so
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
--D
>
> --D
>
> > + struct iomap_readpage_ctx *ctx)
> > {
> > const struct iomap *iomap = &iter->iomap;
> > - loff_t pos = iter->pos + offset;
> > - loff_t length = iomap_length(iter) - offset;
> > + loff_t pos = iter->pos;
> > + loff_t length = iomap_length(iter);
> > struct folio *folio = ctx->cur_folio;
> > struct iomap_folio_state *ifs;
> > - loff_t orig_pos = pos;
> > size_t poff, plen;
> > sector_t sector;
> >
> > @@ -438,25 +437,22 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter,
> > * we can skip trailing ones as they will be handled in the next
> > * iteration.
> > */
> > - return pos - orig_pos + plen;
> > + length = pos - iter->pos + plen;
> > + return iomap_iter_advance(iter, &length);
> > }
> >
> > -static loff_t iomap_read_folio_iter(const struct iomap_iter *iter,
> > +static loff_t iomap_read_folio_iter(struct iomap_iter *iter,
> > struct iomap_readpage_ctx *ctx)
> > {
> > - struct folio *folio = ctx->cur_folio;
> > - size_t offset = offset_in_folio(folio, iter->pos);
> > - loff_t length = min_t(loff_t, folio_size(folio) - offset,
> > - iomap_length(iter));
> > - loff_t done, ret;
> > -
> > - for (done = 0; done < length; done += ret) {
> > - ret = iomap_readpage_iter(iter, ctx, done);
> > - if (ret <= 0)
> > + loff_t ret;
> > +
> > + while (iomap_length(iter)) {
> > + ret = iomap_readpage_iter(iter, ctx);
> > + if (ret)
> > return ret;
> > }
> >
> > - return done;
> > + return 0;
> > }
> >
> > int iomap_read_folio(struct folio *folio, const struct iomap_ops *ops)
> > @@ -493,15 +489,14 @@ int iomap_read_folio(struct folio *folio, const struct iomap_ops *ops)
> > }
> > EXPORT_SYMBOL_GPL(iomap_read_folio);
> >
> > -static loff_t iomap_readahead_iter(const struct iomap_iter *iter,
> > +static loff_t iomap_readahead_iter(struct iomap_iter *iter,
> > struct iomap_readpage_ctx *ctx)
> > {
> > - loff_t length = iomap_length(iter);
> > - loff_t done, ret;
> > + loff_t ret;
> >
> > - for (done = 0; done < length; done += ret) {
> > + while (iomap_length(iter)) {
> > if (ctx->cur_folio &&
> > - offset_in_folio(ctx->cur_folio, iter->pos + done) == 0) {
> > + offset_in_folio(ctx->cur_folio, iter->pos) == 0) {
> > if (!ctx->cur_folio_in_bio)
> > folio_unlock(ctx->cur_folio);
> > ctx->cur_folio = NULL;
> > @@ -510,12 +505,12 @@ static loff_t iomap_readahead_iter(const struct iomap_iter *iter,
> > ctx->cur_folio = readahead_folio(ctx->rac);
> > ctx->cur_folio_in_bio = false;
> > }
> > - ret = iomap_readpage_iter(iter, ctx, done);
> > - if (ret <= 0)
> > + ret = iomap_readpage_iter(iter, ctx);
> > + if (ret)
> > return ret;
> > }
> >
> > - return done;
> > + return 0;
> > }
> >
> > /**
> > --
> > 2.48.1
> >
> >
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2 03/12] iomap: convert misc simple ops to incremental advance
2025-02-19 22:24 ` Darrick J. Wong
@ 2025-02-19 22:31 ` Darrick J. Wong
0 siblings, 0 replies; 30+ messages in thread
From: Darrick J. Wong @ 2025-02-19 22:31 UTC (permalink / raw)
To: Brian Foster; +Cc: linux-fsdevel, linux-xfs, Christoph Hellwig
On Wed, Feb 19, 2025 at 02:24:28PM -0800, Darrick J. Wong wrote:
> On Wed, Feb 19, 2025 at 12:50:41PM -0500, Brian Foster wrote:
> > Update several of the remaining iomap operations to advance the iter
> > directly rather than via return value. This includes page faults,
> > fiemap, seek data/hole and swapfile activation.
> >
> > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > ---
> > fs/iomap/buffered-io.c | 2 +-
> > fs/iomap/fiemap.c | 18 +++++++++---------
> > fs/iomap/seek.c | 12 ++++++------
> > fs/iomap/swapfile.c | 7 +++++--
> > 4 files changed, 21 insertions(+), 18 deletions(-)
> >
> > diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> > index 215866ba264d..ddc82dab6bb5 100644
> > --- a/fs/iomap/buffered-io.c
> > +++ b/fs/iomap/buffered-io.c
> > @@ -1480,7 +1480,7 @@ static loff_t iomap_folio_mkwrite_iter(struct iomap_iter *iter,
> > folio_mark_dirty(folio);
> > }
> >
> > - return length;
> > + return iomap_iter_advance(iter, &length);
>
> Same dorky question here -- doesn't iomap_iter_advance return int, so
> all these functions can now return int instead of loff_t?
Same dorky answer too!
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
--D
>
> --D
>
> > }
> >
> > vm_fault_t iomap_page_mkwrite(struct vm_fault *vmf, const struct iomap_ops *ops)
> > diff --git a/fs/iomap/fiemap.c b/fs/iomap/fiemap.c
> > index 610ca6f1ec9b..8a0d8b034218 100644
> > --- a/fs/iomap/fiemap.c
> > +++ b/fs/iomap/fiemap.c
> > @@ -39,24 +39,24 @@ static int iomap_to_fiemap(struct fiemap_extent_info *fi,
> > iomap->length, flags);
> > }
> >
> > -static loff_t iomap_fiemap_iter(const struct iomap_iter *iter,
> > +static loff_t iomap_fiemap_iter(struct iomap_iter *iter,
> > struct fiemap_extent_info *fi, struct iomap *prev)
> > {
> > + u64 length = iomap_length(iter);
> > int ret;
> >
> > if (iter->iomap.type == IOMAP_HOLE)
> > - return iomap_length(iter);
> > + goto advance;
> >
> > ret = iomap_to_fiemap(fi, prev, 0);
> > *prev = iter->iomap;
> > - switch (ret) {
> > - case 0: /* success */
> > - return iomap_length(iter);
> > - case 1: /* extent array full */
> > - return 0;
> > - default: /* error */
> > + if (ret < 0)
> > return ret;
> > - }
> > + if (ret == 1) /* extent array full */
> > + return 0;
> > +
> > +advance:
> > + return iomap_iter_advance(iter, &length);
> > }
> >
> > int iomap_fiemap(struct inode *inode, struct fiemap_extent_info *fi,
> > diff --git a/fs/iomap/seek.c b/fs/iomap/seek.c
> > index a845c012b50c..83c687d6ccc0 100644
> > --- a/fs/iomap/seek.c
> > +++ b/fs/iomap/seek.c
> > @@ -10,7 +10,7 @@
> > #include <linux/pagemap.h>
> > #include <linux/pagevec.h>
> >
> > -static loff_t iomap_seek_hole_iter(const struct iomap_iter *iter,
> > +static loff_t iomap_seek_hole_iter(struct iomap_iter *iter,
> > loff_t *hole_pos)
> > {
> > loff_t length = iomap_length(iter);
> > @@ -20,13 +20,13 @@ static loff_t iomap_seek_hole_iter(const struct iomap_iter *iter,
> > *hole_pos = mapping_seek_hole_data(iter->inode->i_mapping,
> > iter->pos, iter->pos + length, SEEK_HOLE);
> > if (*hole_pos == iter->pos + length)
> > - return length;
> > + return iomap_iter_advance(iter, &length);
> > return 0;
> > case IOMAP_HOLE:
> > *hole_pos = iter->pos;
> > return 0;
> > default:
> > - return length;
> > + return iomap_iter_advance(iter, &length);
> > }
> > }
> >
> > @@ -56,19 +56,19 @@ iomap_seek_hole(struct inode *inode, loff_t pos, const struct iomap_ops *ops)
> > }
> > EXPORT_SYMBOL_GPL(iomap_seek_hole);
> >
> > -static loff_t iomap_seek_data_iter(const struct iomap_iter *iter,
> > +static loff_t iomap_seek_data_iter(struct iomap_iter *iter,
> > loff_t *hole_pos)
> > {
> > loff_t length = iomap_length(iter);
> >
> > switch (iter->iomap.type) {
> > case IOMAP_HOLE:
> > - return length;
> > + return iomap_iter_advance(iter, &length);
> > case IOMAP_UNWRITTEN:
> > *hole_pos = mapping_seek_hole_data(iter->inode->i_mapping,
> > iter->pos, iter->pos + length, SEEK_DATA);
> > if (*hole_pos < 0)
> > - return length;
> > + return iomap_iter_advance(iter, &length);
> > return 0;
> > default:
> > *hole_pos = iter->pos;
> > diff --git a/fs/iomap/swapfile.c b/fs/iomap/swapfile.c
> > index b90d0eda9e51..4395e46a4dc7 100644
> > --- a/fs/iomap/swapfile.c
> > +++ b/fs/iomap/swapfile.c
> > @@ -94,9 +94,11 @@ static int iomap_swapfile_fail(struct iomap_swapfile_info *isi, const char *str)
> > * swap only cares about contiguous page-aligned physical extents and makes no
> > * distinction between written and unwritten extents.
> > */
> > -static loff_t iomap_swapfile_iter(const struct iomap_iter *iter,
> > +static loff_t iomap_swapfile_iter(struct iomap_iter *iter,
> > struct iomap *iomap, struct iomap_swapfile_info *isi)
> > {
> > + u64 length = iomap_length(iter);
> > +
> > switch (iomap->type) {
> > case IOMAP_MAPPED:
> > case IOMAP_UNWRITTEN:
> > @@ -132,7 +134,8 @@ static loff_t iomap_swapfile_iter(const struct iomap_iter *iter,
> > return error;
> > memcpy(&isi->iomap, iomap, sizeof(isi->iomap));
> > }
> > - return iomap_length(iter);
> > +
> > + return iomap_iter_advance(iter, &length);
> > }
> >
> > /*
> > --
> > 2.48.1
> >
> >
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2 04/12] dax: advance the iomap_iter in the read/write path
2025-02-19 17:50 ` [PATCH v2 04/12] dax: advance the iomap_iter in the read/write path Brian Foster
@ 2025-02-19 22:33 ` Darrick J. Wong
2025-02-20 14:58 ` Brian Foster
0 siblings, 1 reply; 30+ messages in thread
From: Darrick J. Wong @ 2025-02-19 22:33 UTC (permalink / raw)
To: Brian Foster; +Cc: linux-fsdevel, linux-xfs, Christoph Hellwig
On Wed, Feb 19, 2025 at 12:50:42PM -0500, Brian Foster wrote:
> DAX reads and writes flow through dax_iomap_iter(), which has one or
> more subtleties in terms of how it processes a range vs. what is
> specified in the iomap_iter. To keep things simple and remove the
> dependency on iomap_iter() advances, convert a positive return from
> dax_iomap_iter() to the new advance and status return semantics. The
> advance can be pushed further down in future patches.
>
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
Not sure why this and the next patch are split up but it's fsdax so meh.
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
--D
> ---
> fs/dax.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/fs/dax.c b/fs/dax.c
> index 21b47402b3dc..296f5aa18640 100644
> --- a/fs/dax.c
> +++ b/fs/dax.c
> @@ -1585,8 +1585,12 @@ dax_iomap_rw(struct kiocb *iocb, struct iov_iter *iter,
> if (iocb->ki_flags & IOCB_NOWAIT)
> iomi.flags |= IOMAP_NOWAIT;
>
> - while ((ret = iomap_iter(&iomi, ops)) > 0)
> + while ((ret = iomap_iter(&iomi, ops)) > 0) {
> iomi.processed = dax_iomap_iter(&iomi, iter);
> + if (iomi.processed > 0)
> + iomi.processed = iomap_iter_advance(&iomi,
> + &iomi.processed);
> + }
>
> done = iomi.pos - iocb->ki_pos;
> iocb->ki_pos = iomi.pos;
> --
> 2.48.1
>
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2 05/12] dax: push advance down into dax_iomap_iter() for read and write
2025-02-19 17:50 ` [PATCH v2 05/12] dax: push advance down into dax_iomap_iter() for read and write Brian Foster
@ 2025-02-19 22:34 ` Darrick J. Wong
0 siblings, 0 replies; 30+ messages in thread
From: Darrick J. Wong @ 2025-02-19 22:34 UTC (permalink / raw)
To: Brian Foster; +Cc: linux-fsdevel, linux-xfs, Christoph Hellwig
On Wed, Feb 19, 2025 at 12:50:43PM -0500, Brian Foster wrote:
> DAX read and write currently advances the iter after the
> dax_iomap_iter() returns the number of bytes processed rather than
> internally within the iter handler itself, as most other iomap
> operations do. Push the advance down into dax_iomap_iter() and
> update the function to return op status instead of bytes processed.
>
> dax_iomap_iter() shortcuts reads from a hole or unwritten mapping by
> directly zeroing the iov_iter, so advance the iomap_iter similarly
> in that case.
>
> The DAX processing loop can operate on a range slightly different
> than defined by the iomap_iter depending on circumstances. For
> example, a read may be truncated by inode size, a read or write
> range can be increased due to page alignment, etc. Therefore, this
> patch aims to retain as much of the existing logic as possible.
>
> The loop control logic remains pos based, but is sampled from the
> iomap_iter on each iteration after the advance instead of being
> updated manually. Similarly, length is updated based on the output
> of the advance instead of being updated manually. The advance itself
> is based on the number of bytes transferred, which was previously
> used to update the local copies of pos and length.
>
> Signed-off-by: Brian Foster <bfoster@redhat.com>
Looks good,
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
--D
> ---
> fs/dax.c | 27 +++++++++++----------------
> 1 file changed, 11 insertions(+), 16 deletions(-)
>
> diff --git a/fs/dax.c b/fs/dax.c
> index 296f5aa18640..139e299e53e6 100644
> --- a/fs/dax.c
> +++ b/fs/dax.c
> @@ -1431,8 +1431,7 @@ int dax_truncate_page(struct inode *inode, loff_t pos, bool *did_zero,
> }
> EXPORT_SYMBOL_GPL(dax_truncate_page);
>
> -static loff_t dax_iomap_iter(const struct iomap_iter *iomi,
> - struct iov_iter *iter)
> +static int dax_iomap_iter(struct iomap_iter *iomi, struct iov_iter *iter)
> {
> const struct iomap *iomap = &iomi->iomap;
> const struct iomap *srcmap = iomap_iter_srcmap(iomi);
> @@ -1451,8 +1450,10 @@ static loff_t dax_iomap_iter(const struct iomap_iter *iomi,
> if (pos >= end)
> return 0;
>
> - if (iomap->type == IOMAP_HOLE || iomap->type == IOMAP_UNWRITTEN)
> - return iov_iter_zero(min(length, end - pos), iter);
> + if (iomap->type == IOMAP_HOLE || iomap->type == IOMAP_UNWRITTEN) {
> + done = iov_iter_zero(min(length, end - pos), iter);
> + return iomap_iter_advance(iomi, &done);
> + }
> }
>
> /*
> @@ -1485,7 +1486,7 @@ static loff_t dax_iomap_iter(const struct iomap_iter *iomi,
> }
>
> id = dax_read_lock();
> - while (pos < end) {
> + while ((pos = iomi->pos) < end) {
> unsigned offset = pos & (PAGE_SIZE - 1);
> const size_t size = ALIGN(length + offset, PAGE_SIZE);
> pgoff_t pgoff = dax_iomap_pgoff(iomap, pos);
> @@ -1535,18 +1536,16 @@ static loff_t dax_iomap_iter(const struct iomap_iter *iomi,
> xfer = dax_copy_to_iter(dax_dev, pgoff, kaddr,
> map_len, iter);
>
> - pos += xfer;
> - length -= xfer;
> - done += xfer;
> -
> - if (xfer == 0)
> + length = xfer;
> + ret = iomap_iter_advance(iomi, &length);
> + if (!ret && xfer == 0)
> ret = -EFAULT;
> if (xfer < map_len)
> break;
> }
> dax_read_unlock(id);
>
> - return done ? done : ret;
> + return ret;
> }
>
> /**
> @@ -1585,12 +1584,8 @@ dax_iomap_rw(struct kiocb *iocb, struct iov_iter *iter,
> if (iocb->ki_flags & IOCB_NOWAIT)
> iomi.flags |= IOMAP_NOWAIT;
>
> - while ((ret = iomap_iter(&iomi, ops)) > 0) {
> + while ((ret = iomap_iter(&iomi, ops)) > 0)
> iomi.processed = dax_iomap_iter(&iomi, iter);
> - if (iomi.processed > 0)
> - iomi.processed = iomap_iter_advance(&iomi,
> - &iomi.processed);
> - }
>
> done = iomi.pos - iocb->ki_pos;
> iocb->ki_pos = iomi.pos;
> --
> 2.48.1
>
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2 06/12] dax: advance the iomap_iter on zero range
2025-02-19 17:50 ` [PATCH v2 06/12] dax: advance the iomap_iter on zero range Brian Foster
@ 2025-02-19 22:34 ` Darrick J. Wong
0 siblings, 0 replies; 30+ messages in thread
From: Darrick J. Wong @ 2025-02-19 22:34 UTC (permalink / raw)
To: Brian Foster; +Cc: linux-fsdevel, linux-xfs, Christoph Hellwig
On Wed, Feb 19, 2025 at 12:50:44PM -0500, Brian Foster wrote:
> Update the DAX zero range iomap iter handler to advance the iter
> directly. Advance by the full length in the hole/unwritten case, or
> otherwise advance incrementally in the zeroing loop. In either case,
> return 0 or an error code for success or failure.
>
> Signed-off-by: Brian Foster <bfoster@redhat.com>
Pretty straightforward
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
--D
> ---
> fs/dax.c | 33 +++++++++++++++++----------------
> 1 file changed, 17 insertions(+), 16 deletions(-)
>
> diff --git a/fs/dax.c b/fs/dax.c
> index 139e299e53e6..f4d8c8c10086 100644
> --- a/fs/dax.c
> +++ b/fs/dax.c
> @@ -1358,13 +1358,12 @@ static s64 dax_zero_iter(struct iomap_iter *iter, bool *did_zero)
> {
> const struct iomap *iomap = &iter->iomap;
> const struct iomap *srcmap = iomap_iter_srcmap(iter);
> - loff_t pos = iter->pos;
> u64 length = iomap_length(iter);
> - s64 written = 0;
> + s64 ret;
>
> /* already zeroed? we're done. */
> if (srcmap->type == IOMAP_HOLE || srcmap->type == IOMAP_UNWRITTEN)
> - return length;
> + return iomap_iter_advance(iter, &length);
>
> /*
> * invalidate the pages whose sharing state is to be changed
> @@ -1372,33 +1371,35 @@ static s64 dax_zero_iter(struct iomap_iter *iter, bool *did_zero)
> */
> if (iomap->flags & IOMAP_F_SHARED)
> invalidate_inode_pages2_range(iter->inode->i_mapping,
> - pos >> PAGE_SHIFT,
> - (pos + length - 1) >> PAGE_SHIFT);
> + iter->pos >> PAGE_SHIFT,
> + (iter->pos + length - 1) >> PAGE_SHIFT);
>
> do {
> + loff_t pos = iter->pos;
> unsigned offset = offset_in_page(pos);
> - unsigned size = min_t(u64, PAGE_SIZE - offset, length);
> pgoff_t pgoff = dax_iomap_pgoff(iomap, pos);
> - long rc;
> int id;
>
> + length = min_t(u64, PAGE_SIZE - offset, length);
> +
> id = dax_read_lock();
> - if (IS_ALIGNED(pos, PAGE_SIZE) && size == PAGE_SIZE)
> - rc = dax_zero_page_range(iomap->dax_dev, pgoff, 1);
> + if (IS_ALIGNED(pos, PAGE_SIZE) && length == PAGE_SIZE)
> + ret = dax_zero_page_range(iomap->dax_dev, pgoff, 1);
> else
> - rc = dax_memzero(iter, pos, size);
> + ret = dax_memzero(iter, pos, length);
> dax_read_unlock(id);
>
> - if (rc < 0)
> - return rc;
> - pos += size;
> - length -= size;
> - written += size;
> + if (ret < 0)
> + return ret;
> +
> + ret = iomap_iter_advance(iter, &length);
> + if (ret)
> + return ret;
> } while (length > 0);
>
> if (did_zero)
> *did_zero = true;
> - return written;
> + return ret;
> }
>
> int dax_zero_range(struct inode *inode, loff_t pos, loff_t len, bool *did_zero,
> --
> 2.48.1
>
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2 07/12] dax: advance the iomap_iter on unshare range
2025-02-19 17:50 ` [PATCH v2 07/12] dax: advance the iomap_iter on unshare range Brian Foster
@ 2025-02-19 22:35 ` Darrick J. Wong
0 siblings, 0 replies; 30+ messages in thread
From: Darrick J. Wong @ 2025-02-19 22:35 UTC (permalink / raw)
To: Brian Foster; +Cc: linux-fsdevel, linux-xfs, Christoph Hellwig
On Wed, Feb 19, 2025 at 12:50:45PM -0500, Brian Foster wrote:
> Advance the iter and return 0 or an error code for success or
> failure.
>
> Signed-off-by: Brian Foster <bfoster@redhat.com>
Looks good,
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
--D
> ---
> fs/dax.c | 8 +++++---
> 1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/fs/dax.c b/fs/dax.c
> index f4d8c8c10086..c0fbab8c66f7 100644
> --- a/fs/dax.c
> +++ b/fs/dax.c
> @@ -1266,11 +1266,11 @@ static s64 dax_unshare_iter(struct iomap_iter *iter)
> u64 copy_len = iomap_length(iter);
> u32 mod;
> int id = 0;
> - s64 ret = 0;
> + s64 ret = iomap_length(iter);
> void *daddr = NULL, *saddr = NULL;
>
> if (!iomap_want_unshare_iter(iter))
> - return iomap_length(iter);
> + return iomap_iter_advance(iter, &ret);
>
> /*
> * Extend the file range to be aligned to fsblock/pagesize, because
> @@ -1307,7 +1307,9 @@ static s64 dax_unshare_iter(struct iomap_iter *iter)
>
> out_unlock:
> dax_read_unlock(id);
> - return dax_mem2blk_err(ret);
> + if (ret < 0)
> + return dax_mem2blk_err(ret);
> + return iomap_iter_advance(iter, &ret);
> }
>
> int dax_file_unshare(struct inode *inode, loff_t pos, loff_t len,
> --
> 2.48.1
>
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2 08/12] dax: advance the iomap_iter on dedupe range
2025-02-19 17:50 ` [PATCH v2 08/12] dax: advance the iomap_iter on dedupe range Brian Foster
@ 2025-02-19 22:35 ` Darrick J. Wong
0 siblings, 0 replies; 30+ messages in thread
From: Darrick J. Wong @ 2025-02-19 22:35 UTC (permalink / raw)
To: Brian Foster; +Cc: linux-fsdevel, linux-xfs, Christoph Hellwig
On Wed, Feb 19, 2025 at 12:50:46PM -0500, Brian Foster wrote:
> Advance the iter on successful dedupe. Dedupe range uses two iters
> and iterates so long as both have outstanding work, so
> correspondingly this needs to advance both on each iteration. Since
> dax_range_compare_iter() now returns status instead of a byte count,
> update the variable name in the caller as well.
>
> Signed-off-by: Brian Foster <bfoster@redhat.com>
Double iterators, hrhghghggh
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
--D
> ---
> fs/dax.c | 21 ++++++++++++++-------
> 1 file changed, 14 insertions(+), 7 deletions(-)
>
> diff --git a/fs/dax.c b/fs/dax.c
> index c0fbab8c66f7..c8c0d81122ab 100644
> --- a/fs/dax.c
> +++ b/fs/dax.c
> @@ -2001,12 +2001,13 @@ vm_fault_t dax_finish_sync_fault(struct vm_fault *vmf, unsigned int order,
> }
> EXPORT_SYMBOL_GPL(dax_finish_sync_fault);
>
> -static loff_t dax_range_compare_iter(struct iomap_iter *it_src,
> +static int dax_range_compare_iter(struct iomap_iter *it_src,
> struct iomap_iter *it_dest, u64 len, bool *same)
> {
> const struct iomap *smap = &it_src->iomap;
> const struct iomap *dmap = &it_dest->iomap;
> loff_t pos1 = it_src->pos, pos2 = it_dest->pos;
> + u64 dest_len;
> void *saddr, *daddr;
> int id, ret;
>
> @@ -2014,7 +2015,7 @@ static loff_t dax_range_compare_iter(struct iomap_iter *it_src,
>
> if (smap->type == IOMAP_HOLE && dmap->type == IOMAP_HOLE) {
> *same = true;
> - return len;
> + goto advance;
> }
>
> if (smap->type == IOMAP_HOLE || dmap->type == IOMAP_HOLE) {
> @@ -2037,7 +2038,13 @@ static loff_t dax_range_compare_iter(struct iomap_iter *it_src,
> if (!*same)
> len = 0;
> dax_read_unlock(id);
> - return len;
> +
> +advance:
> + dest_len = len;
> + ret = iomap_iter_advance(it_src, &len);
> + if (!ret)
> + ret = iomap_iter_advance(it_dest, &dest_len);
> + return ret;
>
> out_unlock:
> dax_read_unlock(id);
> @@ -2060,15 +2067,15 @@ int dax_dedupe_file_range_compare(struct inode *src, loff_t srcoff,
> .len = len,
> .flags = IOMAP_DAX,
> };
> - int ret, compared = 0;
> + int ret, status;
>
> while ((ret = iomap_iter(&src_iter, ops)) > 0 &&
> (ret = iomap_iter(&dst_iter, ops)) > 0) {
> - compared = dax_range_compare_iter(&src_iter, &dst_iter,
> + status = dax_range_compare_iter(&src_iter, &dst_iter,
> min(src_iter.len, dst_iter.len), same);
> - if (compared < 0)
> + if (status < 0)
> return ret;
> - src_iter.processed = dst_iter.processed = compared;
> + src_iter.processed = dst_iter.processed = status;
> }
> return ret;
> }
> --
> 2.48.1
>
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2 09/12] dax: advance the iomap_iter on pte and pmd faults
2025-02-19 17:50 ` [PATCH v2 09/12] dax: advance the iomap_iter on pte and pmd faults Brian Foster
@ 2025-02-19 22:36 ` Darrick J. Wong
0 siblings, 0 replies; 30+ messages in thread
From: Darrick J. Wong @ 2025-02-19 22:36 UTC (permalink / raw)
To: Brian Foster; +Cc: linux-fsdevel, linux-xfs, Christoph Hellwig
On Wed, Feb 19, 2025 at 12:50:47PM -0500, Brian Foster wrote:
> Advance the iomap_iter on PTE and PMD faults. Each of these
> operations assign a hardcoded size to iter.processed. Replace those
> with an advance and status return.
>
> Signed-off-by: Brian Foster <bfoster@redhat.com>
Looks ok,
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
--D
> ---
> fs/dax.c | 12 ++++++++----
> 1 file changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/fs/dax.c b/fs/dax.c
> index c8c0d81122ab..44701865ca94 100644
> --- a/fs/dax.c
> +++ b/fs/dax.c
> @@ -1771,8 +1771,10 @@ static vm_fault_t dax_iomap_pte_fault(struct vm_fault *vmf, pfn_t *pfnp,
> ret |= VM_FAULT_MAJOR;
> }
>
> - if (!(ret & VM_FAULT_ERROR))
> - iter.processed = PAGE_SIZE;
> + if (!(ret & VM_FAULT_ERROR)) {
> + u64 length = PAGE_SIZE;
> + iter.processed = iomap_iter_advance(&iter, &length);
> + }
> }
>
> if (iomap_errp)
> @@ -1885,8 +1887,10 @@ static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, pfn_t *pfnp,
> continue; /* actually breaks out of the loop */
>
> ret = dax_fault_iter(vmf, &iter, pfnp, &xas, &entry, true);
> - if (ret != VM_FAULT_FALLBACK)
> - iter.processed = PMD_SIZE;
> + if (ret != VM_FAULT_FALLBACK) {
> + u64 length = PMD_SIZE;
> + iter.processed = iomap_iter_advance(&iter, &length);
> + }
> }
>
> unlock_entry:
> --
> 2.48.1
>
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2 00/12] iomap: incremental advance conversion -- phase 2
2025-02-19 17:50 [PATCH v2 00/12] iomap: incremental advance conversion -- phase 2 Brian Foster
` (11 preceding siblings ...)
2025-02-19 17:50 ` [PATCH v2 12/12] iomap: introduce a full map advance helper Brian Foster
@ 2025-02-20 9:10 ` Christian Brauner
2025-02-20 14:59 ` Brian Foster
12 siblings, 1 reply; 30+ messages in thread
From: Christian Brauner @ 2025-02-20 9:10 UTC (permalink / raw)
To: Brian Foster
Cc: linux-fsdevel, linux-xfs, Christoph Hellwig, Darrick J . Wong
On Wed, Feb 19, 2025 at 12:50:38PM -0500, Brian Foster wrote:
> Hi all,
>
> Here's phase 2 of the incremental iter advance conversions. This updates
Hm, what's this based on? Can you base it on vfs-6.15.iomap, please?
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2 04/12] dax: advance the iomap_iter in the read/write path
2025-02-19 22:33 ` Darrick J. Wong
@ 2025-02-20 14:58 ` Brian Foster
0 siblings, 0 replies; 30+ messages in thread
From: Brian Foster @ 2025-02-20 14:58 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: linux-fsdevel, linux-xfs, Christoph Hellwig
On Wed, Feb 19, 2025 at 02:33:44PM -0800, Darrick J. Wong wrote:
> On Wed, Feb 19, 2025 at 12:50:42PM -0500, Brian Foster wrote:
> > DAX reads and writes flow through dax_iomap_iter(), which has one or
> > more subtleties in terms of how it processes a range vs. what is
> > specified in the iomap_iter. To keep things simple and remove the
> > dependency on iomap_iter() advances, convert a positive return from
> > dax_iomap_iter() to the new advance and status return semantics. The
> > advance can be pushed further down in future patches.
> >
> > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > Reviewed-by: Christoph Hellwig <hch@lst.de>
>
> Not sure why this and the next patch are split up but it's fsdax so meh.
> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
>
Heh.. mainly just out of caution and bisectability. The subsequent patch
took a bit of fighting for me to get right and this patch served as a
nice baseline to isolate changes in the DAX I/O path from the functional
iomap change, since this patch is relatively trivial.
Thanks for the reviews..
Brian
> --D
>
> > ---
> > fs/dax.c | 6 +++++-
> > 1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/fs/dax.c b/fs/dax.c
> > index 21b47402b3dc..296f5aa18640 100644
> > --- a/fs/dax.c
> > +++ b/fs/dax.c
> > @@ -1585,8 +1585,12 @@ dax_iomap_rw(struct kiocb *iocb, struct iov_iter *iter,
> > if (iocb->ki_flags & IOCB_NOWAIT)
> > iomi.flags |= IOMAP_NOWAIT;
> >
> > - while ((ret = iomap_iter(&iomi, ops)) > 0)
> > + while ((ret = iomap_iter(&iomi, ops)) > 0) {
> > iomi.processed = dax_iomap_iter(&iomi, iter);
> > + if (iomi.processed > 0)
> > + iomi.processed = iomap_iter_advance(&iomi,
> > + &iomi.processed);
> > + }
> >
> > done = iomi.pos - iocb->ki_pos;
> > iocb->ki_pos = iomi.pos;
> > --
> > 2.48.1
> >
> >
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2 00/12] iomap: incremental advance conversion -- phase 2
2025-02-20 9:10 ` [PATCH v2 00/12] iomap: incremental advance conversion -- phase 2 Christian Brauner
@ 2025-02-20 14:59 ` Brian Foster
0 siblings, 0 replies; 30+ messages in thread
From: Brian Foster @ 2025-02-20 14:59 UTC (permalink / raw)
To: Christian Brauner
Cc: linux-fsdevel, linux-xfs, Christoph Hellwig, Darrick J . Wong
On Thu, Feb 20, 2025 at 10:10:41AM +0100, Christian Brauner wrote:
> On Wed, Feb 19, 2025 at 12:50:38PM -0500, Brian Foster wrote:
> > Hi all,
> >
> > Here's phase 2 of the incremental iter advance conversions. This updates
>
> Hm, what's this based on? Can you base it on vfs-6.15.iomap, please?
>
Yup.. this was based on -rc3 plus the previous series that introduced
the core iomap change. I'll give hch a couple days or so for any further
comments and post a v3 with Darrick's nit fixed and rebased on the 6.15
iomap branch. Thanks!
Brian
^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2025-02-20 14:57 UTC | newest]
Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-19 17:50 [PATCH v2 00/12] iomap: incremental advance conversion -- phase 2 Brian Foster
2025-02-19 17:50 ` [PATCH v2 01/12] iomap: advance the iter directly on buffered read Brian Foster
2025-02-19 22:22 ` Darrick J. Wong
2025-02-19 22:31 ` Darrick J. Wong
2025-02-19 17:50 ` [PATCH v2 02/12] iomap: advance the iter on direct I/O Brian Foster
2025-02-19 22:22 ` Darrick J. Wong
2025-02-19 17:50 ` [PATCH v2 03/12] iomap: convert misc simple ops to incremental advance Brian Foster
2025-02-19 22:24 ` Darrick J. Wong
2025-02-19 22:31 ` Darrick J. Wong
2025-02-19 17:50 ` [PATCH v2 04/12] dax: advance the iomap_iter in the read/write path Brian Foster
2025-02-19 22:33 ` Darrick J. Wong
2025-02-20 14:58 ` Brian Foster
2025-02-19 17:50 ` [PATCH v2 05/12] dax: push advance down into dax_iomap_iter() for read and write Brian Foster
2025-02-19 22:34 ` Darrick J. Wong
2025-02-19 17:50 ` [PATCH v2 06/12] dax: advance the iomap_iter on zero range Brian Foster
2025-02-19 22:34 ` Darrick J. Wong
2025-02-19 17:50 ` [PATCH v2 07/12] dax: advance the iomap_iter on unshare range Brian Foster
2025-02-19 22:35 ` Darrick J. Wong
2025-02-19 17:50 ` [PATCH v2 08/12] dax: advance the iomap_iter on dedupe range Brian Foster
2025-02-19 22:35 ` Darrick J. Wong
2025-02-19 17:50 ` [PATCH v2 09/12] dax: advance the iomap_iter on pte and pmd faults Brian Foster
2025-02-19 22:36 ` Darrick J. Wong
2025-02-19 17:50 ` [PATCH v2 10/12] iomap: remove unnecessary advance from iomap_iter() Brian Foster
2025-02-19 22:30 ` Darrick J. Wong
2025-02-19 17:50 ` [PATCH v2 11/12] iomap: rename iomap_iter processed field to status Brian Foster
2025-02-19 22:30 ` Darrick J. Wong
2025-02-19 17:50 ` [PATCH v2 12/12] iomap: introduce a full map advance helper Brian Foster
2025-02-19 22:31 ` Darrick J. Wong
2025-02-20 9:10 ` [PATCH v2 00/12] iomap: incremental advance conversion -- phase 2 Christian Brauner
2025-02-20 14:59 ` Brian Foster
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).