linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Joanne Koong <joannelkoong@gmail.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: brauner@kernel.org, miklos@szeredi.hu, hch@infradead.org,
	 linux-block@vger.kernel.org, gfs2@lists.linux.dev,
	 linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org,
	 linux-doc@vger.kernel.org, hsiangkao@linux.alibaba.com,
	kernel-team@meta.com
Subject: Re: [PATCH v4 09/15] iomap: add caller-provided callbacks for read and readahead
Date: Wed, 24 Sep 2025 11:18:14 -0700	[thread overview]
Message-ID: <CAJnrk1bYAaJofNBpYYKB2fWGVw-9BPrOUBy_ivmfnjR=49BmNQ@mail.gmail.com> (raw)
In-Reply-To: <20250924002654.GM1587915@frogsfrogsfrogs>

On Tue, Sep 23, 2025 at 5:26 PM Darrick J. Wong <djwong@kernel.org> wrote:
>
> On Mon, Sep 22, 2025 at 05:23:47PM -0700, Joanne Koong wrote:
> > Add caller-provided callbacks for read and readahead so that it can be
> > used generically, especially by filesystems that are not block-based.
> >
> > In particular, this:
> > * Modifies the read and readahead interface to take in a
> >   struct iomap_read_folio_ctx that is publicly defined as:
> >
> >   struct iomap_read_folio_ctx {
> >       const struct iomap_read_ops *ops;
> >       struct folio *cur_folio;
> >       struct readahead_control *rac;
> >       void *read_ctx;
> >   };
>
> I'm starting to wonder if struct iomap_read_ops should contain a struct
> iomap_ops object, but that might result in more churn through this
> patchset.
>
> <shrug> What do you think?

Lol I thought the same thing a while back for "struct iomap_write_ops"
but I don't think Christoph liked the idea [1]

[1] https://lore.kernel.org/linux-fsdevel/20250618044344.GE28041@lst.de/

>
> >
> >   where struct iomap_read_ops is defined as:
> >
> >   struct iomap_read_ops {
> >       int (*read_folio_range)(const struct iomap_iter *iter,
> >                              struct iomap_read_folio_ctx *ctx,
> >                              size_t len);
> >       void (*read_submit)(struct iomap_read_folio_ctx *ctx);
> >   };
> >
> >   read_folio_range() reads in the folio range and is required by the
> >   caller to provide. read_submit() is optional and is used for
> >   submitting any pending read requests.
> >
> > * Modifies existing filesystems that use iomap for read and readahead to
> >   use the new API, through the new statically inlined helpers
> >   iomap_bio_read_folio() and iomap_bio_readahead(). There is no change
> >   in functinality for those filesystems.
>
> Nit: functionality

Thanks, will fix this!
>
> >
> > Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
> > ---
> >  .../filesystems/iomap/operations.rst          | 45 ++++++++++++
> >  block/fops.c                                  |  5 +-
> >  fs/erofs/data.c                               |  5 +-
> >  fs/gfs2/aops.c                                |  6 +-
> >  fs/iomap/buffered-io.c                        | 68 +++++++++++--------
> >  fs/xfs/xfs_aops.c                             |  5 +-
> >  fs/zonefs/file.c                              |  5 +-
> >  include/linux/iomap.h                         | 62 ++++++++++++++++-
> >  8 files changed, 158 insertions(+), 43 deletions(-)
> >
> > diff --git a/Documentation/filesystems/iomap/operations.rst b/Documentation/filesystems/iomap/operations.rst
> > index 067ed8e14ef3..dbb193415c0e 100644
> > --- a/Documentation/filesystems/iomap/operations.rst
> > +++ b/Documentation/filesystems/iomap/operations.rst
> > @@ -135,6 +135,29 @@ These ``struct kiocb`` flags are significant for buffered I/O with iomap:
> >
> >   * ``IOCB_DONTCACHE``: Turns on ``IOMAP_DONTCACHE``.
> >
> > +``struct iomap_read_ops``
> > +--------------------------
> > +
> > +.. code-block:: c
> > +
> > + struct iomap_read_ops {
> > +     int (*read_folio_range)(const struct iomap_iter *iter,
> > +                             struct iomap_read_folio_ctx *ctx, size_t len);
> > +     void (*submit_read)(struct iomap_read_folio_ctx *ctx);
> > + };
> > +
> > +iomap calls these functions:
> > +
> > +  - ``read_folio_range``: Called to read in the range. This must be provided
> > +    by the caller. The caller is responsible for calling
> > +    iomap_start_folio_read() and iomap_finish_folio_read() before and after
> > +    reading in the folio range. This should be done even if an error is
> > +    encountered during the read. This returns 0 on success or a negative error
> > +    on failure.
> > +
> > +  - ``submit_read``: Submit any pending read requests. This function is
> > +    optional.
> > +
> >  Internal per-Folio State
> >  ------------------------
> >
> > @@ -182,6 +205,28 @@ The ``flags`` argument to ``->iomap_begin`` will be set to zero.
> >  The pagecache takes whatever locks it needs before calling the
> >  filesystem.
> >
> > +Both ``iomap_readahead`` and ``iomap_read_folio`` pass in a ``struct
> > +iomap_read_folio_ctx``:
> > +
> > +.. code-block:: c
> > +
> > + struct iomap_read_folio_ctx {
> > +    const struct iomap_read_ops *ops;
> > +    struct folio *cur_folio;
> > +    struct readahead_control *rac;
> > +    void *read_ctx;
> > + };
> > +
> > +``iomap_readahead`` must set:
> > + * ``ops->read_folio_range()`` and ``rac``
> > +
> > +``iomap_read_folio`` must set:
> > + * ``ops->read_folio_range()`` and ``cur_folio``
>
> Hrmm, so we're multiplexing read and readahead through the same
> iomap_read_folio_ctx.  Is there ever a case where cur_folio and rac can
> both be used by the underlying machinery?  I think the answer to that
> question is "no" but I don't think the struct definition makes that
> obvious.

In the ->read_folio_range() callback, both rac and cur_folio are used
for readahead, but in passing in the "struct iomap_read_folio_ctx" to
the main iomap_read_folio()/iomap_readahead() entrypoint, no both rac
and cur_folio do not get set at the same time.

We could change the signature back to something like:
int iomap_read_folio(struct folio *folio, const struct iomap_ops *ops,
const struct iomap_read_ops *ops, void *read_ctx);
void iomap_readahead(struct readahead_control *rac, const struct
iomap_ops *ops, const struct iomap_read_ops *ops, void *read_ctx);

but I think it might get a bit much if/when "void *private" needs to
get added too for iomap iter metadata, though maybe that's okay now
that the private read data has been renamed to read_ctx.


>
> > +
> >  static int iomap_read_folio_iter(struct iomap_iter *iter,
> >               struct iomap_read_folio_ctx *ctx, bool *folio_owned)
> >  {
> > @@ -436,7 +438,7 @@ static int iomap_read_folio_iter(struct iomap_iter *iter,
> >       loff_t length = iomap_length(iter);
> >       struct folio *folio = ctx->cur_folio;
> >       size_t poff, plen;
> > -     loff_t count;
> > +     loff_t pos_diff;
> >       int ret;
> >
> >       if (iomap->type == IOMAP_INLINE) {
> > @@ -454,12 +456,16 @@ static int iomap_read_folio_iter(struct iomap_iter *iter,
> >               iomap_adjust_read_range(iter->inode, folio, &pos, length, &poff,
> >                               &plen);
> >
> > -             count = pos - iter->pos + plen;
> > -             if (WARN_ON_ONCE(count > length))
> > +             pos_diff = pos - iter->pos;
> > +             if (WARN_ON_ONCE(pos_diff + plen > length))
> >                       return -EIO;
>
> Er, can these changes get their own patch describing why the count ->
> pos_diff change was made?

I will separate this out into its own patch. The reasoning behind this
is so that the ->read_folio_range() callback doesn't need to take in a
pos arg but instead can get it from iter->pos [1]

[1] https://lore.kernel.org/linux-fsdevel/aMKt52YxKi1Wrw4y@infradead.org/

Thanks for looking at this patchset!

Thanks,
Joanne
>
> --D
>

  reply	other threads:[~2025-09-24 18:18 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-23  0:23 [PATCH v4 00/15] fuse: use iomap for buffered reads + readahead Joanne Koong
2025-09-23  0:23 ` [PATCH v4 01/15] iomap: move bio read logic into helper function Joanne Koong
2025-09-23  0:23 ` [PATCH v4 02/15] iomap: move read/readahead bio submission " Joanne Koong
2025-09-23  0:23 ` [PATCH v4 03/15] iomap: store read/readahead bio generically Joanne Koong
2025-09-23  0:23 ` [PATCH v4 04/15] iomap: iterate over folio mapping in iomap_readpage_iter() Joanne Koong
2025-09-23  0:23 ` [PATCH v4 05/15] iomap: rename iomap_readpage_iter() to iomap_read_folio_iter() Joanne Koong
2025-09-23  0:23 ` [PATCH v4 06/15] iomap: rename iomap_readpage_ctx struct to iomap_read_folio_ctx Joanne Koong
2025-09-23  0:23 ` [PATCH v4 07/15] iomap: track read/readahead folio ownership internally Joanne Koong
2025-09-24  0:13   ` Darrick J. Wong
2025-09-23  0:23 ` [PATCH v4 08/15] iomap: add public start/finish folio read helpers Joanne Koong
2025-09-23  0:23 ` [PATCH v4 09/15] iomap: add caller-provided callbacks for read and readahead Joanne Koong
2025-09-24  0:26   ` Darrick J. Wong
2025-09-24 18:18     ` Joanne Koong [this message]
2025-09-23  0:23 ` [PATCH v4 10/15] iomap: add bias for async read requests Joanne Koong
2025-09-24  0:28   ` Darrick J. Wong
2025-09-24 18:23     ` Joanne Koong
2025-09-23  0:23 ` [PATCH v4 11/15] iomap: move buffered io bio logic into new file Joanne Koong
2025-09-23  0:23 ` [PATCH v4 12/15] iomap: make iomap_read_folio() a void return Joanne Koong
2025-09-23  0:23 ` [PATCH v4 13/15] fuse: use iomap for read_folio Joanne Koong
2025-09-23 15:39   ` Miklos Szeredi
2025-09-23 17:21     ` Darrick J. Wong
2025-09-23  0:23 ` [PATCH v4 14/15] fuse: use iomap for readahead Joanne Koong
2025-09-23  0:23 ` [PATCH v4 15/15] fuse: remove fc->blkbits workaround for partial writes Joanne Koong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJnrk1bYAaJofNBpYYKB2fWGVw-9BPrOUBy_ivmfnjR=49BmNQ@mail.gmail.com' \
    --to=joannelkoong@gmail.com \
    --cc=brauner@kernel.org \
    --cc=djwong@kernel.org \
    --cc=gfs2@lists.linux.dev \
    --cc=hch@infradead.org \
    --cc=hsiangkao@linux.alibaba.com \
    --cc=kernel-team@meta.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).