public inbox for linux-doc@vger.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: Gao Xiang <hsiangkao@linux.alibaba.com>
Cc: Joanne Koong <joannelkoong@gmail.com>,
	brauner@kernel.org, miklos@szeredi.hu, hch@infradead.org,
	linux-fsdevel@vger.kernel.org, kernel-team@meta.com,
	linux-xfs@vger.kernel.org, linux-doc@vger.kernel.org
Subject: Re: [PATCH v1 13/16] iomap: add a private arg for read and readahead
Date: Fri, 5 Sep 2025 08:21:18 -0700	[thread overview]
Message-ID: <20250905152118.GE1587915@frogsfrogsfrogs> (raw)
In-Reply-To: <d631c71f-9d0d-405f-862d-b881767b1945@linux.alibaba.com>

On Fri, Sep 05, 2025 at 10:21:19AM +0800, Gao Xiang wrote:
> 
> 
> On 2025/9/5 07:29, Joanne Koong wrote:
> > On Tue, Sep 2, 2025 at 6:55 PM Gao Xiang <hsiangkao@linux.alibaba.com> wrote:
> > > 
> 
> ...
> 
> 
> > > > > 
> > > > > >    int iomap_read_folio(struct folio *folio, const struct iomap_ops *ops,
> > > > > > -             const struct iomap_read_ops *read_ops)
> > > > > > +             const struct iomap_read_ops *read_ops, void *private)
> > > > > >    {
> > > > > >         struct iomap_iter iter = {
> > > > > >                 .inode          = folio->mapping->host,
> > > > > >                 .pos            = folio_pos(folio),
> > > > > >                 .len            = folio_size(folio),
> > > > > > +             .private        = private,
> > > > > >         };
> > > > > 
> > > > > Will this whole work be landed for v6.18?
> > > > > 
> > > > > If not, may I ask if this patch can be shifted advance in this
> > > > > patchset for applying separately (I tried but no luck).
> > > > > 
> > > > > Because I also need some similar approach for EROFS iomap page
> > > > > cache sharing feature since EROFS uncompressed I/Os go through
> > > > > iomap and extra information needs a proper way to pass down to
> > > > > iomap_{begin,end} with extra pointer `.private` too.
> > > > 
> > > > Hi Gao,
> > > > 
> > > > I'm not sure whether this will be landed for v6.18 but I'm happy to
> > > > shift this patch to the beginning of the patchset for applying
> > > > separately.
> > > 
> > > Yeah, thanks.  At least this common patch can be potentially applied
> > > easily (e.g. form a common commit id for both features if really
> > > needed) since other iomap/FUSE patches are not dependency of our new
> > > feature and shouldn't be coupled with our development branch later.
> > > 
> > 
> > Hi Gao,
> > 
> > I'll be dropping this patch in v2 since all the iomap read stuff is
> > going to go through a struct ctx arg instead of through iter->private.
> > Sorry this won't help your use case, but looking forward to seeing your patches.
> 
> Hi Joanne,
> 
> Thanks for your reminder.  Okay, I will check your v2 to know how
> you change then.
> 
> Also, one thing I really think it's helpful for our use cases is
> converting .iomap_begin() at least to pass struct iomap_iter *
> directly rather than (inode, pos, len, flags, iomap, srcmap)
> since:
>   - .iomap_{begin,end}() are introduced before iomap_iter()
>     and struct iomap_iter but those callbacks are basically
>     now passed down some fields of `struct iomap_iter` now;
> 
>   - struct iomap_iter->private then can contain a per-request
>     context so that .iomap_begin() can leverage too;
> 
>   - There are already too many arguments for .iomap_begin(),
>     pass down struct iomap_iter directly could avoid adding
>     another `private` argument to .iomap_begin()..
> 
> Therefore, I do wonder if this change (.iomap_begin() passes
> struct iomap_iter *) is a good idea for the iomap folks, in
> addition that filesystems can specify `struct iomap_iter->private`
> as in this patch.  Since this change is necessary to make our
> page cache sharing feature efficient, I will continue working on
> this soon.

From a source code perspective, I like the idea of cleaning up the
function signature to pass fewer things to ->iomap_begin.  I suspect
that we could simplify it to:

	int (*iomap_begin)(const struct iomap_iter *iter,
			   struct iomap *iomap,
			   struct iomap *srcmap);

That way we preserve the notion that the ->iomap_begin functions aren't
allowed to change the iterator contents except for the two iomaps.

That said, the nice thing about passing so many parameters is that it
probably leads to less pointer chasing in the implementation functions.
I wonder if that makes any difference because extent mapping lookups
likely involve a lot more pointer chasing anyway.  Another benefit is
that since the parameters aren't const, each implementation can (re)use
those variables if they need to.

I think you could simplify iomap_end too:

	int (*iomap_end)(const struct iomap_iter *iter,
			 loff_t pos, u64 length,
			 size_t written);

and make ->iomap_end implementations extract iter->flags and iter->iomap
themselves if they want to.  I don't like how xfs_iomap.c abuses
container_of to extract the iter from the iomap pointer.

(But not enough to have written patches fixing any of this. :P)

> Another thing I want to discuss (but it's less important for our
> recent features) is the whole callback hook model of iomap.
> 
> Basically the current model does mean if any filesystem doesn't
> fulfill the iomap standard flow, it has to add some customized
> callback hook somewhere to modify the code flow then (or introduce
> a new special flag and move their specific logic into iomap/
> itself even other fses may not need this), but the hook way will
> cause increased indirect calls for them, currently we have
> `struct iomap_ops`, `struct iomap_writeback_ops` and
> `struct iomap_dio_ops`, if some another filesystem (when converting
> buffer I/Os for example or adding {pre,post}-processing ) have
> specified timing, it needs to add new hooks then.
> 
> I do wonder if it's possible to convert iomap to get rid of the
> indirect-call model by just providing helper kAPIs instead,
> take .read_folio / .fiemap for example e.g.
> 
>    xxxfs_read_folio:
>       loop iomap_iter
>         xxxfs_iomap_begin();
> 	iomap_readpage_bio_advance(); [ or if a fs is non-bio
>              based, spliting more low-level helpers for them. ]
>         xxxfs_iomap_end();
> 
>    xxxfs_fiemap():
>       iomap_fiemap_begin
>       loop iomap_iter
>         xxxfs_iomap_begin();
>         iomap_readpage_fiemap_advance()
>         xxxfs_iomap_end();
>       iomap_fiemap_end
> So that each fs can use those helpers flexibly instead of diging
> into adding various new indirect call hooks or moving customized
> logic into iomap/ itself.

Yes, it's quite possible to push the iomap iteration control down into
the filesystems to avoid the indirect calls.  That might make things
faster, though I have no idea what sort of performance impact that will
have.

> I don't have a specific example  because currently we don't have
> direct issue against standard iomap flow on our uncompressed
> path, but after a quick glance of other potential users who try
> to convert their buffer I/Os to iomap, I had such impression in
> my head for a while.

OTOH making it easier for non-disk filesystems to use iomap but supply
their own IO mechanism (transformed bios, memcpy, etc) makes a far more
compelling argument for doing this painful(?) treewide change IMO.

--D

> Thanks,
> Gao Xiang
> 
> > 
> > 
> > Thanks,
> > Joanne
> > 
> > > Thanks,
> > > Gao Xiang
> > > 
> > > > 
> > > > Thanks,
> > > > Joanne
> > > > > 
> > > > > Thanks,
> > > > > Gao Xiang
> > > 
> 
> 

  reply	other threads:[~2025-09-05 15:21 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-29 23:56 [PATCH v1 00/16] fuse: use iomap for buffered reads + readahead Joanne Koong
2025-08-29 23:56 ` [PATCH v1 01/16] iomap: move async bio read logic into helper function Joanne Koong
2025-09-03 20:16   ` Darrick J. Wong
2025-09-04  6:00     ` Christoph Hellwig
2025-09-04 21:44       ` Joanne Koong
2025-08-29 23:56 ` [PATCH v1 02/16] iomap: rename cur_folio_in_bio to folio_unlocked Joanne Koong
2025-09-03 20:26   ` [PATCH v1 02/16] iomap: rename cur_folio_in_bio to folio_unlockedOM Darrick J. Wong
2025-09-04  6:03     ` Christoph Hellwig
2025-09-04 22:06       ` Joanne Koong
2025-09-05 15:03         ` Darrick J. Wong
2025-08-29 23:56 ` [PATCH v1 03/16] iomap: refactor read/readahead completion Joanne Koong
2025-09-04  6:05   ` Christoph Hellwig
2025-09-04 23:16     ` Joanne Koong
2025-08-29 23:56 ` [PATCH v1 04/16] iomap: use iomap_iter->private for stashing read/readahead bio Joanne Koong
2025-09-03 20:30   ` Darrick J. Wong
2025-09-04  6:07     ` Christoph Hellwig
2025-09-04 22:20       ` Joanne Koong
2025-09-04 23:15         ` Joanne Koong
2025-08-29 23:56 ` [PATCH v1 05/16] iomap: propagate iomap_read_folio() error to caller Joanne Koong
2025-09-03 20:32   ` Darrick J. Wong
2025-09-04  6:09   ` Christoph Hellwig
2025-09-04 21:13     ` Matthew Wilcox
2025-09-22 16:49       ` Joanne Koong
2025-09-23 18:34         ` Darrick J. Wong
2025-09-24 23:12           ` Joanne Koong
2025-08-29 23:56 ` [PATCH v1 06/16] iomap: move read/readahead logic out of CONFIG_BLOCK guard Joanne Koong
2025-08-29 23:56 ` [PATCH v1 07/16] iomap: iterate through entire folio in iomap_readpage_iter() Joanne Koong
2025-09-03 20:43   ` Darrick J. Wong
2025-09-04 22:37     ` Joanne Koong
2025-09-04  6:14   ` Christoph Hellwig
2025-09-04 22:45     ` Joanne Koong
2025-08-29 23:56 ` [PATCH v1 08/16] iomap: rename iomap_readpage_iter() to iomap_readfolio_iter() Joanne Koong
2025-09-04  6:15   ` Christoph Hellwig
2025-09-04 22:47     ` Joanne Koong
2025-08-29 23:56 ` [PATCH v1 09/16] iomap: rename iomap_readpage_ctx struct to iomap_readfolio_ctx Joanne Koong
2025-09-03 20:44   ` Darrick J. Wong
2025-08-29 23:56 ` [PATCH v1 10/16] iomap: add iomap_start_folio_read() helper Joanne Koong
2025-09-03 20:52   ` Darrick J. Wong
2025-08-29 23:56 ` [PATCH v1 11/16] iomap: make start folio read and finish folio read public APIs Joanne Koong
2025-09-03 20:53   ` Darrick J. Wong
2025-09-04  6:15   ` Christoph Hellwig
2025-08-29 23:56 ` [PATCH v1 12/16] iomap: add iomap_read_ops for read and readahead Joanne Koong
2025-09-03 21:08   ` Darrick J. Wong
2025-09-04 20:58     ` Joanne Koong
2025-09-04  6:21   ` Christoph Hellwig
2025-09-04 21:38     ` Joanne Koong
2025-08-29 23:56 ` [PATCH v1 13/16] iomap: add a private arg " Joanne Koong
2025-08-30  1:54   ` Gao Xiang
2025-09-02 21:24     ` Joanne Koong
2025-09-03  1:55       ` Gao Xiang
2025-09-04 23:29         ` Joanne Koong
2025-09-05  2:21           ` Gao Xiang
2025-09-05 15:21             ` Darrick J. Wong [this message]
2025-09-08  6:56               ` Gao Xiang
2025-09-03 21:11   ` Darrick J. Wong
2025-08-29 23:56 ` [PATCH v1 14/16] fuse: use iomap for read_folio Joanne Koong
2025-09-03 21:13   ` Darrick J. Wong
2025-08-29 23:56 ` [PATCH v1 15/16] fuse: use iomap for readahead Joanne Koong
2025-09-03 21:17   ` Darrick J. Wong
2025-09-04 19:40     ` Joanne Koong
2025-09-04 19:46   ` Joanne Koong
2025-08-29 23:56 ` [PATCH v1 16/16] fuse: remove fuse_readpages_end() null mapping check Joanne Koong
2025-09-02  9:21   ` Miklos Szeredi
2025-09-02 21:19     ` Joanne Koong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250905152118.GE1587915@frogsfrogsfrogs \
    --to=djwong@kernel.org \
    --cc=brauner@kernel.org \
    --cc=hch@infradead.org \
    --cc=hsiangkao@linux.alibaba.com \
    --cc=joannelkoong@gmail.com \
    --cc=kernel-team@meta.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox