public inbox for v9fs@lists.linux.dev
 help / color / mirror / Atom feed
From: Chris Arges <carges@cloudflare.com>
To: asmadeus@codewreck.org
Cc: Christian Schoenebeck <linux_oss@crudebyte.com>,
	Christoph Hellwig <hch@infradead.org>,
	Eric Van Hensbergen <ericvh@kernel.org>,
	Latchesar Ionkov <lucho@ionkov.net>,
	v9fs@lists.linux.dev, linux-kernel@vger.kernel.org,
	David Howells <dhowells@redhat.com>,
	Matthew Wilcox <willy@infradead.org>,
	linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH RFC v2] 9p/virtio: convert to extract_iter_to_sg()
Date: Mon, 15 Dec 2025 10:46:34 -0600	[thread overview]
Message-ID: <aUA7aqpEBWIN1lKQ@861G6M3> (raw)
In-Reply-To: <aT4PZUtvirfq3Lov@codewreck.org>

On 2025-12-14 10:14:13, asmadeus@codewreck.org wrote:
> 
> (Chris, sorry, I had dropped you from Ccs as you were Cc'd from the
> patch itself and not the header... See [1] for start of thread if you're
> interested)
> [1] https://lore.kernel.org/r/20251214-virtio_trans_iter-v2-1-f7f7072e8c15@codewreck.org
> 
>
No problem. Would it help if I tested this v2 patch? I still can easily
reproduce the problem.
 
> Christian Schoenebeck wrote on Sat, Dec 13, 2025 at 07:02:00PM +0100:
> > On Saturday, 13 December 2025 16:07:40 CET Dominique Martinet via B4 Relay wrote:
> > > This simplifies the code quite a bit and should fix issues with
> > > blowing up when iov_iter points at kmalloc data
> > > 
> > > RFC - Not really tested yet!!
> > 
> > TBH, that bothers me.
> 
> Well, that just means "don't bother spending time testing much (or
> debugging if you do test and run into bugs)" and it likely won't be
> merged as is -- this is enough to start discussions without wasting more
> time if this gets refused.
> 
> > Also considering the huge amount of changes; again, what
> > was actually wrong with the previously suggested simple patch v1 [1]? All I
> > can see is a discussion about the term "folio" being misused in the commit log
> > message, but nothing about the patch being wrong per se.
> > [1] https://lore.kernel.org/all/20251210-virtio_trans_iter-v1-1-92eee6d8b6db@codewreck.org/
> 
> I agree we're close to a "perfect is the enemy of good" case here, but
> while it didn't show up in discussions I did bring it up in the patch
> comments themselves.
> My main problem is I'm pretty sure this will break any non-user non-kvec
> iov_iter; at the very least if we go that route we should guard the else
> with is_kvec(), figure out what type of iov Chris gets and handle that
> properly -- likely bvec? I still couldn't figure how to reproduce :/ --
> and error out cleanly in other cases.
> 

If it helps I can work on a more isolated reproducer. Essentially I found
this when running some of our internal testing tools which spin up qemu VMs
and run kernel tests. I may be able to whittle that down to something
externally consumable.

> That's enough work that I figured switching boat wouldn't be much
> different, and if nothing else I've learned -a lot- about the kernel
> scatterlist, iov_iter and virtio APIs, so we can always say that time
> wasn't wasted even if this patch ends up dropped.
> 
> The second problem that I'm reading between the lines of the replies is
> that iov_iter_get_pages_alloc2() is more or less broken/not supported
> for what we're trying to use it for, and the faster we stop using it the
> less bugs we'll get.
> 
> 
> (It's also really not such a huge patch, the bulk of the change is
> removed stuff we no longer use and massaging the cleanup path, but
> extract_iter_to_sg() is doing exactly what we were doing (lookup pages
> and sg_set_page() from the iov_iter) in better (for some reason when I
> added traces iov_iter_get_pages_alloc2() always stopped at one page for
> me?! I thought it was a cache thing but also happens with cache=none, I
> didn't spend time looking as this code will likely go away one way or
> another)
> 
> 
> > > This brings two major changes to how we've always done things with
> > > virtio 9p though:
> > > - We no longer fill in "chan->sg" with user data, but instead allocate a
> > >   scatterlist; this should not be a problem nor a slowdown as previous
> > >   code would allocate a page list instead, the main difference is that
> > >   this might eventually lead to lifting the 512KB msize limit if
> > >   compatible with virtio?
> > 
> > Remember that I already had a patch set for lifting the msize limit [2], which
> > was *heavily* tested and will of course break with these changes BTW, and the
> > reason why I used a custom struct virtqueue_sg instead of the shared sg_table
> > API was that the latter could not be resized (see commit log of patch 3).
> > 
> > [2] https://lore.kernel.org/all/cover.1657920926.git.linux_oss@crudebyte.com/
> 
> Ugh, I'm sorry, it had completely slipped out of my mind...
> And it was waiting for me to debug the RDMA issues wasn't it :/
> 
> FWIW I never heard back from former employer, and my lack of hardware
> situation hasn't changed, so we can probably mark RDMA as deprecated and
> see who complains and get them to look into it if they care...
> (I'm really sorry about having forgotten, but while I never have much
> time for 9p at any given moment if you don't hear back from me on some
> subject you want to push please do send a reminder after a month or
> so... It doesn't excuse my behavior and if we had any other maintainer
> active that might improve the situation, but at least it might prevent
> such useful patches from being forgotten while waiting on me)
> 
> (... actually now I'm done re-reading the patches we've already applied
> patch 10/11 that could impact RDMA, and the rest is specific to virtio,
> so there's nothing else that could go wrong with it as far as this is
> concerned?...)
> 
> 
> 
> OTOH I've learnt a lot about the scatterlist API meanwhile,
> and I don't necessarily agree about why you've had to basically
> reimplement sg_table just to chain them (what you described in
> patchset 3:
> > A custom struct virtqueue_sg is defined instead of using
> > shared API struct sg_table, because the latter would not
> > allow to resize the table after allocation. sg_append_table
> > API OTOH would not fit either, because it requires a list
> > of pages beforehand upon allocation. And both APIs only
> > support all-or-nothing allocation.
> )
> Having looked at sg_append_table I agree it doesn't look appropriate,
> but I think sg_table would work just fine -- you can call sg_chain on
> the last element of the table.
> It'll still require some of the helpers you introduced, but the
> virtqueue_sg type can go away, and we'd just need to loop over
> extract_iter_to_sg() for each of the sg_table "segment".
> 
> If I'm not mistaken here, then the patches aren't that incompatible, and
> it's something worth doing in one order or the other.
> 
> 
> 
> Anyway, what now?
> I'm happy to delay the switch to extract_iter_to_sg() to after your
> virtio msize rework if you want to send it again, but I don't think it's
> as simple as picking up the v1.
> 
> I've honestly spent too long on this for this weekend already, so
> I'll log off for now but if you have any suggestion I'm all ears.
> 
> Thanks,
> -- 
> Dominique Martinet | Asmadeus

      reply	other threads:[~2025-12-15 16:46 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-13 15:07 [PATCH RFC v2] 9p/virtio: convert to extract_iter_to_sg() Dominique Martinet via B4 Relay
2025-12-13 18:02 ` Christian Schoenebeck
2025-12-14  1:14   ` asmadeus
2025-12-15 16:46     ` Chris Arges [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aUA7aqpEBWIN1lKQ@861G6M3 \
    --to=carges@cloudflare.com \
    --cc=asmadeus@codewreck.org \
    --cc=dhowells@redhat.com \
    --cc=ericvh@kernel.org \
    --cc=hch@infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux_oss@crudebyte.com \
    --cc=lucho@ionkov.net \
    --cc=v9fs@lists.linux.dev \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox