public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Dominique Martinet <asmadeus@codewreck.org>
To: Matthew Wilcox <willy@infradead.org>
Cc: Chris Arges <carges@cloudflare.com>,
	David Howells <dhowells@redhat.com>,
	ericvh@kernel.org, lucho@ionkov.net, linux_oss@crudebyte.com,
	v9fs@lists.linux.dev, linux-kernel@vger.kernel.org,
	kernel-team@cloudflare.com
Subject: Re: kernel BUG when mounting large block xfs backed by 9p (folio ref count bug)
Date: Tue, 25 Nov 2025 18:03:12 +0900	[thread overview]
Message-ID: <aSVw0M8f3vTXdQxH@codewreck.org> (raw)
In-Reply-To: <aSTwj8LfyFvAXEqc@casper.infradead.org>

Matthew Wilcox wrote on Mon, Nov 24, 2025 at 11:55:59PM +0000:
> > > [   31.395976][   T62] page_type: f8(unknown)
> 
>         PGTY_large_kmalloc      = 0xf8,
> 
> So somebody called kmalloc(2 * 1024 * 1024).  Not sure if that's helpful
> in tracking this down?

This is a "zero-copy rpc" so the pages come from wherever the iov_iter
we were passed was from, and we don't really check...
In particular that zero-copy code in net/9p/trans_virtio.c hasn't
changed much since Al Viro rewrote the 9p code to use iov_iter in 2015
(commit 4f3b35c157e4 ("net/9p: switch the guts of
p9_client_{read,write}() to iov_iter")), and I'm not quite sure anyone
ever looked at if it is anywhere close to friendly with folios...

So I guess it turned out not to be:
> > > [   31.398075][   T62]  ? kvm_sched_clock_read+0x11/0x20
> > > [   31.398131][   T62]  ? sched_clock+0x10/0x30
> > > [   31.398179][   T62]  ? sched_clock_cpu+0xf/0x1d0
> > > [   31.398234][   T62]  iov_iter_get_pages_alloc2+0x20/0x50
> > > [   31.398277][   T62]  p9_get_mapped_pages.part.0.constprop.0+0x6f/0x280 [9pnet_virtio]
> 
> Oh, hang on.  You're passing a kmalloc'ed page to
> iov_iter_get_pages_alloc().  That's not allowed ...

Thanks for finding this, I wouldn't have noticed.


> see https://lore.kernel.org/all/20250310142750.1209192-1-willy@infradead.org/

I'm sorry but I'm not sure I see what I should do from this -- your
patch looks to me like it should now work with this?
Oh, it's not merged?... I don't see where the discussion stalled
either...


For context, in this case virtio needs the pages to be pinned because
the host will write directly into it, and the API we're using is
virtqueue_add_sgs() (drivers/virtio/virtio_ring.c) which expects a
scatterlist, which I guess must be pages (can't say I'm very familiar
with this particular API either, but the word `folio` doesn't show up in
drivers/virtio)



Since we don't know where the iov comes from, we can't have any
expectation about it, but we can check things and try to act
appropriately (or error out and/or somehow fallback to non-zc if there's
a reason we can't do it).

What would one need to go from an iov_iter to something this could use?

out of curiosity I looked at other "big" virtqueue users (e.g. vhost
scsi must be shuffling similar data around), but I don't quite see how
the buffers are passed, I'd need to spend more time than I can afford immediately...


Thanks (and sorry for pulling the whole arm when you give a hand),
-- 
Dominique

  reply	other threads:[~2025-11-25  9:03 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-24 15:47 kernel BUG when mounting filesystem on 9p Chris Arges
2025-11-24 23:12 ` kernel BUG when mounting large block xfs backed by 9p (folio ref count bug) Dominique Martinet
2025-11-24 23:55   ` Matthew Wilcox
2025-11-25  9:03     ` Dominique Martinet [this message]
2025-12-05  4:53       ` Matthew Wilcox
2025-12-05 10:47     ` Christian Schoenebeck
2025-12-05 13:03       ` Dominique Martinet
2025-12-05 13:36         ` Christian Schoenebeck
2025-12-05 13:48           ` Dominique Martinet
2025-12-07  7:18             ` Matthew Wilcox
2025-12-07 13:49               ` Dominique Martinet
2025-12-08 17:21                 ` Chris Arges
2025-12-09  9:52                 ` Christian Schoenebeck
2025-11-25 15:52   ` Chris Arges

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aSVw0M8f3vTXdQxH@codewreck.org \
    --to=asmadeus@codewreck.org \
    --cc=carges@cloudflare.com \
    --cc=dhowells@redhat.com \
    --cc=ericvh@kernel.org \
    --cc=kernel-team@cloudflare.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux_oss@crudebyte.com \
    --cc=lucho@ionkov.net \
    --cc=v9fs@lists.linux.dev \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox