public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Pierre Barre" <pierre@barre.sh>
To: "Christian Schoenebeck" <linux_oss@crudebyte.com>,
	asmadeus <asmadeus@codewreck.org>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>,
	v9fs@lists.linux.dev, ericvh@kernel.org, lucho@ionkov.net,
	linux-kernel@vger.kernel.org
Subject: [PATCH v2] 9p: Use kvmalloc for message buffers on supported  transports
Date: Thu, 16 Oct 2025 09:01:56 +0200	[thread overview]
Message-ID: <7005d8d9-d42d-409f-b8e3-cd7207059eee@app.fastmail.com> (raw)
In-Reply-To: <8602724.2ttRNpPraX@silver>

While developing a 9P server (https://github.com/Barre/ZeroFS) and
testing it under high-load, I was running into allocation failures.
The failures occur even with plenty of free memory available because
kmalloc requires contiguous physical memory.

This results in errors like:
ls: page allocation failure: order:7, mode:0x40c40(GFP_NOFS|__GFP_COMP)

This patch introduces a transport capability flag (supports_vmalloc)
that indicates whether a transport can work with vmalloc'd buffers
(non-physically contiguous memory). Transports requiring DMA should
leave this flag as false.

The fd-based transports (tcp, unix, fd) set this flag to true, and
p9_fcall_init will use kvmalloc instead of kmalloc for these
transports. This allows the allocator to fall back to vmalloc when
contiguous physical memory is not available.

Additionally, if kmem_cache_alloc fails, the code falls back to
kvmalloc for transports that support it.

Signed-off-by: Pierre Barre <pierre@barre.sh>
---

 include/net/9p/transport.h |  4 ++++
 net/9p/client.c            | 11 +++++++++--
 net/9p/trans_fd.c          |  3 +++
 3 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/include/net/9p/transport.h b/include/net/9p/transport.h
index 766ec07c9599..f0981515148d 100644
--- a/include/net/9p/transport.h
+++ b/include/net/9p/transport.h
@@ -24,6 +24,9 @@
  *                   we're less flexible when choosing the response message
  *                   size in this case
  * @def: set if this transport should be considered the default
+ * @supports_vmalloc: set if this transport can work with vmalloc'd buffers
+ *                    (non-physically contiguous memory). Transports requiring
+ *                    DMA should leave this as false.
  * @create: member function to create a new connection on this transport
  * @close: member function to discard a connection on this transport
  * @request: member function to issue a request to the transport
@@ -44,6 +47,7 @@ struct p9_trans_module {
 	int maxsize;		/* max message size of transport */
 	bool pooled_rbuffers;
 	int def;		/* this transport should be default */
+	bool supports_vmalloc;	/* can work with vmalloc'd buffers */
 	struct module *owner;
 	int (*create)(struct p9_client *client,
 		      const char *devname, char *args);
diff --git a/net/9p/client.c b/net/9p/client.c
index 5c1ca57ccd28..2a4884c880c1 100644
--- a/net/9p/client.c
+++ b/net/9p/client.c
@@ -229,8 +229,15 @@ static int p9_fcall_init(struct p9_client *c, struct p9_fcall *fc,
 	if (likely(c->fcall_cache) && alloc_msize == c->msize) {
 		fc->sdata = kmem_cache_alloc(c->fcall_cache, GFP_NOFS);
 		fc->cache = c->fcall_cache;
+		if (!fc->sdata && c->trans_mod->supports_vmalloc) {
+			fc->sdata = kvmalloc(alloc_msize, GFP_NOFS);
+			fc->cache = NULL;
+		}
 	} else {
-		fc->sdata = kmalloc(alloc_msize, GFP_NOFS);
+		if (c->trans_mod->supports_vmalloc)
+			fc->sdata = kvmalloc(alloc_msize, GFP_NOFS);
+		else
+			fc->sdata = kmalloc(alloc_msize, GFP_NOFS);
 		fc->cache = NULL;
 	}
 	if (!fc->sdata)
@@ -252,7 +259,7 @@ void p9_fcall_fini(struct p9_fcall *fc)
 	if (fc->cache)
 		kmem_cache_free(fc->cache, fc->sdata);
 	else
-		kfree(fc->sdata);
+		kvfree(fc->sdata);
 }
 EXPORT_SYMBOL(p9_fcall_fini);
 
diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c
index a516745f732f..e7334033eba5 100644
--- a/net/9p/trans_fd.c
+++ b/net/9p/trans_fd.c
@@ -1101,6 +1101,7 @@ static struct p9_trans_module p9_tcp_trans = {
 	.maxsize = MAX_SOCK_BUF,
 	.pooled_rbuffers = false,
 	.def = 0,
+	.supports_vmalloc = true,
 	.create = p9_fd_create_tcp,
 	.close = p9_fd_close,
 	.request = p9_fd_request,
@@ -1115,6 +1116,7 @@ static struct p9_trans_module p9_unix_trans = {
 	.name = "unix",
 	.maxsize = MAX_SOCK_BUF,
 	.def = 0,
+	.supports_vmalloc = true,
 	.create = p9_fd_create_unix,
 	.close = p9_fd_close,
 	.request = p9_fd_request,
@@ -1129,6 +1131,7 @@ static struct p9_trans_module p9_fd_trans = {
 	.name = "fd",
 	.maxsize = MAX_SOCK_BUF,
 	.def = 0,
+	.supports_vmalloc = true,
 	.create = p9_fd_create,
 	.close = p9_fd_close,
 	.request = p9_fd_request,
-- 
2.39.5 (Apple Git-154)



On Fri, Aug 8, 2025, at 13:12, Christian Schoenebeck wrote:
> On Wednesday, August 6, 2025 11:44:34 PM CEST Dominique Martinet wrote:
>> 
>> Pierre Barre wrote on Wed, Aug 06, 2025 at 05:50:42PM +0200:
>> > If I submit a patch based on what has been discussed above, is it
>> > likely to be accepted? Unfortunately, in my current setup, I am
>> > encountering this issue quite frequently, and users of my servers are
>> > having a hard time making sense of the error.
>> 
>> Yes, sorry it wasn't clear.
>> 
>> I still have no idea what's the "best" allocation method that we'll be
>> able to use as either a vmalloc buffer or split into a scatterlist, but
>> there's little point in worrying too much about it, so please go ahead.
>> 
>> If it's restricted to trans_fd and there's a chance we can make use of
>> it with (at least) virtio later I think everyone will be happy :)
>
> Yes, sounds like a viable plan.
>
> Pierre, one more thing to note: kmem_cache_alloc() might still fail though. So
> maybe it would make sense to add a separate patch that would check the result
> of kmem_cache_alloc() and if it fails, falling back to your kvmalloc() call
> (if enabled by the discussed transport mechanism of course).
>
> /Christian

  reply	other threads:[~2025-10-16  7:02 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-30 15:08 [PATCH] 9p: Use kvmalloc for message buffers Pierre Barre
2025-07-30 16:08 ` Christian Schoenebeck
2025-07-30 16:19   ` Pierre Barre
2025-07-30 17:28     ` Christian Schoenebeck
2025-07-30 20:16       ` Pierre Barre
2025-07-30 22:07         ` asmadeus
2025-07-31  0:36           ` Pierre Barre
2025-08-06 15:50             ` Pierre Barre
2025-08-06 21:44               ` Dominique Martinet
2025-08-08 11:12                 ` Christian Schoenebeck
2025-10-16  7:01                   ` Pierre Barre [this message]
2025-10-16  7:26                     ` [PATCH v2] 9p: Use kvmalloc for message buffers on supported transports Dominique Martinet
2025-10-16 13:06                     ` Christian Schoenebeck
2025-10-16 13:58                       ` [PATCH v3] " Pierre Barre
2025-10-17  5:49                         ` Christophe JAILLET
2025-11-03  7:52                         ` asmadeus
2025-11-03 10:16                           ` Pierre Barre

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7005d8d9-d42d-409f-b8e3-cd7207059eee@app.fastmail.com \
    --to=pierre@barre.sh \
    --cc=asmadeus@codewreck.org \
    --cc=ericvh@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux_oss@crudebyte.com \
    --cc=lucho@ionkov.net \
    --cc=v9fs@lists.linux.dev \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox