Re: [RFC PATCH 00/12] dax: Add DAX to guest memfd support for KVM

Linux CXL
 help / color / mirror / Atom feed

From: Ira Weiny <iweiny@fastmail.com>
To: Gregory Price <gourry@gourry.net>, Dave Jiang <dave.jiang@intel.com>
Cc: linux-cxl@vger.kernel.org, nvdimm@lists.linux.dev,
	djbw@kernel.org, iweiny@kernel.org, pasha.tatashin@soleen.com,
	mclapinski@google.com, rppt@kernel.org,
	joao.m.martins@oracle.com, jic23@kernel.org, john@groves.net,
	rick.p.edgecombe@intel.com
Subject: Re: [RFC PATCH 00/12] dax: Add DAX to guest memfd support for KVM
Date: Wed, 29 Apr 2026 08:21:00 -0500	[thread overview]
Message-ID: <69f205bc6402_3a7a81004d@xwing.notmuch> (raw)
In-Reply-To: <aerm4yDVYpOhxXEF@gourry-fedora-PF4VCD3F>

Gregory Price wrote:
> On Thu, Apr 23, 2026 at 10:02:07AM -0700, Dave Jiang wrote:
> > This RFC series is created as a proof of concept to connect device DAX to guest
> > memory by riding on top of guest memfd in order to prove out that device DAX
> > can be used as guest memory. The series seeks to jump start a discussion on
> > if there are interests in creating a DAX bridge to utilize CXL memory for guest
> > memory until the N_PRIVATE implementation by Gregory [1] is available upstream
> > and DAX users are ready to move to the new scheme. Once there's an established
> > consensus of interest, we can move the discussion to the best way to implement
> > the DAX bridge and the future of device DAX as guest.
> >
> > I did the bare minimal to get the PoC to pass a modified version of KVM gmem
> > selftest (guest_memfd_test) in order to prove out that DAX can go in the gmem
> > path. A DAX char dev is created and the fd is passed in user space with
> > vm_set_user_memory_region2(). The DAX region is passed in as a whole when used
> > unlike memfd where any size can be passed in to be allocated.
> > 
> > The folks on the cc line are people that Dan Williams has mentioned that may be
> > of interest to this.
> 
> I see these as *mildly* orthogonal, but I think maybe you should propose
> a discussion at LSF to talk about this.

Sorry I was a bit delayed on this thread due to some email issues.

Yes this should be talked about at LSF if possible.  But I also think this
is something which is a ways off based on the responses we have seen here.

> 
> guest_memfd in particular wants the host to never map the memory - and
> guests *generally* want 1GB huge page support (TLB go brrrrr).

That is _not_ going to be true forever.  There is work ongoing to create
shared gmem for various reasons.  For secure guests this is at least
useful for initial population of memory before handing to a guest.

> 
> There's a real argument for just handing a physical memory region over
> to guest_memfd and making it manage the region manually, rather than
> doing a bunch of nonsense just so you can call alloc_pages_node()

Agreed.

> 
> So I see an extension like this as genuinely useful regardless of
> whether private nodes actually end up merged.  It's a matter of
> flexibility and use cases.

Yep, the initial talks we had with Dan were to try and get DAX FDs to be
more mainstream.  Given some of the other work it may be better to
deprecate DAX FDs.  But deprecations can take a long time so what Dave
came up with here is trying to help modernize those fds to be more useful
for guest computing.

Also depending on existing use cases this may be easier for folks to
adopt?  But it may have more rough edges than it is worth?

> 
> With this plumbing, you get less flexible use of the memory (you're tied
> to dax abstractions), whereas with private nodes you can build slightly
> more flexible general-system support.
> 
> IN THEORY you could add something like an NP_OP_NOMAP to private nodes
> to make the buddy manage pages that don't have a direct map - BUT - in
> practice that's likely to be more of a bodge rather than a good design.
> 
> So I will say - to the detriment of private nodes ;] - I like this idea.

I've investigated using private nodes as a mechanism for guest_memfd to
draw from.  I think this is along the lines of what Frank mentioned
elsewhere in this thread.

> 
> The question is ultimately how much flexibility you need to shuffle this
> capacity from one guest to another.

Yep.  And how much control one needs over which exact CXL/DAX devices the
memory comes from.  As you know from our community calls that is one thing
I'm not sure the private node idea is great at.  But it could be that is
not really required.  Or is best handled as a carve out.

Ira

next prev parent reply	other threads:[~2026-04-29 13:21 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-23 17:02 [RFC PATCH 00/12] dax: Add DAX to guest memfd support for KVM Dave Jiang
2026-04-23 17:02 ` [RFC PATCH 01/12] dax: rate limit dev_dax_huge_fault() output Dave Jiang
2026-04-23 17:02 ` [RFC PATCH 02/12] dax: Save the kva from memremap Dave Jiang
2026-04-23 17:02 ` [RFC PATCH 03/12] dax: Add fallocate support to device dax Dave Jiang
2026-04-23 17:02 ` [RFC PATCH 04/12] dax: Move dax_pgoff_to_phys() to dax bus to be used by dev dax Dave Jiang
2026-04-23 17:02 ` [RFC PATCH 05/12] dax: Add dax_operations and supporting functions to device dax Dave Jiang
2026-04-23 17:02 ` [RFC PATCH 06/12] dax: Add helper to determine if a 'struct file' supports dax Dave Jiang
2026-04-23 17:02 ` [RFC PATCH 07/12] KVM: guest_memfd: Add setup of daxfd when binding gmem Dave Jiang
2026-04-23 17:02 ` [RFC PATCH 08/12] fs: allow char dev to go through fallocate Dave Jiang
2026-04-23 17:02 ` [RFC PATCH 09/12] dax: Add dax_get_dev_dax() helper function Dave Jiang
2026-04-23 17:02 ` [RFC PATCH 10/12] kvm: Implement dax support for KVM faulting Dave Jiang
2026-04-23 17:02 ` [RFC PATCH 11/12] kvm: Add daxfd support for supported flags Dave Jiang
2026-04-23 17:02 ` [RFC PATCH 12/12] selftest/kvm: Add daxfd support for gmem selftest Dave Jiang
2026-04-23 17:27 ` [RFC PATCH 00/12] dax: Add DAX to guest memfd support for KVM Pasha Tatashin
2026-04-23 18:08   ` Dave Jiang
2026-04-23 18:21     ` Dave Jiang
2026-04-24  3:43 ` Gregory Price
2026-04-24 17:38   ` Frank van der Linden
2026-04-29 13:21   ` Ira Weiny [this message]
2026-04-29 23:58     ` Gregory Price
2026-04-24 17:13 ` Frank van der Linden
2026-04-24 18:23   ` Dave Jiang
2026-04-24 20:01     ` Frank van der Linden
2026-04-24 20:59       ` Dave Jiang
2026-05-06 20:23     ` Ackerley Tng
2026-05-06 20:37       ` Dave Jiang
2026-05-08  1:09       ` Ira Weiny
2026-05-10 14:40         ` Gregory Price

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=69f205bc6402_3a7a81004d@xwing.notmuch \
    --to=iweiny@fastmail.com \
    --cc=dave.jiang@intel.com \
    --cc=djbw@kernel.org \
    --cc=gourry@gourry.net \
    --cc=iweiny@kernel.org \
    --cc=jic23@kernel.org \
    --cc=joao.m.martins@oracle.com \
    --cc=john@groves.net \
    --cc=linux-cxl@vger.kernel.org \
    --cc=mclapinski@google.com \
    --cc=nvdimm@lists.linux.dev \
    --cc=pasha.tatashin@soleen.com \
    --cc=rick.p.edgecombe@intel.com \
    --cc=rppt@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox