All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michael Roth <michael.roth@amd.com>
To: <kvm@vger.kernel.org>, <linux-mm@kvack.org>,
	<linux-kernel@vger.kernel.org>, <x86@kernel.org>,
	<linux-fsdevel@vger.kernel.org>
Cc: <david@kernel.org>, <fvdl@google.com>, <ira.weiny@intel.com>,
	<jthoughton@google.com>, <pankaj.gupta@amd.com>,
	<rick.p.edgecombe@intel.com>, <seanjc@google.com>,
	<vannapurve@google.com>, <yan.y.zhao@intel.com>,
	<kalyazin@amazon.co.uk>
Subject: Re: [RFC PATCH v2 00/51] 1G page support for guest_memfd
Date: Mon, 16 Feb 2026 17:07:33 -0600	[thread overview]
Message-ID: <20260216230733.ejxtppfrbjaarftb@amd.com> (raw)

I'm not sure I'm hitting the same issue you were, but in order to fix
the race I was hitting I needed to grab the range look outside of the
kvm_gmem_get_folio() path so that it could provide mutual exclusion on
the allocation as well as the subsequent splitting of newly-allocation
hugepages.

Here's the patch I needed on top:

  https://github.com/mdroth/linux/commit/240e09e68fe61bb0dfad6a8e054a6aa9316a3660

I think this same issue exists for the THP implementation[1], where a
range lock built around filemap indicies instead of physical addresses
could maybe address both, but not sure it's worthwhile since THP has been
deemed non-upstreamable until general memory migration support is added
to gmem.

I'll dump the code below though for reference since I know some folks on
Cc have been asking about it, but it isn't yet in a state where it's
worth posting separately, but is at least relevant to this particular
discussion. For now, I've just piggy-backed off the filemap invalidate
write lock to serialize all allocations, but I've only hit the race
condition once for 2MB, it's a lot easier with 1GB using hugetlb.

[1]

The THP patches are currently on top of a snapshot of Ackerley’s hugetlb dev
tree. I’d originally planned to rebase on top of just the common
dependencies and posting upstream, but based on the latest guest_memfd/PUCK
calls, there is no chance of THP going upstream without first implementing
memory migration support for guest_memfd to deal with system-wide/cumulative
fragmentation. So I’m tabling that work, it’s just these 3 patches on top for
now:

  2ae099ef6977 KVM: guest_memfd: Serialize allocations when THP is enabled
  733f7a111699 [WIP] KVM: guest_memfd: Enable/fix hugepages for in-place conversion
  349aa261ac65 KVM: Add hugepage support for dedicated guest memory

The initial patch adds THP support for legacy/non-inplace, the remaining 2
enable it for inplace. There are various warnings/TODOs/debugs, I'm only
posting it for reference since I don't know when I'll get to a cleaned up
version since it's not clear it'll be useful in the near-term.

  Kernel:
    https://github.com/mdroth/linux/commits/snp-thp-rfc2-wip0

  QEMU:
    https://github.com/mdroth/qemu/commits/snp-hugetlb-v3wip0b

  To run QEMU with in-place conversion enabled you need the following option (SNP will default to legacy/non-inplace conversion otherwise):
    qemu ... -object sev-snp-guest,...,convert-in-place=true

  To enable hugepages when using either convert-in-place=false/true, a kvm module turns it on for now (flipping it on/off rapidly may help with simulating/testing low memory situations):

    echo 1 >/sys/module/kvm/gmem_2m_enabled

  This tree also supports SNP+hugetlbfs with the following in case you need it for comparison:

  For 2MB hugetlb:
    qemu ... \
      -object sev-snp-guest,...,convert-in-place=true,gmem-allocator=hugetlb,gmem-page-size=2097152

  For 1GB hugetlb:
    qemu ... \
      -object sev-snp-guest,...,convert-in-place=true,gmem-allocator=hugetlb,gmem-page-size=1073741824


             reply	other threads:[~2026-02-16 23:08 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-16 23:07 Michael Roth [this message]
  -- strict thread matches above, loose matches on Subject: below --
2025-05-14 23:41 [RFC PATCH v2 00/51] 1G page support for guest_memfd Ackerley Tng
2025-05-15 18:03 ` Edgecombe, Rick P
2025-05-15 18:42   ` Vishal Annapurve
2025-05-15 23:35     ` Edgecombe, Rick P
2025-05-16  0:57       ` Sean Christopherson
2025-05-16  2:12         ` Edgecombe, Rick P
2025-05-16 13:11           ` Vishal Annapurve
2025-05-16 16:45             ` Edgecombe, Rick P
2025-05-16 17:51               ` Sean Christopherson
2025-05-16 19:14                 ` Edgecombe, Rick P
2025-05-16 20:25                   ` Dave Hansen
2025-05-16 21:42                     ` Edgecombe, Rick P
2025-05-16 17:45             ` Sean Christopherson
2025-05-16 13:09         ` Jason Gunthorpe
2025-05-16 17:04           ` Edgecombe, Rick P
2025-05-16 19:48 ` Ira Weiny
2025-05-16 19:59   ` Ira Weiny
2025-05-16 20:26     ` Ackerley Tng
2025-05-16 22:43 ` Ackerley Tng
2025-06-19  8:13 ` Yan Zhao
2025-06-19  8:59   ` Xiaoyao Li
2025-06-19  9:18     ` Xiaoyao Li
2025-06-19  9:28       ` Yan Zhao
2025-06-19  9:45         ` Xiaoyao Li
2025-06-19  9:49           ` Xiaoyao Li
2025-06-29 18:28     ` Vishal Annapurve
2025-06-30  3:14       ` Yan Zhao
2025-06-30 14:14         ` Vishal Annapurve
2025-07-01  5:23           ` Yan Zhao
2025-07-01 19:48             ` Vishal Annapurve
2025-07-07 23:25               ` Sean Christopherson
2025-07-08  0:14                 ` Vishal Annapurve
2025-07-08  1:08                   ` Edgecombe, Rick P
2025-07-08 14:20                     ` Sean Christopherson
2025-07-08 14:52                       ` Edgecombe, Rick P
2025-07-08 15:07                         ` Vishal Annapurve
2025-07-08 15:31                           ` Edgecombe, Rick P
2025-07-08 17:16                             ` Vishal Annapurve
2025-07-08 17:39                               ` Edgecombe, Rick P
2025-07-08 18:03                                 ` Sean Christopherson
2025-07-08 18:13                                   ` Edgecombe, Rick P
2025-07-08 18:55                                     ` Sean Christopherson
2025-07-08 21:23                                       ` Edgecombe, Rick P
2025-07-09 14:28                                       ` Vishal Annapurve
2025-07-09 15:00                                         ` Sean Christopherson
2025-07-10  1:30                                           ` Vishal Annapurve
2025-07-10 23:33                                             ` Sean Christopherson
2025-07-11 21:18                                             ` Vishal Annapurve
2025-07-12 17:33                                               ` Vishal Annapurve
2025-07-09 15:17                                         ` Edgecombe, Rick P
2025-07-10  3:39                                           ` Vishal Annapurve
2025-07-08 19:28                                   ` Vishal Annapurve
2025-07-08 19:58                                     ` Sean Christopherson
2025-07-08 22:54                                       ` Vishal Annapurve
2025-07-08 15:38                           ` Sean Christopherson
2025-07-08 16:22                             ` Fuad Tabba
2025-07-08 17:25                               ` Sean Christopherson
2025-07-08 18:37                                 ` Fuad Tabba
2025-07-16 23:06                                   ` Ackerley Tng
2025-06-26 23:19 ` Ackerley Tng
2026-01-24  0:07 ` Ackerley Tng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260216230733.ejxtppfrbjaarftb@amd.com \
    --to=michael.roth@amd.com \
    --cc=CAEvNRgFmq8DP_=V7mrY8qza3i9h4-Bn0OWt72iDj6mELu+BiZg@mail.gmail.com \
    --cc=david@kernel.org \
    --cc=fvdl@google.com \
    --cc=ira.weiny@intel.com \
    --cc=jthoughton@google.com \
    --cc=kalyazin@amazon.co.uk \
    --cc=kvm@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=pankaj.gupta@amd.com \
    --cc=rick.p.edgecombe@intel.com \
    --cc=seanjc@google.com \
    --cc=vannapurve@google.com \
    --cc=x86@kernel.org \
    --cc=yan.y.zhao@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.