qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Gowans, James" <jgowans@amazon.com>
To: "seanjc@google.com" <seanjc@google.com>
Cc: "kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"linux-coco@lists.linux.dev" <linux-coco@lists.linux.dev>,
	"Kalyazin, Nikita" <kalyazin@amazon.co.uk>,
	 "rppt@kernel.org" <rppt@kernel.org>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"Roy, Patrick" <roypat@amazon.co.uk>,
	"somlo@cmu.edu" <somlo@cmu.edu>,
	"vbabka@suse.cz" <vbabka@suse.cz>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"kirill.shutemov@linux.intel.com"
	<kirill.shutemov@linux.intel.com>,
	"Liam.Howlett@oracle.com" <Liam.Howlett@oracle.com>,
	"Woodhouse, David" <dwmw@amazon.co.uk>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>, "Graf (AWS),
	Alexander" <graf@amazon.de>,
	"Manwaring, Derek" <derekmn@amazon.com>,
	"chao.p.peng@linux.intel.com" <chao.p.peng@linux.intel.com>,
	"lstoakes@gmail.com" <lstoakes@gmail.com>,
	"mst@redhat.com" <mst@redhat.com>
Subject: Re: Unmapping KVM Guest Memory from Host Kernel
Date: Mon, 13 May 2024 19:43:01 +0000	[thread overview]
Message-ID: <f880d0187e2d482bc8a8095cf5b7404ea9d6fb03.camel@amazon.com> (raw)
In-Reply-To: <ZkJFIpEHIQvfuzx1@google.com>

On Mon, 2024-05-13 at 10:09 -0700, Sean Christopherson wrote:
> On Mon, May 13, 2024, James Gowans wrote:
> > On Mon, 2024-05-13 at 08:39 -0700, Sean Christopherson wrote:
> > > > Sean, you mentioned that you envision guest_memfd also supporting non-CoCo VMs.
> > > > Do you have some thoughts about how to make the above cases work in the
> > > > guest_memfd context?
> > > 
> > > Yes.  The hand-wavy plan is to allow selectively mmap()ing guest_memfd().  There
> > > is a long thread[*] discussing how exactly we want to do that.  The TL;DR is that
> > > the basic functionality is also straightforward; the bulk of the discussion is
> > > around gup(), reclaim, page migration, etc.
> > 
> > I still need to read this long thread, but just a thought on the word
> > "restricted" here: for MMIO the instruction can be anywhere and
> > similarly the load/store MMIO data can be anywhere. Does this mean that
> > for running unmodified non-CoCo VMs with guest_memfd backend that we'll
> > always need to have the whole of guest memory mmapped?
> 
> Not necessarily, e.g. KVM could re-establish the direct map or mremap() on-demand.
> There are variation on that, e.g. if ASI[*] were to ever make it's way upstream,
> which is a huge if, then we could have guest_memfd mapped into a KVM-only CR3.

Yes, on-demand mapping in of guest RAM pages is definitely an option. It
sounds quite challenging to need to always go via interfaces which
demand map/fault memory, and also potentially quite slow needing to
unmap and flush afterwards. 

Not too sure what you have in mind with "guest_memfd mapped into KVM-
only CR3" - could you expand?

> > I guess the idea is that this use case will still be subject to the
> > normal restriction rules, but for a non-CoCo non-pKVM VM there will be
> > no restriction in practice, and userspace will need to mmap everything
> > always?
> > 
> > It really seems yucky to need to have all of guest RAM mmapped all the
> > time just for MMIO to work... But I suppose there is no way around that
> > for Intel x86.
> 
> It's not just MMIO.  Nested virtualization, and more specifically shadowing nested
> TDP, is also problematic (probably more so than MMIO).  And there are more cases,
> i.e. we'll need a generic solution for this.  As above, there are a variety of
> options, it's largely just a matter of doing the work.  I'm not saying it's a
> trivial amount of work/effort, but it's far from an unsolvable problem.

I didn't even think of nested virt, but that will absolutely be an even
bigger problem too. MMIO was just the first roadblock which illustrated
the problem.
Overall what I'm trying to figure out is whether there is any sane path
here other than needing to mmap all guest RAM all the time. Trying to
get nested virt and MMIO and whatever else needs access to guest RAM
working by doing just-in-time (aka: on-demand) mappings and unmappings
of guest RAM sounds like a painful game of whack-a-mole, potentially
really bad for performance too.

Do you think we should look at doing this on-demand mapping, or, for
now, simply require that all guest RAM is mmapped all the time and KVM
be given a valid virtual addr for the memslots?
Note that I'm specifically referring to regular non-CoCo non-enlightened
VMs here. For CoCo we definitely need all the cooperative MMIO and
sharing. What we're trying to do here is to get guest RAM out of the
direct map using guest_memfd, and now tackling the knock-on problem of
whether or not to mmap all of guest RAM all the time in userspace.

JG

  reply	other threads:[~2024-05-13 19:44 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <AQHacXBJeX10YUH0O0SiQBg1zQLaEw==>
2024-03-08 15:50 ` Unmapping KVM Guest Memory from Host Kernel Gowans, James
2024-03-08 16:25   ` Brendan Jackman
2024-03-08 17:35     ` David Matlack
2024-03-08 17:45       ` David Woodhouse
2024-03-08 22:47         ` Sean Christopherson
2024-03-09  2:45       ` Manwaring, Derek
2024-03-18 14:11         ` Brendan Jackman
2024-03-08 23:22   ` Sean Christopherson
2024-03-09 11:14     ` Mike Rapoport
2024-05-13 10:31       ` Patrick Roy
2024-05-13 15:39         ` Sean Christopherson
2024-05-13 16:01           ` Gowans, James
2024-05-13 17:09             ` Sean Christopherson
2024-05-13 19:43               ` Gowans, James [this message]
2024-05-13 20:36                 ` Sean Christopherson
2024-05-13 22:01                   ` Manwaring, Derek
2024-03-14 21:45     ` Manwaring, Derek
2024-03-09  5:01   ` Matthew Wilcox
2024-03-08 21:05 Manwaring, Derek
2024-03-11  9:26 ` Fuad Tabba
2024-03-11  9:29   ` Fuad Tabba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f880d0187e2d482bc8a8095cf5b7404ea9d6fb03.camel@amazon.com \
    --to=jgowans@amazon.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=chao.p.peng@linux.intel.com \
    --cc=derekmn@amazon.com \
    --cc=dwmw@amazon.co.uk \
    --cc=graf@amazon.de \
    --cc=kalyazin@amazon.co.uk \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-coco@lists.linux.dev \
    --cc=linux-mm@kvack.org \
    --cc=lstoakes@gmail.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=roypat@amazon.co.uk \
    --cc=rppt@kernel.org \
    --cc=seanjc@google.com \
    --cc=somlo@cmu.edu \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).