All of lore.kernel.org
 help / color / mirror / Atom feed
From: Quentin Perret <qperret@google.com>
To: Sean Christopherson <seanjc@google.com>
Cc: Andy Lutomirski <luto@kernel.org>,
	Steven Price <steven.price@arm.com>,
	Chao Peng <chao.p.peng@linux.intel.com>,
	kvm list <kvm@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
	Linux API <linux-api@vger.kernel.org>,
	qemu-devel@nongnu.org, Paolo Bonzini <pbonzini@redhat.com>,
	Jonathan Corbet <corbet@lwn.net>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	the arch/x86 maintainers <x86@kernel.org>,
	"H. Peter Anvin" <hpa@zytor.com>, Hugh Dickins <hughd@google.com>,
	Jeff Layton <jlayton@kernel.org>,
	"J . Bruce Fields" <bfields@fieldses.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Mike Rapoport <rppt@kernel.org>,
	"Maciej S . Szmigiero" <mail@maciej.szmigiero.name>,
	Vlastimil Babka <vbabka@suse.cz>,
	Vishal Annapurve <vannapurve@google.com>,
	Yu Zhang <yu.c.zhang@linux.intel.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	"Nakajima, Jun" <jun.nakajima@intel.com>,
	Dave Hansen <dave.hansen@intel.com>,
	Andi Kleen <ak@linux.intel.com>,
	David Hildenbrand <david@redhat.com>,
	Marc Zyngier <maz@kernel.org>, Will Deacon <will@kernel.org>
Subject: Re: [PATCH v5 00/13] KVM: mm: fd-based approach for supporting KVM guest private memory
Date: Wed, 6 Apr 2022 10:34:15 +0000	[thread overview]
Message-ID: <Yk1spw4zIxR73VX8@google.com> (raw)
In-Reply-To: <YkyEaYiL0BrDYcZv@google.com>

On Tuesday 05 Apr 2022 at 18:03:21 (+0000), Sean Christopherson wrote:
> On Tue, Apr 05, 2022, Quentin Perret wrote:
> > On Monday 04 Apr 2022 at 15:04:17 (-0700), Andy Lutomirski wrote:
> > > >>  - it can be very useful for protected VMs to do shared=>private
> > > >>    conversions. Think of a VM receiving some data from the host in a
> > > >>    shared buffer, and then it wants to operate on that buffer without
> > > >>    risking to leak confidential informations in a transient state. In
> > > >>    that case the most logical thing to do is to convert the buffer back
> > > >>    to private, do whatever needs to be done on that buffer (decrypting a
> > > >>    frame, ...), and then share it back with the host to consume it;
> > > >
> > > > If performance is a motivation, why would the guest want to do two
> > > > conversions instead of just doing internal memcpy() to/from a private
> > > > page?  I would be quite surprised if multiple exits and TLB shootdowns is
> > > > actually faster, especially at any kind of scale where zapping stage-2
> > > > PTEs will cause lock contention and IPIs.
> > > 
> > > I don't know the numbers or all the details, but this is arm64, which is a
> > > rather better architecture than x86 in this regard.  So maybe it's not so
> > > bad, at least in very simple cases, ignoring all implementation details.
> > > (But see below.)  Also the systems in question tend to have fewer CPUs than
> > > some of the massive x86 systems out there.
> > 
> > Yep. I can try and do some measurements if that's really necessary, but
> > I'm really convinced the cost of the TLBI for the shared->private
> > conversion is going to be significantly smaller than the cost of memcpy
> > the buffer twice in the guest for us.
> 
> It's not just the TLB shootdown, the VM-Exits aren't free.

Ack, but we can at least work on the rest (number of exits, locking, ...).
The cost of the memcpy and the TLBI are really incompressible.

> And barring non-trivial
> improvements to KVM's MMU, e.g. sharding of mmu_lock, modifying the page tables will
> block all other updates and MMU operations.  Taking mmu_lock for read, should arm64
> ever convert to a rwlock, is not an option because KVM needs to block other
> conversions to avoid races.

FWIW the host mmu_lock isn't all that useful for pKVM. The host doesn't
have _any_ control over guest page-tables, and the hypervisor can't
safely rely on the host for locking, so we have hypervisor-level
synchronization.

> Hmm, though batching multiple pages into a single request would mitigate most of
> the overhead.

Yep, there are a few tricks we can play to make this fairly efficient in
the most common cases. And fine-grain locking at EL2 is really high up
on the todo list :-)

Thanks,
Quentin

  reply	other threads:[~2022-04-06 14:23 UTC|newest]

Thread overview: 183+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-10 14:08 [PATCH v5 00/13] KVM: mm: fd-based approach for supporting KVM guest private memory Chao Peng
2022-03-10 14:08 ` Chao Peng
2022-03-10 14:08 ` [PATCH v5 01/13] mm/memfd: Introduce MFD_INACCESSIBLE flag Chao Peng
2022-03-10 14:08   ` Chao Peng
2022-04-11 15:10   ` Kirill A. Shutemov
2022-04-11 15:10     ` Kirill A. Shutemov
2022-04-12 13:11     ` Chao Peng
2022-04-12 13:11       ` Chao Peng
2022-04-23  5:43   ` Vishal Annapurve
2022-04-24  8:15     ` Chao Peng
2022-04-24  8:15       ` Chao Peng
2022-03-10 14:09 ` [PATCH v5 02/13] mm: Introduce memfile_notifier Chao Peng
2022-03-10 14:09   ` Chao Peng
2022-03-29 18:45   ` Sean Christopherson
2022-04-08 12:54     ` Chao Peng
2022-04-08 12:54       ` Chao Peng
2022-04-12 14:36   ` Hillf Danton
2022-04-13  6:47     ` Chao Peng
2022-03-10 14:09 ` [PATCH v5 03/13] mm/shmem: Support memfile_notifier Chao Peng
2022-03-10 14:09   ` Chao Peng
2022-03-10 23:08   ` Dave Chinner
2022-03-10 23:08     ` Dave Chinner
2022-03-11  8:42     ` Chao Peng
2022-03-11  8:42       ` Chao Peng
2022-04-11 15:26   ` Kirill A. Shutemov
2022-04-11 15:26     ` Kirill A. Shutemov
2022-04-12 13:12     ` Chao Peng
2022-04-12 13:12       ` Chao Peng
2022-04-19 22:40   ` Vishal Annapurve
2022-04-20  3:24     ` Chao Peng
2022-04-20  3:24       ` Chao Peng
2022-03-10 14:09 ` [PATCH v5 04/13] mm/shmem: Restrict MFD_INACCESSIBLE memory against RLIMIT_MEMLOCK Chao Peng
2022-03-10 14:09   ` Chao Peng
2022-04-07 16:05   ` Sean Christopherson
2022-04-07 17:09     ` Andy Lutomirski
2022-04-07 17:09       ` Andy Lutomirski
2022-04-08 17:56       ` Sean Christopherson
2022-04-08 18:54         ` David Hildenbrand
2022-04-08 18:54           ` David Hildenbrand
2022-04-12 14:36           ` Jason Gunthorpe
2022-04-12 14:36             ` Jason Gunthorpe
2022-04-12 21:27             ` Andy Lutomirski
2022-04-12 21:27               ` Andy Lutomirski
2022-04-13 16:30               ` David Hildenbrand
2022-04-13 16:30                 ` David Hildenbrand
2022-04-13 16:24             ` David Hildenbrand
2022-04-13 16:24               ` David Hildenbrand
2022-04-13 17:52               ` Jason Gunthorpe
2022-04-13 17:52                 ` Jason Gunthorpe
2022-04-25 14:07                 ` David Hildenbrand
2022-04-25 14:07                   ` David Hildenbrand
2022-04-08 13:02     ` Chao Peng
2022-04-08 13:02       ` Chao Peng
2022-04-11 15:34       ` Kirill A. Shutemov
2022-04-11 15:34         ` Kirill A. Shutemov
2022-04-12  5:14         ` Hugh Dickins
2022-04-11 15:32     ` Kirill A. Shutemov
2022-04-11 15:32       ` Kirill A. Shutemov
2022-04-12 13:39       ` Chao Peng
2022-04-12 13:39         ` Chao Peng
2022-04-12 19:28         ` Kirill A. Shutemov
2022-04-12 19:28           ` Kirill A. Shutemov
2022-04-13  9:15           ` Chao Peng
2022-04-13  9:15             ` Chao Peng
2022-03-10 14:09 ` [PATCH v5 05/13] KVM: Extend the memslot to support fd-based private memory Chao Peng
2022-03-10 14:09   ` Chao Peng
2022-03-28 21:27   ` Sean Christopherson
2022-04-08 13:21     ` Chao Peng
2022-04-08 13:21       ` Chao Peng
2022-03-28 21:56   ` Sean Christopherson
2022-04-08 13:46     ` Chao Peng
2022-04-08 13:46       ` Chao Peng
2022-04-08 17:45       ` Sean Christopherson
2022-03-10 14:09 ` [PATCH v5 06/13] KVM: Use kvm_userspace_memory_region_ext Chao Peng
2022-03-10 14:09   ` Chao Peng
2022-03-28 22:26   ` Sean Christopherson
2022-04-08 13:58     ` Chao Peng
2022-04-08 13:58       ` Chao Peng
2022-03-10 14:09 ` [PATCH v5 07/13] KVM: Add KVM_EXIT_MEMORY_ERROR exit Chao Peng
2022-03-10 14:09   ` Chao Peng
2022-03-28 22:33   ` Sean Christopherson
2022-04-08 13:59     ` Chao Peng
2022-04-08 13:59       ` Chao Peng
2022-03-10 14:09 ` [PATCH v5 08/13] KVM: Use memfile_pfn_ops to obtain pfn for private pages Chao Peng
2022-03-10 14:09   ` Chao Peng
2022-03-28 23:56   ` Sean Christopherson
2022-04-08 14:07     ` Chao Peng
2022-04-08 14:07       ` Chao Peng
2022-04-28 12:37     ` Chao Peng
2022-04-28 12:37       ` Chao Peng
2022-03-10 14:09 ` [PATCH v5 09/13] KVM: Handle page fault for private memory Chao Peng
2022-03-10 14:09   ` Chao Peng
2022-03-29  1:07   ` Sean Christopherson
2022-04-12 12:10     ` Chao Peng
2022-04-12 12:10       ` Chao Peng
2022-03-10 14:09 ` [PATCH v5 10/13] KVM: Register private memslot to memory backing store Chao Peng
2022-03-10 14:09   ` Chao Peng
2022-03-29 19:01   ` Sean Christopherson
2022-04-12 12:40     ` Chao Peng
2022-04-12 12:40       ` Chao Peng
2022-03-10 14:09 ` [PATCH v5 11/13] KVM: Zap existing KVM mappings when pages changed in the private fd Chao Peng
2022-03-10 14:09   ` Chao Peng
2022-03-29 19:23   ` Sean Christopherson
2022-04-12 12:43     ` Chao Peng
2022-04-12 12:43       ` Chao Peng
2022-04-05 23:45   ` Michael Roth
2022-04-08  3:06     ` Sean Christopherson
2022-04-19 22:43   ` Vishal Annapurve
2022-04-20  3:17     ` Chao Peng
2022-04-20  3:17       ` Chao Peng
2022-03-10 14:09 ` [PATCH v5 12/13] KVM: Expose KVM_MEM_PRIVATE Chao Peng
2022-03-10 14:09   ` Chao Peng
2022-03-29 19:13   ` Sean Christopherson
2022-04-12 12:56     ` Chao Peng
2022-04-12 12:56       ` Chao Peng
2022-03-10 14:09 ` [PATCH v5 13/13] memfd_create.2: Describe MFD_INACCESSIBLE flag Chao Peng
2022-03-10 14:09   ` Chao Peng
2022-03-24 15:51 ` [PATCH v5 00/13] KVM: mm: fd-based approach for supporting KVM guest private memory Quentin Perret
2022-03-28 17:13   ` Sean Christopherson
2022-03-28 18:00     ` Quentin Perret
2022-03-28 18:58       ` Sean Christopherson
2022-03-29 17:01         ` Quentin Perret
2022-03-30  8:58           ` Steven Price
2022-03-30  8:58             ` Steven Price
2022-03-30 10:39             ` Quentin Perret
2022-03-30 17:58               ` Sean Christopherson
2022-03-31 16:04                 ` Andy Lutomirski
2022-03-31 16:04                   ` Andy Lutomirski
2022-04-01 14:59                   ` Quentin Perret
2022-04-01 17:14                     ` Sean Christopherson
2022-04-01 18:03                       ` Quentin Perret
2022-04-01 18:24                         ` Sean Christopherson
2022-04-01 19:56                     ` Andy Lutomirski
2022-04-01 19:56                       ` Andy Lutomirski
2022-04-04 15:01                       ` Quentin Perret
2022-04-04 17:06                         ` Sean Christopherson
2022-04-04 22:04                           ` Andy Lutomirski
2022-04-04 22:04                             ` Andy Lutomirski
2022-04-05 10:36                             ` Quentin Perret
2022-04-05 17:51                               ` Andy Lutomirski
2022-04-05 17:51                                 ` Andy Lutomirski
2022-04-05 18:30                                 ` Sean Christopherson
2022-04-06 18:42                                   ` Andy Lutomirski
2022-04-06 18:42                                     ` Andy Lutomirski
2022-04-06 13:05                                 ` Quentin Perret
2022-04-05 18:03                               ` Sean Christopherson
2022-04-06 10:34                                 ` Quentin Perret [this message]
2022-04-22 10:56                                 ` Chao Peng
2022-04-22 10:56                                   ` Chao Peng
2022-04-22 11:06                                   ` Paolo Bonzini
2022-04-22 11:06                                     ` Paolo Bonzini
2022-04-24  8:07                                     ` Chao Peng
2022-04-24  8:07                                       ` Chao Peng
2022-04-24 16:59                                   ` Andy Lutomirski
2022-04-24 16:59                                     ` Andy Lutomirski
2022-04-25 13:40                                     ` Chao Peng
2022-04-25 13:40                                       ` Chao Peng
2022-04-25 14:52                                       ` Andy Lutomirski
2022-04-25 14:52                                         ` Andy Lutomirski
2022-04-25 20:30                                         ` Sean Christopherson
2022-06-10 19:18                                           ` Andy Lutomirski
2022-06-10 19:27                                             ` Sean Christopherson
2022-04-28 12:29                                         ` Chao Peng
2022-04-28 12:29                                           ` Chao Peng
2022-05-03 11:12                                           ` Quentin Perret
2022-05-09 22:30                                   ` Michael Roth
2022-05-09 23:29                                     ` Sean Christopherson
2022-07-21 20:05                                       ` Gupta, Pankaj
2022-07-21 21:19                                         ` Sean Christopherson
2022-07-21 21:36                                           ` Gupta, Pankaj
2022-07-23  3:09                                           ` Andy Lutomirski
2022-07-25  9:19                                             ` Gupta, Pankaj
2022-03-30 16:18             ` Sean Christopherson
2022-03-28 20:16 ` Andy Lutomirski
2022-03-28 20:16   ` Andy Lutomirski
2022-03-28 22:48   ` Nakajima, Jun
2022-03-28 22:48     ` Nakajima, Jun
2022-03-29  0:04     ` Sean Christopherson
2022-04-08 21:35   ` Vishal Annapurve
2022-04-12 13:00     ` Chao Peng
2022-04-12 13:00       ` Chao Peng
2022-04-12 19:58   ` Kirill A. Shutemov
2022-04-12 19:58     ` Kirill A. Shutemov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yk1spw4zIxR73VX8@google.com \
    --to=qperret@google.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=bfields@fieldses.org \
    --cc=bp@alien8.de \
    --cc=chao.p.peng@linux.intel.com \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@intel.com \
    --cc=david@redhat.com \
    --cc=hpa@zytor.com \
    --cc=hughd@google.com \
    --cc=jlayton@kernel.org \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=jun.nakajima@intel.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mail@maciej.szmigiero.name \
    --cc=maz@kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=rppt@kernel.org \
    --cc=seanjc@google.com \
    --cc=steven.price@arm.com \
    --cc=tglx@linutronix.de \
    --cc=vannapurve@google.com \
    --cc=vbabka@suse.cz \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    --cc=yu.c.zhang@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.