All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oliver Upton <oupton@google.com>
To: Quentin Perret <qperret@google.com>
Cc: kvm@vger.kernel.org, Marc Zyngier <maz@kernel.org>,
	Peter Shier <pshier@google.com>, Ben Gardon <bgardon@google.com>,
	David Matlack <dmatlack@google.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	kvmarm@lists.cs.columbia.edu,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [RFC PATCH 09/17] KVM: arm64: Tear down unlinked page tables in parallel walk
Date: Wed, 4 May 2022 06:03:56 +0000	[thread overview]
Message-ID: <YnIXTMDpucMxnpFg@google.com> (raw)
In-Reply-To: <YnE5dfaC3HpXli26@google.com>

On Tue, May 03, 2022 at 02:17:25PM +0000, Quentin Perret wrote:
> On Friday 22 Apr 2022 at 20:41:47 (+0000), Oliver Upton wrote:
> > On Fri, Apr 22, 2022 at 04:00:45PM +0000, Quentin Perret wrote:
> > > On Thursday 21 Apr 2022 at 16:40:56 (+0000), Oliver Upton wrote:
> > > > The other option would be to not touch the subtree at all until the rcu
> > > > callback, as at that point software will not tweak the tables any more.
> > > > No need for atomics/spinning and can just do a boring traversal.
> > > 
> > > Right that is sort of what I had in mind. Note that I'm still trying to
> > > make my mind about the overall approach -- I can see how RCU protection
> > > provides a rather elegant solution to this problem, but this makes the
> > > whole thing inaccessible to e.g. pKVM where RCU is a non-starter.
> > 
> > Heh, figuring out how to do this for pKVM seemed hard hence my lazy
> > attempt :)
> > 
> > > A
> > > possible alternative that comes to mind would be to have all walkers
> > > take references on the pages as they walk down, and release them on
> > > their way back, but I'm still not sure how to make this race-safe. I'll
> > > have a think ...
> > 
> > Does pKVM ever collapse tables into blocks? That is the only reason any
> > of this mess ever gets roped in. If not I think it is possible to get
> > away with a rwlock with unmap on the write side and everything else on
> > the read side, right?
> > 
> > As far as regular KVM goes we get in this business when disabling dirty
> > logging on a memslot. Guest faults will lazily collapse the tables back
> > into blocks. An equally valid implementation would be just to unmap the
> > whole memslot and have the guest build out the tables again, which could
> > work with the aforementioned rwlock.
> 
> Apologies for the delay on this one, I was away for a while.
> 
> Yup, that all makes sense. FWIW the pKVM use-case I have in mind is
> slightly different. Specifically, in the pKVM world the hypervisor
> maintains a stage-2 for the host, that is all identity mapped. So we use
> nice big block mappings as much as we can. But when a protected guest
> starts, the hypervisor needs to break down the host stage-2 blocks to
> unmap the 4K guest pages from the host (which is where the protection
> comes from in pKVM). And when the guest is torn down, the host can
> reclaim its pages, hence putting us in a position to coallesce its
> stage-2 into nice big blocks again. Note that none of this coallescing
> is currently implemented even in our pKVM prototype, so it's a bit
> unfair to ask you to deal with this stuff now, but clearly it'd be cool
> if there was a way we could make these things coexist and even ideally
> share some code...

Oh, it certainly isn't unfair to make sure we've got good constructs
landing for everyone to use :-)

I'll need to chew on this a bit more to have a better answer. The reason
I hesitate to do the giant unmap for non-pKVM is that I believe we'd be
leaving some performance on the table for newer implementations of the
architecture. Having said that, avoiding a tlbi vmalls12e1is on every
collapsed table is highly desirable.

FEAT_BBM=2 semantics in the MMU is also on the todo list. In this case
we'd do a direct table->block transformation on the PTE and elide that
nasty tlbi.

Unless there's objections, I'll probably hobble this series along as-is
for the time being. My hope is that other table walkers can join in on
the parallel party later down the road.

Thanks for getting back to me.

--
Best,
Oliver
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

WARNING: multiple messages have this Message-ID (diff)
From: Oliver Upton <oupton@google.com>
To: Quentin Perret <qperret@google.com>
Cc: kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org,
	Marc Zyngier <maz@kernel.org>, Ben Gardon <bgardon@google.com>,
	Peter Shier <pshier@google.com>,
	David Matlack <dmatlack@google.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [RFC PATCH 09/17] KVM: arm64: Tear down unlinked page tables in parallel walk
Date: Wed, 4 May 2022 06:03:56 +0000	[thread overview]
Message-ID: <YnIXTMDpucMxnpFg@google.com> (raw)
In-Reply-To: <YnE5dfaC3HpXli26@google.com>

On Tue, May 03, 2022 at 02:17:25PM +0000, Quentin Perret wrote:
> On Friday 22 Apr 2022 at 20:41:47 (+0000), Oliver Upton wrote:
> > On Fri, Apr 22, 2022 at 04:00:45PM +0000, Quentin Perret wrote:
> > > On Thursday 21 Apr 2022 at 16:40:56 (+0000), Oliver Upton wrote:
> > > > The other option would be to not touch the subtree at all until the rcu
> > > > callback, as at that point software will not tweak the tables any more.
> > > > No need for atomics/spinning and can just do a boring traversal.
> > > 
> > > Right that is sort of what I had in mind. Note that I'm still trying to
> > > make my mind about the overall approach -- I can see how RCU protection
> > > provides a rather elegant solution to this problem, but this makes the
> > > whole thing inaccessible to e.g. pKVM where RCU is a non-starter.
> > 
> > Heh, figuring out how to do this for pKVM seemed hard hence my lazy
> > attempt :)
> > 
> > > A
> > > possible alternative that comes to mind would be to have all walkers
> > > take references on the pages as they walk down, and release them on
> > > their way back, but I'm still not sure how to make this race-safe. I'll
> > > have a think ...
> > 
> > Does pKVM ever collapse tables into blocks? That is the only reason any
> > of this mess ever gets roped in. If not I think it is possible to get
> > away with a rwlock with unmap on the write side and everything else on
> > the read side, right?
> > 
> > As far as regular KVM goes we get in this business when disabling dirty
> > logging on a memslot. Guest faults will lazily collapse the tables back
> > into blocks. An equally valid implementation would be just to unmap the
> > whole memslot and have the guest build out the tables again, which could
> > work with the aforementioned rwlock.
> 
> Apologies for the delay on this one, I was away for a while.
> 
> Yup, that all makes sense. FWIW the pKVM use-case I have in mind is
> slightly different. Specifically, in the pKVM world the hypervisor
> maintains a stage-2 for the host, that is all identity mapped. So we use
> nice big block mappings as much as we can. But when a protected guest
> starts, the hypervisor needs to break down the host stage-2 blocks to
> unmap the 4K guest pages from the host (which is where the protection
> comes from in pKVM). And when the guest is torn down, the host can
> reclaim its pages, hence putting us in a position to coallesce its
> stage-2 into nice big blocks again. Note that none of this coallescing
> is currently implemented even in our pKVM prototype, so it's a bit
> unfair to ask you to deal with this stuff now, but clearly it'd be cool
> if there was a way we could make these things coexist and even ideally
> share some code...

Oh, it certainly isn't unfair to make sure we've got good constructs
landing for everyone to use :-)

I'll need to chew on this a bit more to have a better answer. The reason
I hesitate to do the giant unmap for non-pKVM is that I believe we'd be
leaving some performance on the table for newer implementations of the
architecture. Having said that, avoiding a tlbi vmalls12e1is on every
collapsed table is highly desirable.

FEAT_BBM=2 semantics in the MMU is also on the todo list. In this case
we'd do a direct table->block transformation on the PTE and elide that
nasty tlbi.

Unless there's objections, I'll probably hobble this series along as-is
for the time being. My hope is that other table walkers can join in on
the parallel party later down the road.

Thanks for getting back to me.

--
Best,
Oliver

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

WARNING: multiple messages have this Message-ID (diff)
From: Oliver Upton <oupton@google.com>
To: Quentin Perret <qperret@google.com>
Cc: kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org,
	Marc Zyngier <maz@kernel.org>, Ben Gardon <bgardon@google.com>,
	Peter Shier <pshier@google.com>,
	David Matlack <dmatlack@google.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [RFC PATCH 09/17] KVM: arm64: Tear down unlinked page tables in parallel walk
Date: Wed, 4 May 2022 06:03:56 +0000	[thread overview]
Message-ID: <YnIXTMDpucMxnpFg@google.com> (raw)
In-Reply-To: <YnE5dfaC3HpXli26@google.com>

On Tue, May 03, 2022 at 02:17:25PM +0000, Quentin Perret wrote:
> On Friday 22 Apr 2022 at 20:41:47 (+0000), Oliver Upton wrote:
> > On Fri, Apr 22, 2022 at 04:00:45PM +0000, Quentin Perret wrote:
> > > On Thursday 21 Apr 2022 at 16:40:56 (+0000), Oliver Upton wrote:
> > > > The other option would be to not touch the subtree at all until the rcu
> > > > callback, as at that point software will not tweak the tables any more.
> > > > No need for atomics/spinning and can just do a boring traversal.
> > > 
> > > Right that is sort of what I had in mind. Note that I'm still trying to
> > > make my mind about the overall approach -- I can see how RCU protection
> > > provides a rather elegant solution to this problem, but this makes the
> > > whole thing inaccessible to e.g. pKVM where RCU is a non-starter.
> > 
> > Heh, figuring out how to do this for pKVM seemed hard hence my lazy
> > attempt :)
> > 
> > > A
> > > possible alternative that comes to mind would be to have all walkers
> > > take references on the pages as they walk down, and release them on
> > > their way back, but I'm still not sure how to make this race-safe. I'll
> > > have a think ...
> > 
> > Does pKVM ever collapse tables into blocks? That is the only reason any
> > of this mess ever gets roped in. If not I think it is possible to get
> > away with a rwlock with unmap on the write side and everything else on
> > the read side, right?
> > 
> > As far as regular KVM goes we get in this business when disabling dirty
> > logging on a memslot. Guest faults will lazily collapse the tables back
> > into blocks. An equally valid implementation would be just to unmap the
> > whole memslot and have the guest build out the tables again, which could
> > work with the aforementioned rwlock.
> 
> Apologies for the delay on this one, I was away for a while.
> 
> Yup, that all makes sense. FWIW the pKVM use-case I have in mind is
> slightly different. Specifically, in the pKVM world the hypervisor
> maintains a stage-2 for the host, that is all identity mapped. So we use
> nice big block mappings as much as we can. But when a protected guest
> starts, the hypervisor needs to break down the host stage-2 blocks to
> unmap the 4K guest pages from the host (which is where the protection
> comes from in pKVM). And when the guest is torn down, the host can
> reclaim its pages, hence putting us in a position to coallesce its
> stage-2 into nice big blocks again. Note that none of this coallescing
> is currently implemented even in our pKVM prototype, so it's a bit
> unfair to ask you to deal with this stuff now, but clearly it'd be cool
> if there was a way we could make these things coexist and even ideally
> share some code...

Oh, it certainly isn't unfair to make sure we've got good constructs
landing for everyone to use :-)

I'll need to chew on this a bit more to have a better answer. The reason
I hesitate to do the giant unmap for non-pKVM is that I believe we'd be
leaving some performance on the table for newer implementations of the
architecture. Having said that, avoiding a tlbi vmalls12e1is on every
collapsed table is highly desirable.

FEAT_BBM=2 semantics in the MMU is also on the todo list. In this case
we'd do a direct table->block transformation on the PTE and elide that
nasty tlbi.

Unless there's objections, I'll probably hobble this series along as-is
for the time being. My hope is that other table walkers can join in on
the parallel party later down the road.

Thanks for getting back to me.

--
Best,
Oliver

  reply	other threads:[~2022-05-04  6:04 UTC|newest]

Thread overview: 165+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-15 21:58 [RFC PATCH 00/17] KVM: arm64: Parallelize stage 2 fault handling Oliver Upton
2022-04-15 21:58 ` Oliver Upton
2022-04-15 21:58 ` Oliver Upton
2022-04-15 21:58 ` [RFC PATCH 01/17] KVM: arm64: Directly read owner id field in stage2_pte_is_counted() Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58 ` [RFC PATCH 02/17] KVM: arm64: Only read the pte once per visit Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-21 16:12   ` Ben Gardon
2022-04-21 16:12     ` Ben Gardon
2022-04-21 16:12     ` Ben Gardon
2022-04-15 21:58 ` [RFC PATCH 03/17] KVM: arm64: Return the next table from map callbacks Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58 ` [RFC PATCH 04/17] KVM: arm64: Protect page table traversal with RCU Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-19  2:55   ` Ricardo Koller
2022-04-19  2:55     ` Ricardo Koller
2022-04-19  2:55     ` Ricardo Koller
2022-04-19  3:01     ` Oliver Upton
2022-04-19  3:01       ` Oliver Upton
2022-04-19  3:01       ` Oliver Upton
2022-04-15 21:58 ` [RFC PATCH 05/17] KVM: arm64: Take an argument to indicate parallel walk Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-16 11:30   ` Marc Zyngier
2022-04-16 11:30     ` Marc Zyngier
2022-04-16 11:30     ` Marc Zyngier
2022-04-16 16:03     ` Oliver Upton
2022-04-16 16:03       ` Oliver Upton
2022-04-16 16:03       ` Oliver Upton
2022-04-15 21:58 ` [RFC PATCH 06/17] KVM: arm64: Implement break-before-make sequence for parallel walks Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-20 16:55   ` Quentin Perret
2022-04-20 16:55     ` Quentin Perret
2022-04-20 16:55     ` Quentin Perret
2022-04-20 17:06     ` Oliver Upton
2022-04-20 17:06       ` Oliver Upton
2022-04-20 17:06       ` Oliver Upton
2022-04-21 16:57   ` Ben Gardon
2022-04-21 16:57     ` Ben Gardon
2022-04-21 16:57     ` Ben Gardon
2022-04-21 18:52     ` Oliver Upton
2022-04-21 18:52       ` Oliver Upton
2022-04-21 18:52       ` Oliver Upton
2022-04-26 21:32       ` Ben Gardon
2022-04-26 21:32         ` Ben Gardon
2022-04-26 21:32         ` Ben Gardon
2022-04-25 15:13   ` Sean Christopherson
2022-04-25 15:13     ` Sean Christopherson
2022-04-25 15:13     ` Sean Christopherson
2022-04-25 16:53     ` Oliver Upton
2022-04-25 16:53       ` Oliver Upton
2022-04-25 16:53       ` Oliver Upton
2022-04-25 18:16       ` Sean Christopherson
2022-04-25 18:16         ` Sean Christopherson
2022-04-25 18:16         ` Sean Christopherson
2022-04-15 21:58 ` [RFC PATCH 07/17] KVM: arm64: Enlighten perm relax path about " Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58 ` [RFC PATCH 08/17] KVM: arm64: Spin off helper for initializing table pte Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58 ` [RFC PATCH 09/17] KVM: arm64: Tear down unlinked page tables in parallel walk Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-21 13:21   ` Quentin Perret
2022-04-21 13:21     ` Quentin Perret
2022-04-21 13:21     ` Quentin Perret
2022-04-21 16:40     ` Oliver Upton
2022-04-21 16:40       ` Oliver Upton
2022-04-21 16:40       ` Oliver Upton
2022-04-22 16:00       ` Quentin Perret
2022-04-22 16:00         ` Quentin Perret
2022-04-22 16:00         ` Quentin Perret
2022-04-22 20:41         ` Oliver Upton
2022-04-22 20:41           ` Oliver Upton
2022-04-22 20:41           ` Oliver Upton
2022-05-03 14:17           ` Quentin Perret
2022-05-03 14:17             ` Quentin Perret
2022-05-03 14:17             ` Quentin Perret
2022-05-04  6:03             ` Oliver Upton [this message]
2022-05-04  6:03               ` Oliver Upton
2022-05-04  6:03               ` Oliver Upton
2022-04-15 21:58 ` [RFC PATCH 10/17] KVM: arm64: Assume a table pte is already owned in post-order traversal Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-21 16:11   ` Ben Gardon
2022-04-21 16:11     ` Ben Gardon
2022-04-21 16:11     ` Ben Gardon
2022-04-21 17:16     ` Oliver Upton
2022-04-21 17:16       ` Oliver Upton
2022-04-21 17:16       ` Oliver Upton
2022-04-15 21:58 ` [RFC PATCH 11/17] KVM: arm64: Move MMU cache init/destroy into helpers Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58 ` [RFC PATCH 12/17] KVM: arm64: Stuff mmu page cache in sub struct Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58 ` [RFC PATCH 13/17] KVM: arm64: Setup cache for stage2 page headers Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58 ` [RFC PATCH 14/17] KVM: arm64: Punt last page reference to rcu callback for parallel walk Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-19  2:59   ` Ricardo Koller
2022-04-19  2:59     ` Ricardo Koller
2022-04-19  2:59     ` Ricardo Koller
2022-04-19  3:09     ` Ricardo Koller
2022-04-19  3:09       ` Ricardo Koller
2022-04-19  3:09       ` Ricardo Koller
2022-04-20  0:53       ` Oliver Upton
2022-04-20  0:53         ` Oliver Upton
2022-04-20  0:53         ` Oliver Upton
2022-09-08  0:52         ` David Matlack
2022-09-08  0:52           ` David Matlack
2022-09-08  0:52           ` David Matlack
2022-04-21 16:28   ` Ben Gardon
2022-04-21 16:28     ` Ben Gardon
2022-04-21 16:28     ` Ben Gardon
2022-04-15 21:58 ` [RFC PATCH 15/17] KVM: arm64: Allow parallel calls to kvm_pgtable_stage2_map() Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:58   ` Oliver Upton
2022-04-15 21:59 ` [RFC PATCH 16/17] KVM: arm64: Enable parallel stage 2 MMU faults Oliver Upton
2022-04-15 21:59   ` Oliver Upton
2022-04-15 21:59   ` Oliver Upton
2022-04-21 16:35   ` Ben Gardon
2022-04-21 16:35     ` Ben Gardon
2022-04-21 16:35     ` Ben Gardon
2022-04-21 16:46     ` Oliver Upton
2022-04-21 16:46       ` Oliver Upton
2022-04-21 16:46       ` Oliver Upton
2022-04-21 17:03       ` Ben Gardon
2022-04-21 17:03         ` Ben Gardon
2022-04-21 17:03         ` Ben Gardon
2022-04-15 21:59 ` [RFC PATCH 17/17] TESTONLY: KVM: arm64: Add super lazy accounting of stage 2 table pages Oliver Upton
2022-04-15 21:59   ` Oliver Upton
2022-04-15 21:59   ` Oliver Upton
2022-04-15 23:35 ` [RFC PATCH 00/17] KVM: arm64: Parallelize stage 2 fault handling David Matlack
2022-04-15 23:35   ` David Matlack
2022-04-15 23:35   ` David Matlack
2022-04-16  0:04   ` Oliver Upton
2022-04-16  0:04     ` Oliver Upton
2022-04-16  0:04     ` Oliver Upton
2022-04-21 16:43     ` David Matlack
2022-04-21 16:43       ` David Matlack
2022-04-21 16:43       ` David Matlack
2022-04-16  6:23 ` Oliver Upton
2022-04-16  6:23   ` Oliver Upton
2022-04-16  6:23   ` Oliver Upton
2022-04-19 17:57 ` Ben Gardon
2022-04-19 17:57   ` Ben Gardon
2022-04-19 17:57   ` Ben Gardon
2022-04-19 18:36   ` Oliver Upton
2022-04-19 18:36     ` Oliver Upton
2022-04-19 18:36     ` Oliver Upton
2022-04-21 16:30     ` Ben Gardon
2022-04-21 16:30       ` Ben Gardon
2022-04-21 16:30       ` Ben Gardon
2022-04-21 16:37       ` Paolo Bonzini
2022-04-21 16:37         ` Paolo Bonzini
2022-04-21 16:37         ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YnIXTMDpucMxnpFg@google.com \
    --to=oupton@google.com \
    --cc=bgardon@google.com \
    --cc=dmatlack@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=maz@kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=pshier@google.com \
    --cc=qperret@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.