* [PATCH] KVM: arm/arm64: Handle hva aging while destroying the vm
@ 2017-06-23 15:21 Alexander Graf
2017-07-03 8:03 ` Christoffer Dall
0 siblings, 1 reply; 11+ messages in thread
From: Alexander Graf @ 2017-06-23 15:21 UTC (permalink / raw)
To: kvmarm; +Cc: kvm, linux-kernel
If we want to age an HVA while the VM is getting destroyed, we have a
tiny race window during which we may end up dereferencing an invalid
kvm->arch.pgd value.
CPU0 CPU1
kvm_age_hva()
kvm_mmu_notifier_release()
kvm_arch_flush_shadow_all()
kvm_free_stage2_pgd()
<grab mmu_lock>
stage2_get_pmd()
<wait for mmu_lock>
set kvm->arch.pgd = 0
<free mmu_lock>
<grab mmu_lock>
stage2_get_pud()
<access kvm->arch.pgd>
<use incorrect value>
This patch adds a check for that case.
Signed-off-by: Alexander Graf <agraf@suse.de>
---
virt/kvm/arm/mmu.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index f2d5b6c..227931f 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -861,6 +861,10 @@ static pud_t *stage2_get_pud(struct kvm *kvm, struct kvm_mmu_memory_cache *cache
pgd_t *pgd;
pud_t *pud;
+ /* Do we clash with kvm_free_stage2_pgd()? */
+ if (!kvm->arch.pgd)
+ return NULL;
+
pgd = kvm->arch.pgd + stage2_pgd_index(addr);
if (WARN_ON(stage2_pgd_none(*pgd))) {
if (!cache)
--
1.8.5.6
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH] KVM: arm/arm64: Handle hva aging while destroying the vm
2017-06-23 15:21 [PATCH] KVM: arm/arm64: Handle hva aging while destroying the vm Alexander Graf
@ 2017-07-03 8:03 ` Christoffer Dall
0 siblings, 0 replies; 11+ messages in thread
From: Christoffer Dall @ 2017-07-03 8:03 UTC (permalink / raw)
To: Alexander Graf; +Cc: kvmarm, kvm, linux-kernel
Hi Alex,
On Fri, Jun 23, 2017 at 05:21:59PM +0200, Alexander Graf wrote:
> If we want to age an HVA while the VM is getting destroyed, we have a
> tiny race window during which we may end up dereferencing an invalid
> kvm->arch.pgd value.
>
> CPU0 CPU1
>
> kvm_age_hva()
> kvm_mmu_notifier_release()
> kvm_arch_flush_shadow_all()
> kvm_free_stage2_pgd()
> <grab mmu_lock>
> stage2_get_pmd()
> <wait for mmu_lock>
> set kvm->arch.pgd = 0
> <free mmu_lock>
> <grab mmu_lock>
> stage2_get_pud()
> <access kvm->arch.pgd>
> <use incorrect value>
I don't think this sequence, can happen, but I think kvm_age_hva() can
be called with the mmu_lock held and kvm->pgd already being NULL.
Is that possible for the mmu notifiers to be calling clear(_flush)_young
while also calling notifier_release?
If so, the patch below looks good to me.
Thanks,
-Christoffer
>
> This patch adds a check for that case.
>
> Signed-off-by: Alexander Graf <agraf@suse.de>
> ---
> virt/kvm/arm/mmu.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index f2d5b6c..227931f 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -861,6 +861,10 @@ static pud_t *stage2_get_pud(struct kvm *kvm, struct kvm_mmu_memory_cache *cache
> pgd_t *pgd;
> pud_t *pud;
>
> + /* Do we clash with kvm_free_stage2_pgd()? */
> + if (!kvm->arch.pgd)
> + return NULL;
> +
> pgd = kvm->arch.pgd + stage2_pgd_index(addr);
> if (WARN_ON(stage2_pgd_none(*pgd))) {
> if (!cache)
> --
> 1.8.5.6
>
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] KVM: arm/arm64: Handle hva aging while destroying the vm
@ 2017-07-03 8:03 ` Christoffer Dall
0 siblings, 0 replies; 11+ messages in thread
From: Christoffer Dall @ 2017-07-03 8:03 UTC (permalink / raw)
To: Alexander Graf; +Cc: kvmarm, linux-kernel, kvm
Hi Alex,
On Fri, Jun 23, 2017 at 05:21:59PM +0200, Alexander Graf wrote:
> If we want to age an HVA while the VM is getting destroyed, we have a
> tiny race window during which we may end up dereferencing an invalid
> kvm->arch.pgd value.
>
> CPU0 CPU1
>
> kvm_age_hva()
> kvm_mmu_notifier_release()
> kvm_arch_flush_shadow_all()
> kvm_free_stage2_pgd()
> <grab mmu_lock>
> stage2_get_pmd()
> <wait for mmu_lock>
> set kvm->arch.pgd = 0
> <free mmu_lock>
> <grab mmu_lock>
> stage2_get_pud()
> <access kvm->arch.pgd>
> <use incorrect value>
I don't think this sequence, can happen, but I think kvm_age_hva() can
be called with the mmu_lock held and kvm->pgd already being NULL.
Is that possible for the mmu notifiers to be calling clear(_flush)_young
while also calling notifier_release?
If so, the patch below looks good to me.
Thanks,
-Christoffer
>
> This patch adds a check for that case.
>
> Signed-off-by: Alexander Graf <agraf@suse.de>
> ---
> virt/kvm/arm/mmu.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index f2d5b6c..227931f 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -861,6 +861,10 @@ static pud_t *stage2_get_pud(struct kvm *kvm, struct kvm_mmu_memory_cache *cache
> pgd_t *pgd;
> pud_t *pud;
>
> + /* Do we clash with kvm_free_stage2_pgd()? */
> + if (!kvm->arch.pgd)
> + return NULL;
> +
> pgd = kvm->arch.pgd + stage2_pgd_index(addr);
> if (WARN_ON(stage2_pgd_none(*pgd))) {
> if (!cache)
> --
> 1.8.5.6
>
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] KVM: arm/arm64: Handle hva aging while destroying the vm
2017-07-03 8:03 ` Christoffer Dall
@ 2017-07-03 8:48 ` Alexander Graf
-1 siblings, 0 replies; 11+ messages in thread
From: Alexander Graf @ 2017-07-03 8:48 UTC (permalink / raw)
To: Christoffer Dall; +Cc: kvmarm, kvm, linux-kernel
On 07/03/2017 10:03 AM, Christoffer Dall wrote:
> Hi Alex,
>
> On Fri, Jun 23, 2017 at 05:21:59PM +0200, Alexander Graf wrote:
>> If we want to age an HVA while the VM is getting destroyed, we have a
>> tiny race window during which we may end up dereferencing an invalid
>> kvm->arch.pgd value.
>>
>> CPU0 CPU1
>>
>> kvm_age_hva()
>> kvm_mmu_notifier_release()
>> kvm_arch_flush_shadow_all()
>> kvm_free_stage2_pgd()
>> <grab mmu_lock>
>> stage2_get_pmd()
>> <wait for mmu_lock>
>> set kvm->arch.pgd = 0
>> <free mmu_lock>
>> <grab mmu_lock>
>> stage2_get_pud()
>> <access kvm->arch.pgd>
>> <use incorrect value>
> I don't think this sequence, can happen, but I think kvm_age_hva() can
> be called with the mmu_lock held and kvm->pgd already being NULL.
>
> Is that possible for the mmu notifiers to be calling clear(_flush)_young
> while also calling notifier_release?
I *think* the aging happens completely orthogonally to release. But
let's ask Andrea - I'm sure he knows :).
Alex
>
> If so, the patch below looks good to me.
>
> Thanks,
> -Christoffer
>
>
>> This patch adds a check for that case.
>>
>> Signed-off-by: Alexander Graf <agraf@suse.de>
>> ---
>> virt/kvm/arm/mmu.c | 4 ++++
>> 1 file changed, 4 insertions(+)
>>
>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>> index f2d5b6c..227931f 100644
>> --- a/virt/kvm/arm/mmu.c
>> +++ b/virt/kvm/arm/mmu.c
>> @@ -861,6 +861,10 @@ static pud_t *stage2_get_pud(struct kvm *kvm, struct kvm_mmu_memory_cache *cache
>> pgd_t *pgd;
>> pud_t *pud;
>>
>> + /* Do we clash with kvm_free_stage2_pgd()? */
>> + if (!kvm->arch.pgd)
>> + return NULL;
>> +
>> pgd = kvm->arch.pgd + stage2_pgd_index(addr);
>> if (WARN_ON(stage2_pgd_none(*pgd))) {
>> if (!cache)
>> --
>> 1.8.5.6
>>
>> _______________________________________________
>> kvmarm mailing list
>> kvmarm@lists.cs.columbia.edu
>> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] KVM: arm/arm64: Handle hva aging while destroying the vm
@ 2017-07-03 8:48 ` Alexander Graf
0 siblings, 0 replies; 11+ messages in thread
From: Alexander Graf @ 2017-07-03 8:48 UTC (permalink / raw)
To: Christoffer Dall; +Cc: kvmarm, linux-kernel, kvm, Andrea Arcangeli
On 07/03/2017 10:03 AM, Christoffer Dall wrote:
> Hi Alex,
>
> On Fri, Jun 23, 2017 at 05:21:59PM +0200, Alexander Graf wrote:
>> If we want to age an HVA while the VM is getting destroyed, we have a
>> tiny race window during which we may end up dereferencing an invalid
>> kvm->arch.pgd value.
>>
>> CPU0 CPU1
>>
>> kvm_age_hva()
>> kvm_mmu_notifier_release()
>> kvm_arch_flush_shadow_all()
>> kvm_free_stage2_pgd()
>> <grab mmu_lock>
>> stage2_get_pmd()
>> <wait for mmu_lock>
>> set kvm->arch.pgd = 0
>> <free mmu_lock>
>> <grab mmu_lock>
>> stage2_get_pud()
>> <access kvm->arch.pgd>
>> <use incorrect value>
> I don't think this sequence, can happen, but I think kvm_age_hva() can
> be called with the mmu_lock held and kvm->pgd already being NULL.
>
> Is that possible for the mmu notifiers to be calling clear(_flush)_young
> while also calling notifier_release?
I *think* the aging happens completely orthogonally to release. But
let's ask Andrea - I'm sure he knows :).
Alex
>
> If so, the patch below looks good to me.
>
> Thanks,
> -Christoffer
>
>
>> This patch adds a check for that case.
>>
>> Signed-off-by: Alexander Graf <agraf@suse.de>
>> ---
>> virt/kvm/arm/mmu.c | 4 ++++
>> 1 file changed, 4 insertions(+)
>>
>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>> index f2d5b6c..227931f 100644
>> --- a/virt/kvm/arm/mmu.c
>> +++ b/virt/kvm/arm/mmu.c
>> @@ -861,6 +861,10 @@ static pud_t *stage2_get_pud(struct kvm *kvm, struct kvm_mmu_memory_cache *cache
>> pgd_t *pgd;
>> pud_t *pud;
>>
>> + /* Do we clash with kvm_free_stage2_pgd()? */
>> + if (!kvm->arch.pgd)
>> + return NULL;
>> +
>> pgd = kvm->arch.pgd + stage2_pgd_index(addr);
>> if (WARN_ON(stage2_pgd_none(*pgd))) {
>> if (!cache)
>> --
>> 1.8.5.6
>>
>> _______________________________________________
>> kvmarm mailing list
>> kvmarm@lists.cs.columbia.edu
>> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] KVM: arm/arm64: Handle hva aging while destroying the vm
2017-07-03 8:48 ` Alexander Graf
@ 2017-07-03 23:41 ` Andrea Arcangeli
-1 siblings, 0 replies; 11+ messages in thread
From: Andrea Arcangeli @ 2017-07-03 23:41 UTC (permalink / raw)
To: Alexander Graf; +Cc: Christoffer Dall, kvmarm, kvm, linux-kernel
Hello,
On Mon, Jul 03, 2017 at 10:48:03AM +0200, Alexander Graf wrote:
> On 07/03/2017 10:03 AM, Christoffer Dall wrote:
> > Hi Alex,
> >
> > On Fri, Jun 23, 2017 at 05:21:59PM +0200, Alexander Graf wrote:
> >> If we want to age an HVA while the VM is getting destroyed, we have a
> >> tiny race window during which we may end up dereferencing an invalid
> >> kvm->arch.pgd value.
> >>
> >> CPU0 CPU1
> >>
> >> kvm_age_hva()
> >> kvm_mmu_notifier_release()
> >> kvm_arch_flush_shadow_all()
> >> kvm_free_stage2_pgd()
> >> <grab mmu_lock>
> >> stage2_get_pmd()
> >> <wait for mmu_lock>
> >> set kvm->arch.pgd = 0
> >> <free mmu_lock>
> >> <grab mmu_lock>
> >> stage2_get_pud()
> >> <access kvm->arch.pgd>
> >> <use incorrect value>
> > I don't think this sequence, can happen, but I think kvm_age_hva() can
> > be called with the mmu_lock held and kvm->pgd already being NULL.
> >
> > Is that possible for the mmu notifiers to be calling clear(_flush)_young
> > while also calling notifier_release?
>
> I *think* the aging happens completely orthogonally to release. But
> let's ask Andrea - I'm sure he knows :).
I think the sequence can happen. All mmu notifier methods are flushed
out of CPUs only through synchronize_srcu() which is called as the
last step in __mmu_notifier_release/unregister. Only after _unregister
returns you're sure kvm_age_hva cannot run anymore, until that point
it can still run. Even during exit_mmap->mmu_notifier_release it can
still run if invoked through rmap walks.
So while the ->release method runs, all other mmu notifier methods
could be still invoked concurrently.
mmu notifier methods are only protected by srcu to prevent the mmu
notifier structure to be freed from under them, but there's no
additional locking to serialize them (except for the synchronize_srcu
that happens as the last step of mmu_notifier_release/unregister, well
after they may have called the ->release method).
There's also a comment about it in __mmu_notifier_release:
* runs with mm_users == 0. Other tasks may still invoke mmu notifiers
* in parallel despite there being no task using this mm any more,
* through the vmas outside of the exit_mmap context, such as with
* vmtruncate. This serializes against mmu_notifier_unregister with
And in the mmu_notifier_unregister too:
* calling mmu_notifier_unregister. ->release or any other notifier
* method may be invoked concurrently with mmu_notifier_unregister,
* and only after mmu_notifier_unregister returned we're guaranteed
* that ->release or any other method can't run anymore.
Even ->release could in theory run concurrently against itself if
mmu_notifier_unregister runs concurrently with mmu_notifier_release
but that's purely theoretical possibility.
Thanks,
Andrea
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] KVM: arm/arm64: Handle hva aging while destroying the vm
@ 2017-07-03 23:41 ` Andrea Arcangeli
0 siblings, 0 replies; 11+ messages in thread
From: Andrea Arcangeli @ 2017-07-03 23:41 UTC (permalink / raw)
To: Alexander Graf; +Cc: Christoffer Dall, kvmarm, linux-kernel, kvm
Hello,
On Mon, Jul 03, 2017 at 10:48:03AM +0200, Alexander Graf wrote:
> On 07/03/2017 10:03 AM, Christoffer Dall wrote:
> > Hi Alex,
> >
> > On Fri, Jun 23, 2017 at 05:21:59PM +0200, Alexander Graf wrote:
> >> If we want to age an HVA while the VM is getting destroyed, we have a
> >> tiny race window during which we may end up dereferencing an invalid
> >> kvm->arch.pgd value.
> >>
> >> CPU0 CPU1
> >>
> >> kvm_age_hva()
> >> kvm_mmu_notifier_release()
> >> kvm_arch_flush_shadow_all()
> >> kvm_free_stage2_pgd()
> >> <grab mmu_lock>
> >> stage2_get_pmd()
> >> <wait for mmu_lock>
> >> set kvm->arch.pgd = 0
> >> <free mmu_lock>
> >> <grab mmu_lock>
> >> stage2_get_pud()
> >> <access kvm->arch.pgd>
> >> <use incorrect value>
> > I don't think this sequence, can happen, but I think kvm_age_hva() can
> > be called with the mmu_lock held and kvm->pgd already being NULL.
> >
> > Is that possible for the mmu notifiers to be calling clear(_flush)_young
> > while also calling notifier_release?
>
> I *think* the aging happens completely orthogonally to release. But
> let's ask Andrea - I'm sure he knows :).
I think the sequence can happen. All mmu notifier methods are flushed
out of CPUs only through synchronize_srcu() which is called as the
last step in __mmu_notifier_release/unregister. Only after _unregister
returns you're sure kvm_age_hva cannot run anymore, until that point
it can still run. Even during exit_mmap->mmu_notifier_release it can
still run if invoked through rmap walks.
So while the ->release method runs, all other mmu notifier methods
could be still invoked concurrently.
mmu notifier methods are only protected by srcu to prevent the mmu
notifier structure to be freed from under them, but there's no
additional locking to serialize them (except for the synchronize_srcu
that happens as the last step of mmu_notifier_release/unregister, well
after they may have called the ->release method).
There's also a comment about it in __mmu_notifier_release:
* runs with mm_users == 0. Other tasks may still invoke mmu notifiers
* in parallel despite there being no task using this mm any more,
* through the vmas outside of the exit_mmap context, such as with
* vmtruncate. This serializes against mmu_notifier_unregister with
And in the mmu_notifier_unregister too:
* calling mmu_notifier_unregister. ->release or any other notifier
* method may be invoked concurrently with mmu_notifier_unregister,
* and only after mmu_notifier_unregister returned we're guaranteed
* that ->release or any other method can't run anymore.
Even ->release could in theory run concurrently against itself if
mmu_notifier_unregister runs concurrently with mmu_notifier_release
but that's purely theoretical possibility.
Thanks,
Andrea
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] KVM: arm/arm64: Handle hva aging while destroying the vm
2017-07-03 23:41 ` Andrea Arcangeli
@ 2017-07-04 8:35 ` Christoffer Dall
-1 siblings, 0 replies; 11+ messages in thread
From: Christoffer Dall @ 2017-07-04 8:35 UTC (permalink / raw)
To: Andrea Arcangeli; +Cc: linux-kernel, kvm, kvmarm
On Tue, Jul 04, 2017 at 01:41:49AM +0200, Andrea Arcangeli wrote:
> Hello,
>
> On Mon, Jul 03, 2017 at 10:48:03AM +0200, Alexander Graf wrote:
> > On 07/03/2017 10:03 AM, Christoffer Dall wrote:
> > > Hi Alex,
> > >
> > > On Fri, Jun 23, 2017 at 05:21:59PM +0200, Alexander Graf wrote:
> > >> If we want to age an HVA while the VM is getting destroyed, we have a
> > >> tiny race window during which we may end up dereferencing an invalid
> > >> kvm->arch.pgd value.
> > >>
> > >> CPU0 CPU1
> > >>
> > >> kvm_age_hva()
> > >> kvm_mmu_notifier_release()
> > >> kvm_arch_flush_shadow_all()
> > >> kvm_free_stage2_pgd()
> > >> <grab mmu_lock>
> > >> stage2_get_pmd()
> > >> <wait for mmu_lock>
> > >> set kvm->arch.pgd = 0
> > >> <free mmu_lock>
> > >> <grab mmu_lock>
> > >> stage2_get_pud()
> > >> <access kvm->arch.pgd>
> > >> <use incorrect value>
> > > I don't think this sequence, can happen, but I think kvm_age_hva() can
> > > be called with the mmu_lock held and kvm->pgd already being NULL.
> > >
> > > Is that possible for the mmu notifiers to be calling clear(_flush)_young
> > > while also calling notifier_release?
> >
> > I *think* the aging happens completely orthogonally to release. But
> > let's ask Andrea - I'm sure he knows :).
>
> I think the sequence can happen. All mmu notifier methods are flushed
> out of CPUs only through synchronize_srcu() which is called as the
> last step in __mmu_notifier_release/unregister. Only after _unregister
> returns you're sure kvm_age_hva cannot run anymore, until that point
> it can still run. Even during exit_mmap->mmu_notifier_release it can
> still run if invoked through rmap walks.
>
> So while the ->release method runs, all other mmu notifier methods
> could be still invoked concurrently.
>
> mmu notifier methods are only protected by srcu to prevent the mmu
> notifier structure to be freed from under them, but there's no
> additional locking to serialize them (except for the synchronize_srcu
> that happens as the last step of mmu_notifier_release/unregister, well
> after they may have called the ->release method).
>
> There's also a comment about it in __mmu_notifier_release:
>
> * runs with mm_users == 0. Other tasks may still invoke mmu notifiers
> * in parallel despite there being no task using this mm any more,
> * through the vmas outside of the exit_mmap context, such as with
> * vmtruncate. This serializes against mmu_notifier_unregister with
>
> And in the mmu_notifier_unregister too:
>
> * calling mmu_notifier_unregister. ->release or any other notifier
> * method may be invoked concurrently with mmu_notifier_unregister,
> * and only after mmu_notifier_unregister returned we're guaranteed
> * that ->release or any other method can't run anymore.
>
> Even ->release could in theory run concurrently against itself if
> mmu_notifier_unregister runs concurrently with mmu_notifier_release
> but that's purely theoretical possibility.
>
Thanks for the clarification.
So Alex, I think you just need to fix the commit message of this patch.
Thanks,
-Christoffer
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] KVM: arm/arm64: Handle hva aging while destroying the vm
@ 2017-07-04 8:35 ` Christoffer Dall
0 siblings, 0 replies; 11+ messages in thread
From: Christoffer Dall @ 2017-07-04 8:35 UTC (permalink / raw)
To: Andrea Arcangeli; +Cc: Alexander Graf, kvmarm, linux-kernel, kvm
On Tue, Jul 04, 2017 at 01:41:49AM +0200, Andrea Arcangeli wrote:
> Hello,
>
> On Mon, Jul 03, 2017 at 10:48:03AM +0200, Alexander Graf wrote:
> > On 07/03/2017 10:03 AM, Christoffer Dall wrote:
> > > Hi Alex,
> > >
> > > On Fri, Jun 23, 2017 at 05:21:59PM +0200, Alexander Graf wrote:
> > >> If we want to age an HVA while the VM is getting destroyed, we have a
> > >> tiny race window during which we may end up dereferencing an invalid
> > >> kvm->arch.pgd value.
> > >>
> > >> CPU0 CPU1
> > >>
> > >> kvm_age_hva()
> > >> kvm_mmu_notifier_release()
> > >> kvm_arch_flush_shadow_all()
> > >> kvm_free_stage2_pgd()
> > >> <grab mmu_lock>
> > >> stage2_get_pmd()
> > >> <wait for mmu_lock>
> > >> set kvm->arch.pgd = 0
> > >> <free mmu_lock>
> > >> <grab mmu_lock>
> > >> stage2_get_pud()
> > >> <access kvm->arch.pgd>
> > >> <use incorrect value>
> > > I don't think this sequence, can happen, but I think kvm_age_hva() can
> > > be called with the mmu_lock held and kvm->pgd already being NULL.
> > >
> > > Is that possible for the mmu notifiers to be calling clear(_flush)_young
> > > while also calling notifier_release?
> >
> > I *think* the aging happens completely orthogonally to release. But
> > let's ask Andrea - I'm sure he knows :).
>
> I think the sequence can happen. All mmu notifier methods are flushed
> out of CPUs only through synchronize_srcu() which is called as the
> last step in __mmu_notifier_release/unregister. Only after _unregister
> returns you're sure kvm_age_hva cannot run anymore, until that point
> it can still run. Even during exit_mmap->mmu_notifier_release it can
> still run if invoked through rmap walks.
>
> So while the ->release method runs, all other mmu notifier methods
> could be still invoked concurrently.
>
> mmu notifier methods are only protected by srcu to prevent the mmu
> notifier structure to be freed from under them, but there's no
> additional locking to serialize them (except for the synchronize_srcu
> that happens as the last step of mmu_notifier_release/unregister, well
> after they may have called the ->release method).
>
> There's also a comment about it in __mmu_notifier_release:
>
> * runs with mm_users == 0. Other tasks may still invoke mmu notifiers
> * in parallel despite there being no task using this mm any more,
> * through the vmas outside of the exit_mmap context, such as with
> * vmtruncate. This serializes against mmu_notifier_unregister with
>
> And in the mmu_notifier_unregister too:
>
> * calling mmu_notifier_unregister. ->release or any other notifier
> * method may be invoked concurrently with mmu_notifier_unregister,
> * and only after mmu_notifier_unregister returned we're guaranteed
> * that ->release or any other method can't run anymore.
>
> Even ->release could in theory run concurrently against itself if
> mmu_notifier_unregister runs concurrently with mmu_notifier_release
> but that's purely theoretical possibility.
>
Thanks for the clarification.
So Alex, I think you just need to fix the commit message of this patch.
Thanks,
-Christoffer
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH] KVM: arm/arm64: Handle hva aging while destroying the vm
@ 2017-08-10 8:54 Suzuki K Poulose
2017-08-11 21:04 ` Greg KH
0 siblings, 1 reply; 11+ messages in thread
From: Suzuki K Poulose @ 2017-08-10 8:54 UTC (permalink / raw)
To: stable; +Cc: agraf, cdall, marc.zyngier, gregkh, suzuki.poulose
commit 7e5a672289c9754d07e1c3b33649786d3d70f5e4 upstream.
The mmu_notifier_release() callback of KVM triggers cleaning up
the stage2 page table on kvm-arm. However there could be other
notifier callbacks in parallel with the mmu_notifier_release(),
which could cause the call backs ending up in an empty stage2
page table. Make sure we check it for all the notifier callbacks.
Fixes: commit 293f29363 ("kvm-arm: Unmap shadow pagetables properly")
Cc: stable@vger.kernel.org # v4.9-
Reported-by: Alex Graf <agraf@suse.de>
Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
This is a back port of the patch for stable tree for version v4.9 and
earlier.
---
arch/arm/kvm/mmu.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 332ce3b..710511c 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -1664,12 +1664,16 @@ static int kvm_test_age_hva_handler(struct kvm *kvm, gpa_t gpa, void *data)
int kvm_age_hva(struct kvm *kvm, unsigned long start, unsigned long end)
{
+ if (!kvm->arch.pgd)
+ return 0;
trace_kvm_age_hva(start, end);
return handle_hva_to_gpa(kvm, start, end, kvm_age_hva_handler, NULL);
}
int kvm_test_age_hva(struct kvm *kvm, unsigned long hva)
{
+ if (!kvm->arch.pgd)
+ return 0;
trace_kvm_test_age_hva(hva);
return handle_hva_to_gpa(kvm, hva, hva, kvm_test_age_hva_handler, NULL);
}
--
2.7.5
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH] KVM: arm/arm64: Handle hva aging while destroying the vm
2017-08-10 8:54 Suzuki K Poulose
@ 2017-08-11 21:04 ` Greg KH
0 siblings, 0 replies; 11+ messages in thread
From: Greg KH @ 2017-08-11 21:04 UTC (permalink / raw)
To: Suzuki K Poulose; +Cc: stable, agraf, cdall, marc.zyngier
On Thu, Aug 10, 2017 at 09:54:01AM +0100, Suzuki K Poulose wrote:
> commit 7e5a672289c9754d07e1c3b33649786d3d70f5e4 upstream.
>
> The mmu_notifier_release() callback of KVM triggers cleaning up
> the stage2 page table on kvm-arm. However there could be other
> notifier callbacks in parallel with the mmu_notifier_release(),
> which could cause the call backs ending up in an empty stage2
> page table. Make sure we check it for all the notifier callbacks.
>
> Fixes: commit 293f29363 ("kvm-arm: Unmap shadow pagetables properly")
> Cc: stable@vger.kernel.org # v4.9-
> Reported-by: Alex Graf <agraf@suse.de>
> Reviewed-by: Christoffer Dall <cdall@linaro.org>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>
> ---
>
> This is a back port of the patch for stable tree for version v4.9 and
> earlier.
now applied, thanks.
greg k-h
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2017-08-11 21:04 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-06-23 15:21 [PATCH] KVM: arm/arm64: Handle hva aging while destroying the vm Alexander Graf
2017-07-03 8:03 ` Christoffer Dall
2017-07-03 8:03 ` Christoffer Dall
2017-07-03 8:48 ` Alexander Graf
2017-07-03 8:48 ` Alexander Graf
2017-07-03 23:41 ` Andrea Arcangeli
2017-07-03 23:41 ` Andrea Arcangeli
2017-07-04 8:35 ` Christoffer Dall
2017-07-04 8:35 ` Christoffer Dall
-- strict thread matches above, loose matches on Subject: below --
2017-08-10 8:54 Suzuki K Poulose
2017-08-11 21:04 ` Greg KH
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.