All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christoffer Dall <cdall@linaro.org>
To: Dongjiu Geng <gengdongjiu@huawei.com>
Cc: wuquanming@huawei.com, marc.zyngier@arm.com,
	catalin.marinas@arm.com, will.deacon@arm.com,
	linuxarm@huawei.com, linux-arm-kernel@lists.infradead.org,
	huangshaoyu@huawei.com, kvmarm@lists.cs.columbia.edu
Subject: Re: [PATCH] KVM: arm64: handle the translation table walk RAS error
Date: Wed, 29 Nov 2017 14:22:12 +0100	[thread overview]
Message-ID: <20171129132212.GA10563@lvm> (raw)
In-Reply-To: <1511988524-30240-1-git-send-email-gengdongjiu@huawei.com>

On Thu, Nov 30, 2017 at 04:48:44AM +0800, Dongjiu Geng wrote:
> For the RAS Synchronous External Abort, there are two types.
> One is memory access, it will be handled by host APEI driver.
> Another is translation table walk, in essence, it is hardware
> memory error on stage1 or stage2 page table.
> 
> For the guest stage1 translation table error, if host APEI
> driver handles it, APEI driver will unmap this page for the
> stage1 page table, then switch to guest, guest reused this
> page table and generate stage2 data abort, KVM deliver SIGBUS
> to user space. User space inject this error to guest, when
> guest handle this abort, it may also use this stage1 page
> table, but it already unmap by host APEI driver, then
> generate stage2 data abort again, so this will lead to dead
> loop.

Why does it lead to a loop? If the host has marked a page as unusable,
shouldn't the guest stage 1 page table be backed by a different page
when the fault happens on stage 2?

> 
> For the guest stage2 translation table error, if host APEI
> driver handles it, it will do nothing.
> 
> So for above reasons, we directly inject this Synchronous
> External Abort to guest and let guest handle it, for example,
> kill the guest application or panic guest OS.

I don't see why we need to distinguish between what caused a memory
access error, a direct access or a page table walk, in terms of how the
host/guest interaction works here.

What is the fundamental difference?

Thanks,
-Christoffer

> 
> Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
> ---
>  arch/arm64/include/asm/kvm_arm.h |  2 ++
>  virt/kvm/arm/mmu.c               | 14 ++++++++++++--
>  2 files changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
> index 1188272..b8cb67a 100644
> --- a/arch/arm64/include/asm/kvm_arm.h
> +++ b/arch/arm64/include/asm/kvm_arm.h
> @@ -217,6 +217,8 @@
>  #define FSC_SECC_TTW2	(0x1e)
>  #define FSC_SECC_TTW3	(0x1f)
>  
> +#define FSC_SEA_TTW    FSC_SEA_TTW0
> +
>  /* Hyp Prefetch Fault Address Register (HPFAR/HDFAR) */
>  #define HPFAR_MASK	(~UL(0xf))
>  
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index b36945d..6eab82d 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -1484,8 +1484,18 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  	/* Synchronous External Abort? */
>  	if (kvm_vcpu_dabt_isextabt(vcpu)) {
>  		/*
> -		 * For RAS the host kernel may handle this abort.
> -		 * There is no need to pass the error into the guest.
> +		 * For RAS translation table walk abort, pass the error
> +		 * into the guest.
> +		 */
> +		if (fault_status == FSC_SEA_TTW) {
> +			kvm_inject_dabt(vcpu, kvm_vcpu_get_hfar(vcpu));
> +			return 1;
> +		}
> +
> +		/*
> +		 * For RAS normal memory access abort, the host kernel may
> +		 * handle this abort. There is no need to pass the error into
> +		 * the guest.
>  		 */
>  		if (!handle_guest_sea(fault_ipa, kvm_vcpu_get_hsr(vcpu)))
>  			return 1;
> -- 
> 1.9.1
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

WARNING: multiple messages have this Message-ID (diff)
From: cdall@linaro.org (Christoffer Dall)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH] KVM: arm64: handle the translation table walk RAS error
Date: Wed, 29 Nov 2017 14:22:12 +0100	[thread overview]
Message-ID: <20171129132212.GA10563@lvm> (raw)
In-Reply-To: <1511988524-30240-1-git-send-email-gengdongjiu@huawei.com>

On Thu, Nov 30, 2017 at 04:48:44AM +0800, Dongjiu Geng wrote:
> For the RAS Synchronous External Abort, there are two types.
> One is memory access, it will be handled by host APEI driver.
> Another is translation table walk, in essence, it is hardware
> memory error on stage1 or stage2 page table.
> 
> For the guest stage1 translation table error, if host APEI
> driver handles it, APEI driver will unmap this page for the
> stage1 page table, then switch to guest, guest reused this
> page table and generate stage2 data abort, KVM deliver SIGBUS
> to user space. User space inject this error to guest, when
> guest handle this abort, it may also use this stage1 page
> table, but it already unmap by host APEI driver, then
> generate stage2 data abort again, so this will lead to dead
> loop.

Why does it lead to a loop? If the host has marked a page as unusable,
shouldn't the guest stage 1 page table be backed by a different page
when the fault happens on stage 2?

> 
> For the guest stage2 translation table error, if host APEI
> driver handles it, it will do nothing.
> 
> So for above reasons, we directly inject this Synchronous
> External Abort to guest and let guest handle it, for example,
> kill the guest application or panic guest OS.

I don't see why we need to distinguish between what caused a memory
access error, a direct access or a page table walk, in terms of how the
host/guest interaction works here.

What is the fundamental difference?

Thanks,
-Christoffer

> 
> Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
> ---
>  arch/arm64/include/asm/kvm_arm.h |  2 ++
>  virt/kvm/arm/mmu.c               | 14 ++++++++++++--
>  2 files changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
> index 1188272..b8cb67a 100644
> --- a/arch/arm64/include/asm/kvm_arm.h
> +++ b/arch/arm64/include/asm/kvm_arm.h
> @@ -217,6 +217,8 @@
>  #define FSC_SECC_TTW2	(0x1e)
>  #define FSC_SECC_TTW3	(0x1f)
>  
> +#define FSC_SEA_TTW    FSC_SEA_TTW0
> +
>  /* Hyp Prefetch Fault Address Register (HPFAR/HDFAR) */
>  #define HPFAR_MASK	(~UL(0xf))
>  
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index b36945d..6eab82d 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -1484,8 +1484,18 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  	/* Synchronous External Abort? */
>  	if (kvm_vcpu_dabt_isextabt(vcpu)) {
>  		/*
> -		 * For RAS the host kernel may handle this abort.
> -		 * There is no need to pass the error into the guest.
> +		 * For RAS translation table walk abort, pass the error
> +		 * into the guest.
> +		 */
> +		if (fault_status == FSC_SEA_TTW) {
> +			kvm_inject_dabt(vcpu, kvm_vcpu_get_hfar(vcpu));
> +			return 1;
> +		}
> +
> +		/*
> +		 * For RAS normal memory access abort, the host kernel may
> +		 * handle this abort. There is no need to pass the error into
> +		 * the guest.
>  		 */
>  		if (!handle_guest_sea(fault_ipa, kvm_vcpu_get_hsr(vcpu)))
>  			return 1;
> -- 
> 1.9.1
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2017-11-29 13:19 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-29 20:48 [PATCH] KVM: arm64: handle the translation table walk RAS error Dongjiu Geng
2017-11-29 20:48 ` Dongjiu Geng
2017-11-29 13:22 ` Christoffer Dall [this message]
2017-11-29 13:22   ` Christoffer Dall
2017-11-30 11:32   ` gengdongjiu
2017-11-30 11:32     ` gengdongjiu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171129132212.GA10563@lvm \
    --to=cdall@linaro.org \
    --cc=catalin.marinas@arm.com \
    --cc=gengdongjiu@huawei.com \
    --cc=huangshaoyu@huawei.com \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linuxarm@huawei.com \
    --cc=marc.zyngier@arm.com \
    --cc=will.deacon@arm.com \
    --cc=wuquanming@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.