From: "Kalra, Ashish" <ashish.kalra@amd.com>
To: Tianyu Lan <ltykernel@gmail.com>,
luto@kernel.org, tglx@linutronix.de, mingo@redhat.com,
bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org,
hpa@zytor.com, seanjc@google.com, pbonzini@redhat.com,
jgross@suse.com, tiala@microsoft.com, kirill@shutemov.name,
jiangshan.ljs@antgroup.com, peterz@infradead.org,
srutherford@google.com, akpm@linux-foundation.org,
anshuman.khandual@arm.com, pawan.kumar.gupta@linux.intel.com,
adrian.hunter@intel.com, daniel.sneddon@linux.intel.com,
alexander.shishkin@linux.intel.com, sandipan.das@amd.com,
ray.huang@amd.com, brijesh.singh@amd.com, michael.roth@amd.com,
thomas.lendacky@amd.com, venu.busireddy@oracle.com,
sterritt@google.com, tony.luck@intel.com,
samitolvanen@google.com, fenghua.yu@intel.com
Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
linux-hyperv@vger.kernel.org, linux-arch@vger.kernel.org
Subject: Re: [RFC PATCH 16/17] x86/sev: Add a #HV exception handler
Date: Thu, 10 Nov 2022 14:38:38 -0600 [thread overview]
Message-ID: <aed619fe-3e08-e99d-67c3-3a91da47eefb@amd.com> (raw)
In-Reply-To: <20221109205353.984745-17-ltykernel@gmail.com>
Hello Tianyu,
On 11/9/2022 2:53 PM, Tianyu Lan wrote:
> From: Tianyu Lan <tiala@microsoft.com>
>
> Add a #HV exception handler that uses IST stack.
>
> Signed-off-by: Tianyu Lan <tiala@microsoft.com>
> ---
> arch/x86/entry/entry_64.S | 58 ++++++++++++++++++++++++++
> arch/x86/include/asm/cpu_entry_area.h | 6 +++
> arch/x86/include/asm/idtentry.h | 39 +++++++++++++++++-
> arch/x86/include/asm/page_64_types.h | 1 +
> arch/x86/include/asm/trapnr.h | 1 +
> arch/x86/include/asm/traps.h | 1 +
> arch/x86/kernel/cpu/common.c | 1 +
> arch/x86/kernel/dumpstack_64.c | 9 +++-
> arch/x86/kernel/idt.c | 1 +
> arch/x86/kernel/sev.c | 59 +++++++++++++++++++++++++++
> arch/x86/mm/cpu_entry_area.c | 2 +
> 11 files changed, 175 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
> index 9953d966d124..b2059df43c57 100644
> --- a/arch/x86/entry/entry_64.S
> +++ b/arch/x86/entry/entry_64.S
> @@ -560,6 +560,64 @@ SYM_CODE_START(\asmsym)
> .Lfrom_usermode_switch_stack_\@:
> idtentry_body user_\cfunc, has_error_code=1
>
> +_ASM_NOKPROBE(\asmsym)
> +SYM_CODE_END(\asmsym)
> +.endm
> +/*
> + * idtentry_hv - Macro to generate entry stub for #HV
> + * @vector: Vector number
> + * @asmsym: ASM symbol for the entry point
> + * @cfunc: C function to be called
> + *
> + * The macro emits code to set up the kernel context for #HV. The #HV handler
> + * runs on an IST stack and needs to be able to support nested #HV exceptions.
> + *
> + * To make this work the #HV entry code tries its best to pretend it doesn't use
> + * an IST stack by switching to the task stack if coming from user-space (which
> + * includes early SYSCALL entry path) or back to the stack in the IRET frame if
> + * entered from kernel-mode.
> + *
> + * If entered from kernel-mode the return stack is validated first, and if it is
> + * not safe to use (e.g. because it points to the entry stack) the #HV handler
> + * will switch to a fall-back stack (HV2) and call a special handler function.
> + *
> + * The macro is only used for one vector, but it is planned to be extended in
> + * the future for the #HV exception.
> + */
> +.macro idtentry_hv vector asmsym cfunc
> +SYM_CODE_START(\asmsym)
> + UNWIND_HINT_IRET_REGS
> + ASM_CLAC
> + pushq $-1 /* ORIG_RAX: no syscall to restart */
> +
> + testb $3, CS-ORIG_RAX(%rsp)
> + jnz .Lfrom_usermode_switch_stack_\@
> +
> + call paranoid_entry
> +
> + UNWIND_HINT_REGS
> +
> + /*
> + * Switch off the IST stack to make it free for nested exceptions.
> + */
> + movq %rsp, %rdi /* pt_regs pointer */
> + call hv_switch_off_ist
> + movq %rax, %rsp /* Switch to new stack */
> +
> + UNWIND_HINT_REGS
> +
> + /* Update pt_regs */
> + movq ORIG_RAX(%rsp), %rsi /* get error code into 2nd argument*/
> + movq $-1, ORIG_RAX(%rsp) /* no syscall to restart */
> +
> + movq %rsp, %rdi /* pt_regs pointer */
> + call kernel_\cfunc
> +
> + jmp paranoid_exit
> +
> +.Lfrom_usermode_switch_stack_\@:
> + idtentry_body user_\cfunc, has_error_code=1
#HV exception handler cannot/does not inject an error code, so shouldn't
has_error_code == 0?
> +
> _ASM_NOKPROBE(\asmsym)
> SYM_CODE_END(\asmsym)
> .endm
> diff --git a/arch/x86/include/asm/cpu_entry_area.h b/arch/x86/include/asm/cpu_entry_area.h
> index 75efc4c6f076..f173a16cfc59 100644
> --- a/arch/x86/include/asm/cpu_entry_area.h
> +++ b/arch/x86/include/asm/cpu_entry_area.h
> @@ -30,6 +30,10 @@
> char VC_stack[optional_stack_size]; \
> char VC2_stack_guard[guardsize]; \
> char VC2_stack[optional_stack_size]; \
> + char HV_stack_guard[guardsize]; \
> + char HV_stack[optional_stack_size]; \
> + char HV2_stack_guard[guardsize]; \
> + char HV2_stack[optional_stack_size]; \
> char IST_top_guard[guardsize]; \
>
> /* The exception stacks' physical storage. No guard pages required */
> @@ -52,6 +56,8 @@ enum exception_stack_ordering {
> ESTACK_MCE,
> ESTACK_VC,
> ESTACK_VC2,
> + ESTACK_HV,
> + ESTACK_HV2,
> N_EXCEPTION_STACKS
> };
>
> diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h
> index 72184b0b2219..ed68acd6f723 100644
> --- a/arch/x86/include/asm/idtentry.h
> +++ b/arch/x86/include/asm/idtentry.h
> @@ -317,6 +317,19 @@ static __always_inline void __##func(struct pt_regs *regs)
> __visible noinstr void kernel_##func(struct pt_regs *regs, unsigned long error_code); \
> __visible noinstr void user_##func(struct pt_regs *regs, unsigned long error_code)
>
> +
> +/**
> + * DECLARE_IDTENTRY_HV - Declare functions for the HV entry point
> + * @vector: Vector number (ignored for C)
> + * @func: Function name of the entry point
> + *
> + * Maps to DECLARE_IDTENTRY_RAW, but declares also the user C handler.
> + */
> +#define DECLARE_IDTENTRY_HV(vector, func) \
> + DECLARE_IDTENTRY_RAW_ERRORCODE(vector, func); \
> + __visible noinstr void kernel_##func(struct pt_regs *regs, unsigned long error_code); \
> + __visible noinstr void user_##func(struct pt_regs *regs, unsigned long error_code)
> +
> /**
> * DEFINE_IDTENTRY_IST - Emit code for IST entry points
> * @func: Function name of the entry point
> @@ -376,6 +389,26 @@ static __always_inline void __##func(struct pt_regs *regs)
> #define DEFINE_IDTENTRY_VC_USER(func) \
> DEFINE_IDTENTRY_RAW_ERRORCODE(user_##func)
>
> +/**
> + * DEFINE_IDTENTRY_HV_KERNEL - Emit code for HV injection handler
> + * when raised from kernel mode
> + * @func: Function name of the entry point
> + *
> + * Maps to DEFINE_IDTENTRY_RAW
> + */
> +#define DEFINE_IDTENTRY_HV_KERNEL(func) \
> + DEFINE_IDTENTRY_RAW_ERRORCODE(kernel_##func)
> +
> +/**
> + * DEFINE_IDTENTRY_HV_USER - Emit code for HV injection handler
> + * when raised from user mode
> + * @func: Function name of the entry point
> + *
> + * Maps to DEFINE_IDTENTRY_RAW
> + */
> +#define DEFINE_IDTENTRY_HV_USER(func) \
> + DEFINE_IDTENTRY_RAW_ERRORCODE(user_##func)
> +
> #else /* CONFIG_X86_64 */
>
> /**
> @@ -465,6 +498,9 @@ __visible noinstr void func(struct pt_regs *regs, \
> # define DECLARE_IDTENTRY_VC(vector, func) \
> idtentry_vc vector asm_##func func
>
> +# define DECLARE_IDTENTRY_HV(vector, func) \
> + idtentry_hv vector asm_##func func
> +
> #else
> # define DECLARE_IDTENTRY_MCE(vector, func) \
> DECLARE_IDTENTRY(vector, func)
> @@ -622,9 +658,10 @@ DECLARE_IDTENTRY_RAW_ERRORCODE(X86_TRAP_DF, xenpv_exc_double_fault);
> DECLARE_IDTENTRY_ERRORCODE(X86_TRAP_CP, exc_control_protection);
> #endif
>
> -/* #VC */
> +/* #VC & #HV */
> #ifdef CONFIG_AMD_MEM_ENCRYPT
> DECLARE_IDTENTRY_VC(X86_TRAP_VC, exc_vmm_communication);
> +DECLARE_IDTENTRY_HV(X86_TRAP_HV, exc_hv_injection);
> #endif
>
> #ifdef CONFIG_XEN_PV
> diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_types.h
> index e9e2c3ba5923..0bd7dab676c5 100644
> --- a/arch/x86/include/asm/page_64_types.h
> +++ b/arch/x86/include/asm/page_64_types.h
> @@ -29,6 +29,7 @@
> #define IST_INDEX_DB 2
> #define IST_INDEX_MCE 3
> #define IST_INDEX_VC 4
> +#define IST_INDEX_HV 5
>
> /*
> * Set __PAGE_OFFSET to the most negative possible address +
> diff --git a/arch/x86/include/asm/trapnr.h b/arch/x86/include/asm/trapnr.h
> index f5d2325aa0b7..c6583631cecb 100644
> --- a/arch/x86/include/asm/trapnr.h
> +++ b/arch/x86/include/asm/trapnr.h
> @@ -26,6 +26,7 @@
> #define X86_TRAP_XF 19 /* SIMD Floating-Point Exception */
> #define X86_TRAP_VE 20 /* Virtualization Exception */
> #define X86_TRAP_CP 21 /* Control Protection Exception */
> +#define X86_TRAP_HV 28 /* HV injected exception in SNP restricted mode */
> #define X86_TRAP_VC 29 /* VMM Communication Exception */
> #define X86_TRAP_IRET 32 /* IRET Exception */
>
> diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
> index 47ecfff2c83d..6795d3e517d6 100644
> --- a/arch/x86/include/asm/traps.h
> +++ b/arch/x86/include/asm/traps.h
> @@ -16,6 +16,7 @@ asmlinkage __visible notrace
> struct pt_regs *fixup_bad_iret(struct pt_regs *bad_regs);
> void __init trap_init(void);
> asmlinkage __visible noinstr struct pt_regs *vc_switch_off_ist(struct pt_regs *eregs);
> +asmlinkage __visible noinstr struct pt_regs *hv_switch_off_ist(struct pt_regs *eregs);
> #endif
>
> extern bool ibt_selftest(void);
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index 3e508f239098..87afa3a4c8b1 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -2165,6 +2165,7 @@ static inline void tss_setup_ist(struct tss_struct *tss)
> tss->x86_tss.ist[IST_INDEX_MCE] = __this_cpu_ist_top_va(MCE);
> /* Only mapped when SEV-ES is active */
> tss->x86_tss.ist[IST_INDEX_VC] = __this_cpu_ist_top_va(VC);
> + tss->x86_tss.ist[IST_INDEX_HV] = __this_cpu_ist_top_va(HV);
> }
>
> #else /* CONFIG_X86_64 */
> diff --git a/arch/x86/kernel/dumpstack_64.c b/arch/x86/kernel/dumpstack_64.c
> index 6c5defd6569a..23aa5912c87a 100644
> --- a/arch/x86/kernel/dumpstack_64.c
> +++ b/arch/x86/kernel/dumpstack_64.c
> @@ -26,11 +26,14 @@ static const char * const exception_stack_names[] = {
> [ ESTACK_MCE ] = "#MC",
> [ ESTACK_VC ] = "#VC",
> [ ESTACK_VC2 ] = "#VC2",
> + [ ESTACK_HV ] = "#HV",
> + [ ESTACK_HV2 ] = "#HV2",
> +
> };
>
> const char *stack_type_name(enum stack_type type)
> {
> - BUILD_BUG_ON(N_EXCEPTION_STACKS != 6);
> + BUILD_BUG_ON(N_EXCEPTION_STACKS != 8);
>
> if (type == STACK_TYPE_TASK)
> return "TASK";
> @@ -89,6 +92,8 @@ struct estack_pages estack_pages[CEA_ESTACK_PAGES] ____cacheline_aligned = {
> EPAGERANGE(MCE),
> EPAGERANGE(VC),
> EPAGERANGE(VC2),
> + EPAGERANGE(HV),
> + EPAGERANGE(HV2),
> };
>
> static __always_inline bool in_exception_stack(unsigned long *stack, struct stack_info *info)
> @@ -98,7 +103,7 @@ static __always_inline bool in_exception_stack(unsigned long *stack, struct stac
> struct pt_regs *regs;
> unsigned int k;
>
> - BUILD_BUG_ON(N_EXCEPTION_STACKS != 6);
> + BUILD_BUG_ON(N_EXCEPTION_STACKS != 8);
>
> begin = (unsigned long)__this_cpu_read(cea_exception_stacks);
> /*
> diff --git a/arch/x86/kernel/idt.c b/arch/x86/kernel/idt.c
> index a58c6bc1cd68..48c0a7e1dbcb 100644
> --- a/arch/x86/kernel/idt.c
> +++ b/arch/x86/kernel/idt.c
> @@ -113,6 +113,7 @@ static const __initconst struct idt_data def_idts[] = {
>
> #ifdef CONFIG_AMD_MEM_ENCRYPT
> ISTG(X86_TRAP_VC, asm_exc_vmm_communication, IST_INDEX_VC),
> + ISTG(X86_TRAP_HV, asm_exc_hv_injection, IST_INDEX_HV),
> #endif
>
> SYSG(X86_TRAP_OF, asm_exc_overflow),
> diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
> index a428c62330d3..63ddb043d16d 100644
> --- a/arch/x86/kernel/sev.c
> +++ b/arch/x86/kernel/sev.c
> @@ -2004,6 +2004,65 @@ DEFINE_IDTENTRY_VC_USER(exc_vmm_communication)
> irqentry_exit_to_user_mode(regs);
> }
>
> +static bool hv_raw_handle_exception(struct pt_regs *regs)
> +{
> + return false;
> +}
> +
> +static __always_inline bool on_hv_fallback_stack(struct pt_regs *regs)
> +{
> + unsigned long sp = (unsigned long)regs;
> +
> + return (sp >= __this_cpu_ist_bottom_va(HV2) && sp < __this_cpu_ist_top_va(HV2));
> +}
> +
> +DEFINE_IDTENTRY_HV_USER(exc_hv_injection)
> +{
> + irqentry_enter_from_user_mode(regs);
> + instrumentation_begin();
> +
> + if (!hv_raw_handle_exception(regs)) {
> + /*
> + * Do not kill the machine if user-space triggered the
> + * exception. Send SIGBUS instead and let user-space deal
> + * with it.
> + */
> + force_sig_fault(SIGBUS, BUS_OBJERR, (void __user *)0);
> + }
> +
> + instrumentation_end();
> + irqentry_exit_to_user_mode(regs);
> +}
> +
> +DEFINE_IDTENTRY_HV_KERNEL(exc_hv_injection)
> +{
> + irqentry_state_t irq_state;
> +
> + if (unlikely(on_hv_fallback_stack(regs))) {
> + instrumentation_begin();
> + panic("Can't handle #HV exception from unsupported context\n");
> + instrumentation_end();
> + }
HV fallback stack exists and is used if we couldn't switch to HV stack.
If we have to issue a panic() here, why don't we simply issue the
panic() in hv_switch_off_ist(), when we couldn't switch to HV stack ?
Thanks,
Ashish
> +
> + irq_state = irqentry_nmi_enter(regs);
> + instrumentation_begin();
> +
> + if (!hv_raw_handle_exception(regs)) {
> + pr_emerg("PANIC: Unhandled #HV exception in kernel space\n");
> +
> + /* Show some debug info */
> + show_regs(regs);
> +
> + /* Ask hypervisor to sev_es_terminate */
> + sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
> +
> + panic("Returned from Terminate-Request to Hypervisor\n");
> + }
> +
> + instrumentation_end();
> + irqentry_nmi_exit(regs, irq_state);
> +}
> +
> bool __init handle_vc_boot_ghcb(struct pt_regs *regs)
> {
> unsigned long exit_code = regs->orig_ax;
> diff --git a/arch/x86/mm/cpu_entry_area.c b/arch/x86/mm/cpu_entry_area.c
> index 6c2f1b76a0b6..608905dc6704 100644
> --- a/arch/x86/mm/cpu_entry_area.c
> +++ b/arch/x86/mm/cpu_entry_area.c
> @@ -115,6 +115,8 @@ static void __init percpu_setup_exception_stacks(unsigned int cpu)
> if (cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT)) {
> cea_map_stack(VC);
> cea_map_stack(VC2);
> + cea_map_stack(HV);
> + cea_map_stack(HV2);
> }
> }
> }
>
next prev parent reply other threads:[~2022-11-10 20:38 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-09 20:53 [RFC PATCH 00/17] x86/hyperv/sev: Add AMD sev-snp enlightened guest support on hyperv Tianyu Lan
2022-11-09 20:53 ` [RFC PATCH 01/17] x86/boot: Check boot param's cc_blob_address for direct boot mode Tianyu Lan
2022-11-09 23:39 ` Michael Roth
2022-11-10 15:01 ` Tianyu Lan
2022-11-09 20:53 ` [RFC PATCH 02/17] x86/sev: Pvalidate memory gab for decompressing kernel Tianyu Lan
2022-11-09 20:53 ` [RFC PATCH 03/17] x86/hyperv: Add sev-snp enlightened guest specific config Tianyu Lan
2022-11-09 20:53 ` [RFC PATCH 04/17] x86/hyperv: apic change for sev-snp enlightened guest Tianyu Lan
2022-11-09 20:53 ` [RFC PATCH 05/17] x86/hyperv: Decrypt hv vp assist page in " Tianyu Lan
2022-11-09 20:53 ` [RFC PATCH 06/17] x86/hyperv: Get Virtual Trust Level via hvcall Tianyu Lan
2022-11-09 20:53 ` [RFC PATCH 07/17] x86/hyperv: Use vmmcall to implement hvcall in sev-snp enlightened guest Tianyu Lan
2022-11-09 20:53 ` [RFC PATCH 08/17] clocksource: hyper-v: decrypt hyperv tsc page " Tianyu Lan
2022-11-09 20:53 ` [RFC PATCH 09/17] x86/hyperv: decrypt vmbus pages for " Tianyu Lan
2022-11-09 20:53 ` [RFC PATCH 10/17] x86/hyperv: set target vtl in the vmbus init message Tianyu Lan
2022-11-09 20:53 ` [RFC PATCH 11/17] drivers: hv: Decrypt percpu hvcall input arg page in sev-snp enlightened guest Tianyu Lan
2022-11-09 20:53 ` [RFC PATCH 12/17] Drivers: hv: vmbus: Decrypt vmbus ring buffer Tianyu Lan
2022-11-09 20:53 ` [RFC PATCH 13/17] x86/hyperv: Initialize cpu and memory for sev-snp enlightened guest Tianyu Lan
2022-11-09 20:53 ` [RFC PATCH 14/17] x86/hyperv: Add smp support for sev-snp guest Tianyu Lan
2022-11-09 20:53 ` [RFC PATCH 15/17] x86/hyperv: Add hyperv-specific hadling for VMMCALL under SEV-ES Tianyu Lan
2022-11-09 20:53 ` [RFC PATCH 16/17] x86/sev: Add a #HV exception handler Tianyu Lan
2022-11-10 20:38 ` Kalra, Ashish [this message]
2022-11-14 1:28 ` Tianyu Lan
2022-11-09 20:53 ` [RFC PATCH 17/17] x86/sev: Initialize #HV doorbell and handle interrupt requests Tianyu Lan
2022-11-10 21:36 ` Kalra, Ashish
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aed619fe-3e08-e99d-67c3-3a91da47eefb@amd.com \
--to=ashish.kalra@amd.com \
--cc=adrian.hunter@intel.com \
--cc=akpm@linux-foundation.org \
--cc=alexander.shishkin@linux.intel.com \
--cc=anshuman.khandual@arm.com \
--cc=bp@alien8.de \
--cc=brijesh.singh@amd.com \
--cc=daniel.sneddon@linux.intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=fenghua.yu@intel.com \
--cc=hpa@zytor.com \
--cc=jgross@suse.com \
--cc=jiangshan.ljs@antgroup.com \
--cc=kirill@shutemov.name \
--cc=kvm@vger.kernel.org \
--cc=linux-arch@vger.kernel.org \
--cc=linux-hyperv@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=ltykernel@gmail.com \
--cc=luto@kernel.org \
--cc=michael.roth@amd.com \
--cc=mingo@redhat.com \
--cc=pawan.kumar.gupta@linux.intel.com \
--cc=pbonzini@redhat.com \
--cc=peterz@infradead.org \
--cc=ray.huang@amd.com \
--cc=samitolvanen@google.com \
--cc=sandipan.das@amd.com \
--cc=seanjc@google.com \
--cc=srutherford@google.com \
--cc=sterritt@google.com \
--cc=tglx@linutronix.de \
--cc=thomas.lendacky@amd.com \
--cc=tiala@microsoft.com \
--cc=tony.luck@intel.com \
--cc=venu.busireddy@oracle.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox