All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Alexey Kardashevskiy <aik@amd.com>
Cc: x86@kernel.org, linux-kernel@vger.kernel.org,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Tom Lendacky <thomas.lendacky@amd.com>
Subject: Re: [PATCH kernel v2] x86/compressed/64: reduce #VC nesting for intercepted CPUID for SEV-SNP guest
Date: Wed, 27 Sep 2023 11:48:33 +0200	[thread overview]
Message-ID: <ZRP6cd6rEymUaiL+@gmail.com> (raw)
In-Reply-To: <20230926040526.957240-1-aik@amd.com>


* Alexey Kardashevskiy <aik@amd.com> wrote:

>  arch/x86/include/asm/svm.h   | 14 ++++++++++++++
>  arch/x86/kernel/sev-shared.c |  5 +++--

Doesn't build on x86-64 allmodconfig:

  arch/x86/kernel/sev-shared.c:442:75: error: ‘ghcb’ undeclared (first use in this function)
  arch/x86/kernel/sev-shared.c:442:81: error: ‘ctxt’ undeclared (first use in this function)

Not sure how this was supposed to work - there's no 'ghcb' passed in to
snp_cpuid_postprocess(). Does this patch have a dependency perhaps, that
I missed?

For the next version please also pick up the edited changelog I've done,
see below.

Thanks,

	Ingo

====================>
From: Alexey Kardashevskiy <aik@amd.com>
Subject: [PATCH] x86/sev: Reduce #VC nesting for intercepted CPUID for SEV-SNP guest, to fix nesting crash

For certain intercepts an SNP guest uses the GHCB protocol to talk to
the hypervisor from the #VC handler. The protocol requires a shared page so
there is one per vCPU. In case NMI arrives in a middle of #VC or the NMI
handler triggers a #VC, there is another "backup" GHCB page which stores
the content of the first one while SVM_VMGEXIT_NMI_COMPLETE is sent.
The vc_raw_handle_exception() handler manages main and backup GHCB pages
via __sev_get_ghcb/__sev_put_ghcb.

This works fine for #VC and occasional NMIs. This does not work so fine if
the #VC handler causes intercept + another #VC, if NMI arrives during
the second #VC, there are no more pages for SVM_VMGEXIT_NMI_COMPLETE.
The problem place is the #VC CPUID handler. Running perf in the SNP guest
crashes with:

  Kernel panic - not syncing: Unable to handle #VC exception! GHCB and Backup GHCB are already in use

  vc_raw_handle_exception #1: exit_code 72 (CPUID) eax d ecx 1

We lock the main GHCB and while it is locked we get to
snp_cpuid_postprocess() which executes "rdmsr" of MSR_IA32_XSS==0xda0 which
triggers:

  vc_raw_handle_exception #2: exit_code 7c (MSR) ecx da0

Here we lock the backup ghcb.

And then PMC NMI comes which cannot complete as there is no GHCB page left
to use:

  CPU: 5 PID: 566 Comm: touch Not tainted 6.5.0-rc2-aik-ad9c-g7413e71d3dcf-dirty #27
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS unknown unknown
  Call Trace:
   <NMI>
   dump_stack_lvl+0x44/0x60
   panic+0x222/0x310
   ____sev_get_ghcb+0x21e/0x220
   __sev_es_nmi_complete+0x28/0xf0
   exc_nmi+0x1ac/0x1c0
   end_repeat_nmi+0x16/0x67
  ...
   </NMI>
   <TASK>
   vc_raw_handle_exception+0x9e/0x2c0
   kernel_exc_vmm_communication+0x4d/0xa0
   asm_exc_vmm_communication+0x31/0x60
  RIP: 0010:snp_cpuid+0x2ad/0x420

Add a helper similar to rdmsr_safe() for making a direct hypercall in the SEV-ES
environment. Use the new helper instead of the raw "rdmsr" to avoid the extra #VC event.

Fixes: ee0bfa08a345 ("x86/compressed/64: Add support for SEV-SNP CPUID table in #VC handlers")
Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Link: https://lore.kernel.org/r/20230926040526.957240-1-aik@amd.com

  parent reply	other threads:[~2023-09-27  9:49 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-26  4:05 [PATCH kernel v2] x86/compressed/64: reduce #VC nesting for intercepted CPUID for SEV-SNP guest Alexey Kardashevskiy
2023-09-26 18:28 ` Tom Lendacky
2023-09-27  8:50 ` [tip: x86/mm] x86/sev: Reduce #VC nesting for intercepted CPUID for SEV-SNP guest, to fix nesting crash tip-bot2 for Alexey Kardashevskiy
2023-09-27  9:48 ` Ingo Molnar [this message]
2023-09-27 11:22   ` [PATCH kernel v2] x86/compressed/64: reduce #VC nesting for intercepted CPUID for SEV-SNP guest Borislav Petkov
2023-09-28  9:32     ` Ingo Molnar
2023-09-30  7:17 ` Borislav Petkov
2023-10-01  9:40   ` Alexey Kardashevskiy
2023-10-01  9:53     ` Borislav Petkov
2023-10-01 12:17       ` Ingo Molnar
2023-10-01 12:31         ` Ingo Molnar
2023-10-03  1:45       ` Alexey Kardashevskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZRP6cd6rEymUaiL+@gmail.com \
    --to=mingo@kernel.org \
    --cc=aik@amd.com \
    --cc=bp@alien8.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=thomas.lendacky@amd.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.