* [PATCH v2] x86/sev: Update ghcb_version only once
@ 2023-11-29 10:40 Ashwin Dayanand Kamat
2023-11-29 10:42 ` kernel test robot
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Ashwin Dayanand Kamat @ 2023-11-29 10:40 UTC (permalink / raw)
To: linux-kernel, thomas.lendacky, bp, brijesh.singh
Cc: kashwindayan, tglx, mingo, dave.hansen, x86, hpa, jroedel, stable,
ganb, tkundu, vsirnapalli, akaher, amakhalov, namit
From: Ashwin Dayanand Kamat <ashwin.kamat@broadcom.com>
kernel crash was observed because of page fault, while running
cpuhotplug ltp testcases on SEV-ES enabled systems. The crash was
observed during hotplug after the CPU was offlined and the process
was migrated to different cpu. setup_ghcb() is called again which
tries to update ghcb_version in sev_es_negotiate_protocol(). Ideally this
is a read_only variable which is initialised during booting.
This results in pagefault.
From logs,
[ 256.447466] BUG: unable to handle page fault for address: ffffffffba556e70
[ 256.447476] #PF: supervisor write access in kernel mode
[ 256.447478] #PF: error_code(0x0003) - permissions violation
[ 256.447479] PGD 8000667c0f067 P4D 8000667c0f067 PUD 8000667c10063 PMD 80080006674001e1
[ 256.447483] Oops: 0003 [#1] PREEMPT SMP NOPTI
[ 256.447487] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.1.45-8.ph5 #1-photon
.
.
.
.
.
[ 256.447511] CR2: ffffffffba556e70 CR3: 0008000667c0a004 CR4: 0000000000770ee0
[ 256.447514] PKRU: 55555554
[ 256.447515] Call Trace:
[ 256.447516] <TASK>
[ 256.447519] ? __die_body.cold+0x1a/0x1f
[ 256.447526] ? __die+0x2a/0x35
[ 256.447528] ? page_fault_oops+0x10c/0x270
[ 256.447531] ? setup_ghcb+0x71/0x100
[ 256.447533] ? __x86_return_thunk+0x5/0x6
[ 256.447537] ? search_exception_tables+0x60/0x70
[ 256.447541] ? __x86_return_thunk+0x5/0x6
[ 256.447543] ? fixup_exception+0x27/0x320
[ 256.447546] ? kernelmode_fixup_or_oops+0xa2/0x120
[ 256.447549] ? __bad_area_nosemaphore+0x16a/0x1b0
[ 256.447551] ? kernel_exc_vmm_communication+0x60/0xb0
[ 256.447556] ? bad_area_nosemaphore+0x16/0x20
[ 256.447558] ? do_kern_addr_fault+0x7a/0x90
[ 256.447560] ? exc_page_fault+0xbd/0x160
[ 256.447563] ? asm_exc_page_fault+0x27/0x30
[ 256.447570] ? setup_ghcb+0x71/0x100
[ 256.447572] ? setup_ghcb+0xe/0x100
[ 256.447574] cpu_init_exception_handling+0x1b9/0x1f0
Fix is to call sev_es_negotiate_protocol() only in the BSP boot phase (and
it only needs to be done once)
Fixes: 95d33bfaa3e1 ("x86/sev: Register GHCB memory when SEV-SNP is active")
Co-developed-by: Bo Gan <bo.gan@broadcom.com>
Signed-off-by: Bo Gan <bo.gan@broadcom.com>
Signed-off-by: Ashwin Dayanand Kamat <ashwin.kamat@broadcom.com>
---
v2:
As per the review comments given by Tom Lendacky, did below changes in v2,
- Moved sev_es_negotiate_protocol() after initial_vc_handler if-check in setup_ghcb()
- Added Signed-off of Co-developer
---
arch/x86/kernel/sev.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 70472eebe719..c67285824e82 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -1234,10 +1234,6 @@ void setup_ghcb(void)
if (!cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT))
return;
- /* First make sure the hypervisor talks a supported protocol. */
- if (!sev_es_negotiate_protocol())
- sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
-
/*
* Check whether the runtime #VC exception handler is active. It uses
* the per-CPU GHCB page which is set up by sev_es_init_vc_handling().
@@ -1254,6 +1250,13 @@ void setup_ghcb(void)
return;
}
+ /*
+ * Make sure the hypervisor talks a supported protocol.
+ * This gets called only in the BSP boot phase.
+ */
+ if (!sev_es_negotiate_protocol())
+ sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
+
/*
* Clear the boot_ghcb. The first exception comes in before the bss
* section is cleared.
--
2.39.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v2] x86/sev: Update ghcb_version only once
2023-11-29 10:40 [PATCH v2] x86/sev: Update ghcb_version only once Ashwin Dayanand Kamat
@ 2023-11-29 10:42 ` kernel test robot
2023-11-30 9:30 ` [PATCH] x86/sev: Fix kernel crash due to late update to read-only ghcb_version Ingo Molnar
2023-11-30 9:39 ` [tip: x86/urgent] " tip-bot2 for Ashwin Dayanand Kamat
2 siblings, 0 replies; 5+ messages in thread
From: kernel test robot @ 2023-11-29 10:42 UTC (permalink / raw)
To: Ashwin Dayanand Kamat; +Cc: stable, oe-kbuild-all
Hi,
Thanks for your patch.
FYI: kernel test robot notices the stable kernel rule is not satisfied.
The check is based on https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html#option-1
Rule: add the tag "Cc: stable@vger.kernel.org" in the sign-off area to have the patch automatically included in the stable tree.
Subject: [PATCH v2] x86/sev: Update ghcb_version only once
Link: https://lore.kernel.org/stable/1701254429-18250-1-git-send-email-kashwindayan%40vmware.com
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH] x86/sev: Fix kernel crash due to late update to read-only ghcb_version
2023-11-29 10:40 [PATCH v2] x86/sev: Update ghcb_version only once Ashwin Dayanand Kamat
2023-11-29 10:42 ` kernel test robot
@ 2023-11-30 9:30 ` Ingo Molnar
2023-11-30 16:07 ` Tom Lendacky
2023-11-30 9:39 ` [tip: x86/urgent] " tip-bot2 for Ashwin Dayanand Kamat
2 siblings, 1 reply; 5+ messages in thread
From: Ingo Molnar @ 2023-11-30 9:30 UTC (permalink / raw)
To: Ashwin Dayanand Kamat
Cc: linux-kernel, thomas.lendacky, bp, brijesh.singh, tglx, mingo,
dave.hansen, x86, hpa, jroedel, stable, ganb, tkundu, vsirnapalli,
akaher, amakhalov, namit
* Ashwin Dayanand Kamat <kashwindayan@vmware.com> wrote:
> From: Ashwin Dayanand Kamat <ashwin.kamat@broadcom.com>
>
> kernel crash was observed because of page fault, while running
> cpuhotplug ltp testcases on SEV-ES enabled systems. The crash was
> observed during hotplug after the CPU was offlined and the process
> was migrated to different cpu. setup_ghcb() is called again which
> tries to update ghcb_version in sev_es_negotiate_protocol(). Ideally this
> is a read_only variable which is initialised during booting.
> This results in pagefault.
Applied to tip:x86/urgent, thanks.
Tom: I've added your Suggested-by and Acked-by, which appeared to be the
case given the v1 discussion, let me know if that's not accurate.
I've also tidied up the changelog - final version attached below.
Thanks,
Ingo
============>
From: Ashwin Dayanand Kamat <ashwin.kamat@broadcom.com>
Date: Wed, 29 Nov 2023 16:10:29 +0530
Subject: [PATCH] x86/sev: Fix kernel crash due to late update to read-only ghcb_version
A write-access violation page fault kernel crash was observed while running
cpuhotplug LTP testcases on SEV-ES enabled systems. The crash was
observed during hotplug, after the CPU was offlined and the process
was migrated to different CPU. setup_ghcb() is called again which
tries to update ghcb_version in sev_es_negotiate_protocol(). Ideally this
is a read_only variable which is initialised during booting.
Trying to write it results in a pagefault:
BUG: unable to handle page fault for address: ffffffffba556e70
#PF: supervisor write access in kernel mode
#PF: error_code(0x0003) - permissions violation
[ ...]
Call Trace:
<TASK>
? __die_body.cold+0x1a/0x1f
? __die+0x2a/0x35
? page_fault_oops+0x10c/0x270
? setup_ghcb+0x71/0x100
? __x86_return_thunk+0x5/0x6
? search_exception_tables+0x60/0x70
? __x86_return_thunk+0x5/0x6
? fixup_exception+0x27/0x320
? kernelmode_fixup_or_oops+0xa2/0x120
? __bad_area_nosemaphore+0x16a/0x1b0
? kernel_exc_vmm_communication+0x60/0xb0
? bad_area_nosemaphore+0x16/0x20
? do_kern_addr_fault+0x7a/0x90
? exc_page_fault+0xbd/0x160
? asm_exc_page_fault+0x27/0x30
? setup_ghcb+0x71/0x100
? setup_ghcb+0xe/0x100
cpu_init_exception_handling+0x1b9/0x1f0
The fix is to call sev_es_negotiate_protocol() only in the BSP boot phase,
and it only needs to be done once in any case.
[ mingo: Refined the changelog. ]
Fixes: 95d33bfaa3e1 ("x86/sev: Register GHCB memory when SEV-SNP is active")
Suggested-by: Tom Lendacky <thomas.lendacky@amd.com>
Co-developed-by: Bo Gan <bo.gan@broadcom.com>
Signed-off-by: Bo Gan <bo.gan@broadcom.com>
Signed-off-by: Ashwin Dayanand Kamat <ashwin.kamat@broadcom.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Link: https://lore.kernel.org/r/1701254429-18250-1-git-send-email-kashwindayan@vmware.com
---
arch/x86/kernel/sev.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 70472eebe719..c67285824e82 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -1234,10 +1234,6 @@ void setup_ghcb(void)
if (!cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT))
return;
- /* First make sure the hypervisor talks a supported protocol. */
- if (!sev_es_negotiate_protocol())
- sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
-
/*
* Check whether the runtime #VC exception handler is active. It uses
* the per-CPU GHCB page which is set up by sev_es_init_vc_handling().
@@ -1254,6 +1250,13 @@ void setup_ghcb(void)
return;
}
+ /*
+ * Make sure the hypervisor talks a supported protocol.
+ * This gets called only in the BSP boot phase.
+ */
+ if (!sev_es_negotiate_protocol())
+ sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
+
/*
* Clear the boot_ghcb. The first exception comes in before the bss
* section is cleared.
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [tip: x86/urgent] x86/sev: Fix kernel crash due to late update to read-only ghcb_version
2023-11-29 10:40 [PATCH v2] x86/sev: Update ghcb_version only once Ashwin Dayanand Kamat
2023-11-29 10:42 ` kernel test robot
2023-11-30 9:30 ` [PATCH] x86/sev: Fix kernel crash due to late update to read-only ghcb_version Ingo Molnar
@ 2023-11-30 9:39 ` tip-bot2 for Ashwin Dayanand Kamat
2 siblings, 0 replies; 5+ messages in thread
From: tip-bot2 for Ashwin Dayanand Kamat @ 2023-11-30 9:39 UTC (permalink / raw)
To: linux-tip-commits
Cc: Tom Lendacky, Bo Gan, Ashwin Dayanand Kamat, Ingo Molnar, x86,
linux-kernel
The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: 27d25348d42161837be08fc63b04a2559d2e781c
Gitweb: https://git.kernel.org/tip/27d25348d42161837be08fc63b04a2559d2e781c
Author: Ashwin Dayanand Kamat <ashwin.kamat@broadcom.com>
AuthorDate: Wed, 29 Nov 2023 16:10:29 +05:30
Committer: Ingo Molnar <mingo@kernel.org>
CommitterDate: Thu, 30 Nov 2023 10:23:12 +01:00
x86/sev: Fix kernel crash due to late update to read-only ghcb_version
A write-access violation page fault kernel crash was observed while running
cpuhotplug LTP testcases on SEV-ES enabled systems. The crash was
observed during hotplug, after the CPU was offlined and the process
was migrated to different CPU. setup_ghcb() is called again which
tries to update ghcb_version in sev_es_negotiate_protocol(). Ideally this
is a read_only variable which is initialised during booting.
Trying to write it results in a pagefault:
BUG: unable to handle page fault for address: ffffffffba556e70
#PF: supervisor write access in kernel mode
#PF: error_code(0x0003) - permissions violation
[ ...]
Call Trace:
<TASK>
? __die_body.cold+0x1a/0x1f
? __die+0x2a/0x35
? page_fault_oops+0x10c/0x270
? setup_ghcb+0x71/0x100
? __x86_return_thunk+0x5/0x6
? search_exception_tables+0x60/0x70
? __x86_return_thunk+0x5/0x6
? fixup_exception+0x27/0x320
? kernelmode_fixup_or_oops+0xa2/0x120
? __bad_area_nosemaphore+0x16a/0x1b0
? kernel_exc_vmm_communication+0x60/0xb0
? bad_area_nosemaphore+0x16/0x20
? do_kern_addr_fault+0x7a/0x90
? exc_page_fault+0xbd/0x160
? asm_exc_page_fault+0x27/0x30
? setup_ghcb+0x71/0x100
? setup_ghcb+0xe/0x100
cpu_init_exception_handling+0x1b9/0x1f0
The fix is to call sev_es_negotiate_protocol() only in the BSP boot phase,
and it only needs to be done once in any case.
[ mingo: Refined the changelog. ]
Fixes: 95d33bfaa3e1 ("x86/sev: Register GHCB memory when SEV-SNP is active")
Suggested-by: Tom Lendacky <thomas.lendacky@amd.com>
Co-developed-by: Bo Gan <bo.gan@broadcom.com>
Signed-off-by: Bo Gan <bo.gan@broadcom.com>
Signed-off-by: Ashwin Dayanand Kamat <ashwin.kamat@broadcom.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Link: https://lore.kernel.org/r/1701254429-18250-1-git-send-email-kashwindayan@vmware.com
---
arch/x86/kernel/sev.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 70472ee..c672858 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -1234,10 +1234,6 @@ void setup_ghcb(void)
if (!cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT))
return;
- /* First make sure the hypervisor talks a supported protocol. */
- if (!sev_es_negotiate_protocol())
- sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
-
/*
* Check whether the runtime #VC exception handler is active. It uses
* the per-CPU GHCB page which is set up by sev_es_init_vc_handling().
@@ -1255,6 +1251,13 @@ void setup_ghcb(void)
}
/*
+ * Make sure the hypervisor talks a supported protocol.
+ * This gets called only in the BSP boot phase.
+ */
+ if (!sev_es_negotiate_protocol())
+ sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
+
+ /*
* Clear the boot_ghcb. The first exception comes in before the bss
* section is cleared.
*/
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] x86/sev: Fix kernel crash due to late update to read-only ghcb_version
2023-11-30 9:30 ` [PATCH] x86/sev: Fix kernel crash due to late update to read-only ghcb_version Ingo Molnar
@ 2023-11-30 16:07 ` Tom Lendacky
0 siblings, 0 replies; 5+ messages in thread
From: Tom Lendacky @ 2023-11-30 16:07 UTC (permalink / raw)
To: Ingo Molnar, Ashwin Dayanand Kamat
Cc: linux-kernel, bp, brijesh.singh, tglx, mingo, dave.hansen, x86,
hpa, jroedel, stable, ganb, tkundu, vsirnapalli, akaher,
amakhalov, namit
On 11/30/23 03:30, Ingo Molnar wrote:
>
> * Ashwin Dayanand Kamat <kashwindayan@vmware.com> wrote:
>
>> From: Ashwin Dayanand Kamat <ashwin.kamat@broadcom.com>
>>
>> kernel crash was observed because of page fault, while running
>> cpuhotplug ltp testcases on SEV-ES enabled systems. The crash was
>> observed during hotplug after the CPU was offlined and the process
>> was migrated to different cpu. setup_ghcb() is called again which
>> tries to update ghcb_version in sev_es_negotiate_protocol(). Ideally this
>> is a read_only variable which is initialised during booting.
>> This results in pagefault.
>
> Applied to tip:x86/urgent, thanks.
>
> Tom: I've added your Suggested-by and Acked-by, which appeared to be the
> case given the v1 discussion, let me know if that's not accurate.
All good.
Thanks,
Tom
>
> I've also tidied up the changelog - final version attached below.
>
> Thanks,
>
> Ingo
>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-11-30 16:07 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-29 10:40 [PATCH v2] x86/sev: Update ghcb_version only once Ashwin Dayanand Kamat
2023-11-29 10:42 ` kernel test robot
2023-11-30 9:30 ` [PATCH] x86/sev: Fix kernel crash due to late update to read-only ghcb_version Ingo Molnar
2023-11-30 16:07 ` Tom Lendacky
2023-11-30 9:39 ` [tip: x86/urgent] " tip-bot2 for Ashwin Dayanand Kamat
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.