From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.2 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B9E46C4708A for ; Thu, 27 May 2021 17:22:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A1F00613CC for ; Thu, 27 May 2021 17:22:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236922AbhE0RYJ (ORCPT ); Thu, 27 May 2021 13:24:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60346 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234431AbhE0RYE (ORCPT ); Thu, 27 May 2021 13:24:04 -0400 Received: from mail-pg1-x52e.google.com (mail-pg1-x52e.google.com [IPv6:2607:f8b0:4864:20::52e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 447A5C061760 for ; Thu, 27 May 2021 10:22:30 -0700 (PDT) Received: by mail-pg1-x52e.google.com with SMTP id q15so412526pgg.12 for ; Thu, 27 May 2021 10:22:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=+RzIY58jABtEHA+lH8aprc5omocLM+qqwqn4cJd2lgw=; b=MVCNdBs+rCz8MZ8O3bdDcK73UDCycWcsVCMv/8tEmzPhorqXH80SuH/sb/Zrj88SMw UypGHQ3vFE6seguYuUmKD40/d7mM8y2T9ZGnuw3AwgQh1lFNJehbjNXjHGbK/aSZwQPL CwQFE7w3hAYjyExtYdtAez998WAND/NuAoyd00IpEQEFfXnDcbWbPrQ2aGca2i+x1j0n 2OVqbcWGoA8jERU2xh7ylLWmyOzDoMETb3iwN3Q2ZGN8zekYPddYZ52AOlcgGNx2uuZM +W9ibA0pUZ0YkBuQijqGTrrysgS/nqDikVUmeY9WYIacRHtr48qN+UXvslycYwgMEkZZ umvw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=+RzIY58jABtEHA+lH8aprc5omocLM+qqwqn4cJd2lgw=; b=g8YkvI90fhQXtJPo6Q7JgB0iMZcaaUdhpWM2YNPsEL4siqOgGGGPNN2KItZWDnXHeD o9Ugf3Z38fdK9fQL91pU1b5pN0iv7s4x0FlR1Lvw2+m1r+BIHRMW0ZDNO+rQa4wSRXJD b9n2qwBSf2kipKhVvia1U1WZItjgdYyHTYkvYF5SY3deE2UfJ0F4kuG4l1CJO6ADcbX9 4Pn62AOqAIqrbt/XgjrJNo399hYdu9rLQPx3IPj/fB8NfGtfoSemlqtf9/syT39Fp4pw NKJIvC84JMGEUGCcmSRLJRSUBmz6OgkEOnwBCKtU3eDrdVOeaEl0xq0CQaxmD1bj5EYO CJbQ== X-Gm-Message-State: AOAM532xXLd7iFj3O7bxBHvjeQNs+Ax3u8NuQArjwFFTJeMCzghnatFc XI0HUYp80gMN5U2jf1kArr3rrA== X-Google-Smtp-Source: ABdhPJxWt+t/DpJOnQug1cYP7cIT1Vdb3VAdQ2y22nKG0B57J/KpfYmB6Kwa4G3SRAUppFnmVVOJXA== X-Received: by 2002:a62:6491:0:b029:28e:8c90:6b16 with SMTP id y139-20020a6264910000b029028e8c906b16mr4583655pfb.24.1622136149533; Thu, 27 May 2021 10:22:29 -0700 (PDT) Received: from google.com (240.111.247.35.bc.googleusercontent.com. [35.247.111.240]) by smtp.gmail.com with ESMTPSA id i2sm2390109pjj.25.2021.05.27.10.22.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 27 May 2021 10:22:28 -0700 (PDT) Date: Thu, 27 May 2021 17:22:24 +0000 From: Sean Christopherson To: Tom Lendacky Cc: Peter Gonda , kvm list , linux-kernel@vger.kernel.org, x86@kernel.org, Paolo Bonzini , Jim Mattson , Joerg Roedel , Vitaly Kuznetsov , Wanpeng Li , Borislav Petkov , Ingo Molnar , Thomas Gleixner , Brijesh Singh Subject: Re: [PATCH] KVM: SVM: Do not terminate SEV-ES guests on GHCB validation failure Message-ID: References: <324d9228-03e9-0fe2-59c0-5e41e449211b@amd.com> <468cee77-aa0a-cf4a-39cf-71b5bfb3575e@amd.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <468cee77-aa0a-cf4a-39cf-71b5bfb3575e@amd.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, May 20, 2021, Tom Lendacky wrote: > On 5/20/21 2:16 PM, Sean Christopherson wrote: > > On Mon, May 17, 2021, Tom Lendacky wrote: > >> On 5/14/21 6:06 PM, Peter Gonda wrote: > >>> On Fri, May 14, 2021 at 1:22 PM Tom Lendacky wrote: > >>>> > >>>> Currently, an SEV-ES guest is terminated if the validation of the VMGEXIT > >>>> exit code and parameters fail. Since the VMGEXIT instruction can be issued > >>>> from userspace, even though userspace (likely) can't update the GHCB, > >>>> don't allow userspace to be able to kill the guest. > >>>> > >>>> Return a #GP request through the GHCB when validation fails, rather than > >>>> terminating the guest. > >>> > >>> Is this a gap in the spec? I don't see anything that details what > >>> should happen if the correct fields for NAE are not set in the first > >>> couple paragraphs of section 4 'GHCB Protocol'. > >> > >> No, I don't think the spec needs to spell out everything like this. The > >> hypervisor is free to determine its course of action in this case. > > > > The hypervisor can decide whether to inject/return an error or kill the guest, > > but what errors can be returned and how they're returned absolutely needs to be > > ABI between guest and host, and to make the ABI vendor agnostic the GHCB spec > > is the logical place to define said ABI. > > For now, that is all we have for versions 1 and 2 of the spec. We can > certainly extend it in future versions if that is desired. > > I would suggest starting a thread on what we would like to see in the next > version of the GHCB spec on the amd-sev-snp mailing list: > > amd-sev-snp@lists.suse.com Will do, but in the meantime, I don't think we should merge a fix of any kind until there is consensus on what the VMM behavior will be. IMO, fixing this in upstream is not urgent; I highly doubt anyone is deploying SEV-ES in production using a bleeding edge KVM. > > For example, "injecting" #GP if the guest botched the GHCB on #VMGEXIT(CPUID) is > > completely nonsensical. As is, a Linux guest appears to blindly forward the #GP, > > which means if something does go awry KVM has just made debugging the guest that > > much harder, e.g. imagine the confusion that will ensue if the end result is a > > SIGBUS to userspace on CPUID. > > I see the point you're making, but I would also say that we probably > wouldn't even boot successfully if the kernel can't handle, e.g., a CPUID > #VC properly. I agree that GHCB bugs in the guest will be fatal, but that doesn't give the VMM carte blanche to do whatever it wants given bad input. > A lot of what could go wrong with required inputs, not the values, but the > required state being communicated, should have already been ironed out during > development of whichever OS is providing the SEV-ES support. Yes, but better on the kernel never having a regression is a losing proposition. And it doesn't even necessarily require a regression, e.g. an existing memory corruption bug elsewhere in the guest kernel (that escaped qualification) could corrupt the GHCB. If the GHCB is corrupted at runtime, the guest needs well-defined semantics from the VMM so that the guest at least has a chance of sanely handling the error. Handling in this case would mean an oops/panic, but that's far, far better than a random pseudo-#GP that might not even be immediately logged as a failure.