From mboxrd@z Thu Jan 1 00:00:00 1970 From: Joerg Roedel Subject: Re: [RFC PATCH 00/62] Linux as SEV-ES Guest Support Date: Wed, 12 Feb 2020 14:59:34 +0100 Message-ID: <20200212135934.GL20066@8bytes.org> References: <20200211135256.24617-1-joro@8bytes.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Return-path: Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org To: Andy Lutomirski Cc: x86@kernel.org, hpa@zytor.com, Andy Lutomirski , Dave Hansen , Peter Zijlstra , Thomas Hellstrom , Jiri Slaby , Dan Williams , Tom Lendacky , Juergen Gross , Kees Cook , linux-kernel@vger.kernel.org, kvm@vger.kernel.org, virtualization@lists.linux-foundation.org, Joerg Roedel List-Id: virtualization@lists.linuxfoundation.org On Tue, Feb 11, 2020 at 07:48:12PM -0800, Andy Lutomirski wrote: > > > > On Feb 11, 2020, at 5:53 AM, Joerg Roedel wrote: > > > > > > > * Putting some NMI-load on the guest will make it crash usually > > within a minute > > Suppose you do CPUID or some MMIO and get #VC. You fill in the GHCB to > ask for help. Some time between when you start filling it out and when > you do VMGEXIT, you get NMI. If the NMI does its own GHCB access [0], > it will clobber the outer #VC’a state, resulting in a failure when > VMGEXIT happens. There’s a related failure mode if the NMI is after > the VMGEXIT but before the result is read. > > I suspect you can fix this by saving the GHCB at the beginning of > do_nmi and restoring it at the end. This has the major caveat that it > will not work if do_nmi comes from user mode and schedules, but I > don’t believe this can happen. > > [0] Due to the NMI_COMPLETE catastrophe, there is a 100% chance that > this happens. Very true, thank you! You probably saved me a few hours of debugging this further :) I will implement better handling for nested #VC exceptions, which hopefully solves the NMI crashes. Thanks again, Joerg