From mboxrd@z Thu Jan 1 00:00:00 1970 From: Boris Ostrovsky Subject: Re: [PATCH V2] x86, amd_ucode: Safeguard against #GP Date: Mon, 02 Jun 2014 10:13:07 -0400 Message-ID: <538C8673.60900@oracle.com> References: <1401215048-17154-1-git-send-email-aravind.gopalakrishnan@amd.com> <53852405.9010704@citrix.com> <5385FDD3.8020307@amd.com> <53862348.1060400@oracle.com> <5388AB6A.7010803@amd.com> <5388B005.2070108@citrix.com> <538C44560200007800016B08@mail.emea.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <538C44560200007800016B08@mail.emea.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: Andrew Cooper , keir@xen.org, Aravind Gopalakrishnan , xen-devel@lists.xen.org List-Id: xen-devel@lists.xenproject.org On 06/02/2014 03:31 AM, Jan Beulich wrote: >>>> On 30.05.14 at 18:21, wrote: >> The unhandled #GP fault certainly should be wrapped with wrmsr_safe(), >> and an error/warning presented to the user. In the case that a bad >> ucode is discovered, it should be discarded and the server allowed to >> boot. It is substantially more useful for the server to come up and say >> "I couldn't load that bit of microcode you wanted me to", than to sit in >> a reboot loop because you made a typo in the bootloader config, and have >> to get someone in the datacenter to poke the physical server. > But this isn't due to a typo somewhere, but due to a corrupted > microcode blob. Right, but the argument that we don't want to be stuck in the reboot loop still holds. > Besides that no matter which BKDG I look at, I can't seem to find any > indication of there being room for a #GP here if the MSR itself is > implemented. While I don't question its presence in reality, I'd prefer > if this was documented properly for a patch to recover from it to go > in. Unfortunately the whole microcode patching procedure is, to put it mildly, not well documented, particularly the #GP part. We had an email exchange with an AMD HW architect and he confirmed that corrupted patch results in #GP. -boris