Re: [PATCH V2] x86, amd_ucode: Safeguard against #GP

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

From: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com>
To: Andrew Cooper <andrew.cooper3@citrix.com>,
	boris ostrovsky <boris.ostrovsky@oracle.com>
Cc: keir@xen.org, JBeulich@suse.com, xen-devel@lists.xen.org
Subject: Re: [PATCH V2] x86, amd_ucode: Safeguard against #GP
Date: Fri, 30 May 2014 11:46:01 -0500	[thread overview]
Message-ID: <5388B5C9.20100@amd.com> (raw)
In-Reply-To: <5388B005.2070108@citrix.com>

On 5/30/2014 11:21 AM, Andrew Cooper wrote:
>
> On 30/05/2014 17:01, Aravind Gopalakrishnan wrote:
>> On 5/28/2014 12:56 PM, boris ostrovsky wrote:
>>>
>>> On 5/28/2014 11:16 AM, Aravind Gopalakrishnan wrote:
>>>> On 5/27/2014 6:47 PM, Andrew Cooper wrote:
>>>>> On 27/05/2014 19:24, Aravind Gopalakrishnan wrote:
>>>>>> When HW tries to load a corrupted patch, it generates #GP
>>>>>> and hangs the system. Use wrmsr_safe instead so that we
>>>>>> fail to load microcode gracefully.
>>>>>>
>>>>>> Also, massage error handling around apply_microcode to keep
>>>>>> in tune with error handling style of other parts of the code.
>>>>>>
>>>>>> Example on a Fam15h system-
>>>>>> (XEN) microcode: CPU0 collect_cpu_info: patch_id=0x6000626
>>>>>> (XEN) microcode: CPU0 size 7870, block size 2586 offset 76 equivID
>>>>>> 0x6012 rev 0x6000637
>>>>>> (XEN) microcode: CPU0 found a matching microcode update with version
>>>>>> 0x6000637 (current=0x6000626)
>>>>>> (XEN) traps.c:3073: GPF (0000): ffff82d08016f682 -> ffff82d08022d9f8
>>>>>> (XEN) microcode: CPU0 update from revision 0x6000637 to 0x6000626 
>>>>>> failed
>>>>>> ^^^^^^^^^^^^^^^^^^^^^^
>>>>>> As shown, the log message above has the two revisions reversed. 
>>>>>> Fix this
>>>>>>
>>>>>> Changes in V2:
>>>>>>      - Do not ignore return value from wrmsr_safe
>>>>>>      - Flip revision numbers as shown above
>>>>>>
>>>>>> Signed-off-by: Aravind Gopalakrishnan 
>>>>>> <aravind.gopalakrishnan@amd.com>
>>>>>> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>>>>> I thought we had identified that the hangs were to do with your 
>>>>> use of
>>>>> 'noreboot' on the Xen command line.
>>>>>
>>>>
>>>> Hmm. Yeah.. I figured using wrmsr_safe allows user to just boot 
>>>> into dom0 without
>>>> having to run through reboot loops. (lazy alternative I guess)
>>>>
>>>> Nevermind then. Thanks for the comments (Jan and Andrew). Will keep 
>>>> in mind for the future.
>>>
>>> I don't understand --- the fact that you had noreboot option meant 
>>> that your system wouldn't reboot (duh!) when a patch is corrupted 
>>> (aka "it will hang"). But I'd think we don't want a reboot neither 
>>> --- we want to safely skip the patch (and possibly backlist it).
>>>
>>>
>>
>> So, allowing the reboot to happen in turn allows entry to grub.
>> Where we can simply remove 'ucode=' option and this is same as
>> 'skipping' the patch using wrmsr_safe right?
>>
>> Except this is now explicitly done by the sysadmin
>> while wrmsr_safe just works without anyone doing some extra work;
>> and printing a log message informs user that update went wrong
>> for whatever reason.
>>
>> My understanding from earlier comments is that Andrew (and Jan)
>> would rather not see a change from wrmsrl to wrmsr_safe if it
>> is needless because there is already a way someone can circumvent the
>> corrupt patch: just don't provide it on the grub menu.
>>
>> Thanks,
>> -Aravind.
>
> Sorry - I think my comment confused the issue.  Let me retry.
>
> Originally, the bug was described approximately as "A corrupt ucode 
> will cause a GP fault, causing the server to hang".
>
> The unhandled #GP fault certainly should be wrapped with wrmsr_safe(), 
> and an error/warning presented to the user.  In the case that a bad 
> ucode is discovered, it should be discarded and the server allowed to 
> boot.  It is substantially more useful for the server to come up and 
> say "I couldn't load that bit of microcode you wanted me to", than to 
> sit in a reboot loop because you made a typo in the bootloader config, 
> and have to get someone in the datacenter to poke the physical server.
>
>
> My objection was to the wording of the comment alone.  Unhandled #GP 
> faults do not "hang" Xen unless you ask for them to behave in that 
> way, given "noreboot" on the command line.
>

Ah. Okay, let me look into this again then and fix issues with V2.

-Aravind.

next prev parent reply	other threads:[~2014-05-30 16:46 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-27 18:24 [PATCH V2] x86, amd_ucode: Safeguard against #GP Aravind Gopalakrishnan
2014-05-27 23:47 ` Andrew Cooper
2014-05-28 15:16   ` Aravind Gopalakrishnan
2014-05-28 17:56     ` boris ostrovsky
2014-05-30 16:01       ` Aravind Gopalakrishnan
2014-05-30 16:21         ` Andrew Cooper
2014-05-30 16:46           ` Aravind Gopalakrishnan [this message]
2014-06-02  7:31           ` Jan Beulich
2014-06-02 14:13             ` Boris Ostrovsky
2014-05-28  7:22 ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5388B5C9.20100@amd.com \
    --to=aravind.gopalakrishnan@amd.com \
    --cc=JBeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=keir@xen.org \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).