From mboxrd@z Thu Jan 1 00:00:00 1970 From: Aravind Gopalakrishnan Subject: Re: [PATCH V2] x86, amd_ucode: Skip microcode updates for final levels Date: Mon, 3 Aug 2015 13:42:39 -0500 Message-ID: <55BFB61F.5090805@amd.com> References: <1438619749-1625-1-git-send-email-aravind.gopalakrishnan@amd.com> <55BFA354.4040009@oracle.com> <55BFAA6A.2040108@amd.com> <55BFB06F.7070101@oracle.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <55BFB06F.7070101@oracle.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Boris Ostrovsky , jbeulich@suse.com, andrew.cooper3@citrix.com Cc: keir@xen.org, Suravee.Suthikulpanit@amd.com, xen-devel@lists.xen.org List-Id: xen-devel@lists.xenproject.org On 8/3/2015 1:18 PM, Boris Ostrovsky wrote: > On 08/03/2015 01:52 PM, Aravind Gopalakrishnan wrote: >> On 8/3/2015 12:22 PM, Boris Ostrovsky wrote: >>> On 08/03/2015 12:35 PM, Aravind Gopalakrishnan wrote: >>>> Some of older[Fam10h] systems require that certain number of >>>> applied microcode patch levels should not be overwritten by >>>> the microcode loader. Otherwise, system hangs are known to occur. >>>> >>>> The 'final_levels' of patch ids have been obtained empirically. >>>> Refer bug https://bugzilla.suse.com/show_bug.cgi?id=913996 >>>> for details of the issue. >>>> >>>> The short version is that people have predominantly noticed >>>> system hang issues when trying to update microcode levels >>>> beyond the patch IDs below. >>>> [0x01000098, 0x0100009f, 0x010000af] >>>> >>>> >>>> >>>> +/* >>>> + * The 'final_levels' of patch ids have been obtained empirically. >>>> + * Refer bug https://bugzilla.suse.com/show_bug.cgi?id=913996 >>>> + * for details of the issue. The short version is that people >>>> + * have predominantly noticed system hang issues when trying to >>>> + * update microcode levels beyond the patch IDs below. >>>> + * From internal discussions, we gathered that OS/hypervisor >>>> + * cannot reliably perform microcode updates beyond these levels >>>> + * due to hardware issues. Therefore, we need to abort microcode >>>> + * update process if we hit any of these levels. >>>> + */ >>>> +static unsigned int final_levels[] = { >>>> + 0x01000098, >>>> + 0x0100009f, >>>> + 0x010000af >>>> +}; >>>> + >>>> +static bool_t check_final_patch_levels(int cpu) >>>> +{ >>>> + /* >>>> + * Check the current patch levels on the cpu. If they are >>>> equal to >>>> + * any of the 'final_levels', then we should not update the >>>> microcode >>>> + * patch on the cpu as system will hang otherwise. >>>> + */ >>>> + struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu); >>>> + unsigned int i; >>>> + >>>> + for ( i = 0; i < ARRAY_SIZE(final_levels); i++ ) >>>> + if ( uci->cpu_sig.rev == final_levels[i] ) >>>> + return 1; >>> >>> (I should have asked this when you posted v1, sorry) >>> >>> Are these final_levels[] fam10h-specific? Are we guaranteed that >>> other families won't have them? I think family check would be good >>> here. Or at least a comment stating that these levels can only be >>> observed on family 10h (the only mention of family right now is in >>> the commit message) >>> >>> >> >> >> Yep. These are Fam10h specific microcode patches. And yeah, the HW >> problem seems to exist only on these Fam10h specific processor >> variants. So, it's pretty isolated. >> >> And why need a family check? The microcode patches are tied to a >> specific family/model anyway right? > > Patches --- yes. I wasn't sure about levels (I hope we will never see > the same level for different families but AFAIK it's not strictly > prohibited since level is just a cookie). > Ah. I see what you mean. I can think of two ways around this- a. I can move the check_final_patch_levels() call to apply_microcode(). That way, our initial checks in microcode_fits() would have already checked equivalent cpu IDs so we know we are on the right family. b. Introduce a family check in this patch as you suggested (It's a simple change, should work fine too and I think can retain Andrew's 'Reviewed-by' as the logic doesn't change too much..) Do let me know if you have a preference. >> It's only ever going to apply successfully if the equivalent id's >> match.. >> >> And sure, I can massage the comment like so- >> " ... The short version is that people who are using certain Fam10h >> systems are noticing system hang issues when tryingupdate microcode..." >> >> Would that be ok? > > Better than nothing (although TBH I'd rather see both the comment and > the check). > And sure, I'll include a comment as well. Thanks, -Aravind.