From mboxrd@z Thu Jan 1 00:00:00 1970 From: Boris Ostrovsky Subject: Re: [PATCH V2] x86, amd_ucode: Skip microcode updates for final levels Date: Mon, 03 Aug 2015 13:22:28 -0400 Message-ID: <55BFA354.4040009@oracle.com> References: <1438619749-1625-1-git-send-email-aravind.gopalakrishnan@amd.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <1438619749-1625-1-git-send-email-aravind.gopalakrishnan@amd.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Aravind Gopalakrishnan , jbeulich@suse.com, andrew.cooper3@citrix.com Cc: keir@xen.org, Suravee.Suthikulpanit@amd.com, xen-devel@lists.xen.org List-Id: xen-devel@lists.xenproject.org On 08/03/2015 12:35 PM, Aravind Gopalakrishnan wrote: > Some of older[Fam10h] systems require that certain number of > applied microcode patch levels should not be overwritten by > the microcode loader. Otherwise, system hangs are known to occur. > > The 'final_levels' of patch ids have been obtained empirically. > Refer bug https://bugzilla.suse.com/show_bug.cgi?id=913996 > for details of the issue. > > The short version is that people have predominantly noticed > system hang issues when trying to update microcode levels > beyond the patch IDs below. > [0x01000098, 0x0100009f, 0x010000af] > > From internal discussions, we gathered that OS/hypervisor > cannot reliably perform microcode updates beyond these levels > due to hardware issues. Therefore, we need to abort microcode > update process if we hit any of these levels. > > In this patch, we check for those microcode versions and abort > if the current core has one of those final patch levels applied > by the BIOS > > A linux version of the patch has already made it into tip- > http://marc.info/?l=linux-kernel&m=143703405627170 > > Signed-off-by: Aravind Gopalakrishnan > --- > Changes from V1 (per Andrew) > - use commit text from linux patch > - include details about how 'final_levels' are obtaines in comments > (I have also copied it into commit message. But shall remove if you > feel it's redundant) > - use ARRAY_SIZE() and kill NULL terminator > - use XENLOG_INFO in place of pr_debug() > - correct unsigned int usage > > xen/arch/x86/microcode_amd.c | 42 ++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 42 insertions(+) > > diff --git a/xen/arch/x86/microcode_amd.c b/xen/arch/x86/microcode_amd.c > index f79b397..a9761dc 100644 > --- a/xen/arch/x86/microcode_amd.c > +++ b/xen/arch/x86/microcode_amd.c > @@ -347,6 +347,40 @@ static int container_fast_forward(const void *data, size_t size_left, size_t *of > return 0; > } > > +/* > + * The 'final_levels' of patch ids have been obtained empirically. > + * Refer bug https://bugzilla.suse.com/show_bug.cgi?id=913996 > + * for details of the issue. The short version is that people > + * have predominantly noticed system hang issues when trying to > + * update microcode levels beyond the patch IDs below. > + * From internal discussions, we gathered that OS/hypervisor > + * cannot reliably perform microcode updates beyond these levels > + * due to hardware issues. Therefore, we need to abort microcode > + * update process if we hit any of these levels. > + */ > +static unsigned int final_levels[] = { > + 0x01000098, > + 0x0100009f, > + 0x010000af > +}; > + > +static bool_t check_final_patch_levels(int cpu) > +{ > + /* > + * Check the current patch levels on the cpu. If they are equal to > + * any of the 'final_levels', then we should not update the microcode > + * patch on the cpu as system will hang otherwise. > + */ > + struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu); > + unsigned int i; > + > + for ( i = 0; i < ARRAY_SIZE(final_levels); i++ ) > + if ( uci->cpu_sig.rev == final_levels[i] ) > + return 1; (I should have asked this when you posted v1, sorry) Are these final_levels[] fam10h-specific? Are we guaranteed that other families won't have them? I think family check would be good here. Or at least a comment stating that these levels can only be observed on family 10h (the only mention of family right now is in the commit message) -borsi > + > + return 0; > +} > + > static int cpu_request_microcode(int cpu, const void *buf, size_t bufsize) > { > struct microcode_amd *mc_amd, *mc_old; > @@ -369,6 +403,14 @@ static int cpu_request_microcode(int cpu, const void *buf, size_t bufsize) > goto out; > } > > + if ( check_final_patch_levels(cpu) ) > + { > + printk(XENLOG_INFO > + "microcode: Cannot update microcode patch on the cpu as we hit a final level\n"); > + error = -EPERM; > + goto out; > + } > + > mc_amd = xmalloc(struct microcode_amd); > if ( !mc_amd ) > {