From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1423339AbdAITYi (ORCPT ); Mon, 9 Jan 2017 14:24:38 -0500 Received: from mga09.intel.com ([134.134.136.24]:28266 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1161727AbdAITYc (ORCPT ); Mon, 9 Jan 2017 14:24:32 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.33,341,1477983600"; d="scan'208";a="1081028219" Date: Mon, 9 Jan 2017 11:23:36 -0800 From: "Raj, Ashok" To: Paul Menzel Cc: Borislav Petkov , Linux Kernel Mailing List , Thorsten Leemhuis , Len Brown , Tony Luck , ashok.raj@intel.com Subject: Re: Dell XPS13: MCE (Hardware Error) reported Message-ID: <20170109192336.GA42856@otc-nc-03> References: <20170104225546.wy36fu5t2jbow2dq@pd.tnic> <20170105011236.GA80100@otc-brkl-03> <662102c9-94da-3193-08c4-9fe75411cadb@molgen.mpg.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <662102c9-94da-3193-08c4-9fe75411cadb@molgen.mpg.de> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Paul On Mon, Jan 09, 2017 at 12:53:33PM +0100, Paul Menzel wrote: > > > On 01/05/17 02:12, Raj, Ashok wrote: > > >>>CPUID Vendor Intel Family 6 Model 142 > >This is Kabylake Mobile > > > >>>Hardware event. This is not a software error. > >>>MCE 1 > >>>CPU 0 BANK 7 > >>>MISC 7880018086 ADDR fef1ce40 > >>>TIME 1483543069 Wed Jan 4 16:17:49 2017 > >>>STATUS ee0000000040110a MCGSTATUS 0 > > > >Decoding the bits further from MCi_STATUS above: > >Val=1, OVER=1, UC=1, but EN=0 indicates this isn't a MCE, hence should have > >been signaled by a CMCI. > > > >PCC=1, but should be ignored when EN=0. > >MCACOD: 110a MSCOD: 0040 This MSCOD indicates that its a write back access to mmio space. Its possible that BIOS is scanning certain memory region during boot. During which time BIOS does disable generation of MCE's. Which is why EN=0 in the above log. Its a BIOS bug, one would expect that BIOS clears up these before handoff to OS. During OS boot we also scan all MC banks and log/clear them. If you aren't observing them during normal operation you can safely ignore these preboot logs, or pass them along to your OEM. Cheers, Ashok