From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: [ 3009.778974] mcelog:16842 map pfn expected mapping type write-back for [mem 0x0009f000-0x000a0fff], got uncached-minus Date: Fri, 16 Nov 2012 11:58:34 -0500 Message-ID: <20121116165834.GA18725@phenom.dumpdata.com> References: <791265057.20121116134056@eikelenboom.it> <20121116160733.GO22320@phenom.dumpdata.com> <1422434855.20121116174754@eikelenboom.it> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <1422434855.20121116174754@eikelenboom.it> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Sander Eikelenboom , jinsong.liu@intel.com Cc: "xen-devel@lists.xen.org" List-Id: xen-devel@lists.xenproject.org On Fri, Nov 16, 2012 at 05:47:54PM +0100, Sander Eikelenboom wrote: > > Friday, November 16, 2012, 5:07:33 PM, you wrote: > > > On Fri, Nov 16, 2012 at 01:40:56PM +0100, Sander Eikelenboom wrote: > >> Hi Konrad, > >> > >> Sometime ago i reported this one at boot up: > >> > >> [ 3009.778974] mcelog:16842 map pfn expected mapping type write-back for [mem 0x0009f000-0x000a0fff], got uncached-minus > >> [ 3009.788570] ------------[ cut here ]------------ > >> [ 3009.798175] WARNING: at arch/x86/mm/pat.c:774 untrack_pfn+0xa1/0xb0() > >> [ 3009.807966] Hardware name: MS-7640 > >> [ 3009.817677] Modules linked in: > >> [ 3009.827524] Pid: 16842, comm: mcelog Tainted: G W 3.7.0-rc5-20121116-reverted-persistent-warn-patwarn #1 > >> [ 3009.837415] Call Trace: > >> [ 3009.847110] [] warn_slowpath_common+0x7a/0xb0 > >> [ 3009.856857] [] warn_slowpath_null+0x15/0x20 > >> [ 3009.866562] [] untrack_pfn+0xa1/0xb0 > >> [ 3009.876201] [] unmap_single_vma+0x86b/0x8e0 > >> [ 3009.885895] [] ? release_pages+0x196/0x1f0 > >> [ 3009.895488] [] unmap_vmas+0x4c/0xa0 > >> [ 3009.905134] [] exit_mmap+0x9a/0x180 > >> [ 3009.914706] [] mmput+0x52/0xd0 > >> [ 3009.924252] [] dup_mm+0x3c7/0x510 > >> [ 3009.933839] [] copy_process+0xac5/0x14a0 > >> [ 3009.943430] [] do_fork+0x53/0x360 > >> [ 3009.952843] [] ? lock_release+0x117/0x250 > >> [ 3009.962283] [] ? _raw_spin_unlock+0x30/0x60 > >> [ 3009.971532] [] ? sysret_check+0x22/0x5d > >> [ 3009.980820] [] sys_clone+0x23/0x30 > >> [ 3009.990046] [] stub_clone+0x13/0x20 > >> [ 3009.999335] [] ? system_call_fastpath+0x16/0x1b > >> [ 3010.008667] ---[ end trace 2d9694c2c0a24da8 ]--- > >> > >> > >> It seems to be due to the "mcelog" userspace tool provided with Debian Squeeze (mcelog 1.0~pre3-3 x86-64 Machine Check Exceptions collector and decoder). > >> I can trigger this warning easily by restarting the mcelog tool with /etc/init.d/mcelog restart > >> > >> Should that one also function with the xen mcelog driver, or is a newer version required ? > > > The reason we get is b/c I had to disable the PAT functionality in the Linux kernel for Xen. > > This is b/c it only worked one way - meaning you could convert a page from > > WriteBack to WriteCombine or WriteBack to Uncached. But you could not > > do WriteCombine back to WriteBack - due to one of the functions that > > changes the bits was using an "unfiltered" way to identify the bits on the > > page. > > > Anyhow, we had a disaster b/c some of these pages that used to WriteBack (WB) > > got converted to WriteCombine (WC) and then were returned back as such > > to the page pool. And if they were re-used by filesystem invariably we got > > corruptions. > > > So until the PAT table lookup thing that Peter H. Anvin suggested > > gets implemented this splat gotta show up :-( > > Not a big problem for me, i was just wondering :-) > I'm more interested in the netfront troubles, since it's already rc5. > > > Does mcelog still work even with this warning? > > Not the daemon: > > serveerstertje:~# sh /etc/init.d/mcelog start > Starting Machine Check Exceptions decoder: daemon: Cannot allocate memory > Ugh. CC-ing Liu here. > > >> > >> -- > >> Sander >