From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stefan Bader Subject: Re: PATastic fun Date: Tue, 26 Feb 2013 18:11:42 +0100 Message-ID: <512CECCE.3070305@canonical.com> References: <51277888.50908@canonical.com> <20130222145316.GB8017@phenom.dumpdata.com> <512B2A7C.1050906@canonical.com> <512CD595.2000502@canonical.com> <20130226170330.GB22422@phenom.dumpdata.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============2439896911099820513==" Return-path: In-Reply-To: <20130226170330.GB22422@phenom.dumpdata.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Konrad Rzeszutek Wilk Cc: "Liu, Jinsong" , Colin Ian King , "xen-devel@lists.xensource.com" , Sander Eikelenboom List-Id: xen-devel@lists.xenproject.org This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --===============2439896911099820513== Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="------------enig351F714F7D60569A30F41ADA" This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig351F714F7D60569A30F41ADA Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On 26.02.2013 18:03, Konrad Rzeszutek Wilk wrote: > On Tue, Feb 26, 2013 at 04:32:37PM +0100, Stefan Bader wrote: >> On 25.02.2013 10:10, Stefan Bader wrote: >>> On 25.02.2013 04:15, Liu, Jinsong wrote: >>>> Konrad Rzeszutek Wilk wrote: >>>>> On Fri, Feb 22, 2013 at 02:54:16PM +0100, Stefan Bader wrote: >>>>>> Hi Konrad, >>>>> >>>>> Hey Stefan, >>>>>> >>>>>> here is another one from the hm-what? department: >>>>> >>>>> Heh - the really good-bug-hunting one. Lets also include Jinsong as= >>>>> he has been tracking a similar one with mcelog. >>>>>> >>>>>> Colin discovered that running the attached program with the fork >>>>>> active (e.g. "./mmap-example -f 0x10000", the address can be that = or >>>>>> iomem) this triggers the following weird messages:=20 >>>>>> >>>>>> [ 6824.453724] mmap-example:3481 map pfn expected mapping type >>>>>> write-back for [mem 0x00010000-0x00010fff], got uncached-minus >>>>>> [ 6824.453776] ------------[ cut here ]------------ >>>>>> [ 6824.453796] WARNING: at >>>>>> /build/buildd/linux-3.8.0/arch/x86/mm/pat.c:774 >>>>>> untrack_pfn+0xb8/0xd0() ... [ 6824.453920] Pid: 3481, comm: >>>>>> mmap-example Tainted: GF=20 >>>>>> 3.8.0-6-generic #13-Ubuntu >>>>>> [ 6824.453926] Call Trace: >>>>>> [ 6824.453944] [] warn_slowpath_common+0x7f/0xc= 0 >>>>>> [ 6824.453954] [] warn_slowpath_null+0x1a/0x20 >>>>>> [ 6824.453963] [] untrack_pfn+0xb8/0xd0 >>>>>> [ 6824.453975] [] unmap_single_vma+0xac/0x100 >>>>>> [ 6824.453985] [] unmap_vmas+0x49/0x90 >>>>>> [ 6824.453995] [] exit_mmap+0x98/0x170 >>>>>> [ 6824.454007] [] mmput+0x64/0x100 >>>>>> [ 6824.454017] [] dup_mm+0x445/0x660 >>>>>> [ 6824.454027] [] >>>>>> copy_process.part.22+0xa5f/0x1510 [ 6824.454038]=20 >>>>>> [] do_fork+0x91/0x350 [ 6824.454048]=20 >>>>>> [] sys_clone+0x16/0x20 [ 6824.454060]=20 >>>>>> [] stub_clone+0x69/0x90 [ 6824.454069]=20 >>>>>> [] ? system_call_fastpath+0x1a/0x1f [ 6824.45407= 6] >>>>>> ---[ end trace 4918cdd0a4c9fea4 ]---=20 >>>>>> >>>>>> I found that this is related to your bandaid patch >>>>>> >>>>>> commit 8eaffa67b43e99ae581622c5133e20b0f48bcef1 >>>>>> Author: Konrad Rzeszutek Wilk >>>>>> Date: Fri Feb 10 09:16:27 2012 -0500 >>>>>> >>>>>> xen/pat: Disable PAT support for now. >>>>>> >>>>>> I just do not understand how this happens. From the trace it seems= >>>>>> the fork=20 >>>>>> fails when duplicating the VMAs (dup_mm calls mmput on failure). S= o >>>>>> maybe the=20 >>>>>> warning is just related to this. So primarily the question is how = on >>>>>> fork the _PAGE_PCD bit can become set? That and _PAGE_PWT are >>>>>> cleared from the supported=20 >>>>>> mask by the patch, so somehow I would think nothing should be able= >>>>>> to set it...=20 >>>>>> But apparently not so. >>>>>> Not sure it is a big deal since I never saw this in normal operati= on >>>>>> and it=20 >>>>>> seems to be ok when unapping before doing the fork. It is just pla= in >>>>>> odd.=20 >>>>> >>>>> Jinsong mentioned that there is some oddity with the MTRR. Somehow = the >>>>> ranges are swapped or not correct. Jinsong, could you shed some lig= ht >>>>> on what you have found so far? >>>>> >>>> >>>> Yes, Sander once also reported a similar weird warning when start mc= elog daemon, as attached. >>>> >>>> Basically, it occurs when mcelog user daemon start,=20 >>>> do_fork >>>> --> copy_process >>>> --> dup_mm >>>> --> dup_mmap >>>> --> copy_page_range >>>> --> track_pfn_copy >>>> --> reserve_pfn_range >> >> So that makes it clearer as this will do >> >> reserve_memtype(...) >> --> pat_x_mtrr_type >> --> mtrr_type_lookup >> --> __mtrr_type_lookup >> >> And that can return -1/0xff in case of mtrr not being enabled/initiali= zed. Which >> is not the case (given there are no messages for it in dmesg). This is= not equal >> to MTRR_TYPE_WRBACK and thus becomes _PAGE_CACHE_UC_MINUS. >> >> It looks like the problem starts early in reserve_memtype: >> >> if (!pat_enabled) { >> /* This is identical to page table setting without PAT= */ >> if (new_type) { >> if (req_type =3D=3D _PAGE_CACHE_WC) >> *new_type =3D _PAGE_CACHE_UC_MINUS; >> else >> *new_type =3D req_type & _PAGE_CACHE_M= ASK; >> } >> return 0; >> } >> >> This would be what we want, but only clearing the PWT and PCD flags fr= om the >> supported flags is not changing pat_enabled (which is 1 when PAT suppo= rt is >> compiled into the kernel). Unfortunately the variable is local and sin= ce there >> are not any messages about PAT in dmesg I would say pat_init() is not = called >> either. Which might be used to disable PAT support by clearing the CPU= feature >> flag. >=20 > Hm, so: >=20 > diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c > index 39928d1..9ac70c5 100644 > --- a/arch/x86/xen/enlighten.c > +++ b/arch/x86/xen/enlighten.c > @@ -379,6 +379,7 @@ static void __init xen_init_cpuid_mask(void) > =20 > cpuid_leaf1_edx_mask =3D > ~((1 << X86_FEATURE_MTRR) | /* disable MTRR */ > + (1 << X86_FEATURE_PAT) | /* disable PAT */ > (1 << X86_FEATURE_ACC)); /* thermal monitoring */ > =20 > if (!xen_initial_domain()) >=20 >=20 >=20 > should do it right? Let me check on the troublesome machine I saw > it. >=20 I could try it as well but somehow my reading of pat_init() is that if th= at would have been called at all, there should be some message about PAT in = dmesg. And normal dom0 boots do not show anything. It looks like pat_init only gets called from two places in mtrr code, and= that probably is not done in dom0 either. So clearing the feature is one step,= but I would think that also there needs to be a call to pat_init... -Stefan --------------enig351F714F7D60569A30F41ADA Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iQIcBAEBCgAGBQJRLOzOAAoJEOhnXe7L7s6jwmoQAIGRjO0yPCkqAZ1TvZBGMAoG t3L7kymQruKXs9oE2sDUokBJDUdwaU9T/V20LXhwimHdix4ax96dew+p6NEyZXYS zydb2/TjcgAWD1rvgon1CpwKVTwXJdDG1xuo1WWr2B5KALwfiBHJhXnUqH5vi6Ar qRCigw5HwfNyx6pxG3WYR7Lb9fFBfzufbJ/+UzFKDwGbmrNAscwrzxhs3HUNhoR2 wWuST+Fjkilc02ZyVzPuJh0mMScKvq3z+czBJxsQzUyGgbChaTswvRSkIYN3x6Mm NuQp6b39TEbyQ1k2ZXq0ijpM6XczEObUEToQ6yPoVd2UU3nsDQiOiXSIPmDpmfF2 teJF0pQj9Vw2Q89rZuiUBFR6yJtyUlGlCR+HiuUBnC04tuoeZeK68dOYSfk44oHn lGoWgcqOowzuQfTCiTh0ZxvM9CwaeSAJY3kQxHRzUocGAVIQLi9NJinR9CmytbPv G1BA6+ClOOQhWawhoCfYw/iEkzdmRByhQvgO5NcG2DjKwJSejmiu+E2Ka5USv0AO N1Ra1YyFwxwGvyhDkLmkzj3xQWUHIHTmZasj/vd5CQPE2UPMS+7+1hTXQdvJXF/w mxqhU56jEr/cOsWMYSt+vqz9Mucj4ka1o7H2nnWDRw2qMn3DoNQ35+o52XDjBXQU mFc6hxyfo+lyWYLOQBRo =OAQR -----END PGP SIGNATURE----- --------------enig351F714F7D60569A30F41ADA-- --===============2439896911099820513== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --===============2439896911099820513==--