From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stefan Bader Subject: bad page flags booting 32bit dom0 on 64bit hypervisor using dom0_mem (kernel >=4.2) Date: Mon, 2 May 2016 12:47:44 +0200 Message-ID: <57273050.6060300@canonical.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============4190870232625663370==" Return-path: Received: from mail6.bemta6.messagelabs.com ([85.158.143.247]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1axBOP-00063q-1m for xen-devel@lists.xenproject.org; Mon, 02 May 2016 10:47:57 +0000 Received: from 1.general.smb.uk.vpn ([10.172.193.28]) by youngberry.canonical.com with esmtpsa (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1axBOL-0002YH-0d for xen-devel@lists.xenproject.org; Mon, 02 May 2016 10:47:53 +0000 List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" To: xen-devel List-Id: xen-devel@lists.xenproject.org This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --===============4190870232625663370== Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="93xTDCu33JwN3xtf6vXnhCw7brAddndus" This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --93xTDCu33JwN3xtf6vXnhCw7brAddndus Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable I recently tried to boot 32bit dom0 on 64bit Xen host which I configured = to run with a limited, fix amount of memory for dom0. It seems that somewhere be= tween kernel versions 3.19 and 4.2 (sorry that is still a wide range) the Linux= kernel would report bad page flags for a range of pages (which seem to be around= the end of the guest pfn range). For a 4.2 kernel that was easily missed as t= he boot finished ok and dom0 was accessible. However starting with 4.4 (tested 4.= 5 and a 4.6-rc) the serial console output freezes after some of those bad page fl= ag messages and then (unfortunately without any further helpful output) the = host reboots (I assume there is a panic that triggers a reset). I suspect the problem is more a kernel side one. It is just possible to influence things by variation of dom0_mem=3D#,max:#. 512M seems ok, 1024M= , 2048M, and 3072M cause bad page flags starting around kernel 4.2 and reboots aro= und 4.4. Then 4096M and not clamping dom0 memory seem to be ok again (though = not limiting dom0 memory seems to cause trouble on 32bit dom0 later when a do= mU tries to balloon memory, but I think that is a different problem). I have not seen this on a 64bit dom0. Below is an example of those bad pa= ge errors. Somehow it looks to be a page marked as reserved. Initially I won= dered whether this could be a problem of not clearing page flags when moving ma= ppings to match the e820. But I never looked into i386 memory setup in that deta= il. So I am posting this, hoping that someone may have an idea from the detail a= bout where to look next. PAE is enabled there. Usually its bpf init that gets = hit but that likely is just because that is doing the first vmallocs. -Stefan [ 4.748815] BUG: Bad page state in process swapper/0 pfn:3fc1e [ 4.748861] page:f675a4b0 count:0 mapcount:0 mapping: (null) index:0x= 0 [ 4.748908] flags: 0x3000400(reserved) [ 4.748984] page dumped because: PAGE_FLAGS_CHECK_AT_PREP flag set [ 4.749030] bad because of flags: [ 4.749069] flags: 0x400(reserved) [ 4.749143] Modules linked in: [ 4.749201] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G B 4.2.0-10-generic #12-Ubuntu [ 4.749303] Hardware name: Intel Corporation ... [ 4.749379] 00000000 00000000 f0cffcfc c1730710 f675a4b0 f0cffd20 c11= 5be27 c194692c [ 4.749584] f0d503ec 0003fc1e 007fffff c1946eb8 c194a70e 00000001 f0c= ffd8c c115f4c3 [ 4.749790] 00000002 00000141 00009069 c1ac5384 00000000 f11d7ce4 f11= d7ce4 c1ac4dc0 [ 4.749993] Call Trace: [ 4.750034] [] dump_stack+0x41/0x52 [ 4.750078] [] bad_page+0xb7/0x110 [ 4.750121] [] get_page_from_freelist+0x2d3/0x610 [ 4.750168] [] __alloc_pages_nodemask+0x146/0x8f0 [ 4.750215] [] ? find_entry.isra.13+0x52/0x90 [ 4.750260] [] ? kmem_cache_alloc_trace+0x175/0x1e0 [ 4.750308] [] ? __raw_callee_save___pv_queued_spin_unlock+= 0x6/0x10 [ 4.750373] [] ? __kmalloc+0x21d/0x240 [ 4.750417] [] __vmalloc_node_range+0x10e/0x210 [ 4.750464] [] ? bpf_prog_alloc+0x37/0xa0 [ 4.750509] [] __vmalloc_node+0x66/0x70 [ 4.750553] [] ? bpf_prog_alloc+0x37/0xa0 [ 4.750598] [] __vmalloc+0x34/0x40 [ 4.750642] [] ? bpf_prog_alloc+0x37/0xa0 [ 4.750687] [] bpf_prog_alloc+0x37/0xa0 [ 4.750732] [] bpf_prog_create+0x2c/0x90 [ 4.750776] [] ? bsp_pm_check_init+0x11/0x11 [ 4.750821] [] ptp_classifier_init+0x20/0x28 [ 4.750866] [] ? [ 4.750933] [] sock_init+0x7c/0x83 [ 4.750977] [] do_one_initcall+0xaa/0x200 [ 4.751021] [] ? bsp_pm_check_init+0x11/0x11 [ 4.751067] [] ? parse_args+0x2ad/0x540 [ 4.751112] [] kernel_init_freeable+0x13a/0x1bc [ 4.751158] [] kernel_init+0x10/0xe0 [ 4.751203] [] ? schedule_tail+0x11/0x50 [ 4.751251] [] ret_from_kernel_thread+0x21/0x30 [ 4.751297] [] ? rest_init+0x70/0x70 For reference some memory info from an Intel box with 8G physical memory,= booted with 1024M dom0 memory on a 4.2 kernel. The range of bad page pfn from sy= slog was 3fc1e to 3fc3b. The reported pages (f675a4b0 to f675a938) all with th= e same line about no mappings or references. (XEN) Xen-e820 RAM map: (XEN) 0000000000000000 - 000000000009a400 (usable) (XEN) 000000000009a400 - 00000000000a0000 (reserved) (XEN) 00000000000e0000 - 0000000000100000 (reserved) (XEN) 0000000000100000 - 0000000030a48000 (usable) (XEN) 0000000030a48000 - 0000000030a49000 (reserved) (XEN) 0000000030a49000 - 00000000a27f4000 (usable) (XEN) 00000000a27f4000 - 00000000a2ab4000 (reserved) (XEN) 00000000a2ab4000 - 00000000a2fb4000 (ACPI NVS) (XEN) 00000000a2fb4000 - 00000000a2feb000 (ACPI data) (XEN) 00000000a2feb000 - 00000000a3000000 (usable) (XEN) 00000000a3000000 - 00000000afa00000 (reserved) (XEN) 00000000e0000000 - 00000000f0000000 (reserved) (XEN) 00000000fec00000 - 00000000fec01000 (reserved) (XEN) 00000000fed00000 - 00000000fed04000 (reserved) (XEN) 00000000fed10000 - 00000000fed1a000 (reserved) (XEN) 00000000fed1c000 - 00000000fed20000 (reserved) (XEN) 00000000fed84000 - 00000000fed85000 (reserved) (XEN) 00000000fee00000 - 00000000fee01000 (reserved) (XEN) 00000000ffc00000 - 0000000100000000 (reserved) (XEN) 0000000100000000 - 000000024e600000 (usable) =2E.. (XEN) *** LOADING DOMAIN 0 *** (XEN) Xen kernel: 64-bit, lsb, compat32 (XEN) Dom0 kernel: 32-bit, PAE, lsb, paddr 0x1000000 -> 0x1ecc000 (XEN) PHYSICAL MEMORY ARRANGEMENT: (XEN) Dom0 alloc.: 0000000240000000->0000000244000000 (238639 pages to be= all ocated) (XEN) Init. ramdisk: 000000024ca2f000->000000024e5ffbd4 (XEN) VIRTUAL MEMORY ARRANGEMENT: (XEN) Loaded kernel: 00000000c1000000->00000000c1ecc000 (XEN) Init. ramdisk: 0000000000000000->0000000000000000 (XEN) Phys-Mach map: 00000000c1ecc000->00000000c1fcc000 (XEN) Start info: 00000000c1fcc000->00000000c1fcc4b4 (XEN) Page tables: 00000000c1fcd000->00000000c1fe4000 (XEN) Boot stack: 00000000c1fe4000->00000000c1fe5000 (XEN) TOTAL: 00000000c0000000->00000000c2400000 (XEN) ENTRY ADDRESS: 00000000c1ae8254 --93xTDCu33JwN3xtf6vXnhCw7brAddndus Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAEBCgAGBQJXJzBYAAoJEOhnXe7L7s6jOiMP/Ru9VFyoUcKQs2yjuWNzRsVu PjC1ReUzXLPkRypM5y8Ei00YBkcSvHwLBeSflj0uHAQuwi1Yw+hzlwlIw/+l+3HC YFBah2DYggJsOp/bV9iWOW29rqoxdLW45trZPOZ0PhLY73flwF8Wp0HI74weO0Ww BBRQQo/FBl4LCgRnu0M6sR1JipYrv9hBP7enqgX4uY15elDWZE2QLFtPQ1AEb+eJ vZ16D0qf3DqX1k+uMdhlc73PLZkXfVugJv8KfMzSdBKTC1iO6ilHf0Q+pgHCAK1H 3OZ1Yuf1VKgOSb6peIwRhNRDkyrih8NC01IEC5IS2jHqQCSqBv1/b64D2Ta2F609 iCh0t3/gBERvwh8jcYXDJVM/Antq18IPs75+0U0HgAib2PPAGVzcvkyDOdDSK2nZ pl5vkSNKwCA0spihCtV36kzdfvtSqUWB4jMEdLNBW0tQtO2DWwqeyRyW6tdQk3fL +qUbg6aB3p1CdglBnTHQ/ZH2o/8xy7A3j3dvsxjcu7ZNpFhvyss7zmn/1VAZnDqt DwBGomWZrTRnoIYMWGMS3eO3EKvpvV+K4JGTC8tTLV5LNHZf/3T5hSOyaJ/RUSLg uUy3IXWzONjSKCuqiIbRp3Eslz+TcvLSZMe8LFacSK+OtPvchI4HA11uEFWOhAqy GVHGL2Tfr5QPdk6jMx0A =Y65r -----END PGP SIGNATURE----- --93xTDCu33JwN3xtf6vXnhCw7brAddndus-- --===============4190870232625663370== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KWGVuLWRldmVs IG1haWxpbmcgbGlzdApYZW4tZGV2ZWxAbGlzdHMueGVuLm9yZwpodHRwOi8vbGlzdHMueGVuLm9y Zy94ZW4tZGV2ZWwK --===============4190870232625663370==--