From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33896) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bO4t0-0001K1-JF for qemu-devel@nongnu.org; Fri, 15 Jul 2016 11:18:43 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bO4su-0002js-Gj for qemu-devel@nongnu.org; Fri, 15 Jul 2016 11:18:41 -0400 Received: from 5.mo178.mail-out.ovh.net ([46.105.51.53]:54403) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bO4su-0002j0-4k for qemu-devel@nongnu.org; Fri, 15 Jul 2016 11:18:36 -0400 Received: from player169.ha.ovh.net (b9.ovh.net [213.186.33.59]) by mo178.mail-out.ovh.net (Postfix) with ESMTP id 2D988100CADC for ; Fri, 15 Jul 2016 17:18:34 +0200 (CEST) Date: Fri, 15 Jul 2016 17:18:29 +0200 From: Greg Kurz Message-ID: <20160715171829.0a9dfd16@bahia.lan> In-Reply-To: <5a9731af-6636-7031-4d50-20815cdfb5e0@redhat.com> References: <1468570225-14101-1-git-send-email-thuth@redhat.com> <20160715083530.GA14615@voom.fritz.box> <5a9731af-6636-7031-4d50-20815cdfb5e0@redhat.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; boundary="Sig_/Ili2vPuLuFxzm0XZvna8cCl"; protocol="application/pgp-signature" Subject: Re: [Qemu-devel] [PATCH] ppc: Yet another fix for the huge page support detection mechanism List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Thomas Huth Cc: David Gibson , qemu-ppc@nongnu.org, qemu-devel@nongnu.org --Sig_/Ili2vPuLuFxzm0XZvna8cCl Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable On Fri, 15 Jul 2016 14:28:44 +0200 Thomas Huth wrote: > On 15.07.2016 10:35, David Gibson wrote: > > On Fri, Jul 15, 2016 at 10:10:25AM +0200, Thomas Huth wrote: =20 > >> Commit 86b50f2e1bef ("Disable huge page support if it is not available > >> for main RAM") already made sure that huge page support is not announc= ed > >> to the guest if the normal RAM of non-NUMA configurations is not backed > >> by a huge page filesystem. However, there is one more case that can go > >> wrong: NUMA is enabled, but the RAM of the NUMA nodes are not configur= ed > >> with huge page support (and only the memory of a DIMM is configured wi= th > >> it). When QEMU is started with the following command line for example, > >> the Linux guest currently crashes because it is trying to use huge pag= es > >> on a memory region that does not support huge pages: > >> > >> qemu-system-ppc64 -enable-kvm ... -m 1G,slots=3D4,maxmem=3D32G -objec= t \ > >> memory-backend-file,policy=3Ddefault,mem-path=3D/hugepages,size=3D1= G,id=3Dmem-mem1 \ > >> -device pc-dimm,id=3Ddimm-mem1,memdev=3Dmem-mem1 -smp 2 \ > >> -numa node,nodeid=3D0 -numa node,nodeid=3D1 > >> > >> To fix this issue, we've got to make sure to disable huge page support, > >> too, when there is a NUMA node that is not using a memory backend with > >> huge page support. > >> > >> Fixes: 86b50f2e1befc33407bdfeb6f45f7b0d2439a740 > >> Signed-off-by: Thomas Huth > >> --- > >> target-ppc/kvm.c | 10 +++++++--- > >> 1 file changed, 7 insertions(+), 3 deletions(-) > >> > >> diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c > >> index 884d564..7a8f555 100644 > >> --- a/target-ppc/kvm.c > >> +++ b/target-ppc/kvm.c > >> @@ -389,12 +389,16 @@ static long getrampagesize(void) > >> =20 > >> object_child_foreach(memdev_root, find_max_supported_pagesize, &h= psize); > >> =20 > >> - if (hpsize =3D=3D LONG_MAX) { > >> + if (hpsize =3D=3D LONG_MAX || hpsize =3D=3D getpagesize()) { > >> return getpagesize(); > >> } > >> =20 > >> - if (nb_numa_nodes =3D=3D 0 && hpsize > getpagesize()) { > >> - /* No NUMA nodes and normal RAM without -mem-path =3D=3D> no = huge pages! */ > >> + /* If NUMA is disabled or the NUMA nodes are not backed with a > >> + * memory-backend, then there is at least one node using "normal" > >> + * RAM. And since normal RAM has not been configured with "-mem-p= ath" > >> + * (what we've checked earlier here already), we can not use huge= pages! > >> + */ > >> + if (nb_numa_nodes =3D=3D 0 || numa_info[0].node_memdev =3D=3D NUL= L) { =20 > >=20 > > Is that second clause sufficient, or do you need to loop through and > > check the memdev of every node? =20 >=20 > Checking the first entry should be sufficient. QEMU forces you to > specify either a memory backend for all NUMA nodes (which we should have > looked at during the object_child_foreach() some lines earlier), or you > must not specify a memory backend for any NUMA node at all. You can not > mix the settings, so checking numa_info[0] is enough. >=20 > Thomas >=20 >=20 And what happens if we specify a hugepage memdev backend to one of the nodes and a regular RAM memdev backend to the other ? I actually wanted to try that but I hit an assertion, which isn't related to this patch I think: qemu-system-ppc64: memory.c:1934: memory_region_add_subregion_common:=20 Assertion `!subregion->container' failed. So I tried to trick the logic you are trying to fix the other way round: -mem-path /dev/hugepages \ -m 1G,slots=3D4,maxmem=3D32G \ -object memory-backend-ram,policy=3Ddefault,size=3D1G,id=3Dmem-mem1 \ -device pc-dimm,id=3Ddimm-mem1,memdev=3Dmem-mem1 \ -smp 2 \ -numa node,nodeid=3D0 -numa node,nodeid=3D1 The guest fails the same way as before your patch: the hugepage size is advertised to the guest, but the numa node is associated to regular ram. -- Greg --Sig_/Ili2vPuLuFxzm0XZvna8cCl Content-Type: application/pgp-signature Content-Description: OpenPGP digital signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEARECAAYFAleI/sUACgkQAvw66wEB28KK6wCeNgK7F4TcbxGqs3O0LTdML3z4 PhgAnigTx3twcPk4uPxNJHDgOIZhZRwI =A+9Q -----END PGP SIGNATURE----- --Sig_/Ili2vPuLuFxzm0XZvna8cCl--