From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dario Faggioli Subject: Re: [libvirt] [PATCH 1/4] libxl: implement NUMA capabilities reporting Date: Thu, 4 Jul 2013 18:53:12 +0200 Message-ID: <1372956792.10336.93.camel@Solace> References: <20130628142948.28579.8536.stgit@hit-nxdomain.opendns.com> <20130628143244.28579.57535.stgit@hit-nxdomain.opendns.com> <51D206FB.8000908@suse.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============4332757614911613878==" Return-path: In-Reply-To: <51D206FB.8000908@suse.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jim Fehlig Cc: Jan Beulich , xen-devel List-Id: xen-devel@lists.xenproject.org --===============4332757614911613878== Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-ww8BUdM/M6M5TkyV9KNM" --=-ww8BUdM/M6M5TkyV9KNM Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable [Moving the conversation on @xen-devel and adding Jan, as that seems more appropriate] [Jan, this came up as I'm implementing some NUMA bits in libvirt but, as you see, the core of Jim's question is purely about Xen] On lun, 2013-07-01 at 16:47 -0600, Jim Fehlig wrote: > On my non-NUMA test machine I have the cell memory reported as >=20 > 9175040 >=20 Which is 8960, if divided by 1024, so at least it's consistent. However... > The machine has 8G of memory, running xen 4.3 rc6, with dom0_mem=3D1024M.= 'xl=20 > info --numa' says >=20 > total_memory : 8190 > ... > numa_info : > node: memsize memfree distances > 0: 8960 7116 10 >=20 > Why is the node memsize > total_memory? Mmm... Interesting question. I really never paid attention to this... Jan (or anyone else), is that something known and/or expected? I went checking this down in Xen, and here's what I found. total_memory is: info.total_pages/((1 << 20) / vinfo->pagesize) where 'info' is what libxl_get_physinfo() provides. On its turn, libxl_get_physinfo() is xc_physinfo(), which is XEN_SYSCTL_physinfo, which uses total_pages, which is assigned the number of pages, down in __start_xen(), as it results from parsing the E820 map (looking for RAM blocks). OTOH, memsize comes from libxl_get_numainfo(), which is xc_numainfo(), which is XEN_SYSCTL_numainfo, which puts in memsize what node_spanned_pages() says. That seems to come, on a NUMA box, from the parsing of SRAT, and on a non-NUMA box, from just (start_pfn-end_pfn) (in pages, of course). Anyway, on my NUMA box, I see something similar to what Jim sees on a non-NUMA one: # xl info -n ... total_memory : 12285 ... numa_info : node: memsize memfree distances 0: 6144 23 10,20 1: 6720 104 20,10 Where 6144+6720=3D12864 > 12285 Looking at what Xen says during boot, I see this (the [*], [+], [=3D] and [|] are mine): (XEN) Xen-e820 RAM map: (XEN) 0000000000000000 - 0000000000096000 (usable) (XEN) 00000000000f0000 - 0000000000100000 (reserved) [*] (XEN) 0000000000100000 - 00000000dbdf9c00 (usable) (XEN) 00000000dbdf9c00 - 00000000dbe4bc00 (ACPI NVS) [+] (XEN) 00000000dbe4bc00 - 00000000dbe4dc00 (ACPI data) [=3D] (XEN) 00000000dbe4dc00 - 00000000dc000000 (reserved) [|] (XEN) 00000000f8000000 - 00000000fd000000 (reserved) (XEN) 00000000fe000000 - 00000000fed00400 (reserved) (XEN) 00000000fee00000 - 00000000fef00000 (reserved) (XEN) 00000000ffb00000 - 0000000100000000 (reserved) (XEN) 0000000100000000 - 0000000324000000 (usable) ... (XEN) System RAM: 12285MB (12580412kB) And my math says that 12285MB is the sum of the areas marked as (usable), i.e., I guess, what during parsing is recognised as E820_RAM... which makes total sense. A bit below that I have this: (XEN) SRAT: Node 1 PXM 1 0-dc000000 (XEN) SRAT: Node 1 PXM 1 100000000-1a4000000 (XEN) SRAT: Node 0 PXM 0 1a4000000-324000000 Which, after the needed calculations, gives exactly the same results than memsize-s in `xl info -n'. Now, if I add up the [*], [+], [=3D] and [|] regions above, and then subtract that from node 1's PXMs, I see that node 1 has only ~6141MB of usable RAM, instead of 6720MB. And in fact, 6720-6141=3D579, just as much as 12864-12285=3D579. So, if I haven't messed up with the calculations, it looks like that Xen, when reporting to the upper layers the amount of memory it has available, does filter out the non-RAM regions, if this happens via XEN_SYSCTL_physinfo (i.e., by parsing E820), while it does not do that, if this happens via XEN_SYSCTL_numainfo (i.e., by parsing ACPI SRAT). What I'm not sure about is whether or not that was something known/intended and whether or not it is something we should be concerned about. Thanks and Regards, Dario --=20 <> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) --=-ww8BUdM/M6M5TkyV9KNM Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) iEYEABECAAYFAlHVqHgACgkQk4XaBE3IOsRYgQCglhCDEJwoLuAzGiAI+fKU7QlC u8YAoJ2JargKc7Uwo+kPa0XxBuuxemGs =nOYX -----END PGP SIGNATURE----- --=-ww8BUdM/M6M5TkyV9KNM-- --===============4332757614911613878== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --===============4332757614911613878==--