From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33371) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UnIT1-0001Ky-8Y for qemu-devel@nongnu.org; Thu, 13 Jun 2013 21:06:16 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UnIT0-0004e4-38 for qemu-devel@nongnu.org; Thu, 13 Jun 2013 21:06:15 -0400 Received: from e31.co.us.ibm.com ([32.97.110.149]:39899) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UnISw-0004cm-VC for qemu-devel@nongnu.org; Thu, 13 Jun 2013 21:06:14 -0400 Received: from /spool/local by e31.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 13 Jun 2013 19:06:09 -0600 Received: from d03relay05.boulder.ibm.com (d03relay05.boulder.ibm.com [9.17.195.107]) by d03dlp01.boulder.ibm.com (Postfix) with ESMTP id EC1B81FF002B for ; Thu, 13 Jun 2013 19:00:34 -0600 (MDT) Received: from d03av04.boulder.ibm.com (d03av04.boulder.ibm.com [9.17.195.170]) by d03relay05.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id r5E15l2R148584 for ; Thu, 13 Jun 2013 19:05:47 -0600 Received: from d03av04.boulder.ibm.com (loopback [127.0.0.1]) by d03av04.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id r5E15kVm006677 for ; Thu, 13 Jun 2013 19:05:46 -0600 From: Anthony Liguori In-Reply-To: <51BA4891.6020108@redhat.com> References: <1370404705-4620-1-git-send-email-gaowanlong@cn.fujitsu.com> <1370404705-4620-2-git-send-email-gaowanlong@cn.fujitsu.com> <20130605134505.GS2580@otherpad.lan.raisama.net> <51B6D025.3040606@cn.fujitsu.com> <20130611134017.GC2895@otherpad.lan.raisama.net> <51B922FE.8090109@cn.fujitsu.com> <20130613125019.GI2895@otherpad.lan.raisama.net> <51BA4891.6020108@redhat.com> Date: Thu, 13 Jun 2013 20:05:34 -0500 Message-ID: <87mwqttsbl.fsf@codemonkey.ws> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Subject: Re: [Qemu-devel] [PATCH 2/2] Add monitor command mem-nodes List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini , Eduardo Habkost Cc: andre.przywara@amd.com, qemu-devel@nongnu.org, Wanlong Gao Paolo Bonzini writes: > Il 13/06/2013 08:50, Eduardo Habkost ha scritto: >> I believe an interface based on guest physical memory addresses is more >> flexible (and even simpler!) than one that only allows binding of whole >> virtual NUMA nodes. > > And "-numa node" is already one, what about just adding "mem-path=/foo" > or "host_node=NN" suboptions? Then "-mem-path /foo" would be a shortcut > for "-numa node,mem-path=/foo". > > I even had patches to convert -numa to QemuOpts, I can dig them out if > your interested. Ack. This is a very reasonable thing to add. Regards, Anthony Liguori > > Paolo > >> (And I still don't understand why you are exposing QEMU virtual memory >> addresses in the new command, if they are useless). >> >> >>>> >>>> >>>>>> * The correspondence between guest physical address ranges and ranges >>>>>> inside the mapped files (so external tools could set the policy on >>>>>> those files instead of requiring QEMU to set it directly) >>>>>> >>>>>> I understand that your use case may require additional information and >>>>>> additional interfaces. But if we provide the information above we will >>>>>> allow external components set the policy on the hugetlbfs files before >>>>>> we add new interfaces required for your use case. >>>>> >>>>> But the file backed memory is not good for the host which has many >>>>> virtual machines, in this situation, we can't handle anon THP yet. >>>> >>>> I don't understand what you mean, here. What prevents someone from using >>>> file-backed memory with multiple virtual machines? >>> >>> While if we use hugetlbfs backed memory, we should know how many virtual machines, >>> how much memory each vm will use, then reserve these pages for them. And even >>> should reserve more pages for external tools(numactl) to set memory polices. >>> Even the memory reservation also has it's own memory policies. It's very hard >>> to control it to what we want to set. >> >> Well, it's hard because we don't even have tools to help on that, yet. >> >> Anyway, I understand that you want to make it work with THP as well. But >> if THP works with tmpfs (does it?), people then could use exactly the >> same file-based mechanisms with tmpfs and keep THP working. >> >> (Right now I am doing some experiments to understand how the system >> behaves when using numactl on hugetlbfs and tmpfs, before and after >> getting the files mapped). >> >> >>>> >>>>> >>>>> And as I mentioned, the cross numa node access performance regression >>>>> is caused by pci-passthrough, it's a very long time bug, we should >>>>> back port the host memory pinning patch to old QEMU to resolve this performance >>>>> problem, too. >>>> >>>> If it's a regression, what's the last version of QEMU where the bug >>>> wasn't present? >>>> >>> >>> As QEMU doesn't support host memory binding, I think >>> this was present since we support guest NUMA, and the pci-passthrough made >>> it even worse. >> >> If the problem was always present, it is not a regression, is it? >>