From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:39535)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <pbonzini@redhat.com>) id 1Vq4lY-0007nw-K2
	for qemu-devel@nongnu.org; Mon, 09 Dec 2013 12:37:14 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <pbonzini@redhat.com>) id 1Vq4lS-0003uh-Ho
	for qemu-devel@nongnu.org; Mon, 09 Dec 2013 12:37:08 -0500
Received: from mx1.redhat.com ([209.132.183.28]:7690)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <pbonzini@redhat.com>) id 1Vq4iV-0003Ex-Ev
	for qemu-devel@nongnu.org; Mon, 09 Dec 2013 12:33:59 -0500
Message-ID: <52A5FEF5.1010504@redhat.com>
Date: Mon, 09 Dec 2013 18:33:41 +0100
From: Paolo Bonzini <pbonzini@redhat.com>
MIME-Version: 1.0
References: <1386143939-19142-1-git-send-email-gaowanlong@cn.fujitsu.com>
	<52A1939D.1080709@redhat.com> <20131206184936.GA10903@amt.cnet>
In-Reply-To: <20131206184936.GA10903@amt.cnet>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [PATCH V17 00/11] Add support for binding guest
 numa nodes to host numa nodes
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Marcelo Tosatti <mtosatti@redhat.com>
Cc: drjones@redhat.com, ehabkost@redhat.com, lersek@redhat.com, qemu-devel@nongnu.org, lcapitulino@redhat.com, bsd@redhat.com, anthony@codemonkey.ws, hutao@cn.fujitsu.com, y-goto@jp.fujitsu.com, peter.huangpeng@huawei.com, afaerber@suse.de, Wanlong Gao <gaowanlong@cn.fujitsu.com>

Il 06/12/2013 19:49, Marcelo Tosatti ha scritto:
>> > You'll have with your patches (without them it's worse of course):
>> > 
>> >    RAM offset    physical address   node 0
>> >    0-3840M       0-3840M            host node 0
>> >    4096M-4352M   4096M-4352M        host node 0
>> >    4352M-8192M   4352M-8192M        host node 1
>> >    3840M-4096M   8192M-8448M        host node 1
>> > 
>> > So only 0-3G and 5-8G are aligned, 3G-5G and 8G-8.25G cannot use
>> > gigabyte pages because they are split across host nodes.
> AFAIK the TLB caches virt->phys translations, why specifics of 
> a given phys address is a factor into TLB caching?

The problem is that "-numa mem" receives memory sizes and these do not
take into account the hole below 4G.

Thus, two adjacent host-physical addresses (two adjacent ram_addr_t-s)
map to very far guest-physical addresses, are assigned to different
guest nodes, and from there to different host nodes.  In the above
example this happens for 3G-5G.

On second thought, this is not particularly important, or at least not
yet.  It's not really possible to control the NUMA policy for
hugetlbfs-allocated memory, right?

>> > So rather than your patches, it seems simpler to just widen the PCI hole
>> > to 1G for i440FX and 2G for q35.
>> > 
>> > What do you think?
> 
> Problem is its a guest visible change. To get 1GB TLB entries with
> "legacy guest visible machine types" (which require new machine types
> at the host side, but invisible to guest), that won't work.
> Windows registration invalidation etc.

Yeah, that's a tradeoff to make.

Paolo

> But for q35, sure.
>