From mboxrd@z Thu Jan 1 00:00:00 1970 From: Gleb Natapov Subject: Re: Partial huge page backing with KVM/qemu Date: Wed, 9 Oct 2013 10:09:31 +0300 Message-ID: <20131009070931.GI3574@redhat.com> References: <7F858974825DC945874D45551C161B941156CCC4@eusaamb105.ericsson.se> <20130825085210.GI8218@redhat.com> <7F858974825DC945874D45551C161B941156D883@eusaamb105.ericsson.se> <20130826080209.GX8218@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=cp1255 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: "kvm@vger.kernel.org" To: Chris Leduc Return-path: Received: from mx1.redhat.com ([209.132.183.28]:13147 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751567Ab3JIMrZ convert rfc822-to-8bit (ORCPT ); Wed, 9 Oct 2013 08:47:25 -0400 Content-Disposition: inline In-Reply-To: <20130826080209.GX8218@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: On Mon, Aug 26, 2013 at 11:02:09AM +0300, Gleb Natapov wrote: > On Mon, Aug 26, 2013 at 02:09:57AM +0000, Chris Leduc wrote: > >=20 > >=20 > > > -----Original Message----- > > > From: Gleb Natapov [mailto:gleb@redhat.com] > > > Sent: Sunday, August 25, 2013 1:52 AM > > > To: Chris Leduc > > > Cc: kvm@vger.kernel.org > > > Subject: Re: Partial huge page backing with KVM/qemu > > >=20 > > > On Sat, Aug 24, 2013 at 12:32:07AM +0000, Chris Leduc wrote: > > > > Hi - In a KVM/qemu environment is it possible for the host to b= ack only a > > > portion of the guests memory with huge pages?=A0 In some situatio= ns it may > > > not be desirable to back the entirety of a guest's memory with hu= ge pages > > > (as can be done via libvirt memoryBacking option). > > > What are those situations? > > For example to limit a guest with 64GB of total memory to use 4GB o= f huge pages for fast lookup memory. This takes advantage of the 4 TLB= entries for 1G pages on a Sandy/Ivy Bridge processor to ensure a page = walk is never necessary for this fast memory. An example is a high per= formance data plane application. The remainder of the less frequently = accessed memory can be in normal pages. > >=20 > When two level paging (EPT) is in use combined mappings are stored in > TLB, not linear mappings (see 28.3.1). I am not sure those will ever > use 1G TLB. Not with KVM anyway since KVM does not use 1G pages for = EPT > tables since the chance to get as much of contiguous memory on a runn= ing > system is close to zero. >=20 > > > > What would be very useful is to request huge pages in the guest= , either at > > > boot time or dynamically, and have the host back them with physic= al huge > > > pages, but not back the rest of the normal page guest memory with= huge > > > pages from the host. > > > > > > > > The equivalent in Xen is setting allowsuperpage=3D1 on the hype= rvisor boot > > > line. > > > > > > > As far as I can tell this disables/enables use of huge pages by X= EN vm, not > > > something you say you want. > >=20 > > The Xen documentation is not clear on this, but in practice this fl= ag allows the host to back up guest huge page requests with physical hu= ge pages. So the guest could for example add hugepages=3DN to its boot= line and these pages would be backed in the host with corresponding ph= ysical huge pages. > Allow me to be sceptical on this :) With shadow paging sure, same is = true > for KVM: if guest maps memory with huge page and memory is contiguous= on > a host too KVM will create huge shadow page, but with two level pagin= g > hypervisor has no idea how guest's page tables look. The best it can = do > is to map entire guest physical memory using huge pages. >=20 > >=20 > > From experimentation with KVM, requesting hugepages at guest boot t= ime (without memory backing enabled) will result in guest hugepages bac= ked by host normal pages. > What do you mean by "requesting hugepages at guest boot time" and how > have you checked that guest hugepages backed by host normal pages? Do > you have THP enabled? Without THP you need to back guest's memory wit= h > huge pages using "=96mem-path /hugepagesfs". But again only 2MB pages= are > supported. >=20 Just wanted to correct myself here. KVM does support 1G pages with "=96mem-path /hugepagesfs", so to backup only part of guests memory wit= h 1G pages QEMU need to be changed to allocate part of guest's memory fro= m hugetlbfs and part as regular anonymous memory. Guest needs to know what part of its physical memory is allocated from hugetlbfs and map it using 1G pages too. I do not see anything in Intel SDM that says if such combined (EPT + guest's PT) mapping will be stored in 1G tlb entries, but it probably should. Of course tlb is shared between guests= , so if you have two guests with 4 1G pages each tlb will not be able to hold translation for both guests. -- Gleb.