From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59645) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VgyLm-00021m-9v for qemu-devel@nongnu.org; Thu, 14 Nov 2013 09:57:01 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VgyLf-0008GS-01 for qemu-devel@nongnu.org; Thu, 14 Nov 2013 09:56:54 -0500 Received: from mail-we0-f176.google.com ([74.125.82.176]:58893) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VgyLe-0008GC-Qf for qemu-devel@nongnu.org; Thu, 14 Nov 2013 09:56:46 -0500 Received: by mail-we0-f176.google.com with SMTP id w62so2052634wes.21 for ; Thu, 14 Nov 2013 06:56:45 -0800 (PST) Message-ID: <5284E4AB.2050900@cloudius-systems.com> Date: Thu, 14 Nov 2013 16:56:43 +0200 From: Avi Kivity MIME-Version: 1.0 References: <1384370613-6433-1-git-send-email-mst@redhat.com> <1384370613-6433-6-git-send-email-mst@redhat.com> <20131114144037.GA6635@redhat.com> In-Reply-To: <20131114144037.GA6635@redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v2 5/7] exec: memory radix tree page level?compression List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Michael S. Tsirkin" Cc: qemu-devel@nongnu.org On 11/14/2013 04:40 PM, Michael S. Tsirkin wrote: > On Thu, Nov 14, 2013 at 08:54:11AM +0000, Avi Kivity wrote: >> Michael S. Tsirkin redhat.com> writes: >> >>> At the moment, memory radix tree is already variable width, but it can >>> only skip the low bits of address. >>> >>> This is efficient if we have huge memory regions but inefficient if we >>> are only using a tiny portion of the address space. >>> >>> After we have built up the map, detect >>> configurations where a single L2 entry is valid. >>> >>> We then speed up the lookup by skipping one or more levels. >>> In case any levels were skipped, we might end up in a valid section >>> instead of erroring out. We handle this by checking that >>> the address is in range of the resulting section. >>> >> >> I think this is overkill. It can be done in a simpler way as follows: >> >> >> phys_page_find(RadixTree* tr, hwaddr index, ...) >> { >> if (index & rt->invalid_index_mask) { >> // not found >> } >> lp = rt->root; >> for (i = rt->nb_levels - 1; i >= 0 && !lp.is_leaf; --i) { >> ... >> >> This exploits the fact the lower portion of the address space is always >> filled, at least in the cases that matter to us. >> >> >> >> >> > Basically skip unused high bits? > Sure. > In fact I think both optimizations can be combined. Not much value in combining them -- the variable level tree check will be dominated by the level skip logic. IMO however skipping intermediate levels will be too rare to justify the complexity and the doubling of the page table size -- it can only happen in iommu setups that place memory in very high addresses. These ought to be rare.