From mboxrd@z Thu Jan  1 00:00:00 1970
From: Alex Williamson <alex.williamson@redhat.com>
Subject: Re: [RFC PATCH 0/6] kvm: Growable memory slot array
Date: Tue, 04 Dec 2012 08:39:47 -0700
Message-ID: <1354635587.1809.422.camel@bling.home>
References: <20121203231912.3661.57179.stgit@bling.home>
	 <20121204114803.GH19514@redhat.com> <1354634515.1809.406.camel@bling.home>
	 <20121204153043.GA14176@redhat.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
Cc: mtosatti@redhat.com, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org
To: Gleb Natapov <gleb@redhat.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from mx1.redhat.com ([209.132.183.28]:1191 "EHLO mx1.redhat.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1754106Ab2LDPjs (ORCPT <rfc822;kvm@vger.kernel.org>);
	Tue, 4 Dec 2012 10:39:48 -0500
In-Reply-To: <20121204153043.GA14176@redhat.com>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

On Tue, 2012-12-04 at 17:30 +0200, Gleb Natapov wrote:
> On Tue, Dec 04, 2012 at 08:21:55AM -0700, Alex Williamson wrote:
> > On Tue, 2012-12-04 at 13:48 +0200, Gleb Natapov wrote:
> > > On Mon, Dec 03, 2012 at 04:39:05PM -0700, Alex Williamson wrote:
> > > > Memory slots are currently a fixed resource with a relatively small
> > > > limit.  When using PCI device assignment in a qemu guest it's fairly
> > > > easy to exhaust the number of available slots.  I posted patches
> > > > exploring growing the number of memory slots a while ago, but it was
> > > > prior to caching memory slot array misses and thefore had potentially
> > > > poor performance.  Now that we do that, Avi seemed receptive to
> > > > increasing the memory slot array to arbitrary lengths.  I think we
> > > > still don't want to impose unnecessary kernel memory consumptions on
> > > > guests not making use of this, so I present again a growable memory
> > > > slot array.
> > > > 
> > > > A couple notes/questions; in the previous version we had a
> > > > kvm_arch_flush_shadow() call when we increased the number of slots.
> > > > I'm not sure if this is still necessary.  I had also made the x86
> > > > specific slot_bitmap dynamically grow as well and switch between a
> > > > direct bitmap and indirect pointer to a bitmap.  That may have
> > > > contributed to needing the flush.  I haven't done that yet here
> > > > because it seems like an unnecessary complication if we have a max
> > > > on the order of 512 or 1024 entries.  A bit per slot isn't a lot of
> > > > overhead.  If we want to go more, maybe we should make it switch.
> > > > That leads to the final question, we need an upper bound since this
> > > > does allow consumption of extra kernel memory, what should it be?  A
> > > This is the most important question :) If we want to have 1000s of
> > > them or 100 is enough?
> > 
> > We can certainly hit respectable numbers of assigned devices in the
> > hundreds.  Worst case is 8 slots per assigned device, typical case is 4
> > or less.  So 512 slots would more or less guarantee 64 devices (we do
> > need some slots for actual memory), and more typically allow at least
> > 128 devices.  Philosophically, supporting a full PCI bus, 256 functions,
> > 2048 slots, is an attractive target, but it's probably no practical.
> > 
> > I think on x86 a slot is 72 bytes w/ alignment padding, so a maximum of
> > 36k @512 slots.
> > 
> > >  Also what about changing kvm_memslots->memslots[]
> > > array to be "struct kvm_memory_slot *memslots[KVM_MEM_SLOTS_NUM]"? It
> > > will save us good amount of memory for unused slots.
> > 
> > I'm not following where that results in memory savings.  Can you
> > clarify.  Thanks,
> > 
> We will waste sizeof(void*) for each unused slot instead of
> sizeof(struct kvm_memory_slot).

Ah, of course.  That means for 512 slots we're wasting a full page just
for the pointers, whereas we can fit 56 slots in the same space.  Given
that most users get by just fine w/ 32 slots, I don't think that's a win
in the typical case.  Maybe if we want to support sparse arrays, but a
tree would probably be better at that point.  A drawback of the growable
array is that userspace can subvert any savings by using slot N-1 first,
but that's why we put a limit at a reasonable size.  Thanks,

Alex