From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jan Kiszka Subject: Re: [RFC PATCH 0/2] Expose available KVM free memory slot count to help avoid aborts Date: Mon, 24 Jan 2011 15:16:17 +0100 Message-ID: <4D3D89B1.30300@siemens.com> References: <20110121233040.22262.68117.stgit@s20.home> <20110124093241.GA28654@amt.cnet> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: Alex Williamson , "kvm@vger.kernel.org" , "ddutile@redhat.com" , "mst@redhat.com" , "avi@redhat.com" , "chrisw@redhat.com" To: Marcelo Tosatti Return-path: Received: from goliath.siemens.de ([192.35.17.28]:18864 "EHLO goliath.siemens.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751810Ab1AXOQb (ORCPT ); Mon, 24 Jan 2011 09:16:31 -0500 In-Reply-To: <20110124093241.GA28654@amt.cnet> Sender: kvm-owner@vger.kernel.org List-ID: On 2011-01-24 10:32, Marcelo Tosatti wrote: > On Fri, Jan 21, 2011 at 04:48:02PM -0700, Alex Williamson wrote: >> When doing device assignment, we use cpu_register_physical_memory() to >> directly map the qemu mmap of the device resource into the address >> space of the guest. The unadvertised feature of the register physical >> memory code path on kvm, at least for this type of mapping, is that it >> needs to allocate an index from a small, fixed array of memory slots. >> Even better, if it can't get an index, the code aborts deep in the >> kvm specific bits, preventing the caller from having a chance to >> recover. >> >> It's really easy to hit this by hot adding too many assigned devices >> to a guest (pretty easy to hit with too many devices at instantiation >> time too, but the abort is slightly more bearable there). >> >> I'm assuming it's pretty difficult to make the memory slot array >> dynamically sized. If that's not the case, please let me know as >> that would be a much better solution. > > Its not difficult to either increase the maximum number (defined as > 32 now in both qemu and kernel) of static slots, or support dynamic > increases, if it turns out to be a performance issue. Static limits are waiting to be hit again, just a bit later. I would start thinking about a tree search as well because iterating over all slots won't get faster over the time. > > But you'd probably want to fix the abort for currently supported kernels > anyway. Jep. Depending on how soon we have smarter solution in the kernel, this fix may vary in pragmatism. > >> I'm not terribly happy with the solution in this series, it doesn't >> provide any guarantees whether a cpu_register_physical_memory() will >> succeed, only slightly better educated guesses. >> >> Are there better ideas how we could solve this? Thanks, > > Why can't cpu_register_physical_memory() return an error so you can > fallback to slow mode or cancel device insertion? > Doesn't that registration happens much later than the call to pci_register_bar? In any case, this will require significantly more invasive work (but it would be much nicer if possible, no question). Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux