From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:55851) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QGTub-0008Q3-Cj for qemu-devel@nongnu.org; Sun, 01 May 2011 06:30:02 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QGTuZ-0001eJ-DZ for qemu-devel@nongnu.org; Sun, 01 May 2011 06:30:01 -0400 Received: from mx1.redhat.com ([209.132.183.28]:3473) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QGTuZ-0001db-4X for qemu-devel@nongnu.org; Sun, 01 May 2011 06:29:59 -0400 Date: Sun, 1 May 2011 13:29:41 +0300 From: "Michael S. Tsirkin" Message-ID: <20110501102941.GG27816@redhat.com> References: <20110429031437.3796.49456.stgit@s20.home> <20110429150640.GB27816@redhat.com> <4DBAD942.6080001@siemens.com> <1304091508.3418.11.camel@x201> <4DBADD2B.2050300@siemens.com> <1304092537.3418.16.camel@x201> <4DBAE239.80007@siemens.com> <1304094003.14244.2.camel@x201> <4DBAE7C7.8010203@siemens.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4DBAE7C7.8010203@siemens.com> Subject: Re: [Qemu-devel] [PATCH] Fix phys memory client - pass guest physical address not region offset List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jan Kiszka Cc: Alex Williamson , "qemu-devel@nongnu.org" On Fri, Apr 29, 2011 at 06:31:03PM +0200, Jan Kiszka wrote: > On 2011-04-29 18:20, Alex Williamson wrote: > > On Fri, 2011-04-29 at 18:07 +0200, Jan Kiszka wrote: > >> On 2011-04-29 17:55, Alex Williamson wrote: > >>> On Fri, 2011-04-29 at 17:45 +0200, Jan Kiszka wrote: > >>>> On 2011-04-29 17:38, Alex Williamson wrote: > >>>>> On Fri, 2011-04-29 at 17:29 +0200, Jan Kiszka wrote: > >>>>>> On 2011-04-29 17:06, Michael S. Tsirkin wrote: > >>>>>>> On Thu, Apr 28, 2011 at 09:15:23PM -0600, Alex Williamson wrote: > >>>>>>>> When we're trying to get a newly registered phys memory client updated > >>>>>>>> with the current page mappings, we end up passing the region offset > >>>>>>>> (a ram_addr_t) as the start address rather than the actual guest > >>>>>>>> physical memory address (target_phys_addr_t). If your guest has less > >>>>>>>> than 3.5G of memory, these are coincidentally the same thing. If > >>>>>> > >>>>>> I think this broke even with < 3.5G as phys_offset also encodes the > >>>>>> memory type while region_offset does not. So everything became RAMthis > >>>>>> way, no MMIO was announced. > >>>>>> > >>>>>>>> there's more, the region offset for the memory above 4G starts over > >>>>>>>> at 0, so the set_memory client will overwrite it's lower memory entries. > >>>>>>>> > >>>>>>>> Instead, keep track of the guest phsyical address as we're walking the > >>>>>>>> tables and pass that to the set_memory client. > >>>>>>>> > >>>>>>>> Signed-off-by: Alex Williamson > >>>>>>> > >>>>>>> Acked-by: Michael S. Tsirkin > >>>>>>> > >>>>>>> Given all this, can yo tell how much time does > >>>>>>> it take to hotplug a device with, say, a 40G RAM guest? > >>>>>> > >>>>>> Why not collect pages of identical types and report them as one chunk > >>>>>> once the type changes? > >>>>> > >>>>> Good idea, I'll see if I can code that up. I don't have a terribly > >>>>> large system to test with, but with an 8G guest, it's surprisingly not > >>>>> very noticeable. For vfio, I intend to only have one memory client, so > >>>>> adding additional devices won't have to rescan everything. The memory > >>>>> overhead of keeping the list that the memory client creates is probably > >>>>> also low enough that it isn't worthwhile to tear it all down if all the > >>>>> devices are removed. Thanks, > >>>> > >>>> What other clients register late? Do the need to know to whole memory > >>>> layout? > >>>> > >>>> This full page table walk is likely a latency killer as it happens under > >>>> global lock. Ugly. > >>> > >>> vhost and kvm are the only current users. kvm registers it's client > >>> early enough that there's no memory registered, so doesn't really need > >>> this replay through the page table walk. I'm not sure how vhost works > >>> currently. I'm also looking at using this for vfio to register pages > >>> for the iommu. > >> > >> Hmm, it looks like vhost is basically recreating the condensed, slotted > >> memory layout from the per-page reports now. A bit inefficient, > >> specifically as this happens per vhost device, no? And if vfio preferred > >> a slotted format as well, you would end up copying vhost logic. > >> > >> That sounds to me like the qemu core should start tracking slots and > >> report slot changes, not memory region registrations. > > > > I was thinking the same thing, but I think Michael is concerned if we'll > > each need slightly different lists. This is also where kvm is mapping > > to a fixed array of slots, which is know to blow-up with too many > > assigned devices. Needs to be fixed on both kernel and qemu side. > > Runtime overhead of the phys memory client is pretty minimal, it's just > > the startup that thrashes set_memory. > > I'm not just concerned about the runtime overhead. This is code > duplication. Even if the format of the lists differ, their structure > should not: one entry per continuous memory region, and some lists may > track sparsely based on their interests. > > I'm sure the core could be taught to help the clients creating and > maintaining such lists. We already have two types of users in tree, you > are about to create another one, and Xen should have some need for it as > well. > > Jan Absolutely. There should be some common code to deal with slots. > -- > Siemens AG, Corporate Technology, CT T DE IT 1 > Corporate Competence Center Embedded Linux