From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:37499) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QN8Ii-0002yC-Cc for qemu-devel@nongnu.org; Thu, 19 May 2011 14:50:25 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QN8Ih-0006gq-8T for qemu-devel@nongnu.org; Thu, 19 May 2011 14:50:24 -0400 Received: from mx1.redhat.com ([209.132.183.28]:13931) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QN8Ig-0006gk-Sn for qemu-devel@nongnu.org; Thu, 19 May 2011 14:50:23 -0400 Date: Thu, 19 May 2011 21:50:19 +0300 From: Gleb Natapov Message-ID: <20110519185019.GK27310@redhat.com> References: <4DD3C5B9.1080908@redhat.com> <4DD420A5.2020606@web.de> <4DD51CF3.5040306@codemonkey.ws> <4DD51D36.7040504@siemens.com> <20110519173923.GF27310@redhat.com> <4DD55D5B.30008@web.de> <20110519182203.GH27310@redhat.com> <4DD56126.20808@web.de> <20110519184056.GJ27310@redhat.com> <4DD56543.9020404@web.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4DD56543.9020404@web.de> Subject: Re: [Qemu-devel] [RFC] Memory API List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jan Kiszka Cc: Avi Kivity , qemu-devel On Thu, May 19, 2011 at 08:45:23PM +0200, Jan Kiszka wrote: > On 2011-05-19 20:40, Gleb Natapov wrote: > > On Thu, May 19, 2011 at 08:27:50PM +0200, Jan Kiszka wrote: > >> On 2011-05-19 20:22, Gleb Natapov wrote: > >>> On Thu, May 19, 2011 at 08:11:39PM +0200, Jan Kiszka wrote: > >>>> On 2011-05-19 19:39, Gleb Natapov wrote: > >>>>> On Thu, May 19, 2011 at 03:37:58PM +0200, Jan Kiszka wrote: > >>>>>> On 2011-05-19 15:36, Anthony Liguori wrote: > >>>>>>> On 05/18/2011 02:40 PM, Jan Kiszka wrote: > >>>>>>>> On 2011-05-18 15:12, Avi Kivity wrote: > >>>>>>>>> void cpu_register_memory_region(MemoryRegion *mr, target_phys_addr_t > >>>>>>>>> addr); > >>>>>>>> > >>>>>>>> OK, let's allow overlapping, but make it explicit: > >>>>>>>> > >>>>>>>> void cpu_register_memory_region_overlap(MemoryRegion *mr, > >>>>>>>> target_phys_addr_t addr, > >>>>>>>> int priority); > >>>>>>> > >>>>>>> The device doesn't actually know how overlapping is handled. This is > >>>>>>> based on the bus hierarchy. > >>>>>> > >>>>>> Devices don't register their regions, buses do. > >>>>>> > >>>>> Today PCI device may register region that overlaps with any other > >>>>> registered memory region without even knowing it. Guest can write any > >>>>> RAM address into PCI BAR and this RAM address will be come mmio are. More > >>>>> interesting is what happens when guest reprogram PCI BAR to other address > >>>>> - the RAM that was at the previous address just disappears. Obviously > >>>>> this is crazy behaviour, but the question is how do we want to handle > >>>>> it? One option is to disallow such overlapping registration, another is > >>>>> to restore RAM mapping after PCI BAR is reprogrammed. If we chose second > >>>>> one the PCI will not know that _overlap() should be called. > >>>> > >>>> BARs may overlap with other BARs or with RAM. That's well-known, so PCI > >>>> bridged need to register their regions with the _overlap variant > >>>> unconditionally. In contrast to the current PhysPageDesc mechanism, the > >>> With what priority? If it needs to call _overlap unconditionally why not > >>> always call _overlap and drop not _overlap variant? > >> > >> Because we should catch accidental overlaps in all those non PCI devices > >> with hard-wired addressing. That's a bug in the device/machine model and > >> should be reported as such by QEMU. > > Why should we complicate API to catch unlikely errors? If you want to > > debug that add capability to dump memory map from the monitor. > > Because we need to switch tons of code that so far saw a fairly > different reaction of the core to overlapping regions. > How so? Today if there is accidental overlap device will not function properly. With new API it will be the same. > > > >> > >>> > >>>> new region management will not cause any harm to overlapping regions so > >>>> that they can "recover" when the overlap is gone. > >>>> > >>>>> > >>>>> Another example may be APIC region and PCI. They overlap, but neither > >>>>> CPU nor PCI knows about it. > >>>> > >>>> And they do not need to. The APIC regions will be managed by the per-CPU > >>>> region management, reusing the tool box we need for all bridges. It will > >>>> register the APIC page with a priority higher than the default one, thus > >>>> overriding everything that comes from the host bridge. I think that > >>>> reflects pretty well real machine behaviour. > >>>> > >>> What is "higher"? How does it know that priority is high enough? > >> > >> Because no one else manages priorities at a specific hierarchy level. > >> There is only one. > >> > > PCI and CPU are on different hierarchy levels. PCI is under the PIIX and > > CPU is on a system BUS. > > The priority for the APIC mapping will be applied at CPU level, of > course. So it will override everything, not just PCI. > So you do not need explicit priority because the place in hierarchy implicitly provides you with one. > > > >>> I > >>> thought, from reading other replies, that priorities are meaningful > >>> only on the same hierarchy level (which kinda make sense), but now you > >>> are saying that you will override PCI address from another part of > >>> the topology? > >> > >> Everything from below in the hierarchy is fed in with default priority, > >> the lowest one. So to let some region created at this level override > >> those regions, just pick default+1. If you want to create more overlay > >> levels (can't imagine a good scenario, though), pick > >> default+1..default+n. It's really that simple. > >> > > Except that PCI and CPU are not on the same level. > > See above. > > Jan > -- Gleb.