From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=45859 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PfE3T-0006Za-6A for qemu-devel@nongnu.org; Tue, 18 Jan 2011 11:05:12 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PfE3R-0006er-PA for qemu-devel@nongnu.org; Tue, 18 Jan 2011 11:05:11 -0500 Received: from e38.co.us.ibm.com ([32.97.110.159]:34254) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1PfE3R-0006eW-If for qemu-devel@nongnu.org; Tue, 18 Jan 2011 11:05:09 -0500 Received: from d03relay05.boulder.ibm.com (d03relay05.boulder.ibm.com [9.17.195.107]) by e38.co.us.ibm.com (8.14.4/8.13.1) with ESMTP id p0IFotXV001725 for ; Tue, 18 Jan 2011 08:50:55 -0700 Received: from d03av01.boulder.ibm.com (d03av01.boulder.ibm.com [9.17.195.167]) by d03relay05.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id p0IG4wLx070826 for ; Tue, 18 Jan 2011 09:04:59 -0700 Received: from d03av01.boulder.ibm.com (loopback [127.0.0.1]) by d03av01.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id p0IG4wrJ014521 for ; Tue, 18 Jan 2011 09:04:58 -0700 Message-ID: <4D35BA22.7060602@linux.vnet.ibm.com> Date: Tue, 18 Jan 2011 10:04:50 -0600 From: Anthony Liguori MIME-Version: 1.0 Subject: Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state References: <4D2B6CB5.9050602@codemonkey.ws> <4D2B74D8.4080309@web.de> <4D2B8662.9060909@web.de> <4D2C60FB.7030009@linux.vnet.ibm.com> <4D2D80ED.8030405@redhat.com> <4D2D82EE.20002@siemens.com> <4D35A39A.8000801@siemens.com> <4D35ABF8.9050700@linux.vnet.ibm.com> <4D35B521.3090601@siemens.com> <4D35B6DD.1020005@linux.vnet.ibm.com> <4D35B963.7000605@siemens.com> In-Reply-To: <4D35B963.7000605@siemens.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jan Kiszka Cc: "kvm@vger.kernel.org" , Glauber Costa , Marcelo Tosatti , "qemu-devel@nongnu.org" , Markus Armbruster , Avi Kivity On 01/18/2011 10:01 AM, Jan Kiszka wrote: > On 2011-01-18 16:50, Anthony Liguori wrote: > >> On 01/18/2011 09:43 AM, Jan Kiszka wrote: >> >>> On 2011-01-18 16:04, Anthony Liguori wrote: >>> >>> >>>> On 01/18/2011 08:28 AM, Jan Kiszka wrote: >>>> >>>> >>>>> On 2011-01-12 11:31, Jan Kiszka wrote: >>>>> >>>>> >>>>> >>>>>> Am 12.01.2011 11:22, Avi Kivity wrote: >>>>>> >>>>>> >>>>>> >>>>>>> On 01/11/2011 03:54 PM, Anthony Liguori wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>>> Right, we should introduce a KVMBus that KVM devices are created on. >>>>>>>> The devices can get at KVMState through the BusState. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> There is no kvm bus in a PC (I looked). We're bending the device model >>>>>>> here because a device is implemented in the kernel and not in >>>>>>> userspace. An implementation detail is magnified beyond all proportions. >>>>>>> >>>>>>> An ioapic that is implemented by kvm lives in exactly the same place >>>>>>> that the qemu ioapic lives in. An assigned pci device lives on the PCI >>>>>>> bus, not a KVMBus. If we need a pointer to KVMState, then we must find >>>>>>> it elsewhere, not through creating imaginary buses that don't exist. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> Exactly. >>>>>> >>>>>> So we can either "infect" the whole device tree with kvm (or maybe a >>>>>> more generic accelerator structure that also deals with Xen) or we need >>>>>> to pull the reference inside the device's init function from some global >>>>>> service (kvm_get_state). >>>>>> >>>>>> >>>>>> >>>>> Note that this topic is still waiting for good suggestions, specifically >>>>> from those who believe in kvm_state references :). This is not only >>>>> blocking kvmstate merge but will affect KVM irqchips as well. >>>>> >>>>> It boils down to how we reasonably pass a kvm_state reference from >>>>> machine init code to a sysbus device. I'm probably biased, but I don't >>>>> see any way that does not work against the idea of confining access to >>>>> kvm_state or breaks device instantiation from the command line or a >>>>> config file. >>>>> >>>>> >>>>> >>>> A KVM device should sit on a KVM specific bus that hangs off of sysbus. >>>> It can get to kvm_state through that bus. >>>> >>>> That bus doesn't get instantiated through qdev so requiring a pointer >>>> argument should not be an issue. >>>> >>>> >>>> >>> This design is in conflict with the requirement to attach KVM-assisted >>> devices also to their home bus, e.g. an assigned PCI device to the PCI >>> bus. We don't support multi-homed qdev devices. >>> >>> >> The bus topology reflects how I/O flows in and out of a device. We do >> not model a perfect PC bus architecture and I don't think we ever intend >> to. Instead, we model a functional architecture. >> >> I/O from an assigned device does not flow through the emulated PCI bus. >> Therefore, it does not belong on the emulated PCI bus. >> >> Assigned devices need to interact with the emulated PCI bus, but they >> shouldn't be children of it. >> > You should be able to find assigned devices on some PCI bus, so you > either have to hack up the existing bus to host devices that are, on the > other side, not part of it or branch off a pci-kvm sub-bus, just like > you would have to create a sysbus-kvm. Management tools should never transverse the device tree to find devices. This is a recipe for disaster in the long term because the device tree will not remain stable. So yes, a management tool should be able to enumerate assigned devices as they would enumerate any other PCI device but that has almost nothing to do with what the tree layout is. Regards, Anthony Liguori > I guess, if at all, we want the > latter. > > Is that acceptable for everyone? > > Jan > >