From mboxrd@z Thu Jan 1 00:00:00 1970 From: Blue Swirl Subject: Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state Date: Thu, 20 Jan 2011 20:02:01 +0000 Message-ID: References: <4D2B6CB5.9050602@codemonkey.ws> <4D2B74D8.4080309@web.de> <4D2B8662.9060909@web.de> <4D2C60FB.7030009@linux.vnet.ibm.com> <4D2D80ED.8030405@redhat.com> <4D2D82EE.20002@siemens.com> <4D35A39A.8000801@siemens.com> <4D35ABF8.9050700@linux.vnet.ibm.com> <4D35B521.3090601@siemens.com> <4D35B6DD.1020005@linux.vnet.ibm.com> <4D3717E7.3010105@linux.vnet.ibm.com> <4D38017D.2020401@siemens.com> <4D388EE8.3000004@linux.vnet.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Jan Kiszka , Markus Armbruster , "kvm@vger.kernel.org" , Glauber Costa , Marcelo Tosatti , "qemu-devel@nongnu.org" , Avi Kivity To: Anthony Liguori Return-path: Received: from mail-pw0-f46.google.com ([209.85.160.46]:50947 "EHLO mail-pw0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754196Ab1ATUCW convert rfc822-to-8bit (ORCPT ); Thu, 20 Jan 2011 15:02:22 -0500 Received: by pwj3 with SMTP id 3so174587pwj.19 for ; Thu, 20 Jan 2011 12:02:22 -0800 (PST) In-Reply-To: <4D388EE8.3000004@linux.vnet.ibm.com> Sender: kvm-owner@vger.kernel.org List-ID: On Thu, Jan 20, 2011 at 7:37 PM, Anthony Liguori wrote: > On 01/20/2011 03:33 AM, Jan Kiszka wrote: >> >> On 2011-01-19 20:32, Blue Swirl wrote: >> >>> >>> On Wed, Jan 19, 2011 at 4:57 PM, Anthony Liguori >>> =C2=A0wrote: >>> >>>> >>>> On 01/19/2011 07:15 AM, Markus Armbruster wrote: >>>> >>>>> >>>>> So they interact with KVM (need kvm_state), and they interact wit= h the >>>>> emulated PCI bus. =C2=A0Could you elaborate on the fundamental di= fference >>>>> between the two interactions that makes you choose the (hypotheti= cal) >>>>> KVM bus over the PCI bus as device parent? >>>>> >>>>> >>>> >>>> It's almost arbitrary, but I would say it's the direction that I/O= s >>>> flow. >>>> >>>> But if the underlying observation is that the device tree is not r= eally >>>> a >>>> tree, you're 100% correct. =C2=A0This is part of why a factory int= erface that >>>> just takes a parent bus is too simplistic. >>>> >>>> I think we ought to introduce a -pci-device option that is specifi= cally >>>> for >>>> creating PCI devices that doesn't require a parent bus argument bu= t >>>> provides >>>> a way to specify stable addressing (for instancing, using a linear >>>> index). >>>> >>> >>> I think kvm_state should not be a property of any device or bus. It >>> should be split to more logical pieces. >>> >>> Some parts of it could remain in CPUState, because they are associa= ted >>> with a VCPU. >>> >>> Also, for example irqfd could be considered to be similar object to >>> char or block devices provided by QEMU to devices. Would it make se= nse >>> to introduce new host types for passing parts of kvm_state to devic= es? >>> >>> I'd also make coalesced MMIO stuff part of memory object. We are no= t >>> passing any state references when using cpu_physical_memory_rw(), b= ut >>> that could be changed. >>> >> >> There are currently no VCPU-specific bits remaining in kvm_state. It= may >> be a good idea to introduce an arch-specific kvm_state and move rela= ted >> bits over. It may also once be feasible to carve out memory manageme= nt >> related fields if we have proper abstractions for that, but I'm not >> completely sure here. >> >> Anyway, all these things are secondary. The primary topic here is ho= w to >> deal with kvm_state and its fields that have VM-global scope. >> > > The debate is really: > > 1) should we remove all passing of kvm_state and just assume it's sta= tic > > 2) deal with a couple places in the code where we need to figure out = how to > get at kvm_state > > I think we've only identified 1 real instance of (2) and it's resulte= d in > some good discussions about how to model KVM devices vs. emulated dev= ices. > =C2=A0Honestly, (1) just stinks. =C2=A0I see absolutely no advantage = to it at all. =46ully agree. > In the very worst case scenario, the thing we need to do is just refe= rence > an extern variable in a few places. =C2=A0That completely avoids all = of the > modelling discussions for now (while leaving for placeholder FIXMEs s= o the > problem can be tackled later). I think KVMState was designed to match KVM ioctl interface: all stuff that is needed for talking to KVM or received from KVM are there. But I think this shouldn't be a design driver. If the only pieces of kvm_state that are needed by the devices are irqchip_in_kernel, pit_in_kernel and many_ioeventfds, the problem of passing kvm_state to devices becomes very different. Each of these are just single bits, affecting only a few devices. Perhaps they could be device properties which the board level sets when KVM is used?