From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33822) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WV4z4-00067E-Fe for qemu-devel@nongnu.org; Tue, 01 Apr 2014 16:08:43 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WV4yv-0006e3-AP for qemu-devel@nongnu.org; Tue, 01 Apr 2014 16:08:34 -0400 Received: from e06smtp11.uk.ibm.com ([195.75.94.107]:53611) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WV4yv-0006dy-1p for qemu-devel@nongnu.org; Tue, 01 Apr 2014 16:08:25 -0400 Received: from /spool/local by e06smtp11.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 1 Apr 2014 21:08:24 +0100 Received: from b06cxnps3075.portsmouth.uk.ibm.com (d06relay10.portsmouth.uk.ibm.com [9.149.109.195]) by d06dlp03.portsmouth.uk.ibm.com (Postfix) with ESMTP id B21421B08040 for ; Tue, 1 Apr 2014 21:08:17 +0100 (BST) Received: from d06av05.portsmouth.uk.ibm.com (d06av05.portsmouth.uk.ibm.com [9.149.37.229]) by b06cxnps3075.portsmouth.uk.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id s31K8AxU66257008 for ; Tue, 1 Apr 2014 20:08:10 GMT Received: from d06av05.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av05.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id s31K8LBk022177 for ; Tue, 1 Apr 2014 14:08:21 -0600 Message-ID: <533B1CB4.6020600@de.ibm.com> Date: Tue, 01 Apr 2014 22:08:20 +0200 From: Christian Borntraeger MIME-Version: 1.0 References: <1396363663-50450-1-git-send-email-borntraeger@de.ibm.com> <1396363663-50450-2-git-send-email-borntraeger@de.ibm.com> <533AD42D.8060504@suse.de> <533AD593.5020804@de.ibm.com> <533AD752.5010605@suse.de> <533B1130.9030207@de.ibm.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH/RFC] KVM: s390: Add S390 configuration and control kvm device List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alexander Graf Cc: linux-s390 , Michael Mueller , KVM , Ekaterina Tumanova , qemu-devel , Jens Freimann , Cornelia Huck On 01/04/14 21:36, Alexander Graf wrote: [...] >>> Speaking of which, why don't we just forward STSI to user space with an ENABLE_CAP and handle all of this there? It's not performance critical at all, right? >> >> No, performance is not critical. >> The thing is, that we definitely need the kernel to handle parts of STSI, as we have to provide information from the upper hipervisor (LPAR or zVM). This information is only available in kernel space. So in essence we could only forward a small subset of STSI, namely stsi3_2_2. But we still have to call stsi_3_2_2 in the kernel, >> as 3_2_2 does contain the list of hipervisors underneath us (KVM under z/VM). >> >> So then only thing that we could do is to forward STSI_3_2_2 to qemu when a capability is set and after the kernel has filled in the upper layers. >> QEMU then has to modify the page that the kernel touched and go back. Would work, but needs a capability and preferably an own exit. An new ioctl or >> a subcode of an ioctl (attr/group whatever) seems easier. > > I would consider these 2 orthogonal bits of information. User space wants to get information about its underlying hypervisors regardless of KVM, no? So we should have some interface to bubble STSI information of the current system to user space either way. There is /proc/sysinfo and it provides most (but not all) of the STSI information. But we dont want to go that path, really. Providing STSI as binary blob is also complex and error-prone. Think about it again: Instead of having a small qemu->kernel interface for specific subsets of stsi, you suggest a kernel->qemu interface for the complete set of STSI, so that qemu can then emulate it? (We talk about 20 pages in the principles of operation). T > QEMU could use that and add a few bits of its own. That way we could handle all of STSI in QEMU and get out of the business of defining complicated interfaces. This also misses my main point: The name is just the first user. I want to have some interface that allows me to do other things: enable CMMA, reset the CMMA memory usage state on reset, zap the page tables on clear reset, set the cpu facilities (some parts need to be in kernel so that we can block specific operations), enable s390 specific settings, whatever. The usual approach was create new KVM ioctls, each for one feature + capability. This sounded like overkill. So please dont focus on stsi, I need some interface that I can use for other pending patches. And if that interface is generic enough, we might also use that for name (or not depending on the stsi discussion). I am not married to the device idea, anything that works out fine is ok with me. >>>>> I think VM configuration is common enough to just make this a separate interface. >>>> So you propose to define a new base ioctl (e.g. VM_REG) on the vm fd, instead? >>>> Seems like an easy enough change. Would you reuse the kvm_attr structure for that? >>> >>> Yeah, reuse whatever we can. Basically just remove the device boilerplate - I don't think it's impressively useful for a non-device. >> >> See above, name is just a simple first user. >> The thing is, that we have to have the ioctl either define a proper namespace (unique groups attrs) or to make it s390 specific. The device approach does help us here. > > If you like the device approach, make sure to create it on VM creation and only implement a specific ioctl to fetch its fd. We don't create the configuration information pseudo device after VM creation - it's always there :). > >> I personally dont mind which way to go, as long as Paolo is fine with the approach, and nobody complains about the functions being non-QOM. > > I think the most obvious and straight forward way would be to deal with all of STSI in user space. Make it a separate exit type similar to hypercalls and don't worry about QOM'ification of anything. This thing is on the same level as CPUID really - just VM wide :). For 3_2_2 this might be possible solution, but not for the other codes. But as I said, the name is not my main problem here.