From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:57490 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730591AbfKEJPv (ORCPT ); Tue, 5 Nov 2019 04:15:51 -0500 Date: Tue, 5 Nov 2019 10:15:36 +0100 From: Cornelia Huck Subject: Re: [RFC 09/37] KVM: s390: protvirt: Implement on-demand pinning Message-ID: <20191105101536.7df8f3bb.cohuck@redhat.com> In-Reply-To: <2c36b668-e6a7-4497-62da-f2be09350896@redhat.com> References: <20191024114059.102802-1-frankja@linux.ibm.com> <20191024114059.102802-10-frankja@linux.ibm.com> <7465141c-27b7-a89e-f02d-ab05cdd8505d@de.ibm.com> <4abdc1dc-884e-a819-2e9d-2b8b15030394@redhat.com> <2a7c4644-d718-420a-9bd7-723baccfb302@linux.ibm.com> <84bd87f0-37bf-caa8-5762-d8da58f37a8f@redhat.com> <69ddb6a7-8f69-fbc4-63a4-4f5695117078@de.ibm.com> <1fad0466-1eeb-7d24-8015-98af9b564f74@redhat.com> <8a68fcbb-1dea-414f-7d48-e4647f7985fe@redhat.com> <20191104181743.3792924a.cohuck@redhat.com> <2c36b668-e6a7-4497-62da-f2be09350896@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=WINDOWS-1252 Content-Transfer-Encoding: quoted-printable Sender: linux-s390-owner@vger.kernel.org List-ID: To: David Hildenbrand Cc: Christian Borntraeger , Janosch Frank , kvm@vger.kernel.org, linux-s390@vger.kernel.org, thuth@redhat.com, imbrenda@linux.ibm.com, mihajlov@linux.ibm.com, mimu@linux.ibm.com, gor@linux.ibm.com On Mon, 4 Nov 2019 19:38:27 +0100 David Hildenbrand wrote: > On 04.11.19 18:17, Cornelia Huck wrote: > > On Mon, 4 Nov 2019 15:42:11 +0100 > > David Hildenbrand wrote: > > =20 > >> On 04.11.19 15:08, David Hildenbrand wrote: =20 > >>> On 04.11.19 14:58, Christian Borntraeger wrote: =20 > >>>>> How hard would it be to > >>>>> > >>>>> 1. Detect the error condition > >>>>> 2. Try a read on the affected page from the CPU (will will automati= cally convert to encrypted/!secure) > >>>>> 3. Restart the I/O > >>>>> > >>>>> I assume that this is a corner case where we don't really have to c= are about performance in the first shot. =20 > >>>> > >>>> We have looked into this. You would need to implement this in the lo= w level > >>>> handler for every I/O. DASD, FCP, PCI based NVME, iscsi. Where do yo= u want > >>>> to stop? =20 > >>> > >>> If that's the real fix, we should do that. Maybe one can focus on the > >>> real use cases first. But I am no I/O expert, so my judgment might be > >>> completely wrong. > >>> =20 > >> > >> Oh, and by the way, as discussed you really only have to care about > >> accesses via "real" I/O devices (IOW, not via the CPU). When accessing > >> via the CPU, you should have automatic conversion back and forth. As I > >> am no expert on I/O, I have no idea how iscsi fits into this picture > >> here (especially on s390x). > >> =20 > >=20 > > By "real" I/O devices, you mean things like channel devices, right? (So > > everything where you basically hand off control to a different kind of > > processor.) > >=20 > > For classic channel I/O (as used by dasd), I'd expect something like > > getting a check condition on a ccw if the CU or device cannot access > > the memory. You will know how far the channel program has progressed, > > and might be able to restart (from the beginning or from that point). > > Probably has a chance of working for a subset of channel programs. NB that there's more than simple reads/writes... could also be control commands, some of which do read/writes as well. > >=20 > > For QDIO (as used by FCP), I have no idea how this is could work, as we > > have long-running channel programs there and any error basically kills > > the queues, which you would have to re-setup from the beginning. > >=20 > > For PCI devices, I have no idea how the instructions even act. > >=20 > > From my point of view, that error/restart approach looks nice on paper= , > > but it seems hard to make it work in the general case (and I'm unsure > > if it's possible at all.) =20 >=20 > One thought: If all we do during an I/O request is read or write (or=20 > even a mixture), can we simply restart the whole I/O again, although we= =20 > did partial reads/writes? This would eliminate the "know how far the=20 > channel program has progressed". On error, one would have to touch each= =20 > involved page (e.g., try to read first byte to trigger a conversion) and= =20 > restart the I/O. I can understand that this might sound simpler than it= =20 > is (if it is even possible) Any control commands might have side effects, though. Problems there should be uncommon; there's still the _general_ case, though :( Also, there's stuff like rewriting the channel program w/o prefetch, jumping with TIC, etc. Linux probably does not do the former, but at least the dasd driver uses NOP/TIC for error recovery. > and might still be problematic for QDIO as=20 > far as I understand. Just a thought. Yes, given that for QDIO, establishing the queues is simply one long-running channel program...