From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([209.51.188.92]:38033) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1goElm-0006Tg-Sh for qemu-devel@nongnu.org; Mon, 28 Jan 2019 16:48:44 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1goElj-0008GO-FA for qemu-devel@nongnu.org; Mon, 28 Jan 2019 16:48:42 -0500 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:49150 helo=mx0a-001b2d01.pphosted.com) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1goElg-00082R-77 for qemu-devel@nongnu.org; Mon, 28 Jan 2019 16:48:36 -0500 Received: from pps.filterd (m0098413.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x0SLiGnO079969 for ; Mon, 28 Jan 2019 16:48:18 -0500 Received: from e14.ny.us.ibm.com (e14.ny.us.ibm.com [129.33.205.204]) by mx0b-001b2d01.pphosted.com with ESMTP id 2qa6hca8v1-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Mon, 28 Jan 2019 16:48:17 -0500 Received: from localhost by e14.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 28 Jan 2019 21:48:17 -0000 References: <20190121110354.2247-1-cohuck@redhat.com> <20190121110354.2247-3-cohuck@redhat.com> <2dac6201-9e71-b188-0385-d09d05071a1c@linux.ibm.com> <5627cb78-22b3-0557-7972-256bc9560d86@linux.ibm.com> <20190125112437.2c06fac6.cohuck@redhat.com> <20190125135835.2d59b511.cohuck@redhat.com> <20190125150101.3b61f0a1@oc2783563651> <20190128180948.506a9695.cohuck@redhat.com> <20190128201548.1ecfb84f@oc2783563651> From: Eric Farman Date: Mon, 28 Jan 2019 16:48:10 -0500 MIME-Version: 1.0 In-Reply-To: <20190128201548.1ecfb84f@oc2783563651> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Message-Id: Subject: Re: [Qemu-devel] [PATCH v2 2/5] vfio-ccw: concurrent I/O handling List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Halil Pasic , Cornelia Huck Cc: Farhan Ali , Pierre Morel , linux-s390@vger.kernel.org, kvm@vger.kernel.org, qemu-devel@nongnu.org, qemu-s390x@nongnu.org, Alex Williamson On 01/28/2019 02:15 PM, Halil Pasic wrote: > On Mon, 28 Jan 2019 18:09:48 +0100 > Cornelia Huck wrote: > >> On Fri, 25 Jan 2019 15:01:01 +0100 >> Halil Pasic wrote: >> >>> On Fri, 25 Jan 2019 13:58:35 +0100 >>> Cornelia Huck wrote: >> >>>> - The code should not be interrupted while we process the channel >>>> program, do the ssch etc. We want the caller to try again later (i.e. >>>> return -EAGAIN) >> >> (...) >> >>>> - With the async interface, we want user space to be able to submit a >>>> halt/clear while a start request is still in flight, but not while >>>> we're processing a start request with translation etc. We probably >>>> want to do -EAGAIN in that case. >>> >>> This reads very similar to your first point. >> >> Not quite. ssch() means that we have a cp around; for hsch()/csch() we >> don't have such a thing. So we want to protect the process of >> translating the cp etc., but we don't need such protection for the >> halt/clear processing. >> > > What does this don't 'need such protection' mean in terms of code, > moving the unlock of the io_mutex upward (in > vfio_ccw_async_region_write())? > > Here the function in question for reference: > > +static ssize_t vfio_ccw_async_region_write(struct vfio_ccw_private > *private, > + const char __user *buf, > size_t count, > + loff_t *ppos) > +{ > + unsigned int i = VFIO_CCW_OFFSET_TO_INDEX(*ppos) - > VFIO_CCW_NUM_REGIONS; > + loff_t pos = *ppos & VFIO_CCW_OFFSET_MASK; > + struct ccw_cmd_region *region; > + int ret; > + > + if (pos + count > sizeof(*region)) > + return -EINVAL; > + > + if (private->state == VFIO_CCW_STATE_NOT_OPER || > + private->state == VFIO_CCW_STATE_STANDBY) > + return -EACCES; > + if (!mutex_trylock(&private->io_mutex)) > + return -EAGAIN; > + > + region = private->region[i].data; > + if (copy_from_user((void *)region + pos, buf, count)) { > + ret = -EFAULT; > + goto out_unlock; > + } > + > + vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_ASYNC_REQ); > + > + ret = region->ret_code ? region->ret_code : count; > + > +out_unlock: > + mutex_unlock(&private->io_mutex); > + return ret; > +} > > That does not make much sense to me at the moment (so I guess I > misunderstood again). > >>> >>>> >>>> My idea would be: >>>> >>>> - The BUSY state denotes "I'm busy processing a request right now, try >>>> again". We hold it while processing the cp and doing the ssch and >>>> leave it afterwards (i.e., while the start request is processed by >>>> the hardware). I/O requests and async requests get -EAGAIN in that >>>> state. >>>> - A new state (CP_PENDING?) is entered after ssch returned with cc 0 >>>> (from the BUSY state). We stay in there as long as no final state for >>>> that request has been received and delivered. (This may be final >>>> interrupt for that request, a deferred cc, or successful halt/clear.) >>>> I/O requests get -EBUSY, async requests are processed. This state can >>>> be removed again once we are able to handle more than one outstanding >>>> cp. >>>> >>>> Does that make sense? >>>> >>> >>> AFAIU your idea is to split up the busy state into two states: CP_PENDING >>> and of busy without CP_PENDING called BUSY. I like the idea of having a >>> separate state for CP_PENDING but I don't like the new semantic of BUSY. >>> >>> Hm mashing a conceptual state machine and the jumptabe stuff ain't >>> making reasoning about this simpler either. I'm taking about the >>> conceptual state machine. It would be nice to have a picture of it and >>> then think about how to express that in code. >> >> Sorry, I'm having a hard time parsing your comments. Are you looking >> for something like the below? > > I had more something like this > https://en.wikipedia.org/wiki/UML_state_machine, > in mind but the lists of state transitions are also useful. > I think the picture Connie paints below is just as useful as any formalized UML diagram. >> >> IDLE --- IO_REQ --> BUSY ---> CP_PENDING --- IRQ ---> IDLE (if final > > There ain't no trigger/action list between BUSY and CP_PENDING. Right, because BUSY means "KVM started processing a SSCH" and CP_PENDING means "KVM finished processing the SSCH and issued it to the hardware, and got cc=0." > I'm also in the dark about where the issuing of the ssch() happen > here (is it an internal transition within CP_PENDING?). Connie said... >>>> - A new state (CP_PENDING?) is entered after ssch returned with cc 0 >>>> (from the BUSY state). ...and I agree with that. I guess if > the ssch() returns with non cc == 0 the CP_PENDING ---IRQ---> IDLE > transition > won't take place. And I guess the IRQ is a final one. Yes this is the one point I hadn't seen explicitly stated. We shouldn't remain in state=BUSY if the ssch got cc!=0, and probably return to IDLE when processing the failure. In Connie's response (Mon, 28 Jan 2019 18:24:24 +0100) to my note, she expressed some agreement to that. > > Sorry abstraction is not a concept unknown to me. But this is too much > abstraction for me in this context. The devil is in the details, and > AFAIU we are discussing these details right now. > > >> state for I/O) >> (normal ssch) >> >> BUSY --- IO_REQ ---> return -EAGAIN, stay in BUSY >> (user space is supposed to retry, as we'll eventually progress from >> BUSY) >> >> CP_PENDING --- IO_REQ ---> return -EBUSY, stay in CP_PENDING >> (user space is supposed to map this to the appropriate cc for the guest) > > From this it seems you don't intend to issue the second requested ssch() > any more (and don't want to do any translation). Is that right? (If it > is, that what I was asking for for a while, but then it's a pity for the > retries.) > >> >> IDLE --- ASYNC_REQ ---> IDLE >> (user space is welcome to do anything else right away) > > Your idea is to not issue a requested hsch() if we think we are IDLE > it seems. Do I understand this right? We would end up with a different > semantic for hsch()/and csch() (compared to PoP) in the guest with this > (AFAICT). > >> >> BUSY --- ASYNC_REQ ---> return -EAGAIN, stay in BUSY >> (user space is supposed to retry, as above) >> >> CP_PENDING --- ASYNC_REQ ---> return success, stay in CP_PENDING >> (the interrupt will get us out of CP_PENDING eventually) > > Issue (c|h)sch() is an action that is done on this internal > transition (within CP_PENDING). These three do read like CSCH/HSCH are subject to the same rules as SSCH, when in fact they would be (among other reasons) issued to clean up a lost interrupt from a previous SSCH. So maybe return -EAGAIN on state=BUSY (don't race ourselves with the start), but issue to hardware if CP_PENDING. If we get an async request when state=IDLE, then maybe just issue it for fun, as if it were an SSCH? > > Thank you very much for investing into this description of the state > machine. I'm afraid I'm acting like a not so nice person (self censored) > at the moment. I can't help myself, sorry. Maybe Farhan and Eric can take > this as a starting point and come up with something that we can integrate > into our documentation. Maybe not... > > Regards, > Halil >