From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eric Farman Subject: Re: [PATCH v2 2/5] vfio-ccw: concurrent I/O handling Date: Fri, 25 Jan 2019 15:21:55 -0500 Message-ID: References: <20190121110354.2247-1-cohuck@redhat.com> <20190121110354.2247-3-cohuck@redhat.com> <2dac6201-9e71-b188-0385-d09d05071a1c@linux.ibm.com> <20190125135807.17c59cdd@oc2783563651> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20190125135807.17c59cdd@oc2783563651> Content-Language: en-US List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+gceq-qemu-devel2=m.gmane.org@nongnu.org Sender: "Qemu-devel" List-Archive: List-Post: To: Halil Pasic Cc: linux-s390@vger.kernel.org, Cornelia Huck , Alex Williamson , Pierre Morel , kvm@vger.kernel.org, Farhan Ali , qemu-devel@nongnu.org, qemu-s390x@nongnu.org List-ID: On 01/25/2019 07:58 AM, Halil Pasic wrote: > On Thu, 24 Jan 2019 21:25:10 -0500 > Eric Farman wrote: > >>> private = dev_get_drvdata(mdev_parent_dev(mdev)); >>> - if (private->state != VFIO_CCW_STATE_IDLE) >>> + if (private->state == VFIO_CCW_STATE_NOT_OPER || >>> + private->state == VFIO_CCW_STATE_STANDBY) >>> return -EACCES; >>> + if (!mutex_trylock(&private->io_mutex)) >>> + return -EAGAIN; >> >> Ah, I see Halil's difficulty here. >> >> It is true there is a race condition today, and that this doesn't >> address it. That's fine, add it to the todo list. But even with that, >> I don't see what the mutex is enforcing? > > It is protecting the io regions. AFAIU the idea was that only one > thread is accessing the io region(s) at a time to prevent corruption and > reading half-morphed data. > >> Two simultaneous SSCHs will be >> serialized (one will get kicked out with a failed trylock() call), while >> still leaving the window open between cc=0 on the SSCH and the >> subsequent interrupt. In the latter case, a second SSCH will come >> through here, do the copy_from_user below, and then jump to fsm_io_busy >> to return EAGAIN. Do we really want to stomp on io_region in that case? > > I'm not sure I understood you correctly. The interrupt handler does not > take the lock before writing to the io_region. That is one race but it is > easy to fix. > > The bigger problem is that between the interrupt handler has written IRB > area and userspace has read it we may end up destroying it by stomping on > it (to use your words). The userspace reading a wrong (given todays qemu > zeroed out) IRB could lead to follow on problems. I wasn't thinking about a race between the start and interrupt handler, but rather between two near-simultaneous starts. Looking at it more closely, the orb and scsw structs as well as the ret_code field in ccw_io_region are only referenced under the protection of the new mutex (within fsm_io_request, for example), which I guess is the point. So that leaves us with just the irb fields, which you'd mentioned a couple days ago (and which I was trying to ignore since it'd seems to have been discussed enough at the time). So I withdraw my concerns on this point. For now. ;-) > >> Why can't we simply return EAGAIN if state==BUSY? >> > > Sure we can. That would essentially go back to the old way of things: > if not idle return with error. I think this happens both before and after this series. With this series, we just update the io_region with things that are never used because we're busy. Just the error code returned would change > form EACCESS to EAGAIN. Which Isn't necessarily a win, because > conceptually here should be never two interleaved io_requests/start > commands hitting the module. > > >>> >>> region = private->io_region; >>> - if (copy_from_user((void *)region + *ppos, buf, count)) >>> - return -EFAULT; >>> + if (copy_from_user((void *)region + *ppos, buf, count)) { >>> + ret = -EFAULT; >>> + goto out_unlock; >>> + } >