From: Farhan Ali <alifm@linux.ibm.com>
To: Halil Pasic <pasic@linux.ibm.com>
Cc: Eric Farman <farman@linux.ibm.com>,
Cornelia Huck <cohuck@redhat.com>,
kvm@vger.kernel.org, linux-s390@vger.kernel.org,
pmorel@linux.ibm.com
Subject: Re: [PATCH v3 1/1] vfio-ccw: Prevent quiesce function going into an infinite loop
Date: Tue, 23 Apr 2019 15:41:34 -0400 [thread overview]
Message-ID: <a2618478-4b98-a2c2-5f14-a91b193ebf53@linux.ibm.com> (raw)
In-Reply-To: <20190423194251.093304c7.pasic@linux.ibm.com>
On 04/23/2019 01:42 PM, Halil Pasic wrote:
> One thing I'm confused about is, that we don't seem to prevent
> new I/O being submitted. That is we could still loop indefinitely
> if we get new IO after the 'kill I/O on the subchannel' is done but
> before the msch() with the disable is issued.
So the quiesce function will be called in the remove, release functions
and also in the mdev reset callback via an ioctl VFIO_DEVICE_RESET.
Now the release function is invoked in cases when we hot unplug the
device or the guest is gone (or anything that will close the vfio mdev
file descriptor, I believe). In such scenarios it's really the userspace
which is asking to release the device. Similar for remove, where the
user has to explicitly write to the remove file for the mdev to invoke
it. Under normal conditions no sane userspace should be doing
release/remove while there are still on going I/Os :)
Me and Conny had some discussion on this in v1 of this patch:
https://marc.info/?l=kvm&m=155437117823248&w=2
>
> The 'flush all I/O' parts in the commit message and in the code make
> this even more confusing.
Maybe...if it's too confusing it could be fixed, but IMHO I don't think
it's a dealbreaker. If anyone else thinks otherwise, I can go ahead and
change it.
>
> Another thing that I don't quite understand is injecting interrupts
> into QEMU for stuff that is actually not guest initiated.
As mentioned above under normal conditions we shouldn't be doing
quiesce. But wouldn't those interrupts just be unsolicited interrupts
for the guest?
>
> Furthermore I find how cio_cancel_halt_clear() quite confusing. We
> usually return EBUSY to say that the function was suppressed because,
> well, the resource is busy. For cio_cancel_halt_clear() EBUSY means:
> * either a xsch() was effectively suppressed because status pending
> * or an hsch() or csch() was actually successfully executed, and the
> client code is supposed to wait for the assync clear/halt function
> to complete. This gets even more confusing when one reads:
>
> /**
> * cio_cancel_halt_clear - Cancel running I/O by performing cancel, halt
> * and clear ordinally if subchannel is valid.
> * @sch: subchannel on which to perform the cancel_halt_clear operation
> * @iretry: the number of the times remained to retry the next operation
> *
> * This should be called repeatedly since halt/clear are asynchronous
>
> This is simply not true. Because halt/clear is async, we may not know
> if we managed to accomplish canceling the running I/O. But the next
> call of this function won't detect and say 'yep it worked, we are good
> now' and return 0. This is the responsibility of the caller.
>
> If it were like that, the current code would have been fine!
>
> * operations. We do one try with cio_cancel, three tries with cio_halt,
> * 255 tries with cio_clear. The caller should initialize @iretry with
> * the value 255 for its first call to this, and keep using the same
> * @iretry in the subsequent calls until it gets a non -EBUSY return.
> *
> * Returns 0 if device now idle, -ENODEV for device not operational,
> * -EBUSY if an interrupt is expected (either from halt/clear or from a
> * status pending), and -EIO if out of retries.
> */
> int cio_cancel_halt_clear(struct subchannel *sch, int *iretry
>
> TL;DR:
>
> I welcome this batch (with an r-b) but I would like the commit message
> and the comment changed so that the misleading 'flush all I/O in the
> workqueue'.
>
> I think 'vfio-ccw: fix cio_cancel_halt_clear() usage' would reflect the
> content of this patch better, because reasoning about the upper limit,
> and what happens if this upper limit is hit is not what this patch is
> about. It is about a client code bug that rendered iretry ineffective.
>
I politely disagree with the change in subject line. I think the current
subject line describe what we are trying to prevent with this patch. But
again if anyone else feels otherwise, I will go ahead and change :)
Thanks
Farhan
> Regards,
> Halil
>
>
next prev parent reply other threads:[~2019-04-23 19:41 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <cover.1555449329.git.alifm@linux.ibm.com>
2019-04-16 21:23 ` [PATCH v3 1/1] vfio-ccw: Prevent quiesce function going into an infinite loop Farhan Ali
2019-04-17 9:03 ` Cornelia Huck
2019-04-17 13:58 ` Eric Farman
2019-04-17 15:13 ` Halil Pasic
2019-04-17 15:18 ` Farhan Ali
2019-04-19 20:12 ` Halil Pasic
2019-04-22 14:01 ` Farhan Ali
2019-04-23 17:42 ` Halil Pasic
2019-04-23 19:41 ` Farhan Ali [this message]
2019-04-23 20:37 ` Eric Farman
2019-04-24 7:09 ` Cornelia Huck
2019-04-24 10:02 ` Halil Pasic
2019-04-24 10:21 ` Halil Pasic
2019-04-18 14:36 ` Cornelia Huck
2019-04-17 14:02 ` Farhan Ali
2019-04-24 16:35 ` Cornelia Huck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a2618478-4b98-a2c2-5f14-a91b193ebf53@linux.ibm.com \
--to=alifm@linux.ibm.com \
--cc=cohuck@redhat.com \
--cc=farman@linux.ibm.com \
--cc=kvm@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=pasic@linux.ibm.com \
--cc=pmorel@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox