From: Pierre Morel <pmorel@linux.vnet.ibm.com>
To: Cornelia Huck <cohuck@redhat.com>, "Michael S. Tsirkin" <mst@redhat.com>
Cc: virtualization@lists.linux-foundation.org
Subject: Re: [PATCH] [RFC] virtio: Limit the retries on a virtio device reset
Date: Fri, 25 Aug 2017 13:21:06 +0200 [thread overview]
Message-ID: <c95e82e6-a39f-ad9a-45cd-13e9f9870caf@linux.vnet.ibm.com> (raw)
In-Reply-To: <20170825102612.39a5ca60.cohuck@redhat.com>
On 25/08/2017 10:26, Cornelia Huck wrote:
> On Fri, 25 Aug 2017 00:16:05 +0300
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
>
>> On Thu, Aug 24, 2017 at 07:07:42PM +0200, Pierre Morel wrote:
>>>> - we'll have to spread these tests all over the place.
>>>
>>> I counted 19 places where to check if the reset went OK.
>>>
>>> None of them touch the device anymore after reset and just free driver's
>>> resources.
>>
>> ... and then hypervisor uses the resources after free. Not good.
hum... yes, no good.
we can not assume anything about the host side at that time anymore.
>
> The only place where we can simply give up on the device and be sure
> that nothing bad happens is during initial setup. In the other places,
> it seems we have the choice between looping (as now) or panic.
Yes
Note also that the probability that an initial reset works but further
reset don't is very small.
I think we catch most of the problem during the probe.
IMHO, looping if not in initial setup let the administrator more freedom.
OTOH panic reset to a known state
Problem open: how to recognize an initial setup:
- just make reset return an error and let the driver take the decision
- find a way to let the reset function that this is an initial setup
- add a state in virtio_device?
- other?
If we add a state to the virtio_device we could also add other
information as the reset_retries_count.
>
>>
>>> So that if reset failed, nothing goes wrong, no device access, but the
>>> probability that the next probe fail is high. (If it ever succeed).
>>>
>>>> Allowing reset to fail would be better.
>>>
>>> May be I did not understand what you mean.
>>> Testing the flag or a return value is as expensive.
>>>
>>> Of course the implementation is a mater of taste.
>>
>> If a function can fail it should return an error, not just set a flag.
You are right in this case (at least), setting the flag is bad, since
the flag is set by iowrite8(), it is handled on the device side...
untrusted in the case we explore.
>
> ccw reset can (a) succeed, (b) fail (by returning an error status via
> standard channel subsystem mechanisms), or (c) run into a timeout
> (where the driver should recover via csch and friends). [1] In any
> case, we need to able to rely on the channel subsystem to make sure a
> broken device is dead.
>
> pci-modern reset can (a) succeed, or (b) be in an indeterminate state
> (it has not yet completed, but we don't know whether it will complete
> in the future). It may be reasonable to just give up if (b) happens
> during initial setup.
>
> I'm not sure that allowing to fail reset will help much, other than
> allowing to give up on a pci device early.
We have nowhere a sanity check to verify that the virtio transport is in
order.
The first successful reset can be seen as such a check, independently
from the virtio transport type.
Give up the device early if the transport is not in order is indeed the
only thing we can do.
Regards,
Pierre
>
>>
>>
>>> I notice two other things to do:
>>>
>>> - May be adding a warning would be fine too.
>>> - Virtio_ccw may add a fail flag when allocation of CCW failed.
>>> I did not find anything to do for virtio_mmio or legacy virtio_pci.
>>>
>>> Regards,
>>>
>>> Pierre
>
> [1] At that point, we either succeed with csch and can either retry or
> fail, or fail with device gone, in which case the device is, well,
> gone, which we also should be able to handle fine.
>
--
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
next prev parent reply other threads:[~2017-08-25 11:21 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-23 16:33 [PATCH] [RFC] virtio: Limit the retries on a virtio device reset Pierre Morel
2017-08-24 11:07 ` Cornelia Huck
2017-08-24 12:16 ` Pierre Morel
2017-08-24 14:12 ` Michael S. Tsirkin
2017-08-24 17:07 ` Pierre Morel
2017-08-24 21:16 ` Michael S. Tsirkin
2017-08-25 8:26 ` Cornelia Huck
2017-08-25 11:21 ` Pierre Morel [this message]
2017-08-25 16:43 ` Michael S. Tsirkin
2017-08-24 14:19 ` Michael S. Tsirkin
2017-08-24 17:42 ` Pierre Morel
2017-08-24 21:23 ` Michael S. Tsirkin
2017-08-25 8:33 ` Pierre Morel
2017-08-25 16:46 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c95e82e6-a39f-ad9a-45cd-13e9f9870caf@linux.vnet.ibm.com \
--to=pmorel@linux.vnet.ibm.com \
--cc=cohuck@redhat.com \
--cc=mst@redhat.com \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).