From: Paolo Bonzini <pbonzini@redhat.com>
To: Stefan Hajnoczi <stefanha@gmail.com>
Cc: "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
Asias He <asias@redhat.com>, Hannes Reinecke <hare@suse.de>,
Alexander Graf <agraf@suse.de>
Subject: Re: [Qemu-devel] virtio-scsi and error handling
Date: Wed, 12 Jun 2013 16:19:14 -0400 [thread overview]
Message-ID: <51B8D7C2.3040907@redhat.com> (raw)
In-Reply-To: <20130612075620.GD946@stefanha-thinkpad.muc.redhat.com>
Il 12/06/2013 03:56, Stefan Hajnoczi ha scritto:
> On Tue, Jun 11, 2013 at 01:41:38PM +0200, Hannes Reinecke wrote:
>> I currently playing around with improving SCSI EH, optimizing
>> command aborts and the like.
>>
>> And, supposing it to be a nice testbed, tried to make things work
>> with virtio_scsi.
>>
>> However, looking at the code there I've found virtscsi_tmf() just
>> uses 'wait_for_completion', with no timeout specified. So in effect
>> any abort might stall forever.
>>
>> Wouldn't it be more sensible to use 'wait_for_completion_timeout'
>> here, to allow the error escalation to continue?
>> This would especially be useful when running with multipathing,
>> as the underlying device might stall, and aio_cancel() doesn't work
>> reliably, if at all.
>
> Hi,
> I agree that we need a timeout. bdrv_aio_cancel() is not guaranteed to
> complete in bounded time.
I also agree that we need a timeout, but then note that host reset could
also not complete in bounded time if I/O doesn't terminate in the host.
Last time I checked the io_cancel system call was basically a no-op (for
aio=native), and for aio=threads the worker might stay in D state for an
unbounded time too.
Paolo
>> Also I've found that there is no host reset. Currently the virtio
>> semantics seem to require reliable communication, ie for every
>> command send there _has_ to be a response.
>>
>> Long and painful experience with RAID HBAs has shown that this model
>> works okay for the lower-level escalations, but you absolutely need
>> a host reset to restore communication.
>> In the case of virtio I would think that a virtio-level reset for
>> host_reset would be a sensible idea.
>
> One thing to watch out for is that a virtio-scsi reset will likely hang
> too because it resets all pending requests.
>
> Paolo Bonzini has done the lion's share of virtio-scsi work over the
> past year (or two?). He might have some more thoughts.
>
> Stefan
>
prev parent reply other threads:[~2013-06-12 20:19 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-11 11:41 [Qemu-devel] virtio-scsi and error handling Hannes Reinecke
2013-06-12 7:56 ` Stefan Hajnoczi
2013-06-12 20:19 ` Paolo Bonzini [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51B8D7C2.3040907@redhat.com \
--to=pbonzini@redhat.com \
--cc=agraf@suse.de \
--cc=asias@redhat.com \
--cc=hare@suse.de \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).