qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC PATCH 0/4] Virtio command timeouts (qemu part)
@ 2017-05-12 10:20 Hannes Reinecke
  2017-05-12 10:20 ` [Qemu-devel] [PATCH 1/4] scsi: make default command timeout user-settable Hannes Reinecke
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Hannes Reinecke @ 2017-05-12 10:20 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Alexander Graf, qemu-devel, Hannes Reinecke

Hi all,

we've run into a really awkward customer situation where the guest would
hang forever due to an SG_IO ioctl on the host not returning.
Looking into it we found that qemu will submit direct I/O requests with
an _infinite_ timeout (well, actually UINT_MAX, which due to a kernel
bug gets translated into (ULONG)-2, resulting in a timeout of
4.2 years :-).
And this particular I/O ran into a timeout on the wire due to a flaky
connection. Which resulted in the 'normal' block-level timeout on the
host being disabled, and the SCSI stack never sending any aborts as
the block-layer was still waiting for the I/O timeout to expire.

Unfortunately I didn't find a way to create a stand-alone patch; the
fix I'm proposing relies on fixes for qemu running on the host and
the kernel side running on the guest.

The proposed fix consists of several parts:
- make the standard device-timeout user-settable via a 'timeout'
  attribute to 'scsi-disk' and 'scsi-generic'
- Add a kernel patch to implement a eh_timeout_handler() for
  virtio_scsi(); this patch just checks if the command is still pending
  and resets the timer if so.
- Add a request timeout to allow drivers to modify the timeout
  on a per-request base.
- Implement a new VIRTIO_SCSI_F_TIMEOUT feature allowing virtio-scsi
  to pass in a timeout via the otherwise unused 'crn' field.
- Add a kernel patch to implement the VIRTIO_SCSI_F_TIMEOUT feature
  so that the timeout is added per virtio request.

With that virtio-scsi on the guest can pass in the used timeout to the
qemu on the host side, which then can use this timeout to issue I/O
requests to the host.
The host can then properly aborting a command if the timeout is hit, and
the aborted command will be returned to the guest.
The guest itself doesn't need to (and, in fact, in most cases can't) abort
any commands anymore, so it just need to reset the I/O timer until the
requests are returned.

However, as this is quite an elaborate construct I'd like to get some
feedback for it.

Hannes Reinecke (4):
  scsi: make default command timeout user-settable
  scsi: use host default timeouts for SCSI commands
  scsi: per-request timeouts
  virtio: implement VIRTIO_SCSI_F_TIMEOUT feature

 hw/scsi/scsi-bus.c                           |  1 +
 hw/scsi/scsi-disk.c                          | 16 ++++++++++++----
 hw/scsi/scsi-generic.c                       | 11 +++++++++--
 hw/scsi/virtio-scsi.c                        | 16 ++++++++++++++++
 include/hw/scsi/scsi.h                       |  2 ++
 include/standard-headers/linux/virtio_scsi.h |  1 +
 6 files changed, 41 insertions(+), 6 deletions(-)

-- 
2.12.0

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-05-12 10:21 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-05-12 10:20 [Qemu-devel] [RFC PATCH 0/4] Virtio command timeouts (qemu part) Hannes Reinecke
2017-05-12 10:20 ` [Qemu-devel] [PATCH 1/4] scsi: make default command timeout user-settable Hannes Reinecke
2017-05-12 10:20 ` [Qemu-devel] [PATCH 2/4] scsi: use host default timeouts for SCSI commands Hannes Reinecke
2017-05-12 10:20 ` [Qemu-devel] [PATCH 3/4] scsi: per-request timeouts Hannes Reinecke
2017-05-12 10:20 ` [Qemu-devel] [PATCH 4/4] virtio: implement VIRTIO_SCSI_F_TIMEOUT feature Hannes Reinecke

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).