All of lore.kernel.org
 help / color / mirror / Atom feed
* How to reissue stuck inflight i/o
@ 2013-11-26 17:54 Spelic
  2013-11-27  1:23 ` Mikulas Patocka
  0 siblings, 1 reply; 2+ messages in thread
From: Spelic @ 2013-11-26 17:54 UTC (permalink / raw)
  To: device-mapper development

Hello all,
we just had a case in which some I/Os were apparently stuck inflight in 
an LV aka DM (visible in iostat as nonzero avgqu-sz for an dm-X device) 
for a long time, such as at least 20 minutes, but all layers below it 
had zero inflight I/O (MD RAID and then the disks), so DM and the above 
layers were waiting endlessly.
This was with an old kernel 2.6.24-something .
I wasn't able to debug further. After 30 minutes or so it resolved by 
itself without leaving anything in dmesg or anywhere else.

Is there a way to reissue inflight I/O to lower layers (such as what 
happens transparently with the 5 SCSI retries after SCSI timeout) for 
DM, or at least kill such I/O so that above layers receive an I/O error 
and move on?
I was thinking at some dmsetup command but was not clear to me which.
What about a dmsetup suspend and then resume? I didn't think about 
trying this, at that time.

Thanks
S.

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: How to reissue stuck inflight i/o
  2013-11-26 17:54 How to reissue stuck inflight i/o Spelic
@ 2013-11-27  1:23 ` Mikulas Patocka
  0 siblings, 0 replies; 2+ messages in thread
From: Mikulas Patocka @ 2013-11-27  1:23 UTC (permalink / raw)
  To: device-mapper development, Spelic



On Tue, 26 Nov 2013, Spelic wrote:

> Hello all,
> we just had a case in which some I/Os were apparently stuck inflight in an LV
> aka DM (visible in iostat as nonzero avgqu-sz for an dm-X device) for a long
> time, such as at least 20 minutes, but all layers below it had zero inflight
> I/O (MD RAID and then the disks), so DM and the above layers were waiting
> endlessly.
> This was with an old kernel 2.6.24-something .
> I wasn't able to debug further. After 30 minutes or so it resolved by itself
> without leaving anything in dmesg or anywhere else.
> 
> Is there a way to reissue inflight I/O to lower layers (such as what happens
> transparently with the 5 SCSI retries after SCSI timeout) for DM, or at least
> kill such I/O so that above layers receive an I/O error and move on?
> I was thinking at some dmsetup command but was not clear to me which.
> What about a dmsetup suspend and then resume? I didn't think about trying
> this, at that time.
> 
> Thanks
> S.

Hi

There is no way to reissue or cancel stuck I/O. The kernel architecture 
doesn't support this sort of operation.

Timeout is only happening in the low level physical device driver.

The higher level drivers (dm and md) have no timeout and it is generally 
assumed that if the lowest level i/o finishes in finite time, dm or md 
should finish the high-level i/o in finite time too (some dm targets, such 
as dm-mirror or dm-multipath, need userspace helper daemons - if the 
daemon is not running, ios could be stuck indefinitely in the dm driver).

Anyway, if you are seeing stuck ios and they are not caused by missing 
userspace daemons, you should try to reproduce it and report it as a bug.

Mikulas

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2013-11-27  1:23 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-11-26 17:54 How to reissue stuck inflight i/o Spelic
2013-11-27  1:23 ` Mikulas Patocka

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.