qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] question:about introduce a new feature named “I/O hang”
@ 2019-07-04 15:16 wangjie (P)
  2019-07-05  7:50 ` Kevin Wolf
  0 siblings, 1 reply; 3+ messages in thread
From: wangjie (P) @ 2019-07-04 15:16 UTC (permalink / raw)
  To: qemu-block, qemu-devel
  Cc: kwolf, Fangyi (C), armbru, mreitz, wangjie88, Paolo Bonzini

Hi, everybody:

I developed a feature named "I/O hang",my intention is to solve the 
problem like that:
If the backend storage media of VM disk is far-end storage like IPSAN or 
FCSAN, storage net link will always disconnection and
make I/O requests return EIO to Guest, and the status of filesystem in 
Guest will be read-only, even the link recovered
after a while, the status of filesystem in Guest will not recover.

So I developed a feature named "I/O hang" to solve this problem, the 
solution like that:
when some I/O requests return EIO in backend, "I/O hang" will catch the 
requests in qemu block layer and
insert the requests to a rehandle queue but not return EIO to Guest, the 
I/O requests in Guest will hang but it does not lead
Guest filesystem to be read-only, then "I/O hang" will loop to rehandle 
the requests for a period time(ex. 5 second) until the requests
not return EIO(when backend storage link recovered).

In addition to the function as above, "I/O hang" also can sent event to 
libvirt after backend storage status changed.

configure methods:
1. "I/O hang" ability can be configured for each disk as a disk attribute.
2. "I/O hang" timeout value also can be configured for each disk, when 
storage link not recover in timeout value,
    "I/O hang" will disable rehandle I/O requests and return EIO to Guest.


Are you interested in the feature?  I intend to push this feature to 
qemu org, what's your opinion?



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Qemu-devel]  question:about introduce a new feature named “I/O hang”
  2019-07-04 15:16 [Qemu-devel] question:about introduce a new feature named “I/O hang” wangjie (P)
@ 2019-07-05  7:50 ` Kevin Wolf
  2019-07-08 13:16   ` [Qemu-devel] [Qemu-block] " Maxim Levitsky
  0 siblings, 1 reply; 3+ messages in thread
From: Kevin Wolf @ 2019-07-05  7:50 UTC (permalink / raw)
  To: wangjie (P)
  Cc: qemu-block, Fangyi (C), armbru, qemu-devel, Paolo Bonzini, mreitz

Am 04.07.2019 um 17:16 hat wangjie (P) geschrieben:
> Hi, everybody:
> 
> I developed a feature named "I/O hang",my intention is to solve the problem
> like that:
> If the backend storage media of VM disk is far-end storage like IPSAN or
> FCSAN, storage net link will always disconnection and
> make I/O requests return EIO to Guest, and the status of filesystem in Guest
> will be read-only, even the link recovered
> after a while, the status of filesystem in Guest will not recover.

The standard solution for this is configuring the guest device with
werror=stop,rerror=stop so that the error is not delivered to the guest,
but the VM is stopped. When you run 'cont', the request is then retried.

> So I developed a feature named "I/O hang" to solve this problem, the
> solution like that:
> when some I/O requests return EIO in backend, "I/O hang" will catch the
> requests in qemu block layer and
> insert the requests to a rehandle queue but not return EIO to Guest, the I/O
> requests in Guest will hang but it does not lead
> Guest filesystem to be read-only, then "I/O hang" will loop to rehandle the
> requests for a period time(ex. 5 second) until the requests
> not return EIO(when backend storage link recovered).

Letting requests hang without stopping the VM risks the guest running
into timeouts and deciding that its disk is broken.

As you say your "hang" and retry logic sits in the block layer, what do
you do when you encounter a bdrv_drain() request?

> In addition to the function as above, "I/O hang" also can sent event to
> libvirt after backend storage status changed.
> 
> configure methods:
> 1. "I/O hang" ability can be configured for each disk as a disk attribute.
> 2. "I/O hang" timeout value also can be configured for each disk, when
> storage link not recover in timeout value,
>    "I/O hang" will disable rehandle I/O requests and return EIO to Guest.
> 
> Are you interested in the feature?  I intend to push this feature to qemu
> org, what's your opinion?

Were you aware of werror/rerror? Before we add another mechanism, we
need to be sure how the features compare, that the new mechanism
provides a significant advantage and that we keep code duplication as
low as possible.

Kevin


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Qemu-devel]  [Qemu-block] question:about introduce a new feature named “I/O hang”
  2019-07-05  7:50 ` Kevin Wolf
@ 2019-07-08 13:16   ` Maxim Levitsky
  0 siblings, 0 replies; 3+ messages in thread
From: Maxim Levitsky @ 2019-07-08 13:16 UTC (permalink / raw)
  To: Kevin Wolf, wangjie (P)
  Cc: qemu-block, Fangyi (C), armbru, qemu-devel, mreitz, Paolo Bonzini

On Fri, 2019-07-05 at 09:50 +0200, Kevin Wolf wrote:
> Am 04.07.2019 um 17:16 hat wangjie (P) geschrieben:
> > Hi, everybody:
> > 
> > I developed a feature named "I/O hang",my intention is to solve the problem
> > like that:
> > If the backend storage media of VM disk is far-end storage like IPSAN or
> > FCSAN, storage net link will always disconnection and
> > make I/O requests return EIO to Guest, and the status of filesystem in Guest
> > will be read-only, even the link recovered
> > after a while, the status of filesystem in Guest will not recover.
> 
> The standard solution for this is configuring the guest device with
> werror=stop,rerror=stop so that the error is not delivered to the guest,
> but the VM is stopped. When you run 'cont', the request is then retried.
> 
> > So I developed a feature named "I/O hang" to solve this problem, the
> > solution like that:
> > when some I/O requests return EIO in backend, "I/O hang" will catch the
> > requests in qemu block layer and
> > insert the requests to a rehandle queue but not return EIO to Guest, the I/O
> > requests in Guest will hang but it does not lead
> > Guest filesystem to be read-only, then "I/O hang" will loop to rehandle the
> > requests for a period time(ex. 5 second) until the requests
> > not return EIO(when backend storage link recovered).
> 
> Letting requests hang without stopping the VM risks the guest running
> into timeouts and deciding that its disk is broken.
I came to say exactly this.
While developing the nvme-mdev I also had this problem and due to assumptions built in the block layer,
you can't just let the guest wait forever for a request.

Note that Linux's nvme driver does know how to retry failed requests, including these that timed out if that helps in any way.

Best regards,
	Maxim Levitsky


> 
> As you say your "hang" and retry logic sits in the block layer, what do
> you do when you encounter a bdrv_drain() request?
> 
> > In addition to the function as above, "I/O hang" also can sent event to
> > libvirt after backend storage status changed.
> > 
> > configure methods:
> > 1. "I/O hang" ability can be configured for each disk as a disk attribute.
> > 2. "I/O hang" timeout value also can be configured for each disk, when
> > storage link not recover in timeout value,
> >    "I/O hang" will disable rehandle I/O requests and return EIO to Guest.
> > 
> > Are you interested in the feature?  I intend to push this feature to qemu
> > org, what's your opinion?
> 
> Were you aware of werror/rerror? Before we add another mechanism, we
> need to be sure how the features compare, that the new mechanism
> provides a significant advantage and that we keep code duplication as
> low as possible.
> 
> Kevin
> 




^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-07-08 13:19 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-07-04 15:16 [Qemu-devel] question:about introduce a new feature named “I/O hang” wangjie (P)
2019-07-05  7:50 ` Kevin Wolf
2019-07-08 13:16   ` [Qemu-devel] [Qemu-block] " Maxim Levitsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).