[Qemu-devel] qemu AIO worker threads change causes Guest OS hangup

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] qemu AIO worker threads change causes Guest OS hangup
@ 2016-03-01 18:45 Huaicheng Li
  2016-03-01 21:01 ` Paolo Bonzini
  2016-03-01 21:34 ` Stefan Hajnoczi
  0 siblings, 2 replies; 7+ messages in thread
From: Huaicheng Li @ 2016-03-01 18:45 UTC (permalink / raw)
  To: qemu-devel; +Cc: stefanha

Hi all,

I’m trying to add some latency conditionally to I/O requests (qemu_paiocb, from **IDE** disk emulation, **raw** image file). 
My idea is to add this part into the work thread:

  * First, set a timer for each incoming qemu_paiocb structure (e.g. 2ms)
  * When worker thread handles this I/O, it will first check if the timer has expired.
     If so, it will go to normal r/w handling to image files in host. Otherwise, it will insert
     this I/O request back to `request_list` via `qemu_paio_submit`. Here, I just want to skip
     the IO until the timer condition is satisfied.

Logically, I think this should be right. 

But after I run some I/O tests inside Guest OS, the guest OS will hangup (freeze) with “INFO: task xxx blocked for more than 120 seconds”. 
From the guest OS’s perspective, the disk seems to be very busy. Thus, the kernel keeps waiting for IO and have no responsiveness to other tasks. So I guess it should still be the problem of worker threads.

My questions are:

  * Is it safe to call `qemu_paio_submit` from one worker thread? Since all request_access accesses are protected by lock, I think this is OK.

  * What are the possible reasons why guest OS hangs up? My understand is that, although worker threads will busy with skipping I/O for many times, they will eventually finish the task (guest OS freezes after my r/w test program runs successfully, then guest OS becomes unresponsive).

  * Any thoughts on debugging? Currently I’m do some checking (e.g. the request_list length, number of threads) via printf. For this part, it seems hard to use gdb for debugging because guest OS will trigger timeout if I stay at some breakpoints for “too long”. 

Any suggestions would be appreciated. 

Thanks.

Best,
Huaicheng

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] qemu AIO worker threads change causes Guest OS hangup
  2016-03-01 18:45 [Qemu-devel] qemu AIO worker threads change causes Guest OS hangup Huaicheng Li
@ 2016-03-01 21:01 ` Paolo Bonzini
  2016-03-06  2:42   ` Huaicheng Li (coperd)
  2016-03-01 21:34 ` Stefan Hajnoczi
  1 sibling, 1 reply; 7+ messages in thread
From: Paolo Bonzini @ 2016-03-01 21:01 UTC (permalink / raw)
  To: Huaicheng Li, qemu-devel; +Cc: stefanha



On 01/03/2016 19:45, Huaicheng Li wrote:
> 
> * Is it safe to call `qemu_paio_submit` from one worker thread? Since
> all request_access accesses are protected by lock, I think this is
> OK.

No, it's not possible.  The "all" list in thread-pool.c is protected
with the AioContext lock, not with the thread pool lock.  This is done
because the worker threads only care about the queued request list, not
about active or completed requests.

Paolo

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] qemu AIO worker threads change causes Guest OS hangup
  2016-03-01 18:45 [Qemu-devel] qemu AIO worker threads change causes Guest OS hangup Huaicheng Li
  2016-03-01 21:01 ` Paolo Bonzini
@ 2016-03-01 21:34 ` Stefan Hajnoczi
  2016-03-06  2:33   ` Huaicheng Li (coperd)
  1 sibling, 1 reply; 7+ messages in thread
From: Stefan Hajnoczi @ 2016-03-01 21:34 UTC (permalink / raw)
  To: Huaicheng Li; +Cc: qemu-devel

----- Original Message -----
> From: "Huaicheng Li" <lhcwhu@gmail.com>
> I’m trying to add some latency conditionally to I/O requests (qemu_paiocb,
> from **IDE** disk emulation, **raw** image file).

Paolo already covered the technical issue with what you're doing.

Have you seen Linux Documentation/device-mapper/delay.txt?

You could set up a loopback block device and put the device-mapper delay target on top to simulate latency.

Stefan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] qemu AIO worker threads change causes Guest OS hangup
  2016-03-01 21:34 ` Stefan Hajnoczi
@ 2016-03-06  2:33   ` Huaicheng Li (coperd)
  0 siblings, 0 replies; 7+ messages in thread
From: Huaicheng Li (coperd) @ 2016-03-06  2:33 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-devel

[-- Attachment #1: Type: text/plain, Size: 442 bytes --]


> On Mar 1, 2016, at 3:34 PM, Stefan Hajnoczi <shajnocz@redhat.com> wrote:
> 
> Have you seen Linux Documentation/device-mapper/delay.txt?
> 
> You could set up a loopback block device and put the device-mapper delay target on top to simulate latency.


I’m working on one idea to emulate the latency of SSD read/write, 
which is *dynamically* changing according to the status of emulated 
flash media. Thanks for the suggestion.

[-- Attachment #2: Type: text/html, Size: 2587 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] qemu AIO worker threads change causes Guest OS hangup
  2016-03-01 21:01 ` Paolo Bonzini
@ 2016-03-06  2:42   ` Huaicheng Li (coperd)
  2016-03-07 13:02     ` Paolo Bonzini
  2016-03-07 14:32     ` Huaicheng Li (coperd)
  0 siblings, 2 replies; 7+ messages in thread
From: Huaicheng Li (coperd) @ 2016-03-06  2:42 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, stefanha

[-- Attachment #1: Type: text/plain, Size: 495 bytes --]


> On Mar 1, 2016, at 3:01 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
> This is done
> because the worker threads only care about the queued request list, not
> about active or completed requests.

Do you think it would be useful to add an API for inserting one request back 
to the queued list? For example, In case of request failure, we can insert it 
back to the list for re-handling according to some rule before returning it directly 
to guest os. 

Best,
Huaicheng


[-- Attachment #2: Type: text/html, Size: 2786 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] qemu AIO worker threads change causes Guest OS hangup
  2016-03-06  2:42   ` Huaicheng Li (coperd)
@ 2016-03-07 13:02     ` Paolo Bonzini
  2016-03-07 14:32     ` Huaicheng Li (coperd)
  1 sibling, 0 replies; 7+ messages in thread
From: Paolo Bonzini @ 2016-03-07 13:02 UTC (permalink / raw)
  To: Huaicheng Li (coperd); +Cc: qemu-devel, stefanha



On 06/03/2016 03:42, Huaicheng Li (coperd) wrote:
> 
>> On Mar 1, 2016, at 3:01 PM, Paolo Bonzini <pbonzini@redhat.com
>> <mailto:pbonzini@redhat.com>> wrote:
>>
>> This is done
>> because the worker threads only care about the queued request list, not
>> about active or completed requests.
> 
> Do you think it would be useful to add an API for inserting one request
> back 
> to the queued list? For example, In case of request failure, we can
> insert it 
> back to the list for re-handling according to some rule before returning
> it directly 
> to guest os. 

Hi,

this is usually handled at a higher level than the thread pool.  See for
example hw/block/virtio-blk.c's restart mechanism, which is enabled with
the rerror=stop and werror=stop options of -drive.

Paolo

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] qemu AIO worker threads change causes Guest OS hangup
  2016-03-06  2:42   ` Huaicheng Li (coperd)
  2016-03-07 13:02     ` Paolo Bonzini
@ 2016-03-07 14:32     ` Huaicheng Li (coperd)
  1 sibling, 0 replies; 7+ messages in thread
From: Huaicheng Li (coperd) @ 2016-03-07 14:32 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, stefanha

[-- Attachment #1: Type: text/plain, Size: 660 bytes --]


> On Mar 5, 2016, at 8:42 PM, Huaicheng Li (coperd) <lhcwhu@gmail.com> wrote:
> 
> 
>> On Mar 1, 2016, at 3:01 PM, Paolo Bonzini <pbonzini@redhat.com <mailto:pbonzini@redhat.com>> wrote:
>> 
>> This is done
>> because the worker threads only care about the queued request list, not
>> about active or completed requests.
> 
> Do you think it would be useful to add an API for inserting one request back 
> to the queued list? For example, In case of request failure, we can insert it 
> back to the list for re-handling according to some rule before returning it directly 
> to guest os. 
> 
> Best,
> Huaicheng
> 

Thank you for the help. 

[-- Attachment #2: Type: text/html, Size: 3327 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-03-07 14:32 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-03-01 18:45 [Qemu-devel] qemu AIO worker threads change causes Guest OS hangup Huaicheng Li
2016-03-01 21:01 ` Paolo Bonzini
2016-03-06  2:42   ` Huaicheng Li (coperd)
2016-03-07 13:02     ` Paolo Bonzini
2016-03-07 14:32     ` Huaicheng Li (coperd)
2016-03-01 21:34 ` Stefan Hajnoczi
2016-03-06  2:33   ` Huaicheng Li (coperd)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).