qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] Hanging VM: Fix block I/O hang patch could help?
@ 2008-12-10  5:51 Daniel Dickinson
  2008-12-10 15:27 ` Anthony Liguori
  0 siblings, 1 reply; 3+ messages in thread
From: Daniel Dickinson @ 2008-12-10  5:51 UTC (permalink / raw)
  To: qemu-devel@nongnu.org

[-- Attachment #1: Type: text/plain, Size: 3085 bytes --]

Hello,
	
	I have been experiencing VM hangs for some time (anything after
0.8.2 on debian) with a Windows 98 SE guest (where I can can
consistently hang it by a certain part of the install procedure) as
well as various Ubuntu guests (where it will hang but is 'just random').

From Debian bug #474386, quoting myself

I have done the bisect and narrowed it down to three commits.  Two of
the commits do not compile so could be the problem and the third is the
first bad commit that does compile.  

The last good revision is 2072 (git
d62ca2669be6d6af2c0cbda47abd7e51548060bf)

The first known bad commit is 2075 (git
83f640910acd7cd13ff8a603f29c46033c4fb00)

I have attached the diff between these two revisions (three commits
worth).

The first bad commit is large.  It is the switch to asynchronous file
I/O for the disk images.  I believe the problem is likely a race
condition in that code which is unpleasant.  The main reason I wonder
about this is that if that is the problem I would expect random vm
hangs across the board (though possibly rarely) and not just for me.  I
experience hangs in ubuntu (two versions) as well as windows 98 so that
is consistent, ...

I have noticed in the user forums some indication that this is
happening to other people to.

I have also seen the messages series titled

Re: [Qemu-devel] [patch] Fix block I/O hang.

and was wondering if that could be the problem.  I include the last
message in that series below.

Re: [Qemu-devel] [patch] Fix block I/O hang.
From: 	Gerd Hoffmann
Subject: 	Re: [Qemu-devel] [patch] Fix block I/O hang.
Date: 	Thu, 13 Nov 2008 10:14:31 +0100
User-agent: 	Thunderbird 2.0.0.16 (X11/20080723)

Anthony Liguori wrote:
> Gerd Hoffmann wrote:
>>  
>>> Under what circumstances?  posix_aio_read() is only invoked from a
>>> select callback.  This means there should be data available to be read.
>>>     
>>
>> Well, there are *two* select loops:  main_loop_wait() and
>> qemu_aio_wait().  Calling sync block i/o functions from a i/o handler
>> causes the two select loops run nested => boom.
> 
> Yeah, qemu_aio_wait needs to die.  Can you resubmit your patch with a
> better description, and change the read() look in posix_aio_read() to
> consume as much data as possible before hitting EAGAIN?

I've fixed my problem by changing xen_disk to use a bottom half for
actual work, so the block read/write calls are moved out of the select
loop anyway.  Which turned out to be useful for aio support too.

So I'm fine again with the current state.  I can create such a patch
nevertheless though.

cheers,
  Gerd


If this isn't xen-specific I'd like to try a build with this patch to
see if it works.

Regards,

Daniel

-- 
And that's my crabbing done for the day.  Got it out of the way early, 
now I have the rest of the afternoon to sniff fragrant tea-roses or 
strangle cute bunnies or something.   -- Michael Devore
GnuPG Key Fingerprint 86 F5 81 A5 D4 2E 1F 1C      http://gnupg.org
The C Shore: http://www.wightman.ca/~cshore

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2008-12-10 16:24 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-12-10  5:51 [Qemu-devel] Hanging VM: Fix block I/O hang patch could help? Daniel Dickinson
2008-12-10 15:27 ` Anthony Liguori
2008-12-10 16:24   ` Daniel Dickinson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).