Re: [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686]

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* Re: [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686]
       [not found] ` <51FC2903.3030802@cloudapt.com>
@ 2013-08-04 13:36   ` Oliver Francke
  2013-08-05  7:48     ` Stefan Hajnoczi
  0 siblings, 1 reply; 12+ messages in thread
From: Oliver Francke @ 2013-08-04 13:36 UTC (permalink / raw)
  To: Mike Dawson
  Cc: josh.durgin@inktank.com Durgin, ceph-users, qemu-devel@nongnu.org,
	Stefan Hajnoczi

Hi Mike,

you might be the guy StefanHa was referring to on the qemu-devel mailing-list.

I just made some more tests, so…

Am 02.08.2013 um 23:47 schrieb Mike Dawson <mike.dawson@cloudapt.com>:

> Oliver,
> 
> We've had a similar situation occur. For about three months, we've run several Windows 2008 R2 guests with virtio drivers that record video surveillance. We have long suffered an issue where the guest appears to hang indefinitely (or until we intervene). For the sake of this conversation, we call this state "wedged", because it appears something (rbd, qemu, virtio, etc) gets stuck on a deadlock. When a guest gets wedged, we see the following:
> 
> - the guest will not respond to pings

If showing up the hung_task - message, I can ping and establish new ssh-sessions, just the session with a while loop does not accept any keyboard-action.

> - the qemu-system-x86_64 process drops to 0% cpu
> - graphite graphs show the interface traffic dropping to 0bps
> - the guest will stay wedged forever (or until we intervene)
> - strace of qemu-system-x86_64 shows QEMU is making progress [1][2]
> 

nothing special here:

5, events=POLLIN}, {fd=7, events=POLLIN}, {fd=6, events=POLLIN}, {fd=19, events=POLLIN}, {fd=15, events=POLLIN}, {fd=4, events=POLLIN}], 11, -1) = 1 ([{fd=12, revents=POLLIN}])
[pid 11793] read(5, 0x7fff16b61f00, 16) = -1 EAGAIN (Resource temporarily unavailable)
[pid 11793] read(12, "\2\0\0\0\0\0\0\0\0\0\0\0\0\361p\0\252\340\374\373\373!gH\10\0E\0\0Yq\374"..., 69632) = 115
[pid 11793] read(12, 0x7f0c1737fcec, 69632) = -1 EAGAIN (Resource temporarily unavailable)
[pid 11793] poll([{fd=27, events=POLLIN|POLLERR|POLLHUP}, {fd=26, events=POLLIN|POLLERR|POLLHUP}, {fd=24, events=POLLIN|POLLERR|POLLHUP}, {fd=12, events=POLLIN|POLLERR|POLLHUP}, {fd=3, events=POLLIN|POLLERR|POLLHUP}, {fd=

and that for many, many threads.
Inside the VM I see 75% wait, but I can restart the spew-test in a second session.

All that tested with rbd_cache=false,cache=none.

I also test every qemu-version with a 2 CPU 2GiB mem Windows 7 VM with some high load, encountering no problem ATM. Running smooth and fast.

> We can "un-wedge" the guest by opening a NoVNC session or running a 'virsh screenshot' command. After that, the guest resumes and runs as expected. At that point we can examine the guest. Each time we'll see:
> 
> - No Windows error logs whatsoever while the guest is wedged
> - A time sync typically occurs right after the guest gets un-wedged
> - Scheduled tasks do not run while wedged
> - Windows error logs do not show any evidence of suspend, sleep, etc
> 
> We had so many issue with guests becoming wedged, we wrote a script to 'virsh screenshot' them via cron. Then we installed some updates and had a month or so of higher stability (wedging happened maybe 1/10th as often). Until today we couldn't figure out why.
> 
> Yesterday, I realized qemu was starting the instances without specifying cache=writeback. We corrected that, and let them run overnight. With RBD writeback re-enabled, wedging came back as often as we had seen in the past. I've counted ~40 occurrences in the past 12-hour period. So I feel like writeback caching in RBD certainly makes the deadlock more likely to occur.
> 
> Joshd asked us to gather RBD client logs:
> 
> "joshd> it could very well be the writeback cache not doing a callback at some point - if you could gather logs of a vm getting stuck with debug rbd = 20, debug ms = 1, and debug objectcacher = 30 that would be great"
> 
> We'll do that over the weekend. If you could as well, we'd love the help!
> 
> [1] http://www.gammacode.com/kvm/wedged-with-timestamps.txt
> [2] http://www.gammacode.com/kvm/not-wedged.txt
> 

As I wrote above, no cache so far, so omitting the verbose debugging at the moment. But will do if requested.

Thnx for your report,

Oliver.

> Thanks,
> 
> Mike Dawson
> Co-Founder & Director of Cloud Architecture
> Cloudapt LLC
> 6330 East 75th Street, Suite 170
> Indianapolis, IN 46250
> 
> On 8/2/2013 6:22 AM, Oliver Francke wrote:
>> Well,
>> 
>> I believe, I'm the winner of buzzwords-bingo for today.
>> 
>> But seriously speaking... as I don't have this particular problem with
>> qcow2 with kernel 3.2 nor qemu-1.2.2 nor newer kernels, I hope I'm not
>> alone here?
>> We have a raising number of tickets from people reinstalling from ISO's
>> with 3.2-kernel.
>> 
>> Fast fallback is to start all VM's with qemu-1.2.2, but we then lose
>> some features ala latency-free-RBD-cache ;)
>> 
>> I just opened a bug for qemu per:
>> 
>> https://bugs.launchpad.net/qemu/+bug/1207686
>> 
>> with all dirty details.
>> 
>> Installing a backport-kernel 3.9.x or upgrade Ubuntu-kernel to 3.8.x
>> "fixes" it. So we have a bad combination for all distros with 3.2-kernel
>> and rbd as storage-backend, I assume.
>> 
>> Any similar findings?
>> Any idea of tracing/debugging ( Josh? ;) ) very welcome,
>> 
>> Oliver.
>> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686]
  2013-08-04 13:36   ` [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686] Oliver Francke
@ 2013-08-05  7:48     ` Stefan Hajnoczi
  2013-08-05 20:08       ` Mike Dawson
  2013-08-08 12:40       ` Oliver Francke
  0 siblings, 2 replies; 12+ messages in thread
From: Stefan Hajnoczi @ 2013-08-05  7:48 UTC (permalink / raw)
  To: Oliver Francke
  Cc: Josh Durgin, ceph-users, Mike Dawson, qemu-devel@nongnu.org

On Sun, Aug 04, 2013 at 03:36:52PM +0200, Oliver Francke wrote:
> Am 02.08.2013 um 23:47 schrieb Mike Dawson <mike.dawson@cloudapt.com>:
> > We can "un-wedge" the guest by opening a NoVNC session or running a 'virsh screenshot' command. After that, the guest resumes and runs as expected. At that point we can examine the guest. Each time we'll see:

If virsh screenshot works then this confirms that QEMU itself is still
responding.  Its main loop cannot be blocked since it was able to
process the screendump command.

This supports Josh's theory that a callback is not being invoked.  The
virtio-blk I/O request would be left in a pending state.

Now here is where the behavior varies between configurations:

On a Windows guest with 1 vCPU, you may see the symptom that the guest no
longer responds to ping.

On a Linux guest with multiple vCPUs, you may see the hung task message
from the guest kernel because other vCPUs are still making progress.
Just the vCPU that issued the I/O request and whose task is in
UNINTERRUPTIBLE state would really be stuck.

Basically, the symptoms depend not just on how QEMU is behaving but also
on the guest kernel and how many vCPUs you have configured.

I think this can explain how both problems you are observing, Oliver and
Mike, are a result of the same bug.  At least I hope they are :).

Stefan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686]
  2013-08-05  7:48     ` Stefan Hajnoczi
@ 2013-08-05 20:08       ` Mike Dawson
  2013-08-13 21:26         ` Sage Weil
  2013-08-08 12:40       ` Oliver Francke
  1 sibling, 1 reply; 12+ messages in thread
From: Mike Dawson @ 2013-08-05 20:08 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Josh Durgin, ceph-users, Oliver Francke, qemu-devel@nongnu.org

Josh,

Logs are uploaded to cephdrop with the file name 
mikedawson-rbd-qemu-deadlock.

- At about 2013-08-05 19:46 or 47, we hit the issue, traffic went to 0
- At about 2013-08-05 19:53:51, ran a 'virsh screenshot'


Environment is:

- Ceph 0.61.7 (client is co-mingled with three OSDs)
- rbd cache = true and cache=writeback
- qemu 1.4.0 1.4.0+dfsg-1expubuntu4
- Ubuntu Raring with 3.8.0-25-generic

This issue is reproducible in my environment, and I'm willing to run any 
wip branch you need. What else can I provide to help?

Thanks,
Mike Dawson


On 8/5/2013 3:48 AM, Stefan Hajnoczi wrote:
> On Sun, Aug 04, 2013 at 03:36:52PM +0200, Oliver Francke wrote:
>> Am 02.08.2013 um 23:47 schrieb Mike Dawson <mike.dawson@cloudapt.com>:
>>> We can "un-wedge" the guest by opening a NoVNC session or running a 'virsh screenshot' command. After that, the guest resumes and runs as expected. At that point we can examine the guest. Each time we'll see:
>
> If virsh screenshot works then this confirms that QEMU itself is still
> responding.  Its main loop cannot be blocked since it was able to
> process the screendump command.
>
> This supports Josh's theory that a callback is not being invoked.  The
> virtio-blk I/O request would be left in a pending state.
>
> Now here is where the behavior varies between configurations:
>
> On a Windows guest with 1 vCPU, you may see the symptom that the guest no
> longer responds to ping.
>
> On a Linux guest with multiple vCPUs, you may see the hung task message
> from the guest kernel because other vCPUs are still making progress.
> Just the vCPU that issued the I/O request and whose task is in
> UNINTERRUPTIBLE state would really be stuck.
>
> Basically, the symptoms depend not just on how QEMU is behaving but also
> on the guest kernel and how many vCPUs you have configured.
>
> I think this can explain how both problems you are observing, Oliver and
> Mike, are a result of the same bug.  At least I hope they are :).
>
> Stefan
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686]
  2013-08-05 20:08       ` Mike Dawson
@ 2013-08-13 21:26         ` Sage Weil
  2013-08-13 22:00           ` James Harper
  0 siblings, 1 reply; 12+ messages in thread
From: Sage Weil @ 2013-08-13 21:26 UTC (permalink / raw)
  To: Mike Dawson; +Cc: ceph-users, qemu-devel@nongnu.org, Stefan Hajnoczi

On Mon, 5 Aug 2013, Mike Dawson wrote:
> Josh,
> 
> Logs are uploaded to cephdrop with the file name mikedawson-rbd-qemu-deadlock.
> 
> - At about 2013-08-05 19:46 or 47, we hit the issue, traffic went to 0
> - At about 2013-08-05 19:53:51, ran a 'virsh screenshot'
> 
> 
> Environment is:
> 
> - Ceph 0.61.7 (client is co-mingled with three OSDs)
> - rbd cache = true and cache=writeback
> - qemu 1.4.0 1.4.0+dfsg-1expubuntu4
> - Ubuntu Raring with 3.8.0-25-generic
> 
> This issue is reproducible in my environment, and I'm willing to run any wip
> branch you need. What else can I provide to help?

This looks like a different issue than Oliver's.  I see one anomaly in the 
log, where a rbd io completion is triggered a second time for no apparent 
reason.  I opened a separate bug 

	http://tracker.ceph.com/issues/5955

and pushed wip-5955 that will hopefully shine some light on the weird 
behavior I saw.  Can you reproduce with this branch and

 debug objectcacher = 20
 debug ms = 1
 debug rbd = 20
 debug finisher = 20

Thanks!
sage


> 
> Thanks,
> Mike Dawson
> 
> 
> On 8/5/2013 3:48 AM, Stefan Hajnoczi wrote:
> > On Sun, Aug 04, 2013 at 03:36:52PM +0200, Oliver Francke wrote:
> > > Am 02.08.2013 um 23:47 schrieb Mike Dawson <mike.dawson@cloudapt.com>:
> > > > We can "un-wedge" the guest by opening a NoVNC session or running a
> > > > 'virsh screenshot' command. After that, the guest resumes and runs as
> > > > expected. At that point we can examine the guest. Each time we'll see:
> > 
> > If virsh screenshot works then this confirms that QEMU itself is still
> > responding.  Its main loop cannot be blocked since it was able to
> > process the screendump command.
> > 
> > This supports Josh's theory that a callback is not being invoked.  The
> > virtio-blk I/O request would be left in a pending state.
> > 
> > Now here is where the behavior varies between configurations:
> > 
> > On a Windows guest with 1 vCPU, you may see the symptom that the guest no
> > longer responds to ping.
> > 
> > On a Linux guest with multiple vCPUs, you may see the hung task message
> > from the guest kernel because other vCPUs are still making progress.
> > Just the vCPU that issued the I/O request and whose task is in
> > UNINTERRUPTIBLE state would really be stuck.
> > 
> > Basically, the symptoms depend not just on how QEMU is behaving but also
> > on the guest kernel and how many vCPUs you have configured.
> > 
> > I think this can explain how both problems you are observing, Oliver and
> > Mike, are a result of the same bug.  At least I hope they are :).
> > 
> > Stefan
> > 
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686]
  2013-08-13 21:26         ` Sage Weil
@ 2013-08-13 22:00           ` James Harper
  0 siblings, 0 replies; 12+ messages in thread
From: James Harper @ 2013-08-13 22:00 UTC (permalink / raw)
  To: Sage Weil, Mike Dawson
  Cc: ceph-users@lists.ceph.com, qemu-devel@nongnu.org, Stefan Hajnoczi

> 
> This looks like a different issue than Oliver's.  I see one anomaly in the
> log, where a rbd io completion is triggered a second time for no apparent
> reason.  I opened a separate bug
> 
> 	http://tracker.ceph.com/issues/5955
> 
> and pushed wip-5955 that will hopefully shine some light on the weird
> behavior I saw.  Can you reproduce with this branch and
> 

Do you think this could be a bug in rbd? I'm seeing a bug in the tapdisk rbd code and if the completion was called twice it could cause the crash I'm seeing too.

Unfortunately I can't get gdb to work with pthreads so I can't get a backtrace.

James

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686]
  2013-08-05  7:48     ` Stefan Hajnoczi
  2013-08-05 20:08       ` Mike Dawson
@ 2013-08-08 12:40       ` Oliver Francke
  2013-08-08 17:01         ` Josh Durgin
  1 sibling, 1 reply; 12+ messages in thread
From: Oliver Francke @ 2013-08-08 12:40 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Josh Durgin, ceph-users, Mike Dawson, qemu-devel@nongnu.org

Hi Josh,

I have a session logged with:

     debug_ms=1:debug_rbd=20:debug_objectcacher=30

as you requested from Mike, even if I think, we do have another story 
here, anyway.

Host-kernel is: 3.10.0-rc7, qemu-client 1.6.0-rc2, client-kernel is 
3.2.0-51-amd...

Do you want me to open a ticket for that stuff? I have about 5MB 
compressed logfile waiting for you ;)

Thnx in advance,

Oliver.

On 08/05/2013 09:48 AM, Stefan Hajnoczi wrote:
> On Sun, Aug 04, 2013 at 03:36:52PM +0200, Oliver Francke wrote:
>> Am 02.08.2013 um 23:47 schrieb Mike Dawson <mike.dawson@cloudapt.com>:
>>> We can "un-wedge" the guest by opening a NoVNC session or running a 'virsh screenshot' command. After that, the guest resumes and runs as expected. At that point we can examine the guest. Each time we'll see:
> If virsh screenshot works then this confirms that QEMU itself is still
> responding.  Its main loop cannot be blocked since it was able to
> process the screendump command.
>
> This supports Josh's theory that a callback is not being invoked.  The
> virtio-blk I/O request would be left in a pending state.
>
> Now here is where the behavior varies between configurations:
>
> On a Windows guest with 1 vCPU, you may see the symptom that the guest no
> longer responds to ping.
>
> On a Linux guest with multiple vCPUs, you may see the hung task message
> from the guest kernel because other vCPUs are still making progress.
> Just the vCPU that issued the I/O request and whose task is in
> UNINTERRUPTIBLE state would really be stuck.
>
> Basically, the symptoms depend not just on how QEMU is behaving but also
> on the guest kernel and how many vCPUs you have configured.
>
> I think this can explain how both problems you are observing, Oliver and
> Mike, are a result of the same bug.  At least I hope they are :).
>
> Stefan


-- 

Oliver Francke

filoo GmbH
Moltkestraße 25a
33330 Gütersloh
HRB4355 AG Gütersloh

Geschäftsführer: J.Rehpöhler | C.Kunz

Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686]
  2013-08-08 12:40       ` Oliver Francke
@ 2013-08-08 17:01         ` Josh Durgin
  2013-08-09  9:22           ` Oliver Francke
  0 siblings, 1 reply; 12+ messages in thread
From: Josh Durgin @ 2013-08-08 17:01 UTC (permalink / raw)
  To: Oliver Francke
  Cc: ceph-users, Mike Dawson, Stefan Hajnoczi, qemu-devel@nongnu.org

On 08/08/2013 05:40 AM, Oliver Francke wrote:
> Hi Josh,
>
> I have a session logged with:
>
>      debug_ms=1:debug_rbd=20:debug_objectcacher=30
>
> as you requested from Mike, even if I think, we do have another story
> here, anyway.
>
> Host-kernel is: 3.10.0-rc7, qemu-client 1.6.0-rc2, client-kernel is
> 3.2.0-51-amd...
>
> Do you want me to open a ticket for that stuff? I have about 5MB
> compressed logfile waiting for you ;)

Yes, that'd be great. If you could include the time when you saw the 
guest hang that'd be ideal. I'm not sure if this is one or two bugs,
but it seems likely it's a bug in rbd and not qemu.

Thanks!
Josh

> Thnx in advance,
>
> Oliver.
>
> On 08/05/2013 09:48 AM, Stefan Hajnoczi wrote:
>> On Sun, Aug 04, 2013 at 03:36:52PM +0200, Oliver Francke wrote:
>>> Am 02.08.2013 um 23:47 schrieb Mike Dawson <mike.dawson@cloudapt.com>:
>>>> We can "un-wedge" the guest by opening a NoVNC session or running a
>>>> 'virsh screenshot' command. After that, the guest resumes and runs
>>>> as expected. At that point we can examine the guest. Each time we'll
>>>> see:
>> If virsh screenshot works then this confirms that QEMU itself is still
>> responding.  Its main loop cannot be blocked since it was able to
>> process the screendump command.
>>
>> This supports Josh's theory that a callback is not being invoked.  The
>> virtio-blk I/O request would be left in a pending state.
>>
>> Now here is where the behavior varies between configurations:
>>
>> On a Windows guest with 1 vCPU, you may see the symptom that the guest no
>> longer responds to ping.
>>
>> On a Linux guest with multiple vCPUs, you may see the hung task message
>> from the guest kernel because other vCPUs are still making progress.
>> Just the vCPU that issued the I/O request and whose task is in
>> UNINTERRUPTIBLE state would really be stuck.
>>
>> Basically, the symptoms depend not just on how QEMU is behaving but also
>> on the guest kernel and how many vCPUs you have configured.
>>
>> I think this can explain how both problems you are observing, Oliver and
>> Mike, are a result of the same bug.  At least I hope they are :).
>>
>> Stefan
>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686]
  2013-08-08 17:01         ` Josh Durgin
@ 2013-08-09  9:22           ` Oliver Francke
  2013-08-09 14:05             ` Andrei Mikhailovsky
  2013-08-13 21:34             ` Sage Weil
  0 siblings, 2 replies; 12+ messages in thread
From: Oliver Francke @ 2013-08-09  9:22 UTC (permalink / raw)
  To: Josh Durgin
  Cc: ceph-users, Mike Dawson, Stefan Hajnoczi, qemu-devel@nongnu.org

Hi Josh,

just opened

http://tracker.ceph.com/issues/5919

with all collected information incl. debug-log.

Hope it helps,

Oliver.

On 08/08/2013 07:01 PM, Josh Durgin wrote:
> On 08/08/2013 05:40 AM, Oliver Francke wrote:
>> Hi Josh,
>>
>> I have a session logged with:
>>
>>      debug_ms=1:debug_rbd=20:debug_objectcacher=30
>>
>> as you requested from Mike, even if I think, we do have another story
>> here, anyway.
>>
>> Host-kernel is: 3.10.0-rc7, qemu-client 1.6.0-rc2, client-kernel is
>> 3.2.0-51-amd...
>>
>> Do you want me to open a ticket for that stuff? I have about 5MB
>> compressed logfile waiting for you ;)
>
> Yes, that'd be great. If you could include the time when you saw the 
> guest hang that'd be ideal. I'm not sure if this is one or two bugs,
> but it seems likely it's a bug in rbd and not qemu.
>
> Thanks!
> Josh
>
>> Thnx in advance,
>>
>> Oliver.
>>
>> On 08/05/2013 09:48 AM, Stefan Hajnoczi wrote:
>>> On Sun, Aug 04, 2013 at 03:36:52PM +0200, Oliver Francke wrote:
>>>> Am 02.08.2013 um 23:47 schrieb Mike Dawson <mike.dawson@cloudapt.com>:
>>>>> We can "un-wedge" the guest by opening a NoVNC session or running a
>>>>> 'virsh screenshot' command. After that, the guest resumes and runs
>>>>> as expected. At that point we can examine the guest. Each time we'll
>>>>> see:
>>> If virsh screenshot works then this confirms that QEMU itself is still
>>> responding.  Its main loop cannot be blocked since it was able to
>>> process the screendump command.
>>>
>>> This supports Josh's theory that a callback is not being invoked.  The
>>> virtio-blk I/O request would be left in a pending state.
>>>
>>> Now here is where the behavior varies between configurations:
>>>
>>> On a Windows guest with 1 vCPU, you may see the symptom that the 
>>> guest no
>>> longer responds to ping.
>>>
>>> On a Linux guest with multiple vCPUs, you may see the hung task message
>>> from the guest kernel because other vCPUs are still making progress.
>>> Just the vCPU that issued the I/O request and whose task is in
>>> UNINTERRUPTIBLE state would really be stuck.
>>>
>>> Basically, the symptoms depend not just on how QEMU is behaving but 
>>> also
>>> on the guest kernel and how many vCPUs you have configured.
>>>
>>> I think this can explain how both problems you are observing, Oliver 
>>> and
>>> Mike, are a result of the same bug.  At least I hope they are :).
>>>
>>> Stefan
>>
>>
>


-- 

Oliver Francke

filoo GmbH
Moltkestraße 25a
33330 Gütersloh
HRB4355 AG Gütersloh

Geschäftsführer: J.Rehpöhler | C.Kunz

Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686]
  2013-08-09  9:22           ` Oliver Francke
@ 2013-08-09 14:05             ` Andrei Mikhailovsky
  2013-08-09 15:03               ` Stefan Hajnoczi
  2013-08-13 21:34             ` Sage Weil
  1 sibling, 1 reply; 12+ messages in thread
From: Andrei Mikhailovsky @ 2013-08-09 14:05 UTC (permalink / raw)
  To: Oliver Francke
  Cc: Josh Durgin, ceph-users, Mike Dawson, Stefan Hajnoczi, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 3951 bytes --]

I can confirm that I am having similar issues with ubuntu vm guests using fio with bs=4k direct=1 numjobs=4 iodepth=16. Occasionally i see hang tasks, occasionally guest vm stops responding without leaving anything in the logs and sometimes i see kernel panic on the console. I typically leave the runtime of the fio test for 60 minutes and it tends to stop responding after about 10-30 mins. 

I am on ubuntu 12.04 with 3.5 kernel backport and using ceph 0.61.7 with qemu 1.5.0 and libvirt 1.0.2 

Andrei 
----- Original Message -----

From: "Oliver Francke" <Oliver.Francke@filoo.de> 
To: "Josh Durgin" <josh.durgin@inktank.com> 
Cc: ceph-users@lists.ceph.com, "Mike Dawson" <mike.dawson@cloudapt.com>, "Stefan Hajnoczi" <stefanha@redhat.com>, qemu-devel@nongnu.org 
Sent: Friday, 9 August, 2013 10:22:00 AM 
Subject: Re: [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Qemu-devel] [Bug 1207686] 

Hi Josh, 

just opened 

http://tracker.ceph.com/issues/5919 

with all collected information incl. debug-log. 

Hope it helps, 

Oliver. 

On 08/08/2013 07:01 PM, Josh Durgin wrote: 
> On 08/08/2013 05:40 AM, Oliver Francke wrote: 
>> Hi Josh, 
>> 
>> I have a session logged with: 
>> 
>> debug_ms=1:debug_rbd=20:debug_objectcacher=30 
>> 
>> as you requested from Mike, even if I think, we do have another story 
>> here, anyway. 
>> 
>> Host-kernel is: 3.10.0-rc7, qemu-client 1.6.0-rc2, client-kernel is 
>> 3.2.0-51-amd... 
>> 
>> Do you want me to open a ticket for that stuff? I have about 5MB 
>> compressed logfile waiting for you ;) 
> 
> Yes, that'd be great. If you could include the time when you saw the 
> guest hang that'd be ideal. I'm not sure if this is one or two bugs, 
> but it seems likely it's a bug in rbd and not qemu. 
> 
> Thanks! 
> Josh 
> 
>> Thnx in advance, 
>> 
>> Oliver. 
>> 
>> On 08/05/2013 09:48 AM, Stefan Hajnoczi wrote: 
>>> On Sun, Aug 04, 2013 at 03:36:52PM +0200, Oliver Francke wrote: 
>>>> Am 02.08.2013 um 23:47 schrieb Mike Dawson <mike.dawson@cloudapt.com>: 
>>>>> We can "un-wedge" the guest by opening a NoVNC session or running a 
>>>>> 'virsh screenshot' command. After that, the guest resumes and runs 
>>>>> as expected. At that point we can examine the guest. Each time we'll 
>>>>> see: 
>>> If virsh screenshot works then this confirms that QEMU itself is still 
>>> responding. Its main loop cannot be blocked since it was able to 
>>> process the screendump command. 
>>> 
>>> This supports Josh's theory that a callback is not being invoked. The 
>>> virtio-blk I/O request would be left in a pending state. 
>>> 
>>> Now here is where the behavior varies between configurations: 
>>> 
>>> On a Windows guest with 1 vCPU, you may see the symptom that the 
>>> guest no 
>>> longer responds to ping. 
>>> 
>>> On a Linux guest with multiple vCPUs, you may see the hung task message 
>>> from the guest kernel because other vCPUs are still making progress. 
>>> Just the vCPU that issued the I/O request and whose task is in 
>>> UNINTERRUPTIBLE state would really be stuck. 
>>> 
>>> Basically, the symptoms depend not just on how QEMU is behaving but 
>>> also 
>>> on the guest kernel and how many vCPUs you have configured. 
>>> 
>>> I think this can explain how both problems you are observing, Oliver 
>>> and 
>>> Mike, are a result of the same bug. At least I hope they are :). 
>>> 
>>> Stefan 
>> 
>> 
> 


-- 

Oliver Francke 

filoo GmbH 
Moltkestraße 25a 
33330 Gütersloh 
HRB4355 AG Gütersloh 

Geschäftsführer: J.Rehpöhler | C.Kunz 

Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh 

_______________________________________________ 
ceph-users mailing list 
ceph-users@lists.ceph.com 
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 


[-- Attachment #2: Type: text/html, Size: 4988 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686]
  2013-08-09 14:05             ` Andrei Mikhailovsky
@ 2013-08-09 15:03               ` Stefan Hajnoczi
  2013-08-10  7:30                 ` Josh Durgin
  0 siblings, 1 reply; 12+ messages in thread
From: Stefan Hajnoczi @ 2013-08-09 15:03 UTC (permalink / raw)
  To: Andrei Mikhailovsky
  Cc: Josh Durgin, ceph-users, Oliver Francke, Mike Dawson, qemu-devel

On Fri, Aug 09, 2013 at 03:05:22PM +0100, Andrei Mikhailovsky wrote:
> I can confirm that I am having similar issues with ubuntu vm guests using fio with bs=4k direct=1 numjobs=4 iodepth=16. Occasionally i see hang tasks, occasionally guest vm stops responding without leaving anything in the logs and sometimes i see kernel panic on the console. I typically leave the runtime of the fio test for 60 minutes and it tends to stop responding after about 10-30 mins. 
> 
> I am on ubuntu 12.04 with 3.5 kernel backport and using ceph 0.61.7 with qemu 1.5.0 and libvirt 1.0.2 

Josh,
In addition to the Ceph logs you can also use QEMU tracing with the
following events enabled:
virtio_blk_handle_write
virtio_blk_handle_read
virtio_blk_rw_complete

See docs/tracing.txt for details on usage.

Inspecting the trace output will let you observe the I/O request
submission/completion from the virtio-blk device perspective.  You'll be
able to see whether requests are never being completed in some cases.

This bug seems like a corner case or race condition since most requests
seem to complete just fine.  The problem is that eventually the
virtio-blk device becomes unusable when it runs out of descriptors (it
has 128).  And before that limit is reached the guest may become
unusable due to the hung I/O requests.

Stefan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686]
  2013-08-09 15:03               ` Stefan Hajnoczi
@ 2013-08-10  7:30                 ` Josh Durgin
  0 siblings, 0 replies; 12+ messages in thread
From: Josh Durgin @ 2013-08-10  7:30 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: ceph-users, Andrei Mikhailovsky, Oliver Francke, Mike Dawson,
	qemu-devel

On 08/09/2013 08:03 AM, Stefan Hajnoczi wrote:
> On Fri, Aug 09, 2013 at 03:05:22PM +0100, Andrei Mikhailovsky wrote:
>> I can confirm that I am having similar issues with ubuntu vm guests using fio with bs=4k direct=1 numjobs=4 iodepth=16. Occasionally i see hang tasks, occasionally guest vm stops responding without leaving anything in the logs and sometimes i see kernel panic on the console. I typically leave the runtime of the fio test for 60 minutes and it tends to stop responding after about 10-30 mins.
>>
>> I am on ubuntu 12.04 with 3.5 kernel backport and using ceph 0.61.7 with qemu 1.5.0 and libvirt 1.0.2

Oliver's logs show one aio_flush() never getting completed, which
means it's an issue with aio_flush in librados when rbd caching isn't
used.

Mike's log is from a qemu without aio_flush(), and with caching turned 
on, and shows all flushes completing quickly, so it's a separate bug.

> Josh,
> In addition to the Ceph logs you can also use QEMU tracing with the
> following events enabled:
> virtio_blk_handle_write
> virtio_blk_handle_read
> virtio_blk_rw_complete
>
> See docs/tracing.txt for details on usage.
>
> Inspecting the trace output will let you observe the I/O request
> submission/completion from the virtio-blk device perspective.  You'll be
> able to see whether requests are never being completed in some cases.

Thanks for the info. That may be the best way to check what's happening
when caching is enabled. Mike, could you recompile qemu with tracing
enabled and get a trace of the hang you were seeing, in addition to
the ceph logs?

> This bug seems like a corner case or race condition since most requests
> seem to complete just fine.  The problem is that eventually the
> virtio-blk device becomes unusable when it runs out of descriptors (it
> has 128).  And before that limit is reached the guest may become
> unusable due to the hung I/O requests.

It seems only one request hung from an important kernel thread in
Oliver's case, but it's good to be aware of the descriptor limit.

Josh

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686]
  2013-08-09  9:22           ` Oliver Francke
  2013-08-09 14:05             ` Andrei Mikhailovsky
@ 2013-08-13 21:34             ` Sage Weil
  1 sibling, 0 replies; 12+ messages in thread
From: Sage Weil @ 2013-08-13 21:34 UTC (permalink / raw)
  To: Oliver Francke
  Cc: Josh Durgin, ceph-users, Mike Dawson, Stefan Hajnoczi,
	qemu-devel@nongnu.org

Hi Oliver,

(Posted this on the bug too, but:)

Your last log revealed a bug in the librados aio flush.  A fix is pushed 
to wip-librados-aio-flush (bobtail) and wip-5919 (master); can you retest 
please (with caching off again)?

Thanks!
sage


On Fri, 9 Aug 2013, Oliver Francke wrote:
> Hi Josh,
> 
> just opened
> 
> http://tracker.ceph.com/issues/5919
> 
> with all collected information incl. debug-log.
> 
> Hope it helps,
> 
> Oliver.
> 
> On 08/08/2013 07:01 PM, Josh Durgin wrote:
> > On 08/08/2013 05:40 AM, Oliver Francke wrote:
> > > Hi Josh,
> > > 
> > > I have a session logged with:
> > > 
> > >      debug_ms=1:debug_rbd=20:debug_objectcacher=30
> > > 
> > > as you requested from Mike, even if I think, we do have another story
> > > here, anyway.
> > > 
> > > Host-kernel is: 3.10.0-rc7, qemu-client 1.6.0-rc2, client-kernel is
> > > 3.2.0-51-amd...
> > > 
> > > Do you want me to open a ticket for that stuff? I have about 5MB
> > > compressed logfile waiting for you ;)
> > 
> > Yes, that'd be great. If you could include the time when you saw the guest
> > hang that'd be ideal. I'm not sure if this is one or two bugs,
> > but it seems likely it's a bug in rbd and not qemu.
> > 
> > Thanks!
> > Josh
> > 
> > > Thnx in advance,
> > > 
> > > Oliver.
> > > 
> > > On 08/05/2013 09:48 AM, Stefan Hajnoczi wrote:
> > > > On Sun, Aug 04, 2013 at 03:36:52PM +0200, Oliver Francke wrote:
> > > > > Am 02.08.2013 um 23:47 schrieb Mike Dawson <mike.dawson@cloudapt.com>:
> > > > > > We can "un-wedge" the guest by opening a NoVNC session or running a
> > > > > > 'virsh screenshot' command. After that, the guest resumes and runs
> > > > > > as expected. At that point we can examine the guest. Each time we'll
> > > > > > see:
> > > > If virsh screenshot works then this confirms that QEMU itself is still
> > > > responding.  Its main loop cannot be blocked since it was able to
> > > > process the screendump command.
> > > > 
> > > > This supports Josh's theory that a callback is not being invoked.  The
> > > > virtio-blk I/O request would be left in a pending state.
> > > > 
> > > > Now here is where the behavior varies between configurations:
> > > > 
> > > > On a Windows guest with 1 vCPU, you may see the symptom that the guest
> > > > no
> > > > longer responds to ping.
> > > > 
> > > > On a Linux guest with multiple vCPUs, you may see the hung task message
> > > > from the guest kernel because other vCPUs are still making progress.
> > > > Just the vCPU that issued the I/O request and whose task is in
> > > > UNINTERRUPTIBLE state would really be stuck.
> > > > 
> > > > Basically, the symptoms depend not just on how QEMU is behaving but also
> > > > on the guest kernel and how many vCPUs you have configured.
> > > > 
> > > > I think this can explain how both problems you are observing, Oliver and
> > > > Mike, are a result of the same bug.  At least I hope they are :).
> > > > 
> > > > Stefan
> > > 
> > > 
> > 
> 
> 
> -- 
> 
> Oliver Francke
> 
> filoo GmbH
> Moltkestra?e 25a
> 33330 G?tersloh
> HRB4355 AG G?tersloh
> 
> Gesch?ftsf?hrer: J.Rehp?hler | C.Kunz
> 
> Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2013-08-13 22:00 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <51FB887F.5070908@filoo.de>
     [not found] ` <51FC2903.3030802@cloudapt.com>
2013-08-04 13:36   ` [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686] Oliver Francke
2013-08-05  7:48     ` Stefan Hajnoczi
2013-08-05 20:08       ` Mike Dawson
2013-08-13 21:26         ` Sage Weil
2013-08-13 22:00           ` James Harper
2013-08-08 12:40       ` Oliver Francke
2013-08-08 17:01         ` Josh Durgin
2013-08-09  9:22           ` Oliver Francke
2013-08-09 14:05             ` Andrei Mikhailovsky
2013-08-09 15:03               ` Stefan Hajnoczi
2013-08-10  7:30                 ` Josh Durgin
2013-08-13 21:34             ` Sage Weil

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).