All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrei Mikhailovsky <andrei@arhont.com>
To: Oliver Francke <Oliver.Francke@filoo.de>
Cc: Josh Durgin <josh.durgin@inktank.com>,
	ceph-users@lists.ceph.com, Mike Dawson <mike.dawson@cloudapt.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686]
Date: Fri, 9 Aug 2013 15:05:22 +0100 (BST)	[thread overview]
Message-ID: <13653691.7559.1376057121351.JavaMail.andrei@finka> (raw)
In-Reply-To: <5204B4B8.3080302@filoo.de>

[-- Attachment #1: Type: text/plain, Size: 3951 bytes --]

I can confirm that I am having similar issues with ubuntu vm guests using fio with bs=4k direct=1 numjobs=4 iodepth=16. Occasionally i see hang tasks, occasionally guest vm stops responding without leaving anything in the logs and sometimes i see kernel panic on the console. I typically leave the runtime of the fio test for 60 minutes and it tends to stop responding after about 10-30 mins. 

I am on ubuntu 12.04 with 3.5 kernel backport and using ceph 0.61.7 with qemu 1.5.0 and libvirt 1.0.2 

Andrei 
----- Original Message -----

From: "Oliver Francke" <Oliver.Francke@filoo.de> 
To: "Josh Durgin" <josh.durgin@inktank.com> 
Cc: ceph-users@lists.ceph.com, "Mike Dawson" <mike.dawson@cloudapt.com>, "Stefan Hajnoczi" <stefanha@redhat.com>, qemu-devel@nongnu.org 
Sent: Friday, 9 August, 2013 10:22:00 AM 
Subject: Re: [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Qemu-devel] [Bug 1207686] 

Hi Josh, 

just opened 

http://tracker.ceph.com/issues/5919 

with all collected information incl. debug-log. 

Hope it helps, 

Oliver. 

On 08/08/2013 07:01 PM, Josh Durgin wrote: 
> On 08/08/2013 05:40 AM, Oliver Francke wrote: 
>> Hi Josh, 
>> 
>> I have a session logged with: 
>> 
>> debug_ms=1:debug_rbd=20:debug_objectcacher=30 
>> 
>> as you requested from Mike, even if I think, we do have another story 
>> here, anyway. 
>> 
>> Host-kernel is: 3.10.0-rc7, qemu-client 1.6.0-rc2, client-kernel is 
>> 3.2.0-51-amd... 
>> 
>> Do you want me to open a ticket for that stuff? I have about 5MB 
>> compressed logfile waiting for you ;) 
> 
> Yes, that'd be great. If you could include the time when you saw the 
> guest hang that'd be ideal. I'm not sure if this is one or two bugs, 
> but it seems likely it's a bug in rbd and not qemu. 
> 
> Thanks! 
> Josh 
> 
>> Thnx in advance, 
>> 
>> Oliver. 
>> 
>> On 08/05/2013 09:48 AM, Stefan Hajnoczi wrote: 
>>> On Sun, Aug 04, 2013 at 03:36:52PM +0200, Oliver Francke wrote: 
>>>> Am 02.08.2013 um 23:47 schrieb Mike Dawson <mike.dawson@cloudapt.com>: 
>>>>> We can "un-wedge" the guest by opening a NoVNC session or running a 
>>>>> 'virsh screenshot' command. After that, the guest resumes and runs 
>>>>> as expected. At that point we can examine the guest. Each time we'll 
>>>>> see: 
>>> If virsh screenshot works then this confirms that QEMU itself is still 
>>> responding. Its main loop cannot be blocked since it was able to 
>>> process the screendump command. 
>>> 
>>> This supports Josh's theory that a callback is not being invoked. The 
>>> virtio-blk I/O request would be left in a pending state. 
>>> 
>>> Now here is where the behavior varies between configurations: 
>>> 
>>> On a Windows guest with 1 vCPU, you may see the symptom that the 
>>> guest no 
>>> longer responds to ping. 
>>> 
>>> On a Linux guest with multiple vCPUs, you may see the hung task message 
>>> from the guest kernel because other vCPUs are still making progress. 
>>> Just the vCPU that issued the I/O request and whose task is in 
>>> UNINTERRUPTIBLE state would really be stuck. 
>>> 
>>> Basically, the symptoms depend not just on how QEMU is behaving but 
>>> also 
>>> on the guest kernel and how many vCPUs you have configured. 
>>> 
>>> I think this can explain how both problems you are observing, Oliver 
>>> and 
>>> Mike, are a result of the same bug. At least I hope they are :). 
>>> 
>>> Stefan 
>> 
>> 
> 


-- 

Oliver Francke 

filoo GmbH 
Moltkestraße 25a 
33330 Gütersloh 
HRB4355 AG Gütersloh 

Geschäftsführer: J.Rehpöhler | C.Kunz 

Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh 

_______________________________________________ 
ceph-users mailing list 
ceph-users@lists.ceph.com 
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 


[-- Attachment #2: Type: text/html, Size: 4988 bytes --]

  reply	other threads:[~2013-08-09 14:05 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <51FB887F.5070908@filoo.de>
     [not found] ` <51FC2903.3030802@cloudapt.com>
2013-08-04 13:36   ` [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686] Oliver Francke
2013-08-05  7:48     ` Stefan Hajnoczi
2013-08-05 20:08       ` Mike Dawson
2013-08-13 21:26         ` Sage Weil
2013-08-13 22:00           ` James Harper
2013-08-08 12:40       ` Oliver Francke
2013-08-08 17:01         ` Josh Durgin
2013-08-09  9:22           ` Oliver Francke
2013-08-09 14:05             ` Andrei Mikhailovsky [this message]
2013-08-09 15:03               ` Stefan Hajnoczi
2013-08-10  7:30                 ` Josh Durgin
2013-08-13 21:34             ` Sage Weil

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=13653691.7559.1376057121351.JavaMail.andrei@finka \
    --to=andrei@arhont.com \
    --cc=Oliver.Francke@filoo.de \
    --cc=ceph-users@lists.ceph.com \
    --cc=josh.durgin@inktank.com \
    --cc=mike.dawson@cloudapt.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.