* Re: [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686] [not found] ` <51FC2903.3030802@cloudapt.com> @ 2013-08-04 13:36 ` Oliver Francke 2013-08-05 7:48 ` Stefan Hajnoczi 0 siblings, 1 reply; 12+ messages in thread From: Oliver Francke @ 2013-08-04 13:36 UTC (permalink / raw) To: Mike Dawson Cc: josh.durgin@inktank.com Durgin, ceph-users, qemu-devel@nongnu.org, Stefan Hajnoczi Hi Mike, you might be the guy StefanHa was referring to on the qemu-devel mailing-list. I just made some more tests, so… Am 02.08.2013 um 23:47 schrieb Mike Dawson <mike.dawson@cloudapt.com>: > Oliver, > > We've had a similar situation occur. For about three months, we've run several Windows 2008 R2 guests with virtio drivers that record video surveillance. We have long suffered an issue where the guest appears to hang indefinitely (or until we intervene). For the sake of this conversation, we call this state "wedged", because it appears something (rbd, qemu, virtio, etc) gets stuck on a deadlock. When a guest gets wedged, we see the following: > > - the guest will not respond to pings If showing up the hung_task - message, I can ping and establish new ssh-sessions, just the session with a while loop does not accept any keyboard-action. > - the qemu-system-x86_64 process drops to 0% cpu > - graphite graphs show the interface traffic dropping to 0bps > - the guest will stay wedged forever (or until we intervene) > - strace of qemu-system-x86_64 shows QEMU is making progress [1][2] > nothing special here: 5, events=POLLIN}, {fd=7, events=POLLIN}, {fd=6, events=POLLIN}, {fd=19, events=POLLIN}, {fd=15, events=POLLIN}, {fd=4, events=POLLIN}], 11, -1) = 1 ([{fd=12, revents=POLLIN}]) [pid 11793] read(5, 0x7fff16b61f00, 16) = -1 EAGAIN (Resource temporarily unavailable) [pid 11793] read(12, "\2\0\0\0\0\0\0\0\0\0\0\0\0\361p\0\252\340\374\373\373!gH\10\0E\0\0Yq\374"..., 69632) = 115 [pid 11793] read(12, 0x7f0c1737fcec, 69632) = -1 EAGAIN (Resource temporarily unavailable) [pid 11793] poll([{fd=27, events=POLLIN|POLLERR|POLLHUP}, {fd=26, events=POLLIN|POLLERR|POLLHUP}, {fd=24, events=POLLIN|POLLERR|POLLHUP}, {fd=12, events=POLLIN|POLLERR|POLLHUP}, {fd=3, events=POLLIN|POLLERR|POLLHUP}, {fd= and that for many, many threads. Inside the VM I see 75% wait, but I can restart the spew-test in a second session. All that tested with rbd_cache=false,cache=none. I also test every qemu-version with a 2 CPU 2GiB mem Windows 7 VM with some high load, encountering no problem ATM. Running smooth and fast. > We can "un-wedge" the guest by opening a NoVNC session or running a 'virsh screenshot' command. After that, the guest resumes and runs as expected. At that point we can examine the guest. Each time we'll see: > > - No Windows error logs whatsoever while the guest is wedged > - A time sync typically occurs right after the guest gets un-wedged > - Scheduled tasks do not run while wedged > - Windows error logs do not show any evidence of suspend, sleep, etc > > We had so many issue with guests becoming wedged, we wrote a script to 'virsh screenshot' them via cron. Then we installed some updates and had a month or so of higher stability (wedging happened maybe 1/10th as often). Until today we couldn't figure out why. > > Yesterday, I realized qemu was starting the instances without specifying cache=writeback. We corrected that, and let them run overnight. With RBD writeback re-enabled, wedging came back as often as we had seen in the past. I've counted ~40 occurrences in the past 12-hour period. So I feel like writeback caching in RBD certainly makes the deadlock more likely to occur. > > Joshd asked us to gather RBD client logs: > > "joshd> it could very well be the writeback cache not doing a callback at some point - if you could gather logs of a vm getting stuck with debug rbd = 20, debug ms = 1, and debug objectcacher = 30 that would be great" > > We'll do that over the weekend. If you could as well, we'd love the help! > > [1] http://www.gammacode.com/kvm/wedged-with-timestamps.txt > [2] http://www.gammacode.com/kvm/not-wedged.txt > As I wrote above, no cache so far, so omitting the verbose debugging at the moment. But will do if requested. Thnx for your report, Oliver. > Thanks, > > Mike Dawson > Co-Founder & Director of Cloud Architecture > Cloudapt LLC > 6330 East 75th Street, Suite 170 > Indianapolis, IN 46250 > > On 8/2/2013 6:22 AM, Oliver Francke wrote: >> Well, >> >> I believe, I'm the winner of buzzwords-bingo for today. >> >> But seriously speaking... as I don't have this particular problem with >> qcow2 with kernel 3.2 nor qemu-1.2.2 nor newer kernels, I hope I'm not >> alone here? >> We have a raising number of tickets from people reinstalling from ISO's >> with 3.2-kernel. >> >> Fast fallback is to start all VM's with qemu-1.2.2, but we then lose >> some features ala latency-free-RBD-cache ;) >> >> I just opened a bug for qemu per: >> >> https://bugs.launchpad.net/qemu/+bug/1207686 >> >> with all dirty details. >> >> Installing a backport-kernel 3.9.x or upgrade Ubuntu-kernel to 3.8.x >> "fixes" it. So we have a bad combination for all distros with 3.2-kernel >> and rbd as storage-backend, I assume. >> >> Any similar findings? >> Any idea of tracing/debugging ( Josh? ;) ) very welcome, >> >> Oliver. >> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686] 2013-08-04 13:36 ` [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686] Oliver Francke @ 2013-08-05 7:48 ` Stefan Hajnoczi 2013-08-05 20:08 ` Mike Dawson 2013-08-08 12:40 ` Oliver Francke 0 siblings, 2 replies; 12+ messages in thread From: Stefan Hajnoczi @ 2013-08-05 7:48 UTC (permalink / raw) To: Oliver Francke Cc: Josh Durgin, ceph-users, Mike Dawson, qemu-devel@nongnu.org On Sun, Aug 04, 2013 at 03:36:52PM +0200, Oliver Francke wrote: > Am 02.08.2013 um 23:47 schrieb Mike Dawson <mike.dawson@cloudapt.com>: > > We can "un-wedge" the guest by opening a NoVNC session or running a 'virsh screenshot' command. After that, the guest resumes and runs as expected. At that point we can examine the guest. Each time we'll see: If virsh screenshot works then this confirms that QEMU itself is still responding. Its main loop cannot be blocked since it was able to process the screendump command. This supports Josh's theory that a callback is not being invoked. The virtio-blk I/O request would be left in a pending state. Now here is where the behavior varies between configurations: On a Windows guest with 1 vCPU, you may see the symptom that the guest no longer responds to ping. On a Linux guest with multiple vCPUs, you may see the hung task message from the guest kernel because other vCPUs are still making progress. Just the vCPU that issued the I/O request and whose task is in UNINTERRUPTIBLE state would really be stuck. Basically, the symptoms depend not just on how QEMU is behaving but also on the guest kernel and how many vCPUs you have configured. I think this can explain how both problems you are observing, Oliver and Mike, are a result of the same bug. At least I hope they are :). Stefan ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686] 2013-08-05 7:48 ` Stefan Hajnoczi @ 2013-08-05 20:08 ` Mike Dawson 2013-08-13 21:26 ` Sage Weil 2013-08-08 12:40 ` Oliver Francke 1 sibling, 1 reply; 12+ messages in thread From: Mike Dawson @ 2013-08-05 20:08 UTC (permalink / raw) To: Stefan Hajnoczi Cc: Josh Durgin, ceph-users, Oliver Francke, qemu-devel@nongnu.org Josh, Logs are uploaded to cephdrop with the file name mikedawson-rbd-qemu-deadlock. - At about 2013-08-05 19:46 or 47, we hit the issue, traffic went to 0 - At about 2013-08-05 19:53:51, ran a 'virsh screenshot' Environment is: - Ceph 0.61.7 (client is co-mingled with three OSDs) - rbd cache = true and cache=writeback - qemu 1.4.0 1.4.0+dfsg-1expubuntu4 - Ubuntu Raring with 3.8.0-25-generic This issue is reproducible in my environment, and I'm willing to run any wip branch you need. What else can I provide to help? Thanks, Mike Dawson On 8/5/2013 3:48 AM, Stefan Hajnoczi wrote: > On Sun, Aug 04, 2013 at 03:36:52PM +0200, Oliver Francke wrote: >> Am 02.08.2013 um 23:47 schrieb Mike Dawson <mike.dawson@cloudapt.com>: >>> We can "un-wedge" the guest by opening a NoVNC session or running a 'virsh screenshot' command. After that, the guest resumes and runs as expected. At that point we can examine the guest. Each time we'll see: > > If virsh screenshot works then this confirms that QEMU itself is still > responding. Its main loop cannot be blocked since it was able to > process the screendump command. > > This supports Josh's theory that a callback is not being invoked. The > virtio-blk I/O request would be left in a pending state. > > Now here is where the behavior varies between configurations: > > On a Windows guest with 1 vCPU, you may see the symptom that the guest no > longer responds to ping. > > On a Linux guest with multiple vCPUs, you may see the hung task message > from the guest kernel because other vCPUs are still making progress. > Just the vCPU that issued the I/O request and whose task is in > UNINTERRUPTIBLE state would really be stuck. > > Basically, the symptoms depend not just on how QEMU is behaving but also > on the guest kernel and how many vCPUs you have configured. > > I think this can explain how both problems you are observing, Oliver and > Mike, are a result of the same bug. At least I hope they are :). > > Stefan > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686] 2013-08-05 20:08 ` Mike Dawson @ 2013-08-13 21:26 ` Sage Weil 2013-08-13 22:00 ` James Harper 0 siblings, 1 reply; 12+ messages in thread From: Sage Weil @ 2013-08-13 21:26 UTC (permalink / raw) To: Mike Dawson; +Cc: ceph-users, qemu-devel@nongnu.org, Stefan Hajnoczi On Mon, 5 Aug 2013, Mike Dawson wrote: > Josh, > > Logs are uploaded to cephdrop with the file name mikedawson-rbd-qemu-deadlock. > > - At about 2013-08-05 19:46 or 47, we hit the issue, traffic went to 0 > - At about 2013-08-05 19:53:51, ran a 'virsh screenshot' > > > Environment is: > > - Ceph 0.61.7 (client is co-mingled with three OSDs) > - rbd cache = true and cache=writeback > - qemu 1.4.0 1.4.0+dfsg-1expubuntu4 > - Ubuntu Raring with 3.8.0-25-generic > > This issue is reproducible in my environment, and I'm willing to run any wip > branch you need. What else can I provide to help? This looks like a different issue than Oliver's. I see one anomaly in the log, where a rbd io completion is triggered a second time for no apparent reason. I opened a separate bug http://tracker.ceph.com/issues/5955 and pushed wip-5955 that will hopefully shine some light on the weird behavior I saw. Can you reproduce with this branch and debug objectcacher = 20 debug ms = 1 debug rbd = 20 debug finisher = 20 Thanks! sage > > Thanks, > Mike Dawson > > > On 8/5/2013 3:48 AM, Stefan Hajnoczi wrote: > > On Sun, Aug 04, 2013 at 03:36:52PM +0200, Oliver Francke wrote: > > > Am 02.08.2013 um 23:47 schrieb Mike Dawson <mike.dawson@cloudapt.com>: > > > > We can "un-wedge" the guest by opening a NoVNC session or running a > > > > 'virsh screenshot' command. After that, the guest resumes and runs as > > > > expected. At that point we can examine the guest. Each time we'll see: > > > > If virsh screenshot works then this confirms that QEMU itself is still > > responding. Its main loop cannot be blocked since it was able to > > process the screendump command. > > > > This supports Josh's theory that a callback is not being invoked. The > > virtio-blk I/O request would be left in a pending state. > > > > Now here is where the behavior varies between configurations: > > > > On a Windows guest with 1 vCPU, you may see the symptom that the guest no > > longer responds to ping. > > > > On a Linux guest with multiple vCPUs, you may see the hung task message > > from the guest kernel because other vCPUs are still making progress. > > Just the vCPU that issued the I/O request and whose task is in > > UNINTERRUPTIBLE state would really be stuck. > > > > Basically, the symptoms depend not just on how QEMU is behaving but also > > on the guest kernel and how many vCPUs you have configured. > > > > I think this can explain how both problems you are observing, Oliver and > > Mike, are a result of the same bug. At least I hope they are :). > > > > Stefan > > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686] 2013-08-13 21:26 ` Sage Weil @ 2013-08-13 22:00 ` James Harper 0 siblings, 0 replies; 12+ messages in thread From: James Harper @ 2013-08-13 22:00 UTC (permalink / raw) To: Sage Weil, Mike Dawson Cc: ceph-users@lists.ceph.com, qemu-devel@nongnu.org, Stefan Hajnoczi > > This looks like a different issue than Oliver's. I see one anomaly in the > log, where a rbd io completion is triggered a second time for no apparent > reason. I opened a separate bug > > http://tracker.ceph.com/issues/5955 > > and pushed wip-5955 that will hopefully shine some light on the weird > behavior I saw. Can you reproduce with this branch and > Do you think this could be a bug in rbd? I'm seeing a bug in the tapdisk rbd code and if the completion was called twice it could cause the crash I'm seeing too. Unfortunately I can't get gdb to work with pthreads so I can't get a backtrace. James ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686] 2013-08-05 7:48 ` Stefan Hajnoczi 2013-08-05 20:08 ` Mike Dawson @ 2013-08-08 12:40 ` Oliver Francke 2013-08-08 17:01 ` Josh Durgin 1 sibling, 1 reply; 12+ messages in thread From: Oliver Francke @ 2013-08-08 12:40 UTC (permalink / raw) To: Stefan Hajnoczi Cc: Josh Durgin, ceph-users, Mike Dawson, qemu-devel@nongnu.org Hi Josh, I have a session logged with: debug_ms=1:debug_rbd=20:debug_objectcacher=30 as you requested from Mike, even if I think, we do have another story here, anyway. Host-kernel is: 3.10.0-rc7, qemu-client 1.6.0-rc2, client-kernel is 3.2.0-51-amd... Do you want me to open a ticket for that stuff? I have about 5MB compressed logfile waiting for you ;) Thnx in advance, Oliver. On 08/05/2013 09:48 AM, Stefan Hajnoczi wrote: > On Sun, Aug 04, 2013 at 03:36:52PM +0200, Oliver Francke wrote: >> Am 02.08.2013 um 23:47 schrieb Mike Dawson <mike.dawson@cloudapt.com>: >>> We can "un-wedge" the guest by opening a NoVNC session or running a 'virsh screenshot' command. After that, the guest resumes and runs as expected. At that point we can examine the guest. Each time we'll see: > If virsh screenshot works then this confirms that QEMU itself is still > responding. Its main loop cannot be blocked since it was able to > process the screendump command. > > This supports Josh's theory that a callback is not being invoked. The > virtio-blk I/O request would be left in a pending state. > > Now here is where the behavior varies between configurations: > > On a Windows guest with 1 vCPU, you may see the symptom that the guest no > longer responds to ping. > > On a Linux guest with multiple vCPUs, you may see the hung task message > from the guest kernel because other vCPUs are still making progress. > Just the vCPU that issued the I/O request and whose task is in > UNINTERRUPTIBLE state would really be stuck. > > Basically, the symptoms depend not just on how QEMU is behaving but also > on the guest kernel and how many vCPUs you have configured. > > I think this can explain how both problems you are observing, Oliver and > Mike, are a result of the same bug. At least I hope they are :). > > Stefan -- Oliver Francke filoo GmbH Moltkestraße 25a 33330 Gütersloh HRB4355 AG Gütersloh Geschäftsführer: J.Rehpöhler | C.Kunz Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686] 2013-08-08 12:40 ` Oliver Francke @ 2013-08-08 17:01 ` Josh Durgin 2013-08-09 9:22 ` Oliver Francke 0 siblings, 1 reply; 12+ messages in thread From: Josh Durgin @ 2013-08-08 17:01 UTC (permalink / raw) To: Oliver Francke Cc: ceph-users, Mike Dawson, Stefan Hajnoczi, qemu-devel@nongnu.org On 08/08/2013 05:40 AM, Oliver Francke wrote: > Hi Josh, > > I have a session logged with: > > debug_ms=1:debug_rbd=20:debug_objectcacher=30 > > as you requested from Mike, even if I think, we do have another story > here, anyway. > > Host-kernel is: 3.10.0-rc7, qemu-client 1.6.0-rc2, client-kernel is > 3.2.0-51-amd... > > Do you want me to open a ticket for that stuff? I have about 5MB > compressed logfile waiting for you ;) Yes, that'd be great. If you could include the time when you saw the guest hang that'd be ideal. I'm not sure if this is one or two bugs, but it seems likely it's a bug in rbd and not qemu. Thanks! Josh > Thnx in advance, > > Oliver. > > On 08/05/2013 09:48 AM, Stefan Hajnoczi wrote: >> On Sun, Aug 04, 2013 at 03:36:52PM +0200, Oliver Francke wrote: >>> Am 02.08.2013 um 23:47 schrieb Mike Dawson <mike.dawson@cloudapt.com>: >>>> We can "un-wedge" the guest by opening a NoVNC session or running a >>>> 'virsh screenshot' command. After that, the guest resumes and runs >>>> as expected. At that point we can examine the guest. Each time we'll >>>> see: >> If virsh screenshot works then this confirms that QEMU itself is still >> responding. Its main loop cannot be blocked since it was able to >> process the screendump command. >> >> This supports Josh's theory that a callback is not being invoked. The >> virtio-blk I/O request would be left in a pending state. >> >> Now here is where the behavior varies between configurations: >> >> On a Windows guest with 1 vCPU, you may see the symptom that the guest no >> longer responds to ping. >> >> On a Linux guest with multiple vCPUs, you may see the hung task message >> from the guest kernel because other vCPUs are still making progress. >> Just the vCPU that issued the I/O request and whose task is in >> UNINTERRUPTIBLE state would really be stuck. >> >> Basically, the symptoms depend not just on how QEMU is behaving but also >> on the guest kernel and how many vCPUs you have configured. >> >> I think this can explain how both problems you are observing, Oliver and >> Mike, are a result of the same bug. At least I hope they are :). >> >> Stefan > > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686] 2013-08-08 17:01 ` Josh Durgin @ 2013-08-09 9:22 ` Oliver Francke 2013-08-09 14:05 ` Andrei Mikhailovsky 2013-08-13 21:34 ` Sage Weil 0 siblings, 2 replies; 12+ messages in thread From: Oliver Francke @ 2013-08-09 9:22 UTC (permalink / raw) To: Josh Durgin Cc: ceph-users, Mike Dawson, Stefan Hajnoczi, qemu-devel@nongnu.org Hi Josh, just opened http://tracker.ceph.com/issues/5919 with all collected information incl. debug-log. Hope it helps, Oliver. On 08/08/2013 07:01 PM, Josh Durgin wrote: > On 08/08/2013 05:40 AM, Oliver Francke wrote: >> Hi Josh, >> >> I have a session logged with: >> >> debug_ms=1:debug_rbd=20:debug_objectcacher=30 >> >> as you requested from Mike, even if I think, we do have another story >> here, anyway. >> >> Host-kernel is: 3.10.0-rc7, qemu-client 1.6.0-rc2, client-kernel is >> 3.2.0-51-amd... >> >> Do you want me to open a ticket for that stuff? I have about 5MB >> compressed logfile waiting for you ;) > > Yes, that'd be great. If you could include the time when you saw the > guest hang that'd be ideal. I'm not sure if this is one or two bugs, > but it seems likely it's a bug in rbd and not qemu. > > Thanks! > Josh > >> Thnx in advance, >> >> Oliver. >> >> On 08/05/2013 09:48 AM, Stefan Hajnoczi wrote: >>> On Sun, Aug 04, 2013 at 03:36:52PM +0200, Oliver Francke wrote: >>>> Am 02.08.2013 um 23:47 schrieb Mike Dawson <mike.dawson@cloudapt.com>: >>>>> We can "un-wedge" the guest by opening a NoVNC session or running a >>>>> 'virsh screenshot' command. After that, the guest resumes and runs >>>>> as expected. At that point we can examine the guest. Each time we'll >>>>> see: >>> If virsh screenshot works then this confirms that QEMU itself is still >>> responding. Its main loop cannot be blocked since it was able to >>> process the screendump command. >>> >>> This supports Josh's theory that a callback is not being invoked. The >>> virtio-blk I/O request would be left in a pending state. >>> >>> Now here is where the behavior varies between configurations: >>> >>> On a Windows guest with 1 vCPU, you may see the symptom that the >>> guest no >>> longer responds to ping. >>> >>> On a Linux guest with multiple vCPUs, you may see the hung task message >>> from the guest kernel because other vCPUs are still making progress. >>> Just the vCPU that issued the I/O request and whose task is in >>> UNINTERRUPTIBLE state would really be stuck. >>> >>> Basically, the symptoms depend not just on how QEMU is behaving but >>> also >>> on the guest kernel and how many vCPUs you have configured. >>> >>> I think this can explain how both problems you are observing, Oliver >>> and >>> Mike, are a result of the same bug. At least I hope they are :). >>> >>> Stefan >> >> > -- Oliver Francke filoo GmbH Moltkestraße 25a 33330 Gütersloh HRB4355 AG Gütersloh Geschäftsführer: J.Rehpöhler | C.Kunz Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686] 2013-08-09 9:22 ` Oliver Francke @ 2013-08-09 14:05 ` Andrei Mikhailovsky 2013-08-09 15:03 ` Stefan Hajnoczi 2013-08-13 21:34 ` Sage Weil 1 sibling, 1 reply; 12+ messages in thread From: Andrei Mikhailovsky @ 2013-08-09 14:05 UTC (permalink / raw) To: Oliver Francke Cc: Josh Durgin, ceph-users, Mike Dawson, Stefan Hajnoczi, qemu-devel [-- Attachment #1: Type: text/plain, Size: 3951 bytes --] I can confirm that I am having similar issues with ubuntu vm guests using fio with bs=4k direct=1 numjobs=4 iodepth=16. Occasionally i see hang tasks, occasionally guest vm stops responding without leaving anything in the logs and sometimes i see kernel panic on the console. I typically leave the runtime of the fio test for 60 minutes and it tends to stop responding after about 10-30 mins. I am on ubuntu 12.04 with 3.5 kernel backport and using ceph 0.61.7 with qemu 1.5.0 and libvirt 1.0.2 Andrei ----- Original Message ----- From: "Oliver Francke" <Oliver.Francke@filoo.de> To: "Josh Durgin" <josh.durgin@inktank.com> Cc: ceph-users@lists.ceph.com, "Mike Dawson" <mike.dawson@cloudapt.com>, "Stefan Hajnoczi" <stefanha@redhat.com>, qemu-devel@nongnu.org Sent: Friday, 9 August, 2013 10:22:00 AM Subject: Re: [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Qemu-devel] [Bug 1207686] Hi Josh, just opened http://tracker.ceph.com/issues/5919 with all collected information incl. debug-log. Hope it helps, Oliver. On 08/08/2013 07:01 PM, Josh Durgin wrote: > On 08/08/2013 05:40 AM, Oliver Francke wrote: >> Hi Josh, >> >> I have a session logged with: >> >> debug_ms=1:debug_rbd=20:debug_objectcacher=30 >> >> as you requested from Mike, even if I think, we do have another story >> here, anyway. >> >> Host-kernel is: 3.10.0-rc7, qemu-client 1.6.0-rc2, client-kernel is >> 3.2.0-51-amd... >> >> Do you want me to open a ticket for that stuff? I have about 5MB >> compressed logfile waiting for you ;) > > Yes, that'd be great. If you could include the time when you saw the > guest hang that'd be ideal. I'm not sure if this is one or two bugs, > but it seems likely it's a bug in rbd and not qemu. > > Thanks! > Josh > >> Thnx in advance, >> >> Oliver. >> >> On 08/05/2013 09:48 AM, Stefan Hajnoczi wrote: >>> On Sun, Aug 04, 2013 at 03:36:52PM +0200, Oliver Francke wrote: >>>> Am 02.08.2013 um 23:47 schrieb Mike Dawson <mike.dawson@cloudapt.com>: >>>>> We can "un-wedge" the guest by opening a NoVNC session or running a >>>>> 'virsh screenshot' command. After that, the guest resumes and runs >>>>> as expected. At that point we can examine the guest. Each time we'll >>>>> see: >>> If virsh screenshot works then this confirms that QEMU itself is still >>> responding. Its main loop cannot be blocked since it was able to >>> process the screendump command. >>> >>> This supports Josh's theory that a callback is not being invoked. The >>> virtio-blk I/O request would be left in a pending state. >>> >>> Now here is where the behavior varies between configurations: >>> >>> On a Windows guest with 1 vCPU, you may see the symptom that the >>> guest no >>> longer responds to ping. >>> >>> On a Linux guest with multiple vCPUs, you may see the hung task message >>> from the guest kernel because other vCPUs are still making progress. >>> Just the vCPU that issued the I/O request and whose task is in >>> UNINTERRUPTIBLE state would really be stuck. >>> >>> Basically, the symptoms depend not just on how QEMU is behaving but >>> also >>> on the guest kernel and how many vCPUs you have configured. >>> >>> I think this can explain how both problems you are observing, Oliver >>> and >>> Mike, are a result of the same bug. At least I hope they are :). >>> >>> Stefan >> >> > -- Oliver Francke filoo GmbH Moltkestraße 25a 33330 Gütersloh HRB4355 AG Gütersloh Geschäftsführer: J.Rehpöhler | C.Kunz Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com [-- Attachment #2: Type: text/html, Size: 4988 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686] 2013-08-09 14:05 ` Andrei Mikhailovsky @ 2013-08-09 15:03 ` Stefan Hajnoczi 2013-08-10 7:30 ` Josh Durgin 0 siblings, 1 reply; 12+ messages in thread From: Stefan Hajnoczi @ 2013-08-09 15:03 UTC (permalink / raw) To: Andrei Mikhailovsky Cc: Josh Durgin, ceph-users, Oliver Francke, Mike Dawson, qemu-devel On Fri, Aug 09, 2013 at 03:05:22PM +0100, Andrei Mikhailovsky wrote: > I can confirm that I am having similar issues with ubuntu vm guests using fio with bs=4k direct=1 numjobs=4 iodepth=16. Occasionally i see hang tasks, occasionally guest vm stops responding without leaving anything in the logs and sometimes i see kernel panic on the console. I typically leave the runtime of the fio test for 60 minutes and it tends to stop responding after about 10-30 mins. > > I am on ubuntu 12.04 with 3.5 kernel backport and using ceph 0.61.7 with qemu 1.5.0 and libvirt 1.0.2 Josh, In addition to the Ceph logs you can also use QEMU tracing with the following events enabled: virtio_blk_handle_write virtio_blk_handle_read virtio_blk_rw_complete See docs/tracing.txt for details on usage. Inspecting the trace output will let you observe the I/O request submission/completion from the virtio-blk device perspective. You'll be able to see whether requests are never being completed in some cases. This bug seems like a corner case or race condition since most requests seem to complete just fine. The problem is that eventually the virtio-blk device becomes unusable when it runs out of descriptors (it has 128). And before that limit is reached the guest may become unusable due to the hung I/O requests. Stefan ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686] 2013-08-09 15:03 ` Stefan Hajnoczi @ 2013-08-10 7:30 ` Josh Durgin 0 siblings, 0 replies; 12+ messages in thread From: Josh Durgin @ 2013-08-10 7:30 UTC (permalink / raw) To: Stefan Hajnoczi Cc: ceph-users, Andrei Mikhailovsky, Oliver Francke, Mike Dawson, qemu-devel On 08/09/2013 08:03 AM, Stefan Hajnoczi wrote: > On Fri, Aug 09, 2013 at 03:05:22PM +0100, Andrei Mikhailovsky wrote: >> I can confirm that I am having similar issues with ubuntu vm guests using fio with bs=4k direct=1 numjobs=4 iodepth=16. Occasionally i see hang tasks, occasionally guest vm stops responding without leaving anything in the logs and sometimes i see kernel panic on the console. I typically leave the runtime of the fio test for 60 minutes and it tends to stop responding after about 10-30 mins. >> >> I am on ubuntu 12.04 with 3.5 kernel backport and using ceph 0.61.7 with qemu 1.5.0 and libvirt 1.0.2 Oliver's logs show one aio_flush() never getting completed, which means it's an issue with aio_flush in librados when rbd caching isn't used. Mike's log is from a qemu without aio_flush(), and with caching turned on, and shows all flushes completing quickly, so it's a separate bug. > Josh, > In addition to the Ceph logs you can also use QEMU tracing with the > following events enabled: > virtio_blk_handle_write > virtio_blk_handle_read > virtio_blk_rw_complete > > See docs/tracing.txt for details on usage. > > Inspecting the trace output will let you observe the I/O request > submission/completion from the virtio-blk device perspective. You'll be > able to see whether requests are never being completed in some cases. Thanks for the info. That may be the best way to check what's happening when caching is enabled. Mike, could you recompile qemu with tracing enabled and get a trace of the hang you were seeing, in addition to the ceph logs? > This bug seems like a corner case or race condition since most requests > seem to complete just fine. The problem is that eventually the > virtio-blk device becomes unusable when it runs out of descriptors (it > has 128). And before that limit is reached the guest may become > unusable due to the hung I/O requests. It seems only one request hung from an important kernel thread in Oliver's case, but it's good to be aware of the descriptor limit. Josh ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686] 2013-08-09 9:22 ` Oliver Francke 2013-08-09 14:05 ` Andrei Mikhailovsky @ 2013-08-13 21:34 ` Sage Weil 1 sibling, 0 replies; 12+ messages in thread From: Sage Weil @ 2013-08-13 21:34 UTC (permalink / raw) To: Oliver Francke Cc: Josh Durgin, ceph-users, Mike Dawson, Stefan Hajnoczi, qemu-devel@nongnu.org Hi Oliver, (Posted this on the bug too, but:) Your last log revealed a bug in the librados aio flush. A fix is pushed to wip-librados-aio-flush (bobtail) and wip-5919 (master); can you retest please (with caching off again)? Thanks! sage On Fri, 9 Aug 2013, Oliver Francke wrote: > Hi Josh, > > just opened > > http://tracker.ceph.com/issues/5919 > > with all collected information incl. debug-log. > > Hope it helps, > > Oliver. > > On 08/08/2013 07:01 PM, Josh Durgin wrote: > > On 08/08/2013 05:40 AM, Oliver Francke wrote: > > > Hi Josh, > > > > > > I have a session logged with: > > > > > > debug_ms=1:debug_rbd=20:debug_objectcacher=30 > > > > > > as you requested from Mike, even if I think, we do have another story > > > here, anyway. > > > > > > Host-kernel is: 3.10.0-rc7, qemu-client 1.6.0-rc2, client-kernel is > > > 3.2.0-51-amd... > > > > > > Do you want me to open a ticket for that stuff? I have about 5MB > > > compressed logfile waiting for you ;) > > > > Yes, that'd be great. If you could include the time when you saw the guest > > hang that'd be ideal. I'm not sure if this is one or two bugs, > > but it seems likely it's a bug in rbd and not qemu. > > > > Thanks! > > Josh > > > > > Thnx in advance, > > > > > > Oliver. > > > > > > On 08/05/2013 09:48 AM, Stefan Hajnoczi wrote: > > > > On Sun, Aug 04, 2013 at 03:36:52PM +0200, Oliver Francke wrote: > > > > > Am 02.08.2013 um 23:47 schrieb Mike Dawson <mike.dawson@cloudapt.com>: > > > > > > We can "un-wedge" the guest by opening a NoVNC session or running a > > > > > > 'virsh screenshot' command. After that, the guest resumes and runs > > > > > > as expected. At that point we can examine the guest. Each time we'll > > > > > > see: > > > > If virsh screenshot works then this confirms that QEMU itself is still > > > > responding. Its main loop cannot be blocked since it was able to > > > > process the screendump command. > > > > > > > > This supports Josh's theory that a callback is not being invoked. The > > > > virtio-blk I/O request would be left in a pending state. > > > > > > > > Now here is where the behavior varies between configurations: > > > > > > > > On a Windows guest with 1 vCPU, you may see the symptom that the guest > > > > no > > > > longer responds to ping. > > > > > > > > On a Linux guest with multiple vCPUs, you may see the hung task message > > > > from the guest kernel because other vCPUs are still making progress. > > > > Just the vCPU that issued the I/O request and whose task is in > > > > UNINTERRUPTIBLE state would really be stuck. > > > > > > > > Basically, the symptoms depend not just on how QEMU is behaving but also > > > > on the guest kernel and how many vCPUs you have configured. > > > > > > > > I think this can explain how both problems you are observing, Oliver and > > > > Mike, are a result of the same bug. At least I hope they are :). > > > > > > > > Stefan > > > > > > > > > > > -- > > Oliver Francke > > filoo GmbH > Moltkestra?e 25a > 33330 G?tersloh > HRB4355 AG G?tersloh > > Gesch?ftsf?hrer: J.Rehp?hler | C.Kunz > > Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2013-08-13 22:00 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <51FB887F.5070908@filoo.de> [not found] ` <51FC2903.3030802@cloudapt.com> 2013-08-04 13:36 ` [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686] Oliver Francke 2013-08-05 7:48 ` Stefan Hajnoczi 2013-08-05 20:08 ` Mike Dawson 2013-08-13 21:26 ` Sage Weil 2013-08-13 22:00 ` James Harper 2013-08-08 12:40 ` Oliver Francke 2013-08-08 17:01 ` Josh Durgin 2013-08-09 9:22 ` Oliver Francke 2013-08-09 14:05 ` Andrei Mikhailovsky 2013-08-09 15:03 ` Stefan Hajnoczi 2013-08-10 7:30 ` Josh Durgin 2013-08-13 21:34 ` Sage Weil
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).