From mboxrd@z Thu Jan 1 00:00:00 1970 From: Nathan Gamber Subject: Re: i/o scheduler deadlocks with loopback devices Date: Wed, 20 Oct 2010 10:30:17 -0400 Message-ID: <4CBEFCF9.8060705@liquidweb.com> References: <4CBDDAED.5070503@liquidweb.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <4CBDDAED.5070503@liquidweb.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org Oddly enough, this only occurs on Intel hardware (core i5s, xeon boxen) and not Opteron/Phenom systems. On 10/19/10 13:52, Nathan Gamber wrote: > Hello all, > > I'm able to consistently reproduce lockups in my domU with heavy I/O > with the following error: > > 36841.420662] INFO: task rsyslogd:15014 > blocked for more than 120 seconds. [36841.420843] "echo 0> > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > The task varies between any of the tasks that might be active > (kjournald, loop0, etc.) > > My setup is: > Xen dom0 version 3.4.2. > domU: Ubuntu 10.04, 2.6.36-rc6 based on Stefano Stabellini's > v2.6.36-rc6-urgent-fixes tree. > Paravirtual disks and network interfaces. > Root filesystem on /dev/xvda3, formatted ext3, mounted with default > options. > Both dom0 and domU are using the CFQ i/o scheduler. > > The xvbd is based on LVM, on top of a local SATA RAID array. > > > To produce this, I can do one of the following: > > Set up domU as a primary drbd node, with my drbd volume on top of a > local loopback device, and then rsync many files to the volume, delete > them, and repeat until the crash. > > Mount a linux iso via loopback on a /mnt/test, rsync /mnt/test/ to > another directory on xvda3, delete the files, and then repeat until > the crash. > > This is very similar to the following situation: > > http://www.amailbox.org/mailarchive/linux-kernel/2010/9/1/4614107 > > Jeremy Fitzhardinge replied to that thread, indicating that his "xen: > use percpu interrupts for IPIs and VIRQs" and "xen: handle events as > edge-triggered" patches should fix the issue. These were introduced > into 2.6.36-rc3, I believe, and the issue persists. Disabling > irqbalanced in dom0, as he suggested as a workaround, has no effect. > I've also tried changing the scheduler, and reducing the number of > vcpus from 4 to 1, which also had no effect. > > Regards, > > Nathan Gamber > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel