From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jeremy Fitzhardinge Subject: Re: 3.0.4: Soft lockup in netfront in SMP build Date: Sat, 21 Apr 2007 15:14:16 -0700 Message-ID: <462A8CB8.4080008@goop.org> References: <342BAC0A5467384983B586A6B0B376710548B5AD@EXNA.corp.stratus.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <342BAC0A5467384983B586A6B0B376710548B5AD@EXNA.corp.stratus.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: "Graham, Simon" Cc: xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org Graham, Simon wrote: > Just run into a (real) soft lockup running 3.0.4 - stack is at the end of this, but basically: > > . network_open acquires the rx spin lock with spin_lock() and then checks for > work on the queue and calls (I think) netif_rx_schedule with the lock still held which > can call into the hypervisor. > . An interrupt is delivered to the bottom half of netfront which ends up calling netif_poll > which blocks attempting to acquire the rx spin lock. > > Oops! > > I see from the unstable tree that this code was recently modified to use spin_lock_bh() instead of spin_lock() as part of a mega-merge of IA64 code - clearly we cant merge this changeset into 3.0.4. > > I haven't looked too closely at all of the code yet, but I'm wondering if a judicious change of spin_lock to spin_lock_bh in netfront would be the best approach? > I found a few locking problems when I ran netfront with lockdep enabled. Fixes were committed to xen-unstable in 14844:abea8d171503 and 14851:22460cfaca71. I was wondering if there had been any real cases of these deadlocking. J