From mboxrd@z Thu Jan 1 00:00:00 1970 From: Tommy Christensen Subject: Re: [PATCH] Deadlock in af_packet/packet_rcv Date: Tue, 30 Nov 2004 12:31:50 +0100 Message-ID: <41AC5A26.6000400@tpack.net> References: <20041125205503.GA18083@suse.de> <41AC3E2F.2030003@tpack.net> <20041130110110.GD16970@suse.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Cc: netdev@oss.sgi.com Return-path: To: Olaf Kirch In-Reply-To: <20041130110110.GD16970@suse.de> Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com List-Id: netdev.vger.kernel.org Olaf Kirch wrote: > On Tue, Nov 30, 2004 at 10:32:31AM +0100, Tommy Christensen wrote: > >>An interrupt handler shouldn't call dev_queue_xmit() directly. If >>this indeed happens, it needs to be fixed. Which handler is this? > > > The call path according to KDB goes like this: > > application does sendmsg() > udp_push_pending_frames > ip_push_pending_frames > ip_output > dev_queue_xmit > dev_queue_xmit_nit > calls ptype->func(skb2, skb->dev, ptype), > where func=packet_rcv > packet_rcv (and this runs with BHs enabled) > take the &sk->sk_receive_queue spinlock > *** timer interrupt > net_tx_action > take the dev->queue_lock spin lock > qdisc_run > qdisc_restart > dev_queue_xmit_nit > as above > packet_rcv > blocks on the &sk->sk_receive_queue spinlock > > Before lockless-loopback this never triggered because we did a > spin_lock_bh(&dev->xmit_lock) around the call to dev_queue_xmit_nit. > > Olaf Ahh, back-traces are *so* nice to have. I still don't agree with the conclusion, though. The spin_lock_bh() is changed to a local_bh_disable() and an optional spin_lock(). That should not lead to what you are seeing! I think perhaps your 'BH disabled count' has been corrupted. There's a fix for that in 2.6.10-rc2. -Tommy