From: Ingo Molnar <mingo@elte.hu>
To: Jeff Garzik <jgarzik@redhat.com>
Cc: linux-kernel@vger.kernel.org, Andrew Morton <akpm@osdl.org>,
"David S. Miller" <davem@redhat.com>
Subject: [lock validator] drivers/net/8139too.c: deadlock?
Date: Thu, 26 Jan 2006 23:43:12 +0100 [thread overview]
Message-ID: <20060126224312.GA2779@elte.hu> (raw)
the lock validator i'm working on found the following scenario in the
rtl8139 driver, which it flagged as a deadlock:
---------------------------------------->
NETDEV WATCHDOG: eth0: transmit timed out
eth0: Transmit timeout, status 0d 0000 c07f media 80.
eth0: Tx queue start entry 164281 dirty entry 164277.
eth0: Tx descriptor 0 is 000805ea.
eth0: Tx descriptor 1 is 00080042. (queue head)
eth0: Tx descriptor 2 is 00080042.
eth0: Tx descriptor 3 is 000805ea.
============================================
[ BUG: circular locking deadlock detected! ]
--------------------------------------------
hackbench/9560 [2] is trying to acquire lock {&tp->rx_lock} at:
[<c033e460>] rtl8139_tx_timeout+0x110/0x1f0
but task is already holding lock {&dev->xmit_lock}, acquired at:
[<c045294b>] dev_watchdog+0x1b/0xc0
which lock already depends on the new lock,
which could lead to circular deadlocks!
the dependency chain (in reverse order) is:
-> #4 {&dev->xmit_lock}: [<c045294b>] dev_watchdog+0x1b/0xc0
-> #3 {&dev->queue_lock}: [<c0447154>] dev_queue_xmit+0x64/0x290
-> #2 {&((sk)->sk_lock.slock)}: [<c043eb66>] sk_clone+0x66/0x200
-> #1 {&((sk)->sk_lock.slock)}: [<c047a116>] tcp_v4_rcv+0x726/0x9d0
-> #0 {&tp->rx_lock}: [<c033e460>] rtl8139_tx_timeout+0x110/0x1f0
other info that might help us debug this:
locks held by hackbench/9560:
#0: {net/unix/af_unix.c:&u->readlock} [<c0490e6f>] unix_stream_recvmsg+0xbf/0x4f0
#1: {&dev->xmit_lock} [<c045294b>] dev_watchdog+0x1b/0xc0
stack backtrace:
[<c010432d>] show_trace+0xd/0x10
[<c0104347>] dump_stack+0x17/0x20
[<c0137d60>] print_circular_bug_tail+0x40/0x50
[<c013922f>] debug_lock_chain+0x74f/0xd40
[<c013985d>] debug_lock_chain_spin+0x3d/0x60
[<c0266add>] _raw_spin_lock+0x2d/0x90
[<c04d9d18>] _spin_lock+0x8/0x10
[<c033e460>] rtl8139_tx_timeout+0x110/0x1f0
[<c04529e8>] dev_watchdog+0xb8/0xc0
[<c0127615>] run_timer_softirq+0xf5/0x1f0
[<c0122f67>] __do_softirq+0x97/0x130
[<c0105519>] do_softirq+0x69/0x100
=======================
[<c0122c19>] irq_exit+0x39/0x50
[<c010f4cc>] smp_apic_timer_interrupt+0x4c/0x50
[<c010393b>] apic_timer_interrupt+0x27/0x2c
[<c0441829>] skb_release_data+0x59/0xa0
[<c0441b43>] kfree_skbmem+0x13/0xe0
[<c0441c58>] __kfree_skb+0x48/0xc0
[<c0490f7c>] unix_stream_recvmsg+0x1cc/0x4f0
[<c043d2d5>] do_sock_read+0x95/0xd0
[<c043d475>] sock_aio_read+0x75/0x80
[<c016499b>] do_sync_read+0xbb/0x110
[<c0164e88>] vfs_read+0x148/0x150
[<c01658bd>] sys_read+0x3d/0x70
[<c0102df7>] sysenter_past_esp+0x54/0x8d
eth0: link up, 100Mbps, full-duplex, lpa 0xC5E1
<----------------------------------------------
i'm wondering, is this a genuine deadlock, or a false positive? The
dependency chain is quite complex, but looks realistic:
-> #4 {&dev->xmit_lock}: [<c045294b>] dev_watchdog+0x1b/0xc0
-> #3 {&dev->queue_lock}: [<c0447154>] dev_queue_xmit+0x64/0x290
-> #2 {&((sk)->sk_lock.slock)}: [<c043eb66>] sk_clone+0x66/0x200
-> #1 {&((sk)->sk_lock.slock)}: [<c047a116>] tcp_v4_rcv+0x726/0x9d0
-> #0 {&tp->rx_lock}: [<c033e460>] rtl8139_tx_timeout+0x110/0x1f0
and rtl8139_tx_timeout() is rare enough to not cause real lockups in
practice all that often.
explanation of the validator output: the above dependency chain does not
mean it actually occured in one single call sequence - it is a
comulative (and full) depdency graph the validator is maintaining, to
prove locking correctness. So it can easily be multiple tasks, at
distinct points in time, on different CPUs, which built this dependency
chain. The first (#0) and the last (#4) entry is the current locking
sequence's fingerprint - so do not understand the above to be an actual
locking stack - it cannot possibly occur in this order.
Ingo
next reply other threads:[~2006-01-26 22:42 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-01-26 22:43 Ingo Molnar [this message]
2006-01-27 0:35 ` [lock validator] drivers/net/8139too.c: deadlock? Francois Romieu
2006-01-27 2:22 ` Herbert Xu
2006-01-27 2:44 ` Jeff Garzik
2006-01-31 0:24 ` [PATCH] 8139too: fix a TX timeout watchdog thread against NAPI softirq race Francois Romieu
2006-01-31 18:49 ` Ingo Molnar
2006-01-31 18:57 ` Ingo Molnar
2006-02-01 0:55 ` Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060126224312.GA2779@elte.hu \
--to=mingo@elte.hu \
--cc=akpm@osdl.org \
--cc=davem@redhat.com \
--cc=jgarzik@redhat.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox