* [lock validator] drivers/net/8139too.c: deadlock?
@ 2006-01-26 22:43 Ingo Molnar
2006-01-27 0:35 ` Francois Romieu
0 siblings, 1 reply; 8+ messages in thread
From: Ingo Molnar @ 2006-01-26 22:43 UTC (permalink / raw)
To: Jeff Garzik; +Cc: linux-kernel, Andrew Morton, David S. Miller
the lock validator i'm working on found the following scenario in the
rtl8139 driver, which it flagged as a deadlock:
---------------------------------------->
NETDEV WATCHDOG: eth0: transmit timed out
eth0: Transmit timeout, status 0d 0000 c07f media 80.
eth0: Tx queue start entry 164281 dirty entry 164277.
eth0: Tx descriptor 0 is 000805ea.
eth0: Tx descriptor 1 is 00080042. (queue head)
eth0: Tx descriptor 2 is 00080042.
eth0: Tx descriptor 3 is 000805ea.
============================================
[ BUG: circular locking deadlock detected! ]
--------------------------------------------
hackbench/9560 [2] is trying to acquire lock {&tp->rx_lock} at:
[<c033e460>] rtl8139_tx_timeout+0x110/0x1f0
but task is already holding lock {&dev->xmit_lock}, acquired at:
[<c045294b>] dev_watchdog+0x1b/0xc0
which lock already depends on the new lock,
which could lead to circular deadlocks!
the dependency chain (in reverse order) is:
-> #4 {&dev->xmit_lock}: [<c045294b>] dev_watchdog+0x1b/0xc0
-> #3 {&dev->queue_lock}: [<c0447154>] dev_queue_xmit+0x64/0x290
-> #2 {&((sk)->sk_lock.slock)}: [<c043eb66>] sk_clone+0x66/0x200
-> #1 {&((sk)->sk_lock.slock)}: [<c047a116>] tcp_v4_rcv+0x726/0x9d0
-> #0 {&tp->rx_lock}: [<c033e460>] rtl8139_tx_timeout+0x110/0x1f0
other info that might help us debug this:
locks held by hackbench/9560:
#0: {net/unix/af_unix.c:&u->readlock} [<c0490e6f>] unix_stream_recvmsg+0xbf/0x4f0
#1: {&dev->xmit_lock} [<c045294b>] dev_watchdog+0x1b/0xc0
stack backtrace:
[<c010432d>] show_trace+0xd/0x10
[<c0104347>] dump_stack+0x17/0x20
[<c0137d60>] print_circular_bug_tail+0x40/0x50
[<c013922f>] debug_lock_chain+0x74f/0xd40
[<c013985d>] debug_lock_chain_spin+0x3d/0x60
[<c0266add>] _raw_spin_lock+0x2d/0x90
[<c04d9d18>] _spin_lock+0x8/0x10
[<c033e460>] rtl8139_tx_timeout+0x110/0x1f0
[<c04529e8>] dev_watchdog+0xb8/0xc0
[<c0127615>] run_timer_softirq+0xf5/0x1f0
[<c0122f67>] __do_softirq+0x97/0x130
[<c0105519>] do_softirq+0x69/0x100
=======================
[<c0122c19>] irq_exit+0x39/0x50
[<c010f4cc>] smp_apic_timer_interrupt+0x4c/0x50
[<c010393b>] apic_timer_interrupt+0x27/0x2c
[<c0441829>] skb_release_data+0x59/0xa0
[<c0441b43>] kfree_skbmem+0x13/0xe0
[<c0441c58>] __kfree_skb+0x48/0xc0
[<c0490f7c>] unix_stream_recvmsg+0x1cc/0x4f0
[<c043d2d5>] do_sock_read+0x95/0xd0
[<c043d475>] sock_aio_read+0x75/0x80
[<c016499b>] do_sync_read+0xbb/0x110
[<c0164e88>] vfs_read+0x148/0x150
[<c01658bd>] sys_read+0x3d/0x70
[<c0102df7>] sysenter_past_esp+0x54/0x8d
eth0: link up, 100Mbps, full-duplex, lpa 0xC5E1
<----------------------------------------------
i'm wondering, is this a genuine deadlock, or a false positive? The
dependency chain is quite complex, but looks realistic:
-> #4 {&dev->xmit_lock}: [<c045294b>] dev_watchdog+0x1b/0xc0
-> #3 {&dev->queue_lock}: [<c0447154>] dev_queue_xmit+0x64/0x290
-> #2 {&((sk)->sk_lock.slock)}: [<c043eb66>] sk_clone+0x66/0x200
-> #1 {&((sk)->sk_lock.slock)}: [<c047a116>] tcp_v4_rcv+0x726/0x9d0
-> #0 {&tp->rx_lock}: [<c033e460>] rtl8139_tx_timeout+0x110/0x1f0
and rtl8139_tx_timeout() is rare enough to not cause real lockups in
practice all that often.
explanation of the validator output: the above dependency chain does not
mean it actually occured in one single call sequence - it is a
comulative (and full) depdency graph the validator is maintaining, to
prove locking correctness. So it can easily be multiple tasks, at
distinct points in time, on different CPUs, which built this dependency
chain. The first (#0) and the last (#4) entry is the current locking
sequence's fingerprint - so do not understand the above to be an actual
locking stack - it cannot possibly occur in this order.
Ingo
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [lock validator] drivers/net/8139too.c: deadlock?
2006-01-26 22:43 [lock validator] drivers/net/8139too.c: deadlock? Ingo Molnar
@ 2006-01-27 0:35 ` Francois Romieu
2006-01-27 2:22 ` Herbert Xu
0 siblings, 1 reply; 8+ messages in thread
From: Francois Romieu @ 2006-01-27 0:35 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Jeff Garzik, linux-kernel, Andrew Morton, David S. Miller
Ingo Molnar <mingo@elte.hu> :
[...]
> i'm wondering, is this a genuine deadlock, or a false positive? The
> dependency chain is quite complex, but looks realistic:
>
> -> #4 {&dev->xmit_lock}: [<c045294b>] dev_watchdog+0x1b/0xc0
> -> #3 {&dev->queue_lock}: [<c0447154>] dev_queue_xmit+0x64/0x290
> -> #2 {&((sk)->sk_lock.slock)}: [<c043eb66>] sk_clone+0x66/0x200
> -> #1 {&((sk)->sk_lock.slock)}: [<c047a116>] tcp_v4_rcv+0x726/0x9d0
> -> #0 {&tp->rx_lock}: [<c033e460>] rtl8139_tx_timeout+0x110/0x1f0
It looks like watchdog racing against rx_poll (which should/can not
happen). Do you have something specific in mind ?
--
Ueimor
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [lock validator] drivers/net/8139too.c: deadlock?
2006-01-27 0:35 ` Francois Romieu
@ 2006-01-27 2:22 ` Herbert Xu
2006-01-27 2:44 ` Jeff Garzik
0 siblings, 1 reply; 8+ messages in thread
From: Herbert Xu @ 2006-01-27 2:22 UTC (permalink / raw)
To: Francois Romieu; +Cc: mingo, jgarzik, linux-kernel, akpm, davem
Francois Romieu <romieu@fr.zoreil.com> wrote:
> Ingo Molnar <mingo@elte.hu> :
> [...]
>> i'm wondering, is this a genuine deadlock, or a false positive? The
>> dependency chain is quite complex, but looks realistic:
>>
>> -> #4 {&dev->xmit_lock}: [<c045294b>] dev_watchdog+0x1b/0xc0
>> -> #3 {&dev->queue_lock}: [<c0447154>] dev_queue_xmit+0x64/0x290
>> -> #2 {&((sk)->sk_lock.slock)}: [<c043eb66>] sk_clone+0x66/0x200
>> -> #1 {&((sk)->sk_lock.slock)}: [<c047a116>] tcp_v4_rcv+0x726/0x9d0
>> -> #0 {&tp->rx_lock}: [<c033e460>] rtl8139_tx_timeout+0x110/0x1f0
>
> It looks like watchdog racing against rx_poll (which should/can not
> happen). Do you have something specific in mind ?
You've got it.
rx_poll => rtl8139_rx => netif_receive_skb => ... => tcp_v4_rcv
In fact once we're at netif_receive_skb it's easy to see how we'll grab
xmit_lock again.
Prescription: Move TX timeout handling into a work queue.
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [lock validator] drivers/net/8139too.c: deadlock?
2006-01-27 2:22 ` Herbert Xu
@ 2006-01-27 2:44 ` Jeff Garzik
2006-01-31 0:24 ` [PATCH] 8139too: fix a TX timeout watchdog thread against NAPI softirq race Francois Romieu
0 siblings, 1 reply; 8+ messages in thread
From: Jeff Garzik @ 2006-01-27 2:44 UTC (permalink / raw)
To: Herbert Xu; +Cc: Francois Romieu, mingo, jgarzik, linux-kernel, akpm, davem
Herbert Xu wrote:
> You've got it.
>
> rx_poll => rtl8139_rx => netif_receive_skb => ... => tcp_v4_rcv
>
> In fact once we're at netif_receive_skb it's easy to see how we'll grab
> xmit_lock again.
>
> Prescription: Move TX timeout handling into a work queue.
This was my recommendation as well.
Using a work queue not only solves this locking problem, but it also
enables the possibility of sleeping, a common operation during the
hardware resets that often occur during ->tx_timeout()
Anyone who wants to make this change in a driver, don't forget that this
could race with other operations, notably dev->stop() or suspend.
Jeff
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] 8139too: fix a TX timeout watchdog thread against NAPI softirq race
2006-01-27 2:44 ` Jeff Garzik
@ 2006-01-31 0:24 ` Francois Romieu
2006-01-31 18:49 ` Ingo Molnar
2006-02-01 0:55 ` Andrew Morton
0 siblings, 2 replies; 8+ messages in thread
From: Francois Romieu @ 2006-01-31 0:24 UTC (permalink / raw)
To: Jeff Garzik; +Cc: Herbert Xu, mingo, linux-kernel, akpm, davem
Ingo's stealth lock validator detected that both thread acquire
dev->xmit_lock and tp->rx_lock in reverse order.
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
471819cf98a56751ae0387400542fe2997855327
diff --git a/drivers/net/8139too.c b/drivers/net/8139too.c
index adfba44..2beac55 100644
--- a/drivers/net/8139too.c
+++ b/drivers/net/8139too.c
@@ -586,6 +586,7 @@ struct rtl8139_private {
dma_addr_t tx_bufs_dma;
signed char phys[4]; /* MII device addresses. */
char twistie, twist_row, twist_col; /* Twister tune state. */
+ unsigned int watchdog_fired : 1;
unsigned int default_port : 4; /* Last dev->if_port value. */
unsigned int have_thread : 1;
spinlock_t lock;
@@ -638,6 +639,7 @@ static void rtl8139_set_rx_mode (struct
static void __set_rx_mode (struct net_device *dev);
static void rtl8139_hw_start (struct net_device *dev);
static void rtl8139_thread (void *_data);
+static void rtl8139_tx_timeout_task(void *_data);
static struct ethtool_ops rtl8139_ethtool_ops;
/* write MMIO register, with flush */
@@ -1598,13 +1600,14 @@ static void rtl8139_thread (void *_data)
{
struct net_device *dev = _data;
struct rtl8139_private *tp = netdev_priv(dev);
- unsigned long thr_delay;
+ unsigned long thr_delay = next_tick;
- if (rtnl_shlock_nowait() == 0) {
+ if (tp->watchdog_fired) {
+ tp->watchdog_fired = 0;
+ rtl8139_tx_timeout_task(_data);
+ } else if (rtnl_shlock_nowait() == 0) {
rtl8139_thread_iter (dev, tp, tp->mmio_addr);
rtnl_unlock ();
-
- thr_delay = next_tick;
} else {
/* unlikely race. mitigate with fast poll. */
thr_delay = HZ / 2;
@@ -1631,7 +1634,8 @@ static void rtl8139_stop_thread(struct r
if (tp->have_thread) {
cancel_rearming_delayed_work(&tp->thread);
tp->have_thread = 0;
- }
+ } else
+ flush_scheduled_work();
}
static inline void rtl8139_tx_clear (struct rtl8139_private *tp)
@@ -1642,14 +1646,13 @@ static inline void rtl8139_tx_clear (str
/* XXX account for unsent Tx packets in tp->stats.tx_dropped */
}
-
-static void rtl8139_tx_timeout (struct net_device *dev)
+static void rtl8139_tx_timeout_task (void *_data)
{
+ struct net_device *dev = _data;
struct rtl8139_private *tp = netdev_priv(dev);
void __iomem *ioaddr = tp->mmio_addr;
int i;
u8 tmp8;
- unsigned long flags;
printk (KERN_DEBUG "%s: Transmit timeout, status %2.2x %4.4x %4.4x "
"media %2.2x.\n", dev->name, RTL_R8 (ChipCmd),
@@ -1670,23 +1673,34 @@ static void rtl8139_tx_timeout (struct n
if (tmp8 & CmdTxEnb)
RTL_W8 (ChipCmd, CmdRxEnb);
- spin_lock(&tp->rx_lock);
+ spin_lock_bh(&tp->rx_lock);
/* Disable interrupts by clearing the interrupt mask. */
RTL_W16 (IntrMask, 0x0000);
/* Stop a shared interrupt from scavenging while we are. */
- spin_lock_irqsave (&tp->lock, flags);
+ spin_lock_irq(&tp->lock);
rtl8139_tx_clear (tp);
- spin_unlock_irqrestore (&tp->lock, flags);
+ spin_unlock_irq(&tp->lock);
/* ...and finally, reset everything */
if (netif_running(dev)) {
rtl8139_hw_start (dev);
netif_wake_queue (dev);
}
- spin_unlock(&tp->rx_lock);
+ spin_unlock_bh(&tp->rx_lock);
}
+static void rtl8139_tx_timeout (struct net_device *dev)
+{
+ struct rtl8139_private *tp = netdev_priv(dev);
+
+ if (!tp->have_thread) {
+ INIT_WORK(&tp->thread, rtl8139_tx_timeout_task, dev);
+ schedule_delayed_work(&tp->thread, next_tick);
+ } else
+ tp->watchdog_fired = 1;
+
+}
static int rtl8139_start_xmit (struct sk_buff *skb, struct net_device *dev)
{
--
1.1.3
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] 8139too: fix a TX timeout watchdog thread against NAPI softirq race
2006-01-31 0:24 ` [PATCH] 8139too: fix a TX timeout watchdog thread against NAPI softirq race Francois Romieu
@ 2006-01-31 18:49 ` Ingo Molnar
2006-01-31 18:57 ` Ingo Molnar
2006-02-01 0:55 ` Andrew Morton
1 sibling, 1 reply; 8+ messages in thread
From: Ingo Molnar @ 2006-01-31 18:49 UTC (permalink / raw)
To: Francois Romieu; +Cc: Jeff Garzik, Herbert Xu, linux-kernel, akpm, davem
[-- Attachment #1: Type: text/plain, Size: 3907 bytes --]
* Francois Romieu <romieu@fr.zoreil.com> wrote:
> Ingo's stealth lock validator detected that both thread acquire
> dev->xmit_lock and tp->rx_lock in reverse order.
>
> Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
i've ported this to -mm4 (see the attached patch), but i'm also getting
a new deadlock message. Seems to be a separate issue, not introduced by
your patch - but still needs fixing i guess.
Ingo
=========================================
[ BUG: irq lock inversion bug detected! ]
-----------------------------------------
rcu_torture_rea/263 just changed the state of lock:
(&mc->mca_lock){-+}, at: [<c04b4c0f>] mld_ifc_timer_expire+0x17f/0x2c0
but this lock took another, softirq-unsafe lock in the past:
(&tp->lock){--}, at: [<c03406d2>] rtl8139_set_rx_mode+0x22/0x180
and interrupts could create an inverse lock dependency between them,
which could lead to deadlocks!
other info that might help us debug this:
no locks held by rcu_torture_rea/263.
-> (&tp->lock){--} {
used at: [<c03413da>] rtl8139_interrupt+0x2a/0x5e0
softirq-on at: [<c044b5fa>] dev_mc_upload+0x2a/0x40
hardirq-on at: [<c0160a28>] kmem_cache_alloc+0x98/0xd0
}
... key at: [<c033fffb>] rtl8139_init_one+0x2ab/0x960
... acquired at: [<c03406d2>] rtl8139_set_rx_mode+0x22/0x180
the first lock's dependencies:
-> (&mc->mca_lock){-+} {
used at: [<c04b3440>] igmp6_group_added+0x20/0x170
in-softirq at: [<c04b4c0f>] mld_ifc_timer_expire+0x17f/0x2c0
hardirq-on at: [<c04dcaa5>] _spin_unlock_irqrestore+0x25/0x30
}
... key at: [<c04b3edd>] ipv6_dev_mc_inc+0x17d/0x3c0
... acquired at: [<c04b4c0f>] mld_ifc_timer_expire+0x17f/0x2c0
-> (&dev->xmit_lock){-.} {
used at: [<c044b5ec>] dev_mc_upload+0x1c/0x40
hardirq-on at: [<c04dba6a>] __mutex_lock_slowpath+0x27a/0x4e0
}
... key at: [<c0449fdd>] register_netdevice+0x5d/0x3a0
... acquired at: [<c044b724>] dev_mc_add+0x34/0x140
-> (&tp->lock){--} {
used at: [<c03413da>] rtl8139_interrupt+0x2a/0x5e0
softirq-on at: [<c044b5fa>] dev_mc_upload+0x2a/0x40
hardirq-on at: [<c0160a28>] kmem_cache_alloc+0x98/0xd0
}
... key at: [<c033fffb>] rtl8139_init_one+0x2ab/0x960
... acquired at: [<c03406d2>] rtl8139_set_rx_mode+0x22/0x180
-> (&base->t_base.lock){++} {
used at: [<c0126df3>] lock_timer_base+0x23/0x50
in-hardirq at: [<c0126df3>] lock_timer_base+0x23/0x50
in-softirq at: [<c01277f2>] run_timer_softirq+0x42/0x1f0
}
... key at: [<c1fd2ef0>] 0xc1fd2ef0
... acquired at: [<c0126df3>] lock_timer_base+0x23/0x50
the second lock's dependencies:
-> (&tp->lock){--} {
used at: [<c03413da>] rtl8139_interrupt+0x2a/0x5e0
softirq-on at: [<c044b5fa>] dev_mc_upload+0x2a/0x40
hardirq-on at: [<c0160a28>] kmem_cache_alloc+0x98/0xd0
}
... key at: [<c033fffb>] rtl8139_init_one+0x2ab/0x960
... acquired at: [<c03406d2>] rtl8139_set_rx_mode+0x22/0x180
stack backtrace:
[<c010437d>] show_trace+0xd/0x10
[<c0104397>] dump_stack+0x17/0x20
[<c0137db2>] print_irq_inversion_bug+0x142/0x1b0
[<c0137f91>] check_usage_forwards+0x41/0x50
[<c0139520>] mark_lock+0x180/0x350
[<c0139b83>] debug_lock_chain+0x493/0x1090
[<c013a7bd>] debug_lock_chain_spin+0x3d/0x60
[<c026777d>] _raw_spin_lock+0x2d/0x90
[<c04dc942>] _spin_lock_bh+0x12/0x20
[<c04b4c0f>] mld_ifc_timer_expire+0x17f/0x2c0
[<c01278a5>] run_timer_softirq+0xf5/0x1f0
[<c01233a7>] __do_softirq+0x97/0x130
[<c01054d9>] do_softirq+0x69/0x100
=======================
[<c0123065>] irq_exit+0x45/0x50
[<c010f534>] smp_apic_timer_interrupt+0x54/0x60
[<c010395b>] apic_timer_interrupt+0x27/0x2c
[<c0265f0c>] __delay+0xc/0x10
[<c0265f71>] __udelay+0x31/0x40
[<c0143333>] rcu_torture_reader+0x83/0x140
[<c0131dea>] kthread+0xca/0xd0
[<c0100ef5>] kernel_thread_helper+0x5/0x10
eth0: no IPv6 routers present
[-- Attachment #2: 8139too-tx-deadlock-fix.patch --]
[-- Type: text/plain, Size: 3523 bytes --]
Ingo's stealth lock validator detected that both thread acquire
dev->xmit_lock and tp->rx_lock in reverse order.
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
----
drivers/net/8139too.c | 36 ++++++++++++++++++++++++++----------
1 files changed, 26 insertions(+), 10 deletions(-)
Index: linux/drivers/net/8139too.c
===================================================================
--- linux.orig/drivers/net/8139too.c
+++ linux/drivers/net/8139too.c
@@ -586,6 +586,7 @@ struct rtl8139_private {
dma_addr_t tx_bufs_dma;
signed char phys[4]; /* MII device addresses. */
char twistie, twist_row, twist_col; /* Twister tune state. */
+ unsigned int watchdog_fired : 1;
unsigned int default_port : 4; /* Last dev->if_port value. */
unsigned int have_thread : 1;
spinlock_t lock;
@@ -638,6 +639,7 @@ static void rtl8139_set_rx_mode (struct
static void __set_rx_mode (struct net_device *dev);
static void rtl8139_hw_start (struct net_device *dev);
static void rtl8139_thread (void *_data);
+static void rtl8139_tx_timeout_task(void *_data);
static struct ethtool_ops rtl8139_ethtool_ops;
/* write MMIO register, with flush */
@@ -1598,9 +1600,12 @@ static void rtl8139_thread (void *_data)
{
struct net_device *dev = _data;
struct rtl8139_private *tp = netdev_priv(dev);
- unsigned long thr_delay;
+ unsigned long thr_delay = next_tick;
- if (rtnl_shlock_nowait() != 0) {
+ if (tp->watchdog_fired) {
+ tp->watchdog_fired = 0;
+ rtl8139_tx_timeout_task(_data);
+ } else if (rtnl_shlock_nowait() != 0) {
rtl8139_thread_iter (dev, tp, tp->mmio_addr);
rtnl_unlock ();
@@ -1631,7 +1636,8 @@ static void rtl8139_stop_thread(struct r
if (tp->have_thread) {
cancel_rearming_delayed_work(&tp->thread);
tp->have_thread = 0;
- }
+ } else
+ flush_scheduled_work();
}
static inline void rtl8139_tx_clear (struct rtl8139_private *tp)
@@ -1642,14 +1648,13 @@ static inline void rtl8139_tx_clear (str
/* XXX account for unsent Tx packets in tp->stats.tx_dropped */
}
-
-static void rtl8139_tx_timeout (struct net_device *dev)
+static void rtl8139_tx_timeout_task (void *_data)
{
+ struct net_device *dev = _data;
struct rtl8139_private *tp = netdev_priv(dev);
void __iomem *ioaddr = tp->mmio_addr;
int i;
u8 tmp8;
- unsigned long flags;
printk (KERN_DEBUG "%s: Transmit timeout, status %2.2x %4.4x %4.4x "
"media %2.2x.\n", dev->name, RTL_R8 (ChipCmd),
@@ -1670,23 +1675,34 @@ static void rtl8139_tx_timeout (struct n
if (tmp8 & CmdTxEnb)
RTL_W8 (ChipCmd, CmdRxEnb);
- spin_lock(&tp->rx_lock);
+ spin_lock_bh(&tp->rx_lock);
/* Disable interrupts by clearing the interrupt mask. */
RTL_W16 (IntrMask, 0x0000);
/* Stop a shared interrupt from scavenging while we are. */
- spin_lock_irqsave (&tp->lock, flags);
+ spin_lock_irq(&tp->lock);
rtl8139_tx_clear (tp);
- spin_unlock_irqrestore (&tp->lock, flags);
+ spin_unlock_irq(&tp->lock);
/* ...and finally, reset everything */
if (netif_running(dev)) {
rtl8139_hw_start (dev);
netif_wake_queue (dev);
}
- spin_unlock(&tp->rx_lock);
+ spin_unlock_bh(&tp->rx_lock);
}
+static void rtl8139_tx_timeout (struct net_device *dev)
+{
+ struct rtl8139_private *tp = netdev_priv(dev);
+
+ if (!tp->have_thread) {
+ INIT_WORK(&tp->thread, rtl8139_tx_timeout_task, dev);
+ schedule_delayed_work(&tp->thread, next_tick);
+ } else
+ tp->watchdog_fired = 1;
+
+}
static int rtl8139_start_xmit (struct sk_buff *skb, struct net_device *dev)
{
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] 8139too: fix a TX timeout watchdog thread against NAPI softirq race
2006-01-31 18:49 ` Ingo Molnar
@ 2006-01-31 18:57 ` Ingo Molnar
0 siblings, 0 replies; 8+ messages in thread
From: Ingo Molnar @ 2006-01-31 18:57 UTC (permalink / raw)
To: Francois Romieu; +Cc: Jeff Garzik, Herbert Xu, linux-kernel, akpm, davem
* Ingo Molnar <mingo@elte.hu> wrote:
> > Ingo's stealth lock validator detected that both thread acquire
> > dev->xmit_lock and tp->rx_lock in reverse order.
> >
> > Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
>
> i've ported this to -mm4 (see the attached patch), but i'm also
> getting a new deadlock message. Seems to be a separate issue, not
> introduced by your patch - but still needs fixing i guess.
sorry - this is a false alarm. (caused by the lock validator not
properly handling DEBUG_SHIRQ)
Ingo
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] 8139too: fix a TX timeout watchdog thread against NAPI softirq race
2006-01-31 0:24 ` [PATCH] 8139too: fix a TX timeout watchdog thread against NAPI softirq race Francois Romieu
2006-01-31 18:49 ` Ingo Molnar
@ 2006-02-01 0:55 ` Andrew Morton
1 sibling, 0 replies; 8+ messages in thread
From: Andrew Morton @ 2006-02-01 0:55 UTC (permalink / raw)
To: Francois Romieu; +Cc: jgarzik, herbert, mingo, linux-kernel, davem
Francois Romieu <romieu@fr.zoreil.com> wrote:
>
> + unsigned int watchdog_fired : 1;
> unsigned int default_port : 4; /* Last dev->if_port value. */
> unsigned int have_thread : 1;
Bear in mind that the compiler will put these three fields into the same
word and will use non-atomic RMWs when they are modified.
In this particualr case it's hard to see how an SMP or IRQ race can occur,
but it's a trap. Which won't show up on x86.
struct x {
unsigned int a:1;
unsigned int b:1;
};
void foo(struct x *a)
{
a->a = 1;
}
void bar(struct x *a)
{
a->b = 1;
}
On ppc:
foo:
lwz 0,0(3)
oris 0,0,0x8000
stw 0,0(3)
blr
.size foo, .-foo
.align 2
.globl bar
.type bar, @function
bar:
lwz 0,0(3)
oris 0,0,0x4000
stw 0,0(3)
blr
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2006-02-01 1:05 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-01-26 22:43 [lock validator] drivers/net/8139too.c: deadlock? Ingo Molnar
2006-01-27 0:35 ` Francois Romieu
2006-01-27 2:22 ` Herbert Xu
2006-01-27 2:44 ` Jeff Garzik
2006-01-31 0:24 ` [PATCH] 8139too: fix a TX timeout watchdog thread against NAPI softirq race Francois Romieu
2006-01-31 18:49 ` Ingo Molnar
2006-01-31 18:57 ` Ingo Molnar
2006-02-01 0:55 ` Andrew Morton
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox