netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* local_bh_enable & hard_start_xmit
@ 2005-04-18 21:37 Ben Greear
  2005-04-18 22:14 ` David S. Miller
  0 siblings, 1 reply; 8+ messages in thread
From: Ben Greear @ 2005-04-18 21:37 UTC (permalink / raw)
  To: 'netdev@oss.sgi.com'

This pertains to kernel 2.6.11.

I seem to have backed myself into a corner with network devices
and locking...again.

I have a thread that grabs a read lock with read_lock_irqsave,
loops through a list, making calls to dev->hard_start_xmit.

Before today, I was not messing with local_bh_enable/disable, and
I was seeing badness messages due to this call path:

Badness in local_bh_enable at kernel/softirq.c:140
  [<c01266f2>] local_bh_enable+0x92/0xa0
  [<c02b6b35>] dev_queue_xmit+0x165/0x280
  [<c0110364>] get_offset_pmtmr+0x14/0xcb0
  [<f89e2372>] vlan_dev_hwaccel_hard_start_xmit+0x62/0x70 [8021q]
  [<f89a89b2>] do_task+0x642/0x61e0 [wanlink]
...

The warning is due to this line:
void local_bh_enable(void)
{
	WARN_ON(irqs_disabled());


So, I decided to wrap my calls to dev->hard_start_xmit with local_bh_disable/enable.

That does not actually fix my problem because I still have the read-lock acquired.
I am going to try putting the disable/enable outside of that, but then I will be
potentially disabling the bh for a long time.


So, two questions:

1)  Why is it bad to have interrupts disabled when calling
     the local_bh_enable() method?

2)  Should there be a hard requirement that one must never have IRQs disabled
     when calling dev->hard_start_xmit  (this requirement seems to currently
     be in effect because VLANs can call dev_queue_xmit from their hard_start_xmit
     method, and it appears that dev_queue_xmit must not be called with IRQs disabled).


Thanks,
Ben


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: local_bh_enable & hard_start_xmit
  2005-04-18 21:37 local_bh_enable & hard_start_xmit Ben Greear
@ 2005-04-18 22:14 ` David S. Miller
  2005-04-18 22:59   ` Ben Greear
  0 siblings, 1 reply; 8+ messages in thread
From: David S. Miller @ 2005-04-18 22:14 UTC (permalink / raw)
  To: Ben Greear; +Cc: netdev

On Mon, 18 Apr 2005 14:37:22 -0700
Ben Greear <greearb@candelatech.com> wrote:

> So, two questions:
> 
> 1)  Why is it bad to have interrupts disabled when calling
>      the local_bh_enable() method?

Because it creates a deadlock.  You can always take hard IRQ disabling
locks inside of BH disabling ones, but _never_ the other way around.

local_bh_enable() potentially runs BH handlers, and this must occur with
hard IRQs enabled.

> 2)  Should there be a hard requirement that one must never have IRQs disabled
>      when calling dev->hard_start_xmit  (this requirement seems to currently
>      be in effect because VLANs can call dev_queue_xmit from their hard_start_xmit
>      method, and it appears that dev_queue_xmit must not be called with IRQs disabled).

Yes, it is another true requirement.

I even tried to disable hard IRQs during ->hard_start_xmit() to fix a
LLTX locking bug and it totally broke things.

There are ->hard_start_xmit() routines for very slow
devices which expect that jiffies continues to increment via timer
interrupts so that they can timeout feeding bytes to the chip properly.
Also, again with slow devices, it is expected that hard IRQs are enabled
so that your serial ports don't overrun.

In general, it's anti-social to IRQ response time sensitive devices in the
machine to disable hard IRQs for any non-trivial stretch of code.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: local_bh_enable & hard_start_xmit
  2005-04-18 22:14 ` David S. Miller
@ 2005-04-18 22:59   ` Ben Greear
  2005-04-18 23:01     ` David S. Miller
  0 siblings, 1 reply; 8+ messages in thread
From: Ben Greear @ 2005-04-18 22:59 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev

David S. Miller wrote:
> On Mon, 18 Apr 2005 14:37:22 -0700
> Ben Greear <greearb@candelatech.com> wrote:
> 
> 
>>So, two questions:
>>
>>1)  Why is it bad to have interrupts disabled when calling
>>     the local_bh_enable() method?
> 
> 
> Because it creates a deadlock.  You can always take hard IRQ disabling
> locks inside of BH disabling ones, but _never_ the other way around.
> 
> local_bh_enable() potentially runs BH handlers, and this must occur with
> hard IRQs enabled.

Ok.  It would be great if this explanation was in comments near the warning in the code.
The dev_start_xmit code in dev.c could also have a note mentioning that it
must never be called with IRQs disabled.

>>2)  Should there be a hard requirement that one must never have IRQs disabled
>>     when calling dev->hard_start_xmit  (this requirement seems to currently
>>     be in effect because VLANs can call dev_queue_xmit from their hard_start_xmit
>>     method, and it appears that dev_queue_xmit must not be called with IRQs disabled).
> 
> 
> Yes, it is another true requirement.

This would be a good addition to the Documentation/networking/netdevices.txt
file, or maybe to the dev.c file somewhere (I haven't found an complete list of
locking notes, though the comments in the dev.c file and the netdevices.txt file
are a big help.)

I should be able to fix my particular problem by using reference counting and
breaking up my big loop into smaller work units.

I assume that it is fine to nest calls to local_bh_enable/disable?  (This
seems to be required since you are supposed to have bh disabled when
calling hard_start_xmit, but hard_start_xmit can call dev_queue_xmit which
disables the bh again...)

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: local_bh_enable & hard_start_xmit
  2005-04-18 22:59   ` Ben Greear
@ 2005-04-18 23:01     ` David S. Miller
  2005-04-18 23:17       ` Ben Greear
  2005-04-19  0:24       ` Ben Greear
  0 siblings, 2 replies; 8+ messages in thread
From: David S. Miller @ 2005-04-18 23:01 UTC (permalink / raw)
  To: Ben Greear; +Cc: netdev

On Mon, 18 Apr 2005 15:59:05 -0700
Ben Greear <greearb@candelatech.com> wrote:

> This would be a good addition to the Documentation/networking/netdevices.txt
> file, or maybe to the dev.c file somewhere (I haven't found an complete list of
> locking notes, though the comments in the dev.c file and the netdevices.txt file
> are a big help.)

So write the patch to add such comments.  It would have taken the
same amount of typing as writing that paragraph saying how great an
addition this would be. :)

> I assume that it is fine to nest calls to local_bh_enable/disable?  (This
> seems to be required since you are supposed to have bh disabled when
> calling hard_start_xmit, but hard_start_xmit can call dev_queue_xmit which
> disables the bh again...)

Yes, this is legal, BH disabling does nest properly.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: local_bh_enable & hard_start_xmit
  2005-04-18 23:01     ` David S. Miller
@ 2005-04-18 23:17       ` Ben Greear
  2005-04-19  0:24       ` Ben Greear
  1 sibling, 0 replies; 8+ messages in thread
From: Ben Greear @ 2005-04-18 23:17 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev

David S. Miller wrote:
> On Mon, 18 Apr 2005 15:59:05 -0700
> Ben Greear <greearb@candelatech.com> wrote:
> 
> 
>>This would be a good addition to the Documentation/networking/netdevices.txt
>>file, or maybe to the dev.c file somewhere (I haven't found an complete list of
>>locking notes, though the comments in the dev.c file and the netdevices.txt file
>>are a big help.)
> 
> 
> So write the patch to add such comments.  It would have taken the
> same amount of typing as writing that paragraph saying how great an
> addition this would be. :)

I figured it would be quicker for you to paste the paragraph than to
apply my patch :)

I'll send one over directly.

Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: local_bh_enable & hard_start_xmit
  2005-04-18 23:01     ` David S. Miller
  2005-04-18 23:17       ` Ben Greear
@ 2005-04-19  0:24       ` Ben Greear
       [not found]         ` <20050419231442.7e37b087.davem@davemloft.net>
  1 sibling, 1 reply; 8+ messages in thread
From: Ben Greear @ 2005-04-19  0:24 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev

David S. Miller wrote:

> So write the patch to add such comments.  It would have taken the
> same amount of typing as writing that paragraph saying how great an
> addition this would be. :)

Signed-off-by:  Ben Greear <greearb@candelatech.com>


--- linux-2.6.11/Documentation/networking/netdevices.txt        2005-03-01 23:37:50.000000000 -0800+++ linux-2.6.11.p4s/Documentation/networking/netdevices.txt    2005-04-18 16:59:43.000000000 -0700@@ -51,6 +51,8 @@
         set_multicast_list
         Context: BHs disabled
         Notes: netif_queue_stopped() is guaranteed false
+               Interrupts must be enabled when calling hard_start_xmit.
+                (Interrupts must also be enabled when enabling the BH handler.)
         Return codes:
         o NETDEV_TX_OK everything ok.
         o NETDEV_TX_BUSY Cannot transmit packet, try later


Second patch:

[greear@lanforge-nx 2.6]$ diff -u linux-2.6.11/net/core/dev.c linux-2.6.11.mostly-clean/net/core/dev.c
--- linux-2.6.11/net/core/dev.c 2005-03-01 23:38:09.000000000 -0800
+++ linux-2.6.11.mostly-clean/net/core/dev.c    2005-04-18 17:23:21.767951600 -0700
@@ -1214,6 +1214,19 @@
   *     A negative errno code is returned on a failure. A success does not
   *     guarantee the frame will be transmitted as it may be dropped due
   *     to congestion or traffic shaping.
+ *
+ * -----------------------------------------------------------------------------------
+ *      I notice this method can also return errors from the queue disciplines,
+ *      including NET_XMIT_DROP, which is a positive value.  So, errors can also
+ *      be positive.
+ *
+ *      Regardless of the return value, the skb is consumed, so it is currently
+ *      difficult to retry a send to this method.  (You can bump the ref count
+ *      before sending to hold a reference for retry if you are careful.)
+ *
+ *      When calling this method, interrupts MUST be enabled.  This is because
+ *      the BH enable code must have IRQs enabled so that it will not deadlock.
+ *          --BLG
   */

  int dev_queue_xmit(struct sk_buff *skb)


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: local_bh_enable & hard_start_xmit
       [not found]         ` <20050419231442.7e37b087.davem@davemloft.net>
@ 2005-04-22 19:39           ` Ben Greear
  2005-04-25  3:13             ` David S. Miller
  0 siblings, 1 reply; 8+ messages in thread
From: Ben Greear @ 2005-04-22 19:39 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev

[-- Attachment #1: Type: text/plain, Size: 594 bytes --]

David S. Miller wrote:
> On Mon, 18 Apr 2005 17:24:52 -0700
> Ben Greear <greearb@candelatech.com> wrote:
> 
> 
>>David S. Miller wrote:
>>
>>
>>>So write the patch to add such comments.  It would have taken the
>>>same amount of typing as writing that paragraph saying how great an
>>>addition this would be. :)
>>
>>Signed-off-by:  Ben Greear <greearb@candelatech.com>
> 
> 
> -EPATCH_MUNGED_BY_MAIL_CLIENT

Sending as attachments this time...plz let me know if these work
better.

Thanks,
Ben



-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


[-- Attachment #2: dev.c.patch --]
[-- Type: text/plain, Size: 1099 bytes --]

--- linux-2.6.11/net/core/dev.c	2005-03-01 23:38:09.000000000 -0800
+++ linux-2.6.11.mostly-clean/net/core/dev.c	2005-04-18 17:23:21.000000000 -0700
@@ -1214,6 +1214,19 @@
  *	A negative errno code is returned on a failure. A success does not
  *	guarantee the frame will be transmitted as it may be dropped due
  *	to congestion or traffic shaping.
+ *
+ * -----------------------------------------------------------------------------------
+ *      I notice this method can also return errors from the queue disciplines,
+ *      including NET_XMIT_DROP, which is a positive value.  So, errors can also
+ *      be positive.
+ *
+ *      Regardless of the return value, the skb is consumed, so it is currently
+ *      difficult to retry a send to this method.  (You can bump the ref count
+ *      before sending to hold a reference for retry if you are careful.)
+ *
+ *      When calling this method, interrupts MUST be enabled.  This is because
+ *      the BH enable code must have IRQs enabled so that it will not deadlock.
+ *          --BLG
  */
 
 int dev_queue_xmit(struct sk_buff *skb)

[-- Attachment #3: netdevices.txt.patch --]
[-- Type: text/plain, Size: 562 bytes --]

--- linux-2.6.11/Documentation/networking/netdevices.txt	2005-03-01 23:37:50.000000000 -0800
+++ linux-2.6.11.p4s/Documentation/networking/netdevices.txt	2005-04-18 16:59:43.000000000 -0700
@@ -51,6 +51,8 @@
 	set_multicast_list
 	Context: BHs disabled
 	Notes: netif_queue_stopped() is guaranteed false
+               Interrupts must be enabled when calling hard_start_xmit.
+                (Interrupts must also be enabled when enabling the BH handler.)
 	Return codes: 
 	o NETDEV_TX_OK everything ok. 
 	o NETDEV_TX_BUSY Cannot transmit packet, try later 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: local_bh_enable & hard_start_xmit
  2005-04-22 19:39           ` Ben Greear
@ 2005-04-25  3:13             ` David S. Miller
  0 siblings, 0 replies; 8+ messages in thread
From: David S. Miller @ 2005-04-25  3:13 UTC (permalink / raw)
  To: Ben Greear; +Cc: netdev

On Fri, 22 Apr 2005 12:39:10 -0700
Ben Greear <greearb@candelatech.com> wrote:

> Sending as attachments this time...plz let me know if these work
> better.

Looks good, patches applied.

Thanks Ben.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2005-04-25  3:13 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-04-18 21:37 local_bh_enable & hard_start_xmit Ben Greear
2005-04-18 22:14 ` David S. Miller
2005-04-18 22:59   ` Ben Greear
2005-04-18 23:01     ` David S. Miller
2005-04-18 23:17       ` Ben Greear
2005-04-19  0:24       ` Ben Greear
     [not found]         ` <20050419231442.7e37b087.davem@davemloft.net>
2005-04-22 19:39           ` Ben Greear
2005-04-25  3:13             ` David S. Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).