netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* d80211: How does TX flow control work?
@ 2007-01-02 13:08 Jan Kiszka
  2007-01-03 17:52 ` Jiri Benc
  0 siblings, 1 reply; 9+ messages in thread
From: Jan Kiszka @ 2007-01-02 13:08 UTC (permalink / raw)
  To: netdev; +Cc: Jiri Benc, Ivo Van Doorn, rt2400-devel

[-- Attachment #1: Type: text/plain, Size: 861 bytes --]

Hi,

can someone explain how TX flow control in d80211 is supposed to work? I
failed to understand the full design so far.

What I (think to) understand is that a low-level drivers call
ieee80211_stop_queue() if they run out of buffers. That flips a
per-queue bit (IEEE80211_LINK_STATE_XOFF), prevents that any further
frame is passed to the low-level TX routine, and can cause that up to
*one* packet per queue is stored in
ieee80211_local::pending_packets[queue]. But it looks to me like nothing
prevents ieee80211_tx() being invoked even in case that there is already
some stuff in that single-packet storage.

That in turn triggers WARN_ONs in ieee80211_tx() under high load for me
(with rt2500usb). And it should also cause orphaned skbs because the
storage is overwritten in that case. Either I'm blind or something is
fishy...

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: d80211: How does TX flow control work?
  2007-01-02 13:08 d80211: How does TX flow control work? Jan Kiszka
@ 2007-01-03 17:52 ` Jiri Benc
  2007-01-03 18:10   ` Jan Kiszka
  0 siblings, 1 reply; 9+ messages in thread
From: Jiri Benc @ 2007-01-03 17:52 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: netdev, Ivo Van Doorn, rt2400-devel

On Tue, 02 Jan 2007 14:08:21 +0100, Jan Kiszka wrote:
> What I (think to) understand is that a low-level drivers call
> ieee80211_stop_queue() if they run out of buffers. That flips a
> per-queue bit (IEEE80211_LINK_STATE_XOFF), prevents that any further
> frame is passed to the low-level TX routine,

Correct.

> and can cause that up to
> *one* packet per queue is stored in
> ieee80211_local::pending_packets[queue].

This is needed due to fragmented frames. After resume, passing of
fragments to the driver has to continue where it was stopped. Returning
the half-sent fragmented frame to the 802.11 qdisc wasn't possible
until recently (I think the conversion of master interface to native
802.11 type could allow that now - but it's probably not worth the
effort).

> But it looks to me like nothing
> prevents ieee80211_tx() being invoked even in case that there is already
> some stuff in that single-packet storage.

The 802.11 qdisc (see wme_qdiscop_dequeue) takes care of that.

> That in turn triggers WARN_ONs in ieee80211_tx() under high load for me
> (with rt2500usb). And it should also cause orphaned skbs because the
> storage is overwritten in that case. Either I'm blind or something is
> fishy...

You are most likely hitting some bug. Could you post more information
please?

Thanks,

 Jiri

-- 
Jiri Benc
SUSE Labs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: d80211: How does TX flow control work?
  2007-01-03 17:52 ` Jiri Benc
@ 2007-01-03 18:10   ` Jan Kiszka
  2007-01-03 18:18     ` Jiri Benc
  0 siblings, 1 reply; 9+ messages in thread
From: Jan Kiszka @ 2007-01-03 18:10 UTC (permalink / raw)
  To: Jiri Benc; +Cc: netdev, Ivo Van Doorn, rt2400-devel

[-- Attachment #1: Type: text/plain, Size: 3151 bytes --]

Jiri Benc wrote:

> On Tue, 02 Jan 2007 14:08:21 +0100, Jan Kiszka wrote:
>   
>> What I (think to) understand is that a low-level drivers call
>> ieee80211_stop_queue() if they run out of buffers. That flips a
>> per-queue bit (IEEE80211_LINK_STATE_XOFF), prevents that any further
>> frame is passed to the low-level TX routine,
>>     
>
> Correct.
>
>   
>> and can cause that up to
>> *one* packet per queue is stored in
>> ieee80211_local::pending_packets[queue].
>>     
>
> This is needed due to fragmented frames. After resume, passing of
> fragments to the driver has to continue where it was stopped. Returning
> the half-sent fragmented frame to the 802.11 qdisc wasn't possible
> until recently (I think the conversion of master interface to native
> 802.11 type could allow that now - but it's probably not worth the
> effort).
>
>   
>> But it looks to me like nothing
>> prevents ieee80211_tx() being invoked even in case that there is already
>> some stuff in that single-packet storage.
>>     
>
> The 802.11 qdisc (see wme_qdiscop_dequeue) takes care of that.
>
>   
Ahh, that is an interesting new piece in the puzzle.


>> That in turn triggers WARN_ONs in ieee80211_tx() under high load for me
>> (with rt2500usb). And it should also cause orphaned skbs because the
>> storage is overwritten in that case. Either I'm blind or something is
>> fishy...
>>     
>
> You are most likely hitting some bug. Could you post more information
> please?
>
>   
Test scenario is rt2500usb from the rt2x00 CVS (+my currently half-pending
series), an ASUS WL167g USB stick, and hostapd driving that stick in master
mode. As soon as I trigger the AP to send out some longer TCP stream, I get
these warnings:

BUG: warning at /usr/src/rt2x00/rt2x00/ieee80211/ieee80211.c:1256/ieee80211_tx()
 <cfa02245> ieee80211_master_start_xmit+0x105/0x430 [80211]  <c024e35d> __ip_ct_refresh_acct+0x4d/0x60
 <c024fd11> tcp_packet+0x941/0x970  <c0217442> qdisc_restart+0x92/0x100
 <c020d43d> dev_queue_xmit+0xbd/0x1a0  <cfa050d8> ieee80211_subif_start_xmit+0x468/0x480 [80211]
 <c0207dca> skb_clone+0x3a/0x1a0  <c021d16d> nf_hook_slow+0x4d/0xc0
 <c020d495> dev_queue_xmit+0x115/0x1a0  <c0226a63> ip_output+0x1c3/0x200
 <c0225740> ip_finish_output+0x0/0x180  <c022628b> ip_queue_xmit+0x36b/0x3b0
 <c0224130> dst_output+0x0/0x10  <ce9bae7d> usb_hcd_giveback_urb+0x2d/0x60 [usbcore]
 <c0237da2> tcp_v4_send_check+0x82/0xd0  <c0237da2> tcp_v4_send_check+0x82/0xd0
 <c0233244> tcp_transmit_skb+0x5e4/0x610  <c0234b36> __tcp_push_pending_frames+0x676/0x740
 <c0207f81> __alloc_skb+0x51/0x100  <c022b817> tcp_sendmsg+0x897/0x980
 <c0153fa9> core_sys_select+0x1b9/0x2b0  <c0241f1d> inet_sendmsg+0x3d/0x50
 <c0202a8f> do_sock_write+0x8f/0xa0  <c020301f> sock_aio_write+0x5f/0x70
 <c01443d3> do_sync_write+0xc3/0x100  <c01247f0> autoremove_wake_function+0x0/0x40
 <c0144ca1> vfs_write+0xa1/0x140  <c01451d3> sys_write+0x43/0x70
 <c0102ae7> syscall_call+0x7/0xb

Does it tell you anything already? Is there something I may instrument? What
could the driver do wrong to trigger such bug?

Jan



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 249 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: d80211: How does TX flow control work?
  2007-01-03 18:10   ` Jan Kiszka
@ 2007-01-03 18:18     ` Jiri Benc
  2007-01-03 18:50       ` Jan Kiszka
  0 siblings, 1 reply; 9+ messages in thread
From: Jiri Benc @ 2007-01-03 18:18 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: netdev, Ivo Van Doorn, rt2400-devel

On Wed, 03 Jan 2007 19:10:01 +0100, Jan Kiszka wrote:
> BUG: warning at /usr/src/rt2x00/rt2x00/ieee80211/ieee80211.c:1256/ieee80211_tx()
>  <cfa02245> ieee80211_master_start_xmit+0x105/0x430 [80211]  <c024e35d> __ip_ct_refresh_acct+0x4d/0x60
>  <c024fd11> tcp_packet+0x941/0x970  <c0217442> qdisc_restart+0x92/0x100
>  <c020d43d> dev_queue_xmit+0xbd/0x1a0  <cfa050d8> ieee80211_subif_start_xmit+0x468/0x480 [80211]
>  <c0207dca> skb_clone+0x3a/0x1a0  <c021d16d> nf_hook_slow+0x4d/0xc0
>  <c020d495> dev_queue_xmit+0x115/0x1a0  <c0226a63> ip_output+0x1c3/0x200
>  <c0225740> ip_finish_output+0x0/0x180  <c022628b> ip_queue_xmit+0x36b/0x3b0
>  <c0224130> dst_output+0x0/0x10  <ce9bae7d> usb_hcd_giveback_urb+0x2d/0x60 [usbcore]
>  <c0237da2> tcp_v4_send_check+0x82/0xd0  <c0237da2> tcp_v4_send_check+0x82/0xd0
>  <c0233244> tcp_transmit_skb+0x5e4/0x610  <c0234b36> __tcp_push_pending_frames+0x676/0x740
>  <c0207f81> __alloc_skb+0x51/0x100  <c022b817> tcp_sendmsg+0x897/0x980
>  <c0153fa9> core_sys_select+0x1b9/0x2b0  <c0241f1d> inet_sendmsg+0x3d/0x50
>  <c0202a8f> do_sock_write+0x8f/0xa0  <c020301f> sock_aio_write+0x5f/0x70
>  <c01443d3> do_sync_write+0xc3/0x100  <c01247f0> autoremove_wake_function+0x0/0x40
>  <c0144ca1> vfs_write+0xa1/0x140  <c01451d3> sys_write+0x43/0x70
>  <c0102ae7> syscall_call+0x7/0xb
> 
> Does it tell you anything already? Is there something I may instrument? What
> could the driver do wrong to trigger such bug?

Do you have CONFIG_NET_SCHED enabled?

Thanks,

 Jiri

-- 
Jiri Benc
SUSE Labs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: d80211: How does TX flow control work?
  2007-01-03 18:18     ` Jiri Benc
@ 2007-01-03 18:50       ` Jan Kiszka
  2007-01-07  0:00         ` Jan Kiszka
  0 siblings, 1 reply; 9+ messages in thread
From: Jan Kiszka @ 2007-01-03 18:50 UTC (permalink / raw)
  To: Jiri Benc; +Cc: netdev, Ivo Van Doorn, rt2400-devel

[-- Attachment #1: Type: text/plain, Size: 1638 bytes --]

Jiri Benc wrote:
> On Wed, 03 Jan 2007 19:10:01 +0100, Jan Kiszka wrote:
>> BUG: warning at /usr/src/rt2x00/rt2x00/ieee80211/ieee80211.c:1256/ieee80211_tx()
>>  <cfa02245> ieee80211_master_start_xmit+0x105/0x430 [80211]  <c024e35d> __ip_ct_refresh_acct+0x4d/0x60
>>  <c024fd11> tcp_packet+0x941/0x970  <c0217442> qdisc_restart+0x92/0x100
>>  <c020d43d> dev_queue_xmit+0xbd/0x1a0  <cfa050d8> ieee80211_subif_start_xmit+0x468/0x480 [80211]
>>  <c0207dca> skb_clone+0x3a/0x1a0  <c021d16d> nf_hook_slow+0x4d/0xc0
>>  <c020d495> dev_queue_xmit+0x115/0x1a0  <c0226a63> ip_output+0x1c3/0x200
>>  <c0225740> ip_finish_output+0x0/0x180  <c022628b> ip_queue_xmit+0x36b/0x3b0
>>  <c0224130> dst_output+0x0/0x10  <ce9bae7d> usb_hcd_giveback_urb+0x2d/0x60 [usbcore]
>>  <c0237da2> tcp_v4_send_check+0x82/0xd0  <c0237da2> tcp_v4_send_check+0x82/0xd0
>>  <c0233244> tcp_transmit_skb+0x5e4/0x610  <c0234b36> __tcp_push_pending_frames+0x676/0x740
>>  <c0207f81> __alloc_skb+0x51/0x100  <c022b817> tcp_sendmsg+0x897/0x980
>>  <c0153fa9> core_sys_select+0x1b9/0x2b0  <c0241f1d> inet_sendmsg+0x3d/0x50
>>  <c0202a8f> do_sock_write+0x8f/0xa0  <c020301f> sock_aio_write+0x5f/0x70
>>  <c01443d3> do_sync_write+0xc3/0x100  <c01247f0> autoremove_wake_function+0x0/0x40
>>  <c0144ca1> vfs_write+0xa1/0x140  <c01451d3> sys_write+0x43/0x70
>>  <c0102ae7> syscall_call+0x7/0xb
>>
>> Does it tell you anything already? Is there something I may instrument? What
>> could the driver do wrong to trigger such bug?
> 
> Do you have CONFIG_NET_SCHED enabled?
> 

Yes. Would it make a difference /wrt to that warning when I switch it off?

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 249 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: d80211: How does TX flow control work?
  2007-01-03 18:50       ` Jan Kiszka
@ 2007-01-07  0:00         ` Jan Kiszka
  2007-01-08 20:18           ` Jan Kiszka
  0 siblings, 1 reply; 9+ messages in thread
From: Jan Kiszka @ 2007-01-07  0:00 UTC (permalink / raw)
  To: Jiri Benc; +Cc: netdev, Ivo Van Doorn, rt2400-devel

[-- Attachment #1: Type: text/plain, Size: 1773 bytes --]

Jan Kiszka wrote:
> Jiri Benc wrote:
>> On Wed, 03 Jan 2007 19:10:01 +0100, Jan Kiszka wrote:
>>> BUG: warning at /usr/src/rt2x00/rt2x00/ieee80211/ieee80211.c:1256/ieee80211_tx()
>>>  <cfa02245> ieee80211_master_start_xmit+0x105/0x430 [80211]  <c024e35d> __ip_ct_refresh_acct+0x4d/0x60
>>>  <c024fd11> tcp_packet+0x941/0x970  <c0217442> qdisc_restart+0x92/0x100
>>>  <c020d43d> dev_queue_xmit+0xbd/0x1a0  <cfa050d8> ieee80211_subif_start_xmit+0x468/0x480 [80211]
>>>  <c0207dca> skb_clone+0x3a/0x1a0  <c021d16d> nf_hook_slow+0x4d/0xc0
>>>  <c020d495> dev_queue_xmit+0x115/0x1a0  <c0226a63> ip_output+0x1c3/0x200
>>>  <c0225740> ip_finish_output+0x0/0x180  <c022628b> ip_queue_xmit+0x36b/0x3b0
>>>  <c0224130> dst_output+0x0/0x10  <ce9bae7d> usb_hcd_giveback_urb+0x2d/0x60 [usbcore]
>>>  <c0237da2> tcp_v4_send_check+0x82/0xd0  <c0237da2> tcp_v4_send_check+0x82/0xd0
>>>  <c0233244> tcp_transmit_skb+0x5e4/0x610  <c0234b36> __tcp_push_pending_frames+0x676/0x740
>>>  <c0207f81> __alloc_skb+0x51/0x100  <c022b817> tcp_sendmsg+0x897/0x980
>>>  <c0153fa9> core_sys_select+0x1b9/0x2b0  <c0241f1d> inet_sendmsg+0x3d/0x50
>>>  <c0202a8f> do_sock_write+0x8f/0xa0  <c020301f> sock_aio_write+0x5f/0x70
>>>  <c01443d3> do_sync_write+0xc3/0x100  <c01247f0> autoremove_wake_function+0x0/0x40
>>>  <c0144ca1> vfs_write+0xa1/0x140  <c01451d3> sys_write+0x43/0x70
>>>  <c0102ae7> syscall_call+0x7/0xb
>>>
>>> Does it tell you anything already? Is there something I may instrument? What
>>> could the driver do wrong to trigger such bug?
>> Do you have CONFIG_NET_SCHED enabled?
>>

Sorry, this was most probably false alarm for the official stack. The
problem now appears to be related to a patch against d80211 that is only
present in the rt2x00 CVS.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: d80211: How does TX flow control work?
  2007-01-07  0:00         ` Jan Kiszka
@ 2007-01-08 20:18           ` Jan Kiszka
  2007-01-10 18:20             ` Jiri Benc
  0 siblings, 1 reply; 9+ messages in thread
From: Jan Kiszka @ 2007-01-08 20:18 UTC (permalink / raw)
  To: Jiri Benc; +Cc: netdev, Ivo Van Doorn, rt2400-devel

[-- Attachment #1: Type: text/plain, Size: 2366 bytes --]

Jan Kiszka wrote:
> Jan Kiszka wrote:
>> Jiri Benc wrote:
>>> On Wed, 03 Jan 2007 19:10:01 +0100, Jan Kiszka wrote:
>>>> BUG: warning at /usr/src/rt2x00/rt2x00/ieee80211/ieee80211.c:1256/ieee80211_tx()
>>>>  <cfa02245> ieee80211_master_start_xmit+0x105/0x430 [80211]  <c024e35d> __ip_ct_refresh_acct+0x4d/0x60
>>>>  <c024fd11> tcp_packet+0x941/0x970  <c0217442> qdisc_restart+0x92/0x100
>>>>  <c020d43d> dev_queue_xmit+0xbd/0x1a0  <cfa050d8> ieee80211_subif_start_xmit+0x468/0x480 [80211]
>>>>  <c0207dca> skb_clone+0x3a/0x1a0  <c021d16d> nf_hook_slow+0x4d/0xc0
>>>>  <c020d495> dev_queue_xmit+0x115/0x1a0  <c0226a63> ip_output+0x1c3/0x200
>>>>  <c0225740> ip_finish_output+0x0/0x180  <c022628b> ip_queue_xmit+0x36b/0x3b0
>>>>  <c0224130> dst_output+0x0/0x10  <ce9bae7d> usb_hcd_giveback_urb+0x2d/0x60 [usbcore]
>>>>  <c0237da2> tcp_v4_send_check+0x82/0xd0  <c0237da2> tcp_v4_send_check+0x82/0xd0
>>>>  <c0233244> tcp_transmit_skb+0x5e4/0x610  <c0234b36> __tcp_push_pending_frames+0x676/0x740
>>>>  <c0207f81> __alloc_skb+0x51/0x100  <c022b817> tcp_sendmsg+0x897/0x980
>>>>  <c0153fa9> core_sys_select+0x1b9/0x2b0  <c0241f1d> inet_sendmsg+0x3d/0x50
>>>>  <c0202a8f> do_sock_write+0x8f/0xa0  <c020301f> sock_aio_write+0x5f/0x70
>>>>  <c01443d3> do_sync_write+0xc3/0x100  <c01247f0> autoremove_wake_function+0x0/0x40
>>>>  <c0144ca1> vfs_write+0xa1/0x140  <c01451d3> sys_write+0x43/0x70
>>>>  <c0102ae7> syscall_call+0x7/0xb
>>>>
>>>> Does it tell you anything already? Is there something I may instrument? What
>>>> could the driver do wrong to trigger such bug?
>>> Do you have CONFIG_NET_SCHED enabled?
>>>
> 
> Sorry, this was most probably false alarm for the official stack. The
> problem now appears to be related to a patch against d80211 that is only
> present in the rt2x00 CVS.

Well, I said "most probably"...

The actual problem was meanwhile identified: shorewall happened to
overwrite the queueing discipline of wmaster0 with pfifo_fast. I found
the magic knob to tell shorewall to no longer do this (at least until I
want to manage traffic control that way...), but I still wonder if it is
an acceptable situation. Currently, the user can intentionally or
accidentally screw up the stack this way.

Jan


PS: Tests performed on a 2.6.17 kernel, but I don't see a reason why
newer kernels should be immune.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: d80211: How does TX flow control work?
  2007-01-08 20:18           ` Jan Kiszka
@ 2007-01-10 18:20             ` Jiri Benc
  2007-01-10 18:29               ` Simon Barber
  0 siblings, 1 reply; 9+ messages in thread
From: Jiri Benc @ 2007-01-10 18:20 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: netdev, Ivo Van Doorn, rt2400-devel, Jouni Malinen, Simon Barber

On Mon, 08 Jan 2007 21:18:48 +0100, Jan Kiszka wrote:
> The actual problem was meanwhile identified: shorewall happened to
> overwrite the queueing discipline of wmaster0 with pfifo_fast. I found
> the magic knob to tell shorewall to no longer do this (at least until I
> want to manage traffic control that way...), but I still wonder if it is
> an acceptable situation. Currently, the user can intentionally or
> accidentally screw up the stack this way.

Hm, we probably need a way to tell the kernel not to remove 802.11
qdisc. Jouni, Simon, is that possible or do we need to patch NET_SCHED
code?

Thanks,

 Jiri

-- 
Jiri Benc
SUSE Labs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: d80211: How does TX flow control work?
  2007-01-10 18:20             ` Jiri Benc
@ 2007-01-10 18:29               ` Simon Barber
  0 siblings, 0 replies; 9+ messages in thread
From: Simon Barber @ 2007-01-10 18:29 UTC (permalink / raw)
  To: Jiri Benc, Jan Kiszka; +Cc: netdev, Ivo Van Doorn, rt2400-devel, Jouni Malinen

Scratches head -- this is from memory when I was thinking about this
problem a long time ago... I think we can return an error in the qdisc
destructor function - making sure legitimate interface removal is not
the cause of the qdisc deletion first of course.

Simon 

-----Original Message-----
From: Jiri Benc [mailto:jbenc@suse.cz] 
Sent: Wednesday, January 10, 2007 6:20 PM
To: Jan Kiszka
Cc: netdev@vger.kernel.org; Ivo Van Doorn;
rt2400-devel@lists.sourceforge.net; Jouni Malinen; Simon Barber
Subject: Re: d80211: How does TX flow control work?

On Mon, 08 Jan 2007 21:18:48 +0100, Jan Kiszka wrote:
> The actual problem was meanwhile identified: shorewall happened to 
> overwrite the queueing discipline of wmaster0 with pfifo_fast. I found

> the magic knob to tell shorewall to no longer do this (at least until 
> I want to manage traffic control that way...), but I still wonder if 
> it is an acceptable situation. Currently, the user can intentionally 
> or accidentally screw up the stack this way.

Hm, we probably need a way to tell the kernel not to remove 802.11
qdisc. Jouni, Simon, is that possible or do we need to patch NET_SCHED
code?

Thanks,

 Jiri

--
Jiri Benc
SUSE Labs

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2007-01-10 18:28 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-01-02 13:08 d80211: How does TX flow control work? Jan Kiszka
2007-01-03 17:52 ` Jiri Benc
2007-01-03 18:10   ` Jan Kiszka
2007-01-03 18:18     ` Jiri Benc
2007-01-03 18:50       ` Jan Kiszka
2007-01-07  0:00         ` Jan Kiszka
2007-01-08 20:18           ` Jan Kiszka
2007-01-10 18:20             ` Jiri Benc
2007-01-10 18:29               ` Simon Barber

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).