Netdev List

Netdev List
 help / color / mirror / Atom feed

* BUG: while bridging Ethernet and wireless device:
From: Tomas Winkler @ 2010-12-16 12:11 UTC (permalink / raw)
  To: linux-netdev, linux-wireless

Will be happy if someone can give me some more insight. (kernel 2.6.37-rc5)
Thanks
Tomas

Dec 15 14:36:41 User-PC kernel: [175576.120287] ------------[ cut here
]------------
Dec 15 14:36:41 User-PC kernel: [175576.120452] kernel BUG at
include/linux/skbuff.h:1178!
Dec 15 14:36:41 User-PC kernel: [175576.120609] invalid opcode: 0000 [#1] SMP
Dec 15 14:36:41 User-PC kernel: [175576.120749] last sysfs file:
/sys/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/uevent
Dec 15 14:36:41 User-PC kernel: [175576.121035] Modules linked in:
oprofile binfmt_misc bridge stp llc parport_pc ppdev arc4 iwlagn
snd_hda_codec_realtek iwlcore i915 snd_hda_intel mac80211 joydev
snd_hda_codec snd_hwdep snd_pcm snd_seq_midi drm_kms_helper
snd_rawmidi drm snd_seq_midi_event snd_seq snd_timer snd_seq_device
cfg80211 eeepc_wmi usbhid psmouse intel_agp i2c_algo_bit intel_gtt
uvcvideo agpgart videodev sparse_keymap snd shpchp v4l1_compat lp hid
video serio_raw soundcore output snd_page_alloc ahci libahci atl1c
Dec 15 14:36:41 User-PC kernel: [175576.122712]
Dec 15 14:36:41 User-PC kernel: [175576.122769] Pid: 0, comm:
kworker/0:0 Tainted: G        W   2.6.37-rc5-wl+ #3 1015PE/1016P
Dec 15 14:36:41 User-PC kernel: [175576.123012] EIP: 0060:[<f83edd65>]
EFLAGS: 00010283 CPU: 1
Dec 15 14:36:41 User-PC kernel: [175576.123193] EIP is at
br_multicast_rcv+0xc95/0xe1c [bridge]
Dec 15 14:36:41 User-PC kernel: [175576.123362] EAX: 0000001c EBX:
f5626318 ECX: 00000000 EDX: 00000000
Dec 15 14:36:41 User-PC kernel: [175576.123550] ESI: ec512262 EDI:
f5626180 EBP: f60b5ca0 ESP: f60b5bd8
Dec 15 14:36:41 User-PC kernel: [175576.123737]  DS: 007b ES: 007b FS:
00d8 GS: 00e0 SS: 0068
Dec 15 14:36:41 User-PC kernel: [175576.123902] Process kworker/0:0
(pid: 0, ti=f60b4000 task=f60a8000 task.ti=f60b0000)
Dec 15 14:36:41 User-PC kernel: [175576.124137] Stack:
Dec 15 14:36:41 User-PC kernel: [175576.124181]  ec556500 f6d06800
f60b5be8 c01087d8 ec512262 00000030 00000024 f5626180
Dec 15 14:36:41 User-PC kernel: [175576.124181]  f572c200 ef463440
f5626300 3affffff f6d06dd0 e60766a4 000000c4 f6d06860
Dec 15 14:36:41 User-PC kernel: [175576.124181]  ffffffff ec55652c
00000001 f6d06844 f60b5c64 c0138264 c016e451 c013e47d
Dec 15 14:36:41 User-PC kernel: [175576.124181] Call Trace:
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c01087d8>] ?
sched_clock+0x8/0x10
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c0138264>] ?
enqueue_entity+0x174/0x440
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c016e451>] ?
sched_clock_cpu+0x131/0x190
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c013e47d>] ?
select_task_rq_fair+0x2ad/0x730
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c0524fc1>] ?
nf_iterate+0x71/0x90
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f83e4914>] ?
br_handle_frame_finish+0x184/0x220 [bridge]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f83e4790>] ?
br_handle_frame_finish+0x0/0x220 [bridge]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f83e46e9>] ?
br_handle_frame+0x189/0x230 [bridge]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f83e4790>] ?
br_handle_frame_finish+0x0/0x220 [bridge]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f83e4560>] ?
br_handle_frame+0x0/0x230 [bridge]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c04ff026>] ?
__netif_receive_skb+0x1b6/0x5b0
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c04f7a30>] ?
skb_copy_bits+0x110/0x210
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c0503a7f>] ?
netif_receive_skb+0x6f/0x80
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f82cb74c>] ?
ieee80211_deliver_skb+0x8c/0x1a0 [mac80211]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f82cc836>] ?
ieee80211_rx_handlers+0xeb6/0x1aa0 [mac80211]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c04ff1f0>] ?
__netif_receive_skb+0x380/0x5b0
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c016e242>] ?
sched_clock_local+0xb2/0x190
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c012b688>] ?
default_spin_lock_flags+0x8/0x10
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c05d83df>] ?
_raw_spin_lock_irqsave+0x2f/0x50
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f82cd621>] ?
ieee80211_prepare_and_rx_handle+0x201/0xa90 [mac80211]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f82ce154>] ?
ieee80211_rx+0x2a4/0x830 [mac80211]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f815a8d6>] ?
iwl_update_stats+0xa6/0x2a0 [iwlcore]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f8499212>] ?
iwlagn_rx_reply_rx+0x292/0x3b0 [iwlagn]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c05d83df>] ?
_raw_spin_lock_irqsave+0x2f/0x50
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f8483697>] ?
iwl_rx_handle+0xe7/0x350 [iwlagn]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f8486ab7>] ?
iwl_irq_tasklet+0xf7/0x5c0 [iwlagn]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c01aece1>] ?
__rcu_process_callbacks+0x201/0x2d0
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c0150d05>] ?
tasklet_action+0xc5/0x100
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c0150a07>] ?
__do_softirq+0x97/0x1d0
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c05d910c>] ?
nmi_stack_correct+0x2f/0x34
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c0150970>] ?
__do_softirq+0x0/0x1d0
Dec 15 14:36:41 User-PC kernel: [175576.124181]  <IRQ>
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c01508f5>] ?
irq_exit+0x65/0x70
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c05df062>] ? do_IRQ+0x52/0xc0
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c01036b0>] ?
common_interrupt+0x30/0x38
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c03a1fc2>] ?
intel_idle+0xc2/0x160
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c04daebb>] ?
cpuidle_idle_call+0x6b/0x100
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c0101dea>] ?
cpu_idle+0x8a/0xf0
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c05d2702>] ?
start_secondary+0x1e8/0x1ee
Dec 15 14:36:41 User-PC kernel: [175576.124181] Code: ff ff ff be ea
ff ff ff 8b 82 b0 00 00 00 e9 fb f5 ff ff 89 c8 e8 4c dc ff ff 85 c0
89 c6 0f 84 9b f5 ff ff 66 90 e9 be fe ff ff <0f> 0b eb fe c7 47 20 01
00 00 00 8b 43 04 89 c2 81 e2 ff ff ff
Dec 15 14:36:41 User-PC kernel: [175576.124181] EIP: [<f83edd65>]
br_multicast_rcv+0xc95/0xe1c [bridge] SS:ESP 0068:f60b5bd8
Dec 15 14:36:41 User-PC kernel: [175576.124181] BUG: scheduling while
atomic: kworker/0:0/0/0x10000100
Dec 15 14:36:41 User-PC kernel: [175576.124181] Modules linked in:
oprofile binfmt_misc bridge stp llc parport_pc ppdev arc4 iwlagn
snd_hda_codec_realtek iwlcore i915 snd_hda_intel mac80211 joydev
snd_hda_codec snd_hwdep snd_pcm snd_seq_midi drm_kms_helper
snd_rawmidi drm snd_seq_midi_event snd_seq snd_timer snd_seq_device
cfg80211 eeepc_wmi usbhid psmouse intel_agp i2c_algo_bit intel_gtt
uvcvideo agpgart videodev sparse_keymap snd shpchp v4l1_compat lp hid
video serio_raw soundcore output snd_page_alloc ahci libahci atl1c
Dec 15 14:36:41 User-PC kernel: [175576.124181] Modules linked in:
oprofile binfmt_misc bridge stp llc parport_pc ppdev arc4 iwlagn
snd_hda_codec_realtek iwlcore i915 snd_hda_intel mac80211 joydev
snd_hda_codec snd_hwdep snd_pcm snd_seq_midi drm_kms_helper
snd_rawmidi drm snd_seq_midi_event snd_seq snd_timer snd_seq_device
cfg80211 eeepc_wmi usbhid psmouse intel_agp i2c_algo_bit intel_gtt
uvcvideo agpgart videodev sparse_keymap snd shpchp v4l1_compat lp hid
video serio_raw soundcore output snd_page_alloc ahci libahci atl1c
Dec 15 14:36:41 User-PC kernel: [175576.124181]
Dec 15 14:36:41 User-PC kernel: [175576.124181] Pid: 0, comm:
kworker/0:0 Tainted: G        W   2.6.37-rc5-wl+ #3 1015PE/1016P
Dec 15 14:36:41 User-PC kernel: [175576.124181] EIP: 0060:[<c03a1fc2>]
EFLAGS: 00000202 CPU: 1
Dec 15 14:36:41 User-PC kernel: [175576.124181] EIP is at intel_idle+0xc2/0x160
Dec 15 14:36:41 User-PC kernel: [175576.124181] EAX: 00000000 EBX:
00001494 ECX: 00000000 EDX: 00001494
Dec 15 14:36:41 User-PC kernel: [175576.124181] ESI: 00000000 EDI:
00000004 EBP: f60b1f50 ESP: f60b1f28
Dec 15 14:36:41 User-PC kernel: [175576.124181]  DS: 007b ES: 007b FS:
00d8 GS: 00e0 SS: 0068
Dec 15 14:36:41 User-PC kernel: [175576.124181] Process kworker/0:0
(pid: 0, ti=f60b4000 task=f60a8000 task.ti=f60b0000)
Dec 15 14:36:41 User-PC kernel: [175576.124181] Stack:
Dec 15 14:36:41 User-PC kernel: [175576.124181]  0000000b 00000000
77359400 00000001 00000010 00000002 00000001 f6d0a95c
Dec 15 14:36:41 User-PC kernel: [175576.124181]  f6d0aa1c c0817f04
f60b1f60 c04daebb 00000001 00000001 f60b1f84 c0101dea
Dec 15 14:36:41 User-PC kernel: [175576.124181]  c07d0ef4 f60b1f7c
e487e262 f2eb6781 85a608d2 00000000 00000000 f60b1fb0
Dec 15 14:36:41 User-PC kernel: [175576.124181] Call Trace:
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c04daebb>] ?
cpuidle_idle_call+0x6b/0x100
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c0101dea>] ?
cpu_idle+0x8a/0xf0
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c05d2702>] ?
start_secondary+0x1e8/0x1ee
Dec 15 14:36:41 User-PC kernel: [175576.124181] Code: f6 89 e0 25 00
e0 ff ff 8b 40 08 a8 08 75 08 b1 01 8b 45 e8 0f 01 c9 e8 cd fc dc ff
29 d8 19 f2 e8 04 d6 da ff 89 c6 89 d3 fb 90 <8d> 74 26 00 85 3d 78 41
7e c0 75 0d 8d 55 f0 b8 05 00 00 00 e8
Dec 15 14:36:41 User-PC kernel: [175576.124181] Call Trace:
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c04daebb>]
cpuidle_idle_call+0x6b/0x100
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c0101dea>] cpu_idle+0x8a/0xf0
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c05d2702>]
start_secondary+0x1e8/0x1ee
Dec 15 14:36:41 User-PC kernel: [175576.487562] BUG: scheduling while
atomic: kworker/0:0/0/0x10000100
Dec 15 14:36:41 User-PC kernel: [175576.497058] Modules linked in:
oprofile binfmt_misc bridge stp llc parport_pc ppdev arc4 iwlagn
snd_hda_codec_realtek iwlcore i915 snd_hda_intel mac80211 joydev
snd_hda_codec snd_hwdep snd_pcm snd_seq_midi drm_kms_helper
snd_rawmidi drm snd_seq_midi_event snd_seq snd_timer snd_seq_device
cfg80211 eeepc_wmi usbhid psmouse intel_agp i2c_algo_bit intel_gtt
uvcvideo agpgart videodev sparse_keymap snd shpchp v4l1_compat lp hid
video serio_raw soundcore output snd_page_alloc ahci libahci atl1c
Dec 15 14:36:41 User-PC kernel: [175576.522221] Modules linked in:
oprofile binfmt_misc bridge stp llc parport_pc ppdev arc4 iwlagn
snd_hda_codec_realtek iwlcore i915 snd_hda_intel mac80211 joydev
snd_hda_codec snd_hwdep snd_pcm snd_seq_midi drm_kms_helper
snd_rawmidi drm snd_seq_midi_event snd_seq snd_timer snd_seq_device
cfg80211 eeepc_wmi usbhid psmouse intel_agp i2c_algo_bit intel_gtt
uvcvideo agpgart videodev sparse_keymap snd shpchp v4l1_compat lp hid
video serio_raw soundcore output snd_page_alloc ahci libahci atl1c
Dec 15 14:36:41 User-PC kernel: [175576.550740]
Dec 15 14:36:41 User-PC kernel: [175576.557947] Pid: 0, comm:
kworker/0:0 Tainted: G        W   2.6.37-rc5-wl+ #3 1015PE/1016P
Dec 15 14:36:41 User-PC kernel: [175576.565201] EIP: 0060:[<c03a1fc2>]
EFLAGS: 00000202 CPU: 1
Dec 15 14:36:41 User-PC kernel: [175576.572280] EIP is at intel_idle+0xc2/0x160
Dec 15 14:36:41 User-PC kernel: [175576.579125] EAX: 00000000 EBX:
00001494 ECX: 00000000 EDX: 00001494
Dec 15 14:36:41 User-PC kernel: [175576.585850] ESI: 00000000 EDI:
00000004 EBP: f60b1f50 ESP: f60b1f28
Dec 15 14:36:41 User-PC kernel: [175576.592460]  DS: 007b ES: 007b FS:
00d8 GS: 00e0 SS: 0068
Dec 15 14:36:41 User-PC kernel: [175576.599021] Process kworker/0:0
(pid: 0, ti=f60b4000 task=f60a8000 task.ti=f60b0000)
Dec 15 14:36:41 User-PC kernel: [175576.605632] Stack:
Dec 15 14:36:41 User-PC kernel: [175576.612158]  0000000b 00000000
77359400 00000001 00000010 00000002 00000001 f6d0a95c
Dec 15 14:36:41 User-PC kernel: [175576.618953]  f6d0aa1c c0817f04
f60b1f60 c04daebb 00000001 00000001 f60b1f84 c0101dea
Dec 15 14:36:41 User-PC kernel: [175576.625818]  c07d0ef4 f60b1f7c
e487e262 f2eb6781 85a608d2 00000000 00000000 f60b1fb0
Dec 15 14:36:41 User-PC kernel: [175576.632737] Call Trace:
Dec 15 14:36:41 User-PC kernel: [175576.639461]  [<c04daebb>] ?
cpuidle_idle_call+0x6b/0x100
Dec 15 14:36:41 User-PC kernel: [175576.646168]  [<c0101dea>] ?
cpu_idle+0x8a/0xf0
Dec 15 14:36:41 User-PC kernel: [175576.652826]  [<c05d2702>] ?
start_secondary+0x1e8/0x1ee
Dec 15 14:36:41 User-PC kernel: [175576.659441] Code: f6 89 e0 25 00
e0 ff ff 8b 40 08 a8 08 75 08 b1 01 8b 45 e8 0f 01 c9 e8 cd fc dc ff
29 d8 19 f2 e8 04 d6 da ff 89 c6 89 d3 fb 90 <8d> 74 26 00 85 3d 78 41
7e c0 75 0d 8d 55 f0 b8 05 00 00 00 e8
Dec 15 14:36:41 User-PC kernel: [175576.673805] Call Trace:
Dec 15 14:36:41 User-PC kernel: [175576.680668]  [<c04daebb>]
cpuidle_idle_call+0x6b/0x100
Dec 15 14:36:41 User-PC kernel: [175576.687612]  [<c0101dea>] cpu_idle+0x8a/0xf0
Dec 15 14:36:41 User-PC kernel: [175576.694516]  [<c05d2702>]
start_secondary+0x1e8/0x1ee
Dec 15 14:36:41 User-PC kernel: [175576.711906] BUG: scheduling while
atomic: kworker/0:0/0/0x10000100
Dec 15 14:36:41 User-PC kernel: [175576.716280] Modules linked in:
oprofile binfmt_misc bridge stp llc parport_pc ppdev arc4 iwlagn
snd_hda_codec_realtek iwlcore i915 snd_hda_intel mac80211 joydev
snd_hda_codec snd_hwdep snd_pcm snd_seq_midi drm_kms_helper
snd_rawmidi drm snd_seq_midi_event snd_seq snd_timer snd_seq_device
cfg80211 eeepc_wmi usbhid psmouse intel_agp i2c_algo_bit intel_gtt
uvcvideo agpgart videodev sparse_keymap snd shpchp v4l1_compat lp hid
video serio_raw soundcore output snd_page_alloc ahci libahci atl1c
Dec 15 14:36:41 User-PC kernel: [175576.734197] Modules linked in:
oprofile binfmt_misc bridge stp llc parport_pc ppdev arc4 iwlagn
snd_hda_codec_realtek iwlcore i915 snd_hda_intel mac80211 joydev
snd_hda_codec snd_hwdep snd_pcm snd_seq_midi drm_kms_helper
snd_rawmidi drm snd_seq_midi_event snd_seq snd_timer snd_seq_device
cfg80211 eeepc_wmi usbhid psmouse intel_agp i2c_algo_bit intel_gtt
uvcvideo agpgart videodev sparse_keymap snd shpchp v4l1_compat lp hid
video serio_raw soundcore output snd_page_alloc ahci libahci atl1c
Dec 15 14:36:41 User-PC kernel: [175576.753330]
Dec 15 14:36:41 User-PC kernel: [175576.757845] Pid: 0, comm:
kworker/0:0 Tainted: G        W   2.6.37-rc5-wl+ #3 1015PE/1016P
Dec 15 14:36:41 User-PC kernel: [175576.762389] EIP: 0060:[<c03a1fc2>]
EFLAGS: 00000202 CPU: 1
Dec 15 14:36:41 User-PC kernel: [175576.766809] EIP is at intel_idle+0xc2/0x160
Dec 15 14:36:41 User-PC kernel: [175576.771119] EAX: 00000000 EBX:
00001494 ECX: 00000000 EDX: 00001494
Dec 15 14:36:41 User-PC kernel: [175576.775348] ESI: 00000000 EDI:
00000004 EBP: f60b1f50 ESP: f60b1f28
Dec 15 14:36:41 User-PC kernel: [175576.780037]  DS: 007b ES: 007b FS:
00d8 GS: 00e0 SS: 0068
Dec 15 14:36:41 User-PC kernel: [175576.784159] Process kworker/0:0
(pid: 0, ti=f60b4000 task=f60a8000 task.ti=f60b0000)
Dec 15 14:36:41 User-PC kernel: [175576.788286] Stack:
Dec 15 14:36:41 User-PC kernel: [175576.792375]  0000000b 00000000
77359400 00000001 00000010 00000002 00000001 f6d0a95c

^ permalink raw reply

* Re: BUG: while bridging Ethernet and wireless device:
From: Johannes Berg @ 2010-12-16 12:16 UTC (permalink / raw)
  To: Tomas Winkler; +Cc: linux-netdev, linux-wireless
In-Reply-To: <AANLkTikYvBspVmAZ0DCMXJ-3WxkotwX+n8NpTtM+97_i-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On Thu, 2010-12-16 at 14:11 +0200, Tomas Winkler wrote:

> Dec 15 14:36:41 User-PC kernel: [175576.120452] kernel BUG at include/linux/skbuff.h:1178!

> Dec 15 14:36:41 User-PC kernel: [175576.123193] EIP is at br_multicast_rcv+0xc95/0xe1c [bridge]

So as I said to Tomas in private before -- it kinda looks like something
here is not handling paged SKBs correctly? But I would imaging that
causing more issues, unless there was a bug here that made bridging
require more data in the skb header than we put in there right now -- it
can end up being empty I believe.

Thing is, I looked at the code and it seemed fine.

johannes

--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH 0/15] RFC: create drivers/net/legacy for ISA, EISA, MCA drivers
From: Jan Engelhardt @ 2010-12-16 12:22 UTC (permalink / raw)
  To: paul.gortmaker; +Cc: Joe Perches, Maciej W. Rozycki, Jeff Kirsher, netdev
In-Reply-To: <alpine.LNX.2.01.1012161253560.3000@obet.zrqbmnf.qr>

[adding missing cc:netdev]

A few comments, since I have just been made aware of the Netconf slides,

Paul Gortmaker wrote:
>
>If in fact this series gets agreement, I'm figuring it makes sense to 
>have it go in either at the beginning of a dev cycle, or at the very 
>end

I think I have seen git properly coping with renames, if your and other 
developers' branches are git-merged (patchwise application of course 
leads to rejects).

>classifying drivers according to the physical layer they support or if 
>multiple are, such as with the Ethernet that is backwards compatible, 
>the newest variation they do?

Above all I would probably like to see

- getting rid of the "1000 Mbit" and "10000 Mbit" submenus. jme.ko for 
example is put under 1000 Mbit, but the jme chip I have ("197b:0260 (rev 
02) JMicron Technology Corp. JMC260 PCI Express Fast Ethernet 
Controller") does not do 1000.

- getting rid of CONFIG_NET_PCI and move dependencies to individual 
driver config options. It looks odd that only drivers from the 10/100 
Mbit category — and then, not even all — depend on this. Of course you 
probably won't see Gbit adapters for ISA, but SUN, 3com, HP cards are 
also available on PCI.

Jeff Kirsher puts forward in:
>
>http://vger.kernel.org/netconf2010_slides/netconf-jtk.pdf
>
>Create /drivers/net/sw for vlan, 8021q, bonding, bridging, etc drivers

That does not seem too nice. Currently, bridge is at net/bridge/, and 
moving it into drivers/net/sw/bridge/ is just elongating the path name 
for a rename that is.. a bit disputable as far as I read the thread.

What is in net/, leave it there for now.
drivers/net/ethernet/, I am fine with.

^ permalink raw reply

* [PATCH v2] e1000e: convert to stats64
From: Flavio Leitner @ 2010-12-16 12:31 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: e1000-devel, netdev, Jeff Kirsher
In-Reply-To: <1292362173.2478.6.camel@edumazet-laptop>

On Tue, Dec 14, 2010 at 10:29:33PM +0100, Eric Dumazet wrote:
> Le mardi 14 décembre 2010 à 18:32 -0200, Flavio Leitner a écrit :
> > Provides accurate stats at the time user reads them.
> > 
> > Signed-off-by: Flavio Leitner <fleitner@redhat.com>
> > ---
> >  drivers/net/e1000e/e1000.h   |    5 ++-
> >  drivers/net/e1000e/ethtool.c |   27 +++++++++-------
> >  drivers/net/e1000e/netdev.c  |   68 ++++++++++++++++++++++++-----------------
> >  3 files changed, 59 insertions(+), 41 deletions(-)
> > 
> > diff --git a/drivers/net/e1000e/e1000.h b/drivers/net/e1000e/e1000.h
> > index fdc67fe..5a5e944 100644
> > --- a/drivers/net/e1000e/e1000.h
> > +++ b/drivers/net/e1000e/e1000.h
> > @@ -363,6 +363,8 @@ struct e1000_adapter {
> >  	/* structs defined in e1000_hw.h */
> >  	struct e1000_hw hw;
> >  
> > +	spinlock_t stats64_lock;
> > +	struct rtnl_link_stats64 stats64;
> 
> I am not sure why you add this stats64 in e1000_adapter ?
> 
> Why isnt it provided by callers (automatic variable, or provided to
> ndo_get_stats64()). I dont see accumulators, only a full rewrite of this
> structure in e1000e_update_stats() ?

Good point. I have modified the patch to fix that. 
thanks!

>From 3487bd7dacd0c23bba315270139dab6e00e5ff02 Mon Sep 17 00:00:00 2001
From: Flavio Leitner <fleitner@redhat.com>
Date: Thu, 16 Dec 2010 10:26:03 -0200
Subject: [PATCH] e1000e: convert to stats64

Provides accurate stats at the time user reads them.

Signed-off-by: Flavio Leitner <fleitner@redhat.com>
---
 drivers/net/e1000e/e1000.h   |    3 ++
 drivers/net/e1000e/ethtool.c |   25 ++++++++-------
 drivers/net/e1000e/netdev.c  |   68 +++++++++++++++++++++++++++++++++--------
 3 files changed, 70 insertions(+), 26 deletions(-)

diff --git a/drivers/net/e1000e/e1000.h b/drivers/net/e1000e/e1000.h
index fdc67fe..d8fc3bc 100644
--- a/drivers/net/e1000e/e1000.h
+++ b/drivers/net/e1000e/e1000.h
@@ -363,6 +363,7 @@ struct e1000_adapter {
 	/* structs defined in e1000_hw.h */
 	struct e1000_hw hw;
 
+	spinlock_t stats64_lock;
 	struct e1000_hw_stats stats;
 	struct e1000_phy_info phy_info;
 	struct e1000_phy_stats phy_stats;
@@ -493,6 +494,8 @@ extern int e1000e_setup_tx_resources(struct e1000_adapter *adapter);
 extern void e1000e_free_rx_resources(struct e1000_adapter *adapter);
 extern void e1000e_free_tx_resources(struct e1000_adapter *adapter);
 extern void e1000e_update_stats(struct e1000_adapter *adapter);
+extern struct rtnl_link_stats64 *e1000e_get_stats64(struct net_device *netdev,
+                                                   struct rtnl_link_stats64 *stats);
 extern void e1000e_set_interrupt_capability(struct e1000_adapter *adapter);
 extern void e1000e_reset_interrupt_capability(struct e1000_adapter *adapter);
 extern void e1000e_disable_aspm(struct pci_dev *pdev, u16 state);
diff --git a/drivers/net/e1000e/ethtool.c b/drivers/net/e1000e/ethtool.c
index 8984d16..f43b412 100644
--- a/drivers/net/e1000e/ethtool.c
+++ b/drivers/net/e1000e/ethtool.c
@@ -49,8 +49,8 @@ struct e1000_stats {
 				sizeof(((struct e1000_adapter *)0)->m), \
 		      		offsetof(struct e1000_adapter, m)
 #define E1000_NETDEV_STAT(m)	NETDEV_STATS, \
-				sizeof(((struct net_device *)0)->m), \
-				offsetof(struct net_device, m)
+				sizeof(((struct rtnl_link_stats64 *)0)->m), \
+				offsetof(struct rtnl_link_stats64, m)
 
 static const struct e1000_stats e1000_gstrings_stats[] = {
 	{ "rx_packets", E1000_STAT(stats.gprc) },
@@ -61,21 +61,21 @@ static const struct e1000_stats e1000_gstrings_stats[] = {
 	{ "tx_broadcast", E1000_STAT(stats.bptc) },
 	{ "rx_multicast", E1000_STAT(stats.mprc) },
 	{ "tx_multicast", E1000_STAT(stats.mptc) },
-	{ "rx_errors", E1000_NETDEV_STAT(stats.rx_errors) },
-	{ "tx_errors", E1000_NETDEV_STAT(stats.tx_errors) },
-	{ "tx_dropped", E1000_NETDEV_STAT(stats.tx_dropped) },
+	{ "rx_errors", E1000_NETDEV_STAT(rx_errors) },
+	{ "tx_errors", E1000_NETDEV_STAT(tx_errors) },
+	{ "tx_dropped", E1000_NETDEV_STAT(tx_dropped) },
 	{ "multicast", E1000_STAT(stats.mprc) },
 	{ "collisions", E1000_STAT(stats.colc) },
-	{ "rx_length_errors", E1000_NETDEV_STAT(stats.rx_length_errors) },
-	{ "rx_over_errors", E1000_NETDEV_STAT(stats.rx_over_errors) },
+	{ "rx_length_errors", E1000_NETDEV_STAT(rx_length_errors) },
+	{ "rx_over_errors", E1000_NETDEV_STAT(rx_over_errors) },
 	{ "rx_crc_errors", E1000_STAT(stats.crcerrs) },
-	{ "rx_frame_errors", E1000_NETDEV_STAT(stats.rx_frame_errors) },
+	{ "rx_frame_errors", E1000_NETDEV_STAT(rx_frame_errors) },
 	{ "rx_no_buffer_count", E1000_STAT(stats.rnbc) },
 	{ "rx_missed_errors", E1000_STAT(stats.mpc) },
 	{ "tx_aborted_errors", E1000_STAT(stats.ecol) },
 	{ "tx_carrier_errors", E1000_STAT(stats.tncrs) },
-	{ "tx_fifo_errors", E1000_NETDEV_STAT(stats.tx_fifo_errors) },
-	{ "tx_heartbeat_errors", E1000_NETDEV_STAT(stats.tx_heartbeat_errors) },
+	{ "tx_fifo_errors", E1000_NETDEV_STAT(tx_fifo_errors) },
+	{ "tx_heartbeat_errors", E1000_NETDEV_STAT(tx_heartbeat_errors) },
 	{ "tx_window_errors", E1000_STAT(stats.latecol) },
 	{ "tx_abort_late_coll", E1000_STAT(stats.latecol) },
 	{ "tx_deferred_ok", E1000_STAT(stats.dc) },
@@ -1972,14 +1972,15 @@ static void e1000_get_ethtool_stats(struct net_device *netdev,
 				    u64 *data)
 {
 	struct e1000_adapter *adapter = netdev_priv(netdev);
+	struct rtnl_link_stats64 net_stats;
 	int i;
 	char *p = NULL;
 
-	e1000e_update_stats(adapter);
+	e1000e_get_stats64(netdev, &net_stats);
 	for (i = 0; i < E1000_GLOBAL_STATS_LEN; i++) {
 		switch (e1000_gstrings_stats[i].type) {
 		case NETDEV_STATS:
-			p = (char *) netdev +
+			p = (char *) &net_stats +
 					e1000_gstrings_stats[i].stat_offset;
 			break;
 		case E1000_STATS:
diff --git a/drivers/net/e1000e/netdev.c b/drivers/net/e1000e/netdev.c
index c4ca162..b3c4b7d 100644
--- a/drivers/net/e1000e/netdev.c
+++ b/drivers/net/e1000e/netdev.c
@@ -894,8 +894,6 @@ next_desc:
 
 	adapter->total_rx_bytes += total_rx_bytes;
 	adapter->total_rx_packets += total_rx_packets;
-	netdev->stats.rx_bytes += total_rx_bytes;
-	netdev->stats.rx_packets += total_rx_packets;
 	return cleaned;
 }
 
@@ -1051,8 +1049,6 @@ static bool e1000_clean_tx_irq(struct e1000_adapter *adapter)
 	}
 	adapter->total_tx_bytes += total_tx_bytes;
 	adapter->total_tx_packets += total_tx_packets;
-	netdev->stats.tx_bytes += total_tx_bytes;
-	netdev->stats.tx_packets += total_tx_packets;
 	return count < tx_ring->count;
 }
 
@@ -1240,8 +1236,6 @@ next_desc:
 
 	adapter->total_rx_bytes += total_rx_bytes;
 	adapter->total_rx_packets += total_rx_packets;
-	netdev->stats.rx_bytes += total_rx_bytes;
-	netdev->stats.rx_packets += total_rx_packets;
 	return cleaned;
 }
 
@@ -1421,8 +1415,6 @@ next_desc:
 
 	adapter->total_rx_bytes += total_rx_bytes;
 	adapter->total_rx_packets += total_rx_packets;
-	netdev->stats.rx_bytes += total_rx_bytes;
-	netdev->stats.rx_packets += total_rx_packets;
 	return cleaned;
 }
 
@@ -3367,6 +3359,11 @@ void e1000e_down(struct e1000_adapter *adapter)
 	del_timer_sync(&adapter->phy_info_timer);
 
 	netif_carrier_off(netdev);
+
+	spin_lock(&adapter->stats64_lock);
+	e1000e_update_stats(adapter);
+	spin_unlock(&adapter->stats64_lock);
+
 	adapter->link_speed = 0;
 	adapter->link_duplex = 0;
 
@@ -3408,6 +3405,8 @@ static int __devinit e1000_sw_init(struct e1000_adapter *adapter)
 	adapter->max_frame_size = netdev->mtu + ETH_HLEN + ETH_FCS_LEN;
 	adapter->min_frame_size = ETH_ZLEN + ETH_FCS_LEN;
 
+	spin_lock_init(&adapter->stats64_lock);
+
 	e1000e_set_interrupt_capability(adapter);
 
 	if (e1000_alloc_queues(adapter))
@@ -4279,7 +4278,9 @@ static void e1000_watchdog_task(struct work_struct *work)
 	}
 
 link_up:
+	spin_lock(&adapter->stats64_lock);
 	e1000e_update_stats(adapter);
+	spin_unlock(&adapter->stats64_lock);
 
 	mac->tx_packet_delta = adapter->stats.tpt - adapter->tpt_old;
 	adapter->tpt_old = adapter->stats.tpt;
@@ -4891,16 +4892,55 @@ static void e1000_reset_task(struct work_struct *work)
 }
 
 /**
- * e1000_get_stats - Get System Network Statistics
+ * e1000_get_stats64 - Get System Network Statistics
  * @netdev: network interface device structure
+ * @stats: rtnl_link_stats64 pointer
  *
  * Returns the address of the device statistics structure.
- * The statistics are actually updated from the timer callback.
  **/
-static struct net_device_stats *e1000_get_stats(struct net_device *netdev)
+struct rtnl_link_stats64 *e1000e_get_stats64(struct net_device *netdev,
+						   struct rtnl_link_stats64 *stats)
 {
-	/* only return the current stats */
-	return &netdev->stats;
+	struct e1000_adapter *adapter = netdev_priv(netdev);
+
+	memset(stats, 0, sizeof(struct rtnl_link_stats64));
+	spin_lock(&adapter->stats64_lock);
+	e1000e_update_stats(adapter);
+	/* Fill out the OS statistics structure */
+	stats->rx_bytes = adapter->stats.gorc;
+	stats->rx_packets = adapter->stats.gprc;
+	stats->tx_bytes = adapter->stats.gotc;
+	stats->tx_packets = adapter->stats.gptc;
+	stats->multicast = adapter->stats.mprc;
+	stats->collisions = adapter->stats.colc;
+
+	/* Rx Errors */
+
+	/*
+	 * RLEC on some newer hardware can be incorrect so build
+	 * our own version based on RUC and ROC
+	 */
+	stats->rx_errors = adapter->stats.rxerrc +
+		adapter->stats.crcerrs + adapter->stats.algnerrc +
+		adapter->stats.ruc + adapter->stats.roc +
+		adapter->stats.cexterr;
+	stats->rx_length_errors = adapter->stats.ruc +
+					      adapter->stats.roc;
+	stats->rx_crc_errors = adapter->stats.crcerrs;
+	stats->rx_frame_errors = adapter->stats.algnerrc;
+	stats->rx_missed_errors = adapter->stats.mpc;
+
+	/* Tx Errors */
+	stats->tx_errors = adapter->stats.ecol +
+				       adapter->stats.latecol;
+	stats->tx_aborted_errors = adapter->stats.ecol;
+	stats->tx_window_errors = adapter->stats.latecol;
+	stats->tx_carrier_errors = adapter->stats.tncrs;
+
+	/* Tx Dropped needs to be maintained elsewhere */
+
+	spin_unlock(&adapter->stats64_lock);
+	return stats;
 }
 
 /**
@@ -5624,7 +5664,7 @@ static const struct net_device_ops e1000e_netdev_ops = {
 	.ndo_open		= e1000_open,
 	.ndo_stop		= e1000_close,
 	.ndo_start_xmit		= e1000_xmit_frame,
-	.ndo_get_stats		= e1000_get_stats,
+	.ndo_get_stats64	= e1000e_get_stats64,
 	.ndo_set_multicast_list	= e1000_set_multi,
 	.ndo_set_mac_address	= e1000_set_mac,
 	.ndo_change_mtu		= e1000_change_mtu,
-- 
1.7.3.1


------------------------------------------------------------------------------
Lotusphere 2011
Register now for Lotusphere 2011 and learn how
to connect the dots, take your collaborative environment
to the next level, and enter the era of Social Business.
http://p.sf.net/sfu/lotusphere-d2d
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired

^ permalink raw reply related

* Re: [PATCH v2] e1000e: convert to stats64
From: Eric Dumazet @ 2010-12-16 12:50 UTC (permalink / raw)
  To: Flavio Leitner; +Cc: netdev, e1000-devel, Jeff Kirsher
In-Reply-To: <20101216123131.GA3070@redhat.com>

Le jeudi 16 décembre 2010 à 10:31 -0200, Flavio Leitner a écrit :

> -static struct net_device_stats *e1000_get_stats(struct net_device *netdev)
> +struct rtnl_link_stats64 *e1000e_get_stats64(struct net_device *netdev,
> +						   struct rtnl_link_stats64 *stats)
>  {
> -	/* only return the current stats */
> -	return &netdev->stats;
> +	struct e1000_adapter *adapter = netdev_priv(netdev);
> +
> +	memset(stats, 0, sizeof(struct rtnl_link_stats64));

You dont need this memset(), stats is cleared by caller (dev_get_stats()
in net/core/dev.c), as this was always done ;)

> +	spin_lock(&adapter->stats64_lock);
> +	e1000e_update_stats(adapter);
> +	/* Fill out the OS statistics structure */
> +	stats->rx_bytes = adapter->stats.gorc;
> +	stats->rx_packets = adapter->stats.gprc;
> +	stats->tx_bytes = adapter->stats.gotc;
> +	stats->tx_packets = adapter->stats.gptc;
> +	stats->multicast = adapter->stats.mprc;
> +	stats->collisions = adapter->stats.colc;
> +


^ permalink raw reply

* Re: [PATCH v2] net_sched: sch_sfq: fix allot handling
From: Eric Dumazet @ 2010-12-16 13:08 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: David Miller, netdev, Jarek Poplawski
In-Reply-To: <4D08F4F4.3050501@trash.net>

Le mercredi 15 décembre 2010 à 18:03 +0100, Patrick McHardy a écrit :
> On 15.12.2010 17:55, Eric Dumazet wrote:

> > I was thinking in allowing more packets per SFQ (but keep the 126 active
> > flows limit), what do you think ?
> 
> I keep forgetting why this limit exists, let me try to figure
> it out once more :)

I took a look, and found we already are at max, unless we change
sfq_index type (from u8 to u16)

SFQ_SLOTS is 128  (not really exist, but this is the number of slots)
SFQ_DEPTH is 128 (max depth for one slot)
Constraints are :
 SFQ_SLOTS < 256
 SFQ_DEPTH < 256
 SFQ_SLOTS + SFQ_DEPTH < 256, because of the dep[] array.

We could avoid the "struct sk_buff_head" structure with its un-needed
spinlock, and eventually group data for a given slot in a structure to
reduce number of cache lines we touch...

struct sfq_slot {
	struct sk_buff *first;
	struct sk_buff *last;
	u8 qlen;
	sfq_index next; /* dequeue chain */
	u16 hash;
	short allot;
	/* 16bit hole */
};

This would save 768 bytes on x86_64 (and much more if LOCKDEP is used)

^ permalink raw reply

* Re: [PATCH net-next-2.6] net_sched: sch_sfq: add backlog info in sfq_dump_class_stats()
From: Jarek Poplawski @ 2010-12-16 13:09 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev, Patrick McHardy
In-Reply-To: <1292497384.2883.89.camel@edumazet-laptop>

On Thu, Dec 16, 2010 at 12:03:04PM +0100, Eric Dumazet wrote:
> Le jeudi 16 décembre 2010 ?? 08:16 +0000, Jarek Poplawski a écrit :
> > On 2010-12-15 19:18, Eric Dumazet wrote:
> > > We currently return for each active SFQ slot the number of packets in
> > > queue. We can also give number of bytes accounted for these packets.
...
> > > Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> > > ---
> > >  net/sched/sch_sfq.c |    7 ++++++-
> > >  1 file changed, 6 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
> > > index 3cf478d..cb331de 100644
> > > --- a/net/sched/sch_sfq.c
> > > +++ b/net/sched/sch_sfq.c
> > > @@ -548,8 +548,13 @@ static int sfq_dump_class_stats(struct Qdisc *sch, unsigned long cl,
> > >  {
> > >  	struct sfq_sched_data *q = qdisc_priv(sch);
> > >  	sfq_index idx = q->ht[cl-1];
> > > -	struct gnet_stats_queue qs = { .qlen = q->qs[idx].qlen };
> > > +	struct sk_buff_head *list = &q->qs[idx];
> > > +	struct gnet_stats_queue qs = { .qlen = list->qlen };
> > >  	struct tc_sfq_xstats xstats = { .allot = q->allot[idx] };
> > > +	struct sk_buff *skb;
> > > +
> > > +	skb_queue_walk(list, skb)
> > > +		qs.backlog += qdisc_pkt_len(skb);
> > 
> > I don't think you can walk this list without the qdisc lock.
> 
> So after checks, I confirm qdisc lock is held at this point, patch is
> valid.
> 
> tc_fill_tclass() calls gnet_stats_start_copy_compat() (and locks
> qdisc_root_sleeping_lock()), before calling 
>  cl_ops->dump_stats(q, cl, &d)
> 
> Thanks !

You are right. Sorry for misleading.

Jarek P.

^ permalink raw reply

* Re: [PATCH v2] e1000e: convert to stats64
From: Flavio Leitner @ 2010-12-16 13:10 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: e1000-devel, netdev, Jeff Kirsher
In-Reply-To: <1292503830.2883.92.camel@edumazet-laptop>

On Thu, Dec 16, 2010 at 01:50:30PM +0100, Eric Dumazet wrote:
> Le jeudi 16 décembre 2010 à 10:31 -0200, Flavio Leitner a écrit :
> 
> > -static struct net_device_stats *e1000_get_stats(struct net_device *netdev)
> > +struct rtnl_link_stats64 *e1000e_get_stats64(struct net_device *netdev,
> > +						   struct rtnl_link_stats64 *stats)
> >  {
> > -	/* only return the current stats */
> > -	return &netdev->stats;
> > +	struct e1000_adapter *adapter = netdev_priv(netdev);
> > +
> > +	memset(stats, 0, sizeof(struct rtnl_link_stats64));
> 
> You dont need this memset(), stats is cleared by caller (dev_get_stats()
> in net/core/dev.c), as this was always done ;)

Yes, but e1000_get_ethtool_stats() also calls it and doesn't do that.
I could move the memset to the caller, but I thought it would be cleaner
to leave where it is now.

 
> > +	spin_lock(&adapter->stats64_lock);
> > +	e1000e_update_stats(adapter);
> > +	/* Fill out the OS statistics structure */
> > +	stats->rx_bytes = adapter->stats.gorc;
> > +	stats->rx_packets = adapter->stats.gprc;
> > +	stats->tx_bytes = adapter->stats.gotc;
> > +	stats->tx_packets = adapter->stats.gptc;
> > +	stats->multicast = adapter->stats.mprc;
> > +	stats->collisions = adapter->stats.colc;
> > +
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Flavio

------------------------------------------------------------------------------
Lotusphere 2011
Register now for Lotusphere 2011 and learn how
to connect the dots, take your collaborative environment
to the next level, and enter the era of Social Business.
http://p.sf.net/sfu/lotusphere-d2d
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired

^ permalink raw reply

* Re: [PATCH v2] e1000e: convert to stats64
From: Eric Dumazet @ 2010-12-16 13:18 UTC (permalink / raw)
  To: Flavio Leitner; +Cc: netdev, e1000-devel, Jeff Kirsher
In-Reply-To: <20101216131021.GA20139@redhat.com>

Le jeudi 16 décembre 2010 à 11:10 -0200, Flavio Leitner a écrit :

> Yes, but e1000_get_ethtool_stats() also calls it and doesn't do that.
> I could move the memset to the caller, but I thought it would be cleaner
> to leave where it is now.

Yes, no problem !

Acked-by: Eric Dumazet <eric.dumazet@gmail.com>



^ permalink raw reply

* Re: Possible regression: Packet drops during iptables calls
From: Jesper Dangaard Brouer @ 2010-12-16 14:04 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Steven Rostedt, Eric Dumazet,
	Alexander Duyck
  Cc: Stephen Hemminger, netfilter-devel, netdev, Peter P Waskiewicz Jr
In-Reply-To: <1292343855.5934.27.camel@edumazet-laptop>

On Tue, 2010-12-14 at 17:24 +0100, Eric Dumazet wrote:
> Le mardi 14 décembre 2010 à 17:09 +0100, Jesper Dangaard Brouer a écrit :
> > On Tue, 2010-12-14 at 16:31 +0100, Eric Dumazet wrote:
> > > Le mardi 14 décembre 2010 à 15:46 +0100, Jesper Dangaard Brouer a
> > > écrit :
> > > > I'm experiencing RX packet drops during call to iptables, on my
> > > > production servers.
> > > > 
> > > > Further investigations showed, that its only the CPU executing the
> > > > iptables command that experience packet drops!?  Thus, a quick fix was
> > > > to force the iptables command to run on one of the idle CPUs (This can
> > > > be achieved with the "taskset" command).
> > > > 
> > > > I have a 2x Xeon 5550 CPU system, thus 16 CPUs (with HT enabled).  We
> > > > only use 8 CPUs due to a multiqueue limitation of 8 queues in the
> > > > 1Gbit/s NICs (82576 chips).  CPUs 0 to 7 is assigned for packet
> > > > processing via smp_affinity.
> > > > 
> > > > Can someone explain why the packet drops only occur on the CPU
> > > > executing the iptables command?
> > > > 
> > > 
> > > It blocks BH
> > > 
> > > take a look at commits :
> > > 
> > > 24b36f0193467fa727b85b4c004016a8dae999b9
> > > netfilter: {ip,ip6,arp}_tables: dont block bottom half more than
> > > necessary 
> > > 
> > > 001389b9581c13fe5fc357a0f89234f85af4215d
> > > netfilter: {ip,ip6,arp}_tables: avoid lockdep false positiv
<... cut ...>
> > 
> > Looking closer at the two combined code change, I see that the code path
> > has been improved (a bit), as the local BH is only disabled inside the
> > for_each_possible_cpu(cpu).  Before local_bh was disabled for the hole
> > function.  Guess I need to reproduce this in my testlab.

To do some further investigation into the unfortunate behavior of the
iptables get_counters() function I started to use "ftrace".  This is a
really useful tool (thanks Steven Rostedt).

 # Select the tracer "function_graph"
 echo function_graph > /sys/kernel/debug/tracing/current_tracer

 # Limit the number of function we look at:
 echo local_bh_\*  >   /sys/kernel/debug/tracing/set_ftrace_filter
 echo get_counters >>  /sys/kernel/debug/tracing/set_ftrace_filter

 # Enable tracing while calling iptables
 cd /sys/kernel/debug/tracing
 echo 0 > trace
 echo 1 > tracing_enabled;
   taskset 1 iptables -vnL > /dev/null ;
 echo 0 > tracing_enabled
 cat trace | less

The reduced output:

# tracer: function_graph
#
# CPU  DURATION                  FUNCTION CALLS
# |     |   |                     |   |   |   |
  2)   2.772 us    |  local_bh_disable();
....
  0)   0.228 us    |  local_bh_enable();
  0)               |  get_counters() {
  0)   0.232 us    |    local_bh_disable();
  0)   7.919 us    |    local_bh_enable();
  0) ! 109467.1 us |  }
  0)   2.344 us    |  local_bh_disable();

The output show that we spend no less that 100 ms with local BH
disabled.  So, no wonder that this causes packet drops in the NIC
(attached to this CPU).

My iptables rule set in question is also very large, it contains:
 Chains: 20929
 Rules: 81239

The vmalloc size is approx 19 MB (19.820.544 bytes) (see
/proc/vmallocinfo).  Looking through vmallocinfo I realized that
even-though I only have 16 CPUs, there is 32 allocated rulesets
"xt_alloc_table_info" (for the filter table). Thus, I have approx
634MB iptables filter rules in the kernel, half of which is totally
unused.

Guess this is because we use: "for_each_possible_cpu" instead of
"for_each_online_cpu". (Feel free to fix this, or point me to some
documentation of this CPU hotplug stuff... I see we are missing
get_cpu() and put_cpu() a lot of places).

The GOOD NEWS, is that moving the local BH disable section into the
"for_each_possible_cpu" fixed the problem with packet drops during
iptables calls.

I wanted to profile with ftrace on the new code, but I cannot get the
measurement I want. Perhaps Steven or Acme can help?

Now I want to measure the time used between the local_bh_disable() and
local_bh_enable, within the loop.  I cannot figure out howto do that?
The new trace looks almost the same as before, just a lot of
local_bh_* inside the get_counters() function call.

 Guess is that the time spend is: 100 ms / 32 = 3.125 ms.

Now I just need to calculate, how large a NIC buffer I need to buffer
3.125 ms at 1Gbit/s.

 3.125 ms *  1Gbit/s = 390625 bytes

Can this be correct?

How much buffer does each queue have in the 82576 NIC?
(Hope Alexander Duyck can answer this one?)

-- 
Med venlig hilsen / Best regards
  Jesper Brouer
  ComX Networks A/S
  Linux Network Kernel Developer
  Cand. Scient Datalog / MSc.CS
  Author of http://adsl-optimizer.dk
  LinkedIn: http://www.linkedin.com/in/brouer

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] net: increase skb->users instead of skb_clone()
From: Junchang Wang @ 2010-12-16 14:05 UTC (permalink / raw)
  To: Changli Gao
  Cc: David S. Miller, Eric Dumazet, Tom Herbert, Jiri Pirko,
	Fenghua Yu, Xinan Tang, netdev
In-Reply-To: <1292479045-3136-1-git-send-email-xiaosuo@gmail.com>

On Thu, Dec 16, 2010 at 1:57 PM, Changli Gao <xiaosuo@gmail.com> wrote:
> In dev_queue_xmit_nit(), we have to clone skbs as we need to mangle skbs,
> however, we don't need to clone skbs for all the packet_types.
>
> Except for the first packet_type, we increase skb->users instead of
> skb_clone().

Hi Changli,
Take af_packet for example, I can't see benefit from this patch.

> +static inline int deliver_skb(struct sk_buff *skb,
> +                             struct packet_type *pt_prev,
> +                             struct net_device *orig_dev)
> +{
> +       atomic_inc(&skb->users);
> +       return pt_prev->func(skb, skb->dev, pt_prev, orig_dev);
> +}
The increment call will incur skb_shared() failure in packet_rcv.
In reality, packet_rcv has to clone this packet by itself.


Thanks

-- 
--Junchang

^ permalink raw reply

* Re: Possible regression: Packet drops during iptables calls
From: Eric Dumazet @ 2010-12-16 14:12 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Arnaldo Carvalho de Melo, Steven Rostedt, Alexander Duyck,
	Stephen Hemminger, netfilter-devel, netdev, Peter P Waskiewicz Jr
In-Reply-To: <1292508266.31289.12.camel@firesoul.comx.local>

Le jeudi 16 décembre 2010 à 15:04 +0100, Jesper Dangaard Brouer a
écrit :
> On Tue, 2010-12-14 at 17:24 +0100, Eric Dumazet wrote:
> > Le mardi 14 décembre 2010 à 17:09 +0100, Jesper Dangaard Brouer a écrit :
> > > On Tue, 2010-12-14 at 16:31 +0100, Eric Dumazet wrote:
> > > > Le mardi 14 décembre 2010 à 15:46 +0100, Jesper Dangaard Brouer a
> > > > écrit :
> > > > > I'm experiencing RX packet drops during call to iptables, on my
> > > > > production servers.
> > > > > 
> > > > > Further investigations showed, that its only the CPU executing the
> > > > > iptables command that experience packet drops!?  Thus, a quick fix was
> > > > > to force the iptables command to run on one of the idle CPUs (This can
> > > > > be achieved with the "taskset" command).
> > > > > 
> > > > > I have a 2x Xeon 5550 CPU system, thus 16 CPUs (with HT enabled).  We
> > > > > only use 8 CPUs due to a multiqueue limitation of 8 queues in the
> > > > > 1Gbit/s NICs (82576 chips).  CPUs 0 to 7 is assigned for packet
> > > > > processing via smp_affinity.
> > > > > 
> > > > > Can someone explain why the packet drops only occur on the CPU
> > > > > executing the iptables command?
> > > > > 
> > > > 
> > > > It blocks BH
> > > > 
> > > > take a look at commits :
> > > > 
> > > > 24b36f0193467fa727b85b4c004016a8dae999b9
> > > > netfilter: {ip,ip6,arp}_tables: dont block bottom half more than
> > > > necessary 
> > > > 
> > > > 001389b9581c13fe5fc357a0f89234f85af4215d
> > > > netfilter: {ip,ip6,arp}_tables: avoid lockdep false positiv
> <... cut ...>
> > > 
> > > Looking closer at the two combined code change, I see that the code path
> > > has been improved (a bit), as the local BH is only disabled inside the
> > > for_each_possible_cpu(cpu).  Before local_bh was disabled for the hole
> > > function.  Guess I need to reproduce this in my testlab.
> 
> 
> To do some further investigation into the unfortunate behavior of the
> iptables get_counters() function I started to use "ftrace".  This is a
> really useful tool (thanks Steven Rostedt).
> 
>  # Select the tracer "function_graph"
>  echo function_graph > /sys/kernel/debug/tracing/current_tracer
> 
>  # Limit the number of function we look at:
>  echo local_bh_\*  >   /sys/kernel/debug/tracing/set_ftrace_filter
>  echo get_counters >>  /sys/kernel/debug/tracing/set_ftrace_filter
> 
>  # Enable tracing while calling iptables
>  cd /sys/kernel/debug/tracing
>  echo 0 > trace
>  echo 1 > tracing_enabled;
>    taskset 1 iptables -vnL > /dev/null ;
>  echo 0 > tracing_enabled
>  cat trace | less
> 
> 
> The reduced output:
> 
> # tracer: function_graph
> #
> # CPU  DURATION                  FUNCTION CALLS
> # |     |   |                     |   |   |   |
>   2)   2.772 us    |  local_bh_disable();
> ....
>   0)   0.228 us    |  local_bh_enable();
>   0)               |  get_counters() {
>   0)   0.232 us    |    local_bh_disable();
>   0)   7.919 us    |    local_bh_enable();
>   0) ! 109467.1 us |  }
>   0)   2.344 us    |  local_bh_disable();
> 
> 
> The output show that we spend no less that 100 ms with local BH
> disabled.  So, no wonder that this causes packet drops in the NIC
> (attached to this CPU).
> 
> My iptables rule set in question is also very large, it contains:
>  Chains: 20929
>  Rules: 81239
> 
> The vmalloc size is approx 19 MB (19.820.544 bytes) (see
> /proc/vmallocinfo).  Looking through vmallocinfo I realized that
> even-though I only have 16 CPUs, there is 32 allocated rulesets
> "xt_alloc_table_info" (for the filter table). Thus, I have approx
> 634MB iptables filter rules in the kernel, half of which is totally
> unused.

Boot your machine with : "maxcpus=16 possible_cpus=16", it will be much
better ;)

> 
> Guess this is because we use: "for_each_possible_cpu" instead of
> "for_each_online_cpu". (Feel free to fix this, or point me to some
> documentation of this CPU hotplug stuff... I see we are missing
> get_cpu() and put_cpu() a lot of places).

Are you really using cpu hotplug ? If not, the "maxcpus=16
possible_cpus=16" trick should be enough for you.

> 
> 
> The GOOD NEWS, is that moving the local BH disable section into the
> "for_each_possible_cpu" fixed the problem with packet drops during
> iptables calls.
> 
> I wanted to profile with ftrace on the new code, but I cannot get the
> measurement I want. Perhaps Steven or Acme can help?
> 
> Now I want to measure the time used between the local_bh_disable() and
> local_bh_enable, within the loop.  I cannot figure out howto do that?
> The new trace looks almost the same as before, just a lot of
> local_bh_* inside the get_counters() function call.
> 
>  Guess is that the time spend is: 100 ms / 32 = 3.125 ms.
> 

yes, approximatly.

In order to accelerate, you could eventually pre-fill cpu cache before
the local_bh_disable() (just reading the table). So that critical
section is short, because mostly in your cpu cache.

> Now I just need to calculate, how large a NIC buffer I need to buffer
> 3.125 ms at 1Gbit/s.
> 
>  3.125 ms *  1Gbit/s = 390625 bytes
> 
> Can this be correct?
> 
> How much buffer does each queue have in the 82576 NIC?
> (Hope Alexander Duyck can answer this one?)
> 


--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] net: increase skb->users instead of skb_clone()
From: Changli Gao @ 2010-12-16 14:12 UTC (permalink / raw)
  To: Junchang Wang
  Cc: David S. Miller, Eric Dumazet, Tom Herbert, Jiri Pirko,
	Fenghua Yu, Xinan Tang, netdev
In-Reply-To: <AANLkTikdYi-Cs=4UtjJx-X4bW+LSKTQBTcSv23NS2S+S@mail.gmail.com>

On Thu, Dec 16, 2010 at 10:05 PM, Junchang Wang <junchangwang@gmail.com> wrote:
>
> Hi Changli,
> Take af_packet for example, I can't see benefit from this patch.
>
>> +static inline int deliver_skb(struct sk_buff *skb,
>> +                             struct packet_type *pt_prev,
>> +                             struct net_device *orig_dev)
>> +{
>> +       atomic_inc(&skb->users);
>> +       return pt_prev->func(skb, skb->dev, pt_prev, orig_dev);
>> +}
> The increment call will incur skb_shared() failure in packet_rcv.
> In reality, packet_rcv has to clone this packet by itself.
>
>

This happens when run_filter returns non zero. For your case, only
small parts of packets match bpf filter.


-- 
Regards,
Changli Gao(xiaosuo@gmail.com)

^ permalink raw reply

* Re: Possible regression: Packet drops during iptables calls
From: Eric Dumazet @ 2010-12-16 14:13 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Arnaldo Carvalho de Melo, Steven Rostedt, Alexander Duyck,
	Stephen Hemminger, netfilter-devel, netdev, Peter P Waskiewicz Jr
In-Reply-To: <1292508266.31289.12.camel@firesoul.comx.local>

Le jeudi 16 décembre 2010 à 15:04 +0100, Jesper Dangaard Brouer a
écrit :

> Now I just need to calculate, how large a NIC buffer I need to buffer
> 3.125 ms at 1Gbit/s.
> 
>  3.125 ms *  1Gbit/s = 390625 bytes
> 
> Can this be correct?
> 
> How much buffer does each queue have in the 82576 NIC?
> (Hope Alexander Duyck can answer this one?)
> 

Worst case is if you receive very small frames, because 3.125 ms is
about 5000 frames at 1Gbit/s



--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] net: increase skb->users instead of skb_clone()
From: Eric Dumazet @ 2010-12-16 14:18 UTC (permalink / raw)
  To: Junchang Wang
  Cc: Changli Gao, David S. Miller, Tom Herbert, Jiri Pirko, Fenghua Yu,
	Xinan Tang, netdev
In-Reply-To: <AANLkTikdYi-Cs=4UtjJx-X4bW+LSKTQBTcSv23NS2S+S@mail.gmail.com>

Le jeudi 16 décembre 2010 à 22:05 +0800, Junchang Wang a écrit :
> On Thu, Dec 16, 2010 at 1:57 PM, Changli Gao <xiaosuo@gmail.com> wrote:
> > In dev_queue_xmit_nit(), we have to clone skbs as we need to mangle skbs,
> > however, we don't need to clone skbs for all the packet_types.
> >
> > Except for the first packet_type, we increase skb->users instead of
> > skb_clone().
> 
> Hi Changli,
> Take af_packet for example, I can't see benefit from this patch.
> 
> > +static inline int deliver_skb(struct sk_buff *skb,
> > +                             struct packet_type *pt_prev,
> > +                             struct net_device *orig_dev)
> > +{
> > +       atomic_inc(&skb->users);
> > +       return pt_prev->func(skb, skb->dev, pt_prev, orig_dev);
> > +}
> The increment call will incur skb_shared() failure in packet_rcv.
> In reality, packet_rcv has to clone this packet by itself.
> 

Yes, and no.

Consider the case you have one receiver.

Packet given after Changli patch wont be shared, so packet_rcv wont
clone it : Thats a win. Only one skb_clone() done instead of two.

Consider case with 2 receivers :

First time we call packet_rcv, packet is shared (because we call
deliver_skb(), so packet_rcv clones it. Normal situation, we really need
to clone it.

Second time, we give a non shared packet : Thats a win over previous
situation.




^ permalink raw reply

* Re: Possible regression: Packet drops during iptables calls
From: Steven Rostedt @ 2010-12-16 14:20 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Arnaldo Carvalho de Melo, Eric Dumazet, Alexander Duyck,
	Stephen Hemminger, netfilter-devel, netdev, Peter P Waskiewicz Jr
In-Reply-To: <1292508266.31289.12.camel@firesoul.comx.local>

On Thu, 2010-12-16 at 15:04 +0100, Jesper Dangaard Brouer wrote:

> 
> To do some further investigation into the unfortunate behavior of the
> iptables get_counters() function I started to use "ftrace".  This is a
> really useful tool (thanks Steven Rostedt).
> 
>  # Select the tracer "function_graph"
>  echo function_graph > /sys/kernel/debug/tracing/current_tracer
> 
>  # Limit the number of function we look at:
>  echo local_bh_\*  >   /sys/kernel/debug/tracing/set_ftrace_filter
>  echo get_counters >>  /sys/kernel/debug/tracing/set_ftrace_filter
> 
>  # Enable tracing while calling iptables
>  cd /sys/kernel/debug/tracing
>  echo 0 > trace
>  echo 1 > tracing_enabled;
>    taskset 1 iptables -vnL > /dev/null ;
>  echo 0 > tracing_enabled
>  cat trace | less

Just an fyi, you can do the above much easier with trace-cmd:

git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/trace-cmd.git

# trace-cmd record -p function_graph -l 'local_bh_*' -l get_counters taskset 1 iptables -vnL > /dev/null
# trace-cmd report

-- Steve

> 
> 
> The reduced output:
> 
> # tracer: function_graph
> #
> # CPU  DURATION                  FUNCTION CALLS
> # |     |   |                     |   |   |   |
>   2)   2.772 us    |  local_bh_disable();
> ....
>   0)   0.228 us    |  local_bh_enable();
>   0)               |  get_counters() {
>   0)   0.232 us    |    local_bh_disable();
>   0)   7.919 us    |    local_bh_enable();
>   0) ! 109467.1 us |  }
>   0)   2.344 us    |  local_bh_disable();
> 
> 
> The output show that we spend no less that 100 ms with local BH
> disabled.  So, no wonder that this causes packet drops in the NIC
> (attached to this CPU).
> 
> My iptables rule set in question is also very large, it contains:
>  Chains: 20929
>  Rules: 81239
> 
> The vmalloc size is approx 19 MB (19.820.544 bytes) (see
> /proc/vmallocinfo).  Looking through vmallocinfo I realized that
> even-though I only have 16 CPUs, there is 32 allocated rulesets
> "xt_alloc_table_info" (for the filter table). Thus, I have approx
> 634MB iptables filter rules in the kernel, half of which is totally
> unused.
> 
> Guess this is because we use: "for_each_possible_cpu" instead of
> "for_each_online_cpu". (Feel free to fix this, or point me to some
> documentation of this CPU hotplug stuff... I see we are missing
> get_cpu() and put_cpu() a lot of places).
> 
> 
> The GOOD NEWS, is that moving the local BH disable section into the
> "for_each_possible_cpu" fixed the problem with packet drops during
> iptables calls.
> 
> I wanted to profile with ftrace on the new code, but I cannot get the
> measurement I want. Perhaps Steven or Acme can help?
> 
> Now I want to measure the time used between the local_bh_disable() and
> local_bh_enable, within the loop.  I cannot figure out howto do that?
> The new trace looks almost the same as before, just a lot of
> local_bh_* inside the get_counters() function call.
> 
>  Guess is that the time spend is: 100 ms / 32 = 3.125 ms.
> 
> Now I just need to calculate, how large a NIC buffer I need to buffer
> 3.125 ms at 1Gbit/s.
> 
>  3.125 ms *  1Gbit/s = 390625 bytes
> 
> Can this be correct?
> 
> How much buffer does each queue have in the 82576 NIC?
> (Hope Alexander Duyck can answer this one?)
> 



^ permalink raw reply

* Re: [PATCH] net: increase skb->users instead of skb_clone()
From: Junchang Wang @ 2010-12-16 14:20 UTC (permalink / raw)
  To: Changli Gao
  Cc: Eric Dumazet, David S. Miller, Tom Herbert, Jiri Pirko,
	Fenghua Yu, Xinan Tang, netdev
In-Reply-To: <AANLkTi=BX8AMaiwV3aDAVqRA=br00eRo-42vPNwZfwS6@mail.gmail.com>

On Thu, Dec 16, 2010 at 3:23 PM, Changli Gao <xiaosuo@gmail.com> wrote:

>> You beat me, but I was thinking of a different way, adding a new
>> pt_prev->xmit_func(), handling all the details (no need for atomic ops
>> on skb users if packet is not delivered at all).
>>
>> By the way, your patch is not 100% safe/OK, because af_packet rcv()
>> handler writes on skb (skb_pull() and all)
>>
>
> But af_packet_rcv() restores skbs at last.
>
>        if (skb_head != skb->data && skb_shared(skb)) {
>                skb->data = skb_head;
>                skb->len = skb_len;
>        }
>
If af packet_rcv invokes skb_clone, this skb is differ from the original one.
Eric's warning is right.


-- 
--Junchang

^ permalink raw reply

* Re: Possible regression: Packet drops during iptables calls
From: Jesper Dangaard Brouer @ 2010-12-16 14:24 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Arnaldo Carvalho de Melo, Steven Rostedt, Alexander Duyck,
	Stephen Hemminger, netfilter-devel, netdev, Peter P Waskiewicz Jr
In-Reply-To: <1292508733.2883.152.camel@edumazet-laptop>

On Thu, 2010-12-16 at 15:12 +0100, Eric Dumazet wrote:
> Le jeudi 16 décembre 2010 à 15:04 +0100, Jesper Dangaard Brouer a

> > The vmalloc size is approx 19 MB (19.820.544 bytes) (see
> > /proc/vmallocinfo).  Looking through vmallocinfo I realized that
> > even-though I only have 16 CPUs, there is 32 allocated rulesets
> > "xt_alloc_table_info" (for the filter table). Thus, I have approx
> > 634MB iptables filter rules in the kernel, half of which is totally
> > unused.
> 
> Boot your machine with : "maxcpus=16 possible_cpus=16", it will be much
> better ;)

Good, trick.  I'll use that.

> > Guess this is because we use: "for_each_possible_cpu" instead of
> > "for_each_online_cpu". (Feel free to fix this, or point me to some
> > documentation of this CPU hotplug stuff... I see we are missing
> > get_cpu() and put_cpu() a lot of places).
> 
> Are you really using cpu hotplug ? If not, the "maxcpus=16
> possible_cpus=16" trick should be enough for you.

No, not using hotplug CPUs.  Its just a pitty that we waste kernel
memory on this, for every one which does not know the "maxcpus=16
possible_cpus=16" trick...

But as I don't have a hotplug CPU system, I have no chance of testing an
eventual code fix/patch.


> > 
> 
> In order to accelerate, you could eventually pre-fill cpu cache before
> the local_bh_disable() (just reading the table). So that critical
> section is short, because mostly in your cpu cache.

In my case I think this will not help. I'll kill the cache anyways, as
the ruleset is 19MB and my CPU cache is 8MB.


-- 
Med venlig hilsen / Best regards
  Jesper Brouer
  ComX Networks A/S
  Linux Network Kernel Developer
  Cand. Scient Datalog / MSc.CS
  Author of http://adsl-optimizer.dk
  LinkedIn: http://www.linkedin.com/in/brouer


--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: Possible regression: Packet drops during iptables calls
From: Eric Dumazet @ 2010-12-16 14:29 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Arnaldo Carvalho de Melo, Steven Rostedt, Alexander Duyck,
	Stephen Hemminger, netfilter-devel, netdev, Peter P Waskiewicz Jr
In-Reply-To: <1292509489.31289.20.camel@firesoul.comx.local>

Le jeudi 16 décembre 2010 à 15:24 +0100, Jesper Dangaard Brouer a
écrit :

> In my case I think this will not help. I'll kill the cache anyways, as
> the ruleset is 19MB and my CPU cache is 8MB.
> 
> 

Yep ;)

By the way, you speak of a 'possible regression', but we always masked
BH while doing get_counters().

Only very recent kernels are masking them for each unit (cpu) of work.

There was attempt to use a lockless read for each counter (using a
seqlock), but it was not completed. I guess we could do something to
ressurect this idea.



--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] net: increase skb->users instead of skb_clone()
From: Eric Dumazet @ 2010-12-16 14:30 UTC (permalink / raw)
  To: Junchang Wang
  Cc: Changli Gao, David S. Miller, Tom Herbert, Jiri Pirko, Fenghua Yu,
	Xinan Tang, netdev
In-Reply-To: <AANLkTim--M0TvW9oUi0UO6Ej_8aLezneQsdRx7X0CXDt@mail.gmail.com>

Le jeudi 16 décembre 2010 à 22:20 +0800, Junchang Wang a écrit :
> On Thu, Dec 16, 2010 at 3:23 PM, Changli Gao <xiaosuo@gmail.com> wrote:
> 
> >> You beat me, but I was thinking of a different way, adding a new
> >> pt_prev->xmit_func(), handling all the details (no need for atomic ops
> >> on skb users if packet is not delivered at all).
> >>
> >> By the way, your patch is not 100% safe/OK, because af_packet rcv()
> >> handler writes on skb (skb_pull() and all)
> >>
> >
> > But af_packet_rcv() restores skbs at last.
> >
> >        if (skb_head != skb->data && skb_shared(skb)) {
> >                skb->data = skb_head;
> >                skb->len = skb_len;
> >        }
> >
> If af packet_rcv invokes skb_clone, this skb is differ from the original one.
> Eric's warning is right.

It was a false alarm.

If packet_rcv() invokes skb_clone(), skb still points to original skb.
No worry.



^ permalink raw reply

* Re: [PATCH] net: increase skb->users instead of skb_clone()
From: Junchang Wang @ 2010-12-16 14:31 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Changli Gao, David S. Miller, Tom Herbert, Jiri Pirko, Fenghua Yu,
	Xinan Tang, netdev
In-Reply-To: <1292509118.2883.167.camel@edumazet-laptop>

On Thu, Dec 16, 2010 at 10:18 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
> Yes, and no.
>
> Consider the case you have one receiver.
>
> Packet given after Changli patch wont be shared, so packet_rcv wont
> clone it : Thats a win. Only one skb_clone() done instead of two.
>
> Consider case with 2 receivers :
>
> First time we call packet_rcv, packet is shared (because we call
> deliver_skb(), so packet_rcv clones it. Normal situation, we really need
> to clone it.

Got it. Thanks.

>
> Second time, we give a non shared packet : Thats a win over previous
> situation.
>
But, if we have N receivers, we get only the last one win - the first N-1 will
call deliver_skb().


Thanks.
-- 
--Junchang

^ permalink raw reply

* Re: [PATCH] net: increase skb->users instead of skb_clone()
From: Junchang Wang @ 2010-12-16 14:36 UTC (permalink / raw)
  To: Changli Gao
  Cc: David S. Miller, Eric Dumazet, Tom Herbert, Jiri Pirko,
	Fenghua Yu, Xinan Tang, netdev
In-Reply-To: <AANLkTi=DvyuZeWzzi5WJR7AeTZba+U7Vz9cJYR5uRBVc@mail.gmail.com>

On Thu, Dec 16, 2010 at 10:12 PM, Changli Gao <xiaosuo@gmail.com> wrote:

>
> This happens when run_filter returns non zero. For your case, only
> small parts of packets match bpf filter.
Hi Changli,
In most cases, I want user-space applications see everything. :)

Thanks.
-- 
--Junchang

^ permalink raw reply

* Re: iwl rfkill suddenly dropped to hard block
From: Evgeniy Polyakov @ 2010-12-16 14:40 UTC (permalink / raw)
  To: John W. Linville
  Cc: Intel Linux Wireless, netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	wey-yi.w.guy-ral2JQCrhuEAvxtiuMwx3w
In-Reply-To: <20101215201126.GG2377-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 565 bytes --]

On Wed, Dec 15, 2010 at 03:11:27PM -0500, John W. Linville (linville-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org) wrote:
> To be honest, nearly every report of "suddenly my rfkill is stuck
> on" is because the laptop has multiple rfkill keys, usually with
> one of them a slider along the edge of the case.  In particular,
> Thinkpads have such switches.  The slider gets accidently engaged
> (possibly while the laptop is being transported or somesuch) and
> suddenly wireless stops working.

I feel incredibly stupid, but...
I found the key :)

-- 
	Evgeniy Polyakov

[-- Attachment #2: 16-12-2010-i-must-be-blond.png --]
[-- Type: image/png, Size: 141801 bytes --]

^ permalink raw reply

* Re: [PATCH] net: increase skb->users instead of skb_clone()
From: Eric Dumazet @ 2010-12-16 14:41 UTC (permalink / raw)
  To: Junchang Wang
  Cc: Changli Gao, David S. Miller, Tom Herbert, Jiri Pirko, Fenghua Yu,
	Xinan Tang, netdev
In-Reply-To: <AANLkTimd96KoT1-KqeiF7i10pH3pfML=rsUho=cCancU@mail.gmail.com>

Le jeudi 16 décembre 2010 à 22:31 +0800, Junchang Wang a écrit :
> On Thu, Dec 16, 2010 at 10:18 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> >
> > Yes, and no.
> >
> > Consider the case you have one receiver.
> >
> > Packet given after Changli patch wont be shared, so packet_rcv wont
> > clone it : Thats a win. Only one skb_clone() done instead of two.
> >
> > Consider case with 2 receivers :
> >
> > First time we call packet_rcv, packet is shared (because we call
> > deliver_skb(), so packet_rcv clones it. Normal situation, we really need
> > to clone it.
> 
> Got it. Thanks.
> 
> >
> > Second time, we give a non shared packet : Thats a win over previous
> > situation.
> >
> But, if we have N receivers, we get only the last one win - the first N-1 will
> call deliver_skb().
> 

Yes, but you want to, because each receiver has to make a private copy
of the skb.

The big win is that if packet if filtered out (not accepted by the
socket filter), you end with no extra skb_clone() at all.

Say you have 8 receivers, with a filter matching some hash/cpu, and only
one af_packet socket will take the message.

Before patch : 8 skb_clones()

After patch : one skb_clone()

If I undertood patch intent ;)




^ permalink raw reply

* Re: iwl rfkill suddenly dropped to hard block
From: John W. Linville @ 2010-12-16 14:42 UTC (permalink / raw)
  To: Evgeniy Polyakov
  Cc: Intel Linux Wireless, netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	wey-yi.w.guy-ral2JQCrhuEAvxtiuMwx3w
In-Reply-To: <20101216144000.GA16183-i6C2adt8DTjR7s880joybQ@public.gmane.org>

On Thu, Dec 16, 2010 at 05:40:00PM +0300, Evgeniy Polyakov wrote:
> On Wed, Dec 15, 2010 at 03:11:27PM -0500, John W. Linville (linville-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org) wrote:
> > To be honest, nearly every report of "suddenly my rfkill is stuck
> > on" is because the laptop has multiple rfkill keys, usually with
> > one of them a slider along the edge of the case.  In particular,
> > Thinkpads have such switches.  The slider gets accidently engaged
> > (possibly while the laptop is being transported or somesuch) and
> > suddenly wireless stops working.
> 
> I feel incredibly stupid, but...
> I found the key :)

Nice to start the day with a laugh! ;-)

John
-- 
John W. Linville		Someday the world will need a hero, and you
linville-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org			might be all we have.  Be ready.
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox