* Re: 2.6.38.x, 2.6.39 sfq? kernel panic in sfq_enqueue
From: Eric Dumazet @ 2011-05-23 12:50 UTC (permalink / raw)
To: Denys Fedoryshchenko; +Cc: netdev, hadi
In-Reply-To: <1306153938.20687.2.camel@edumazet-laptop>
Le lundi 23 mai 2011 à 14:32 +0200, Eric Dumazet a écrit :
> Ouch, thats an ip_fragment() bug I am afraid... nothing to do with SFQ
>
> It calls
>
> err = output(skb);
>
> and a bit later does :
>
> skb = frag;
> frag = skb->next; // thats completely illegal here !
> skb->next = NULL;
>
> I am cooking a patch and send it in a couple of minutes.
Oh well, false alarm, I am still trying to understand the case.
Some other reports would be appreciated, because here is the strange
thing :
[ 4461.969603] Code: b6 70 10
3b b3 08 01 00 00
0f 8d df 01 00 00 jge ....
41 8b 74 24 28 mov 0x28(%r12),%esi qdisc_pkt_len(skb)
01 b3 b4 00 00 00 sch->qstats.backlog += qdisc_pkt_len(skb);
RAX = slot
R12 = SKB
48 8b 70 08 mov 0x8(%rax),%rsi slot->skblist_prev
49 89 04 24 mov %rax,(%r12) skb->next = (struct sk_buff *)slot;
49 89 74 24 08 mov %rsi,0x8(%r12) skb->prev = slot->skblist_prev;
48 8b 70 08 mov 0x8(%rax),%rsi slot->skblist_prev (refetch)
<4c> 89 26 mov %r12,(%rsi) slot->skblist_prev->next = skb; // CRASH
0f b6 f2 movzbl %dl,%esi
4c 89 60 08 mov %r12,0x8(%rax) slot->skblist_prev = skb;
48 8d 3c 76 lea
48 8d bc fb 90 01 00
And in your report RAX = R12 !!! (ffff8801172a7d08) I cant see how it
can happen (Its not even a valid skb address, since an SKB should be
64bytes aligned)
If available a disassembly of sfq_enqueue() would be appreciated too ;)
Thanks !
^ permalink raw reply
* [RFC Patch] bonding: move to net/ directory
From: Américo Wang @ 2011-05-23 12:45 UTC (permalink / raw)
To: Linux Kernel Network Developers
Cc: David Miller, Jay Vosburgh, Andy Gospodarek
[-- Attachment #1: Type: text/plain, Size: 663 bytes --]
Hello, Jay, Andy,
Is there any peculiar reason that bonding code has to stay
in drivers/net/ directory?
Since bonding and bridge are somewhat similar, both of
which are used to "bond" two or more devices into one,
and bridge code is already in net/bridge/, so I think it also
makes sense to move bonding code into net/bonding/ too.
This could also help to grep the source more easily. :)
What do you think about the patch below?
(Note, this patch is against net-next-2.6.)
Thanks!
Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com>
Cc: "David Miller" <davem@davemloft.net>
Cc: Jay Vosburgh <fubar@us.ibm.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
---
[-- Attachment #2: net-bonding-movement.diff --]
[-- Type: text/x-patch, Size: 6476 bytes --]
Documentation/networking/bonding.txt | 2 +-
MAINTAINERS | 2 +-
drivers/net/Kconfig | 18 ------------------
drivers/net/Makefile | 1 -
net/Kconfig | 19 +++++++++++++++++++
net/Makefile | 1 +
{drivers/net => net}/bonding/Makefile | 0
{drivers/net => net}/bonding/bond_3ad.c | 0
{drivers/net => net}/bonding/bond_3ad.h | 0
{drivers/net => net}/bonding/bond_alb.c | 0
{drivers/net => net}/bonding/bond_alb.h | 0
{drivers/net => net}/bonding/bond_debugfs.c | 0
{drivers/net => net}/bonding/bond_ipv6.c | 0
{drivers/net => net}/bonding/bond_main.c | 0
{drivers/net => net}/bonding/bond_procfs.c | 0
{drivers/net => net}/bonding/bond_sysfs.c | 0
{drivers/net => net}/bonding/bonding.h | 0
17 files changed, 22 insertions(+), 21 deletions(-)
diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt
index 1f45bd8..c1b3924 100644
--- a/Documentation/networking/bonding.txt
+++ b/Documentation/networking/bonding.txt
@@ -114,7 +114,7 @@ the following steps:
-----------------------------------------------
The current version of the bonding driver is available in the
-drivers/net/bonding subdirectory of the most recent kernel source
+net/bonding subdirectory of the most recent kernel source
(which is available on http://kernel.org). Most users "rolling their
own" will want to use the most recent kernel from kernel.org.
diff --git a/MAINTAINERS b/MAINTAINERS
index 49a0bf3..2eb2e79 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1495,7 +1495,7 @@ M: Andy Gospodarek <andy@greyhouse.net>
L: netdev@vger.kernel.org
W: http://sourceforge.net/projects/bonding/
S: Supported
-F: drivers/net/bonding/
+F: net/bonding/
F: include/linux/if_bonding.h
BROADCOM B44 10/100 ETHERNET DRIVER
diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index 19f04a3..a8d39e3 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -60,24 +60,6 @@ config DUMMY
Instead of 'dummy', the devices will then be called 'dummy0',
'dummy1' etc.
-config BONDING
- tristate "Bonding driver support"
- depends on INET
- depends on IPV6 || IPV6=n
- ---help---
- Say 'Y' or 'M' if you wish to be able to 'bond' multiple Ethernet
- Channels together. This is called 'Etherchannel' by Cisco,
- 'Trunking' by Sun, 802.3ad by the IEEE, and 'Bonding' in Linux.
-
- The driver supports multiple bonding modes to allow for both high
- performance and high availability operation.
-
- Refer to <file:Documentation/networking/bonding.txt> for more
- information.
-
- To compile this driver as a module, choose M here: the module
- will be called bonding.
-
config MACVLAN
tristate "MAC-VLAN support (EXPERIMENTAL)"
depends on EXPERIMENTAL
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 209fbb7..6f1a3ca 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -25,7 +25,6 @@ obj-$(CONFIG_CHELSIO_T4) += cxgb4/
obj-$(CONFIG_CHELSIO_T4VF) += cxgb4vf/
obj-$(CONFIG_EHEA) += ehea/
obj-$(CONFIG_CAN) += can/
-obj-$(CONFIG_BONDING) += bonding/
obj-$(CONFIG_ATL1) += atlx/
obj-$(CONFIG_ATL2) += atlx/
obj-$(CONFIG_ATL1E) += atl1e/
diff --git a/net/Kconfig b/net/Kconfig
index 878151c..e963cde 100644
--- a/net/Kconfig
+++ b/net/Kconfig
@@ -191,6 +191,25 @@ source "net/bridge/netfilter/Kconfig"
endif
+config BONDING
+ tristate "Bonding driver support"
+ depends on INET
+ depends on IPV6 || IPV6=n
+ ---help---
+ Say 'Y' or 'M' if you wish to be able to 'bond' multiple Ethernet
+ Channels together. This is called 'Etherchannel' by Cisco,
+ 'Trunking' by Sun, 802.3ad by the IEEE, and 'Bonding' in Linux.
+
+ The driver supports multiple bonding modes to allow for both high
+ performance and high availability operation.
+
+ Refer to <file:Documentation/networking/bonding.txt> for more
+ information.
+
+ To compile this driver as a module, choose M here: the module
+ will be called bonding.
+
+
source "net/dccp/Kconfig"
source "net/sctp/Kconfig"
source "net/rds/Kconfig"
diff --git a/net/Makefile b/net/Makefile
index a51d946..1e74030 100644
--- a/net/Makefile
+++ b/net/Makefile
@@ -23,6 +23,7 @@ obj-$(CONFIG_NET) += ipv6/
obj-$(CONFIG_PACKET) += packet/
obj-$(CONFIG_NET_KEY) += key/
obj-$(CONFIG_BRIDGE) += bridge/
+obj-$(CONFIG_BONDING) += bonding/
obj-$(CONFIG_NET_DSA) += dsa/
obj-$(CONFIG_IPX) += ipx/
obj-$(CONFIG_ATALK) += appletalk/
diff --git a/drivers/net/bonding/Makefile b/net/bonding/Makefile
similarity index 100%
rename from drivers/net/bonding/Makefile
rename to net/bonding/Makefile
diff --git a/drivers/net/bonding/bond_3ad.c b/net/bonding/bond_3ad.c
similarity index 100%
rename from drivers/net/bonding/bond_3ad.c
rename to net/bonding/bond_3ad.c
diff --git a/drivers/net/bonding/bond_3ad.h b/net/bonding/bond_3ad.h
similarity index 100%
rename from drivers/net/bonding/bond_3ad.h
rename to net/bonding/bond_3ad.h
diff --git a/drivers/net/bonding/bond_alb.c b/net/bonding/bond_alb.c
similarity index 100%
rename from drivers/net/bonding/bond_alb.c
rename to net/bonding/bond_alb.c
diff --git a/drivers/net/bonding/bond_alb.h b/net/bonding/bond_alb.h
similarity index 100%
rename from drivers/net/bonding/bond_alb.h
rename to net/bonding/bond_alb.h
diff --git a/drivers/net/bonding/bond_debugfs.c b/net/bonding/bond_debugfs.c
similarity index 100%
rename from drivers/net/bonding/bond_debugfs.c
rename to net/bonding/bond_debugfs.c
diff --git a/drivers/net/bonding/bond_ipv6.c b/net/bonding/bond_ipv6.c
similarity index 100%
rename from drivers/net/bonding/bond_ipv6.c
rename to net/bonding/bond_ipv6.c
diff --git a/drivers/net/bonding/bond_main.c b/net/bonding/bond_main.c
similarity index 100%
rename from drivers/net/bonding/bond_main.c
rename to net/bonding/bond_main.c
diff --git a/drivers/net/bonding/bond_procfs.c b/net/bonding/bond_procfs.c
similarity index 100%
rename from drivers/net/bonding/bond_procfs.c
rename to net/bonding/bond_procfs.c
diff --git a/drivers/net/bonding/bond_sysfs.c b/net/bonding/bond_sysfs.c
similarity index 100%
rename from drivers/net/bonding/bond_sysfs.c
rename to net/bonding/bond_sysfs.c
diff --git a/drivers/net/bonding/bonding.h b/net/bonding/bonding.h
similarity index 100%
rename from drivers/net/bonding/bonding.h
rename to net/bonding/bonding.h
^ permalink raw reply related
* Re: 2.6.38.x, 2.6.39 sfq? kernel panic in sfq_enqueue
From: Eric Dumazet @ 2011-05-23 12:32 UTC (permalink / raw)
To: Denys Fedoryshchenko; +Cc: netdev, hadi
In-Reply-To: <598fe111e91c6236b8bfdfca323b9a17@visp.net.lb>
Le lundi 23 mai 2011 à 13:01 +0300, Denys Fedoryshchenko a écrit :
> It is not mine, just helping to forward information to netdev.
> Arch Linux, x86_64, NAS for PPTP (with PPTP "acceleration" enabled)
> If any other info required, please let me know.
>
> Here is panic message
>
> [ 4461.966303] BUG: unable to handle kernel NULL pointer dereference at
> (null)
> [ 4461.969603] IP: [<ffffffffa019fb90>] sfq_enqueue+0xe0/0x630
> [sch_sfq]
> [ 4461.969603] PGD 1179a0067 PUD 1160fa067 PMD 0
> [ 4461.969603] Oops: 0002 [#1] PREEMPT SMP
> [ 4461.969603] last sysfs file: /sys/devices/virtual/net/ppp41/uevent
> [ 4461.969603] CPU 0
> [ 4461.969603] Modules linked in: act_police sch_ingress cls_u32
> sch_sfq sch_htb xt_TCPMSS xt_tcpudp iptable_filter ip_tables x_tables
> igb l2tp_ppp l2tp_netlink l2tp_core pptp pppox ppp_generic slhc gre
> bonding i2c_i801 firewire_ohci psmouse uhci_hcd iTCO_wdt button evdev
> firewire_core processor intel_agp intel_gtt i2c_core asus_atk0110 dca
> iTCO_vendor_support ehci_hcd pcspkr serio_raw sg usbcore crc_itu_t ipv6
> ext2 mbcache sd_mod ahci libahci pata_jmicron pata_acpi libata scsi_mod
> [ 4461.969603]
> [ 4461.969603] Pid: 0, comm: swapper Not tainted 2.6.39-ARCH #1 System
> manufacturer System Product Name/P5QD TURBO
> [ 4461.969603] RIP: 0010:[<ffffffffa019fb90>] [<ffffffffa019fb90>]
> sfq_enqueue+0xe0/0x630 [sch_sfq]
> [ 4461.969603] RSP: 0018:ffff88011fc03940 EFLAGS: 00010206
> [ 4461.969603] RAX: ffff8801172a7d08 RBX: ffff8801172a7000 RCX:
> ffff8801172a7100
> [ 4461.969603] RDX: 000000000000007b RSI: 0000000000000000 RDI:
> ffff8801172a7d08
> [ 4461.969603] RBP: ffff88011fc03980 R08: ffff8801170e8ac0 R09:
> 0000000000000007
> [ 4461.969603] R10: 0000000000000001 R11: ffffc900110a1000 R12:
> ffff8801172a7d08
> [ 4461.969603] R13: 0000000017cc394b R14: 00000000a54672c3 R15:
> ffff880117f8109c
> [ 4461.969603] FS: 0000000000000000(0000) GS:ffff88011fc00000(0000)
> knlGS:0000000000000000
> [ 4461.969603] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [ 4461.969603] CR2: 0000000000000000 CR3: 00000001171b2000 CR4:
> 00000000000406f0
> [ 4461.969603] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [ 4461.969603] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> 0000000000000400
> [ 4461.969603] Process swapper (pid: 0, threadinfo ffffffff81600000,
> task ffffffff8169b020)
> [ 4461.969603] Stack:
> [ 4461.969603] ffff88011fc039a0 ffff880117787d80 ffff88011fc03980
> ffff880117f81000
> [ 4461.969603] ffff880117b0f400 ffff8801172a7d08 ffff8801172a7000
> ffff880117f8109c
> [ 4461.969603] ffff88011fc039d0 ffffffffa0165080 ffff880114016f00
> ffff880114016f00
> [ 4461.969603] Call Trace:
> [ 4461.969603] <IRQ>
> [ 4461.969603] [<ffffffffa0165080>] htb_enqueue+0xb0/0x3c0 [sch_htb]
> [ 4461.969603] [<ffffffff8112f0d3>] ?
> __kmalloc_node_track_caller+0x33/0x240
> [ 4461.969603] [<ffffffff813127b3>] dev_queue_xmit+0x1d3/0x680
> [ 4461.969603] [<ffffffff81345300>] ? ip_fragment+0x1d0/0x960
> [ 4461.969603] [<ffffffff81344252>] ip_finish_output2+0x1e2/0x2c0
> [ 4461.969603] [<ffffffff8134534e>] ip_fragment+0x21e/0x960
> [ 4461.969603] [<ffffffff81344070>] ? ip_send_check+0x50/0x50
> [ 4461.969603] [<ffffffff81345d8f>] ip_finish_output+0x27f/0x360
> [ 4461.969603] [<ffffffff81342840>] ? ip_frag_mem+0x10/0x10
> [ 4461.969603] [<ffffffff81346928>] ip_output+0xc8/0xe0
> [ 4461.969603] [<ffffffff8134287f>] ip_forward_finish+0x3f/0x50
> [ 4461.969603] [<ffffffff81342b25>] ip_forward+0x295/0x430
> [ 4461.969603] [<ffffffff81340d61>] ip_rcv_finish+0x131/0x370
> [ 4461.969603] [<ffffffff81303a1a>] ? __alloc_skb+0x4a/0x230
> [ 4461.969603] [<ffffffff8134163e>] ip_rcv+0x21e/0x2f0
> [ 4461.969603] [<ffffffff8130f80a>] __netif_receive_skb+0x30a/0x6c0
> [ 4461.969603] [<ffffffff81011759>] ? read_tsc+0x9/0x20
> [ 4461.969603] [<ffffffff813103bd>] netif_receive_skb+0xad/0xc0
> [ 4461.969603] [<ffffffff8121867c>] ? is_swiotlb_buffer+0x3c/0x50
> [ 4461.969603] [<ffffffff81310d08>] napi_skb_finish+0x48/0x60
> [ 4461.969603] [<ffffffff81310dcd>] napi_gro_receive+0xad/0xc0
> [ 4461.969603] [<ffffffffa01b79b7>] igb_poll+0x8c7/0xd60 [igb]
> [ 4461.969603] [<ffffffff81310699>] net_rx_action+0x149/0x300
> [ 4461.969603] [<ffffffff81011759>] ? read_tsc+0x9/0x20
> [ 4461.969603] [<ffffffff8105ea78>] __do_softirq+0xa8/0x280
> [ 4461.969603] [<ffffffff81089538>] ?
> tick_dev_program_event+0x48/0x110
> [ 4461.969603] [<ffffffff8108961a>] ? tick_program_event+0x1a/0x20
> [ 4461.969603] [<ffffffff813cc51c>] call_softirq+0x1c/0x30
> [ 4461.969603] [<ffffffff8100caf5>] do_softirq+0x65/0xa0
> [ 4461.969603] [<ffffffff8105ef86>] irq_exit+0x96/0xb0
> [ 4461.969603] [<ffffffff8102784b>] smp_apic_timer_interrupt+0x6b/0xa0
> [ 4461.969603] [<ffffffff813cbcd3>] apic_timer_interrupt+0x13/0x20
> [ 4461.969603] <EOI>
> [ 4461.969603] [<ffffffff81012beb>] ? mwait_idle+0x9b/0x2d0
> [ 4461.969603] [<ffffffff81009226>] cpu_idle+0xb6/0x100
> [ 4461.969603] [<ffffffff813a903d>] rest_init+0x91/0xa4
> [ 4461.969603] [<ffffffff81722c32>] start_kernel+0x3ed/0x3fa
> [ 4461.969603] [<ffffffff81722347>]
> x86_64_start_reservations+0x132/0x136
> [ 4461.969603] [<ffffffff81722140>] ? early_idt_handlers+0x140/0x140
> [ 4461.969603] [<ffffffff8172244d>] x86_64_start_kernel+0x102/0x111
> [ 4461.969603] Code: b6 70 10 3b b3 08 01 00 00 0f 8d df 01 00 00 41 8b
> 74 24 28 01 b3 b4 00 00 00 48 8b 70 08 49 89 04 24 49 89 74 24 08 48 8b
> 70 08 <4c> 89 26 0f b6 f2 4c 89 60 08 48 8d 3c 76 48 8d bc fb 90 01 00
> [ 4461.969603] RIP [<ffffffffa019fb90>] sfq_enqueue+0xe0/0x630
> [sch_sfq]
> [ 4461.969603] RSP <ffff88011fc03940>
> [ 4461.969603] CR2: 0000000000000000
> [ 4463.351117] ---[ end trace f04e6b6edad2d731 ]---
> [ 4463.364930] Kernel panic - not syncing: Fatal exception in interrupt
> [ 4463.383963] Pid: 0, comm: swapper Tainted: G D 2.6.39-ARCH
> #1
> [ 4463.403515] Call Trace:
> [ 4463.410847] <IRQ> [<ffffffff813c16b9>] panic+0x9b/0x1a8
> [ 4463.427073] [<ffffffff8100e322>] oops_end+0xe2/0xf0
> [ 4463.441944] [<ffffffff813c117f>] no_context+0x204/0x213
> [ 4463.457857] [<ffffffff81346928>] ? ip_output+0xc8/0xe0
> [ 4463.473509] [<ffffffff813c1317>] __bad_area_nosemaphore+0x189/0x1ac
> [ 4463.492542] [<ffffffffa01b4c4b>] ?
> igb_xmit_frame_ring_adv+0x1fb/0xc30 [igb]
> [ 4463.513913] [<ffffffff813c1348>] bad_area_nosemaphore+0xe/0x10
> [ 4463.531645] [<ffffffff81036329>] do_page_fault+0x3c9/0x4b0
> [ 4463.548337] [<ffffffff812068e4>] ? timerqueue_add+0x74/0xc0
> [ 4463.565291] [<ffffffff8107c4d3>] ? enqueue_hrtimer+0x33/0xe0
> [ 4463.582503] [<ffffffff8107d0fe>] ?
> __hrtimer_start_range_ns+0x1be/0x520
> [ 4463.602576] [<ffffffff810b95fd>] ? handle_edge_irq+0x7d/0x120
> [ 4463.620048] [<ffffffff813cae85>] page_fault+0x25/0x30
> [ 4463.635438] [<ffffffffa019fb90>] ? sfq_enqueue+0xe0/0x630 [sch_sfq]
> [ 4463.654471] [<ffffffffa0165080>] htb_enqueue+0xb0/0x3c0 [sch_htb]
> [ 4463.672983] [<ffffffff8112f0d3>] ?
> __kmalloc_node_track_caller+0x33/0x240
> [ 4463.693576] [<ffffffff813127b3>] dev_queue_xmit+0x1d3/0x680
> [ 4463.710528] [<ffffffff81345300>] ? ip_fragment+0x1d0/0x960
> [ 4463.727221] [<ffffffff81344252>] ip_finish_output2+0x1e2/0x2c0
> [ 4463.744953] [<ffffffff8134534e>] ip_fragment+0x21e/0x960
> [ 4463.761124] [<ffffffff81344070>] ? ip_send_check+0x50/0x50
> [ 4463.777817] [<ffffffff81345d8f>] ip_finish_output+0x27f/0x360
> [ 4463.795289] [<ffffffff81342840>] ? ip_frag_mem+0x10/0x10
> [ 4463.811462] [<ffffffff81346928>] ip_output+0xc8/0xe0
> [ 4463.826592] [<ffffffff8134287f>] ip_forward_finish+0x3f/0x50
> [ 4463.843806] [<ffffffff81342b25>] ip_forward+0x295/0x430
> [ 4463.859719] [<ffffffff81340d61>] ip_rcv_finish+0x131/0x370
> [ 4463.876411] [<ffffffff81303a1a>] ? __alloc_skb+0x4a/0x230
> [ 4463.892842] [<ffffffff8134163e>] ip_rcv+0x21e/0x2f0
> [ 4463.907714] [<ffffffff8130f80a>] __netif_receive_skb+0x30a/0x6c0
> [ 4463.925966] [<ffffffff81011759>] ? read_tsc+0x9/0x20
> [ 4463.941099] [<ffffffff813103bd>] netif_receive_skb+0xad/0xc0
> [ 4463.958310] [<ffffffff8121867c>] ? is_swiotlb_buffer+0x3c/0x50
> [ 4463.976044] [<ffffffff81310d08>] napi_skb_finish+0x48/0x60
> [ 4463.992735] [<ffffffff81310dcd>] napi_gro_receive+0xad/0xc0
> [ 4464.009688] [<ffffffffa01b79b7>] igb_poll+0x8c7/0xd60 [igb]
> [ 4464.026639] [<ffffffff81310699>] net_rx_action+0x149/0x300
> [ 4464.043333] [<ffffffff81011759>] ? read_tsc+0x9/0x20
> [ 4464.058465] [<ffffffff8105ea78>] __do_softirq+0xa8/0x280
> [ 4464.074636] [<ffffffff81089538>] ?
> tick_dev_program_event+0x48/0x110
> [ 4464.093928] [<ffffffff8108961a>] ? tick_program_event+0x1a/0x20
> [ 4464.111920] [<ffffffff813cc51c>] call_softirq+0x1c/0x30
> [ 4464.127833] [<ffffffff8100caf5>] do_softirq+0x65/0xa0
> [ 4464.143225] [<ffffffff8105ef86>] irq_exit+0x96/0xb0
> [ 4464.158098] [<ffffffff8102784b>] smp_apic_timer_interrupt+0x6b/0xa0
> [ 4464.177130] [<ffffffff813cbcd3>] apic_timer_interrupt+0x13/0x20
> [ 4464.195120] <EOI> [<ffffffff81012beb>] ? mwait_idle+0x9b/0x2d0
> [ 4464.213145] [<ffffffff81009226>] cpu_idle+0xb6/0x100
> [ 4464.228274] [<ffffffff813a903d>] rest_init+0x91/0xa4
> [ 4464.243404] [<ffffffff81722c32>] start_kernel+0x3ed/0x3fa
> [ 4464.259837] [<ffffffff81722347>]
> x86_64_start_reservations+0x132/0x136
> [ 4464.279649] [<ffffffff81722140>] ? early_idt_handlers+0x140/0x140
> [ 4464.298161] [<ffffffff8172244d>] x86_64_start_kernel+0x102/0x111
>
>
Ouch, thats an ip_fragment() bug I am afraid... nothing to do with SFQ
It calls
err = output(skb);
and a bit later does :
skb = frag;
frag = skb->next; // thats completely illegal here !
skb->next = NULL;
I am cooking a patch and send it in a couple of minutes.
Thanks !
^ permalink raw reply
* [PATCH] be2net: hash key for rss-config cmd not set
From: Sathya Perla @ 2011-05-23 12:25 UTC (permalink / raw)
To: netdev; +Cc: Sathya Perla
Need a random hash key to effectively hash incoming connections into
multiple RX rings.
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
---
drivers/net/benet/be_cmds.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/drivers/net/benet/be_cmds.c b/drivers/net/benet/be_cmds.c
index 2463b1c..81654ae 100644
--- a/drivers/net/benet/be_cmds.c
+++ b/drivers/net/benet/be_cmds.c
@@ -1703,7 +1703,8 @@ int be_cmd_rss_config(struct be_adapter *adapter, u8 *rsstable, u16 table_size)
{
struct be_mcc_wrb *wrb;
struct be_cmd_req_rss_config *req;
- u32 myhash[10];
+ u32 myhash[10] = {0x0123, 0x4567, 0x89AB, 0xCDEF, 0x01EF,
+ 0x0123, 0x4567, 0x89AB, 0xCDEF, 0x01EF};
int status;
if (mutex_lock_interruptible(&adapter->mbox_lock))
--
1.7.4
^ permalink raw reply related
* Re: stateless nat *please* tell me how I'm supposed to use it
From: Erik Slagter @ 2011-05-23 12:12 UTC (permalink / raw)
To: netdev; +Cc: rpartearroyo
[-- Attachment #1: Type: text/plain, Size: 1014 bytes --]
Hi everybody,
I am a little disappointed that nobody can or wants to tell me how
stateless nat is supposed to be used. As no other documentation exists
on this subject, this gives the impression this knowledge is a secret?
For people that run into the same problem, I can tell that I've found
the solution, with help from Rodrigo Partearroyo González. The key is
that packet munging on this level is only useful if performed before
routing and as the (normal) egress qdisc is called only just before the
handing the packet to the device, the stateless nat is performed by the
"ingress" qdisc and so the nat action / filter needs to be attached to
the, to be added, tc ingress qdisc. And then it works, e.g.
tc qdisc add dev eth0 ingress
tc filter add dev eth0 parent ffff: protocol ip prio 10 u32 match ip src
1.2.3.4 action nat egress 1.2.3.4/32 5.6.7.8
I guess the "pedit" and related actions work alike.
Now I am still wondering what the "tc action" syntax is for.
Erik Slagter.
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5110 bytes --]
^ permalink raw reply
* Re: [PATCHv2 10/14] virtio_net: limit xmit polling
From: Michael S. Tsirkin @ 2011-05-23 11:19 UTC (permalink / raw)
To: Rusty Russell
Cc: Krishna Kumar, Carsten Otte, lguest-uLR06cmDAlY/bJ5BZ2RsiQ,
Shirley Ma, kvm-u79uwXL29TY76Z2rM5mHXA,
linux-s390-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
habanero-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Heiko Carstens,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
steved-r/Jw6+rmf7HQT0dZR+AlfA, Christian Borntraeger,
Tom Lendacky, Martin Schwidefsky, linux390-tA70FqPdS9bQT0dZR+AlfA
In-Reply-To: <87boyutbjg.fsf-8n+1lVoiYb80n/F98K4Iww@public.gmane.org>
On Mon, May 23, 2011 at 11:37:15AM +0930, Rusty Russell wrote:
> On Sun, 22 May 2011 15:10:08 +0300, "Michael S. Tsirkin" <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> > On Sat, May 21, 2011 at 11:49:59AM +0930, Rusty Russell wrote:
> > > On Fri, 20 May 2011 02:11:56 +0300, "Michael S. Tsirkin" <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> > > > Current code might introduce a lot of latency variation
> > > > if there are many pending bufs at the time we
> > > > attempt to transmit a new one. This is bad for
> > > > real-time applications and can't be good for TCP either.
> > >
> > > Do we have more than speculation to back that up, BTW?
> >
> > Need to dig this up: I thought we saw some reports of this on the list?
>
> I think so too, but a reference needs to be here too.
>
> It helps to have exact benchmarks on what's being tested, otherwise we
> risk unexpected interaction with the other optimization patches.
>
> > > > struct sk_buff *skb;
> > > > unsigned int len;
> > > > -
> > > > - while ((skb = virtqueue_get_buf(vi->svq, &len)) != NULL) {
> > > > + bool c;
> > > > + int n;
> > > > +
> > > > + /* We try to free up at least 2 skbs per one sent, so that we'll get
> > > > + * all of the memory back if they are used fast enough. */
> > > > + for (n = 0;
> > > > + ((c = virtqueue_get_capacity(vi->svq) < capacity) || n < 2) &&
> > > > + ((skb = virtqueue_get_buf(vi->svq, &len)));
> > > > + ++n) {
> > > > pr_debug("Sent skb %p\n", skb);
> > > > vi->dev->stats.tx_bytes += skb->len;
> > > > vi->dev->stats.tx_packets++;
> > > > dev_kfree_skb_any(skb);
> > > > }
> > > > + return !c;
> > >
> > > This is for() abuse :)
> > >
> > > Why is the capacity check in there at all? Surely it's simpler to try
> > > to free 2 skbs each time around?
> >
> > This is in case we can't use indirect: we want to free up
> > enough buffers for the following add_buf to succeed.
>
> Sure, or we could just count the frags of the skb we're taking out,
> which would be accurate for both cases and far more intuitive.
>
> ie. always try to free up twice as much as we're about to put in.
>
> Can we hit problems with OOM? Sure, but no worse than now...
> The problem is that this "virtqueue_get_capacity()" returns the worst
> case, not the normal case. So using it is deceptive.
>
Maybe just document this?
I still believe capacity really needs to be decided
at the virtqueue level, not in the driver.
E.g. with indirect each skb uses a single entry: freeing
1 small skb is always enough to have space for a large one.
I do understand how it seems a waste to leave direct space
in the ring while we might in practice have space
due to indirect. Didn't come up with a nice way to
solve this yet - but 'no worse than now :)'
> > I just wanted to localize the 2+MAX_SKB_FRAGS logic that tries to make
> > sure we have enough space in the buffer. Another way to do
> > that is with a define :).
>
> To do this properly, we should really be using the actual number of sg
> elements needed, but we'd have to do most of xmit_skb beforehand so we
> know how many.
>
> Cheers,
> Rusty.
Maybe I'm confused here. The problem isn't the failing
add_buf for the given skb IIUC. What we are trying to do here is stop
the queue *before xmit_skb fails*. We can't look at the
number of fragments in the current skb - the next one can be
much larger. That's why we check capacity after xmit_skb,
not before it, right?
--
MST
^ permalink raw reply
* Re: [PATCH 1/3] vlan: Do not support clearing VLAN_FLAG_REORDER_HDR
From: Jiri Pirko @ 2011-05-23 10:43 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Changli Gao, Ben Greear, David Miller, Nicolas de Pesloüan,
netdev, shemminger, kaber, fubar, eric.dumazet, andy, Jesse Gross
In-Reply-To: <m139k57nzx.fsf@fess.ebiederm.org>
Mon, May 23, 2011 at 11:41:22AM CEST, ebiederm@xmission.com wrote:
>Changli Gao <xiaosuo@gmail.com> writes:
>
>> On Mon, May 23, 2011 at 9:45 AM, Eric W. Biederman
>> <ebiederm@xmission.com> wrote:
>>>> In another side, is there a specification which defines the
>>>> hw-accel-vlan-rx?
>>>
>>> I don't know.
>>>
>>> I have just been trying to clean up the mess since some of the
>>> hw-accel-vlan code broke my use case, by delivering packets with
>>> priority but no vlan (aka vlan 0 packets) twice to my pf_packet sockets.
>>>
>>
>> OK. But if we have decided to simulate the hw-accel-vlan-rx, I think
>> we'd better adjust the place where we put the emulation code. The very
>> beginnings of netif_rx() and neif_receive_skb() are better. Then rps
>> can support vlan packets without any change.
>
>That sounds nice. Patches are welcome.
>
>In principle it should be doable with some code motion. I don't think
>moving vlan_untag earlier constitutes a bug fix.
I do not think that is doable. Consider multi tagged packets. The place
just after "another_round" takes care about that.
Btw what's the rationale to move untag to earlier position?
>
>In my investigation earlier I found a non-trivial number of paths into
>__netif_receive_skb. So it was not clear to me in the slightest how to
>move the check earlier without modifying every networking driver and a
>few other pieces of code.
>
>Why should receive packet steering be affected by vlan tags at all?
>
>Eric
>
^ permalink raw reply
* 2.6.38.x, 2.6.39 sfq? kernel panic in sfq_enqueue
From: Denys Fedoryshchenko @ 2011-05-23 10:01 UTC (permalink / raw)
To: netdev, hadi
It is not mine, just helping to forward information to netdev.
Arch Linux, x86_64, NAS for PPTP (with PPTP "acceleration" enabled)
If any other info required, please let me know.
Here is panic message
[ 4461.966303] BUG: unable to handle kernel NULL pointer dereference at
(null)
[ 4461.969603] IP: [<ffffffffa019fb90>] sfq_enqueue+0xe0/0x630
[sch_sfq]
[ 4461.969603] PGD 1179a0067 PUD 1160fa067 PMD 0
[ 4461.969603] Oops: 0002 [#1] PREEMPT SMP
[ 4461.969603] last sysfs file: /sys/devices/virtual/net/ppp41/uevent
[ 4461.969603] CPU 0
[ 4461.969603] Modules linked in: act_police sch_ingress cls_u32
sch_sfq sch_htb xt_TCPMSS xt_tcpudp iptable_filter ip_tables x_tables
igb l2tp_ppp l2tp_netlink l2tp_core pptp pppox ppp_generic slhc gre
bonding i2c_i801 firewire_ohci psmouse uhci_hcd iTCO_wdt button evdev
firewire_core processor intel_agp intel_gtt i2c_core asus_atk0110 dca
iTCO_vendor_support ehci_hcd pcspkr serio_raw sg usbcore crc_itu_t ipv6
ext2 mbcache sd_mod ahci libahci pata_jmicron pata_acpi libata scsi_mod
[ 4461.969603]
[ 4461.969603] Pid: 0, comm: swapper Not tainted 2.6.39-ARCH #1 System
manufacturer System Product Name/P5QD TURBO
[ 4461.969603] RIP: 0010:[<ffffffffa019fb90>] [<ffffffffa019fb90>]
sfq_enqueue+0xe0/0x630 [sch_sfq]
[ 4461.969603] RSP: 0018:ffff88011fc03940 EFLAGS: 00010206
[ 4461.969603] RAX: ffff8801172a7d08 RBX: ffff8801172a7000 RCX:
ffff8801172a7100
[ 4461.969603] RDX: 000000000000007b RSI: 0000000000000000 RDI:
ffff8801172a7d08
[ 4461.969603] RBP: ffff88011fc03980 R08: ffff8801170e8ac0 R09:
0000000000000007
[ 4461.969603] R10: 0000000000000001 R11: ffffc900110a1000 R12:
ffff8801172a7d08
[ 4461.969603] R13: 0000000017cc394b R14: 00000000a54672c3 R15:
ffff880117f8109c
[ 4461.969603] FS: 0000000000000000(0000) GS:ffff88011fc00000(0000)
knlGS:0000000000000000
[ 4461.969603] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 4461.969603] CR2: 0000000000000000 CR3: 00000001171b2000 CR4:
00000000000406f0
[ 4461.969603] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[ 4461.969603] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[ 4461.969603] Process swapper (pid: 0, threadinfo ffffffff81600000,
task ffffffff8169b020)
[ 4461.969603] Stack:
[ 4461.969603] ffff88011fc039a0 ffff880117787d80 ffff88011fc03980
ffff880117f81000
[ 4461.969603] ffff880117b0f400 ffff8801172a7d08 ffff8801172a7000
ffff880117f8109c
[ 4461.969603] ffff88011fc039d0 ffffffffa0165080 ffff880114016f00
ffff880114016f00
[ 4461.969603] Call Trace:
[ 4461.969603] <IRQ>
[ 4461.969603] [<ffffffffa0165080>] htb_enqueue+0xb0/0x3c0 [sch_htb]
[ 4461.969603] [<ffffffff8112f0d3>] ?
__kmalloc_node_track_caller+0x33/0x240
[ 4461.969603] [<ffffffff813127b3>] dev_queue_xmit+0x1d3/0x680
[ 4461.969603] [<ffffffff81345300>] ? ip_fragment+0x1d0/0x960
[ 4461.969603] [<ffffffff81344252>] ip_finish_output2+0x1e2/0x2c0
[ 4461.969603] [<ffffffff8134534e>] ip_fragment+0x21e/0x960
[ 4461.969603] [<ffffffff81344070>] ? ip_send_check+0x50/0x50
[ 4461.969603] [<ffffffff81345d8f>] ip_finish_output+0x27f/0x360
[ 4461.969603] [<ffffffff81342840>] ? ip_frag_mem+0x10/0x10
[ 4461.969603] [<ffffffff81346928>] ip_output+0xc8/0xe0
[ 4461.969603] [<ffffffff8134287f>] ip_forward_finish+0x3f/0x50
[ 4461.969603] [<ffffffff81342b25>] ip_forward+0x295/0x430
[ 4461.969603] [<ffffffff81340d61>] ip_rcv_finish+0x131/0x370
[ 4461.969603] [<ffffffff81303a1a>] ? __alloc_skb+0x4a/0x230
[ 4461.969603] [<ffffffff8134163e>] ip_rcv+0x21e/0x2f0
[ 4461.969603] [<ffffffff8130f80a>] __netif_receive_skb+0x30a/0x6c0
[ 4461.969603] [<ffffffff81011759>] ? read_tsc+0x9/0x20
[ 4461.969603] [<ffffffff813103bd>] netif_receive_skb+0xad/0xc0
[ 4461.969603] [<ffffffff8121867c>] ? is_swiotlb_buffer+0x3c/0x50
[ 4461.969603] [<ffffffff81310d08>] napi_skb_finish+0x48/0x60
[ 4461.969603] [<ffffffff81310dcd>] napi_gro_receive+0xad/0xc0
[ 4461.969603] [<ffffffffa01b79b7>] igb_poll+0x8c7/0xd60 [igb]
[ 4461.969603] [<ffffffff81310699>] net_rx_action+0x149/0x300
[ 4461.969603] [<ffffffff81011759>] ? read_tsc+0x9/0x20
[ 4461.969603] [<ffffffff8105ea78>] __do_softirq+0xa8/0x280
[ 4461.969603] [<ffffffff81089538>] ?
tick_dev_program_event+0x48/0x110
[ 4461.969603] [<ffffffff8108961a>] ? tick_program_event+0x1a/0x20
[ 4461.969603] [<ffffffff813cc51c>] call_softirq+0x1c/0x30
[ 4461.969603] [<ffffffff8100caf5>] do_softirq+0x65/0xa0
[ 4461.969603] [<ffffffff8105ef86>] irq_exit+0x96/0xb0
[ 4461.969603] [<ffffffff8102784b>] smp_apic_timer_interrupt+0x6b/0xa0
[ 4461.969603] [<ffffffff813cbcd3>] apic_timer_interrupt+0x13/0x20
[ 4461.969603] <EOI>
[ 4461.969603] [<ffffffff81012beb>] ? mwait_idle+0x9b/0x2d0
[ 4461.969603] [<ffffffff81009226>] cpu_idle+0xb6/0x100
[ 4461.969603] [<ffffffff813a903d>] rest_init+0x91/0xa4
[ 4461.969603] [<ffffffff81722c32>] start_kernel+0x3ed/0x3fa
[ 4461.969603] [<ffffffff81722347>]
x86_64_start_reservations+0x132/0x136
[ 4461.969603] [<ffffffff81722140>] ? early_idt_handlers+0x140/0x140
[ 4461.969603] [<ffffffff8172244d>] x86_64_start_kernel+0x102/0x111
[ 4461.969603] Code: b6 70 10 3b b3 08 01 00 00 0f 8d df 01 00 00 41 8b
74 24 28 01 b3 b4 00 00 00 48 8b 70 08 49 89 04 24 49 89 74 24 08 48 8b
70 08 <4c> 89 26 0f b6 f2 4c 89 60 08 48 8d 3c 76 48 8d bc fb 90 01 00
[ 4461.969603] RIP [<ffffffffa019fb90>] sfq_enqueue+0xe0/0x630
[sch_sfq]
[ 4461.969603] RSP <ffff88011fc03940>
[ 4461.969603] CR2: 0000000000000000
[ 4463.351117] ---[ end trace f04e6b6edad2d731 ]---
[ 4463.364930] Kernel panic - not syncing: Fatal exception in interrupt
[ 4463.383963] Pid: 0, comm: swapper Tainted: G D 2.6.39-ARCH
#1
[ 4463.403515] Call Trace:
[ 4463.410847] <IRQ> [<ffffffff813c16b9>] panic+0x9b/0x1a8
[ 4463.427073] [<ffffffff8100e322>] oops_end+0xe2/0xf0
[ 4463.441944] [<ffffffff813c117f>] no_context+0x204/0x213
[ 4463.457857] [<ffffffff81346928>] ? ip_output+0xc8/0xe0
[ 4463.473509] [<ffffffff813c1317>] __bad_area_nosemaphore+0x189/0x1ac
[ 4463.492542] [<ffffffffa01b4c4b>] ?
igb_xmit_frame_ring_adv+0x1fb/0xc30 [igb]
[ 4463.513913] [<ffffffff813c1348>] bad_area_nosemaphore+0xe/0x10
[ 4463.531645] [<ffffffff81036329>] do_page_fault+0x3c9/0x4b0
[ 4463.548337] [<ffffffff812068e4>] ? timerqueue_add+0x74/0xc0
[ 4463.565291] [<ffffffff8107c4d3>] ? enqueue_hrtimer+0x33/0xe0
[ 4463.582503] [<ffffffff8107d0fe>] ?
__hrtimer_start_range_ns+0x1be/0x520
[ 4463.602576] [<ffffffff810b95fd>] ? handle_edge_irq+0x7d/0x120
[ 4463.620048] [<ffffffff813cae85>] page_fault+0x25/0x30
[ 4463.635438] [<ffffffffa019fb90>] ? sfq_enqueue+0xe0/0x630 [sch_sfq]
[ 4463.654471] [<ffffffffa0165080>] htb_enqueue+0xb0/0x3c0 [sch_htb]
[ 4463.672983] [<ffffffff8112f0d3>] ?
__kmalloc_node_track_caller+0x33/0x240
[ 4463.693576] [<ffffffff813127b3>] dev_queue_xmit+0x1d3/0x680
[ 4463.710528] [<ffffffff81345300>] ? ip_fragment+0x1d0/0x960
[ 4463.727221] [<ffffffff81344252>] ip_finish_output2+0x1e2/0x2c0
[ 4463.744953] [<ffffffff8134534e>] ip_fragment+0x21e/0x960
[ 4463.761124] [<ffffffff81344070>] ? ip_send_check+0x50/0x50
[ 4463.777817] [<ffffffff81345d8f>] ip_finish_output+0x27f/0x360
[ 4463.795289] [<ffffffff81342840>] ? ip_frag_mem+0x10/0x10
[ 4463.811462] [<ffffffff81346928>] ip_output+0xc8/0xe0
[ 4463.826592] [<ffffffff8134287f>] ip_forward_finish+0x3f/0x50
[ 4463.843806] [<ffffffff81342b25>] ip_forward+0x295/0x430
[ 4463.859719] [<ffffffff81340d61>] ip_rcv_finish+0x131/0x370
[ 4463.876411] [<ffffffff81303a1a>] ? __alloc_skb+0x4a/0x230
[ 4463.892842] [<ffffffff8134163e>] ip_rcv+0x21e/0x2f0
[ 4463.907714] [<ffffffff8130f80a>] __netif_receive_skb+0x30a/0x6c0
[ 4463.925966] [<ffffffff81011759>] ? read_tsc+0x9/0x20
[ 4463.941099] [<ffffffff813103bd>] netif_receive_skb+0xad/0xc0
[ 4463.958310] [<ffffffff8121867c>] ? is_swiotlb_buffer+0x3c/0x50
[ 4463.976044] [<ffffffff81310d08>] napi_skb_finish+0x48/0x60
[ 4463.992735] [<ffffffff81310dcd>] napi_gro_receive+0xad/0xc0
[ 4464.009688] [<ffffffffa01b79b7>] igb_poll+0x8c7/0xd60 [igb]
[ 4464.026639] [<ffffffff81310699>] net_rx_action+0x149/0x300
[ 4464.043333] [<ffffffff81011759>] ? read_tsc+0x9/0x20
[ 4464.058465] [<ffffffff8105ea78>] __do_softirq+0xa8/0x280
[ 4464.074636] [<ffffffff81089538>] ?
tick_dev_program_event+0x48/0x110
[ 4464.093928] [<ffffffff8108961a>] ? tick_program_event+0x1a/0x20
[ 4464.111920] [<ffffffff813cc51c>] call_softirq+0x1c/0x30
[ 4464.127833] [<ffffffff8100caf5>] do_softirq+0x65/0xa0
[ 4464.143225] [<ffffffff8105ef86>] irq_exit+0x96/0xb0
[ 4464.158098] [<ffffffff8102784b>] smp_apic_timer_interrupt+0x6b/0xa0
[ 4464.177130] [<ffffffff813cbcd3>] apic_timer_interrupt+0x13/0x20
[ 4464.195120] <EOI> [<ffffffff81012beb>] ? mwait_idle+0x9b/0x2d0
[ 4464.213145] [<ffffffff81009226>] cpu_idle+0xb6/0x100
[ 4464.228274] [<ffffffff813a903d>] rest_init+0x91/0xa4
[ 4464.243404] [<ffffffff81722c32>] start_kernel+0x3ed/0x3fa
[ 4464.259837] [<ffffffff81722347>]
x86_64_start_reservations+0x132/0x136
[ 4464.279649] [<ffffffff81722140>] ? early_idt_handlers+0x140/0x140
[ 4464.298161] [<ffffffff8172244d>] x86_64_start_kernel+0x102/0x111
Here is shaper code
#!/bin/sh
#ABillS %DATE% %TIME%
#
# When the ppp link comes up, this script is called with the following
# parameters
# $1 the interface name used by pppd (e.g. ppp3)
# $2 the tty device name
# $3 the tty device speed
# $4 the local IP address for the interface
# $5 the remote IP address
# $6 the parameter specified by the 'ipparam' option to pppd
#
AWK="/bin/awk"
TC="/usr/sbin/tc"
TCQA=$TC" qdisc add"
TCCA=$TC" class add"
TCFA=$TC" filter add"
TCCR=$TC" class replace"
TCFR=$TC" filter replace"
OUTPUT=$1
if [ "$OUTPUT" = "" ];
then
OUTPUT="$PPP_IFACE"
fi
if [ -f /var/run/radattr.$OUTPUT ]
then
$TC qdisc del dev $OUTPUT root >/dev/null 2>&1
$TC qdisc del dev $OUTPUT ingress >/dev/null 2>&1
DOWNSPEED=`$AWK '/PPPD-Downstream-Speed-Limit/ {print $2}'
/var/run/radattr.$OUTPUT`
UPSPEED=`$AWK '/PPPD-Upstream-Speed-Limit/ {print $2}'
/var/run/radattr.$OUTPUT`
[ w"$DOWNSPEED" = w ] && DOWNSPEED=0
[ w"$UPSPEED" = w ] && UPSPEED=0
FILTERS=`$AWK '/Filter-Id/ {print $2}' /var/run/radattr.$OUTPUT`
##### speed server->client
if [ "$UPSPEED" != "0" ] ;
then
UBURST=$((DOWNSPEED/8))
UBURSTS=$((DOWNSPEED*2))
$TCQA dev $OUTPUT root handle 1: htb default 20 r2q 1
$TCCA dev $OUTPUT parent 1: classid 1:1 htb rate 100mbit ceil
100mbit
$TCCA dev $OUTPUT parent 1:1 classid 1:10 htb rate ${UPSPEED}kbit
ceil ${UBURSTS}kbit burst ${UBURST}k prio 1
$TCCA dev $OUTPUT parent 1:1 classid 1:20 htb rate ${UPSPEED}kbit
ceil ${UBURSTS}kbit burst ${UBURST}k prio 2
$TCCA dev $OUTPUT parent 1:1 classid 1:30 htb rate 100mbit ceil
100mbit prio 3
$TCQA dev $OUTPUT parent 1:10 handle 10: sfq perturb 10 quantum
1400
$TCQA dev $OUTPUT parent 1:20 handle 20: sfq perturb 10 quantum
1400
$TCQA dev $OUTPUT parent 1:30 handle 30: sfq perturb 10 quantum
1400
$TCFA dev $OUTPUT parent 1:0 protocol ip prio 5 u32 match ip tos
0x10 0xff flowid 1:30
$TCFA dev $OUTPUT parent 1:0 protocol ip prio 10 u32 match ip tos
0x10 0xff flowid 1:10
$TCFA dev $OUTPUT parent 1:0 protocol ip prio 10 u32 match ip
protocol 1 0xff flowid 1:10
$TCFA dev $OUTPUT parent 1: protocol ip prio 10 u32 match ip
protocol 6 0xff match u8 0x05 0x0f at 0 match u16 0x0000 0xffc0 at 2
match u8 0x10 0xff at 33 flowid 1:10
fi
##### speed client->server
if [ "$DOWNSPEED" != "0" ] ;
then
DBURST=$((UPSPEED/4))
DBURSTS=$((UPSPEED*2))
$TCQA dev $OUTPUT handle ffff: ingress
$TCFA dev $OUTPUT parent ffff: protocol ip u32 match ip src
0.0.0.0/0 police rate ${DOWNSPEED}kbit burst ${DBURST}kb drop flowid :1
fi
#### Filters
# if [ w$FILTERS != w ] ;
# then
# echo "filters not supported"
# fi
fi
^ permalink raw reply
* Re: [PATCH 1/3] vlan: Do not support clearing VLAN_FLAG_REORDER_HDR
From: Eric W. Biederman @ 2011-05-23 9:41 UTC (permalink / raw)
To: Changli Gao
Cc: Ben Greear, David Miller, Jiri Pirko, Nicolas de Pesloüan,
netdev, shemminger, kaber, fubar, eric.dumazet, andy, Jesse Gross
In-Reply-To: <BANLkTi=BN6Aza7eU=D+WqJgVsVuswT5PDg@mail.gmail.com>
Changli Gao <xiaosuo@gmail.com> writes:
> On Mon, May 23, 2011 at 9:45 AM, Eric W. Biederman
> <ebiederm@xmission.com> wrote:
>>> In another side, is there a specification which defines the
>>> hw-accel-vlan-rx?
>>
>> I don't know.
>>
>> I have just been trying to clean up the mess since some of the
>> hw-accel-vlan code broke my use case, by delivering packets with
>> priority but no vlan (aka vlan 0 packets) twice to my pf_packet sockets.
>>
>
> OK. But if we have decided to simulate the hw-accel-vlan-rx, I think
> we'd better adjust the place where we put the emulation code. The very
> beginnings of netif_rx() and neif_receive_skb() are better. Then rps
> can support vlan packets without any change.
That sounds nice. Patches are welcome.
In principle it should be doable with some code motion. I don't think
moving vlan_untag earlier constitutes a bug fix.
In my investigation earlier I found a non-trivial number of paths into
__netif_receive_skb. So it was not clear to me in the slightest how to
move the check earlier without modifying every networking driver and a
few other pieces of code.
Why should receive packet steering be affected by vlan tags at all?
Eric
^ permalink raw reply
* Re: [v3 00/39] faster tree-based sysctl implementation
From: Eric W. Biederman @ 2011-05-23 9:32 UTC (permalink / raw)
To: Lucian Adrian Grijincu
Cc: linux-kernel, netdev, Alexey Dobriyan, Octavian Purdila,
David S . Miller
In-Reply-To: <BANLkTinqX_=TLU3TuKJFFTJBBxm1scZ3Ew@mail.gmail.com>
Lucian Adrian Grijincu <lucian.grijincu@gmail.com> writes:
> On Mon, May 23, 2011 at 7:27 AM, Eric W. Biederman
> <ebiederm@xmission.com> wrote:
>> This patchset looks like it is deserving of some close scrutiny, and
>> not just the high level design overview I have given the previous
>> patches. This is going to be a busy week for me so I probably won't
>> get through all of the patches for a while.
>
>
> I have one more question. The current implementation uses a single
> sysctl_lock to synchronize all changes to the data structures.
>
> In my algorithm I change a few places to use a per-header read-write
> lock. Even though the code is organized to handle a per-header rwlock,
> the implementation uses a single global rwlock. In v2 I got rid of the
> rwlock and replaced the subdirs/files regular lists with rcu-protected
> lists and that's why I did not bother giving each header a rwlock.
>
>
> I have no idea how to use rcu with rbtree. Should I now give each
> header it's own lock to reduce contention?
I would only walk down that path if we can find some profile data
showing that the lock is where we are hot.
> I'm asking this because I don't know why the only is a global sysctl
> spin lock, when multiple locks could have been used, each to protect
> it's own domain of values.
Mostly it is simplicity. There is also the fact that the spin lock is
used in the implementation of something that is essentially a
reader/writer lock already.
With the help of the reference counts we block when we are unregistering
until there are no more users.
In that context I'm not certain I am comfortable with separating proc
inode usage from other proc usage. But I haven't read through that
section of your code well enough yet to tell if you are making sense.
One of the things that would be very nice to do is add lockdep
annotations like I have to sysfs_activate and sysfs_deactivate, so we
can catch the all too common case of someone unregistering a sysctl
table when there are problems.
Personally I'm not happy with the state of the locking abstractions in
sysctl today. It is all much too obscure, and there are too few
warnings. However for your set of changes I think the thing to focus
on is getting sysctl to better data structures so that it can scale.
Once the data structures are simple enough any remaining issues should
be fixable with small straight forward patches.
Eric
^ permalink raw reply
* Re: [PATCH 1/3] vlan: Do not support clearing VLAN_FLAG_REORDER_HDR
From: Eric W. Biederman @ 2011-05-23 9:00 UTC (permalink / raw)
To: Ben Greear
Cc: David Miller, Jiri Pirko, Nicolas de Pesloüan, Changli Gao,
netdev, shemminger, kaber, fubar, eric.dumazet, andy, Jesse Gross
In-Reply-To: <4DD9F81D.6070806@candelatech.com>
Ben Greear <greearb@candelatech.com> writes:
> On 05/22/2011 03:38 PM, Eric W. Biederman wrote:
>> Ben Greear<greearb@candelatech.com> writes:
>>
>>> On 05/22/2011 12:39 PM, Eric W. Biederman wrote:
>>>>
>>>> Simplify the vlan handling code by not supporing clearing of
>>>> VLAN_FLAG_REORDER_HDR. Which means we always make the vlan handling
>>>> code strip the vlan header from the packets, and always insert the vlan
>>>> header when transmitting packets.
>>>>
>>>> Not stripping the vlan header has alwasy been broken in combination with
>>>> vlan hardware accelleration. Now that we are making everything look
>>>> like accelerated vlan handling not stripping the vlan header is always
>>>> broken.
>>>>
>>>> I don't think anyone actually cares so simply stop supporting the broken
>>>> case.
>>>
>>> I've lost track of the VLAN code a bit. Is there any documentation
>>> somewhere about what happens in these various cases:
>>
>> Other than the code I don't know about documentation.
>
> These cases are tricky and probably have changed over
> the years. It would be nice to have it written down
> somewhere, even if just in comments somewhere in the
> VLAN code.
>
>>
>>> * Open a raw packet socket on eth0.
>> I assume you mean a pf_packet socket.
>
> Yes.
>
>>> * Do we get tagged VLAN packets? (I'd expect yes.)
>> yes.
>>
>>> * If we sent a tagged VLAN packet, it's sent without modification? (I'd expect yes.)
>>> ** Without "yes" to the two above, one cannot do user-space bridging properly.
>>
>> This is sort of. If you set the PACKET_AUXDATA option and use recv_cmsg
>> you get the priority and the vlan identifier in the auxdata.
>>
>> I think that is a pretty horrible answer myself but it has been that way
>> since sometime mid 2008. So I'm not immediately prepared to call it
>> a regression, or a bug.
>
> I believe we have been getting tagged VLAN packets properly
> in our test cases. We would not be creating any VLAN devices
> in this case, so perhaps the NIC isn't doing any stripping.
>
> To me, it seems like we should get the fully tagged packet
> without having to go muck with aux-data, though it would
> be fine if it were *also* in aux-data.
Given that pf_packet is a portable interface that works on multiple OS's
I tend to agree. Certainly my users would be happier if they don't
have to change their code and not having to change tcpdump would
also be nice.
I'm not certain exactly where in the code it makes sense to put the
vlan header back on for pf_packet sockets. The simplest thing would
be just before we run the socket filter. If we don't do the simplest
thing this raises the question how do we avoid breaking socket filters
that look at the packet data and know there is going to be a vlan
header there.
Still the current situation is better than seeing vlan 0 tagged packets
twice.
My gut feel says if we can cheaply get the socket filters to act like it
sees the vlan tag (where the vlan tag belongs) we should not actually
put the vlan tag back on until we copy the packet to userspace.
Having the perspective of someone who cares whose hardware supports the
vlan tagging optimizations would be nice to have.
> I'll try to test this again this coming week to make sure
> it's working like I think it is.
Thanks.
Eric
^ permalink raw reply
* Re: [PATCH] net: ping: cleanups ping_v4_unhash()
From: Eric Dumazet @ 2011-05-23 9:00 UTC (permalink / raw)
To: Vasiliy Kulikov; +Cc: David Miller, netdev
In-Reply-To: <20110523084308.GA6244@albatros>
Le lundi 23 mai 2011 à 12:43 +0400, Vasiliy Kulikov a écrit :
> On Mon, May 23, 2011 at 10:23 +0200, Eric Dumazet wrote:
> > net/ipv4/ping.c: In function ‘ping_v4_unhash’:
> > net/ipv4/ping.c:140:28: warning: variable ‘hslot’ set but not used
> >
> > Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
>
> Acked-by: Vasiliy Kulikov <segoon@openwall.com>
>
> hslot was used for debugging purposes here.
>
>
> BTW, what gcc version do you use? I have no warning with 4.4.3
> (Ubuntu 4.4.3-4ubuntu5):
>
> $ make net/ipv4/ping.o
> CHK include/linux/version.h
> CHK include/generated/utsrelease.h
> CALL scripts/checksyscalls.sh
> CC net/ipv4/ping.o
> $
>
I got this warning with gcc-4.6.0 on a 32bit (x86) host
^ permalink raw reply
* Re: [PATCH] net: ping: cleanups ping_v4_unhash()
From: Vasiliy Kulikov @ 2011-05-23 8:43 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, netdev
In-Reply-To: <1306138981.2869.2.camel@edumazet-laptop>
On Mon, May 23, 2011 at 10:23 +0200, Eric Dumazet wrote:
> net/ipv4/ping.c: In function ‘ping_v4_unhash’:
> net/ipv4/ping.c:140:28: warning: variable ‘hslot’ set but not used
>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Vasiliy Kulikov <segoon@openwall.com>
hslot was used for debugging purposes here.
BTW, what gcc version do you use? I have no warning with 4.4.3
(Ubuntu 4.4.3-4ubuntu5):
$ make net/ipv4/ping.o
CHK include/linux/version.h
CHK include/generated/utsrelease.h
CALL scripts/checksyscalls.sh
CC net/ipv4/ping.o
$
Thanks,
--
Vasiliy Kulikov
http://www.openwall.com - bringing security into open computing environments
^ permalink raw reply
* [PATCH] ipv6: xfrm6: fix dubious code
From: Eric Dumazet @ 2011-05-23 8:42 UTC (permalink / raw)
To: David Miller; +Cc: netdev
net/ipv6/xfrm6_tunnel.c: In function ‘xfrm6_tunnel_rcv’:
net/ipv6/xfrm6_tunnel.c:244:53: warning: the omitted middle operand
in ?: will always be ‘true’, suggest explicit middle operand
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
net/ipv6/xfrm6_tunnel.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv6/xfrm6_tunnel.c b/net/ipv6/xfrm6_tunnel.c
index a6770a0..fb9b0c3 100644
--- a/net/ipv6/xfrm6_tunnel.c
+++ b/net/ipv6/xfrm6_tunnel.c
@@ -241,7 +241,7 @@ static int xfrm6_tunnel_rcv(struct sk_buff *skb)
__be32 spi;
spi = xfrm6_tunnel_spi_lookup(net, (const xfrm_address_t *)&iph->saddr);
- return xfrm6_rcv_spi(skb, IPPROTO_IPV6, spi) > 0 ? : 0;
+ return xfrm6_rcv_spi(skb, IPPROTO_IPV6, spi) > 0 ? 1 : 0;
}
static int xfrm6_tunnel_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
^ permalink raw reply related
* [PATCH] snap: remove one synchronize_net()
From: Eric Dumazet @ 2011-05-23 8:41 UTC (permalink / raw)
To: David Miller; +Cc: netdev
No need to wait for a rcu grace period after list insertion.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
net/802/psnap.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/net/802/psnap.c b/net/802/psnap.c
index 21cde8f..db6baf7 100644
--- a/net/802/psnap.c
+++ b/net/802/psnap.c
@@ -147,7 +147,6 @@ struct datalink_proto *register_snap_client(const unsigned char *desc,
out:
spin_unlock_bh(&snap_lock);
- synchronize_net();
return proto;
}
^ permalink raw reply related
* Re: [PATCH v4 1/1] can: add pruss CAN driver.
From: Marc Kleine-Budde @ 2011-05-23 8:23 UTC (permalink / raw)
To: Oliver Hartkopp
Cc: sachi-EvXpCiN+lbve9wHmmfpqLFaTQe2KTcn/,
davinci-linux-open-source-VycZQUHpC/PFrsHnngEfi1aTQe2KTcn/,
Arnd Bergmann, Subhasish Ghosh, nsekhar-l0cyMroinI0, open list,
CAN NETWORK DRIVERS, Alan Cox,
linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
Netdev-u79uwXL29TY76Z2rM5mHXA, m-watkins-l0cyMroinI0,
Wolfgang Grandegger
In-Reply-To: <4DD9FCFC.10803-fJ+pQTUTwRTk1uMJSBkQmQ@public.gmane.org>
[-- Attachment #1.1: Type: text/plain, Size: 1302 bytes --]
On 05/23/2011 08:21 AM, Oliver Hartkopp wrote:
[...]
> In 'real world' CAN setups you'll never see 21.000 CAN frames per second (and
> therefore 21.000 irqs/s) - you are usually designing CAN network traffic with
> less than 60% busload. So interrupt rates somewhere below 1000 irqs/s can be
> assumed.
>
> From what i've seen so far a 3-4 messages rx FIFO and NAPI support just make it.
>
> @Marc/Wolfgang: Would this be also your recommendation for a CAN controller
> design that supports SocketCAN in the best way?
If you have a rx FIFO NAPI is the way to go. For a single mailbox it
adds overhead, if you can read the CAN frame in the interrupt handler.
The error messages should probably generated from NAPI, too. Especially
the I'm-the-only-CAN-node-on-the-net-and-get-no-ACK error message.
However IIRC David said that every new driver should implement NAPI.
> As the Linux network stack supports hardware timestamps too, this could be an
> additional (optional!) feature.
regards, Marc
--
Pengutronix e.K. | Marc Kleine-Budde |
Industrial Linux Solutions | Phone: +49-231-2826-924 |
Vertretung West/Dortmund | Fax: +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686 | http://www.pengutronix.de |
[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]
[-- Attachment #2: Type: text/plain, Size: 188 bytes --]
_______________________________________________
Socketcan-core mailing list
Socketcan-core-0fE9KPoRgkgATYTw5x5z8w@public.gmane.org
https://lists.berlios.de/mailman/listinfo/socketcan-core
^ permalink raw reply
* [PATCH] net: ping: cleanups ping_v4_unhash()
From: Eric Dumazet @ 2011-05-23 8:23 UTC (permalink / raw)
To: David Miller; +Cc: netdev, Vasiliy Kulikov
net/ipv4/ping.c: In function ‘ping_v4_unhash’:
net/ipv4/ping.c:140:28: warning: variable ‘hslot’ set but not used
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Vasiliy Kulikov <segoon@openwall.com>
---
net/ipv4/ping.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/net/ipv4/ping.c b/net/ipv4/ping.c
index 1f3bb11..9aaa671 100644
--- a/net/ipv4/ping.c
+++ b/net/ipv4/ping.c
@@ -137,9 +137,6 @@ static void ping_v4_unhash(struct sock *sk)
struct inet_sock *isk = inet_sk(sk);
pr_debug("ping_v4_unhash(isk=%p,isk->num=%u)\n", isk, isk->inet_num);
if (sk_hashed(sk)) {
- struct hlist_nulls_head *hslot;
-
- hslot = ping_hashslot(&ping_table, sock_net(sk), isk->inet_num);
write_lock_bh(&ping_table.lock);
hlist_nulls_del(&sk->sk_nulls_node);
sock_put(sk);
^ permalink raw reply related
* [PATCH] e100: Correct firmware memory leak
From: Simon Kagstrom @ 2011-05-23 7:07 UTC (permalink / raw)
To: netdev, e1000-devel
Cc: Jeff Kirsher, Jesse Brandeburg, Bruce Allan, Carolyn Wyborny,
Don Skidmore, Greg Rose, PJ Waskiewicz, Alex Duyck, John Ronciak
kmemcheck reports
unreferenced object 0xcfaf4f00 (size 32):
comm "ifconfig", pid 682, jiffies 87369
backtrace:
[<c00252b4>] save_stack_trace+0x20/0x24
[<c00a5f98>] create_object+0x118/0x20c
[<c00a61a8>] kmemleak_alloc+0x40/0x84
[<c00a2de4>] kmem_cache_alloc+0x114/0x1a4
[<c016ce50>] _request_firmware+0x3c/0x540
[<c016d3f8>] request_firmware+0x14/0x18
[<c0170774>] e100_hw_init+0xf0/0x3d8
[<c0171340>] e100_up+0x38/0x16c
[<c0171494>] e100_open+0x20/0x54
[<c019779c>] dev_open+0xcc/0x134
[<c0196cf0>] dev_change_flags+0xb0/0x190
[<c01e0998>] devinet_ioctl+0x2f0/0x6fc
[<c01e1dc4>] inet_ioctl+0xcc/0x104
[<c01861d8>] sock_ioctl+0x200/0x25c
[<c00b4cbc>] vfs_ioctl+0x34/0x78
[<c00b5400>] do_vfs_ioctl+0x4e4/0x53c
when the interface is taken up because the firmware is loaded with
request_firmware, but never released in the callback where it's used,
so fix that.
Problem introduced in
9ac32e1bc0518b01b47dd34a733dce8634a38ed3
as far as I can tell.
Signed-off-by: Simon Kagstrom <simon.kagstrom@netinsight.net>
---
drivers/net/e100.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)
diff --git a/drivers/net/e100.c b/drivers/net/e100.c
index b0aa9e6..f2b44ef 100644
--- a/drivers/net/e100.c
+++ b/drivers/net/e100.c
@@ -1320,6 +1320,8 @@ static void e100_setup_ucode(struct nic *nic, struct cb *cb,
cb->u.ucode[min_size] |= cpu_to_le32((BUNDLESMALL) ? 0xFFFF : 0xFF80);
cb->command = cpu_to_le16(cb_ucode | cb_el);
+
+ release_firmware(fw);
}
static inline int e100_load_ucode_wait(struct nic *nic)
--
1.7.0.4
^ permalink raw reply related
* [GIT] Networking
From: David Miller @ 2011-05-23 6:54 UTC (permalink / raw)
To: torvalds; +Cc: akpm, netdev, linux-kernel
Ok, this gets rid of the need to include linux/prefetch.h in
linux/skbuff.h, thanks largely to Part Gortmaker.
1) bnx2x bug sifxes from Vladislav Zolotarov and Dmitry Kravkov.
2) Unicast frame handling fix in macvlan from David Ward.
3) Dave Jones was hitting ip_rt_bug(), add a backtrace so we
can diagnose it further.
4) Remove some spurious synchronize_{net,rcu}() calls during device
teardown, from Eric Dumazet.
5) netpoll fixes wrt. bridging from Amerigo Wang.
6) CAIF bug fixes from Sjur Brandeland.
Please pull, thanks a lot!
The following changes since commit 71a8638480eb8fb6cfabe2ee9ca3fbc6e3453a14:
Merge branch 'viafb-next' of git://github.com/schandinat/linux-2.6 (2011-05-22 12:39:58 -0700)
are available in the git repository at:
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6.git master
Amerigo Wang (4):
netpoll: disable netpoll when enslave a device
bridge: call NETDEV_JOIN notifiers when add a slave
net: rename NETDEV_BONDING_DESLAVE to NETDEV_RELEASE
rtnetlink: ignore NETDEV_RELEASE and NETDEV_JOIN event
Dave Jones (1):
ipv4: Give backtrace in ip_rt_bug().
David S. Miller (5):
net: Remove prefetches from SKB list handlers.
rionet: Remove pointless printk of skb pointer.
netlabel: Remove prefetches from list handlers.
ipv4: Include linux/prefetch.h in fib_trie.c
net: Remove linux/prefetch.h include from linux/skbuff.h
David Ward (1):
macvlan: Forward unicast frames in bridge mode to lowerdev
Dmitry Kravkov (2):
bnx2x: fix DMAE timeout according to hw specifications
bnx2x: allow device properly initialize after hotplug
Emmanuel Grumbach (1):
net: skb_trim explicitely check the linearity instead of data_len
Eric Dumazet (2):
net: remove synchronize_net() from netdev_set_master()
net: avoid synchronize_rcu() in dev_deactivate_many
Heiko Carstens (1):
net: filter: move forward declarations to avoid compile warnings
Paul Gortmaker (1):
drivers/net: add prefetch header for prefetch users
Vladislav Zolotarov (2):
bnx2x: call dev_kfree_skb_any instead of dev_kfree_skb
bnx2x: properly handle CFC DEL in cnic flow
WANG Cong (2):
pktgen: use vzalloc_node() instead of vmalloc_node() + memset()
pktgen: refactor pg_init() code
sjur.brandeland@stericsson.com (5):
caif: Bugfix add check NULL pointer before calling functions.
caif: Fixes freeze on Link layer removal.
caif: Fix freezes when running CAIF loopback device
caif: Update documentation of CAIF transmit and receive functions.
caif: Plug memory leak for checksum error
drivers/net/benet/be_main.c | 1 +
drivers/net/bna/bnad.c | 1 +
drivers/net/bnx2x/bnx2x_cmn.c | 5 +-
drivers/net/bnx2x/bnx2x_cmn.h | 2 +-
drivers/net/bnx2x/bnx2x_main.c | 72 +++++++++++++----------------------
drivers/net/bonding/bond_main.c | 4 +-
drivers/net/chelsio/sge.c | 1 +
drivers/net/cnic.c | 1 +
drivers/net/cxgb3/sge.c | 1 +
drivers/net/cxgb4/sge.c | 1 +
drivers/net/cxgb4vf/sge.c | 1 +
drivers/net/e1000/e1000_main.c | 1 +
drivers/net/e1000e/netdev.c | 1 +
drivers/net/ehea/ehea_qmr.h | 1 +
drivers/net/enic/enic_main.c | 1 +
drivers/net/forcedeth.c | 1 +
drivers/net/igb/igb_main.c | 1 +
drivers/net/igbvf/netdev.c | 1 +
drivers/net/ixgb/ixgb_main.c | 1 +
drivers/net/ixgbe/ixgbe_main.c | 1 +
drivers/net/ixgbevf/ixgbevf_main.c | 1 +
drivers/net/macvlan.c | 6 +--
drivers/net/myri10ge/myri10ge.c | 1 +
drivers/net/netconsole.c | 26 ++++++++----
drivers/net/pasemi_mac.c | 1 +
drivers/net/pch_gbe/pch_gbe_main.c | 1 +
drivers/net/qla3xxx.c | 1 +
drivers/net/qlge/qlge_main.c | 1 +
drivers/net/r8169.c | 1 +
drivers/net/rionet.c | 4 +-
drivers/net/s2io.c | 1 +
drivers/net/sb1250-mac.c | 1 +
drivers/net/sfc/rx.c | 1 +
drivers/net/skge.c | 1 +
drivers/net/stmmac/stmmac_main.c | 1 +
drivers/net/tc35815.c | 1 +
drivers/net/vxge/vxge-main.c | 1 +
drivers/net/vxge/vxge-traffic.c | 1 +
include/linux/filter.h | 7 ++-
include/linux/notifier.h | 3 +-
include/linux/skbuff.h | 9 ++--
include/net/caif/caif_layer.h | 36 ++++++++++-------
net/bridge/br_if.c | 3 +
net/caif/caif_dev.c | 7 +++-
net/caif/caif_socket.c | 13 ++----
net/caif/cfcnfg.c | 44 +++++++++------------
net/caif/cfctrl.c | 44 +++++++++++++++------
net/caif/cfmuxl.c | 49 +++++++++++++++++++-----
net/core/dev.c | 4 +-
net/core/pktgen.c | 22 ++++++----
net/core/rtnetlink.c | 2 +
net/ipv4/fib_trie.c | 1 +
net/ipv4/route.c | 1 +
net/netlabel/netlabel_addrlist.h | 8 ++--
net/sched/sch_generic.c | 17 +++++++-
55 files changed, 256 insertions(+), 164 deletions(-)
^ permalink raw reply
* Re: [PATCH net-next 04/11] bnx2x: Add TX fault check for fiber PHYs
From: Yaniv Rosner @ 2011-05-23 6:46 UTC (permalink / raw)
To: Ben Hutchings
Cc: Yaniv Rosner, davem@davemloft.net, netdev@vger.kernel.org,
Eilon Greenstein
In-Reply-To: <1306128018.3456.35.camel@localhost>
On Sun, 2011-05-22 at 22:20 -0700, Ben Hutchings wrote:
> On Sun, 2011-05-22 at 14:32 +0300, Yaniv Rosner wrote:
> > In case TX fault is detected on Fiber PHYs, declare the link as down
> > until TX fault is gone.
> [...]
> > --- a/drivers/net/bnx2x/bnx2x_reg.h
> > +++ b/drivers/net/bnx2x/bnx2x_reg.h
> > @@ -6037,6 +6037,7 @@ Theotherbitsarereservedandshouldbezero*/
> > #define MDIO_PMA_REG_BCM_CTRL 0x0096
> > #define MDIO_PMA_REG_FEC_CTRL 0x00ab
> > #define MDIO_PMA_REG_RX_ALARM_CTRL 0x9000
> > +#define MDIO_PMA_REG_TX_ALARM_CTRL 0x9001
> > #define MDIO_PMA_REG_LASI_CTRL 0x9002
> > #define MDIO_PMA_REG_RX_ALARM 0x9003
> > #define MDIO_PMA_REG_TX_ALARM 0x9004
>
> By the way, the LASI registers are already named in <linux/mdio.h>:
We will remove those private redundant definitions.
>
> #define MDIO_PMA_LASI_RXCTRL 0x9000 /* RX_ALARM control */
> #define MDIO_PMA_LASI_TXCTRL 0x9001 /* TX_ALARM control */
> #define MDIO_PMA_LASI_CTRL 0x9002 /* LASI control */
> #define MDIO_PMA_LASI_RXSTAT 0x9003 /* RX_ALARM status */
> #define MDIO_PMA_LASI_TXSTAT 0x9004 /* TX_ALARM status */
> #define MDIO_PMA_LASI_STAT 0x9005 /* LASI status */
>
> Ben.
>
Thanks,
Yaniv
^ permalink raw reply
* Re: [v3 00/39] faster tree-based sysctl implementation
From: Lucian Adrian Grijincu @ 2011-05-23 6:37 UTC (permalink / raw)
To: Eric W. Biederman
Cc: linux-kernel, netdev, Alexey Dobriyan, Octavian Purdila,
David S . Miller
In-Reply-To: <m14o4mavod.fsf@fess.ebiederm.org>
On Mon, May 23, 2011 at 7:27 AM, Eric W. Biederman
<ebiederm@xmission.com> wrote:
> This patchset looks like it is deserving of some close scrutiny, and
> not just the high level design overview I have given the previous
> patches. This is going to be a busy week for me so I probably won't
> get through all of the patches for a while.
I have one more question. The current implementation uses a single
sysctl_lock to synchronize all changes to the data structures.
In my algorithm I change a few places to use a per-header read-write
lock. Even though the code is organized to handle a per-header rwlock,
the implementation uses a single global rwlock. In v2 I got rid of the
rwlock and replaced the subdirs/files regular lists with rcu-protected
lists and that's why I did not bother giving each header a rwlock.
I have no idea how to use rcu with rbtree. Should I now give each
header it's own lock to reduce contention?
I'm asking this because I don't know why the only is a global sysctl
spin lock, when multiple locks could have been used, each to protect
it's own domain of values.
If you'd like to keep locking as simple as possible (to reduce all the
potential problems brought on by too many locks), or if in general
contention is low enough, then global lock is better. If not, then
I'll change the code to support per-header rwlocks (increasing the
ctl_table_header structure size).
--
.
..: Lucian
^ permalink raw reply
* Re: [PATCH v4 1/1] can: add pruss CAN driver.
From: Oliver Hartkopp @ 2011-05-23 6:21 UTC (permalink / raw)
To: Arnd Bergmann
Cc: sachi-EvXpCiN+lbve9wHmmfpqLFaTQe2KTcn/,
davinci-linux-open-source-VycZQUHpC/PFrsHnngEfi1aTQe2KTcn/,
Alan Cox, Subhasish Ghosh, nsekhar-l0cyMroinI0, open list,
CAN NETWORK DRIVERS, Marc Kleine-Budde,
linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
Netdev-u79uwXL29TY76Z2rM5mHXA, m-watkins-l0cyMroinI0,
Wolfgang Grandegger
In-Reply-To: <201105221230.56243.arnd-r2nGTMty4D4@public.gmane.org>
On 22.05.2011 12:30, Arnd Bergmann wrote:
> On Thursday 12 May 2011 16:41:58 Oliver Hartkopp wrote:
>> E.g. assume you need the CAN-IDs 0x100, 0x200 and 0x300 in your application
>> and for that reason you configure these IDs in the pruss CAN driver.
>>
>> What if someone generates a 100% CAN busload exactly on CAN-ID 0x100 then?
>>
>> Worst case (1MBit/s, DLC=0) you would need to handle about 21.000 irqs/s for
>> the correctly received CAN frames with the filtered CAN-ID 0x100 ...
>
> Then I guess the main thing that a "smart" CAN implementation like pruss
> should do is interrupt mitigation. When you have a constant flow of
> packets coming in, the hardware should be able to DMA a lot of
> them into kernel memory before the driver is required to pick them up,
> and only get into interrupt driven mode when the kernel has managed
> to process all outstanding packets.
>
>> This all depends heavily on Linux networking (skb handling, caching, etc) and
>> is pretty fast and optimized!! That was also the reason why it ran on the old
>> PowerPC that smoothly. The mostly seen effect if anything drops is when the
>> application (holding the socket) was not fast enough to handle the incoming
>> data. NB: For that reason we implemented a CAN content filter (CAN_BCM) that
>> is able to do content filtering and timeout monitoring in Kernelspace - all
>> performed in the SoftIRQ.
>
> Right, dropping packets that no process is waiting for should be done as
> early as possible. In pruss-can, the idea was to do it in hardware, which
> doesn't really work all that well for the reasons discussed before.
> Dropping the frames in the NAPI poll function (softirq time) seems like a
> logical choice.
In 'real world' CAN setups you'll never see 21.000 CAN frames per second (and
therefore 21.000 irqs/s) - you are usually designing CAN network traffic with
less than 60% busload. So interrupt rates somewhere below 1000 irqs/s can be
assumed.
>From what i've seen so far a 3-4 messages rx FIFO and NAPI support just make it.
@Marc/Wolfgang: Would this be also your recommendation for a CAN controller
design that supports SocketCAN in the best way?
As the Linux network stack supports hardware timestamps too, this could be an
additional (optional!) feature.
Regards,
Oliver
>> Having 'Mailboxes' bound to CAN-IDs is something that's useful for 8/16 bit
>> CPUs where an application is tightly bound to the embedded ECUs functionality.
>
> Makes sense.
>
> Arnd
^ permalink raw reply
* Re: [PATCH 1/3] vlan: Do not support clearing VLAN_FLAG_REORDER_HDR
From: Ben Greear @ 2011-05-23 6:01 UTC (permalink / raw)
To: Eric W. Biederman
Cc: David Miller, Jiri Pirko, Nicolas de Pesloüan, Changli Gao,
netdev, shemminger, kaber, fubar, eric.dumazet, andy, Jesse Gross
In-Reply-To: <m1wrhigy37.fsf@fess.ebiederm.org>
On 05/22/2011 03:38 PM, Eric W. Biederman wrote:
> Ben Greear<greearb@candelatech.com> writes:
>
>> On 05/22/2011 12:39 PM, Eric W. Biederman wrote:
>>>
>>> Simplify the vlan handling code by not supporing clearing of
>>> VLAN_FLAG_REORDER_HDR. Which means we always make the vlan handling
>>> code strip the vlan header from the packets, and always insert the vlan
>>> header when transmitting packets.
>>>
>>> Not stripping the vlan header has alwasy been broken in combination with
>>> vlan hardware accelleration. Now that we are making everything look
>>> like accelerated vlan handling not stripping the vlan header is always
>>> broken.
>>>
>>> I don't think anyone actually cares so simply stop supporting the broken
>>> case.
>>
>> I've lost track of the VLAN code a bit. Is there any documentation
>> somewhere about what happens in these various cases:
>
> Other than the code I don't know about documentation.
These cases are tricky and probably have changed over
the years. It would be nice to have it written down
somewhere, even if just in comments somewhere in the
VLAN code.
>
>> * Open a raw packet socket on eth0.
> I assume you mean a pf_packet socket.
Yes.
>> * Do we get tagged VLAN packets? (I'd expect yes.)
> yes.
>
>> * If we sent a tagged VLAN packet, it's sent without modification? (I'd expect yes.)
>> ** Without "yes" to the two above, one cannot do user-space bridging properly.
>
> This is sort of. If you set the PACKET_AUXDATA option and use recv_cmsg
> you get the priority and the vlan identifier in the auxdata.
>
> I think that is a pretty horrible answer myself but it has been that way
> since sometime mid 2008. So I'm not immediately prepared to call it
> a regression, or a bug.
I believe we have been getting tagged VLAN packets properly
in our test cases. We would not be creating any VLAN devices
in this case, so perhaps the NIC isn't doing any stripping.
To me, it seems like we should get the fully tagged packet
without having to go muck with aux-data, though it would
be fine if it were *also* in aux-data.
I'll try to test this again this coming week to make sure
it's working like I think it is.
Thanks,
Ben
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
^ permalink raw reply
* Re: [v3 00/39] faster tree-based sysctl implementation
From: Lucian Adrian Grijincu @ 2011-05-23 5:59 UTC (permalink / raw)
To: Eric W. Biederman
Cc: linux-kernel, netdev, Alexey Dobriyan, Octavian Purdila,
David S . Miller
In-Reply-To: <m14o4mavod.fsf@fess.ebiederm.org>
On Mon, May 23, 2011 at 7:27 AM, Eric W. Biederman
<ebiederm@xmission.com> wrote:
> I will mention a couple of nits I noticed while I was skimming through
> your patches.
> - There can be multiple proc superblocks and thus multiple inodes
> referring to the same /proc/sys file, if there are multiple pid
> namespaces.
OK. I'll revert the patch that make converts an int counter into an u8
and post an update. I guess this is what you were referring to.
https://github.com/luciang/linux-2.6-new-sysctl/commit/b7a547b8ce7484ae22eea34a860f674fcb19c11b
> - I have a hope to move /proc/sys into /proc/<pid>/sys so we don't have
> to look at current to determine the namespace we want to display.
> That would allow the deeply magic sysctl_is_seen check to be removed
> from proc_sys_compare. That is not your problem, but of an
> explanation why the namespaces are passed through.
OK, I'll go an change things to pass the current namespace as it were
before my changes.
Thank you.
--
.
..: Lucian
^ permalink raw reply
* Re: [PATCH net-next 04/11] bnx2x: Add TX fault check for fiber PHYs
From: Ben Hutchings @ 2011-05-23 5:20 UTC (permalink / raw)
To: Yaniv Rosner; +Cc: davem, netdev, eilong
In-Reply-To: <1306063927.20872.86.camel@lb-tlvb-dmitry>
On Sun, 2011-05-22 at 14:32 +0300, Yaniv Rosner wrote:
> In case TX fault is detected on Fiber PHYs, declare the link as down
> until TX fault is gone.
[...]
> --- a/drivers/net/bnx2x/bnx2x_reg.h
> +++ b/drivers/net/bnx2x/bnx2x_reg.h
> @@ -6037,6 +6037,7 @@ Theotherbitsarereservedandshouldbezero*/
> #define MDIO_PMA_REG_BCM_CTRL 0x0096
> #define MDIO_PMA_REG_FEC_CTRL 0x00ab
> #define MDIO_PMA_REG_RX_ALARM_CTRL 0x9000
> +#define MDIO_PMA_REG_TX_ALARM_CTRL 0x9001
> #define MDIO_PMA_REG_LASI_CTRL 0x9002
> #define MDIO_PMA_REG_RX_ALARM 0x9003
> #define MDIO_PMA_REG_TX_ALARM 0x9004
By the way, the LASI registers are already named in <linux/mdio.h>:
#define MDIO_PMA_LASI_RXCTRL 0x9000 /* RX_ALARM control */
#define MDIO_PMA_LASI_TXCTRL 0x9001 /* TX_ALARM control */
#define MDIO_PMA_LASI_CTRL 0x9002 /* LASI control */
#define MDIO_PMA_LASI_RXSTAT 0x9003 /* RX_ALARM status */
#define MDIO_PMA_LASI_TXSTAT 0x9004 /* TX_ALARM status */
#define MDIO_PMA_LASI_STAT 0x9005 /* LASI status */
Ben.
--
Ben Hutchings, Senior Software Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox