* Re: PROBLEM: System call 'sendmsg' of process ospfd (quagga) causes kernel oops
From: Eric Dumazet @ 2011-10-18 2:30 UTC (permalink / raw)
To: Elmar Vonlanthen; +Cc: linux-kernel, netdev, Timo Teräs, Herbert Xu
In-Reply-To: <CA+0Zf5A-8X5MaXC4+BbLKtMb8S8B-Ln_a3JNfz0jR4O-ruefuw@mail.gmail.com>
Le lundi 17 octobre 2011 à 09:16 +0200, Elmar Vonlanthen a écrit :
> 2011/10/14 Eric Dumazet <eric.dumazet@gmail.com>:
> > Please try following patch :
> >
> > [PATCH] ip_gre: dont increase dev->needed_headroom on a live device
> >
> > It seems ip_gre is able to change dev->needed_headroom on the fly.
> >
> > Its is not legal unfortunately and triggers a BUG in raw_sendmsg()
> >
> > skb = sock_alloc_send_skb(sk, ... + LL_ALLOCATED_SPACE(rt->dst.dev)
> >
> > < another cpu change dev->needed_headromm (making it bigger)
> >
> > ...
> > skb_reserve(skb, LL_RESERVED_SPACE(rt->dst.dev));
> >
> > We end with LL_RESERVED_SPACE() being bigger than LL_ALLOCATED_SPACE()
> > -> we crash later because skb head is exhausted.
> >
> > Bug introduced in commit 243aad83 in 2.6.34 (ip_gre: include route
> > header_len in max_headroom calculation)
> >
> > Reported-by: Elmar Vonlanthen <evonlanthen@gmail.com>
> > Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> > CC: Timo Teräs <timo.teras@iki.fi>
> > CC: Herbert Xu <herbert@gondor.apana.org.au>
> > ---
> > diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
> > index 8871067..1505dcf 100644
> > --- a/net/ipv4/ip_gre.c
> > +++ b/net/ipv4/ip_gre.c
> > @@ -835,8 +835,6 @@ static netdev_tx_t ipgre_tunnel_xmit(struct sk_buff *skb, struct net_device *dev
> > if (skb_headroom(skb) < max_headroom || skb_shared(skb)||
> > (skb_cloned(skb) && !skb_clone_writable(skb, 0))) {
> > struct sk_buff *new_skb = skb_realloc_headroom(skb, max_headroom);
> > - if (max_headroom > dev->needed_headroom)
> > - dev->needed_headroom = max_headroom;
> > if (!new_skb) {
> > ip_rt_put(rt);
> > dev->stats.tx_dropped++;
>
> Hello
>
> I tried this patch and I was not able anymore to reproduce the kernel
> oops. So the patch solved the bug.
> Thank you very much!
>
> Would it be possible to add the patch to the long term kernel 2.6.35
> as well? Because this is the one I use at the moment in production.
>
Thanks for testing.
If David/Herbert/Timo agree, then patch should find its way into current
kernel, then to stable trees as well.
Thanks
^ permalink raw reply
* [PATCH] pptp: fix skb leak in pptp_xmit()
From: Eric Dumazet @ 2011-10-18 3:01 UTC (permalink / raw)
To: David Miller; +Cc: netdev, Dmitry Kozlov
In case we cant transmit skb, we must free it
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Dmitry Kozlov <xeb@mail.ru>
---
drivers/net/pptp.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/net/pptp.c b/drivers/net/pptp.c
index eae542a..9c0403d 100644
--- a/drivers/net/pptp.c
+++ b/drivers/net/pptp.c
@@ -285,8 +285,10 @@ static int pptp_xmit(struct ppp_channel *chan, struct sk_buff *skb)
ip_send_check(iph);
ip_local_out(skb);
+ return 1;
tx_error:
+ kfree_skb(skb);
return 1;
}
^ permalink raw reply related
* Re: Bug#645589: linux-image-3.0.0-2-amd64: sky2 rx errors on 3.0, 2.6.32 works
From: Ben Hutchings @ 2011-10-18 3:27 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: 645589, Antti Salmela, netdev
In-Reply-To: <20111017074016.6840.77265.reportbug@thor.viidakko.fi>
[-- Attachment #1: Type: text/plain, Size: 1851 bytes --]
On Mon, 2011-10-17 at 10:40 +0300, Antti Salmela wrote:
> Package: linux-2.6
> Version: 3.0.0-5
> Severity: normal
>
>
> sky2 loses packets on 3.0 (-3 and -5) and 3.1-rc7, 2.6.32-38 and
> setting interface to promiscuous works.
>
> [ 60.118244] sky2 0000:02:00.0: eth0: rx error, status 0xb92100 length 185
> [ 62.664370] sky2 0000:02:00.0: eth0: rx error, status 0x602100 length 96
> [ 63.370051] sky2 0000:02:00.0: eth0: rx error, status 0x422100 length 66
> [ 63.714672] sky2 0000:02:00.0: eth0: rx error, status 0x722100 length 114
> [ 64.513458] device eth0 entered promiscuous mode
It looks like this is a bug in accounting of VLAN tags, though I don't
see what difference promiscuous mode should make.
The log messages show that status has the VLAN flag (bit 13) set and the
length field (bits 16:28) equals the length passed into sky2_receive(),
but that function expects the length field to be greater by VLAN_HLEN.
This device is:
[...]
> 02:00.0 Ethernet controller [0200]: Marvell Technology Group Ltd. 88E8053 PCI-E Gigabit Ethernet Controller [11ab:4362] (rev 19)
> Subsystem: ASUSTeK Computer Inc. Marvell 88E8053 Gigabit Ethernet controller PCIe (Asus) [1043:8142]
> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> Latency: 0, Cache Line Size: 16 bytes
> Interrupt: pin A routed to IRQ 43
> Region 0: Memory at cdefc000 (64-bit, non-prefetchable) [size=16K]
> Region 2: I/O ports at c800 [size=256]
> Expansion ROM at cdec0000 [disabled] [size=128K]
> Capabilities: <access denied>
> Kernel driver in use: sky2
[...]
Ben.
--
Ben Hutchings
No political challenge can be met by shopping. - George Monbiot
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply
* Re: [PATCH] fib_rules: fix unresolved_rules counting
From: Eric Dumazet @ 2011-10-18 3:47 UTC (permalink / raw)
To: Yan, Zheng; +Cc: netdev@vger.kernel.org, davem@davemloft.net
In-Reply-To: <4E9CD45C.9050608@intel.com>
Le mardi 18 octobre 2011 à 09:20 +0800, Yan, Zheng a écrit :
> we should decrease ops->unresolved_rules when deleting a unresolved rule.
>
> Signed-off-by: Zheng Yan <zheng.z.yan@intel.com>
> ---
> diff --git a/net/core/fib_rules.c b/net/core/fib_rules.c
> index 3231b46..27071ee 100644
> --- a/net/core/fib_rules.c
> +++ b/net/core/fib_rules.c
> @@ -475,8 +475,11 @@ static int fib_nl_delrule(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg)
>
> list_del_rcu(&rule->list);
>
> - if (rule->action == FR_ACT_GOTO)
> + if (rule->action == FR_ACT_GOTO) {
> ops->nr_goto_rules--;
> + if (rtnl_dereference(rule->ctarget) == NULL)
> + ops->unresolved_rules--;
> + }
>
> /*
> * Check if this rule is a target to any of them. If so,
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
^ permalink raw reply
* Re: BUG in skb_pull with e1000e, PPTP, and L2TP
From: Eric Dumazet @ 2011-10-18 3:51 UTC (permalink / raw)
To: Bradley Peterson
Cc: netdev, Jeff Kirsher, Jesse Brandeburg, Bruce Allan,
Carolyn Wyborny, Don Skidmore, Greg Rose, PJ Waskiewicz,
Alex Duyck, John Ronciak, e1000-devel
In-Reply-To: <1318904666.2571.33.camel@edumazet-laptop>
Le mardi 18 octobre 2011 à 04:24 +0200, Eric Dumazet a écrit :
> diff --git a/drivers/net/ppp/pptp.c b/drivers/net/ppp/pptp.c
> index eae542a..d0197e3 100644
> --- a/drivers/net/ppp/pptp.c
> +++ b/drivers/net/ppp/pptp.c
> @@ -305,11 +305,16 @@ static int pptp_rcv_core(struct sock *sk, struct sk_buff *skb)
> }
>
> header = (struct pptp_gre_header *)(skb->data);
> + headersize = sizeof(*header);
>
> /* test if acknowledgement present */
> if (PPTP_GRE_IS_A(header->ver)) {
> - __u32 ack = (PPTP_GRE_IS_S(header->flags)) ?
> - header->ack : header->seq; /* ack in different place if S = 0 */
> + __u32 ack;
> +
> + if (!pskb_may_pull(skb, headersize))
> + goto drop;
Oh well, this is buggy, I need to set header again, I'll send an updated
patch
header = (struct pptp_gre_header *)(skb->data);
> + ack = (PPTP_GRE_IS_S(header->flags)) ?
> + header->ack : header->seq; /* ack in different place if S = 0 */
>
^ permalink raw reply
* Re: BUG in skb_pull with e1000e, PPTP, and L2TP
From: Eric Dumazet @ 2011-10-18 3:59 UTC (permalink / raw)
To: Bradley Peterson
Cc: Don, e1000-devel, netdev, Bruce Allan, Jesse Brandeburg,
John Ronciak
In-Reply-To: <1318909879.2571.43.camel@edumazet-laptop>
Le mardi 18 octobre 2011 à 05:51 +0200, Eric Dumazet a écrit :
> Le mardi 18 octobre 2011 à 04:24 +0200, Eric Dumazet a écrit :
>
> > diff --git a/drivers/net/ppp/pptp.c b/drivers/net/ppp/pptp.c
> > index eae542a..d0197e3 100644
> > --- a/drivers/net/ppp/pptp.c
> > +++ b/drivers/net/ppp/pptp.c
> > @@ -305,11 +305,16 @@ static int pptp_rcv_core(struct sock *sk, struct sk_buff *skb)
> > }
> >
> > header = (struct pptp_gre_header *)(skb->data);
> > + headersize = sizeof(*header);
> >
> > /* test if acknowledgement present */
> > if (PPTP_GRE_IS_A(header->ver)) {
> > - __u32 ack = (PPTP_GRE_IS_S(header->flags)) ?
> > - header->ack : header->seq; /* ack in different place if S = 0 */
> > + __u32 ack;
> > +
> > + if (!pskb_may_pull(skb, headersize))
> > + goto drop;
>
> Oh well, this is buggy, I need to set header again, I'll send an updated
> patch
>
[PATCH v2] pptp: pptp_rcv_core() misses pskb_may_pull() call
e1000e uses paged frags, so any layer incorrectly pulling bytes from skb
can trigger a BUG in skb_pull()
[951.142737] [<ffffffff813d2f36>] skb_pull+0x15/0x17
[951.142737] [<ffffffffa0286824>] pptp_rcv_core+0x126/0x19a [pptp]
[951.152725] [<ffffffff813d17c4>] sk_receive_skb+0x69/0x105
[951.163558] [<ffffffffa0286993>] pptp_rcv+0xc8/0xdc [pptp]
[951.165092] [<ffffffffa02800a3>] gre_rcv+0x62/0x75 [gre]
[951.165092] [<ffffffff81410784>] ip_local_deliver_finish+0x150/0x1c1
[951.177599] [<ffffffff81410634>] ? ip_local_deliver_finish+0x0/0x1c1
[951.177599] [<ffffffff81410846>] NF_HOOK.clone.7+0x51/0x58
[951.177599] [<ffffffff81410996>] ip_local_deliver+0x51/0x55
[951.177599] [<ffffffff814105b9>] ip_rcv_finish+0x31a/0x33e
[951.177599] [<ffffffff8141029f>] ? ip_rcv_finish+0x0/0x33e
[951.204898] [<ffffffff81410846>] NF_HOOK.clone.7+0x51/0x58
[951.214651] [<ffffffff81410bb5>] ip_rcv+0x21b/0x246
pptp_rcv_core() is a nice example of a function assuming everything it
needs is available in skb head.
Reported-by: Bradley Peterson <despite@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
drivers/net/ppp/pptp.c | 20 ++++++++++++--------
1 file changed, 12 insertions(+), 8 deletions(-)
diff --git a/drivers/net/ppp/pptp.c b/drivers/net/ppp/pptp.c
index eae542a..29730fd 100644
--- a/drivers/net/ppp/pptp.c
+++ b/drivers/net/ppp/pptp.c
@@ -305,11 +305,18 @@ static int pptp_rcv_core(struct sock *sk, struct sk_buff *skb)
}
header = (struct pptp_gre_header *)(skb->data);
+ headersize = sizeof(*header);
/* test if acknowledgement present */
if (PPTP_GRE_IS_A(header->ver)) {
- __u32 ack = (PPTP_GRE_IS_S(header->flags)) ?
- header->ack : header->seq; /* ack in different place if S = 0 */
+ __u32 ack;
+
+ if (!pskb_may_pull(skb, headersize))
+ goto drop;
+ header = (struct pptp_gre_header *)(skb->data);
+
+ /* ack in different place if S = 0 */
+ ack = PPTP_GRE_IS_S(header->flags) ? header->ack : header->seq;
ack = ntohl(ack);
@@ -318,21 +325,18 @@ static int pptp_rcv_core(struct sock *sk, struct sk_buff *skb)
/* also handle sequence number wrap-around */
if (WRAPPED(ack, opt->ack_recv))
opt->ack_recv = ack;
+ } else {
+ headersize -= sizeof(header->ack);
}
-
/* test if payload present */
if (!PPTP_GRE_IS_S(header->flags))
goto drop;
- headersize = sizeof(*header);
payload_len = ntohs(header->payload_len);
seq = ntohl(header->seq);
- /* no ack present? */
- if (!PPTP_GRE_IS_A(header->ver))
- headersize -= sizeof(header->ack);
/* check for incomplete packet (length smaller than expected) */
- if (skb->len - headersize < payload_len)
+ if (!pskb_may_pull(skb, headersize + payload_len))
goto drop;
payload = skb->data + headersize;
------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2d-oct
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired
^ permalink raw reply related
* [PATCH net-next-2.6] [IPV6] cleanup: remove unused IPV6 stats entries.
From: Kevin Wilson @ 2011-10-18 5:36 UTC (permalink / raw)
To: davem, netdev
[-- Attachment #1: Type: text/plain, Size: 185 bytes --]
Hi,
This cleanup patch removes three unused IPV6 stats entries
from rt6_statistics struct. (ip6_fib.h).
Regards,
wkevils@gmail.com
Signed-off-by: Kevin Wilson <wkevils@gmail.com>
[-- Attachment #2: patch.txt --]
[-- Type: text/plain, Size: 1180 bytes --]
diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
index 5735a0f..ac6f91b 100644
--- a/include/net/ip6_fib.h
+++ b/include/net/ip6_fib.h
@@ -139,11 +139,8 @@ struct fib6_walker_t {
};
struct rt6_statistics {
- __u32 fib_nodes;
__u32 fib_route_nodes;
- __u32 fib_rt_alloc; /* permanent routes */
__u32 fib_rt_entries; /* rt entries in table */
- __u32 fib_rt_cache; /* cache routes */
__u32 fib_discarded_routes;
};
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index fb545ed..7f7a80c 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -2672,12 +2672,9 @@ static const struct file_operations ipv6_route_proc_fops = {
static int rt6_stats_seq_show(struct seq_file *seq, void *v)
{
struct net *net = (struct net *)seq->private;
- seq_printf(seq, "%04x %04x %04x %04x %04x %04x %04x\n",
- net->ipv6.rt6_stats->fib_nodes,
+ seq_printf(seq, "%04x %04x %04x %04x\n",
net->ipv6.rt6_stats->fib_route_nodes,
- net->ipv6.rt6_stats->fib_rt_alloc,
net->ipv6.rt6_stats->fib_rt_entries,
- net->ipv6.rt6_stats->fib_rt_cache,
dst_entries_get_slow(&net->ipv6.ip6_dst_ops),
net->ipv6.rt6_stats->fib_discarded_routes);
^ permalink raw reply related
* Re: Linux 3.1-rc9
From: Simon Kirby @ 2011-10-18 5:40 UTC (permalink / raw)
To: Linux Kernel Mailing List, netdev
In-Reply-To: <20111012213555.GC24461@hostway.ca>
On Wed, Oct 12, 2011 at 02:35:55PM -0700, Simon Kirby wrote:
> > > patching file kernel/posix-cpu-timers.c
> > > patching file kernel/sched_stats.h
> >
> > yes that would be fine.
>
> This patch (s/raw_//) has been stable on 5 boxes for a day. I'll push to
> another 15 shortly and confirm tomorrow. Meanwhile, we had another ~4
> boxes lock up on 3.1-rc9 _with_ d670ec13 reverted (all CPUs spinning),
> but there weren't enough serial cables to log all of them and we haven't
> been lucky enough to capture anything other than what fits on 80x25.
> I'm hoping it's just the same bug you've already fixed.
Looks to be a different bug. It just happened on a box with serial
console logging, on the same build I was testing the above patch on --
Linus master circa Oct 7th. This seems to be specific to TCP. I'm not
sure what is with all of the doubled backtraces. I've only seen this on
a couple of different boxes so far.
Full log at http://0x.ca/sim/ref/3.1-rc9/3.1-rc9-tcp-lockup.log
First 100 lines:
[516112.140013] BUG: soft lockup - CPU#0 stuck for 22s! [swapper:0]
[516112.144001] Modules linked in: ipmi_devintf ipmi_si ipmi_msghandler xt_recent nf_conntrack_ftp xt_state xt_owner nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 bnx2
[516112.144001] CPU 0
[516112.144001] Modules linked in: ipmi_devintf ipmi_si ipmi_msghandler xt_recent nf_conntrack_ftp xt_state xt_owner nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 bnx2
[516112.144001]
[516112.144001] Pid: 0, comm: swapper Not tainted 3.1.0-rc9-hw+ #48 Dell Inc. PowerEdge 1950/0UR033
[516112.144001] RIP: 0010:[<ffffffff816b6694>] [<ffffffff816b6694>] _raw_spin_lock+0x14/0x20
[516112.144001] RSP: 0018:ffff88022fc03e10 EFLAGS: 00000297
[516112.144001] RAX: 0000000000000100 RBX: ffffffff81022674 RCX: ffffffff81b4df20
[516112.144001] RDX: ffff8801002aebe0 RSI: dead000000200200 RDI: ffff8801002ad188
[516112.144001] RBP: ffff88022fc03e10 R08: 00000000000000f7 R09: 0000000000000000
[516112.144001] R10: 0000000000000000 R11: 0000000000000010 R12: ffff88022fc03d88
[516112.144001] R13: ffffffff816bed1e R14: ffff88022fc03e10 R15: ffffffff81b4df00
[516112.144001] FS: 0000000000000000(0000) GS:ffff88022fc00000(0000) knlGS:0000000000000000
[516112.244020] BUG: soft lockup - CPU#1 stuck for 22s! [kworker/0:0:0]
[516112.244024] Modules linked in: ipmi_devintf ipmi_si ipmi_msghandler xt_recent nf_conntrack_ftp xt_state xt_owner nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 bnx2
[516112.244033] CPU 1
[516112.244035] Modules linked in: ipmi_devintf ipmi_si ipmi_msghandler xt_recent nf_conntrack_ftp xt_state xt_owner nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 bnx2
[516112.244041]
[516112.244044] Pid: 0, comm: kworker/0:0 Not tainted 3.1.0-rc9-hw+ #48 Dell Inc. PowerEdge 1950/0UR033
[516112.244048] RIP: 0010:[<ffffffff816b6694>] [<ffffffff816b6694>] _raw_spin_lock+0x14/0x20
[516112.244057] RSP: 0018:ffff88022fc43e10 EFLAGS: 00000297
[516112.244059] RAX: 0000000000000100 RBX: ffffffff81022674 RCX: ffff880226888020
[516112.244062] RDX: ffff88001ece1aa0 RSI: dead000000200200 RDI: ffff88001ece1f88
[516112.244064] RBP: ffff88022fc43e10 R08: 00000000000000df R09: 0000000000000000
[516112.244066] R10: 0000000000000000 R11: 0000000000000010 R12: ffff88022fc43d88
[516112.244068] R13: ffffffff816bed1e R14: ffff88022fc43e10 R15: ffff880226888000
[516112.244071] FS: 0000000000000000(0000) GS:ffff88022fc40000(0000) knlGS:0000000000000000
[516112.244074] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[516112.244076] CR2: ffffffffff600400 CR3: 0000000126d93000 CR4: 00000000000006e0
[516112.244078] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[516112.244081] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[516112.244083] Process kworker/0:0 (pid: 0, threadinfo ffff880226918000, task ffff880226911640)
[516112.244085] Stack:
[516112.244086] ffff88022fc43e40 ffffffff8162a613 0000000000000000 0000000000000000
[516112.244090] ffff880226888000 ffff88001ece20e0 ffff88022fc43ee0 ffffffff810692dc
[516112.244094] 0000000000000000 ffff880226919fd8 ffff880226919fd8 ffff880226919fd8
[516112.244098] Call Trace:
[516112.244099] <IRQ>
[516112.244105] [<ffffffff8162a613>] tcp_keepalive_timer+0x23/0x260
[516112.244110] [<ffffffff810692dc>] run_timer_softirq+0x1ac/0x310
[516112.244113] [<ffffffff8162a5f0>] ? tcp_init_xmit_timers+0x20/0x20
[516112.244118] [<ffffffff8102e838>] ? lapic_next_event+0x18/0x20
[516112.244121] [<ffffffff81060bf0>] __do_softirq+0xe0/0x1d0
[516112.244125] [<ffffffff816c04ac>] call_softirq+0x1c/0x30
[516112.244129] [<ffffffff81014255>] do_softirq+0x65/0xa0
[516112.244132] [<ffffffff810608fd>] irq_exit+0xad/0xe0
[516112.244135] [<ffffffff8102f569>] smp_apic_timer_interrupt+0x69/0xa0
[516112.244139] [<ffffffff816bed1e>] apic_timer_interrupt+0x6e/0x80
[516112.244140] <EOI>
[516112.244144] [<ffffffff8101a337>] ? mwait_idle+0x117/0x120
[516112.244147] [<ffffffff810120c6>] cpu_idle+0x86/0xe0
[516112.244151] [<ffffffff816ae77c>] start_secondary+0x1a3/0x1e7
[516112.244153] Code: 0f b6 c2 85 c0 c9 0f 95 c0 0f b6 c0 c3 66 2e 0f 1f 84 00 00 00 00 00 55 b8 00 01 00 00 48 89 e5 f0 66 0f c1 07 38 e0 74 06 f3 90 <8a> 07 eb f6 c9 c3 66 0f 1f 44 00 00 55 48 89 e5 9c 58 66 66 90
[516112.244173] Call Trace:
[516112.244174] <IRQ> [<ffffffff8162a613>] tcp_keepalive_timer+0x23/0x260
[516112.244179] [<ffffffff810692dc>] run_timer_softirq+0x1ac/0x310
[516112.244182] [<ffffffff8162a5f0>] ? tcp_init_xmit_timers+0x20/0x20
[516112.244185] [<ffffffff8102e838>] ? lapic_next_event+0x18/0x20
[516112.244188] [<ffffffff81060bf0>] __do_softirq+0xe0/0x1d0
[516112.244191] [<ffffffff816c04ac>] call_softirq+0x1c/0x30
[516112.244194] [<ffffffff81014255>] do_softirq+0x65/0xa0
[516112.244197] [<ffffffff810608fd>] irq_exit+0xad/0xe0
[516112.244199] [<ffffffff8102f569>] smp_apic_timer_interrupt+0x69/0xa0
[516112.244202] [<ffffffff816bed1e>] apic_timer_interrupt+0x6e/0x80
[516112.244204] <EOI> [<ffffffff8101a337>] ? mwait_idle+0x117/0x120
[516112.244209] [<ffffffff810120c6>] cpu_idle+0x86/0xe0
[516112.244212] [<ffffffff816ae77c>] start_secondary+0x1a3/0x1e7
[516112.344023] BUG: soft lockup - CPU#2 stuck for 23s! [php:1486]
[516112.344025] Modules linked in: ipmi_devintf ipmi_si ipmi_msghandler xt_recent nf_conntrack_ftp xt_state xt_owner nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 bnx2
[516112.344033] CPU 2
[516112.344034] Modules linked in: ipmi_devintf ipmi_si ipmi_msghandler xt_recent nf_conntrack_ftp xt_state xt_owner nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 bnx2
[516112.344040]
[516112.344042] Pid: 1486, comm: php Not tainted 3.1.0-rc9-hw+ #48 Dell Inc. PowerEdge 1950/0UR033
[516112.344046] RIP: 0010:[<ffffffff816b6694>] [<ffffffff816b6694>] _raw_spin_lock+0x14/0x20
[516112.344051] RSP: 0000:ffff88022fc83e10 EFLAGS: 00000297
[516112.344053] RAX: 0000000000000100 RBX: ffffffff81022674 RCX: ffff880226920020
[516112.344056] RDX: ffff88022198c660 RSI: dead000000200200 RDI: ffff8800ac758cc8
[516112.344058] RBP: ffff88022fc83e10 R08: 00000000000000ef R09: 0000000000000000
[516112.344060] R10: 000000000000018b R11: 0000000000000010 R12: ffff88022fc83d88
[516112.344062] R13: ffffffff816bed1e R14: ffff88022fc83e10 R15: ffff880226920000
[516112.344065] FS: 00007faafda03720(0000) GS:ffff88022fc80000(0000) knlGS:0000000000000000
[516112.344068] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[516112.344070] CR2: ffffffffff600400 CR3: 00000002223de000 CR4: 00000000000006e0
[516112.344072] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[516112.344075] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[516112.344077] Process php (pid: 1486, threadinfo ffff880039262000, task ffff88003e675900)
[516112.344079] Stack:
[516112.344081] ffff88022fc83e40 ffffffff8162a613 0000000000000000 0000000000000000
[516112.344084] ffff880226920000 ffff8800ac758e20 ffff88022fc83ee0 ffffffff810692dc
[516112.344088] 0000000000000001 ffff880039263fd8 ffff880039263fd8 ffff880039263fd8
[516112.344091] Call Trace:
[516112.344093] <IRQ>
[516112.344099] [<ffffffff8162a613>] tcp_keepalive_timer+0x23/0x260
[516112.344104] [<ffffffff810692dc>] run_timer_softirq+0x1ac/0x310
[516112.344107] [<ffffffff8162a5f0>] ? tcp_init_xmit_timers+0x20/0x20
[516112.344111] [<ffffffff8102e838>] ? lapic_next_event+0x18/0x20
[516112.344115] [<ffffffff81060bf0>] __do_softirq+0xe0/0x1d0
[516112.344119] [<ffffffff816c04ac>] call_softirq+0x1c/0x30
[516112.344123] [<ffffffff81014255>] do_softirq+0x65/0xa0
Simon-
^ permalink raw reply
* Re: [PATCH v13 0/6] flexcan: Add support for powerpc flexcan (freescale p1010)
From: Kumar Gala @ 2011-10-18 5:44 UTC (permalink / raw)
To: Robin Holt
Cc: David S. Miller, Wolfgang Grandegger, Marc Kleine-Budde,
U Bhaskar-B22300, socketcan-core, netdev, PPC list
In-Reply-To: <1313551944-28603-1-git-send-email-holt@sgi.com>
On Aug 16, 2011, at 10:32 PM, Robin Holt wrote:
> David,
>
> The following set of patches have been reviewed by the above parties and
> all comments have been integrated. Although the patches stray from the
> drivers/net/can directory, the diversions are related to changes for
> the flexcan driver.
>
> The patch set is based upon your net-next-2.6 tree's commit 6c37e46.
>
> Could you please queue these up for the next appropriate push to Linus'
> tree?
>
> Thanks,
> Robin Holt
Robin,
Do you remember why we went with just 'fsl,p1010-flexcan' as the device tree compatible? Do we feel the flex can on P1010 isn't the same as on MPC5xxx? or the ARM SoCs?
- k
^ permalink raw reply
* RE: IPv6 routing requests ignore NLM_F_CREATE and NLM_F_REPLACE
From: Vaittinen, Matti (EXT-Other - FI/Oulu) @ 2011-10-18 6:02 UTC (permalink / raw)
To: netdev
Hi again.
> -----Original Message-----
> From: Vaittinen, Matti (EXT-Other - FI/Oulu)
> Sent: Monday, October 17, 2011 12:06 PM
> To: 'netdev@vger.kernel.org'
> Subject: IPv6 routing requests ignore NLM_F_CREATE and
NLM_F_REPLACE
>
> Hi dee Ho!
>
> I was enchancing an userspace application configuring IPv4
routes via netlink sockets to support
> IPv6 route configuration too. While doing this I noticed that
NLM_F_* flags seemed to have
> no handling at IPv6 side. For example replacing a route to
some destiantion, with route
> having different pref_src (or metric or gateway or...) can be
done by having NLM_F_REPLACE flag
> specified in netlink request and leaving out NLM_F_CREATE.
>
> However with IPv6, if new route being requested has different
properties (like gateway or
> metric or..) the existing one will not be replaced. Instead a
new route will be created - even
> if NLM_F_CREATE was not specified in request.
>
> That causes some inconvenience when a route is being changed.
Routes need to be queried, and
> matching route needs to be explisitly deleted by userspace
application. Also creating new route
> even without NLM_F_CREATE feels a bit strange to me.
>
> I was wondering if this is a bug or wanted behaviour? I was
thinking of trying to write a patch
> to add support for replacing a route, but I feel I'm a bit
lost with the fib :) I guess the
> fib6_add_rt2node function could be changed to inspect the
NLM_F_ flags from nl_info pointer,
> and to perform replace instead of returning -EEXIST /
performing insertion. Also returning error
> when NLM_F_CREATE is not specified, and existing route is not
found could propably be implemented.
>
> Anyways, before I spend more time trying to understand the
data structures in fib6, I would like
> to ask if the handling of NLM_F_* flags is dropped out in
purpose?
I do not intend pushing this topic but is this the correct list to ask
this? Is there something I could clarify regarding my question? If this
is not correct list, could someone please point me the right one.
>
>
> Br. Matti Vaittinen
>
> --
>
> Theory:
> Theoretical approach means that everything is well known, but
still nothing works.
> Practice:
> Practical approach means that everything works but no one
knows why.
>
> Thank God we have theory and practice balanced here. Nothing
works, and no one knows why...
>
>
Regards.
-Matti Vaittinen
^ permalink raw reply
* Re: [PATCH net-next-2.6] [IPV6] cleanup: remove unused IPV6 stats entries.
From: Eric Dumazet @ 2011-10-18 6:44 UTC (permalink / raw)
To: Kevin Wilson; +Cc: davem, netdev
In-Reply-To: <CAGXs5wVxkqLeCEMV6Lanczj-zarv6iu4ekdjk5zZKaadabwvbw@mail.gmail.com>
Le mardi 18 octobre 2011 à 07:36 +0200, Kevin Wilson a écrit :
> Hi,
>
> This cleanup patch removes three unused IPV6 stats entries
> from rt6_statistics struct. (ip6_fib.h).
>
> Regards,
> wkevils@gmail.com
>
>
> Signed-off-by: Kevin Wilson <wkevils@gmail.com>
Hi Kevin
Did you change all /proc/net/rt6_stats users so that they still continue
to work, including closed source ones ?
^ permalink raw reply
* RE: IPv6 routing requests ignore NLM_F_CREATE and NLM_F_REPLACE
From: Eric Dumazet @ 2011-10-18 6:54 UTC (permalink / raw)
To: Vaittinen, Matti (EXT-Other - FI/Oulu); +Cc: netdev
In-Reply-To: <82C9FC7ED59434458AD4E09AFF2DE230B534F9@FIESEXC006.nsn-intra.net>
Le mardi 18 octobre 2011 à 09:02 +0300, Vaittinen, Matti (EXT-Other -
FI/Oulu) a écrit :
> Hi again.
>
...
> I do not intend pushing this topic but is this the correct list to ask
> this? Is there something I could clarify regarding my question? If this
> is not correct list, could someone please point me the right one.
>
Well, this is the correct list. And yes, please send us a patch.
IPv6 lacks some features found in IPv4, not on purpose, but because
nobody wanted them or had the time to implement them.
^ permalink raw reply
* [PATCH] route:ip_rt_frag_needed always return unzero
From: Gao feng @ 2011-10-18 7:04 UTC (permalink / raw)
To: davem, kuznet, jmorris, eric.dumazet; +Cc: netdev, Gao feng
int function ip_rt_frag_need,if peer is null,
there is no need to do ipprot->err_handler.
I am right?
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
---
net/ipv4/route.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 075212e..6cde0fa 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1574,7 +1574,7 @@ unsigned short ip_rt_frag_needed(struct net *net, const struct iphdr *iph,
atomic_inc(&__rt_peer_genid);
}
- return est_mtu ? : new_mtu;
+ return est_mtu;
}
static void check_peer_pmtu(struct dst_entry *dst, struct inet_peer *peer)
--
1.7.1
^ permalink raw reply related
* [patch] filter: use unsigned int to silence static checker warning
From: Dan Carpenter @ 2011-10-18 7:04 UTC (permalink / raw)
To: netdev; +Cc: David S. Miller, Eric Dumazet, Changli Gao, kernel-janitors
This is just a cleanup.
My testing version of Smatch warns about this:
net/core/filter.c +380 check_load_and_stores(6)
warn: check 'flen' for negative values
flen comes from the user. We try to clamp the values here between 1
and BPF_MAXINSNS but the clamp doesn't work because it could be
negative. This is a bug, but it's not exploitable.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
diff --git a/include/linux/filter.h b/include/linux/filter.h
index 741956f..8eeb205 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -155,7 +155,7 @@ extern unsigned int sk_run_filter(const struct sk_buff *skb,
const struct sock_filter *filter);
extern int sk_attach_filter(struct sock_fprog *fprog, struct sock *sk);
extern int sk_detach_filter(struct sock *sk);
-extern int sk_chk_filter(struct sock_filter *filter, int flen);
+extern int sk_chk_filter(struct sock_filter *filter, unsigned int flen);
#ifdef CONFIG_BPF_JIT
extern void bpf_jit_compile(struct sk_filter *fp);
diff --git a/net/core/filter.c b/net/core/filter.c
index 8fcc2d7..5dea452 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -436,7 +436,7 @@ error:
*
* Returns 0 if the rule set is legal or -EINVAL if not.
*/
-int sk_chk_filter(struct sock_filter *filter, int flen)
+int sk_chk_filter(struct sock_filter *filter, unsigned int flen)
{
/*
* Valid instructions are initialized to non-0.
^ permalink raw reply related
* Re: [PATCH v13 0/6] flexcan: Add support for powerpc flexcan (freescale p1010)
From: Wolfgang Grandegger @ 2011-10-18 7:13 UTC (permalink / raw)
To: Kumar Gala
Cc: netdev-u79uwXL29TY76Z2rM5mHXA, U Bhaskar-B22300,
socketcan-core-0fE9KPoRgkgATYTw5x5z8w, Marc Kleine-Budde,
PPC list, David S. Miller
In-Reply-To: <16FBAA47-5133-43A1-80CE-C6D63B79FB5D-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r@public.gmane.org>
Hi Kumar,
On 10/18/2011 07:44 AM, Kumar Gala wrote:
>
> On Aug 16, 2011, at 10:32 PM, Robin Holt wrote:
>
>> David,
>>
>> The following set of patches have been reviewed by the above parties and
>> all comments have been integrated. Although the patches stray from the
>> drivers/net/can directory, the diversions are related to changes for
>> the flexcan driver.
>>
>> The patch set is based upon your net-next-2.6 tree's commit 6c37e46.
>>
>> Could you please queue these up for the next appropriate push to Linus'
>> tree?
>>
>> Thanks,
>> Robin Holt
>
> Robin,
>
> Do you remember why we went with just 'fsl,p1010-flexcan' as the device tree compatible? Do we feel the flex can on P1010 isn't the same as on MPC5xxx? or the ARM SoCs?
The MPC5xxx SOCs have a MSCAN controller, which is different to the
Flexcan and handled by another driver. But the Flexcan's on the
Freescale ARM SOCs are identical and supported by that driver as well
and "fsl,flexcan" would work *perfectly*. Actually Grant instructed use
to be more explicit and use "fsl,p1010-flexcan". Anyway,
"fsl,p1010-flexcan" should work on ARM SOCs if the source frequency is
provided via boot loader or the DTS file. Compatibility was one of our
main concerns.
Wolfgang.
^ permalink raw reply
* RE: [PATCH 6/7] mlx4_en: Adding rxhash support
From: Yevgeny Petrilin @ 2011-10-18 7:36 UTC (permalink / raw)
To: Eric Dumazet; +Cc: davem@davemloft.net, netdev@vger.kernel.org
In-Reply-To: <1318902525.2571.24.camel@edumazet-laptop>
> > rss_context->flags = rss_mask;
> > + rss_context->hash_fn = 1;
> > + for (i = 0; i < 10; i++)
> > + rss_context->rss_key[i] = random32();
> >
>
> Thats bit of a problem : Two NICS will have different seeds, and thus provide different rxhash for a given flow. A bonding of two NICS will
> not be able to provide a consistent rxhash.
>
> drivers/net/ethernet/intel/igb/igb_main.c uses a static table to avoid this problem.
>
Hello Eric, thanks for your review.
I agree that in this case two ports will have different seeds.
But even if we use static values for the key, what about bonding of 2 NICs from different vendors?
How can we ensure we get same rxhash value for all NICs?
There are also other drivers that use random values as well, for example:
drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
Thanks,
Yevgeny
^ permalink raw reply
* RE: [PATCH 3/7] mlx4_en: Incoming traffic alignment optimizations
From: Yevgeny Petrilin @ 2011-10-18 7:43 UTC (permalink / raw)
To: Eric Dumazet; +Cc: davem@davemloft.net, netdev@vger.kernel.org
In-Reply-To: <1318902799.2571.26.camel@edumazet-laptop>
>
> > @@ -91,7 +91,7 @@
> > /* Receive fragment sizes; we use at most 4 fragments (for 9600 byte MTU
> > * and 4K allocations) */
> > enum {
> > - FRAG_SZ0 = 512 - NET_IP_ALIGN,
> > + FRAG_SZ0 = 2048,
> > FRAG_SZ1 = 1024,
> > FRAG_SZ2 = 4096,
> > FRAG_SZ3 = MLX4_EN_ALLOC_SIZE
>
> Is the 512 -> 2048 change really wanted ? Its not mentioned in changelog and is confusing.
>
> This means mlx4 lost the ability to use a small frag (512 bytes) to store small frames.
>
The change is wanted as an optimization for our HW.
We do get better numbers with this change, even with small packets.
You are correct, I should have mentioned it in the changelog.
Yevgeny
^ permalink raw reply
* @2011 MICROSOFT AWARD !!
From: Microsoft Office @ 2011-10-18 0:58 UTC (permalink / raw)
Message bodyA lump sum of (?1,000,000.00) have been credited to your E-mail Address.Congrats...Batch number YM 09102XN
Confirm this receipt by contacting Mr. Martin Lahm via
Email:y.msn021010121@live.com
^ permalink raw reply
* [PATCH] Fix guest memory leak and panic
From: Krishna Kumar @ 2011-10-18 8:05 UTC (permalink / raw)
To: rusty, mst
Cc: Ian.Campbell, netdev, linux-kernel, virtualization, davem,
Krishna Kumar
Commit 86ee8130 ("virtionet: convert to SKB paged frag API")
introduced a bug in guest. During RX testing, guest runs out
of memory within seconds, causing oom-killer; which then
panics the system: "Kernel panic - not syncing: Out of memory
and no killable processes...". /proc/meminfo just before the
panic shows MemFree is a few MB's:
MemFree: 1928544 kB (starts here)
...
...
MemFree: 27488 kB
MemFree: 26248 kB
MemFree: 24636 kB
MemFree: 22632 kB
MemFree: 19580 kB
MemFree: 17928 kB
MemFree: 15548 kB
(Panic)
The extra reference to the fragment pages causes those pages to
not get freed in skb_release_data(). The following patch fixes
the bug. I have not checked if any other converted driver has
the same issue.
Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
---
drivers/net/virtio_net.c | 13 +++++--------
1 file changed, 5 insertions(+), 8 deletions(-)
diff -ruNp org/drivers/net/virtio_net.c new/drivers/net/virtio_net.c
--- org/drivers/net/virtio_net.c 2011-10-18 08:49:46.000000000 +0530
+++ new/drivers/net/virtio_net.c 2011-10-18 12:55:32.000000000 +0530
@@ -143,18 +143,15 @@ static void skb_xmit_done(struct virtque
static void set_skb_frag(struct sk_buff *skb, struct page *page,
unsigned int offset, unsigned int *len)
{
+ int size = min((unsigned)PAGE_SIZE - offset, *len);
int i = skb_shinfo(skb)->nr_frags;
- skb_frag_t *f;
- f = &skb_shinfo(skb)->frags[i];
- f->size = min((unsigned)PAGE_SIZE - offset, *len);
- f->page_offset = offset;
- __skb_frag_set_page(f, page);
+ __skb_fill_page_desc(skb, i, page, offset, size);
- skb->data_len += f->size;
- skb->len += f->size;
+ skb->data_len += size;
+ skb->len += size;
skb_shinfo(skb)->nr_frags++;
- *len -= f->size;
+ *len -= size;
}
static struct sk_buff *page_to_skb(struct virtnet_info *vi,
^ permalink raw reply
* @2011 MICROSOFT AWARD !!
From: Microsoft Office @ 2011-10-18 1:12 UTC (permalink / raw)
Message bodyA lump sum of (?1,000,000.00) have been credited to your E-mail Address.Congrats...Batch number YM 09102XN
Confirm this receipt by contacting Mr. Martin Lahm via
Email:y.msn021010121@live.com
^ permalink raw reply
* [PATCH net-next] neigh: fix rcu splat in neigh_update()
From: roy.qing.li @ 2011-10-18 8:32 UTC (permalink / raw)
To: ari.m.savolainen, netdev
when use dst_get_neighbour to get neighbour, we need
rcu_read_lock to protect, since dst_get_neighbour uses
rcu_dereference.
The bug was reported by Ari Savolainen <ari.m.savolainen@gmail.com>
[ 105.612095]
[ 105.612096] ===================================================
[ 105.612100] [ INFO: suspicious rcu_dereference_check() usage. ]
[ 105.612101] ---------------------------------------------------
[ 105.612103] include/net/dst.h:91 invoked rcu_dereference_check()
without protection!
[ 105.612105]
[ 105.612106] other info that might help us debug this:
[ 105.612106]
[ 105.612108]
[ 105.612108] rcu_scheduler_active = 1, debug_locks = 0
[ 105.612110] 1 lock held by dnsmasq/2618:
[ 105.612111] #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff815df8c7>]
rtnl_lock+0x17/0x20
[ 105.612120]
[ 105.612121] stack backtrace:
[ 105.612123] Pid: 2618, comm: dnsmasq Not tainted 3.1.0-rc1 #41
[ 105.612125] Call Trace:
[ 105.612129] [<ffffffff810ccdcb>] lockdep_rcu_dereference+0xbb/0xc0
[ 105.612132] [<ffffffff815dc5a9>] neigh_update+0x4f9/0x5f0
[ 105.612135] [<ffffffff815da001>] ? neigh_lookup+0xe1/0x220
[ 105.612139] [<ffffffff81639298>] arp_req_set+0xb8/0x230
[ 105.612142] [<ffffffff8163a59f>] arp_ioctl+0x1bf/0x310
[ 105.612146] [<ffffffff810baa40>] ? lock_hrtimer_base.isra.26+0x30/0x60
[ 105.612150] [<ffffffff8163fb75>] inet_ioctl+0x85/0x90
[ 105.612154] [<ffffffff815b5520>] sock_do_ioctl+0x30/0x70
[ 105.612157] [<ffffffff815b55d3>] sock_ioctl+0x73/0x280
[ 105.612162] [<ffffffff811b7698>] do_vfs_ioctl+0x98/0x570
[ 105.612165] [<ffffffff811a5c40>] ? fget_light+0x340/0x3a0
[ 105.612168] [<ffffffff811b7bbf>] sys_ioctl+0x4f/0x80
[ 105.612172] [<ffffffff816fdcab>] system_call_fastpath+0x16/0x1b
Reported-by: Ari Savolainen <ari.m.savolainen@gmail.com>
Signed-off-by: RongQing <roy.qing.li@gmail.com>
---
net/core/neighbour.c | 4 ++++
1 files changed, 4 insertions(+), 0 deletions(-)
diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index 4344964..909ecb3 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -1168,10 +1168,14 @@ int neigh_update(struct neighbour *neigh, const u8 *lladdr, u8 new,
struct dst_entry *dst = skb_dst(skb);
struct neighbour *n2, *n1 = neigh;
write_unlock_bh(&neigh->lock);
+
+ rcu_read_lock();
/* On shaper/eql skb->dst->neighbour != neigh :( */
if (dst && (n2 = dst_get_neighbour(dst)) != NULL)
n1 = n2;
n1->output(n1, skb);
+ rcu_read_unlock();
+
write_lock_bh(&neigh->lock);
}
skb_queue_purge(&neigh->arp_queue);
--
1.7.1
^ permalink raw reply related
* RE: [PATCH 6/7] mlx4_en: Adding rxhash support
From: Eric Dumazet @ 2011-10-18 8:34 UTC (permalink / raw)
To: Yevgeny Petrilin; +Cc: davem@davemloft.net, netdev@vger.kernel.org
In-Reply-To: <953B660C027164448AE903364AC447D2235EEB65@MTLDAG01.mtl.com>
Le mardi 18 octobre 2011 à 07:36 +0000, Yevgeny Petrilin a écrit :
> > > rss_context->flags = rss_mask;
> > > + rss_context->hash_fn = 1;
> > > + for (i = 0; i < 10; i++)
> > > + rss_context->rss_key[i] = random32();
> > >
> >
> > Thats bit of a problem : Two NICS will have different seeds, and thus provide different rxhash for a given flow. A bonding of two NICS will
> > not be able to provide a consistent rxhash.
> >
> > drivers/net/ethernet/intel/igb/igb_main.c uses a static table to avoid this problem.
> >
>
> Hello Eric, thanks for your review.
>
> I agree that in this case two ports will have different seeds.
> But even if we use static values for the key, what about bonding of 2 NICs from different vendors?
> How can we ensure we get same rxhash value for all NICs?
>
> There are also other drivers that use random values as well, for example:
> drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
>
What is the gain using random values ?
Usually, we tend to have same hardware in a single machine, or we use
active-backup bonding mode, and an active slave flip can change rxhash
values with litle effect, since this happens not often.
I really prefer not random values, because it allows to have replayable
configurations : For a given tcp flow, the same rxhash value is given
and same cpu target in RPS. Its way easier to tune your machine for some
workloads.
^ permalink raw reply
* Re: [PATCH] Fix guest memory leak and panic
From: Ian Campbell @ 2011-10-18 8:36 UTC (permalink / raw)
To: Krishna Kumar
Cc: rusty@rustcorp.com.au, mst@redhat.com, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org,
virtualization@lists.linux-foundation.org, davem@davemloft.net
In-Reply-To: <20111018080523.16861.55402.sendpatchset@krkumar2.in.ibm.com>
On Tue, 2011-10-18 at 09:05 +0100, Krishna Kumar wrote:
> The extra reference to the fragment pages causes those pages to
> not get freed in skb_release_data(). The following patch fixes
> the bug. I have not checked if any other converted driver has
> the same issue.
Damn. You are completely correct and I appear to have made this same
mistake several times. A quick look suggests that at least cxbg,
myriage, vmxnet, cassini and bnx2 may potentially have a similar
issue :-( (I stopped looking at that point, I'll obviously do a full
audit).
I considered quite carefully whether (__)skb_frag_set_page should take a
reference or not and decided yes but I'm starting to reconsider whether
I made the right choice. It seems that is just confusing and violates
the principal of least surprise to have a function called "set" take a
new reference. In reality all existing drivers expect that adding a page
to an SKB frag will just take over the existing reference.
I think the best thing might be to remove the additional ref taking from
the setter function and audit the previous changes to ensure they
conform. I'll do that right away and post a fixup patch ASAP.
> Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
This change is correct as things stand today, so:
Acked-by: Ian Campbell <ian.campbell@citrix.com>
but perhaps it would be better to hold off and let me fix all of these
all at once.
Thanks for bringing this to my attention Krishna.
Ian.
^ permalink raw reply
* RE: [PATCH 3/7] mlx4_en: Incoming traffic alignment optimizations
From: Eric Dumazet @ 2011-10-18 8:40 UTC (permalink / raw)
To: Yevgeny Petrilin; +Cc: davem@davemloft.net, netdev@vger.kernel.org
In-Reply-To: <953B660C027164448AE903364AC447D2235EEB7E@MTLDAG01.mtl.com>
Le mardi 18 octobre 2011 à 07:43 +0000, Yevgeny Petrilin a écrit :
> >
> > > @@ -91,7 +91,7 @@
> > > /* Receive fragment sizes; we use at most 4 fragments (for 9600 byte MTU
> > > * and 4K allocations) */
> > > enum {
> > > - FRAG_SZ0 = 512 - NET_IP_ALIGN,
> > > + FRAG_SZ0 = 2048,
> > > FRAG_SZ1 = 1024,
> > > FRAG_SZ2 = 4096,
> > > FRAG_SZ3 = MLX4_EN_ALLOC_SIZE
> >
> > Is the 512 -> 2048 change really wanted ? Its not mentioned in changelog and is confusing.
> >
> > This means mlx4 lost the ability to use a small frag (512 bytes) to store small frames.
> >
> The change is wanted as an optimization for our HW.
> We do get better numbers with this change, even with small packets.
> You are correct, I should have mentioned it in the changelog.
Oh my...
Of course you are aware that the 'truesize' stuff around means that
using big frag size will probably lower your performance number, unless
you allow protocol stacks to use more ram ?
Only possible drawback using 512 bytes instead of 2048 is the cache-line
bounce on the page->_count field. So I would say your change hides a
performance issue of your driver.
Maybe you should make sure you dont touch it too often [ You should use
a single add per allocated PAGE, not 2 (for 2048-bytes frags) or 8 (for
512-bytes frags) ]
^ permalink raw reply
* [net-next 0/7] stmmac: update to Oct 2011 version (V2)
From: Giuseppe CAVALLARO @ 2011-10-18 8:42 UTC (permalink / raw)
To: netdev; +Cc: davem, Giuseppe Cavallaro
This patches update the driver adding the chained
descriptor mode and some new useful fixes.
I've reviewed some patches after the V1 and V2:
stmmac: allow mtu bigger than 1500 in case of normal desc (V3)
|-> removed the useless max_mtu init: Thx Eric's feedback
stmmac: add CHAINED descriptor mode support (V3)
|-> removed ifdef in the C and added small routines
specialised for chained/ring modes. See comment
within the patch itself. Thx David's feedback.
stmmac: allow mmc usage only if feature actually available (V3)
|-> added a check if interface is NULL
In the end, I added two new small patches:
stmmac: use predefined macros for HW cap register fields (V3)
stmmac: allow mmc usage only if feature actually available (V3)
made by Rayagond and reviewed/reworked by myself
Let me know if it's ok.
Giuseppe Cavallaro (5):
stmmac: protect tx process with lock (V3)
stmmac: update the driver version and doc (V3)
stmmac: allow mtu bigger than 1500 in case of normal desc (V3)
stmmac: allow mmc usage only if feature actually available (V3)
stmmac: add CHAINED descriptor mode support (V3)
Rayagond Kokatanur (1):
stmmac: use predefined macros for HW cap register fields (V3)
Srinivas Kandagatla (1):
stmmac: Stop advertising 1000Base capabilties for non GMII iface
(V3).
Documentation/networking/stmmac.txt | 11 +-
drivers/net/ethernet/stmicro/stmmac/Kconfig | 18 ++
drivers/net/ethernet/stmicro/stmmac/Makefile | 2 +
drivers/net/ethernet/stmicro/stmmac/chain_mode.c | 137 +++++++++++++
drivers/net/ethernet/stmicro/stmmac/common.h | 43 ++++
drivers/net/ethernet/stmicro/stmmac/descs_com.h | 126 ++++++++++++
drivers/net/ethernet/stmicro/stmmac/enh_desc.c | 22 +--
drivers/net/ethernet/stmicro/stmmac/norm_desc.c | 14 +-
drivers/net/ethernet/stmicro/stmmac/ring_mode.c | 126 ++++++++++++
drivers/net/ethernet/stmicro/stmmac/stmmac.h | 5 +-
.../net/ethernet/stmicro/stmmac/stmmac_ethtool.c | 24 ++-
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 212 ++++++++++----------
12 files changed, 602 insertions(+), 138 deletions(-)
create mode 100644 drivers/net/ethernet/stmicro/stmmac/chain_mode.c
create mode 100644 drivers/net/ethernet/stmicro/stmmac/descs_com.h
create mode 100644 drivers/net/ethernet/stmicro/stmmac/ring_mode.c
--
1.7.4.4
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox