* Re: [0/8] netpoll/bridge fixes
From: Eric Dumazet @ 2010-07-19 16:52 UTC (permalink / raw)
To: David Miller; +Cc: herbert, mst, shemminger, frzhang, netdev, amwang, mpm
In-Reply-To: <20100719.090503.73693858.davem@davemloft.net>
Le lundi 19 juillet 2010 à 09:05 -0700, David Miller a écrit :
> I thought we did that already.... oh I see, we did it for bonding:
>
> commit c22d7ac844f1cb9c6a5fd20f89ebadc2feef891b
> Author: Andy Gospodarek <andy@greyhouse.net>
> Date: Fri Jun 25 09:50:44 2010 +0000
>
BTW, this added following warning :
[PATCH] bonding: avoid a warning
drivers/net/bonding/bond_main.c:179:12: warning: ‘disable_netpoll’
defined but not used
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 8228088..20f45cb 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -176,7 +176,9 @@ static int arp_ip_count;
static int bond_mode = BOND_MODE_ROUNDROBIN;
static int xmit_hashtype = BOND_XMIT_POLICY_LAYER2;
static int lacp_fast;
+#ifdef CONFIG_NET_POLL_CONTROLLER
static int disable_netpoll = 1;
+#endif
const struct bond_parm_tbl bond_lacp_tbl[] = {
{ "slow", AD_LACP_SLOW},
^ permalink raw reply related
* Re: Are concurrent calls to tc action ipt safe?
From: Jan Engelhardt @ 2010-07-19 16:44 UTC (permalink / raw)
To: Gerd v. Egidy; +Cc: netfilter-devel, netdev
In-Reply-To: <201007191623.40423.lists@egidy.de>
On Monday 2010-07-19 16:23, Gerd v. Egidy wrote:
>AFAIK, current iptables has a short race condition when two rules within the
>same table are changed at once.
>
>E.g. when two users simultaneously call something like this
>iptables -t filter -A INPUT -s 192.168.1.1 -j MARK --set-mark 1
>and
>iptables -t filter -A INPUT -s 192.168.1.2 -j MARK --set-mark 2
>one of these entries can get lost.
There are many serialization techniques possible to serialize iptables
execution.
>tc filter add dev eth0 parent ffff: protocol ip prio 1 u32 \
>match ip src 192.168.1.1 \
>action ipt -j MARK --set-mark 1
>
>Since this call uses the xtables targets I'm currently not sure if the same
>problem regarding concurrent changes exists or not. Can anyone tell me if
>concurrent calls like this are safe?
This target invocation is not in any table, thus there is no race
condition.
^ permalink raw reply
* Re: [PATCH net-next-2.6] net: 64bit stats for netdev_queue
From: David Miller @ 2010-07-19 16:35 UTC (permalink / raw)
To: eric.dumazet; +Cc: netdev
In-Reply-To: <1279546422.2553.45.camel@edumazet-laptop>
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 19 Jul 2010 15:33:42 +0200
> Since struct netdev_queue tx_bytes/tx_packets/tx_dropped are already
> protected by _xmit_lock, its easy to convert these fields to u64 instead
> of unsigned long.
> This completes 64bit stats for devices using them (vlan, macvlan, ...)
>
> Strictly, we could avoid the locking in dev_txq_stats_fold() on 64bit
> arches, but its slow path and we prefer keep it simple.
>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Applied, thanks a lot Eric.
^ permalink raw reply
* Re: [BUG net-next-2.6] vlan, bonding, bnx2 problems
From: David Miller @ 2010-07-19 16:35 UTC (permalink / raw)
To: eric.dumazet; +Cc: mchan, pedro.netdev, netdev, kaber, bhutchings
In-Reply-To: <1279545854.2553.37.camel@edumazet-laptop>
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 19 Jul 2010 15:24:14 +0200
> [RFC net-next-2.6] bonding: fix bond_inet6addr_event()
>
> After commit ad1afb0039391 (vlan_dev: VLAN 0 should be treated
> as "no vlan tag" (802.1p packet)),
> bond_inet6addr_event() might be called with a NULL bond->vlgrp pointer, and
> a non empty bond->vlan_list. vlan_group_get_device() is dereferencing a NULL pointer.
>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
I'll apply this bandaid for now, but yes we need to think more
deeply about this.
^ permalink raw reply
* Re: [PATCH] s2io: Remove unnecessary memset of netdev private data
From: David Miller @ 2010-07-19 16:28 UTC (permalink / raw)
To: tklauser; +Cc: netdev, kernel-janitors
In-Reply-To: <1279529758-7901-1-git-send-email-tklauser@distanz.ch>
From: Tobias Klauser <tklauser@distanz.ch>
Date: Mon, 19 Jul 2010 10:55:58 +0200
> The memory for the private data is allocated using kzalloc in
> alloc_etherdev (or alloc_netdev_mq respectively) so there is no need to
> set it to 0 again.
>
> Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Applied, thanks.
^ permalink raw reply
* Re: [RFC 1/2] netfilter: xt_condition: export list management code
From: Jan Engelhardt @ 2010-07-19 16:13 UTC (permalink / raw)
To: Luciano Coelho
Cc: Netfilter Developer Mailing List, netdev, Patrick McHardy, sameo
In-Reply-To: <1279548947-10470-2-git-send-email-luciano.coelho@nokia.com>
On Monday 2010-07-19 16:15, Luciano Coelho wrote:
>From: Luciano Coelho <coelho@testbed>
>
>This patch isolates and exports the condition list management code, in
>preparation for the CONDITION target to use it. No functional change,
>just reorganization of the code.
Well, I guess it would make more sense if the two extensions be in a
single file. That would alleviate the need for export reorganizations,
and also works because the module metadata overhead is large already.
>@@ -3,12 +3,27 @@
>
> #include <linux/types.h>
>
>+#define XT_CONDITION_MAX_NAME_SIZE 30
>+
> struct xt_condition_mtinfo {
>- char name[31];
>+ char name[XT_CONDITION_MAX_NAME_SIZE + 1];
> __u8 invert;
Oh noes. Please please avoid any math operations inside []. It has
already driven XT_FUNCTION_MAXNAMELEN into nuts ("was it now +1 or -1,
or even -2 that we needed to pass for various functions?"). Just let MAX
be 31 and have name[MAX].
> MODULE_ALIAS("ip6t_condition");
>
>-struct condition_variable {
>- struct list_head list;
>- struct proc_dir_entry *status_proc;
>- unsigned int refcount;
>- bool enabled;
>-};
Given your excellent usage example of a CONDITION target, I think it
even makes sense to enlarge the "enabled" variable to a full-fledged
32-bit value that can be &, | and ^'d, similar to nfmark.
^ permalink raw reply
* Re: [0/8] netpoll/bridge fixes
From: David Miller @ 2010-07-19 16:05 UTC (permalink / raw)
To: herbert; +Cc: mst, shemminger, frzhang, netdev, amwang, mpm
In-Reply-To: <20100719115411.GA22758@gondor.apana.org.au>
From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Mon, 19 Jul 2010 19:54:11 +0800
> Still, it might be a good idea to disable bridge netpoll in
> 2.6.35.
I thought we did that already.... oh I see, we did it for bonding:
commit c22d7ac844f1cb9c6a5fd20f89ebadc2feef891b
Author: Andy Gospodarek <andy@greyhouse.net>
Date: Fri Jun 25 09:50:44 2010 +0000
bonding: prevent netpoll over bonded interfaces
I'm fine with disabling it for bridging too, just send me a patch
similar to the bonding one.
^ permalink raw reply
* Re: bnx2/5709: Strange interrupts spread
From: Christophe Ngo Van Duc @ 2010-07-19 15:55 UTC (permalink / raw)
To: netdev
In-Reply-To: <AANLkTiniNNPV9ztxXHtX4np7PIZabkm0I4v5O29chf8i@mail.gmail.com>
Dear list,
So i've been able to do some test today:
If I put the 2 interface in a bridge with no IP adress, the interrupts
are on 1 CPU
If I put the 2 interface in a bridge with IP adress, the interrupts
are still on 1 CPU
If I put the 2 interface outside the bridge with IP address,
everything works fine the interrupts get spread on the CPU
So the conclusion seems to be that when the bnx2 is put into
promiscuous mode by the bridge, the RSS hash stop to work even if
traffic is IP in nature.
Best regards,
Christophe.
On Fri, Jul 2, 2010 at 5:33 PM, Christophe Ngo Van Duc
<cngovanduc@gmail.com> wrote:
> Dear list,
>
> I hope I am posting to the correct place...
>
> I am facing a strange issue on a HP DL 360.
>
> I have 2 internal ethernet cards (the one that came by default with
> the server) and 2 additional ethernet cards for a total for 4 ethernet
> cards.
>
> The 2 internal cards are running fine as of interrupts (for example eth1):
> CPU0 CPU1 CPU2 CPU3 CPU4 CPU5
> CPU6 CPU7
>
> 71: 604 11933 40 1537 0
> 0 0 6043 PCI-MSI-edge eth1-0
> 72: 24805 9795 3606 0 128
> 0 3365 0 PCI-MSI-edge eth1-1
> 73: 0 279 0 429 38
> 16540 0 30843 PCI-MSI-edge eth1-2
> 74: 0 0 25365 267 0
> 0 89 15541 PCI-MSI-edge eth1-3
> 75: 7244 24108 0 0 16488
> 0 240 0 PCI-MSI-edge eth1-4
> 76: 21378 3628 7726 0 49
> 247 2871 0 PCI-MSI-edge eth1-5
> 77: 0 0 47199 459 13
> 46 63064 18 PCI-MSI-edge eth1-6
> 78: 0 6230 67 283 259
> 82 7846 27130 PCI-MSI-edge eth1-7
>
> On eth2 (external card) all interrupts goes to CPU0
> CPU0 CPU1 CPU2 CPU3 CPU4 CPU5
> CPU6 CPU7
> 80: 46973077 0 0 0 0
> 0 0 PCI-MSI-edge eth2-0
> 81: 0 0 0 0 0
> 0 0 0 PCI-MSI-edge eth2-1
> 82: 0 0 0 0 0
> 0 0 0 PCI-MSI-edge eth2-2
> 83: 0 0 0 0 0
> 0 0 0 PCI-MSI-edge eth2-3
> 84: 0 0 0 0 0
> 0 0 0 PCI-MSI-edge eth2-4
> 85: 0 0 0 0 0
> 0 0 0 PCI-MSI-edge eth2-5
> 86: 0 0 2445 0 37
> 0 8463 13 PCI-MSI-edge eth2-6
> 87: 0 0 0 0 0
> 0 0 0 PCI-MSI-edge eth2-7
>
> If I understand correctly the RSS hash is used to dispatch the packets
> into the different queues running on the different CPU.
>
> Why then my internal cards are running fine but the additional cards
> (eth2 and eth3) are presenting this behavior where all interrupts goes
> to one CPU?
>
> Thanks for your help in understanding this. (see below for config details)
>
> Christophe.
>
> All are detected correctly at boot:
> Broadcom NetXtreme II Gigabit Ethernet Driver bnx2 v2.0.8e (April 13, 2010)
> bnx2 0000:02:00.0: PCI INT A -> GSI 31 (level, low) -> IRQ 31
> bnx2 0000:02:00.0: setting latency timer to 64
> eth0: Broadcom NetXtreme II BCM5709 1000Base-T (C0) PCI Express found
> at mem f4000000, IRQ 31, node addr f4:ce:46:86:a1:00
> bnx2 0000:02:00.1: PCI INT B -> GSI 39 (level, low) -> IRQ 39
> bnx2 0000:02:00.1: setting latency timer to 64
> eth1: Broadcom NetXtreme II BCM5709 1000Base-T (C0) PCI Express found
> at mem f2000000, IRQ 39, node addr f4:ce:46:86:a1:02
> bnx2 0000:07:00.0: PCI INT A -> GSI 24 (level, low) -> IRQ 24
> bnx2 0000:07:00.0: setting latency timer to 64
> eth2: Broadcom NetXtreme II BCM5709 1000Base-T (C0) PCI Express found
> at mem fa000000, IRQ 24, node addr 00:26:55:87:17:98
> bnx2 0000:07:00.1: PCI INT B -> GSI 34 (level, low) -> IRQ 34
> bnx2 0000:07:00.1: setting latency timer to 64
> eth3: Broadcom NetXtreme II BCM5709 1000Base-T (C0) PCI Express found
> at mem f8000000, IRQ 34, node addr 00:26:55:87:17:9a
>
> Kernel is 2.6.31-13
> Broadcom driver bnx2 v2.0.8e
>
> eth0 is a normal interface with an Ip address
> eth1 is a normal interface with an Ip address
> eth2 belongs to a bridge interface without an ip address, running tc (htb)
> eth3 belongs to the same bridge interface without an ip address
>
^ permalink raw reply
* Re: oops in tcp_xmit_retransmit_queue() w/ v2.6.32.15
From: Tejun Heo @ 2010-07-19 14:57 UTC (permalink / raw)
To: Ilpo Järvinen
Cc: Lennart Schulte, Eric Dumazet, David S. Miller, lkml,
netdev@vger.kernel.org, Fehrmann, Henning, Carsten Aulbert
In-Reply-To: <alpine.DEB.2.00.1007161448330.13946@melkinpaasi.cs.helsinki.fi>
Hello,
On 07/16/2010 02:02 PM, Ilpo Järvinen wrote:
> Besides, Tejun has also found that it's hint->next ptr which is NULL in
> his case so this won't solve his case anyway. Tejun, can you confirm
> whether it was retransmit_skb_hint->next being NULL on _entry time_ to
> tcp_xmit_retransmit_queue() or later on in the loop after the updates done
> by the loop itself to the hint (or that your testing didn't conclude
> either)?
Sorry about the delay. I was traveling last week. Unfortunately, I
don't know whether ->next was NULL on entry or not. I hacked up the
following ugly patch for the next test run. It should have everything
which has come up till now + list and hint sanity checking before
starting processing them. I'm planning on deploying it w/ crashdump
enabled in several days. If I've missed something, please let me
know.
Thanks.
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index b4ed957..1c8b1e0 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2190,6 +2190,53 @@ static int tcp_can_forward_retransmit(struct sock *sk)
return 1;
}
+static void print_queue(struct sock *sk, struct sk_buff *old, struct sk_buff *hole)
+{
+ struct tcp_sock *tp = tcp_sk(sk);
+ struct sk_buff *skb, *prev;
+ bool do_panic = false;
+
+ skb = tcp_write_queue_head(sk);
+ prev = (struct sk_buff *)(&sk->sk_write_queue);
+
+ if (skb == NULL) {
+ printk("XXX NULL head, pkts %u\n", tp->packets_out);
+ do_panic = true;
+ }
+
+ printk("XXX head %p tail %p sendhead %p oldhint %p now %p hole %p high %u\n",
+ tcp_write_queue_head(sk), tcp_write_queue_tail(sk),
+ tcp_send_head(sk), old, tp->retransmit_skb_hint, hole,
+ tp->retransmit_high);
+
+ while (skb) {
+ printk("XXX skb %p (%u-%u) next %p prev %p sacked %u\n",
+ skb, TCP_SKB_CB(skb)->seq, TCP_SKB_CB(skb)->end_seq,
+ skb->next, skb->prev, TCP_SKB_CB(skb)->sacked);
+ if (prev != skb->prev) {
+ printk("XXX Inconsistent prev\n");
+ do_panic = true;
+ }
+
+ if (skb == tcp_write_queue_tail(sk)) {
+ if (skb->next != (struct sk_buff *)(&sk->sk_write_queue)) {
+ printk("XXX Improper next at tail\n");
+ do_panic = true;
+ }
+ break;
+ }
+
+ prev = skb;
+ skb = skb->next;
+ }
+ if (!skb) {
+ printk("XXX Encountered unexpected NULL\n");
+ do_panic = true;
+ }
+ if (do_panic)
+ panic("XXX panicking");
+}
+
/* This gets called after a retransmit timeout, and the initially
* retransmitted data is acknowledged. It tries to continue
* resending the rest of the retransmit queue, until either
@@ -2198,19 +2245,53 @@ static int tcp_can_forward_retransmit(struct sock *sk)
* based retransmit packet might feed us FACK information again.
* If so, we use it to avoid unnecessarily retransmissions.
*/
+static unsigned int caught_it;
+
void tcp_xmit_retransmit_queue(struct sock *sk)
{
const struct inet_connection_sock *icsk = inet_csk(sk);
struct tcp_sock *tp = tcp_sk(sk);
- struct sk_buff *skb;
+ struct sk_buff *skb, *prev;
struct sk_buff *hole = NULL;
+ struct sk_buff *old = tp->retransmit_skb_hint;
u32 last_lost;
int mib_idx;
int fwd_rexmitting = 0;
+ bool saw_hint = false;
+
+ if (!tp->packets_out) {
+ if (net_ratelimit())
+ printk("XXX !tp->packets_out, retransmit_skb_hint=%p, write_queue_head=%p\n",
+ tp->retransmit_skb_hint, tcp_write_queue_head(sk));
+ return;
+ }
if (!tp->lost_out)
tp->retransmit_high = tp->snd_una;
+ for (skb = tcp_write_queue_head(sk),
+ prev = (struct sk_buff *)&sk->sk_write_queue;
+ skb != (struct sk_buff *)&sk->sk_write_queue;
+ prev = skb, skb = skb->next) {
+ if (prev != skb->prev) {
+ printk("XXX sanity check: prev corrupt\n");
+ print_queue(sk, old, hole);
+ }
+ if (skb == tp->retransmit_skb_hint)
+ saw_hint = true;
+ if (skb == tcp_write_queue_tail(sk) &&
+ skb->next != (struct sk_buff *)(&sk->sk_write_queue)) {
+ printk("XXX sanity check: end corrupt\n");
+ print_queue(sk, old, hole);
+ }
+ }
+ if (tp->retransmit_skb_hint && !saw_hint) {
+ printk("XXX sanity check: retransmit_skb_hint=%p is not on list, claring hint\n",
+ tp->retransmit_skb_hint);
+ print_queue(sk, old, hole);
+ tp->retransmit_skb_hint = NULL;
+ }
+
if (tp->retransmit_skb_hint) {
skb = tp->retransmit_skb_hint;
last_lost = TCP_SKB_CB(skb)->end_seq;
@@ -2218,7 +2299,17 @@ void tcp_xmit_retransmit_queue(struct sock *sk)
last_lost = tp->retransmit_high;
} else {
skb = tcp_write_queue_head(sk);
- last_lost = tp->snd_una;
+ if (skb)
+ last_lost = tp->snd_una;
+ }
+
+checknull:
+ if (skb == NULL) {
+ print_queue(sk, old, hole);
+ caught_it++;
+ if (net_ratelimit())
+ printk("XXX Errors caught so far %u\n", caught_it);
+ return;
}
tcp_for_write_queue_from(skb, sk) {
@@ -2261,7 +2352,7 @@ begin_fwd:
} else if (!(sacked & TCPCB_LOST)) {
if (hole == NULL && !(sacked & (TCPCB_SACKED_RETRANS|TCPCB_SACKED_ACKED)))
hole = skb;
- continue;
+ goto cont;
} else {
last_lost = TCP_SKB_CB(skb)->end_seq;
@@ -2272,7 +2363,7 @@ begin_fwd:
}
if (sacked & (TCPCB_SACKED_ACKED|TCPCB_SACKED_RETRANS))
- continue;
+ goto cont;
if (tcp_retransmit_skb(sk, skb))
return;
@@ -2282,6 +2373,9 @@ begin_fwd:
inet_csk_reset_xmit_timer(sk, ICSK_TIME_RETRANS,
inet_csk(sk)->icsk_rto,
TCP_RTO_MAX);
+cont:
+ skb = skb->next;
+ goto checknull;
}
}
--
tejun
^ permalink raw reply related
* Re: [PATCH 07/11] Removing dead ARCH_PNX010X
From: Christoph Egger @ 2010-07-19 14:37 UTC (permalink / raw)
To: David Miller
Cc: joe, shemminger, dongdong.deng, jkosina, netdev, linux-kernel,
vamos-dev
In-Reply-To: <20100714.133916.71109591.davem@davemloft.net>
On Wed, Jul 14, 2010 at 01:39:16PM -0700, David Miller wrote:
> From: Christoph Egger <siccegge@cs.fau.de>
> Date: Wed, 14 Jul 2010 14:41:09 +0200
>
> > ARCH_PNX010X doesn't exist in Kconfig, therefore removing all
> > references for it from the source code.
> >
> > Signed-off-by: Christoph Egger <siccegge@cs.fau.de>
>
> If you are going to kill this off, kill the references in
> driver/net/Kconfig at the same time.
>
> Please fix this up and resubmit your patch, thanks.
DOne, patch below
Thanks
CHristoph
---
>From ed6ffbfd77e14f17fa7d75ddf70b0d3b0126848c Mon Sep 17 00:00:00 2001
From: Christoph Egger <siccegge@cs.fau.de>
Date: Wed, 14 Jul 2010 14:19:15 +0200
Subject: [PATCH] Removing dead ARCH_PNX010X
ARCH_PNX010X doesn't exist in Kconfig, therefore removing all
references for it from the source code/Kconfig.
Signed-off-by: Christoph Egger <siccegge@cs.fau.de>
---
drivers/net/Kconfig | 4 ++--
drivers/net/cs89x0.c | 45 ---------------------------------------------
2 files changed, 2 insertions(+), 47 deletions(-)
diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index ce2fcdd..ba5b862 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -1463,7 +1463,7 @@ config FORCEDETH
config CS89x0
tristate "CS89x0 support"
depends on NET_ETHERNET && (ISA || EISA || MACH_IXDP2351 \
- || ARCH_IXDP2X01 || ARCH_PNX010X || MACH_MX31ADS)
+ || ARCH_IXDP2X01 || MACH_MX31ADS)
---help---
Support for CS89x0 chipset based Ethernet cards. If you have a
network (Ethernet) card of this type, say Y and read the
@@ -1477,7 +1477,7 @@ config CS89x0
config CS89x0_NONISA_IRQ
def_bool y
depends on CS89x0 != n
- depends on MACH_IXDP2351 || ARCH_IXDP2X01 || ARCH_PNX010X || MACH_MX31ADS
+ depends on MACH_IXDP2351 || ARCH_IXDP2X01 || MACH_MX31ADS
config TC35815
tristate "TOSHIBA TC35815 Ethernet support"
diff --git a/drivers/net/cs89x0.c b/drivers/net/cs89x0.c
index 2ccb9f1..7a5d787 100644
--- a/drivers/net/cs89x0.c
+++ b/drivers/net/cs89x0.c
@@ -180,12 +180,6 @@ static unsigned int cs8900_irq_map[] = {IRQ_IXDP2351_CS8900, 0, 0, 0};
#elif defined(CONFIG_ARCH_IXDP2X01)
static unsigned int netcard_portlist[] __used __initdata = {IXDP2X01_CS8900_VIRT_BASE, 0};
static unsigned int cs8900_irq_map[] = {IRQ_IXDP2X01_CS8900, 0, 0, 0};
-#elif defined(CONFIG_ARCH_PNX010X)
-#include <mach/gpio.h>
-#define CIRRUS_DEFAULT_BASE IO_ADDRESS(EXT_STATIC2_s0_BASE + 0x200000) /* = Physical address 0x48200000 */
-#define CIRRUS_DEFAULT_IRQ VH_INTC_INT_NUM_CASCADED_INTERRUPT_1 /* Event inputs bank 1 - ID 35/bit 3 */
-static unsigned int netcard_portlist[] __used __initdata = {CIRRUS_DEFAULT_BASE, 0};
-static unsigned int cs8900_irq_map[] = {CIRRUS_DEFAULT_IRQ, 0, 0, 0};
#elif defined(CONFIG_MACH_MX31ADS)
#include <mach/board-mx31ads.h>
static unsigned int netcard_portlist[] __used __initdata = {
@@ -372,18 +366,6 @@ writeword(unsigned long base_addr, int portno, u16 value)
{
__raw_writel(value, base_addr + (portno << 1));
}
-#elif defined(CONFIG_ARCH_PNX010X)
-static u16
-readword(unsigned long base_addr, int portno)
-{
- return inw(base_addr + (portno << 1));
-}
-
-static void
-writeword(unsigned long base_addr, int portno, u16 value)
-{
- outw(value, base_addr + (portno << 1));
-}
#else
static u16
readword(unsigned long base_addr, int portno)
@@ -546,30 +528,6 @@ cs89x0_probe1(struct net_device *dev, int ioaddr, int modular)
#endif
}
-#ifdef CONFIG_ARCH_PNX010X
- initialize_ebi();
-
- /* Map GPIO registers for the pins connected to the CS8900a. */
- if (map_cirrus_gpio() < 0)
- return -ENODEV;
-
- reset_cirrus();
-
- /* Map event-router registers. */
- if (map_event_router() < 0)
- return -ENODEV;
-
- enable_cirrus_irq();
-
- unmap_cirrus_gpio();
- unmap_event_router();
-
- dev->base_addr = ioaddr;
-
- for (i = 0 ; i < 3 ; i++)
- readreg(dev, 0);
-#endif
-
/* Grab the region so we can find another board if autoIRQ fails. */
/* WTF is going on here? */
if (!request_region(ioaddr & ~3, NETCARD_IO_EXTENT, DRV_NAME)) {
@@ -1391,9 +1349,6 @@ net_open(struct net_device *dev)
case A_CNF_MEDIA_10B_2: result = lp->adapter_cnf & A_CNF_10B_2; break;
default: result = lp->adapter_cnf & (A_CNF_10B_T | A_CNF_AUI | A_CNF_10B_2);
}
-#ifdef CONFIG_ARCH_PNX010X
- result = A_CNF_10B_T;
-#endif
if (!result) {
printk(KERN_ERR "%s: EEPROM is configured for unavailable media\n", dev->name);
release_dma:
--
1.7.0.4
^ permalink raw reply related
* Re: [RFC 0/2] netfilter: xtables: CONDITION target implementation
From: Luciano Coelho @ 2010-07-19 14:31 UTC (permalink / raw)
To: ext Changli Gao
Cc: netfilter-devel@vger.kernel.org, netdev@vger.kernel.org,
kaber@trash.net, jengelh@medozas.de, sameo@linux.intel.com
In-Reply-To: <AANLkTilRtCiuAwox2sTsEu73fE89OLEUtIbVQ_njtkqz@mail.gmail.com>
On Mon, 2010-07-19 at 16:27 +0200, ext Changli Gao wrote:
> On Mon, Jul 19, 2010 at 10:15 PM, Luciano Coelho
> <luciano.coelho@nokia.com> wrote:
> > From: Luciano Coelho <coelho@testbed.(none)>
> >
> > Hi all,
> >
> > As discussed earlier, I've been looking for a way to enable and disable the
> > condition match automatically, in the netfilter tables themselves (ie. without
> > the need to use procfs).
> >
> > This is my initial implementation. Please let me know how it looks. The first
> > patch is based on the xt_condition patch that Jan sent to the list (but which
> > has not been finalized for inclusion yet). Once the condition match gets
> > applied, I'll forward port my patch and submit it again.
> >
> > Cheers,
> > Luca.
> >
> > Luciano Coelho (2):
> > netfilter: xt_condition: export list management code
> > netfilter: xtables: implement CONDITION target
> >
> > include/linux/netfilter/Kbuild | 1 +
> > include/linux/netfilter/xt_CONDITION.h | 39 +++++++++++
> > include/linux/netfilter/xt_condition.h | 17 +++++-
> > net/netfilter/Kconfig | 12 ++++
> > net/netfilter/Makefile | 1 +
> > net/netfilter/xt_CONDITION.c | 112 ++++++++++++++++++++++++++++++++
> > net/netfilter/xt_condition.c | 82 ++++++++++++++----------
>
> Why not combine xt_CONDITION.c and xt_condition.c into xt_condition.c,
> like xt_mark.c?
I just thought that someone may want to use the condition match without
using the CONDITION target, that's why I've put it in a different
module.
But I don't have a strong opinion about this. If everybody agrees on
that, I can merge the code into a single module.
Thanks for your comment.
--
Cheers,
Luca.
^ permalink raw reply
* Re: [RFC 0/2] netfilter: xtables: CONDITION target implementation
From: Changli Gao @ 2010-07-19 14:27 UTC (permalink / raw)
To: Luciano Coelho; +Cc: netfilter-devel, netdev, kaber, jengelh, sameo
In-Reply-To: <1279548947-10470-1-git-send-email-luciano.coelho@nokia.com>
On Mon, Jul 19, 2010 at 10:15 PM, Luciano Coelho
<luciano.coelho@nokia.com> wrote:
> From: Luciano Coelho <coelho@testbed.(none)>
>
> Hi all,
>
> As discussed earlier, I've been looking for a way to enable and disable the
> condition match automatically, in the netfilter tables themselves (ie. without
> the need to use procfs).
>
> This is my initial implementation. Please let me know how it looks. The first
> patch is based on the xt_condition patch that Jan sent to the list (but which
> has not been finalized for inclusion yet). Once the condition match gets
> applied, I'll forward port my patch and submit it again.
>
> Cheers,
> Luca.
>
> Luciano Coelho (2):
> netfilter: xt_condition: export list management code
> netfilter: xtables: implement CONDITION target
>
> include/linux/netfilter/Kbuild | 1 +
> include/linux/netfilter/xt_CONDITION.h | 39 +++++++++++
> include/linux/netfilter/xt_condition.h | 17 +++++-
> net/netfilter/Kconfig | 12 ++++
> net/netfilter/Makefile | 1 +
> net/netfilter/xt_CONDITION.c | 112 ++++++++++++++++++++++++++++++++
> net/netfilter/xt_condition.c | 82 ++++++++++++++----------
Why not combine xt_CONDITION.c and xt_condition.c into xt_condition.c,
like xt_mark.c?
--
Regards,
Changli Gao(xiaosuo@gmail.com)
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Are concurrent calls to tc action ipt safe?
From: Gerd v. Egidy @ 2010-07-19 14:23 UTC (permalink / raw)
To: netfilter-devel; +Cc: netdev
Hi,
AFAIK, current iptables has a short race condition when two rules within the
same table are changed at once.
E.g. when two users simultaneously call something like this
iptables -t filter -A INPUT -s 192.168.1.1 -j MARK --set-mark 1
and
iptables -t filter -A INPUT -s 192.168.1.2 -j MARK --set-mark 2
one of these entries can get lost.
Jan Engelhard recently posted his xt2 patchset to overcome problems like this,
but it seems to still have performance issues.
I have a set of simple rules which need to change often and are subject to
this problem. I now wonder if I can solve this by using tc and the ipt action:
tc filter add dev eth0 parent ffff: protocol ip prio 1 u32 \
match ip src 192.168.1.1 \
action ipt -j MARK --set-mark 1
Since this call uses the xtables targets I'm currently not sure if the same
problem regarding concurrent changes exists or not. Can anyone tell me if
concurrent calls like this are safe?
Thank you very much.
Kind regards,
Gerd
--
Address (better: trap) for people I really don't want to get mail from:
jonas@cactusamerica.com
^ permalink raw reply
* [RFC 0/2] netfilter: xtables: CONDITION target implementation
From: Luciano Coelho @ 2010-07-19 14:15 UTC (permalink / raw)
To: netfilter-devel; +Cc: netdev, kaber, jengelh, sameo
From: Luciano Coelho <coelho@testbed.(none)>
Hi all,
As discussed earlier, I've been looking for a way to enable and disable the
condition match automatically, in the netfilter tables themselves (ie. without
the need to use procfs).
This is my initial implementation. Please let me know how it looks. The first
patch is based on the xt_condition patch that Jan sent to the list (but which
has not been finalized for inclusion yet). Once the condition match gets
applied, I'll forward port my patch and submit it again.
Cheers,
Luca.
Luciano Coelho (2):
netfilter: xt_condition: export list management code
netfilter: xtables: implement CONDITION target
include/linux/netfilter/Kbuild | 1 +
include/linux/netfilter/xt_CONDITION.h | 39 +++++++++++
include/linux/netfilter/xt_condition.h | 17 +++++-
net/netfilter/Kconfig | 12 ++++
net/netfilter/Makefile | 1 +
net/netfilter/xt_CONDITION.c | 112 ++++++++++++++++++++++++++++++++
net/netfilter/xt_condition.c | 82 ++++++++++++++----------
7 files changed, 229 insertions(+), 35 deletions(-)
create mode 100644 include/linux/netfilter/xt_CONDITION.h
create mode 100644 net/netfilter/xt_CONDITION.c
^ permalink raw reply
* [RFC 2/2] netfilter: xtables: implement CONDITION target
From: Luciano Coelho @ 2010-07-19 14:15 UTC (permalink / raw)
To: netfilter-devel; +Cc: netdev, kaber, jengelh, sameo
In-Reply-To: <1279548947-10470-1-git-send-email-luciano.coelho@nokia.com>
From: Luciano Coelho <coelho@testbed>
This patch implements a condition target, which let's the user set
netfilter rules that enable/disable the conditions used by the
condition match. Originally, the condition match only allowed the
variable to be changed via procfs. This new target makes it easy to
enable or disable the condition depending on the rules set.
Signed-off-by: Luciano Coelho <luciano.coelho@nokia.com>
---
include/linux/netfilter/Kbuild | 1 +
include/linux/netfilter/xt_CONDITION.h | 39 +++++++++++
net/netfilter/Kconfig | 12 ++++
net/netfilter/Makefile | 1 +
net/netfilter/xt_CONDITION.c | 112 ++++++++++++++++++++++++++++++++
5 files changed, 165 insertions(+), 0 deletions(-)
create mode 100644 include/linux/netfilter/xt_CONDITION.h
create mode 100644 net/netfilter/xt_CONDITION.c
diff --git a/include/linux/netfilter/Kbuild b/include/linux/netfilter/Kbuild
index c57e099..72eff3a 100644
--- a/include/linux/netfilter/Kbuild
+++ b/include/linux/netfilter/Kbuild
@@ -4,6 +4,7 @@ header-y += nfnetlink_conntrack.h
header-y += nfnetlink_log.h
header-y += nfnetlink_queue.h
header-y += xt_CLASSIFY.h
+header-y += xt_CONDITION.h
header-y += xt_CONNMARK.h
header-y += xt_CONNSECMARK.h
header-y += xt_CT.h
diff --git a/include/linux/netfilter/xt_CONDITION.h b/include/linux/netfilter/xt_CONDITION.h
new file mode 100644
index 0000000..cbffe3f
--- /dev/null
+++ b/include/linux/netfilter/xt_CONDITION.h
@@ -0,0 +1,39 @@
+/*
+ * linux/include/linux/netfilter/xt_CONDITION.h
+ *
+ * Header file for Xtables timer target module.
+ *
+ * Copyright (C) 2010 Nokia Corporation
+ *
+ * Contact: Luciano Coelho <luciano.coelho@nokia.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA
+ * 02110-1301 USA
+ */
+
+#ifndef _XT_CONDITION_TG_H
+#define _XT_CONDITION_TG_H
+
+#include <linux/types.h>
+#include <linux/netfilter/xt_condition.h>
+
+struct condition_tg_info {
+ char name[XT_CONDITION_MAX_NAME_SIZE + 1];
+ __u8 enabled;
+
+ /* Used internally by the kernel */
+ void *condvar __attribute__((aligned(8)));
+};
+
+#endif
diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
index e54e216..1877c6a 100644
--- a/net/netfilter/Kconfig
+++ b/net/netfilter/Kconfig
@@ -310,6 +310,18 @@ config NETFILTER_XT_MARK
"Use netfilter MARK value as routing key") and can also be used by
other subsystems to change their behavior.
+config NETFILTER_XT_TARGET_CONDITION
+ tristate "'CONDITION' target support"
+ depends on NETFILTER_ADVANCED
+ select NETFILTER_XT_MATCH_CONDITION
+ help
+
+ Allows changing the condition match value in procfs from the
+ netfilter tables, without requiring userspace to change the
+ condition value.
+
+ To compile it as a module, choose M here. If unsure, say N.
+
config NETFILTER_XT_CONNMARK
tristate 'ctmark target and match support'
depends on NF_CONNTRACK
diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile
index 474dd06..9237a67 100644
--- a/net/netfilter/Makefile
+++ b/net/netfilter/Makefile
@@ -46,6 +46,7 @@ obj-$(CONFIG_NETFILTER_XT_CONNMARK) += xt_connmark.o
# targets
obj-$(CONFIG_NETFILTER_XT_TARGET_CLASSIFY) += xt_CLASSIFY.o
+obj-$(CONFIG_NETFILTER_XT_TARGET_CONDITION) += xt_CONDITION.o
obj-$(CONFIG_NETFILTER_XT_TARGET_CONNSECMARK) += xt_CONNSECMARK.o
obj-$(CONFIG_NETFILTER_XT_TARGET_CT) += xt_CT.o
obj-$(CONFIG_NETFILTER_XT_TARGET_DSCP) += xt_DSCP.o
diff --git a/net/netfilter/xt_CONDITION.c b/net/netfilter/xt_CONDITION.c
new file mode 100644
index 0000000..8150352
--- /dev/null
+++ b/net/netfilter/xt_CONDITION.c
@@ -0,0 +1,112 @@
+/*
+ * linux/net/netfilter/xt_CONDITION.c
+ *
+ * Netfilter module to trigger a timer when packet matches.
+ * After timer expires a kevent will be sent.
+ *
+ * Copyright (C) 2010 Nokia Corporation
+ *
+ * Contact: Luciano Coelho <luciano.coelho@nokia.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA
+ * 02110-1301 USA
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/module.h>
+#include <linux/netfilter.h>
+#include <linux/netfilter/x_tables.h>
+#include <linux/netfilter/xt_CONDITION.h>
+
+static unsigned int condition_tg_target(struct sk_buff *skb,
+ const struct xt_action_param *par)
+{
+ const struct condition_tg_info *info = par->targinfo;
+
+ pr_debug("setting condition %s, enabled %d\n",
+ info->name, info->enabled);
+
+ xt_condition_set(info->condvar, info->enabled);
+
+ return XT_CONTINUE;
+}
+
+static int condition_tg_checkentry(const struct xt_tgchk_param *par)
+{
+ struct condition_tg_info *info = par->targinfo;
+ struct condition_variable *var;
+
+ pr_debug("checkentry %s\n", info->name);
+
+ /* Forbid certain names */
+ if (*info->name == '\0' || *info->name == '.' ||
+ info->name[sizeof(info->name)-1] != '\0' ||
+ memchr(info->name, '/', sizeof(info->name)) != NULL) {
+ pr_info("name not allowed or too long: \"%.*s\"\n",
+ (unsigned int)sizeof(info->name), info->name);
+ return -EINVAL;
+ }
+
+ var = xt_condition_insert(info->name);
+ if (var == NULL)
+ return -ENOMEM;
+
+ info->condvar = var;
+ return 0;
+}
+
+static void condition_tg_destroy(const struct xt_tgdtor_param *par)
+{
+ const struct condition_tg_info *info = par->targinfo;
+
+ pr_debug("destroy %s\n", info->name);
+
+ xt_condition_put(info->condvar);
+}
+
+static struct xt_target condition_tg __read_mostly = {
+ .name = "CONDITION",
+ .family = NFPROTO_UNSPEC,
+ .target = condition_tg_target,
+ .targetsize = sizeof(struct condition_tg_info),
+ .checkentry = condition_tg_checkentry,
+ .destroy = condition_tg_destroy,
+ .me = THIS_MODULE,
+};
+
+static int __init condition_tg_init(void)
+{
+ int err;
+
+ err = xt_register_target(&condition_tg);
+ if (err < 0) {
+ pr_debug("couldn't register xt target\n");
+ return err;
+ }
+
+ return 0;
+}
+
+static void __exit condition_tg_exit(void)
+{
+ xt_unregister_target(&condition_tg);
+}
+
+module_init(condition_tg_init);
+module_exit(condition_tg_exit);
+
+MODULE_AUTHOR("Luciano Coelho <luciano.coelho@nokia.com>");
+MODULE_DESCRIPTION("Xtables: condition target");
+MODULE_LICENSE("GPL v2");
--
1.7.0.4
^ permalink raw reply related
* [RFC 1/2] netfilter: xt_condition: export list management code
From: Luciano Coelho @ 2010-07-19 14:15 UTC (permalink / raw)
To: netfilter-devel; +Cc: netdev, kaber, jengelh, sameo
In-Reply-To: <1279548947-10470-1-git-send-email-luciano.coelho@nokia.com>
From: Luciano Coelho <coelho@testbed>
This patch isolates and exports the condition list management code, in
preparation for the CONDITION target to use it. No functional change,
just reorganization of the code.
Signed-off-by: Luciano Coelho <luciano.coelho@nokia.com>
---
include/linux/netfilter/xt_condition.h | 17 ++++++-
net/netfilter/xt_condition.c | 82 ++++++++++++++++++-------------
2 files changed, 64 insertions(+), 35 deletions(-)
diff --git a/include/linux/netfilter/xt_condition.h b/include/linux/netfilter/xt_condition.h
index 4faf3ca..eebf41a 100644
--- a/include/linux/netfilter/xt_condition.h
+++ b/include/linux/netfilter/xt_condition.h
@@ -3,12 +3,27 @@
#include <linux/types.h>
+#define XT_CONDITION_MAX_NAME_SIZE 30
+
struct xt_condition_mtinfo {
- char name[31];
+ char name[XT_CONDITION_MAX_NAME_SIZE + 1];
__u8 invert;
/* Used internally by the kernel */
void *condvar __attribute__((aligned(8)));
};
+#ifdef __KERNEL__
+struct condition_variable {
+ struct list_head list;
+ struct proc_dir_entry *status_proc;
+ unsigned int refcount;
+ bool enabled;
+};
+
+struct condition_variable *xt_condition_insert(const char *name);
+void xt_condition_put(struct condition_variable *var);
+void xt_condition_set(struct condition_variable *var, bool enabled);
+#endif /* __KERNEL__ */
+
#endif /* _XT_CONDITION_H */
diff --git a/net/netfilter/xt_condition.c b/net/netfilter/xt_condition.c
index a7ccea3..dec97fe 100644
--- a/net/netfilter/xt_condition.c
+++ b/net/netfilter/xt_condition.c
@@ -43,13 +43,6 @@ MODULE_PARM_DESC(condition_gid_perms, "default group owner of /proc/net/nf_condi
MODULE_ALIAS("ipt_condition");
MODULE_ALIAS("ip6t_condition");
-struct condition_variable {
- struct list_head list;
- struct proc_dir_entry *status_proc;
- unsigned int refcount;
- bool enabled;
-};
-
/* proc_lock is a user context only semaphore used for write access */
/* to the conditions' list. */
static DEFINE_MUTEX(proc_lock);
@@ -100,47 +93,34 @@ condition_mt(const struct sk_buff *skb, struct xt_action_param *par)
return var->enabled ^ info->invert;
}
-static int condition_mt_check(const struct xt_mtchk_param *par)
+struct condition_variable *xt_condition_insert(const char *name)
{
- struct xt_condition_mtinfo *info = par->matchinfo;
struct condition_variable *var;
- /* Forbid certain names */
- if (*info->name == '\0' || *info->name == '.' ||
- info->name[sizeof(info->name)-1] != '\0' ||
- memchr(info->name, '/', sizeof(info->name)) != NULL) {
- pr_info("name not allowed or too long: \"%.*s\"\n",
- (unsigned int)sizeof(info->name), info->name);
- return -EINVAL;
- }
/*
* Let's acquire the lock, check for the condition and add it
* or increase the reference counter.
*/
mutex_lock(&proc_lock);
list_for_each_entry(var, &conditions_list, list) {
- if (strcmp(info->name, var->status_proc->name) == 0) {
+ if (strcmp(name, var->status_proc->name) == 0) {
++var->refcount;
- mutex_unlock(&proc_lock);
- info->condvar = var;
- return 0;
+ goto out;
}
}
/* At this point, we need to allocate a new condition variable. */
var = kmalloc(sizeof(struct condition_variable), GFP_KERNEL);
- if (var == NULL) {
- mutex_unlock(&proc_lock);
- return -ENOMEM;
- }
+ if (var == NULL)
+ goto out;
/* Create the condition variable's proc file entry. */
- var->status_proc = create_proc_entry(info->name, condition_list_perms,
+ var->status_proc = create_proc_entry(name, condition_list_perms,
proc_net_condition);
if (var->status_proc == NULL) {
kfree(var);
- mutex_unlock(&proc_lock);
- return -ENOMEM;
+ var = NULL;
+ goto out;
}
var->refcount = 1;
@@ -151,16 +131,14 @@ static int condition_mt_check(const struct xt_mtchk_param *par)
var->status_proc->uid = condition_uid_perms;
var->status_proc->gid = condition_gid_perms;
list_add(&var->list, &conditions_list);
+out:
mutex_unlock(&proc_lock);
- info->condvar = var;
- return 0;
+ return var;
}
+EXPORT_SYMBOL_GPL(xt_condition_insert);
-static void condition_mt_destroy(const struct xt_mtdtor_param *par)
+void xt_condition_put(struct condition_variable *var)
{
- const struct xt_condition_mtinfo *info = par->matchinfo;
- struct condition_variable *var = info->condvar;
-
mutex_lock(&proc_lock);
if (--var->refcount == 0) {
list_del(&var->list);
@@ -171,6 +149,42 @@ static void condition_mt_destroy(const struct xt_mtdtor_param *par)
}
mutex_unlock(&proc_lock);
}
+EXPORT_SYMBOL_GPL(xt_condition_put);
+
+void xt_condition_set(struct condition_variable *var, bool enabled)
+{
+ var->enabled = enabled;
+}
+EXPORT_SYMBOL_GPL(xt_condition_set);
+
+static int condition_mt_check(const struct xt_mtchk_param *par)
+{
+ struct xt_condition_mtinfo *info = par->matchinfo;
+ struct condition_variable *var;
+
+ /* Forbid certain names */
+ if (*info->name == '\0' || *info->name == '.' ||
+ info->name[sizeof(info->name)-1] != '\0' ||
+ memchr(info->name, '/', sizeof(info->name)) != NULL) {
+ pr_info("name not allowed or too long: \"%.*s\"\n",
+ (unsigned int)sizeof(info->name), info->name);
+ return -EINVAL;
+ }
+
+ var = xt_condition_insert(info->name);
+ if (var == NULL)
+ return -ENOMEM;
+
+ info->condvar = var;
+ return 0;
+}
+
+static void condition_mt_destroy(const struct xt_mtdtor_param *par)
+{
+ const struct xt_condition_mtinfo *info = par->matchinfo;
+
+ xt_condition_put(info->condvar);
+}
static struct xt_match condition_mt_reg __read_mostly = {
.name = "condition",
--
1.7.0.4
^ permalink raw reply related
* Re: [PATCHv2] tcp: fix crash in tcp_xmit_retransmit_queue
From: Eric Dumazet @ 2010-07-19 14:09 UTC (permalink / raw)
To: Ilpo Järvinen
Cc: Lennart Schulte, David Miller, Tejun Heo, lkml,
netdev@vger.kernel.org, Fehrmann, Henning, Carsten Aulbert
In-Reply-To: <alpine.DEB.2.00.1007191319010.13002@wel-95.cs.helsinki.fi>
Le lundi 19 juillet 2010 à 14:16 +0300, Ilpo Järvinen a écrit :
> Thanks for testing.
>
> DaveM, I think this oops was introduced for 2.6.28 (in
> 08ebd1721ab8fd362e90ae17b461c07b23fa2824 it seems, to be exact) so to
> stables it should go too please. I've only tweaked the message (so no need
> for Lennart to retest v2 :-)).
>
> --
> [PATCHv2] tcp: fix crash in tcp_xmit_retransmit_queue
>
> It can happen that there are no packets in queue while calling
> tcp_xmit_retransmit_queue(). tcp_write_queue_head() then returns
> NULL and that gets deref'ed to get sacked into a local var.
>
> There is no work to do if no packets are outstanding so we just
> exit early.
>
> This oops was introduced by 08ebd1721ab8fd (tcp: remove tp->lost_out
> guard to make joining diff nicer).
>
But prior to commit 08ebd1721ab8fd3, we were not testing
tp->packets_out, but tp->lost_out
if it was 0, we were not doing the tcp_for_write_queue_from() loop.
Not sure it makes a difference ?
> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
> Reported-by: Lennart Schulte <lennart.schulte@nets.rwth-aachen.de>
> Tested-by: Lennart Schulte <lennart.schulte@nets.rwth-aachen.de>
> ---
> net/ipv4/tcp_output.c | 3 +++
> 1 files changed, 3 insertions(+), 0 deletions(-)
>
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index b4ed957..7ed9dc1 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -2208,6 +2208,9 @@ void tcp_xmit_retransmit_queue(struct sock *sk)
> int mib_idx;
> int fwd_rexmitting = 0;
>
> + if (!tp->packets_out)
> + return;
> +
> if (!tp->lost_out)
> tp->retransmit_high = tp->snd_una;
>
^ permalink raw reply
* [PATCH net-next-2.6] net: 64bit stats for netdev_queue
From: Eric Dumazet @ 2010-07-19 13:33 UTC (permalink / raw)
To: David Miller; +Cc: netdev
Since struct netdev_queue tx_bytes/tx_packets/tx_dropped are already
protected by _xmit_lock, its easy to convert these fields to u64 instead
of unsigned long.
This completes 64bit stats for devices using them (vlan, macvlan, ...)
Strictly, we could avoid the locking in dev_txq_stats_fold() on 64bit
arches, but its slow path and we prefer keep it simple.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
include/linux/netdevice.h | 6 +++---
net/core/dev.c | 4 +++-
2 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index fdc3f29..b626289 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -501,9 +501,9 @@ struct netdev_queue {
* please use this field instead of dev->trans_start
*/
unsigned long trans_start;
- unsigned long tx_bytes;
- unsigned long tx_packets;
- unsigned long tx_dropped;
+ u64 tx_bytes;
+ u64 tx_packets;
+ u64 tx_dropped;
} ____cacheline_aligned_in_smp;
#ifdef CONFIG_RPS
diff --git a/net/core/dev.c b/net/core/dev.c
index 1c002c7..9de75cd 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5282,15 +5282,17 @@ void netdev_run_todo(void)
void dev_txq_stats_fold(const struct net_device *dev,
struct rtnl_link_stats64 *stats)
{
- unsigned long tx_bytes = 0, tx_packets = 0, tx_dropped = 0;
+ u64 tx_bytes = 0, tx_packets = 0, tx_dropped = 0;
unsigned int i;
struct netdev_queue *txq;
for (i = 0; i < dev->num_tx_queues; i++) {
txq = netdev_get_tx_queue(dev, i);
+ spin_lock_bh(&txq->_xmit_lock);
tx_bytes += txq->tx_bytes;
tx_packets += txq->tx_packets;
tx_dropped += txq->tx_dropped;
+ spin_unlock_bh(&txq->_xmit_lock);
}
if (tx_bytes || tx_packets || tx_dropped) {
stats->tx_bytes = tx_bytes;
^ permalink raw reply related
* [BUG net-next-2.6] vlan, bonding, bnx2 problems
From: Eric Dumazet @ 2010-07-19 13:24 UTC (permalink / raw)
To: David Miller, Michael Chan; +Cc: pedro.netdev, netdev, kaber, bhutchings
In-Reply-To: <20100718.153910.67919508.davem@davemloft.net>
Le dimanche 18 juillet 2010 à 15:39 -0700, David Miller a écrit :
> From: Pedro Garcia <pedro.netdev@dondevamos.com>
> Date: Sun, 18 Jul 2010 18:43:25 +0200
>
> > - Without the 8021q module loaded in the kernel, all 802.1p packets
> > (VLAN 0 but QoS tagging) are silently discarded (as expected, as
> > the protocol is not loaded).
> >
> > - Without this patch in 8021q module, these packets are forwarded to
> > the module, but they are discarded also if VLAN 0 is not configured,
> > which should not be the default behaviour, as VLAN 0 is not really
> > a VLANed packet but a 802.1p packet. Defining VLAN 0 makes it almost
> > impossible to communicate with mixed 802.1p and non 802.1p devices on
> > the same network due to arp table issues.
> >
> > - Changed logic to skip vlan specific code in vlan_skb_recv if VLAN
> > is 0 and we have not defined a VLAN with ID 0, but we accept the
> > packet with the encapsulated proto and pass it later to netif_rx.
> >
> > - In the vlan device event handler, added some logic to add VLAN 0
> > to HW filter in devices that support it (this prevented any traffic
> > in VLAN 0 to reach the stack in e1000e with HW filter under 2.6.35,
> > and probably also with other HW filtered cards, so we fix it here).
> >
> > - In the vlan unregister logic, prevent the elimination of VLAN 0
> > in devices with HW filter.
> >
> > - The default behaviour is to ignore the VLAN 0 tagging and accept
> > the packet as if it was not tagged, but we can still define a
> > VLAN 0 if desired (so it is backwards compatible).
> >
> > Signed-off-by: Pedro Garcia <pedro.netdev@dondevamos.com>
>
> Applied, thanks Pedro.
Hmm, current net-next-2.6 is not working with bonding and bnx2.
I got some fatal oops.
modprobe bond0
ifconfig bond0 down
echo 100 >/sys/class/net/bond0/bonding/miimon
echo 1 >/sys/class/net/bond0/bonding/mode
ifconfig bond0 up
ifenslave bond0 eth1 eth2
ip link set eth1 up
ip link set eth2 up
After some debugging to avoid crashes, I get :
[ 31.784308] bonding: bond0: Setting MII monitoring interval to 100.
[ 31.784391] bonding: bond0: setting mode to active-backup (1).
[ 31.784900] 8021q: adding VLAN 0 to HW filter on device bond0
[ 31.784903] ADDRCONF(NETDEV_UP): bond0: link is not ready
[ 31.904440] ------------[ cut here ]------------
[ 31.904500] WARNING: at drivers/net/bonding/bond_ipv6.c:185 bond_inet6addr_event+0x179/0x240 [bonding]()
[ 31.904576] Hardware name: ProLiant BL460c G1
[ 31.904629] Modules linked in: ipmi_si ipmi_msghandler hpilo bonding ipv6
[ 31.904873] Pid: 4586, comm: ifenslave Tainted: G W 2.6.35-rc1-01453-g3e12451-dirty #836
[ 31.904948] Call Trace:
[ 31.905002] [<c13421c4>] ? printk+0x18/0x1c
[ 31.905057] [<c103c8fd>] warn_slowpath_common+0x6d/0xa0
[ 31.905114] [<f8cf5fd9>] ? bond_inet6addr_event+0x179/0x240 [bonding]
[ 31.905172] [<f8cf5fd9>] ? bond_inet6addr_event+0x179/0x240 [bonding]
[ 31.905236] [<c103c94d>] warn_slowpath_null+0x1d/0x20
[ 31.905296] [<f8cf5fd9>] bond_inet6addr_event+0x179/0x240 [bonding]
[ 31.905354] [<c105b061>] notifier_call_chain+0x41/0x60
[ 31.905409] [<c105b0cd>] atomic_notifier_call_chain+0x1d/0x20
[ 31.905471] [<f8b88b31>] addrconf_ifdown+0x211/0x320 [ipv6]
[ 31.905529] [<f8b897ae>] addrconf_notify+0x6e/0x870 [ipv6]
[ 31.905586] [<c1344912>] ? _raw_write_unlock_bh+0x12/0x20
[ 31.905642] [<c1344912>] ? _raw_write_unlock_bh+0x12/0x20
[ 31.905701] [<f8b8f1f0>] ? fib6_clean_all+0x70/0x80 [ipv6]
[ 31.905770] [<f8b8dda0>] ? fib6_age+0x0/0x90 [ipv6]
[ 31.905830] [<c104a106>] ? lock_timer_base+0x26/0x50
[ 31.905884] [<c104a279>] ? del_timer+0x69/0xb0
[ 31.905938] [<c134493d>] ? _raw_spin_unlock_bh+0xd/0x10
[ 31.905997] [<f8b8f267>] ? fib6_run_gc+0x67/0xe0 [ipv6]
[ 31.906052] [<c105b061>] notifier_call_chain+0x41/0x60
[ 31.906107] [<c105b19a>] raw_notifier_call_chain+0x1a/0x20
[ 31.906165] [<c129fe37>] call_netdevice_notifiers+0x27/0x60
[ 31.906221] [<c12ac0cd>] ? rtmsg_ifinfo+0xbd/0xf0
[ 31.906276] [<c12a183c>] __dev_notify_flags+0x5c/0x80
[ 31.906333] [<c12a1897>] dev_change_flags+0x37/0x60
[ 31.906390] [<c12f6291>] devinet_ioctl+0x591/0x6f0
[ 31.906445] [<c11726be>] ? copy_to_user+0x2e/0x40
[ 31.906500] [<c12f7212>] inet_ioctl+0xa2/0xd0
[ 31.906555] [<c128f65e>] sock_ioctl+0x4e/0x240
[ 31.906610] [<c10d3a44>] vfs_ioctl+0x34/0xa0
[ 31.906664] [<c10c7cab>] ? alloc_file+0x1b/0xa0
[ 31.906718] [<c128f610>] ? sock_ioctl+0x0/0x240
[ 31.906771] [<c10d4186>] do_vfs_ioctl+0x66/0x550
[ 31.906827] [<c1022ca0>] ? do_page_fault+0x0/0x350
[ 31.906881] [<c1022e41>] ? do_page_fault+0x1a1/0x350
[ 31.906936] [<c129098c>] ? sys_socket+0x5c/0x70
[ 31.906990] [<c1291860>] ? sys_socketcall+0x60/0x270
[ 31.907045] [<c10d46a9>] sys_ioctl+0x39/0x60
[ 31.907099] [<c1002bd0>] sysenter_do_call+0x12/0x26
[ 31.907153] ---[ end trace 5c4638450a77a22f ]---
[ 32.046479] BUG: scheduling while atomic: ifenslave/4586/0x00000100
[ 32.046540] Modules linked in: ipmi_si ipmi_msghandler hpilo bonding ipv6
[ 32.046784] Pid: 4586, comm: ifenslave Tainted: G W 2.6.35-rc1-01453-g3e12451-dirty #836
[ 32.046860] Call Trace:
[ 32.046910] [<c13421c4>] ? printk+0x18/0x1c
[ 32.046965] [<c10315c9>] __schedule_bug+0x59/0x60
[ 32.047019] [<c1342a2c>] schedule+0x57c/0x850
[ 32.047074] [<c104a106>] ? lock_timer_base+0x26/0x50
[ 32.047128] [<c1342f78>] schedule_timeout+0x118/0x250
[ 32.047183] [<c104a2c0>] ? process_timeout+0x0/0x10
[ 32.047238] [<c13430c5>] schedule_timeout_uninterruptible+0x15/0x20
[ 32.047295] [<c104a345>] msleep+0x15/0x20
[ 32.047350] [<c1227082>] bnx2_napi_disable+0x52/0x80
[ 32.047405] [<c122b56f>] bnx2_netif_stop+0x3f/0xa0
[ 32.047460] [<c122b62a>] bnx2_vlan_rx_register+0x5a/0x80
[ 32.047516] [<f8ced776>] bond_enslave+0x526/0xa90 [bonding]
[ 32.047576] [<f8b8f0d0>] ? fib6_clean_node+0x0/0xb0 [ipv6]
[ 32.047634] [<f8b8dda0>] ? fib6_age+0x0/0x90 [ipv6]
[ 32.047689] [<c129d2d3>] ? netdev_set_master+0x3/0xc0
[ 32.047746] [<f8cee4cb>] bond_do_ioctl+0x31b/0x430 [bonding]
[ 32.047804] [<c105b19a>] ? raw_notifier_call_chain+0x1a/0x20
[ 32.047861] [<c12abd5d>] ? __rtnl_unlock+0xd/0x10
[ 32.047915] [<c129f8cd>] ? __dev_get_by_name+0x7d/0xa0
[ 32.047970] [<c12a19b0>] dev_ifsioc+0xf0/0x290
[ 32.048025] [<f8cee1b0>] ? bond_do_ioctl+0x0/0x430 [bonding]
[ 32.048081] [<c12a1ce1>] dev_ioctl+0x191/0x610
[ 32.048136] [<c12eeb20>] ? udp_ioctl+0x0/0x70
[ 32.048189] [<c128f67c>] sock_ioctl+0x6c/0x240
[ 32.048243] [<c10d3a44>] vfs_ioctl+0x34/0xa0
[ 32.048297] [<c10c7cab>] ? alloc_file+0x1b/0xa0
[ 32.048351] [<c128f610>] ? sock_ioctl+0x0/0x240
[ 32.048404] [<c10d4186>] do_vfs_ioctl+0x66/0x550
[ 32.048459] [<c1022ca0>] ? do_page_fault+0x0/0x350
[ 32.048513] [<c1022e41>] ? do_page_fault+0x1a1/0x350
[ 32.048568] [<c129098c>] ? sys_socket+0x5c/0x70
[ 32.048622] [<c1291860>] ? sys_socketcall+0x60/0x270
[ 32.048677] [<c10d46a9>] sys_ioctl+0x39/0x60
[ 32.048730] [<c1002bd0>] sysenter_do_call+0x12/0x26
[ 32.052025] bonding: bond0: enslaving eth1 as a backup interface with a down link.
[ 32.100207] tg3 0000:14:04.0: PME# enabled
[ 32.100222] pci0000:00: wake-up capability enabled by ACPI
[ 32.224488] pci0000:00: wake-up capability disabled by ACPI
[ 32.224492] tg3 0000:14:04.0: PME# disabled
[ 32.348516] tg3 0000:14:04.0: BAR 0: set to [mem 0xfdff0000-0xfdffffff 64bit] (PCI address [0xfdff0000-0xfdffffff]
[ 32.348524] tg3 0000:14:04.0: BAR 2: set to [mem 0xfdfe0000-0xfdfeffff 64bit] (PCI address [0xfdfe0000-0xfdfeffff]
[ 32.363711] bonding: bond0: enslaving eth2 as a backup interface with a down link.
For bnx2, it seems commit 212f9934afccf9c9739921
was not sufficient to correct the "scheduling while atomic" bug...
enslaving a bnx2 on a bond device with one vlan already set :
bond_enslave -> bnx2_vlan_rx_register -> bnx2_netif_stop -> bnx2_napi_disable -> msleep()
For the first oops, following patch cures it, but I am not pleased
with it. This zero-vid registration seems wrong at the beginning.
Thanks
[RFC net-next-2.6] bonding: fix bond_inet6addr_event()
After commit ad1afb0039391 (vlan_dev: VLAN 0 should be treated
as "no vlan tag" (802.1p packet)),
bond_inet6addr_event() might be called with a NULL bond->vlgrp pointer, and
a non empty bond->vlan_list. vlan_group_get_device() is dereferencing a NULL pointer.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
diff --git a/drivers/net/bonding/bond_ipv6.c b/drivers/net/bonding/bond_ipv6.c
index 969ffed..121b073 100644
--- a/drivers/net/bonding/bond_ipv6.c
+++ b/drivers/net/bonding/bond_ipv6.c
@@ -178,6 +178,8 @@ static int bond_inet6addr_event(struct notifier_block *this,
}
list_for_each_entry(vlan, &bond->vlan_list, vlan_list) {
+ if (!bond->vlgrp)
+ continue;
vlan_dev = vlan_group_get_device(bond->vlgrp,
vlan->vlan_id);
if (vlan_dev == event_dev) {
^ permalink raw reply related
* [PATCH] smsc911x: Add spinlocks around registers access
From: Catalin Marinas @ 2010-07-19 12:42 UTC (permalink / raw)
To: netdev; +Cc: Steve Glendinning, stable
On SMP systems, the SMSC911x registers may be accessed by multiple CPUs
and this seems to put the chip in an inconsistent state. The patch adds
spinlocks to the smsc911x_reg_read, smsc911x_reg_write,
smsc911x_rx_readfifo and smsc911x_tx_writefifo functions.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Steve Glendinning <steve.glendinning@smsc.com>
Cc: stable@kernel.org
---
I've had this patch in my tree for some time. It's the only way I can
get the ARM SMP systems using this controller to work reliable. I
haven't got an ack from Steve yet but the only alternative is to mark
this driver BROKEN_ON_SMP and revert the ARM boards to use the old
smc911x.c driver.
Thanks.
drivers/net/smsc911x.c | 92 +++++++++++++++++++++++++++---------------------
1 files changed, 52 insertions(+), 40 deletions(-)
diff --git a/drivers/net/smsc911x.c b/drivers/net/smsc911x.c
index cc55974..7a7b01a 100644
--- a/drivers/net/smsc911x.c
+++ b/drivers/net/smsc911x.c
@@ -84,8 +84,7 @@ struct smsc911x_data {
*/
spinlock_t mac_lock;
- /* spinlock to ensure 16-bit accesses are serialised.
- * unused with a 32-bit bus */
+ /* spinlock to ensure register accesses are serialised */
spinlock_t dev_lock;
struct phy_device *phy_dev;
@@ -118,37 +117,33 @@ struct smsc911x_data {
unsigned int hashlo;
};
-/* The 16-bit access functions are significantly slower, due to the locking
- * necessary. If your bus hardware can be configured to do this for you
- * (in response to a single 32-bit operation from software), you should use
- * the 32-bit access functions instead. */
-
-static inline u32 smsc911x_reg_read(struct smsc911x_data *pdata, u32 reg)
+static inline u32 __smsc911x_reg_read(struct smsc911x_data *pdata, u32 reg)
{
if (pdata->config.flags & SMSC911X_USE_32BIT)
return readl(pdata->ioaddr + reg);
- if (pdata->config.flags & SMSC911X_USE_16BIT) {
- u32 data;
- unsigned long flags;
-
- /* these two 16-bit reads must be performed consecutively, so
- * must not be interrupted by our own ISR (which would start
- * another read operation) */
- spin_lock_irqsave(&pdata->dev_lock, flags);
- data = ((readw(pdata->ioaddr + reg) & 0xFFFF) |
+ if (pdata->config.flags & SMSC911X_USE_16BIT)
+ return ((readw(pdata->ioaddr + reg) & 0xFFFF) |
((readw(pdata->ioaddr + reg + 2) & 0xFFFF) << 16));
- spin_unlock_irqrestore(&pdata->dev_lock, flags);
-
- return data;
- }
BUG();
return 0;
}
-static inline void smsc911x_reg_write(struct smsc911x_data *pdata, u32 reg,
- u32 val)
+static inline u32 smsc911x_reg_read(struct smsc911x_data *pdata, u32 reg)
+{
+ u32 data;
+ unsigned long flags;
+
+ spin_lock_irqsave(&pdata->dev_lock, flags);
+ data = __smsc911x_reg_read(pdata, reg);
+ spin_unlock_irqrestore(&pdata->dev_lock, flags);
+
+ return data;
+}
+
+static inline void __smsc911x_reg_write(struct smsc911x_data *pdata, u32 reg,
+ u32 val)
{
if (pdata->config.flags & SMSC911X_USE_32BIT) {
writel(val, pdata->ioaddr + reg);
@@ -156,44 +151,54 @@ static inline void smsc911x_reg_write(struct smsc911x_data *pdata, u32 reg,
}
if (pdata->config.flags & SMSC911X_USE_16BIT) {
- unsigned long flags;
-
- /* these two 16-bit writes must be performed consecutively, so
- * must not be interrupted by our own ISR (which would start
- * another read operation) */
- spin_lock_irqsave(&pdata->dev_lock, flags);
writew(val & 0xFFFF, pdata->ioaddr + reg);
writew((val >> 16) & 0xFFFF, pdata->ioaddr + reg + 2);
- spin_unlock_irqrestore(&pdata->dev_lock, flags);
return;
}
BUG();
}
+static inline void smsc911x_reg_write(struct smsc911x_data *pdata, u32 reg,
+ u32 val)
+{
+ unsigned long flags;
+
+ spin_lock_irqsave(&pdata->dev_lock, flags);
+ __smsc911x_reg_write(pdata, reg, val);
+ spin_unlock_irqrestore(&pdata->dev_lock, flags);
+}
+
/* Writes a packet to the TX_DATA_FIFO */
static inline void
smsc911x_tx_writefifo(struct smsc911x_data *pdata, unsigned int *buf,
unsigned int wordcount)
{
+ unsigned long flags;
+
+ spin_lock_irqsave(&pdata->dev_lock, flags);
+
if (pdata->config.flags & SMSC911X_SWAP_FIFO) {
while (wordcount--)
- smsc911x_reg_write(pdata, TX_DATA_FIFO, swab32(*buf++));
- return;
+ __smsc911x_reg_write(pdata, TX_DATA_FIFO,
+ swab32(*buf++));
+ goto out;
}
if (pdata->config.flags & SMSC911X_USE_32BIT) {
writesl(pdata->ioaddr + TX_DATA_FIFO, buf, wordcount);
- return;
+ goto out;
}
if (pdata->config.flags & SMSC911X_USE_16BIT) {
while (wordcount--)
- smsc911x_reg_write(pdata, TX_DATA_FIFO, *buf++);
- return;
+ __smsc911x_reg_write(pdata, TX_DATA_FIFO, *buf++);
+ goto out;
}
BUG();
+out:
+ spin_unlock_irqrestore(&pdata->dev_lock, flags);
}
/* Reads a packet out of the RX_DATA_FIFO */
@@ -201,24 +206,31 @@ static inline void
smsc911x_rx_readfifo(struct smsc911x_data *pdata, unsigned int *buf,
unsigned int wordcount)
{
+ unsigned long flags;
+
+ spin_lock_irqsave(&pdata->dev_lock, flags);
+
if (pdata->config.flags & SMSC911X_SWAP_FIFO) {
while (wordcount--)
- *buf++ = swab32(smsc911x_reg_read(pdata, RX_DATA_FIFO));
- return;
+ *buf++ = swab32(__smsc911x_reg_read(pdata,
+ RX_DATA_FIFO));
+ goto out;
}
if (pdata->config.flags & SMSC911X_USE_32BIT) {
readsl(pdata->ioaddr + RX_DATA_FIFO, buf, wordcount);
- return;
+ goto out;
}
if (pdata->config.flags & SMSC911X_USE_16BIT) {
while (wordcount--)
- *buf++ = smsc911x_reg_read(pdata, RX_DATA_FIFO);
- return;
+ *buf++ = __smsc911x_reg_read(pdata, RX_DATA_FIFO);
+ goto out;
}
BUG();
+out:
+ spin_unlock_irqrestore(&pdata->dev_lock, flags);
}
/* waits for MAC not busy, with timeout. Only called by smsc911x_mac_read
^ permalink raw reply related
* Re: [0/8] netpoll/bridge fixes
From: Herbert Xu @ 2010-07-19 11:54 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Stephen Hemminger, Qianfeng Zhang, David S. Miller, netdev,
WANG Cong, Matt Mackall
In-Reply-To: <20100719105300.GA22179@gondor.apana.org.au>
On Mon, Jul 19, 2010 at 06:53:00PM +0800, Herbert Xu wrote:
>
> > Meanwhile, should we just disable netpoll for bridge in 2.6.35 and -stable?
> > We are getting crash reports in virtualization which I suspect are
> > related to this:
> > https://bugzilla.kernel.org/show_bug.cgi?id=16413
>
> I think that's probably the best solution, Dave?
I take that back :)
It turns out that 16413 has nothing to do with bridge netpoll
(which was not merged until after 2.6.34) since he's running
2.6.34.
Still, it might be a good idea to disable bridge netpoll in
2.6.35.
Thanks,
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply
* [PATCHv2] tcp: fix crash in tcp_xmit_retransmit_queue
From: Ilpo Järvinen @ 2010-07-19 11:16 UTC (permalink / raw)
To: Lennart Schulte, David Miller
Cc: Eric Dumazet, Tejun Heo, lkml, netdev@vger.kernel.org,
Fehrmann, Henning, Carsten Aulbert
In-Reply-To: <4C440771.7080107@nets.rwth-aachen.de>
[-- Attachment #1: Type: TEXT/PLAIN, Size: 1951 bytes --]
On Mon, 19 Jul 2010, Lennart Schulte wrote:
> I ran tests for about 2 hours with this patch and I got no output from the
> debug patch. This seems to have solved at least my problem :)
>
> Thanks!
> > [PATCH] tcp: fix crash in tcp_xmit_retransmit_queue
> >
> > It can happen that there are no packets in queue while calling
> > tcp_xmit_retransmit_queue(). tcp_write_queue_head() then returns
> > NULL and that gets deref'ed to get sacked into a local var.
> >
> > There is no work to do if no packets are outstanding so we just
> > exit early.
> >
> > There may still be another bug affecting this same function.
Thanks for testing.
DaveM, I think this oops was introduced for 2.6.28 (in
08ebd1721ab8fd362e90ae17b461c07b23fa2824 it seems, to be exact) so to
stables it should go too please. I've only tweaked the message (so no need
for Lennart to retest v2 :-)).
--
[PATCHv2] tcp: fix crash in tcp_xmit_retransmit_queue
It can happen that there are no packets in queue while calling
tcp_xmit_retransmit_queue(). tcp_write_queue_head() then returns
NULL and that gets deref'ed to get sacked into a local var.
There is no work to do if no packets are outstanding so we just
exit early.
This oops was introduced by 08ebd1721ab8fd (tcp: remove tp->lost_out
guard to make joining diff nicer).
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Reported-by: Lennart Schulte <lennart.schulte@nets.rwth-aachen.de>
Tested-by: Lennart Schulte <lennart.schulte@nets.rwth-aachen.de>
---
net/ipv4/tcp_output.c | 3 +++
1 files changed, 3 insertions(+), 0 deletions(-)
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index b4ed957..7ed9dc1 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2208,6 +2208,9 @@ void tcp_xmit_retransmit_queue(struct sock *sk)
int mib_idx;
int fwd_rexmitting = 0;
+ if (!tp->packets_out)
+ return;
+
if (!tp->lost_out)
tp->retransmit_high = tp->snd_una;
--
1.5.6.5
^ permalink raw reply related
* Re: [0/8] netpoll/bridge fixes
From: Herbert Xu @ 2010-07-19 10:53 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Stephen Hemminger, Qianfeng Zhang, David S. Miller, netdev,
WANG Cong, Matt Mackall
In-Reply-To: <20100719101904.GA31743@redhat.com>
On Mon, Jul 19, 2010 at 01:19:04PM +0300, Michael S. Tsirkin wrote:
> On Fri, Jun 11, 2010 at 12:11:42PM +1000, Herbert Xu wrote:
> > On Fri, Jun 11, 2010 at 08:48:39AM +1000, Herbert Xu wrote:
> > > On Thu, Jun 10, 2010 at 02:59:15PM -0700, Stephen Hemminger wrote:
> > > >
> > > > Okay, then add a comment where in_irq is used?
> > >
> > > Actually let me put it into a wrapper. I'll respin the patches.
> >
> > OK here is a repost. And this time it really is 8 patches :)
> > I've tested it lightly.
> >
> > Cheers,
>
> Meanwhile, should we just disable netpoll for bridge in 2.6.35 and -stable?
> We are getting crash reports in virtualization which I suspect are
> related to this:
> https://bugzilla.kernel.org/show_bug.cgi?id=16413
I think that's probably the best solution, Dave?
Thanks,
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply
* Re: [0/8] netpoll/bridge fixes
From: Michael S. Tsirkin @ 2010-07-19 10:19 UTC (permalink / raw)
To: Herbert Xu
Cc: Stephen Hemminger, Qianfeng Zhang, David S. Miller, netdev,
WANG Cong, Matt Mackall
In-Reply-To: <20100611021142.GA24490@gondor.apana.org.au>
On Fri, Jun 11, 2010 at 12:11:42PM +1000, Herbert Xu wrote:
> On Fri, Jun 11, 2010 at 08:48:39AM +1000, Herbert Xu wrote:
> > On Thu, Jun 10, 2010 at 02:59:15PM -0700, Stephen Hemminger wrote:
> > >
> > > Okay, then add a comment where in_irq is used?
> >
> > Actually let me put it into a wrapper. I'll respin the patches.
>
> OK here is a repost. And this time it really is 8 patches :)
> I've tested it lightly.
>
> Cheers,
Meanwhile, should we just disable netpoll for bridge in 2.6.35 and -stable?
We are getting crash reports in virtualization which I suspect are
related to this:
https://bugzilla.kernel.org/show_bug.cgi?id=16413
> --
> Visit Openswan at http://www.openswan.org/
> Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* [PATCH] s2io: Remove unnecessary memset of netdev private data
From: Tobias Klauser @ 2010-07-19 8:55 UTC (permalink / raw)
To: netdev, David S. Miller; +Cc: kernel-janitors
The memory for the private data is allocated using kzalloc in
alloc_etherdev (or alloc_netdev_mq respectively) so there is no need to
set it to 0 again.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
---
drivers/net/s2io.c | 1 -
1 files changed, 0 insertions(+), 1 deletions(-)
diff --git a/drivers/net/s2io.c b/drivers/net/s2io.c
index d0af924..aa6cbb0 100644
--- a/drivers/net/s2io.c
+++ b/drivers/net/s2io.c
@@ -7886,7 +7886,6 @@ s2io_init_nic(struct pci_dev *pdev, const struct pci_device_id *pre)
/* Private member variable initialized to s2io NIC structure */
sp = netdev_priv(dev);
- memset(sp, 0, sizeof(struct s2io_nic));
sp->dev = dev;
sp->pdev = pdev;
sp->high_dma_flag = dma_flag;
--
1.7.0.4
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox