Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH 6/9] tproxy: added IPv6 socket lookup function to nf_tproxy_core
From: Jan Engelhardt @ 2010-10-21  8:42 UTC (permalink / raw)
  To: KOVACS Krisztian; +Cc: netdev, netfilter-devel, Patrick McHardy, David Miller
In-Reply-To: <20101020112118.6260.89471.stgit@este.odu>


On Wednesday 2010-10-20 13:21, KOVACS Krisztian wrote:
>+
>+	pr_debug("tproxy socket lookup: proto %u %pI6:%u -> %pI6:%u, lookup type: %d, sock %p\n",
>+		 protocol, saddr, ntohs(sport), daddr, ntohs(dport), lookup_type, sk);

Shorts should preferably be used with %hd/%hu.

^ permalink raw reply

* Re: [PATCH 4/9] tproxy: added tproxy sockopt interface in the IPV6 layer
From: KOVACS Krisztian @ 2010-10-21  8:46 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: netdev, netfilter-devel, Patrick McHardy, David Miller
In-Reply-To: <alpine.LNX.2.01.1010211037520.22922@obet.zrqbmnf.qr>

Hi,

On Thu, 2010-10-21 at 10:39 +0200, Jan Engelhardt wrote:
> On Wednesday 2010-10-20 13:21, KOVACS Krisztian wrote:
> 
> >@@ -268,6 +268,10 @@ struct in6_flowlabel_req {
> > /* RFC5082: Generalized Ttl Security Mechanism */
> > #define IPV6_MINHOPCOUNT		73
> > 
> >+#define IPV6_ORIGDSTADDR        74
> >+#define IPV6_RECVORIGDSTADDR    IPV6_ORIGDSTADDR
> >+#define IPV6_TRANSPARENT        75
> >+
> 
> Why do we actually need two names for the same thing?

IPV6_RECVORIGDSTADDR is the name of the socket option you're supposed to
set if you require the original destination address. IPV6_ORIGDSTADDR is
the name of the ancillary message you get with the actual address in it.
Just like we have it for IP_TOS/IP_RECVTOS, for example.

--KK




^ permalink raw reply

* Re: [PATCH 7/9] tproxy: added IPv6 support to the TPROXY target
From: Jan Engelhardt @ 2010-10-21  8:47 UTC (permalink / raw)
  To: KOVACS Krisztian; +Cc: netdev, netfilter-devel, Patrick McHardy, David Miller
In-Reply-To: <20101020112118.6260.51773.stgit@este.odu>


On Wednesday 2010-10-20 13:21, KOVACS Krisztian wrote:
> 
> /* TPROXY target is capable of marking the packet to perform
>  * redirection. We can get rid of that whenever we get support for
>  * mutliple targets in the same rule. */
>-struct xt_tproxy_target_info {
>+struct xt_tproxy_target_info_v0 {
> 	u_int32_t mark_mask;
> 	u_int32_t mark_value;
> 	__be32 laddr;
> 	__be16 lport;
> };

You cannot change the struct name either, or it may break userspace
compilations.


^ permalink raw reply

* Re: [PATCH 7/9] tproxy: added IPv6 support to the TPROXY target
From: KOVACS Krisztian @ 2010-10-21  8:50 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: netdev, netfilter-devel, Patrick McHardy, David Miller
In-Reply-To: <alpine.LNX.2.01.1010211042190.22922@obet.zrqbmnf.qr>

Hi,

On Thu, 2010-10-21 at 10:47 +0200, Jan Engelhardt wrote:
> On Wednesday 2010-10-20 13:21, KOVACS Krisztian wrote:
> > 
> > /* TPROXY target is capable of marking the packet to perform
> >  * redirection. We can get rid of that whenever we get support for
> >  * mutliple targets in the same rule. */
> >-struct xt_tproxy_target_info {
> >+struct xt_tproxy_target_info_v0 {
> > 	u_int32_t mark_mask;
> > 	u_int32_t mark_value;
> > 	__be32 laddr;
> > 	__be16 lport;
> > };
> 
> You cannot change the struct name either, or it may break userspace
> compilations.

True, though iptables has its own copy of the header anyway.

--KK



^ permalink raw reply

* Re: [RFC PATCH 1/9] ipvs network name space aware
From: Eric Dumazet @ 2010-10-21  8:58 UTC (permalink / raw)
  To: paulmck
  Cc: Hans Schillstrom, Daniel Lezcano, lvs-devel@vger.kernel.org,
	netdev@vger.kernel.org, netfilter-devel@vger.kernel.org,
	horms@verge.net.au, ja@ssi.bg, wensong@linux-vs.org
In-Reply-To: <20101020160205.GB2386@linux.vnet.ibm.com>


> You said that there were a lot of "stepi" commands to get through
> rcu_read_lock() on x86_64.  This is quite surprising, especially if you
> built with CONFIG_RCU_TREE.  Even if you built with CONFIG_PREEMPT_RCU_TREE,
> you should only see something like the following from rcu_read_lock():
> 
> 000000b7 <__rcu_read_lock>:
>       b7:	55                   	push   %ebp
>       b8:	64 a1 00 00 00 00    	mov    %fs:0x0,%eax
>       be:	ff 80 80 01 00 00    	incl   0x180(%eax)
>       c4:	89 e5                	mov    %esp,%ebp
>       c6:	5d                   	pop    %ebp
>       c7:	c3                   	ret    
> 
> Unless you have some sort of debugging options turned on.  Or unless
> six instructions counts for "quite many" stepi commands.  ;-)
> 

Paul, this should be inlined, dont you think ?

Also, I dont understand why we use ACCESS_ONCE() in rcu_read_lock()

ACCESS_ONCE(current->rcu_read_lock_nesting)++;

Apparently, some compilers are a bit noisy here.

mov    0x1b0(%rdx),%eax
inc    %eax
mov    %eax,0x1b0(%rdx)

instead of :

incl   0x1b0(%rax)

So if the ACCESS_ONCE() is needed, we might add a comment, because it's
not obvious ;)




^ permalink raw reply

* [net-next-2.6 PATCH 1/3] ixgbe: update copyright info
From: Jeff Kirsher @ 2010-10-21  8:59 UTC (permalink / raw)
  To: davem; +Cc: netdev, gospo, bphilips, Emil Tantilov, Jeff Kirsher

From: Emil Tantilov <emil.s.tantilov@intel.com>

Update copyright notice

Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---

 drivers/net/ixgbe/ixgbe_mbx.c   |    2 +-
 drivers/net/ixgbe/ixgbe_mbx.h   |    2 +-
 drivers/net/ixgbe/ixgbe_sriov.c |    2 +-
 drivers/net/ixgbe/ixgbe_sriov.h |    2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_mbx.c b/drivers/net/ixgbe/ixgbe_mbx.c
index 435e028..471f0f2 100644
--- a/drivers/net/ixgbe/ixgbe_mbx.c
+++ b/drivers/net/ixgbe/ixgbe_mbx.c
@@ -1,7 +1,7 @@
 /*******************************************************************************
 
   Intel 10 Gigabit PCI Express Linux driver
-  Copyright(c) 1999 - 2009 Intel Corporation.
+  Copyright(c) 1999 - 2010 Intel Corporation.
 
   This program is free software; you can redistribute it and/or modify it
   under the terms and conditions of the GNU General Public License,
diff --git a/drivers/net/ixgbe/ixgbe_mbx.h b/drivers/net/ixgbe/ixgbe_mbx.h
index c5ae4b4..7e0d08f 100644
--- a/drivers/net/ixgbe/ixgbe_mbx.h
+++ b/drivers/net/ixgbe/ixgbe_mbx.h
@@ -1,7 +1,7 @@
 /*******************************************************************************
 
   Intel 10 Gigabit PCI Express Linux driver
-  Copyright(c) 1999 - 2009 Intel Corporation.
+  Copyright(c) 1999 - 2010 Intel Corporation.
 
   This program is free software; you can redistribute it and/or modify it
   under the terms and conditions of the GNU General Public License,
diff --git a/drivers/net/ixgbe/ixgbe_sriov.c b/drivers/net/ixgbe/ixgbe_sriov.c
index a6b720a..5428153 100644
--- a/drivers/net/ixgbe/ixgbe_sriov.c
+++ b/drivers/net/ixgbe/ixgbe_sriov.c
@@ -1,7 +1,7 @@
 /*******************************************************************************
 
   Intel 10 Gigabit PCI Express Linux driver
-  Copyright(c) 1999 - 2009 Intel Corporation.
+  Copyright(c) 1999 - 2010 Intel Corporation.
 
   This program is free software; you can redistribute it and/or modify it
   under the terms and conditions of the GNU General Public License,
diff --git a/drivers/net/ixgbe/ixgbe_sriov.h b/drivers/net/ixgbe/ixgbe_sriov.h
index 9a424bb..49dc14d 100644
--- a/drivers/net/ixgbe/ixgbe_sriov.h
+++ b/drivers/net/ixgbe/ixgbe_sriov.h
@@ -1,7 +1,7 @@
 /*******************************************************************************
 
   Intel 10 Gigabit PCI Express Linux driver
-  Copyright(c) 1999 - 2009 Intel Corporation.
+  Copyright(c) 1999 - 2010 Intel Corporation.
 
   This program is free software; you can redistribute it and/or modify it
   under the terms and conditions of the GNU General Public License,


^ permalink raw reply related

* [net-next-2.6 PATCH 2/3] ixgbe: fix stats handling
From: Jeff Kirsher @ 2010-10-21  9:00 UTC (permalink / raw)
  To: davem; +Cc: netdev, gospo, bphilips, Eric Dumazet, Don Skidmore, Jeff Kirsher
In-Reply-To: <20101021085740.12059.20577.stgit@localhost.localdomain>

From: Eric Dumazet <eric.dumazet@gmail.com>

Current ixgbe stats have following problems :

- Not 64 bit safe (on 32bit arches)

- Not safe in ixgbe_clean_rx_irq() :
   All cpus dirty a common location (netdev->stats.rx_bytes &
netdev->stats.rx_packets) without proper synchronization.
   This slow down a bit multiqueue operations, and possibly miss some
updates.

Fixes :

Implement ndo_get_stats64() method to provide accurate 64bit rx|tx
bytes/packets counters, using 64bit safe infrastructure.

ixgbe_get_ethtool_stats() also use this infrastructure to provide 64bit
safe counters.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---

 drivers/net/ixgbe/ixgbe.h         |    3 ++-
 drivers/net/ixgbe/ixgbe_ethtool.c |   29 ++++++++++++++++-----------
 drivers/net/ixgbe/ixgbe_main.c    |   40 ++++++++++++++++++++++++++++++++++---
 3 files changed, 56 insertions(+), 16 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe.h b/drivers/net/ixgbe/ixgbe.h
index a8c47b0..944d9e2 100644
--- a/drivers/net/ixgbe/ixgbe.h
+++ b/drivers/net/ixgbe/ixgbe.h
@@ -180,8 +180,9 @@ struct ixgbe_ring {
 					 */
 
 	struct ixgbe_queue_stats stats;
-	unsigned long reinit_state;
+	struct u64_stats_sync syncp;
 	int numa_node;
+	unsigned long reinit_state;
 	u64 rsc_count;			/* stat for coalesced packets */
 	u64 rsc_flush;			/* stats for flushed packets */
 	u32 restart_queue;		/* track tx queue restarts */
diff --git a/drivers/net/ixgbe/ixgbe_ethtool.c b/drivers/net/ixgbe/ixgbe_ethtool.c
index d4ac943..3c7f15d 100644
--- a/drivers/net/ixgbe/ixgbe_ethtool.c
+++ b/drivers/net/ixgbe/ixgbe_ethtool.c
@@ -999,12 +999,11 @@ static void ixgbe_get_ethtool_stats(struct net_device *netdev,
                                     struct ethtool_stats *stats, u64 *data)
 {
 	struct ixgbe_adapter *adapter = netdev_priv(netdev);
-	u64 *queue_stat;
-	int stat_count = sizeof(struct ixgbe_queue_stats) / sizeof(u64);
 	struct rtnl_link_stats64 temp;
 	const struct rtnl_link_stats64 *net_stats;
-	int j, k;
-	int i;
+	unsigned int start;
+	struct ixgbe_ring *ring;
+	int i, j;
 	char *p = NULL;
 
 	ixgbe_update_stats(adapter);
@@ -1025,16 +1024,22 @@ static void ixgbe_get_ethtool_stats(struct net_device *netdev,
 		           sizeof(u64)) ? *(u64 *)p : *(u32 *)p;
 	}
 	for (j = 0; j < adapter->num_tx_queues; j++) {
-		queue_stat = (u64 *)&adapter->tx_ring[j]->stats;
-		for (k = 0; k < stat_count; k++)
-			data[i + k] = queue_stat[k];
-		i += k;
+		ring = adapter->tx_ring[j];
+		do {
+			start = u64_stats_fetch_begin_bh(&ring->syncp);
+			data[i]   = ring->stats.packets;
+			data[i+1] = ring->stats.bytes;
+		} while (u64_stats_fetch_retry_bh(&ring->syncp, start));
+		i += 2;
 	}
 	for (j = 0; j < adapter->num_rx_queues; j++) {
-		queue_stat = (u64 *)&adapter->rx_ring[j]->stats;
-		for (k = 0; k < stat_count; k++)
-			data[i + k] = queue_stat[k];
-		i += k;
+		ring = adapter->rx_ring[j];
+		do {
+			start = u64_stats_fetch_begin_bh(&ring->syncp);
+			data[i]   = ring->stats.packets;
+			data[i+1] = ring->stats.bytes;
+		} while (u64_stats_fetch_retry_bh(&ring->syncp, start));
+		i += 2;
 	}
 	if (adapter->flags & IXGBE_FLAG_DCB_ENABLED) {
 		for (j = 0; j < MAX_TX_PACKET_BUFFERS; j++) {
diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
index 790a0da..d066ba3 100644
--- a/drivers/net/ixgbe/ixgbe_main.c
+++ b/drivers/net/ixgbe/ixgbe_main.c
@@ -824,8 +824,10 @@ static bool ixgbe_clean_tx_irq(struct ixgbe_q_vector *q_vector,
 
 	tx_ring->total_bytes += total_bytes;
 	tx_ring->total_packets += total_packets;
+	u64_stats_update_begin(&tx_ring->syncp);
 	tx_ring->stats.packets += total_packets;
 	tx_ring->stats.bytes += total_bytes;
+	u64_stats_update_end(&tx_ring->syncp);
 	return count < tx_ring->work_limit;
 }
 
@@ -1172,7 +1174,6 @@ static bool ixgbe_clean_rx_irq(struct ixgbe_q_vector *q_vector,
 			       int *work_done, int work_to_do)
 {
 	struct ixgbe_adapter *adapter = q_vector->adapter;
-	struct net_device *netdev = adapter->netdev;
 	struct pci_dev *pdev = adapter->pdev;
 	union ixgbe_adv_rx_desc *rx_desc, *next_rxd;
 	struct ixgbe_rx_buffer *rx_buffer_info, *next_buffer;
@@ -1298,8 +1299,10 @@ static bool ixgbe_clean_rx_irq(struct ixgbe_q_vector *q_vector,
 					rx_ring->rsc_count++;
 				rx_ring->rsc_flush++;
 			}
+			u64_stats_update_begin(&rx_ring->syncp);
 			rx_ring->stats.packets++;
 			rx_ring->stats.bytes += skb->len;
+			u64_stats_update_end(&rx_ring->syncp);
 		} else {
 			if (rx_ring->flags & IXGBE_RING_RX_PS_ENABLED) {
 				rx_buffer_info->skb = next_buffer->skb;
@@ -1375,8 +1378,6 @@ next_desc:
 
 	rx_ring->total_packets += total_rx_packets;
 	rx_ring->total_bytes += total_rx_bytes;
-	netdev->stats.rx_bytes += total_rx_bytes;
-	netdev->stats.rx_packets += total_rx_packets;
 
 	return cleaned;
 }
@@ -6558,6 +6559,38 @@ static void ixgbe_netpoll(struct net_device *netdev)
 }
 #endif
 
+static struct rtnl_link_stats64 *ixgbe_get_stats64(struct net_device *netdev,
+						   struct rtnl_link_stats64 *stats)
+{
+	struct ixgbe_adapter *adapter = netdev_priv(netdev);
+	int i;
+
+	/* accurate rx/tx bytes/packets stats */
+	dev_txq_stats_fold(netdev, stats);
+	for (i = 0; i < adapter->num_rx_queues; i++) {
+		struct ixgbe_ring *ring = adapter->rx_ring[i];
+		u64 bytes, packets;
+		unsigned int start;
+
+		do {
+			start = u64_stats_fetch_begin_bh(&ring->syncp);
+			packets = ring->stats.packets;
+			bytes   = ring->stats.bytes;
+		} while (u64_stats_fetch_retry_bh(&ring->syncp, start));
+		stats->rx_packets += packets;
+		stats->rx_bytes   += bytes;
+	}
+
+	/* following stats updated by ixgbe_watchdog_task() */
+	stats->multicast	= netdev->stats.multicast;
+	stats->rx_errors	= netdev->stats.rx_errors;
+	stats->rx_length_errors	= netdev->stats.rx_length_errors;
+	stats->rx_crc_errors	= netdev->stats.rx_crc_errors;
+	stats->rx_missed_errors	= netdev->stats.rx_missed_errors;
+	return stats;
+}
+
+
 static const struct net_device_ops ixgbe_netdev_ops = {
 	.ndo_open		= ixgbe_open,
 	.ndo_stop		= ixgbe_close,
@@ -6577,6 +6610,7 @@ static const struct net_device_ops ixgbe_netdev_ops = {
 	.ndo_set_vf_vlan	= ixgbe_ndo_set_vf_vlan,
 	.ndo_set_vf_tx_rate	= ixgbe_ndo_set_vf_bw,
 	.ndo_get_vf_config	= ixgbe_ndo_get_vf_config,
+	.ndo_get_stats64	= ixgbe_get_stats64,
 #ifdef CONFIG_NET_POLL_CONTROLLER
 	.ndo_poll_controller	= ixgbe_netpoll,
 #endif


^ permalink raw reply related

* [net-next-2.6 PATCH 3/3] ixgbe: add a refcnt when turning on/off FCoE offload capability
From: Jeff Kirsher @ 2010-10-21  9:00 UTC (permalink / raw)
  To: davem; +Cc: netdev, gospo, bphilips, Yi Zou, Jeff Kirsher
In-Reply-To: <20101021085740.12059.20577.stgit@localhost.localdomain>

From: Yi Zou <yi.zou@intel.com>

The FCoE offload is enabled/disabled per adapter, but upper FCoE protocol
stack could have multiple FCoE instances created on the same physical network
interface, e.g., FCoE on multiple VLAN interfaces on the same physical
network interface. In this case we want to turn on FCoE offload at the first
request from ndo_fcoe_enable() but only turn off FCoE offload at the very last
call to ndo_fcoe_disable(). This is fixed by adding a refcnt in the per adapter
FCoE structure and tear down FCoE offload when refcnt decrements to zero.

Signed-off-by: Yi Zou <yi.zou@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---

 drivers/net/ixgbe/ixgbe_fcoe.c |    6 ++++++
 drivers/net/ixgbe/ixgbe_fcoe.h |    1 +
 2 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_fcoe.c b/drivers/net/ixgbe/ixgbe_fcoe.c
index 2f1de8b..05efa6a 100644
--- a/drivers/net/ixgbe/ixgbe_fcoe.c
+++ b/drivers/net/ixgbe/ixgbe_fcoe.c
@@ -604,11 +604,13 @@ int ixgbe_fcoe_enable(struct net_device *netdev)
 {
 	int rc = -EINVAL;
 	struct ixgbe_adapter *adapter = netdev_priv(netdev);
+	struct ixgbe_fcoe *fcoe = &adapter->fcoe;
 
 
 	if (!(adapter->flags & IXGBE_FLAG_FCOE_CAPABLE))
 		goto out_enable;
 
+	atomic_inc(&fcoe->refcnt);
 	if (adapter->flags & IXGBE_FLAG_FCOE_ENABLED)
 		goto out_enable;
 
@@ -648,6 +650,7 @@ int ixgbe_fcoe_disable(struct net_device *netdev)
 {
 	int rc = -EINVAL;
 	struct ixgbe_adapter *adapter = netdev_priv(netdev);
+	struct ixgbe_fcoe *fcoe = &adapter->fcoe;
 
 	if (!(adapter->flags & IXGBE_FLAG_FCOE_CAPABLE))
 		goto out_disable;
@@ -655,6 +658,9 @@ int ixgbe_fcoe_disable(struct net_device *netdev)
 	if (!(adapter->flags & IXGBE_FLAG_FCOE_ENABLED))
 		goto out_disable;
 
+	if (!atomic_dec_and_test(&fcoe->refcnt))
+		goto out_disable;
+
 	e_info(drv, "Disabling FCoE offload features.\n");
 	netdev->features &= ~NETIF_F_FCOE_CRC;
 	netdev->features &= ~NETIF_F_FSO;
diff --git a/drivers/net/ixgbe/ixgbe_fcoe.h b/drivers/net/ixgbe/ixgbe_fcoe.h
index abf4b2b..4bc2c55 100644
--- a/drivers/net/ixgbe/ixgbe_fcoe.h
+++ b/drivers/net/ixgbe/ixgbe_fcoe.h
@@ -66,6 +66,7 @@ struct ixgbe_fcoe {
 	u8 tc;
 	u8 up;
 #endif
+	atomic_t refcnt;
 	spinlock_t lock;
 	struct pci_pool *pool;
 	struct ixgbe_fcoe_ddp ddp[IXGBE_FCOE_DDP_MAX];


^ permalink raw reply related

* Re: [PATCH 9/9] tproxy: use the interface primary IP address as a default value for --on-ip
From: Jan Engelhardt @ 2010-10-21  9:12 UTC (permalink / raw)
  To: KOVACS Krisztian; +Cc: netdev, netfilter-devel, Patrick McHardy, David Miller
In-Reply-To: <20101020112118.6260.54362.stgit@este.odu>


On Wednesday 2010-10-20 13:21, KOVACS Krisztian wrote:
>+
>+	if (!ipv6_addr_any(user_laddr))
>+		return user_laddr;
>+	laddr = NULL;
>+
>+	rcu_read_lock();
>+	indev = __in6_dev_get(skb->dev);
>+	if (indev)
>+		list_for_each_entry(ifa, &indev->addr_list, if_list) {
>+			/* FIXME: address selection */

Per our realworld discussion, I believe we should add checks for
some conditions (RFC 4862 section 2):

1. ignore tentative addresses

	if (ifa->ifa_flags & IFA_F_TENTATIVE)
		continue;

2. tests for when the interface's ifa->preferred_lft == 0/deprecatedness:

	if (ctinfo == IP_CT_NEW/RELATED && (ifa->ifa_flags & IFA_F_DEPRECATED))
		continue;

3. check for invalid addresses
(There might be a flag like tentative..)

	if (ifa->valid_lft == 0)
		continue;


^ permalink raw reply

* Re: [PATCH 7/9] tproxy: added IPv6 support to the TPROXY target
From: Jan Engelhardt @ 2010-10-21  9:14 UTC (permalink / raw)
  To: KOVACS Krisztian; +Cc: netdev, netfilter-devel, Patrick McHardy, David Miller
In-Reply-To: <1287651014.13326.2.camel@este.odu>

On Thursday 2010-10-21 10:50, KOVACS Krisztian wrote:

>Hi,
>
>On Thu, 2010-10-21 at 10:47 +0200, Jan Engelhardt wrote:
>> On Wednesday 2010-10-20 13:21, KOVACS Krisztian wrote:
>> > 
>> > /* TPROXY target is capable of marking the packet to perform
>> >  * redirection. We can get rid of that whenever we get support for
>> >  * mutliple targets in the same rule. */
>> >-struct xt_tproxy_target_info {
>> >+struct xt_tproxy_target_info_v0 {
>> > 	u_int32_t mark_mask;
>> > 	u_int32_t mark_value;
>> > 	__be32 laddr;
>> > 	__be16 lport;
>> > };
>> 
>> You cannot change the struct name either, or it may break userspace
>> compilations.
>
>True, though iptables has its own copy of the header anyway.

There is - or so I always hear - other userspace programs.

As for iptables, we only do the copy so that it compiles independent of 
the kernel version. You have to assume that the headers can be updated 
at any time.

^ permalink raw reply

* Re: [PATCH 7/9] tproxy: added IPv6 support to the TPROXY target
From: KOVACS Krisztian @ 2010-10-21  9:33 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: netdev, netfilter-devel, Patrick McHardy, David Miller
In-Reply-To: <alpine.LNX.2.01.1010211113020.22922@obet.zrqbmnf.qr>

Hi,

On Thu, 2010-10-21 at 11:14 +0200, Jan Engelhardt wrote:
> On Thursday 2010-10-21 10:50, KOVACS Krisztian wrote:
> 
> >Hi,
> >
> >On Thu, 2010-10-21 at 10:47 +0200, Jan Engelhardt wrote:
> >> On Wednesday 2010-10-20 13:21, KOVACS Krisztian wrote:
> >> > 
> >> > /* TPROXY target is capable of marking the packet to perform
> >> >  * redirection. We can get rid of that whenever we get support for
> >> >  * mutliple targets in the same rule. */
> >> >-struct xt_tproxy_target_info {
> >> >+struct xt_tproxy_target_info_v0 {
> >> > 	u_int32_t mark_mask;
> >> > 	u_int32_t mark_value;
> >> > 	__be32 laddr;
> >> > 	__be16 lport;
> >> > };
> >> 
> >> You cannot change the struct name either, or it may break userspace
> >> compilations.
> >
> >True, though iptables has its own copy of the header anyway.
> 
> There is - or so I always hear - other userspace programs.
> 
> As for iptables, we only do the copy so that it compiles independent of 
> the kernel version. You have to assume that the headers can be updated 
> at any time.

Sure, I wasn't implying we shouldn't fix this in the patch, I just doubt
there's anything else other than iptables using this and iptables itself
isn't affected.

Anyway, I've fixed it. Thanks, Jan.

--KK



^ permalink raw reply

* Re: [PATCH 6/9] tproxy: added IPv6 socket lookup function to nf_tproxy_core
From: KOVACS Krisztian @ 2010-10-21  9:48 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: netdev, netfilter-devel, Patrick McHardy, David Miller
In-Reply-To: <alpine.LNX.2.01.1010211040250.22922@obet.zrqbmnf.qr>

Hi,

On Thu, 2010-10-21 at 10:42 +0200, Jan Engelhardt wrote:
> On Wednesday 2010-10-20 13:21, KOVACS Krisztian wrote:
> >+
> >+	pr_debug("tproxy socket lookup: proto %u %pI6:%u -> %pI6:%u, lookup type: %d, sock %p\n",
> >+		 protocol, saddr, ntohs(sport), daddr, ntohs(dport), lookup_type, sk);
> 
> Shorts should preferably be used with %hd/%hu.

Fixed, thanks Jan.

--KK



^ permalink raw reply

* Re: [PATCH net-2.6] net/sched: fix missing spinlock init
From: David Miller @ 2010-10-21 10:10 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev
In-Reply-To: <1287206554.2799.32.camel@edumazet-laptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Sat, 16 Oct 2010 07:22:34 +0200

> Under network load, doing :
> 
> tc qdisc del dev eth0 root
> 
> triggers :
 ...
> commit 79640a4ca695 (add additional lock to qdisc to increase
> throughput) forgot to initialize  noop_qdisc and noqueue_qdisc busylock 
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> ---

Applied.

^ permalink raw reply

* Re: [PATCH net-next] fib: introduce fib_alias_accessed() helper
From: David Miller @ 2010-10-21 10:11 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev
In-Reply-To: <1287648218.6871.18.camel@edumazet-laptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 21 Oct 2010 10:03:38 +0200

> Perf tools session at NFWS 2010 pointed out a false sharing on struct
> fib_alias that can be avoided pretty easily, if we set FA_S_ACCESSED bit
> only if needed (ie : not already set)
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

Applied.

^ permalink raw reply

* Re: [PATCH] Drivers: atm: Makefile: replace the use of <module>-objs with <module>-y
From: David Miller @ 2010-10-21 10:11 UTC (permalink / raw)
  To: tdent48227; +Cc: chas, linux-kernel, netdev, linux-atm-general
In-Reply-To: <1287201209-2101-1-git-send-email-tdent48227@gmail.com>

From: Tracey Dent <tdent48227@gmail.com>
Date: Fri, 15 Oct 2010 23:53:29 -0400

> Changed <module>-objs to <module>-y in Makefile.
> 
> Signed-off-by: Tracey Dent <tdent48227@gmail.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] netxen: make local function static.
From: David Miller @ 2010-10-21 10:11 UTC (permalink / raw)
  To: shemminger; +Cc: amit.salecha, narender.kumar, rajesh.borundia, netdev
In-Reply-To: <20101018204010.1dc01291@nehalam>

From: Stephen Hemminger <shemminger@vyatta.com>
Date: Mon, 18 Oct 2010 20:40:10 -0700

> 
> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] socket: localize functions
From: David Miller @ 2010-10-21 10:12 UTC (permalink / raw)
  To: shemminger; +Cc: netdev
In-Reply-To: <20101018172729.68755c69@nehalam>

From: Stephen Hemminger <shemminger@vyatta.com>
Date: Mon, 18 Oct 2010 17:27:29 -0700

> A couple of functions in socket.c are only used there and
> should be localized.
> 
> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

Applied.

^ permalink raw reply

* Re: [PATCH] bridge: make br_parse_ip_options static
From: David Miller @ 2010-10-21 10:12 UTC (permalink / raw)
  To: bandan.das; +Cc: shemminger, netdev
In-Reply-To: <20101019112234.GB12005@stratus.com>

From: Bandan Das <bandan.das@stratus.com>
Date: Tue, 19 Oct 2010 07:22:34 -0400

> On  0, Stephen Hemminger <shemminger@vyatta.com> wrote:
>> 
>> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
>> 
>> --- a/net/bridge/br_netfilter.c	2010-10-18 17:01:36.903364885 -0700
>> +++ b/net/bridge/br_netfilter.c	2010-10-18 17:01:48.106569141 -0700
>> @@ -213,7 +213,7 @@ static inline void nf_bridge_update_prot
>>   * expected format
>>   */
>>  
>> -int br_parse_ip_options(struct sk_buff *skb)
>> +static int br_parse_ip_options(struct sk_buff *skb)
>>  {
>>  	struct ip_options *opt;
>>  	struct iphdr *iph;
>> 
> 
> My main motivation behind not making this static was that
> there would be possibly other places in the bridge code 
> (besides br_netfilter.c) where we enter the IP stack and might 
> want to call this. Not sure if it's indeed the case though..

You can un-static it when the use is added.

Patch applied, thanks Stephen.

^ permalink raw reply

* [PATCH] ipv4: synchronize bind() with RTM_NEWADDR notifications
From: Timo Teräs @ 2010-10-21 10:12 UTC (permalink / raw)
  To: netdev; +Cc: Timo Teräs

Otherwise we have race condition to user land:
 1. process A changes IP address
 2. kernel sends RTM_NEWADDR
 3. process B gets notification
 4. process B tries to bind() to new IP but that fails with
EADDRNOTAVAIL because FIB is not yet updated and inet_addr_type() in
inet_bind() does not recognize the IP as local
 5. kernel calls inetaddr_chain notifiers which updates FIB

IPv6 side seems to handle the notifications properly: bind()
immediately after RTM_NEWADDR succeeds as expected. This is because
ipv6_chk_addr() uses inet6_addr_lst which is updated before address
notification.

Signed-off-by: Timo Teräs <timo.teras@iki.fi>
---
 net/ipv4/af_inet.c  |    9 +++++++++
 net/ipv6/af_inet6.c |    4 +++-
 2 files changed, 12 insertions(+), 1 deletions(-)

diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 6a1100c..21200e4 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -466,6 +466,15 @@ int inet_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len)
 	if (addr_len < sizeof(struct sockaddr_in))
 		goto out;
 
+	/* Acquire rtnl_lock to synchronize with possible simultaneous
+	 * IP-address changes. This is needed because when RTM_NEWADDR
+	 * is sent the new IP is not yet in FIB, but alas inet_addr_type
+	 * checks the address type using FIB. Acquiring rtnl lock once
+	 * makse sure that any address for which RTM_NEWADDR was sent
+	 * earlier exists also in FIB. */
+	rtnl_lock();
+	rtnl_unlock();
+
 	chk_addr_ret = inet_addr_type(sock_net(sk), addr->sin_addr.s_addr);
 
 	/* Not specified by any standard per-se, however it breaks too
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index 56b9bf2..6fc37f4 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -300,7 +300,9 @@ int inet6_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len)
 			goto out;
 		}
 
-		/* Reproduce AF_INET checks to make the bindings consitant */
+		/* Reproduce AF_INET checks to make the bindings consistent */
+		rtnl_lock();
+		rtnl_unlock();
 		v4addr = addr->sin6_addr.s6_addr32[3];
 		chk_addr_ret = inet_addr_type(net, v4addr);
 		if (!sysctl_ip_nonlocal_bind &&
-- 
1.7.1


^ permalink raw reply related

* Re: [PATCH net-next-2.6 1/5] jme: Fix PHY power-off error
From: David Miller @ 2010-10-21 10:12 UTC (permalink / raw)
  To: cooldavid; +Cc: netdev, stable
In-Reply-To: <1287447044-24471-1-git-send-email-cooldavid@cooldavid.org>


All 5 patches applied, thanks.

^ permalink raw reply

* Re: [PATCH net-next] bnx2: Increase max rx ring size from 1K to 2K
From: David Miller @ 2010-10-21 10:13 UTC (permalink / raw)
  To: mchan; +Cc: andy, jfeeney, netdev
In-Reply-To: <1287448254-14173-1-git-send-email-mchan@broadcom.com>

From: "Michael Chan" <mchan@broadcom.com>
Date: Mon, 18 Oct 2010 17:30:54 -0700

> A number of customers are reporting packet loss under certain workloads
> (e.g. heavy bursts of small packets) with flow control disabled.  A larger
> rx ring helps to prevent these losses.
> 
> No change in default rx ring size and memory consumption.
> 
> Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
> Acked-by: John Feeney <jfeeney@redhat.com>
> Signed-off-by: Michael Chan <mchan@broadcom.com>

Ok, since the new limit is not the default, applied.

Thanks for the explanation Michael.

^ permalink raw reply

* Re: [PATCH net-next] sfc: make functions static
From: David Miller @ 2010-10-21 10:13 UTC (permalink / raw)
  To: bhutchings; +Cc: shemminger, shodgson, linux-net-drivers, netdev
In-Reply-To: <1287421518.2252.219.camel@achroite.uk.solarflarecom.com>

From: Ben Hutchings <bhutchings@solarflare.com>
Date: Mon, 18 Oct 2010 18:05:18 +0100

> On Mon, 2010-10-18 at 08:27 -0700, Stephen Hemminger wrote:
>> Make local functions and variable static. Do some rearrangement
>> of the string table stuff to put it where it gets used.
>> 
>> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
> 
> Acked-by: Ben Hutchings <bhutchings@solarflare.com>
> 
> We might have to change some of these back in future, but I suppose
> there is no harm in making them static now.

Applied.

^ permalink raw reply

* Re: [PATCH net-next-2.6] mlx4: make functions local and remove dead code.
From: David Miller @ 2010-10-21 10:13 UTC (permalink / raw)
  To: yevgenyp; +Cc: shemminger, netdev, eli
In-Reply-To: <E113D394D7C5DB4F8FF691FA7EE9DB443CC3716F65@MTLMAIL.mtl.com>

From: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Date: Tue, 19 Oct 2010 09:37:44 +0200

>> 
>> There is a whole section of code in this driver related to vlan tables
>> which is not accessed from any kernel code.
>> 
>> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
>>
> 
> Thanks for sending this, 
> There are patches under review at the moment (RDMA over Ethernet) that use this code:
> http://www.spinics.net/lists/linux-rdma/msg05512.html, which will be broken if the VLAN tables code is removed.

ok.

^ permalink raw reply

* Re: [PATCH net-next] bonding: make bond_resend_igmp_join_requests_delayed static
From: David Miller @ 2010-10-21 10:14 UTC (permalink / raw)
  To: shemminger; +Cc: fubar, netdev, bonding-devel
In-Reply-To: <20101015140256.4192fd34@nehalam>

From: Stephen Hemminger <shemminger@vyatta.com>
Date: Fri, 15 Oct 2010 14:02:56 -0700

> 
> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] vmxnet3: make bit twiddle routines inline
From: David Miller @ 2010-10-21 10:14 UTC (permalink / raw)
  To: shemminger; +Cc: sbhatewara, pv-drivers, netdev
In-Reply-To: <20101015140620.63d0a615@nehalam>

From: Stephen Hemminger <shemminger@vyatta.com>
Date: Fri, 15 Oct 2010 14:06:20 -0700

> Gcc doesn't usually handle inline across compilation units, and the
> functions don't have to be global in scope. Move the set/reset flag
> functions int the existing vmxnet3 header.
> 
> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

Applied.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox