Netdev List
 help / color / mirror / Atom feed
* Re: [net] e1000e: remove use of IP payload checksum
From: David Miller @ 2012-07-01  7:26 UTC (permalink / raw)
  To: jeffrey.t.kirsher; +Cc: bruce.w.allan, netdev, gospo, sassmann, stable
In-Reply-To: <1341122562-17382-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Sat, 30 Jun 2012 23:02:42 -0700

> From: Bruce Allan <bruce.w.allan@intel.com>
> 
> Currently only used when packet split mode is enabled with jumbo frames,
> IP payload checksum (for fragmented UDP packets) is mutually exclusive with
> receive hashing offload since the hardware uses the same space in the
> receive descriptor for the hardware-provided packet checksum and the RSS
> hash, respectively.  Users currently must disable jumbos when receive
> hashing offload is enabled, or vice versa, because of this incompatibility.
> Since testing has shown that IP payload checksum does not provide any real
> benefit, just remove it so that there is no longer a choice between jumbos
> or receive hashing offload but not both as done in other Intel GbE drivers
> (e.g. e1000, igb).
> 
> Also, add a missing check for IP checksum error reported by the hardware;
> let the stack verify the checksum when this happens.
> 
> CC: stable <stable@vger.kernel.org> [3.4]
> Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
> Tested-by: Aaron Brown <aaron.f.brown@intel.com>
> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

Applied, thanks Jeff.

^ permalink raw reply

* [PATCH v4 2/2] Update bonding driver documentation to include IPv6 transmit hashing algorithm.
From: John Eaglesham @ 2012-07-01  7:01 UTC (permalink / raw)
  To: netdev; +Cc: John Eaglesham
In-Reply-To: <cover.1341125875.git.linux@8192.net>

---
 Documentation/networking/bonding.txt | 31 ++++++++++++++++++++++++++-----
 1 file changed, 26 insertions(+), 5 deletions(-)

diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt
index bfea8a3..5db14fe 100644
--- a/Documentation/networking/bonding.txt
+++ b/Documentation/networking/bonding.txt
@@ -752,12 +752,22 @@ xmit_hash_policy
 		protocol information to generate the hash.
 
 		Uses XOR of hardware MAC addresses and IP addresses to
-		generate the hash.  The formula is
+		generate the hash.  The IPv4 formula is
 
 		(((source IP XOR dest IP) AND 0xffff) XOR
 			( source MAC XOR destination MAC ))
 				modulo slave count
 
+		The IPv6 forumla is
+
+		iphash =
+			(source ip quad 2 XOR dest IP quad 2) XOR
+			(source ip quad 3 XOR dest IP quad 3) XOR
+			(source ip quad 4 XOR dest IP quad 4)
+
+		((iphash >> 16) XOR (iphash >> 8) XOR iphash)
+			modulo slave count
+
 		This algorithm will place all traffic to a particular
 		network peer on the same slave.  For non-IP traffic,
 		the formula is the same as for the layer2 transmit
@@ -778,19 +788,30 @@ xmit_hash_policy
 		slaves, although a single connection will not span
 		multiple slaves.
 
-		The formula for unfragmented TCP and UDP packets is
+		The formula for unfragmented IPv4 TCP and UDP packets is
 
 		((source port XOR dest port) XOR
 			 ((source IP XOR dest IP) AND 0xffff)
 				modulo slave count
 
-		For fragmented TCP or UDP packets and all other IP
-		protocol traffic, the source and destination port
+		The formula for unfragmented IPv6 TCP and UDP packets is
+
+		iphash =
+			(source ip quad 2 XOR dest IP quad 2) XOR
+			(source ip quad 3 XOR dest IP quad 3) XOR
+			(source ip quad 4 XOR dest IP quad 4)
+
+		((source port XOR dest port) XOR
+			(iphash >> 16) XOR (iphash >> 8) XOR iphash)
+				modulo slave count
+
+		For fragmented TCP or UDP packets and all other IPv4 and
+		IPv6 protocol traffic, the source and destination port
 		information is omitted.  For non-IP traffic, the
 		formula is the same as for the layer2 transmit hash
 		policy.
 
-		This policy is intended to mimic the behavior of
+		The IPv4 policy is intended to mimic the behavior of
 		certain switches, notably Cisco switches with PFC2 as
 		well as some Foundry and IBM products.
 
-- 
1.7.11

^ permalink raw reply related

* [PATCH v4 1/2] Add support for IPv6 and bounds checking to transmit hashing functions.
From: John Eaglesham @ 2012-07-01  7:01 UTC (permalink / raw)
  To: netdev; +Cc: John Eaglesham
In-Reply-To: <cover.1341125875.git.linux@8192.net>

---
 drivers/net/bonding/bond_main.c | 91 +++++++++++++++++++++++++++++------------
 1 file changed, 64 insertions(+), 27 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index f5a40b9..b138d84 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3345,56 +3345,93 @@ static struct notifier_block bond_netdev_notifier = {
 /*---------------------------- Hashing Policies -----------------------------*/
 
 /*
+ * Hash for the output device based upon layer 2 data
+ */
+static int bond_xmit_hash_policy_l2(struct sk_buff *skb, int count)
+{
+	struct ethhdr *data = (struct ethhdr *)skb->data;
+
+	if (skb_headlen(skb) >= offsetof(struct ethhdr, h_proto))
+		return (data->h_dest[5] ^ data->h_source[5]) % count;
+
+	return 0;
+}
+
+/*
  * Hash for the output device based upon layer 2 and layer 3 data. If
- * the packet is not IP mimic bond_xmit_hash_policy_l2()
+ * the packet is not IP, fall back on bond_xmit_hash_policy_l2()
  */
 static int bond_xmit_hash_policy_l23(struct sk_buff *skb, int count)
 {
 	struct ethhdr *data = (struct ethhdr *)skb->data;
-	struct iphdr *iph = ip_hdr(skb);
+	struct iphdr *iph;
+	struct ipv6hdr *ipv6h;
+	u32 v6hash;
 
-	if (skb->protocol == htons(ETH_P_IP)) {
+	if (skb->protocol == htons(ETH_P_IP) &&
+		skb_network_header_len(skb) >= sizeof(struct iphdr)) {
+		iph = ip_hdr(skb);
 		return ((ntohl(iph->saddr ^ iph->daddr) & 0xffff) ^
 			(data->h_dest[5] ^ data->h_source[5])) % count;
-	}
-
-	return (data->h_dest[5] ^ data->h_source[5]) % count;
+	} else if (skb->protocol == htons(ETH_P_IPV6) &&
+		skb_network_header_len(skb) >= sizeof(struct ipv6hdr)) {
+		ipv6h = ipv6_hdr(skb);
+		v6hash =
+			(ipv6h->saddr.s6_addr32[1] ^ ipv6h->daddr.s6_addr32[1]) ^
+			(ipv6h->saddr.s6_addr32[2] ^ ipv6h->daddr.s6_addr32[2]) ^
+			(ipv6h->saddr.s6_addr32[3] ^ ipv6h->daddr.s6_addr32[3]);
+		v6hash = (v6hash >> 16) ^ (v6hash >> 8) ^ v6hash;
+		return (v6hash ^ data->h_dest[5] ^ data->h_source[5]) % count;
+	}
+
+	return bond_xmit_hash_policy_l2(skb, count);
 }
 
 /*
  * Hash for the output device based upon layer 3 and layer 4 data. If
  * the packet is a frag or not TCP or UDP, just use layer 3 data.  If it is
- * altogether not IP, mimic bond_xmit_hash_policy_l2()
+ * altogether not IP, fall back on bond_xmit_hash_policy_l2()
  */
 static int bond_xmit_hash_policy_l34(struct sk_buff *skb, int count)
 {
-	struct ethhdr *data = (struct ethhdr *)skb->data;
-	struct iphdr *iph = ip_hdr(skb);
-	__be16 *layer4hdr = (__be16 *)((u32 *)iph + iph->ihl);
-	int layer4_xor = 0;
+	u32 layer4_xor = 0;
+	struct iphdr *iph;
+	struct ipv6hdr *ipv6h;
 
 	if (skb->protocol == htons(ETH_P_IP)) {
+		iph = ip_hdr(skb);
 		if (!ip_is_fragment(iph) &&
-		    (iph->protocol == IPPROTO_TCP ||
-		     iph->protocol == IPPROTO_UDP)) {
+			(iph->protocol == IPPROTO_TCP ||
+			iph->protocol == IPPROTO_UDP)) {
+			__be16 *layer4hdr = (__be16 *)((u32 *)iph + iph->ihl);
+			if (iph->ihl * sizeof(u32) + sizeof(__be16) * 2 >
+				skb_headlen(skb) - skb_network_offset(skb))
+				goto short_header;
 			layer4_xor = ntohs((*layer4hdr ^ *(layer4hdr + 1)));
+		} else if (skb_network_header_len(skb) < sizeof(struct iphdr)) {
+			goto short_header;
 		}
-		return (layer4_xor ^
-			((ntohl(iph->saddr ^ iph->daddr)) & 0xffff)) % count;
-
+		return (layer4_xor ^ ((ntohl(iph->saddr ^ iph->daddr)) & 0xffff)) % count;
+	} else if (skb->protocol == htons(ETH_P_IPV6)) {
+		ipv6h = ipv6_hdr(skb);
+		if (ipv6h->nexthdr == IPPROTO_TCP || ipv6h->nexthdr == IPPROTO_UDP) {
+			__be16 *layer4hdrv6 = (__be16 *)((u8 *)ipv6h + sizeof(struct ipv6hdr));
+			if (sizeof(struct ipv6hdr) + sizeof(__be16) * 2 >
+				skb_headlen(skb) - skb_network_offset(skb))
+				goto short_header;
+			layer4_xor = (*layer4hdrv6 ^ *(layer4hdrv6 + 1));
+		} else if (skb_network_header_len(skb) < sizeof(struct ipv6hdr)) {
+			goto short_header;
+		}
+		layer4_xor ^=
+			(ipv6h->saddr.s6_addr32[1] ^ ipv6h->daddr.s6_addr32[1]) ^
+			(ipv6h->saddr.s6_addr32[2] ^ ipv6h->daddr.s6_addr32[2]) ^
+			(ipv6h->saddr.s6_addr32[3] ^ ipv6h->daddr.s6_addr32[3]);
+		return ((layer4_xor >> 16) ^ (layer4_xor >> 8) ^ layer4_xor) % count;
 	}
 
-	return (data->h_dest[5] ^ data->h_source[5]) % count;
-}
-
-/*
- * Hash for the output device based upon layer 2 data
- */
-static int bond_xmit_hash_policy_l2(struct sk_buff *skb, int count)
-{
-	struct ethhdr *data = (struct ethhdr *)skb->data;
-
-	return (data->h_dest[5] ^ data->h_source[5]) % count;
+short_header:
+	return bond_xmit_hash_policy_l2(skb, count);
 }
 
 /*-------------------------- Device entry points ----------------------------*/
-- 
1.7.11

^ permalink raw reply related

* [PATCH v4 0/2] bonding support for IPv6 transmit hashing
From: John Eaglesham @ 2012-07-01  7:01 UTC (permalink / raw)
  To: netdev; +Cc: John Eaglesham

Currently the "bonding" driver does not support load balancing outgoing
traffic in LACP mode for IPv6 traffic. IPv4 (and TCP or UDP over IPv4)
are currently supported; this patch adds transmit hashing for IPv6 (and
TCP or UDP over IPv6), bringing IPv6 up to par with IPv4 support in the
bonding driver.

The algorithm chosen (xor'ing the bottom three quads and then xor'ing
the bottom three bytes of that) was chosen after testing almost 400,000
unique IPv6 addresses harvested from server logs. This algorithm had the
most even distribution for both big- and little-endian architectures while
still using few instructions.

The IPv6 flow label was intentionally not included in the hash as it appears
to be unset in the vast majority of IPv6 traffic sampled, and the current
algorithm not using the flow label already offers a very even distribution.

Fragmented IPv6 packets are handled the same way as fragmented IPv4 packets,
ie, they are not balanced based on layer 4 information. Additionally,
IPv6 packets with intermediate headers are not balanced based on layer
4 information. In practice these intermediate headers are not common and
this should not cause any problems, and the alternative (a packet-parsing
loop and look-up table) seemed slow and complicated for little gain.

This is an update to a prior patch I submitted. This version includes
a clarified description, thorough bounds checking, updates functions to
call bond_xmit_hash_policy_l2 rather than re-implement the same logic,
incorporates Jay's style suggestions, and patches against net-next. Patch
has been tested and performs as expected.

John Eaglesham (2):
  Add support for IPv6 and bounds checking to transmit hashing
    functions.
  Update bonding driver documentation to include IPv6 transmit hashing
    algorithm.

 Documentation/networking/bonding.txt | 31 ++++++++++--
 drivers/net/bonding/bond_main.c      | 91 +++++++++++++++++++++++++-----------
 2 files changed, 90 insertions(+), 32 deletions(-)

-- 
1.7.11

^ permalink raw reply

* [net] e1000e: remove use of IP payload checksum
From: Jeff Kirsher @ 2012-07-01  6:02 UTC (permalink / raw)
  To: davem; +Cc: Bruce Allan, netdev, gospo, sassmann, stable, Jeff Kirsher

From: Bruce Allan <bruce.w.allan@intel.com>

Currently only used when packet split mode is enabled with jumbo frames,
IP payload checksum (for fragmented UDP packets) is mutually exclusive with
receive hashing offload since the hardware uses the same space in the
receive descriptor for the hardware-provided packet checksum and the RSS
hash, respectively.  Users currently must disable jumbos when receive
hashing offload is enabled, or vice versa, because of this incompatibility.
Since testing has shown that IP payload checksum does not provide any real
benefit, just remove it so that there is no longer a choice between jumbos
or receive hashing offload but not both as done in other Intel GbE drivers
(e.g. e1000, igb).

Also, add a missing check for IP checksum error reported by the hardware;
let the stack verify the checksum when this happens.

CC: stable <stable@vger.kernel.org> [3.4]
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/e1000e/defines.h |    1 +
 drivers/net/ethernet/intel/e1000e/netdev.c  |   75 +++++----------------------
 2 files changed, 15 insertions(+), 61 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000e/defines.h b/drivers/net/ethernet/intel/e1000e/defines.h
index 351a409..76edbc1 100644
--- a/drivers/net/ethernet/intel/e1000e/defines.h
+++ b/drivers/net/ethernet/intel/e1000e/defines.h
@@ -103,6 +103,7 @@
 #define E1000_RXD_ERR_SEQ       0x04    /* Sequence Error */
 #define E1000_RXD_ERR_CXE       0x10    /* Carrier Extension Error */
 #define E1000_RXD_ERR_TCPE      0x20    /* TCP/UDP Checksum Error */
+#define E1000_RXD_ERR_IPE       0x40    /* IP Checksum Error */
 #define E1000_RXD_ERR_RXE       0x80    /* Rx Data Error */
 #define E1000_RXD_SPC_VLAN_MASK 0x0FFF  /* VLAN ID is in lower 12 bits */
 
diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index 31d37a2..623e30b 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -496,7 +496,7 @@ static void e1000_receive_skb(struct e1000_adapter *adapter,
  * @sk_buff: socket buffer with received data
  **/
 static void e1000_rx_checksum(struct e1000_adapter *adapter, u32 status_err,
-			      __le16 csum, struct sk_buff *skb)
+			      struct sk_buff *skb)
 {
 	u16 status = (u16)status_err;
 	u8 errors = (u8)(status_err >> 24);
@@ -511,8 +511,8 @@ static void e1000_rx_checksum(struct e1000_adapter *adapter, u32 status_err,
 	if (status & E1000_RXD_STAT_IXSM)
 		return;
 
-	/* TCP/UDP checksum error bit is set */
-	if (errors & E1000_RXD_ERR_TCPE) {
+	/* TCP/UDP checksum error bit or IP checksum error bit is set */
+	if (errors & (E1000_RXD_ERR_TCPE | E1000_RXD_ERR_IPE)) {
 		/* let the stack verify checksum errors */
 		adapter->hw_csum_err++;
 		return;
@@ -523,19 +523,7 @@ static void e1000_rx_checksum(struct e1000_adapter *adapter, u32 status_err,
 		return;
 
 	/* It must be a TCP or UDP packet with a valid checksum */
-	if (status & E1000_RXD_STAT_TCPCS) {
-		/* TCP checksum is good */
-		skb->ip_summed = CHECKSUM_UNNECESSARY;
-	} else {
-		/*
-		 * IP fragment with UDP payload
-		 * Hardware complements the payload checksum, so we undo it
-		 * and then put the value in host order for further stack use.
-		 */
-		__sum16 sum = (__force __sum16)swab16((__force u16)csum);
-		skb->csum = csum_unfold(~sum);
-		skb->ip_summed = CHECKSUM_COMPLETE;
-	}
+	skb->ip_summed = CHECKSUM_UNNECESSARY;
 	adapter->hw_csum_good++;
 }
 
@@ -954,8 +942,7 @@ static bool e1000_clean_rx_irq(struct e1000_ring *rx_ring, int *work_done,
 		skb_put(skb, length);
 
 		/* Receive Checksum Offload */
-		e1000_rx_checksum(adapter, staterr,
-				  rx_desc->wb.lower.hi_dword.csum_ip.csum, skb);
+		e1000_rx_checksum(adapter, staterr, skb);
 
 		e1000_rx_hash(netdev, rx_desc->wb.lower.hi_dword.rss, skb);
 
@@ -1341,8 +1328,7 @@ copydone:
 		total_rx_bytes += skb->len;
 		total_rx_packets++;
 
-		e1000_rx_checksum(adapter, staterr,
-				  rx_desc->wb.lower.hi_dword.csum_ip.csum, skb);
+		e1000_rx_checksum(adapter, staterr, skb);
 
 		e1000_rx_hash(netdev, rx_desc->wb.lower.hi_dword.rss, skb);
 
@@ -1512,9 +1498,8 @@ static bool e1000_clean_jumbo_rx_irq(struct e1000_ring *rx_ring, int *work_done,
 			}
 		}
 
-		/* Receive Checksum Offload XXX recompute due to CRC strip? */
-		e1000_rx_checksum(adapter, staterr,
-				  rx_desc->wb.lower.hi_dword.csum_ip.csum, skb);
+		/* Receive Checksum Offload */
+		e1000_rx_checksum(adapter, staterr, skb);
 
 		e1000_rx_hash(netdev, rx_desc->wb.lower.hi_dword.rss, skb);
 
@@ -3098,19 +3083,10 @@ static void e1000_configure_rx(struct e1000_adapter *adapter)
 
 	/* Enable Receive Checksum Offload for TCP and UDP */
 	rxcsum = er32(RXCSUM);
-	if (adapter->netdev->features & NETIF_F_RXCSUM) {
+	if (adapter->netdev->features & NETIF_F_RXCSUM)
 		rxcsum |= E1000_RXCSUM_TUOFL;
-
-		/*
-		 * IPv4 payload checksum for UDP fragments must be
-		 * used in conjunction with packet-split.
-		 */
-		if (adapter->rx_ps_pages)
-			rxcsum |= E1000_RXCSUM_IPPCSE;
-	} else {
+	else
 		rxcsum &= ~E1000_RXCSUM_TUOFL;
-		/* no need to clear IPPCSE as it defaults to 0 */
-	}
 	ew32(RXCSUM, rxcsum);
 
 	if (adapter->hw.mac.type == e1000_pch2lan) {
@@ -5241,22 +5217,10 @@ static int e1000_change_mtu(struct net_device *netdev, int new_mtu)
 	int max_frame = new_mtu + ETH_HLEN + ETH_FCS_LEN;
 
 	/* Jumbo frame support */
-	if (max_frame > ETH_FRAME_LEN + ETH_FCS_LEN) {
-		if (!(adapter->flags & FLAG_HAS_JUMBO_FRAMES)) {
-			e_err("Jumbo Frames not supported.\n");
-			return -EINVAL;
-		}
-
-		/*
-		 * IP payload checksum (enabled with jumbos/packet-split when
-		 * Rx checksum is enabled) and generation of RSS hash is
-		 * mutually exclusive in the hardware.
-		 */
-		if ((netdev->features & NETIF_F_RXCSUM) &&
-		    (netdev->features & NETIF_F_RXHASH)) {
-			e_err("Jumbo frames cannot be enabled when both receive checksum offload and receive hashing are enabled.  Disable one of the receive offload features before enabling jumbos.\n");
-			return -EINVAL;
-		}
+	if ((max_frame > ETH_FRAME_LEN + ETH_FCS_LEN) &&
+	    !(adapter->flags & FLAG_HAS_JUMBO_FRAMES)) {
+		e_err("Jumbo Frames not supported.\n");
+		return -EINVAL;
 	}
 
 	/* Supported frame sizes */
@@ -6030,17 +5994,6 @@ static int e1000_set_features(struct net_device *netdev,
 			 NETIF_F_RXALL)))
 		return 0;
 
-	/*
-	 * IP payload checksum (enabled with jumbos/packet-split when Rx
-	 * checksum is enabled) and generation of RSS hash is mutually
-	 * exclusive in the hardware.
-	 */
-	if (adapter->rx_ps_pages &&
-	    (features & NETIF_F_RXCSUM) && (features & NETIF_F_RXHASH)) {
-		e_err("Enabling both receive checksum offload and receive hashing is not possible with jumbo frames.  Disable jumbos or enable only one of the receive offload features.\n");
-		return -EINVAL;
-	}
-
 	if (changed & NETIF_F_RXFCS) {
 		if (features & NETIF_F_RXFCS) {
 			adapter->flags2 &= ~FLAG2_CRC_STRIPPING;
-- 
1.7.10.4

^ permalink raw reply related

* Re: [PATCH v6] sctp: be more restrictive in transport selection on bundled sacks
From: David Miller @ 2012-07-01  5:44 UTC (permalink / raw)
  To: vyasevich; +Cc: nhorman, netdev, linux-sctp
In-Reply-To: <a1fb36e6-783a-4a89-9771-a7010c2da4fb@email.android.com>

From: Vlad Yasevich <vyasevich@gmail.com>
Date: Sat, 30 Jun 2012 23:17:52 -0400

> David Miller <davem@davemloft.net> wrote:
> 
>>Once this has Vlad's ACK I'll apply it.
> 
> Acked-by: Vlad Yasevich <vyasevich@gmail.com>

Applied, thanks everyone.

^ permalink raw reply

* Re: [PATCH] ipv4: Elide fib_validate_source() completely when possible.
From: David Miller @ 2012-07-01  5:39 UTC (permalink / raw)
  To: ja; +Cc: netdev
In-Reply-To: <alpine.LFD.2.00.1206301300530.1593@ja.ssi.bg>

From: Julian Anastasov <ja@ssi.bg>
Date: Sat, 30 Jun 2012 13:45:52 +0300 (EEST)

> 	If we really want a change in behavior we should
> at least update the accept_local info in
> Documentation/networking/ip-sysctl.txt ?

Thanks for pointing this out, that's what I will do.

====================
ipv4: Clarify in docs that accept_local requires rp_filter.

Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/networking/ip-sysctl.txt |   11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index 99d0e05..47b6c79 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -857,9 +857,14 @@ accept_source_route - BOOLEAN
 		FALSE (host)
 
 accept_local - BOOLEAN
-	Accept packets with local source addresses. In combination with
-	suitable routing, this can be used to direct packets between two
-	local interfaces over the wire and have them accepted properly.
+	Accept packets with local source addresses. In combination
+	with suitable routing, this can be used to direct packets
+	between two local interfaces over the wire and have them
+	accepted properly.
+
+	rp_filter must be set to a non-zero value in order for
+	accept_local to have an effect.
+
 	default FALSE
 
 route_localnet - BOOLEAN
-- 
1.7.10.4

^ permalink raw reply related

* Re: [net-next] e1000e: remove use of IP payload checksum
From: Jeff Kirsher @ 2012-07-01  5:32 UTC (permalink / raw)
  To: David Miller; +Cc: ben, bruce.w.allan, netdev, gospo, sassmann
In-Reply-To: <20120630.173752.1993136000245136259.davem@davemloft.net>

[-- Attachment #1: Type: text/plain, Size: 1889 bytes --]

On Sat, 2012-06-30 at 17:37 -0700, David Miller wrote:
> From: Ben Hutchings <ben@decadent.org.uk>
> Date: Sat, 30 Jun 2012 22:36:36 +0100
> 
> > On Sat, 2012-06-30 at 03:35 -0700, Jeff Kirsher wrote:
> >> From: Bruce Allan <bruce.w.allan@intel.com>
> >> 
> >> Currently only used when packet split mode is enabled with jumbo frames,
> >> IP payload checksum (for fragmented UDP packets) is mutually exclusive with
> >> receive hashing offload since the hardware uses the same space in the
> >> receive descriptor for the hardware-provided packet checksum and the RSS
> >> hash, respectively.  Users currently must disable jumbos when receive
> >> hashing offload is enabled, or vice versa, because of this incompatibility.
> >> Since testing has shown that IP payload checksum does not provide any real
> >> benefit, just remove it so that there is no longer a choice between jumbos
> >> or receive hashing offload but not both as done in other Intel GbE drivers
> >> (e.g. e1000, igb).
> >> 
> >> Also, add a missing check for IP checksum error reported by the hardware;
> >> let the stack verify the checksum when this happens.
> > [...]
> > 
> > The change to enable RX hashing in 3.4, with this odd restriction seems
> > to have broken most existing systems using jumbo MTU on e1000e.  None of
> > the distro scripts or network management daemons will automatically
> > change offload configuration before MTU; how could they know?
> > 
> > Therefore this needs to be fixed in 3.5 and 3.4.y, not net-next.
> 
> Agreed.

Ok, I will prepare it for net and stable 3.4.  I know it will require a
backported patch for stable 3.4.y since the current patch only applied
to net & net-next.

Bruce was wanting to have it applied to net & stable, and I was not sure
based on the patch content and description, so I that is why I submitted
it for net-next.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* Re: [PATCH V3 2/2] bonding support for IPv6 transmit hashing
From: Hannes Frederic Sowa @ 2012-07-01  3:57 UTC (permalink / raw)
  To: John; +Cc: netdev
In-Reply-To: <4FEF55A7.6070502@8192.net>

On Sat, Jun 30, 2012 at 9:38 PM, John <linux@8192.net> wrote:
> On 6/30/2012 4:59 AM, Hannes Frederic Sowa wrote:
>> On Sat, Jun 30, 2012 at 8:17 AM, John <linux@8192.net> wrote:
>>>
>>> diff --git a/Documentation/networking/bonding.txt
>>> b/Documentation/networking/bonding.txt
>>> index bfea8a3..5db14fe 100644
>>> --- a/Documentation/networking/bonding.txt
>>> +++ b/Documentation/networking/bonding.txt
>>> @@ -752,12 +752,22 @@ xmit_hash_policy
>>>                  protocol information to generate the hash.
>>>
>>>                  Uses XOR of hardware MAC addresses and IP addresses to
>>> -               generate the hash.  The formula is
>>> +               generate the hash.  The IPv4 formula is
>>>
>>>                  (((source IP XOR dest IP) AND 0xffff) XOR
>>>                          ( source MAC XOR destination MAC ))
>>>                                  modulo slave count
>>>
>>> +               The IPv6 forumla is
>>> +
>>> +               iphash =
>>> +                       (source ip quad 2 XOR dest IP quad 2) XOR
>>> +                       (source ip quad 3 XOR dest IP quad 3) XOR
>>> +                       (source ip quad 4 XOR dest IP quad 4)
>>> +
>>> +               ((iphash >> 16) XOR (iphash >> 8) XOR iphash)
>>> +                       modulo slave count
>>> +
>>
>>
>> Wouldn't it be beneficial to include the ipv6 flow label in the hash
>> calculation?
>
> Hannes,
>
> In all of the traffic I inspected I don't believe I saw a single flow label
> set. Even if it were set 100% of the time by Linux, any packets routed or
> bridged from another operating system wouldn't see any benefit. The current
> algorithm distributes the traffic very well, I don't believe adding the flow
> label would be beneficial even if it were set more frequently.
>
> If you feel strongly about its inclusion, though, I am willing to
> reconsider.

It would definitely help to load balance tunnelled traffic over a
bonded interface. But as I currently don't use such a setup, I don't
have a strong opinion on that.

Greetings,

  Hannes

^ permalink raw reply

* Re: [PATCH v6] sctp: be more restrictive in transport selection on bundled sacks
From: Vlad Yasevich @ 2012-07-01  3:17 UTC (permalink / raw)
  To: David Miller, nhorman; +Cc: netdev, linux-sctp
In-Reply-To: <20120630.173945.173993639982489712.davem@davemloft.net>

David Miller <davem@davemloft.net> wrote:

>From: Neil Horman <nhorman@tuxdriver.com>
>Date: Sat, 30 Jun 2012 09:04:26 -0400
>
>> It was noticed recently that when we send data on a transport, its
>possible that
>> we might bundle a sack that arrived on a different transport.  While
>this isn't
>> a major problem, it does go against the SHOULDAcm requirement in section
>6.4 of RFC
>> 2960:
>> 
>>  An endpoint SHOULD transmit reply chunks (e.g., SACK, HEARTBEAT ACK,
>>    etc.) to the same destination transport address from which it
>>    received the DATA or control chunk to which it is replying.  This
>>    rule should also be followed if the endpoint is bundling DATA
>chunks
>>    together with the reply chunk.
>> 
>> This patch seeks to correct that.  It restricts the bundling of sack
>operations
>> to only those transports which have moved the ctsn of the association
>forward
>> since the last sack.  By doing this we guarantee that we only bundle
>outbound
>> saks on a transport that has received a chunk since the last sack. 
>This brings
>> us into stricter compliance with the RFC.
>> 
>> Vlad had initially suggested that we strictly allow only sack
>bundling on the
>> transport that last moved the ctsn forward.  While this makes sense,
>I was
>> concerned that doing so prevented us from bundling in the case where
>we had
>> received chunks that moved the ctsn on multiple transports.  In those
>cases, the
>> RFC allows us to select any of the transports having received chunks
>to bundle
>> the sack on.  so I've modified the approach to allow for that, by
>adding a state
>> variable to each transport that tracks weather it has moved the ctsn
>since the
>> last sack.  This I think keeps our behavior (and performance), close
>enough to
>> our current profile that I think we can do this without a sysctl knob
>to
>> enable/disable it.
>> 
>> Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
>> CC: Vlad Yaseivch <vyasevich@gmail.com>
>> CC: David S. Miller <davem@davemloft.net>
>> CC: linux-sctp@vger.kernel.org
>> Reported-by: Michele Baldessari <michele@redhat.com>
>> Reported-by: sorin serban <sserban@redhat.com>
>
>Once this has Vlad's ACK I'll apply it.
>

Acked-by: Vlad Yasevich <vyasevich@gmail.com>

Sorry for the delay.

-vlad

>There has to be a better way to handle this situation, wherein the
>responsible party has ACK'd the patch but I just ask for a few coding
>style fixups and whatnot.  As it stands now I have to twiddle my
>thumbs waiting for the new ACK.


-- 
Sent from my Android phone with SkitMail. Please excuse my brevity.

^ permalink raw reply

* Re: [PATCH net-next 06/15] netfilter: Add NFPROTO_BUS hook constant for AF_BUS socket family
From: Jan Engelhardt @ 2012-07-01  2:15 UTC (permalink / raw)
  To: Vincent Sanders
  Cc: netdev, linux-kernel, David S. Miller, Javier Martinez Canillas
In-Reply-To: <1340988354-26981-7-git-send-email-vincent.sanders@collabora.co.uk>

On Friday 2012-06-29 18:45, Vincent Sanders wrote:

>AF_BUS sockets add a netfilter NF_HOOK() on the packet sending path.
>This allows packet to be mangled by registered netfilter hooks.

If you do touch netfiler, consider adding that mailing list as well.

>diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h
>index c613cf0..0698924 100644
>--- a/include/linux/netfilter.h
>+++ b/include/linux/netfilter.h
>@@ -67,6 +67,7 @@ enum {
> 	NFPROTO_BRIDGE =  7,
> 	NFPROTO_IPV6   = 10,
> 	NFPROTO_DECNET = 12,
>+	NFPROTO_BUS,
> 	NFPROTO_NUMPROTO,
> };

Make use of the holes that were left.

^ permalink raw reply

* Re: [net] igbvf: fix divide by zero
From: David Miller @ 2012-07-01  0:41 UTC (permalink / raw)
  To: jeffrey.t.kirsher
  Cc: mitch.a.williams, netdev, gospo, sassmann, stable, daahern
In-Reply-To: <1341051799-8824-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Sat, 30 Jun 2012 03:23:19 -0700

> From: Mitch A Williams <mitch.a.williams@intel.com>
> 
> Using ethtool -C ethX rx-usecs 0 crashes with a divide by zero.
> Refactor this function to fix this issue and make it more clear
> what the intent of each conditional is. Add comment regarding
> using a setting of zero.
> 
> CC: stable <stable@vger.kernel.org> [3.3+]
> CC: David Ahern <daahern@cisco.com>
> Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
> Tested-by: Aaron Brown <aaron.f.brown@intel.com>
> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH v6] sctp: be more restrictive in transport selection on bundled sacks
From: David Miller @ 2012-07-01  0:39 UTC (permalink / raw)
  To: nhorman; +Cc: netdev, vyasevich, linux-sctp
In-Reply-To: <1341061466-4186-1-git-send-email-nhorman@tuxdriver.com>

From: Neil Horman <nhorman@tuxdriver.com>
Date: Sat, 30 Jun 2012 09:04:26 -0400

> It was noticed recently that when we send data on a transport, its possible that
> we might bundle a sack that arrived on a different transport.  While this isn't
> a major problem, it does go against the SHOULD requirement in section 6.4 of RFC
> 2960:
> 
>  An endpoint SHOULD transmit reply chunks (e.g., SACK, HEARTBEAT ACK,
>    etc.) to the same destination transport address from which it
>    received the DATA or control chunk to which it is replying.  This
>    rule should also be followed if the endpoint is bundling DATA chunks
>    together with the reply chunk.
> 
> This patch seeks to correct that.  It restricts the bundling of sack operations
> to only those transports which have moved the ctsn of the association forward
> since the last sack.  By doing this we guarantee that we only bundle outbound
> saks on a transport that has received a chunk since the last sack.  This brings
> us into stricter compliance with the RFC.
> 
> Vlad had initially suggested that we strictly allow only sack bundling on the
> transport that last moved the ctsn forward.  While this makes sense, I was
> concerned that doing so prevented us from bundling in the case where we had
> received chunks that moved the ctsn on multiple transports.  In those cases, the
> RFC allows us to select any of the transports having received chunks to bundle
> the sack on.  so I've modified the approach to allow for that, by adding a state
> variable to each transport that tracks weather it has moved the ctsn since the
> last sack.  This I think keeps our behavior (and performance), close enough to
> our current profile that I think we can do this without a sysctl knob to
> enable/disable it.
> 
> Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
> CC: Vlad Yaseivch <vyasevich@gmail.com>
> CC: David S. Miller <davem@davemloft.net>
> CC: linux-sctp@vger.kernel.org
> Reported-by: Michele Baldessari <michele@redhat.com>
> Reported-by: sorin serban <sserban@redhat.com>

Once this has Vlad's ACK I'll apply it.

There has to be a better way to handle this situation, wherein the
responsible party has ACK'd the patch but I just ask for a few coding
style fixups and whatnot.  As it stands now I have to twiddle my
thumbs waiting for the new ACK.

^ permalink raw reply

* Re: [PATCH v5] sctp: be more restrictive in transport selection on bundled sacks
From: David Miller @ 2012-07-01  0:38 UTC (permalink / raw)
  To: nhorman; +Cc: netdev, vyasevich, linux-sctp
In-Reply-To: <20120630122647.GA22647@neilslaptop.think-freely.org>

From: Neil Horman <nhorman@tuxdriver.com>
Date: Sat, 30 Jun 2012 08:26:47 -0400

> This is wrong.  Its a counter that increments every time we call sctp_make_sack,
> so that we can create a unique generation identifier for use in tagging which
> transports move ctsn in a given generation.  It saves us from having to iterate
> over a list every time we send a sack. 

Sorry, I missed the counter bump.

^ permalink raw reply

* Re: [net-next] e1000e: remove use of IP payload checksum
From: David Miller @ 2012-07-01  0:37 UTC (permalink / raw)
  To: ben; +Cc: jeffrey.t.kirsher, bruce.w.allan, netdev, gospo, sassmann
In-Reply-To: <1341092196.4852.43.camel@deadeye.wl.decadent.org.uk>

From: Ben Hutchings <ben@decadent.org.uk>
Date: Sat, 30 Jun 2012 22:36:36 +0100

> On Sat, 2012-06-30 at 03:35 -0700, Jeff Kirsher wrote:
>> From: Bruce Allan <bruce.w.allan@intel.com>
>> 
>> Currently only used when packet split mode is enabled with jumbo frames,
>> IP payload checksum (for fragmented UDP packets) is mutually exclusive with
>> receive hashing offload since the hardware uses the same space in the
>> receive descriptor for the hardware-provided packet checksum and the RSS
>> hash, respectively.  Users currently must disable jumbos when receive
>> hashing offload is enabled, or vice versa, because of this incompatibility.
>> Since testing has shown that IP payload checksum does not provide any real
>> benefit, just remove it so that there is no longer a choice between jumbos
>> or receive hashing offload but not both as done in other Intel GbE drivers
>> (e.g. e1000, igb).
>> 
>> Also, add a missing check for IP checksum error reported by the hardware;
>> let the stack verify the checksum when this happens.
> [...]
> 
> The change to enable RX hashing in 3.4, with this odd restriction seems
> to have broken most existing systems using jumbo MTU on e1000e.  None of
> the distro scripts or network management daemons will automatically
> change offload configuration before MTU; how could they know?
> 
> Therefore this needs to be fixed in 3.5 and 3.4.y, not net-next.

Agreed.

^ permalink raw reply

* Re: AF_BUS socket address family
From: David Miller @ 2012-07-01  0:33 UTC (permalink / raw)
  To: alan; +Cc: vincent.sanders, netdev, linux-kernel
In-Reply-To: <20120630141222.60df95a5@pyramind.ukuu.org.uk>

From: Alan Cox <alan@lxorguk.ukuu.org.uk>
Date: Sat, 30 Jun 2012 14:12:22 +0100

> In fact if you look up the stack you'll find a large number of multicast
> messaging systems which do reliable transport built on top of IP. In fact
> Red Hat provides a high level messaging cluster service that does exactly
> this. (as well as dbus which does it on the deskop level) plus a ton of
> stuff on top of that (JGroups etc)
> 
> Everybody at the application level has been using these 'receiver
> reliable'  multicast services for years (Websphere MQ, TIBCO, RTPGM,
> OpenPGM, MS-PGM, you name it). There are even accelerators for PGM based
> protocols in things like Cisco routers and Solarflare can do much of it
> on the card for 10Gbit.

The issue is that what to do when a receiver goes deaf is a policy
issue.

^ permalink raw reply

* Re: [patch -next] netfilter: use kfree_skb() not kfree()
From: David Miller @ 2012-07-01  0:27 UTC (permalink / raw)
  To: dan.carpenter
  Cc: netfilter, coreteam, netdev, bridge, kernel-janitors,
	bart.de.schuymer, netfilter-devel, shemminger, pablo
In-Reply-To: <20120630114853.GA22767@elgon.mountain>

From: Dan Carpenter <dan.carpenter@oracle.com>
Date: Sat, 30 Jun 2012 14:48:53 +0300

> This was should be a kfree_skb() here to free the sk_buff pointer.
> 
> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>

My bad, applied, thanks Dan.

^ permalink raw reply

* Re: [net-next] e1000e: remove use of IP payload checksum
From: Ben Hutchings @ 2012-06-30 21:36 UTC (permalink / raw)
  To: Jeff Kirsher; +Cc: davem, Bruce Allan, netdev, gospo, sassmann
In-Reply-To: <1341052528-2444-1-git-send-email-jeffrey.t.kirsher@intel.com>

[-- Attachment #1: Type: text/plain, Size: 1435 bytes --]

On Sat, 2012-06-30 at 03:35 -0700, Jeff Kirsher wrote:
> From: Bruce Allan <bruce.w.allan@intel.com>
> 
> Currently only used when packet split mode is enabled with jumbo frames,
> IP payload checksum (for fragmented UDP packets) is mutually exclusive with
> receive hashing offload since the hardware uses the same space in the
> receive descriptor for the hardware-provided packet checksum and the RSS
> hash, respectively.  Users currently must disable jumbos when receive
> hashing offload is enabled, or vice versa, because of this incompatibility.
> Since testing has shown that IP payload checksum does not provide any real
> benefit, just remove it so that there is no longer a choice between jumbos
> or receive hashing offload but not both as done in other Intel GbE drivers
> (e.g. e1000, igb).
> 
> Also, add a missing check for IP checksum error reported by the hardware;
> let the stack verify the checksum when this happens.
[...]

The change to enable RX hashing in 3.4, with this odd restriction seems
to have broken most existing systems using jumbo MTU on e1000e.  None of
the distro scripts or network management daemons will automatically
change offload configuration before MTU; how could they know?

Therefore this needs to be fixed in 3.5 and 3.4.y, not net-next.

Ben.

-- 
Ben Hutchings
Lowery's Law:
             If it jams, force it. If it breaks, it needed replacing anyway.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply

* Re: AF_BUS socket address family
From: Hans-Peter Jansen @ 2012-06-30 20:41 UTC (permalink / raw)
  To: Vincent Sanders; +Cc: netdev, linux-kernel, David S. Miller
In-Reply-To: <1340988354-26981-1-git-send-email-vincent.sanders@collabora.co.uk>

Dear Vincent,

On Friday 29 June 2012, 18:45:39 Vincent Sanders wrote:
> This series adds the bus address family (AF_BUS) it is against
> net-next as of yesterday.
>
> AF_BUS is a message oriented inter process communication system.
>
> The principle features are:
>
>  - Reliable datagram based communication (all sockets are of type
>    SOCK_SEQPACKET)
>
>  - Multicast message delivery (one to many, unicast as a subset)
>
>  - Strict ordering (messages are delivered to every client in the
> same order)
>
>  - Ability to pass file descriptors
>
>  - Ability to pass credentials
>
> The basic concept is to provide a virtual bus on which multiple
> processes can communicate and policy is imposed by a "bus master".
>
> Introduction
> ------------
>
> AF_BUS is based upon AF_UNIX but extended for multicast operation and
> removes stream operation, responding to extensive feedback on
> previous approaches we have made the implementation as isolated as
> possible. There are opportunities in the future to integrate the
> socket garbage collector with that of the unix socket implementation.
>
> The impetus for creating this IPC mechanism is to replace the
> underlying transport for D-Bus. The D-Bus system currently emulates
> this IPC mechanism using AF_UNIX sockets in userspace and has
> numerous undesirable behaviours. D-Bus is now widely deployed in many
> areas and has become a de-facto IPC standard. Using this IPC
> mechanism as a transport gives a significant (100% or more)
> improvement to throughput with comparable improvement to latency.

Your introduction is missing a comprehensive "Discussion" section, where 
you compare the AF_UNIX based implementation with AF_BUS ones. 

You should elaborate on each of the above noted undesirable behaviours, 
why and how AF_BUS is advantageous. Show the workarounds, that are 
needed by AF_UNIX to operate (properly?!?) and how the new 
implementation is going to improve this situation.

This will help to get some progress into the indurated discussion here.

Please also note, that, while your aims are nice and sound, it's even 
more important for IPC mechanisms to operate properly - even during 
persisting error conditions (crashed bus master and clients, 
misbehaving or even abusing members). It would be cool to create a 
D-BUS test rig, that not only measures performance numbers, but also 
checks for dead locks, corner cases and abuse attempts in both IPC 
implementations.

It's a juggling act: while AF_UNIX might suffer from downsides, the code 
is heavily exercised in every aspect. Your implementation will only be 
exercised by a handful of users (basically one lib), but in order to 
rectify its existence in kernel space, such extensions need different 
kinds of users, and the basic concepts need to fit in the whole kernel 
picture as well, or you need to call it AF_DBUS with even less chance 
to get it into mainstream.

Wishing you all the best and good luck,
Pete

^ permalink raw reply

* Re: [BUG, regression, bisected] Marvell 88E8055 NIC (sky2) fails to detect link after resume from S3
From: Francois Romieu @ 2012-06-30 20:02 UTC (permalink / raw)
  To: Michal Zatloukal; +Cc: Stephen Hemminger, netdev
In-Reply-To: <CAKKZj2DVX7Kr4Ag7jubtr7fSa5sSLYR=kt2b6=PV=8fV6q0d8Q@mail.gmail.com>

Michal Zatloukal <myxal.mxl@gmail.com> :
[...]
> Is there something I can try?

I have not used it for quite some time but comparing mmiotrace output
(see Documentation/trace/mmiotrace.txt) before and after the regression
commit may give some hint.

Otherwise I would ask for help on linux-pm@vger.kernel.org

-- 
Ueimor

^ permalink raw reply

* Re: [PATCH V3 1/2] bonding support for IPv6 transmit hashing
From: John Eaglesham @ 2012-06-30 19:50 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20120630.010514.79398765104671796.davem@davemloft.net>

On 6/30/2012 1:05 AM, David Miller wrote:
>
> If you're going to post multiple patches, give them unique
> subject line texts describing what each change does uniquely.
> Do not use identical subject lines ever, that is very unhelpful
> for the people reading your changes.
>
> From: John <linux@8192.net>
> Date: Fri, 29 Jun 2012 23:17:11 -0700
>
>> + skb_network_header_len(skb) >= sizeof(struct ipv6hdr)) {
>> +		ipv6h = ipv6_hdr(skb);
>> +		v6hash =
>> + (ipv6h->saddr.s6_addr32[1] ^ ipv6h->daddr.s6_addr32[1]) ^
>> + (ipv6h->saddr.s6_addr32[2] ^ ipv6h->daddr.s6_addr32[2]) ^
>> + (ipv6h->saddr.s6_addr32[3] ^ ipv6h->daddr.s6_addr32[3]);
>> +		v6hash = (v6hash >> 16) ^ (v6hash >> 8) ^ v6hash;
>> + return (v6hash ^ data->h_dest[5] ^ data->h_source[5]) % count;
>
> Either you formatted this terribly, or your email client corrupted
> your patches.
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

Thanks for the feedback. It must have been my mail client formatting 
that incorrectly. I will re-submit with useful subject lines in a method 
that preserves the intended indentation.

John

^ permalink raw reply

* Re: [PATCH V3 2/2] bonding support for IPv6 transmit hashing
From: John @ 2012-06-30 19:38 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: netdev
In-Reply-To: <CAMEyesBOj2cd6sAmowjk1vATrMjxHreeJoDEdksVwENkEWWqLQ@mail.gmail.com>

On 6/30/2012 4:59 AM, Hannes Frederic Sowa wrote:
> On Sat, Jun 30, 2012 at 8:17 AM, John <linux@8192.net> wrote:
>> diff --git a/Documentation/networking/bonding.txt
>> b/Documentation/networking/bonding.txt
>> index bfea8a3..5db14fe 100644
>> --- a/Documentation/networking/bonding.txt
>> +++ b/Documentation/networking/bonding.txt
>> @@ -752,12 +752,22 @@ xmit_hash_policy
>>                  protocol information to generate the hash.
>>
>>                  Uses XOR of hardware MAC addresses and IP addresses to
>> -               generate the hash.  The formula is
>> +               generate the hash.  The IPv4 formula is
>>
>>                  (((source IP XOR dest IP) AND 0xffff) XOR
>>                          ( source MAC XOR destination MAC ))
>>                                  modulo slave count
>>
>> +               The IPv6 forumla is
>> +
>> +               iphash =
>> +                       (source ip quad 2 XOR dest IP quad 2) XOR
>> +                       (source ip quad 3 XOR dest IP quad 3) XOR
>> +                       (source ip quad 4 XOR dest IP quad 4)
>> +
>> +               ((iphash >> 16) XOR (iphash >> 8) XOR iphash)
>> +                       modulo slave count
>> +
>
> Wouldn't it be beneficial to include the ipv6 flow label in the hash
> calculation?
>
> Greetings,
>
>    Hannes
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

Hannes,

In all of the traffic I inspected I don't believe I saw a single flow 
label set. Even if it were set 100% of the time by Linux, any packets 
routed or bridged from another operating system wouldn't see any 
benefit. The current algorithm distributes the traffic very well, I 
don't believe adding the flow label would be beneficial even if it were 
set more frequently.

If you feel strongly about its inclusion, though, I am willing to 
reconsider.

John

^ permalink raw reply

* Re: [PATCH] iwlegacy: print how long queue was actually stuck
From: Paul Bolle @ 2012-06-30 19:11 UTC (permalink / raw)
  To: Emmanuel Grumbach
  Cc: Stanislaw Gruszka, John W. Linville, linux-wireless, netdev,
	linux-kernel
In-Reply-To: <CANUX_P3kypkUv+-meABAOGa6GQ9cR5ugySJj+SjvVfCykCQuYQ@mail.gmail.com>

On Sat, 2012-06-30 at 21:18 +0300, Emmanuel Grumbach wrote:
> You may want to try this one:
> http://www.spinics.net/lists/stable-commits/msg18110.html

That issue looks similar, though it regards iwlwifi. Thanks anyway.
Perhaps commit d6ee27eb13beab94056e0de52d81220058ca2297 ("iwlwifi: don't
mess up the SCD when removing a key") can be ported to iwlegacy. We'll
see whether I manage to port it and whether it helps.

Note that iwlwifi also seems to print the (default) timeout in its
message and not how long a queue was actually stuck. So perhaps
something like my patch could be ported to iwlwifi too.


Paul Bolle

^ permalink raw reply

* STRICTLY AND CONFIDENTIAL :
From: IBRAHIM ARWAN @ 2012-06-30 18:42 UTC (permalink / raw)




^ permalink raw reply

* Re: [PATCH] iwlegacy: print how long queue was actually stuck
From: Emmanuel Grumbach @ 2012-06-30 18:18 UTC (permalink / raw)
  To: Paul Bolle
  Cc: Stanislaw Gruszka, John W. Linville,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1341062406.1911.76.camel-uMdlDhfIn7prKue/0VVhAg@public.gmane.org>

>
> On Wed, 2012-06-27 at 10:36 +0200, Paul Bolle wrote:
> > Every now and then, after resuming from suspend, the iwlegacy driver
> > prints
> >     iwl4965 0000:03:00.0: Queue 2 stuck for 2000 ms.
> >     iwl4965 0000:03:00.0: On demand firmware reload
> >
> > I have no idea what causes these errors. But the code currently uses
> > wd_timeout in the first error. wd_timeout will generally be set at
> > IL_DEF_WD_TIMEOUT (ie, 2000). Perhaps printing for how long the queue
> > was actually stuck can clarify the cause of these errors.
>

You may want to try this one:
http://www.spinics.net/lists/stable-commits/msg18110.html
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox