Netdev List
 help / color / mirror / Atom feed
* Re: net/hsr Patch - Help
From: Arvid Brodin @ 2013-10-15 19:09 UTC (permalink / raw)
  To: Elías Molina Muñoz
  Cc: netdev@vger.kernel.org, Stephen Hemminger, Javier Boticario,
	balferreira, Joe Perches
In-Reply-To: <525CF1F3.2050006@ehu.es>

On 2013-10-15 09:42, Elías Molina Muñoz wrote:
> El 09/09/2013 20:15, Arvid Brodin escribió:
>> On 2013-09-06 10:25, Elías Molina Muñoz wrote:
>>> Dear Mr. Brodin,
>>> 
>>> I would like to introduce myself. My name is Elías Molina, PhD. 
>>> Student at University of Basque Country (Spain). I am writing to 
>>> enquire about your HSR patch.
>> Hi!
>> 
>>> I have read "This is a patch against net-next (2013-08-21)" in
>>> its last version (v3) so I have tried with several kernel
>>> versions but I do not know which is the repo's correct version
>>> of 
>>> http://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/
>>> .
>>> 
>>> Could you tell me which is the kernel version to apply your
>>> patch?
>> I made an error when I sent that patch, so it won't apply to any
>> kernel version.
>> 
>> The below patch should work (cd to the net-next directory and apply
>> with patch -Np1):
>> 
[removed]

> Dear Mr. Brodin,
>  
> Thanks for getting back to me and I apologize for being so late
> replying.
>  
> I am writing to enquire if, once compiled the kernel with your patch,
> there is a sample application for verifying the correct operation of
> HSR, as you did in http://patchwork.ozlabs.org/patch/191165/ with
> Documentation/networking/hsr/hsr_genl.c
>  
> Thank you very much. Best regards,
>  
> Elías Molina

Hi again,

I'm CC:ing the netdev list and others who've shown interest in HSR, since 
they might be interested as well.

Yes, I have patches for iproute2 (to make it possible to add HSR devices)
and also a "hsrinfo" program which can be used to query an HSR interface 
for statistics, and to listen for any HSR errors detected. The hsrinfo 
program is based on the hsr_genl program that you mention. It requires 
the libnl3 library.

The iproute patch is below (I'll send a separate message with the hsrinfo
code).

(I don't think the patch below will get accepted into iproute2 before 
the HSR patch itself is accepted into the kernel - which I'm beginning to 
doubt it ever will be, unfortunately.)


This patch adds support to iproute2 for adding High-Availability Seamless
Redundancy (HSR) network devices.

Signed-off-by: Arvid Brodin <arvid.brodin@xdin.com>
---
 include/linux/if_link.h | 12 +++++++
 ip/Makefile             |  2 +-
 ip/iplink_hsr.c         | 86 +++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 99 insertions(+), 1 deletion(-)
 create mode 100644 ip/iplink_hsr.c

diff --git a/include/linux/if_link.h b/include/linux/if_link.h
index d07aeca..bab39e8 100644
--- a/include/linux/if_link.h
+++ b/include/linux/if_link.h
@@ -465,4 +465,16 @@ enum {
 
 #define IFLA_IPOIB_MAX (__IFLA_IPOIB_MAX - 1)
 
+/* HSR section */
+
+enum {
+	IFLA_HSR_UNSPEC,
+	IFLA_HSR_SLAVE1,
+	IFLA_HSR_SLAVE2,
+	IFLA_HSR_MULTICAST_SPEC,
+	__IFLA_HSR_MAX,
+};
+
+#define IFLA_HSR_MAX (__IFLA_HSR_MAX - 1)
+
 #endif /* _LINUX_IF_LINK_H */
diff --git a/ip/Makefile b/ip/Makefile
index 48bd4a1..5ef1562 100644
--- a/ip/Makefile
+++ b/ip/Makefile
@@ -5,7 +5,7 @@ IPOBJ=ip.o ipaddress.o ipaddrlabel.o iproute.o iprule.o ipnetns.o \
     iplink_vlan.o link_veth.o link_gre.o iplink_can.o \
     iplink_macvlan.o iplink_macvtap.o ipl2tp.o link_vti.o \
     iplink_vxlan.o tcp_metrics.o iplink_ipoib.o ipnetconf.o link_ip6tnl.o \
-    link_iptnl.o
+    link_iptnl.o iplink_hsr.o
 
 RTMONOBJ=rtmon.o
 
diff --git a/ip/iplink_hsr.c b/ip/iplink_hsr.c
new file mode 100644
index 0000000..c7c00d6
--- /dev/null
+++ b/ip/iplink_hsr.c
@@ -0,0 +1,86 @@
+/*
+ * iplink_hsr.c	HSR device support
+ *
+ *              This program is free software; you can redistribute it and/or
+ *              modify it under the terms of the GNU General Public License
+ *              as published by the Free Software Foundation; either version
+ *              2 of the License, or (at your option) any later version.
+ *
+ * Authors:     Arvid Brodin <arvid.brodin@xdin.com>
+ *
+ * 		Based on iplink_vlan.c by Patrick McHardy <kaber@trash.net>
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/socket.h>  /* Needed by linux/if.h for some reason */
+#include <linux/if.h>
+
+#include "utils.h"
+#include "ip_common.h"
+
+static void usage(void)
+{
+	fprintf(stderr,
+"Usage:\tip link add name NAME type hsr slave1 SLAVE1-IF slave2 SLAVE2-IF\n"
+"\t[ multicast ADDR-BYTE ]\n"
+"\n"
+"NAME\n"
+"	name of new hsr device (e.g. hsr0)\n"
+"SLAVE1-IF, SLAVE2-IF\n"
+"	the two slave devices bound to the HSR device\n"
+"ADDR-BYTE\n"
+"	0-255; the last byte of the multicast address used for HSR supervision\n"
+"	frames (default = 0)\n");
+}
+
+static int hsr_parse_opt(struct link_util *lu, int argc, char **argv,
+			  struct nlmsghdr *n)
+{
+	int ifindex;
+	unsigned char multicast_spec;
+
+	while (argc > 0) {
+		if (matches(*argv, "multicast") == 0) {
+			NEXT_ARG();
+			if (get_u8(&multicast_spec, *argv, 0))
+				invarg("ADDR-BYTE is invalid", *argv);
+			addattr_l(n, 1024, IFLA_HSR_MULTICAST_SPEC, &multicast_spec, 1);
+		} else if (matches(*argv, "slave1") == 0) {
+			NEXT_ARG();
+			ifindex = ll_name_to_index(*argv);
+			if (ifindex == 0)
+				invarg("No such interface", *argv);
+			addattr_l(n, 1024, IFLA_HSR_SLAVE1, &ifindex, 4);
+		} else if (matches(*argv, "slave2") == 0) {
+			NEXT_ARG();
+			ifindex = ll_name_to_index(*argv);
+			if (ifindex == 0)
+				invarg("No such interface", *argv);
+			addattr_l(n, 1024, IFLA_HSR_SLAVE2, &ifindex, 4);
+		} else if (matches(*argv, "help") == 0) {
+			usage();
+			return -1;
+		} else {
+			fprintf(stderr, "hsr: what is \"%s\"?\n", *argv);
+			usage();
+			return -1;
+		}
+		argc--, argv++;
+	}
+
+	return 0;
+}
+
+static void hsr_print_opt(struct link_util *lu, FILE *f, struct rtattr *tb[])
+{
+	fprintf(f, "hsr_print_opt() called\n");
+}
+
+struct link_util hsr_link_util = {
+	.id		= "hsr",
+	.maxattr	= IFLA_VLAN_MAX,
+	.parse_opt	= hsr_parse_opt,
+	.print_opt	= hsr_print_opt,
+};
-- 
1.8.1.5



-- 
Arvid Brodin | Consultant (Linux)
XDIN AB | Knarrarnäsgatan 7 | SE-164 40 Kista | Sweden | xdin.com

^ permalink raw reply related

* RE: [Ilw] drivers/net/wireless/iwlwifi/dvm/tx.c:456 iwlagn_tx_skb+0x6c5/0x883()
From: Grumbach, Emmanuel @ 2013-10-15 19:11 UTC (permalink / raw)
  To: Sander Eikelenboom
  Cc: John W. Linville, Berg, Johannes, ilw@linux.intel.com,
	netdev@vger.kernel.org, linux-wireless@vger.kernel.org
In-Reply-To: <1764863131.20131015210406@eikelenboom.it>

[-- Attachment #1: Type: text/plain, Size: 989 bytes --]

> > Please apply this:
> > diff --git a/drivers/net/wireless/iwlwifi/dvm/tx.c
> b/drivers/net/wireless/iwlwifi/dvm/tx.c
> > index d131f85..5968f19 100644
> > --- a/drivers/net/wireless/iwlwifi/dvm/tx.c
> > +++ b/drivers/net/wireless/iwlwifi/dvm/tx.c
> > @@ -457,8 +457,8 @@ int iwlagn_tx_skb(struct iwl_priv *priv,
> >         WARN_ON_ONCE(is_agg &&
> >                      priv->queue_to_mac80211[txq_id] != info->hw_queue);
> >
> > -       IWL_DEBUG_TX(priv, "TX to [%d|%d] Q:%d - seq: 0x%x\n", sta_id, tid,
> > -                    txq_id, seq_number);
> > +       IWL_DEBUG_TX(priv, "TX to [%d|%d] Q:%d info Q %d - seq: 0x%x\n",
> sta_id, tid,
> > +                    txq_id, info->hw_queue, seq_number);
> >
> >         if (iwl_trans_tx(priv->trans, skb, dev_cmd, txq_id))
> >                 goto drop_unlock_sta;
> 
> > and send the output back to me
> 
> > Thanks.
> 

Can you please apply the patch attached (and remove the previous change)?
Thanks.


[-- Attachment #2: 0001-iwlwifi-dvm-don-t-override-mac80211-s-queue-setting.patch --]
[-- Type: application/octet-stream, Size: 1902 bytes --]

From fdda79b24213483034cd9a173bd1b078309cb4b9 Mon Sep 17 00:00:00 2001
From: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Date: Tue, 15 Oct 2013 22:04:54 +0300
Subject: [PATCH] iwlwifi: dvm: don't override mac80211's queue setting

Since we set IEEE80211_HW_QUEUE_CONTROL, we can let
mac80211 do the queue assignement and don't need to
override its decisions.
This is true for offchannel packets, packets to  be sent
after DTIM, but not for AMPDUs since we have a special
queue for them. So for AMPDU, we still override
info->hw_queue by the AMPDU queue.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
---
 drivers/net/wireless/iwlwifi/dvm/tx.c | 14 +++-----------
 1 file changed, 3 insertions(+), 11 deletions(-)

diff --git a/drivers/net/wireless/iwlwifi/dvm/tx.c b/drivers/net/wireless/iwlwifi/dvm/tx.c
index d131f85..86196a5 100644
--- a/drivers/net/wireless/iwlwifi/dvm/tx.c
+++ b/drivers/net/wireless/iwlwifi/dvm/tx.c
@@ -433,27 +433,19 @@ int iwlagn_tx_skb(struct iwl_priv *priv,
 	/* Copy MAC header from skb into command buffer */
 	memcpy(tx_cmd->hdr, hdr, hdr_len);
 
+	txq_id = info->hw_queue;
+
 	if (is_agg)
 		txq_id = priv->tid_data[sta_id][tid].agg.txq_id;
 	else if (info->flags & IEEE80211_TX_CTL_SEND_AFTER_DTIM) {
 		/*
-		 * Send this frame after DTIM -- there's a special queue
-		 * reserved for this for contexts that support AP mode.
-		 */
-		txq_id = ctx->mcast_queue;
-
-		/*
 		 * The microcode will clear the more data
 		 * bit in the last frame it transmits.
 		 */
 		hdr->frame_control |=
 			cpu_to_le16(IEEE80211_FCTL_MOREDATA);
-	} else if (info->flags & IEEE80211_TX_CTL_TX_OFFCHAN)
-		txq_id = IWL_AUX_QUEUE;
-	else
-		txq_id = ctx->ac_to_queue[skb_get_queue_mapping(skb)];
+	}
 
-	WARN_ON_ONCE(!is_agg && txq_id != info->hw_queue);
 	WARN_ON_ONCE(is_agg &&
 		     priv->queue_to_mac80211[txq_id] != info->hw_queue);
 
-- 
1.8.1.msysgit.1


^ permalink raw reply related

* Re: [PATCH] bridge: Correctly clamp MAX forward_delay when enabling STP
From: Veaceslav Falico @ 2013-10-15 19:17 UTC (permalink / raw)
  To: Vlad Yasevich; +Cc: netdev, Herbert Xu, Stephen Hemminger
In-Reply-To: <1381863465-27304-1-git-send-email-vyasevic@redhat.com>

On Tue, Oct 15, 2013 at 02:57:45PM -0400, Vlad Yasevich wrote:
>Commit be4f154d5ef0ca147ab6bcd38857a774133f5450
>	bridge: Clamp forward_delay when enabling STP
>had a typo when attempting to clamp maximum forward delay.
>
>It is possible to set bridge_forward_delay to be higher then
>permitted maximum when STP is off.  When turning STP on, the
>higher then allowed delay has to be clamed down to max value.
>
>CC: Herbert Xu <herbert@gondor.apana.org.au>
>CC: Stephen Hemminger <shemminger@vyatta.com>
>Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>

I think it should also be queued for stable, as it's present there also.
David, mind adding it?

As for the code - great catch!

Reviewed-by: Veaceslav Falico <vfalico@redhat.com>

>---
> net/bridge/br_stp_if.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/net/bridge/br_stp_if.c b/net/bridge/br_stp_if.c
>index 108084a..656a6f3 100644
>--- a/net/bridge/br_stp_if.c
>+++ b/net/bridge/br_stp_if.c
>@@ -134,7 +134,7 @@ static void br_stp_start(struct net_bridge *br)
>
> 	if (br->bridge_forward_delay < BR_MIN_FORWARD_DELAY)
> 		__br_set_forward_delay(br, BR_MIN_FORWARD_DELAY);
>-	else if (br->bridge_forward_delay < BR_MAX_FORWARD_DELAY)
>+	else if (br->bridge_forward_delay > BR_MAX_FORWARD_DELAY)
> 		__br_set_forward_delay(br, BR_MAX_FORWARD_DELAY);
>
> 	if (r == 0) {
>-- 
>1.8.3.1
>
>--
>To unsubscribe from this list: send the line "unsubscribe netdev" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH RFC 4/5] net: macb: Use devm_request_irq()
From: Sergei Shtylyov @ 2013-10-15 19:21 UTC (permalink / raw)
  To: Soren Brinkmann; +Cc: Nicolas Ferre, netdev, linux-kernel, Michal Simek
In-Reply-To: <1381795140-10792-5-git-send-email-soren.brinkmann@xilinx.com>

Hello.

On 10/15/2013 03:58 AM, Soren Brinkmann wrote:

> Use the device managed interface to request the IRQ, simplifying error
> paths.

> Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
> ---
>   drivers/net/ethernet/cadence/macb.c | 8 +++-----
>   1 file changed, 3 insertions(+), 5 deletions(-)

> diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c
> index 436aecc31732..603844b1d483 100644
> --- a/drivers/net/ethernet/cadence/macb.c
> +++ b/drivers/net/ethernet/cadence/macb.c
> @@ -1825,7 +1825,8 @@ static int __init macb_probe(struct platform_device *pdev)
>   	}
>
>   	dev->irq = platform_get_irq(pdev, 0);
> -	err = request_irq(dev->irq, macb_interrupt, 0, dev->name, dev);
> +	err = devm_request_irq(&pdev->dev, dev->irq, macb_interrupt, 0,
> +			dev->name, dev);

    You should start the continuation line right under &.

WBR, Sergei

^ permalink raw reply

* [PATCH 2/2] tcp: remove the sk_can_gso() check from tcp_set_skb_tso_segs()
From: Eric Dumazet @ 2013-10-15 19:24 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Neal Cardwell, Yuchung Cheng

From: Eric Dumazet <edumazet@google.com>

sk_can_gso() should only be used as a hint in tcp_sendmsg() to build GSO
packets in the first place. (As a performance hint)

Once we have GSO packets in write queue, we can not decide they are no
longer GSO only because flow now uses a route which doesn't handle
TSO/GSO.

Core networking stack handles the case very well for us, all we need
is keeping track of packet counts in MSS terms, regardless of
segmentation done later (in GSO or hardware)

Right now, if  tcp_fragment() splits a GSO packet in two parts,
@left and @right, and route changed through a non GSO device,
both @left and @right have pcount set to 1, which is wrong,
and leads to incorrect packet_count tracking.

This problem was added in commit d5ac99a648 ("[TCP]: skb pcount with MTU
discovery")

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
---
 net/ipv4/tcp_output.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 8fad1c1..d46f214 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -989,8 +989,7 @@ static void tcp_set_skb_tso_segs(const struct sock *sk, struct sk_buff *skb,
 	/* Make sure we own this skb before messing gso_size/gso_segs */
 	WARN_ON_ONCE(skb_cloned(skb));
 
-	if (skb->len <= mss_now || !sk_can_gso(sk) ||
-	    skb->ip_summed == CHECKSUM_NONE) {
+	if (skb->len <= mss_now || skb->ip_summed == CHECKSUM_NONE) {
 		/* Avoid the costly divide in the normal
 		 * non-TSO case.
 		 */

^ permalink raw reply related

* Re: net/hsr Patch - Help
From: Arvid Brodin @ 2013-10-15 19:27 UTC (permalink / raw)
  To: Elías Molina Muñoz
  Cc: netdev@vger.kernel.org, Stephen Hemminger, Javier Boticario,
	balferreira, Joe Perches
In-Reply-To: <525CF1F3.2050006@ehu.es>

On 2013-10-15 09:42, Elías Molina Muñoz wrote:
> El 09/09/2013 20:15, Arvid Brodin escribió:
>> On 2013-09-06 10:25, Elías Molina Muñoz wrote:
>>> Dear Mr. Brodin,
>>> 
>>> I would like to introduce myself. My name is Elías Molina, PhD. 
>>> Student at University of Basque Country (Spain). I am writing to 
>>> enquire about your HSR patch.
>> Hi!
>> 
>>> I have read "This is a patch against net-next (2013-08-21)" in
>>> its last version (v3) so I have tried with several kernel
>>> versions but I do not know which is the repo's correct version
>>> of 
>>> http://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/
>>> .
>>> 
>>> Could you tell me which is the kernel version to apply your
>>> patch?
>> I made an error when I sent that patch, so it won't apply to any
>> kernel version.
>> 
>> The below patch should work (cd to the net-next directory and apply
>> with patch -Np1):
>> 
[removed]

> Dear Mr. Brodin,
>  
> Thanks for getting back to me and I apologize for being so late
> replying.
>  
> I am writing to enquire if, once compiled the kernel with your patch,
> there is a sample application for verifying the correct operation of
> HSR, as you did in http://patchwork.ozlabs.org/patch/191165/ with
> Documentation/networking/hsr/hsr_genl.c
>  
> Thank you very much. Best regards,
>  
> Elías Molina

Hi again,

I'm CC:ing the netdev list and others who've shown interest in HSR, since 
they might be interested as well.

Yes, I have patches for iproute2 (to make it possible to add HSR devices)
and also a "hsrinfo" program which can be used to query an HSR interface 
for statistics, and to listen for any HSR errors detected. The hsrinfo 
program is based on the hsr_genl program that you mention. It requires 
the libnl3 library.


The hsrinfo program is below (the iproute2 patch was sent in a previous
message). This code is not of the same quality standards as you would 
expect of a kernel/iproute2 patch. The idea is to re-write it so that it
becomes part of iproute2, but I won't spend time on that unless there is
some progress with the HSR kernel patch.


---
diff -Nurp hsrinfo-a/hsr_netlink.h hsrinfo-b/hsr_netlink.h
--- hsrinfo-a/hsr_netlink.h	1970-01-01 01:00:00.000000000 +0100
+++ hsrinfo-b/hsr_netlink.h	2013-10-15 21:21:09.058960618 +0200
@@ -0,0 +1,72 @@
+/*
+ * Copyright 2011-2013 Autronica Fire and Security AS
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the Free
+ * Software Foundation; either version 2 of the License, or (at your option)
+ * any later version.
+ *
+ * Author(s):
+ *	2011-2013 Arvid Brodin, arvid.brodin@xdin.com
+ */
+
+#ifndef __HSR_NETLINK_H
+#define __HSR_NETLINK_H
+
+/* attributes */
+enum {
+	HSR_A_UNSPEC,
+	HSR_A_NODE_ADDR,
+	HSR_A_IFINDEX,
+	HSR_A_IF1_AGE,
+	HSR_A_IF2_AGE,
+	HSR_A_NODE_ADDR_B,
+	HSR_A_IF1_SEQ,
+	HSR_A_IF2_SEQ,
+	HSR_A_IF1_IFINDEX,
+	HSR_A_IF2_IFINDEX,
+	HSR_A_ADDR_B_IFINDEX,
+	__HSR_A_MAX,
+};
+#define HSR_A_MAX (__HSR_A_MAX - 1)
+
+
+#ifdef __KERNEL__
+
+#include <linux/if_ether.h>
+#include <linux/module.h>
+
+int __init hsr_netlink_init(void);
+void __exit hsr_netlink_exit(void);
+
+void hsr_nl_ringerror(unsigned char addr[ETH_ALEN], int dev_idx);
+void hsr_nl_nodedown(unsigned char addr[ETH_ALEN]);
+void hsr_nl_framedrop(int dropcount, int dev_idx);
+void hsr_nl_linkdown(int dev_idx);
+
+
+/*
+ * Generic Netlink HSR family definition
+ */
+
+
+#endif /* __KERNEL__ */
+
+
+
+/* commands */
+enum {
+	HSR_C_UNSPEC,
+	HSR_C_RING_ERROR,
+	HSR_C_NODE_DOWN,
+	HSR_C_GET_NODE_STATUS,
+	HSR_C_SET_NODE_STATUS,
+	HSR_C_GET_NODE_LIST,
+	HSR_C_SET_NODE_LIST,
+	__HSR_C_MAX,
+};
+#define HSR_C_MAX (__HSR_C_MAX - 1)
+
+
+
+#endif /* __HSR_NETLINK_H */
diff -Nurp hsrinfo-a/hsrinfo.c hsrinfo-b/hsrinfo.c
--- hsrinfo-a/hsrinfo.c	1970-01-01 01:00:00.000000000 +0100
+++ hsrinfo-b/hsrinfo.c	2013-10-15 21:23:32.983517217 +0200
@@ -0,0 +1,504 @@
+/*
+ * Copyright 2011-2013 Autronica Fire and Security AS
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the Free
+ * Software Foundation; either version 2 of the License, or (at your option)
+ * any later version.
+ *
+ * Author(s):
+ *	2011-2013 Arvid Brodin, arvid.brodin@xdin.com
+ *
+ * Userspace example of using Generic Netlink (through libnl-3) to get HSR
+ * ("High-availability Seamless Redundancy") link/network status.
+ */
+
+
+/*
+
+Manual static cross-build:
+
+$ PATH=[toolchain-path]/usr/bin/:${PATH} avr32-unknown-linux-uclibc-gcc -Wall -g -I[toolchain-path]/usr/include/libnl3 -static -L[toolchain-path]/usr/lib hsrinfo.c -o hsrinfo -lnl-genl-3 -lnl-3 -pthread -lm
+
+Native build:
+$ gcc -Wall -g -I /usr/include/libnl3/ -lnl-3 -lnl-genl-3 hsrinfo.c -o hsrinfo-x86
+
+*/
+
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <errno.h>
+#include <string.h>
+#include <netlink/netlink.h>
+#include <netlink/socket.h>
+#include <netlink/attr.h>
+#include <netlink/genl/genl.h>
+#include <netlink/genl/ctrl.h>
+#include <net/if.h>
+#include <linux/if_ether.h>
+#include "hsr_netlink.h"
+
+struct node_item {
+	struct node_item *next;
+	unsigned char addr[ETH_ALEN];
+};
+
+static struct node_item *node_head = NULL;
+
+static int seq_nr = 10;
+
+
+static void nodelist_clear(struct node_item **ni)
+{
+	if (!*ni)
+		return;
+
+	nodelist_clear(&(*ni)->next);
+
+	free(*ni);
+	(*ni) = NULL;
+}
+
+static void nodelist_add(struct node_item **head, const char addr[ETH_ALEN])
+{
+	struct node_item **ni;
+
+	ni = head;
+	while (*ni)
+		ni = &(*ni)->next;
+
+	*ni = calloc(1, sizeof(struct node_item));
+	if (!*ni)
+		return; // No mem
+
+	memcpy((*ni)->addr, addr, ETH_ALEN);
+}
+
+
+static void print_mac(const unsigned char *addr)
+{
+	int i;
+
+	if (!addr) {
+		printf("(null)           ");
+		return;
+	}
+
+	for (i = 0; i < ETH_ALEN - 1; i++)
+		printf("%02x:", addr[i]);
+	printf("%02x", addr[ETH_ALEN - 1]);
+}
+
+
+static void parse_ring_error(struct genlmsghdr *hdr)
+{
+	struct nlattr *attr;
+	unsigned char *AddrA;
+	int ifindex;
+	char ifname[IF_NAMESIZE];
+	char *nameptr;
+
+	printf("Ring error: ");
+
+	AddrA = NULL;
+	ifindex = -1;
+
+	attr = genlmsg_attrdata(hdr, 0);
+	int remaining = genlmsg_attrlen(hdr, 0);
+	while (nla_ok(attr, remaining)) {
+		switch (attr->nla_type) {
+		case HSR_A_NODE_ADDR:
+			AddrA = nla_data(attr);
+			break;
+		case HSR_A_IFINDEX:
+			ifindex = nla_get_u32(attr);
+			break;
+		default:
+			printf("unknown attribute type: %d\n", attr->nla_type);
+		}
+		attr = nla_next(attr, &remaining);
+	}
+
+	if (!AddrA) {
+		printf("Error: invalid HSR_C_RING_ERROR packet\n");
+		return;
+	}
+
+	nameptr = if_indextoname(ifindex, ifname);
+	if (!nameptr)
+		snprintf(ifname, IF_NAMESIZE, "if%d", ifindex);
+	printf("interface %s, node ", ifname);
+	print_mac(AddrA);
+	printf("\n");
+}
+
+static void parse_node_down(struct genlmsghdr *hdr)
+{
+	struct nlattr *attr;
+	unsigned char *AddrA;
+
+	printf("Node down: ");
+
+	AddrA = NULL;
+
+	attr = genlmsg_attrdata(hdr, 0);
+	int remaining = genlmsg_attrlen(hdr, 0);
+	while (nla_ok(attr, remaining)) {
+		switch (attr->nla_type) {
+		case HSR_A_NODE_ADDR:
+			AddrA = nla_data(attr);
+			break;
+		default:
+			printf("unknown attribute type: %d\n", attr->nla_type);
+		}
+		attr = nla_next(attr, &remaining);
+	}
+
+	if (!AddrA) {
+		printf("Error: invalid HSR_C_NODE_DOWN packet\n");
+		return;
+	}
+
+	print_mac(AddrA);
+	printf("\n");
+}
+
+static void parse_node_status(struct genlmsghdr *hdr)
+{
+	unsigned char *AddrA, *AddrB;
+	int if1_age, if2_age;
+	int if1_seq, if2_seq;
+	int if1_ifindex, if2_ifindex, addr_b_ifindex;
+	char if1_ifname[IF_NAMESIZE];
+	char if2_ifname[IF_NAMESIZE];
+	char addr_b_ifname[IF_NAMESIZE];
+	char *nameptr;
+	struct nlattr *attr;
+
+	AddrA = NULL;
+	AddrB = NULL;
+	if1_age = -1;
+	if2_age = -1;
+	if1_seq = -1;
+	if2_seq = -1;
+	if1_ifindex = -1;
+	if2_ifindex = -1;
+	addr_b_ifindex = -1;
+
+	attr = genlmsg_attrdata(hdr, 0);
+	int remaining = genlmsg_attrlen(hdr, 0);
+	while (nla_ok(attr, remaining)) {
+		switch (attr->nla_type) {
+		case HSR_A_NODE_ADDR:
+			if (AddrA)
+				printf("%s: Too many AddrA in message!\n", __func__);
+			AddrA = nla_data(attr);
+			break;
+		case HSR_A_NODE_ADDR_B:
+			if (AddrB)
+				printf("%s: Too many AddrB in message!\n", __func__);
+			AddrB = nla_data(attr);
+			break;
+		case HSR_A_IFINDEX:
+			break;
+		case HSR_A_IF1_AGE:
+			if1_age = (int) nla_get_u32(attr);
+			break;
+		case HSR_A_IF2_AGE:
+			if2_age = (int) nla_get_u32(attr);
+			break;
+		case HSR_A_IF1_SEQ:
+			if1_seq = nla_get_u16(attr);
+			break;
+		case HSR_A_IF2_SEQ:
+			if2_seq = nla_get_u16(attr);
+			break;
+		case HSR_A_IF1_IFINDEX:
+			if1_ifindex = nla_get_u32(attr);
+			break;
+		case HSR_A_IF2_IFINDEX:
+			if2_ifindex = nla_get_u32(attr);
+			break;
+		case HSR_A_ADDR_B_IFINDEX:
+			addr_b_ifindex = nla_get_u32(attr);
+			break;
+		default:
+			printf("%s: unknown attribute type: %d\n", __func__, attr->nla_type);
+		}
+		attr = nla_next(attr, &remaining);
+	}
+
+	if (!AddrA) {
+		printf("Error: invalid HSR_C_SET_NODE_STATUS packet\n");
+		return;
+	}
+
+	nameptr = if_indextoname(if1_ifindex, if1_ifname);
+	if (!nameptr)
+		snprintf(if1_ifname, IF_NAMESIZE, "if%d", if1_ifindex);
+	nameptr = if_indextoname(if2_ifindex, if2_ifname);
+	if (!nameptr)
+		snprintf(if2_ifname, IF_NAMESIZE, "if%d", if2_ifindex);
+	nameptr = if_indextoname(addr_b_ifindex, addr_b_ifname);
+	if (!nameptr)
+		snprintf(addr_b_ifname, IF_NAMESIZE, "if%d", addr_b_ifindex);
+
+	printf("Node: ");
+	print_mac(AddrA);
+	if (AddrB) {
+		printf("    AddrB: ");
+		print_mac(AddrB);
+		printf(" (over %s)", addr_b_ifname);
+	}
+	printf("\n      Sequence nr (age): %s: %5d (%5d ms); %s: %5d (%5d ms)\n",
+		if1_ifname, if1_seq, if1_age,
+		if2_ifname, if2_seq, if2_age);
+}
+
+static void parse_node_list(struct genlmsghdr *hdr)
+{
+	struct nlattr *attr;
+
+	nodelist_clear(&node_head);
+
+	attr = genlmsg_attrdata(hdr, 0);
+	int remaining = genlmsg_attrlen(hdr, 0);
+	while (nla_ok(attr, remaining)) {
+		switch (attr->nla_type) {
+		case HSR_A_NODE_ADDR:
+			nodelist_add(&node_head, nla_data(attr));
+			break;
+		default:
+			printf("Unknown attribute type for HSR_C_SET_NODE_LIST: %d\n", attr->nla_type);
+		}
+		attr = nla_next(attr, &remaining);
+	}
+}
+
+
+static int parse_genlmsg(struct nl_msg *msg, void *arg)
+{
+	struct genlmsghdr *hdr;
+
+	/*
+	 * Extract command ID from "message" -> "netlink header" ->
+	 * "generic netlink header".
+	 *
+	 * These are the command enums used when creating a genl msg header
+	 * in the kernel with genlmsg_put().
+	 */
+	hdr = genlmsg_hdr(nlmsg_hdr(msg));
+
+//	printf("%d: ", nlmsg_hdr(msg)->nlmsg_seq);
+	switch (hdr->cmd) {
+	case HSR_C_RING_ERROR:
+		parse_ring_error(hdr);
+		break;
+	case HSR_C_NODE_DOWN:
+		parse_node_down(hdr);
+		break;
+	case HSR_C_SET_NODE_STATUS:
+		parse_node_status(hdr);
+		break;
+	case HSR_C_SET_NODE_LIST:
+		parse_node_list(hdr);
+		break;
+	default:
+		printf("Unknown genl message received (%d)\n", hdr->cmd);
+	}
+
+	return 0;
+}
+
+
+static int query_get_node_status(struct nl_sock *nlsk, int family, int ifindex,
+					const unsigned char node_addr[ETH_ALEN])
+{
+	struct nl_msg *msg;
+	void *user_hdr;
+
+	msg = nlmsg_alloc();
+	if (!msg)
+		return -1;
+
+	user_hdr = genlmsg_put(msg, NL_AUTO_PORT, seq_nr++, family,
+				0, NLM_F_REQUEST, HSR_C_GET_NODE_STATUS, 1);
+	if (!user_hdr)
+		goto nla_put_failure;
+
+	NLA_PUT_U32(msg, HSR_A_IFINDEX, ifindex);
+	NLA_PUT(msg, HSR_A_NODE_ADDR, ETH_ALEN, node_addr);
+
+/*
+	printf("Querying if %d for status of node ", ifindex);
+	print_mac(node_addr);
+	printf("\n");
+*/
+
+	return (nl_send_auto(nlsk, msg));
+
+nla_put_failure:
+	nlmsg_free(msg);
+	return -1;
+}
+
+
+static int query_get_node_list(struct nl_sock *nlsk, int family, int ifindex)
+{
+	struct nl_msg *msg;
+	void *user_hdr;
+
+	msg = nlmsg_alloc();
+	if (!msg)
+		return -1;
+
+	user_hdr = genlmsg_put(msg, NL_AUTO_PORT, seq_nr++, family,
+				0, NLM_F_REQUEST, HSR_C_GET_NODE_LIST, 1);
+	if (!user_hdr)
+		goto nla_put_failure;
+
+	NLA_PUT_U32(msg, HSR_A_IFINDEX, ifindex);
+
+//	printf("Querying if %d for node list\n", ifindex);
+
+	return (nl_send_auto(nlsk, msg));
+
+nla_put_failure:
+	nlmsg_free(msg);
+	return -1;
+}
+
+
+
+static void print_usage(const char *name)
+{
+	printf(
+"Usage: %s [-q] interface [node mac address]\n"
+"Display ring error messages for a HSR network interface, or\n"
+"(-q) query the interface node database. The node address parameter is only\n"
+"valid with -q, and limits the output to data about a specific node.\n", name);
+}
+
+
+static const char optstring[] = "+q";
+
+int main(int argc, char **argv)
+{
+	struct nl_sock *nlsk;
+	int hsr_mgroup;
+	int query;
+	int opt, rc;
+	int hsr_ifindex;
+	struct node_item *ni;
+
+	query = 0;
+	opt = getopt(argc, argv, optstring);
+	while (opt != -1) {
+		switch (opt) {
+		case 'q':
+			query = 1;
+			break;
+		default:
+			print_usage(argv[0]);
+			return EXIT_FAILURE;
+		}
+		opt = getopt(argc, argv, optstring);
+	}
+
+
+	nlsk = nl_socket_alloc();
+	if (!nlsk) {
+		printf("nl_socket_alloc() failed\n");
+		return EXIT_FAILURE;
+	}
+	nl_socket_modify_cb(nlsk, NL_CB_VALID, NL_CB_CUSTOM, parse_genlmsg, NULL);
+	genl_connect(nlsk);
+
+	/*
+	 * Sign up for HSR messages
+	 */
+	hsr_mgroup = genl_ctrl_resolve_grp(nlsk, "HSR", "hsr-network");
+	if (hsr_mgroup < 0) {
+		printf("genl_ctrl_resolve_grp() failed: %d\n", hsr_mgroup);
+		rc = EXIT_FAILURE;
+		goto out;
+	}
+
+	nl_socket_disable_seq_check(nlsk);
+
+	if (!query) {
+		if (argc - optind != 1) {
+			print_usage(argv[0]);
+			return EXIT_FAILURE;
+		}
+
+//		printf("Registering for multicast group %d\n", hsr_mgroup);
+		rc = nl_socket_add_memberships(nlsk, hsr_mgroup, 0);
+		if (rc < 0) {
+			printf("nl_socket_add_memberships() failed: %d\n", rc);
+			goto out;
+		}
+
+		while (1)
+			nl_recvmsgs_default(nlsk);
+
+		/* Not reached */
+	}
+
+
+	if (argc - optind < 1) {
+		print_usage(argv[0]);
+		return EXIT_FAILURE;
+	}
+
+	/* The hsr if we send the enquiry to (get it with e.g.
+	 * 'cat /sys/class/net/hsr0/ifindex'): */
+	hsr_ifindex = if_nametoindex(argv[optind]);
+	if (hsr_ifindex == 0) {
+		printf("%s: %s\n", argv[optind], strerror(errno));
+		exit(EXIT_FAILURE);
+	}
+
+	/* Get node list */
+	int hsr_family;
+
+	hsr_family = genl_ctrl_resolve(nlsk, "HSR");
+	if (hsr_family < 0) {
+		printf("genl_ctrl_resolve() failed: %d\n", hsr_family);
+		goto out;
+	}
+
+	rc = query_get_node_list(nlsk, hsr_family, hsr_ifindex);
+//	printf("query_get_node_list() returned %d\n", rc);
+
+	rc = nl_recvmsgs_default(nlsk);
+//	printf("nl_recvmsgs_default() returned %d\n", rc);
+
+	ni = node_head;
+	while (ni) {
+		/*
+		 * Send a query about the status of another node on the HSR network:
+		 */
+		/* The node to enquire about: */
+		//const unsigned char node[ETH_ALEN] = {0x00, 0x24, 0x74, 0x00, 0x17, 0xAD};
+
+		rc = query_get_node_status(nlsk, hsr_family, hsr_ifindex, ni->addr);
+//		printf("query_node_status() returned %d\n", rc);
+
+		ni = ni->next;
+	}
+
+	while (1) {
+		rc = nl_recvmsgs_default(nlsk);
+//		printf("nl_recvmsgs_default() returned %d\n", rc);
+	}
+
+
+	rc = EXIT_SUCCESS;
+out:
+	nl_close(nlsk);
+	nl_socket_free(nlsk);
+	return rc;
+}


-- 
Arvid Brodin | Consultant (Linux)
XDIN AB | Knarrarnäsgatan 7 | SE-164 40 Kista | Sweden | xdin.com

^ permalink raw reply

* Re: [PATCH 2/2] tcp: remove the sk_can_gso() check from tcp_set_skb_tso_segs()
From: Eric Dumazet @ 2013-10-15 19:44 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Neal Cardwell, Yuchung Cheng
In-Reply-To: <1381865094.2045.69.camel@edumazet-glaptop.roam.corp.google.com>

On Tue, 2013-10-15 at 12:24 -0700, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>

> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Neal Cardwell <ncardwell@google.com>
> Signed-off-by: Yuchung Cheng <ycheng@google.com>
> ---

Reported-by: Maciej Żenczykowski <maze@google.com>

^ permalink raw reply

* Re: [PATCH v4] net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)
From: Arvid Brodin @ 2013-10-15 19:59 UTC (permalink / raw)
  To: netdev
  Cc: David Miller, shemminger, joe, jboticario, balferreira,
	elias.molina, Arvid Brodin
In-Reply-To: <5249942F.3090504@xdin.com>

On 2013-09-30 17:09, Arvid Brodin wrote:
> On 2013-09-20 21:10, David Miller wrote:
>> From: Arvid Brodin <arvid.brodin@xdin.com>
>> Date: Thu, 19 Sep 2013 03:11:58 +0200
>>
>>> +#if !defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)
>>> +		/* We need to memmove the whole header to work around
>>> +		 * alignment problems caused by the 6-byte HSR tag.
>>> +		 */
>>> +		memmove(skb_deliver->data - HSR_TAGLEN, skb_deliver->data,
>>> +			skb_headlen(skb_deliver));
>>> +		skb_deliver->data -= HSR_TAGLEN;
>>> +		skb_deliver->tail -= HSR_TAGLEN;
>>> +#endif
>>
>> You can't do this.
>>
>> First of all, you have no idea if subtracting skb->data a given amount
>> will underflow the skb buffer start.  You aren't even checking, all
>> of the standard skb_*() data adjustment interfaces do.
> 
> (Shorter and more to the point than my previous replies:)
> 
> I _do_ know: this can't possibly underflow since strip_hsr_tag() a 
> few lines above pulled the same amount of data. I will rename 
> strip_hsr_tag() to hsr_pull_tag() to make this clearer.
> 
> 
>> Secondly, everything after the header is now at the wrong offset from
>> the beginning of the packet.
> 
> How does this matter? The memmove moves everything back (restores the 
> changes made to the packet on the sending side) so that it is at the
> "normal" position for an ethernet packet.
> 

Obviously, David is too busy to help me figure out what the problem is 
(I know he reviews several thousand patches each year, so maybe that's
no wonder). 

If anyone else has got an idea you are very welcome to chime in, and
perhaps we can solve this. I can't fix the problem if I don't understand
it.


On 2013-09-20 21:10, David Miller wrote:
> Secondly, everything after the header is now at the wrong offset from
> the beginning of the packet.

Maybe he's talking about systems with NET_SKBUFF_DATA_USES_OFFSET here? 
This means the transport, network and mac headers are all relative to 
skb->head, if I understand correctly. But at the point in the protocol 
stack where this code is (the Ethernet protocol handler), the transport 
and network headers have not been set yet, and the mac header is not
moved by the code. And tail is updated by the code. So that should not 
be a problem?


-- 
Arvid Brodin | Consultant (Linux)
XDIN AB | Knarrarnäsgatan 7 | SE-164 40 Kista | Sweden | xdin.com

^ permalink raw reply

* Re: [Ilw] drivers/net/wireless/iwlwifi/dvm/tx.c:456 iwlagn_tx_skb+0x6c5/0x883()
From: Sander Eikelenboom @ 2013-10-15 20:19 UTC (permalink / raw)
  To: Grumbach, Emmanuel
  Cc: John W. Linville, Berg, Johannes,
	ilw-VuQAYsv1563Yd54FQh9/CA@public.gmane.org,
	netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
In-Reply-To: <0BA3FCBA62E2DC44AF3030971E174FB301DC4E24-yLYCJVCiOGwLt2AQoY/u9bfspsVTdybXVpNB7YpNyf8@public.gmane.org>


Tuesday, October 15, 2013, 9:11:36 PM, you wrote:

>> > Please apply this:
>> > diff --git a/drivers/net/wireless/iwlwifi/dvm/tx.c
>> b/drivers/net/wireless/iwlwifi/dvm/tx.c
>> > index d131f85..5968f19 100644
>> > --- a/drivers/net/wireless/iwlwifi/dvm/tx.c
>> > +++ b/drivers/net/wireless/iwlwifi/dvm/tx.c
>> > @@ -457,8 +457,8 @@ int iwlagn_tx_skb(struct iwl_priv *priv,
>> >         WARN_ON_ONCE(is_agg &&
>> >                      priv->queue_to_mac80211[txq_id] != info->hw_queue);
>> >
>> > -       IWL_DEBUG_TX(priv, "TX to [%d|%d] Q:%d - seq: 0x%x\n", sta_id, tid,
>> > -                    txq_id, seq_number);
>> > +       IWL_DEBUG_TX(priv, "TX to [%d|%d] Q:%d info Q %d - seq: 0x%x\n",
>> sta_id, tid,
>> > +                    txq_id, info->hw_queue, seq_number);
>> >
>> >         if (iwl_trans_tx(priv->trans, skb, dev_cmd, txq_id))
>> >                 goto drop_unlock_sta;
>> 
>> > and send the output back to me
>> 
>> > Thanks.
>> 

> Can you please apply the patch attached (and remove the previous change)?
> Thanks.


That seems to make the warning go away :-)

[    7.306696] iwlwifi 0000:02:00.0: L1 Disabled; Enabling L0S
[    7.315790] iwlwifi 0000:02:00.0: Radio type=0x2-0x1-0x0
[    7.362212] iwlwifi 0000:02:00.0: U iwl_trans_pcie_txq_enable Activate queue 9 on FIFO 7 WrPtr: 0
[    7.364973] iwlwifi 0000:02:00.0: U iwl_trans_pcie_txq_enable Activate queue 0 on FIFO 3 WrPtr: 0
[    7.365090] iwlwifi 0000:02:00.0: U iwl_trans_pcie_txq_enable Activate queue 1 on FIFO 2 WrPtr: 0
[    7.365208] iwlwifi 0000:02:00.0: U iwl_trans_pcie_txq_enable Activate queue 2 on FIFO 1 WrPtr: 0
[    7.365324] iwlwifi 0000:02:00.0: U iwl_trans_pcie_txq_enable Activate queue 3 on FIFO 0 WrPtr: 0
[    7.365440] iwlwifi 0000:02:00.0: U iwl_trans_pcie_txq_enable Activate queue 4 on FIFO 0 WrPtr: 0
[    7.365556] iwlwifi 0000:02:00.0: U iwl_trans_pcie_txq_enable Activate queue 5 on FIFO 4 WrPtr: 0
[    7.365672] iwlwifi 0000:02:00.0: U iwl_trans_pcie_txq_enable Activate queue 6 on FIFO 2 WrPtr: 0
[    7.365789] iwlwifi 0000:02:00.0: U iwl_trans_pcie_txq_enable Activate queue 7 on FIFO 5 WrPtr: 0
[    7.365905] iwlwifi 0000:02:00.0: U iwl_trans_pcie_txq_enable Activate queue 8 on FIFO 4 WrPtr: 0
[    7.366034] iwlwifi 0000:02:00.0: U iwl_trans_pcie_txq_enable Activate queue 10 on FIFO 5 WrPtr: 0
[    7.602726] iwlwifi 0000:02:00.0: L1 Disabled; Enabling L0S
[    7.612133] iwlwifi 0000:02:00.0: Radio type=0x2-0x1-0x0
[    7.658168] iwlwifi 0000:02:00.0: U iwl_trans_pcie_txq_enable Activate queue 9 on FIFO 7 WrPtr: 0
[    7.661021] iwlwifi 0000:02:00.0: U iwl_trans_pcie_txq_enable Activate queue 0 on FIFO 3 WrPtr: 0
[    7.663693] iwlwifi 0000:02:00.0: U iwl_trans_pcie_txq_enable Activate queue 1 on FIFO 2 WrPtr: 0
[    7.666341] iwlwifi 0000:02:00.0: U iwl_trans_pcie_txq_enable Activate queue 2 on FIFO 1 WrPtr: 0
[    7.668914] iwlwifi 0000:02:00.0: U iwl_trans_pcie_txq_enable Activate queue 3 on FIFO 0 WrPtr: 0
[    7.671464] iwlwifi 0000:02:00.0: U iwl_trans_pcie_txq_enable Activate queue 4 on FIFO 0 WrPtr: 0
[    7.673057] iwlwifi 0000:02:00.0: U iwl_trans_pcie_txq_enable Activate queue 5 on FIFO 4 WrPtr: 0
[    7.674631] iwlwifi 0000:02:00.0: U iwl_trans_pcie_txq_enable Activate queue 6 on FIFO 2 WrPtr: 0
[    7.676174] iwlwifi 0000:02:00.0: U iwl_trans_pcie_txq_enable Activate queue 7 on FIFO 5 WrPtr: 0
[    7.677657] iwlwifi 0000:02:00.0: U iwl_trans_pcie_txq_enable Activate queue 8 on FIFO 4 WrPtr: 0
[    7.679133] iwlwifi 0000:02:00.0: U iwl_trans_pcie_txq_enable Activate queue 10 on FIFO 5 WrPtr: 0
[    7.730037] device wlan0 entered promiscuous mode
[    7.732390] xen_bridge: port 2(wlan0) entered forwarding state
[    7.733984] xen_bridge: port 2(wlan0) entered forwarding state
[    7.735692] cfg80211: Pending regulatory request, waiting for it to be processed...
[    7.743541] iwlwifi 0000:02:00.0: I iwlagn_tx_skb TX to [14|8] Q:8 - seq: 0x0
[    7.745347] device wlan0 left promiscuous mode
[    7.747088] xen_bridge: port 2(wlan0) entered disabled state
[    7.748034] iwlwifi 0000:02:00.0: I iwl_trans_pcie_reclaim [Q 8] 0 -> 1 (1)
[    7.748039] iwlwifi 0000:02:00.0: I iwlagn_rx_reply_tx TXQ 8 status SUCCESS (0x00000201)
[    7.748042] iwlwifi 0000:02:00.0: I iwlagn_rx_reply_tx                               initial_rate 0x820a retries 0, idx=0 ssn=1 seq_ctl=0x0
[    7.769963] iwlwifi 0000:02:00.0: I iwl_pcie_txq_inc_wr_ptr Q:9 WR: 0x23
[    7.821813] iwlwifi 0000:02:00.0: I iwl_pcie_txq_inc_wr_ptr Q:9 WR: 0x24
[    7.824914] iwlwifi 0000:02:00.0: I iwl_pcie_txq_inc_wr_ptr Q:9 WR: 0x25
[    7.828033] iwlwifi 0000:02:00.0: I iwl_pcie_txq_inc_wr_ptr Q:9 WR: 0x26
[    7.831078] iwlwifi 0000:02:00.0: I iwl_pcie_txq_inc_wr_ptr Q:9 WR: 0x27
[    7.834005] iwlwifi 0000:02:00.0: I iwl_pcie_txq_inc_wr_ptr Q:9 WR: 0x28
[    7.836087] iwlwifi 0000:02:00.0: I iwl_pcie_txq_unmap Q 9 Free 39

--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH RFC 4/5] net: macb: Use devm_request_irq()
From: Sören Brinkmann @ 2013-10-15 20:20 UTC (permalink / raw)
  To: Sergei Shtylyov; +Cc: Nicolas Ferre, netdev, linux-kernel, Michal Simek
In-Reply-To: <525D95A4.3020706@cogentembedded.com>

On Tue, Oct 15, 2013 at 11:21:08PM +0400, Sergei Shtylyov wrote:
> Hello.
> 
> On 10/15/2013 03:58 AM, Soren Brinkmann wrote:
> 
> >Use the device managed interface to request the IRQ, simplifying error
> >paths.
> 
> >Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
> >---
> >  drivers/net/ethernet/cadence/macb.c | 8 +++-----
> >  1 file changed, 3 insertions(+), 5 deletions(-)
> 
> >diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c
> >index 436aecc31732..603844b1d483 100644
> >--- a/drivers/net/ethernet/cadence/macb.c
> >+++ b/drivers/net/ethernet/cadence/macb.c
> >@@ -1825,7 +1825,8 @@ static int __init macb_probe(struct platform_device *pdev)
> >  	}
> >
> >  	dev->irq = platform_get_irq(pdev, 0);
> >-	err = request_irq(dev->irq, macb_interrupt, 0, dev->name, dev);
> >+	err = devm_request_irq(&pdev->dev, dev->irq, macb_interrupt, 0,
> >+			dev->name, dev);
> 
>    You should start the continuation line right under &.
Actually this one is a good example why I don't do such alignment: You
do a simple search & replace - in this case request_irq ->
devm_request_irq - and all alignment is gone.

	Sören

^ permalink raw reply

* Re: [PATCH] veth: Showing peer of veth type dev in ip link (kernel side)
From: Eric W. Biederman @ 2013-10-15 20:34 UTC (permalink / raw)
  To: nicolas.dichtel; +Cc: Stephen Hemminger, David Miller, yamato, netdev
In-Reply-To: <525D7109.4010004@6wind.com>

Nicolas Dichtel <nicolas.dichtel@6wind.com> writes:

> Le 10/10/2013 02:17, Eric W. Biederman a écrit :
>>
>> Right.
>>
>> IFLA_NET_NS_PID is not invertible as there may be no processes running
>> in a pid namespace.
>>
>> IFLA_NET_NS_FD is in principle invertible.  We just need to add a file
>> descriptor to the callers fd table.  I don't see IFLA_NET_NS_FD being
>> invertible for broadcast messages, but for unicast it looks like a bit
>> of a pain but there are no fundamental problems.
> I'm not sure to understand why it is invertible only for unicast message.
> Or are you saying that it is invertible only for the netns where the
> caller stands (and then not for the veth peer)?

The pain is that it is a special case of SCM_RIGHTS aka passing file
descriptors.  Right now we don't support SCM_RIGHTS on netlink sockets
and so from that perspective IFLA_NET_NS_FD is a bit of a hack.

For unicast messages we can just stuff a file descriptor in the calling
process and be done with it.  For multicast messages we have to be much
more complete.

>> I don't know if we care enough yet to write the code for the
>> IFLA_NET_NS_FD attribute but it is doable.
> I care ;-)
> Has somebody already started to write a patch?

For IFLA_NET_NS_FD not that I know of.

Mostly it is doable but there are some silly cases.
- Do we need to actually implement SCM_RIGHTS to prevent people
  accepting file-descriptors unknowingly and hitting their file
  descriptor limits.

  In which case we need to call the attribute IFLA_NET_NS_SCM_FD
  so we knew it was just an index into the passed file descriptors.n

- Do we need an extra permission check to prevent keeping a network
  namespace alive longer than necessary?  Aka there are some permission
  checks opening and bind mounting /proc/<pid>/ns/net do we need
  a similar check.  Perhaps we would need to require CAP_NET_ADMIN over
  the target network namespace.

Beyond that it is just the logistics to open what is essentially
/proc/<pid>/ns/net and add it to the file descriptor table of the
requesting process.  Exactly which mount of proc we are going to
find the appropriate file to open I don't know.

It isn't likely to be lots of code but it is code that the necessary
infrastructure is not in place for, and a bunch of moderately hairy
corner cases to deal with.

Eric

^ permalink raw reply

* Re: kernel policy routing table src ip not respected since 2.6.37 and commit 9fc3bbb4a752
From: Julian Anastasov @ 2013-10-15 20:36 UTC (permalink / raw)
  To: Vincent Li; +Cc: netdev@vger.kernel.org, Joel Sing
In-Reply-To: <CAK3+h2w13isZEOurBMv57L0H_pkqMdYmPaNE3Kn5vPxfqErOMw@mail.gmail.com>


	Hello,

On Tue, 15 Oct 2013, Vincent Li wrote:

> it is strange though when 10.1.1.9 is unreachable address and the ping
> utility reports error 'Destination Host Unreachable' with source
> 10.1.1.1.  before 2.6.37, it reports 10.1.1..2

	I see, it is the icmp_send() function that uses
inet_select_addr() to select primary source for locally generated
ICMP errors that are sent back to the sender (10.1.1.2).
In your case it is the error_report from the ARP code.

	So, you are correct about the commit but I don't
see any problem with this behavior. IIRC, users preferred
to see primary addresses in the traceroute output.

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply

* Re: [PATCH RFC 4/5] net: macb: Use devm_request_irq()
From: Sergei Shtylyov @ 2013-10-15 20:38 UTC (permalink / raw)
  To: Sören Brinkmann; +Cc: Nicolas Ferre, netdev, linux-kernel, Michal Simek
In-Reply-To: <ffb65abe-36e2-45c3-9d40-02050545184d@TX2EHSMHS038.ehs.local>

On 10/16/2013 12:20 AM, Sören Brinkmann wrote:

>>> Use the device managed interface to request the IRQ, simplifying error
>>> paths.

>>> Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
>>> ---
>>>   drivers/net/ethernet/cadence/macb.c | 8 +++-----
>>>   1 file changed, 3 insertions(+), 5 deletions(-)

>>> diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c
>>> index 436aecc31732..603844b1d483 100644
>>> --- a/drivers/net/ethernet/cadence/macb.c
>>> +++ b/drivers/net/ethernet/cadence/macb.c
>>> @@ -1825,7 +1825,8 @@ static int __init macb_probe(struct platform_device *pdev)
>>>   	}
>>>
>>>   	dev->irq = platform_get_irq(pdev, 0);
>>> -	err = request_irq(dev->irq, macb_interrupt, 0, dev->name, dev);
>>> +	err = devm_request_irq(&pdev->dev, dev->irq, macb_interrupt, 0,
>>> +			dev->name, dev);

>>     You should start the continuation line right under &.

> Actually this one is a good example why I don't do such alignment: You
> do a simple search & replace - in this case request_irq ->
> devm_request_irq - and all alignment is gone.

    I didn't understand why this is a good example. In this case you broke the 
line yourself and did it incorrectly, not following the networking coding 
style which assumes Emacs-style alignment for broken lines.

> 	Sören

WBR, Sergei

^ permalink raw reply

* "xfrm: Fix the gc threshold value for ipv4" broke my IPSec connections
From: Damian Pietras @ 2013-10-15 20:40 UTC (permalink / raw)
  To: netdev

I've recently upgraded from 3.4.x to 3.10.x and this broke my IPSec
setup in transport mode. The simplest test case is to setup few such
connections with few boxes like this:

spdadd 192.168.1.100 192.168.2.100 any -P out ipsec
           esp/transport//require
           ah/transport//require;

spdadd 192.168.2.100 192.168.1.100 any -P in ipsec
           esp/transport//require
           ah/transport//require;

Then set up an HTTP server on one box and run ab on the other box to
create come TCP connections:

ab -n 10000 -c 50 http://192.168.1.100/

Then the connect() call will very quickly start returning ENOBUFS. I
haven't seen anything wrong with my simple setup (just copy of
ipsec-howto.org in transport mode and pre shared keys) and started
bisecting. That way I found this commit to break my case:

703fb94ec58e0e8769380c2877a8a34aeb5b6c97
xfrm: Fix the gc threshold value for ipv4

Reverting it on 3.10.15 fixes my issue. This seems to be there from 3.7
and I don't really believe such simple case stayed broken for so long.
Em I missing something or there is really a bug?

If smeone is interested in details of this configuration and commands
I'm running, just let me know. This was reproduced with few VMs under XEN.

-- 
Damian Pietras

^ permalink raw reply

* Re: [PATCH ipsec] xfrm: prevent ipcomp scratch buffer race condition
From: Michal Kubecek @ 2013-10-15 20:55 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: Herbert Xu, David S. Miller, netdev
In-Reply-To: <20131015083348.GW7660@secunet.com>

On Tue, Oct 15, 2013 at 10:33:48AM +0200, Steffen Klassert wrote:
> > diff --git a/net/xfrm/xfrm_ipcomp.c b/net/xfrm/xfrm_ipcomp.c
> > index 2906d52..96946fb 100644
> > --- a/net/xfrm/xfrm_ipcomp.c
> > +++ b/net/xfrm/xfrm_ipcomp.c
> > @@ -48,9 +48,11 @@ static int ipcomp_decompress(struct xfrm_state *x, struct sk_buff *skb)
> >  	const int cpu = get_cpu();
> >  	u8 *scratch = *per_cpu_ptr(ipcomp_scratches, cpu);
> >  	struct crypto_comp *tfm = *per_cpu_ptr(ipcd->tfms, cpu);
> > -	int err = crypto_comp_decompress(tfm, start, plen, scratch, &dlen);
> > +	int err;
> >  	int len;
> >  
> > +	local_bh_disable();
> 
> Maybe we could disable the BHs before we fetch the percpu pointers.
> Then we can use smp_processor_id() to get the cpu. With that we
> could get rid of a (now useless) preempt_disable()/preempt_enable()
> pair. Same could be done in ipcomp_compress().

Sounds like a good idea. I'll send v2 after some basic testing.

                                                 Michal Kubecek

^ permalink raw reply

* Re: "xfrm: Fix the gc threshold value for ipv4" broke my IPSec connections
From: Eric Dumazet @ 2013-10-15 21:02 UTC (permalink / raw)
  To: Damian Pietras; +Cc: netdev
In-Reply-To: <525DA855.1010905@daper.net>

On Tue, 2013-10-15 at 22:40 +0200, Damian Pietras wrote:
> I've recently upgraded from 3.4.x to 3.10.x and this broke my IPSec
> setup in transport mode. The simplest test case is to setup few such
> connections with few boxes like this:
> 
> spdadd 192.168.1.100 192.168.2.100 any -P out ipsec
>            esp/transport//require
>            ah/transport//require;
> 
> spdadd 192.168.2.100 192.168.1.100 any -P in ipsec
>            esp/transport//require
>            ah/transport//require;
> 
> Then set up an HTTP server on one box and run ab on the other box to
> create come TCP connections:
> 
> ab -n 10000 -c 50 http://192.168.1.100/
> 
> Then the connect() call will very quickly start returning ENOBUFS. I
> haven't seen anything wrong with my simple setup (just copy of
> ipsec-howto.org in transport mode and pre shared keys) and started
> bisecting. That way I found this commit to break my case:
> 
> 703fb94ec58e0e8769380c2877a8a34aeb5b6c97
> xfrm: Fix the gc threshold value for ipv4
> 
> Reverting it on 3.10.15 fixes my issue. This seems to be there from 3.7
> and I don't really believe such simple case stayed broken for so long.
> Em I missing something or there is really a bug?
> 
> If smeone is interested in details of this configuration and commands
> I'm running, just let me know. This was reproduced with few VMs under XEN.
> 

It looks like you need to tune /proc/sys/net/ipv4/xfrm4_gc_thresh to a
sensible value given your workload.

try :

echo 65536 >/proc/sys/net/ipv4/xfrm4_gc_thresh

Presumably the 1024 default is really too small...

^ permalink raw reply

* Re: kernel policy routing table src ip not respected since 2.6.37 and commit 9fc3bbb4a752
From: Vincent Li @ 2013-10-15 21:38 UTC (permalink / raw)
  To: Julian Anastasov; +Cc: netdev@vger.kernel.org, Joel Sing
In-Reply-To: <alpine.LFD.2.03.1310152242170.1919@ssi.bg>

ok, thanks for the clarification.

On Tue, Oct 15, 2013 at 1:36 PM, Julian Anastasov <ja@ssi.bg> wrote:
>
>         Hello,
>
> On Tue, 15 Oct 2013, Vincent Li wrote:
>
>> it is strange though when 10.1.1.9 is unreachable address and the ping
>> utility reports error 'Destination Host Unreachable' with source
>> 10.1.1.1.  before 2.6.37, it reports 10.1.1..2
>
>         I see, it is the icmp_send() function that uses
> inet_select_addr() to select primary source for locally generated
> ICMP errors that are sent back to the sender (10.1.1.2).
> In your case it is the error_report from the ARP code.
>
>         So, you are correct about the commit but I don't
> see any problem with this behavior. IIRC, users preferred
> to see primary addresses in the traceroute output.
>
> Regards
>
> --
> Julian Anastasov <ja@ssi.bg>

^ permalink raw reply

* [PATCH ipsec v2] xfrm: prevent ipcomp scratch buffer race condition
From: Michal Kubecek @ 2013-10-15 21:40 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: Herbert Xu, David S. Miller, netdev
In-Reply-To: <20131015083348.GW7660@secunet.com>

In ipcomp_compress(), sortirq is enabled too early, allowing the
per-cpu scratch buffer to be rewritten by ipcomp_decompress()
(called on the same CPU in softirq context) between populating
the buffer and copying the compressed data to the skb.

Add similar protection into ipcomp_decompress() as it can be
called from process context as well (even if such scenario seems
a bit artificial).

v2: as pointed out by Steffen Klassert, if we also move the
local_bh_disable() before reading the per-cpu pointers, we can
get rid of get_cpu()/put_cpu().

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
---
 net/xfrm/xfrm_ipcomp.c | 29 ++++++++++++++++++-----------
 1 file changed, 18 insertions(+), 11 deletions(-)

diff --git a/net/xfrm/xfrm_ipcomp.c b/net/xfrm/xfrm_ipcomp.c
index 2906d52..9fe4f42 100644
--- a/net/xfrm/xfrm_ipcomp.c
+++ b/net/xfrm/xfrm_ipcomp.c
@@ -45,12 +45,17 @@ static int ipcomp_decompress(struct xfrm_state *x, struct sk_buff *skb)
 	const int plen = skb->len;
 	int dlen = IPCOMP_SCRATCH_SIZE;
 	const u8 *start = skb->data;
-	const int cpu = get_cpu();
-	u8 *scratch = *per_cpu_ptr(ipcomp_scratches, cpu);
-	struct crypto_comp *tfm = *per_cpu_ptr(ipcd->tfms, cpu);
-	int err = crypto_comp_decompress(tfm, start, plen, scratch, &dlen);
+	struct crypto_comp *tfm;
+	u8 *scratch;
+	int cpu;
+	int err;
 	int len;
 
+	local_bh_disable();
+	cpu = smp_processor_id();
+	scratch = *per_cpu_ptr(ipcomp_scratches, cpu);
+	tfm = *per_cpu_ptr(ipcd->tfms, cpu);
+	err = crypto_comp_decompress(tfm, start, plen, scratch, &dlen);
 	if (err)
 		goto out;
 
@@ -103,7 +108,7 @@ static int ipcomp_decompress(struct xfrm_state *x, struct sk_buff *skb)
 	err = 0;
 
 out:
-	put_cpu();
+	local_bh_enable();
 	return err;
 }
 
@@ -141,14 +146,16 @@ static int ipcomp_compress(struct xfrm_state *x, struct sk_buff *skb)
 	const int plen = skb->len;
 	int dlen = IPCOMP_SCRATCH_SIZE;
 	u8 *start = skb->data;
-	const int cpu = get_cpu();
-	u8 *scratch = *per_cpu_ptr(ipcomp_scratches, cpu);
-	struct crypto_comp *tfm = *per_cpu_ptr(ipcd->tfms, cpu);
+	struct crypto_comp *tfm;
+	u8 *scratch;
+	int cpu;
 	int err;
 
 	local_bh_disable();
+	cpu = smp_processor_id();
+	scratch = *per_cpu_ptr(ipcomp_scratches, cpu);
+	tfm = *per_cpu_ptr(ipcd->tfms, cpu);
 	err = crypto_comp_compress(tfm, start, plen, scratch, &dlen);
-	local_bh_enable();
 	if (err)
 		goto out;
 
@@ -158,13 +165,13 @@ static int ipcomp_compress(struct xfrm_state *x, struct sk_buff *skb)
 	}
 
 	memcpy(start + sizeof(struct ip_comp_hdr), scratch, dlen);
-	put_cpu();
+	local_bh_enable();
 
 	pskb_trim(skb, dlen + sizeof(struct ip_comp_hdr));
 	return 0;
 
 out:
-	put_cpu();
+	local_bh_enable();
 	return err;
 }
 
-- 
1.8.1.4

^ permalink raw reply related

* Re: [PATCH] net: sh_eth: Fix RX packets errors on R8A7740
From: Sergei Shtylyov @ 2013-10-15 21:48 UTC (permalink / raw)
  To: Guennadi Liakhovetski
  Cc: Nguyen Hong Ky, David S. Miller, netdev, Ryusuke Sakato,
	Simon Horman
In-Reply-To: <Pine.LNX.4.64.1310150926220.5601@axis700.grange>

Hello.

On 10/15/2013 11:28 AM, Guennadi Liakhovetski wrote:

>>> This patch will fix RX packets errors when receiving big size
>>> of data by set bit RNC = 1.

>>> RNC - Receive Enable Control

>>> 0: Upon completion of reception of one frame, the E-DMAC writes
>>> the receive status to the descriptor and clears the RR bit in
>>> EDRRR to 0.

>>> 1: Upon completion of reception of one frame, the E-DMAC writes
>>> (writes back) the receive status to the descriptor. In addition,
>>> the E-DMAC reads the next descriptor and prepares for reception
>>> of the next frame.

>>> In addition, for get more stable when receiving packets, I set
>>> maximum size for the transmit/receive FIFO and inserts padding
>>> in receive data.

>>> Signed-off-by: Nguyen Hong Ky <nh-ky@jinso.co.jp>
>>> ---
>>>    drivers/net/ethernet/renesas/sh_eth.c |    4 ++++
>>>    1 files changed, 4 insertions(+), 0 deletions(-)

>>> diff --git a/drivers/net/ethernet/renesas/sh_eth.c
>>> b/drivers/net/ethernet/renesas/sh_eth.c
>>> index a753928..11d34f0 100644
>>> --- a/drivers/net/ethernet/renesas/sh_eth.c
>>> +++ b/drivers/net/ethernet/renesas/sh_eth.c
>>> @@ -649,12 +649,16 @@ static struct sh_eth_cpu_data r8a7740_data = {
>>>    	.eesr_err_check	= EESR_TWB1 | EESR_TWB | EESR_TABT | EESR_RABT |
>>>    			  EESR_RFE | EESR_RDE | EESR_RFRMER | EESR_TFE |
>>>    			  EESR_TDE | EESR_ECI,
>>> +	.fdr_value	= 0x0000070f,
>>> +	.rmcr_value	= 0x00000001,
>>>
>>>    	.apr		= 1,
>>>    	.mpr		= 1,
>>>    	.tpauser	= 1,
>>>    	.bculr		= 1,
>>>    	.hw_swap	= 1,
>>> +	.rpadir		= 1,
>>> +	.rpadir_value   = 2 << 16,
>>>    	.no_trimd	= 1,
>>>    	.no_ade		= 1,
>>>    	.tsu		= 1,

>>     Guennadi, could you check if this patch fixes your issue with NFS. Make
>> sure it applies to 'r8a7740_data' (it was misapplied to DaveM's tree).

> Yes, the current -next, which includes this patch (in a slightly different
> form) boots fine over NFS for me.

    I don't know what you mean by "slightly different form" exactly. Also, I 
was unable to locate the fresh -next tree. 'net-next.git' contains this patch 
in a mismerged form, 'net.git' has Simon's patch that corrects this mismerge.

> Thanks
> Guennadi

WBR, Sergei

^ permalink raw reply

* Re: [PATCH v3 net-next] openvswitch: fix vport-netdev unregister
From: Alexei Starovoitov @ 2013-10-15 21:49 UTC (permalink / raw)
  To: Jesse Gross
  Cc: David S. Miller, Pravin B Shelar, Jiri Pirko, Cong Wang,
	dev@openvswitch.org, netdev
In-Reply-To: <CAMEtUuxysgzNqeBpBQ-ajLtymYCvEF-GMiUcQuy1b-QA=dFhdw@mail.gmail.com>

On Tue, Oct 15, 2013 at 9:53 AM, Alexei Starovoitov <ast@plumgrid.com> wrote:
> On Tue, Oct 15, 2013 at 8:31 AM, Jesse Gross <jesse@nicira.com> wrote:
>> On Sun, Oct 13, 2013 at 8:50 PM, Alexei Starovoitov <ast@plumgrid.com> wrote:
>>> diff --git a/net/openvswitch/dp_notify.c b/net/openvswitch/dp_notify.c
>>> index c323567..ffa429a 100644
>>> --- a/net/openvswitch/dp_notify.c
>>> +++ b/net/openvswitch/dp_notify.c
>>> @@ -59,15 +59,9 @@ void ovs_dp_notify_wq(struct work_struct *work)
>>>                         struct hlist_node *n;
>>>
>>>                         hlist_for_each_entry_safe(vport, n, &dp->ports[i], dp_hash_node) {
>>> -                               struct netdev_vport *netdev_vport;
>>> -
>>>                                 if (vport->ops->type != OVS_VPORT_TYPE_NETDEV)
>>>                                         continue;
>>> -
>>> -                               netdev_vport = netdev_vport_priv(vport);
>>> -                               if (netdev_vport->dev->reg_state == NETREG_UNREGISTERED ||
>>> -                                   netdev_vport->dev->reg_state == NETREG_UNREGISTERING)
>>> -                                       dp_detach_port_notify(vport);
>>> +                               dp_detach_port_notify(vport);
>>
>> Doesn't this free *all* ports of type OVS_VPORT_TYPE_NETDEV when any
>> one of them is removed?
>
> sorry. not sure what I was thinking on Sunday evening. will respin

will take it back. the check was removed to prevent hang upon dev netns moves,
since reg_state will still be == netreg_registered,
but yes, different check is needed.
sending v4

^ permalink raw reply

* [PATCH v4 net-next] openvswitch: fix vport-netdev unregister
From: Alexei Starovoitov @ 2013-10-15 21:54 UTC (permalink / raw)
  To: David S. Miller
  Cc: Jesse Gross, Pravin B Shelar, Jiri Pirko, Cong Wang, dev, netdev

The combination of two commits:
commit 8e4e1713e4
("openvswitch: Simplify datapath locking.")
commit 2537b4dd0a
("openvswitch:: link upper device for port devices")

introduced a bug where upper_dev wasn't unlinked upon
netdev_unregister notification

The following steps:

  modprobe openvswitch
  ovs-dpctl add-dp test
  ip tuntap add dev tap1 mode tap
  ovs-dpctl add-if test tap1
  ip tuntap del dev tap1 mode tap

are causing multiple warnings:

[   62.747557] gre: GRE over IPv4 demultiplexor driver
[   62.749579] openvswitch: Open vSwitch switching datapath
[   62.755087] device test entered promiscuous mode
[   62.765911] device tap1 entered promiscuous mode
[   62.766033] IPv6: ADDRCONF(NETDEV_UP): tap1: link is not ready
[   62.769017] ------------[ cut here ]------------
[   62.769022] WARNING: CPU: 1 PID: 3267 at net/core/dev.c:5501 rollback_registered_many+0x20f/0x240()
[   62.769023] Modules linked in: openvswitch gre vxlan ip_tunnel libcrc32c ip6table_filter ip6_tables ebtable_nat ebtables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_CHECKSUM iptable_mangle ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge stp llc vhost_net macvtap macvlan vhost kvm_intel kvm dm_crypt iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi hid_generic mxm_wmi eeepc_wmi asus_wmi sparse_keymap dm_multipath psmouse serio_raw usbhid hid parport_pc ppdev firewire_ohci lpc_ich firewire_core e1000e crc_itu_t binfmt_misc igb dca ptp pps_core mac_hid wmi lp parport i2o_config i2o_block video
[   62.769051] CPU: 1 PID: 3267 Comm: ip Not tainted 3.12.0-rc3+ #60
[   62.769052] Hardware name: System manufacturer System Product Name/P8Z77 WS, BIOS 3007 07/26/2012
[   62.769053]  0000000000000009 ffff8807f25cbd28 ffffffff8175e575 0000000000000006
[   62.769055]  0000000000000000 ffff8807f25cbd68 ffffffff8105314c ffff8807f25cbd58
[   62.769057]  ffff8807f2634000 ffff8807f25cbdc8 ffff8807f25cbd88 ffff8807f25cbdc8
[   62.769059] Call Trace:
[   62.769062]  [<ffffffff8175e575>] dump_stack+0x55/0x76
[   62.769065]  [<ffffffff8105314c>] warn_slowpath_common+0x8c/0xc0
[   62.769067]  [<ffffffff8105319a>] warn_slowpath_null+0x1a/0x20
[   62.769069]  [<ffffffff8162a04f>] rollback_registered_many+0x20f/0x240
[   62.769071]  [<ffffffff8162a101>] rollback_registered+0x31/0x40
[   62.769073]  [<ffffffff8162a488>] unregister_netdevice_queue+0x58/0x90
[   62.769075]  [<ffffffff8154f900>] __tun_detach+0x140/0x340
[   62.769077]  [<ffffffff8154fb36>] tun_chr_close+0x36/0x60
[   62.769080]  [<ffffffff811bddaf>] __fput+0xff/0x260
[   62.769082]  [<ffffffff811bdf5e>] ____fput+0xe/0x10
[   62.769084]  [<ffffffff8107b515>] task_work_run+0xb5/0xe0
[   62.769087]  [<ffffffff810029b9>] do_notify_resume+0x59/0x80
[   62.769089]  [<ffffffff813a41fe>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[   62.769091]  [<ffffffff81770f5a>] int_signal+0x12/0x17
[   62.769093] ---[ end trace 838756c62e156ffb ]---
[   62.769481] ------------[ cut here ]------------
[   62.769485] WARNING: CPU: 1 PID: 92 at fs/sysfs/inode.c:325 sysfs_hash_and_remove+0xa9/0xb0()
[   62.769486] sysfs: can not remove 'master', no directory
[   62.769486] Modules linked in: openvswitch gre vxlan ip_tunnel libcrc32c ip6table_filter ip6_tables ebtable_nat ebtables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_CHECKSUM iptable_mangle ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge stp llc vhost_net macvtap macvlan vhost kvm_intel kvm dm_crypt iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi hid_generic mxm_wmi eeepc_wmi asus_wmi sparse_keymap dm_multipath psmouse serio_raw usbhid hid parport_pc ppdev firewire_ohci lpc_ich firewire_core e1000e crc_itu_t binfmt_misc igb dca ptp pps_core mac_hid wmi lp parport i2o_config i2o_block video
[   62.769514] CPU: 1 PID: 92 Comm: kworker/1:2 Tainted: G        W    3.12.0-rc3+ #60
[   62.769515] Hardware name: System manufacturer System Product Name/P8Z77 WS, BIOS 3007 07/26/2012
[   62.769518] Workqueue: events ovs_dp_notify_wq [openvswitch]
[   62.769519]  0000000000000009 ffff880807ad3ac8 ffffffff8175e575 0000000000000006
[   62.769521]  ffff880807ad3b18 ffff880807ad3b08 ffffffff8105314c ffff880807ad3b28
[   62.769523]  0000000000000000 ffffffff81a87a1f ffff8807f2634000 ffff880037038500
[   62.769525] Call Trace:
[   62.769528]  [<ffffffff8175e575>] dump_stack+0x55/0x76
[   62.769529]  [<ffffffff8105314c>] warn_slowpath_common+0x8c/0xc0
[   62.769531]  [<ffffffff81053236>] warn_slowpath_fmt+0x46/0x50
[   62.769533]  [<ffffffff8123e7e9>] sysfs_hash_and_remove+0xa9/0xb0
[   62.769535]  [<ffffffff81240e96>] sysfs_remove_link+0x26/0x30
[   62.769538]  [<ffffffff81631ef7>] __netdev_adjacent_dev_remove+0xf7/0x150
[   62.769540]  [<ffffffff81632037>] __netdev_adjacent_dev_unlink_lists+0x27/0x50
[   62.769542]  [<ffffffff8163213a>] __netdev_adjacent_dev_unlink_neighbour+0x3a/0x50
[   62.769544]  [<ffffffff8163218d>] netdev_upper_dev_unlink+0x3d/0x140
[   62.769548]  [<ffffffffa033c2db>] netdev_destroy+0x4b/0x80 [openvswitch]
[   62.769550]  [<ffffffffa033b696>] ovs_vport_del+0x46/0x60 [openvswitch]
[   62.769552]  [<ffffffffa0335314>] ovs_dp_detach_port+0x44/0x60 [openvswitch]
[   62.769555]  [<ffffffffa0336574>] ovs_dp_notify_wq+0xb4/0x150 [openvswitch]
[   62.769557]  [<ffffffff81075c28>] process_one_work+0x1d8/0x6a0
[   62.769559]  [<ffffffff81075bc8>] ? process_one_work+0x178/0x6a0
[   62.769562]  [<ffffffff8107659b>] worker_thread+0x11b/0x370
[   62.769564]  [<ffffffff81076480>] ? rescuer_thread+0x350/0x350
[   62.769566]  [<ffffffff8107f44a>] kthread+0xea/0xf0
[   62.769568]  [<ffffffff8107f360>] ? flush_kthread_worker+0x150/0x150
[   62.769570]  [<ffffffff81770bac>] ret_from_fork+0x7c/0xb0
[   62.769572]  [<ffffffff8107f360>] ? flush_kthread_worker+0x150/0x150
[   62.769573] ---[ end trace 838756c62e156ffc ]---
[   62.769574] ------------[ cut here ]------------
[   62.769576] WARNING: CPU: 1 PID: 92 at fs/sysfs/inode.c:325 sysfs_hash_and_remove+0xa9/0xb0()
[   62.769577] sysfs: can not remove 'upper_test', no directory
[   62.769577] Modules linked in: openvswitch gre vxlan ip_tunnel libcrc32c ip6table_filter ip6_tables ebtable_nat ebtables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_CHECKSUM iptable_mangle ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge stp llc vhost_net macvtap macvlan vhost kvm_intel kvm dm_crypt iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi hid_generic mxm_wmi eeepc_wmi asus_wmi sparse_keymap dm_multipath psmouse serio_raw usbhid hid parport_pc ppdev firewire_ohci lpc_ich firewire_core e1000e crc_itu_t binfmt_misc igb dca ptp pps_core mac_hid wmi lp parport i2o_config i2o_block video
[   62.769603] CPU: 1 PID: 92 Comm: kworker/1:2 Tainted: G        W    3.12.0-rc3+ #60
[   62.769604] Hardware name: System manufacturer System Product Name/P8Z77 WS, BIOS 3007 07/26/2012
[   62.769606] Workqueue: events ovs_dp_notify_wq [openvswitch]
[   62.769607]  0000000000000009 ffff880807ad3ac8 ffffffff8175e575 0000000000000006
[   62.769609]  ffff880807ad3b18 ffff880807ad3b08 ffffffff8105314c ffff880807ad3b58
[   62.769611]  0000000000000000 ffff880807ad3bd9 ffff8807f2634000 ffff880037038500
[   62.769613] Call Trace:
[   62.769615]  [<ffffffff8175e575>] dump_stack+0x55/0x76
[   62.769617]  [<ffffffff8105314c>] warn_slowpath_common+0x8c/0xc0
[   62.769619]  [<ffffffff81053236>] warn_slowpath_fmt+0x46/0x50
[   62.769621]  [<ffffffff8123e7e9>] sysfs_hash_and_remove+0xa9/0xb0
[   62.769622]  [<ffffffff81240e96>] sysfs_remove_link+0x26/0x30
[   62.769624]  [<ffffffff81631f22>] __netdev_adjacent_dev_remove+0x122/0x150
[   62.769627]  [<ffffffff81632037>] __netdev_adjacent_dev_unlink_lists+0x27/0x50
[   62.769629]  [<ffffffff8163213a>] __netdev_adjacent_dev_unlink_neighbour+0x3a/0x50
[   62.769631]  [<ffffffff8163218d>] netdev_upper_dev_unlink+0x3d/0x140
[   62.769633]  [<ffffffffa033c2db>] netdev_destroy+0x4b/0x80 [openvswitch]
[   62.769636]  [<ffffffffa033b696>] ovs_vport_del+0x46/0x60 [openvswitch]
[   62.769638]  [<ffffffffa0335314>] ovs_dp_detach_port+0x44/0x60 [openvswitch]
[   62.769640]  [<ffffffffa0336574>] ovs_dp_notify_wq+0xb4/0x150 [openvswitch]
[   62.769642]  [<ffffffff81075c28>] process_one_work+0x1d8/0x6a0
[   62.769644]  [<ffffffff81075bc8>] ? process_one_work+0x178/0x6a0
[   62.769646]  [<ffffffff8107659b>] worker_thread+0x11b/0x370
[   62.769648]  [<ffffffff81076480>] ? rescuer_thread+0x350/0x350
[   62.769650]  [<ffffffff8107f44a>] kthread+0xea/0xf0
[   62.769652]  [<ffffffff8107f360>] ? flush_kthread_worker+0x150/0x150
[   62.769654]  [<ffffffff81770bac>] ret_from_fork+0x7c/0xb0
[   62.769656]  [<ffffffff8107f360>] ? flush_kthread_worker+0x150/0x150
[   62.769657] ---[ end trace 838756c62e156ffd ]---
[   62.769724] device tap1 left promiscuous mode

This patch also affects moving devices between net namespaces.

OVS used to ignore netns move notifications which caused problems.
Like:
  ovs-dpctl add-if test tap1
  ip link set tap1 netns 3512
and then removing tap1 inside the namespace will cause hang on missing dev_put.

With this patch OVS will detach dev upon receiving netns move event.

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
---
 net/openvswitch/dp_notify.c    |    7 +++++--
 net/openvswitch/vport-netdev.c |   16 +++++++++++++---
 net/openvswitch/vport-netdev.h |    1 +
 3 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/net/openvswitch/dp_notify.c b/net/openvswitch/dp_notify.c
index c323567..5c2dab2 100644
--- a/net/openvswitch/dp_notify.c
+++ b/net/openvswitch/dp_notify.c
@@ -65,8 +65,7 @@ void ovs_dp_notify_wq(struct work_struct *work)
 					continue;
 
 				netdev_vport = netdev_vport_priv(vport);
-				if (netdev_vport->dev->reg_state == NETREG_UNREGISTERED ||
-				    netdev_vport->dev->reg_state == NETREG_UNREGISTERING)
+				if (!(netdev_vport->dev->priv_flags & IFF_OVS_DATAPATH))
 					dp_detach_port_notify(vport);
 			}
 		}
@@ -88,6 +87,10 @@ static int dp_device_event(struct notifier_block *unused, unsigned long event,
 		return NOTIFY_DONE;
 
 	if (event == NETDEV_UNREGISTER) {
+		/* upper_dev_unlink and decrement promisc immediately */
+		ovs_netdev_detach_dev(vport);
+
+		/* schedule vport destroy, dev_put and genl notification */
 		ovs_net = net_generic(dev_net(dev), ovs_net_id);
 		queue_work(system_wq, &ovs_net->dp_notify_work);
 	}
diff --git a/net/openvswitch/vport-netdev.c b/net/openvswitch/vport-netdev.c
index 09d93c1..d21f77d 100644
--- a/net/openvswitch/vport-netdev.c
+++ b/net/openvswitch/vport-netdev.c
@@ -150,15 +150,25 @@ static void free_port_rcu(struct rcu_head *rcu)
 	ovs_vport_free(vport_from_priv(netdev_vport));
 }
 
-static void netdev_destroy(struct vport *vport)
+void ovs_netdev_detach_dev(struct vport *vport)
 {
 	struct netdev_vport *netdev_vport = netdev_vport_priv(vport);
 
-	rtnl_lock();
+	ASSERT_RTNL();
 	netdev_vport->dev->priv_flags &= ~IFF_OVS_DATAPATH;
 	netdev_rx_handler_unregister(netdev_vport->dev);
-	netdev_upper_dev_unlink(netdev_vport->dev, get_dpdev(vport->dp));
+	netdev_upper_dev_unlink(netdev_vport->dev,
+				netdev_master_upper_dev_get(netdev_vport->dev));
 	dev_set_promiscuity(netdev_vport->dev, -1);
+}
+
+static void netdev_destroy(struct vport *vport)
+{
+	struct netdev_vport *netdev_vport = netdev_vport_priv(vport);
+
+	rtnl_lock();
+	if (netdev_vport->dev->priv_flags & IFF_OVS_DATAPATH)
+		ovs_netdev_detach_dev(vport);
 	rtnl_unlock();
 
 	call_rcu(&netdev_vport->rcu, free_port_rcu);
diff --git a/net/openvswitch/vport-netdev.h b/net/openvswitch/vport-netdev.h
index dd298b5..8df01c11 100644
--- a/net/openvswitch/vport-netdev.h
+++ b/net/openvswitch/vport-netdev.h
@@ -39,5 +39,6 @@ netdev_vport_priv(const struct vport *vport)
 }
 
 const char *ovs_netdev_get_name(const struct vport *);
+void ovs_netdev_detach_dev(struct vport *);
 
 #endif /* vport_netdev.h */
-- 
1.7.9.5

^ permalink raw reply related

* Re: "xfrm: Fix the gc threshold value for ipv4" broke my IPSec connections
From: Damian Pietras @ 2013-10-15 22:15 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev
In-Reply-To: <1381870957.2045.73.camel@edumazet-glaptop.roam.corp.google.com>

On 15.10.2013 23:02, Eric Dumazet wrote:
>> 703fb94ec58e0e8769380c2877a8a34aeb5b6c97
>> xfrm: Fix the gc threshold value for ipv4
>>
>> Reverting it on 3.10.15 fixes my issue. This seems to be there from 3.7
>> and I don't really believe such simple case stayed broken for so long.
>> Em I missing something or there is really a bug?
>>
>> If smeone is interested in details of this configuration and commands
>> I'm running, just let me know. This was reproduced with few VMs under XEN.
>>
> 
> It looks like you need to tune /proc/sys/net/ipv4/xfrm4_gc_thresh to a
> sensible value given your workload.
> 
> try :
> 
> echo 65536 >/proc/sys/net/ipv4/xfrm4_gc_thresh
> 
> Presumably the 1024 default is really too small...

Now it's working in my test setup, I'm changing it on the production
boxes, thanks!


-- 
Damian Pietras

^ permalink raw reply

* [PATCH] sh_eth: add/use RMCR.RNC bit
From: Sergei Shtylyov @ 2013-10-15 22:29 UTC (permalink / raw)
  To: netdev; +Cc: nobuhiro.iwamatsu.yj, linux-sh, davem, horms

Declare 'enum EMCR_BIT' containing the single member for the RMCR.RNC bit and
replace bare numbers in the driver by  this mnemonic.

Suggested-by: David Miller <davem@davemloft.net>
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>

---
This patch is against DaveM's 'net.git' repo but it intended for 'net-next.git'
repo -- it's  because 'net-next.git' doesn't contain the required Simon Horman's
patch yet.

 drivers/net/ethernet/renesas/sh_eth.c |    6 +++---
 drivers/net/ethernet/renesas/sh_eth.h |    3 +++
 2 files changed, 6 insertions(+), 3 deletions(-)

Index: net/drivers/net/ethernet/renesas/sh_eth.c
===================================================================
--- net.orig/drivers/net/ethernet/renesas/sh_eth.c
+++ net/drivers/net/ethernet/renesas/sh_eth.c
@@ -483,7 +483,7 @@ static struct sh_eth_cpu_data sh7757_dat
 	.register_type	= SH_ETH_REG_FAST_SH4,
 
 	.eesipr_value	= DMAC_M_RFRMER | DMAC_M_ECI | 0x003fffff,
-	.rmcr_value	= 0x00000001,
+	.rmcr_value	= RMCR_RNC,
 
 	.tx_check	= EESR_FTC | EESR_CND | EESR_DLC | EESR_CD | EESR_RTO,
 	.eesr_err_check	= EESR_TWB | EESR_TABT | EESR_RABT | EESR_RFE |
@@ -561,7 +561,7 @@ static struct sh_eth_cpu_data sh7757_dat
 			  EESR_RFE | EESR_RDE | EESR_RFRMER | EESR_TFE |
 			  EESR_TDE | EESR_ECI,
 	.fdr_value	= 0x0000072f,
-	.rmcr_value	= 0x00000001,
+	.rmcr_value	= RMCR_RNC,
 
 	.irq_flags	= IRQF_SHARED,
 	.apr		= 1,
@@ -689,7 +689,7 @@ static struct sh_eth_cpu_data r8a7740_da
 			  EESR_RFE | EESR_RDE | EESR_RFRMER | EESR_TFE |
 			  EESR_TDE | EESR_ECI,
 	.fdr_value	= 0x0000070f,
-	.rmcr_value	= 0x00000001,
+	.rmcr_value	= RMCR_RNC,
 
 	.apr		= 1,
 	.mpr		= 1,
Index: net/drivers/net/ethernet/renesas/sh_eth.h
===================================================================
--- net.orig/drivers/net/ethernet/renesas/sh_eth.h
+++ net/drivers/net/ethernet/renesas/sh_eth.h
@@ -321,6 +321,9 @@ enum TD_STS_BIT {
 #define TD_TFP	(TD_TFP1|TD_TFP0)
 
 /* RMCR */
+enum RMCR_BIT {
+	RMCR_RNC = 0x00000001,
+};
 #define DEFAULT_RMCR_VALUE	0x00000000
 
 /* ECMR */

^ permalink raw reply

* Re: [PATCH ipsec v2] xfrm: prevent ipcomp scratch buffer race condition
From: Eric Dumazet @ 2013-10-15 22:44 UTC (permalink / raw)
  To: Michal Kubecek; +Cc: Steffen Klassert, Herbert Xu, David S. Miller, netdev
In-Reply-To: <20131015214030.B0D06E8A60@unicorn.suse.cz>

On Tue, 2013-10-15 at 23:40 +0200, Michal Kubecek wrote:
> In ipcomp_compress(), sortirq is enabled too early, allowing the
> per-cpu scratch buffer to be rewritten by ipcomp_decompress()
> (called on the same CPU in softirq context) between populating
> the buffer and copying the compressed data to the skb.
> 
> Add similar protection into ipcomp_decompress() as it can be
> called from process context as well (even if such scenario seems
> a bit artificial).
> 
> v2: as pointed out by Steffen Klassert, if we also move the
> local_bh_disable() before reading the per-cpu pointers, we can
> get rid of get_cpu()/put_cpu().
> 
> Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
> ---
>  net/xfrm/xfrm_ipcomp.c | 29 ++++++++++++++++++-----------
>  1 file changed, 18 insertions(+), 11 deletions(-)
> 
> diff --git a/net/xfrm/xfrm_ipcomp.c b/net/xfrm/xfrm_ipcomp.c
> index 2906d52..9fe4f42 100644
> --- a/net/xfrm/xfrm_ipcomp.c
> +++ b/net/xfrm/xfrm_ipcomp.c
> @@ -45,12 +45,17 @@ static int ipcomp_decompress(struct xfrm_state *x, struct sk_buff *skb)
>  	const int plen = skb->len;
>  	int dlen = IPCOMP_SCRATCH_SIZE;
>  	const u8 *start = skb->data;
> -	const int cpu = get_cpu();
> -	u8 *scratch = *per_cpu_ptr(ipcomp_scratches, cpu);
> -	struct crypto_comp *tfm = *per_cpu_ptr(ipcd->tfms, cpu);
> -	int err = crypto_comp_decompress(tfm, start, plen, scratch, &dlen);
> +	struct crypto_comp *tfm;
> +	u8 *scratch;
> +	int cpu;
> +	int err;
>  	int len;
>  
> +	local_bh_disable();
> +	cpu = smp_processor_id();
> +	scratch = *per_cpu_ptr(ipcomp_scratches, cpu);
> +	tfm = *per_cpu_ptr(ipcd->tfms, cpu);

Have you tried this_cpu_ptr() instead ?

^ permalink raw reply

* Re: "xfrm: Fix the gc threshold value for ipv4" broke my IPSec connections
From: Eric Dumazet @ 2013-10-15 22:51 UTC (permalink / raw)
  To: Damian Pietras, Steffen Klassert; +Cc: netdev
In-Reply-To: <525DBE65.1070707@daper.net>

On Wed, 2013-10-16 at 00:15 +0200, Damian Pietras wrote:
> On 15.10.2013 23:02, Eric Dumazet wrote:
> >> 703fb94ec58e0e8769380c2877a8a34aeb5b6c97
> >> xfrm: Fix the gc threshold value for ipv4
> >>
> >> Reverting it on 3.10.15 fixes my issue. This seems to be there from 3.7
> >> and I don't really believe such simple case stayed broken for so long.
> >> Em I missing something or there is really a bug?
> >>
> >> If smeone is interested in details of this configuration and commands
> >> I'm running, just let me know. This was reproduced with few VMs under XEN.
> >>
> > 
> > It looks like you need to tune /proc/sys/net/ipv4/xfrm4_gc_thresh to a
> > sensible value given your workload.
> > 
> > try :
> > 
> > echo 65536 >/proc/sys/net/ipv4/xfrm4_gc_thresh
> > 
> > Presumably the 1024 default is really too small...
> 
> Now it's working in my test setup, I'm changing it on the production
> boxes, thanks!
> 
> 

Steffen, what do you think ?

1024 seems really small, given we had much higher values.

(256 K on a 1GB host)

This sysctl also needs an entry in
Documentation/networking/ip-sysctl.txt

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox