Netdev List
 help / color / mirror / Atom feed
* [XFRM][RFC v1] Fix unexpected SA hard expiration after setting new date
From: fan.du @ 2012-06-18  8:24 UTC (permalink / raw)
  To: davem, herbert; +Cc: netdev, fdu


First, I'm not sure whether I Cced to the right person, if not,
apologize for the noise.


*Background*:
Once IPsec SAs are created between two peers, kernel setup a timer to monitor
two events: soft/hard expiration. However the timer handler use xtime to
caculate whether it's soft or hard expiration event.

normal code flow(hard expire time:100s, soft expire time:82s)

a) When new SAs created, xfrm_timer_handler is called one second
after its creation. At this point, calculate soft expire
interval(81s), setup the timer;

b) soft expire occur, rearm the timer with hard expire interval(18s)
then notify racoon2 about soft expire event. racoon2 will create
new SAs.

c) hard expire happen, notify racoon2 about it. racoon2 will delete
the old SAs.

*Scenario*:
Setting a new date before b),and after a) could result c) happens first,
As a result, old SAs is deleted before new ones are created. Normally
new SAs will be created by the next time networking traffic, but there
is a small time being when networking connection is down, this could
result in upper layer connections failed in tel comm area, thus it's
better to keep it strict in sequence.

*Workaround*:
set new time could happen:
1) before a), then SAs is updated with new time.
2) before b),and after a)
2a) When new SAs created, xfrm_timer_handler is called one second
after its creation. At this point, calculate soft expire
interval(81s), setup the timer;(set flag to mark next time should
be soft time expire)

<<---- new date comes

2b) soft expire occur, the calculation results in a hard time expire
event, but flag is set, so catch ya. Sync the addtime, and rearm
the timer with hard expire interval(18s), then notify racoon2
about soft expire event;

2c) hard expire happen, notify racoon2 about it;
so everything is in order.

3) after b), hard expire always happened anyway.


So, could you please give your comments on this?

thanks

^ permalink raw reply

* [PATCH] [XFRM] Fix unexpected SA hard expiration after changing date
From: fan.du @ 2012-06-18  8:24 UTC (permalink / raw)
  To: davem, herbert; +Cc: netdev, fdu
In-Reply-To: <1340007856-27651-1-git-send-email-fan.du@windriver.com>

After SA is setup, one timer is armed to detect soft/hard expiration,
however the timer handler uses xtime to do the math. This makes hard
expiration occurs first before soft expiration after setting new date
with big interval. As a result new child SA is deleted before rekeying
the new one.

Signed-off-by: fan.du <fan.du@windriver.com>
---
 include/net/xfrm.h    |    2 ++
 net/xfrm/xfrm_state.c |   22 ++++++++++++++++++----
 2 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 2933d74..1734acc 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -214,6 +214,8 @@ struct xfrm_state
 	/* Private data of this transformer, format is opaque,
 	 * interpreted by xfrm_type methods. */
 	void			*data;
+	u32				flags;
+	long			saved_tmo;
 };
 
 /* xflags - make enum if more show up */
diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index fd77cf0..da2cc78 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -442,8 +442,18 @@ static void xfrm_timer_handler(unsigned long data)
 	if (x->lft.hard_add_expires_seconds) {
 		long tmo = x->lft.hard_add_expires_seconds +
 			x->curlft.add_time - now;
-		if (tmo <= 0)
-			goto expired;
+		if (tmo <= 0) {
+			if (x->flags != 1)
+				goto expired;
+			else {
+				/* enter hard expire without soft expire first?!
+				 * setting a new date could trigger this.
+				 * workarbound: fix x->curflt.add_time by below:
+				 */
+				x->curlft.add_time = now - x->saved_tmo - 1;
+				tmo = x->lft.hard_add_expires_seconds - x->saved_tmo;
+			}
+		}
 		if (tmo < next)
 			next = tmo;
 	}
@@ -460,10 +470,14 @@ static void xfrm_timer_handler(unsigned long data)
 	if (x->lft.soft_add_expires_seconds) {
 		long tmo = x->lft.soft_add_expires_seconds +
 			x->curlft.add_time - now;
-		if (tmo <= 0)
+		if (tmo <= 0) {
 			warn = 1;
-		else if (tmo < next)
+			x->flags = 0;
+		} else if (tmo < next) {
 			next = tmo;
+			x->flags = 1;
+			x->saved_tmo = tmo;
+		}
 	}
 	if (x->lft.soft_use_expires_seconds) {
 		long tmo = x->lft.soft_use_expires_seconds +
-- 
1.6.3.1

^ permalink raw reply related

* [PATCH] net: added support for 40GbE link.
From: Parav Pandit @ 2012-06-18 12:44 UTC (permalink / raw)
  To: netdev; +Cc: bhutchings, Parav Pandit

1. link speed of 40GbE and #4 KR4, CR4, SR4, LR4 modes defined.
2. removed code replication for tov calculation for 1G, 10G and
made is common for 1G, 10G, 40G.

Port cost calculation changes for bridging for 40G will be done once have more clarify from 802.1d spec in coming days.

Signed-off-by: Parav Pandit <parav.pandit@emulex.com>
---
 include/linux/ethtool.h |   11 ++++++++++-
 net/packet/af_packet.c  |    8 +++-----
 2 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index 297370a..1ebfa6e 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -1153,6 +1153,10 @@ struct ethtool_ops {
 #define SUPPORTED_10000baseR_FEC	(1 << 20)
 #define SUPPORTED_20000baseMLD2_Full	(1 << 21)
 #define SUPPORTED_20000baseKR2_Full	(1 << 22)
+#define SUPPORTED_40000baseKR4_Full	(1 << 23)
+#define SUPPORTED_40000baseCR4_Full	(1 << 24)
+#define SUPPORTED_40000baseSR4_Full	(1 << 25)
+#define SUPPORTED_40000baseLR4_Full	(1 << 26)
 
 /* Indicates what features are advertised by the interface. */
 #define ADVERTISED_10baseT_Half		(1 << 0)
@@ -1178,6 +1182,10 @@ struct ethtool_ops {
 #define ADVERTISED_10000baseR_FEC	(1 << 20)
 #define ADVERTISED_20000baseMLD2_Full	(1 << 21)
 #define ADVERTISED_20000baseKR2_Full	(1 << 22)
+#define ADVERTISED_40000baseKR4_Full	(1 << 23)
+#define ADVERTISED_40000baseCR4_Full	(1 << 24)
+#define ADVERTISED_40000baseSR4_Full	(1 << 25)
+#define ADVERTISED_40000baseLR4_Full	(1 << 26)
 
 /* The following are all involved in forcing a particular link
  * mode for the device for setting things.  When getting the
@@ -1185,12 +1193,13 @@ struct ethtool_ops {
  * it was forced up into this mode or autonegotiated.
  */
 
-/* The forced speed, 10Mb, 100Mb, gigabit, 2.5Gb, 10GbE. */
+/* The forced speed, 10Mb, 100Mb, gigabit, 2.5Gb, 10GbE, 40GbE. */
 #define SPEED_10		10
 #define SPEED_100		100
 #define SPEED_1000		1000
 #define SPEED_2500		2500
 #define SPEED_10000		10000
+#define SPEED_40000		40000
 #define SPEED_UNKNOWN		-1
 
 /* Duplex, half or full. */
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 8a10d5b..dd0e503 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -542,13 +542,11 @@ static int prb_calc_retire_blk_tmo(struct packet_sock *po,
 	rtnl_unlock();
 	if (!err) {
 		switch (ecmd.speed) {
-		case SPEED_10000:
-			msec = 1;
-			div = 10000/1000;
-			break;
 		case SPEED_1000:
+		case SPEED_10000:
+		case SPEED_40000:
 			msec = 1;
-			div = 1000/1000;
+			div = ecmd.speed / 1000;
 			break;
 		/*
 		 * If the link speed is so slow you don't really
-- 
1.6.0.2

^ permalink raw reply related

* [PATCH] ethtool: added support for 40G link.
From: Parav Pandit @ 2012-06-18 12:45 UTC (permalink / raw)
  To: bhutchings; +Cc: netdev, Parav Pandit

1. defined 40G link speed.
2. defined values for KR4, CR4, SR4, LR4 PHY.

Signed-off-by: Parav Pandit <parav.pandit@emulex.com>
---
 ethtool-copy.h |   11 ++++++++++-
 ethtool.c      |    4 ++++
 2 files changed, 14 insertions(+), 1 deletions(-)

diff --git a/ethtool-copy.h b/ethtool-copy.h
index 3027ca3..3bbc0ed 100644
--- a/ethtool-copy.h
+++ b/ethtool-copy.h
@@ -913,6 +913,10 @@ enum ethtool_sfeatures_retval_bits {
 #define SUPPORTED_10000baseR_FEC	(1 << 20)
 #define SUPPORTED_20000baseMLD2_Full	(1 << 21)
 #define SUPPORTED_20000baseKR2_Full	(1 << 22)
+#define SUPPORTED_40000baseKR4_Full	(1 << 23)
+#define SUPPORTED_40000baseCR4_Full	(1 << 24)
+#define SUPPORTED_40000baseSR4_Full	(1 << 25)
+#define SUPPORTED_40000baseLR4_Full	(1 << 26)
 
 /* Indicates what features are advertised by the interface. */
 #define ADVERTISED_10baseT_Half		(1 << 0)
@@ -938,6 +942,10 @@ enum ethtool_sfeatures_retval_bits {
 #define ADVERTISED_10000baseR_FEC	(1 << 20)
 #define ADVERTISED_20000baseMLD2_Full	(1 << 21)
 #define ADVERTISED_20000baseKR2_Full	(1 << 22)
+#define ADVERTISED_40000baseKR4_Full	(1 << 23)
+#define ADVERTISED_40000baseCR4_Full	(1 << 24)
+#define ADVERTISED_40000baseSR4_Full	(1 << 25)
+#define ADVERTISED_40000baseLR4_Full	(1 << 26)
 
 /* The following are all involved in forcing a particular link
  * mode for the device for setting things.  When getting the
@@ -945,12 +953,13 @@ enum ethtool_sfeatures_retval_bits {
  * it was forced up into this mode or autonegotiated.
  */
 
-/* The forced speed, 10Mb, 100Mb, gigabit, 2.5Gb, 10GbE. */
+/* The forced speed, 10Mb, 100Mb, gigabit, 2.5Gb, 10GbE, 40GbE. */
 #define SPEED_10		10
 #define SPEED_100		100
 #define SPEED_1000		1000
 #define SPEED_2500		2500
 #define SPEED_10000		10000
+#define SPEED_40000		40000
 #define SPEED_UNKNOWN		-1
 
 /* Duplex, half or full. */
diff --git a/ethtool.c b/ethtool.c
index b0d3eea..6e9418e 100644
--- a/ethtool.c
+++ b/ethtool.c
@@ -460,6 +460,10 @@ dump_link_caps(const char *prefix, const char *an_prefix, u32 mask)
 		{ 0, ADVERTISED_10000baseT_Full,    "10000baseT/Full" },
 		{ 0, ADVERTISED_10000baseKX4_Full,  "10000baseKX4/Full" },
 		{ 0, ADVERTISED_20000baseMLD2_Full, "20000baseMLD2/Full" },
+		{ 0, ADVERTISED_40000baseKR4_Full,  "40000baseKR4/Full" },
+		{ 0, ADVERTISED_40000baseCR4_Full,  "40000baseCR4/Full" },
+		{ 0, ADVERTISED_40000baseSR4_Full,  "40000baseSR4/Full" },
+		{ 0, ADVERTISED_40000baseLR4_Full,  "40000baseLR4/Full" },
 	};
 	int indent;
 	int did1, new_line_pend, i;
-- 
1.6.0.2

^ permalink raw reply related

* Re: [XFRM][RFC v1] Fix unexpected SA hard expiration after setting new date
From: Herbert Xu @ 2012-06-18  8:40 UTC (permalink / raw)
  To: fan.du; +Cc: davem, netdev, fdu
In-Reply-To: <1340007856-27651-1-git-send-email-fan.du@windriver.com>

On Mon, Jun 18, 2012 at 04:24:15PM +0800, fan.du wrote:
> 
> So, could you please give your comments on this?

Well, this used to work back when we were using relative time
instead of absolute time.  Then someone came along and changed
it for suspend/resume.

I didn't like this new behaviour but Dave convinced me that
it is a good thing :)

I guess I can live with your workaround if Dave is happy with
it.  But IMHO we should just go back to relative time and fix
the suspend/resume user-space scripts instead.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [XFRM][RFC v1] Fix unexpected SA hard expiration after setting new date
From: Fan Du @ 2012-06-18  8:53 UTC (permalink / raw)
  To: Herbert Xu; +Cc: davem, netdev, fdu
In-Reply-To: <20120618084021.GA24902@gondor.apana.org.au>



On 2012年06月18日 16:40, Herbert Xu wrote:
> On Mon, Jun 18, 2012 at 04:24:15PM +0800, fan.du wrote:
>>
>> So, could you please give your comments on this?
>
> Well, this used to work back when we were using relative time
> instead of absolute time.  Then someone came along and changed
> it for suspend/resume.
>
> I didn't like this new behaviour but Dave convinced me that
> it is a good thing :)
>

One of our custom complained the networking down when changing date,
even it's only less than 10 seconds, but they complain anyway:)
So it probably hurts much in practice for tel comm company.


> I guess I can live with your workaround if Dave is happy with
> it.  But IMHO we should just go back to relative time and fix
> the suspend/resume user-space scripts instead.

Ok, let's see what Dave will say about this.
And thanks for your comments.

>
> Cheers,

-- 

Love each day!
--fan

^ permalink raw reply

* [PATCH] phy/micrel: change phy_id_mask for KSZ9021 and KS8001
From: Hui Wang @ 2012-06-18  8:52 UTC (permalink / raw)
  To: david.choi, davem, nobuhiro.iwamatsu.yj; +Cc: netdev

On a freescale imx6q platform, a hardware phy chip KSZ9021 is
recognized as a KS8001 chip by the current driver like this:
eth0: Freescale FEC PHY driver [Micrel KS8001 or KS8721]

KSZ9021 has phy_id 0x00221610, while KSZ8001 has phy_id
0x0022161a, the current phy_id_mask (0x00fffff0/0x00ffff10) can't
distinguish them. So change phy_id_mask to resolve this problem.

Although the micrel datasheet says that the 4 LSB of phyid2 register
contains the chip revision number and the current driver is designed
to follow this rule, in reality the chip implementation doesn't follow
it.

Cc: David J. Choi <david.choi@micrel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
Signed-off-by: Hui Wang <jason77.wang@gmail.com>
---
 drivers/net/phy/micrel.c |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c
index 590f902..9d6c80c 100644
--- a/drivers/net/phy/micrel.c
+++ b/drivers/net/phy/micrel.c
@@ -161,7 +161,7 @@ static struct phy_driver ks8051_driver = {
 static struct phy_driver ks8001_driver = {
 	.phy_id		= PHY_ID_KS8001,
 	.name		= "Micrel KS8001 or KS8721",
-	.phy_id_mask	= 0x00fffff0,
+	.phy_id_mask	= 0x00ffffff,
 	.features	= (PHY_BASIC_FEATURES | SUPPORTED_Pause),
 	.flags		= PHY_HAS_MAGICANEG | PHY_HAS_INTERRUPT,
 	.config_init	= kszphy_config_init,
@@ -174,7 +174,7 @@ static struct phy_driver ks8001_driver = {
 
 static struct phy_driver ksz9021_driver = {
 	.phy_id		= PHY_ID_KSZ9021,
-	.phy_id_mask	= 0x000fff10,
+	.phy_id_mask	= 0x000ffffe,
 	.name		= "Micrel KSZ9021 Gigabit PHY",
 	.features	= (PHY_GBIT_FEATURES | SUPPORTED_Pause
 				| SUPPORTED_Asym_Pause),
@@ -240,8 +240,8 @@ MODULE_AUTHOR("David J. Choi");
 MODULE_LICENSE("GPL");
 
 static struct mdio_device_id __maybe_unused micrel_tbl[] = {
-	{ PHY_ID_KSZ9021, 0x000fff10 },
-	{ PHY_ID_KS8001, 0x00fffff0 },
+	{ PHY_ID_KSZ9021, 0x000ffffe },
+	{ PHY_ID_KS8001, 0x00ffffff },
 	{ PHY_ID_KS8737, 0x00fffff0 },
 	{ PHY_ID_KS8041, 0x00fffff0 },
 	{ PHY_ID_KS8051, 0x00fffff0 },
-- 
1.7.6

^ permalink raw reply related

* Re: [PATCH net-next v2 01/12] netfilter: fix problem with proto register
From: Pablo Neira Ayuso @ 2012-06-18  9:06 UTC (permalink / raw)
  To: Gao feng; +Cc: netdev, netfilter-devel
In-Reply-To: <4FDE7D5F.5080703@cn.fujitsu.com>

On Mon, Jun 18, 2012 at 08:59:11AM +0800, Gao feng wrote:
> 于 2012年06月16日 18:50, Pablo Neira Ayuso 写道:
> > On Sat, Jun 16, 2012 at 11:41:12AM +0800, Gao feng wrote:
> >> commit 2c352f444ccfa966a1aa4fd8e9ee29381c467448
> >> (netfilter: nf_conntrack: prepare namespace support for
> >> l4 protocol trackers) register proto before register sysctl.
> >>
> >> it changes the behavior that when register sysctl failed, the
> >> proto should not be registered too.
> >>
> >> so change to register sysctl before register protos.
> > 
> > Could you explain why we need to change the order in the registration?
> > ie. now first proto->init_net then sysctl things.
> 
> before commit 2c352f444ccfa966a1aa4fd8e9ee29381c467448, we register sysctl before
> register protos, so if sysctl is registered faild, the protos will not be registered.
> 
> but now, we register protos first, and when register sysctl failed, we can use protos
> too, it's different from before.

That makes sense.

IMO, this is the thing that should be included in the description.

^ permalink raw reply

* unstable 10GBE performance with recent kernels (> 3.0.X)
From: Stefan Priebe - Profihost AG @ 2012-06-18  9:22 UTC (permalink / raw)
  To: Linux Netdev List

Hello list,

i've discovered very unstable 10GBE performance using recent kernels. 
I'm using some optimized settings mentioned by intel here 
(part:Improving Performance): 
http://downloadmirror.intel.com/5874/eng/README.txt

I'm using Intel X520 cards (ixgbe driver version: 3.9.17-NAPI in all 
tests).

I'm measuring the performance with iperf.

With 3.0.32 i get constant 9,90 Gbit/s in both directions simultaneously.

With 3.4.2 or 3.5.0-rc2 it get sometimes 9,9gbit/s - sometimes 3gbit/s 
or even sometimes only 1gbit/s throughput.

I also tried to change the tcp_congestion_control from cubic to reno, 
bic and highspeed but no change.

I also tried to bisect the issue but there are so many changes in the 
net kernel part that i'm unable to identify the problem as with some 
commits i only get 0-300kb/s performance.

Any ideas?

Greets
Stefan

^ permalink raw reply

* Re: unstable 10GBE performance with recent kernels (> 3.0.X)
From: Eric Dumazet @ 2012-06-18  9:45 UTC (permalink / raw)
  To: Stefan Priebe - Profihost AG; +Cc: Linux Netdev List
In-Reply-To: <4FDEF34C.808@profihost.ag>

On Mon, 2012-06-18 at 11:22 +0200, Stefan Priebe - Profihost AG wrote:
> Hello list,
> 
> i've discovered very unstable 10GBE performance using recent kernels. 
> I'm using some optimized settings mentioned by intel here 
> (part:Improving Performance): 
> http://downloadmirror.intel.com/5874/eng/README.txt
> 

like what settings ?

> I'm using Intel X520 cards (ixgbe driver version: 3.9.17-NAPI in all 
> tests).
> 
> I'm measuring the performance with iperf.
> 
> With 3.0.32 i get constant 9,90 Gbit/s in both directions simultaneously.
> 
> With 3.4.2 or 3.5.0-rc2 it get sometimes 9,9gbit/s - sometimes 3gbit/s 
> or even sometimes only 1gbit/s throughput.
> 
> I also tried to change the tcp_congestion_control from cubic to reno, 
> bic and highspeed but no change.
> 
> I also tried to bisect the issue but there are so many changes in the 
> net kernel part that i'm unable to identify the problem as with some 
> commits i only get 0-300kb/s performance.
> 
> Any ideas?

ethtool -S eth0
ethtool -k eth0
netstat -s   (on sender, on receiver)

single flow , multiple flows ?

^ permalink raw reply

* Re: unstable 10GBE performance with recent kernels (> 3.0.X)
From: Stefan Priebe - Profihost AG @ 2012-06-18 10:05 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Linux Netdev List
In-Reply-To: <1340012706.7491.770.camel@edumazet-glaptop>

Am 18.06.2012 11:45, schrieb Eric Dumazet:
> On Mon, 2012-06-18 at 11:22 +0200, Stefan Priebe - Profihost AG wrote:
>> Hello list,
>>
>> i've discovered very unstable 10GBE performance using recent kernels.
>> I'm using some optimized settings mentioned by intel here
>> (part:Improving Performance):
>> http://downloadmirror.intel.com/5874/eng/README.txt
>>
> like what settings ?

Sorry these ones:
http://pastebin.com/raw.php?i=J2XRFAjD

>> Any ideas?
>
> ethtool -S eth0
 From host1 and 2: http://pastebin.com/raw.php?i=0Pp8Vs3Y

> ethtool -k eth0
 From host1 and 2: # ethtool -k eth2
Offload parameters for eth2:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp-segmentation-offload: on
udp-fragmentation-offload: off
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off
ntuple-filters: off
receive-hashing: on

# ethtool -k eth2
Offload parameters for eth2:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp-segmentation-offload: on
udp-fragmentation-offload: off
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off
ntuple-filters: off
receive-hashing: on

> netstat -s   (on sender, on receiver)

 From host1 and 2: http://pastebin.com/raw.php?i=P9T96j0L

> single flow , multiple flows ?

host1: iperf -s
host2: iperf -c ssdstor002 -d -t 600 -i 10;

In this test i got around 7.4Gbit/s

Stefan

^ permalink raw reply

* Re: unstable 10GBE performance with recent kernels (> 3.0.X)
From: Eric Dumazet @ 2012-06-18 10:25 UTC (permalink / raw)
  To: Stefan Priebe - Profihost AG; +Cc: Linux Netdev List
In-Reply-To: <4FDEFD55.7010204@profihost.ag>

On Mon, 2012-06-18 at 12:05 +0200, Stefan Priebe - Profihost AG wrote:
> Am 18.06.2012 11:45, schrieb Eric Dumazet:
> > On Mon, 2012-06-18 at 11:22 +0200, Stefan Priebe - Profihost AG wrote:
> >> Hello list,
> >>
> >> i've discovered very unstable 10GBE performance using recent kernels.
> >> I'm using some optimized settings mentioned by intel here
> >> (part:Improving Performance):
> >> http://downloadmirror.intel.com/5874/eng/README.txt
> >>
> > like what settings ?
> 
> Sorry these ones:
> http://pastebin.com/raw.php?i=J2XRFAjD
> 

I would remove all this (pretty old and obsolete) stuff and use standard
params.

Only thing you could do is :

ethtool -K eth2 ntuple on

on both machines

^ permalink raw reply

* Re: [PATCH] [XFRM] Fix unexpected SA hard expiration after changing date
From: Steffen Klassert @ 2012-06-18 11:05 UTC (permalink / raw)
  To: fan.du; +Cc: davem, herbert, netdev, fdu
In-Reply-To: <1340007856-27651-2-git-send-email-fan.du@windriver.com>

On Mon, Jun 18, 2012 at 04:24:16PM +0800, fan.du wrote:
> After SA is setup, one timer is armed to detect soft/hard expiration,
> however the timer handler uses xtime to do the math. This makes hard
> expiration occurs first before soft expiration after setting new date
> with big interval. As a result new child SA is deleted before rekeying
> the new one.
> 
> Signed-off-by: fan.du <fan.du@windriver.com>
> ---
>  include/net/xfrm.h    |    2 ++
>  net/xfrm/xfrm_state.c |   22 ++++++++++++++++++----
>  2 files changed, 20 insertions(+), 4 deletions(-)
> 
> diff --git a/include/net/xfrm.h b/include/net/xfrm.h
> index 2933d74..1734acc 100644
> --- a/include/net/xfrm.h
> +++ b/include/net/xfrm.h
> @@ -214,6 +214,8 @@ struct xfrm_state
>  	/* Private data of this transformer, format is opaque,
>  	 * interpreted by xfrm_type methods. */
>  	void			*data;
> +	u32				flags;

We already have the xflags field, it holds exactly one flag
at the moment. So I think we don't need yet another u32 that
holds one flag too.

^ permalink raw reply

* [PATCH net-next] em_canid: Ematch rule to match CAN frames according to their CAN IDs
From: Rostislav Lisovy @ 2012-06-18 12:22 UTC (permalink / raw)
  To: netdev; +Cc: linux-can, lartc, pisa, sojkam1, Rostislav Lisovy

This ematch makes it possible to classify CAN frames (AF_CAN) according
to their identifiers. This functionality can not be easily achieved with
existing classifiers, such as u32, because CAN ID is always stored in
native endianness, whereas u32 expects Network byte order.

The filtering rules for EFF frames are stored in an array, which
is traversed during classification. A bitmap is used to store SFF
rules -- one bit for each ID.

It is possible to to pass up to 32 'rules' to this ematch during
configuration.

Signed-off-by: Rostislav Lisovy <lisovy@gmail.com>
---

This Patch contains a reworked classifier initially posted in
http://www.spinics.net/lists/netdev/msg200114.html
The functionality is the same however there is almost 50% reduction
in the source code length.

There were simple benchmark performed on MPC5200 -- an embedded PowerPC CPU
(e300 core, G2 LE), 396 MHz, with 128 MiB of RAM running 3.4.2 Linux kernel
The benchmark simply generated CAN frames with different identifiers and the
time spent in can_send() function was measured.

CAN device was configured as follows:
ip link set can0 type can bitrate 1000000
ip link set can0 txqueuelen 1000

CAN frames were generated with command:
cangen can0 -I ${ID} -L 8 -D i -g 1 -n 100

With no extra filter/qdisc configured, median of the time spent in can_send()
was about 27 us -- with prio qdisc with 5 bands and 5 appropriate cls_can
filters (previous patch), it was about 30 us -- with prio qdisc with 5 bands
and 5 appropriate em_can filters (this patch), it was about 34 us.

---
 include/linux/can.h     |    3 +
 include/linux/pkt_cls.h |    5 +-
 net/sched/Kconfig       |   10 ++
 net/sched/Makefile      |    1 +
 net/sched/em_canid.c    |  252 +++++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 269 insertions(+), 2 deletions(-)
 create mode 100644 net/sched/em_canid.c

diff --git a/include/linux/can.h b/include/linux/can.h
index 9a19bcb..08d1610 100644
--- a/include/linux/can.h
+++ b/include/linux/can.h
@@ -38,6 +38,9 @@
  */
 typedef __u32 canid_t;
 
+#define CAN_SFF_ID_BITS		11
+#define CAN_EFF_ID_BITS		29
+
 /*
  * Controller Area Network Error Frame Mask structure
  *
diff --git a/include/linux/pkt_cls.h b/include/linux/pkt_cls.h
index defbde2..7fbe6c2 100644
--- a/include/linux/pkt_cls.h
+++ b/include/linux/pkt_cls.h
@@ -451,8 +451,9 @@ enum {
 #define	TCF_EM_U32		3
 #define	TCF_EM_META		4
 #define	TCF_EM_TEXT		5
-#define        TCF_EM_VLAN		6
-#define	TCF_EM_MAX		6
+#define	TCF_EM_VLAN		6
+#define        TCF_EM_CANID		7
+#define	TCF_EM_MAX		7
 
 enum {
 	TCF_EM_PROG_TC
diff --git a/net/sched/Kconfig b/net/sched/Kconfig
index 75b58f8..bc0ceab 100644
--- a/net/sched/Kconfig
+++ b/net/sched/Kconfig
@@ -485,6 +485,16 @@ config NET_EMATCH_TEXT
 	  To compile this code as a module, choose M here: the
 	  module will be called em_text.
 
+config NET_EMATCH_CANID
+	tristate "CAN ID"
+	depends on NET_EMATCH && CAN
+	---help---
+          Say Y here if you want to be able to classify CAN frames based
+          on CAN ID.
+
+	  To compile this code as a module, choose M here: the
+	  module will be called em_canid.
+
 config NET_CLS_ACT
 	bool "Actions"
 	---help---
diff --git a/net/sched/Makefile b/net/sched/Makefile
index 8cdf4e2..47f9262 100644
--- a/net/sched/Makefile
+++ b/net/sched/Makefile
@@ -53,3 +53,4 @@ obj-$(CONFIG_NET_EMATCH_NBYTE)	+= em_nbyte.o
 obj-$(CONFIG_NET_EMATCH_U32)	+= em_u32.o
 obj-$(CONFIG_NET_EMATCH_META)	+= em_meta.o
 obj-$(CONFIG_NET_EMATCH_TEXT)	+= em_text.o
+obj-$(CONFIG_NET_EMATCH_CANID)	+= em_canid.o
diff --git a/net/sched/em_canid.c b/net/sched/em_canid.c
new file mode 100644
index 0000000..5cc6e5e
--- /dev/null
+++ b/net/sched/em_canid.c
@@ -0,0 +1,252 @@
+/*
+ * em_canid.c  Ematch rule to match CAN frames according to their CAN IDs
+ *
+ *              This program is free software; you can distribute it and/or
+ *              modify it under the terms of the GNU General Public License
+ *              as published by the Free Software Foundation; either version
+ *              2 of the License, or (at your option) any later version.
+ *
+ * Idea:       Oliver Hartkopp <oliver.hartkopp@volkswagen.de>
+ * Copyright:  (c) 2011 Czech Technical University in Prague
+ *             (c) 2011 Volkswagen Group Research
+ * Authors:    Michal Sojka <sojkam1@fel.cvut.cz>
+ *             Pavel Pisa <pisa@cmp.felk.cvut.cz>
+ *             Rostislav Lisovy <lisovy@gmail.cz>
+ * Funded by:  Volkswagen Group Research
+ */
+
+#include <linux/slab.h>
+#include <linux/module.h>
+#include <linux/types.h>
+#include <linux/kernel.h>
+#include <linux/string.h>
+#include <linux/skbuff.h>
+#include <net/pkt_cls.h>
+#include <linux/can.h>
+
+#define EM_CAN_RULES_SIZE				32
+
+struct canid_match {
+	/*
+	 * Raw rules copied from netlink message; Used for sending
+	 * information to userspace (when 'tc filter show' is invoked)
+	 * AND when matching EFF frames
+	 */
+	struct can_filter rules_raw[EM_CAN_RULES_SIZE];
+
+	/* For each SFF CAN ID (11 bit) there is one record in this bitfield */
+	DECLARE_BITMAP(match_sff, (1 << CAN_SFF_ID_BITS));
+
+	int rules_count;
+	int eff_rules_count;
+	int sff_rules_count;
+};
+
+/**
+ * em_canid_get_id() - Extracts Can ID out of the sk_buff structure.
+ */
+static canid_t em_canid_get_id(struct sk_buff *skb)
+{
+	/* CAN ID is stored within the data field */
+	struct can_frame *cf = (struct can_frame *)skb->data;
+
+	return cf->can_id;
+}
+
+static void em_canid_sff_match_add(struct canid_match *cm, u32 can_id,
+					u32 can_mask)
+{
+	int i;
+
+	/*
+	 * Limit can_mask and can_id to SFF range to
+	 * protect against write after end of array
+	 */
+	can_mask &= CAN_SFF_MASK;
+	can_id &= can_mask;
+
+	/* Single frame */
+	if (can_mask == CAN_SFF_MASK) {
+		set_bit(can_id, cm->match_sff);
+		return;
+	}
+
+	/* All frames */
+	if (can_mask == 0) {
+		bitmap_fill(cm->match_sff, (1 << CAN_SFF_ID_BITS));
+		return;
+	}
+
+	/*
+	 * Individual frame filter.
+	 * Add record (set bit to 1) for each ID that
+	 * conforms particular rule
+	 */
+	for (i = 0; i < (1 << CAN_SFF_ID_BITS); i++) {
+		if ((i & can_mask) == can_id)
+			set_bit(i, cm->match_sff);
+	}
+}
+
+static inline struct canid_match *em_canid_priv(struct tcf_ematch *m)
+{
+	return (struct canid_match *) (m)->data;
+}
+
+static int em_canid_match(struct sk_buff *skb, struct tcf_ematch *m,
+			 struct tcf_pkt_info *info)
+{
+	struct canid_match *cm = em_canid_priv(m);
+	canid_t can_id;
+	unsigned int match = false;
+	int i;
+
+	can_id = em_canid_get_id(skb);
+
+	if (can_id & CAN_EFF_FLAG) {
+		can_id &= CAN_EFF_MASK;
+
+		for (i = 0; i < cm->eff_rules_count; i++) {
+			if (!(((cm->rules_raw[i].can_id ^ can_id) &
+			    cm->rules_raw[i].can_mask) & CAN_EFF_MASK)) {
+				match = true;
+				break;
+			}
+		}
+	} else { /* SFF */
+		can_id &= CAN_SFF_MASK;
+		match = test_bit(can_id, cm->match_sff);
+	}
+
+	if (match)
+		return 1;
+
+	return 0;
+}
+
+static int em_canid_change(struct tcf_proto *tp, void *data, int len,
+			  struct tcf_ematch *m)
+{
+	struct can_filter *conf = data; /* Array with rules,
+					 * fixed size EM_CAN_RULES_SIZE
+					 */
+	struct canid_match *cm;
+	int err;
+	int i;
+
+	if (len < sizeof(struct can_filter))
+		return -EINVAL;
+
+	err = -ENOBUFS;
+	cm = kzalloc(sizeof(*cm), GFP_KERNEL);
+	if (cm == NULL)
+		goto errout;
+
+	cm->sff_rules_count = 0;
+	cm->eff_rules_count = 0;
+	cm->rules_count = len/sizeof(struct can_filter);
+	err = -EINVAL;
+
+	/* Be sure to fit into the array */
+	if (cm->rules_count > EM_CAN_RULES_SIZE)
+		goto errout_free;
+
+	/*
+	 * We need two for() loops for copying rules into
+	 * two contiguous areas in rules_raw
+	 */
+
+	/* Process EFF frame rules*/
+	for (i = 0; i < cm->rules_count; i++) {
+		if ((conf[i].can_id & CAN_EFF_FLAG) &&
+		    (conf[i].can_mask & CAN_EFF_FLAG)) {
+			memcpy(cm->rules_raw + cm->eff_rules_count,
+				&conf[i],
+				sizeof(struct can_filter));
+
+			cm->eff_rules_count++;
+		} else {
+			continue;
+		}
+	}
+
+	/* Process SFF frame rules */
+	for (i = 0; i < cm->rules_count; i++) {
+		if ((conf[i].can_id & CAN_EFF_FLAG) &&
+		    (conf[i].can_mask & CAN_EFF_FLAG)) {
+			continue;
+		} else {
+			memcpy(cm->rules_raw
+				+ cm->eff_rules_count
+				+ cm->sff_rules_count,
+				&conf[i], sizeof(struct can_filter));
+
+			cm->sff_rules_count++;
+
+			em_canid_sff_match_add(cm,
+				conf[i].can_id, conf[i].can_mask);
+		}
+	}
+
+	m->datalen = sizeof(*cm);
+	m->data = (unsigned long) cm;
+
+	return 0;
+
+errout_free:
+	kfree(cm);
+errout:
+	return err;
+}
+
+static void em_canid_destroy(struct tcf_proto *tp, struct tcf_ematch *m)
+{
+	struct canid_match *cm = em_canid_priv(m);
+
+	kfree(cm);
+}
+
+static int em_canid_dump(struct sk_buff *skb, struct tcf_ematch *m)
+{
+	struct canid_match *cm = em_canid_priv(m);
+
+	/*
+	 * When configuring this ematch 'rules_count' is set not to exceed
+	 * 'rules_raw' array size
+	 */
+	if (nla_put_nohdr(skb, sizeof(cm->rules_raw[0]) * cm->rules_count,
+	    &cm->rules_raw) < 0)
+		goto nla_put_failure;
+
+	return 0;
+
+nla_put_failure:
+	return -1;
+}
+
+static struct tcf_ematch_ops em_canid_ops = {
+	.kind	  = TCF_EM_CANID,
+	.change	  = em_canid_change,
+	.match	  = em_canid_match,
+	.destroy  = em_canid_destroy,
+	.dump	  = em_canid_dump,
+	.owner	  = THIS_MODULE,
+	.link	  = LIST_HEAD_INIT(em_canid_ops.link)
+};
+
+static int __init init_em_canid(void)
+{
+	return tcf_em_register(&em_canid_ops);
+}
+
+static void __exit exit_em_canid(void)
+{
+	tcf_em_unregister(&em_canid_ops);
+}
+
+MODULE_LICENSE("GPL");
+
+module_init(init_em_canid);
+module_exit(exit_em_canid);
+
+MODULE_ALIAS_TCF_EMATCH(TCF_EM_CANID);
-- 
1.7.10

^ permalink raw reply related

* Re: unstable 10GBE performance with recent kernels (> 3.0.X)
From: Stefan Priebe - Profihost AG @ 2012-06-18 12:36 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Linux Netdev List
In-Reply-To: <1340015143.7491.855.camel@edumazet-glaptop>

Am 18.06.2012 12:25, schrieb Eric Dumazet:
> I would remove all this (pretty old and obsolete) stuff and use standard
> params.
OK, thanks. done.

I've one RHEL6 system (using default 2.6.32 kernel) where only using 
"ntuple on" results in just 4-5Gbit/s while the others using 3.5.0-rc2 
are working fine.

> Only thing you could do is :
> ethtool -K eth2 ntuple on
>
> on both machines
Thanks that works great and boosts the performance a lot. What does it 
do? man ethtool doesn't show anything useful.

Is this also recommanded for 1Gb/s?

Could you recommend a tcp_congestion_control module? RHEL6 uses cubic by 
default. Some others use reno or bic.

Stefan

^ permalink raw reply

* Re: unstable 10GBE performance with recent kernels (> 3.0.X)
From: Eric Dumazet @ 2012-06-18 12:51 UTC (permalink / raw)
  To: Stefan Priebe - Profihost AG; +Cc: Linux Netdev List
In-Reply-To: <4FDF20B2.2010603@profihost.ag>

On Mon, 2012-06-18 at 14:36 +0200, Stefan Priebe - Profihost AG wrote:
> Am 18.06.2012 12:25, schrieb Eric Dumazet:
> > I would remove all this (pretty old and obsolete) stuff and use standard
> > params.
> OK, thanks. done.
> 
> I've one RHEL6 system (using default 2.6.32 kernel) where only using 
> "ntuple on" results in just 4-5Gbit/s while the others using 3.5.0-rc2 
> are working fine.
> 

OK, then you might play with affinities, that's the only thing that
might need sysadmin tuning (irqbalance needs to be disabled)

> > Only thing you could do is :
> > ethtool -K eth2 ntuple on
> >
> > on both machines
> Thanks that works great and boosts the performance a lot. What does it 
> do? man ethtool doesn't show anything useful.
> 

I am not Intel guy, but you might find some info in :

http://downloadmirror.intel.com/14687/eng/README.txt



> Is this also recommanded for 1Gb/s?
> 

Not sure it's supported on ixgb


> Could you recommend a tcp_congestion_control module? RHEL6 uses cubic by 
> default. Some others use reno or bic.

Hard to tell, there is no generic answer. Default should be fine for
most uses.

^ permalink raw reply

* Re: unstable 10GBE performance with recent kernels (> 3.0.X)
From: Stefan Priebe - Profihost AG @ 2012-06-18 13:07 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Linux Netdev List
In-Reply-To: <1340023876.7491.1153.camel@edumazet-glaptop>

Am 18.06.2012 14:51, schrieb Eric Dumazet:
 > On Mon, 2012-06-18 at 14:36 +0200, Stefan Priebe - Profihost AG wrote:
 >> Am 18.06.2012 12:25, schrieb Eric Dumazet:
 >>> I would remove all this (pretty old and obsolete) stuff and use 
standard
 >>> params.
 >> OK, thanks. done.
 >>
 >> I've one RHEL6 system (using default 2.6.32 kernel) where only using
 >> "ntuple on" results in just 4-5Gbit/s while the others using 3.5.0-rc2
 >> are working fine.
 >>
 >
 > OK, then you might play with affinities, that's the only thing that
 > might need sysadmin tuning (irqbalance needs to be disabled)

I've now used this one:

# script provided by intel
/root/set_irq_affinity.sh eth2
ifconfig eth2 mtu 9000
ethtool -K eth2 ntuple on

irqbalance is off

But with the latest long term stable vanilla kernel 3.0.34.

# iperf -c HOST -t 60

works fine but with -d for using both directions i see around 9.8Gbit/s 
in one direction and only 300kb/s in the other direction.

Stefan

^ permalink raw reply

* Re: unstable 10GBE performance with recent kernels (> 3.0.X)
From: Eric Dumazet @ 2012-06-18 13:29 UTC (permalink / raw)
  To: Stefan Priebe - Profihost AG; +Cc: Linux Netdev List
In-Reply-To: <4FDF2829.6050705@profihost.ag>

On Mon, 2012-06-18 at 15:07 +0200, Stefan Priebe - Profihost AG wrote:
> Am 18.06.2012 14:51, schrieb Eric Dumazet:
>  > On Mon, 2012-06-18 at 14:36 +0200, Stefan Priebe - Profihost AG wrote:
>  >> Am 18.06.2012 12:25, schrieb Eric Dumazet:
>  >>> I would remove all this (pretty old and obsolete) stuff and use 
> standard
>  >>> params.
>  >> OK, thanks. done.
>  >>
>  >> I've one RHEL6 system (using default 2.6.32 kernel) where only using
>  >> "ntuple on" results in just 4-5Gbit/s while the others using 3.5.0-rc2
>  >> are working fine.
>  >>
>  >
>  > OK, then you might play with affinities, that's the only thing that
>  > might need sysadmin tuning (irqbalance needs to be disabled)
> 
> I've now used this one:
> 
> # script provided by intel
> /root/set_irq_affinity.sh eth2
> ifconfig eth2 mtu 9000
> ethtool -K eth2 ntuple on
> 
> irqbalance is off
> 
> But with the latest long term stable vanilla kernel 3.0.34.
> 
> # iperf -c HOST -t 60
> 
> works fine but with -d for using both directions i see around 9.8Gbit/s 
> in one direction and only 300kb/s in the other direction.
> 
> Stefan


I would try using regular MTU.

And post "netstat -s" results before/after your iperf.

^ permalink raw reply

* Re: unstable 10GBE performance with recent kernels (> 3.0.X)
From: Stefan Priebe - Profihost AG @ 2012-06-18 14:17 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Linux Netdev List
In-Reply-To: <1340026144.7491.1231.camel@edumazet-glaptop>

Am 18.06.2012 15:29, schrieb Eric Dumazet:
>> But with the latest long term stable vanilla kernel 3.0.34.
>>
>> # iperf -c HOST -t 60
>>
>> works fine but with -d for using both directions i see around 9.8Gbit/s
>> in one direction and only 300kb/s in the other direction.
>
> I would try using regular MTU.
For 3.5.0-rc2 this works fine but decreases the speed from 9,9Gbit/s to 
9,3Gbit/s. With mtu 9000 i get around 3-4Gbit/s (netstat -s attached below)

With 3.0.34 MTU 9000 works fine with constant 9,9Gbit/s in both directions.

> And post "netstat -s" results before/after your iperf.

Tests with mtu 9000 and 3.5rc2:
Pre: http://pastebin.com/raw.php?i=uBjsFKjs
Post: http://pastebin.com/raw.php?i=TAK8JS6j

Thanks again!

Stefan

^ permalink raw reply

* Re: [PATCH] usbnet: Activate halt interrupt endpoint before re-submit URB
From: Huajun Li @ 2012-06-18 16:25 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: David Miller, tom.leiming-Re5JQEeQqe8AvxtiuMwx3w,
	stern-nwvwT67g6+6dFdvTe/nMLpVzexx5G7lz,
	linux-usb-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <201206180923.36091.oneukum-l3A5Bk7waGM@public.gmane.org>

On Mon, Jun 18, 2012 at 3:23 PM, Oliver Neukum <oneukum-l3A5Bk7waGM@public.gmane.org> wrote:
> Am Montag, 18. Juni 2012, 01:30:17 schrieb David Miller:
>> From: Huajun Li <huajun.li.lee-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>> Date: Wed, 13 Jun 2012 20:50:31 +0800
>>
>> > intr_complete() submits URB even the interrupt endpoint stalls.
>> > This patch will try to activate the endpoint once the exception
>> > occurs, and then re-submit the URB if the endpoint works again.
>> >
>> > Signed-off-by: Huajun Li <huajun.li.lee-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>>
>> Review from USB experts would be appreciated.
>
> The code implements a minimum error handler correctly.
> Did you observe a stall in actual hardware or is this a just
> in case patch?
>

This one is just a patch, thanks for your comments.

>        Regards
>                Oliver
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] net: added support for 40GbE link.
From: Rick Jones @ 2012-06-18 16:27 UTC (permalink / raw)
  To: Parav Pandit; +Cc: netdev, bhutchings
In-Reply-To: <0c7c97b0-bfe1-4143-a562-2019f86912fc@exht1.ad.emulex.com>

On 06/18/2012 05:44 AM, Parav Pandit wrote:

> diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
> index 297370a..1ebfa6e 100644
> --- a/include/linux/ethtool.h
> +++ b/include/linux/ethtool.h
> @@ -1153,6 +1153,10 @@ struct ethtool_ops {
>   #define SUPPORTED_10000baseR_FEC	(1<<  20)
>   #define SUPPORTED_20000baseMLD2_Full	(1<<  21)
>   #define SUPPORTED_20000baseKR2_Full	(1<<  22)
> +#define SUPPORTED_40000baseKR4_Full	(1<<  23)
> +#define SUPPORTED_40000baseCR4_Full	(1<<  24)
> +#define SUPPORTED_40000baseSR4_Full	(1<<  25)
> +#define SUPPORTED_40000baseLR4_Full	(1<<  26)
>
>   /* Indicates what features are advertised by the interface. */
>   #define ADVERTISED_10baseT_Half		(1<<  0)
> @@ -1178,6 +1182,10 @@ struct ethtool_ops {
>   #define ADVERTISED_10000baseR_FEC	(1<<  20)
>   #define ADVERTISED_20000baseMLD2_Full	(1<<  21)
>   #define ADVERTISED_20000baseKR2_Full	(1<<  22)
> +#define ADVERTISED_40000baseKR4_Full	(1<<  23)
> +#define ADVERTISED_40000baseCR4_Full	(1<<  24)
> +#define ADVERTISED_40000baseSR4_Full	(1<<  25)
> +#define ADVERTISED_40000baseLR4_Full	(1<<  26)

Any idea how many defines will be wanted for 100 Gbit Ethernet? 
Supported and advertising in ethtool_cmd are __u32s...

rick jones

^ permalink raw reply

* Re: [PATCH] usbnet: Activate halt interrupt endpoint before re-submit URB
From: Oliver Neukum @ 2012-06-18 16:51 UTC (permalink / raw)
  To: Huajun Li
  Cc: David Miller, tom.leiming-Re5JQEeQqe8AvxtiuMwx3w,
	stern-nwvwT67g6+6dFdvTe/nMLpVzexx5G7lz,
	linux-usb-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <CA+v9cxZG6b1O7kOQpeTELtv0vHzqy-8NH84boMpim0BM+tp1eQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

Am Montag, 18. Juni 2012, 18:25:15 schrieb Huajun Li:
> On Mon, Jun 18, 2012 at 3:23 PM, Oliver Neukum <oneukum-l3A5Bk7waGM@public.gmane.org> wrote:
> > Am Montag, 18. Juni 2012, 01:30:17 schrieb David Miller:
> >> From: Huajun Li <huajun.li.lee-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> >> Date: Wed, 13 Jun 2012 20:50:31 +0800
> >>
> >> > intr_complete() submits URB even the interrupt endpoint stalls.
> >> > This patch will try to activate the endpoint once the exception
> >> > occurs, and then re-submit the URB if the endpoint works again.
> >> >
> >> > Signed-off-by: Huajun Li <huajun.li.lee-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> >>
> >> Review from USB experts would be appreciated.
> >
> > The code implements a minimum error handler correctly.
> > Did you observe a stall in actual hardware or is this a just
> > in case patch?
> >
> 
> This one is just a patch, thanks for your comments.

Then I am inclined to say that this is not fully thought through
If the endpoint stalls, the device has detected an error.
We have no idea how to generically handle this error.
We might come into a vicious circle. Could you add a sanity limit for the
amount of halts we clear?

	Regards
		Oliver
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] net: added support for 40GbE link.
From: Ben Hutchings @ 2012-06-18 16:56 UTC (permalink / raw)
  To: Rick Jones; +Cc: Parav Pandit, netdev
In-Reply-To: <4FDF56FB.9080509@hp.com>

On Mon, 2012-06-18 at 09:27 -0700, Rick Jones wrote:
> On 06/18/2012 05:44 AM, Parav Pandit wrote:
> 
> > diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
> > index 297370a..1ebfa6e 100644
> > --- a/include/linux/ethtool.h
> > +++ b/include/linux/ethtool.h
> > @@ -1153,6 +1153,10 @@ struct ethtool_ops {
> >   #define SUPPORTED_10000baseR_FEC	(1<<  20)
> >   #define SUPPORTED_20000baseMLD2_Full	(1<<  21)
> >   #define SUPPORTED_20000baseKR2_Full	(1<<  22)
> > +#define SUPPORTED_40000baseKR4_Full	(1<<  23)
> > +#define SUPPORTED_40000baseCR4_Full	(1<<  24)
> > +#define SUPPORTED_40000baseSR4_Full	(1<<  25)
> > +#define SUPPORTED_40000baseLR4_Full	(1<<  26)
> >
> >   /* Indicates what features are advertised by the interface. */
> >   #define ADVERTISED_10baseT_Half		(1<<  0)
> > @@ -1178,6 +1182,10 @@ struct ethtool_ops {
> >   #define ADVERTISED_10000baseR_FEC	(1<<  20)
> >   #define ADVERTISED_20000baseMLD2_Full	(1<<  21)
> >   #define ADVERTISED_20000baseKR2_Full	(1<<  22)
> > +#define ADVERTISED_40000baseKR4_Full	(1<<  23)
> > +#define ADVERTISED_40000baseCR4_Full	(1<<  24)
> > +#define ADVERTISED_40000baseSR4_Full	(1<<  25)
> > +#define ADVERTISED_40000baseLR4_Full	(1<<  26)
> 
> Any idea how many defines will be wanted for 100 Gbit Ethernet? 
> Supported and advertising in ethtool_cmd are __u32s...

There are 9 bytes reserved in struct ethtool_cmd, so we can potentially
add extend each of supported, advertising and lp_advertising by 24 bits.
But it might be better to define a new, cleaner struct ethtool_cmd.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [PATCH] net: added support for 40GbE link.
From: Ben Hutchings @ 2012-06-18 17:09 UTC (permalink / raw)
  To: Parav Pandit; +Cc: netdev
In-Reply-To: <0c7c97b0-bfe1-4143-a562-2019f86912fc@exht1.ad.emulex.com>

On Mon, 2012-06-18 at 18:14 +0530, Parav Pandit wrote:
> 1. link speed of 40GbE and #4 KR4, CR4, SR4, LR4 modes defined.
> 2. removed code replication for tov calculation for 1G, 10G and
> made is common for 1G, 10G, 40G.
[...]
> @@ -1185,12 +1193,13 @@ struct ethtool_ops {
>   * it was forced up into this mode or autonegotiated.
>   */
>  
> -/* The forced speed, 10Mb, 100Mb, gigabit, 2.5Gb, 10GbE. */
> +/* The forced speed, 10Mb, 100Mb, gigabit, 2.5Gb, 10GbE, 40GbE. */
>  #define SPEED_10		10
>  #define SPEED_100		100
>  #define SPEED_1000		1000
>  #define SPEED_2500		2500
>  #define SPEED_10000		10000
> +#define SPEED_40000		40000
>  #define SPEED_UNKNOWN		-1

I don't think there's any need to name all possible link speeds, and it
just encourages the bad practice of ethtool API users checking for
specific values.  You may notice there is no SPEED_20000.

>  /* Duplex, half or full. */
> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
> index 8a10d5b..dd0e503 100644
> --- a/net/packet/af_packet.c
> +++ b/net/packet/af_packet.c
> @@ -542,13 +542,11 @@ static int prb_calc_retire_blk_tmo(struct packet_sock *po,
>  	rtnl_unlock();
>  	if (!err) {
>  		switch (ecmd.speed) {
> -		case SPEED_10000:
> -			msec = 1;
> -			div = 10000/1000;
> -			break;
>  		case SPEED_1000:
> +		case SPEED_10000:
> +		case SPEED_40000:
>  			msec = 1;
> -			div = 1000/1000;
> +			div = ecmd.speed / 1000;
>  			break;

This function should be fixed properly.  Firstly, it must use
ethtool_cmd_speed() rather than directly accessing ecmd.speed.
Secondly, it should allow any speed value rather than checking for
specific values.  Then there will be no need to make further changes for
100G or any other new speed.

Ben.

>  		/*
>  		 * If the link speed is so slow you don't really

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [PATCH] bnx2x: fix panic when TX ring is full
From: Tomas Hruby @ 2012-06-18 17:18 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Dmitry Kravkov, David Miller, netdev@vger.kernel.org,
	therbert@google.com, evansr@google.com, Eilon Greenstein,
	Merav Sicron, Yaniv Rosner, willemb@google.com
In-Reply-To: <1340005136.7491.609.camel@edumazet-glaptop>

On Mon, Jun 18, 2012 at 12:38 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Sat, 2012-06-16 at 07:40 +0000, Dmitry Kravkov wrote:
>> Hi Eric and Tomas
>>
>> > From: netdev-owner@vger.kernel.org [mailto:netdev-
>> > owner@vger.kernel.org] On Behalf Of David Miller
>> > Sent: Saturday, June 16, 2012 1:31 AM
>> > To: eric.dumazet@gmail.com
>> > Cc: netdev@vger.kernel.org; therbert@google.com; evansr@google.com;
>> > Eilon Greenstein; Merav Sicron; Yaniv Rosner; willemb@google.com;
>> > thruby@google.com
>> > Subject: Re: [PATCH] bnx2x: fix panic when TX ring is full
>> >
>> > From: Eric Dumazet <eric.dumazet@gmail.com>
>> > Date: Wed, 13 Jun 2012 21:45:16 +0200
>> >
>> > > From: Eric Dumazet <edumazet@google.com>
>> > >
>> > > There is a off by one error in the minimal number of BD in
>> > > bnx2x_start_xmit() and bnx2x_tx_int() before stopping/resuming tx
>> > queue.
>> > >
>> > > A full size GSO packet, with data included in skb->head really needs
>> > > (MAX_SKB_FRAGS + 4) BDs, because of bnx2x_tx_split()
>> > >
>> > > This error triggers if BQL is disabled and heavy TCP transmit traffic
>> > > occurs.
>> > >
>> > > bnx2x_tx_split() definitely can be called, remove a wrong comment.
>> > >
>> > > Reported-by: Tomas Hruby <thruby@google.com>
>> > > Signed-off-by: Eric Dumazet <edumazet@google.com>
>>
>> Theoretically a can't see how we can reach the case with 4 BDs required apart of frags,
>> Usually we need 2, when split invoked 3:
>> 1.Start
>> 2.Start(split)
>> 3.Parsing
>> + Frags
>>
>> Next pages descriptors and 2 extras for full indication are not counted as available.
>>
>> Practically I'm running the traffic for more then a day without hitting the panic.
>>
>> Can you describe the scenario you reproduced this in details? And which code has paniced?
>
> Thats pretty immediate.

yes

> Disable bql on your NIC.
>
> Say you have 4 queues :
>
> for q in 0 1 2 3
> do
>  echo max >/sys/class/net/eth0/queues/tx-$q/byte_queue_limits/limit_min
> done
>
> Then start 40 netperf
>
> for i in `seq 1 40`
> do
>  netperf -H 192.168.1.4 &
> done

this is enough in my case too, it is perfectly reproducible on
different machines. Replacing +3 for +4 fixes the problem.

T.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox