Netdev List
 help / color / mirror / Atom feed
* Re: [RFC] TCP:  Support configurable delayed-ack parameters.
From: Eric Dumazet @ 2012-06-19  5:11 UTC (permalink / raw)
  To: greearb; +Cc: netdev, Daniel Baluta
In-Reply-To: <1340067163-29329-1-git-send-email-greearb@candelatech.com>

On Mon, 2012-06-18 at 17:52 -0700, greearb@candelatech.com wrote:
> From: Ben Greear <greearb@candelatech.com>
> 
> RFC2581 ($4.2) specifies when an ACK should be generated as follows:
> 
> " .. an ACK SHOULD be generated for at least every second
>   full-sized segment, and MUST be generated within 500 ms
>   of the arrival of the first unacknowledged packet.
> "
> 
> We export the number of segments and the timeout limits
> specified above, so that a user can tune them according
> to their needs.
> 
> Specifically:
> 	* /proc/sys/net/ipv4/tcp_default_delack_segs, represents
> 	the threshold for the number of segments.
> 	* /proc/sys/net/ipv4/tcp_default_delack_min, specifies
> 	the minimum timeout value
> 	* /proc/sys/net/ipv4/tcp_default_delack_max, specifies
> 	the maximum timeout value.
> 
> In addition, new TCP socket options are added to allow
> per-socket configuration:
> 
> TCP_DELACK_SEGS
> TCP_DELACK_MIN
> TCP_DELACK_MAX
> 
> In order to keep a multiply out of the hot path, the segs * mss
> computation is recalculated and cached whenever segs or mss changes.
> 

I know David was worried about this multiply, but current cpus do a
multiply in at most 3 cycles.

Addding an u32 field in socket structure adds 1/16 of a cache line, and
adds more penalty.

Avoiding to build/send an ACK packet can save us so many cpu cycles that
the multiply is pure noise.

^ permalink raw reply

* RE: [PATCH] net: added support for 40GbE link.
From: Parav.Pandit @ 2012-06-19  5:20 UTC (permalink / raw)
  To: rick.jones2; +Cc: netdev, bhutchings
In-Reply-To: <4FDF56FB.9080509@hp.com>



> -----Original Message-----
> From: Rick Jones [mailto:rick.jones2@hp.com]
> Sent: Monday, June 18, 2012 9:58 PM
> To: Pandit, Parav
> Cc: netdev@vger.kernel.org; bhutchings@solarflare.com
> Subject: Re: [PATCH] net: added support for 40GbE link.
> 
> On 06/18/2012 05:44 AM, Parav Pandit wrote:
> 
> > diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h index
> > 297370a..1ebfa6e 100644
> > --- a/include/linux/ethtool.h
> > +++ b/include/linux/ethtool.h
> > @@ -1153,6 +1153,10 @@ struct ethtool_ops {
> >   #define SUPPORTED_10000baseR_FEC	(1<<  20)
> >   #define SUPPORTED_20000baseMLD2_Full	(1<<  21)
> >   #define SUPPORTED_20000baseKR2_Full	(1<<  22)
> > +#define SUPPORTED_40000baseKR4_Full	(1<<  23)
> > +#define SUPPORTED_40000baseCR4_Full	(1<<  24)
> > +#define SUPPORTED_40000baseSR4_Full	(1<<  25)
> > +#define SUPPORTED_40000baseLR4_Full	(1<<  26)
> >
> >   /* Indicates what features are advertised by the interface. */
> >   #define ADVERTISED_10baseT_Half		(1<<  0)
> > @@ -1178,6 +1182,10 @@ struct ethtool_ops {
> >   #define ADVERTISED_10000baseR_FEC	(1<<  20)
> >   #define ADVERTISED_20000baseMLD2_Full	(1<<  21)
> >   #define ADVERTISED_20000baseKR2_Full	(1<<  22)
> > +#define ADVERTISED_40000baseKR4_Full	(1<<  23)
> > +#define ADVERTISED_40000baseCR4_Full	(1<<  24)
> > +#define ADVERTISED_40000baseSR4_Full	(1<<  25)
> > +#define ADVERTISED_40000baseLR4_Full	(1<<  26)
> 
> Any idea how many defines will be wanted for 100 Gbit Ethernet?
> Supported and advertising in ethtool_cmd are __u32s...
> 
100G supports CR10, ER4, LR4, Base-R and SR10 interfaces. So 5 bits. We have space from 27 to 31 bits for 100G as well.

> rick jones

^ permalink raw reply

* Re: [PATCH] ipv4: Early TCP socket demux.
From: David Miller @ 2012-06-19  6:07 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev
In-Reply-To: <1340080661.7491.2205.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 19 Jun 2012 06:37:41 +0200

> On Mon, 2012-06-18 at 21:27 -0700, David Miller wrote:
> 
>> How are we not using it?  We use the cached SKB socket no matter what
>> happens.
>> 
>> Look at how inet hash lookup works.
>> 
>> The error tells the caller solely whether a route lookup is still
>> necessary.
> 
> OK, remove the unlikely() in __inet_lookup_skb() so that its obvious we
> have this skb_steal_sock() thing :)

Sure thing.

We also need to add some dst->ops->check() handling as well.

^ permalink raw reply

* Re: [PATCH] ipv4: Early TCP socket demux.
From: Eric Dumazet @ 2012-06-19  6:28 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20120618.230723.612232275251003129.davem@davemloft.net>

On Mon, 2012-06-18 at 23:07 -0700, David Miller wrote:

> Sure thing.
> 
> We also need to add some dst->ops->check() handling as well.

Yes, rp_filter comes to mind.

^ permalink raw reply

* Re: [PATCH] sctp: fix warning when compiling without IPv6
From: David Miller @ 2012-06-19  7:28 UTC (permalink / raw)
  To: dhalperi; +Cc: netdev
In-Reply-To: <B7062EEE-046D-4435-B5FC-54FF3F763645@cs.washington.edu>

From: Daniel Halperin <dhalperi@cs.washington.edu>
Date: Mon, 18 Jun 2012 14:04:55 -0700

> net/sctp/protocol.c: In function ‘sctp_addr_wq_timeout_handler’:
> net/sctp/protocol.c:676: warning: label ‘free_next’ defined but not used
> 
> Signed-off-by: Daniel Halperin <dhalperi@cs.washington.edu>

Applied.

^ permalink raw reply

* Re: [PATCH RESEND] net: lpc_eth: Driver cleanup
From: David Miller @ 2012-06-19  7:28 UTC (permalink / raw)
  To: stigge
  Cc: eric.dumazet, netdev, linux-kernel, kevin.wells, srinivas.bakki,
	aletes.xgr, linux-arm-kernel
In-Reply-To: <1340050482-22666-1-git-send-email-stigge@antcom.de>

From: Roland Stigge <stigge@antcom.de>
Date: Mon, 18 Jun 2012 22:14:42 +0200

> This patch removes some nowadays superfluous definitions (one unused define and
> an obsolete function forward declaration) and corrects a netdev_err() to
> netdev_dbg().
> 
> Signed-off-by: Roland Stigge <stigge@antcom.de>
> Signed-off-by: Alexandre Pereira da Silva <aletes.xgr@gmail.com>

Applied.

^ permalink raw reply

* Re: [PATCH] net: added support for 40GbE link.
From: David Miller @ 2012-06-19  7:29 UTC (permalink / raw)
  To: bhutchings; +Cc: parav.pandit, netdev
In-Reply-To: <1340039376.2913.13.camel@bwh-desktop.uk.solarflarecom.com>

From: Ben Hutchings <bhutchings@solarflare.com>
Date: Mon, 18 Jun 2012 18:09:36 +0100

> On Mon, 2012-06-18 at 18:14 +0530, Parav Pandit wrote:
...
>> -/* The forced speed, 10Mb, 100Mb, gigabit, 2.5Gb, 10GbE. */
>> +/* The forced speed, 10Mb, 100Mb, gigabit, 2.5Gb, 10GbE, 40GbE. */
> 
> I don't think there's any need to name all possible link speeds, and it
> just encourages the bad practice of ethtool API users checking for
> specific values.  You may notice there is no SPEED_20000.

Agreed.

>> @@ -542,13 +542,11 @@ static int prb_calc_retire_blk_tmo(struct packet_sock *po,
 ...
> This function should be fixed properly.  Firstly, it must use
> ethtool_cmd_speed() rather than directly accessing ecmd.speed.
> Secondly, it should allow any speed value rather than checking for
> specific values.  Then there will be no need to make further changes for
> 100G or any other new speed.

Agreed.

^ permalink raw reply

* Re: [PATCH] phy/micrel: change phy_id_mask for KSZ9021 and KS8001
From: David Miller @ 2012-06-19  7:31 UTC (permalink / raw)
  To: jason77.wang; +Cc: david.choi, nobuhiro.iwamatsu.yj, netdev
In-Reply-To: <1340009529-4231-1-git-send-email-jason77.wang@gmail.com>

From: Hui Wang <jason77.wang@gmail.com>
Date: Mon, 18 Jun 2012 16:52:09 +0800

> On a freescale imx6q platform, a hardware phy chip KSZ9021 is
> recognized as a KS8001 chip by the current driver like this:
> eth0: Freescale FEC PHY driver [Micrel KS8001 or KS8721]
> 
> KSZ9021 has phy_id 0x00221610, while KSZ8001 has phy_id
> 0x0022161a, the current phy_id_mask (0x00fffff0/0x00ffff10) can't
> distinguish them. So change phy_id_mask to resolve this problem.
> 
> Although the micrel datasheet says that the 4 LSB of phyid2 register
> contains the chip revision number and the current driver is designed
> to follow this rule, in reality the chip implementation doesn't follow
> it.
> 
> Cc: David J. Choi <david.choi@micrel.com>
> Cc: David S. Miller <davem@davemloft.net>
> Cc: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
> Signed-off-by: Hui Wang <jason77.wang@gmail.com>

Applied, thanks.

^ permalink raw reply

* RE: [PATCH] net: added support for 40GbE link.
From: Parav.Pandit @ 2012-06-19  7:33 UTC (permalink / raw)
  To: davem, bhutchings; +Cc: netdev
In-Reply-To: <20120619.002905.922583388766089167.davem@davemloft.net>



> -----Original Message-----
> From: David Miller [mailto:davem@davemloft.net]
> Sent: Tuesday, June 19, 2012 12:59 PM
> To: bhutchings@solarflare.com
> Cc: Pandit, Parav; netdev@vger.kernel.org
> Subject: Re: [PATCH] net: added support for 40GbE link.
> 
> From: Ben Hutchings <bhutchings@solarflare.com>
> Date: Mon, 18 Jun 2012 18:09:36 +0100
> 
> > On Mon, 2012-06-18 at 18:14 +0530, Parav Pandit wrote:
> ...
> >> -/* The forced speed, 10Mb, 100Mb, gigabit, 2.5Gb, 10GbE. */
> >> +/* The forced speed, 10Mb, 100Mb, gigabit, 2.5Gb, 10GbE, 40GbE. */
> >
> > I don't think there's any need to name all possible link speeds, and
> > it just encourages the bad practice of ethtool API users checking for
> > specific values.  You may notice there is no SPEED_20000.
> 
> Agreed.

Should eventually all net driver should remove using SPEED_xxxxxx and start using hard coded value of 10, 100, 1000, 20000?

> 
> >> @@ -542,13 +542,11 @@ static int prb_calc_retire_blk_tmo(struct
> >> packet_sock *po,
>  ...
> > This function should be fixed properly.  Firstly, it must use
> > ethtool_cmd_speed() rather than directly accessing ecmd.speed.
> > Secondly, it should allow any speed value rather than checking for
> > specific values.  Then there will be no need to make further changes
> > for 100G or any other new speed.
> 
> Agreed.

That means ethtool_cmd_speed() should not be called in this function?

If I understand correctly, it should return the value of 8ms (for 10Mb,s 100Mbps, 2.5 Gbps, 20Gbps) as today and remaining it should return calculated value?
Or
Function needs a fix for all these speeds (10Mbps, 100Mbs, 20Gbps too)?

^ permalink raw reply

* Re: [XFRM][RFC v1] Fix unexpected SA hard expiration after setting new date
From: David Miller @ 2012-06-19  7:34 UTC (permalink / raw)
  To: herbert; +Cc: fan.du, netdev, fdu
In-Reply-To: <20120618084021.GA24902@gondor.apana.org.au>

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Mon, 18 Jun 2012 16:40:21 +0800

> I guess I can live with your workaround if Dave is happy with
> it.  But IMHO we should just go back to relative time and fix
> the suspend/resume user-space scripts instead.

In the short term I'm happy with it too.

But I seem to remember that last time this issue came up the
suggestion was to use highres timers.  Someone tried but they did an
amazingly poor job so the effort just died off.  It takes someone with
some skill because highres timers operate with different context
requirements than normal timers.

^ permalink raw reply

* Re: [PATCH] net: added support for 40GbE link.
From: David Miller @ 2012-06-19  7:35 UTC (permalink / raw)
  To: Parav.Pandit; +Cc: bhutchings, netdev
In-Reply-To: <5D6C0ABE6A236946864C45679362BBE20AC52269@CMEXMB1.ad.emulex.com>

From: <Parav.Pandit@Emulex.Com>
Date: Tue, 19 Jun 2012 07:33:12 +0000

> Should eventually all net driver should remove using SPEED_xxxxxx and start using hard coded value of 10, 100, 1000, 20000?

No, the ones that exist can stay, just no new ones.

> That means ethtool_cmd_speed() should not be called in this function?

Ben said that it must be called, what are you talking about?

^ permalink raw reply

* RE: [PATCH] net: added support for 40GbE link.
From: Parav.Pandit @ 2012-06-19  7:42 UTC (permalink / raw)
  To: davem; +Cc: bhutchings, netdev
In-Reply-To: <20120619.003517.2067426285873138772.davem@davemloft.net>

> -----Original Message-----
> From: David Miller [mailto:davem@davemloft.net]
> Sent: Tuesday, June 19, 2012 1:05 PM
> To: Pandit, Parav
> Cc: bhutchings@solarflare.com; netdev@vger.kernel.org
> Subject: Re: [PATCH] net: added support for 40GbE link.
> 
> From: <Parav.Pandit@Emulex.Com>
> Date: Tue, 19 Jun 2012 07:33:12 +0000
> 
> > Should eventually all net driver should remove using SPEED_xxxxxx and
> start using hard coded value of 10, 100, 1000, 20000?
> 
> No, the ones that exist can stay, just no new ones.
> 
So driver which supports 40Gpbs, 100Gbps should hardcode to 40000, 100000 respectively?

> > That means ethtool_cmd_speed() should not be called in this function?
> 
> Ben said that it must be called, what are you talking about?

Sorry, I wanted to ask - Do you need switch case for speed like below new code or its should be speed independent code?
                switch (ethtool_cmd_speed()) {
                case SPEED_100:
                case SPEED_10:
                        return DEFAULT_PRB_RETIRE_TOV;
                default:
                        msec = 1;
                        div = ethtool_cmd_speed() / 1000;
                        break;
                /*
                }

^ permalink raw reply

* Re: [XFRM][RFC v1] Fix unexpected SA hard expiration after setting new date
From: Herbert Xu @ 2012-06-19  7:43 UTC (permalink / raw)
  To: David Miller; +Cc: fan.du, netdev, fdu
In-Reply-To: <20120619.003422.1109786261235176550.davem@davemloft.net>

On Tue, Jun 19, 2012 at 12:34:22AM -0700, David Miller wrote:
> 
> But I seem to remember that last time this issue came up the
> suggestion was to use highres timers.  Someone tried but they did an
> amazingly poor job so the effort just died off.  It takes someone with
> some skill because highres timers operate with different context
> requirements than normal timers.

Sorry I think I didn't see that discussion or if I did, then
it has completely left my mind :)

So excuse my ignorance, are highres timers relative like jiffies
or absolute? If it's absolute, I don't see how it would impact this
case.

If it's relative, what exactly does it buy for us over jiffies?

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* [XFRM][RFC v2] Fix unexpected SA hard expiration after setting new date
From: fan.du @ 2012-06-19  7:51 UTC (permalink / raw)
  To: davem, herbert; +Cc: netdev, fdu


Hi, Dave and Herbert

Could you give a try to review this?
thanks :)

Changelog:
v1->v2
1) use xflags instead of creating new flags(suggested by Steffen Klassert)


*Background*:
Once IPsec SAs are created between two peers, kernel setup a timer to monitor
two events: soft/hard expiration. However the timer handler use xtime to
caculate whether it's soft or hard expiration event.

normal code flow(hard expire time:100s, soft expire time:82s)

a) When new SAs created, xfrm_timer_handler is called one second
after its creation. At this point, calculate soft expire
interval(81s), setup the timer;

b) soft expire occur, rearm the timer with hard expire interval(18s)
then notify racoon2 about soft expire event. racoon2 will create
new SAs.

c) hard expire happen, notify racoon2 about it. racoon2 will delete
the old SAs.

*Scenario*:
Setting a new date before b),and after a) could result c) happens first,
As a result, old SAs is deleted before new ones are created. Normally
new SAs will be created by the next time networking traffic, but there
is a small time being when networking connection is down, this could
result in upper layer connections failed in tel comm area, thus it's
better to keep it strict in sequence.

*Workaround*:
set new time could happen:
1) before a), then SAs is updated with new time.
2) before b),and after a)
2a) When new SAs created, xfrm_timer_handler is called one second
after its creation. At this point, calculate soft expire
interval(81s), setup the timer;(set flag to mark next time should
be soft time expire)

<<---- new date comes

2b) soft expire occur, the calculation results in a hard time expire
event, but flag is set, so catch ya. Sync the addtime, and rearm
the timer with hard expire interval(18s), then notify racoon2
about soft expire event;

2c) hard expire happen, notify racoon2 about it;
so everything is in order.

3) after b), hard expire always happened anyway.

^ permalink raw reply

* [PATCH] [XFRM] Fix unexpected SA hard expiration after changing date
From: fan.du @ 2012-06-19  7:51 UTC (permalink / raw)
  To: davem, herbert; +Cc: netdev, fdu
In-Reply-To: <1340092269-6136-1-git-send-email-fan.du@windriver.com>

From: "fan.du" <fan.du@windriver.com>

After SA is setup, one timer is armed to detect soft/hard expiration,
however the timer handler uses xtime to do the math. This makes hard
expiration occurs first before soft expiration after setting new date
with big interval. As a result new child SA is deleted before rekeying
the new one.

Signed-off-by: fan.du <fan.du@windriver.com>
---
 include/net/xfrm.h    |  3 +++
 net/xfrm/xfrm_state.c | 21 +++++++++++++++++----
 2 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 2933d74..8d16777 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -197,6 +197,8 @@ struct xfrm_state
 
 	struct xfrm_lifetime_cur curlft;
 	struct timer_list	timer;
+	/* used to fix curlft->add_time when changing date */
+	long		saved_tmo;
 
 	/* Last used time */
 	unsigned long		lastused;
@@ -218,6 +220,7 @@ struct xfrm_state
 
 /* xflags - make enum if more show up */
 #define XFRM_TIME_DEFER	1
+#define XFRM_SOFT_EXPIRE 2
 
 enum {
 	XFRM_STATE_VOID,
diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index fd77cf0..ab4aa0f 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -442,8 +442,17 @@ static void xfrm_timer_handler(unsigned long data)
 	if (x->lft.hard_add_expires_seconds) {
 		long tmo = x->lft.hard_add_expires_seconds +
 			x->curlft.add_time - now;
-		if (tmo <= 0)
-			goto expired;
+		if (tmo <= 0) {
+			if (x->xflags & XFRM_SOFT_EXPIRE) {
+				/* enter hard expire without soft expire first?!
+				 * setting a new date could trigger this.
+				 * workarbound: fix x->curflt.add_time by below:
+				 */
+				x->curlft.add_time = now - x->saved_tmo - 1;
+				tmo = x->lft.hard_add_expires_seconds - x->saved_tmo;
+			} else
+				goto expired;
+		}
 		if (tmo < next)
 			next = tmo;
 	}
@@ -460,10 +469,14 @@ static void xfrm_timer_handler(unsigned long data)
 	if (x->lft.soft_add_expires_seconds) {
 		long tmo = x->lft.soft_add_expires_seconds +
 			x->curlft.add_time - now;
-		if (tmo <= 0)
+		if (tmo <= 0) {
 			warn = 1;
-		else if (tmo < next)
+			x->xflags |= ~XFRM_SOFT_EXPIRE;
+		} else if (tmo < next) {
 			next = tmo;
+			x->xflags |= XFRM_SOFT_EXPIRE;
+			x->saved_tmo = tmo;
+		}
 	}
 	if (x->lft.soft_use_expires_seconds) {
 		long tmo = x->lft.soft_use_expires_seconds +
-- 
1.7.11

^ permalink raw reply related

* Re: [PATCH net-next v2 06/12] netfilter: merge udpv[4,6]_net_init into udp_net_init
From: Gao feng @ 2012-06-19  8:08 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netdev, netfilter-devel
In-Reply-To: <20120616112259.GG18251@1984>

于 2012年06月16日 19:22, Pablo Neira Ayuso 写道:
> On Sat, Jun 16, 2012 at 11:41:17AM +0800, Gao feng wrote:
>> merge udpv4_net_init and udpv6_net_init into udp_net_init to
>> reduce the redundancy codes.
>>
>> and use nf_proto_net.users to identify if it's the first time
>> we use the nf_proto_net. when it's the first time,we will
>> initialized it.
>>
>> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
>> ---
>>  net/netfilter/nf_conntrack_proto_udp.c |   56 ++++++++++---------------------
>>  1 files changed, 18 insertions(+), 38 deletions(-)
>>
>> diff --git a/net/netfilter/nf_conntrack_proto_udp.c b/net/netfilter/nf_conntrack_proto_udp.c
>> index 2b978e6..61bca4f 100644
>> --- a/net/netfilter/nf_conntrack_proto_udp.c
>> +++ b/net/netfilter/nf_conntrack_proto_udp.c
>> @@ -270,52 +270,32 @@ static int udp_kmemdup_compat_sysctl_table(struct nf_proto_net *pn)
>>  	return 0;
>>  }
>>
>> -static void udp_init_net_data(struct nf_udp_net *un)
>> +static int udp_init_net(struct net *net, u_int16_t proto)
>>  {
>> -	int i;
>> -#ifdef CONFIG_SYSCTL
>> -	if (!un->pn.ctl_table) {
>> -#else
>> -	if (!un->pn.users++) {
>> -#endif
>> +	int ret;
>> +	struct nf_udp_net *un = udp_pernet(net);
>> +	struct nf_proto_net *pn = &un->pn;
>> +
>> +	if (!pn->users) {
>> +		int i;
>>  		for (i = 0; i < UDP_CT_MAX; i++)
>>  			un->timeouts[i] = udp_timeouts[i];
>>  	}
>> -}
>> -
>> -static int udpv4_init_net(struct net *net, u_int16_t proto)
>> -{
>> -	int ret;
>> -	struct nf_udp_net *un = udp_pernet(net);
>> -	struct nf_proto_net *pn = (struct nf_proto_net *)un;
>>
>> -	udp_init_net_data(un);
>> +	if (proto == AF_INET) {
> 
> I think we can remove that u_int16_t proto that I proposed to make
> something like:
> 
> static int udp_kmemdup_compat_sysctl_table(struct nf_proto_net *pn)
> {
> #ifdef CONFIG_SYSCTL
> #ifdef CONFIG_NF_CONNTRACK_PROC_COMPAT
>         struct nf_udp_net *un = (struct nf_udp_net *)pn;
> +
> +       if (pn->ctl_compat_table)
> +               return 0;
> +
>         pn->ctl_compat_table = kmemdup(udp_compat_sysctl_table,
>                                        sizeof(udp_compat_sysctl_table),
>                                        GFP_KERNEL);
>         if (!pn->ctl_compat_table)
>                 return -ENOMEM;
> 
> That should be enough to ensure that the compat is registered once. No
> matter if it's done by the IPv4 or IPv6 invocation of udp_init_net.
> 
> Thus, it will look consistent with udp_kmemdup_sysctl_table.


yes, but this will be very terrible to unregister compat sysctl
and free compat sysctl table.

thinking about, we may insmod nf_conntrack_ipv6 only, as your idea,
we will allocate compat_sysctl_table.so we have to free it when
rmmod nf_conntrack_ipv6.

in order to implement it, we have to change the logic of
nf_ct_l4proto_unregister_sysctl and nf_ct_unregister_sysctl. because we
only free the sysctl table when we unregister the proto.

^ permalink raw reply

* Re: [PATCH net-next v2 06/12] netfilter: merge udpv[4,6]_net_init into udp_net_init
From: Gao feng @ 2012-06-19  8:12 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netdev, netfilter-devel
In-Reply-To: <4FE0337F.6030000@cn.fujitsu.com>

于 2012年06月19日 16:08, Gao feng 写道:
> 
> in order to implement it, we have to change the logic of
> nf_ct_l4proto_unregister_sysctl and nf_ct_unregister_sysctl. because we
> only free the sysctl table when we unregister the proto.

I means free the sysctl table when we register sysctl success
(pn->ctl_table_header != NULL).

^ permalink raw reply

* Re: [PATCH net-next v2 11/12] netfilter: nf_conntrack_l4proto_icmp cleanup
From: Gao feng @ 2012-06-19  8:14 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netdev, netfilter-devel
In-Reply-To: <20120616110828.GE18251@1984>

于 2012年06月16日 19:08, Pablo Neira Ayuso 写道:
> extra line unrequiered.
> 
> I'm sorry, I'm stressing a lot on this because I don't like abusing
> follow-up patches that clean up extra lines / missing lines and that
> sort of nitpicks...
> 

Sorry for this,I will double check this when I send the v3 patch.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] [XFRM] Fix unexpected SA hard expiration after changing date
From: David Miller @ 2012-06-19  9:01 UTC (permalink / raw)
  To: fan.du; +Cc: herbert, netdev, fdu
In-Reply-To: <1340092269-6136-2-git-send-email-fan.du@windriver.com>

From: "fan.du" <fan.du@windriver.com>
Date: Tue, 19 Jun 2012 15:51:09 +0800

> From: "fan.du" <fan.du@windriver.com>

Please don't put your email user name instead of real name in quotes
there.  Thank you.

^ permalink raw reply

* Re: [PATCH] [XFRM] Fix unexpected SA hard expiration after changing date
From: David Miller @ 2012-06-19  9:05 UTC (permalink / raw)
  To: fan.du; +Cc: herbert, netdev, fdu
In-Reply-To: <20120619.020111.1596676335106760442.davem@davemloft.net>

From: David Miller <davem@davemloft.net>
Date: Tue, 19 Jun 2012 02:01:11 -0700 (PDT)

> From: "fan.du" <fan.du@windriver.com>
> Date: Tue, 19 Jun 2012 15:51:09 +0800
> 
>> From: "fan.du" <fan.du@windriver.com>
> 
> Please don't put your email user name instead of real name in quotes
> there.  Thank you.

Also fdu@windriver.com bounces, do not put it in the CC: list.

^ permalink raw reply

* [XFRM][RFC v3] Fix unexpected SA hard expiration after setting new date
From: Fan Du @ 2012-06-19  9:28 UTC (permalink / raw)
  To: davem, herbert; +Cc: netdev


Hi, Dave and Herbert

Could you give a try to review this?
thanks

Changelog:
v1->v2
1) use xflags instead of creating new flags(suggested by Steffen Klassert)

v2->v3
1) fix email problem, and remove cc to myself(requested by David Miller)


*Background*:
Once IPsec SAs are created between two peers, kernel setup a timer to monitor
two events: soft/hard expiration. However the timer handler use xtime to
caculate whether it's soft or hard expiration event.

normal code flow(hard expire time:100s, soft expire time:82s)

a) When new SAs created, xfrm_timer_handler is called one second
after its creation. At this point, calculate soft expire
interval(81s), setup the timer;

b) soft expire occur, rearm the timer with hard expire interval(18s)
then notify racoon2 about soft expire event. racoon2 will create
new SAs.

c) hard expire happen, notify racoon2 about it. racoon2 will delete
the old SAs.

*Scenario*:
Setting a new date before b),and after a) could result c) happens first,
As a result, old SAs is deleted before new ones are created. Normally
new SAs will be created by the next time networking traffic, but there
is a small time being when networking connection is down, this could
result in upper layer connections failed in tel comm area, thus it's
better to keep it strict in sequence.

*Workaround*:
set new time could happen:
1) before a), then SAs is updated with new time.
2) before b),and after a)
2a) When new SAs created, xfrm_timer_handler is called one second
after its creation. At this point, calculate soft expire
interval(81s), setup the timer;(set flag to mark next time should
be soft time expire)

<<---- new date comes

2b) soft expire occur, the calculation results in a hard time expire
event, but flag is set, so catch ya. Sync the addtime, and rearm
the timer with hard expire interval(18s), then notify racoon2
about soft expire event;

2c) hard expire happen, notify racoon2 about it;
so everything is in order.

3) after b), hard expire always happened anyway.

^ permalink raw reply

* [PATCH] [XFRM] Fix unexpected SA hard expiration after changing date
From: Fan Du @ 2012-06-19  9:28 UTC (permalink / raw)
  To: davem, herbert; +Cc: netdev
In-Reply-To: <1340098118-21529-1-git-send-email-fdu@windriver.com>

After SA is setup, one timer is armed to detect soft/hard expiration,
however the timer handler uses xtime to do the math. This makes hard
expiration occurs first before soft expiration after setting new date
with big interval. As a result new child SA is deleted before rekeying
the new one.

Signed-off-by: Fan Du <fdu@windriver.com>
---
 include/net/xfrm.h    |  3 +++
 net/xfrm/xfrm_state.c | 21 +++++++++++++++++----
 2 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 2933d74..8d16777 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -197,6 +197,8 @@ struct xfrm_state
 
 	struct xfrm_lifetime_cur curlft;
 	struct timer_list	timer;
+	/* used to fix curlft->add_time when changing date */
+	long		saved_tmo;
 
 	/* Last used time */
 	unsigned long		lastused;
@@ -218,6 +220,7 @@ struct xfrm_state
 
 /* xflags - make enum if more show up */
 #define XFRM_TIME_DEFER	1
+#define XFRM_SOFT_EXPIRE 2
 
 enum {
 	XFRM_STATE_VOID,
diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index fd77cf0..ab4aa0f 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -442,8 +442,17 @@ static void xfrm_timer_handler(unsigned long data)
 	if (x->lft.hard_add_expires_seconds) {
 		long tmo = x->lft.hard_add_expires_seconds +
 			x->curlft.add_time - now;
-		if (tmo <= 0)
-			goto expired;
+		if (tmo <= 0) {
+			if (x->xflags & XFRM_SOFT_EXPIRE) {
+				/* enter hard expire without soft expire first?!
+				 * setting a new date could trigger this.
+				 * workarbound: fix x->curflt.add_time by below:
+				 */
+				x->curlft.add_time = now - x->saved_tmo - 1;
+				tmo = x->lft.hard_add_expires_seconds - x->saved_tmo;
+			} else
+				goto expired;
+		}
 		if (tmo < next)
 			next = tmo;
 	}
@@ -460,10 +469,14 @@ static void xfrm_timer_handler(unsigned long data)
 	if (x->lft.soft_add_expires_seconds) {
 		long tmo = x->lft.soft_add_expires_seconds +
 			x->curlft.add_time - now;
-		if (tmo <= 0)
+		if (tmo <= 0) {
 			warn = 1;
-		else if (tmo < next)
+			x->xflags |= ~XFRM_SOFT_EXPIRE;
+		} else if (tmo < next) {
 			next = tmo;
+			x->xflags |= XFRM_SOFT_EXPIRE;
+			x->saved_tmo = tmo;
+		}
 	}
 	if (x->lft.soft_use_expires_seconds) {
 		long tmo = x->lft.soft_use_expires_seconds +
-- 
1.7.11

^ permalink raw reply related

* [PATCH net-next 0/5] qmi_wwan fixes for 3.6
From: Bjørn Mork @ 2012-06-19 10:41 UTC (permalink / raw)
  To: netdev-u79uwXL29TY76Z2rM5mHXA
  Cc: Oliver Neukum, Dan Williams, linux-usb-u79uwXL29TY76Z2rM5mHXA,
	Andrew Bird (Sphere Systems), Bjørn Mork

None of these are critical, so I don't think they are appropriate
for 3.5 at this point. I've therefore based them on net-next.

The two first patches prepare the driver for the new probing
model introduced by patch #3, which is the main change in this
set.  A RFC version of this was posted to linux-usb 29 May 2012 
for discussion, as it also implicitly affects the cdc-wdm driver, 
without any comments so far.

Probing on the data interface was a mistake which was introduced
at a point where I didn't realize that the cdc-wdm subdriver
support was required in any case, because most of the supported
devices only provide a single USB interface. Apart from creating
a confusing and inconsistent driver usage, where some devices would
use cdc-wdm as a subdriver and some would not, a number of real
problems has also shown up.  The most important one at the moment
is the need to make cdc_ether and qmi_wwan cooperate wrt which
devices are supported by which driver.  This is close to impossible
unless the two probes both look at the same control interface. It
is not enough that the probe does this. We need to fine tune the
alias matching, which implies fine tuning the device ID tables.

Completing this change will require removing the now unnecessary
and conflicting device ID entries from cdc-wdm.  This will be
submitted as a separate patch against usb-next to avoid mixing
patches for both trees in one patch set.  There is no build
dependency.

The two last patches are minor obvious editorial fixes.


Bjørn Mork (5):
  net: qmi_wwan: define a structure for driver specific state
  net: qmi_wwan: rearranging to prepare for code sharing
  net: qmi_wwan: bind to both control and data interface
  net: qmi_wwan: shorten driver description
  net: qmi_wwan: use module_usb_driver macro

 drivers/net/usb/qmi_wwan.c |  308 +++++++++++++++++++++++---------------------
 1 file changed, 163 insertions(+), 145 deletions(-)

-- 
1.7.10

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [PATCH net-next 1/5] net: qmi_wwan: define a structure for driver specific state
From: Bjørn Mork @ 2012-06-19 10:41 UTC (permalink / raw)
  To: netdev-u79uwXL29TY76Z2rM5mHXA
  Cc: Oliver Neukum, Dan Williams, linux-usb-u79uwXL29TY76Z2rM5mHXA,
	Andrew Bird (Sphere Systems), Bjørn Mork
In-Reply-To: <1340102523-23990-1-git-send-email-bjorn-yOkvZcmFvRU@public.gmane.org>

usbnet allocates a fixed size array for minidriver specific
state.  Naming the fields and taking advantage of type checking
is a bit more failsafe than casting array elements each time
they are referenced.

Signed-off-by: Bjørn Mork <bjorn-yOkvZcmFvRU@public.gmane.org>
---
 drivers/net/usb/qmi_wwan.c |   49 ++++++++++++++++++++++++++------------------
 1 file changed, 29 insertions(+), 20 deletions(-)

diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c
index 3b20678..c7b9be8 100644
--- a/drivers/net/usb/qmi_wwan.c
+++ b/drivers/net/usb/qmi_wwan.c
@@ -54,6 +54,13 @@
  *     corresponding management interface
  */
 
+/* driver specific data */
+struct qmi_wwan_state {
+	struct usb_driver *subdriver;
+	atomic_t pmcount;
+	unsigned long unused[3];
+};
+
 static int qmi_wwan_bind(struct usbnet *dev, struct usb_interface *intf)
 {
 	int status = -1;
@@ -65,9 +72,11 @@ static int qmi_wwan_bind(struct usbnet *dev, struct usb_interface *intf)
 	struct usb_cdc_ether_desc *cdc_ether = NULL;
 	u32 required = 1 << USB_CDC_HEADER_TYPE | 1 << USB_CDC_UNION_TYPE;
 	u32 found = 0;
-	atomic_t *pmcount = (void *)&dev->data[1];
+	struct qmi_wwan_state *info = (void *)&dev->data;
+
+	BUILD_BUG_ON((sizeof(((struct usbnet *)0)->data) < sizeof(struct qmi_wwan_state)));
 
-	atomic_set(pmcount, 0);
+	atomic_set(&info->pmcount, 0);
 
 	/*
 	 * assume a data interface has no additional descriptors and
@@ -177,12 +186,12 @@ err:
 /* using a counter to merge subdriver requests with our own into a combined state */
 static int qmi_wwan_manage_power(struct usbnet *dev, int on)
 {
-	atomic_t *pmcount = (void *)&dev->data[1];
+	struct qmi_wwan_state *info = (void *)&dev->data;
 	int rv = 0;
 
-	dev_dbg(&dev->intf->dev, "%s() pmcount=%d, on=%d\n", __func__, atomic_read(pmcount), on);
+	dev_dbg(&dev->intf->dev, "%s() pmcount=%d, on=%d\n", __func__, atomic_read(&info->pmcount), on);
 
-	if ((on && atomic_add_return(1, pmcount) == 1) || (!on && atomic_dec_and_test(pmcount))) {
+	if ((on && atomic_add_return(1, &info->pmcount) == 1) || (!on && atomic_dec_and_test(&info->pmcount))) {
 		/* need autopm_get/put here to ensure the usbcore sees the new value */
 		rv = usb_autopm_get_interface(dev->intf);
 		if (rv < 0)
@@ -212,7 +221,7 @@ static int qmi_wwan_bind_shared(struct usbnet *dev, struct usb_interface *intf)
 {
 	int rv;
 	struct usb_driver *subdriver = NULL;
-	atomic_t *pmcount = (void *)&dev->data[1];
+	struct qmi_wwan_state *info = (void *)&dev->data;
 
 	/* ZTE makes devices where the interface descriptors and endpoint
 	 * configurations of two or more interfaces are identical, even
@@ -228,7 +237,7 @@ static int qmi_wwan_bind_shared(struct usbnet *dev, struct usb_interface *intf)
 		goto err;
 	}
 
-	atomic_set(pmcount, 0);
+	atomic_set(&info->pmcount, 0);
 
 	/* collect all three endpoints */
 	rv = usbnet_get_endpoints(dev, intf);
@@ -251,7 +260,7 @@ static int qmi_wwan_bind_shared(struct usbnet *dev, struct usb_interface *intf)
 	dev->status = NULL;
 
 	/* save subdriver struct for suspend/resume wrappers */
-	dev->data[0] = (unsigned long)subdriver;
+	info->subdriver = subdriver;
 
 err:
 	return rv;
@@ -282,12 +291,12 @@ err:
 
 static void qmi_wwan_unbind_shared(struct usbnet *dev, struct usb_interface *intf)
 {
-	struct usb_driver *subdriver = (void *)dev->data[0];
+	struct qmi_wwan_state *info = (void *)&dev->data;
 
-	if (subdriver && subdriver->disconnect)
-		subdriver->disconnect(intf);
+	if (info->subdriver && info->subdriver->disconnect)
+		info->subdriver->disconnect(intf);
 
-	dev->data[0] = (unsigned long)NULL;
+	info->subdriver = NULL;
 }
 
 /* suspend/resume wrappers calling both usbnet and the cdc-wdm
@@ -299,15 +308,15 @@ static void qmi_wwan_unbind_shared(struct usbnet *dev, struct usb_interface *int
 static int qmi_wwan_suspend(struct usb_interface *intf, pm_message_t message)
 {
 	struct usbnet *dev = usb_get_intfdata(intf);
-	struct usb_driver *subdriver = (void *)dev->data[0];
+	struct qmi_wwan_state *info = (void *)&dev->data;
 	int ret;
 
 	ret = usbnet_suspend(intf, message);
 	if (ret < 0)
 		goto err;
 
-	if (subdriver && subdriver->suspend)
-		ret = subdriver->suspend(intf, message);
+	if (info->subdriver && info->subdriver->suspend)
+		ret = info->subdriver->suspend(intf, message);
 	if (ret < 0)
 		usbnet_resume(intf);
 err:
@@ -317,16 +326,16 @@ err:
 static int qmi_wwan_resume(struct usb_interface *intf)
 {
 	struct usbnet *dev = usb_get_intfdata(intf);
-	struct usb_driver *subdriver = (void *)dev->data[0];
+	struct qmi_wwan_state *info = (void *)&dev->data;
 	int ret = 0;
 
-	if (subdriver && subdriver->resume)
-		ret = subdriver->resume(intf);
+	if (info->subdriver && info->subdriver->resume)
+		ret = info->subdriver->resume(intf);
 	if (ret < 0)
 		goto err;
 	ret = usbnet_resume(intf);
-	if (ret < 0 && subdriver && subdriver->resume && subdriver->suspend)
-		subdriver->suspend(intf, PMSG_SUSPEND);
+	if (ret < 0 && info->subdriver && info->subdriver->resume && info->subdriver->suspend)
+		info->subdriver->suspend(intf, PMSG_SUSPEND);
 err:
 	return ret;
 }
-- 
1.7.10

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH net-next 5/5] net: qmi_wwan: use module_usb_driver macro
From: Bjørn Mork @ 2012-06-19 10:42 UTC (permalink / raw)
  To: netdev-u79uwXL29TY76Z2rM5mHXA
  Cc: Oliver Neukum, Dan Williams, linux-usb-u79uwXL29TY76Z2rM5mHXA,
	Andrew Bird (Sphere Systems), Bjørn Mork
In-Reply-To: <1340102523-23990-1-git-send-email-bjorn-yOkvZcmFvRU@public.gmane.org>

Signed-off-by: Bjørn Mork <bjorn-yOkvZcmFvRU@public.gmane.org>
---
 drivers/net/usb/qmi_wwan.c |   12 +-----------
 1 file changed, 1 insertion(+), 11 deletions(-)

diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c
index f12ba3c..f1e7791 100644
--- a/drivers/net/usb/qmi_wwan.c
+++ b/drivers/net/usb/qmi_wwan.c
@@ -593,17 +593,7 @@ static struct usb_driver qmi_wwan_driver = {
 	.disable_hub_initiated_lpm = 1,
 };
 
-static int __init qmi_wwan_init(void)
-{
-	return usb_register(&qmi_wwan_driver);
-}
-module_init(qmi_wwan_init);

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox