Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH] inet_diag: fix panic when unload inet_diag
From: Gao feng @ 2012-09-25  7:20 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: davem, stephen.hemminger, jengelh, kuznet, netdev
In-Reply-To: <1348555011.26828.2031.camel@edumazet-glaptop>

于 2012年09月25日 14:36, Eric Dumazet 写道:
> On Tue, 2012-09-25 at 10:48 +0800, Gao feng wrote:
>> when inet_diag being compiled as module, inet_diag_handler_dump
>> set netlink_dump_control.dump to inet_diag_dump,so if module
>> inet_diag is unloaded,netlink will still try to call this function
>> in netlink_dump. this will cause kernel panic.
>>
>> fix this by adding a reference of inet_diag module before
>> setting netlink_callback, and release this reference in
>> netlink_callback.done.
>>
>> Thanks for all help from Stephen,Jan and Eric.
> ...
> 
>>  
>> @@ -1001,8 +1025,26 @@ static int inet_diag_handler_dump(struct sk_buff *skb, struct nlmsghdr *h)
>>  		{
>>  			struct netlink_dump_control c = {
>>  				.dump = inet_diag_dump,
>> +				.done = inet_diag_done,
>>  			};
>> -			return netlink_dump_start(net->diag_nlsk, skb, h, &c);
>> +			int err;
>> +			/*
>> +			 * netlink_dump will call inet_diag_dump,
>> +			 * so we need a reference of THIS_MODULE.
>> +			 */
>> +			if (!try_module_get(THIS_MODULE))
>> +				return -EPROTONOSUPPORT;
>> +
>> +			err = netlink_dump_start(net->diag_nlsk, skb, h, &c);
>> +
>> +			if ((err != -EINTR) && (err != -ENOBUFS)) {
>> +				/*
>> +				 * netlink_callback set failed, release the
>> +				 * referenct of THIS_MODULE.
>> +				 */
>> +				module_put(THIS_MODULE);
>> +			}
>> +			return err;
>>  		}
>>  	}
>>  
> 
> Hmm... this seems error prone...
> 
> In the future, netlink_dump_start() could be changed to return other
> errors than EINTR or ENOBUFS that need the module_put()
> 

EINTR and ENOBUFS is returned by netlink_dump, netlink_dump is called by
netlink_dump_start after netlink_callback being set successfully.
so this checking of EINTR and ENOBUFS here is to determinate if we set
netlink_callback successfully.

I think in order to reduce error prone,we have to change netlink_dump_start
to determinate if we set netlink_callback successfully.

> I would change netlink_dump_start() to __netlink_dump_start() and add a
> module param to it, so that this module stuff is centralized in
> __netlink_dump_start()
> 
> Then, instead of calling (from inet_diag)
> 
> netlink_dump_start(net->diag_nlsk, skb, nlh, &c);
> 
> you would use :
> 
> __netlink_dump_start(net->diag_nlsk, skb, nlh, &c, THIS_MODULE);
> 
> I wonder if this fix is not needed elsewhere eventually
> (net/unix/af_unix.c for example ?)
> 

do you mean net/unix/unix_diag.c ?
I test nfnetlink module,it has the same problem.

It's need to modify netlink_dump_start not only wrap netlink_dump_start.

^ permalink raw reply

* Re: [PATCH net-next] net: raw: revert unrelated change
From: David Miller @ 2012-09-25  7:11 UTC (permalink / raw)
  To: eric.dumazet; +Cc: sfr, netdev, linux-next, linux-kernel
In-Reply-To: <1348554076.26828.1990.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 25 Sep 2012 08:21:16 +0200

> From: Eric Dumazet <edumazet@google.com>
> 
> Commit 5640f7685831 ("net: use a per task frag allocator")
> accidentally contained an unrelated change to net/ipv4/raw.c,
> later committed (without the pr_err() debugging bits) in
> net tree as commit ab43ed8b749 (ipv4: raw: fix icmp_filter())
> 
> This patch reverts this glitch, noticed by Stephen Rothwell.
> 
> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied, thanks Eric.

^ permalink raw reply

* Re: [PATCH] inet_diag: fix panic when unload inet_diag
From: Eric Dumazet @ 2012-09-25  6:36 UTC (permalink / raw)
  To: Gao feng; +Cc: davem, stephen.hemminger, jengelh, kuznet, netdev
In-Reply-To: <1348541310-31913-1-git-send-email-gaofeng@cn.fujitsu.com>

On Tue, 2012-09-25 at 10:48 +0800, Gao feng wrote:
> when inet_diag being compiled as module, inet_diag_handler_dump
> set netlink_dump_control.dump to inet_diag_dump,so if module
> inet_diag is unloaded,netlink will still try to call this function
> in netlink_dump. this will cause kernel panic.
> 
> fix this by adding a reference of inet_diag module before
> setting netlink_callback, and release this reference in
> netlink_callback.done.
> 
> Thanks for all help from Stephen,Jan and Eric.
...

>  
> @@ -1001,8 +1025,26 @@ static int inet_diag_handler_dump(struct sk_buff *skb, struct nlmsghdr *h)
>  		{
>  			struct netlink_dump_control c = {
>  				.dump = inet_diag_dump,
> +				.done = inet_diag_done,
>  			};
> -			return netlink_dump_start(net->diag_nlsk, skb, h, &c);
> +			int err;
> +			/*
> +			 * netlink_dump will call inet_diag_dump,
> +			 * so we need a reference of THIS_MODULE.
> +			 */
> +			if (!try_module_get(THIS_MODULE))
> +				return -EPROTONOSUPPORT;
> +
> +			err = netlink_dump_start(net->diag_nlsk, skb, h, &c);
> +
> +			if ((err != -EINTR) && (err != -ENOBUFS)) {
> +				/*
> +				 * netlink_callback set failed, release the
> +				 * referenct of THIS_MODULE.
> +				 */
> +				module_put(THIS_MODULE);
> +			}
> +			return err;
>  		}
>  	}
>  

Hmm... this seems error prone...

In the future, netlink_dump_start() could be changed to return other
errors than EINTR or ENOBUFS that need the module_put()

I would change netlink_dump_start() to __netlink_dump_start() and add a
module param to it, so that this module stuff is centralized in
__netlink_dump_start()

Then, instead of calling (from inet_diag)

netlink_dump_start(net->diag_nlsk, skb, nlh, &c);

you would use :

__netlink_dump_start(net->diag_nlsk, skb, nlh, &c, THIS_MODULE);

I wonder if this fix is not needed elsewhere eventually
(net/unix/af_unix.c for example ?)

^ permalink raw reply

* Re: [PATCH v4] lxt PHY: Support for the buggy LXT973 rev A2
From: leroy christophe @ 2012-09-25  6:23 UTC (permalink / raw)
  To: Richard Cochran; +Cc: David S Miller, netdev, linux-kernel
In-Reply-To: <20120924183035.GA2252@netboy.at.omicron.at>


Le 24/09/2012 20:30, Richard Cochran a écrit :
> On Mon, Sep 24, 2012 at 04:00:58PM +0200, Christophe Leroy wrote:
>
>> diff -u a/drivers/net/phy/lxt.c b/drivers/net/phy/lxt.c
>> --- a/drivers/net/phy/lxt.c	2012-09-23 03:08:48.000000000 +0200
>> +++ b/drivers/net/phy/lxt.c	2012-09-23 03:18:00.000000000 +0200
> ...
>
>> @@ -175,6 +292,16 @@
>>   	.driver		= { .owner = THIS_MODULE,},
>>   }, {
>>   	.phy_id		= 0x00137a10,
>> +	.name		= "LXT973-A2",
>> +	.phy_id_mask	= 0xffffffff,
>> +	.features	= PHY_BASIC_FEATURES,
>> +	.flags		= 0,
>> +	.probe		= lxt973_probe,
>> +	.config_aneg	= lxt973_config_aneg,
>> +	.read_status	= lxt973a2_read_status,
> I like this way of matching the A2 chips much better than what you had
> before. But are you sure this will work correctly?
Apparently it does.
>
> What do A3 chips have in the last nibble of phy_id?

A2 chip has phy_id 0x00137a10
A3 chip has phy_id 0x00137a11

Christophe
>
>> +	.driver		= { .owner = THIS_MODULE,},
>> +}, {
>> +	.phy_id		= 0x00137a10,
>>   	.name		= "LXT973",
>>   	.phy_id_mask	= 0xfffffff0,
>>   	.features	= PHY_BASIC_FEATURES,
> Thanks,
> Richard
>

^ permalink raw reply

* [PATCH net-next] net: raw: revert unrelated change
From: Eric Dumazet @ 2012-09-25  6:21 UTC (permalink / raw)
  To: David Miller; +Cc: sfr, netdev, linux-next, linux-kernel
In-Reply-To: <1348550602.26828.1918.camel@edumazet-glaptop>

From: Eric Dumazet <edumazet@google.com>

Commit 5640f7685831 ("net: use a per task frag allocator")
accidentally contained an unrelated change to net/ipv4/raw.c,
later committed (without the pr_err() debugging bits) in
net tree as commit ab43ed8b749 (ipv4: raw: fix icmp_filter())

This patch reverts this glitch, noticed by Stephen Rothwell.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv4/raw.c |   19 +++++++------------
 1 file changed, 7 insertions(+), 12 deletions(-)

diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index a80740b..f242578 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -131,23 +131,18 @@ found:
  *	0 - deliver
  *	1 - block
  */
-static int icmp_filter(const struct sock *sk, const struct sk_buff *skb)
+static __inline__ int icmp_filter(struct sock *sk, struct sk_buff *skb)
 {
-	struct icmphdr _hdr;
-	const struct icmphdr *hdr;
-
-	pr_err("icmp_filter skb_transport_offset %d data-head %ld len %d/%d\n",
-		skb_transport_offset(skb), skb->data - skb->head, skb->len, skb->data_len);
-	hdr = skb_header_pointer(skb, skb_transport_offset(skb),
-				 sizeof(_hdr), &_hdr);
-	pr_err("head %p data %p hdr %p type %d\n", skb->head, skb->data, hdr, hdr ? hdr->type : -1);
-	if (!hdr)
+	int type;
+
+	if (!pskb_may_pull(skb, sizeof(struct icmphdr)))
 		return 1;
 
-	if (hdr->type < 32) {
+	type = icmp_hdr(skb)->type;
+	if (type < 32) {
 		__u32 data = raw_sk(sk)->filter.data;
 
-		return ((1U << hdr->type) & data) != 0;
+		return ((1 << type) & data) != 0;
 	}
 
 	/* Do not block unknown ICMP types */

^ permalink raw reply related

* Re: [PATCH] team: fix return value check
From: Jiri Pirko @ 2012-09-25  5:40 UTC (permalink / raw)
  To: Wei Yongjun; +Cc: jpirko, yongjun_wei, netdev
In-Reply-To: <CAPgLHd9GYYu21FhiKAr2MvmYtWbL=85-E0EzJmiJm6cDDp=WcA@mail.gmail.com>

Tue, Sep 25, 2012 at 06:29:35AM CEST, weiyj.lk@gmail.com wrote:
>From: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
>
>In case of error, the function genlmsg_put() returns NULL pointer
>not ERR_PTR(). The IS_ERR() test in the return value check should
>be replaced with NULL test.
>
>dpatch engine is used to auto generate this patch.
>(https://github.com/weiyj/dpatch)
>
>Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
>---
> drivers/net/team/team.c | 12 ++++++------
> 1 file changed, 6 insertions(+), 6 deletions(-)
>
>diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c
>index 341b65d..e19da26 100644
>--- a/drivers/net/team/team.c
>+++ b/drivers/net/team/team.c
>@@ -1652,8 +1652,8 @@ static int team_nl_cmd_noop(struct sk_buff *skb, struct genl_info *info)
> 
> 	hdr = genlmsg_put(msg, info->snd_pid, info->snd_seq,
> 			  &team_nl_family, 0, TEAM_CMD_NOOP);
>-	if (IS_ERR(hdr)) {
>-		err = PTR_ERR(hdr);
>+	if (!hdr) {
>+		err = -EMSGSIZE;
> 		goto err_msg_put;
> 	}
> 
>@@ -1847,8 +1847,8 @@ start_again:
> 
> 	hdr = genlmsg_put(skb, pid, seq, &team_nl_family, flags | NLM_F_MULTI,
> 			  TEAM_CMD_OPTIONS_GET);
>-	if (IS_ERR(hdr))
>-		return PTR_ERR(hdr);
>+	if (!hdr)
>+		return -EMSGSIZE;
> 
> 	if (nla_put_u32(skb, TEAM_ATTR_TEAM_IFINDEX, team->dev->ifindex))
> 		goto nla_put_failure;
>@@ -2067,8 +2067,8 @@ static int team_nl_fill_port_list_get(struct sk_buff *skb,
> 
> 	hdr = genlmsg_put(skb, pid, seq, &team_nl_family, flags,
> 			  TEAM_CMD_PORT_LIST_GET);
>-	if (IS_ERR(hdr))
>-		return PTR_ERR(hdr);
>+	if (!hdr)
>+		return -EMSGSIZE;
> 
> 	if (nla_put_u32(skb, TEAM_ATTR_TEAM_IFINDEX, team->dev->ifindex))
> 		goto nla_put_failure;
>
>
>--
>To unsubscribe from this list: send the line "unsubscribe netdev" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html


Acked-by: Jiri Pirko <jiri@resnulli.us>

^ permalink raw reply

* Re: [PATCH] tcp: sysctl for initial receive window
From: Jan Engelhardt @ 2012-09-25  5:29 UTC (permalink / raw)
  To: Jesper Dangaard Brouer; +Cc: netdev, Nandita Dukkipati, Eric Dumazet
In-Reply-To: <20120921085502.4534.20232.stgit@dragon>



On Friday 2012-09-21 10:55, Jesper Dangaard Brouer wrote:
>diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
>index c7fc107..684131c 100644
>--- a/Documentation/networking/ip-sysctl.txt
>+++ b/Documentation/networking/ip-sysctl.txt
>@@ -257,6 +257,18 @@ tcp_frto_response - INTEGER
> 		  to the values prior timeout
> 	Default: 0 (rate halving based)
> 
>+tcp_init_recv_window - INTEGER
>+	Default initial advertised receive window.  Actual window size
>+	is this value multiplied by the MSS of the connection.  Its

	is this value multiplied by the MSS of the connection.  It is

>+	possible to control/override this value per route table entry
>+	via the iproute2 option initrwnd.
>+	Minimum value is 1, but 2 is the recommended minimum.
>+	The effective max value, is limited by the sockets receive

	The effective max value is limited by the sockets receive

>+	buffer size (default tcp_rmem[1], and possibly scaled by
>+	tcp_adv_win_scale), and can further be limited by window

	tcp_adv_win_scale) and can further be limited by window

>+	clamp.

	clamping.

>+	Default: 10
>+
> tcp_keepalive_time - INTEGER
> 	How often TCP sends out keepalive messages when keepalive is enabled.
> 	Default: 2hours.

The "recommended minimum" is somewhat strange from a language POV,
since the recommendation is actually to _not touch_ the option at all
(because the default works and there is potential abuse as Dave
mentions).

^ permalink raw reply

* Re: linux-next: manual merge of the net-next tree with the net tree
From: Eric Dumazet @ 2012-09-25  5:23 UTC (permalink / raw)
  To: David Miller; +Cc: sfr, netdev, linux-next, linux-kernel, edumazet
In-Reply-To: <20120925.011343.1495275645575636949.davem@davemloft.net>

On Tue, 2012-09-25 at 01:13 -0400, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Tue, 25 Sep 2012 07:10:42 +0200
> 
> > Oops, my bad, net/ipv4/raw.c changes in 5640f7685831 ("net: use a per
> > task frag allocator") should not be there :
> > 
> > I accidentally left a debugging version of the patch I sent to fix the
> > icmp bug.
> > 
> > Sorry David for this, I am not sure how I can help on this ?
> 
> The thing to do is send me a patch to revert the raw.c change from
> net-next, right?

Sure, I'll do that after my breakfast and some coffee ;)

^ permalink raw reply

* Re: [RFC] gre: conform to RFC6040 ECN progogation
From: Eric Dumazet @ 2012-09-25  5:17 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Chris Wright, David Miller, netdev
In-Reply-To: <20120924153013.553f0b76@nehalam.linuxnetplumber.net>

On Mon, 2012-09-24 at 15:30 -0700, Stephen Hemminger wrote:

> Logging is a bad idea in this case since the tunnel might be from a remote
> host/protocol and the log would be filled with crap.
> 
> The tunnels in general do need to have rx_dropped counter, but it looks
> like that isn't being done right either.
> 

I never suggested to log _all_ messages.


Stephen, I personally was hit by some provider playing with TOS bits so
wrong I had to patch linux to fix a minor problem. [1]

I would like to know why my tunnels are going to fail, and what should I
do to get a fallback. Ie reverting your patches.


RFC 6040 states :

      In these cases,
      particularly the more dangerous ones, the decapsulator SHOULD log
      the event and MAY also raise an alarm.

      Just because the highlighted combinations are currently unused,
      does not mean that all the other combinations are always valid.
      Some are only valid if they have arrived from a particular type of
      legacy ingress, and dangerous otherwise.  Therefore, an
      implementation MAY allow an operator to configure logging and
      alarms for such additional header combinations known to be
      dangerous or CU for the particular configuration of tunnel
      endpoints deployed at run-time.

      Alarms SHOULD be rate-limited so that the anomalous combinations
      will not amplify into a flood of alarm messages.  It MUST be
      possible to suppress alarms or logging, e.g., if it becomes
      apparent that a combination that previously was not used has
      started to be used for legitimate purposes such as a new standards
      action.


[1]
commit 7a269ffad72f3604b8982fa09c387670e0d2ee14
Author: Eric Dumazet <eric.dumazet@gmail.com>
Date:   Thu Sep 22 20:02:19 2011 +0000

    tcp: ECN blackhole should not force quickack mode
    
    While playing with a new ADSL box at home, I discovered that ECN
    blackhole can trigger suboptimal quickack mode on linux : We send one
    ACK for each incoming data frame, without any delay and eventual
    piggyback.
    
    This is because TCP_ECN_check_ce() considers that if no ECT is seen on a
    segment, this is because this segment was a retransmit.
    
    Refine this heuristic and apply it only if we seen ECT in a previous
    segment, to detect ECN blackhole at IP level.

^ permalink raw reply

* Re: linux-next: manual merge of the net-next tree with the net tree
From: David Miller @ 2012-09-25  5:13 UTC (permalink / raw)
  To: eric.dumazet; +Cc: sfr, netdev, linux-next, linux-kernel, edumazet
In-Reply-To: <1348549842.26828.1897.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 25 Sep 2012 07:10:42 +0200

> Oops, my bad, net/ipv4/raw.c changes in 5640f7685831 ("net: use a per
> task frag allocator") should not be there :
> 
> I accidentally left a debugging version of the patch I sent to fix the
> icmp bug.
> 
> Sorry David for this, I am not sure how I can help on this ?

The thing to do is send me a patch to revert the raw.c change from
net-next, right?

^ permalink raw reply

* IT Notification -- Attachment Removed
From: Postmaster @ 2012-09-25  5:01 UTC (permalink / raw)
  To: netdev

-------------------------------------------------------------------
IT has detected restricted attachments within an email message
-------------------------------------------------------------------

>From      : netdev@vger.kernel.org
To        : nandy@ntb.org.np
Subject   : [***SPAM*** Score/Req: 05.1/5.0] Mail System Error - Returned Mail
Message-ID: 

---------------------
Attachment(s) removed
---------------------
TEXT.SCR

^ permalink raw reply

* IT Notification -- Attachment Removed
From: Postmaster @ 2012-09-25  5:01 UTC (permalink / raw)
  To: netdev

-------------------------------------------------------------------
IT has detected restricted attachments within an email message
-------------------------------------------------------------------

>From      : netdev@vger.kernel.org
To        : abaral@ntb.org.np
Subject   : Delivery reports about your e-mail
Message-ID: 

---------------------
Attachment(s) removed
---------------------
text.zip

^ permalink raw reply

* Re: linux-next: manual merge of the net-next tree with the net tree
From: Eric Dumazet @ 2012-09-25  5:10 UTC (permalink / raw)
  To: Stephen Rothwell
  Cc: David Miller, netdev, linux-next, linux-kernel, Eric Dumazet
In-Reply-To: <20120925123438.bbf3126889513ee5cc195e5c@canb.auug.org.au>

On Tue, 2012-09-25 at 12:34 +1000, Stephen Rothwell wrote:
> Hi all,
> 
> Today's linux-next merge of the net-next tree got a conflict in
> net/ipv4/raw.c between commit ab43ed8b7490 ("ipv4: raw: fix icmp_filter
> ()") from the net tree and commit 5640f7685831 ("net: use a per task frag
> allocator") from the net-next tree.
> 
> They are basically the same patch (for this file) except the net-next
> version adds two pr_err() calls. I used the net-next version and can carry
> the fix as necessary (no action is required).
> 
> I do wonder if this change belongs in the net-next patch?

Oops, my bad, net/ipv4/raw.c changes in 5640f7685831 ("net: use a per
task frag allocator") should not be there :

I accidentally left a debugging version of the patch I sent to fix the
icmp bug.

Sorry David for this, I am not sure how I can help on this ?

^ permalink raw reply

* [PATCH] team: fix return value check
From: Wei Yongjun @ 2012-09-25  4:29 UTC (permalink / raw)
  To: jpirko; +Cc: yongjun_wei, netdev

From: Wei Yongjun <yongjun_wei@trendmicro.com.cn>

In case of error, the function genlmsg_put() returns NULL pointer
not ERR_PTR(). The IS_ERR() test in the return value check should
be replaced with NULL test.

dpatch engine is used to auto generate this patch.
(https://github.com/weiyj/dpatch)

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
---
 drivers/net/team/team.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c
index 341b65d..e19da26 100644
--- a/drivers/net/team/team.c
+++ b/drivers/net/team/team.c
@@ -1652,8 +1652,8 @@ static int team_nl_cmd_noop(struct sk_buff *skb, struct genl_info *info)
 
 	hdr = genlmsg_put(msg, info->snd_pid, info->snd_seq,
 			  &team_nl_family, 0, TEAM_CMD_NOOP);
-	if (IS_ERR(hdr)) {
-		err = PTR_ERR(hdr);
+	if (!hdr) {
+		err = -EMSGSIZE;
 		goto err_msg_put;
 	}
 
@@ -1847,8 +1847,8 @@ start_again:
 
 	hdr = genlmsg_put(skb, pid, seq, &team_nl_family, flags | NLM_F_MULTI,
 			  TEAM_CMD_OPTIONS_GET);
-	if (IS_ERR(hdr))
-		return PTR_ERR(hdr);
+	if (!hdr)
+		return -EMSGSIZE;
 
 	if (nla_put_u32(skb, TEAM_ATTR_TEAM_IFINDEX, team->dev->ifindex))
 		goto nla_put_failure;
@@ -2067,8 +2067,8 @@ static int team_nl_fill_port_list_get(struct sk_buff *skb,
 
 	hdr = genlmsg_put(skb, pid, seq, &team_nl_family, flags,
 			  TEAM_CMD_PORT_LIST_GET);
-	if (IS_ERR(hdr))
-		return PTR_ERR(hdr);
+	if (!hdr)
+		return -EMSGSIZE;
 
 	if (nla_put_u32(skb, TEAM_ATTR_TEAM_IFINDEX, team->dev->ifindex))
 		goto nla_put_failure;

^ permalink raw reply related

* [PATCH] l2tp: fix return value check
From: Wei Yongjun @ 2012-09-25  4:29 UTC (permalink / raw)
  To: davem; +Cc: yongjun_wei, netdev

From: Wei Yongjun <yongjun_wei@trendmicro.com.cn>

In case of error, the function genlmsg_put() returns NULL pointer
not ERR_PTR(). The IS_ERR() test in the return value check should
be replaced with NULL test.

dpatch engine is used to auto generate this patch.
(https://github.com/weiyj/dpatch)

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
---
 net/l2tp/l2tp_netlink.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/net/l2tp/l2tp_netlink.c b/net/l2tp/l2tp_netlink.c
index d71cd92..6f93635 100644
--- a/net/l2tp/l2tp_netlink.c
+++ b/net/l2tp/l2tp_netlink.c
@@ -80,8 +80,8 @@ static int l2tp_nl_cmd_noop(struct sk_buff *skb, struct genl_info *info)
 
 	hdr = genlmsg_put(msg, info->snd_pid, info->snd_seq,
 			  &l2tp_nl_family, 0, L2TP_CMD_NOOP);
-	if (IS_ERR(hdr)) {
-		ret = PTR_ERR(hdr);
+	if (!hdr) {
+		ret = -EMSGSIZE;
 		goto err_out;
 	}
 
@@ -250,8 +250,8 @@ static int l2tp_nl_tunnel_send(struct sk_buff *skb, u32 pid, u32 seq, int flags,
 
 	hdr = genlmsg_put(skb, pid, seq, &l2tp_nl_family, flags,
 			  L2TP_CMD_TUNNEL_GET);
-	if (IS_ERR(hdr))
-		return PTR_ERR(hdr);
+	if (!hdr)
+		return -EMSGSIZE;
 
 	if (nla_put_u8(skb, L2TP_ATTR_PROTO_VERSION, tunnel->version) ||
 	    nla_put_u32(skb, L2TP_ATTR_CONN_ID, tunnel->tunnel_id) ||
@@ -617,8 +617,8 @@ static int l2tp_nl_session_send(struct sk_buff *skb, u32 pid, u32 seq, int flags
 	sk = tunnel->sock;
 
 	hdr = genlmsg_put(skb, pid, seq, &l2tp_nl_family, flags, L2TP_CMD_SESSION_GET);
-	if (IS_ERR(hdr))
-		return PTR_ERR(hdr);
+	if (!hdr)
+		return -EMSGSIZE;
 
 	if (nla_put_u32(skb, L2TP_ATTR_CONN_ID, tunnel->tunnel_id) ||
 	    nla_put_u32(skb, L2TP_ATTR_SESSION_ID, session->session_id) ||

^ permalink raw reply related

* [PATCH net-next 4/4] tunnel: drop packet if ECN present with not-ECT
From: Stephen Hemminger @ 2012-09-25  4:12 UTC (permalink / raw)
  To: davem; +Cc: netdev
In-Reply-To: <20120925041222.056704869@vyatta.com>

[-- Attachment #1: gre-rfc-6040.patch --]
[-- Type: text/plain, Size: 7420 bytes --]

Linux tunnels were written before RFC6040 and therefore never
implemented the corner case of ECN getting set in the outer header
and the inner header not being ready for it.

Section 4.2.  Default Tunnel Egress Behaviour.
 The new code addresses:
 o If the inner ECN field is Not-ECT, the decapsulator MUST NOT
      propagate any other ECN codepoint onwards.  This is because the
      inner Not-ECT marking is set by transports that rely on dropped
      packets as an indication of congestion and would not understand or
      respond to any other ECN codepoint [RFC4774].  Specifically:

      *  If the inner ECN field is Not-ECT and the outer ECN field is
         CE, the decapsulator MUST drop the packet.

      *  If the inner ECN field is Not-ECT and the outer ECN field is
         Not-ECT, ECT(0), or ECT(1), the decapsulator MUST forward the
         outgoing packet with the ECN field cleared to Not-ECT.

Overloads rx_frame_error to keep track of this error.
This was caught by Chris Wright while reviewing the new VXLAN tunnel.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

---
Note: supersedes earlier patch which only did Ipv4 GRE.

 net/ipv4/ip_gre.c            |   23 +++++++++++------
 net/ipv4/ipip.c              |   19 ++++++++------
 net/ipv4/xfrm4_mode_tunnel.c |   12 +++++---
 net/ipv6/ip6_gre.c           |   58 +++++++++++++++++++++++--------------------
 4 files changed, 67 insertions(+), 45 deletions(-)

--- a/net/ipv4/ip_gre.c	2012-09-24 18:11:52.000000000 -0700
+++ b/net/ipv4/ip_gre.c	2012-09-24 18:13:13.098719907 -0700
@@ -204,7 +204,9 @@ static struct rtnl_link_stats64 *ipgre_g
 	tot->rx_crc_errors = dev->stats.rx_crc_errors;
 	tot->rx_fifo_errors = dev->stats.rx_fifo_errors;
 	tot->rx_length_errors = dev->stats.rx_length_errors;
+	tot->rx_frame_errors = dev->stats.rx_frame_errors;
 	tot->rx_errors = dev->stats.rx_errors;
+
 	tot->tx_fifo_errors = dev->stats.tx_fifo_errors;
 	tot->tx_carrier_errors = dev->stats.tx_carrier_errors;
 	tot->tx_dropped = dev->stats.tx_dropped;
@@ -587,15 +589,16 @@ static void ipgre_err(struct sk_buff *sk
 	t->err_time = jiffies;
 }
 
-static inline void ipgre_ecn_decapsulate(const struct iphdr *iph, struct sk_buff *skb)
+static int ipgre_ecn_decapsulate(const struct iphdr *iph, struct sk_buff *skb)
 {
 	if (INET_ECN_is_ce(iph->tos)) {
 		if (skb->protocol == htons(ETH_P_IP)) {
-			IP_ECN_set_ce(ip_hdr(skb));
+			return IP_ECN_set_ce(ip_hdr(skb));
 		} else if (skb->protocol == htons(ETH_P_IPV6)) {
-			IP6_ECN_set_ce(ipv6_hdr(skb));
+			return IP6_ECN_set_ce(ipv6_hdr(skb));
 		}
 	}
+	return 1;
 }
 
 static inline u8
@@ -723,17 +726,21 @@ static int ipgre_rcv(struct sk_buff *skb
 			skb_postpull_rcsum(skb, eth_hdr(skb), ETH_HLEN);
 		}
 
+		__skb_tunnel_rx(skb, tunnel->dev);
+
+		skb_reset_network_header(skb);
+		if (!ipgre_ecn_decapsulate(iph, skb)) {
+			++tunnel->dev->stats.rx_frame_errors;
+			++tunnel->dev->stats.rx_errors;
+			goto drop;
+		}
+
 		tstats = this_cpu_ptr(tunnel->dev->tstats);
 		u64_stats_update_begin(&tstats->syncp);
 		tstats->rx_packets++;
 		tstats->rx_bytes += skb->len;
 		u64_stats_update_end(&tstats->syncp);
 
-		__skb_tunnel_rx(skb, tunnel->dev);
-
-		skb_reset_network_header(skb);
-		ipgre_ecn_decapsulate(iph, skb);
-
 		netif_rx(skb);
 
 		return 0;
--- a/net/ipv4/ipip.c	2012-09-24 18:12:33.000000000 -0700
+++ b/net/ipv4/ipip.c	2012-09-24 18:13:13.098719907 -0700
@@ -400,13 +400,14 @@ out:
 	return err;
 }
 
-static inline void ipip_ecn_decapsulate(const struct iphdr *outer_iph,
-					struct sk_buff *skb)
+static int ipip_ecn_decapsulate(const struct iphdr *outer_iph,
+				struct sk_buff *skb)
 {
 	struct iphdr *inner_iph = ip_hdr(skb);
 
 	if (INET_ECN_is_ce(outer_iph->tos))
-		IP_ECN_set_ce(inner_iph);
+		return IP_ECN_set_ce(inner_iph);
+	return 1;
 }
 
 static int ipip_rcv(struct sk_buff *skb)
@@ -430,16 +431,20 @@ static int ipip_rcv(struct sk_buff *skb)
 		skb->protocol = htons(ETH_P_IP);
 		skb->pkt_type = PACKET_HOST;
 
+		__skb_tunnel_rx(skb, tunnel->dev);
+
+		if (!ipip_ecn_decapsulate(iph, skb)) {
+			++tunnel->dev->stats.rx_frame_errors;
+			++tunnel->dev->stats.rx_errors;
+			return 0;
+		}
+
 		tstats = this_cpu_ptr(tunnel->dev->tstats);
 		u64_stats_update_begin(&tstats->syncp);
 		tstats->rx_packets++;
 		tstats->rx_bytes += skb->len;
 		u64_stats_update_end(&tstats->syncp);
 
-		__skb_tunnel_rx(skb, tunnel->dev);
-
-		ipip_ecn_decapsulate(iph, skb);
-
 		netif_rx(skb);
 		return 0;
 	}
--- a/net/ipv6/ip6_gre.c	2012-09-24 18:12:00.783449055 -0700
+++ b/net/ipv6/ip6_gre.c	2012-09-24 18:13:13.098719907 -0700
@@ -149,7 +149,9 @@ static struct rtnl_link_stats64 *ip6gre_
 	tot->rx_crc_errors = dev->stats.rx_crc_errors;
 	tot->rx_fifo_errors = dev->stats.rx_fifo_errors;
 	tot->rx_length_errors = dev->stats.rx_length_errors;
+	tot->rx_frame_errors = dev->stats.rx_frame_errors;
 	tot->rx_errors = dev->stats.rx_errors;
+
 	tot->tx_fifo_errors = dev->stats.tx_fifo_errors;
 	tot->tx_carrier_errors = dev->stats.tx_carrier_errors;
 	tot->tx_dropped = dev->stats.tx_dropped;
@@ -489,26 +491,28 @@ static void ip6gre_err(struct sk_buff *s
 	t->err_time = jiffies;
 }
 
-static inline void ip6gre_ecn_decapsulate_ipv4(const struct ip6_tnl *t,
-		const struct ipv6hdr *ipv6h, struct sk_buff *skb)
-{
-	__u8 dsfield = ipv6_get_dsfield(ipv6h) & ~INET_ECN_MASK;
-
-	if (t->parms.flags & IP6_TNL_F_RCV_DSCP_COPY)
-		ipv4_change_dsfield(ip_hdr(skb), INET_ECN_MASK, dsfield);
-
-	if (INET_ECN_is_ce(dsfield))
-		IP_ECN_set_ce(ip_hdr(skb));
-}
+static int ip6gre_ecn_decapsulate(const struct ip6_tnl *t,
+				  const struct ipv6hdr *ipv6h,
+				  struct sk_buff *skb)
+{
+	__u8 dsfield = ipv6_get_dsfield(ipv6h);
+	if (skb->protocol == htons(ETH_P_IP)) {
+		dsfield  &= ~INET_ECN_MASK;
+
+		if (t->parms.flags & IP6_TNL_F_RCV_DSCP_COPY)
+			ipv4_change_dsfield(ip_hdr(skb), INET_ECN_MASK,
+					    dsfield);
+		if (INET_ECN_is_ce(dsfield))
+			return IP_ECN_set_ce(ip_hdr(skb));
+	} else if (skb->protocol == htons(ETH_P_IPV6)) {
+		if (t->parms.flags & IP6_TNL_F_RCV_DSCP_COPY)
+			ipv6_copy_dscp(dsfield, ipv6_hdr(skb));
 
-static inline void ip6gre_ecn_decapsulate_ipv6(const struct ip6_tnl *t,
-		const struct ipv6hdr *ipv6h, struct sk_buff *skb)
-{
-	if (t->parms.flags & IP6_TNL_F_RCV_DSCP_COPY)
-		ipv6_copy_dscp(ipv6_get_dsfield(ipv6h), ipv6_hdr(skb));
+		if (INET_ECN_is_ce(ipv6_get_dsfield(ipv6h)))
+			return IP6_ECN_set_ce(ipv6_hdr(skb));
+	}
 
-	if (INET_ECN_is_ce(ipv6_get_dsfield(ipv6h)))
-		IP6_ECN_set_ce(ipv6_hdr(skb));
+	return 1;
 }
 
 static int ip6gre_rcv(struct sk_buff *skb)
@@ -625,20 +629,22 @@ static int ip6gre_rcv(struct sk_buff *sk
 			skb_postpull_rcsum(skb, eth_hdr(skb), ETH_HLEN);
 		}
 
+		__skb_tunnel_rx(skb, tunnel->dev);
+
+		skb_reset_network_header(skb);
+
+		if (!ip6gre_ecn_decapsulate(tunnel, ipv6h, skb)) {
+			++tunnel->dev->stats.rx_frame_errors;
+			++tunnel->dev->stats.rx_errors;
+			goto drop;
+		}
+
 		tstats = this_cpu_ptr(tunnel->dev->tstats);
 		u64_stats_update_begin(&tstats->syncp);
 		tstats->rx_packets++;
 		tstats->rx_bytes += skb->len;
 		u64_stats_update_end(&tstats->syncp);
 
-		__skb_tunnel_rx(skb, tunnel->dev);
-
-		skb_reset_network_header(skb);
-		if (skb->protocol == htons(ETH_P_IP))
-			ip6gre_ecn_decapsulate_ipv4(tunnel, ipv6h, skb);
-		else if (skb->protocol == htons(ETH_P_IPV6))
-			ip6gre_ecn_decapsulate_ipv6(tunnel, ipv6h, skb);
-
 		netif_rx(skb);
 
 		return 0;

^ permalink raw reply

* [PATCH net-next 3/4] xfrm: remove extranous rcu_read_lock
From: Stephen Hemminger @ 2012-09-25  4:12 UTC (permalink / raw)
  To: davem; +Cc: netdev
In-Reply-To: <20120925041222.056704869@vyatta.com>

[-- Attachment #1: ipip-no-rcu.patch --]
[-- Type: text/plain, Size: 3493 bytes --]

The handlers for xfrm_tunnel are always invoked with rcu read lock
already.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

---
 net/ipv4/ip_vti.c |    5 -----
 net/ipv4/ipip.c   |    9 +--------
 net/ipv6/sit.c    |    6 ------
 3 files changed, 1 insertion(+), 19 deletions(-)

--- a/net/ipv4/ip_vti.c	2012-08-15 08:59:22.958704223 -0700
+++ b/net/ipv4/ip_vti.c	2012-09-24 18:01:20.521904773 -0700
@@ -304,7 +304,6 @@ static int vti_err(struct sk_buff *skb,
 
 	err = -ENOENT;
 
-	rcu_read_lock();
 	t = vti_tunnel_lookup(dev_net(skb->dev), iph->daddr, iph->saddr);
 	if (t == NULL)
 		goto out;
@@ -326,7 +325,6 @@ static int vti_err(struct sk_buff *skb,
 		t->err_count = 1;
 	t->err_time = jiffies;
 out:
-	rcu_read_unlock();
 	return err;
 }
 
@@ -336,7 +334,6 @@ static int vti_rcv(struct sk_buff *skb)
 	struct ip_tunnel *tunnel;
 	const struct iphdr *iph = ip_hdr(skb);
 
-	rcu_read_lock();
 	tunnel = vti_tunnel_lookup(dev_net(skb->dev), iph->saddr, iph->daddr);
 	if (tunnel != NULL) {
 		struct pcpu_tstats *tstats;
@@ -348,10 +345,8 @@ static int vti_rcv(struct sk_buff *skb)
 		u64_stats_update_end(&tstats->syncp);
 
 		skb->dev = tunnel->dev;
-		rcu_read_unlock();
 		return 1;
 	}
-	rcu_read_unlock();
 
 	return -1;
 }
--- a/net/ipv4/ipip.c	2012-09-24 17:59:26.127058210 -0700
+++ b/net/ipv4/ipip.c	2012-09-24 18:00:28.410430210 -0700
@@ -365,8 +365,6 @@ static int ipip_err(struct sk_buff *skb,
 	}
 
 	err = -ENOENT;
-
-	rcu_read_lock();
 	t = ipip_tunnel_lookup(dev_net(skb->dev), iph->daddr, iph->saddr);
 	if (t == NULL)
 		goto out;
@@ -398,7 +396,7 @@ static int ipip_err(struct sk_buff *skb,
 		t->err_count = 1;
 	t->err_time = jiffies;
 out:
-	rcu_read_unlock();
+
 	return err;
 }
 
@@ -416,13 +414,11 @@ static int ipip_rcv(struct sk_buff *skb)
 	struct ip_tunnel *tunnel;
 	const struct iphdr *iph = ip_hdr(skb);
 
-	rcu_read_lock();
 	tunnel = ipip_tunnel_lookup(dev_net(skb->dev), iph->saddr, iph->daddr);
 	if (tunnel != NULL) {
 		struct pcpu_tstats *tstats;
 
 		if (!xfrm4_policy_check(NULL, XFRM_POLICY_IN, skb)) {
-			rcu_read_unlock();
 			kfree_skb(skb);
 			return 0;
 		}
@@ -445,11 +441,8 @@ static int ipip_rcv(struct sk_buff *skb)
 		ipip_ecn_decapsulate(iph, skb);
 
 		netif_rx(skb);
-
-		rcu_read_unlock();
 		return 0;
 	}
-	rcu_read_unlock();
 
 	return -1;
 }
--- a/net/ipv6/sit.c	2012-08-15 08:59:22.986703941 -0700
+++ b/net/ipv6/sit.c	2012-09-24 18:02:28.753216804 -0700
@@ -545,7 +545,6 @@ static int ipip6_err(struct sk_buff *skb
 
 	err = -ENOENT;
 
-	rcu_read_lock();
 	t = ipip6_tunnel_lookup(dev_net(skb->dev),
 				skb->dev,
 				iph->daddr,
@@ -579,7 +578,6 @@ static int ipip6_err(struct sk_buff *skb
 		t->err_count = 1;
 	t->err_time = jiffies;
 out:
-	rcu_read_unlock();
 	return err;
 }
 
@@ -599,7 +597,6 @@ static int ipip6_rcv(struct sk_buff *skb
 
 	iph = ip_hdr(skb);
 
-	rcu_read_lock();
 	tunnel = ipip6_tunnel_lookup(dev_net(skb->dev), skb->dev,
 				     iph->saddr, iph->daddr);
 	if (tunnel != NULL) {
@@ -615,7 +612,6 @@ static int ipip6_rcv(struct sk_buff *skb
 		if ((tunnel->dev->priv_flags & IFF_ISATAP) &&
 		    !isatap_chksrc(skb, iph, tunnel)) {
 			tunnel->dev->stats.rx_errors++;
-			rcu_read_unlock();
 			kfree_skb(skb);
 			return 0;
 		}
@@ -630,12 +626,10 @@ static int ipip6_rcv(struct sk_buff *skb
 
 		netif_rx(skb);
 
-		rcu_read_unlock();
 		return 0;
 	}
 
 	/* no tunnel matched,  let upstream know, ipsec may handle it */
-	rcu_read_unlock();
 	return 1;
 out:
 	kfree_skb(skb);

^ permalink raw reply

* [PATCH net-next 0/4] Tunnel related patches
From: Stephen Hemminger @ 2012-09-25  4:12 UTC (permalink / raw)
  To: davem; +Cc: netdev

These are a set of small tunnel related patches.

^ permalink raw reply

* [PATCH net-next 2/4] gre: remove unnecessary rcu_read_lock/unlock
From: Stephen Hemminger @ 2012-09-25  4:12 UTC (permalink / raw)
  To: davem; +Cc: netdev
In-Reply-To: <20120925041222.056704869@vyatta.com>

[-- Attachment #1: gre-rcu.patch --]
[-- Type: text/plain, Size: 4043 bytes --]

The gre function pointers for receive and error handling are
always called (from gre.c) with rcu_read_lock already held.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>


---
 net/ipv4/ip_gre.c  |   19 ++++++-------------
 net/ipv6/ip6_gre.c |   10 +---------
 2 files changed, 7 insertions(+), 22 deletions(-)

--- a/net/ipv6/ip6_gre.c	2012-09-24 17:33:22.766821472 -0700
+++ b/net/ipv6/ip6_gre.c	2012-09-24 18:12:00.783449055 -0700
@@ -437,14 +437,12 @@ static void ip6gre_err(struct sk_buff *s
 	ipv6h = (const struct ipv6hdr *)skb->data;
 	p = (__be16 *)(skb->data + offset);
 
-	rcu_read_lock();
-
 	t = ip6gre_tunnel_lookup(skb->dev, &ipv6h->daddr, &ipv6h->saddr,
 				flags & GRE_KEY ?
 				*(((__be32 *)p) + (grehlen / 4) - 1) : 0,
 				p[1]);
 	if (t == NULL)
-		goto out;
+		return;
 
 	switch (type) {
 		__u32 teli;
@@ -489,8 +487,6 @@ static void ip6gre_err(struct sk_buff *s
 	else
 		t->err_count = 1;
 	t->err_time = jiffies;
-out:
-	rcu_read_unlock();
 }
 
 static inline void ip6gre_ecn_decapsulate_ipv4(const struct ip6_tnl *t,
@@ -528,7 +524,7 @@ static int ip6gre_rcv(struct sk_buff *sk
 	__be16 gre_proto;
 
 	if (!pskb_may_pull(skb, sizeof(struct in6_addr)))
-		goto drop_nolock;
+		goto drop;
 
 	ipv6h = ipv6_hdr(skb);
 	h = skb->data;
@@ -539,7 +535,7 @@ static int ip6gre_rcv(struct sk_buff *sk
 		   - We do not support routing headers.
 		 */
 		if (flags&(GRE_VERSION|GRE_ROUTING))
-			goto drop_nolock;
+			goto drop;
 
 		if (flags&GRE_CSUM) {
 			switch (skb->ip_summed) {
@@ -567,7 +563,6 @@ static int ip6gre_rcv(struct sk_buff *sk
 
 	gre_proto = *(__be16 *)(h + 2);
 
-	rcu_read_lock();
 	tunnel = ip6gre_tunnel_lookup(skb->dev,
 					  &ipv6h->saddr, &ipv6h->daddr, key,
 					  gre_proto);
@@ -646,14 +641,11 @@ static int ip6gre_rcv(struct sk_buff *sk
 
 		netif_rx(skb);
 
-		rcu_read_unlock();
 		return 0;
 	}
 	icmpv6_send(skb, ICMPV6_DEST_UNREACH, ICMPV6_PORT_UNREACH, 0);
 
 drop:
-	rcu_read_unlock();
-drop_nolock:
 	kfree_skb(skb);
 	return 0;
 }
--- a/net/ipv4/ip_gre.c	2012-09-24 17:33:33.298715280 -0700
+++ b/net/ipv4/ip_gre.c	2012-09-24 18:11:52.383533752 -0700
@@ -557,37 +557,34 @@ static void ipgre_err(struct sk_buff *sk
 		break;
 	}
 
-	rcu_read_lock();
 	t = ipgre_tunnel_lookup(skb->dev, iph->daddr, iph->saddr,
 				flags, key, p[1]);
 
 	if (t == NULL)
-		goto out;
+		return;
 
 	if (type == ICMP_DEST_UNREACH && code == ICMP_FRAG_NEEDED) {
 		ipv4_update_pmtu(skb, dev_net(skb->dev), info,
 				 t->parms.link, 0, IPPROTO_GRE, 0);
-		goto out;
+		return;
 	}
 	if (type == ICMP_REDIRECT) {
 		ipv4_redirect(skb, dev_net(skb->dev), t->parms.link, 0,
 			      IPPROTO_GRE, 0);
-		goto out;
+		return;
 	}
 	if (t->parms.iph.daddr == 0 ||
 	    ipv4_is_multicast(t->parms.iph.daddr))
-		goto out;
+		return;
 
 	if (t->parms.iph.ttl == 0 && type == ICMP_TIME_EXCEEDED)
-		goto out;
+		return;
 
 	if (time_before(jiffies, t->err_time + IPTUNNEL_ERR_TIMEO))
 		t->err_count++;
 	else
 		t->err_count = 1;
 	t->err_time = jiffies;
-out:
-	rcu_read_unlock();
 }
 
 static inline void ipgre_ecn_decapsulate(const struct iphdr *iph, struct sk_buff *skb)
@@ -625,7 +622,7 @@ static int ipgre_rcv(struct sk_buff *skb
 	__be16 gre_proto;
 
 	if (!pskb_may_pull(skb, 16))
-		goto drop_nolock;
+		goto drop;
 
 	iph = ip_hdr(skb);
 	h = skb->data;
@@ -636,7 +633,7 @@ static int ipgre_rcv(struct sk_buff *skb
 		   - We do not support routing headers.
 		 */
 		if (flags&(GRE_VERSION|GRE_ROUTING))
-			goto drop_nolock;
+			goto drop;
 
 		if (flags&GRE_CSUM) {
 			switch (skb->ip_summed) {
@@ -664,7 +661,6 @@ static int ipgre_rcv(struct sk_buff *skb
 
 	gre_proto = *(__be16 *)(h + 2);
 
-	rcu_read_lock();
 	tunnel = ipgre_tunnel_lookup(skb->dev,
 				     iph->saddr, iph->daddr, flags, key,
 				     gre_proto);
@@ -740,14 +736,11 @@ static int ipgre_rcv(struct sk_buff *skb
 
 		netif_rx(skb);
 
-		rcu_read_unlock();
 		return 0;
 	}
 	icmp_send(skb, ICMP_DEST_UNREACH, ICMP_PORT_UNREACH, 0);
 
 drop:
-	rcu_read_unlock();
-drop_nolock:
 	kfree_skb(skb);
 	return 0;
 }

^ permalink raw reply

* [PATCH net-next 1/4] gre: fix handling of key 0
From: Stephen Hemminger @ 2012-09-25  4:12 UTC (permalink / raw)
  To: davem; +Cc: netdev
In-Reply-To: <20120925041222.056704869@vyatta.com>

[-- Attachment #1: gre-key0.patch --]
[-- Type: text/plain, Size: 3885 bytes --]

GRE driver incorrectly uses zero as a flag value. Zero is a perfectly
valid value for key, and the tunnel should match packets with no key only
with tunnels created without key, and vice versa.

This is a slightly visible  change since previously it might be possible to
construct a working tunnel that sent key 0 and received only because
of the key wildcard of zero.  I.e the sender sent key of zero, but tunnel
was defined without key.

Note: using gre key 0 requires iproute2 utilities v3.2 or later.
The original utility code was broken as well.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>


--- a/net/ipv4/ip_gre.c	2012-08-15 08:59:22.958704223 -0700
+++ b/net/ipv4/ip_gre.c	2012-09-12 09:40:04.420959235 -0700
@@ -214,11 +214,25 @@ static struct rtnl_link_stats64 *ipgre_g
 	return tot;
 }
 
+/* Does key in tunnel parameters match packet */
+static bool ipgre_key_match(const struct ip_tunnel_parm *p,
+			    __u32 flags, __be32 key)
+{
+	if (p->i_flags & GRE_KEY) {
+		if (flags & GRE_KEY)
+			return key == p->i_key;
+		else
+			return false;	/* key expected, none present */
+	} else
+		return !(flags & GRE_KEY);
+}
+
 /* Given src, dst and key, find appropriate for input tunnel. */
 
 static struct ip_tunnel *ipgre_tunnel_lookup(struct net_device *dev,
 					     __be32 remote, __be32 local,
-					     __be32 key, __be16 gre_proto)
+					     __u32 flags, __be32 key,
+					     __be16 gre_proto)
 {
 	struct net *net = dev_net(dev);
 	int link = dev->ifindex;
@@ -233,10 +247,12 @@ static struct ip_tunnel *ipgre_tunnel_lo
 	for_each_ip_tunnel_rcu(ign->tunnels_r_l[h0 ^ h1]) {
 		if (local != t->parms.iph.saddr ||
 		    remote != t->parms.iph.daddr ||
-		    key != t->parms.i_key ||
 		    !(t->dev->flags & IFF_UP))
 			continue;
 
+		if (!ipgre_key_match(&t->parms, flags, key))
+			continue;
+
 		if (t->dev->type != ARPHRD_IPGRE &&
 		    t->dev->type != dev_type)
 			continue;
@@ -257,10 +273,12 @@ static struct ip_tunnel *ipgre_tunnel_lo
 
 	for_each_ip_tunnel_rcu(ign->tunnels_r[h0 ^ h1]) {
 		if (remote != t->parms.iph.daddr ||
-		    key != t->parms.i_key ||
 		    !(t->dev->flags & IFF_UP))
 			continue;
 
+		if (!ipgre_key_match(&t->parms, flags, key))
+			continue;
+
 		if (t->dev->type != ARPHRD_IPGRE &&
 		    t->dev->type != dev_type)
 			continue;
@@ -283,10 +301,12 @@ static struct ip_tunnel *ipgre_tunnel_lo
 		if ((local != t->parms.iph.saddr &&
 		     (local != t->parms.iph.daddr ||
 		      !ipv4_is_multicast(local))) ||
-		    key != t->parms.i_key ||
 		    !(t->dev->flags & IFF_UP))
 			continue;
 
+		if (!ipgre_key_match(&t->parms, flags, key))
+			continue;
+
 		if (t->dev->type != ARPHRD_IPGRE &&
 		    t->dev->type != dev_type)
 			continue;
@@ -489,6 +509,7 @@ static void ipgre_err(struct sk_buff *sk
 	const int code = icmp_hdr(skb)->code;
 	struct ip_tunnel *t;
 	__be16 flags;
+	__be32 key = 0;
 
 	flags = p[0];
 	if (flags&(GRE_CSUM|GRE_KEY|GRE_SEQ|GRE_ROUTING|GRE_VERSION)) {
@@ -505,6 +526,9 @@ static void ipgre_err(struct sk_buff *sk
 	if (skb_headlen(skb) < grehlen)
 		return;
 
+	if (flags & GRE_KEY)
+		key = *(((__be32 *)p) + (grehlen / 4) - 1);
+
 	switch (type) {
 	default:
 	case ICMP_PARAMETERPROB:
@@ -535,9 +559,8 @@ static void ipgre_err(struct sk_buff *sk
 
 	rcu_read_lock();
 	t = ipgre_tunnel_lookup(skb->dev, iph->daddr, iph->saddr,
-				flags & GRE_KEY ?
-				*(((__be32 *)p) + (grehlen / 4) - 1) : 0,
-				p[1]);
+				flags, key, p[1]);
+
 	if (t == NULL)
 		goto out;
 
@@ -642,9 +665,10 @@ static int ipgre_rcv(struct sk_buff *skb
 	gre_proto = *(__be16 *)(h + 2);
 
 	rcu_read_lock();
-	if ((tunnel = ipgre_tunnel_lookup(skb->dev,
-					  iph->saddr, iph->daddr, key,
-					  gre_proto))) {
+	tunnel = ipgre_tunnel_lookup(skb->dev,
+				     iph->saddr, iph->daddr, flags, key,
+				     gre_proto);
+	if (tunnel) {
 		struct pcpu_tstats *tstats;
 
 		secpath_reset(skb);

^ permalink raw reply

* [PATCH] inet_diag: fix panic when unload inet_diag
From: Gao feng @ 2012-09-25  2:48 UTC (permalink / raw)
  To: davem; +Cc: stephen.hemminger, jengelh, eric.dumazet, kuznet, netdev,
	Gao feng

when inet_diag being compiled as module, inet_diag_handler_dump
set netlink_dump_control.dump to inet_diag_dump,so if module
inet_diag is unloaded,netlink will still try to call this function
in netlink_dump. this will cause kernel panic.

fix this by adding a reference of inet_diag module before
setting netlink_callback, and release this reference in
netlink_callback.done.

Thanks for all help from Stephen,Jan and Eric.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
---
 net/ipv4/inet_diag.c |   46 ++++++++++++++++++++++++++++++++++++++++++++--
 1 files changed, 44 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/inet_diag.c b/net/ipv4/inet_diag.c
index 570e61f..e573090 100644
--- a/net/ipv4/inet_diag.c
+++ b/net/ipv4/inet_diag.c
@@ -903,6 +903,12 @@ static int inet_diag_dump(struct sk_buff *skb, struct netlink_callback *cb)
 	return __inet_diag_dump(skb, cb, nlmsg_data(cb->nlh), bc);
 }
 
+static int inet_diag_done(struct netlink_callback *cb)
+{
+	module_put(THIS_MODULE);
+	return 0;
+}
+
 static inline int inet_diag_type2proto(int type)
 {
 	switch (type) {
@@ -972,8 +978,26 @@ static int inet_diag_rcv_msg_compat(struct sk_buff *skb, struct nlmsghdr *nlh)
 		{
 			struct netlink_dump_control c = {
 				.dump = inet_diag_dump_compat,
+				.done = inet_diag_done,
 			};
-			return netlink_dump_start(net->diag_nlsk, skb, nlh, &c);
+			int err;
+			/*
+			 * netlink_dump will call inet_diag_dump_compat,
+			 * so we need a reference of THIS_MODULE.
+			 */
+			if (!try_module_get(THIS_MODULE))
+				return -EPROTONOSUPPORT;
+
+			err = netlink_dump_start(net->diag_nlsk, skb, nlh, &c);
+
+			if ((err != -EINTR) && (err != -ENOBUFS)) {
+				/*
+				 * netlink_callback set failed, release the
+				 * referenct of THIS_MODULE.
+				 */
+				module_put(THIS_MODULE);
+			}
+			return err;
 		}
 	}
 
@@ -1001,8 +1025,26 @@ static int inet_diag_handler_dump(struct sk_buff *skb, struct nlmsghdr *h)
 		{
 			struct netlink_dump_control c = {
 				.dump = inet_diag_dump,
+				.done = inet_diag_done,
 			};
-			return netlink_dump_start(net->diag_nlsk, skb, h, &c);
+			int err;
+			/*
+			 * netlink_dump will call inet_diag_dump,
+			 * so we need a reference of THIS_MODULE.
+			 */
+			if (!try_module_get(THIS_MODULE))
+				return -EPROTONOSUPPORT;
+
+			err = netlink_dump_start(net->diag_nlsk, skb, h, &c);
+
+			if ((err != -EINTR) && (err != -ENOBUFS)) {
+				/*
+				 * netlink_callback set failed, release the
+				 * referenct of THIS_MODULE.
+				 */
+				module_put(THIS_MODULE);
+			}
+			return err;
 		}
 	}
 
-- 
1.7.7

^ permalink raw reply related

* IGMP snooping problem
From: Lin Ming @ 2012-09-25  2:59 UTC (permalink / raw)
  To: Herbert Xu; +Cc: networking

Hi,

I'm testing IGMP snooping on a router which has 4 LAN ports.

# brctl show
bridge name     bridge id                      STP enabled     interfaces
br0                  8000.00037fbef050       yes                   eth0

          eth1

          eth2

          eth3

          ath0

          ath1

          ath2

          ath3

ath0, 1, 2, 3 are WIFI.
IGMP snooping works well with this configuration.

The router supports a "switch mode", namely the 4 LAN ports connected
to a on-board switch.
In this mode, there is only a eth0, no eth1,2,3.
One benefit of this mode is that the traffic between LAN ports don't
need to go to CPU.
The LAN ports traffic will be handled by hardware switch directly.

# brctl show
bridge name     bridge id                      STP enabled     interfaces
br0                  8000.00037fbef050       yes                   eth0

          ath0

          ath1

          ath2

          ath3

In this mode, the IGMP snooping among the 4 LAN ports won't work.

Any idea how to resolve this problem?

Thanks,
Lin Ming

^ permalink raw reply

* linux-next: manual merge of the net-next tree with the net tree
From: Stephen Rothwell @ 2012-09-25  2:34 UTC (permalink / raw)
  To: David Miller, netdev; +Cc: linux-next, linux-kernel, Eric Dumazet

[-- Attachment #1: Type: text/plain, Size: 590 bytes --]

Hi all,

Today's linux-next merge of the net-next tree got a conflict in
net/ipv4/raw.c between commit ab43ed8b7490 ("ipv4: raw: fix icmp_filter
()") from the net tree and commit 5640f7685831 ("net: use a per task frag
allocator") from the net-next tree.

They are basically the same patch (for this file) except the net-next
version adds two pr_err() calls. I used the net-next version and can carry
the fix as necessary (no action is required).

I do wonder if this change belongs in the net-next patch?
-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* Re: [PATCH] inet_diag: make config INET_DIAG bool
From: Gao feng @ 2012-09-25  2:18 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Jan Engelhardt, Stephen Hemminger, netdev, davem, kuznet
In-Reply-To: <1348486346.26828.511.camel@edumazet-glaptop>

于 2012年09月24日 19:32, Eric Dumazet 写道:
> On Mon, 2012-09-24 at 18:17 +0800, Gao feng wrote:
>> 于 2012年09月24日 17:42, Eric Dumazet 写道:
>>> In fact I didnt fully understand the problem you try to address.
>>>
>>> If you want to prevent module being unloaded, you need to add proper
>>> module_get()/module_put()
>>>
>>> So I would add a "struct module *module;" in struct sock_diag_handler
>>> and use it appropriately.
>>
>> Yes, I try to add reference of the module,but I can't find a proper
>> location to call module_get and module_put.
>>
>> module_get should be called when userspace program use netlink to
>> send dump request to the kernel,and module_put should be called when
>> the dump is completed. I am right?
>>
>> BUT the userspace program may only call netlink_sendmsg without call
>> netlink_recvmsg.so the reference of the module will be incorrect.
> 
> check ->dump() and ->done() methods
> 

I miss that cb->done will be called when netlink sock being destructed.
so add a reference of the inet_diag module is doable.

I will send a v2 patch.

Thanks!

^ permalink raw reply

* [PATCH 2/2] Fix a typo in PTP_1588_CLOCK_PCH Kconfig help info.
From: Haicheng Li @ 2012-09-25  0:24 UTC (permalink / raw)
  To: David S. Miller, netdev
  Cc: Takahiroi Shimizu, linux-kernel@vger.kernel.org, haicheng.lee
In-Reply-To: <50600A49.7040902@linux.intel.com>

 From 5911413366d37aafcc19ddfc9c0f2db31855431e Mon Sep 17 00:00:00 2001
From: Haicheng Li <haicheng.li@linux.intel.com>
Date: Mon, 24 Sep 2012 15:55:27 +0800
Subject: [PATCH 2/2] Fix a typo in PTP_1588_CLOCK_PCH Kconfig help info.

Signed-off-by: Haicheng Li <haicheng.lee@gmail.com>
---
  drivers/ptp/Kconfig |    2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/ptp/Kconfig b/drivers/ptp/Kconfig
index ffdf712..82c4a26 100644
--- a/drivers/ptp/Kconfig
+++ b/drivers/ptp/Kconfig
@@ -87,6 +87,6 @@ config PTP_1588_CLOCK_PCH
  	  SO_TIMESTAMPING API.

  	  To compile this driver as a module, choose M here: the module
-	  will be called ptp_pch.
+	  will be called by pch_ptp.

  endmenu
-- 
1.7.1

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox