Netdev List
 help / color / mirror / Atom feed
* [PATCH for-net] net/mlx4_en: Check device state when setting coalescing
From: Amir Vadai @ 2013-09-12 15:11 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Amir Vadai, Or Gerlitz, Gideon Naim, Eugenia Emantayev

From: Eugenia Emantayev <eugenia@mellanox.com>

When the device is down, CQs are freed. We must check the device state
to avoid issuing firmware commands on non existing CQs.

CC: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx4/en_ethtool.c | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c
index a28cd80..0c75098 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c
@@ -53,9 +53,11 @@ static int mlx4_en_moderation_update(struct mlx4_en_priv *priv)
 	for (i = 0; i < priv->tx_ring_num; i++) {
 		priv->tx_cq[i].moder_cnt = priv->tx_frames;
 		priv->tx_cq[i].moder_time = priv->tx_usecs;
-		err = mlx4_en_set_cq_moder(priv, &priv->tx_cq[i]);
-		if (err)
-			return err;
+		if (priv->port_up) {
+			err = mlx4_en_set_cq_moder(priv, &priv->tx_cq[i]);
+			if (err)
+				return err;
+		}
 	}
 
 	if (priv->adaptive_rx_coal)
@@ -65,9 +67,11 @@ static int mlx4_en_moderation_update(struct mlx4_en_priv *priv)
 		priv->rx_cq[i].moder_cnt = priv->rx_frames;
 		priv->rx_cq[i].moder_time = priv->rx_usecs;
 		priv->last_moder_time[i] = MLX4_EN_AUTO_CONF;
-		err = mlx4_en_set_cq_moder(priv, &priv->rx_cq[i]);
-		if (err)
-			return err;
+		if (priv->port_up) {
+			err = mlx4_en_set_cq_moder(priv, &priv->rx_cq[i]);
+			if (err)
+				return err;
+		}
 	}
 
 	return err;
-- 
1.8.3.4

^ permalink raw reply related

* Re: [PATCHv3 linux-next] hrtimer: Add notifier when clock_was_set was called
From: Thomas Gleixner @ 2013-09-12 14:43 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Fan Du, Steffen Klassert, David Miller, Daniel Borkmann, LKML,
	netdev
In-Reply-To: <20130912134409.GB21212@gondor.apana.org.au>

On Thu, 12 Sep 2013, Herbert Xu wrote:
> On Thu, Sep 12, 2013 at 03:21:24PM +0200, Thomas Gleixner wrote:
> >
> > > (3): http://www.spinics.net/lists/netdev/msg245169.html
> > 
> > Thanks for the explanation so far.
> > 
> > What's still unclear to me is why these timeouts are bound to wall
> > time in the first place.
> > 
> > Is there any real reason why the key life time can't simply be
> > expressed in monotonic time, e.g. N seconds after creation or M
> > seconds after usage? Looking at the relevant RFCs I can't find any
> > requirement for binding the life time to wall time. 
> > 
> > A life time of 10 minutes does not change when the wall clock is
> > adjusted for whatever reasons. It's still 10 minutes and not some
> > random result of the wall clock adjustments. But I might be wrong as
> > usual :)
> 
> Well we started out with straight timers.  It was changed because
> people wanted IPsec SAs to expire after a suspect/resume which

Right suspend is the usual suspect :)

> AFAIK does not touch normal timers.
> 
> Of course, this brought with it a new set of problems when the
> system time is stepped which now cause SAs to expire even though
> they probably shouldn't.

Right. That's what I guessed. So your problem is that the timer_list
timers which are the proper mechanism for this (the life time has a 1
second granularity, so hrtimers are complete overkill) are not
expiring after a suspend/resume cycle.

So what about going back to timer_list timers and simply utilize
register_pm_notifier(), which will tell you that the system resumed?

Thanks,

	tglx

^ permalink raw reply

* Re: [PATCH net-next] fix NULL pointer dereference in br_handle_frame
From: Vlad Yasevich @ 2013-09-12 14:42 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Hong zhi guo, netdev, David Miller, zhiguohong
In-Reply-To: <1378996396.24408.8.camel@edumazet-glaptop>

On 09/12/2013 10:33 AM, Eric Dumazet wrote:
> On Thu, 2013-09-12 at 22:24 +0800, Hong zhi guo wrote:
>> You mean IFF_BRIDGE_PORT should be only for admin/control path, but
>> not for data path?
>
> By definition, br_handle_frame() should be called only when device is a
> bridge port.
>
> After the call to synchronize_net() included in
> netdev_rx_handler_unregister(), you have guarantee br_handle_frame()
> wont be called.
>
> Therefore, testing IFF_BRIDGE_PORT in br_handle_frame() is redundant.

Don't all tests for IFF_BRIDGE_PORT on the bridge receive path become 
redundant as well?

-vlad

>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply

* Twice the action, double the fun!
From: ROBERTLBEMIS @ 2013-09-12 14:35 UTC (permalink / raw)



		Dear Customer 

Make a first deposit to receive your first casino bonus.

http://onlinecasino46.webnode.com/casino/casino3

Best Regards, 
		

^ permalink raw reply

* Re: [PATCH net-next] fix NULL pointer dereference in br_handle_frame
From: Eric Dumazet @ 2013-09-12 14:33 UTC (permalink / raw)
  To: Hong zhi guo; +Cc: netdev, David Miller, zhiguohong
In-Reply-To: <CAA7+ByWJLUkxy7C0frKOFUscJaQyRLsfk_N=NZfdzUrbZ=4MNA@mail.gmail.com>

On Thu, 2013-09-12 at 22:24 +0800, Hong zhi guo wrote:
> You mean IFF_BRIDGE_PORT should be only for admin/control path, but
> not for data path?

By definition, br_handle_frame() should be called only when device is a
bridge port.

After the call to synchronize_net() included in
netdev_rx_handler_unregister(), you have guarantee br_handle_frame()
wont be called.

Therefore, testing IFF_BRIDGE_PORT in br_handle_frame() is redundant.

^ permalink raw reply

* Re: [PATCH net-next] fix NULL pointer dereference in br_handle_frame
From: Hong zhi guo @ 2013-09-12 14:24 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, David Miller, zhiguohong
In-Reply-To: <1378995112.24408.4.camel@edumazet-glaptop>

You mean IFF_BRIDGE_PORT should be only for admin/control path, but
not for data path?

On Thu, Sep 12, 2013 at 10:11 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Thu, 2013-09-12 at 20:16 +0800, Hong Zhiguo wrote:
>> From: Hong Zhiguo <zhiguohong@tencent.com>
>>
>> In function netdev_rx_handler_unregister it's said that:
>>
>> /* a reader seeing a non NULL rx_handler in a rcu_read_lock()
>>  * section has a guarantee to see a non NULL rx_handler_data
>>  * as well.
>>  */
>>
>> This is true. But br_port_get_rcu(dev) returns NULL if:
>>       !(dev->priv_flags & IFF_BRIDGE_PORT)
>>
>> And this happended on my box when br_handle_frame is called
>> between these 2 lines of del_nbp:
>>
>>       dev->priv_flags &= ~IFF_BRIDGE_PORT;
>>       /* --> br_handle_frame is called at this time */
>>       netdev_upper_dev_unlink(dev, br->dev);
>>
>> I got below Oops(some lines omitted):
>> BUG: unable to handle kernel NULL pointer dereference at 0000000000000021
>> IP: [<ffffffff8150901d>] br_handle_frame+0xed/0x230
>> Oops: 0000 [#1] PREEMPT SMP
>> RIP: 0010:[<ffffffff8150901d>]  [<ffffffff8150901d>] br_handle_frame+0xed/0x230
>> RSP: 0018:ffff880030403c10  EFLAGS: 00010286
>> Stack:
>>  ffff88002c945700 ffffffff81508f30 0000000000000000 ffff88002d41e000
>>  ffff880030403c98 ffffffff81477acb ffffffff81477821 ffff880030403c68
>>  ffffffff81090e10 00ff88002d545c80 ffff88002c945700 ffffffff81aa50c0
>> Call Trace:
>>  <IRQ>
>>  [<ffffffff81508f30>] ? br_handle_frame_finish+0x300/0x300
>>  [<ffffffff81477acb>] __netif_receive_skb_core+0x39b/0x880
>>
>> Signed-off-by: Hong Zhiguo <zhiguohong@tencent.com>
>> ---
>>  net/bridge/br_if.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
>> index c41d5fb..bd21159 100644
>> --- a/net/bridge/br_if.c
>> +++ b/net/bridge/br_if.c
>> @@ -133,6 +133,8 @@ static void del_nbp(struct net_bridge_port *p)
>>
>>       sysfs_remove_link(br->ifobj, p->dev->name);
>>
>> +     netdev_rx_handler_unregister(dev);
>> +
>>       dev_set_promiscuity(dev, -1);
>>
>>       spin_lock_bh(&br->lock);
>> @@ -148,8 +150,6 @@ static void del_nbp(struct net_bridge_port *p)
>>
>>       dev->priv_flags &= ~IFF_BRIDGE_PORT;
>>
>> -     netdev_rx_handler_unregister(dev);
>> -
>>       netdev_upper_dev_unlink(dev, br->dev);
>>
>>       br_multicast_del_port(p);
>
> Interesting.
>
> Then br_handle_frame() should not even have to check IFF_BRIDGE_PORT
> flag.
>
> diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
> index a2fd37e..45b2568 100644
> --- a/net/bridge/br_input.c
> +++ b/net/bridge/br_input.c
> @@ -173,7 +173,7 @@ rx_handler_result_t br_handle_frame(struct sk_buff **pskb)
>         if (!skb)
>                 return RX_HANDLER_CONSUMED;
>
> -       p = br_port_get_rcu(skb->dev);
> +       p = rcu_dereference(skb->dev->rx_handler_data);
>
>         if (unlikely(is_link_local_ether_addr(dest))) {
>                 /*
>
>



-- 
best regards
Hong Zhiguo

^ permalink raw reply

* Re: [PATCH] sunrpc: Add missing kuids conversion for printing
From: Myklebust, Trond @ 2013-09-12 14:22 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Eric W. Biederman, J. Bruce Fields,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
In-Reply-To: <1378991379-9106-1-git-send-email-geert-Td1EMuHUCqxL1ZNQvxDV9g@public.gmane.org>

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 768 bytes --]

On Thu, 2013-09-12 at 15:09 +0200, Geert Uytterhoeven wrote:
> m68k/allmodconfig:
> 
> net/sunrpc/auth_generic.c: In function ‘generic_key_timeout’:
> net/sunrpc/auth_generic.c:241: warning: format ‘%d’ expects type ‘int’, but
> argument 2 has type ‘kuid_t’
> 
> commit cdba321e291f0fbf5abda4d88340292b858e3d4d ("sunrpc: Convert kuids and
> kgids to uids and gids for printing") forgot to convert one instance.
> 
> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
> ---

Thanks! Applied...

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com
N‹§²æìr¸›yúèšØb²X¬¶Ç§vØ^–)Þº{.nÇ+‰·¥Š{±û"žØ^n‡r¡ö¦zË\x1aëh™¨è­Ú&¢îý»\x05ËÛÔØï¦v¬Îf\x1dp)¹¹br	šê+€Ê+zf£¢·hšˆ§~†­†Ûiÿûàz¹\x1e®w¥¢¸?™¨è­Ú&¢)ߢ^[f

^ permalink raw reply

* Re: [PATCH net-next] fix NULL pointer dereference in br_handle_frame
From: Eric Dumazet @ 2013-09-12 14:11 UTC (permalink / raw)
  To: Hong Zhiguo; +Cc: netdev, davem, zhiguohong
In-Reply-To: <1378988195-2710-1-git-send-email-zhiguohong@tencent.com>

On Thu, 2013-09-12 at 20:16 +0800, Hong Zhiguo wrote:
> From: Hong Zhiguo <zhiguohong@tencent.com>
> 
> In function netdev_rx_handler_unregister it's said that:
> 
> /* a reader seeing a non NULL rx_handler in a rcu_read_lock()
>  * section has a guarantee to see a non NULL rx_handler_data
>  * as well.
>  */
> 
> This is true. But br_port_get_rcu(dev) returns NULL if:
> 	!(dev->priv_flags & IFF_BRIDGE_PORT)
> 
> And this happended on my box when br_handle_frame is called
> between these 2 lines of del_nbp:
> 
> 	dev->priv_flags &= ~IFF_BRIDGE_PORT;
> 	/* --> br_handle_frame is called at this time */
> 	netdev_upper_dev_unlink(dev, br->dev);
> 
> I got below Oops(some lines omitted):
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000021
> IP: [<ffffffff8150901d>] br_handle_frame+0xed/0x230
> Oops: 0000 [#1] PREEMPT SMP
> RIP: 0010:[<ffffffff8150901d>]  [<ffffffff8150901d>] br_handle_frame+0xed/0x230
> RSP: 0018:ffff880030403c10  EFLAGS: 00010286
> Stack:
>  ffff88002c945700 ffffffff81508f30 0000000000000000 ffff88002d41e000
>  ffff880030403c98 ffffffff81477acb ffffffff81477821 ffff880030403c68
>  ffffffff81090e10 00ff88002d545c80 ffff88002c945700 ffffffff81aa50c0
> Call Trace:
>  <IRQ>
>  [<ffffffff81508f30>] ? br_handle_frame_finish+0x300/0x300
>  [<ffffffff81477acb>] __netif_receive_skb_core+0x39b/0x880
> 
> Signed-off-by: Hong Zhiguo <zhiguohong@tencent.com>
> ---
>  net/bridge/br_if.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
> index c41d5fb..bd21159 100644
> --- a/net/bridge/br_if.c
> +++ b/net/bridge/br_if.c
> @@ -133,6 +133,8 @@ static void del_nbp(struct net_bridge_port *p)
>  
>  	sysfs_remove_link(br->ifobj, p->dev->name);
>  
> +	netdev_rx_handler_unregister(dev);
> +
>  	dev_set_promiscuity(dev, -1);
>  
>  	spin_lock_bh(&br->lock);
> @@ -148,8 +150,6 @@ static void del_nbp(struct net_bridge_port *p)
>  
>  	dev->priv_flags &= ~IFF_BRIDGE_PORT;
>  
> -	netdev_rx_handler_unregister(dev);
> -
>  	netdev_upper_dev_unlink(dev, br->dev);
>  
>  	br_multicast_del_port(p);

Interesting.

Then br_handle_frame() should not even have to check IFF_BRIDGE_PORT
flag.

diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
index a2fd37e..45b2568 100644
--- a/net/bridge/br_input.c
+++ b/net/bridge/br_input.c
@@ -173,7 +173,7 @@ rx_handler_result_t br_handle_frame(struct sk_buff **pskb)
 	if (!skb)
 		return RX_HANDLER_CONSUMED;
 
-	p = br_port_get_rcu(skb->dev);
+	p = rcu_dereference(skb->dev->rx_handler_data);
 
 	if (unlikely(is_link_local_ether_addr(dest))) {
 		/*

^ permalink raw reply related

* Re: [PATCHv3 linux-next] hrtimer: Add notifier when clock_was_set was called
From: Herbert Xu @ 2013-09-12 13:44 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Fan Du, Steffen Klassert, David Miller, Daniel Borkmann, LKML,
	netdev
In-Reply-To: <alpine.DEB.2.02.1309121441420.4089@ionos.tec.linutronix.de>

On Thu, Sep 12, 2013 at 03:21:24PM +0200, Thomas Gleixner wrote:
>
> > (3): http://www.spinics.net/lists/netdev/msg245169.html
> 
> Thanks for the explanation so far.
> 
> What's still unclear to me is why these timeouts are bound to wall
> time in the first place.
> 
> Is there any real reason why the key life time can't simply be
> expressed in monotonic time, e.g. N seconds after creation or M
> seconds after usage? Looking at the relevant RFCs I can't find any
> requirement for binding the life time to wall time. 
> 
> A life time of 10 minutes does not change when the wall clock is
> adjusted for whatever reasons. It's still 10 minutes and not some
> random result of the wall clock adjustments. But I might be wrong as
> usual :)

Well we started out with straight timers.  It was changed because
people wanted IPsec SAs to expire after a suspect/resume which
AFAIK does not touch normal timers.

Of course, this brought with it a new set of problems when the
system time is stepped which now cause SAs to expire even though
they probably shouldn't.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCHv3 linux-next] hrtimer: Add notifier when clock_was_set was called
From: Thomas Gleixner @ 2013-09-12 13:21 UTC (permalink / raw)
  To: Fan Du
  Cc: Steffen Klassert, David Miller, Herbert Xu, Daniel Borkmann, LKML,
	netdev
In-Reply-To: <5212CCCA.4090907@windriver.com>

On Tue, 20 Aug 2013, Fan Du wrote:
> Thanks for your patience. Please let me take a few seconds try to
> explain this.

Sorry for the late reply.
 
> Current xfrm layers has *one* hrtimer to guard Ipsec keys timeout,
> The timeout could be measured in either of below two ways:
> 
>  (1) The timer is started once the keys is created, but this
>      key is not necessary actually used right now. In detail,
>      record the get_seconds() when this key is created.
> 
>  (2) Starting the timer when this key is actually used, e.g when
>      an IP packet need to be encrypted. In details, recored the
>      get_seconds() when this key is first used.
> 
> So in the hrtimer handler, the code get current get_seconds, and
> check against with what saved in (1)or(2), and notify the timeout
> up to user land.
> 
> So the pitfall is using one hrtimer for two timeout events,
> most importantly using get_seconds to check timeout, once system
> clock is changed by user intentionally, the key timeout could
> misbehave wildly.
> 
> A refractor has been proposed to get rid of depending on system wall
> clock by cleaning up the hrtimer handler. Unfortunately David frowned
> on it in (3), and suggest once system clock is changed, adjust the
> timeout of the key.
> 
> 
> (3): http://www.spinics.net/lists/netdev/msg245169.html

Thanks for the explanation so far.

What's still unclear to me is why these timeouts are bound to wall
time in the first place.

Is there any real reason why the key life time can't simply be
expressed in monotonic time, e.g. N seconds after creation or M
seconds after usage? Looking at the relevant RFCs I can't find any
requirement for binding the life time to wall time. 

A life time of 10 minutes does not change when the wall clock is
adjusted for whatever reasons. It's still 10 minutes and not some
random result of the wall clock adjustments. But I might be wrong as
usual :)

Thanks,

	tglx

^ permalink raw reply

* [PATCH] sunrpc: Add missing kuids conversion for printing
From: Geert Uytterhoeven @ 2013-09-12 13:09 UTC (permalink / raw)
  To: Eric W. Biederman, J. Bruce Fields, Trond Myklebust
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Geert Uytterhoeven

m68k/allmodconfig:

net/sunrpc/auth_generic.c: In function ‘generic_key_timeout’:
net/sunrpc/auth_generic.c:241: warning: format ‘%d’ expects type ‘int’, but
argument 2 has type ‘kuid_t’

commit cdba321e291f0fbf5abda4d88340292b858e3d4d ("sunrpc: Convert kuids and
kgids to uids and gids for printing") forgot to convert one instance.

Signed-off-by: Geert Uytterhoeven <geert-Td1EMuHUCqxL1ZNQvxDV9g@public.gmane.org>
---
 net/sunrpc/auth_generic.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/sunrpc/auth_generic.c b/net/sunrpc/auth_generic.c
index f6d84be..ed04869 100644
--- a/net/sunrpc/auth_generic.c
+++ b/net/sunrpc/auth_generic.c
@@ -239,7 +239,7 @@ generic_key_timeout(struct rpc_auth *auth, struct rpc_cred *cred)
 		if (test_and_clear_bit(RPC_CRED_KEY_EXPIRE_SOON,
 					&acred->ac_flags))
 			dprintk("RPC:        UID %d Credential key reset\n",
-				tcred->cr_uid);
+				from_kuid(&init_user_ns, tcred->cr_uid));
 		/* set up fasttrack for the normal case */
 		set_bit(RPC_CRED_NOTIFY_TIMEOUT, &acred->ac_flags);
 	}
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* Re: [PATCH 09/11] sctp: move route updating for redirect to ndisc layer
From: Daniel Borkmann @ 2013-09-12 12:33 UTC (permalink / raw)
  To: Duan Jiong; +Cc: davem, netdev, hannes, linux-sctp@vger.kernel.org
In-Reply-To: <52319CA7.8080809@cn.fujitsu.com>

(Please also cc linux-sctp on this one.)

On 09/12/2013 12:51 PM, Duan Jiong wrote:
> From: Duan Jiong <duanj.fnst@cn.fujitsu.com>
> 
> In additon, when dealing with redirect message, it should
> not report error message to user, and need to return
> directly.
> 
> Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
> ---
>   net/sctp/input.c | 12 ------------
>   net/sctp/ipv6.c  |  6 +++---
>   2 files changed, 3 insertions(+), 15 deletions(-)
> 
> diff --git a/net/sctp/input.c b/net/sctp/input.c
> index 5f20686..0d2d4b7 100644
> --- a/net/sctp/input.c
> +++ b/net/sctp/input.c
> @@ -413,18 +413,6 @@ void sctp_icmp_frag_needed(struct sock *sk, struct sctp_association *asoc,
>   	sctp_retransmit(&asoc->outqueue, t, SCTP_RTXR_PMTUD);
>   }
>   
> -void sctp_icmp_redirect(struct sock *sk, struct sctp_transport *t,
> -			struct sk_buff *skb)
> -{
> -	struct dst_entry *dst;
> -
> -	if (!t)
> -		return;
> -	dst = sctp_transport_dst_check(t);
> -	if (dst)
> -		dst->ops->redirect(dst, sk, skb);
> -}
> -
>   /*
>    * SCTP Implementer's Guide, 2.37 ICMP handling procedures
>    *
> diff --git a/net/sctp/ipv6.c b/net/sctp/ipv6.c
> index da613ce..ee12d87 100644
> --- a/net/sctp/ipv6.c
> +++ b/net/sctp/ipv6.c
> @@ -151,6 +151,9 @@ static void sctp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
>   	int err;
>   	struct net *net = dev_net(skb->dev);
>   
> +	if (type == NDISC_REDIRECT)
> +		return;
> +
>   	idev = in6_dev_get(skb->dev);
>   
>   	/* Fix up skb to look at the embedded net header. */
> @@ -181,9 +184,6 @@ static void sctp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
>   			goto out_unlock;
>   		}
>   		break;
> -	case NDISC_REDIRECT:
> -		sctp_icmp_redirect(sk, transport, skb);
> -		break;
>   	default:
>   		break;
>   	}
> 

^ permalink raw reply

* [PATCH net-next] fix NULL pointer dereference in br_handle_frame
From: Hong Zhiguo @ 2013-09-12 12:16 UTC (permalink / raw)
  To: netdev; +Cc: davem, zhiguohong

From: Hong Zhiguo <zhiguohong@tencent.com>

In function netdev_rx_handler_unregister it's said that:

/* a reader seeing a non NULL rx_handler in a rcu_read_lock()
 * section has a guarantee to see a non NULL rx_handler_data
 * as well.
 */

This is true. But br_port_get_rcu(dev) returns NULL if:
	!(dev->priv_flags & IFF_BRIDGE_PORT)

And this happended on my box when br_handle_frame is called
between these 2 lines of del_nbp:

	dev->priv_flags &= ~IFF_BRIDGE_PORT;
	/* --> br_handle_frame is called at this time */
	netdev_upper_dev_unlink(dev, br->dev);

I got below Oops(some lines omitted):
BUG: unable to handle kernel NULL pointer dereference at 0000000000000021
IP: [<ffffffff8150901d>] br_handle_frame+0xed/0x230
Oops: 0000 [#1] PREEMPT SMP
RIP: 0010:[<ffffffff8150901d>]  [<ffffffff8150901d>] br_handle_frame+0xed/0x230
RSP: 0018:ffff880030403c10  EFLAGS: 00010286
Stack:
 ffff88002c945700 ffffffff81508f30 0000000000000000 ffff88002d41e000
 ffff880030403c98 ffffffff81477acb ffffffff81477821 ffff880030403c68
 ffffffff81090e10 00ff88002d545c80 ffff88002c945700 ffffffff81aa50c0
Call Trace:
 <IRQ>
 [<ffffffff81508f30>] ? br_handle_frame_finish+0x300/0x300
 [<ffffffff81477acb>] __netif_receive_skb_core+0x39b/0x880

Signed-off-by: Hong Zhiguo <zhiguohong@tencent.com>
---
 net/bridge/br_if.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index c41d5fb..bd21159 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -133,6 +133,8 @@ static void del_nbp(struct net_bridge_port *p)
 
 	sysfs_remove_link(br->ifobj, p->dev->name);
 
+	netdev_rx_handler_unregister(dev);
+
 	dev_set_promiscuity(dev, -1);
 
 	spin_lock_bh(&br->lock);
@@ -148,8 +150,6 @@ static void del_nbp(struct net_bridge_port *p)
 
 	dev->priv_flags &= ~IFF_BRIDGE_PORT;
 
-	netdev_rx_handler_unregister(dev);
-
 	netdev_upper_dev_unlink(dev, br->dev);
 
 	br_multicast_del_port(p);
-- 
1.8.1.2

^ permalink raw reply related

* [PATCH v2] Don't destroy the netdev until the vif is shut down
From: Paul Durrant @ 2013-09-12 12:14 UTC (permalink / raw)
  To: netdev, xen-devel; +Cc: Paul Durrant, David Vrabel, Wei Liu, Ian Campbell

Without this patch, if a frontend cycles through states Closing
and Closed (which Windows frontends need to do) then the netdev
will be destroyed and requires re-invocation of hotplug scripts
to restore state before the frontend can move to Connected. Thus
when udev is not in use the backend gets stuck in InitWait.

With this patch, the netdev is left alone whilst the backend is
still online and is only de-registered and freed just prior to
destroying the vif (which is also nicely symmetrical with the
netdev allocation and registration being done during probe) so
no re-invocation of hotplug scripts is required.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
---
v2:
 - Modify netback_remove() - bug only seemed to manifest with linux guest

 drivers/net/xen-netback/common.h    |    1 +
 drivers/net/xen-netback/interface.c |   13 ++++++++-----
 drivers/net/xen-netback/xenbus.c    |   17 ++++++++++++-----
 3 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index a197743..5715318 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -184,6 +184,7 @@ int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref,
 		   unsigned long rx_ring_ref, unsigned int tx_evtchn,
 		   unsigned int rx_evtchn);
 void xenvif_disconnect(struct xenvif *vif);
+void xenvif_free(struct xenvif *vif);
 
 int xenvif_xenbus_init(void);
 void xenvif_xenbus_fini(void);
diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index 625c6f4..65e78f9 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -477,14 +477,17 @@ void xenvif_disconnect(struct xenvif *vif)
 	if (vif->task)
 		kthread_stop(vif->task);
 
+	xenvif_unmap_frontend_rings(vif);
+
+	if (need_module_put)
+		module_put(THIS_MODULE);
+}
+
+void xenvif_free(struct xenvif *vif)
+{
 	netif_napi_del(&vif->napi);
 
 	unregister_netdev(vif->dev);
 
-	xenvif_unmap_frontend_rings(vif);
-
 	free_netdev(vif->dev);
-
-	if (need_module_put)
-		module_put(THIS_MODULE);
 }
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index 1fe48fe3..a53782e 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -42,7 +42,7 @@ static int netback_remove(struct xenbus_device *dev)
 	if (be->vif) {
 		kobject_uevent(&dev->dev.kobj, KOBJ_OFFLINE);
 		xenbus_rm(XBT_NIL, dev->nodename, "hotplug-status");
-		xenvif_disconnect(be->vif);
+		xenvif_free(be->vif);
 		be->vif = NULL;
 	}
 	kfree(be);
@@ -213,9 +213,18 @@ static void disconnect_backend(struct xenbus_device *dev)
 {
 	struct backend_info *be = dev_get_drvdata(&dev->dev);
 
+	if (be->vif)
+		xenvif_disconnect(be->vif);
+}
+
+static void destroy_backend(struct xenbus_device *dev)
+{
+	struct backend_info *be = dev_get_drvdata(&dev->dev);
+
 	if (be->vif) {
+		kobject_uevent(&dev->dev.kobj, KOBJ_OFFLINE);
 		xenbus_rm(XBT_NIL, dev->nodename, "hotplug-status");
-		xenvif_disconnect(be->vif);
+		xenvif_free(be->vif);
 		be->vif = NULL;
 	}
 }
@@ -246,14 +255,11 @@ static void frontend_changed(struct xenbus_device *dev,
 	case XenbusStateConnected:
 		if (dev->state == XenbusStateConnected)
 			break;
-		backend_create_xenvif(be);
 		if (be->vif)
 			connect(be);
 		break;
 
 	case XenbusStateClosing:
-		if (be->vif)
-			kobject_uevent(&dev->dev.kobj, KOBJ_OFFLINE);
 		disconnect_backend(dev);
 		xenbus_switch_state(dev, XenbusStateClosing);
 		break;
@@ -262,6 +268,7 @@ static void frontend_changed(struct xenbus_device *dev,
 		xenbus_switch_state(dev, XenbusStateClosed);
 		if (xenbus_dev_is_online(dev))
 			break;
+		destroy_backend(dev);
 		/* fall through if not online */
 	case XenbusStateUnknown:
 		device_unregister(&dev->dev);
-- 
1.7.10.4

^ permalink raw reply related

* Re: [PATCH] ipv6: Do route updating for redirect in ndisc layer
From: Daniel Borkmann @ 2013-09-12 12:07 UTC (permalink / raw)
  To: vyasevic; +Cc: Duan Jiong, davem, netdev, vyasevich
In-Reply-To: <52311604.6080507@redhat.com>

On 09/12/2013 03:16 AM, Vlad Yasevich wrote:
> On 09/11/2013 07:17 PM, Hannes Frederic Sowa wrote:
>> [added Cc to Daniel and Vlad because of ipv6/sctp/redirect problem]
>>
>> On Wed, Sep 11, 2013 at 03:04:35PM +0800, Duan Jiong wrote:
>>> 于 2013年09月11日 06:50, Hannes Frederic Sowa 写道:
>>>> On Mon, Sep 09, 2013 at 03:09:56PM +0800, Duan Jiong wrote:
>>>>> diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
>>>>> index 5c71501..61fe8e5 100644
>>>>> --- a/net/ipv6/tcp_ipv6.c
>>>>> +++ b/net/ipv6/tcp_ipv6.c
>>>>> @@ -382,14 +382,6 @@ static void tcp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
>>>>>
>>>>>       np = inet6_sk(sk);
>>>>>
>>>>> -    if (type == NDISC_REDIRECT) {
>>>>> -        struct dst_entry *dst = __sk_dst_check(sk, np->dst_cookie);
>>>>> -
>>>>> -        if (dst)
>>>>> -            dst->ops->redirect(dst, sk, skb);
>>>>> -        goto out;
>>>>> -    }
>>>>> -
>>>>
>>>> You dropped the "goto out" here in case of an NDISC_REDIRECT, so this sends an
>>>> EPROTO further up the socket layer. Was this intended?
>>>>
>>>
>>> I'm sorry, i didn't notice the variable err was assigned to EPROTO.
>>> I only thought that message should be sent to the socket layer, because
>>> i found that in function sctp_v6_err().
>>>
>>> In addition, the rfc 4443 said the Redirect Message is not the ICMPv6 Error
>>> Message, so i think we shouldn't call those err_handler function, in other
>>> words we shouldn't call the icmpv6_notify().
>>>
>>> How do you think of this?
>>
>> Hm, thats hard.
>>
>> First of, when the kernel started publishing these errors it had a
>> contract with user-space we cannot break now. This includes all error
>> handling functions which call ipv6_icmp_error. So we only have to care
>> about INET6_PROTO_FINAL protocols, bbecause they mostly operate in socket
>> space (in this case these are the raw and the udp protocol and currently
>> sctp). Especially I do think it is important to report the redirects
>> to raw sockets. The other non-final protocols only need to be notified
>> for mtu reduction currently. Maybe we could stop notifying non-final
>> protocols for redirects, but I don't think this will improve things.
>>
>> Also we cannot know if the router sending the redirect discarded the
>> original packet or if it forwarded it just notifying us of a better route,
>> so we don't know if an actual error happend. So I would do the same thing
>> as IPv4 sockets, set sk_err to zero and queue up the icmp packet on the
>> socket's error queue (for udp and raw).
>>
>> Regarding notifying tcp sockets about the redirect seems wrong. It would
>> generate a poll notification and I do think it could even tear down
>> the whole connection.  I guess sctp should also stop updating sk_err
>> on redirects.  But let's Cc Daniel and Vlad about this. My guess is that
>> sctp could go into some error recovery mode because of this which would
>> be wrong.
>
> You are right.  SCTP shouldn't be setting sk_err on redirects as it
> isn't an error condition.  it should be doing exactly what tcp is doing and leaving the error handler without touching the socket.

Yep, probably something like ...

diff --git a/net/sctp/input.c b/net/sctp/input.c
index 5f20686..98b69bb 100644
--- a/net/sctp/input.c
+++ b/net/sctp/input.c
@@ -634,8 +634,7 @@ void sctp_v4_err(struct sk_buff *skb, __u32 info)
                 break;
         case ICMP_REDIRECT:
                 sctp_icmp_redirect(sk, transport, skb);
-               err = 0;
-               break;
+               /* Fall through to out_unlock. */
         default:
                 goto out_unlock;
         }
diff --git a/net/sctp/ipv6.c b/net/sctp/ipv6.c
index 4f52e2c..e7b2d4f 100644
--- a/net/sctp/ipv6.c
+++ b/net/sctp/ipv6.c
@@ -183,7 +183,7 @@ static void sctp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
                 break;
         case NDISC_REDIRECT:
                 sctp_icmp_redirect(sk, transport, skb);
-               break;
+               goto out_unlock;
         default:
                 break;
         }

> Thanks
> -vlad
>
>>
>> So, for this patch I would leave the logic as is and not change anything
>> at the error reporting. Maybe Daniel and Vlad could check if we should
>> suppress redirect information for ipv6 in sctp, too? But this should
>> go into another patch.  Regarding the EPROTO problem in raw and udp,
>> let's see if all the problems go away if we update icmpv6_err_convert
>> to set *err to 0.
>>
>> Greetings,
>>
>>    Hannes
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>

^ permalink raw reply related

* Re: [PATCH] ehea: remove deprecated IRQF_DISABLED
From: Thadeu Lima de Souza Cascardo @ 2013-09-12 12:05 UTC (permalink / raw)
  To: Michael Opdenacker; +Cc: netdev, linux-kernel
In-Reply-To: <1378957571-3574-1-git-send-email-michael.opdenacker@free-electrons.com>

On Thu, Sep 12, 2013 at 05:46:11AM +0200, Michael Opdenacker wrote:
> This patch proposes to remove the IRQF_DISABLED flag from
> drivers/net/ethernet/ibm/ehea/ehea_main.c
> 
> It's a NOOP since 2.6.35 and it will be removed one day.
> 
> Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com>
> ---
>  drivers/net/ethernet/ibm/ehea/ehea_main.c | 7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/ethernet/ibm/ehea/ehea_main.c b/drivers/net/ethernet/ibm/ehea/ehea_main.c
> index 35853b4..04e0ef1 100644
> --- a/drivers/net/ethernet/ibm/ehea/ehea_main.c
> +++ b/drivers/net/ethernet/ibm/ehea/ehea_main.c
> @@ -1285,7 +1285,7 @@ static int ehea_reg_interrupts(struct net_device *dev)
> 
>  	ret = ibmebus_request_irq(port->qp_eq->attr.ist1,
>  				  ehea_qp_aff_irq_handler,
> -				  IRQF_DISABLED, port->int_aff_name, port);
> +				  0, port->int_aff_name, port);
>  	if (ret) {
>  		netdev_err(dev, "failed registering irq for qp_aff_irq_handler:ist=%X\n",
>  			   port->qp_eq->attr.ist1);
> @@ -1303,8 +1303,7 @@ static int ehea_reg_interrupts(struct net_device *dev)
>  			 "%s-queue%d", dev->name, i);
>  		ret = ibmebus_request_irq(pr->eq->attr.ist1,
>  					  ehea_recv_irq_handler,
> -					  IRQF_DISABLED, pr->int_send_name,
> -					  pr);
> +					  0, pr->int_send_name, pr);
>  		if (ret) {
>  			netdev_err(dev, "failed registering irq for ehea_queue port_res_nr:%d, ist=%X\n",
>  				   i, pr->eq->attr.ist1);
> @@ -3320,7 +3319,7 @@ static int ehea_probe_adapter(struct platform_device *dev)
>  	}
> 
>  	ret = ibmebus_request_irq(adapter->neq->attr.ist1,
> -				  ehea_interrupt_neq, IRQF_DISABLED,
> +				  ehea_interrupt_neq, 0,
>  				  "ehea_neq", adapter);
>  	if (ret) {
>  		dev_err(&dev->dev, "requesting NEQ IRQ failed\n");
> -- 
> 1.8.1.2
> 

Acked-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>

^ permalink raw reply

* Re: [PATCH] Inet-hashtable: Change the range of sk->hash lock to avoid the race condition.
From: Eric Dumazet @ 2013-09-12 12:00 UTC (permalink / raw)
  To: Jun Chen; +Cc: edumazet, davem, netdev, linux-kernel
In-Reply-To: <1379003549.12328.6.camel@chenjun-workstation>

On Thu, 2013-09-12 at 12:32 -0400, Jun Chen wrote:
> When try to add node to list in __inet_hash_nolisten function, first get the
> list and then to lock for using, but in extremeness case, others can del this
> node before locking it, then the node should be null.So this patch try to lock
> firstly and then get the list for using to avoid this race condition.

I suspect another bug. This should not happen.

Care to describe the problem you got ?

Thanks

^ permalink raw reply

* [PATCH] Don't destroy the netdev until the vif is shut down
From: Paul Durrant @ 2013-09-12 11:08 UTC (permalink / raw)
  To: xen-devel, netdev; +Cc: Paul Durrant, Ian Campbell, Wei Liu, David Vrabel

Without this patch, if a frontend cycles through states Closing
and Closed (which Windows frontends need to do) then the netdev
will be destroyed and requires re-invocation of hotplug scripts
to restore state before the frontend can move to Connected. Thus
when udev is not in use the backend gets stuck in InitWait.

With this patch, the netdev is left alone whilst the backend is
still online and is only de-registered and freed just prior to
destroying the vif (which is also nicely symmetrical with the
netdev allocation and registration being done during probe) so
no re-invocation of hotplug scripts is required.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
---
 drivers/net/xen-netback/common.h    |    1 +
 drivers/net/xen-netback/interface.c |   13 ++++++++-----
 drivers/net/xen-netback/xenbus.c    |   15 +++++++++++----
 3 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index a197743..5715318 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -184,6 +184,7 @@ int xenvif_connect(struct xenvif *vif, unsigned long tx_ring_ref,
 		   unsigned long rx_ring_ref, unsigned int tx_evtchn,
 		   unsigned int rx_evtchn);
 void xenvif_disconnect(struct xenvif *vif);
+void xenvif_free(struct xenvif *vif);
 
 int xenvif_xenbus_init(void);
 void xenvif_xenbus_fini(void);
diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index 625c6f4..65e78f9 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -477,14 +477,17 @@ void xenvif_disconnect(struct xenvif *vif)
 	if (vif->task)
 		kthread_stop(vif->task);
 
+	xenvif_unmap_frontend_rings(vif);
+
+	if (need_module_put)
+		module_put(THIS_MODULE);
+}
+
+void xenvif_free(struct xenvif *vif)
+{
 	netif_napi_del(&vif->napi);
 
 	unregister_netdev(vif->dev);
 
-	xenvif_unmap_frontend_rings(vif);
-
 	free_netdev(vif->dev);
-
-	if (need_module_put)
-		module_put(THIS_MODULE);
 }
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index 1fe48fe3..1a5683f 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -213,9 +213,18 @@ static void disconnect_backend(struct xenbus_device *dev)
 {
 	struct backend_info *be = dev_get_drvdata(&dev->dev);
 
+	if (be->vif)
+		xenvif_disconnect(be->vif);
+}
+
+static void destroy_backend(struct xenbus_device *dev)
+{
+	struct backend_info *be = dev_get_drvdata(&dev->dev);
+
 	if (be->vif) {
+		kobject_uevent(&dev->dev.kobj, KOBJ_OFFLINE);
 		xenbus_rm(XBT_NIL, dev->nodename, "hotplug-status");
-		xenvif_disconnect(be->vif);
+		xenvif_free(be->vif);
 		be->vif = NULL;
 	}
 }
@@ -246,14 +255,11 @@ static void frontend_changed(struct xenbus_device *dev,
 	case XenbusStateConnected:
 		if (dev->state == XenbusStateConnected)
 			break;
-		backend_create_xenvif(be);
 		if (be->vif)
 			connect(be);
 		break;
 
 	case XenbusStateClosing:
-		if (be->vif)
-			kobject_uevent(&dev->dev.kobj, KOBJ_OFFLINE);
 		disconnect_backend(dev);
 		xenbus_switch_state(dev, XenbusStateClosing);
 		break;
@@ -262,6 +268,7 @@ static void frontend_changed(struct xenbus_device *dev,
 		xenbus_switch_state(dev, XenbusStateClosed);
 		if (xenbus_dev_is_online(dev))
 			break;
+		destroy_backend(dev);
 		/* fall through if not online */
 	case XenbusStateUnknown:
 		device_unregister(&dev->dev);
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 11/11] ipv6: move route updating for redirect to ndisc layer
From: Duan Jiong @ 2013-09-12 10:52 UTC (permalink / raw)
  To: davem; +Cc: netdev, hannes
In-Reply-To: <52319A6E.6090503@cn.fujitsu.com>

From: Duan Jiong <duanj.fnst@cn.fujitsu.com>

And when dealing with the redirect message, the err
should be assigned to 0.

Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
---
 net/ipv6/udp.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index f405815..d212c62 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -525,8 +525,6 @@ void __udp6_lib_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
 
 	if (type == ICMPV6_PKT_TOOBIG)
 		ip6_sk_update_pmtu(skb, sk, info);
-	if (type == NDISC_REDIRECT)
-		ip6_sk_redirect(skb, sk);
 
 	np = inet6_sk(sk);
 
@@ -536,6 +534,8 @@ void __udp6_lib_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
 	if (sk->sk_state != TCP_ESTABLISHED && !np->recverr)
 		goto out;
 
+	if (type == NDISC_REDIRECT)
+		err = 0;
 	if (np->recverr)
 		ipv6_icmp_error(sk, skb, err, uh->dest, ntohl(info), (u8 *)(uh+1));
 
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH 10/11] ipv6: move route updating for redirect to ndisc layer
From: Duan Jiong @ 2013-09-12 10:51 UTC (permalink / raw)
  To: davem; +Cc: netdev, hannes
In-Reply-To: <52319A6E.6090503@cn.fujitsu.com>

From: Duan Jiong <duanj.fnst@cn.fujitsu.com>

And when dealing with redirect message, the err shoud
be assigned to 0.

Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
---
 net/ipv6/raw.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 58916bb..6138199 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -336,7 +336,7 @@ static void rawv6_err(struct sock *sk, struct sk_buff *skb,
 		harderr = (np->pmtudisc == IPV6_PMTUDISC_DO);
 	}
 	if (type == NDISC_REDIRECT)
-		ip6_sk_redirect(skb, sk);
+		err = 0;
 	if (np->recverr) {
 		u8 *payload = skb->data;
 		if (!inet->hdrincl)
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH 09/11] sctp: move route updating for redirect to ndisc layer
From: Duan Jiong @ 2013-09-12 10:51 UTC (permalink / raw)
  To: davem; +Cc: netdev, hannes
In-Reply-To: <52319A6E.6090503@cn.fujitsu.com>

From: Duan Jiong <duanj.fnst@cn.fujitsu.com>

In additon, when dealing with redirect message, it should
not report error message to user, and need to return
directly.

Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
---
 net/sctp/input.c | 12 ------------
 net/sctp/ipv6.c  |  6 +++---
 2 files changed, 3 insertions(+), 15 deletions(-)

diff --git a/net/sctp/input.c b/net/sctp/input.c
index 5f20686..0d2d4b7 100644
--- a/net/sctp/input.c
+++ b/net/sctp/input.c
@@ -413,18 +413,6 @@ void sctp_icmp_frag_needed(struct sock *sk, struct sctp_association *asoc,
 	sctp_retransmit(&asoc->outqueue, t, SCTP_RTXR_PMTUD);
 }
 
-void sctp_icmp_redirect(struct sock *sk, struct sctp_transport *t,
-			struct sk_buff *skb)
-{
-	struct dst_entry *dst;
-
-	if (!t)
-		return;
-	dst = sctp_transport_dst_check(t);
-	if (dst)
-		dst->ops->redirect(dst, sk, skb);
-}
-
 /*
  * SCTP Implementer's Guide, 2.37 ICMP handling procedures
  *
diff --git a/net/sctp/ipv6.c b/net/sctp/ipv6.c
index da613ce..ee12d87 100644
--- a/net/sctp/ipv6.c
+++ b/net/sctp/ipv6.c
@@ -151,6 +151,9 @@ static void sctp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
 	int err;
 	struct net *net = dev_net(skb->dev);
 
+	if (type == NDISC_REDIRECT)
+		return;
+
 	idev = in6_dev_get(skb->dev);
 
 	/* Fix up skb to look at the embedded net header. */
@@ -181,9 +184,6 @@ static void sctp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
 			goto out_unlock;
 		}
 		break;
-	case NDISC_REDIRECT:
-		sctp_icmp_redirect(sk, transport, skb);
-		break;
 	default:
 		break;
 	}
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH 08/11] ipv6: move route updating for redirect to ndisc layer
From: Duan Jiong @ 2013-09-12 10:50 UTC (permalink / raw)
  To: davem; +Cc: netdev, hannes
In-Reply-To: <52319A6E.6090503@cn.fujitsu.com>

From: Duan Jiong <duanj.fnst@cn.fujitsu.com>


Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
---
 net/ipv6/tcp_ipv6.c | 12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 5c71501..d3ca8a4 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -346,6 +346,10 @@ static void tcp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
 	__u32 seq;
 	struct net *net = dev_net(skb->dev);
 
+
+	if (type == NDISC_REDIRECT)
+		return;
+
 	sk = inet6_lookup(net, &tcp_hashinfo, &hdr->daddr,
 			th->dest, &hdr->saddr, th->source, skb->dev->ifindex);
 
@@ -382,14 +386,6 @@ static void tcp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
 
 	np = inet6_sk(sk);
 
-	if (type == NDISC_REDIRECT) {
-		struct dst_entry *dst = __sk_dst_check(sk, np->dst_cookie);
-
-		if (dst)
-			dst->ops->redirect(dst, sk, skb);
-		goto out;
-	}
-
 	if (type == ICMPV6_PKT_TOOBIG) {
 		/* We are not interested in TCP_LISTEN and open_requests
 		 * (SYN-ACKs send out by Linux are always <576bytes so
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH 07/11] ipv6: move route updating for redirect to ndisc layer
From: Duan Jiong @ 2013-09-12 10:49 UTC (permalink / raw)
  To: davem; +Cc: netdev, hannes
In-Reply-To: <52319A6E.6090503@cn.fujitsu.com>

From: Duan Jiong <duanj.fnst@cn.fujitsu.com>

So the ipcomm6_err() only handle the ICMPV6_PKT_TOOBIG.

Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
---
 net/ipv6/ipcomp6.c | 9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/net/ipv6/ipcomp6.c b/net/ipv6/ipcomp6.c
index 5636a91..e943158 100644
--- a/net/ipv6/ipcomp6.c
+++ b/net/ipv6/ipcomp6.c
@@ -64,9 +64,7 @@ static void ipcomp6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
 		(struct ip_comp_hdr *)(skb->data + offset);
 	struct xfrm_state *x;
 
-	if (type != ICMPV6_DEST_UNREACH &&
-	    type != ICMPV6_PKT_TOOBIG &&
-	    type != NDISC_REDIRECT)
+	if (type != ICMPV6_PKT_TOOBIG)
 		return;
 
 	spi = htonl(ntohs(ipcomph->cpi));
@@ -75,10 +73,7 @@ static void ipcomp6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
 	if (!x)
 		return;
 
-	if (type == NDISC_REDIRECT)
-		ip6_redirect(skb, net, skb->dev->ifindex, 0);
-	else
-		ip6_update_pmtu(skb, net, info, 0, 0);
+	ip6_update_pmtu(skb, net, info, 0, 0);
 	xfrm_state_put(x);
 }
 
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH 06/11] ip6tnl: move route updating for redirect to ndisc layer
From: Duan Jiong @ 2013-09-12 10:49 UTC (permalink / raw)
  To: davem; +Cc: netdev, hannes
In-Reply-To: <52319A6E.6090503@cn.fujitsu.com>

From: Duan Jiong <duanj.fnst@cn.fujitsu.com>

In rfc2473, we can know that the tunnel ICMP redirect
message should not be reported to the source of the
original packet, so after calling ip6_tnl_err(), the
rel_msg is set to 0 in function ip4ip6_err(), and the
redirect will never be handled.

In order to deal with this, we move route updating for
redirect to ndisc layer.

Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
---
 net/ipv6/ip6_tunnel.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index 61355f7..3ea834b 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -576,9 +576,6 @@ ip4ip6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
 		rel_type = ICMP_DEST_UNREACH;
 		rel_code = ICMP_FRAG_NEEDED;
 		break;
-	case NDISC_REDIRECT:
-		rel_type = ICMP_REDIRECT;
-		rel_code = ICMP_REDIR_HOST;
 	default:
 		return 0;
 	}
@@ -637,8 +634,6 @@ ip4ip6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
 
 		skb_dst(skb2)->ops->update_pmtu(skb_dst(skb2), NULL, skb2, rel_info);
 	}
-	if (rel_type == ICMP_REDIRECT)
-		skb_dst(skb2)->ops->redirect(skb_dst(skb2), NULL, skb2);
 
 	icmp_send(skb2, rel_type, rel_code, htonl(rel_info));
 
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH 05/11] ipv6: move route updating for redirect to ndisc layer
From: Duan Jiong @ 2013-09-12 10:47 UTC (permalink / raw)
  To: davem; +Cc: netdev, hannes
In-Reply-To: <52319A6E.6090503@cn.fujitsu.com>

From: Duan Jiong <duanj.fnst@cn.fujitsu.com>

Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
---
 net/ipv6/icmp.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/net/ipv6/icmp.c b/net/ipv6/icmp.c
index eef8d94..4bde43c 100644
--- a/net/ipv6/icmp.c
+++ b/net/ipv6/icmp.c
@@ -91,8 +91,6 @@ static void icmpv6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
 
 	if (type == ICMPV6_PKT_TOOBIG)
 		ip6_update_pmtu(skb, net, info, 0, 0);
-	else if (type == NDISC_REDIRECT)
-		ip6_redirect(skb, net, skb->dev->ifindex, 0);
 
 	if (!(type & ICMPV6_INFOMSG_MASK))
 		if (icmp6->icmp6_type == ICMPV6_ECHO_REQUEST)
-- 
1.8.3.1

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox