Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH nf-next] netfilter: xtables: lightweight process control group matching
From: Eric W. Biederman @ 2013-10-18 23:21 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: pablo-Cap9r6Oaw4JrovVCs/uTlw,
	netfilter-devel-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA, Tejun Heo,
	cgroups-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1380910855-12325-1-git-send-email-dborkman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

Daniel Borkmann <dborkman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> writes:

> Implementation of PID-based matching would not be appropriate
> as they frequently change, and child tracking would make that
> even more complex and ugly. Cgroups would be a perfect candidate
> for accomplishing that as they associate a set of tasks with a
> set of parameters for one or more subsystems, in our case the
> netfilter subsystem, which, of course, can be combined with other
> cgroup subsystems into something more complex.

I am coming to this late.  But two concrete suggestions.

1) process groups and sessions don't change as frequently as pids.

2) It is possible to put a set of processes in their own network
   namespace and pipe just the packets you want those processes to
   use into that network namespace.  Using an ingress queueing filter
   makes that process very efficient even if you have to filter by port.

So I don't think you need cgroups to solve this problem at all.

Eric

^ permalink raw reply

* [PATCH] wireless: rt2800lib: Fix typo on checking
From: Felipe Pena @ 2013-10-19  0:20 UTC (permalink / raw)
  To: Ivo van Doorn, Gertjan van Wingerde, Helmut Schaa,
	John W. Linville
  Cc: linux-wireless, users, netdev, linux-kernel, Felipe Pena

On rt2800_config_channel_rf53xx function the member default_power1 is checked
for bound limit, but default_power2 is used instead.

Signed-off-by: Felipe Pena <felipensp@gmail.com>
---
 drivers/net/wireless/rt2x00/rt2800lib.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/rt2x00/rt2800lib.c b/drivers/net/wireless/rt2x00/rt2800lib.c
index 88ce656..1c2ce93 100644
--- a/drivers/net/wireless/rt2x00/rt2800lib.c
+++ b/drivers/net/wireless/rt2x00/rt2800lib.c
@@ -2650,7 +2650,7 @@ static void rt2800_config_channel_rf53xx(struct rt2x00_dev *rt2x00dev,

 	if (rt2x00_rt(rt2x00dev, RT5392)) {
 		rt2800_rfcsr_read(rt2x00dev, 50, &rfcsr);
-		if (info->default_power1 > POWER_BOUND)
+		if (info->default_power2 > POWER_BOUND)
 			rt2x00_set_field8(&rfcsr, RFCSR50_TX, POWER_BOUND);
 		else
 			rt2x00_set_field8(&rfcsr, RFCSR50_TX,
--
1.7.10.4

^ permalink raw reply related

* [PATCH] rtlwifi: rtl8192se: Fix wrong assignment
From: Felipe Pena @ 2013-10-19  0:52 UTC (permalink / raw)
  To: Larry Finger, Chaoming Li, John W. Linville
  Cc: linux-wireless, netdev, linux-kernel, Felipe Pena

There is a typo in the struct member name on assignment when checking
rtlphy->current_chan_bw == HT_CHANNEL_WIDTH_20_40, the check uses pwrgroup_ht40
for bound limit and uses pwrgroup_ht20 when assigning instead.

Signed-off-by: Felipe Pena <felipensp@gmail.com>
---
 drivers/net/wireless/rtlwifi/rtl8192se/rf.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/rtlwifi/rtl8192se/rf.c b/drivers/net/wireless/rtlwifi/rtl8192se/rf.c
index 5061f1d..92d38ab 100644
--- a/drivers/net/wireless/rtlwifi/rtl8192se/rf.c
+++ b/drivers/net/wireless/rtlwifi/rtl8192se/rf.c
@@ -265,7 +265,7 @@ static void _rtl92s_get_txpower_writeval_byregulatory(struct ieee80211_hw *hw,
 				    rtlefuse->pwrgroup_ht40
 				    [RF90_PATH_A][chnl - 1]) {
 					pwrdiff_limit[i] =
-					  rtlefuse->pwrgroup_ht20
+					  rtlefuse->pwrgroup_ht40
 					  [RF90_PATH_A][chnl - 1];
 				}
 			} else {
--
1.7.10.4

^ permalink raw reply related

* Re: [PATCH] rtlwifi: rtl8192se: Fix wrong assignment
From: Joe Perches @ 2013-10-19  1:05 UTC (permalink / raw)
  To: Felipe Pena
  Cc: Larry Finger, Chaoming Li, John W. Linville, linux-wireless,
	netdev, linux-kernel
In-Reply-To: <1382143960-4594-1-git-send-email-felipensp@gmail.com>

On Fri, 2013-10-18 at 21:52 -0300, Felipe Pena wrote:
> There is a typo in the struct member name on assignment when checking
> rtlphy->current_chan_bw == HT_CHANNEL_WIDTH_20_40, the check uses pwrgroup_ht40
> for bound limit and uses pwrgroup_ht20 when assigning instead.
> 
> Signed-off-by: Felipe Pena <felipensp@gmail.com>
> ---
>  drivers/net/wireless/rtlwifi/rtl8192se/rf.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/wireless/rtlwifi/rtl8192se/rf.c b/drivers/net/wireless/rtlwifi/rtl8192se/rf.c
> index 5061f1d..92d38ab 100644
> --- a/drivers/net/wireless/rtlwifi/rtl8192se/rf.c
> +++ b/drivers/net/wireless/rtlwifi/rtl8192se/rf.c
> @@ -265,7 +265,7 @@ static void _rtl92s_get_txpower_writeval_byregulatory(struct ieee80211_hw *hw,
>  				    rtlefuse->pwrgroup_ht40
>  				    [RF90_PATH_A][chnl - 1]) {
>  					pwrdiff_limit[i] =
> -					  rtlefuse->pwrgroup_ht20
> +					  rtlefuse->pwrgroup_ht40
>  					  [RF90_PATH_A][chnl - 1];
>  				}

Using min could help minimize typos

					pwrdiff_limit[i] = min(pwrdiff_limit[i],
							       rtlefuse->pwrgroup_ht40[RF90_PATH_A][chnl - 1]);

^ permalink raw reply

* Re: [PATCH] rtlwifi: rtl8192se: Fix wrong assignment
From: Larry Finger @ 2013-10-19  1:33 UTC (permalink / raw)
  To: Joe Perches, Felipe Pena
  Cc: Chaoming Li, John W. Linville, linux-wireless, netdev,
	linux-kernel
In-Reply-To: <1382144748.2041.16.camel@joe-AO722>

On 10/18/2013 08:05 PM, Joe Perches wrote:
> On Fri, 2013-10-18 at 21:52 -0300, Felipe Pena wrote:
>> There is a typo in the struct member name on assignment when checking
>> rtlphy->current_chan_bw == HT_CHANNEL_WIDTH_20_40, the check uses pwrgroup_ht40
>> for bound limit and uses pwrgroup_ht20 when assigning instead.
>>
>> Signed-off-by: Felipe Pena <felipensp@gmail.com>
>> ---
>>   drivers/net/wireless/rtlwifi/rtl8192se/rf.c |    2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/wireless/rtlwifi/rtl8192se/rf.c b/drivers/net/wireless/rtlwifi/rtl8192se/rf.c
>> index 5061f1d..92d38ab 100644
>> --- a/drivers/net/wireless/rtlwifi/rtl8192se/rf.c
>> +++ b/drivers/net/wireless/rtlwifi/rtl8192se/rf.c
>> @@ -265,7 +265,7 @@ static void _rtl92s_get_txpower_writeval_byregulatory(struct ieee80211_hw *hw,
>>   				    rtlefuse->pwrgroup_ht40
>>   				    [RF90_PATH_A][chnl - 1]) {
>>   					pwrdiff_limit[i] =
>> -					  rtlefuse->pwrgroup_ht20
>> +					  rtlefuse->pwrgroup_ht40
>>   					  [RF90_PATH_A][chnl - 1];
>>   				}
>
> Using min could help minimize typos
>
> 					pwrdiff_limit[i] = min(pwrdiff_limit[i],
> 							       rtlefuse->pwrgroup_ht40[RF90_PATH_A][chnl - 1]);

In the initial driver, that would have been true; however, at this point, I 
prefer the patch as submitted. There is less churning of the source.

Larry

^ permalink raw reply

* Re: [PATCH] rtlwifi: rtl8192se: Fix wrong assignment
From: Larry Finger @ 2013-10-19  1:35 UTC (permalink / raw)
  To: Felipe Pena, Chaoming Li, John W. Linville
  Cc: linux-wireless, netdev, linux-kernel
In-Reply-To: <1382143960-4594-1-git-send-email-felipensp@gmail.com>

On 10/18/2013 07:52 PM, Felipe Pena wrote:
> There is a typo in the struct member name on assignment when checking
> rtlphy->current_chan_bw == HT_CHANNEL_WIDTH_20_40, the check uses pwrgroup_ht40
> for bound limit and uses pwrgroup_ht20 when assigning instead.
>
> Signed-off-by: Felipe Pena <felipensp@gmail.com>
> ---
>   drivers/net/wireless/rtlwifi/rtl8192se/rf.c |    2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/wireless/rtlwifi/rtl8192se/rf.c b/drivers/net/wireless/rtlwifi/rtl8192se/rf.c
> index 5061f1d..92d38ab 100644
> --- a/drivers/net/wireless/rtlwifi/rtl8192se/rf.c
> +++ b/drivers/net/wireless/rtlwifi/rtl8192se/rf.c
> @@ -265,7 +265,7 @@ static void _rtl92s_get_txpower_writeval_byregulatory(struct ieee80211_hw *hw,
>   				    rtlefuse->pwrgroup_ht40
>   				    [RF90_PATH_A][chnl - 1]) {
>   					pwrdiff_limit[i] =
> -					  rtlefuse->pwrgroup_ht20
> +					  rtlefuse->pwrgroup_ht40
>   					  [RF90_PATH_A][chnl - 1];
>   				}
>   			} else {
> --

Acked-by: Larry Finger <Larry.Finger@lwfinger.net>

John: I forgot to mention this earlier, but when you add your s-o-b, please add 
a Cc for Stable [3.0 +].

Thanks,

Larry

^ permalink raw reply

* Re: [PATCH 3/9 RESENT] staging: r8188eu: Fix sparse warnings in ioctl_linux.c
From: Larry Finger @ 2013-10-19  1:45 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: gregkh, netdev, devel
In-Reply-To: <1382134111.3381.25.camel@bwh-desktop.uk.level5networks.com>

On 10/18/2013 05:08 PM, Ben Hutchings wrote:
> On Fri, 2013-10-18 at 16:34 -0500, Larry Finger wrote:
>> Sparse checking results in the following warnings:
>>
>>     CHECK   drivers/staging/rtl8188eu/os_dep/ioctl_linux.c
>> drivers/staging/rtl8188eu/os_dep/ioctl_linux.c:3932:17: warning: cast removes
>> address space of expression
>> drivers/staging/rtl8188eu/os_dep/ioctl_linux.c:3933:31: warning: incorrect type
>> in argument 1 (different address spaces)
> [...]
>> -		DBG_88E("[%s] extra = %s\n", __func__, (char *)wrqu->data.pointer);
>> -	if (!memcmp(wrqu->data.pointer, "status", 6)) {
>> +		DBG_88E("[%s] extra = %s\n", __func__, (__force char *)wrqu->data.pointer);
>> +	if (!memcmp((__force const char *)wrqu->data.pointer, "status", 6)) {
> [...]
>
> This is hiding the problem, not fixing it.
>
> This private interface needs to be removed anyway.

I agree, but that is not likely to happen soon. I will just leave this set of 
Sparse warnings alone.

Larry

^ permalink raw reply

* Re: [PATCH] packet: Deliver VLAN TPID to userspace
From: Atzm Watanabe @ 2013-10-19  6:19 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev
In-Reply-To: <20131018105655.2cdc628e@nehalam.linuxnetplumber.net>

At Fri, 18 Oct 2013 10:56:55 -0700,
Stephen Hemminger wrote:
> 
> On Sat, 19 Oct 2013 02:08:11 +0900
> Atzm Watanabe <atzm@stratosphere.co.jp> wrote:
> 
> > diff --git a/include/uapi/linux/if_packet.h b/include/uapi/linux/if_packet.h
> > index dbf0666..6e36e0a 100644
> > --- a/include/uapi/linux/if_packet.h
> > +++ b/include/uapi/linux/if_packet.h
> > @@ -83,7 +83,7 @@ struct tpacket_auxdata {
> >  	__u16		tp_mac;
> >  	__u16		tp_net;
> >  	__u16		tp_vlan_tci;
> > -	__u16		tp_padding;
> > +	__u16		tp_vlan_tpid;
> >  };
> >  
> >  /* Rx ring - header status */
> > @@ -132,12 +132,13 @@ struct tpacket2_hdr {
> >  	__u32		tp_sec;
> >  	__u32		tp_nsec;
> >  	__u16		tp_vlan_tci;
> > -	__u16		tp_padding;
> > +	__u16		tp_vlan_tpid;
> >  };
> >  
> >  struct tpacket_hdr_variant1 {
> >  	__u32	tp_rxhash;
> >  	__u32	tp_vlan_tci;
> > +	__u32	tp_vlan_tpid;
> >  };
> 
> The last change will break ABI to userspace applications.
> You can reuse padding elements; but you can't increase (or shrink)
> an existing structure.

Thank you for pointing.
But I have two questions:

  - The patch that increases existing structures was posted and
    accepted in the past (e.g 393e52e33c6c26ec7db290dab803bac1bed962d4
    "packet: deliver VLAN TCI to userspace").
    What is the difference between them and my patch?

  - I tested using tools/testing/selftests/net/psock_tpacket.c built
    before applying my patch, and all test cases were passed.
    Also I tested by the code that was listed in
    Documentation/networking/packet_mmap.txt "AF_PACKET TPACKET_V3
    example".  It seems that problem was not caused.
    What situation causes the problem that you assumed?

Thank you.

^ permalink raw reply

* Re: [patch net REPOST 2/3] ip6_output: do skb ufo init for peeked non ufo skb as well
From: Jiri Pirko @ 2013-10-19  6:23 UTC (permalink / raw)
  To: Sergei Shtylyov
  Cc: netdev, davem, eric.dumazet, hannes, jdmason, yoshfuji, kuznet,
	jmorris, kaber, herbert
In-Reply-To: <5261819B.8020801@cogentembedded.com>

Fri, Oct 18, 2013 at 08:44:43PM CEST, sergei.shtylyov@cogentembedded.com wrote:
>Hello.
>
>On 10/18/2013 09:13 PM, Jiri Pirko wrote:
>
>>Now, if user application does:
>>sendto len<mtu flag MSG_MORE
>>sendto len>mtu flag 0
>>The skb is not treated as fragmented one because it is not initialized
>>that way. So move the initialization to fix this.
>
>>introduced by:
>>commit e89e9cf539a28df7d0eb1d0a545368e9920b34ac "[IPv4/IPv6]: UFO Scatter-gather approach"
>
>>Signed-off-by: Jiri Pirko <jiri@resnulli.us>
>>---
>>  net/ipv6/ip6_output.c | 26 ++++++++++++++------------
>>  1 file changed, 14 insertions(+), 12 deletions(-)
>
>>diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
>>index a54c45c..c6cfa2f 100644
>>--- a/net/ipv6/ip6_output.c
>>+++ b/net/ipv6/ip6_output.c
>[...]
>>@@ -1036,20 +1035,23 @@ static inline int ip6_ufo_append_data(struct sock *sk,
>>  		skb->transport_header = skb->network_header + fragheaderlen;
>>
>>  		skb->protocol = htons(ETH_P_IPV6);
>>-		skb->ip_summed = CHECKSUM_PARTIAL;
>>  		skb->csum = 0;
>>
>>-		/* Specify the length of each IPv6 datagram fragment.
>>-		 * It has to be a multiple of 8.
>>-		 */
>>-		skb_shinfo(skb)->gso_size = (mtu - fragheaderlen -
>>-					     sizeof(struct frag_hdr)) & ~7;
>>-		skb_shinfo(skb)->gso_type = SKB_GSO_UDP;
>>-		ipv6_select_ident(&fhdr, rt);
>>-		skb_shinfo(skb)->ip6_frag_id = fhdr.identification;
>>  		__skb_queue_tail(&sk->sk_write_queue, skb);
>>-	}
>>+	} else if (skb_is_gso(skb))
>>+		goto append;
>
>   According to Documentation/CodingStyle, there should be {} in both
>arms of *if* statement if there's {} in one of them.

Yep, you are right. I wonder why scripts/checkpatch.pl do not warn about
this. I will send v2. Thanks


>
>WBR, Sergei
>

^ permalink raw reply

* Re: [PATCH nf-next] netfilter: xtables: lightweight process control group matching
From: Daniel Borkmann @ 2013-10-19  7:16 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: pablo-Cap9r6Oaw4JrovVCs/uTlw,
	netfilter-devel-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA, Tejun Heo,
	cgroups-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <87li1qp3l8.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>

On 10/19/2013 01:21 AM, Eric W. Biederman wrote:

> I am coming to this late.  But two concrete suggestions.
>
> 1) process groups and sessions don't change as frequently as pids.
>
> 2) It is possible to put a set of processes in their own network
>     namespace and pipe just the packets you want those processes to
>     use into that network namespace.  Using an ingress queueing filter
>     makes that process very efficient even if you have to filter by port.

Actually in our case we're filtering outgoing traffic, based on which
local socket that originated from; so you wouldn't need all of that
construct. Also, you wouldn't even need to have an a-prio knowledge of
the application internals regarding their use of particular use of ports
or protocols. I don't think that such a setup will have the same
efficiency, ease of use, and power to distinguish the application the
traffic came from in such a lightweight, protocol independent and easy way.

^ permalink raw reply

* See the attached file
From: Microsoft Promotion @ 2013-10-19  7:37 UTC (permalink / raw)

In-Reply-To: <1382030230.54824.YahooMailNeo@web5702.biz.mail.ne1.yahoo.com>

[-- Attachment #1: Type: text/plain, Size: 21 bytes --]

See the attached file

[-- Attachment #2: MICROSOFT_AWARD_PROMOTION_2013.doc --]
[-- Type: application/msword, Size: 124416 bytes --]

^ permalink raw reply

* Re: [PATCH] wireless: rt2800lib: Fix typo on checking
From: Gertjan van Wingerde @ 2013-10-19  8:31 UTC (permalink / raw)
  To: Felipe Pena, John W. Linville
  Cc: Ivo van Doorn, Helmut Schaa, linux-wireless, users, netdev,
	linux-kernel
In-Reply-To: <1382142042-4463-1-git-send-email-felipensp@gmail.com>

On 10/19/13 02:20, Felipe Pena wrote:
> On rt2800_config_channel_rf53xx function the member default_power1 is checked
> for bound limit, but default_power2 is used instead.
> 
> Signed-off-by: Felipe Pena <felipensp@gmail.com>

Good catch.

Acked-by: Gertjan van Wingerde <gwingerde@gmail.com>

> ---
>  drivers/net/wireless/rt2x00/rt2800lib.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/wireless/rt2x00/rt2800lib.c b/drivers/net/wireless/rt2x00/rt2800lib.c
> index 88ce656..1c2ce93 100644
> --- a/drivers/net/wireless/rt2x00/rt2800lib.c
> +++ b/drivers/net/wireless/rt2x00/rt2800lib.c
> @@ -2650,7 +2650,7 @@ static void rt2800_config_channel_rf53xx(struct rt2x00_dev *rt2x00dev,
> 
>  	if (rt2x00_rt(rt2x00dev, RT5392)) {
>  		rt2800_rfcsr_read(rt2x00dev, 50, &rfcsr);
> -		if (info->default_power1 > POWER_BOUND)
> +		if (info->default_power2 > POWER_BOUND)
>  			rt2x00_set_field8(&rfcsr, RFCSR50_TX, POWER_BOUND);
>  		else
>  			rt2x00_set_field8(&rfcsr, RFCSR50_TX,
> --
> 1.7.10.4
> 


-- 
---
Gertjan

^ permalink raw reply

* Re: IPv6 path discovery oddities - flushing the routing cache resolves
From: Hannes Frederic Sowa @ 2013-10-19  8:42 UTC (permalink / raw)
  To: Valentijn Sessink; +Cc: netdev, sgunderson
In-Reply-To: <5260D8DE.30303@blub.net>

On Fri, Oct 18, 2013 at 08:44:46AM +0200, Valentijn Sessink wrote:
> On 18-10-13 05:04, Hannes Frederic Sowa wrote:
> > Thanks, I needed this to verify I am on the right track replicating this.
> > 2001:1af8:ff03:3:219:66ff:fe26:6dd is the other end of the connection, I
> > guess?
> 
> Yes, the working connection (first example) is from
> 2001:1af8:ff03:3:219:66ff:fe26:6dd. The non-working connection should
> have an MTU of 1280 on the 2001:7b8:1529:: subnet connections (those are
> tunneled, with the tunnel restricting the MTU).

I got access to a nice test box yesterday where I could brute force the
problem in parallel (it was a PITA). This is what I found:

This first patch solves the problem of a complete lockdown of all sockets
towards one ipv6 destination. This can happen if we recheck the ipv6 fib
(expiration is ok) and we get back a rt6_info where we apply the new metrics
information on. After the check the dst entry expires and we do a relookup.
We try to insert the same routing information into the fib which results only
in a call to rt6_clean_expires. Because we don't reset the dst.expires value a
later update of mtu information won't update the expiration time because of
the strange semantics in rt6_update_expires. This patch should fix this.

diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
index 6738f34..3932633 100644
--- a/include/net/ip6_fib.h
+++ b/include/net/ip6_fib.h
@@ -164,6 +164,7 @@ static inline struct inet6_dev *ip6_dst_idev(struct dst_entry *dst)

 static inline void rt6_clean_expires(struct rt6_info *rt)
 {
+	rt->dst.expires = 0;
 	rt->rt6i_flags &= ~RTF_EXPIRES;
 }

The second patch resolves the problem that the socket keeps hanging on
outdated mtu information which gets invalidated just after processing. We
need to relookup the destination entry in case the socket expires. This
helps a socket to free the cached dst before applying the mtu information
to an already expired dst which will be reinserted (see above, it will
only call rt6_clean_expires on the dst_entry). This is normally not a
problem, but in the process of the creation of the cloned dst_entry we
end up copying the metric information from the non-DST_CACHEd route to
the dst_entry (ip6_rt_copy/dst_copy_metrics). Because the information are
held in inetpeer storage and the key for the expired dst and the new dst
have the same key we overwrite the metrics store which currently is in
use by two rt6_infos. So we just invalidate the newly installed metrics
information and will use the interface mtu just after the PACKET_TOO_BIG
notification, which leads to hangs of the connection. A flush of the
cached routing entries causes relookups, so this a workaround.

This patch should fix this:

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index c3130ff..7629022 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1064,10 +1064,13 @@ static struct dst_entry *ip6_dst_check(struct dst_entry *dst, u32 cookie)
 	if (rt->rt6i_genid != rt_genid_ipv6(dev_net(rt->dst.dev)))
 		return NULL;

-	if (rt->rt6i_node && (rt->rt6i_node->fn_sernum == cookie))
-		return dst;
+	if (!rt->rt6i_node && (rt->rt6i_node->fn_sernum != cookie))
+		return NULL;

-	return NULL;
+	if (rt6_check_expired(rt))
+		return NULL;
+
+	return dst;
 }

 static struct dst_entry *ip6_negative_advice(struct dst_entry *dst)

I had the patches in test for a few hours on some VMs where I could normally
reproduce this issue within 5 minutes. They are for testing only and I don't
know if they resolve all issues. I also have to check why rt6_update_expires
has such strange expiration update logic.

Steinar and Valentijn could you give them a test drive?

Greetings,

  Hannes

^ permalink raw reply related

* Re: IPv6 path discovery oddities - flushing the routing cache resolves
From: Hannes Frederic Sowa @ 2013-10-19  8:53 UTC (permalink / raw)
  To: Valentijn Sessink, netdev, sgunderson
In-Reply-To: <20131019084225.GA31333@order.stressinduktion.org>

On Sat, Oct 19, 2013 at 10:42:25AM +0200, Hannes Frederic Sowa wrote:
> This patch should fix this:
> 
> diff --git a/net/ipv6/route.c b/net/ipv6/route.c
> index c3130ff..7629022 100644
> --- a/net/ipv6/route.c
> +++ b/net/ipv6/route.c
> @@ -1064,10 +1064,13 @@ static struct dst_entry *ip6_dst_check(struct dst_entry *dst, u32 cookie)
>  	if (rt->rt6i_genid != rt_genid_ipv6(dev_net(rt->dst.dev)))
>  		return NULL;
>  
> -	if (rt->rt6i_node && (rt->rt6i_node->fn_sernum == cookie))
> -		return dst;
> +	if (!rt->rt6i_node && (rt->rt6i_node->fn_sernum != cookie))
> +		return NULL;

I just rewrote this patch to have a clean diff. I missed to rewrite the '&&'
to '||'.

Correct patch is:

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index c3130ff..17b3cc1 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1064,10 +1064,13 @@ static struct dst_entry *ip6_dst_check(struct dst_entry *dst, u32 cookie)
 	if (rt->rt6i_genid != rt_genid_ipv6(dev_net(rt->dst.dev)))
 		return NULL;
 
-	if (rt->rt6i_node && (rt->rt6i_node->fn_sernum == cookie))
-		return dst;
+	if (!rt->rt6i_node || (rt->rt6i_node->fn_sernum != cookie))
+		return NULL;
 
-	return NULL;
+	if (rt6_check_expired(rt))
+		return NULL;
+
+	return dst;
 }
 
 static struct dst_entry *ip6_negative_advice(struct dst_entry *dst)

^ permalink raw reply related

* Re: IPv6 path discovery oddities - flushing the routing cache resolves
From: Steinar H. Gunderson @ 2013-10-19 10:12 UTC (permalink / raw)
  To: Valentijn Sessink, netdev
In-Reply-To: <20131019084225.GA31333@order.stressinduktion.org>

On Sat, Oct 19, 2013 at 10:42:25AM +0200, Hannes Frederic Sowa wrote:
> I got access to a nice test box yesterday where I could brute force the
> problem in parallel (it was a PITA). This is what I found:

Thanks so much for looking into this! It's great that to hear that it's not
just me being weird =) I'll try rebooting the machine in question into a
kernel with these patches later today; given how intermittent it is, it can
of course only show if the patches are not fully working (or, over time,
make it likely that they are).

/* Steinar */
-- 
Homepage: http://www.sesse.net/

^ permalink raw reply

* [patch net v2 0/3] UFO fixes
From: Jiri Pirko @ 2013-10-19 10:29 UTC (permalink / raw)
  To: netdev
  Cc: davem, eric.dumazet, hannes, jdmason, yoshfuji, kuznet, jmorris,
	kaber, herbert

Couple of patches fixing UFO functionality in different situations.

v1->v2:
- minor if{}else{} coding style adjustment suggested by Sergei Shtylyov

Jiri Pirko (3):
  udp6: respect IPV6_DONTFRAG sockopt in case there are pending frames
  ip6_output: do skb ufo init for peeked non ufo skb as well
  ip_output: do skb ufo init for peeked non ufo skb as well

 net/ipv4/ip_output.c  | 13 +++++++++----
 net/ipv6/ip6_output.c | 25 ++++++++++++++-----------
 net/ipv6/udp.c        |  5 ++---
 3 files changed, 25 insertions(+), 18 deletions(-)

-- 
1.8.3.1

^ permalink raw reply

* [patch net v2 2/3] ip6_output: do skb ufo init for peeked non ufo skb as well
From: Jiri Pirko @ 2013-10-19 10:29 UTC (permalink / raw)
  To: netdev
  Cc: davem, eric.dumazet, hannes, jdmason, yoshfuji, kuznet, jmorris,
	kaber, herbert
In-Reply-To: <1382178557-14737-1-git-send-email-jiri@resnulli.us>

Now, if user application does:
sendto len<mtu flag MSG_MORE
sendto len>mtu flag 0
The skb is not treated as fragmented one because it is not initialized
that way. So move the initialization to fix this.

introduced by:
commit e89e9cf539a28df7d0eb1d0a545368e9920b34ac "[IPv4/IPv6]: UFO Scatter-gather approach"

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
---
 net/ipv6/ip6_output.c | 25 ++++++++++++++-----------
 1 file changed, 14 insertions(+), 11 deletions(-)

diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index a54c45c..975624b 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1008,6 +1008,7 @@ static inline int ip6_ufo_append_data(struct sock *sk,
 
 {
 	struct sk_buff *skb;
+	struct frag_hdr fhdr;
 	int err;
 
 	/* There is support for UDP large send offload by network
@@ -1015,8 +1016,6 @@ static inline int ip6_ufo_append_data(struct sock *sk,
 	 * udp datagram
 	 */
 	if ((skb = skb_peek_tail(&sk->sk_write_queue)) == NULL) {
-		struct frag_hdr fhdr;
-
 		skb = sock_alloc_send_skb(sk,
 			hh_len + fragheaderlen + transhdrlen + 20,
 			(flags & MSG_DONTWAIT), &err);
@@ -1036,20 +1035,24 @@ static inline int ip6_ufo_append_data(struct sock *sk,
 		skb->transport_header = skb->network_header + fragheaderlen;
 
 		skb->protocol = htons(ETH_P_IPV6);
-		skb->ip_summed = CHECKSUM_PARTIAL;
 		skb->csum = 0;
 
-		/* Specify the length of each IPv6 datagram fragment.
-		 * It has to be a multiple of 8.
-		 */
-		skb_shinfo(skb)->gso_size = (mtu - fragheaderlen -
-					     sizeof(struct frag_hdr)) & ~7;
-		skb_shinfo(skb)->gso_type = SKB_GSO_UDP;
-		ipv6_select_ident(&fhdr, rt);
-		skb_shinfo(skb)->ip6_frag_id = fhdr.identification;
 		__skb_queue_tail(&sk->sk_write_queue, skb);
+	} else if (skb_is_gso(skb)) {
+		goto append;
 	}
 
+	skb->ip_summed = CHECKSUM_PARTIAL;
+	/* Specify the length of each IPv6 datagram fragment.
+	 * It has to be a multiple of 8.
+	 */
+	skb_shinfo(skb)->gso_size = (mtu - fragheaderlen -
+				     sizeof(struct frag_hdr)) & ~7;
+	skb_shinfo(skb)->gso_type = SKB_GSO_UDP;
+	ipv6_select_ident(&fhdr, rt);
+	skb_shinfo(skb)->ip6_frag_id = fhdr.identification;
+
+append:
 	return skb_append_datato_frags(sk, skb, getfrag, from,
 				       (length - transhdrlen));
 }
-- 
1.8.3.1

^ permalink raw reply related

* [patch net v2 1/3] udp6: respect IPV6_DONTFRAG sockopt in case there are pending frames
From: Jiri Pirko @ 2013-10-19 10:29 UTC (permalink / raw)
  To: netdev
  Cc: davem, eric.dumazet, hannes, jdmason, yoshfuji, kuznet, jmorris,
	kaber, herbert
In-Reply-To: <1382178557-14737-1-git-send-email-jiri@resnulli.us>

if up->pending != 0 dontfrag is left with default value -1. That
causes that application that do:
sendto len>mtu flag MSG_MORE
sendto len>mtu flag 0
will receive EMSGSIZE errno as the result of the second sendto.

This patch fixes it by respecting IPV6_DONTFRAG socket option.

introduced by:
commit 4b340ae20d0e2366792abe70f46629e576adaf5e "IPv6: Complete IPV6_DONTFRAG support"

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
---
 net/ipv6/udp.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 72b7eaa..1878609 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -1225,9 +1225,6 @@ do_udp_sendmsg:
 	if (tclass < 0)
 		tclass = np->tclass;
 
-	if (dontfrag < 0)
-		dontfrag = np->dontfrag;
-
 	if (msg->msg_flags&MSG_CONFIRM)
 		goto do_confirm;
 back_from_confirm:
@@ -1246,6 +1243,8 @@ back_from_confirm:
 	up->pending = AF_INET6;
 
 do_append_data:
+	if (dontfrag < 0)
+		dontfrag = np->dontfrag;
 	up->len += ulen;
 	getfrag  =  is_udplite ?  udplite_getfrag : ip_generic_getfrag;
 	err = ip6_append_data(sk, getfrag, msg->msg_iov, ulen,
-- 
1.8.3.1

^ permalink raw reply related

* [patch net v2 3/3] ip_output: do skb ufo init for peeked non ufo skb as well
From: Jiri Pirko @ 2013-10-19 10:29 UTC (permalink / raw)
  To: netdev
  Cc: davem, eric.dumazet, hannes, jdmason, yoshfuji, kuznet, jmorris,
	kaber, herbert
In-Reply-To: <1382178557-14737-1-git-send-email-jiri@resnulli.us>

Now, if user application does:
sendto len<mtu flag MSG_MORE
sendto len>mtu flag 0
The skb is not treated as fragmented one because it is not initialized
that way. So move the initialization to fix this.

introduced by:
commit e89e9cf539a28df7d0eb1d0a545368e9920b34ac "[IPv4/IPv6]: UFO Scatter-gather approach"

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
---
 net/ipv4/ip_output.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index a04d872..3982eab 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -772,15 +772,20 @@ static inline int ip_ufo_append_data(struct sock *sk,
 		/* initialize protocol header pointer */
 		skb->transport_header = skb->network_header + fragheaderlen;
 
-		skb->ip_summed = CHECKSUM_PARTIAL;
 		skb->csum = 0;
 
-		/* specify the length of each IP datagram fragment */
-		skb_shinfo(skb)->gso_size = maxfraglen - fragheaderlen;
-		skb_shinfo(skb)->gso_type = SKB_GSO_UDP;
+
 		__skb_queue_tail(queue, skb);
+	} else if (skb_is_gso(skb)) {
+		goto append;
 	}
 
+	skb->ip_summed = CHECKSUM_PARTIAL;
+	/* specify the length of each IP datagram fragment */
+	skb_shinfo(skb)->gso_size = maxfraglen - fragheaderlen;
+	skb_shinfo(skb)->gso_type = SKB_GSO_UDP;
+
+append:
 	return skb_append_datato_frags(sk, skb, getfrag, from,
 				       (length - transhdrlen));
 }
-- 
1.8.3.1

^ permalink raw reply related

* Charity Donation
From: Gillian and Adrian Bayford @ 2013-10-19 10:36 UTC (permalink / raw)
  To: Recipients

My wife and i won £148.6 Million Pounds last year, and we have done lot of charity donation, so we decide to give 1.5 Million Pounds each to 5 lucky people, lucky for you, your email, was given to us by Google management as one of our lucky precipitants.

For verification process see below Please read the article - http://www.bbc.co.uk/news/uk-england-19254228

Send Name, Country, Age, Occupation and Phone Number for details

Congratulations & Happy Celebrations in Advance,

Gillian and Adrian Bayford
Email: gillian.adrianbayford01@rogers.com

^ permalink raw reply

* Re: [PATCH RESENT] staging: r8188eu: Move writeN buffer off stack
From: Dan Carpenter @ 2013-10-19 11:12 UTC (permalink / raw)
  To: Larry Finger; +Cc: devel, gregkh, netdev
In-Reply-To: <5261AA78.6090205@lwfinger.net>

On Fri, Oct 18, 2013 at 04:39:04PM -0500, Larry Finger wrote:
> The driver places a 254-byte buffer on the stack when writing long output.
> To reduce stack usage, a buffer of the required length is acquired using
> kmemdup().
> 

The patch is badly whitespace dammaged.

regards,
dan carpenter

^ permalink raw reply

* net: sctp: possible dereference after freeing
From: Geyslan Gregório Bem @ 2013-10-19 11:38 UTC (permalink / raw)
  To: Vlad Yasevich; +Cc: Neil Horman, David S. Miller, linux-sctp, netdev, LKML

Hi maintainers,

I would like to know if these are catches:

/net/sctp/endpointola.c (281)
static void sctp_endpoint_destroy(struct sctp_endpoint *ep)
{
    struct sock *sk;
...
    kfree(ep);
    SCTP_DBG_OBJCNT_DEC(ep);
}

The 'ep' object counter is being decremented?! Is the kfree to be there indeed?
Let me know what was intended here.

Same here:
/net/sctp/endpointola.c (165)
static void sctp_transport_destroy_rcu(struct rcu_head *head)
{
    struct sctp_transport *transport;
...
    kfree(transport);
    SCTP_DBG_OBJCNT_DEC(transport);
}

Regards,

Geyslan Gregório Bem
hackingbits.com

^ permalink raw reply

* Re: net: sctp: possible dereference after freeing
From: Geyslan Gregório Bem @ 2013-10-19 11:49 UTC (permalink / raw)
  To: Vlad Yasevich; +Cc: Neil Horman, David S. Miller, linux-sctp, netdev, LKML
In-Reply-To: <CAGG-pUTXC+6O-wyYp0FZxKH-XWqP++jLEn-Km5kixQFcKsWRZA@mail.gmail.com>

2013/10/19 Geyslan Gregório Bem <geyslan@gmail.com>:
> Hi maintainers,
>
> I would like to know if these are catches:
>
> /net/sctp/endpointola.c (281)
> static void sctp_endpoint_destroy(struct sctp_endpoint *ep)
> {
>     struct sock *sk;
> ...
>     kfree(ep);
>     SCTP_DBG_OBJCNT_DEC(ep);
> }
>
> The 'ep' object counter is being decremented?! Is the kfree to be there indeed?
> Let me know what was intended here.
>
> Same here:
> /net/sctp/endpointola.c (165)
> static void sctp_transport_destroy_rcu(struct rcu_head *head)
> {
>     struct sctp_transport *transport;
> ...
>     kfree(transport);
>     SCTP_DBG_OBJCNT_DEC(transport);
> }
>
> Regards,
>
> Geyslan Gregório Bem
> hackingbits.com

Oops. I got it. ;)

extern atomic_t sctp_dbg_objcnt_ep;
http://lxr.free-electrons.com/source/include/net/sctp/sctp.h#L269
extern atomic_t sctp_dbg_objcnt_transport;
http://lxr.free-electrons.com/source/arch/x86/include/asm/atomic.h#L105

Cheers.

^ permalink raw reply

* Re: IPv6 path discovery oddities - flushing the routing cache resolves
From: Valentijn Sessink @ 2013-10-19 14:24 UTC (permalink / raw)
  To: netdev; +Cc: Steinar H. Gunderson
In-Reply-To: <20131019101246.GA27576@sesse.net>

op 19-10-13 12:12, Steinar H. Gunderson schreef:
> Thanks so much for looking into this! It's great that to hear that it's not
> just me being weird =)

I'm not sure the patch addresses weirdness ;-P

On the more serious side: I have this problem with a 3.8.0 kernel 
(Ubuntu 12.04 with updated kernel). I saw that you (Steinar) experience 
this with 3.10 and 3.11, iirc. I'm a little bit worried by the fact that 
this has not been seen for at least six months - and now suddenly pops up.

Does anyone have an idea when this race condition was introduced?

(It could be, that certain patches have been included in Ubuntu 
3.8.0-something but not in an official 3.8).

Best regards,

Valentijn
p.s. I won't be able to test any patches until tuesday - sorry.

^ permalink raw reply

* [PATCH RFC 0/4] MDIO bus timeout issues on Dreamplug
From: Leigh Brown @ 2013-10-19 16:23 UTC (permalink / raw)
  To: linux-arm-kernel; +Cc: Thomas Petazzoni, netdev, Leigh Brown

Hello,

I just upgraded my Dreamplug to kernel 3.12-rc4 and began encountering
timeout errors in the log, like the following :-

orion-mdio f1072004.mdio-bus: Timeout when reading PHY

I discovered that the mv643xx_eth driver had been converted to use the
mvmdio driver to talk to the PHY rather than the old code that was
embedded in the driver.  The new code had some inconsistent timeouts
and always polled for a response rather than using the interrupt, where
available.  The timeout was 1ms when polling but the Dreamplug can
take up to 4.9ms to respond.

So, making reference to the old 3.9 driver I changed the mvmdio driver
to have more consistent timeouts and use interrupts to poll, where
available. I set the timeout to 10ms.  Patches 1 and 2 do this.

I got carried away checking the size of the code and patch 3 is a 
simple optimisation to reduce the code size. The final patch updates
the documentation in mvmdio to make it reflect the current situation.

I have tested the patches on my Dreamplug and Mirabox and have not 
encountered any issues.

One final thing I noticed when testing: you can unload the mvmdio 
driver and the machine will crash shortly after.  I tried to understand
how the module reference counting would apply to these kinds of drivers
and got a bit stuck.  I think these modules should not be unloadable
while there is an ethernet driver that uses them.

Regards,

Leigh.

Leigh Brown (4):
  net: mvmdio: make orion_mdio_wait_ready consistent
  net: mvmdio: orion_mdio_ready: remove manual poll
  net: mvmdio: slight optimisation of orion_mdio_write
  net: mvmdio: doc: mvmdio now used by mv643xx_eth

 drivers/net/ethernet/marvell/mvmdio.c |   93 +++++++++++++++------------------
 1 file changed, 42 insertions(+), 51 deletions(-)

-- 
1.7.10.4

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox