Netdev List
 help / color / mirror / Atom feed
* Re: ip6_rt_gc_elasticity and ip6_rt_gc_min_interval race in rt6_alloc_cow() ?
From: David Miller @ 2012-11-30 18:50 UTC (permalink / raw)
  To: jbohac; +Cc: netdev, mkubecek
In-Reply-To: <20121130184336.GB7113@midget.suse.cz>

From: Jiri Bohac <jbohac@suse.cz>
Date: Fri, 30 Nov 2012 19:43:36 +0100

> I believe there is a race condition in the setting of
> ip6_rt_gc_elasticity and ip6_rt_gc_min_interval. I have not seen
> happen, though:
> 
> 1) a setting made by the user via sysctl while ip6_dst_gc() is
>    running will get lost
> 
> 2) another instance of rt6_alloc_cow() may save the temporary
>    values in the saved_* variables, making those permanent.
> 
> Am I overlooking some synchronization or should I send a
> patch to fix this?

The ipv4 routing cache had the same issue, I really and sincerely
doubt this race matters in practice.

^ permalink raw reply

* Re: [PATCH v3 resend net-next 0/2] myri10ge: LRO to GRO conversion
From: Andrew Gallatin @ 2012-11-30 18:50 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20121130.134438.1284316376841704382.davem@davemloft.net>

On 11/30/12 13:44, David Miller wrote:
> From: Andrew Gallatin <gallatin@myri.com>
> Date: Fri, 30 Nov 2012 13:31:57 -0500
> 
>> I'm resending this after fixing email client issues.
> 
> Applied, thanks.
> 

Thank you.

I also want to apologize for making so many beginner mistakes,
and to thank you for your patience with me.  There has usually
been somebody more "plugged in" to the linux community that
has submitted patches for us in the past, so I'm just not used
to doing it yet.

Thanks again,

Drew

^ permalink raw reply

* Re: [PATCH] MAINTAINERS: fix bouncing tun/tap entries
From: Max Krasnyansky @ 2012-11-30 18:52 UTC (permalink / raw)
  To: David Miller; +Cc: jslaby, netdev, jirislaby, linux-kernel
In-Reply-To: <20121130.122813.1136966656907114890.davem@davemloft.net>

On 11/30/2012 09:28 AM, David Miller wrote:
> From: Jiri Slaby <jslaby@suse.cz>
> Date: Fri, 30 Nov 2012 18:05:40 +0100
> 
>> Delivery to the following recipient failed permanently:
>>
>>      vtun@office.satix.net
>>
>> Technical details of permanent failure:
>> DNS Error: Domain name not found
>>
>> Of course:
>> $ host office.satix.net
>> Host office.satix.net not found: 3(NXDOMAIN)
>>
>> ===========
>>
>> And "Change of Email Address Notification":
>> Old Address        New Address           Email Subject
>> ------------------------------------------------------
>> maxk@qualcomm.com  maxk@qti.qualcomm.com "tuntap: multiqueue...
>>
>> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
> 
> Applied.
> 

Thanks for fixing that guys.

Max

^ permalink raw reply

* Re: iputils: ping -I <iface>
From: YOSHIFUJI Hideaki @ 2012-11-30 19:04 UTC (permalink / raw)
  To: Jan Synacek; +Cc: Ben Greear, netdev, YOSHIFUJI Hideaki
In-Reply-To: <50B84CCB.7000502@redhat.com>

(2012年11月30日 15:06), Jan Synacek wrote:
> On 11/29/2012 08:48 PM, Ben Greear wrote:
>> On 11/29/2012 06:12 AM, Jan Synacek wrote:
>>> Hello,
>>>
>>> There seems to be a bug(?) when calling ping with -I lo:
>>>
>>> $ ping -I lo kernel.org
>>>
>>> PING kernel.org (149.20.4.69) from 192.168.1.10 lo: 56(84) bytes of data.
>>> ^C
>>>
>>> Note that 192.168.1.10 is my primary interface's address (em1). However, no
>>> replies are coming back.
>>>
>>> $ ping -I em1 kernel.org
>>>
>>> PING kernel.org (149.20.4.69) from 192.168.1.10 em1: 56(84) bytes of data.
>>> 64 bytes from pub2.kernel.org (149.20.4.69): icmp_seq=1 ttl=42 time=202 ms
>>> 64 bytes from pub2.kernel.org (149.20.4.69): icmp_seq=2 ttl=42 time=187 ms
>>> ^C
>>>
>>> Works as expected.
>>>
>>> I know that binding to loopback probably doesn't make much sense, but I think
>>> that ping should be able to cope with that.
>>
>> I think it would be wrong if ping worked as you suggest.  Binding to an
>> interface means use that interface as the source of your packets, and having
>> it bind hard helps when using systems with multiple NICs on same subnet
>> (or possibly, same IP).
> 
> I just wanted to point out that if I call ping with -I lo, its 'from' address is
> wrong (in my case 192.168.1.10) and nothing happens (that's, I guess, expected
> if it really bound to loopback). If I call ping with the -I <the same address>
> or -I em1 (the same address again), it works as expected. I'm sorry if I wasn't
> clear enough.
> 
>>
>>> Also, it would be nice to mention the difference between -I <ip> and -I <iface>
>>> in the manpage.
>>
>> In my opinion, -I <iface> should use SO_BINDTODEVICE, but at least in
>> older versions of ping it did not.
> 
> Ping does use SO_BINDTODEVICE.

So far, -I device is related to source address selection (using
SO_BINDTODEVICE) and outgoing device (using in_pktinfo).
On the other hand, -I addr is, in fact, related to source
address selection (and it is enfoced by bind), only.

Something like this:

       -I interface
              interface is either an address, or an interface name.  If
              interface is an address, it sets source address to
              specified interface address.   If  interface
              in  an  interface name, it tells the command to use that
              interface.  For ping6, when doing ping to a link-local
              scope address, link specification (by the
              '%'-notation in destination, or by this option) is
              required.

BUT, even with -I device, net/ipv4/dev_inet.c:inet_select_addr()
may select an address from other interfaces, AFAIK).

Should we check if the selected source address blongs to the actual
device?

--yoshfuji

^ permalink raw reply

* Re: pull request: wireless 2012-11-30
From: John W. Linville @ 2012-11-30 19:07 UTC (permalink / raw)
  To: David Miller; +Cc: linux-wireless, netdev, linux-kernel
In-Reply-To: <20121130.134018.1859855439413083920.davem@davemloft.net>

On Fri, Nov 30, 2012 at 01:40:18PM -0500, David Miller wrote:
> From: "John W. Linville" <linville@tuxdriver.com>
> Date: Fri, 30 Nov 2012 13:25:00 -0500
> 
> > I have a couple of stragglers intended for 3.7...
> > 
> > The iwlwifi patch fixes a bug in CCK basic rate calculations.
> > 
> > The mac80211 patch removes an unnecessary function call that was
> > generating a lot of log SPAM.
> 
> Pulled, thanks John.
> 
> I want to warn you ahead of time that these might not make it.

Yes, I suspect that might be the case -- but it won't make it if I
don't send it... :-)

John
-- 
John W. Linville		Someday the world will need a hero, and you
linville@tuxdriver.com			might be all we have.  Be ready.

^ permalink raw reply

* Re: iputils: ping -I <iface>
From: Ben Greear @ 2012-11-30 19:11 UTC (permalink / raw)
  To: YOSHIFUJI Hideaki; +Cc: Jan Synacek, netdev
In-Reply-To: <50B90326.2010302@linux-ipv6.org>

On 11/30/2012 11:04 AM, YOSHIFUJI Hideaki wrote:
> (2012年11月30日 15:06), Jan Synacek wrote:
>> On 11/29/2012 08:48 PM, Ben Greear wrote:
>>> On 11/29/2012 06:12 AM, Jan Synacek wrote:
>>>> Hello,
>>>>
>>>> There seems to be a bug(?) when calling ping with -I lo:
>>>>
>>>> $ ping -I lo kernel.org
>>>>
>>>> PING kernel.org (149.20.4.69) from 192.168.1.10 lo: 56(84) bytes of data.
>>>> ^C
>>>>
>>>> Note that 192.168.1.10 is my primary interface's address (em1). However, no
>>>> replies are coming back.
>>>>
>>>> $ ping -I em1 kernel.org
>>>>
>>>> PING kernel.org (149.20.4.69) from 192.168.1.10 em1: 56(84) bytes of data.
>>>> 64 bytes from pub2.kernel.org (149.20.4.69): icmp_seq=1 ttl=42 time=202 ms
>>>> 64 bytes from pub2.kernel.org (149.20.4.69): icmp_seq=2 ttl=42 time=187 ms
>>>> ^C
>>>>
>>>> Works as expected.
>>>>
>>>> I know that binding to loopback probably doesn't make much sense, but I think
>>>> that ping should be able to cope with that.
>>>
>>> I think it would be wrong if ping worked as you suggest.  Binding to an
>>> interface means use that interface as the source of your packets, and having
>>> it bind hard helps when using systems with multiple NICs on same subnet
>>> (or possibly, same IP).
>>
>> I just wanted to point out that if I call ping with -I lo, its 'from' address is
>> wrong (in my case 192.168.1.10) and nothing happens (that's, I guess, expected
>> if it really bound to loopback). If I call ping with the -I <the same address>
>> or -I em1 (the same address again), it works as expected. I'm sorry if I wasn't
>> clear enough.
>>
>>>
>>>> Also, it would be nice to mention the difference between -I <ip> and -I <iface>
>>>> in the manpage.
>>>
>>> In my opinion, -I <iface> should use SO_BINDTODEVICE, but at least in
>>> older versions of ping it did not.
>>
>> Ping does use SO_BINDTODEVICE.
>
> So far, -I device is related to source address selection (using
> SO_BINDTODEVICE) and outgoing device (using in_pktinfo).
> On the other hand, -I addr is, in fact, related to source
> address selection (and it is enfoced by bind), only.
>
> Something like this:
>
>         -I interface
>                interface is either an address, or an interface name.  If
>                interface is an address, it sets source address to
>                specified interface address.   If  interface
>                in  an  interface name, it tells the command to use that
>                interface.  For ping6, when doing ping to a link-local
>                scope address, link specification (by the
>                '%'-notation in destination, or by this option) is
>                required.
>
> BUT, even with -I device, net/ipv4/dev_inet.c:inet_select_addr()
> may select an address from other interfaces, AFAIK).
>
> Should we check if the selected source address blongs to the actual
> device?

Maybe have -D <iface-name> and -P <source-address>
options?

Where -D uses SO_BINDTODEVICE, and -P binds to a source IP?

Would have to keep the -I logic similar to what it does now for
backwards compatibility...

Thanks,
Ben

>
> --yoshfuji
>


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply

* Re: [PATCH] smsc: RFC: Workaround for problems with lan8710 phy auto MDI-X
From: Ben Hutchings @ 2012-11-30 19:14 UTC (permalink / raw)
  To: David Miller
  Cc: jkosina, pt, otavio, javier, chohnstaedt, netdev, linux-kernel
In-Reply-To: <20121130.122302.1825816242500143123.davem@davemloft.net>

On Fri, 2012-11-30 at 12:23 -0500, David Miller wrote:
> From: Jiri Kosina <jkosina@suse.cz>
> Date: Fri, 30 Nov 2012 16:23:30 +0100 (CET)
> 
> > Of course it'd be far better if faulty hardware can be autodetected in 
> > runtime.
> 
> That is how this must be handled.

Well, we also have the recent extension to ETHTOOL_SSET for forcing
MDI-X on or off.  If it's not possible to detect the faulty hardware
then the driver can implement this.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* [PATCH net-next, 1/1] hyperv: Add an error message to rndis_filter_set_device_mac()
From: Haiyang Zhang @ 2012-11-30 19:23 UTC (permalink / raw)
  To: davem, netdev; +Cc: olaf, jasowang, linux-kernel, devel, haiyangz

This message indicates an error returned from the host when changing MAC address.

Reported-by: Michal Kubecek <mkubecek@suse.com>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>

---
 drivers/net/hyperv/rndis_filter.c |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/drivers/net/hyperv/rndis_filter.c b/drivers/net/hyperv/rndis_filter.c
index 7fdeb52..2b657d4 100644
--- a/drivers/net/hyperv/rndis_filter.c
+++ b/drivers/net/hyperv/rndis_filter.c
@@ -605,8 +605,11 @@ int rndis_filter_set_device_mac(struct hv_device *hdev, char *mac)
 		return -EBUSY;
 	} else {
 		set_complete = &request->response_msg.msg.set_complete;
-		if (set_complete->status != RNDIS_STATUS_SUCCESS)
+		if (set_complete->status != RNDIS_STATUS_SUCCESS) {
+			netdev_err(ndev, "Fail to set MAC on host side:0x%x\n",
+				   set_complete->status);
 			ret = -EINVAL;
+		}
 	}
 
 cleanup:
-- 
1.7.4.1

^ permalink raw reply related

* RE: [PATCH] [trivial] wireless: mwifiex: Fix typo in wireless/mwifiex driver
From: Bing Zhao @ 2012-11-30 19:30 UTC (permalink / raw)
  To: Masanari Iida, linux-wireless@vger.kernel.org, trivial@kernel.org,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org
In-Reply-To: <1354280911-4866-1-git-send-email-standby24x7@gmail.com>

Hi Masanari,

Thanks for your patch.

> Subject: [PATCH] [trivial] wireless: mwifiex: Fix typo in wireless/mwifiex driver
> 
> Correct spelling typo in wireless/mwifiex driver.
> 
> Signed-off-by: Masanari Iida <standby24x7@gmail.com>

Acked-by: Bing Zhao <bzhao@marvell.com>

Regards,
Bing

> ---
>  drivers/net/wireless/mwifiex/sta_ioctl.c | 2 +-
>  drivers/net/wireless/mwifiex/usb.c       | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/wireless/mwifiex/sta_ioctl.c b/drivers/net/wireless/mwifiex/sta_ioctl.c
> index 552d72e..21035e6 100644
> --- a/drivers/net/wireless/mwifiex/sta_ioctl.c
> +++ b/drivers/net/wireless/mwifiex/sta_ioctl.c
> @@ -463,7 +463,7 @@ int mwifiex_enable_hs(struct mwifiex_adapter *adapter)
>  	}
> 
>  	if (adapter->hs_activated) {
> -		dev_dbg(adapter->dev, "cmd: HS Already actived\n");
> +		dev_dbg(adapter->dev, "cmd: HS Already activated\n");
>  		return true;
>  	}
> 
> diff --git a/drivers/net/wireless/mwifiex/usb.c b/drivers/net/wireless/mwifiex/usb.c
> index 22a5916..5eb3b33 100644
> --- a/drivers/net/wireless/mwifiex/usb.c
> +++ b/drivers/net/wireless/mwifiex/usb.c
> @@ -351,7 +351,7 @@ static int mwifiex_usb_probe(struct usb_interface *intf,
>  	card->udev = udev;
>  	card->intf = intf;
> 
> -	pr_debug("info: bcdUSB=%#x Device Class=%#x SubClass=%#x Protocl=%#x\n",
> +	pr_debug("info: bcdUSB=%#x Device Class=%#x SubClass=%#x Protocol=%#x\n",
>  		 udev->descriptor.bcdUSB, udev->descriptor.bDeviceClass,
>  		 udev->descriptor.bDeviceSubClass,
>  		 udev->descriptor.bDeviceProtocol);
> --
> 1.8.0.1.347.gf94c325

^ permalink raw reply

* [PATCH v3 net-next] net: move inet_dport/inet_num in sock_common
From: Eric Dumazet @ 2012-11-30 19:49 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Ling Ma, Ben Hutchings, Joe Perches

From: Eric Dumazet <edumazet@google.com>

commit 68835aba4d9b (net: optimize INET input path further)
moved some fields used for tcp/udp sockets lookup in the first cache
line of struct sock_common.

This patch moves inet_dport/inet_num as well, filling a 32bit hole
on 64 bit arches and reducing number of cache line misses in lookups.

Also change INET_MATCH()/INET_TW_MATCH() to perform the ports match
before addresses match, as this check is more discriminant.

Remove the hash check from MATCH() macros because we dont need to
re validate the hash value after taking a refcount on socket, and
use likely/unlikely compiler hints, as the sk_hash/hash check
makes the following conditional tests 100% predicted by cpu.

Introduce skc_addrpair/skc_portpair pair values to better
document the alignment requirements of the port/addr pairs
used in the various MATCH() macros, and remove some casts.

The namespace check can also be done at last.

This slightly improves TCP/UDP lookup times.

IP/TCP early demux needs inet->rx_dst_ifindex and
TCP needs inet->min_ttl, lets group them together in same cache line.

With help from Ben Hutchings & Joe Perches.

Idea of this patch came after Ling Ma proposal to move skc_hash
to the beginning of struct sock_common, and should allow him
to submit a final version of his patch. My tests show an improvement
doing so.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Joe Perches <joe@perches.com>
Cc: Ling Ma <ling.ma.program@gmail.com>
---
v3: fix a typo
    rx_dst_ifindex moves right before min_ttl

 include/linux/ipv6.h             |   32 ++++++++++---------
 include/net/inet_hashtables.h    |   48 +++++++++++++++--------------
 include/net/inet_sock.h          |    8 +++-
 include/net/inet_timewait_sock.h |    7 +++-
 include/net/sock.h               |   25 ++++++++++++---
 net/ipv4/inet_hashtables.c       |   36 +++++++++++++--------
 net/ipv6/inet6_hashtables.c      |   27 +++++++++++-----
 7 files changed, 115 insertions(+), 68 deletions(-)

diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index 5e11905..12729e9 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -364,20 +364,22 @@ static inline struct raw6_sock *raw6_sk(const struct sock *sk)
 #define inet_v6_ipv6only(__sk)		0
 #endif /* IS_ENABLED(CONFIG_IPV6) */
 
-#define INET6_MATCH(__sk, __net, __hash, __saddr, __daddr, __ports, __dif)\
-	(((__sk)->sk_hash == (__hash)) && sock_net((__sk)) == (__net)	&& \
-	 ((*((__portpair *)&(inet_sk(__sk)->inet_dport))) == (__ports)) && \
-	 ((__sk)->sk_family		== AF_INET6)		&& \
-	 ipv6_addr_equal(&inet6_sk(__sk)->daddr, (__saddr))	&& \
-	 ipv6_addr_equal(&inet6_sk(__sk)->rcv_saddr, (__daddr))	&& \
-	 (!((__sk)->sk_bound_dev_if) || ((__sk)->sk_bound_dev_if == (__dif))))
-
-#define INET6_TW_MATCH(__sk, __net, __hash, __saddr, __daddr, __ports, __dif) \
-	(((__sk)->sk_hash == (__hash)) && sock_net((__sk)) == (__net)	&& \
-	 (*((__portpair *)&(inet_twsk(__sk)->tw_dport)) == (__ports))	&& \
-	 ((__sk)->sk_family	       == PF_INET6)			&& \
-	 (ipv6_addr_equal(&inet6_twsk(__sk)->tw_v6_daddr, (__saddr)))	&& \
-	 (ipv6_addr_equal(&inet6_twsk(__sk)->tw_v6_rcv_saddr, (__daddr))) && \
-	 (!((__sk)->sk_bound_dev_if) || ((__sk)->sk_bound_dev_if == (__dif))))
+#define INET6_MATCH(__sk, __net, __saddr, __daddr, __ports, __dif)	\
+	((inet_sk(__sk)->inet_portpair == (__ports))		&&	\
+	 ((__sk)->sk_family == AF_INET6)			&&	\
+	 ipv6_addr_equal(&inet6_sk(__sk)->daddr, (__saddr))	&&	\
+	 ipv6_addr_equal(&inet6_sk(__sk)->rcv_saddr, (__daddr))	&&	\
+	 (!(__sk)->sk_bound_dev_if	||				\
+	   ((__sk)->sk_bound_dev_if == (__dif))) 		&&	\
+	 net_eq(sock_net(__sk), (__net)))
+
+#define INET6_TW_MATCH(__sk, __net, __saddr, __daddr, __ports, __dif)	   \
+	((inet_twsk(__sk)->tw_portpair == (__ports))			&& \
+	 ((__sk)->sk_family == AF_INET6)				&& \
+	 ipv6_addr_equal(&inet6_twsk(__sk)->tw_v6_daddr, (__saddr))	&& \
+	 ipv6_addr_equal(&inet6_twsk(__sk)->tw_v6_rcv_saddr, (__daddr)) && \
+	 (!(__sk)->sk_bound_dev_if	||				   \
+	  ((__sk)->sk_bound_dev_if == (__dif)))				&& \
+	 net_eq(sock_net(__sk), (__net)))
 
 #endif /* _IPV6_H */
diff --git a/include/net/inet_hashtables.h b/include/net/inet_hashtables.h
index 54be028..d1de4fb 100644
--- a/include/net/inet_hashtables.h
+++ b/include/net/inet_hashtables.h
@@ -299,30 +299,34 @@ typedef __u64 __bitwise __addrpair;
 				   (((__force __u64)(__be32)(__daddr)) << 32) | \
 				   ((__force __u64)(__be32)(__saddr)));
 #endif /* __BIG_ENDIAN */
-#define INET_MATCH(__sk, __net, __hash, __cookie, __saddr, __daddr, __ports, __dif)\
-	(((__sk)->sk_hash == (__hash)) && net_eq(sock_net(__sk), (__net)) &&	\
-	 ((*((__addrpair *)&(inet_sk(__sk)->inet_daddr))) == (__cookie))  &&	\
-	 ((*((__portpair *)&(inet_sk(__sk)->inet_dport))) == (__ports))   &&	\
-	 (!((__sk)->sk_bound_dev_if) || ((__sk)->sk_bound_dev_if == (__dif))))
-#define INET_TW_MATCH(__sk, __net, __hash, __cookie, __saddr, __daddr, __ports, __dif)\
-	(((__sk)->sk_hash == (__hash)) && net_eq(sock_net(__sk), (__net)) &&	\
-	 ((*((__addrpair *)&(inet_twsk(__sk)->tw_daddr))) == (__cookie)) &&	\
-	 ((*((__portpair *)&(inet_twsk(__sk)->tw_dport))) == (__ports)) &&	\
-	 (!((__sk)->sk_bound_dev_if) || ((__sk)->sk_bound_dev_if == (__dif))))
+#define INET_MATCH(__sk, __net, __cookie, __saddr, __daddr, __ports, __dif)	\
+	((inet_sk(__sk)->inet_portpair == (__ports))		&&	\
+	 (inet_sk(__sk)->inet_addrpair == (__cookie))		&&	\
+	 (!(__sk)->sk_bound_dev_if	||				\
+	   ((__sk)->sk_bound_dev_if == (__dif))) 		&& 	\
+	 net_eq(sock_net(__sk), (__net)))
+#define INET_TW_MATCH(__sk, __net, __cookie, __saddr, __daddr, __ports, __dif)\
+	((inet_twsk(__sk)->tw_portpair == (__ports))	&&		\
+	 (inet_twsk(__sk)->tw_addrpair == (__cookie))	&&		\
+	 (!(__sk)->sk_bound_dev_if	||				\
+	   ((__sk)->sk_bound_dev_if == (__dif)))	&&		\
+	 net_eq(sock_net(__sk), (__net)))
 #else /* 32-bit arch */
 #define INET_ADDR_COOKIE(__name, __saddr, __daddr)
-#define INET_MATCH(__sk, __net, __hash, __cookie, __saddr, __daddr, __ports, __dif)	\
-	(((__sk)->sk_hash == (__hash)) && net_eq(sock_net(__sk), (__net))	&&	\
-	 (inet_sk(__sk)->inet_daddr	== (__saddr))		&&	\
-	 (inet_sk(__sk)->inet_rcv_saddr	== (__daddr))		&&	\
-	 ((*((__portpair *)&(inet_sk(__sk)->inet_dport))) == (__ports))	&&	\
-	 (!((__sk)->sk_bound_dev_if) || ((__sk)->sk_bound_dev_if == (__dif))))
-#define INET_TW_MATCH(__sk, __net, __hash,__cookie, __saddr, __daddr, __ports, __dif)	\
-	(((__sk)->sk_hash == (__hash)) && net_eq(sock_net(__sk), (__net))	&&	\
-	 (inet_twsk(__sk)->tw_daddr	== (__saddr))		&&	\
-	 (inet_twsk(__sk)->tw_rcv_saddr	== (__daddr))		&&	\
-	 ((*((__portpair *)&(inet_twsk(__sk)->tw_dport))) == (__ports)) &&	\
-	 (!((__sk)->sk_bound_dev_if) || ((__sk)->sk_bound_dev_if == (__dif))))
+#define INET_MATCH(__sk, __net, __cookie, __saddr, __daddr, __ports, __dif) \
+	((inet_sk(__sk)->inet_portpair == (__ports))	&&		\
+	 (inet_sk(__sk)->inet_daddr	== (__saddr))	&&		\
+	 (inet_sk(__sk)->inet_rcv_saddr	== (__daddr))	&&		\
+	 (!(__sk)->sk_bound_dev_if	||				\
+	   ((__sk)->sk_bound_dev_if == (__dif))) 	&&		\
+	 net_eq(sock_net(__sk), (__net)))
+#define INET_TW_MATCH(__sk, __net, __cookie, __saddr, __daddr, __ports, __dif) \
+	((inet_twsk(__sk)->tw_portpair == (__ports))	&&		\
+	 (inet_twsk(__sk)->tw_daddr	== (__saddr))	&&		\
+	 (inet_twsk(__sk)->tw_rcv_saddr	== (__daddr))	&&		\
+	 (!(__sk)->sk_bound_dev_if	||				\
+	   ((__sk)->sk_bound_dev_if == (__dif))) 	&&		\
+	 net_eq(sock_net(__sk), (__net)))
 #endif /* 64-bit arch */
 
 /*
diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
index 256c1ed..a4196cb 100644
--- a/include/net/inet_sock.h
+++ b/include/net/inet_sock.h
@@ -144,9 +144,11 @@ struct inet_sock {
 	/* Socket demultiplex comparisons on incoming packets. */
 #define inet_daddr		sk.__sk_common.skc_daddr
 #define inet_rcv_saddr		sk.__sk_common.skc_rcv_saddr
+#define inet_addrpair		sk.__sk_common.skc_addrpair
+#define inet_dport		sk.__sk_common.skc_dport
+#define inet_num		sk.__sk_common.skc_num
+#define inet_portpair		sk.__sk_common.skc_portpair
 
-	__be16			inet_dport;
-	__u16			inet_num;
 	__be32			inet_saddr;
 	__s16			uc_ttl;
 	__u16			cmsg_flags;
@@ -154,6 +156,7 @@ struct inet_sock {
 	__u16			inet_id;
 
 	struct ip_options_rcu __rcu	*inet_opt;
+	int			rx_dst_ifindex;
 	__u8			tos;
 	__u8			min_ttl;
 	__u8			mc_ttl;
@@ -170,7 +173,6 @@ struct inet_sock {
 	int			uc_index;
 	int			mc_index;
 	__be32			mc_addr;
-	int			rx_dst_ifindex;
 	struct ip_mc_socklist __rcu	*mc_list;
 	struct inet_cork_full	cork;
 };
diff --git a/include/net/inet_timewait_sock.h b/include/net/inet_timewait_sock.h
index ba52c83..7d658d5 100644
--- a/include/net/inet_timewait_sock.h
+++ b/include/net/inet_timewait_sock.h
@@ -112,6 +112,11 @@ struct inet_timewait_sock {
 #define tw_net			__tw_common.skc_net
 #define tw_daddr        	__tw_common.skc_daddr
 #define tw_rcv_saddr    	__tw_common.skc_rcv_saddr
+#define tw_addrpair		__tw_common.skc_addrpair
+#define tw_dport		__tw_common.skc_dport
+#define tw_num			__tw_common.skc_num
+#define tw_portpair		__tw_common.skc_portpair
+
 	int			tw_timeout;
 	volatile unsigned char	tw_substate;
 	unsigned char		tw_rcv_wscale;
@@ -119,8 +124,6 @@ struct inet_timewait_sock {
 	/* Socket demultiplex comparisons on incoming packets. */
 	/* these three are in inet_sock */
 	__be16			tw_sport;
-	__be16			tw_dport;
-	__u16			tw_num;
 	kmemcheck_bitfield_begin(flags);
 	/* And these are ours. */
 	unsigned int		tw_ipv6only     : 1,
diff --git a/include/net/sock.h b/include/net/sock.h
index c945fba..c4132c1 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -132,6 +132,8 @@ struct net;
  *	@skc_rcv_saddr: Bound local IPv4 addr
  *	@skc_hash: hash value used with various protocol lookup tables
  *	@skc_u16hashes: two u16 hash values used by UDP lookup tables
+ *	@skc_dport: placeholder for inet_dport/tw_dport
+ *	@skc_num: placeholder for inet_num/tw_num
  *	@skc_family: network address family
  *	@skc_state: Connection state
  *	@skc_reuse: %SO_REUSEADDR setting
@@ -149,16 +151,29 @@ struct net;
  *	for struct sock and struct inet_timewait_sock.
  */
 struct sock_common {
-	/* skc_daddr and skc_rcv_saddr must be grouped :
-	 * cf INET_MATCH() and INET_TW_MATCH()
+	/* skc_daddr and skc_rcv_saddr must be grouped on a 8 bytes aligned
+	 * address on 64bit arches : cf INET_MATCH() and INET_TW_MATCH()
 	 */
-	__be32			skc_daddr;
-	__be32			skc_rcv_saddr;
-
+	union {
+		unsigned long	skc_addrpair;
+		struct {
+			__be32	skc_daddr;
+			__be32	skc_rcv_saddr;
+		};
+	};
 	union  {
 		unsigned int	skc_hash;
 		__u16		skc_u16hashes[2];
 	};
+	/* skc_dport && skc_num must be grouped as well */
+	union {
+		u32		skc_portpair;
+		struct {
+			__be16	skc_dport;
+			__u16	skc_num;
+		};
+	};
+
 	unsigned short		skc_family;
 	volatile unsigned char	skc_state;
 	unsigned char		skc_reuse;
diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
index 7880af9..fa3ae81 100644
--- a/net/ipv4/inet_hashtables.c
+++ b/net/ipv4/inet_hashtables.c
@@ -237,12 +237,14 @@ struct sock *__inet_lookup_established(struct net *net,
 	rcu_read_lock();
 begin:
 	sk_nulls_for_each_rcu(sk, node, &head->chain) {
-		if (INET_MATCH(sk, net, hash, acookie,
-					saddr, daddr, ports, dif)) {
+		if (sk->sk_hash != hash)
+			continue;
+		if (likely(INET_MATCH(sk, net, acookie,
+				      saddr, daddr, ports, dif))) {
 			if (unlikely(!atomic_inc_not_zero(&sk->sk_refcnt)))
 				goto begintw;
-			if (unlikely(!INET_MATCH(sk, net, hash, acookie,
-				saddr, daddr, ports, dif))) {
+			if (unlikely(!INET_MATCH(sk, net, acookie,
+						 saddr, daddr, ports, dif))) {
 				sock_put(sk);
 				goto begin;
 			}
@@ -260,14 +262,18 @@ begin:
 begintw:
 	/* Must check for a TIME_WAIT'er before going to listener hash. */
 	sk_nulls_for_each_rcu(sk, node, &head->twchain) {
-		if (INET_TW_MATCH(sk, net, hash, acookie,
-					saddr, daddr, ports, dif)) {
+		if (sk->sk_hash != hash)
+			continue;
+		if (likely(INET_TW_MATCH(sk, net, acookie,
+					 saddr, daddr, ports,
+					 dif))) {
 			if (unlikely(!atomic_inc_not_zero(&sk->sk_refcnt))) {
 				sk = NULL;
 				goto out;
 			}
-			if (unlikely(!INET_TW_MATCH(sk, net, hash, acookie,
-				 saddr, daddr, ports, dif))) {
+			if (unlikely(!INET_TW_MATCH(sk, net, acookie,
+						    saddr, daddr, ports,
+						    dif))) {
 				sock_put(sk);
 				goto begintw;
 			}
@@ -314,10 +320,12 @@ static int __inet_check_established(struct inet_timewait_death_row *death_row,
 
 	/* Check TIME-WAIT sockets first. */
 	sk_nulls_for_each(sk2, node, &head->twchain) {
-		tw = inet_twsk(sk2);
+		if (sk2->sk_hash != hash)
+			continue;
 
-		if (INET_TW_MATCH(sk2, net, hash, acookie,
-					saddr, daddr, ports, dif)) {
+		if (likely(INET_TW_MATCH(sk2, net, acookie,
+					 saddr, daddr, ports, dif))) {
+			tw = inet_twsk(sk2);
 			if (twsk_unique(sk, sk2, twp))
 				goto unique;
 			else
@@ -328,8 +336,10 @@ static int __inet_check_established(struct inet_timewait_death_row *death_row,
 
 	/* And established part... */
 	sk_nulls_for_each(sk2, node, &head->chain) {
-		if (INET_MATCH(sk2, net, hash, acookie,
-					saddr, daddr, ports, dif))
+		if (sk2->sk_hash != hash)
+			continue;
+		if (likely(INET_MATCH(sk2, net, acookie,
+				      saddr, daddr, ports, dif)))
 			goto not_unique;
 	}
 
diff --git a/net/ipv6/inet6_hashtables.c b/net/ipv6/inet6_hashtables.c
index 73f1a00..dea17fd 100644
--- a/net/ipv6/inet6_hashtables.c
+++ b/net/ipv6/inet6_hashtables.c
@@ -87,11 +87,13 @@ struct sock *__inet6_lookup_established(struct net *net,
 	rcu_read_lock();
 begin:
 	sk_nulls_for_each_rcu(sk, node, &head->chain) {
-		/* For IPV6 do the cheaper port and family tests first. */
-		if (INET6_MATCH(sk, net, hash, saddr, daddr, ports, dif)) {
+		if (sk->sk_hash != hash)
+			continue;
+		if (likely(INET6_MATCH(sk, net, saddr, daddr, ports, dif))) {
 			if (unlikely(!atomic_inc_not_zero(&sk->sk_refcnt)))
 				goto begintw;
-			if (!INET6_MATCH(sk, net, hash, saddr, daddr, ports, dif)) {
+			if (unlikely(!INET6_MATCH(sk, net, saddr, daddr,
+						  ports, dif))) {
 				sock_put(sk);
 				goto begin;
 			}
@@ -104,12 +106,16 @@ begin:
 begintw:
 	/* Must check for a TIME_WAIT'er before going to listener hash. */
 	sk_nulls_for_each_rcu(sk, node, &head->twchain) {
-		if (INET6_TW_MATCH(sk, net, hash, saddr, daddr, ports, dif)) {
+		if (sk->sk_hash != hash)
+			continue;
+		if (likely(INET6_TW_MATCH(sk, net, saddr, daddr,
+					  ports, dif))) {
 			if (unlikely(!atomic_inc_not_zero(&sk->sk_refcnt))) {
 				sk = NULL;
 				goto out;
 			}
-			if (!INET6_TW_MATCH(sk, net, hash, saddr, daddr, ports, dif)) {
+			if (unlikely(!INET6_TW_MATCH(sk, net, saddr, daddr,
+						     ports, dif))) {
 				sock_put(sk);
 				goto begintw;
 			}
@@ -236,9 +242,12 @@ static int __inet6_check_established(struct inet_timewait_death_row *death_row,
 
 	/* Check TIME-WAIT sockets first. */
 	sk_nulls_for_each(sk2, node, &head->twchain) {
-		tw = inet_twsk(sk2);
+		if (sk2->sk_hash != hash)
+			continue;
 
-		if (INET6_TW_MATCH(sk2, net, hash, saddr, daddr, ports, dif)) {
+		if (likely(INET6_TW_MATCH(sk2, net, saddr, daddr,
+					  ports, dif))) {
+			tw = inet_twsk(sk2);
 			if (twsk_unique(sk, sk2, twp))
 				goto unique;
 			else
@@ -249,7 +258,9 @@ static int __inet6_check_established(struct inet_timewait_death_row *death_row,
 
 	/* And established part... */
 	sk_nulls_for_each(sk2, node, &head->chain) {
-		if (INET6_MATCH(sk2, net, hash, saddr, daddr, ports, dif))
+		if (sk2->sk_hash != hash)
+			continue;
+		if (likely(INET6_MATCH(sk2, net, saddr, daddr, ports, dif)))
 			goto not_unique;
 	}
 

^ permalink raw reply related

* Re: [PATCH] ipv6: unify logic evaluating inet6_dev's accept_ra property
From: Shmulik Ladkani @ 2012-11-30 19:51 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, yoshfuji, tgraf, tore
In-Reply-To: <20121130.120943.1397811156693287841.davem@davemloft.net>

Hi,

On Fri, 30 Nov 2012 12:09:43 -0500 (EST) David Miller <davem@davemloft.net> wrote:
> From: Shmulik Ladkani <shmulik.ladkani@gmail.com>
> Date: Thu, 29 Nov 2012 11:26:19 +0200
> 
> > +	/*
> > +	 * If forwarding is enabled, RA are not accepted unless the special
> > +	 * hybrid mode (accept_ra=2) is enabled.
> > +	 */
> 
> Please format this comment correctly, in the networking we use
> the style:
> 
> 	/* That looks
> 	 * like this.
> 	 */
> 
> 	/*
> 	 * Not
> 	 * like this.
> 	 */
> 
> Thanks.

Ok. Will amend shortly.

BTW the "Not like this" style has many occurrences in net/ipv6.
One of which was the comment relocated by my patch :-)

Regards,
Shmulik

^ permalink raw reply

* Re: [PATCH] smsc: RFC: Workaround for problems with lan8710 phy auto MDI-X
From: David Miller @ 2012-11-30 20:00 UTC (permalink / raw)
  To: bhutchings; +Cc: jkosina, pt, otavio, javier, chohnstaedt, netdev, linux-kernel
In-Reply-To: <1354302880.2640.1.camel@bwh-desktop.uk.solarflarecom.com>

From: Ben Hutchings <bhutchings@solarflare.com>
Date: Fri, 30 Nov 2012 19:14:40 +0000

> On Fri, 2012-11-30 at 12:23 -0500, David Miller wrote:
>> From: Jiri Kosina <jkosina@suse.cz>
>> Date: Fri, 30 Nov 2012 16:23:30 +0100 (CET)
>> 
>> > Of course it'd be far better if faulty hardware can be autodetected in 
>> > runtime.
>> 
>> That is how this must be handled.
> 
> Well, we also have the recent extension to ETHTOOL_SSET for forcing
> MDI-X on or off.  If it's not possible to detect the faulty hardware
> then the driver can implement this.

Agreed.

^ permalink raw reply

* Re: [PATCH] ipv6: unify logic evaluating inet6_dev's accept_ra property
From: David Miller @ 2012-11-30 20:02 UTC (permalink / raw)
  To: shmulik.ladkani; +Cc: netdev, yoshfuji, tgraf, tore
In-Reply-To: <20121130215113.7a81bc16.shmulik.ladkani@gmail.com>

From: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Date: Fri, 30 Nov 2012 21:51:13 +0200

> BTW the "Not like this" style has many occurrences in net/ipv6.
> One of which was the comment relocated by my patch :-)

Patches to fix this are more than welcome :-)

^ permalink raw reply

* Re: [PATCH net-next,1/1] hyperv: Add an error message to rndis_filter_set_device_mac()
From: David Miller @ 2012-11-30 20:03 UTC (permalink / raw)
  To: haiyangz; +Cc: netdev, kys, olaf, jasowang, linux-kernel, devel
In-Reply-To: <1354303421-26149-1-git-send-email-haiyangz@microsoft.com>

From: Haiyang Zhang <haiyangz@microsoft.com>
Date: Fri, 30 Nov 2012 11:23:41 -0800

> This message indicates an error returned from the host when changing MAC address.
> 
> Reported-by: Michal Kubecek <mkubecek@suse.com>
> Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>

Applied.

^ permalink raw reply

* Re: [PATCH v3 net-next] net: move inet_dport/inet_num in sock_common
From: David Miller @ 2012-11-30 20:03 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev, ling.ma.program, bhutchings, joe
In-Reply-To: <1354304967.20109.10.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 30 Nov 2012 11:49:27 -0800

> From: Eric Dumazet <edumazet@google.com>
> 
> commit 68835aba4d9b (net: optimize INET input path further)
> moved some fields used for tcp/udp sockets lookup in the first cache
> line of struct sock_common.
> 
> This patch moves inet_dport/inet_num as well, filling a 32bit hole
> on 64 bit arches and reducing number of cache line misses in lookups.
> 
> Also change INET_MATCH()/INET_TW_MATCH() to perform the ports match
> before addresses match, as this check is more discriminant.
> 
> Remove the hash check from MATCH() macros because we dont need to
> re validate the hash value after taking a refcount on socket, and
> use likely/unlikely compiler hints, as the sk_hash/hash check
> makes the following conditional tests 100% predicted by cpu.
> 
> Introduce skc_addrpair/skc_portpair pair values to better
> document the alignment requirements of the port/addr pairs
> used in the various MATCH() macros, and remove some casts.
> 
> The namespace check can also be done at last.
> 
> This slightly improves TCP/UDP lookup times.
> 
> IP/TCP early demux needs inet->rx_dst_ifindex and
> TCP needs inet->min_ttl, lets group them together in same cache line.
> 
> With help from Ben Hutchings & Joe Perches.
> 
> Idea of this patch came after Ling Ma proposal to move skc_hash
> to the beginning of struct sock_common, and should allow him
> to submit a final version of his patch. My tests show an improvement
> doing so.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied, thanks for fixing this up.

^ permalink raw reply

* [PATCH net-next] tcp: change default tcp hash size
From: Eric Dumazet @ 2012-11-30 20:08 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

From: Eric Dumazet <edumazet@google.com>

As time passed, available memory increased faster than number of
concurrent tcp sockets. 

As a result, a machine with 4GB of ram gets a hash table
with 524288 slots, using 8388608 bytes of memory.

Lets change that by a 16x factor (one slot for 128 KB of ram)

Even if a small machine needs a _lot_ of sockets, tcp lookups are now
very efficient, using one cache line per socket.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv4/tcp.c |    6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index e6eace1..1aca02c 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -3590,8 +3590,7 @@ void __init tcp_init(void)
 		alloc_large_system_hash("TCP established",
 					sizeof(struct inet_ehash_bucket),
 					thash_entries,
-					(totalram_pages >= 128 * 1024) ?
-					13 : 15,
+					17, /* one slot per 128 KB of memory */
 					0,
 					NULL,
 					&tcp_hashinfo.ehash_mask,
@@ -3607,8 +3606,7 @@ void __init tcp_init(void)
 		alloc_large_system_hash("TCP bind",
 					sizeof(struct inet_bind_hashbucket),
 					tcp_hashinfo.ehash_mask + 1,
-					(totalram_pages >= 128 * 1024) ?
-					13 : 15,
+					17, /* one slot per 128 KB of memory */
 					0,
 					&tcp_hashinfo.bhash_size,
 					NULL,

^ permalink raw reply related

* Re: [PATCH 00/17] ATM fixes for pppoatm/br2684
From: David Woodhouse @ 2012-11-30 20:22 UTC (permalink / raw)
  To: davem; +Cc: netdev, chas, Krzysztof Mazur
In-Reply-To: <20121130104411.GA16410@shrek.podlesie.net>

[-- Attachment #1: Type: text/plain, Size: 1955 bytes --]

On Fri, 2012-11-30 at 11:44 +0100, Krzysztof Mazur wrote:
> > The patch series can be pulled from
> >       git://git.infradead.org/users/dwmw2/atm.git
> > or viewed at 
> >       http://git.infradead.org/users/dwmw2/atm.git
> > 
> > DaveM, please wait for an ack from Krzysztof and Chas before pulling this.
> 
> looks good to me, except [<fixed>]

On Fri, 2012-11-30 at 12:12 -0500, chas williams - CONTRACTOR wrote:
> no objections.  i think this deals with my concerns. 

Dave, if you're not now ignoring this thread entirely, please pull into
net-next from
	git://git.infradead.org/users/dwmw2/atm.git

David Woodhouse (9):
      solos-pci: wait for pending TX to complete when releasing vcc
      atm: add release_cb() callback to vcc
      br2684: don't send frames on not-ready vcc
      pppoatm: fix missing wakeup in pppoatm_send()
      br2684: fix module_put() race
      pppoatm: optimise PPP channel wakeups after sock_owned_by_user()
      solos-pci: clean up pclose() function
      solos-pci: use GFP_KERNEL where possible, not GFP_ATOMIC
      solos-pci: remove list_vccs() debugging function

Krzysztof Mazur (7):
      atm: add owner of push() callback to atmvcc
      pppoatm: allow assign only on a connected socket
      pppoatm: fix module_put() race
      pppoatm: take ATM socket lock in pppoatm_send()
      pppoatm: drop frames to not-ready vcc
      pppoatm: do not inline pppoatm_may_send()
      br2684: allow assign only on a connected socket

Nathan Williams (1):
      solos-pci: Fix leak of skb received for unknown vcc

 drivers/atm/solos-pci.c | 85 ++++++++++++++++++++-----------------------------
 include/linux/atmdev.h  |  2 ++
 net/atm/br2684.c        | 55 ++++++++++++++++++++++++++++----
 net/atm/common.c        | 12 +++++++
 net/atm/pppoatm.c       | 68 ++++++++++++++++++++++++++++++++++++---
 5 files changed, 160 insertions(+), 62 deletions(-)

-- 
dwmw2


[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 6171 bytes --]

^ permalink raw reply

* [PATCH v2] ipv6: unify logic evaluating inet6_dev's accept_ra property
From: Shmulik Ladkani @ 2012-11-30 20:25 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Hideaki YOSHIFUJI, Thomas Graf, Tore Anderson,
	shmulik.ladkani

As of 026359b [ipv6: Send ICMPv6 RSes only when RAs are accepted], the
logic determining whether to send Router Solicitations is identical
to the logic determining whether kernel accepts Router Advertisements.

However the condition itself is repeated in several code locations.

Unify it by introducing 'ipv6_accept_ra()' accessor.

Also, simplify the condition expression, making it more readable.
No semantic change.

Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
---

v2: proper comment styling

 include/net/ipv6.h  |    9 +++++++++
 net/ipv6/addrconf.c |    3 +--
 net/ipv6/ndisc.c    |   16 ++--------------
 3 files changed, 12 insertions(+), 16 deletions(-)

diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 979bf6c..985c6fa 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -271,6 +271,15 @@ struct ipv6_txoptions *ipv6_fixup_options(struct ipv6_txoptions *opt_space,
 
 extern bool ipv6_opt_accepted(const struct sock *sk, const struct sk_buff *skb);
 
+static inline bool ipv6_accept_ra(struct inet6_dev *idev)
+{
+	/* If forwarding is enabled, RA are not accepted unless the special
+	 * hybrid mode (accept_ra=2) is enabled.
+	 */
+	return idev->cnf.forwarding ? idev->cnf.accept_ra == 2 :
+	    idev->cnf.accept_ra;
+}
+
 #if IS_ENABLED(CONFIG_IPV6)
 static inline int ip6_frag_nqueues(struct net *net)
 {
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 0424e4e..ca1ed8a 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -3005,8 +3005,7 @@ static void addrconf_dad_completed(struct inet6_ifaddr *ifp)
 	   router advertisements, start sending router solicitations.
 	 */
 
-	if (((ifp->idev->cnf.accept_ra == 1 && !ifp->idev->cnf.forwarding) ||
-	     ifp->idev->cnf.accept_ra == 2) &&
+	if (ipv6_accept_ra(ifp->idev) &&
 	    ifp->idev->cnf.rtr_solicits > 0 &&
 	    (dev->flags&IFF_LOOPBACK) == 0 &&
 	    (ipv6_addr_type(&ifp->addr) & IPV6_ADDR_LINKLOCAL)) {
diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index 2edce30..980cdc3 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -1033,18 +1033,6 @@ errout:
 	rtnl_set_sk_err(net, RTNLGRP_ND_USEROPT, err);
 }
 
-static inline int accept_ra(struct inet6_dev *in6_dev)
-{
-	/*
-	 * If forwarding is enabled, RA are not accepted unless the special
-	 * hybrid mode (accept_ra=2) is enabled.
-	 */
-	if (in6_dev->cnf.forwarding && in6_dev->cnf.accept_ra < 2)
-		return 0;
-
-	return in6_dev->cnf.accept_ra;
-}
-
 static void ndisc_router_discovery(struct sk_buff *skb)
 {
 	struct ra_msg *ra_msg = (struct ra_msg *)skb_transport_header(skb);
@@ -1092,7 +1080,7 @@ static void ndisc_router_discovery(struct sk_buff *skb)
 		return;
 	}
 
-	if (!accept_ra(in6_dev))
+	if (!ipv6_accept_ra(in6_dev))
 		goto skip_linkparms;
 
 #ifdef CONFIG_IPV6_NDISC_NODETYPE
@@ -1248,7 +1236,7 @@ skip_linkparms:
 			     NEIGH_UPDATE_F_ISROUTER);
 	}
 
-	if (!accept_ra(in6_dev))
+	if (!ipv6_accept_ra(in6_dev))
 		goto out;
 
 #ifdef CONFIG_IPV6_ROUTE_INFO
-- 
1.7.9

^ permalink raw reply related

* [RFT PATCH] 8139cp: properly support change of MTU values
From: John Greene @ 2012-11-30 20:51 UTC (permalink / raw)
  To: netdev; +Cc: John Greene, David S. Miller

The 8139cp driver has a change_mtu function that has not been
enabled since the dawn of the git repository. However, the
generic eth_change_mtu is not used in its place, so that
invalid MTU values can be set on the interface.

Original patch salvages the broken code for the single case of
setting the MTU while the interface is down, which is safe
and also includes the range check.  Now enhanced to support up
or down interface.

Original patch from
http://lkml.indiana.edu/hypermail/linux/kernel/1202.2/00770.html

Testing: has been test on virtual 8139cp setup without issue,
have no access real hardware 8139cp, need testing help.

Signed-off-by: "John Greene" <jogreene@redhat.com>
CC: "David S. Miller" <davem@davemloft.net>
---
 drivers/net/ethernet/realtek/8139cp.c | 22 +++-------------------
 1 file changed, 3 insertions(+), 19 deletions(-)

diff --git a/drivers/net/ethernet/realtek/8139cp.c b/drivers/net/ethernet/realtek/8139cp.c
index 6cb96b4..7847c83 100644
--- a/drivers/net/ethernet/realtek/8139cp.c
+++ b/drivers/net/ethernet/realtek/8139cp.c
@@ -1226,12 +1226,9 @@ static void cp_tx_timeout(struct net_device *dev)
 	spin_unlock_irqrestore(&cp->lock, flags);
 }
 
-#ifdef BROKEN
 static int cp_change_mtu(struct net_device *dev, int new_mtu)
 {
 	struct cp_private *cp = netdev_priv(dev);
-	int rc;
-	unsigned long flags;
 
 	/* check for invalid MTU, according to hardware limits */
 	if (new_mtu < CP_MIN_MTU || new_mtu > CP_MAX_MTU)
@@ -1244,22 +1241,11 @@ static int cp_change_mtu(struct net_device *dev, int new_mtu)
 		return 0;
 	}
 
-	spin_lock_irqsave(&cp->lock, flags);
-
-	cp_stop_hw(cp);			/* stop h/w and free rings */
-	cp_clean_rings(cp);
-
+	/* network IS up, close it, reset MTU, and come up again. */
+	cp_close(dev);
 	dev->mtu = new_mtu;
-	cp_set_rxbufsize(cp);		/* set new rx buf size */
-
-	rc = cp_init_rings(cp);		/* realloc and restart h/w */
-	cp_start_hw(cp);
-
-	spin_unlock_irqrestore(&cp->lock, flags);
-
-	return rc;
+	return cp_open(dev);
 }
-#endif /* BROKEN */
 
 static const char mii_2_8139_map[8] = {
 	BasicModeCtrl,
@@ -1835,9 +1821,7 @@ static const struct net_device_ops cp_netdev_ops = {
 	.ndo_start_xmit		= cp_start_xmit,
 	.ndo_tx_timeout		= cp_tx_timeout,
 	.ndo_set_features	= cp_set_features,
-#ifdef BROKEN
 	.ndo_change_mtu		= cp_change_mtu,
-#endif
 
 #ifdef CONFIG_NET_POLL_CONTROLLER
 	.ndo_poll_controller	= cp_poll_controller,
-- 
1.7.11.7

^ permalink raw reply related

* [net-next:master 98/98] drivers/net/ethernet/myricom/myri10ge/myri10ge.c:1286:34: sparse: cast to restricted __be16
From: kbuild test robot @ 2012-11-30 21:02 UTC (permalink / raw)
  To: Andrew Gallatin; +Cc: netdev

tree:   git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git master
head:   1b4c44e6369dbbafd113f1e00b406f1eda5ab5b2
commit: 1b4c44e6369dbbafd113f1e00b406f1eda5ab5b2 [98/98] myri10ge: Add vlan rx for better GRO perf.


sparse warnings:

+ drivers/net/ethernet/myricom/myri10ge/myri10ge.c:1286:34: sparse: cast to restricted __be16
+ drivers/net/ethernet/myricom/myri10ge/myri10ge.c:1286:34: sparse: cast to restricted __be16
+ drivers/net/ethernet/myricom/myri10ge/myri10ge.c:1286:34: sparse: cast to restricted __be16
+ drivers/net/ethernet/myricom/myri10ge/myri10ge.c:1286:34: sparse: cast to restricted __be16
+ drivers/net/ethernet/myricom/myri10ge/myri10ge.c:1286:16: sparse: restricted __be16 degrades to integer
drivers/net/ethernet/myricom/myri10ge/myri10ge.c:1888:16: sparse: incorrect type in argument 1 (different base types)
drivers/net/ethernet/myricom/myri10ge/myri10ge.c:1888:16:    expected unsigned int [unsigned] val
drivers/net/ethernet/myricom/myri10ge/myri10ge.c:1888:16:    got restricted __be32 [usertype] <noident>
drivers/net/ethernet/myricom/myri10ge/myri10ge.c:2334:24: sparse: incorrect type in assignment (different address spaces)
drivers/net/ethernet/myricom/myri10ge/myri10ge.c:2334:24:    expected unsigned char [usertype] *itable
drivers/net/ethernet/myricom/myri10ge/myri10ge.c:2334:24:    got unsigned char [noderef] [usertype] <asn:2>*
drivers/net/ethernet/myricom/myri10ge/myri10ge.c:2336:48: sparse: incorrect type in argument 2 (different address spaces)
drivers/net/ethernet/myricom/myri10ge/myri10ge.c:2336:48:    expected void volatile [noderef] <asn:2>*addr
drivers/net/ethernet/myricom/myri10ge/myri10ge.c:2336:48:    got unsigned char [usertype] *
drivers/net/ethernet/myricom/myri10ge/myri10ge.c:2760:60: sparse: dubious: x & !y
drivers/net/ethernet/myricom/myri10ge/myri10ge.c:3839:13: sparse: cast to restricted __be32
drivers/net/ethernet/myricom/myri10ge/myri10ge.c:3839:13: sparse: cast to restricted __be32
drivers/net/ethernet/myricom/myri10ge/myri10ge.c:3839:13: sparse: cast to restricted __be32
drivers/net/ethernet/myricom/myri10ge/myri10ge.c:3839:13: sparse: cast to restricted __be32
drivers/net/ethernet/myricom/myri10ge/myri10ge.c:3839:13: sparse: cast to restricted __be32
drivers/net/ethernet/myricom/myri10ge/myri10ge.c:3839:13: sparse: cast to restricted __be32
drivers/net/ethernet/myricom/myri10ge/myri10ge.c:3841:26: sparse: cast to restricted __be32
drivers/net/ethernet/myricom/myri10ge/myri10ge.c:3841:26: sparse: cast to restricted __be32
drivers/net/ethernet/myricom/myri10ge/myri10ge.c:3841:26: sparse: cast to restricted __be32
drivers/net/ethernet/myricom/myri10ge/myri10ge.c:3841:26: sparse: cast to restricted __be32
drivers/net/ethernet/myricom/myri10ge/myri10ge.c:3841:26: sparse: cast to restricted __be32
drivers/net/ethernet/myricom/myri10ge/myri10ge.c:3841:26: sparse: cast to restricted __be32
drivers/net/ethernet/myricom/myri10ge/myri10ge.c:1434:35: sparse: context imbalance in 'myri10ge_intr' - different lock contexts for basic block

vim +1286 drivers/net/ethernet/myricom/myri10ge/myri10ge.c

1b4c44e6 Andrew Gallatin 2012-11-30  1270   * the tag ourselves to be able to achieve GRO performance that
1b4c44e6 Andrew Gallatin 2012-11-30  1271   * is comparable to LRO.
1b4c44e6 Andrew Gallatin 2012-11-30  1272   */
1b4c44e6 Andrew Gallatin 2012-11-30  1273  
1b4c44e6 Andrew Gallatin 2012-11-30  1274  static inline void
1b4c44e6 Andrew Gallatin 2012-11-30  1275  myri10ge_vlan_rx(struct net_device *dev, void *addr, struct sk_buff *skb)
1b4c44e6 Andrew Gallatin 2012-11-30  1276  {
1b4c44e6 Andrew Gallatin 2012-11-30  1277  	u8 *va;
1b4c44e6 Andrew Gallatin 2012-11-30  1278  	struct vlan_ethhdr *veh;
1b4c44e6 Andrew Gallatin 2012-11-30  1279  	struct skb_frag_struct *frag;
1b4c44e6 Andrew Gallatin 2012-11-30  1280  	__wsum vsum;
1b4c44e6 Andrew Gallatin 2012-11-30  1281  
1b4c44e6 Andrew Gallatin 2012-11-30  1282  	va = addr;
1b4c44e6 Andrew Gallatin 2012-11-30  1283  	va += MXGEFW_PAD;
1b4c44e6 Andrew Gallatin 2012-11-30  1284  	veh = (struct vlan_ethhdr *)va;
1b4c44e6 Andrew Gallatin 2012-11-30  1285  	if ((dev->features & NETIF_F_HW_VLAN_RX) == NETIF_F_HW_VLAN_RX &&
1b4c44e6 Andrew Gallatin 2012-11-30 @1286  	    veh->h_vlan_proto == ntohs(ETH_P_8021Q)) {
1b4c44e6 Andrew Gallatin 2012-11-30  1287  		/* fixup csum if needed */
1b4c44e6 Andrew Gallatin 2012-11-30  1288  		if (skb->ip_summed == CHECKSUM_COMPLETE) {
1b4c44e6 Andrew Gallatin 2012-11-30  1289  			vsum = csum_partial(va + ETH_HLEN, VLAN_HLEN, 0);
1b4c44e6 Andrew Gallatin 2012-11-30  1290  			skb->csum = csum_sub(skb->csum, vsum);
1b4c44e6 Andrew Gallatin 2012-11-30  1291  		}
1b4c44e6 Andrew Gallatin 2012-11-30  1292  		/* pop tag */
1b4c44e6 Andrew Gallatin 2012-11-30  1293  		__vlan_hwaccel_put_tag(skb, ntohs(veh->h_vlan_TCI));
1b4c44e6 Andrew Gallatin 2012-11-30  1294  		memmove(va + VLAN_HLEN, va, 2 * ETH_ALEN);

---
0-DAY kernel build testing backend         Open Source Technology Center
Fengguang Wu, Yuanhan Liu                              Intel Corporation

^ permalink raw reply

* Re: Wireless regression in workqueue: use mod_delayed_work() instead of __cancel + queue
From: Tejun Heo @ 2012-11-30 21:14 UTC (permalink / raw)
  To: Anders Kaseorg
  Cc: Herbert Xu, John W. Linville, netdev, linux-wireless,
	linux-kernel
In-Reply-To: <alpine.DEB.2.00.1211281016320.26602@dr-wily.mit.edu>

Hello, Anders.

Sorry about the delay.

On Wed, Nov 28, 2012 at 10:17:28AM -0500, Anders Kaseorg wrote:
> On Wed, 28 Nov 2012, Anders Kaseorg wrote:
> > My Intel 6250 wireless card (iwldvm) can no longer associate with a 
> > WPA-Enterprise network (PEAP-MSCHAPv2).  To my surprise, I bisected this 
> > regression to commit e7c2f967445dd2041f0f8e3179cca22bb8bb7f79, 
> > workqueue: use mod_delayed_work() instead of __cancel + queue.

I see.

> > A bunch of logs collected by Ubuntu apport are in this bug report: 
> >   https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1083980
> > 
> > How can I help to debug this?
> > 
> > I see that someone else reported another regression with the same commit 
> > last week, although this looks unrelated at first glance: 
> >   http://thread.gmane.org/gmane.linux.kernel/1395938

Urgh... that one was in my spam folder probably due to the mimed
content.  Nothing rings a bell yet.  Will keep looking into it.

Thanks.

-- 
tejun

^ permalink raw reply

* Re: [net-next PATCH V2 1/9] net: frag evictor, avoid killing warm frag queues
From: Jesper Dangaard Brouer @ 2012-11-30 21:37 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, fw, netdev, pablo, tgraf, amwang, kaber, paulmck,
	herbert
In-Reply-To: <1354293469.3299.81.camel@edumazet-glaptop>

On Fri, 2012-11-30 at 08:37 -0800, Eric Dumazet wrote:
> On Fri, 2012-11-30 at 16:45 +0100, Jesper Dangaard Brouer wrote:
> > On Fri, 2012-11-30 at 06:52 -0800, Eric Dumazet wrote:
> 
> > 
> > > I dont know how you expect that many
> > > datagrams being correctly reassembled with ipfrag_high_thresh=262144 
> > 
> > That's my point... I'm showing that its not possible, with out current
> > implementation!
> 
> What I was saying is that the limits are too small, and we should
> increase them for this particular need.
> 
> This has little to do with the underlying algo.

Actual data is an engineers best friend.

[root@dragon ~]# sysctl -w net/ipv4/ipfrag_high_thresh=$((4<<20))
net.ipv4.ipfrag_high_thresh = 4194304
[root@dragon ~]# sysctl -w net/ipv4/ipfrag_low_thresh=$((3<<20))
net.ipv4.ipfrag_low_thresh = 3145728


[jbrouer@firesoul ~]$ netperf -H 192.168.51.2 -T0,0 -t UDP_STREAM -l 20 &\
 netperf -p 1337 -H 192.168.31.2 -T7,7 -t UDP_STREAM -l 20
[1] 18573
UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.51.2 (192.168.51.2) port 0 AF_INET : cpu bind
UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.31.2 (192.168.31.2) port 0 AF_INET : cpu bind
Socket  Message  Elapsed      Messages                
Size    Size     Time         Okay Errors   Throughput
bytes   bytes    secs            #      #   10^6bits/sec

229376   65507   20.00      363315      0    9519.86
212992           20.00        7297            191.20

Socket  Message  Elapsed      Messages                
Size    Size     Time         Okay Errors   Throughput
bytes   bytes    secs            #      #   10^6bits/sec

229376   65507   20.00      366927      0    9614.48
212992           20.00       10437            273.48


This test is 2x10G with straight NUMA nodes (meaning optimal NUMA
allocation where the incoming netperf packets are received by kernel and
delivered to netserver on the same NUMA node).


Come on Eric, you are smart than this.  When will you realize, that
dropping partly completed fragment queue are bad for performance? (And
thus a bad algorithmic choice in the evictor)


-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Sr. Network Kernel Developer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply

* Re: [net-next:master 98/98] drivers/net/ethernet/myricom/myri10ge/myri10ge.c:1286:34: sparse: cast to restricted __be16
From: Andrew Gallatin @ 2012-11-30 21:51 UTC (permalink / raw)
  To: kbuild test robot; +Cc: netdev
In-Reply-To: <50b91efa.B0WbOtcWMs7eOSaC%fengguang.wu@intel.com>

On 11/30/12 16:02, kbuild test robot wrote:
> tree:   git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git master
> head:   1b4c44e6369dbbafd113f1e00b406f1eda5ab5b2
> commit: 1b4c44e6369dbbafd113f1e00b406f1eda5ab5b2 [98/98] myri10ge: Add vlan rx for better GRO perf.
> 
> 
> sparse warnings:
> 
> + drivers/net/ethernet/myricom/myri10ge/myri10ge.c:1286:34: sparse: cast to restricted __be16
> + drivers/net/ethernet/myricom/myri10ge/myri10ge.c:1286:34: sparse: cast to restricted __be16
> + drivers/net/ethernet/myricom/myri10ge/myri10ge.c:1286:34: sparse: cast to restricted __be16
> + drivers/net/ethernet/myricom/myri10ge/myri10ge.c:1286:34: sparse: cast to restricted __be16
> + drivers/net/ethernet/myricom/myri10ge/myri10ge.c:1286:16: sparse: restricted __be16 degrades to integer


OK, maybe a dumb question again, but how do I get sparse to produce
the 'cast to restricted' warnings?  I ran sparse before submission,
but it only showed the pre-existing, non "cast to restricted"
warnings, so I did not know I was introducing a new warning.
Do I need to use a different architecture? (I was using x86_64).

Also, the line it is warning about is this:

> 1b4c44e6 Andrew Gallatin 2012-11-30 @1286  	    veh->h_vlan_proto == ntohs(ETH_P_8021Q)) {


Which seems to be nearly identical to the usage in
if_vlan.h:__vlan_get_tag, which I was treating as canonical..
So I'm a bit confused as to how to fix it.


Thanks,

Drew

^ permalink raw reply

* Re: [net-next:master 98/98] drivers/net/ethernet/myricom/myri10ge/myri10ge.c:1286:34: sparse: cast to restricted __be16
From: Stephen Hemminger @ 2012-11-30 21:53 UTC (permalink / raw)
  To: Andrew Gallatin; +Cc: kbuild test robot, netdev
In-Reply-To: <50B92A6D.8000600@myri.com>

On Fri, 30 Nov 2012 16:51:41 -0500
Andrew Gallatin <gallatin@myri.com> wrote:

> On 11/30/12 16:02, kbuild test robot wrote:
> > tree:   git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git master
> > head:   1b4c44e6369dbbafd113f1e00b406f1eda5ab5b2
> > commit: 1b4c44e6369dbbafd113f1e00b406f1eda5ab5b2 [98/98] myri10ge: Add vlan rx for better GRO perf.
> > 
> > 
> > sparse warnings:
> > 
> > + drivers/net/ethernet/myricom/myri10ge/myri10ge.c:1286:34: sparse: cast to restricted __be16
> > + drivers/net/ethernet/myricom/myri10ge/myri10ge.c:1286:34: sparse: cast to restricted __be16
> > + drivers/net/ethernet/myricom/myri10ge/myri10ge.c:1286:34: sparse: cast to restricted __be16
> > + drivers/net/ethernet/myricom/myri10ge/myri10ge.c:1286:34: sparse: cast to restricted __be16
> > + drivers/net/ethernet/myricom/myri10ge/myri10ge.c:1286:16: sparse: restricted __be16 degrades to integer
> 
> 
> OK, maybe a dumb question again, but how do I get sparse to produce
> the 'cast to restricted' warnings?  I ran sparse before submission,
> but it only showed the pre-existing, non "cast to restricted"
> warnings, so I did not know I was introducing a new warning.
> Do I need to use a different architecture? (I was using x86_64).

See Documentation/sparse.txt
  
  The optional make variable CF can be used to pass arguments to sparse.  The
  build system passes -Wbitwise to sparse automatically.  To perform endianness
  checks, you may define __CHECK_ENDIAN__:

        make C=2 CF="-D__CHECK_ENDIAN__"

  These checks are disabled by default as they generate a host of warnings.

^ permalink raw reply

* Re: [PATCH] Smack: Add missing depends on INET in Kconfig
From: Eric Paris @ 2012-11-30 22:01 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: Randy Dunlap, Paul Moore, Stephen Rothwell, linux-next,
	Linux Kernel Mailing List, netdev@vger.kernel.org, LSM List
In-Reply-To: <50B8ECB3.2090801@schaufler-ca.com>

Do other LSMs need this too Casey?  I remember we mentioned how select
was dangerous  :-(

On Fri, Nov 30, 2012 at 12:28 PM, Casey Schaufler
<casey@schaufler-ca.com> wrote:
> Because NETLABEL depends on INET SECURITY_SMACK
> has to explicitly call out the dependency.
>
> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
> ---
>  security/smack/Kconfig |    1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/security/smack/Kconfig b/security/smack/Kconfig
> index 9fb14ef..1be1088 100644
> --- a/security/smack/Kconfig
> +++ b/security/smack/Kconfig
> @@ -1,5 +1,6 @@
>  config SECURITY_SMACK
>         bool "Simplified Mandatory Access Control Kernel Support"
> +       depends on INET
>         depends on NET
>         depends on SECURITY
>         select NETLABEL
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox