Netdev List

Netdev List
 help / color / mirror / Atom feed

* RE: rx_dropped count for USB ethernet interfaces
From: David Laight @ 2013-09-27  8:39 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20130926.132125.1343558938307288175.davem@redhat.com>

> From: "David Laight" <David.Laight@ACULAB.COM>
> Date: Thu, 26 Sep 2013 10:32:24 +0100
> 
> > It isn't exactly useful behaviour though.
> 
> In your opinion.
> 
> We have tracepoints for people who want more fine grained drops
> in these kinds of situations.
> 
> Also, the behavior of this statistic has existed for more than a
> decade so changing it is really not in the cards.  Therefore any
> discussion about what would have been the best samentic to choose from
> the beginning is moot.

I suspect that not many people have networks with broadcast packets
for non-IP protocols - so don't ever see rx_dropped being incremented
because there wasn't a protocol stack looking for the packet.

A similar problem will arise if promiscuous mode is enabled
by someone who only wants a specific protocol - eg a specific
ethertype or a single LLC SAP value.

	David

^ permalink raw reply

* Re: ipv6: strange routing behaviors on a multi-interfaces setup
From: Hannes Frederic Sowa @ 2013-09-27  8:50 UTC (permalink / raw)
  To: Emmanuel Thierry; +Cc: netdev
In-Reply-To: <374AB8DA-F970-4652-B147-2094B56BE5F0@sekil.fr>

On Wed, Sep 25, 2013 at 06:53:07PM +0200, Emmanuel Thierry wrote:
> I experienced these problems on a 3.11.1 kernel but they look to be quite recurrent in the past versions as well. 

A current solution would be to bind the socket to an interface
(SO_BINDTODEVICE) from user-space. I'll look if we can make the next hop
selection more fine grained.

Greetings,

  Hannes

^ permalink raw reply

* Re: Bug - regression - Via velocity interface coming up freezes kernel
From: Dirk Kraft @ 2013-09-27  9:10 UTC (permalink / raw)
  To: Francois Romieu; +Cc: netdev, Julia Lawall
In-Reply-To: <20130922221109.GA14246@electric-eye.fr.zoreil.com>

Hi,

My machine has now been running for multiple days with the proposed
patch. I was not able to see any warnings/error messages or any bad
behavior.

For me the proposed patch fixes the problem and I could not see any
new problems being introduced.

Thanks,
Dirk

^ permalink raw reply

* Re: [PATCH 1/4] [RFC] net: Explicitly initialize u64_stats_sync structures for lockdep
From: Eric Dumazet @ 2013-09-27  9:15 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: John Stultz, LKML, Thomas Petazzoni, Mirko Lindner,
	Stephen Hemminger, Roger Luethi, Patrick McHardy, Rusty Russell,
	Michael S. Tsirkin, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Wensong Zhang, Simon Horman, Julian Anastasov,
	Jesse Gross, Mathieu Desnoyers, Steven Rostedt, Peter Zijlstra,
	Thomas Gleixner, David S. Miller, netdev, netfilter-devel
In-Reply-To: <20130927054455.GA6597@gmail.com>

On Fri, 2013-09-27 at 07:44 +0200, Ingo Molnar wrote:

> C cannot pass along symbolic names, unfortunately, so we are stuck with 
> 1970's tech and the C preprocessor.
> 

Yes, I realized that a bit too late.

> There's a way to make such macros look a tiny bit more structured and thus 
> be more palatable:
> 
> #if BITS_PER_LONG == 32 && defined(CONFIG_SMP)
> # define u64_stats_init(syncp) seqcount_init(syncp.seq)
> #else
> # define u64_stats_init(syncp)
> #endif
> 
> Note, the 'else' branch should probably be:
> 
> # define u64_stats_init(syncp) do { } while (0)

Yes, this is better ;)

Thanks !



^ permalink raw reply

* Re: [PATCH net-next 1/5] tipc: silence sparse warnings
From: Ying Xue @ 2013-09-27  9:48 UTC (permalink / raw)
  To: Andreas Bofjäll; +Cc: jon.maloy, netdev, tipc-discussion, David Miller
In-Reply-To: <52453E8D.2020705@ericsson.com>

On 09/27/2013 04:15 PM, Andreas Bofjäll wrote:
> On 09/27/2013 10:01 AM, Ying Xue wrote:
>> Good point!
>> It's better for us to use memcpy_fromiovecend() instead of
>> skb_copy_to_linear_data() and its friends.
>>
>> We will submit another version to correct this error soon.
> 
> I could be wrong here, but I think you should also remove the entire
> cast on line 1210 in link.c:
> 
> sect_crs = msg_sect[curr_sect].iov_base;
> 
> There should be no reason to cast there since you made sect_crs into
> const unchar* __user and msg_sect[].iov_base is void* __user.
> 

You are right, thank you for your nice reminder!

Regards,
Ying

> /Andreas
> 
> 


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60133471&iu=/4140/ostg.clktrk

^ permalink raw reply

* Re: [PATCHv2] ipvs: improved SH fallback strategy
From: Alexander Frolkin @ 2013-09-27 10:06 UTC (permalink / raw)
  To: Julian Anastasov
  Cc: Simon Horman, Sergei Shtylyov, lvs-devel, Wensong Zhang, netdev,
	linux-kernel
In-Reply-To: <alpine.LFD.2.00.1309262255400.1949@ja.ssi.bg>

Improve the SH fallback realserver selection strategy.

With sh and sh-fallback, if a realserver is down, this attempts to
distribute the traffic that would have gone to that server evenly
among the remaining servers.

Signed-off-by: Alexander Frolkin <avf@eldamar.org.uk>

--
diff --git a/net/netfilter/ipvs/ip_vs_sh.c b/net/netfilter/ipvs/ip_vs_sh.c
index 3588fae..cc65b2f 100644
--- a/net/netfilter/ipvs/ip_vs_sh.c
+++ b/net/netfilter/ipvs/ip_vs_sh.c
@@ -115,27 +115,46 @@ ip_vs_sh_get(struct ip_vs_service *svc, struct ip_vs_sh_state *s,
 }
 
 
-/* As ip_vs_sh_get, but with fallback if selected server is unavailable */
+/* As ip_vs_sh_get, but with fallback if selected server is unavailable
+ *
+ * The fallback strategy loops around the table starting from a "random"
+ * point (in fact, it is chosen to be the original hash value to make the
+ * algorithm deterministic) to find a new server.
+ */
 static inline struct ip_vs_dest *
 ip_vs_sh_get_fallback(struct ip_vs_service *svc, struct ip_vs_sh_state *s,
 		      const union nf_inet_addr *addr, __be16 port)
 {
-	unsigned int offset;
-	unsigned int hash;
+	unsigned int offset, roffset;
+	unsigned int hash, ihash;
 	struct ip_vs_dest *dest;
 
+	/* first try the dest it's supposed to go to */
+	ihash = ip_vs_sh_hashkey(svc->af, addr, port, 0);
+	dest = rcu_dereference(s->buckets[ihash].dest);
+	if (!dest)
+		return NULL;
+	if (!is_unavailable(dest))
+		return dest;
+
+	IP_VS_DBG_BUF(6, "SH: selected unavailable server %s:%d, reselecting",
+		      IP_VS_DBG_ADDR(svc->af, &dest->addr), ntohs(dest->port));
+
+	/* if the original dest is unavailable, loop around the table
+	 * starting from ihash to find a new dest
+	 */
 	for (offset = 0; offset < IP_VS_SH_TAB_SIZE; offset++) {
-		hash = ip_vs_sh_hashkey(svc->af, addr, port, offset);
+		roffset = (offset + ihash) % IP_VS_SH_TAB_SIZE;
+		hash = ip_vs_sh_hashkey(svc->af, addr, port, roffset);
 		dest = rcu_dereference(s->buckets[hash].dest);
 		if (!dest)
 			break;
-		if (is_unavailable(dest))
-			IP_VS_DBG_BUF(6, "SH: selected unavailable server "
-				      "%s:%d (offset %d)",
-				      IP_VS_DBG_ADDR(svc->af, &dest->addr),
-				      ntohs(dest->port), offset);
-		else
+		if (!is_unavailable(dest))
 			return dest;
+		IP_VS_DBG_BUF(6, "SH: selected unavailable "
+			      "server %s:%d (offset %d), reselecting",
+			      IP_VS_DBG_ADDR(svc->af, &dest->addr),
+			      ntohs(dest->port), roffset);
 	}
 
 	return NULL;


^ permalink raw reply related

* [PATCH] tcp: TSQ can use a dynamic limit
From: Eric Dumazet @ 2013-09-27 10:28 UTC (permalink / raw)
  To: Cong Wang, David Miller
  Cc: Wei Liu, Linux Kernel Network Developers, Yuchung Cheng,
	Neal Cardwell
In-Reply-To: <1379861902.3431.12.camel@edumazet-glaptop>

From: Eric Dumazet <edumazet@google.com>

When TCP Small Queues was added, we used a sysctl to limit amount of
packets queues on Qdisc/device queues for a given TCP flow.

Problem is this limit is either too big for low rates, or too small
for high rates.

Now TCP stack has rate estimation in sk->sk_pacing_rate, and TSO 
auto sizing, it can better control number of packets in Qdisc/device
queues.

New limit is two packets or at least 1 to 2 ms worth of packets.

Low rates flows benefit from this patch by having even smaller
number of packets in queues, allowing for faster recovery,
better RTT estimations.

High rates flows benefit from this patch by allowing more than 2 packets
in flight as we had reports this was a limiting factor to reach line
rate. [ In particular if TX completion is delayed because of coalescing
parameters ]

Example for a single flow on 10Gbp link controlled by FQ/pacing

14 packets in flight instead of 2

$ tc -s -d qd
qdisc fq 8001: dev eth0 root refcnt 32 limit 10000p flow_limit 100p
buckets 1024 quantum 3028 initial_quantum 15140 
 Sent 1168459366606 bytes 771822841 pkt (dropped 0, overlimits 0
requeues 6822476) 
 rate 9346Mbit 771713pps backlog 953820b 14p requeues 6822476 
  2047 flow, 2046 inactive, 1 throttled, delay 15673 ns
  2372 gc, 0 highprio, 0 retrans, 9739249 throttled, 0 flows_plimit

Note that sk_pacing_rate is currently set to twice the actual rate, but
this might be refined in the future when a flow is in congestion
avoidance.

Additional change : skb->destructor should be set to tcp_wfree().

A future patch (for linux 3.13+) might remove tcp_limit_output_bytes

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
---
 net/ipv4/tcp_output.c |   17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 7c83cb8..c20e406 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -895,8 +895,7 @@ static int tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, int clone_it,
 
 	skb_orphan(skb);
 	skb->sk = sk;
-	skb->destructor = (sysctl_tcp_limit_output_bytes > 0) ?
-			  tcp_wfree : sock_wfree;
+	skb->destructor = tcp_wfree;
 	atomic_add(skb->truesize, &sk->sk_wmem_alloc);
 
 	/* Build TCP header and checksum it. */
@@ -1840,7 +1839,6 @@ static bool tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle,
 	while ((skb = tcp_send_head(sk))) {
 		unsigned int limit;
 
-
 		tso_segs = tcp_init_tso_segs(sk, skb, mss_now);
 		BUG_ON(!tso_segs);
 
@@ -1869,13 +1867,20 @@ static bool tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle,
 				break;
 		}
 
-		/* TSQ : sk_wmem_alloc accounts skb truesize,
-		 * including skb overhead. But thats OK.
+		/* TCP Small Queues :
+		 * Control number of packets in qdisc/devices to two packets / or ~1 ms.
+		 * This allows for :
+		 *  - better RTT estimation and ACK scheduling
+		 *  - faster recovery
+		 *  - high rates
 		 */
-		if (atomic_read(&sk->sk_wmem_alloc) >= sysctl_tcp_limit_output_bytes) {
+		limit = max(skb->truesize, sk->sk_pacing_rate >> 10);
+
+		if (atomic_read(&sk->sk_wmem_alloc) > limit) {
 			set_bit(TSQ_THROTTLED, &tp->tsq_flags);
 			break;
 		}
+
 		limit = mss_now;
 		if (tso_segs > 1 && !tcp_urg_mode(tp))
 			limit = tcp_mss_split_point(sk, skb, mss_now,

^ permalink raw reply related

* Re: [PATCH net-next 1/5] tipc: silence sparse warnings
From: Jon Maloy @ 2013-09-27 10:25 UTC (permalink / raw)
  To: Ying Xue
  Cc: Andreas Bofjäll, jon.maloy, netdev, paul.gortmaker,
	erik.hugne, tipc-discussion
In-Reply-To: <52455471.8060102@windriver.com>

On 09/27/2013 05:48 AM, Ying Xue wrote:
> On 09/27/2013 04:15 PM, Andreas Bofjäll wrote:
>> On 09/27/2013 10:01 AM, Ying Xue wrote:
>>> Good point!
>>> It's better for us to use memcpy_fromiovecend() instead of
>>> skb_copy_to_linear_data() and its friends.
>>>
>>> We will submit another version to correct this error soon.
Can you fix this asap, so I can re-post it? And, let's continue the thread
in tipc-discussion only until we (incl Paul) are happy with this patch.

///jon

>> I could be wrong here, but I think you should also remove the entire
>> cast on line 1210 in link.c:
>>
>> sect_crs = msg_sect[curr_sect].iov_base;
>>
>> There should be no reason to cast there since you made sect_crs into
>> const unchar* __user and msg_sect[].iov_base is void* __user.
>>
> You are right, thank you for your nice reminder!
>
> Regards,
> Ying
>
>> /Andreas
>>
>>

^ permalink raw reply

* Re: [PATCH] IPv6: Allow the MTU of ipip6 tunnel to be set below 1280
From: Oussama Ghorbel @ 2013-09-27 10:45 UTC (permalink / raw)
  To: Oussama Ghorbel, David S. Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, netdev, linux-kernel
In-Reply-To: <20130927083730.GC28287@order.stressinduktion.org>

The ip6_tunnel.c module would be then dependent on ip_tunnel.c and may
be it would not be good thing?
As I have check in v3.10 there is no call from ip6_tunnel to ip_tunnel...

For information, there is no check for the maximum MTU for ipv4 in the
patch as this is not done for ipv6.

Regards,
Oussama

On Fri, Sep 27, 2013 at 9:37 AM, Hannes Frederic Sowa
<hannes@stressinduktion.org> wrote:
> On Thu, Sep 26, 2013 at 03:51:48PM +0100, Oussama Ghorbel wrote:
>> The (inner) MTU of a ipip6 (IPv4-in-IPv6) tunnel cannot be set below 1280, which is the minimum MTU in IPv6.
>> However, there should be no IPv6 on the tunnel interface at all, so the IPv6 rules should not apply.
>> More info at https://bugzilla.kernel.org/show_bug.cgi?id=15530
>>
>> This patch allows to check the minimum MTU for ipv6 tunnel according to these rules:
>>  -In case the tunnel is configured with ipip6 mode the minimum MTU is 68.
>>  -In case the tunnel is configured with ip6ip6 or any mode the minimum MTU is 1280.
>>
>> Signed-off-by: Oussama Ghorbel <oghorbell@gmail.com>
>> ---
>>  net/ipv6/ip6_tunnel.c |   10 ++++++++--
>>  1 file changed, 8 insertions(+), 2 deletions(-)
>>
>> diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
>> index 1e55866..a66ead2 100644
>> --- a/net/ipv6/ip6_tunnel.c
>> +++ b/net/ipv6/ip6_tunnel.c
>> @@ -1423,8 +1423,14 @@ ip6_tnl_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
>>  static int
>>  ip6_tnl_change_mtu(struct net_device *dev, int new_mtu)
>>  {
>> -     if (new_mtu < IPV6_MIN_MTU) {
>> -             return -EINVAL;
>> +     struct ip6_tnl *t = netdev_priv(dev);
>> +
>> +     if (t->parms.proto == IPPROTO_IPIP) {
>> +             if (new_mtu < 68)
>> +                     return -EINVAL;
>
> Maybe you could have a look at ip_tunnel_change_mtu in ipv4/ip_tunnel.c,
> generalize this check as e.g. ip_tunnel_valid_mtu or something and use it
> here? Maybe an af-independent ip_tunnel_max_mtu()?
>
>> +     } else {
>> +             if (new_mtu < IPV6_MIN_MTU)
>> +                     return -EINVAL;
>
> This check could also be used here, then.
>
>>       }
>>       dev->mtu = new_mtu;
>>       return 0;
>
> Thanks,
>
>   Hannes
>

^ permalink raw reply

* Re: [PATCH] IPv6: Allow the MTU of ipip6 tunnel to be set below 1280
From: Hannes Frederic Sowa @ 2013-09-27 10:58 UTC (permalink / raw)
  To: Oussama Ghorbel
  Cc: David S. Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, netdev, linux-kernel
In-Reply-To: <CA+ev270=P9txSvo7AZN0YcsopwhCYtJCgcZCPsdofEJ+BX+EUQ@mail.gmail.com>

On Fri, Sep 27, 2013 at 11:45:48AM +0100, Oussama Ghorbel wrote:
> The ip6_tunnel.c module would be then dependent on ip_tunnel.c and may
> be it would not be good thing?

It could just be a static inline in some shared header. So there would
be no compile-time dependency.

> As I have check in v3.10 there is no call from ip6_tunnel to ip_tunnel...
> 
> For information, there is no check for the maximum MTU for ipv4 in the
> patch as this is not done for ipv6.

I understand, but it would be better to limit the MTU here. There is a
(non-jumo) IPV6_MAXPLEN constant.

Looking through the source it seems grev6 does actually check this,
so it would not hurt adding them here, too.

Otherwise, I think your patch is fine.

Greetings,

  Hannes

^ permalink raw reply

* [PATCH next] l2tp: add support for IPv4-mapped IPv6 addresses
From: François Cachereul @ 2013-09-27 11:17 UTC (permalink / raw)
  To: James Chapman; +Cc: David S. Miller, netdev

IPv4 mapped addresses cause kernel panic.
The patch juste check whether the IPv6 address is an IPv4 mapped
address. If so, use IPv4 API instead of IPv6.

Signed-off-by: François CACHEREUL <f.cachereul@alphalink.fr>
---
 net/l2tp/l2tp_core.c |   28 ++++++++++++++++++++++++----
 net/l2tp/l2tp_core.h |    3 +++
 2 files changed, 27 insertions(+), 4 deletions(-)

diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
index feae495..731aa40 100644
--- a/net/l2tp/l2tp_core.c
+++ b/net/l2tp/l2tp_core.c
@@ -496,6 +496,7 @@ out:
 static inline int l2tp_verify_udp_checksum(struct sock *sk,
 					   struct sk_buff *skb)
 {
+	struct l2tp_tunnel *tunnel = (struct l2tp_tunnel *)sk->sk_user_data;
 	struct udphdr *uh = udp_hdr(skb);
 	u16 ulen = ntohs(uh->len);
 	__wsum psum;
@@ -504,7 +505,7 @@ static inline int l2tp_verify_udp_checksum(struct sock *sk,
 		return 0;
 
 #if IS_ENABLED(CONFIG_IPV6)
-	if (sk->sk_family == PF_INET6) {
+	if (sk->sk_family == PF_INET6 && !tunnel->v4mapped) {
 		if (!uh->check) {
 			LIMIT_NETDEBUG(KERN_INFO "L2TP: IPv6: checksum is 0\n");
 			return 1;
@@ -1128,7 +1129,7 @@ static int l2tp_xmit_core(struct l2tp_session *session, struct sk_buff *skb,
 	/* Queue the packet to IP for output */
 	skb->local_df = 1;
 #if IS_ENABLED(CONFIG_IPV6)
-	if (skb->sk->sk_family == PF_INET6)
+	if (skb->sk->sk_family == PF_INET6 && !tunnel->v4mapped)
 		error = inet6_csk_xmit(skb, NULL);
 	else
 #endif
@@ -1255,7 +1256,7 @@ int l2tp_xmit_skb(struct l2tp_session *session, struct sk_buff *skb, int hdr_len
 
 		/* Calculate UDP checksum if configured to do so */
 #if IS_ENABLED(CONFIG_IPV6)
-		if (sk->sk_family == PF_INET6)
+		if (sk->sk_family == PF_INET6 && !tunnel->v4mapped)
 			l2tp_xmit_ipv6_csum(sk, skb, udp_len);
 		else
 #endif
@@ -1620,6 +1621,10 @@ int l2tp_tunnel_create(struct net *net, int fd, int version, u32 tunnel_id, u32
 	int err;
 	struct socket *sock = NULL;
 	struct sock *sk = NULL;
+#if IS_ENABLED(CONFIG_IPV6)
+	struct inet_sock *inet = NULL;
+	struct ipv6_pinfo *np = NULL;
+#endif
 	struct l2tp_net *pn;
 	enum l2tp_encap_type encap = L2TP_ENCAPTYPE_UDP;
 
@@ -1704,6 +1709,21 @@ int l2tp_tunnel_create(struct net *net, int fd, int version, u32 tunnel_id, u32
 	if (cfg != NULL)
 		tunnel->debug = cfg->debug;
 
+#if IS_ENABLED(CONFIG_IPV6)
+	np = inet6_sk(sk);
+	if (sk->sk_family == PF_INET6 &&
+	    ipv6_addr_v4mapped(&np->saddr) &&
+	    ipv6_addr_v4mapped(&np->daddr)) {
+		tunnel->v4mapped = true;
+		inet = inet_sk(sk);
+		inet->inet_saddr = np->saddr.s6_addr32[3];
+		inet->inet_rcv_saddr = np->rcv_saddr.s6_addr32[3];
+		inet->inet_daddr = np->daddr.s6_addr32[3];
+	} else {
+		tunnel->v4mapped = false;
+	}
+#endif
+
 	/* Mark socket as an encapsulation socket. See net/ipv4/udp.c */
 	tunnel->encap = encap;
 	if (encap == L2TP_ENCAPTYPE_UDP) {
@@ -1712,7 +1732,7 @@ int l2tp_tunnel_create(struct net *net, int fd, int version, u32 tunnel_id, u32
 		udp_sk(sk)->encap_rcv = l2tp_udp_encap_recv;
 		udp_sk(sk)->encap_destroy = l2tp_udp_encap_destroy;
 #if IS_ENABLED(CONFIG_IPV6)
-		if (sk->sk_family == PF_INET6)
+		if (sk->sk_family == PF_INET6 && !tunnel->v4mapped)
 			udpv6_encap_enable();
 		else
 #endif
diff --git a/net/l2tp/l2tp_core.h b/net/l2tp/l2tp_core.h
index 66a559b..6f251cb 100644
--- a/net/l2tp/l2tp_core.h
+++ b/net/l2tp/l2tp_core.h
@@ -194,6 +194,9 @@ struct l2tp_tunnel {
 	struct sock		*sock;		/* Parent socket */
 	int			fd;		/* Parent fd, if tunnel socket
 						 * was created by userspace */
+#if IS_ENABLED(CONFIG_IPV6)
+	bool			v4mapped;
+#endif
 
 	struct work_struct	del_work;
 
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH] ll_temac: Reset dma descriptors on ndo_open
From: Ricardo Ribalda Delgado @ 2013-09-27 11:24 UTC (permalink / raw)
  To: David S. Miller, Joe Perches, Jingoo Han, Greg Kroah-Hartman,
	Bill Pemberton, netdev, linux-kernel
  Cc: Ricardo Ribalda Delgado

The dma descriptors are only initialized on the probe function.

If a packet is on the buffer when temac_stop is called, the dma
descriptors can be left on a incorrect status where no other package can
be sent.

So an interface could be left in an usable state after ifdow/ifup.

This patch makes sure that the descriptors are in a proper status when
the device is started.

Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
---
 drivers/net/ethernet/xilinx/ll_temac_main.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/drivers/net/ethernet/xilinx/ll_temac_main.c b/drivers/net/ethernet/xilinx/ll_temac_main.c
index b88121f..8bae87f 100644
--- a/drivers/net/ethernet/xilinx/ll_temac_main.c
+++ b/drivers/net/ethernet/xilinx/ll_temac_main.c
@@ -229,6 +229,25 @@ static void temac_dma_bd_release(struct net_device *ndev)
 }
 
 /**
+ * temac_dma_bd_reset - Reset buffer descriptor rings
+ */
+static void temac_dma_bd_reset(struct net_device *ndev)
+{
+	struct temac_local *lp = netdev_priv(ndev);
+	int i;
+
+	for (i = 0; i < TX_BD_NUM; i++)
+		lp->tx_bd_v[i].app0 = 0;
+
+	lp->tx_bd_ci = 0;
+	lp->tx_bd_next = 0;
+	lp->tx_bd_tail = 0;
+	lp->rx_bd_ci = 0;
+
+	return;
+}
+
+/**
  * temac_dma_bd_init - Setup buffer descriptor rings
  */
 static int temac_dma_bd_init(struct net_device *ndev)
@@ -864,6 +883,8 @@ static int temac_open(struct net_device *ndev)
 	if (rc)
 		goto err_rx_irq;
 
+	temac_dma_bd_reset(ndev);
+
 	return 0;
 
  err_rx_irq:
-- 
1.8.4.rc3

^ permalink raw reply related

* Re: question about map_read_chunks()
From: Dan Carpenter @ 2013-09-27 12:21 UTC (permalink / raw)
  To: Tom Tucker; +Cc: J. Bruce Fields, netdev, linux-nfs
In-Reply-To: <20120220095019.GA21338@elgon.mountain>

I have looked at this again, and I still worry that it looks like a bug.
(remote security related blah blah blah).

regards,
dan carpenter

On Mon, Feb 20, 2012 at 12:50:19PM +0300, Dan Carpenter wrote:
> I had a couple questions about some map_read_chunks().
> 
> net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> 
>    150          ch_bytes = ntohl(ch->rc_target.rs_length);
>                 ^^^^^^^^
> It look like this is 32 bits from the network?
> 
>    151          head->arg.head[0] = rqstp->rq_arg.head[0];
>    152          head->arg.tail[0] = rqstp->rq_arg.tail[0];
>    153          head->arg.pages = &head->pages[head->count];
>    154          head->hdr_count = head->count; /* save count of hdr pages */
>    155          head->arg.page_base = 0;
>    156          head->arg.page_len = ch_bytes;
>    157          head->arg.len = rqstp->rq_arg.len + ch_bytes;
>                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> Can overflow.
>    158          head->arg.buflen = rqstp->rq_arg.buflen + ch_bytes;
>                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> Same.  I didn't follow it through to see if an overflow matters.  Does
> it?
> 
>    159          head->count++;
>    160          chl_map->ch[0].start = 0;
>    161          while (byte_count) {
>    162                  rpl_map->sge[sge_no].iov_base =
>    163                          page_address(rqstp->rq_arg.pages[page_no]) + page_off;
>    164                  sge_bytes = min_t(int, PAGE_SIZE-page_off, ch_bytes);
>                                           ^^^
> This is the wrong cast to use.  A large ch_bytes would be counted as a
> negative value and get around the cap here.
> 
>    165                  rpl_map->sge[sge_no].iov_len = sge_bytes;
> 
> regards,
> dan carpenter

^ permalink raw reply

* [net v2 0/6][pull request] Intel Wired LAN Driver Updates
From: Jeff Kirsher @ 2013-09-27 12:35 UTC (permalink / raw)
  To: davem; +Cc: Jeff Kirsher, netdev, gospo, sassmann

This series contains updates to igb and i40e.

Todd provides a fix for 82580 devices in igb, where the ethtool
loopback test was missing 82580 copper devices.

Jesse provides five fixes/cleanups to i40e based on feedback from
Joe Perches and the community.

v2: fixed up patch 5 in the series based on feedback from Joe Perches
    and David Miller

The following are changes since commit f875691640cd3fa67f7ad1d3130ff9fa7fdca042:
  Merge branch 'fixes-for-3.12' of git://gitorious.org/linux-can/linux-can
and are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net master

Jesse Brandeburg (5):
  i40e: use common failure flow
  i40e: small clean ups from review
  i40e: convert ret to aq_ret
  i40e: better return values
  i40e: clean up coccicheck reported errors

Todd Fujinaka (1):
  igb: Fix ethtool loopback test for 82580 copper

 drivers/net/ethernet/intel/i40e/i40e_adminq.c |   7 +-
 drivers/net/ethernet/intel/i40e/i40e_common.c |   2 +-
 drivers/net/ethernet/intel/i40e/i40e_main.c   | 162 +++++++++++++-------------
 drivers/net/ethernet/intel/igb/igb_ethtool.c  |   3 +
 4 files changed, 89 insertions(+), 85 deletions(-)

-- 
1.8.3.1

^ permalink raw reply

* [net v2 3/6] i40e: small clean ups from review
From: Jeff Kirsher @ 2013-09-27 12:35 UTC (permalink / raw)
  To: davem; +Cc: Jesse Brandeburg, netdev, gospo, sassmann, Joe Perches,
	Jeff Kirsher
In-Reply-To: <1380285358-15685-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Jesse Brandeburg <jesse.brandeburg@intel.com>

As mentioned by Joe Perches clean up a loop flow.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
CC: Joe Perches <joe@perches.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 67f8fd5..865bc6b 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -174,8 +174,7 @@ static int i40e_get_lump(struct i40e_pf *pf, struct i40e_lump_tracking *pile,
 			 u16 needed, u16 id)
 {
 	int ret = -ENOMEM;
-	int i = 0;
-	int j = 0;
+	int i, j;
 
 	if (!pile || needed == 0 || id >= I40E_PILE_VALID_BIT) {
 		dev_info(&pf->pdev->dev,
@@ -186,7 +185,7 @@ static int i40e_get_lump(struct i40e_pf *pf, struct i40e_lump_tracking *pile,
 
 	/* start the linear search with an imperfect hint */
 	i = pile->search_hint;
-	while (i < pile->num_entries && ret < 0) {
+	while (i < pile->num_entries) {
 		/* skip already allocated entries */
 		if (pile->list[i] & I40E_PILE_VALID_BIT) {
 			i++;
@@ -205,6 +204,7 @@ static int i40e_get_lump(struct i40e_pf *pf, struct i40e_lump_tracking *pile,
 				pile->list[i+j] = id | I40E_PILE_VALID_BIT;
 			ret = i;
 			pile->search_hint = i + j;
+			break;
 		} else {
 			/* not enough, so skip over it and continue looking */
 			i += j;
-- 
1.8.3.1

^ permalink raw reply related

* [net v2 2/6] i40e: use common failure flow
From: Jeff Kirsher @ 2013-09-27 12:35 UTC (permalink / raw)
  To: davem; +Cc: Jesse Brandeburg, netdev, gospo, sassmann, Joe Perches,
	Jeff Kirsher
In-Reply-To: <1380285358-15685-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Jesse Brandeburg <jesse.brandeburg@intel.com>

As mentioned by Joe Perches, we should be using
foo = alloc(...)
if (!foo)
	return -ENOMEM;

return 0;

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
CC: Joe Perches <joe@perches.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 601d482..67f8fd5 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -101,10 +101,10 @@ int i40e_allocate_dma_mem_d(struct i40e_hw *hw, struct i40e_dma_mem *mem,
 	mem->size = ALIGN(size, alignment);
 	mem->va = dma_zalloc_coherent(&pf->pdev->dev, mem->size,
 				      &mem->pa, GFP_KERNEL);
-	if (mem->va)
-		return 0;
+	if (!mem->va)
+		return -ENOMEM;
 
-	return -ENOMEM;
+	return 0;
 }
 
 /**
@@ -136,10 +136,10 @@ int i40e_allocate_virt_mem_d(struct i40e_hw *hw, struct i40e_virt_mem *mem,
 	mem->size = size;
 	mem->va = kzalloc(size, GFP_KERNEL);
 
-	if (mem->va)
-		return 0;
+	if (!mem->va)
+		return -ENOMEM;
 
-	return -ENOMEM;
+	return 0;
 }
 
 /**
-- 
1.8.3.1

^ permalink raw reply related

* [net v2 1/6] igb: Fix ethtool loopback test for 82580 copper
From: Jeff Kirsher @ 2013-09-27 12:35 UTC (permalink / raw)
  To: davem; +Cc: Todd Fujinaka, netdev, gospo, sassmann, Jeff Kirsher
In-Reply-To: <1380285358-15685-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Todd Fujinaka <todd.fujinaka@intel.com>

Add back 82580 loopback tests to ethtool.

Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igb/igb_ethtool.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/intel/igb/igb_ethtool.c b/drivers/net/ethernet/intel/igb/igb_ethtool.c
index 48cbc83..86d5142 100644
--- a/drivers/net/ethernet/intel/igb/igb_ethtool.c
+++ b/drivers/net/ethernet/intel/igb/igb_ethtool.c
@@ -1607,6 +1607,9 @@ static int igb_integrated_phy_loopback(struct igb_adapter *adapter)
 			igb_write_phy_reg(hw, I347AT4_PAGE_SELECT, 0);
 			igb_write_phy_reg(hw, PHY_CONTROL, 0x4140);
 		}
+	} else if (hw->phy.type == e1000_phy_82580) {
+		/* enable MII loopback */
+		igb_write_phy_reg(hw, I82580_PHY_LBK_CTRL, 0x8041);
 	}
 
 	/* add small delay to avoid loopback test failure */
-- 
1.8.3.1

^ permalink raw reply related

* [net v2 4/6] i40e: convert ret to aq_ret
From: Jeff Kirsher @ 2013-09-27 12:35 UTC (permalink / raw)
  To: davem; +Cc: Jesse Brandeburg, netdev, gospo, sassmann, Jeff Kirsher
In-Reply-To: <1380285358-15685-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Jesse Brandeburg <jesse.brandeburg@intel.com>

When calling admin queue functions the driver should use aq_ret
variable to help make clear that the return value is not a regular
return variable.

This allows for clean up of the return types that were previously
converted to int.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 107 ++++++++++++++--------------
 1 file changed, 52 insertions(+), 55 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 865bc6b..60c7152 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -1388,7 +1388,7 @@ int i40e_sync_vsi_filters(struct i40e_vsi *vsi)
 	bool add_happened = false;
 	int filter_list_len = 0;
 	u32 changed_flags = 0;
-	i40e_status ret = 0;
+	i40e_status aq_ret = 0;
 	struct i40e_pf *pf;
 	int num_add = 0;
 	int num_del = 0;
@@ -1449,28 +1449,28 @@ int i40e_sync_vsi_filters(struct i40e_vsi *vsi)
 
 			/* flush a full buffer */
 			if (num_del == filter_list_len) {
-				ret = i40e_aq_remove_macvlan(&pf->hw,
+				aq_ret = i40e_aq_remove_macvlan(&pf->hw,
 					    vsi->seid, del_list, num_del,
 					    NULL);
 				num_del = 0;
 				memset(del_list, 0, sizeof(*del_list));
 
-				if (ret)
+				if (aq_ret)
 					dev_info(&pf->pdev->dev,
 						 "ignoring delete macvlan error, err %d, aq_err %d while flushing a full buffer\n",
-						 ret,
+						 aq_ret,
 						 pf->hw.aq.asq_last_status);
 			}
 		}
 		if (num_del) {
-			ret = i40e_aq_remove_macvlan(&pf->hw, vsi->seid,
+			aq_ret = i40e_aq_remove_macvlan(&pf->hw, vsi->seid,
 						     del_list, num_del, NULL);
 			num_del = 0;
 
-			if (ret)
+			if (aq_ret)
 				dev_info(&pf->pdev->dev,
 					 "ignoring delete macvlan error, err %d, aq_err %d\n",
-					 ret, pf->hw.aq.asq_last_status);
+					 aq_ret, pf->hw.aq.asq_last_status);
 		}
 
 		kfree(del_list);
@@ -1515,32 +1515,30 @@ int i40e_sync_vsi_filters(struct i40e_vsi *vsi)
 
 			/* flush a full buffer */
 			if (num_add == filter_list_len) {
-				ret = i40e_aq_add_macvlan(&pf->hw,
-							  vsi->seid,
-							  add_list,
-							  num_add,
-							  NULL);
+				aq_ret = i40e_aq_add_macvlan(&pf->hw, vsi->seid,
+							     add_list, num_add,
+							     NULL);
 				num_add = 0;
 
-				if (ret)
+				if (aq_ret)
 					break;
 				memset(add_list, 0, sizeof(*add_list));
 			}
 		}
 		if (num_add) {
-			ret = i40e_aq_add_macvlan(&pf->hw, vsi->seid,
-						  add_list, num_add, NULL);
+			aq_ret = i40e_aq_add_macvlan(&pf->hw, vsi->seid,
+						     add_list, num_add, NULL);
 			num_add = 0;
 		}
 		kfree(add_list);
 		add_list = NULL;
 
-		if (add_happened && (!ret)) {
+		if (add_happened && (!aq_ret)) {
 			/* do nothing */;
-		} else if (add_happened && (ret)) {
+		} else if (add_happened && (aq_ret)) {
 			dev_info(&pf->pdev->dev,
 				 "add filter failed, err %d, aq_err %d\n",
-				 ret, pf->hw.aq.asq_last_status);
+				 aq_ret, pf->hw.aq.asq_last_status);
 			if ((pf->hw.aq.asq_last_status == I40E_AQ_RC_ENOSPC) &&
 			    !test_bit(__I40E_FILTER_OVERFLOW_PROMISC,
 				      &vsi->state)) {
@@ -1556,28 +1554,27 @@ int i40e_sync_vsi_filters(struct i40e_vsi *vsi)
 	if (changed_flags & IFF_ALLMULTI) {
 		bool cur_multipromisc;
 		cur_multipromisc = !!(vsi->current_netdev_flags & IFF_ALLMULTI);
-		ret = i40e_aq_set_vsi_multicast_promiscuous(&vsi->back->hw,
-							    vsi->seid,
-							    cur_multipromisc,
-							    NULL);
-		if (ret)
+		aq_ret = i40e_aq_set_vsi_multicast_promiscuous(&vsi->back->hw,
+							       vsi->seid,
+							       cur_multipromisc,
+							       NULL);
+		if (aq_ret)
 			dev_info(&pf->pdev->dev,
 				 "set multi promisc failed, err %d, aq_err %d\n",
-				 ret, pf->hw.aq.asq_last_status);
+				 aq_ret, pf->hw.aq.asq_last_status);
 	}
 	if ((changed_flags & IFF_PROMISC) || promisc_forced_on) {
 		bool cur_promisc;
 		cur_promisc = (!!(vsi->current_netdev_flags & IFF_PROMISC) ||
 			       test_bit(__I40E_FILTER_OVERFLOW_PROMISC,
 					&vsi->state));
-		ret = i40e_aq_set_vsi_unicast_promiscuous(&vsi->back->hw,
-							  vsi->seid,
-							  cur_promisc,
-							  NULL);
-		if (ret)
+		aq_ret = i40e_aq_set_vsi_unicast_promiscuous(&vsi->back->hw,
+							     vsi->seid,
+							     cur_promisc, NULL);
+		if (aq_ret)
 			dev_info(&pf->pdev->dev,
 				 "set uni promisc failed, err %d, aq_err %d\n",
-				 ret, pf->hw.aq.asq_last_status);
+				 aq_ret, pf->hw.aq.asq_last_status);
 	}
 
 	clear_bit(__I40E_CONFIG_BUSY, &vsi->state);
@@ -1936,10 +1933,10 @@ static void i40e_restore_vlan(struct i40e_vsi *vsi)
  * @vsi: the vsi being adjusted
  * @vid: the vlan id to set as a PVID
  **/
-i40e_status i40e_vsi_add_pvid(struct i40e_vsi *vsi, u16 vid)
+int i40e_vsi_add_pvid(struct i40e_vsi *vsi, u16 vid)
 {
 	struct i40e_vsi_context ctxt;
-	i40e_status ret;
+	i40e_status aq_ret;
 
 	vsi->info.valid_sections = cpu_to_le16(I40E_AQ_VSI_PROP_VLAN_VALID);
 	vsi->info.pvid = cpu_to_le16(vid);
@@ -1948,14 +1945,15 @@ i40e_status i40e_vsi_add_pvid(struct i40e_vsi *vsi, u16 vid)
 
 	ctxt.seid = vsi->seid;
 	memcpy(&ctxt.info, &vsi->info, sizeof(vsi->info));
-	ret = i40e_aq_update_vsi_params(&vsi->back->hw, &ctxt, NULL);
-	if (ret) {
+	aq_ret = i40e_aq_update_vsi_params(&vsi->back->hw, &ctxt, NULL);
+	if (aq_ret) {
 		dev_info(&vsi->back->pdev->dev,
 			 "%s: update vsi failed, aq_err=%d\n",
 			 __func__, vsi->back->hw.aq.asq_last_status);
+		return -ENOENT;
 	}
 
-	return ret;
+	return 0;
 }
 
 /**
@@ -3451,28 +3449,27 @@ static int i40e_vsi_get_bw_info(struct i40e_vsi *vsi)
 	struct i40e_aqc_query_vsi_bw_config_resp bw_config = {0};
 	struct i40e_pf *pf = vsi->back;
 	struct i40e_hw *hw = &pf->hw;
+	i40e_status aq_ret;
 	u32 tc_bw_max;
-	int ret;
 	int i;
 
 	/* Get the VSI level BW configuration */
-	ret = i40e_aq_query_vsi_bw_config(hw, vsi->seid, &bw_config, NULL);
-	if (ret) {
+	aq_ret = i40e_aq_query_vsi_bw_config(hw, vsi->seid, &bw_config, NULL);
+	if (aq_ret) {
 		dev_info(&pf->pdev->dev,
 			 "couldn't get pf vsi bw config, err %d, aq_err %d\n",
-			 ret, pf->hw.aq.asq_last_status);
-		return ret;
+			 aq_ret, pf->hw.aq.asq_last_status);
+		return -EINVAL;
 	}
 
 	/* Get the VSI level BW configuration per TC */
-	ret = i40e_aq_query_vsi_ets_sla_config(hw, vsi->seid,
-					       &bw_ets_config,
-					       NULL);
-	if (ret) {
+	aq_ret = i40e_aq_query_vsi_ets_sla_config(hw, vsi->seid, &bw_ets_config,
+					          NULL);
+	if (aq_ret) {
 		dev_info(&pf->pdev->dev,
 			 "couldn't get pf vsi ets bw config, err %d, aq_err %d\n",
-			 ret, pf->hw.aq.asq_last_status);
-		return ret;
+			 aq_ret, pf->hw.aq.asq_last_status);
+		return -EINVAL;
 	}
 
 	if (bw_config.tc_valid_bits != bw_ets_config.tc_valid_bits) {
@@ -3494,7 +3491,7 @@ static int i40e_vsi_get_bw_info(struct i40e_vsi *vsi)
 		/* 3 bits out of 4 for each TC */
 		vsi->bw_ets_max_quanta[i] = (u8)((tc_bw_max >> (i*4)) & 0x7);
 	}
-	return ret;
+	return 0;
 }
 
 /**
@@ -3505,30 +3502,30 @@ static int i40e_vsi_get_bw_info(struct i40e_vsi *vsi)
  *
  * Returns 0 on success, negative value on failure
  **/
-static int i40e_vsi_configure_bw_alloc(struct i40e_vsi *vsi,
-				       u8 enabled_tc,
+static int i40e_vsi_configure_bw_alloc(struct i40e_vsi *vsi, u8 enabled_tc,
 				       u8 *bw_share)
 {
 	struct i40e_aqc_configure_vsi_tc_bw_data bw_data;
-	int i, ret = 0;
+	i40e_status aq_ret;
+	int i;
 
 	bw_data.tc_valid_bits = enabled_tc;
 	for (i = 0; i < I40E_MAX_TRAFFIC_CLASS; i++)
 		bw_data.tc_bw_credits[i] = bw_share[i];
 
-	ret = i40e_aq_config_vsi_tc_bw(&vsi->back->hw, vsi->seid,
-				       &bw_data, NULL);
-	if (ret) {
+	aq_ret = i40e_aq_config_vsi_tc_bw(&vsi->back->hw, vsi->seid, &bw_data,
+					  NULL);
+	if (aq_ret) {
 		dev_info(&vsi->back->pdev->dev,
 			 "%s: AQ command Config VSI BW allocation per TC failed = %d\n",
 			 __func__, vsi->back->hw.aq.asq_last_status);
-		return ret;
+		return -EINVAL;
 	}
 
 	for (i = 0; i < I40E_MAX_TRAFFIC_CLASS; i++)
 		vsi->info.qs_handle[i] = bw_data.qs_handles[i];
 
-	return ret;
+	return 0;
 }
 
 /**
-- 
1.8.3.1

^ permalink raw reply related

* [net v2 6/6] i40e: clean up coccicheck reported errors
From: Jeff Kirsher @ 2013-09-27 12:35 UTC (permalink / raw)
  To: davem; +Cc: Jesse Brandeburg, netdev, gospo, sassmann, Jeff Kirsher
In-Reply-To: <1380285358-15685-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Jesse Brandeburg <jesse.brandeburg@intel.com>

coccicheck shows:

drivers/net/ethernet/intel/i40e/i40e_adminq.c:704:2-8: Replace memcpy
with struct assignment
drivers/net/ethernet/intel/i40e/i40e_adminq.c:763:1-7: Replace memcpy
with struct assignment
drivers/net/ethernet/intel/i40e/i40e_adminq.c:810:2-8: Replace memcpy
with struct assignment
drivers/net/ethernet/intel/i40e/i40e_common.c:510:2-8: Replace memcpy
with struct assignment

Fix each of them with a *a = *b;

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_adminq.c | 7 +++----
 drivers/net/ethernet/intel/i40e/i40e_common.c | 2 +-
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_adminq.c b/drivers/net/ethernet/intel/i40e/i40e_adminq.c
index 0c524fa..cfef7fc 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_adminq.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_adminq.c
@@ -701,8 +701,7 @@ i40e_status i40e_asq_send_command(struct i40e_hw *hw,
 
 	details = I40E_ADMINQ_DETAILS(hw->aq.asq, hw->aq.asq.next_to_use);
 	if (cmd_details) {
-		memcpy(details, cmd_details,
-		       sizeof(struct i40e_asq_cmd_details));
+		*details = *cmd_details;
 
 		/* If the cmd_details are defined copy the cookie.  The
 		 * cpu_to_le32 is not needed here because the data is ignored
@@ -760,7 +759,7 @@ i40e_status i40e_asq_send_command(struct i40e_hw *hw,
 	desc_on_ring = I40E_ADMINQ_DESC(hw->aq.asq, hw->aq.asq.next_to_use);
 
 	/* if the desc is available copy the temp desc to the right place */
-	memcpy(desc_on_ring, desc, sizeof(struct i40e_aq_desc));
+	*desc_on_ring = *desc;
 
 	/* if buff is not NULL assume indirect command */
 	if (buff != NULL) {
@@ -807,7 +806,7 @@ i40e_status i40e_asq_send_command(struct i40e_hw *hw,
 
 	/* if ready, copy the desc back to temp */
 	if (i40e_asq_done(hw)) {
-		memcpy(desc, desc_on_ring, sizeof(struct i40e_aq_desc));
+		*desc = *desc_on_ring;
 		if (buff != NULL)
 			memcpy(buff, dma_buff->va, buff_size);
 		retval = le16_to_cpu(desc->retval);
diff --git a/drivers/net/ethernet/intel/i40e/i40e_common.c b/drivers/net/ethernet/intel/i40e/i40e_common.c
index c21df7b..1e4ea13 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_common.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_common.c
@@ -507,7 +507,7 @@ i40e_status i40e_aq_get_link_info(struct i40e_hw *hw,
 
 	/* save link status information */
 	if (link)
-		memcpy(link, hw_link_info, sizeof(struct i40e_link_status));
+		*link = *hw_link_info;
 
 	/* flag cleared so helper functions don't call AQ again */
 	hw->phy.get_link_info = false;
-- 
1.8.3.1

^ permalink raw reply related

* [net v2 5/6] i40e: better return values
From: Jeff Kirsher @ 2013-09-27 12:35 UTC (permalink / raw)
  To: davem; +Cc: Jesse Brandeburg, netdev, gospo, sassmann, Joe Perches,
	Jeff Kirsher
In-Reply-To: <1380285358-15685-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Jesse Brandeburg <jesse.brandeburg@intel.com>

As mentioned by Joe Perches, clean up return values in some functions
making sure to have consistent return types, not mixing types.

A couple of Joe's comments suggested returning void, but since
the functions in question are ndo defined, the return values are fixed.
So make a comment in the header that notes this is a function called by
net_device_ops.

v2: fix post increment bug in return

CC: Joe Perches <joe@perches.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 37 ++++++++++++++++-------------
 1 file changed, 21 insertions(+), 16 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 60c7152..221aa47 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -1787,6 +1787,8 @@ int i40e_vsi_add_vlan(struct i40e_vsi *vsi, s16 vid)
  * i40e_vsi_kill_vlan - Remove vsi membership for given vlan
  * @vsi: the vsi being configured
  * @vid: vlan id to be removed (0 = untagged only , -1 = any)
+ *
+ * Return: 0 on success or negative otherwise
  **/
 int i40e_vsi_kill_vlan(struct i40e_vsi *vsi, s16 vid)
 {
@@ -1860,37 +1862,39 @@ int i40e_vsi_kill_vlan(struct i40e_vsi *vsi, s16 vid)
  * i40e_vlan_rx_add_vid - Add a vlan id filter to HW offload
  * @netdev: network interface to be adjusted
  * @vid: vlan id to be added
+ *
+ * net_device_ops implementation for adding vlan ids
  **/
 static int i40e_vlan_rx_add_vid(struct net_device *netdev,
 				__always_unused __be16 proto, u16 vid)
 {
 	struct i40e_netdev_priv *np = netdev_priv(netdev);
 	struct i40e_vsi *vsi = np->vsi;
-	int ret;
+	int ret = 0;
 
 	if (vid > 4095)
-		return 0;
+		return -EINVAL;
+
+	netdev_info(netdev, "adding %pM vid=%d\n", netdev->dev_addr, vid);
 
-	netdev_info(vsi->netdev, "adding %pM vid=%d\n",
-		    netdev->dev_addr, vid);
 	/* If the network stack called us with vid = 0, we should
 	 * indicate to i40e_vsi_add_vlan() that we want to receive
 	 * any traffic (i.e. with any vlan tag, or untagged)
 	 */
 	ret = i40e_vsi_add_vlan(vsi, vid ? vid : I40E_VLAN_ANY);
 
-	if (!ret) {
-		if (vid < VLAN_N_VID)
-			set_bit(vid, vsi->active_vlans);
-	}
+	if (!ret && (vid < VLAN_N_VID))
+		set_bit(vid, vsi->active_vlans);
 
-	return 0;
+	return ret;
 }
 
 /**
  * i40e_vlan_rx_kill_vid - Remove a vlan id filter from HW offload
  * @netdev: network interface to be adjusted
  * @vid: vlan id to be removed
+ *
+ * net_device_ops implementation for adding vlan ids
  **/
 static int i40e_vlan_rx_kill_vid(struct net_device *netdev,
 				 __always_unused __be16 proto, u16 vid)
@@ -1898,15 +1902,16 @@ static int i40e_vlan_rx_kill_vid(struct net_device *netdev,
 	struct i40e_netdev_priv *np = netdev_priv(netdev);
 	struct i40e_vsi *vsi = np->vsi;
 
-	netdev_info(vsi->netdev, "removing %pM vid=%d\n",
-		    netdev->dev_addr, vid);
+	netdev_info(netdev, "removing %pM vid=%d\n", netdev->dev_addr, vid);
+
 	/* return code is ignored as there is nothing a user
 	 * can do about failure to remove and a log message was
-	 * already printed from another function
+	 * already printed from the other function
 	 */
 	i40e_vsi_kill_vlan(vsi, vid);
 
 	clear_bit(vid, vsi->active_vlans);
+
 	return 0;
 }
 
@@ -3324,7 +3329,8 @@ static void i40e_pf_unquiesce_all_vsi(struct i40e_pf *pf)
  **/
 static u8 i40e_dcb_get_num_tc(struct i40e_dcbx_config *dcbcfg)
 {
-	int num_tc = 0, i;
+	u8 num_tc = 0;
+	int i;
 
 	/* Scan the ETS Config Priority Table to find
 	 * traffic class enabled for a given priority
@@ -3339,9 +3345,7 @@ static u8 i40e_dcb_get_num_tc(struct i40e_dcbx_config *dcbcfg)
 	/* Traffic class index starts from zero so
 	 * increment to return the actual count
 	 */
-	num_tc++;
-
-	return num_tc;
+	return num_tc + 1;
 }
 
 /**
@@ -3491,6 +3495,7 @@ static int i40e_vsi_get_bw_info(struct i40e_vsi *vsi)
 		/* 3 bits out of 4 for each TC */
 		vsi->bw_ets_max_quanta[i] = (u8)((tc_bw_max >> (i*4)) & 0x7);
 	}
+
 	return 0;
 }
 
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH net-next 1/2] bonding: correctly verify for the first slave in bond_enslave
From: Veaceslav Falico @ 2013-09-27 13:10 UTC (permalink / raw)
  To: netdev; +Cc: Veaceslav Falico, Jay Vosburgh, Andy Gospodarek
In-Reply-To: <1380287458-3488-1-git-send-email-vfalico@redhat.com>

After commit 1f718f0f4f97145f4072d2d72dcf85069ca7226d ("bonding: populate
neighbour's private on enslave"), we've moved the actual 'linking' in the
end of the function - so that, once linked, the slave is ready to be used,
and is not still in the process of enslaving.

However, 802.3ad verified if it's the first slave by looking at the

if (bond_first_slave(bond) == new_slave)

which, because we've moved the linking to the end, became broken - on the
first slave bond_first_slave(bond) returns NULL.

Fix this by verifying if the prev_slave, that equals bond_last_slave(), is
actually populated - if it is - then it's not the first slave, and vice
versa.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
 drivers/net/bonding/bond_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index d5c3153..0367f80 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1557,7 +1557,7 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
 		 */
 		bond_set_slave_inactive_flags(new_slave);
 		/* if this is the first slave */
-		if (bond_first_slave(bond) == new_slave) {
+		if (!prev_slave) {
 			SLAVE_AD_INFO(new_slave).id = 1;
 			/* Initialize AD with the number of times that the AD timer is called in 1 second
 			 * can be called only after the mac address of the bond is set
-- 
1.8.4

^ permalink raw reply related

* [PATCH net-next 0/2] bonding: fix 3ad slave (de)init
From: Veaceslav Falico @ 2013-09-27 13:10 UTC (permalink / raw)
  To: netdev; +Cc: Jay Vosburgh, Andy Gospodarek, Veaceslav Falico

After 1f718f0f4f97145f4072d2d72dcf85069ca7226d ("bonding: populate
neighbour's private on enslave") the (de)linking of slaves in
bond_enslave/bond_release_one happens in the correct places - after we've
completely initialized the slave (for bond_enslave) and before we've even
began to de-init the slave (for bond_release_one, respectively).

This was done to prevent any RCU readers to see the half-initialized slave
(because the RCU readers aren't blocked by bond->lock or rtnl_lock
usually).

However, 802.3ad logic, in several places, relied on the fact that the
slave is still linked to the bond.

Fix it by correctly handling these cases - we shouldn't rely that the slave
is linked before fully initialized and, respectively, that the slave is
still linked while it's being removed.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>

---
 drivers/net/bonding/bond_3ad.c  | 4 +++-
 drivers/net/bonding/bond_main.c | 2 +-
 2 files changed, 4 insertions(+), 2 deletions(-)

^ permalink raw reply

* [PATCH net-next 2/2] bonding: verify if we still have slaves in bond_3ad_unbind_slave()
From: Veaceslav Falico @ 2013-09-27 13:10 UTC (permalink / raw)
  To: netdev; +Cc: Veaceslav Falico, Jay Vosburgh, Andy Gospodarek
In-Reply-To: <1380287458-3488-1-git-send-email-vfalico@redhat.com>

After commit 1f718f0f4f97145f4072d2d72dcf85069ca7226d ("bonding: populate
neighbour's private on enslave"), we've moved the unlinking of the slave
to the earliest position possible - so that nobody will see an
half-uninited slave.

However, bond_3ad_unbind_slave() relied that, even while removing the last
slave, it is still accessible - via __get_first_agg() (and, eventually,
bond_first_slave()).

Fix that by verifying if the aggregator return is an actual aggregator, but
not NULL.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
 drivers/net/bonding/bond_3ad.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
index 1337eaf..2715ea8 100644
--- a/drivers/net/bonding/bond_3ad.c
+++ b/drivers/net/bonding/bond_3ad.c
@@ -2056,7 +2056,9 @@ void bond_3ad_unbind_slave(struct slave *slave)
 				pr_info("%s: Removing an active aggregator\n",
 					slave->bond->dev->name);
 				// select new active aggregator
-				ad_agg_selection_logic(__get_first_agg(port));
+				temp_aggregator = __get_first_agg(port);
+				if (temp_aggregator)
+					ad_agg_selection_logic(temp_aggregator);
 			}
 		}
 	}
-- 
1.8.4

^ permalink raw reply related

* [PATCH v3] powerpc/83xx: gianfar_ptp: select 1588 clock source through dts file
From: Aida Mynzhasova @ 2013-09-27 13:40 UTC (permalink / raw)
  To: linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	richardcochran-Re5JQEeQqe8AvxtiuMwx3w,
	galak-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r

Currently IEEE 1588 timer reference clock source is determined through
hard-coded value in gianfar_ptp driver. This patch allows to select ptp
clock source by means of device tree file node.

For instance:

	fsl,cksel = <0>;

for using external (TSEC_TMR_CLK input) high precision timer
reference clock.

Other acceptable values:

	<1> : eTSEC system clock
	<2> : eTSEC1 transmit clock
	<3> : RTC clock input

When this attribute isn't used, eTSEC system clock will serve as
IEEE 1588 timer reference clock.

Signed-off-by: Aida Mynzhasova <aida.mynzhasova-Fmhy8gsqeTEvJsYlp49lxw@public.gmane.org>
---
 Documentation/devicetree/bindings/net/fsl-tsec-phy.txt | 18 +++++++++++++++++-
 drivers/net/ethernet/freescale/gianfar_ptp.c           |  4 +++-
 2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/Documentation/devicetree/bindings/net/fsl-tsec-phy.txt b/Documentation/devicetree/bindings/net/fsl-tsec-phy.txt
index 2c6be03..d2ea460 100644
--- a/Documentation/devicetree/bindings/net/fsl-tsec-phy.txt
+++ b/Documentation/devicetree/bindings/net/fsl-tsec-phy.txt
@@ -86,6 +86,7 @@ General Properties:
 
 Clock Properties:
 
+  - fsl,cksel        Timer reference clock source.
   - fsl,tclk-period  Timer reference clock period in nanoseconds.
   - fsl,tmr-prsc     Prescaler, divides the output clock.
   - fsl,tmr-add      Frequency compensation value.
@@ -97,7 +98,7 @@ Clock Properties:
   clock. You must choose these carefully for the clock to work right.
   Here is how to figure good values:
 
-  TimerOsc     = system clock               MHz
+  TimerOsc     = selected reference clock   MHz
   tclk_period  = desired clock period       nanoseconds
   NominalFreq  = 1000 / tclk_period         MHz
   FreqDivRatio = TimerOsc / NominalFreq     (must be greater that 1.0)
@@ -114,6 +115,20 @@ Clock Properties:
   Pulse Per Second (PPS) signal, since this will be offered to the PPS
   subsystem to synchronize the Linux clock.
 
+  Reference clock source is determined by the value, which is holded
+  in CKSEL bits in TMR_CTRL register. "fsl,cksel" property keeps the
+  value, which will be directly written in those bits, that is why,
+  according to reference manual, the next clock sources can be used:
+
+  <0> - external high precision timer reference clock (TSEC_TMR_CLK
+        input is used for this purpose);
+  <1> - eTSEC system clock;
+  <2> - eTSEC1 transmit clock;
+  <3> - RTC clock input.
+
+  When this attribute is not used, eTSEC system clock will serve as
+  IEEE 1588 timer reference clock.
+
 Example:
 
 	ptp_clock@24E00 {
@@ -121,6 +136,7 @@ Example:
 		reg = <0x24E00 0xB0>;
 		interrupts = <12 0x8 13 0x8>;
 		interrupt-parent = < &ipic >;
+		fsl,cksel       = <1>;
 		fsl,tclk-period = <10>;
 		fsl,tmr-prsc    = <100>;
 		fsl,tmr-add     = <0x999999A4>;
diff --git a/drivers/net/ethernet/freescale/gianfar_ptp.c b/drivers/net/ethernet/freescale/gianfar_ptp.c
index 098f133..e006a09 100644
--- a/drivers/net/ethernet/freescale/gianfar_ptp.c
+++ b/drivers/net/ethernet/freescale/gianfar_ptp.c
@@ -452,7 +452,9 @@ static int gianfar_ptp_probe(struct platform_device *dev)
 	err = -ENODEV;
 
 	etsects->caps = ptp_gianfar_caps;
-	etsects->cksel = DEFAULT_CKSEL;
+
+	if (get_of_u32(node, "fsl,cksel", &etsects->cksel))
+		etsects->cksel = DEFAULT_CKSEL;
 
 	if (get_of_u32(node, "fsl,tclk-period", &etsects->tclk_period) ||
 	    get_of_u32(node, "fsl,tmr-prsc", &etsects->tmr_prsc) ||
-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH net-next 0/9] bonding: remove bond_next_slave()
From: Veaceslav Falico @ 2013-09-27 14:11 UTC (permalink / raw)
  To: netdev; +Cc: nikolay, bhutchings, Jay Vosburgh, Andy Gospodarek,
	Veaceslav Falico

(on top of "[PATCH net-next 0/2] bonding: fix 3ad slave (de)init" - the
patchset is essential)

Hi,

As Ben Hutchings and Nikolay Aleksandrov correctly noted -
bond_next_slave() already is not O(1), but rather O(n), so it's only useful
for one-off operations and shouldn't be used widely - for example, for list
traversal, which will take O(n^2) time, which will be disastrous for any
hot path with a large number of slaves.

Currently, bond_next_slave() is used several times in 802.3ad and for
seq_read - bond_info_seq_next().

The 802.3ad part uses it only in constructs like:

for (X = __get_first_X(); X; X = __get_next_X()) {

where __get_next_X() uses bond_next_slave().

This for can (and should) actually be replaced by the standard

bond_for_each_slave(bond, slave, iter) {
	X = __get_X_by_slave(slave);

it's faster, easier to read, debug and maintain. Also, removes a lot of
code lines.

After replacing it, the only user of bond_next_slave() is
bond_info_seq_next() - which can, actually, implement it by itself, and not
call another function.

So, that way, we can completely remove the bond_next_slave(), cleanup and
optimize a bit.

p.s. the 802.3ad code is horrible, both style-wise and from the logical
point of view - so I've decided to *not* change anything except that what
this patch is intended to provide. The cleanup and some refactoring should
be done in another patchset, which I've began to work on already.

Thank you!

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>

---
 drivers/net/bonding/bond_3ad.c    | 137 ++++++++++++++------------------------
 drivers/net/bonding/bond_main.c   |   2 +-
 drivers/net/bonding/bond_procfs.c |  16 +++--
 drivers/net/bonding/bonding.h     |  31 ---------
 4 files changed, 62 insertions(+), 124 deletions(-)

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox