Netdev List
 help / color / mirror / Atom feed
* Re: [RFC] tcp demux used to signal ip_route_input_noref to not cache dst
From: David Miller @ 2012-06-27  8:19 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev
In-Reply-To: <1340785104.26242.9.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 27 Jun 2012 10:18:24 +0200

> On Wed, 2012-06-27 at 09:52 +0200, Eric Dumazet wrote:
> 
>> I'll test the following patch in a moment.
>> 
>> For the moment, set nocache to true for all frames not associated to an
>> ESTABLISHED socket. Not sure we want to test SYN flag after all.
>> 
>>  include/net/protocol.h |    2 +-
>>  include/net/route.h    |    8 ++++----
>>  include/net/tcp.h      |    2 +-
>>  net/ipv4/arp.c         |    2 +-
>>  net/ipv4/ip_fragment.c |    2 +-
>>  net/ipv4/ip_input.c    |    5 +++--
>>  net/ipv4/route.c       |    8 +++++---
>>  net/ipv4/tcp_ipv4.c    |    4 +++-
>>  net/ipv4/xfrm4_input.c |    2 +-
>>  9 files changed, 20 insertions(+), 15 deletions(-)
> 
> Excellent results.
> 
> I am now able to resist to DDOS synflood attacks, with no route cache
> pollution, and no more rt_garbage_collect() hits.

Sweet.

^ permalink raw reply

* Re: [RFC] tcp demux used to signal ip_route_input_noref to not cache dst
From: Eric Dumazet @ 2012-06-27  8:18 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <1340783533.26242.2.camel@edumazet-glaptop>

On Wed, 2012-06-27 at 09:52 +0200, Eric Dumazet wrote:

> I'll test the following patch in a moment.
> 
> For the moment, set nocache to true for all frames not associated to an
> ESTABLISHED socket. Not sure we want to test SYN flag after all.
> 
>  include/net/protocol.h |    2 +-
>  include/net/route.h    |    8 ++++----
>  include/net/tcp.h      |    2 +-
>  net/ipv4/arp.c         |    2 +-
>  net/ipv4/ip_fragment.c |    2 +-
>  net/ipv4/ip_input.c    |    5 +++--
>  net/ipv4/route.c       |    8 +++++---
>  net/ipv4/tcp_ipv4.c    |    4 +++-
>  net/ipv4/xfrm4_input.c |    2 +-
>  9 files changed, 20 insertions(+), 15 deletions(-)

Excellent results.

I am now able to resist to DDOS synflood attacks, with no route cache
pollution, and no more rt_garbage_collect() hits.

^ permalink raw reply

* Re: [PATCH net-next] be2net: Fix to trim skb for padded vlan packets to workaround an ASIC Bug
From: David Miller @ 2012-06-27  8:18 UTC (permalink / raw)
  To: somnath.kotur; +Cc: netdev
In-Reply-To: <2931b75a-cf52-48cd-8639-388538fe9c26@exht1.ad.emulex.com>

From: Somnath Kotur <somnath.kotur@emulex.com>
Date: Wed, 27 Jun 2012 11:34:29 +0530

> +	/* HW has a bug whicn considers padding bytes as legal
                        ^^^^^

Please fix that spelling error.

^ permalink raw reply

* Re: [patch -next] 6lowpan: double unlock on an error path
From: David Miller @ 2012-06-27  8:17 UTC (permalink / raw)
  To: alex.bluesman.smirnov
  Cc: dan.carpenter, dbaryshkov, slapin, linux-zigbee-devel, netdev,
	kernel-janitors
In-Reply-To: <CAJmB2rCwYo1KR_gOQFrvMn2A2px1mPEN5DV+KGtjV914p-o7NA@mail.gmail.com>

From: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Date: Wed, 27 Jun 2012 10:57:18 +0400

> 2012/6/27 Dan Carpenter <dan.carpenter@oracle.com>:
>> We already unlocked a few lines earlier here, so we can go directly to
>> drop without passing through unlock.  This was introduced recently in
>> c5d3687f6c ('6lowpan: read data from skb safely').
>>
>> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
 ...
> Acked-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>

Applied, thanks.

^ permalink raw reply

* Re: [RFC] tcp demux used to signal ip_route_input_noref to not cache dst
From: David Miller @ 2012-06-27  8:15 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev
In-Reply-To: <1340783533.26242.2.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 27 Jun 2012 09:52:13 +0200

> I'll test the following patch in a moment.
> 
> For the moment, set nocache to true for all frames not associated to an
> ESTABLISHED socket. Not sure we want to test SYN flag after all.

Looks good.

After this change goes in I'm going to change the calling
convention, especially since I really hate functions that
return multiple values using pass-by-reference to accomplish
this.

What I plan to do is move the early socket demux before the
skb_dst()==NULL check, then we don't need the error return.

Subsequently we can return a bool which is your new nocache
value.

^ permalink raw reply

* Re: [PATCH v2 net-next] tcp: avoid tx starvation by SYNACK packets
From: David Miller @ 2012-06-27  8:13 UTC (permalink / raw)
  To: eric.dumazet
  Cc: brouer, hans.schillstrom, subramanian.vijay, dave.taht, netdev,
	ncardwell, therbert, mph
In-Reply-To: <1340782216.10893.427.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 27 Jun 2012 09:30:16 +0200

> Many linux servers in colocations are still using a mono queue NIC, and
> default linux configuration is to use a single cpu to handle all
> incoming frames (no RPS/RFS).

Even worse, many are virtualized guest with single virtual netdev
queue :-)

^ permalink raw reply

* Re: [PATCH v2 net-next] tcp: avoid tx starvation by SYNACK packets
From: Eric Dumazet @ 2012-06-27  8:02 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: David Miller, hans.schillstrom, subramanian.vijay, dave.taht,
	netdev, ncardwell, therbert, mph
In-Reply-To: <1340783670.2028.141.camel@localhost>

On Wed, 2012-06-27 at 09:54 +0200, Jesper Dangaard Brouer wrote:

> Well, that would lead to parallel SYN processing, wouldn't it?

I think we already discussed of the current issues of current code.

Telling people to spread SYN to several cpus is a good way to have a
freeze in case of synflood, because 15 cpus are busy looping while one
is doing progress.

Thats why Intel felt the need of a hardware filter to direct all SYN
packets on a single queue.

^ permalink raw reply

* Re: [PATCH v2 net-next] tcp: avoid tx starvation by SYNACK packets
From: Jesper Dangaard Brouer @ 2012-06-27  7:54 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, hans.schillstrom, subramanian.vijay, dave.taht,
	netdev, ncardwell, therbert, mph
In-Reply-To: <1340782216.10893.427.camel@edumazet-glaptop>

On Wed, 2012-06-27 at 09:30 +0200, Eric Dumazet wrote:
> On Wed, 2012-06-27 at 09:24 +0200, Jesper Dangaard Brouer wrote:
> 
> > But, I still believe that we need, to solve this SYN issues by parallel
> > processing of packets. (It seems Eric and Hans are looking at a single
> > core SYN processing scheme, but I might have missed their point).
> 
> Yep
> 
> Parallel processing will only benefit multiqueue setups.
> 
> Many linux servers in colocations are still using a mono queue NIC, and
> default linux configuration is to use a single cpu to handle all
> incoming frames (no RPS/RFS).

I see, your target is different than mine (now I understand you
motivation).  Its good, as optimizing the single queue case, would also
be a benefit once we implement parallel processing / take advantage of
the multi queue devices.


> Sometime the hw IRQ itself is distributed among several cpus, but at one
> single moment, only one cpu is serving the NAPI poll.
> 
> As long as the LISTEN processing is locking the socket, there is no
> point distributing SYN packets to multiple cpus, this only adds
> contention and poor performance because of false sharing.
> 
> My plan is to get rid of the socket lock for LISTEN and use RCU instead.

Well, that would lead to parallel SYN processing, wouldn't it?

^ permalink raw reply

* Re: [RFC] tcp demux used to signal ip_route_input_noref to not cache dst
From: Eric Dumazet @ 2012-06-27  7:52 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <1340781553.10893.414.camel@edumazet-glaptop>

On Wed, 2012-06-27 at 09:19 +0200, Eric Dumazet wrote:
> In case tcp_v{4|6}_early_demux() doesnt find an ESTABLISHED socket, and
> SYN flag is set, and an "atomic_t listener_under_synflood" counter is
> not 0, we could :
> 
> - instruct make ip_rcv_finish() to not cache the input dst into route
> cache (if dst is not found in the hash table)
> 
> This would make synflood attacks having minimal impact on route cache
> 
> (We did this for the output dst of SYN-cookie-ACK messages)
> 
> 

I'll test the following patch in a moment.

For the moment, set nocache to true for all frames not associated to an
ESTABLISHED socket. Not sure we want to test SYN flag after all.

 include/net/protocol.h |    2 +-
 include/net/route.h    |    8 ++++----
 include/net/tcp.h      |    2 +-
 net/ipv4/arp.c         |    2 +-
 net/ipv4/ip_fragment.c |    2 +-
 net/ipv4/ip_input.c    |    5 +++--
 net/ipv4/route.c       |    8 +++++---
 net/ipv4/tcp_ipv4.c    |    4 +++-
 net/ipv4/xfrm4_input.c |    2 +-
 9 files changed, 20 insertions(+), 15 deletions(-)

diff --git a/include/net/protocol.h b/include/net/protocol.h
index 967b926..7cfc8f7 100644
--- a/include/net/protocol.h
+++ b/include/net/protocol.h
@@ -37,7 +37,7 @@
 
 /* This is used to register protocols. */
 struct net_protocol {
-	int			(*early_demux)(struct sk_buff *skb);
+	int			(*early_demux)(struct sk_buff *skb, bool *nocache);
 	int			(*handler)(struct sk_buff *skb);
 	void			(*err_handler)(struct sk_buff *skb, u32 info);
 	int			(*gso_send_check)(struct sk_buff *skb);
diff --git a/include/net/route.h b/include/net/route.h
index 47eb25a..6361f93 100644
--- a/include/net/route.h
+++ b/include/net/route.h
@@ -201,18 +201,18 @@ static inline struct rtable *ip_route_output_gre(struct net *net, struct flowi4
 }
 
 extern int ip_route_input_common(struct sk_buff *skb, __be32 dst, __be32 src,
-				 u8 tos, struct net_device *devin, bool noref);
+				 u8 tos, struct net_device *devin, bool noref, bool nocache);
 
 static inline int ip_route_input(struct sk_buff *skb, __be32 dst, __be32 src,
 				 u8 tos, struct net_device *devin)
 {
-	return ip_route_input_common(skb, dst, src, tos, devin, false);
+	return ip_route_input_common(skb, dst, src, tos, devin, false, false);
 }
 
 static inline int ip_route_input_noref(struct sk_buff *skb, __be32 dst, __be32 src,
-				       u8 tos, struct net_device *devin)
+				       u8 tos, struct net_device *devin, bool nocache)
 {
-	return ip_route_input_common(skb, dst, src, tos, devin, true);
+	return ip_route_input_common(skb, dst, src, tos, devin, true, nocache);
 }
 
 extern void ipv4_update_pmtu(struct sk_buff *skb, struct net *net, u32 mtu,
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 6660ffc..917ed2e 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -325,7 +325,7 @@ extern void tcp_v4_err(struct sk_buff *skb, u32);
 
 extern void tcp_shutdown (struct sock *sk, int how);
 
-extern int tcp_v4_early_demux(struct sk_buff *skb);
+extern int tcp_v4_early_demux(struct sk_buff *skb, bool *nocache);
 extern int tcp_v4_rcv(struct sk_buff *skb);
 
 extern struct inet_peer *tcp_v4_get_peer(struct sock *sk);
diff --git a/net/ipv4/arp.c b/net/ipv4/arp.c
index 2e560f0..6a97959 100644
--- a/net/ipv4/arp.c
+++ b/net/ipv4/arp.c
@@ -828,7 +828,7 @@ static int arp_process(struct sk_buff *skb)
 	}
 
 	if (arp->ar_op == htons(ARPOP_REQUEST) &&
-	    ip_route_input_noref(skb, tip, sip, 0, dev) == 0) {
+	    ip_route_input_noref(skb, tip, sip, 0, dev, false) == 0) {
 
 		rt = skb_rtable(skb);
 		addr_type = rt->rt_type;
diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c
index 8d07c97..978d55f 100644
--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -259,7 +259,7 @@ static void ip_expire(unsigned long arg)
 		skb_dst_drop(head);
 		iph = ip_hdr(head);
 		err = ip_route_input_noref(head, iph->daddr, iph->saddr,
-					   iph->tos, head->dev);
+					   iph->tos, head->dev, false);
 		if (err)
 			goto out_rcu_unlock;
 
diff --git a/net/ipv4/ip_input.c b/net/ipv4/ip_input.c
index 2a39204..7be54c8 100644
--- a/net/ipv4/ip_input.c
+++ b/net/ipv4/ip_input.c
@@ -326,6 +326,7 @@ static int ip_rcv_finish(struct sk_buff *skb)
 	 */
 	if (skb_dst(skb) == NULL) {
 		int err = -ENOENT;
+		bool nocache = false;
 
 		if (sysctl_ip_early_demux) {
 			const struct net_protocol *ipprot;
@@ -334,13 +335,13 @@ static int ip_rcv_finish(struct sk_buff *skb)
 			rcu_read_lock();
 			ipprot = rcu_dereference(inet_protos[protocol]);
 			if (ipprot && ipprot->early_demux)
-				err = ipprot->early_demux(skb);
+				err = ipprot->early_demux(skb, &nocache);
 			rcu_read_unlock();
 		}
 
 		if (err) {
 			err = ip_route_input_noref(skb, iph->daddr, iph->saddr,
-						   iph->tos, skb->dev);
+						   iph->tos, skb->dev, nocache);
 			if (unlikely(err)) {
 				if (err == -EXDEV)
 					NET_INC_STATS_BH(dev_net(skb->dev),
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 81533e3..fdc7900 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2214,7 +2214,7 @@ static int ip_mkroute_input(struct sk_buff *skb,
  */
 
 static int ip_route_input_slow(struct sk_buff *skb, __be32 daddr, __be32 saddr,
-			       u8 tos, struct net_device *dev)
+			       u8 tos, struct net_device *dev, bool nocache)
 {
 	struct fib_result res;
 	struct in_device *in_dev = __in_dev_get_rcu(dev);
@@ -2353,6 +2353,8 @@ local_input:
 		rth->dst.error= -err;
 		rth->rt_flags 	&= ~RTCF_LOCAL;
 	}
+	if (nocache)
+		rth->dst.flags |= DST_NOCACHE;
 	hash = rt_hash(daddr, saddr, fl4.flowi4_iif, rt_genid(net));
 	rth = rt_intern_hash(hash, rth, skb, fl4.flowi4_iif);
 	err = 0;
@@ -2395,7 +2397,7 @@ martian_source_keep_err:
 }
 
 int ip_route_input_common(struct sk_buff *skb, __be32 daddr, __be32 saddr,
-			   u8 tos, struct net_device *dev, bool noref)
+			   u8 tos, struct net_device *dev, bool noref, bool nocache)
 {
 	struct rtable	*rth;
 	unsigned int	hash;
@@ -2471,7 +2473,7 @@ skip_cache:
 		rcu_read_unlock();
 		return -EINVAL;
 	}
-	res = ip_route_input_slow(skb, daddr, saddr, tos, dev);
+	res = ip_route_input_slow(skb, daddr, saddr, tos, dev, nocache);
 	rcu_read_unlock();
 	return res;
 }
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 1781dc6..33aabd4 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1673,7 +1673,7 @@ csum_err:
 }
 EXPORT_SYMBOL(tcp_v4_do_rcv);
 
-int tcp_v4_early_demux(struct sk_buff *skb)
+int tcp_v4_early_demux(struct sk_buff *skb, bool *no_dst_cache)
 {
 	struct net *net = dev_net(skb->dev);
 	const struct iphdr *iph;
@@ -1719,6 +1719,8 @@ int tcp_v4_early_demux(struct sk_buff *skb)
 				}
 			}
 		}
+	} else {
+		*no_dst_cache = true;
 	}
 
 out_err:
diff --git a/net/ipv4/xfrm4_input.c b/net/ipv4/xfrm4_input.c
index 06814b6..eee636b 100644
--- a/net/ipv4/xfrm4_input.c
+++ b/net/ipv4/xfrm4_input.c
@@ -28,7 +28,7 @@ static inline int xfrm4_rcv_encap_finish(struct sk_buff *skb)
 		const struct iphdr *iph = ip_hdr(skb);
 
 		if (ip_route_input_noref(skb, iph->daddr, iph->saddr,
-					 iph->tos, skb->dev))
+					 iph->tos, skb->dev, false))
 			goto drop;
 	}
 	return dst_input(skb);

^ permalink raw reply related

* Re: [PATCH v2 net-next] tcp: avoid tx starvation by SYNACK packets
From: Eric Dumazet @ 2012-06-27  7:30 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: David Miller, hans.schillstrom, subramanian.vijay, dave.taht,
	netdev, ncardwell, therbert, mph
In-Reply-To: <1340781845.2028.133.camel@localhost>

On Wed, 2012-06-27 at 09:24 +0200, Jesper Dangaard Brouer wrote:

> But, I still believe that we need, to solve this SYN issues by parallel
> processing of packets. (It seems Eric and Hans are looking at a single
> core SYN processing scheme, but I might have missed their point).

Yep

Parallel processing will only benefit multiqueue setups.

Many linux servers in colocations are still using a mono queue NIC, and
default linux configuration is to use a single cpu to handle all
incoming frames (no RPS/RFS).

Sometime the hw IRQ itself is distributed among several cpus, but at one
single moment, only one cpu is serving the NAPI poll.

As long as the LISTEN processing is locking the socket, there is no
point distributing SYN packets to multiple cpus, this only adds
contention and poor performance because of false sharing.

My plan is to get rid of the socket lock for LISTEN and use RCU instead.

^ permalink raw reply

* Re: [PATCH 0/18] Kill off NLMSG_NEW and NLMSG_PUT
From: David Miller @ 2012-06-27  7:25 UTC (permalink / raw)
  To: tgraf; +Cc: netdev
In-Reply-To: <20120627071042.GD31808@canuck.infradead.org>

From: Thomas Graf <tgraf@suug.ch>
Date: Wed, 27 Jun 2012 03:10:42 -0400

> I have a patchset that does just that that I was about to submit today.
> I'll rebase on top of these patches.

Thanks a lot.

^ permalink raw reply

* Re: [PATCH v2 net-next] tcp: avoid tx starvation by SYNACK packets
From: Jesper Dangaard Brouer @ 2012-06-27  7:24 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, hans.schillstrom, subramanian.vijay, dave.taht,
	netdev, ncardwell, therbert, mph
In-Reply-To: <20120626.235423.588696200884989114.davem@davemloft.net>

On Tue, 2012-06-26 at 23:54 -0700, David Miller wrote:
> From: Jesper Dangaard Brouer <brouer@redhat.com>
> Date: Wed, 27 Jun 2012 08:32:13 +0200
> 
> > Using it as default, might be "dangerous" and open an attack vector
> > on SYN cookies in Linux.
> 
> If it's dangerous for syncookies then it's just as dangerous for
> the routing hash and the socket hashes where we use it already.
> 
> Therefore, this sounds like a baseless claim to me.

Yes, you are right. Looking at you patch again, you also use
syncookie_secret[c] as initval.  So, it should be safe.

But, I still believe that we need, to solve this SYN issues by parallel
processing of packets. (It seems Eric and Hans are looking at a single
core SYN processing scheme, but I might have missed their point).

^ permalink raw reply

* [RFC] tcp demux used to signal ip_route_input_noref to not cache dst
From: Eric Dumazet @ 2012-06-27  7:19 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

In case tcp_v{4|6}_early_demux() doesnt find an ESTABLISHED socket, and
SYN flag is set, and an "atomic_t listener_under_synflood" counter is
not 0, we could :

- instruct make ip_rcv_finish() to not cache the input dst into route
cache (if dst is not found in the hash table)

This would make synflood attacks having minimal impact on route cache

(We did this for the output dst of SYN-cookie-ACK messages)

^ permalink raw reply

* Re: [PATCH 0/18] Kill off NLMSG_NEW and NLMSG_PUT
From: Thomas Graf @ 2012-06-27  7:10 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20120626.220155.1298849942906073647.davem@davemloft.net>

On Tue, Jun 26, 2012 at 10:01:55PM -0700, David Miller wrote:
> 
> Bad API, embedded gotos, error prone, etc.
> 
> Next pass we'll have to deal with the RTA_PUT*() macros
> too.

I have a patchset that does just that that I was about to submit today.
I'll rebase on top of these patches.

Reviewed-by: Thomas Graf <tgraf@suug.ch>

^ permalink raw reply

* Re: [patch -next] 6lowpan: double unlock on an error path
From: Alexander Smirnov @ 2012-06-27  6:57 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: Dmitry Eremin-Solenikov, Sergey Lapin, David S. Miller,
	linux-zigbee-devel, netdev, kernel-janitors
In-Reply-To: <20120627065309.GA25774@elgon.mountain>

2012/6/27 Dan Carpenter <dan.carpenter@oracle.com>:
> We already unlocked a few lines earlier here, so we can go directly to
> drop without passing through unlock.  This was introduced recently in
> c5d3687f6c ('6lowpan: read data from skb safely').
>
> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
>
> diff --git a/net/ieee802154/6lowpan.c b/net/ieee802154/6lowpan.c
> index ad0c226..cd5007f 100644
> --- a/net/ieee802154/6lowpan.c
> +++ b/net/ieee802154/6lowpan.c
> @@ -771,7 +771,7 @@ lowpan_process_data(struct sk_buff *skb)
>                        kfree(frame);
>
>                        if (lowpan_fetch_skb_u8(skb, &iphc0))
> -                               goto unlock_and_drop;
> +                               goto drop;
>
>                        break;
>                }

Acked-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>

^ permalink raw reply

* Re: [PATCH v2 net-next] tcp: avoid tx starvation by SYNACK packets
From: David Miller @ 2012-06-27  6:54 UTC (permalink / raw)
  To: brouer
  Cc: eric.dumazet, hans.schillstrom, subramanian.vijay, dave.taht,
	netdev, ncardwell, therbert, mph
In-Reply-To: <1340778733.2028.110.camel@localhost>

From: Jesper Dangaard Brouer <brouer@redhat.com>
Date: Wed, 27 Jun 2012 08:32:13 +0200

> Using it as default, might be "dangerous" and open an attack vector
> on SYN cookies in Linux.

If it's dangerous for syncookies then it's just as dangerous for
the routing hash and the socket hashes where we use it already.

Therefore, this sounds like a baseless claim to me.

^ permalink raw reply

* [patch -next] 6lowpan: double unlock on an error path
From: Dan Carpenter @ 2012-06-27  6:53 UTC (permalink / raw)
  To: Dmitry Eremin-Solenikov
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
	kernel-janitors-u79uwXL29TY76Z2rM5mHXA,
	linux-zigbee-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f,
	David S. Miller

We already unlocked a few lines earlier here, so we can go directly to
drop without passing through unlock.  This was introduced recently in
c5d3687f6c ('6lowpan: read data from skb safely').

Signed-off-by: Dan Carpenter <dan.carpenter-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>

diff --git a/net/ieee802154/6lowpan.c b/net/ieee802154/6lowpan.c
index ad0c226..cd5007f 100644
--- a/net/ieee802154/6lowpan.c
+++ b/net/ieee802154/6lowpan.c
@@ -771,7 +771,7 @@ lowpan_process_data(struct sk_buff *skb)
 			kfree(frame);
 
 			if (lowpan_fetch_skb_u8(skb, &iphc0))
-				goto unlock_and_drop;
+				goto drop;
 
 			break;
 		}

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/

^ permalink raw reply related

* >>FROM BARRISTER CLIFTON HILL ESQ!!
From: gareds1 @ 2012-06-27  6:21 UTC (permalink / raw)
  To: Recipients

FROM BARRISTER ALEN CLIFTON HILL ESQ. i have to make this note been open and straight to you considering the 
urgency of the matter at hand,My Client Mr Anthony Cella has interest in a     l  u  c  r  a  t i v e    i  n  v  e s  t  m  e  n  t     outside the  shores of the UK. [GET BACK TO ME IF INTERESTED] 

^ permalink raw reply

* Re: [PATCH v2 net-next] tcp: avoid tx starvation by SYNACK packets
From: Jesper Dangaard Brouer @ 2012-06-27  6:32 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Hans Schillstrom, David Miller, subramanian.vijay@gmail.com,
	dave.taht@gmail.com, netdev@vger.kernel.org, ncardwell@google.com,
	therbert@google.com, Martin Topholm
In-Reply-To: <1340730156.10893.359.camel@edumazet-glaptop>

On Tue, 2012-06-26 at 19:02 +0200, Eric Dumazet wrote:
> On Tue, 2012-06-26 at 07:34 +0200, Hans Schillstrom wrote:
> 
> > This patch didn't give much in gain actually.
> 
> With a 100Mbps link it does.
> 
> With a 1Gbps link we are cpu bounded for sure.

I'm using a 10G link

> > The big cycle consumer during a syn attack is SHA sum right now, 
> > so from that perspective it's better to add aes crypto (by using AES-NI) 
> > to the syn cookies instead of SHA sum. Even if only newer x86_64 can use it.

How are you avoiding the lock bh_lock_sock_nested(sk) in tcp_v4_rcv()?


> My dev machine is able to process ~280.000 SYN (and synack) per second
> (tg3, mono queue), and sha_transform() takes ~10 % of the time according
> to perf.

With my parallel SYN cookie/brownies patches, I could easily process 750
Kpps (limited by the generator, think the owners of the big machine did
a test where they reached 1400 Kpps).

I also had ~10% CPU usage from sha_transform() but across all cores...


> With David patch using jhash instead of SHA, I reach ~315.000 SYN per
> second.

IMHO a faster hash is not the answer... parallel processing of SYN
packets is a better answer.  But I do think, adding this faster hash as
a sysctl switch might be a good idea, for people with smaller embedded
hardware.  Using it as default, might be "dangerous" and open an attack
vector on SYN cookies in Linux.


-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Sr. Network Kernel Developer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply

* Re: [net-next.git 0/4] EEE for PAL and stmmac (V6)
From: Giuseppe CAVALLARO @ 2012-06-27  6:07 UTC (permalink / raw)
  To: bhutchings; +Cc: netdev, eric.dumazet, rayagond, davem, yuvalmin
In-Reply-To: <1340258599-3083-1-git-send-email-peppe.cavallaro@st.com>

On 6/21/2012 8:03 AM, Giuseppe CAVALLARO wrote:
> These patches add the EEE support in the stmmac device driver
> restoring an old work I had done some months ago and not
> completed in time.
> 
> I've tested all on ST STB with the IC+ 101G PHY device that has
> this feature.
> 
> The initial EEE support for the stmmac has been written by Rayagond
> but I have reworked all his code adding new parts and especially
> performing tests on a real hardware. Thx Rayagond!
> 
> In these patches, we can see that the stmmac supports the EEE
> only if the DMA HW capability register says that this
> feature is actually available. In that case, the driver can enter
> in the Tx LPI mode by using a timer as recommended by Synopsys.
> Note that EEE is supported in new chip generations; in particular
> I used the 3.61a.
> 
> At any rate, further information about how the driver treats the EEE
> can be found in the stmmac.txt file (there is a patch for that).
> 
> Another patch is for Physical Abstraction Layer now able to
> manage the MMD registers (clause 45); it also provides the ethtool
> support to manage supported/advertisement/lp adv features.
> 
> v3: fixed the "stmmac: do not use strict_strtoul but kstrtoint"
>     to use the kstrtoint.
> v4: fixed the function to enable the EEE and add a check that verifies
>     if the link auto-negotiated matches with the bits in the adv and lp
>     registers.
> v5: reviewed the way to get the negotiated settings
> v6: fixed a broken return value in the phy_eee_init function
> Giuseppe Cavallaro (4):
>   stmmac: do not use strict_strtoul but kstrtoint
>   stmmac: update the driver Documentation and add EEE
>   stmmac: add the Energy Efficient Ethernet support
>   phy: add the EEE support and the way to access to the MMD registers.

Hello Ben

sorry if I stress you but I wonder if the v8 version of the latest patch
([net-next.git 4/4 (v8)] phy: add the EEE support and the way to access
to the MMD registers.) is ok for you or if I have to fix something else.

Many thanks for your effort.

Let me know.

Best Regards
Peppe

>  Documentation/networking/stmmac.txt                |   36 ++-
>  drivers/net/ethernet/stmicro/stmmac/common.h       |   31 ++-
>  drivers/net/ethernet/stmicro/stmmac/dwmac1000.h    |   20 ++
>  .../net/ethernet/stmicro/stmmac/dwmac1000_core.c   |  101 +++++++-
>  .../net/ethernet/stmicro/stmmac/dwmac100_core.c    |    4 +-
>  drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h    |    1 +
>  drivers/net/ethernet/stmicro/stmmac/stmmac.h       |    8 +
>  .../net/ethernet/stmicro/stmmac/stmmac_ethtool.c   |   57 ++++
>  drivers/net/ethernet/stmicro/stmmac/stmmac_main.c  |  193 ++++++++++++--
>  .../net/ethernet/stmicro/stmmac/stmmac_platform.c  |    2 +
>  drivers/net/phy/phy.c                              |  281 ++++++++++++++++++++
>  include/linux/mdio.h                               |   21 ++-
>  include/linux/mii.h                                |    9 +
>  include/linux/phy.h                                |    5 +
>  14 files changed, 724 insertions(+), 45 deletions(-)
> 

^ permalink raw reply

* [PATCH net-next] be2net: Fix to trim skb for padded vlan packets to workaround an ASIC Bug
From: Somnath Kotur @ 2012-06-27  6:04 UTC (permalink / raw)
  To: netdev; +Cc: davem, Somnath Kotur

Also refactored existing code a bit to provide placeholders for another ASIC Bug
workaround that will be checked-in soon after this.

Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com>
---
 drivers/net/ethernet/emulex/benet/be.h      |    5 ++
 drivers/net/ethernet/emulex/benet/be_main.c |   56 ++++++++++++++++++++-------
 2 files changed, 47 insertions(+), 14 deletions(-)

diff --git a/drivers/net/ethernet/emulex/benet/be.h b/drivers/net/ethernet/emulex/benet/be.h
index 7b5cc2b..7a71fb6 100644
--- a/drivers/net/ethernet/emulex/benet/be.h
+++ b/drivers/net/ethernet/emulex/benet/be.h
@@ -573,6 +573,11 @@ static inline u8 is_udp_pkt(struct sk_buff *skb)
 	return val;
 }
 
+static inline bool is_ipv4_pkt(struct sk_buff *skb)
+{
+	return skb->protocol == ntohs(ETH_P_IP) && ip_hdr(skb)->version == 4;
+}
+
 static inline void be_vf_eth_addr_generate(struct be_adapter *adapter, u8 *mac)
 {
 	u32 addr;
diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c
index a28896d..22e2c04 100644
--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -577,6 +577,11 @@ static inline u16 be_get_tx_vlan_tag(struct be_adapter *adapter,
 	return vlan_tag;
 }
 
+static int be_vlan_tag_chk(struct be_adapter *adapter, struct sk_buff *skb)
+{
+	return vlan_tx_tag_present(skb) || adapter->pvid;
+}
+
 static void wrb_fill_hdr(struct be_adapter *adapter, struct be_eth_hdr_wrb *hdr,
 		struct sk_buff *skb, u32 wrb_cnt, u32 len)
 {
@@ -704,33 +709,56 @@ dma_err:
 	return 0;
 }
 
+static struct sk_buff *be_insert_vlan_in_pkt(struct be_adapter *adapter,
+					     struct sk_buff *skb)
+{
+	u16 vlan_tag = 0;
+
+	skb = skb_share_check(skb, GFP_ATOMIC);
+	if (unlikely(!skb))
+		return skb;
+
+	if (vlan_tx_tag_present(skb)) {
+		vlan_tag = be_get_tx_vlan_tag(adapter, skb);
+		__vlan_put_tag(skb, vlan_tag);
+		skb->vlan_tci = 0;
+	}
+
+	return skb;
+}
+
 static netdev_tx_t be_xmit(struct sk_buff *skb,
 			struct net_device *netdev)
 {
 	struct be_adapter *adapter = netdev_priv(netdev);
 	struct be_tx_obj *txo = &adapter->tx_obj[skb_get_queue_mapping(skb)];
 	struct be_queue_info *txq = &txo->q;
+	struct iphdr *ip = NULL;
 	u32 wrb_cnt = 0, copied = 0;
-	u32 start = txq->head;
+	u32 start = txq->head, eth_hdr_len;
 	bool dummy_wrb, stopped = false;
 
-	/* For vlan tagged pkts, BE
-	 * 1) calculates checksum even when CSO is not requested
-	 * 2) calculates checksum wrongly for padded pkt less than
-	 * 60 bytes long.
-	 * As a workaround disable TX vlan offloading in such cases.
+	eth_hdr_len = ntohs(skb->protocol) == ETH_P_8021Q ?
+		VLAN_ETH_HLEN : ETH_HLEN;
+
+	/* HW has a bug whicn considers padding bytes as legal
+	 * and modifies the IPv4 hdr's 'tot_len' field
 	 */
-	if (vlan_tx_tag_present(skb) &&
-	    (skb->ip_summed != CHECKSUM_PARTIAL || skb->len <= 60)) {
-		skb = skb_share_check(skb, GFP_ATOMIC);
-		if (unlikely(!skb))
-			goto tx_drop;
+	if (skb->len <= 60 && be_vlan_tag_chk(adapter, skb) &&
+			is_ipv4_pkt(skb)) {
+		ip = (struct iphdr *)ip_hdr(skb);
+		pskb_trim(skb, eth_hdr_len + ntohs(ip->tot_len));
+	}
 
-		skb = __vlan_put_tag(skb, be_get_tx_vlan_tag(adapter, skb));
+	/* HW has a bug wherein it will calculate CSUM for VLAN
+	 * pkts even though it is disabled.
+	 * Manually insert VLAN in pkt.
+	 */
+	if (skb->ip_summed != CHECKSUM_PARTIAL &&
+			be_vlan_tag_chk(adapter, skb)) {
+		skb = be_insert_vlan_in_pkt(adapter, skb);
 		if (unlikely(!skb))
 			goto tx_drop;
-
-		skb->vlan_tci = 0;
 	}
 
 	wrb_cnt = wrb_cnt_for_skb(adapter, skb, &dummy_wrb);
-- 
1.5.6.1

^ permalink raw reply related

* [PATCH net-next 3/4 v2] net: sh_eth: fix up the buffer pointers
From: Shimoda, Yoshihiro @ 2012-06-27  6:00 UTC (permalink / raw)
  To: netdev; +Cc: SH-Linux

After freeing the buffer, the driver should change the value of
the pointer to NULL.

Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
---
 about v2:
  - rebase the latest net-next.git

 drivers/net/ethernet/renesas/sh_eth.c |   31 ++++++++++++++++++++++++-------
 1 files changed, 24 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/renesas/sh_eth.c b/drivers/net/ethernet/renesas/sh_eth.c
index 43e76d2..2dd2ff5 100644
--- a/drivers/net/ethernet/renesas/sh_eth.c
+++ b/drivers/net/ethernet/renesas/sh_eth.c
@@ -788,6 +788,7 @@ static void sh_eth_ring_free(struct net_device *ndev)
 		}
 	}
 	kfree(mdp->rx_skbuff);
+	mdp->rx_skbuff = NULL;

 	/* Free Tx skb ringbuffer */
 	if (mdp->tx_skbuff) {
@@ -797,6 +798,7 @@ static void sh_eth_ring_free(struct net_device *ndev)
 		}
 	}
 	kfree(mdp->tx_skbuff);
+	mdp->tx_skbuff = NULL;
 }

 /* format skb and descriptor buffer */
@@ -933,10 +935,31 @@ desc_ring_free:
 skb_ring_free:
 	/* Free Rx and Tx skb ring buffer */
 	sh_eth_ring_free(ndev);
+	mdp->tx_ring = NULL;
+	mdp->rx_ring = NULL;

 	return ret;
 }

+static void sh_eth_free_dma_buffer(struct sh_eth_private *mdp)
+{
+	int ringsize;
+
+	if (mdp->rx_ring) {
+		ringsize = sizeof(struct sh_eth_rxdesc) * RX_RING_SIZE;
+		dma_free_coherent(NULL, ringsize, mdp->rx_ring,
+				  mdp->rx_desc_dma);
+		mdp->rx_ring = NULL;
+	}
+
+	if (mdp->tx_ring) {
+		ringsize = sizeof(struct sh_eth_txdesc) * TX_RING_SIZE;
+		dma_free_coherent(NULL, ringsize, mdp->tx_ring,
+				  mdp->tx_desc_dma);
+		mdp->tx_ring = NULL;
+	}
+}
+
 static int sh_eth_dev_init(struct net_device *ndev)
 {
 	int ret = 0;
@@ -1677,7 +1700,6 @@ static int sh_eth_start_xmit(struct sk_buff *skb, struct net_device *ndev)
 static int sh_eth_close(struct net_device *ndev)
 {
 	struct sh_eth_private *mdp = netdev_priv(ndev);
-	int ringsize;

 	netif_stop_queue(ndev);

@@ -1700,12 +1722,7 @@ static int sh_eth_close(struct net_device *ndev)
 	sh_eth_ring_free(ndev);

 	/* free DMA buffer */
-	ringsize = sizeof(struct sh_eth_rxdesc) * RX_RING_SIZE;
-	dma_free_coherent(NULL, ringsize, mdp->rx_ring, mdp->rx_desc_dma);
-
-	/* free DMA buffer */
-	ringsize = sizeof(struct sh_eth_txdesc) * TX_RING_SIZE;
-	dma_free_coherent(NULL, ringsize, mdp->tx_ring, mdp->tx_desc_dma);
+	sh_eth_free_dma_buffer(mdp);

 	pm_runtime_put_sync(&mdp->pdev->dev);

-- 
1.7.1

^ permalink raw reply related

* [PATCH net-next 4/4 v2] net: sh_eth: add support for set_ringparam/get_ringparam
From: Shimoda, Yoshihiro @ 2012-06-27  6:00 UTC (permalink / raw)
  To: netdev; +Cc: SH-Linux

This patch supports the ethtool's set_ringparam() and get_ringparam().

Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
---
 about v2:
  - rebase the latest net-next.git

 drivers/net/ethernet/renesas/sh_eth.c |  139 +++++++++++++++++++++++++--------
 drivers/net/ethernet/renesas/sh_eth.h |    6 ++
 2 files changed, 112 insertions(+), 33 deletions(-)

diff --git a/drivers/net/ethernet/renesas/sh_eth.c b/drivers/net/ethernet/renesas/sh_eth.c
index 2dd2ff5..af0b867 100644
--- a/drivers/net/ethernet/renesas/sh_eth.c
+++ b/drivers/net/ethernet/renesas/sh_eth.c
@@ -782,7 +782,7 @@ static void sh_eth_ring_free(struct net_device *ndev)

 	/* Free Rx skb ringbuffer */
 	if (mdp->rx_skbuff) {
-		for (i = 0; i < RX_RING_SIZE; i++) {
+		for (i = 0; i < mdp->num_rx_ring; i++) {
 			if (mdp->rx_skbuff[i])
 				dev_kfree_skb(mdp->rx_skbuff[i]);
 		}
@@ -792,7 +792,7 @@ static void sh_eth_ring_free(struct net_device *ndev)

 	/* Free Tx skb ringbuffer */
 	if (mdp->tx_skbuff) {
-		for (i = 0; i < TX_RING_SIZE; i++) {
+		for (i = 0; i < mdp->num_tx_ring; i++) {
 			if (mdp->tx_skbuff[i])
 				dev_kfree_skb(mdp->tx_skbuff[i]);
 		}
@@ -809,8 +809,8 @@ static void sh_eth_ring_format(struct net_device *ndev)
 	struct sk_buff *skb;
 	struct sh_eth_rxdesc *rxdesc = NULL;
 	struct sh_eth_txdesc *txdesc = NULL;
-	int rx_ringsize = sizeof(*rxdesc) * RX_RING_SIZE;
-	int tx_ringsize = sizeof(*txdesc) * TX_RING_SIZE;
+	int rx_ringsize = sizeof(*rxdesc) * mdp->num_rx_ring;
+	int tx_ringsize = sizeof(*txdesc) * mdp->num_tx_ring;

 	mdp->cur_rx = mdp->cur_tx = 0;
 	mdp->dirty_rx = mdp->dirty_tx = 0;
@@ -818,7 +818,7 @@ static void sh_eth_ring_format(struct net_device *ndev)
 	memset(mdp->rx_ring, 0, rx_ringsize);

 	/* build Rx ring buffer */
-	for (i = 0; i < RX_RING_SIZE; i++) {
+	for (i = 0; i < mdp->num_rx_ring; i++) {
 		/* skb */
 		mdp->rx_skbuff[i] = NULL;
 		skb = netdev_alloc_skb(ndev, mdp->rx_buf_sz);
@@ -844,7 +844,7 @@ static void sh_eth_ring_format(struct net_device *ndev)
 		}
 	}

-	mdp->dirty_rx = (u32) (i - RX_RING_SIZE);
+	mdp->dirty_rx = (u32) (i - mdp->num_rx_ring);

 	/* Mark the last entry as wrapping the ring. */
 	rxdesc->status |= cpu_to_edmac(mdp, RD_RDEL);
@@ -852,7 +852,7 @@ static void sh_eth_ring_format(struct net_device *ndev)
 	memset(mdp->tx_ring, 0, tx_ringsize);

 	/* build Tx ring buffer */
-	for (i = 0; i < TX_RING_SIZE; i++) {
+	for (i = 0; i < mdp->num_tx_ring; i++) {
 		mdp->tx_skbuff[i] = NULL;
 		txdesc = &mdp->tx_ring[i];
 		txdesc->status = cpu_to_edmac(mdp, TD_TFP);
@@ -886,7 +886,7 @@ static int sh_eth_ring_init(struct net_device *ndev)
 		mdp->rx_buf_sz += NET_IP_ALIGN;

 	/* Allocate RX and TX skb rings */
-	mdp->rx_skbuff = kmalloc(sizeof(*mdp->rx_skbuff) * RX_RING_SIZE,
+	mdp->rx_skbuff = kmalloc(sizeof(*mdp->rx_skbuff) * mdp->num_rx_ring,
 				GFP_KERNEL);
 	if (!mdp->rx_skbuff) {
 		dev_err(&ndev->dev, "Cannot allocate Rx skb\n");
@@ -894,7 +894,7 @@ static int sh_eth_ring_init(struct net_device *ndev)
 		return ret;
 	}

-	mdp->tx_skbuff = kmalloc(sizeof(*mdp->tx_skbuff) * TX_RING_SIZE,
+	mdp->tx_skbuff = kmalloc(sizeof(*mdp->tx_skbuff) * mdp->num_tx_ring,
 				GFP_KERNEL);
 	if (!mdp->tx_skbuff) {
 		dev_err(&ndev->dev, "Cannot allocate Tx skb\n");
@@ -903,7 +903,7 @@ static int sh_eth_ring_init(struct net_device *ndev)
 	}

 	/* Allocate all Rx descriptors. */
-	rx_ringsize = sizeof(struct sh_eth_rxdesc) * RX_RING_SIZE;
+	rx_ringsize = sizeof(struct sh_eth_rxdesc) * mdp->num_rx_ring;
 	mdp->rx_ring = dma_alloc_coherent(NULL, rx_ringsize, &mdp->rx_desc_dma,
 			GFP_KERNEL);

@@ -917,7 +917,7 @@ static int sh_eth_ring_init(struct net_device *ndev)
 	mdp->dirty_rx = 0;

 	/* Allocate all Tx descriptors. */
-	tx_ringsize = sizeof(struct sh_eth_txdesc) * TX_RING_SIZE;
+	tx_ringsize = sizeof(struct sh_eth_txdesc) * mdp->num_tx_ring;
 	mdp->tx_ring = dma_alloc_coherent(NULL, tx_ringsize, &mdp->tx_desc_dma,
 			GFP_KERNEL);
 	if (!mdp->tx_ring) {
@@ -946,21 +946,21 @@ static void sh_eth_free_dma_buffer(struct sh_eth_private *mdp)
 	int ringsize;

 	if (mdp->rx_ring) {
-		ringsize = sizeof(struct sh_eth_rxdesc) * RX_RING_SIZE;
+		ringsize = sizeof(struct sh_eth_rxdesc) * mdp->num_rx_ring;
 		dma_free_coherent(NULL, ringsize, mdp->rx_ring,
 				  mdp->rx_desc_dma);
 		mdp->rx_ring = NULL;
 	}

 	if (mdp->tx_ring) {
-		ringsize = sizeof(struct sh_eth_txdesc) * TX_RING_SIZE;
+		ringsize = sizeof(struct sh_eth_txdesc) * mdp->num_tx_ring;
 		dma_free_coherent(NULL, ringsize, mdp->tx_ring,
 				  mdp->tx_desc_dma);
 		mdp->tx_ring = NULL;
 	}
 }

-static int sh_eth_dev_init(struct net_device *ndev)
+static int sh_eth_dev_init(struct net_device *ndev, bool start)
 {
 	int ret = 0;
 	struct sh_eth_private *mdp = netdev_priv(ndev);
@@ -1008,7 +1008,8 @@ static int sh_eth_dev_init(struct net_device *ndev)
 		     RFLR);

 	sh_eth_write(ndev, sh_eth_read(ndev, EESR), EESR);
-	sh_eth_write(ndev, mdp->cd->eesipr_value, EESIPR);
+	if (start)
+		sh_eth_write(ndev, mdp->cd->eesipr_value, EESIPR);

 	/* PAUSE Prohibition */
 	val = (sh_eth_read(ndev, ECMR) & ECMR_DM) |
@@ -1023,7 +1024,8 @@ static int sh_eth_dev_init(struct net_device *ndev)
 	sh_eth_write(ndev, mdp->cd->ecsr_value, ECSR);

 	/* E-MAC Interrupt Enable register */
-	sh_eth_write(ndev, mdp->cd->ecsipr_value, ECSIPR);
+	if (start)
+		sh_eth_write(ndev, mdp->cd->ecsipr_value, ECSIPR);

 	/* Set MAC address */
 	update_mac_address(ndev);
@@ -1036,10 +1038,12 @@ static int sh_eth_dev_init(struct net_device *ndev)
 	if (mdp->cd->tpauser)
 		sh_eth_write(ndev, TPAUSER_UNLIMITED, TPAUSER);

-	/* Setting the Rx mode will start the Rx process. */
-	sh_eth_write(ndev, EDRRR_R, EDRRR);
+	if (start) {
+		/* Setting the Rx mode will start the Rx process. */
+		sh_eth_write(ndev, EDRRR_R, EDRRR);

-	netif_start_queue(ndev);
+		netif_start_queue(ndev);
+	}

 out:
 	return ret;
@@ -1054,7 +1058,7 @@ static int sh_eth_txfree(struct net_device *ndev)
 	int entry = 0;

 	for (; mdp->cur_tx - mdp->dirty_tx > 0; mdp->dirty_tx++) {
-		entry = mdp->dirty_tx % TX_RING_SIZE;
+		entry = mdp->dirty_tx % mdp->num_tx_ring;
 		txdesc = &mdp->tx_ring[entry];
 		if (txdesc->status & cpu_to_edmac(mdp, TD_TACT))
 			break;
@@ -1067,7 +1071,7 @@ static int sh_eth_txfree(struct net_device *ndev)
 			freeNum++;
 		}
 		txdesc->status = cpu_to_edmac(mdp, TD_TFP);
-		if (entry >= TX_RING_SIZE - 1)
+		if (entry >= mdp->num_tx_ring - 1)
 			txdesc->status |= cpu_to_edmac(mdp, TD_TDLE);

 		ndev->stats.tx_packets++;
@@ -1082,8 +1086,8 @@ static int sh_eth_rx(struct net_device *ndev, u32 intr_status)
 	struct sh_eth_private *mdp = netdev_priv(ndev);
 	struct sh_eth_rxdesc *rxdesc;

-	int entry = mdp->cur_rx % RX_RING_SIZE;
-	int boguscnt = (mdp->dirty_rx + RX_RING_SIZE) - mdp->cur_rx;
+	int entry = mdp->cur_rx % mdp->num_rx_ring;
+	int boguscnt = (mdp->dirty_rx + mdp->num_rx_ring) - mdp->cur_rx;
 	struct sk_buff *skb;
 	u16 pkt_len = 0;
 	u32 desc_status;
@@ -1134,13 +1138,13 @@ static int sh_eth_rx(struct net_device *ndev, u32 intr_status)
 			ndev->stats.rx_bytes += pkt_len;
 		}
 		rxdesc->status |= cpu_to_edmac(mdp, RD_RACT);
-		entry = (++mdp->cur_rx) % RX_RING_SIZE;
+		entry = (++mdp->cur_rx) % mdp->num_rx_ring;
 		rxdesc = &mdp->rx_ring[entry];
 	}

 	/* Refill the Rx ring buffers. */
 	for (; mdp->cur_rx - mdp->dirty_rx > 0; mdp->dirty_rx++) {
-		entry = mdp->dirty_rx % RX_RING_SIZE;
+		entry = mdp->dirty_rx % mdp->num_rx_ring;
 		rxdesc = &mdp->rx_ring[entry];
 		/* The size of the buffer is 16 byte boundary. */
 		rxdesc->buffer_length = ALIGN(mdp->rx_buf_sz, 16);
@@ -1157,7 +1161,7 @@ static int sh_eth_rx(struct net_device *ndev, u32 intr_status)
 			skb_checksum_none_assert(skb);
 			rxdesc->addr = virt_to_phys(PTR_ALIGN(skb->data, 4));
 		}
-		if (entry >= RX_RING_SIZE - 1)
+		if (entry >= mdp->num_rx_ring - 1)
 			rxdesc->status |=
 				cpu_to_edmac(mdp, RD_RACT | RD_RFP | RD_RDEL);
 		else
@@ -1557,6 +1561,71 @@ static void sh_eth_get_strings(struct net_device *ndev, u32 stringset, u8 *data)
 	}
 }

+static void sh_eth_get_ringparam(struct net_device *ndev,
+				 struct ethtool_ringparam *ring)
+{
+	struct sh_eth_private *mdp = netdev_priv(ndev);
+
+	ring->rx_max_pending = RX_RING_MAX;
+	ring->tx_max_pending = TX_RING_MAX;
+	ring->rx_pending = mdp->num_rx_ring;
+	ring->tx_pending = mdp->num_tx_ring;
+}
+
+static int sh_eth_set_ringparam(struct net_device *ndev,
+				struct ethtool_ringparam *ring)
+{
+	struct sh_eth_private *mdp = netdev_priv(ndev);
+	int ret;
+
+	if (ring->tx_pending > TX_RING_MAX ||
+	    ring->rx_pending > RX_RING_MAX ||
+	    ring->tx_pending < TX_RING_MIN ||
+	    ring->rx_pending < RX_RING_MIN)
+		return -EINVAL;
+	if (ring->rx_mini_pending || ring->rx_jumbo_pending)
+		return -EINVAL;
+
+	if (netif_running(ndev)) {
+		netif_tx_disable(ndev);
+		/* Disable interrupts by clearing the interrupt mask. */
+		sh_eth_write(ndev, 0x0000, EESIPR);
+		/* Stop the chip's Tx and Rx processes. */
+		sh_eth_write(ndev, 0, EDTRR);
+		sh_eth_write(ndev, 0, EDRRR);
+		synchronize_irq(ndev->irq);
+	}
+
+	/* Free all the skbuffs in the Rx queue. */
+	sh_eth_ring_free(ndev);
+	/* Free DMA buffer */
+	sh_eth_free_dma_buffer(mdp);
+
+	/* Set new parameters */
+	mdp->num_rx_ring = ring->rx_pending;
+	mdp->num_tx_ring = ring->tx_pending;
+
+	ret = sh_eth_ring_init(ndev);
+	if (ret < 0) {
+		dev_err(&ndev->dev, "%s: sh_eth_ring_init failed.\n", __func__);
+		return ret;
+	}
+	ret = sh_eth_dev_init(ndev, false);
+	if (ret < 0) {
+		dev_err(&ndev->dev, "%s: sh_eth_dev_init failed.\n", __func__);
+		return ret;
+	}
+
+	if (netif_running(ndev)) {
+		sh_eth_write(ndev, mdp->cd->eesipr_value, EESIPR);
+		/* Setting the Rx mode will start the Rx process. */
+		sh_eth_write(ndev, EDRRR_R, EDRRR);
+		netif_wake_queue(ndev);
+	}
+
+	return 0;
+}
+
 static const struct ethtool_ops sh_eth_ethtool_ops = {
 	.get_settings	= sh_eth_get_settings,
 	.set_settings	= sh_eth_set_settings,
@@ -1567,6 +1636,8 @@ static const struct ethtool_ops sh_eth_ethtool_ops = {
 	.get_strings	= sh_eth_get_strings,
 	.get_ethtool_stats  = sh_eth_get_ethtool_stats,
 	.get_sset_count     = sh_eth_get_sset_count,
+	.get_ringparam	= sh_eth_get_ringparam,
+	.set_ringparam	= sh_eth_set_ringparam,
 };

 /* network device open function */
@@ -1597,7 +1668,7 @@ static int sh_eth_open(struct net_device *ndev)
 		goto out_free_irq;

 	/* device init */
-	ret = sh_eth_dev_init(ndev);
+	ret = sh_eth_dev_init(ndev, true);
 	if (ret)
 		goto out_free_irq;

@@ -1631,7 +1702,7 @@ static void sh_eth_tx_timeout(struct net_device *ndev)
 	ndev->stats.tx_errors++;

 	/* Free all the skbuffs in the Rx queue. */
-	for (i = 0; i < RX_RING_SIZE; i++) {
+	for (i = 0; i < mdp->num_rx_ring; i++) {
 		rxdesc = &mdp->rx_ring[i];
 		rxdesc->status = 0;
 		rxdesc->addr = 0xBADF00D0;
@@ -1639,14 +1710,14 @@ static void sh_eth_tx_timeout(struct net_device *ndev)
 			dev_kfree_skb(mdp->rx_skbuff[i]);
 		mdp->rx_skbuff[i] = NULL;
 	}
-	for (i = 0; i < TX_RING_SIZE; i++) {
+	for (i = 0; i < mdp->num_tx_ring; i++) {
 		if (mdp->tx_skbuff[i])
 			dev_kfree_skb(mdp->tx_skbuff[i]);
 		mdp->tx_skbuff[i] = NULL;
 	}

 	/* device init */
-	sh_eth_dev_init(ndev);
+	sh_eth_dev_init(ndev, true);
 }

 /* Packet transmit function */
@@ -1658,7 +1729,7 @@ static int sh_eth_start_xmit(struct sk_buff *skb, struct net_device *ndev)
 	unsigned long flags;

 	spin_lock_irqsave(&mdp->lock, flags);
-	if ((mdp->cur_tx - mdp->dirty_tx) >= (TX_RING_SIZE - 4)) {
+	if ((mdp->cur_tx - mdp->dirty_tx) >= (mdp->num_tx_ring - 4)) {
 		if (!sh_eth_txfree(ndev)) {
 			if (netif_msg_tx_queued(mdp))
 				dev_warn(&ndev->dev, "TxFD exhausted.\n");
@@ -1669,7 +1740,7 @@ static int sh_eth_start_xmit(struct sk_buff *skb, struct net_device *ndev)
 	}
 	spin_unlock_irqrestore(&mdp->lock, flags);

-	entry = mdp->cur_tx % TX_RING_SIZE;
+	entry = mdp->cur_tx % mdp->num_tx_ring;
 	mdp->tx_skbuff[entry] = skb;
 	txdesc = &mdp->tx_ring[entry];
 	/* soft swap. */
@@ -1683,7 +1754,7 @@ static int sh_eth_start_xmit(struct sk_buff *skb, struct net_device *ndev)
 	else
 		txdesc->buffer_length = skb->len;

-	if (entry >= TX_RING_SIZE - 1)
+	if (entry >= mdp->num_tx_ring - 1)
 		txdesc->status |= cpu_to_edmac(mdp, TD_TACT | TD_TDLE);
 	else
 		txdesc->status |= cpu_to_edmac(mdp, TD_TACT);
@@ -2313,6 +2384,8 @@ static int sh_eth_drv_probe(struct platform_device *pdev)
 	ether_setup(ndev);

 	mdp = netdev_priv(ndev);
+	mdp->num_tx_ring = TX_RING_SIZE;
+	mdp->num_rx_ring = RX_RING_SIZE;
 	mdp->addr = ioremap(res->start, resource_size(res));
 	if (mdp->addr == NULL) {
 		ret = -ENOMEM;
diff --git a/drivers/net/ethernet/renesas/sh_eth.h b/drivers/net/ethernet/renesas/sh_eth.h
index 37a0702..bae84fd 100644
--- a/drivers/net/ethernet/renesas/sh_eth.h
+++ b/drivers/net/ethernet/renesas/sh_eth.h
@@ -27,6 +27,10 @@
 #define TX_TIMEOUT	(5*HZ)
 #define TX_RING_SIZE	64	/* Tx ring size */
 #define RX_RING_SIZE	64	/* Rx ring size */
+#define TX_RING_MIN	64
+#define RX_RING_MIN	64
+#define TX_RING_MAX	1024
+#define RX_RING_MAX	1024
 #define ETHERSMALL		60
 #define PKT_BUF_SZ		1538
 #define SH_ETH_TSU_TIMEOUT_MS	500
@@ -701,6 +705,8 @@ struct sh_eth_private {
 	const u16 *reg_offset;
 	void __iomem *addr;
 	void __iomem *tsu_addr;
+	u32 num_rx_ring;
+	u32 num_tx_ring;
 	dma_addr_t rx_desc_dma;
 	dma_addr_t tx_desc_dma;
 	struct sh_eth_rxdesc *rx_ring;
-- 
1.7.1

^ permalink raw reply related

* [PATCH net-next 2/4 v2] net: sh_eth: remove unnecessary members/definitions
From: Shimoda, Yoshihiro @ 2012-06-27  5:59 UTC (permalink / raw)
  To: netdev; +Cc: SH-Linux

This patch removes unnecessary members in sh_th_private.
This patch also removes unnecessary definitions in sh_eth.h

Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
---
 about v2:
  - rebase the latest net-next.git

 drivers/net/ethernet/renesas/sh_eth.c |    7 +---
 drivers/net/ethernet/renesas/sh_eth.h |   69 ---------------------------------
 2 files changed, 1 insertions(+), 75 deletions(-)

diff --git a/drivers/net/ethernet/renesas/sh_eth.c b/drivers/net/ethernet/renesas/sh_eth.c
index cf0bc31..43e76d2 100644
--- a/drivers/net/ethernet/renesas/sh_eth.c
+++ b/drivers/net/ethernet/renesas/sh_eth.c
@@ -941,7 +941,6 @@ static int sh_eth_dev_init(struct net_device *ndev)
 {
 	int ret = 0;
 	struct sh_eth_private *mdp = netdev_priv(ndev);
-	u_int32_t rx_int_var, tx_int_var;
 	u32 val;

 	/* Soft Reset */
@@ -971,9 +970,7 @@ static int sh_eth_dev_init(struct net_device *ndev)
 	/* Frame recv control */
 	sh_eth_write(ndev, mdp->cd->rmcr_value, RMCR);

-	rx_int_var = mdp->rx_int_var = DESC_I_RINT8 | DESC_I_RINT5;
-	tx_int_var = mdp->tx_int_var = DESC_I_TINT2;
-	sh_eth_write(ndev, rx_int_var | tx_int_var, TRSCER);
+	sh_eth_write(ndev, DESC_I_RINT8 | DESC_I_RINT5 | DESC_I_TINT2, TRSCER);

 	if (mdp->cd->bculr)
 		sh_eth_write(ndev, 0x800, BCULR);	/* Burst sycle set */
@@ -2336,8 +2333,6 @@ static int sh_eth_drv_probe(struct platform_device *pdev)

 	/* debug message level */
 	mdp->msg_enable = SH_ETH_DEF_MSG_ENABLE;
-	mdp->post_rx = POST_RX >> (devno << 1);
-	mdp->post_fw = POST_FW >> (devno << 1);

 	/* read and set MAC address */
 	read_mac_address(ndev, pd->mac_addr);
diff --git a/drivers/net/ethernet/renesas/sh_eth.h b/drivers/net/ethernet/renesas/sh_eth.h
index 5af3f2a..37a0702 100644
--- a/drivers/net/ethernet/renesas/sh_eth.h
+++ b/drivers/net/ethernet/renesas/sh_eth.h
@@ -585,71 +585,6 @@ enum RPADIR_BIT {
 /* FDR */
 #define DEFAULT_FDR_INIT	0x00000707

-enum phy_offsets {
-	PHY_CTRL = 0, PHY_STAT = 1, PHY_IDT1 = 2, PHY_IDT2 = 3,
-	PHY_ANA = 4, PHY_ANL = 5, PHY_ANE = 6,
-	PHY_16 = 16,
-};
-
-/* PHY_CTRL */
-enum PHY_CTRL_BIT {
-	PHY_C_RESET = 0x8000, PHY_C_LOOPBK = 0x4000, PHY_C_SPEEDSL = 0x2000,
-	PHY_C_ANEGEN = 0x1000, PHY_C_PWRDN = 0x0800, PHY_C_ISO = 0x0400,
-	PHY_C_RANEG = 0x0200, PHY_C_DUPLEX = 0x0100, PHY_C_COLT = 0x0080,
-};
-#define DM9161_PHY_C_ANEGEN 0	/* auto nego special */
-
-/* PHY_STAT */
-enum PHY_STAT_BIT {
-	PHY_S_100T4 = 0x8000, PHY_S_100X_F = 0x4000, PHY_S_100X_H = 0x2000,
-	PHY_S_10T_F = 0x1000, PHY_S_10T_H = 0x0800, PHY_S_ANEGC = 0x0020,
-	PHY_S_RFAULT = 0x0010, PHY_S_ANEGA = 0x0008, PHY_S_LINK = 0x0004,
-	PHY_S_JAB = 0x0002, PHY_S_EXTD = 0x0001,
-};
-
-/* PHY_ANA */
-enum PHY_ANA_BIT {
-	PHY_A_NP = 0x8000, PHY_A_ACK = 0x4000, PHY_A_RF = 0x2000,
-	PHY_A_FCS = 0x0400, PHY_A_T4 = 0x0200, PHY_A_FDX = 0x0100,
-	PHY_A_HDX = 0x0080, PHY_A_10FDX = 0x0040, PHY_A_10HDX = 0x0020,
-	PHY_A_SEL = 0x001e,
-};
-/* PHY_ANL */
-enum PHY_ANL_BIT {
-	PHY_L_NP = 0x8000, PHY_L_ACK = 0x4000, PHY_L_RF = 0x2000,
-	PHY_L_FCS = 0x0400, PHY_L_T4 = 0x0200, PHY_L_FDX = 0x0100,
-	PHY_L_HDX = 0x0080, PHY_L_10FDX = 0x0040, PHY_L_10HDX = 0x0020,
-	PHY_L_SEL = 0x001f,
-};
-
-/* PHY_ANE */
-enum PHY_ANE_BIT {
-	PHY_E_PDF = 0x0010, PHY_E_LPNPA = 0x0008, PHY_E_NPA = 0x0004,
-	PHY_E_PRX = 0x0002, PHY_E_LPANEGA = 0x0001,
-};
-
-/* DM9161 */
-enum PHY_16_BIT {
-	PHY_16_BP4B45 = 0x8000, PHY_16_BPSCR = 0x4000, PHY_16_BPALIGN = 0x2000,
-	PHY_16_BP_ADPOK = 0x1000, PHY_16_Repeatmode = 0x0800,
-	PHY_16_TXselect = 0x0400,
-	PHY_16_Rsvd = 0x0200, PHY_16_RMIIEnable = 0x0100,
-	PHY_16_Force100LNK = 0x0080,
-	PHY_16_APDLED_CTL = 0x0040, PHY_16_COLLED_CTL = 0x0020,
-	PHY_16_RPDCTR_EN = 0x0010,
-	PHY_16_ResetStMch = 0x0008, PHY_16_PreamSupr = 0x0004,
-	PHY_16_Sleepmode = 0x0002,
-	PHY_16_RemoteLoopOut = 0x0001,
-};
-
-#define POST_RX		0x08
-#define POST_FW		0x04
-#define POST0_RX	(POST_RX)
-#define POST0_FW	(POST_FW)
-#define POST1_RX	(POST_RX >> 2)
-#define POST1_FW	(POST_FW >> 2)
-#define POST_ALL	(POST0_RX | POST0_FW | POST1_RX | POST1_FW)
-
 /* ARSTR */
 enum ARSTR_BIT { ARSTR_ARSTR = 0x00000001, };

@@ -786,10 +721,6 @@ struct sh_eth_private {
 	int msg_enable;
 	int speed;
 	int duplex;
-	u32 rx_int_var, tx_int_var;	/* interrupt control variables */
-	char post_rx;		/* POST receive */
-	char post_fw;		/* POST forward */
-	struct net_device_stats tsu_stats;	/* TSU forward status */
 	int port;		/* for TSU */
 	int vlan_num_ids;	/* for VLAN tag filter */

-- 
1.7.1

^ permalink raw reply related

* [PATCH net-next 1/4 v2] net: sh_eth: remove unnecessary function
From: Shimoda, Yoshihiro @ 2012-06-27  5:59 UTC (permalink / raw)
  To: netdev; +Cc: SH-Linux

The sh_eth_timer() called mod_timer() for itself. So, this patch
removes the function.

Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
---
 about v2:
  - rebase the latest net-next.git
  - I don't submit the NAPI patch this time

 drivers/net/ethernet/renesas/sh_eth.c |   22 ----------------------
 drivers/net/ethernet/renesas/sh_eth.h |    1 -
 2 files changed, 0 insertions(+), 23 deletions(-)

diff --git a/drivers/net/ethernet/renesas/sh_eth.c b/drivers/net/ethernet/renesas/sh_eth.c
index 326cb91..cf0bc31 100644
--- a/drivers/net/ethernet/renesas/sh_eth.c
+++ b/drivers/net/ethernet/renesas/sh_eth.c
@@ -1339,14 +1339,6 @@ other_irq:
 	return ret;
 }

-static void sh_eth_timer(unsigned long data)
-{
-	struct net_device *ndev = (struct net_device *)data;
-	struct sh_eth_private *mdp = netdev_priv(ndev);
-
-	mod_timer(&mdp->timer, jiffies + (10 * HZ));
-}
-
 /* PHY state control function */
 static void sh_eth_adjust_link(struct net_device *ndev)
 {
@@ -1594,11 +1586,6 @@ static int sh_eth_open(struct net_device *ndev)
 	if (ret)
 		goto out_free_irq;

-	/* Set the timer to check for link beat. */
-	init_timer(&mdp->timer);
-	mdp->timer.expires = (jiffies + (24 * HZ)) / 10;/* 2.4 sec. */
-	setup_timer(&mdp->timer, sh_eth_timer, (unsigned long)ndev);
-
 	return ret;

 out_free_irq:
@@ -1623,9 +1610,6 @@ static void sh_eth_tx_timeout(struct net_device *ndev)
 	/* tx_errors count up */
 	ndev->stats.tx_errors++;

-	/* timer off */
-	del_timer_sync(&mdp->timer);
-
 	/* Free all the skbuffs in the Rx queue. */
 	for (i = 0; i < RX_RING_SIZE; i++) {
 		rxdesc = &mdp->rx_ring[i];
@@ -1643,10 +1627,6 @@ static void sh_eth_tx_timeout(struct net_device *ndev)

 	/* device init */
 	sh_eth_dev_init(ndev);
-
-	/* timer on */
-	mdp->timer.expires = (jiffies + (24 * HZ)) / 10;/* 2.4 sec. */
-	add_timer(&mdp->timer);
 }

 /* Packet transmit function */
@@ -1719,8 +1699,6 @@ static int sh_eth_close(struct net_device *ndev)

 	free_irq(ndev->irq, ndev);

-	del_timer_sync(&mdp->timer);
-
 	/* Free all the skbuffs in the Rx queue. */
 	sh_eth_ring_free(ndev);

diff --git a/drivers/net/ethernet/renesas/sh_eth.h b/drivers/net/ethernet/renesas/sh_eth.h
index d6763b1..5af3f2a 100644
--- a/drivers/net/ethernet/renesas/sh_eth.h
+++ b/drivers/net/ethernet/renesas/sh_eth.h
@@ -772,7 +772,6 @@ struct sh_eth_private {
 	struct sh_eth_txdesc *tx_ring;
 	struct sk_buff **rx_skbuff;
 	struct sk_buff **tx_skbuff;
-	struct timer_list timer;
 	spinlock_t lock;
 	u32 cur_rx, dirty_rx;	/* Producer/consumer ring indices */
 	u32 cur_tx, dirty_tx;
-- 
1.7.1

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox