Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH 1/2] Remove unnecessary locks from rtnetlink
From: David Miller @ 2008-02-06  1:53 UTC (permalink / raw)
  To: panther; +Cc: netdev
In-Reply-To: <12018820542721-git-send-email-panther@balabit.hu>

From: Laszlo Attila Toth <panther@balabit.hu>
Date: Fri,  1 Feb 2008 17:07:33 +0100

> The do_setlink() function is protected by rtnl, additional locks are unnecessary.
> and the set_operstate() function is called from protected parts. Locks removed
> from both functions.
> 
> The set_operstate() is also called from rtnl_create_link() and from no other places.
> In rtnl_create_link() none of the changes is protected by set_lock_bh() except
> inside set_operstate(), different locking scheme is not necessary
> for the operstate.
> 
> Signed-off-by: Laszlo Attila Toth <panther@balabit.hu>

The protection using dev_base_lock() is needed.

When analyzing cases like this you need to also look at other code
paths outside of rtnetlink that access ->operstate and ->link_mode,
you obviously didn't do this.

For example, net/core/net-sysfs.c takes a read lock on dev_base_lock
in order to fetch a stable copy of both netif_running() and
dev->operstate at the same time.

Similar write locking to protect dev->operstate is made by
net/core/link_watch.c:rfc2863_policy(), for the same reason rtnetlink
has to make this locking.

You therefore cannot remove it.

This invalidates your second patch so I'm dropping that as well.

^ permalink raw reply

* Re: [PATCH] Add IPv6 support to TCP SYN cookies
From: Glenn Griffin @ 2008-02-06  1:52 UTC (permalink / raw)
  To: Alan Cox
  Cc: Evgeniy Polyakov, Andi Kleen, Glenn Griffin, netdev, linux-kernel
In-Reply-To: <20080205220510.09258826@core>

I realized an earlier email I sent had an incorrect timestamp and wasn't
associated with the thread, so I thought it would be better to resend.
I apologize if this is duplicated for anyone.

Here is a reworked patch that moves the IPv6 syncookie support out of
the ipv4/syncookies.c file and into it's own ipv6/syncookies.c.  The
same CONFIG options and sysctl variables as ipv4, but this way the code
is isolated to the ipv6 module.


Signed-off-by: Glenn Griffin <ggriffin.kernel@gmail.com>
---
 include/net/tcp.h     |    6 +
 net/ipv6/Makefile     |    1 +
 net/ipv6/syncookies.c |  273 +++++++++++++++++++++++++++++++++++++++++++++++++
 net/ipv6/tcp_ipv6.c   |   77 ++++++++++----
 4 files changed, 335 insertions(+), 22 deletions(-)
 create mode 100644 net/ipv6/syncookies.c

diff --git a/include/net/tcp.h b/include/net/tcp.h
index cb5b033..d7f620c 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -436,6 +436,11 @@ extern struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb,
 extern __u32 cookie_v4_init_sequence(struct sock *sk, struct sk_buff *skb, 
 				     __u16 *mss);
 
+/* From net/ipv6/syncookies.c */
+extern struct sock *cookie_v6_check(struct sock *sk, struct sk_buff *skb);
+extern __u32 cookie_v6_init_sequence(struct sock *sk, struct sk_buff *skb,
+				     __u16 *mss);
+
 /* tcp_output.c */
 
 extern void __tcp_push_pending_frames(struct sock *sk, unsigned int cur_mss,
@@ -1337,6 +1342,7 @@ extern int tcp_proc_register(struct tcp_seq_afinfo *afinfo);
 extern void tcp_proc_unregister(struct tcp_seq_afinfo *afinfo);
 
 extern struct request_sock_ops tcp_request_sock_ops;
+extern struct request_sock_ops tcp6_request_sock_ops;
 
 extern int tcp_v4_destroy_sock(struct sock *sk);
 
diff --git a/net/ipv6/Makefile b/net/ipv6/Makefile
index 87c23a7..d1a1056 100644
--- a/net/ipv6/Makefile
+++ b/net/ipv6/Makefile
@@ -15,6 +15,7 @@ ipv6-$(CONFIG_XFRM) += xfrm6_policy.o xfrm6_state.o xfrm6_input.o \
 ipv6-$(CONFIG_NETFILTER) += netfilter.o
 ipv6-$(CONFIG_IPV6_MULTIPLE_TABLES) += fib6_rules.o
 ipv6-$(CONFIG_PROC_FS) += proc.o
+ipv6-$(CONFIG_SYN_COOKIES) += syncookies.o
 
 ipv6-objs += $(ipv6-y)
 
diff --git a/net/ipv6/syncookies.c b/net/ipv6/syncookies.c
new file mode 100644
index 0000000..521c9da
--- /dev/null
+++ b/net/ipv6/syncookies.c
@@ -0,0 +1,273 @@
+/*
+ *  IPv6 Syncookies implementation for the Linux kernel
+ *
+ *  Authors:
+ *  Glenn Griffin	<ggriffin.kernel@gmail.com>
+ *
+ *  Based on IPv4 implementation by Andi Kleen
+ *  linux/net/ipv4/syncookies.c
+ *
+ *	This program is free software; you can redistribute it and/or
+ *      modify it under the terms of the GNU General Public License
+ *      as published by the Free Software Foundation; either version
+ *      2 of the License, or (at your option) any later version.
+ *
+ */
+
+#include <linux/tcp.h>
+#include <linux/random.h>
+#include <linux/cryptohash.h>
+#include <linux/kernel.h>
+#include <net/ipv6.h>
+#include <net/tcp.h>
+
+extern int sysctl_tcp_syncookies;
+
+static __u32 syncookie_secret[2][16-10+SHA_DIGEST_WORDS];
+
+static __init int init_syncookies(void)
+{
+	get_random_bytes(syncookie_secret, sizeof(syncookie_secret));
+	return 0;
+}
+module_init(init_syncookies);
+
+#define COOKIEBITS 24	/* Upper bits store count */
+#define COOKIEMASK (((__u32)1 << COOKIEBITS) - 1)
+
+/*
+ * This table has to be sorted and terminated with (__u16)-1.
+ * XXX generate a better table.
+ * Unresolved Issues: HIPPI with a 64k MSS is not well supported.
+ * 
+ * Taken directly from ipv4 implementation. 
+ * Should this list be modified for ipv6 use or is it close enough?
+ * rfc 2460 8.3 suggests mss values 20 bytes less than ipv4 counterpart
+ */
+static __u16 const msstab[] = {
+	64 - 1,
+	256 - 1,
+	512 - 1,
+	536 - 1,
+	1024 - 1,
+	1440 - 1,
+	1460 - 1,
+	4312 - 1,
+	(__u16)-1
+};
+/* The number doesn't include the -1 terminator */
+#define NUM_MSS (ARRAY_SIZE(msstab) - 1)
+
+/*
+ * This (misnamed) value is the age of syncookie which is permitted.
+ * Its ideal value should be dependent on TCP_TIMEOUT_INIT and
+ * sysctl_tcp_retries1. It's a rather complicated formula (exponential
+ * backoff) to compute at runtime so it's currently hardcoded here.
+ */
+#define COUNTER_TRIES 4
+
+static inline struct sock *get_cookie_sock(struct sock *sk, struct sk_buff *skb,
+					   struct request_sock *req,
+					   struct dst_entry *dst)
+{
+	struct inet_connection_sock *icsk = inet_csk(sk);
+	struct sock *child;
+
+	child = icsk->icsk_af_ops->syn_recv_sock(sk, skb, req, dst);
+	if (child)
+		inet_csk_reqsk_queue_add(sk, req, child);
+	else
+		reqsk_free(req);
+
+	return child;
+}
+
+static u32 cookie_hash(struct in6_addr *saddr, struct in6_addr *daddr,
+		       __be16 sport, __be16 dport, u32 count, int c)
+{
+	__u32 tmp[16 + 5 + SHA_WORKSPACE_WORDS];
+
+	/*
+	 * we have 320 bits of information to hash, copy in the remaining
+	 * 192 bits required for sha_transform, from the syncookie_secret
+	 * and overwrite the digest with the secret
+	 */
+	memcpy(tmp + 10, syncookie_secret[c], sizeof(syncookie_secret[c]));
+	memcpy(tmp, saddr, 16);
+	memcpy(tmp + 4, daddr, 16);
+	tmp[8] = ((__force u32)sport << 16) + (__force u32)dport;
+	tmp[9] = count;
+	sha_transform(tmp + 16, (__u8 *)tmp, tmp + 16 + 5);
+
+	return tmp[17];
+}
+
+static __u32 secure_tcp_syn_cookie(struct in6_addr *saddr, struct in6_addr *daddr,
+				   __be16 sport, __be16 dport, __u32 sseq,
+				   __u32 count, __u32 data)
+{
+	return (cookie_hash(saddr, daddr, sport, dport, 0, 0) +
+		sseq + (count << COOKIEBITS) +
+		((cookie_hash(saddr, daddr, sport, dport, count, 1) + data)
+		& COOKIEMASK));
+}
+
+static __u32 check_tcp_syn_cookie(__u32 cookie, struct in6_addr *saddr,
+				  struct in6_addr *daddr, __be16 sport,
+				  __be16 dport, __u32 sseq, __u32 count,
+				  __u32 maxdiff)
+{
+	__u32 diff;
+
+	cookie -= cookie_hash(saddr, daddr, sport, dport, 0, 0) + sseq;
+
+	diff = (count - (cookie >> COOKIEBITS)) & ((__u32) -1 >> COOKIEBITS);
+	if (diff >= maxdiff)
+		return (__u32)-1;
+
+	return (cookie -
+		cookie_hash(saddr, daddr, sport, dport, count - diff, 1))
+		& COOKIEMASK;
+}
+
+__u32 cookie_v6_init_sequence(struct sock *sk, struct sk_buff *skb, __u16 *mssp)
+{
+	struct ipv6hdr *iph = ipv6_hdr(skb);
+	const struct tcphdr *th = tcp_hdr(skb);
+	int mssind;
+	const __u16 mss = *mssp;
+
+	tcp_sk(sk)->last_synq_overflow = jiffies;
+
+	for (mssind = 0; mss > msstab[mssind + 1]; mssind++)
+		;
+	*mssp = msstab[mssind] + 1;
+
+	NET_INC_STATS_BH(LINUX_MIB_SYNCOOKIESSENT);
+
+	return secure_tcp_syn_cookie(&iph->saddr, &iph->daddr, th->source,
+				     th->dest, ntohl(th->seq),
+				     jiffies / (HZ * 60), mssind);
+}
+
+static inline int cookie_check(struct sk_buff *skb, __u32 cookie)
+{
+	struct ipv6hdr *iph = ipv6_hdr(skb);
+	const struct tcphdr *th = tcp_hdr(skb);
+	__u32 seq = ntohl(th->seq) - 1;
+	__u32 mssind = check_tcp_syn_cookie(cookie, &iph->saddr, &iph->daddr,
+					    th->source, th->dest, seq,
+					    jiffies / (HZ * 60), COUNTER_TRIES);
+
+	return mssind < NUM_MSS ? msstab[mssind] + 1 : 0;
+}
+
+struct sock *cookie_v6_check(struct sock *sk, struct sk_buff *skb)
+{
+	struct inet_request_sock *ireq;
+	struct inet6_request_sock *ireq6;
+	struct tcp_request_sock *treq;
+	struct ipv6_pinfo *np = inet6_sk(sk);
+	struct tcp_sock *tp = tcp_sk(sk);
+	const struct tcphdr *th = tcp_hdr(skb);
+	__u32 cookie = ntohl(th->ack_seq) - 1;
+	struct sock *ret = sk;
+	struct request_sock *req;
+	int mss;
+	struct dst_entry *dst;
+	__u8 rcv_wscale;
+
+	if (!sysctl_tcp_syncookies || !th->ack)
+		goto out;
+
+	if (time_after(jiffies, tp->last_synq_overflow + TCP_TIMEOUT_INIT) ||
+		(mss = cookie_check(skb, cookie)) == 0) {
+		NET_INC_STATS_BH(LINUX_MIB_SYNCOOKIESFAILED);
+		goto out;
+	}
+
+	NET_INC_STATS_BH(LINUX_MIB_SYNCOOKIESRECV);
+
+	ret = NULL;
+	req = inet6_reqsk_alloc(&tcp6_request_sock_ops);
+	if (!req)
+		goto out;
+
+	ireq = inet_rsk(req);
+	ireq6 = inet6_rsk(req);
+	treq = tcp_rsk(req);
+	ireq6->pktopts = NULL;
+
+	if (security_inet_conn_request(sk, skb, req)) {
+		reqsk_free(req);
+		goto out;
+	}
+
+	req->mss = mss;
+	ireq->rmt_port = th->source;
+	ipv6_addr_copy(&ireq6->rmt_addr, &ipv6_hdr(skb)->saddr);
+	ipv6_addr_copy(&ireq6->loc_addr, &ipv6_hdr(skb)->daddr);
+	if (ipv6_opt_accepted(sk, skb) ||
+	    np->rxopt.bits.rxinfo || np->rxopt.bits.rxoinfo ||
+	    np->rxopt.bits.rxhlim || np->rxopt.bits.rxohlim) {
+		atomic_inc(&skb->users);
+		ireq6->pktopts = skb;
+	}
+
+	ireq6->iif = sk->sk_bound_dev_if;
+	/* So that link locals have meaning */
+	if (!sk->sk_bound_dev_if &&
+	    ipv6_addr_type(&ireq6->rmt_addr) & IPV6_ADDR_LINKLOCAL)
+		ireq6->iif = inet6_iif(skb);
+
+	req->expires = 0UL;
+	req->retrans = 0;
+	ireq->snd_wscale = ireq->rcv_wscale = ireq->tstamp_ok = 0;
+	ireq->wscale_ok = ireq->sack_ok = 0;
+	treq->rcv_isn = ntohl(th->seq) - 1;
+	treq->snt_isn = cookie;
+
+	/*
+	 * We need to lookup the dst_entry to get the correct window size.
+	 * This is taken from tcp_v6_syn_recv_sock.  Somebody please enlighten
+	 * me if there is a preferred way.
+	 */
+	{
+		struct in6_addr *final_p = NULL, final;
+		struct flowi fl;
+		memset(&fl, 0, sizeof(fl));
+		fl.proto = IPPROTO_TCP;
+		ipv6_addr_copy(&fl.fl6_dst, &ireq6->rmt_addr);
+		if (np->opt && np->opt->srcrt) {
+			struct rt0_hdr *rt0 = (struct rt0_hdr *) np->opt->srcrt;
+			ipv6_addr_copy(&final, &fl.fl6_dst);
+			ipv6_addr_copy(&fl.fl6_dst, rt0->addr);
+			final_p = &final;
+		}
+		ipv6_addr_copy(&fl.fl6_src, &ireq6->loc_addr);
+		fl.oif = sk->sk_bound_dev_if;
+		fl.fl_ip_dport = inet_rsk(req)->rmt_port;
+		fl.fl_ip_sport = inet_sk(sk)->sport;
+		security_req_classify_flow(req, &fl);
+		if (ip6_dst_lookup(sk, &dst, &fl)) {
+			reqsk_free(req);
+			goto out;
+		}
+		if (final_p)
+			ipv6_addr_copy(&fl.fl6_dst, final_p);
+		if ((xfrm_lookup(&dst, &fl, sk, 0)) < 0)
+			goto out;
+	}
+
+	req->window_clamp = dst_metric(dst, RTAX_WINDOW);
+	tcp_select_initial_window(tcp_full_space(sk), req->mss,
+				  &req->rcv_wnd, &req->window_clamp,
+				  0, &rcv_wscale);
+
+	ireq->rcv_wscale = rcv_wscale;
+
+	ret = get_cookie_sock(sk, skb, req, dst);
+
+out:	return ret;
+}
+
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 93980c3..ad39bd1 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -520,6 +520,20 @@ done:
 	return err;
 }
 
+static inline void syn_flood_warning(struct sk_buff *skb)
+{
+#ifdef CONFIG_SYN_COOKIES
+	if (sysctl_tcp_syncookies)
+		printk(KERN_INFO
+		       "TCPv6: Possible SYN flooding on port %d. "
+		       "Sending cookies.\n", ntohs(tcp_hdr(skb)->dest));
+	else
+#endif
+		printk(KERN_INFO
+		       "TCPv6: Possible SYN flooding on port %d. "
+		       "Dropping request.\n", ntohs(tcp_hdr(skb)->dest));
+}
+
 static void tcp_v6_reqsk_destructor(struct request_sock *req)
 {
 	if (inet6_rsk(req)->pktopts)
@@ -923,7 +937,7 @@ done_opts:
 }
 #endif
 
-static struct request_sock_ops tcp6_request_sock_ops __read_mostly = {
+struct request_sock_ops tcp6_request_sock_ops __read_mostly = {
 	.family		=	AF_INET6,
 	.obj_size	=	sizeof(struct tcp6_request_sock),
 	.rtx_syn_ack	=	tcp_v6_send_synack,
@@ -1221,9 +1235,9 @@ static struct sock *tcp_v6_hnd_req(struct sock *sk,struct sk_buff *skb)
 		return NULL;
 	}
 
-#if 0 /*def CONFIG_SYN_COOKIES*/
+#ifdef CONFIG_SYN_COOKIES
 	if (!th->rst && !th->syn && th->ack)
-		sk = cookie_v6_check(sk, skb, &(IPCB(skb)->opt));
+		sk = cookie_v6_check(sk, skb);
 #endif
 	return sk;
 }
@@ -1239,6 +1253,11 @@ static int tcp_v6_conn_request(struct sock *sk, struct sk_buff *skb)
 	struct tcp_sock *tp = tcp_sk(sk);
 	struct request_sock *req = NULL;
 	__u32 isn = TCP_SKB_CB(skb)->when;
+#ifdef CONFIG_SYN_COOKIES
+	int want_cookie = 0;
+#else
+#define want_cookie 0
+#endif
 
 	if (skb->protocol == htons(ETH_P_IP))
 		return tcp_v4_conn_request(sk, skb);
@@ -1246,12 +1265,14 @@ static int tcp_v6_conn_request(struct sock *sk, struct sk_buff *skb)
 	if (!ipv6_unicast_destination(skb))
 		goto drop;
 
-	/*
-	 *	There are no SYN attacks on IPv6, yet...
-	 */
 	if (inet_csk_reqsk_queue_is_full(sk) && !isn) {
 		if (net_ratelimit())
-			printk(KERN_INFO "TCPv6: dropping request, synflood is possible\n");
+			syn_flood_warning(skb);
+#ifdef CONFIG_SYN_COOKIES
+		if (sysctl_tcp_syncookies)
+			want_cookie = 1;
+		else
+#endif
 		goto drop;
 	}
 
@@ -1272,29 +1293,39 @@ static int tcp_v6_conn_request(struct sock *sk, struct sk_buff *skb)
 
 	tcp_parse_options(skb, &tmp_opt, 0);
 
+	if (want_cookie) {
+		tcp_clear_options(&tmp_opt);
+		tmp_opt.saw_tstamp = 0;
+	}
+
 	tmp_opt.tstamp_ok = tmp_opt.saw_tstamp;
 	tcp_openreq_init(req, &tmp_opt, skb);
 
 	treq = inet6_rsk(req);
 	ipv6_addr_copy(&treq->rmt_addr, &ipv6_hdr(skb)->saddr);
 	ipv6_addr_copy(&treq->loc_addr, &ipv6_hdr(skb)->daddr);
-	TCP_ECN_create_request(req, tcp_hdr(skb));
 	treq->pktopts = NULL;
-	if (ipv6_opt_accepted(sk, skb) ||
-	    np->rxopt.bits.rxinfo || np->rxopt.bits.rxoinfo ||
-	    np->rxopt.bits.rxhlim || np->rxopt.bits.rxohlim) {
-		atomic_inc(&skb->users);
-		treq->pktopts = skb;
-	}
-	treq->iif = sk->sk_bound_dev_if;
+	if (!want_cookie)
+		TCP_ECN_create_request(req, tcp_hdr(skb));
+
+	if (want_cookie) {
+		isn = cookie_v6_init_sequence(sk, skb, &req->mss);
+	} else if (!isn) {
+		if (ipv6_opt_accepted(sk, skb) ||
+		    np->rxopt.bits.rxinfo || np->rxopt.bits.rxoinfo ||
+		    np->rxopt.bits.rxhlim || np->rxopt.bits.rxohlim) {
+			atomic_inc(&skb->users);
+			treq->pktopts = skb;
+		}
+		treq->iif = sk->sk_bound_dev_if;
 
-	/* So that link locals have meaning */
-	if (!sk->sk_bound_dev_if &&
-	    ipv6_addr_type(&treq->rmt_addr) & IPV6_ADDR_LINKLOCAL)
-		treq->iif = inet6_iif(skb);
+		/* So that link locals have meaning */
+		if (!sk->sk_bound_dev_if &&
+		    ipv6_addr_type(&treq->rmt_addr) & IPV6_ADDR_LINKLOCAL)
+			treq->iif = inet6_iif(skb);
 
-	if (isn == 0)
 		isn = tcp_v6_init_sequence(skb);
+	}
 
 	tcp_rsk(req)->snt_isn = isn;
 
@@ -1303,8 +1334,10 @@ static int tcp_v6_conn_request(struct sock *sk, struct sk_buff *skb)
 	if (tcp_v6_send_synack(sk, req, NULL))
 		goto drop;
 
-	inet6_csk_reqsk_queue_hash_add(sk, req, TCP_TIMEOUT_INIT);
-	return 0;
+	if (!want_cookie) {
+		inet6_csk_reqsk_queue_hash_add(sk, req, TCP_TIMEOUT_INIT);
+		return 0;
+	}
 
 drop:
 	if (req)
-- 
1.5.3.4


^ permalink raw reply related

* Re: [PATCH] [IPV4]: Fix compiler error with CONFIG_PROC_FS=n
From: David Miller @ 2008-02-06  0:34 UTC (permalink / raw)
  To: johfel; +Cc: netdev, den
In-Reply-To: <1202237034.9289.8.camel@localhost>

From: Johann Felix Soden <johfel@gmx.de>
Date: Tue, 05 Feb 2008 19:43:54 +0100

> From: Johann Felix Soden <johfel@users.sourceforge.net>
> 
> Handle CONFIG_PROC_FS=n in net/ipv4/fib_frontend.c because:
> 
> net/ipv4/fib_frontend.c: In function 'fib_net_init':
> net/ipv4/fib_frontend.c:1032: error: implicit declaration of function 'fib_proc_init'
> net/ipv4/fib_frontend.c: In function 'fib_net_exit':
> net/ipv4/fib_frontend.c:1047: error: implicit declaration of function 'fib_proc_exit'
> 
> Signed-off-by: Johann Felix Soden <johfel@users.sourceforge.net>

I'm pretty sure I merged in a change this morning which fixes
this.

BTW, in general we do not put ifdefs into *.c code to fix stuff
like this, instead we add empty implementations into a header
file which is a better place for ifdef tests.

And that's how the patch which was merged handles this problem.

^ permalink raw reply

* Re: [PATCH] [PPPOL2TP] Label unused warning when CONFIG_PROC_FS is not set.
From: David Miller @ 2008-02-06  0:32 UTC (permalink / raw)
  To: jeff; +Cc: ramirose, netdev, linux-kernel, paulus
In-Reply-To: <47A8A95B.4060905@garzik.org>

From: Jeff Garzik <jeff@garzik.org>
Date: Tue, 05 Feb 2008 13:22:19 -0500

> David or Paul, wanna pick this up?

I took it, no worries.

^ permalink raw reply

* Re: [PATCH] [PPPOL2TP] Label unused warning when CONFIG_PROC_FS is not set.
From: David Miller @ 2008-02-06  0:31 UTC (permalink / raw)
  To: jchapman; +Cc: ramirose, netdev, jeff, linux-kernel
In-Reply-To: <47A89F02.8010403@katalix.com>

From: James Chapman <jchapman@katalix.com>
Date: Tue, 05 Feb 2008 17:38:10 +0000

> Acked-by: James Chapman <jchapman@katalix.com>

Applied, thanks everyone.

^ permalink raw reply

* Re: [NET_SCHED 04/04]: cls_flow: support classification based on VLAN tag
From: David Miller @ 2008-02-06  0:21 UTC (permalink / raw)
  To: kaber; +Cc: netdev
In-Reply-To: <20080205142907.13543.75941.sendpatchset@localhost.localdomain>

From: Patrick McHardy <kaber@trash.net>
Date: Tue,  5 Feb 2008 15:29:44 +0100 (MET)

> [NET_SCHED]: cls_flow: support classification based on VLAN tag
> 
> Signed-off-by: Patrick McHardy <kaber@trash.net>

Also applied, thanks a lot.

^ permalink raw reply

* Re: [VLAN 03/04]: Constify skb argument to vlan_get_tag()
From: David Miller @ 2008-02-06  0:20 UTC (permalink / raw)
  To: kaber; +Cc: netdev
In-Reply-To: <20080205142906.13543.98167.sendpatchset@localhost.localdomain>

From: Patrick McHardy <kaber@trash.net>
Date: Tue,  5 Feb 2008 15:29:43 +0100 (MET)

> [VLAN]: Constify skb argument to vlan_get_tag()
> 
> Required by next patch to use it from the flow classifier.
> 
> Signed-off-by: Patrick McHardy <kaber@trash.net>

Applied.

^ permalink raw reply

* Re: [NET_SCHED 02/04]: cls_flow: fix key mask validity check
From: David Miller @ 2008-02-06  0:20 UTC (permalink / raw)
  To: kaber; +Cc: netdev
In-Reply-To: <20080205142904.13543.34586.sendpatchset@localhost.localdomain>

From: Patrick McHardy <kaber@trash.net>
Date: Tue,  5 Feb 2008 15:29:41 +0100 (MET)

> [NET_SCHED]: cls_flow: fix key mask validity check
> 
> Since we're using fls(), we need to check whether the value is non-zero first.
> 
> Signed-off-by: Patrick McHardy <kaber@trash.net>

Applied.

^ permalink raw reply

* Re: [NET_SCHED 01/04]: em_meta: fix compile warning
From: David Miller @ 2008-02-06  0:19 UTC (permalink / raw)
  To: kaber; +Cc: netdev
In-Reply-To: <20080205142903.13543.6864.sendpatchset@localhost.localdomain>

From: Patrick McHardy <kaber@trash.net>
Date: Tue,  5 Feb 2008 15:29:40 +0100 (MET)

> [NET_SCHED]: em_meta: fix compile warning
> 
> net/sched/em_meta.c: In function 'meta_int_vlan_tag':
> net/sched/em_meta.c:179: warning: 'tag' may be used uninitialized in this function
> 
> Signed-off-by: Patrick McHardy <kaber@trash.net>

Applied.

^ permalink raw reply

* Re: [PATCH 01/24 for-2.6.25] DM9000: Fix endian-ness of data accesses. Patch from: Laurent Pinchart <laurentp@cse-semaphore.com>
From: Francois Romieu @ 2008-02-05 22:57 UTC (permalink / raw)
  To: Ben Dooks; +Cc: netdev, jeff, akpm, daniel, laurentp
In-Reply-To: <20080205000814.539308209@fluff.org.uk>

Ben Dooks <ben-linux@fluff.org> :
> This patch splits the receive status in 8bit wide fields and convert the
> packet length from little endian to CPU byte order.
> 
> Signed-off-by: Laurent Pinchart <laurentp@cse-semaphore.com>
> Signed-off-by: Ben Dooks <ben-linux@fluff.org>
> 
> Index: linux-2.6.24-git5-dm9k/drivers/net/dm9000.c
> ===================================================================
> --- linux-2.6.24-git5-dm9k.orig/drivers/net/dm9000.c
> +++ linux-2.6.24-git5-dm9k/drivers/net/dm9000.c
> @@ -867,7 +867,8 @@ dm9000_timer(unsigned long data)
>  }
>  
>  struct dm9000_rxhdr {
> -	u16	RxStatus;
> +	u8	RxPktReady;
> +	u8	RxStatus;
>  	u16	RxLen;

Given the description of the patch, you could s/u16/__le16/

Al regularly pushes endianness fixes. You can save him some work.

-- 
Ueimor

^ permalink raw reply

* Re: [PATCH 07/24 for-2.6.25] DM9000: Add initial ethtool support
From: Francois Romieu @ 2008-02-05 22:50 UTC (permalink / raw)
  To: Ben Dooks; +Cc: netdev, jeff, akpm, daniel, laurentp
In-Reply-To: <20080205000815.759244139@fluff.org.uk>

Ben Dooks <ben-linux@fluff.org> :
> Add support for ethtool operations for the DM9000.
> 
> Signed-off-by: Ben Dooks <ben-linux@fluff.org>
> 
> Index: linux-2.6.24-quilt3/drivers/net/dm9000.c
> ===================================================================
> --- linux-2.6.24-quilt3.orig/drivers/net/dm9000.c
> +++ linux-2.6.24-quilt3/drivers/net/dm9000.c
[...]
> +static int dm9000_get_settings(struct net_device *dev, struct ethtool_cmd *cmd)
> +{
> +	board_info_t *dm = to_dm9000_board(dev);
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&dm->lock, flags);
> +	mii_ethtool_gset(&dm->mii, cmd);

drivers/net/mii.c::mii_ethtool_gset
[...]
        advert = mii->mdio_read(dev, mii->phy_id, MII_ADVERTISE);

drivers/net/dm9000.c::dm9000_probe
[...]
        db->mii.mdio_read    = dm9000_phy_read;

drivers/net/dm9000.c::dm9000_phy_read
[...]
        board_info_t *db = (board_info_t *) dev->priv;
        unsigned long flags;
        unsigned int reg_save;
        int ret;

        spin_lock_irqsave(&db->lock,flags);

-> deadlock

-- 
Ueimor

^ permalink raw reply

* Pull request for 'r6040' branch
From: Francois Romieu @ 2008-02-05 22:34 UTC (permalink / raw)
  To: jeff; +Cc: netdev, Florian Fainelli, Sten Wang, Andrew Morton

Please pull from branch 'r6040' in repository

git://git.kernel.org/pub/scm/linux/kernel/git/romieu/netdev-2.6.git r6040

to get the changes below.

I have simply rebased the r6040 branch from december on top of
Linus's latest head and given each patch a compile test.
The content is identical to Florian's initial work (minus the
removal of the unused TIMER_WUT and a duplicate update of an
error counter).

Distance from 'master' (21511abd0a248a3f225d3b611cfabb93124605a7)
-----------------------------------------------------------------

ec6d2d453a932fd50c5fd95d5aac633b4e5f241d
106adf3c84d081776a1d1fbb8a047cad12af2bb9
b4f1255d6839bd970d5ff20a9c3d73f73c9adaa3
d248fd77902fcf33b0bc49ab521930877d94890f

Diffstat
--------

 drivers/net/r6040.c |  233 +++++++++++++++++++++++++++++---------------------
 1 files changed, 135 insertions(+), 98 deletions(-)

Shortlog
--------

Florian Fainelli (4):
      r6040: do not use a private stats structure to store statistics
      r6040: add helpers to allocate and free the Tx/Rx buffers
      r6040: recover from transmit timeout
      r6040: cleanups

Patch
-----

diff --git a/drivers/net/r6040.c b/drivers/net/r6040.c
index 2334f4e..19184e4 100644
--- a/drivers/net/r6040.c
+++ b/drivers/net/r6040.c
@@ -61,7 +61,6 @@
 
 /* Time in jiffies before concluding the transmitter is hung. */
 #define TX_TIMEOUT	(6000 * HZ / 1000)
-#define TIMER_WUT	(jiffies + HZ * 1)/* timer wakeup time : 1 second */
 
 /* RDC MAC I/O Size */
 #define R6040_IO_SIZE	256
@@ -174,8 +173,6 @@ struct r6040_private {
 	struct net_device *dev;
 	struct mii_if_info mii_if;
 	struct napi_struct napi;
-	struct net_device_stats stats;
-	u16	napi_rx_running;
 	void __iomem *base;
 };
 
@@ -235,17 +232,53 @@ static void mdio_write(struct net_device *dev, int mii_id, int reg, int val)
 	phy_write(ioaddr, lp->phy_addr, reg, val);
 }
 
-static void r6040_tx_timeout(struct net_device *dev)
+static void r6040_free_txbufs(struct net_device *dev)
 {
-	struct r6040_private *priv = netdev_priv(dev);
+	struct r6040_private *lp = netdev_priv(dev);
+	int i;
 
-	disable_irq(dev->irq);
-	napi_disable(&priv->napi);
-	spin_lock(&priv->lock);
-	dev->stats.tx_errors++;
-	spin_unlock(&priv->lock);
+	for (i = 0; i < TX_DCNT; i++) {
+		if (lp->tx_insert_ptr->skb_ptr) {
+			pci_unmap_single(lp->pdev, lp->tx_insert_ptr->buf,
+				MAX_BUF_SIZE, PCI_DMA_TODEVICE);
+			dev_kfree_skb(lp->tx_insert_ptr->skb_ptr);
+			lp->rx_insert_ptr->skb_ptr = NULL;
+		}
+		lp->tx_insert_ptr = lp->tx_insert_ptr->vndescp;
+	}
+}
 
-	netif_stop_queue(dev);
+static void r6040_free_rxbufs(struct net_device *dev)
+{
+	struct r6040_private *lp = netdev_priv(dev);
+	int i;
+
+	for (i = 0; i < RX_DCNT; i++) {
+		if (lp->rx_insert_ptr->skb_ptr) {
+			pci_unmap_single(lp->pdev, lp->rx_insert_ptr->buf,
+				MAX_BUF_SIZE, PCI_DMA_FROMDEVICE);
+			dev_kfree_skb(lp->rx_insert_ptr->skb_ptr);
+			lp->rx_insert_ptr->skb_ptr = NULL;
+		}
+		lp->rx_insert_ptr = lp->rx_insert_ptr->vndescp;
+	}
+}
+
+static void r6040_init_ring_desc(struct r6040_descriptor *desc_ring,
+				 dma_addr_t desc_dma, int size)
+{
+	struct r6040_descriptor *desc = desc_ring;
+	dma_addr_t mapping = desc_dma;
+
+	while (size-- > 0) {
+		mapping += sizeof(sizeof(*desc));
+		desc->ndesc = cpu_to_le32(mapping);
+		desc->vndescp = desc + 1;
+		desc++;
+	}
+	desc--;
+	desc->ndesc = cpu_to_le32(desc_dma);
+	desc->vndescp = desc_ring;
 }
 
 /* Allocate skb buffer for rx descriptor */
@@ -256,7 +289,7 @@ static void rx_buf_alloc(struct r6040_private *lp, struct net_device *dev)
 
 	descptr = lp->rx_insert_ptr;
 	while (lp->rx_free_desc < RX_DCNT) {
-		descptr->skb_ptr = dev_alloc_skb(MAX_BUF_SIZE);
+		descptr->skb_ptr = netdev_alloc_skb(dev, MAX_BUF_SIZE);
 
 		if (!descptr->skb_ptr)
 			break;
@@ -272,6 +305,63 @@ static void rx_buf_alloc(struct r6040_private *lp, struct net_device *dev)
 	lp->rx_insert_ptr = descptr;
 }
 
+static void r6040_alloc_txbufs(struct net_device *dev)
+{
+	struct r6040_private *lp = netdev_priv(dev);
+	void __iomem *ioaddr = lp->base;
+
+	lp->tx_free_desc = TX_DCNT;
+
+	lp->tx_remove_ptr = lp->tx_insert_ptr = lp->tx_ring;
+	r6040_init_ring_desc(lp->tx_ring, lp->tx_ring_dma, TX_DCNT);
+
+	iowrite16(lp->tx_ring_dma, ioaddr + MTD_SA0);
+	iowrite16(lp->tx_ring_dma >> 16, ioaddr + MTD_SA1);
+}
+
+static void r6040_alloc_rxbufs(struct net_device *dev)
+{
+	struct r6040_private *lp = netdev_priv(dev);
+	void __iomem *ioaddr = lp->base;
+
+	lp->rx_free_desc = 0;
+
+	lp->rx_remove_ptr = lp->rx_insert_ptr = lp->rx_ring;
+	r6040_init_ring_desc(lp->rx_ring, lp->rx_ring_dma, RX_DCNT);
+
+	rx_buf_alloc(lp, dev);
+
+	iowrite16(lp->rx_ring_dma, ioaddr + MRD_SA0);
+	iowrite16(lp->rx_ring_dma >> 16, ioaddr + MRD_SA1);
+}
+
+static void r6040_tx_timeout(struct net_device *dev)
+{
+	struct r6040_private *priv = netdev_priv(dev);
+	void __iomem *ioaddr = priv->base;
+
+	printk(KERN_WARNING "%s: transmit timed out, status %4.4x, PHY status "
+		"%4.4x\n",
+		dev->name, ioread16(ioaddr + MIER),
+		mdio_read(dev, priv->mii_if.phy_id, MII_BMSR));
+
+	disable_irq(dev->irq);
+	napi_disable(&priv->napi);
+	spin_lock(&priv->lock);
+	/* Clear all descriptors */
+	r6040_free_txbufs(dev);
+	r6040_free_rxbufs(dev);
+	r6040_alloc_txbufs(dev);
+	r6040_alloc_rxbufs(dev);
+
+	/* Reset MAC */
+	iowrite16(MAC_RST, ioaddr + MCR1);
+	spin_unlock(&priv->lock);
+	enable_irq(dev->irq);
+
+	dev->stats.tx_errors++;
+	netif_wake_queue(dev);
+}
 
 static struct net_device_stats *r6040_get_stats(struct net_device *dev)
 {
@@ -280,11 +370,11 @@ static struct net_device_stats *r6040_get_stats(struct net_device *dev)
 	unsigned long flags;
 
 	spin_lock_irqsave(&priv->lock, flags);
-	priv->stats.rx_crc_errors += ioread8(ioaddr + ME_CNT1);
-	priv->stats.multicast += ioread8(ioaddr + ME_CNT0);
+	dev->stats.rx_crc_errors += ioread8(ioaddr + ME_CNT1);
+	dev->stats.multicast += ioread8(ioaddr + ME_CNT0);
 	spin_unlock_irqrestore(&priv->lock, flags);
 
-	return &priv->stats;
+	return &dev->stats;
 }
 
 /* Stop RDC MAC and Free the allocated resource */
@@ -293,7 +383,6 @@ static void r6040_down(struct net_device *dev)
 	struct r6040_private *lp = netdev_priv(dev);
 	void __iomem *ioaddr = lp->base;
 	struct pci_dev *pdev = lp->pdev;
-	int i;
 	int limit = 2048;
 	u16 *adrp;
 	u16 cmd;
@@ -313,27 +402,12 @@ static void r6040_down(struct net_device *dev)
 	iowrite16(adrp[1], ioaddr + MID_0M);
 	iowrite16(adrp[2], ioaddr + MID_0H);
 	free_irq(dev->irq, dev);
+
 	/* Free RX buffer */
-	for (i = 0; i < RX_DCNT; i++) {
-		if (lp->rx_insert_ptr->skb_ptr) {
-			pci_unmap_single(lp->pdev, lp->rx_insert_ptr->buf,
-				MAX_BUF_SIZE, PCI_DMA_FROMDEVICE);
-			dev_kfree_skb(lp->rx_insert_ptr->skb_ptr);
-			lp->rx_insert_ptr->skb_ptr = NULL;
-		}
-		lp->rx_insert_ptr = lp->rx_insert_ptr->vndescp;
-	}
+	r6040_free_rxbufs(dev);
 
 	/* Free TX buffer */
-	for (i = 0; i < TX_DCNT; i++) {
-		if (lp->tx_insert_ptr->skb_ptr) {
-			pci_unmap_single(lp->pdev, lp->tx_insert_ptr->buf,
-				MAX_BUF_SIZE, PCI_DMA_TODEVICE);
-			dev_kfree_skb(lp->tx_insert_ptr->skb_ptr);
-			lp->rx_insert_ptr->skb_ptr = NULL;
-		}
-		lp->tx_insert_ptr = lp->tx_insert_ptr->vndescp;
-	}
+	r6040_free_txbufs(dev);
 
 	/* Free Descriptor memory */
 	pci_free_consistent(pdev, RX_DESC_SIZE, lp->rx_ring, lp->rx_ring_dma);
@@ -432,19 +506,24 @@ static int r6040_rx(struct net_device *dev, int limit)
 
 		/* Check for errors */
 		err = ioread16(ioaddr + MLSR);
-		if (err & 0x0400) priv->stats.rx_errors++;
+		if (err & 0x0400)
+			dev->stats.rx_errors++;
 		/* RX FIFO over-run */
-		if (err & 0x8000) priv->stats.rx_fifo_errors++;
+		if (err & 0x8000)
+			dev->stats.rx_fifo_errors++;
 		/* RX descriptor unavailable */
-		if (err & 0x0080) priv->stats.rx_frame_errors++;
+		if (err & 0x0080)
+			dev->stats.rx_frame_errors++;
 		/* Received packet with length over buffer lenght */
-		if (err & 0x0020) priv->stats.rx_over_errors++;
+		if (err & 0x0020)
+			dev->stats.rx_over_errors++;
 		/* Received packet with too long or short */
-		if (err & (0x0010|0x0008)) priv->stats.rx_length_errors++;
+		if (err & (0x0010 | 0x0008))
+			dev->stats.rx_length_errors++;
 		/* Received packet with CRC errors */
 		if (err & 0x0004) {
 			spin_lock(&priv->lock);
-			priv->stats.rx_crc_errors++;
+			dev->stats.rx_crc_errors++;
 			spin_unlock(&priv->lock);
 		}
 
@@ -469,8 +548,8 @@ static int r6040_rx(struct net_device *dev, int limit)
 			/* Send to upper layer */
 			netif_receive_skb(skb_ptr);
 			dev->last_rx = jiffies;
-			priv->dev->stats.rx_packets++;
-			priv->dev->stats.rx_bytes += descptr->len;
+			dev->stats.rx_packets++;
+			dev->stats.rx_bytes += descptr->len;
 			/* To next descriptor */
 			descptr = descptr->vndescp;
 			priv->rx_free_desc--;
@@ -498,11 +577,13 @@ static void r6040_tx(struct net_device *dev)
 		/* Check for errors */
 		err = ioread16(ioaddr + MLSR);
 
-		if (err & 0x0200) priv->stats.rx_fifo_errors++;
-		if (err & (0x2000 | 0x4000)) priv->stats.tx_carrier_errors++;
+		if (err & 0x0200)
+			dev->stats.rx_fifo_errors++;
+		if (err & (0x2000 | 0x4000))
+			dev->stats.tx_carrier_errors++;
 
 		if (descptr->status & 0x8000)
-			break; /* Not complte */
+			break; /* Not complete */
 		skb_ptr = descptr->skb_ptr;
 		pci_unmap_single(priv->pdev, descptr->buf,
 			skb_ptr->len, PCI_DMA_TODEVICE);
@@ -545,7 +626,6 @@ static irqreturn_t r6040_interrupt(int irq, void *dev_id)
 	struct r6040_private *lp = netdev_priv(dev);
 	void __iomem *ioaddr = lp->base;
 	u16 status;
-	int handled = 1;
 
 	/* Mask off RDC MAC interrupt */
 	iowrite16(MSK_INT, ioaddr + MIER);
@@ -565,7 +645,7 @@ static irqreturn_t r6040_interrupt(int irq, void *dev_id)
 	if (status & 0x10)
 		r6040_tx(dev);
 
-	return IRQ_RETVAL(handled);
+	return IRQ_HANDLED;
 }
 
 #ifdef CONFIG_NET_POLL_CONTROLLER
@@ -577,53 +657,15 @@ static void r6040_poll_controller(struct net_device *dev)
 }
 #endif
 
-static void r6040_init_ring_desc(struct r6040_descriptor *desc_ring,
-				 dma_addr_t desc_dma, int size)
-{
-	struct r6040_descriptor *desc = desc_ring;
-	dma_addr_t mapping = desc_dma;
-
-	while (size-- > 0) {
-		mapping += sizeof(sizeof(*desc));
-		desc->ndesc = cpu_to_le32(mapping);
-		desc->vndescp = desc + 1;
-		desc++;
-	}
-	desc--;
-	desc->ndesc = cpu_to_le32(desc_dma);
-	desc->vndescp = desc_ring;
-}
-
 /* Init RDC MAC */
 static void r6040_up(struct net_device *dev)
 {
 	struct r6040_private *lp = netdev_priv(dev);
 	void __iomem *ioaddr = lp->base;
 
-	/* Initialize */
-	lp->tx_free_desc = TX_DCNT;
-	lp->rx_free_desc = 0;
-	/* Init descriptor */
-	lp->tx_remove_ptr = lp->tx_insert_ptr = lp->tx_ring;
-	lp->rx_remove_ptr = lp->rx_insert_ptr = lp->rx_ring;
-	/* Init TX descriptor */
-	r6040_init_ring_desc(lp->tx_ring, lp->tx_ring_dma, TX_DCNT);
-
-	/* Init RX descriptor */
-	r6040_init_ring_desc(lp->rx_ring, lp->rx_ring_dma, RX_DCNT);
-
-	/* Allocate buffer for RX descriptor */
-	rx_buf_alloc(lp, dev);
-
-	/*
-	 * TX and RX descriptor start registers.
-	 * Lower 16-bits to MxD_SA0. Higher 16-bits to MxD_SA1.
-	 */
-	iowrite16(lp->tx_ring_dma, ioaddr + MTD_SA0);
-	iowrite16(lp->tx_ring_dma >> 16, ioaddr + MTD_SA1);
-
-	iowrite16(lp->rx_ring_dma, ioaddr + MRD_SA0);
-	iowrite16(lp->rx_ring_dma >> 16, ioaddr + MRD_SA1);
+	/* Initialise and alloc RX/TX buffers */
+	r6040_alloc_txbufs(dev);
+	r6040_alloc_rxbufs(dev);
 
 	/* Buffer Size Register */
 	iowrite16(MAX_BUF_SIZE, ioaddr + MR_BSR);
@@ -689,8 +731,7 @@ static void r6040_timer(unsigned long data)
 	}
 
 	/* Timer active again */
-	lp->timer.expires = TIMER_WUT;
-	add_timer(&lp->timer);
+	mod_timer(&lp->timer, jiffies + round_jiffies(HZ));
 }
 
 /* Read/set MAC address routines */
@@ -746,14 +787,10 @@ static int r6040_open(struct net_device *dev)
 	napi_enable(&lp->napi);
 	netif_start_queue(dev);
 
-	if (lp->switch_sig != ICPLUS_PHY_ID) {
-		/* set and active a timer process */
-		init_timer(&lp->timer);
-		lp->timer.expires = TIMER_WUT;
-		lp->timer.data = (unsigned long)dev;
-		lp->timer.function = &r6040_timer;
-		add_timer(&lp->timer);
-	}
+	/* set and active a timer process */
+	setup_timer(&lp->timer, r6040_timer, (unsigned long) dev);
+	if (lp->switch_sig != ICPLUS_PHY_ID)
+		mod_timer(&lp->timer, jiffies + HZ);
 	return 0;
 }
 
-- 
Ueimor

^ permalink raw reply related

* [PATCH] Fix PHY Lib support for gianfar and ucc_geth
From: Andy Fleming @ 2008-02-05 22:35 UTC (permalink / raw)
  To: jeff, netdev; +Cc: Andy Fleming

The PHY Lib now uses mutexes instead of spin_locks.  ucc_geth
and gianfar both grab the locks in their mdio_reset functions,
so they need to use mutex_(un)lock instead.  This was not caught
until someone tested it on an SMP system.

Signed-off-by: Andy Fleming <afleming@freescale.com>
---
 drivers/net/gianfar_mii.c  |    4 ++--
 drivers/net/ucc_geth_mii.c |    4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/gianfar_mii.c b/drivers/net/gianfar_mii.c
index 100bf41..6a647d9 100644
--- a/drivers/net/gianfar_mii.c
+++ b/drivers/net/gianfar_mii.c
@@ -127,7 +127,7 @@ int gfar_mdio_reset(struct mii_bus *bus)
 	struct gfar_mii __iomem *regs = (void __iomem *)bus->priv;
 	unsigned int timeout = PHY_INIT_TIMEOUT;
 
-	spin_lock_bh(&bus->mdio_lock);
+	mutex_lock(&bus->mdio_lock);
 
 	/* Reset the management interface */
 	gfar_write(&regs->miimcfg, MIIMCFG_RESET);
@@ -140,7 +140,7 @@ int gfar_mdio_reset(struct mii_bus *bus)
 			timeout--)
 		cpu_relax();
 
-	spin_unlock_bh(&bus->mdio_lock);
+	mutex_unlock(&bus->mdio_lock);
 
 	if(timeout <= 0) {
 		printk(KERN_ERR "%s: The MII Bus is stuck!\n",
diff --git a/drivers/net/ucc_geth_mii.c b/drivers/net/ucc_geth_mii.c
index e3ba14a..c69e654 100644
--- a/drivers/net/ucc_geth_mii.c
+++ b/drivers/net/ucc_geth_mii.c
@@ -109,7 +109,7 @@ int uec_mdio_reset(struct mii_bus *bus)
 	struct ucc_mii_mng __iomem *regs = (void __iomem *)bus->priv;
 	unsigned int timeout = PHY_INIT_TIMEOUT;
 
-	spin_lock_bh(&bus->mdio_lock);
+	mutex_lock(&bus->mdio_lock);
 
 	/* Reset the management interface */
 	out_be32(&regs->miimcfg, MIIMCFG_RESET_MANAGEMENT);
@@ -121,7 +121,7 @@ int uec_mdio_reset(struct mii_bus *bus)
 	while ((in_be32(&regs->miimind) & MIIMIND_BUSY) && timeout--)
 		cpu_relax();
 
-	spin_unlock_bh(&bus->mdio_lock);
+	mutex_unlock(&bus->mdio_lock);
 
 	if (timeout <= 0) {
 		printk(KERN_ERR "%s: The MII Bus is stuck!\n", bus->name);
-- 
1.5.0.2.230.gfbe3d-dirty


^ permalink raw reply related

* Re: [PATCH] Add IPv6 support to TCP SYN cookies
From: Alan Cox @ 2008-02-05 22:05 UTC (permalink / raw)
  To: Evgeniy Polyakov; +Cc: Andi Kleen, Glenn Griffin, netdev, linux-kernel
In-Reply-To: <20080205215216.GB24275@2ka.mipt.ru>

> SACK is actually a good idea for mobile devices, so preventing
> syncookies from not getting into account some options (btw, does it work
> with timestamps and PAWS?) is not a solution.

Syncookies only get used at the point where the alternative is failure.
No SACK beats a DoS situation most days

^ permalink raw reply

* Re: [PATCH] Add IPv6 support to TCP SYN cookies
From: Willy Tarreau @ 2008-02-05 21:20 UTC (permalink / raw)
  To: Evgeniy Polyakov
  Cc: Alan Cox, Andi Kleen, Glenn Griffin, netdev, linux-kernel
In-Reply-To: <20080205215216.GB24275@2ka.mipt.ru>

On Wed, Feb 06, 2008 at 12:52:17AM +0300, Evgeniy Polyakov wrote:
> Hi Alan.
> 
> On Tue, Feb 05, 2008 at 09:20:17PM +0000, Alan Cox (alan@lxorguk.ukuu.org.uk) wrote:
> > > Most (if not all) distributions have them enabled and window growing
> > > works just fine. Actually I do not see any reason why connection
> > > establishment handshake should prevent any run-time operations at all,
> > > even if it was setup during handshake.
> > 
> > Syncookies are only triggered if the system is under a load where it
> > would begin to lose connections otherwise. So they merely turn a DoS into
> > a working if slightly slower setup (and > 64K windows don't matter for
> > most normal users, especially on mobile devices).
> 
> SACK is actually a good idea for mobile devices, so preventing
> syncookies from not getting into account some options (btw, does it work
> with timestamps and PAWS?) is not a solution.

All TCP options negociated during session setup are lost. In fact, some
bits (3) are still reserved for the best known value of the MSS, but
that's all. The principle of SYN cookies is that the server does not
create any session upon the SYN, but builds a sequence number constitued
from a hash and the values it absolutely needs to know when the client
validates the session with an ACK.

I've seen some firewalls acting as SYN gateways which send the options
from the server to the client in the first ACK packet from the server.
This is normally not allowed, but it seems to work with some TCP stacks
(at least for the MSS). One solution would be to extend TCP to officially
support this behaviour and to optionally use it along with SYN cookies,
but there will always be old clients not compatible with the extension.

Regards,
Willy


^ permalink raw reply

* Re: [PATCH] Add IPv6 support to TCP SYN cookies
From: Evgeniy Polyakov @ 2008-02-05 21:52 UTC (permalink / raw)
  To: Alan Cox; +Cc: Andi Kleen, Glenn Griffin, netdev, linux-kernel
In-Reply-To: <20080205212017.5db50114@core>

Hi Alan.

On Tue, Feb 05, 2008 at 09:20:17PM +0000, Alan Cox (alan@lxorguk.ukuu.org.uk) wrote:
> > Most (if not all) distributions have them enabled and window growing
> > works just fine. Actually I do not see any reason why connection
> > establishment handshake should prevent any run-time operations at all,
> > even if it was setup during handshake.
> 
> Syncookies are only triggered if the system is under a load where it
> would begin to lose connections otherwise. So they merely turn a DoS into
> a working if slightly slower setup (and > 64K windows don't matter for
> most normal users, especially on mobile devices).

SACK is actually a good idea for mobile devices, so preventing
syncookies from not getting into account some options (btw, does it work
with timestamps and PAWS?) is not a solution.

-- 
	Evgeniy Polyakov

^ permalink raw reply

* Re: [PATCH] Add IPv6 support to TCP SYN cookies
From: Evgeniy Polyakov @ 2008-02-05 21:50 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Glenn Griffin, netdev, linux-kernel
In-Reply-To: <20080205205345.GA20920@basil.nowhere.org>

On Tue, Feb 05, 2008 at 09:53:45PM +0100, Andi Kleen (andi@firstfloor.org) wrote:
> > How does syncookies prevent windows from growing?
> 
> Syncookies do not allow window scaling so you can't have any windows >64k

Then you meant not windows change, but the fact, that option is ignored
as long as sack enable one?

> > Most (if not all) distributions have them enabled and window growing
> > works just fine. Actually I do not see any reason why connection
> > establishment handshake should prevent any run-time operations at all,
> > even if it was setup during handshake.
> 
> TCP only uses options negotiated during the hand shake and syncookies
> is incapable to do this.

What about fixing the implementation, so that it could get into account
different options too?

> -Andi

-- 
	Evgeniy Polyakov

^ permalink raw reply

* [PATCH 5/5] forcedeth: preserve registers
From: Ayaz Abdulla @ 2008-02-04 20:14 UTC (permalink / raw)
  To: Jeff Garzik, Manfred Spraul, Andrew Morton, nedev; +Cc: Ayaz Abdulla

[-- Attachment #1: Type: text/plain, Size: 120 bytes --]

Various registers need to be preserved before resetting the device.

Signed-off-by: Ayaz Abdulla <aabdulla@nvidia.com>


[-- Attachment #2: patch-forcedeth-reg-reset --]
[-- Type: text/plain, Size: 1065 bytes --]

--- old/drivers/net/forcedeth.c	2008-01-17 16:51:40.000000000 -0500
+++ new/drivers/net/forcedeth.c	2008-01-20 11:56:45.000000000 -0500
@@ -1443,16 +1443,30 @@
 {
 	struct fe_priv *np = netdev_priv(dev);
 	u8 __iomem *base = get_hwbase(dev);
+	u32 temp1, temp2, temp3;
 
 	dprintk(KERN_DEBUG "%s: nv_mac_reset\n", dev->name);
+
 	writel(NVREG_TXRXCTL_BIT2 | NVREG_TXRXCTL_RESET | np->txrxctl_bits, base + NvRegTxRxControl);
 	pci_push(base);
+
+	/* save registers since they will be cleared on reset */
+	temp1 = readl(base + NvRegMacAddrA);
+	temp2 = readl(base + NvRegMacAddrB);
+	temp3 = readl(base + NvRegTransmitPoll);
+
 	writel(NVREG_MAC_RESET_ASSERT, base + NvRegMacReset);
 	pci_push(base);
 	udelay(NV_MAC_RESET_DELAY);
 	writel(0, base + NvRegMacReset);
 	pci_push(base);
 	udelay(NV_MAC_RESET_DELAY);
+
+	/* restore saved registers */
+	writel(temp1, base + NvRegMacAddrA);
+	writel(temp2, base + NvRegMacAddrB);
+	writel(temp3, base + NvRegTransmitPoll);
+
 	writel(NVREG_TXRXCTL_BIT2 | np->txrxctl_bits, base + NvRegTxRxControl);
 	pci_push(base);
 }

^ permalink raw reply

* [PATCH 4/5] forcedeth: tx pause watermarks
From: Ayaz Abdulla @ 2008-02-04 20:14 UTC (permalink / raw)
  To: Jeff Garzik, Manfred Spraul, Andrew Morton, nedev

[-- Attachment #1: Type: text/plain, Size: 282 bytes --]

New chipsets introduced variant Rx FIFO sizes that need to be taken into 
account when setting up the tx pause watermarks. This patch introduces 
the new device feature flags based on a version and implements the new 
watermarks.

Signed-off-by: Ayaz Abdulla <aabdulla@nvidia.com>


[-- Attachment #2: patch-forcedeth-txpause --]
[-- Type: text/plain, Size: 19517 bytes --]

--- old/drivers/net/forcedeth.c	2008-01-17 16:50:29.000000000 -0500
+++ new/drivers/net/forcedeth.c	2008-01-17 16:51:02.000000000 -0500
@@ -166,22 +166,24 @@
  * Hardware access:
  */
 
-#define DEV_NEED_TIMERIRQ	0x0001  /* set the timer irq flag in the irq mask */
-#define DEV_NEED_LINKTIMER	0x0002	/* poll link settings. Relies on the timer irq */
-#define DEV_HAS_LARGEDESC	0x0004	/* device supports jumbo frames and needs packet format 2 */
-#define DEV_HAS_HIGH_DMA        0x0008  /* device supports 64bit dma */
-#define DEV_HAS_CHECKSUM        0x0010  /* device supports tx and rx checksum offloads */
-#define DEV_HAS_VLAN            0x0020  /* device supports vlan tagging and striping */
-#define DEV_HAS_MSI             0x0040  /* device supports MSI */
-#define DEV_HAS_MSI_X           0x0080  /* device supports MSI-X */
-#define DEV_HAS_POWER_CNTRL     0x0100  /* device supports power savings */
-#define DEV_HAS_PAUSEFRAME_TX   0x0200  /* device supports tx pause frames */
-#define DEV_HAS_STATISTICS_V1   0x0400  /* device supports hw statistics version 1 */
-#define DEV_HAS_STATISTICS_V2   0x0800  /* device supports hw statistics version 2 */
-#define DEV_HAS_TEST_EXTENDED   0x1000  /* device supports extended diagnostic test */
-#define DEV_HAS_MGMT_UNIT       0x2000  /* device supports management unit */
-#define DEV_HAS_CORRECT_MACADDR 0x4000  /* device supports correct mac address order */
-#define DEV_HAS_COLLISION_FIX   0x8000  /* device supports tx collision fix */
+#define DEV_NEED_TIMERIRQ          0x00001  /* set the timer irq flag in the irq mask */
+#define DEV_NEED_LINKTIMER         0x00002  /* poll link settings. Relies on the timer irq */
+#define DEV_HAS_LARGEDESC          0x00004  /* device supports jumbo frames and needs packet format 2 */
+#define DEV_HAS_HIGH_DMA           0x00008  /* device supports 64bit dma */
+#define DEV_HAS_CHECKSUM           0x00010  /* device supports tx and rx checksum offloads */
+#define DEV_HAS_VLAN               0x00020  /* device supports vlan tagging and striping */
+#define DEV_HAS_MSI                0x00040  /* device supports MSI */
+#define DEV_HAS_MSI_X              0x00080  /* device supports MSI-X */
+#define DEV_HAS_POWER_CNTRL        0x00100  /* device supports power savings */
+#define DEV_HAS_STATISTICS_V1      0x00200  /* device supports hw statistics version 1 */
+#define DEV_HAS_STATISTICS_V2      0x00400  /* device supports hw statistics version 2 */
+#define DEV_HAS_TEST_EXTENDED      0x00800  /* device supports extended diagnostic test */
+#define DEV_HAS_MGMT_UNIT          0x01000  /* device supports management unit */
+#define DEV_HAS_CORRECT_MACADDR    0x02000  /* device supports correct mac address order */
+#define DEV_HAS_COLLISION_FIX      0x04000  /* device supports tx collision fix */
+#define DEV_HAS_PAUSEFRAME_TX_V1   0x08000  /* device supports tx pause frames version 1 */
+#define DEV_HAS_PAUSEFRAME_TX_V2   0x10000  /* device supports tx pause frames version 2 */
+#define DEV_HAS_PAUSEFRAME_TX_V3   0x20000  /* device supports tx pause frames version 3 */
 
 enum {
 	NvRegIrqStatus = 0x000,
@@ -322,8 +324,10 @@
 	NvRegTxRingPhysAddrHigh = 0x148,
 	NvRegRxRingPhysAddrHigh = 0x14C,
 	NvRegTxPauseFrame = 0x170,
-#define NVREG_TX_PAUSEFRAME_DISABLE	0x01ff0080
-#define NVREG_TX_PAUSEFRAME_ENABLE	0x01800010
+#define NVREG_TX_PAUSEFRAME_DISABLE	0x0fff0080
+#define NVREG_TX_PAUSEFRAME_ENABLE_V1	0x01800010
+#define NVREG_TX_PAUSEFRAME_ENABLE_V2	0x056003f0
+#define NVREG_TX_PAUSEFRAME_ENABLE_V3	0x09f00880
 	NvRegMIIStatus = 0x180,
 #define NVREG_MIISTAT_ERROR		0x0001
 #define NVREG_MIISTAT_LINKCHANGE	0x0008
@@ -2741,7 +2745,12 @@
 	if (np->pause_flags & NV_PAUSEFRAME_TX_CAPABLE) {
 		u32 regmisc = readl(base + NvRegMisc1) & ~NVREG_MISC1_PAUSE_TX;
 		if (pause_flags & NV_PAUSEFRAME_TX_ENABLE) {
-			writel(NVREG_TX_PAUSEFRAME_ENABLE,  base + NvRegTxPauseFrame);
+			u32 pause_enable = NVREG_TX_PAUSEFRAME_ENABLE_V1;
+			if (np->driver_data & DEV_HAS_PAUSEFRAME_TX_V2)
+				pause_enable = NVREG_TX_PAUSEFRAME_ENABLE_V2;
+			if (np->driver_data & DEV_HAS_PAUSEFRAME_TX_V3)
+				pause_enable = NVREG_TX_PAUSEFRAME_ENABLE_V3;
+			writel(pause_enable,  base + NvRegTxPauseFrame);
 			writel(regmisc|NVREG_MISC1_PAUSE_TX, base + NvRegMisc1);
 			np->pause_flags |= NV_PAUSEFRAME_TX_ENABLE;
 		} else {
@@ -5158,7 +5167,9 @@
 	}
 
 	np->pause_flags = NV_PAUSEFRAME_RX_CAPABLE | NV_PAUSEFRAME_RX_REQ | NV_PAUSEFRAME_AUTONEG;
-	if (id->driver_data & DEV_HAS_PAUSEFRAME_TX) {
+	if ((id->driver_data & DEV_HAS_PAUSEFRAME_TX_V1) ||
+	    (id->driver_data & DEV_HAS_PAUSEFRAME_TX_V2) ||
+	    (id->driver_data & DEV_HAS_PAUSEFRAME_TX_V3)) {
 		np->pause_flags |= NV_PAUSEFRAME_TX_CAPABLE | NV_PAUSEFRAME_TX_REQ;
 	}
 
@@ -5564,107 +5575,107 @@
 	},
 	{	/* MCP55 Ethernet Controller */
 		PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_14),
-		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC|DEV_HAS_CHECKSUM|DEV_HAS_HIGH_DMA|DEV_HAS_VLAN|DEV_HAS_MSI|DEV_HAS_MSI_X|DEV_HAS_POWER_CNTRL|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT,
+		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC|DEV_HAS_CHECKSUM|DEV_HAS_HIGH_DMA|DEV_HAS_VLAN|DEV_HAS_MSI|DEV_HAS_MSI_X|DEV_HAS_POWER_CNTRL|DEV_HAS_PAUSEFRAME_TX_V1|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT,
 	},
 	{	/* MCP55 Ethernet Controller */
 		PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_15),
-		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC|DEV_HAS_CHECKSUM|DEV_HAS_HIGH_DMA|DEV_HAS_VLAN|DEV_HAS_MSI|DEV_HAS_MSI_X|DEV_HAS_POWER_CNTRL|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT,
+		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC|DEV_HAS_CHECKSUM|DEV_HAS_HIGH_DMA|DEV_HAS_VLAN|DEV_HAS_MSI|DEV_HAS_MSI_X|DEV_HAS_POWER_CNTRL|DEV_HAS_PAUSEFRAME_TX_V1|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT,
 	},
 	{	/* MCP61 Ethernet Controller */
 		PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_16),
-		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_CORRECT_MACADDR,
+		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX_V1|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_CORRECT_MACADDR,
 	},
 	{	/* MCP61 Ethernet Controller */
 		PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_17),
-		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_CORRECT_MACADDR,
+		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX_V1|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_CORRECT_MACADDR,
 	},
 	{	/* MCP61 Ethernet Controller */
 		PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_18),
-		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_CORRECT_MACADDR,
+		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX_V1|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_CORRECT_MACADDR,
 	},
 	{	/* MCP61 Ethernet Controller */
 		PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_19),
-		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_CORRECT_MACADDR,
+		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX_V1|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_CORRECT_MACADDR,
 	},
 	{	/* MCP65 Ethernet Controller */
 		PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_20),
-		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_CORRECT_MACADDR,
+		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX_V1|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_CORRECT_MACADDR,
 	},
 	{	/* MCP65 Ethernet Controller */
 		PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_21),
-		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_CORRECT_MACADDR,
+		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX_V1|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_CORRECT_MACADDR,
 	},
 	{	/* MCP65 Ethernet Controller */
 		PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_22),
-		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_CORRECT_MACADDR,
+		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX_V1|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_CORRECT_MACADDR,
 	},
 	{	/* MCP65 Ethernet Controller */
 		PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_23),
-		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_CORRECT_MACADDR,
+		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX_V1|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_CORRECT_MACADDR,
 	},
 	{	/* MCP67 Ethernet Controller */
 		PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_24),
-		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_CORRECT_MACADDR,
+		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX_V1|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_CORRECT_MACADDR,
 	},
 	{	/* MCP67 Ethernet Controller */
 		PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_25),
-		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_CORRECT_MACADDR,
+		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX_V1|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_CORRECT_MACADDR,
 	},
 	{	/* MCP67 Ethernet Controller */
 		PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_26),
-		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_CORRECT_MACADDR,
+		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX_V1|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_CORRECT_MACADDR,
 	},
 	{	/* MCP67 Ethernet Controller */
 		PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_27),
-		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_CORRECT_MACADDR,
+		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX_V1|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_CORRECT_MACADDR,
 	},
 	{	/* MCP73 Ethernet Controller */
 		PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_28),
-		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_CORRECT_MACADDR|DEV_HAS_COLLISION_FIX,
+		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX_V1|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_CORRECT_MACADDR|DEV_HAS_COLLISION_FIX,
 	},
 	{	/* MCP73 Ethernet Controller */
 		PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_29),
-		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_CORRECT_MACADDR|DEV_HAS_COLLISION_FIX,
+		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX_V1|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_CORRECT_MACADDR|DEV_HAS_COLLISION_FIX,
 	},
 	{	/* MCP73 Ethernet Controller */
 		PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_30),
-		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_CORRECT_MACADDR|DEV_HAS_COLLISION_FIX,
+		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX_V1|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_CORRECT_MACADDR|DEV_HAS_COLLISION_FIX,
 	},
 	{	/* MCP73 Ethernet Controller */
 		PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_31),
-		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_CORRECT_MACADDR|DEV_HAS_COLLISION_FIX,
+		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX_V1|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_CORRECT_MACADDR|DEV_HAS_COLLISION_FIX,
 	},
 	{	/* MCP77 Ethernet Controller */
 		PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_32),
-		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_CHECKSUM|DEV_HAS_HIGH_DMA|DEV_HAS_MSI|DEV_HAS_POWER_CNTRL|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_COLLISION_FIX,
+		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_CHECKSUM|DEV_HAS_HIGH_DMA|DEV_HAS_MSI|DEV_HAS_POWER_CNTRL|DEV_HAS_PAUSEFRAME_TX_V2|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_COLLISION_FIX,
 	},
 	{	/* MCP77 Ethernet Controller */
 		PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_33),
-		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_CHECKSUM|DEV_HAS_HIGH_DMA|DEV_HAS_MSI|DEV_HAS_POWER_CNTRL|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_COLLISION_FIX,
+		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_CHECKSUM|DEV_HAS_HIGH_DMA|DEV_HAS_MSI|DEV_HAS_POWER_CNTRL|DEV_HAS_PAUSEFRAME_TX_V2|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_COLLISION_FIX,
 	},
 	{	/* MCP77 Ethernet Controller */
 		PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_34),
-		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_CHECKSUM|DEV_HAS_HIGH_DMA|DEV_HAS_MSI|DEV_HAS_POWER_CNTRL|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_COLLISION_FIX,
+		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_CHECKSUM|DEV_HAS_HIGH_DMA|DEV_HAS_MSI|DEV_HAS_POWER_CNTRL|DEV_HAS_PAUSEFRAME_TX_V2|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_COLLISION_FIX,
 	},
 	{	/* MCP77 Ethernet Controller */
 		PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_35),
-		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_CHECKSUM|DEV_HAS_HIGH_DMA|DEV_HAS_MSI|DEV_HAS_POWER_CNTRL|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_COLLISION_FIX,
+		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_CHECKSUM|DEV_HAS_HIGH_DMA|DEV_HAS_MSI|DEV_HAS_POWER_CNTRL|DEV_HAS_PAUSEFRAME_TX_V2|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_COLLISION_FIX,
 	},
 	{	/* MCP79 Ethernet Controller */
 		PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_36),
-		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_CHECKSUM|DEV_HAS_HIGH_DMA|DEV_HAS_MSI|DEV_HAS_POWER_CNTRL|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_COLLISION_FIX,
+		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_CHECKSUM|DEV_HAS_HIGH_DMA|DEV_HAS_MSI|DEV_HAS_POWER_CNTRL|DEV_HAS_PAUSEFRAME_TX_V3|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_COLLISION_FIX,
 	},
 	{	/* MCP79 Ethernet Controller */
 		PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_37),
-		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_CHECKSUM|DEV_HAS_HIGH_DMA|DEV_HAS_MSI|DEV_HAS_POWER_CNTRL|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_COLLISION_FIX,
+		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_CHECKSUM|DEV_HAS_HIGH_DMA|DEV_HAS_MSI|DEV_HAS_POWER_CNTRL|DEV_HAS_PAUSEFRAME_TX_V3|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_COLLISION_FIX,
 	},
 	{	/* MCP79 Ethernet Controller */
 		PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_38),
-		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_CHECKSUM|DEV_HAS_HIGH_DMA|DEV_HAS_MSI|DEV_HAS_POWER_CNTRL|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_COLLISION_FIX,
+		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_CHECKSUM|DEV_HAS_HIGH_DMA|DEV_HAS_MSI|DEV_HAS_POWER_CNTRL|DEV_HAS_PAUSEFRAME_TX_V3|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_COLLISION_FIX,
 	},
 	{	/* MCP79 Ethernet Controller */
 		PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_39),
-		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_CHECKSUM|DEV_HAS_HIGH_DMA|DEV_HAS_MSI|DEV_HAS_POWER_CNTRL|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_COLLISION_FIX,
+		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_CHECKSUM|DEV_HAS_HIGH_DMA|DEV_HAS_MSI|DEV_HAS_POWER_CNTRL|DEV_HAS_PAUSEFRAME_TX_V3|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_COLLISION_FIX,
 	},
 	{0,},
 };

^ permalink raw reply

* [PATCH 3/5] forcedeth: phy status fix
From: Ayaz Abdulla @ 2008-02-04 20:14 UTC (permalink / raw)
  To: Jeff Garzik, Manfred Spraul, Andrew Morton, nedev

[-- Attachment #1: Type: text/plain, Size: 334 bytes --]

The driver needs to ack only the phy status bits that it is currently 
handling and preserve the other bits for the other handlers. For 
example, when reading/writing from the phy, it should not clear the link 
change interrupt bit. This will cause a missing link change interrupt.

Signed-off-by: Ayaz Abdulla <aabdulla@nvidia.com>


[-- Attachment #2: patch-forcedeth-mii-stat-fix --]
[-- Type: text/plain, Size: 2311 bytes --]

--- old/drivers/net/forcedeth.c	2008-01-15 18:38:52.000000000 -0500
+++ new/drivers/net/forcedeth.c	2008-01-15 19:27:55.000000000 -0500
@@ -327,8 +327,8 @@
 	NvRegMIIStatus = 0x180,
 #define NVREG_MIISTAT_ERROR		0x0001
 #define NVREG_MIISTAT_LINKCHANGE	0x0008
-#define NVREG_MIISTAT_MASK		0x000f
-#define NVREG_MIISTAT_MASK2		0x000f
+#define NVREG_MIISTAT_MASK_RW		0x0007
+#define NVREG_MIISTAT_MASK_ALL		0x000f
 	NvRegMIIMask = 0x184,
 #define NVREG_MII_LINKCHANGE		0x0008
 
@@ -1068,7 +1068,7 @@
 	u32 reg;
 	int retval;
 
-	writel(NVREG_MIISTAT_MASK, base + NvRegMIIStatus);
+	writel(NVREG_MIISTAT_MASK_RW, base + NvRegMIIStatus);
 
 	reg = readl(base + NvRegMIIControl);
 	if (reg & NVREG_MIICTL_INUSE) {
@@ -3012,7 +3012,7 @@
 	u32 miistat;
 
 	miistat = readl(base + NvRegMIIStatus);
-	writel(NVREG_MIISTAT_MASK, base + NvRegMIIStatus);
+	writel(NVREG_MIISTAT_LINKCHANGE, base + NvRegMIIStatus);
 	dprintk(KERN_INFO "%s: link change irq, status 0x%x.\n", dev->name, miistat);
 
 	if (miistat & (NVREG_MIISTAT_LINKCHANGE))
@@ -4887,7 +4887,7 @@
 
 	writel(0, base + NvRegMIIMask);
 	writel(NVREG_IRQSTAT_MASK, base + NvRegIrqStatus);
-	writel(NVREG_MIISTAT_MASK2, base + NvRegMIIStatus);
+	writel(NVREG_MIISTAT_MASK_ALL, base + NvRegMIIStatus);
 
 	writel(NVREG_MISC1_FORCE | NVREG_MISC1_HD, base + NvRegMisc1);
 	writel(readl(base + NvRegTransmitterStatus), base + NvRegTransmitterStatus);
@@ -4925,7 +4925,7 @@
 
 	nv_disable_hw_interrupts(dev, np->irqmask);
 	pci_push(base);
-	writel(NVREG_MIISTAT_MASK2, base + NvRegMIIStatus);
+	writel(NVREG_MIISTAT_MASK_ALL, base + NvRegMIIStatus);
 	writel(NVREG_IRQSTAT_MASK, base + NvRegIrqStatus);
 	pci_push(base);
 
@@ -4948,7 +4948,7 @@
 	{
 		u32 miistat;
 		miistat = readl(base + NvRegMIIStatus);
-		writel(NVREG_MIISTAT_MASK, base + NvRegMIIStatus);
+		writel(NVREG_MIISTAT_MASK_ALL, base + NvRegMIIStatus);
 		dprintk(KERN_INFO "startup: got 0x%08x.\n", miistat);
 	}
 	/* set linkspeed to invalid value, thus force nv_update_linkspeed
@@ -5320,7 +5320,7 @@
 		phystate &= ~NVREG_ADAPTCTL_RUNNING;
 		writel(phystate, base + NvRegAdapterControl);
 	}
-	writel(NVREG_MIISTAT_MASK, base + NvRegMIIStatus);
+	writel(NVREG_MIISTAT_MASK_ALL, base + NvRegMIIStatus);
 
 	if (id->driver_data & DEV_HAS_MGMT_UNIT) {
 		/* management unit running on the mac? */

^ permalink raw reply

* [PATCH 2/5] forcedeth: tx collision fix
From: Ayaz Abdulla @ 2008-02-04 20:14 UTC (permalink / raw)
  To: Jeff Garzik, Manfred Spraul, Andrew Morton, nedev

[-- Attachment #1: Type: text/plain, Size: 301 bytes --]

This patch supports a new fix in hardware regarding tx collisions. In 
the cases where we are in autoneg mode and the link partner is in forced 
mode, we need to setup the tx deferral register differently in order to 
reduce collisions on the wire.

Signed-off-by: Ayaz Abdulla <aabdulla@nvidia.com>


[-- Attachment #2: patch-forcedeth-collision-fix --]
[-- Type: text/plain, Size: 9051 bytes --]

--- old/drivers/net/forcedeth.c	2008-01-15 17:44:38.000000000 -0500
+++ new/drivers/net/forcedeth.c	2008-01-15 18:28:08.000000000 -0500
@@ -181,6 +181,7 @@
 #define DEV_HAS_TEST_EXTENDED   0x1000  /* device supports extended diagnostic test */
 #define DEV_HAS_MGMT_UNIT       0x2000  /* device supports management unit */
 #define DEV_HAS_CORRECT_MACADDR 0x4000  /* device supports correct mac address order */
+#define DEV_HAS_COLLISION_FIX   0x8000  /* device supports tx collision fix */
 
 enum {
 	NvRegIrqStatus = 0x000,
@@ -266,9 +267,12 @@
 #define NVREG_RNDSEED_FORCE3	0x7400
 
 	NvRegTxDeferral = 0xA0,
-#define NVREG_TX_DEFERRAL_DEFAULT	0x15050f
-#define NVREG_TX_DEFERRAL_RGMII_10_100	0x16070f
-#define NVREG_TX_DEFERRAL_RGMII_1000	0x14050f
+#define NVREG_TX_DEFERRAL_DEFAULT		0x15050f
+#define NVREG_TX_DEFERRAL_RGMII_10_100		0x16070f
+#define NVREG_TX_DEFERRAL_RGMII_1000		0x14050f
+#define NVREG_TX_DEFERRAL_RGMII_STRETCH_10	0x16190f
+#define NVREG_TX_DEFERRAL_RGMII_STRETCH_100	0x16300f
+#define NVREG_TX_DEFERRAL_MII_STRETCH		0x152000
 	NvRegRxDeferral = 0xA4,
 #define NVREG_RX_DEFERRAL_DEFAULT	0x16
 	NvRegMacAddrA = 0xA8,
@@ -2771,6 +2775,7 @@
 	int retval = 0;
 	u32 control_1000, status_1000, phyreg, pause_flags, txreg;
 	u32 txrxFlags = 0;
+	u32 phy_exp;
 
 	/* BMSR_LSTATUS is latched, read it twice:
 	 * we want the current value.
@@ -2898,13 +2903,25 @@
 		phyreg |= PHY_1000;
 	writel(phyreg, base + NvRegPhyInterface);
 
+	phy_exp = mii_rw(dev, np->phyaddr, MII_EXPANSION, MII_READ) & EXPANSION_NWAY; /* autoneg capable */
 	if (phyreg & PHY_RGMII) {
-		if ((np->linkspeed & NVREG_LINKSPEED_MASK) == NVREG_LINKSPEED_1000)
+		if ((np->linkspeed & NVREG_LINKSPEED_MASK) == NVREG_LINKSPEED_1000) {
 			txreg = NVREG_TX_DEFERRAL_RGMII_1000;
-		else
-			txreg = NVREG_TX_DEFERRAL_RGMII_10_100;
+		} else {
+			if (!phy_exp && !np->duplex && (np->driver_data & DEV_HAS_COLLISION_FIX)) {
+				if ((np->linkspeed & NVREG_LINKSPEED_MASK) == NVREG_LINKSPEED_10)
+					txreg = NVREG_TX_DEFERRAL_RGMII_STRETCH_10;
+				else
+					txreg = NVREG_TX_DEFERRAL_RGMII_STRETCH_100;
+			} else {
+				txreg = NVREG_TX_DEFERRAL_RGMII_10_100;
+			}
+		}
 	} else {
-		txreg = NVREG_TX_DEFERRAL_DEFAULT;
+		if (!phy_exp && !np->duplex && (np->driver_data & DEV_HAS_COLLISION_FIX))
+			txreg = NVREG_TX_DEFERRAL_MII_STRETCH;
+		else
+			txreg = NVREG_TX_DEFERRAL_DEFAULT;
 	}
 	writel(txreg, base + NvRegTxDeferral);
 
@@ -5603,51 +5620,51 @@
 	},
 	{	/* MCP73 Ethernet Controller */
 		PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_28),
-		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_CORRECT_MACADDR,
+		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_CORRECT_MACADDR|DEV_HAS_COLLISION_FIX,
 	},
 	{	/* MCP73 Ethernet Controller */
 		PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_29),
-		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_CORRECT_MACADDR,
+		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_CORRECT_MACADDR|DEV_HAS_COLLISION_FIX,
 	},
 	{	/* MCP73 Ethernet Controller */
 		PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_30),
-		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_CORRECT_MACADDR,
+		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_CORRECT_MACADDR|DEV_HAS_COLLISION_FIX,
 	},
 	{	/* MCP73 Ethernet Controller */
 		PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_31),
-		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_CORRECT_MACADDR,
+		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_CORRECT_MACADDR|DEV_HAS_COLLISION_FIX,
 	},
 	{	/* MCP77 Ethernet Controller */
 		PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_32),
-		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_CHECKSUM|DEV_HAS_HIGH_DMA|DEV_HAS_MSI|DEV_HAS_POWER_CNTRL|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT,
+		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_CHECKSUM|DEV_HAS_HIGH_DMA|DEV_HAS_MSI|DEV_HAS_POWER_CNTRL|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_COLLISION_FIX,
 	},
 	{	/* MCP77 Ethernet Controller */
 		PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_33),
-		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_CHECKSUM|DEV_HAS_HIGH_DMA|DEV_HAS_MSI|DEV_HAS_POWER_CNTRL|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT,
+		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_CHECKSUM|DEV_HAS_HIGH_DMA|DEV_HAS_MSI|DEV_HAS_POWER_CNTRL|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_COLLISION_FIX,
 	},
 	{	/* MCP77 Ethernet Controller */
 		PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_34),
-		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_CHECKSUM|DEV_HAS_HIGH_DMA|DEV_HAS_MSI|DEV_HAS_POWER_CNTRL|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT,
+		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_CHECKSUM|DEV_HAS_HIGH_DMA|DEV_HAS_MSI|DEV_HAS_POWER_CNTRL|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_COLLISION_FIX,
 	},
 	{	/* MCP77 Ethernet Controller */
 		PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_35),
-		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_CHECKSUM|DEV_HAS_HIGH_DMA|DEV_HAS_MSI|DEV_HAS_POWER_CNTRL|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT,
+		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_CHECKSUM|DEV_HAS_HIGH_DMA|DEV_HAS_MSI|DEV_HAS_POWER_CNTRL|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_COLLISION_FIX,
 	},
 	{	/* MCP79 Ethernet Controller */
 		PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_36),
-		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_CHECKSUM|DEV_HAS_HIGH_DMA|DEV_HAS_MSI|DEV_HAS_POWER_CNTRL|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT,
+		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_CHECKSUM|DEV_HAS_HIGH_DMA|DEV_HAS_MSI|DEV_HAS_POWER_CNTRL|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_COLLISION_FIX,
 	},
 	{	/* MCP79 Ethernet Controller */
 		PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_37),
-		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_CHECKSUM|DEV_HAS_HIGH_DMA|DEV_HAS_MSI|DEV_HAS_POWER_CNTRL|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT,
+		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_CHECKSUM|DEV_HAS_HIGH_DMA|DEV_HAS_MSI|DEV_HAS_POWER_CNTRL|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_COLLISION_FIX,
 	},
 	{	/* MCP79 Ethernet Controller */
 		PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_38),
-		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_CHECKSUM|DEV_HAS_HIGH_DMA|DEV_HAS_MSI|DEV_HAS_POWER_CNTRL|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT,
+		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_CHECKSUM|DEV_HAS_HIGH_DMA|DEV_HAS_MSI|DEV_HAS_POWER_CNTRL|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_COLLISION_FIX,
 	},
 	{	/* MCP79 Ethernet Controller */
 		PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_39),
-		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_CHECKSUM|DEV_HAS_HIGH_DMA|DEV_HAS_MSI|DEV_HAS_POWER_CNTRL|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT,
+		.driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_CHECKSUM|DEV_HAS_HIGH_DMA|DEV_HAS_MSI|DEV_HAS_POWER_CNTRL|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT|DEV_HAS_COLLISION_FIX,
 	},
 	{0,},
 };

^ permalink raw reply

* [PATCH 1/5] forcedeth: restart tx/rx
From: Ayaz Abdulla @ 2008-02-04 20:13 UTC (permalink / raw)
  To: Jeff Garzik, Manfred Spraul, Andrew Morton, nedev

[-- Attachment #1: Type: text/plain, Size: 181 bytes --]

This patch fixes the issue where the transmitter and receiver must be 
restarted when applying new changes to certain registers.

Signed-off-by: Ayaz Abdulla <aabdulla@nvidia.com>


[-- Attachment #2: patch-forcedeth-restart-txrx --]
[-- Type: text/plain, Size: 1284 bytes --]

--- old/drivers/net/forcedeth.c	2008-01-15 17:41:00.000000000 -0500
+++ new/drivers/net/forcedeth.c	2008-01-15 17:41:02.000000000 -0500
@@ -624,6 +624,9 @@
 #define NV_MSI_X_VECTOR_TX    0x1
 #define NV_MSI_X_VECTOR_OTHER 0x2
 
+#define NV_RESTART_TX         0x1
+#define NV_RESTART_RX         0x2
+
 /* statistics */
 struct nv_ethtool_str {
 	char name[ETH_GSTRING_LEN];
@@ -2767,6 +2770,7 @@
 	int mii_status;
 	int retval = 0;
 	u32 control_1000, status_1000, phyreg, pause_flags, txreg;
+	u32 txrxFlags = 0;
 
 	/* BMSR_LSTATUS is latched, read it twice:
 	 * we want the current value.
@@ -2862,6 +2866,16 @@
 	np->duplex = newdup;
 	np->linkspeed = newls;
 
+	/* The transmitter and receiver must be restarted for safe update */
+	if (readl(base + NvRegTransmitterControl) & NVREG_XMITCTL_START) {
+		txrxFlags |= NV_RESTART_TX;
+		nv_stop_tx(dev);
+	}
+	if (readl(base + NvRegReceiverControl) & NVREG_RCVCTL_START) {
+		txrxFlags |= NV_RESTART_RX;
+		nv_stop_rx(dev);
+	}
+
 	if (np->gigabit == PHY_GIGABIT) {
 		phyreg = readl(base + NvRegRandomSeed);
 		phyreg &= ~(0x3FF00);
@@ -2950,6 +2964,11 @@
 	}
 	nv_update_pause(dev, pause_flags);
 
+	if (txrxFlags & NV_RESTART_TX)
+		nv_start_tx(dev);
+	if (txrxFlags & NV_RESTART_RX)
+		nv_start_rx(dev);
+
 	return retval;
 }
 

^ permalink raw reply

* Re: [PATCH] Add IPv6 support to TCP SYN cookies
From: Willy Tarreau @ 2008-02-05 20:50 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Alan Cox, Glenn Griffin, netdev, linux-kernel
In-Reply-To: <20080205163912.GB23145@one.firstfloor.org>

Hi Andi, Alan,

I've run extensive tests with/without syn cookies recently.

On Tue, Feb 05, 2008 at 05:39:12PM +0100, Andi Kleen wrote:
> On Tue, Feb 05, 2008 at 03:42:13PM +0000, Alan Cox wrote:
> > > Syncookies are discouraged these days. They disable too many 
> > > valuable TCP features (window scaling, SACK) and even without them 
> > > the kernel is usually strong enough to defend against syn floods
> > > and systems have much more memory than they used to be.
> > 
> > Somewhat untrue. Network speeds have risen dramatically, the number of
> 
> With strong I meant Linux has much better algorithms to handle the standard
> syn queue (syncookies was originally added when it had only dumb head drop) 
> and there are minisocks which also require significantly less overhead
> to manage than full sockets (less memory etc.) 

That's true, but not enough, see below.

> When I wrote syncookies originally that all was not true.
> 
> > appliances running Linux that are not PC class means memory has fallen
> > not risen and CPU has been pretty level.
> > 
> > > So I don't think it makes much sense to add more code to it, sorry.
> > 
> > I think it makes a lot of sense 
> 
> I have my doubts. It would be probably better to recheck everything
> and then remove syncookies.
> 
> Also your sub PC class appliances rarely run LISTEN servers anyways
> that are open to the world.

In my tests, I discovered that in fact SYN cookies more benefit high
end machines than low-end ones. Let me explain.

I noticed that computing the cookie consumes a lot of CPU, which is a
real problem on low-end machines. But on the other end, it helps the
system continue to respond when otherwise it would not. My tests on
an AMD LX800 with max_syn_backlog at 63000 on an HTTP reverse proxy
consisted in injecting 250 hits/s of legitimate traffic with 8000 SYN/s
of noise.

Without SYN cookies, the average response time was about 1.5 second and
unstable (due to retransmits), and the CPU was set to 60%. With SYN
cookies enabled, the response time dropped to 12-15ms only, but CPU
usage jumped to 70%. The difference appears at a higher legitimate
traffic rate. At 500 hits/s + 7800 SYN/s, the CPU is just saturated
with correct response time (SYN backlog almost full but never full),
and the performance slightly goes down with SYN cookies enabled, inducing
a drop of the hit rate due to the increased CPU consumption.

Till there, one would conclude that SYN cookies are bad. BUT! this was
with tcp_synack_retries = 1, which is the optimal situation without
SYN cookies under an attack and which is pretty bad for normal usage.

The real problem without SYN cookies is that you are forced to support
a huge SYN backlog (eg: 2 million entries to sustain 100 Mbps of SYN).
And what happens with a large backlog ? You send a lot of retries for
each SYN. 5 by default, meaning 6 SYN-ACKs for 1 SYN. Thus, you become
a SYN amplifier and the guy in front of you just has to send you 20 Mbps
of traffic for you to saturate your 100 Mbps uplink.

Also, sending all those SYN-ACKs takes a huge amount of CPU time. With
tcp_synack_retries at 0, my machine received 26600 SYN/s, and returned
26600 SYN-ACK/s at 100% CPU. With tcp_synack_retries set to 4, it could
only accept 12900 SYN/s, replying with 51200 SYN-ACK/s. So the input
load was halfed and the output was doubled. I did not bother going higher.

The only solution against this is then to reduce tcp_synack_retries to
very low values (0 ideally, to match SYN cookies behaviour), but in this
case, you degrade normal traffic 100% of the time, while SYN cookies would
only trigger while you're already under attack.

My conclusions after those tests was to set tcp_synack_retries to a reasonable
value (1 to 3), and set the backlog to the number of half-open sessions your
machine can accumulate under a SYN attack without collapsing. You then enable
SYN cookies, and they will only trigger when you know that your machine will
not be able to sustain the increased load.

This solution permits you to accept normal connections when not under attack,
with an acceptable number of retransmits and with TCP options working well.
Under a moderate attack, the large backlog will still help you accept
legitimate connections with all comfort (sack, wscale, ...). Under a massive
attack, you will not send more than tcp_synack_retries*backlog packets per
tcp_synack_retries period, thus limiting the outbound traffic, plus 1 SYN-ACK
per incoming SYN, legitimate or not. At this stage, if your users have a
castrated TCP stack in front of them, that's not a problem because you know
that otherwise they would not even have been able to connect.

So in this regard, SYN cookies are really needed.

Last, I've read on DJB's page that SYN cookies do not break TCP beahaviour.
Yes they do. If the client waits for the server to talk first, you'd better
not lose the first ACK from the client because it will not get retransmitted,
and the client will see an ESTABLISHED connection while the server will have
nothing. Fortunately, most attack targets are HTTP and do not have this
problem.

For this reason, I think that SYN cookies should be activable by port or
simply by a setsockopt() from the application itself. Having them enabled
by default on the whole system with small backlogs is bad, having large
backlogs to cover attacks is bad, but having medium backlogs with SYN
cookies per application would be very useful.

Best regards,
Willy


^ permalink raw reply

* Re: [PATCH] Add IPv6 support to TCP SYN cookies
From: Alan Cox @ 2008-02-05 21:25 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: Andi Kleen, Glenn Griffin, netdev, linux-kernel
In-Reply-To: <Pine.LNX.4.64.0802052056120.16441@fbirervta.pbzchgretzou.qr>

> >So I don't think it makes much sense to add more code to it, sorry.
> 
> Distributions should then probably deactivate it by default.
> SUSE 10.3 for example still has it enabled on default installs.

Even though I work the loyal opposition to SuSE I'd say SuSE 10.3 is
correct in having it enabled in the build.

Alan

^ permalink raw reply

* Re: [PATCH] Add IPv6 support to TCP SYN cookies
From: Alan Cox @ 2008-02-05 21:20 UTC (permalink / raw)
  To: Evgeniy Polyakov; +Cc: Andi Kleen, Glenn Griffin, netdev, linux-kernel
In-Reply-To: <20080205203911.GA9891@2ka.mipt.ru>

> How does syncookies prevent windows from growing?

Enabling them doesn't.

> Most (if not all) distributions have them enabled and window growing
> works just fine. Actually I do not see any reason why connection
> establishment handshake should prevent any run-time operations at all,
> even if it was setup during handshake.

Syncookies are only triggered if the system is under a load where it
would begin to lose connections otherwise. So they merely turn a DoS into
a working if slightly slower setup (and > 64K windows don't matter for
most normal users, especially on mobile devices).


^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox