Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [net-next-2.6 PATCH] ixgbevf: Fix link speed display
From: David Miller @ 2010-04-27 21:36 UTC (permalink / raw)
  To: jeffrey.t.kirsher; +Cc: netdev, gospo, gregory.v.rose
In-Reply-To: <20100427213143.25913.83381.stgit@localhost.localdomain>

From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Tue, 27 Apr 2010 14:31:45 -0700

> From: Greg Rose <gregory.v.rose@intel.com>
> 
> The ixgbevf driver would always report 10Gig speeds even when the link
> speed is downshifted to 1Gig.  This patch fixes that problem.
> 
> Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

Applied.

^ permalink raw reply

* Re: [PATCH 1/3] bnx2: Fix lost MSI-X problem on 5709 NICs.
From: David Miller @ 2010-04-27 21:38 UTC (permalink / raw)
  To: mchan; +Cc: netdev, gospo, jfeeney
In-Reply-To: <1272403691-2934-1-git-send-email-mchan@broadcom.com>

From: "Michael Chan" <mchan@broadcom.com>
Date: Tue, 27 Apr 2010 14:28:09 -0700

> It has been reported that under certain heavy traffic conditions in MSI-X
> mode, the driver can lose an MSI-X vector causing all packets in the
> associated rx/tx ring pair to be dropped.  The problem is caused by
> the chip dropping the write to unmask the MSI-X vector by the kernel
> (when migrating the IRQ for example).
> 
> This can be prevented by increasing the GRC timeout value for these
> register read and write operations.
> 
> Thanks to Dell for helping us debug this problem.
> 
> Signed-off-by: Michael Chan <mchan@broadcom.com>

Applied to net-2.6

^ permalink raw reply

* Re: [PATCH 2/3] bnx2: Prevent "scheduling while atomic" warning with cnic, bonding and vlan.
From: David Miller @ 2010-04-27 21:38 UTC (permalink / raw)
  To: mchan; +Cc: netdev, gospo, jfeeney
In-Reply-To: <1272403691-2934-2-git-send-email-mchan@broadcom.com>

From: "Michael Chan" <mchan@broadcom.com>
Date: Tue, 27 Apr 2010 14:28:10 -0700

> The bonding driver calls ndo_vlan_rx_register() while holding bond->lock.
> The bnx2 driver calls bnx2_netif_stop() to stop the rx handling while
> changing the vlgrp.  The call also stops the cnic driver which sleeps
> while the bond->lock is held and cause the warning.
> 
> This code path only needs to stop the NAPI rx handling while we are
> changing the vlgrp.  Since no reset is going to occur, there is no need
> to stop cnic in this case.  By adding a parameter to bnx2_netif_stop()
> to skip stopping cnic, we can avoid the warning.
> 
> Signed-off-by: Michael Chan <mchan@broadcom.com>

Applied to net-2.6

^ permalink raw reply

* Re: [PATCH 3/3] bnx2: Update version to 2.0.9.
From: David Miller @ 2010-04-27 21:38 UTC (permalink / raw)
  To: mchan; +Cc: netdev, gospo, jfeeney
In-Reply-To: <1272403691-2934-3-git-send-email-mchan@broadcom.com>

From: "Michael Chan" <mchan@broadcom.com>
Date: Tue, 27 Apr 2010 14:28:11 -0700

> Signed-off-by: Michael Chan <mchan@broadcom.com>

Applied to net-2.6

^ permalink raw reply

* Re: [PATCH net-next-2.6] net: sk_add_backlog() take rmem_alloc into account
From: David Miller @ 2010-04-27 21:43 UTC (permalink / raw)
  To: eric.dumazet; +Cc: bmb, therbert, netdev, rick.jones2
In-Reply-To: <1272399662.2343.12.camel@edumazet-laptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 27 Apr 2010 22:21:02 +0200

> [PATCH net-next-2.6] net: sk_add_backlog() take rmem_alloc into account
> 
> Current socket backlog limit is not enough to really stop DDOS attacks,
> because user thread spend many time to process a full backlog each
> round, and user might crazy spin on socket lock.
> 
> We should add backlog size and receive_queue size (aka rmem_alloc) to
> pace writers, and let user run without being slow down too much.
> 
> Introduce a sk_rcvqueues_full() helper, to avoid taking socket lock in
> stress situations.
> 
> Under huge stress from a multiqueue/RPS enabled NIC, a single flow udp
> receiver can now process ~200.000 pps (instead of ~100 pps before the
> patch) on a 8 core machine.
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

This looks great, applied, thanks Eric!

^ permalink raw reply

* Re: [patch] ipheth: potential null dereferences on error path
From: L. Alberto Giménez @ 2010-04-27 21:43 UTC (permalink / raw)
  To: Dan Carpenter; +Cc: Diego Giagio, David S. Miller, netdev, kernel-janitors
In-Reply-To: <20100427092012.GA29093@bicker>

On Tue, Apr 27, 2010 at 11:20:12AM +0200, Dan Carpenter wrote:
> The calls to usb_free_buffer() dereference rx_urb and tx_urb in the
> parameter list but those could be NULL.
> 
> Signed-off-by: Dan Carpenter <error27@gmail.com>

Acked-by: L. Alberto Giménez <agimenez@sysvalve.es>

-- 
L. Alberto Giménez
JabberID agimenez@jabber.sysvalve.es
GnuPG key ID 0x3BAABDE1

^ permalink raw reply

* Re: [PATCH kernel 2.6.34-rc5] smc91c92_cs: spin_unlock_irqrestore before calling smc_interrupt()
From: David Miller @ 2010-04-27 21:47 UTC (permalink / raw)
  To: ken_kawasaki; +Cc: netdev
In-Reply-To: <20100425053709.ec182f63.ken_kawasaki@spring.nifty.jp>

From: Ken Kawasaki <ken_kawasaki@spring.nifty.jp>
Date: Sun, 25 Apr 2010 05:37:09 +0900

> 
> smc91c92_cs:
>   * spin_unlock_irqrestore before calling smc_interrupt() in media_check()
>      to avoid lockup.
>   * use spin_lock_irqsave for ethtool function.
> 
> Signed-off-by: Ken Kawasaki <ken_kawasaki@spring.nifty.jp>

Applied, thank you.

^ permalink raw reply

* Re: [patch] ipheth: potential null dereferences on error path
From: David Miller @ 2010-04-27 21:49 UTC (permalink / raw)
  To: agimenez; +Cc: error27, diego, netdev, kernel-janitors
In-Reply-To: <20100427214347.GA2376@bart.evergreen.loc>

From: L. Alberto Giménez <agimenez@sysvalve.es>
Date: Tue, 27 Apr 2010 23:43:47 +0200

> On Tue, Apr 27, 2010 at 11:20:12AM +0200, Dan Carpenter wrote:
>> The calls to usb_free_buffer() dereference rx_urb and tx_urb in the
>> parameter list but those could be NULL.
>> 
>> Signed-off-by: Dan Carpenter <error27@gmail.com>
> 
> Acked-by: L. Alberto Giménez <agimenez@sysvalve.es>

Applied, thanks everyone.

^ permalink raw reply

* Re: [PATCH v5] rfs: Receive Flow Steering
From: David Miller @ 2010-04-27 21:59 UTC (permalink / raw)
  To: eric.dumazet; +Cc: therbert, netdev
In-Reply-To: <1272271271.2346.16.camel@edumazet-laptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 26 Apr 2010 10:41:11 +0200

> Le lundi 19 avril 2010 à 14:19 -0700, David Miller a écrit :
> 
>> 
>> I was thinking also about how we could compute rxhash in the
>> loopback driver :-)
> 
> This would be easy if rxhash was not a "struct inet_sock" field but a
> "struct sock" one
> 
> sock_alloc_send_pskb() (or skb_set_owner_w())
> 
> skb->rxhash = sk->rxhash;

Agreed.  I'll commit the following to net-next-2.6 after some build
testing.

net: Make RFS socket operations not be inet specific.

Idea from Eric Dumazet.

As for placement inside of struct sock, I tried to choose a place
that otherwise has a 32-bit hole on 64-bit systems.

Signed-off-by: David S. Miller <davem@davemloft.net>

diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
index c1d4295..1653de5 100644
--- a/include/net/inet_sock.h
+++ b/include/net/inet_sock.h
@@ -102,7 +102,6 @@ struct rtable;
  * @uc_ttl - Unicast TTL
  * @inet_sport - Source port
  * @inet_id - ID counter for DF pkts
- * @rxhash - flow hash received from netif layer
  * @tos - TOS
  * @mc_ttl - Multicasting TTL
  * @is_icsk - is this an inet_connection_sock?
@@ -126,9 +125,6 @@ struct inet_sock {
 	__u16			cmsg_flags;
 	__be16			inet_sport;
 	__u16			inet_id;
-#ifdef CONFIG_RPS
-	__u32			rxhash;
-#endif
 
 	struct ip_options	*opt;
 	__u8			tos;
@@ -224,37 +220,4 @@ static inline __u8 inet_sk_flowi_flags(const struct sock *sk)
 	return inet_sk(sk)->transparent ? FLOWI_FLAG_ANYSRC : 0;
 }
 
-static inline void inet_rps_record_flow(const struct sock *sk)
-{
-#ifdef CONFIG_RPS
-	struct rps_sock_flow_table *sock_flow_table;
-
-	rcu_read_lock();
-	sock_flow_table = rcu_dereference(rps_sock_flow_table);
-	rps_record_sock_flow(sock_flow_table, inet_sk(sk)->rxhash);
-	rcu_read_unlock();
-#endif
-}
-
-static inline void inet_rps_reset_flow(const struct sock *sk)
-{
-#ifdef CONFIG_RPS
-	struct rps_sock_flow_table *sock_flow_table;
-
-	rcu_read_lock();
-	sock_flow_table = rcu_dereference(rps_sock_flow_table);
-	rps_reset_sock_flow(sock_flow_table, inet_sk(sk)->rxhash);
-	rcu_read_unlock();
-#endif
-}
-
-static inline void inet_rps_save_rxhash(struct sock *sk, u32 rxhash)
-{
-#ifdef CONFIG_RPS
-	if (unlikely(inet_sk(sk)->rxhash != rxhash)) {
-		inet_rps_reset_flow(sk);
-		inet_sk(sk)->rxhash = rxhash;
-	}
-#endif
-}
 #endif	/* _INET_SOCK_H */
diff --git a/include/net/sock.h b/include/net/sock.h
index ef2f875..cf12b1e 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -198,6 +198,7 @@ struct sock_common {
   *	@sk_rcvlowat: %SO_RCVLOWAT setting
   *	@sk_rcvtimeo: %SO_RCVTIMEO setting
   *	@sk_sndtimeo: %SO_SNDTIMEO setting
+  *	@sk_rxhash: flow hash received from netif layer
   *	@sk_filter: socket filtering instructions
   *	@sk_protinfo: private area, net family specific, when not using slab
   *	@sk_timer: sock cleanup timer
@@ -278,6 +279,9 @@ struct sock {
 	int			sk_gso_type;
 	unsigned int		sk_gso_max_size;
 	int			sk_rcvlowat;
+#ifdef CONFIG_RPS
+	__u32			sk_rxhash;
+#endif
 	unsigned long 		sk_flags;
 	unsigned long	        sk_lingertime;
 	struct sk_buff_head	sk_error_queue;
@@ -629,6 +633,40 @@ static inline int sk_backlog_rcv(struct sock *sk, struct sk_buff *skb)
 	return sk->sk_backlog_rcv(sk, skb);
 }
 
+static inline void sock_rps_record_flow(const struct sock *sk)
+{
+#ifdef CONFIG_RPS
+	struct rps_sock_flow_table *sock_flow_table;
+
+	rcu_read_lock();
+	sock_flow_table = rcu_dereference(rps_sock_flow_table);
+	rps_record_sock_flow(sock_flow_table, sk->sk_rxhash);
+	rcu_read_unlock();
+#endif
+}
+
+static inline void sock_rps_reset_flow(const struct sock *sk)
+{
+#ifdef CONFIG_RPS
+	struct rps_sock_flow_table *sock_flow_table;
+
+	rcu_read_lock();
+	sock_flow_table = rcu_dereference(rps_sock_flow_table);
+	rps_reset_sock_flow(sock_flow_table, sk->sk_rxhash);
+	rcu_read_unlock();
+#endif
+}
+
+static inline void sock_rps_save_rxhash(struct sock *sk, u32 rxhash)
+{
+#ifdef CONFIG_RPS
+	if (unlikely(sk->sk_rxhash != rxhash)) {
+		sock_rps_reset_flow(sk);
+		sk->sk_rxhash = rxhash;
+	}
+#endif
+}
+
 #define sk_wait_event(__sk, __timeo, __condition)			\
 	({	int __rc;						\
 		release_sock(__sk);					\
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 9f52880..c6c43bc 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -419,7 +419,7 @@ int inet_release(struct socket *sock)
 	if (sk) {
 		long timeout;
 
-		inet_rps_reset_flow(sk);
+		sock_rps_reset_flow(sk);
 
 		/* Applications forget to leave groups before exiting */
 		ip_mc_drop_socket(sk);
@@ -722,7 +722,7 @@ int inet_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
 {
 	struct sock *sk = sock->sk;
 
-	inet_rps_record_flow(sk);
+	sock_rps_record_flow(sk);
 
 	/* We may need to bind the socket. */
 	if (!inet_sk(sk)->inet_num && inet_autobind(sk))
@@ -737,7 +737,7 @@ static ssize_t inet_sendpage(struct socket *sock, struct page *page, int offset,
 {
 	struct sock *sk = sock->sk;
 
-	inet_rps_record_flow(sk);
+	sock_rps_record_flow(sk);
 
 	/* We may need to bind the socket. */
 	if (!inet_sk(sk)->inet_num && inet_autobind(sk))
@@ -755,7 +755,7 @@ int inet_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
 	int addr_len = 0;
 	int err;
 
-	inet_rps_record_flow(sk);
+	sock_rps_record_flow(sk);
 
 	err = sk->sk_prot->recvmsg(iocb, sk, msg, size, flags & MSG_DONTWAIT,
 				   flags & ~MSG_DONTWAIT, &addr_len);
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 4d6717d..771f814 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1672,7 +1672,7 @@ process:
 
 	skb->dev = NULL;
 
-	inet_rps_save_rxhash(sk, skb->rxhash);
+	sock_rps_save_rxhash(sk, skb->rxhash);
 
 	bh_lock_sock_nested(sk);
 	ret = 0;
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 776c844..63eb56b 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1217,7 +1217,7 @@ int udp_disconnect(struct sock *sk, int flags)
 	sk->sk_state = TCP_CLOSE;
 	inet->inet_daddr = 0;
 	inet->inet_dport = 0;
-	inet_rps_save_rxhash(sk, 0);
+	sock_rps_save_rxhash(sk, 0);
 	sk->sk_bound_dev_if = 0;
 	if (!(sk->sk_userlocks & SOCK_BINDADDR_LOCK))
 		inet_reset_saddr(sk);
@@ -1262,7 +1262,7 @@ static int __udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
 	int rc;
 
 	if (inet_sk(sk)->inet_daddr)
-		inet_rps_save_rxhash(sk, skb->rxhash);
+		sock_rps_save_rxhash(sk, skb->rxhash);
 
 	rc = sock_queue_rcv_skb(sk, skb);
 	if (rc < 0) {

^ permalink raw reply related

* Re: [PATCH v6] net: batch skb dequeueing from softnet input_pkt_queue
From: David Miller @ 2010-04-27 22:08 UTC (permalink / raw)
  To: eric.dumazet; +Cc: xiaosuo, hadi, therbert, shemminger, netdev
In-Reply-To: <1272018366.7895.7930.camel@edumazet-laptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 23 Apr 2010 12:26:06 +0200

> Le vendredi 23 avril 2010 à 16:12 +0800, Changli Gao a écrit :
>> batch skb dequeueing from softnet input_pkt_queue.
>> 
>> batch skb dequeueing from softnet input_pkt_queue to reduce potential lock
>> contention when RPS is enabled.
>> 
>> Note: in the worst case, the number of packets in a softnet_data may be double
>> of netdev_max_backlog.
>> 
>> Signed-off-by: Changli Gao <xiaosuo@gmail.com>
>> ----
> 
> Oops, reading it again, I found process_backlog() was still taking the
> lock twice, if only one packet is waiting in input_pkt_queue.
> 
> Possible fix, on top of your patch :

I've applied Changli's patch with this fixup added to it.

If there are any follow-on changes necessary after further analysis,
please send patches on top of this work.

Thanks.

^ permalink raw reply

* Re: [PATCH v5] rfs: Receive Flow Steering
From: Eric Dumazet @ 2010-04-27 22:08 UTC (permalink / raw)
  To: David Miller; +Cc: therbert, netdev
In-Reply-To: <20100427.145904.67907652.davem@davemloft.net>

Le mardi 27 avril 2010 à 14:59 -0700, David Miller a écrit :
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Mon, 26 Apr 2010 10:41:11 +0200
> 
> > Le lundi 19 avril 2010 à 14:19 -0700, David Miller a écrit :
> > 
> >> 
> >> I was thinking also about how we could compute rxhash in the
> >> loopback driver :-)
> > 
> > This would be easy if rxhash was not a "struct inet_sock" field but a
> > "struct sock" one
> > 
> > sock_alloc_send_pskb() (or skb_set_owner_w())
> > 
> > skb->rxhash = sk->rxhash;
> 
> Agreed.  I'll commit the following to net-next-2.6 after some build
> testing.
> 
> net: Make RFS socket operations not be inet specific.
> 
> Idea from Eric Dumazet.
> 
> As for placement inside of struct sock, I tried to choose a place
> that otherwise has a 32-bit hole on 64-bit systems.
> 
> Signed-off-by: David S. Miller <davem@davemloft.net>

Acked-by: Eric Dumazet <eric.dumazet@gmail.com>

I tested same patch today (plus the skb->rxhash = sk->sk_rxhash) and got
a very small speedup on my Nehalem machine, where get_rps_cpus() was
using 1 % of cpu, now 0.25 %, on a tbench.




^ permalink raw reply

* Re: [PATCH v5] rfs: Receive Flow Steering
From: David Miller @ 2010-04-27 22:10 UTC (permalink / raw)
  To: eric.dumazet; +Cc: therbert, netdev
In-Reply-To: <1272406126.2343.17.camel@edumazet-laptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 28 Apr 2010 00:08:46 +0200

> Le mardi 27 avril 2010 à 14:59 -0700, David Miller a écrit :
>> net: Make RFS socket operations not be inet specific.
>> 
>> Idea from Eric Dumazet.
>> 
>> As for placement inside of struct sock, I tried to choose a place
>> that otherwise has a 32-bit hole on 64-bit systems.
>> 
>> Signed-off-by: David S. Miller <davem@davemloft.net>
> 
> Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
> 
> I tested same patch today (plus the skb->rxhash = sk->sk_rxhash) and got
> a very small speedup on my Nehalem machine, where get_rps_cpus() was
> using 1 % of cpu, now 0.25 %, on a tbench.

Great, I've added your ACK.

Thanks!

^ permalink raw reply

* Re: [PATCH net-next-2.6] net: sk_add_backlog() take rmem_alloc into account
From: David Miller @ 2010-04-27 22:11 UTC (permalink / raw)
  To: eric.dumazet; +Cc: bmb, therbert, netdev, rick.jones2
In-Reply-To: <20100427.144307.22037530.davem@davemloft.net>

From: David Miller <davem@davemloft.net>
Date: Tue, 27 Apr 2010 14:43:07 -0700 (PDT)

> This looks great, applied, thanks Eric!

I have to fix this one up, for example SCTP stopped building because it
still refers to the ->limit member you removed.

^ permalink raw reply

* [PATCH net-next-2.6] bnx2x: Remove two prefetch()
From: Eric Dumazet @ 2010-04-27 22:18 UTC (permalink / raw)
  To: David Miller
  Cc: xiaosuo, hadi, therbert, shemminger, netdev, Eilon Greenstein
In-Reply-To: <20100427.150817.84390202.davem@davemloft.net>

Le mardi 27 avril 2010 à 15:08 -0700, David Miller a écrit :
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Fri, 23 Apr 2010 12:26:06 +0200
> 
> > Le vendredi 23 avril 2010 à 16:12 +0800, Changli Gao a écrit :
> >> batch skb dequeueing from softnet input_pkt_queue.
> >> 
> >> batch skb dequeueing from softnet input_pkt_queue to reduce potential lock
> >> contention when RPS is enabled.
> >> 
> >> Note: in the worst case, the number of packets in a softnet_data may be double
> >> of netdev_max_backlog.
> >> 
> >> Signed-off-by: Changli Gao <xiaosuo@gmail.com>
> >> ----
> > 
> > Oops, reading it again, I found process_backlog() was still taking the
> > lock twice, if only one packet is waiting in input_pkt_queue.
> > 
> > Possible fix, on top of your patch :
> 
> I've applied Changli's patch with this fixup added to it.
> 
> If there are any follow-on changes necessary after further analysis,
> please send patches on top of this work.
> 

Thanks David, I was about to resubmit the cumulative patch ;)

On my 'old' dev machine (two quad core), RPS is able to get a 300%
increase on udpsink test on 20 flows.

I yet have to make routing/firewalling tests as well.

I also noticed bnx2x driver has some strange prefetch() calls.

[PATCH net-next-2.6] bnx2x: Remove two prefetch()

1) Even on 64bit arches, sizeof(struct sk_buff) < 256
2) No need to prefetch same pointer twice.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Eilon Greenstein <eilong@broadcom.com>
---

diff --git a/drivers/net/bnx2x_main.c b/drivers/net/bnx2x_main.c
index 613f727..f706ed1 100644
--- a/drivers/net/bnx2x_main.c
+++ b/drivers/net/bnx2x_main.c
@@ -1617,7 +1617,6 @@ static int bnx2x_rx_int(struct bnx2x_fastpath *fp, int budget)
 			rx_buf = &fp->rx_buf_ring[bd_cons];
 			skb = rx_buf->skb;
 			prefetch(skb);
-			prefetch((u8 *)skb + 256);
 			len = le16_to_cpu(cqe->fast_path_cqe.pkt_len);
 			pad = cqe->fast_path_cqe.placement_offset;
 
@@ -1668,7 +1667,6 @@ static int bnx2x_rx_int(struct bnx2x_fastpath *fp, int budget)
 					dma_unmap_addr(rx_buf, mapping),
 						   pad + RX_COPY_THRESH,
 						   DMA_FROM_DEVICE);
-			prefetch(skb);
 			prefetch(((char *)(skb)) + 128);
 
 			/* is this an error packet? */




^ permalink raw reply related

* Re: [PATCH net-next-2.6] bnx2x: Remove two prefetch()
From: David Miller @ 2010-04-27 22:19 UTC (permalink / raw)
  To: eric.dumazet; +Cc: xiaosuo, hadi, therbert, shemminger, netdev, eilong
In-Reply-To: <1272406693.2343.26.camel@edumazet-laptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 28 Apr 2010 00:18:13 +0200

> [PATCH net-next-2.6] bnx2x: Remove two prefetch()
> 
> 1) Even on 64bit arches, sizeof(struct sk_buff) < 256
> 2) No need to prefetch same pointer twice.
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> CC: Eilon Greenstein <eilong@broadcom.com>

Eilon please review and ACK/NACK

Thanks.

^ permalink raw reply

* Re: [PATCH net-next-2.6] net: sk_add_backlog() take rmem_alloc into account
From: Eric Dumazet @ 2010-04-27 22:19 UTC (permalink / raw)
  To: David Miller; +Cc: bmb, therbert, netdev, rick.jones2
In-Reply-To: <20100427.151103.146348463.davem@davemloft.net>

Le mardi 27 avril 2010 à 15:11 -0700, David Miller a écrit :
> From: David Miller <davem@davemloft.net>
> Date: Tue, 27 Apr 2010 14:43:07 -0700 (PDT)
> 
> > This looks great, applied, thanks Eric!
> 
> I have to fix this one up, for example SCTP stopped building because it
> still refers to the ->limit member you removed.
> --

Hmm, sorry for this, thanks !



^ permalink raw reply

* Re: [v4 Patch 1/3] netpoll: add generic support for bridge and bonding devices
From: David Miller @ 2010-04-27 22:22 UTC (permalink / raw)
  To: amwang
  Cc: linux-kernel, mpm, netdev, bridge, gospo, nhorman, jmoyer,
	shemminger, bonding-devel, fubar
In-Reply-To: <20100427075937.4908.18468.sendpatchset@localhost.localdomain>

From: Amerigo Wang <amwang@redhat.com>
Date: Tue, 27 Apr 2010 03:55:41 -0400

> +	if (ndev->priv_flags & IFF_DISABLE_NETPOLL
> +			|| !ndev->netdev_ops->ndo_poll_controller) {

" ||" goes on first line, not second, and second line needs to be indented
properly so that "!ndev->..." matches up with "ndev->priv_flags ..." on
the previous line.

^ permalink raw reply

* Re: [v4 Patch 2/3] bridge: make bridge support netpoll
From: David Miller @ 2010-04-27 22:23 UTC (permalink / raw)
  To: amwang
  Cc: linux-kernel, shemminger, netdev, bridge, gospo, nhorman, jmoyer,
	mpm, bonding-devel, fubar
In-Reply-To: <20100427075952.4908.60515.sendpatchset@localhost.localdomain>

From: Amerigo Wang <amwang@redhat.com>
Date: Tue, 27 Apr 2010 03:55:56 -0400

> +		if (p->dev->priv_flags & IFF_DISABLE_NETPOLL
> +				|| !p->dev->netdev_ops->ndo_poll_controller)

"||" goes on first line, and indentation on second line is incorrect.
See my comments from patch #1.

^ permalink raw reply

* Re: [v4 Patch 3/3] bonding: make bonding support netpoll
From: David Miller @ 2010-04-27 22:24 UTC (permalink / raw)
  To: amwang
  Cc: linux-kernel, mpm, netdev, bridge, gospo, nhorman, jmoyer,
	shemminger, bonding-devel, fubar
In-Reply-To: <20100427080004.4908.27339.sendpatchset@localhost.localdomain>

From: Amerigo Wang <amwang@redhat.com>
Date: Tue, 27 Apr 2010 03:56:09 -0400

> +		if ((slave->dev->priv_flags & IFF_DISABLE_NETPOLL)
> +				|| !slave->dev->netdev_ops->ndo_poll_controller)

"|| on first line please, plus fix second line's indentation as per
comments given in patch #1 and #2

^ permalink raw reply

* [PATCH net-next] cxgb4: set skb->rxhash
From: Dimitris Michailidis @ 2010-04-27 22:20 UTC (permalink / raw)
  To: netdev; +Cc: Dimitris Michailidis

Implement the ->set_flags ethtool method to control NETIF_F_RXHASH and
set skb->rxhash to the HW calculated hash accordingly.

Signed-off-by: Dimitris Michailidis <dm@chelsio.com>
---
 drivers/net/cxgb4/cxgb4_main.c |   15 ++++++++++++++-
 drivers/net/cxgb4/sge.c        |    7 ++++++-
 drivers/net/cxgb4/t4_msg.h     |    1 +
 3 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/drivers/net/cxgb4/cxgb4_main.c b/drivers/net/cxgb4/cxgb4_main.c
index 5f582db..1bad500 100644
--- a/drivers/net/cxgb4/cxgb4_main.c
+++ b/drivers/net/cxgb4/cxgb4_main.c
@@ -1711,6 +1711,18 @@ static int set_tso(struct net_device *dev, u32 value)
 	return 0;
 }
 
+static int set_flags(struct net_device *dev, u32 flags)
+{
+	if (flags & ~ETH_FLAG_RXHASH)
+		return -EOPNOTSUPP;
+
+	if (flags & ETH_FLAG_RXHASH)
+		dev->features |= NETIF_F_RXHASH;
+	else
+		dev->features &= ~NETIF_F_RXHASH;
+	return 0;
+}
+
 static struct ethtool_ops cxgb_ethtool_ops = {
 	.get_settings      = get_settings,
 	.set_settings      = set_settings,
@@ -1741,6 +1753,7 @@ static struct ethtool_ops cxgb_ethtool_ops = {
 	.get_wol           = get_wol,
 	.set_wol           = set_wol,
 	.set_tso           = set_tso,
+	.set_flags         = set_flags,
 	.flash_device      = set_flash,
 };
 
@@ -3203,7 +3216,7 @@ static int __devinit init_one(struct pci_dev *pdev,
 
 		netdev->features |= NETIF_F_SG | NETIF_F_TSO | NETIF_F_TSO6;
 		netdev->features |= NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM;
-		netdev->features |= NETIF_F_GRO | highdma;
+		netdev->features |= NETIF_F_GRO | NETIF_F_RXHASH | highdma;
 		netdev->features |= NETIF_F_HW_VLAN_TX | NETIF_F_HW_VLAN_RX;
 		netdev->vlan_features = netdev->features & VLAN_FEAT;
 
diff --git a/drivers/net/cxgb4/sge.c b/drivers/net/cxgb4/sge.c
index 65d91c4..e804ffc 100644
--- a/drivers/net/cxgb4/sge.c
+++ b/drivers/net/cxgb4/sge.c
@@ -1524,6 +1524,8 @@ static void do_gro(struct sge_eth_rxq *rxq, const struct pkt_gl *gl,
 	skb->truesize += skb->data_len;
 	skb->ip_summed = CHECKSUM_UNNECESSARY;
 	skb_record_rx_queue(skb, rxq->rspq.idx);
+	if (rxq->rspq.netdev->features & NETIF_F_RXHASH)
+		skb->rxhash = ntohl(pkt->rsshdr.hash_val);
 
 	if (unlikely(pkt->vlan_ex)) {
 		struct port_info *pi = netdev_priv(rxq->rspq.netdev);
@@ -1565,7 +1567,7 @@ int t4_ethrx_handler(struct sge_rspq *q, const __be64 *rsp,
 	if (unlikely(*(u8 *)rsp == CPL_TRACE_PKT))
 		return handle_trace_pkt(q->adap, si);
 
-	pkt = (void *)&rsp[1];
+	pkt = (const struct cpl_rx_pkt *)rsp;
 	csum_ok = pkt->csum_calc && !pkt->err_vec;
 	if ((pkt->l2info & htonl(RXF_TCP)) &&
 	    (q->netdev->features & NETIF_F_GRO) && csum_ok && !pkt->ip_frag) {
@@ -1583,6 +1585,9 @@ int t4_ethrx_handler(struct sge_rspq *q, const __be64 *rsp,
 	__skb_pull(skb, RX_PKT_PAD);      /* remove ethernet header padding */
 	skb->protocol = eth_type_trans(skb, q->netdev);
 	skb_record_rx_queue(skb, q->idx);
+	if (skb->dev->features & NETIF_F_RXHASH)
+		skb->rxhash = ntohl(pkt->rsshdr.hash_val);
+
 	pi = netdev_priv(skb->dev);
 	rxq->stats.pkts++;
 
diff --git a/drivers/net/cxgb4/t4_msg.h b/drivers/net/cxgb4/t4_msg.h
index fdb1174..7a981b8 100644
--- a/drivers/net/cxgb4/t4_msg.h
+++ b/drivers/net/cxgb4/t4_msg.h
@@ -503,6 +503,7 @@ struct cpl_rx_data_ack {
 };
 
 struct cpl_rx_pkt {
+	struct rss_header rsshdr;
 	u8 opcode;
 #if defined(__LITTLE_ENDIAN_BITFIELD)
 	u8 iff:4;
-- 
1.5.4


^ permalink raw reply related

* [PATCH net-next 1/2] cxgb4: parse the VPD instead of relying on a static VPD layout
From: Dimitris Michailidis @ 2010-04-27 22:24 UTC (permalink / raw)
  To: netdev; +Cc: Dimitris Michailidis

Some boards' VPDs contain additional keywords or have longer serial numbers,
meaning the keyword locations are variable.  Ditch the static layout and
use the pci_vpd_* family of functions to parse the VPD instead.

Signed-off-by: Dimitris Michailidis <dm@chelsio.com>
---
 drivers/net/cxgb4/t4_hw.c |   64 +++++++++++++++++++++++++++-----------------
 1 files changed, 39 insertions(+), 25 deletions(-)

diff --git a/drivers/net/cxgb4/t4_hw.c b/drivers/net/cxgb4/t4_hw.c
index cadead5..2923dd4 100644
--- a/drivers/net/cxgb4/t4_hw.c
+++ b/drivers/net/cxgb4/t4_hw.c
@@ -347,33 +347,21 @@ int t4_edc_read(struct adapter *adap, int idx, u32 addr, __be32 *data, u64 *ecc)
 	return 0;
 }
 
-#define VPD_ENTRY(name, len) \
-	u8 name##_kword[2]; u8 name##_len; u8 name##_data[len]
-
 /*
  * Partial EEPROM Vital Product Data structure.  Includes only the ID and
- * VPD-R sections.
+ * VPD-R header.
  */
-struct t4_vpd {
+struct t4_vpd_hdr {
 	u8  id_tag;
 	u8  id_len[2];
 	u8  id_data[ID_LEN];
 	u8  vpdr_tag;
 	u8  vpdr_len[2];
-	VPD_ENTRY(pn, 16);                     /* part number */
-	VPD_ENTRY(ec, EC_LEN);                 /* EC level */
-	VPD_ENTRY(sn, SERNUM_LEN);             /* serial number */
-	VPD_ENTRY(na, 12);                     /* MAC address base */
-	VPD_ENTRY(port_type, 8);               /* port types */
-	VPD_ENTRY(gpio, 14);                   /* GPIO usage */
-	VPD_ENTRY(cclk, 6);                    /* core clock */
-	VPD_ENTRY(port_addr, 8);               /* port MDIO addresses */
-	VPD_ENTRY(rv, 1);                      /* csum */
-	u32 pad;                  /* for multiple-of-4 sizing and alignment */
 };
 
 #define EEPROM_STAT_ADDR   0x7bfc
 #define VPD_BASE           0
+#define VPD_LEN            512
 
 /**
  *	t4_seeprom_wp - enable/disable EEPROM write protection
@@ -398,16 +386,36 @@ int t4_seeprom_wp(struct adapter *adapter, bool enable)
  */
 static int get_vpd_params(struct adapter *adapter, struct vpd_params *p)
 {
-	int ret;
-	struct t4_vpd vpd;
-	u8 *q = (u8 *)&vpd, csum;
+	int i, ret;
+	int ec, sn, v2;
+	u8 vpd[VPD_LEN], csum;
+	unsigned int vpdr_len;
+	const struct t4_vpd_hdr *v;
 
-	ret = pci_read_vpd(adapter->pdev, VPD_BASE, sizeof(vpd), &vpd);
+	ret = pci_read_vpd(adapter->pdev, VPD_BASE, sizeof(vpd), vpd);
 	if (ret < 0)
 		return ret;
 
-	for (csum = 0; q <= vpd.rv_data; q++)
-		csum += *q;
+	v = (const struct t4_vpd_hdr *)vpd;
+	vpdr_len = pci_vpd_lrdt_size(&v->vpdr_tag);
+	if (vpdr_len + sizeof(struct t4_vpd_hdr) > VPD_LEN) {
+		dev_err(adapter->pdev_dev, "bad VPD-R length %u\n", vpdr_len);
+		return -EINVAL;
+	}
+
+#define FIND_VPD_KW(var, name) do { \
+	var = pci_vpd_find_info_keyword(&v->id_tag, sizeof(struct t4_vpd_hdr), \
+					vpdr_len, name); \
+	if (var < 0) { \
+		dev_err(adapter->pdev_dev, "missing VPD keyword " name "\n"); \
+		return -EINVAL; \
+	} \
+	var += PCI_VPD_INFO_FLD_HDR_SIZE; \
+} while (0)
+
+	FIND_VPD_KW(i, "RV");
+	for (csum = 0; i >= 0; i--)
+		csum += vpd[i];
 
 	if (csum) {
 		dev_err(adapter->pdev_dev,
@@ -415,12 +423,18 @@ static int get_vpd_params(struct adapter *adapter, struct vpd_params *p)
 		return -EINVAL;
 	}
 
-	p->cclk = simple_strtoul(vpd.cclk_data, NULL, 10);
-	memcpy(p->id, vpd.id_data, sizeof(vpd.id_data));
+	FIND_VPD_KW(ec, "EC");
+	FIND_VPD_KW(sn, "SN");
+	FIND_VPD_KW(v2, "V2");
+#undef FIND_VPD_KW
+
+	p->cclk = simple_strtoul(vpd + v2, NULL, 10);
+	memcpy(p->id, v->id_data, ID_LEN);
 	strim(p->id);
-	memcpy(p->ec, vpd.ec_data, sizeof(vpd.ec_data));
+	memcpy(p->ec, vpd + ec, EC_LEN);
 	strim(p->ec);
-	memcpy(p->sn, vpd.sn_data, sizeof(vpd.sn_data));
+	i = pci_vpd_info_field_size(vpd + sn - PCI_VPD_INFO_FLD_HDR_SIZE);
+	memcpy(p->sn, vpd + sn, min(i, SERNUM_LEN));
 	strim(p->sn);
 	return 0;
 }
-- 
1.5.4


^ permalink raw reply related

* [PATCH net-next 2/2] cxgb4: increase serial number length
From: Dimitris Michailidis @ 2010-04-27 22:24 UTC (permalink / raw)
  To: netdev; +Cc: Dimitris Michailidis
In-Reply-To: <1272407056-11699-1-git-send-email-dm@chelsio.com>

Some boards have longer serial numbers in their VPD, up to 24 bytes.

Signed-off-by: Dimitris Michailidis <dm@chelsio.com>
---
 drivers/net/cxgb4/cxgb4.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/cxgb4/cxgb4.h b/drivers/net/cxgb4/cxgb4.h
index 4b35dc7..8856a75 100644
--- a/drivers/net/cxgb4/cxgb4.h
+++ b/drivers/net/cxgb4/cxgb4.h
@@ -53,7 +53,7 @@
 
 enum {
 	MAX_NPORTS = 4,     /* max # of ports */
-	SERNUM_LEN = 16,    /* Serial # length */
+	SERNUM_LEN = 24,    /* Serial # length */
 	EC_LEN     = 16,    /* E/C length */
 	ID_LEN     = 16,    /* ID length */
 };
-- 
1.5.4


^ permalink raw reply related

* Re: [PATCH 1/3] ptp: Added a brand new class driver for ptp clocks.
From: Randy Dunlap @ 2010-04-27 22:32 UTC (permalink / raw)
  To: Richard Cochran; +Cc: netdev
In-Reply-To: <20100427091405.GA5098@riccoc20.at.omicron.at>

On Tue, 27 Apr 2010 11:14:05 +0200 Richard Cochran wrote:

Hi,

How do I use the testptp.mk file?

$ cd Documentation/ptp
$ make -f *.mk
gcc -Wall -I/usr/include   -c -o testptp.o testptp.c
testptp.c:33:29: error: linux/ptp_clock.h: No such file or directory



> diff --git a/Documentation/ptp/testptp.mk b/Documentation/ptp/testptp.mk
> new file mode 100644
> index 0000000..4ef2d97
> --- /dev/null
> +++ b/Documentation/ptp/testptp.mk
> @@ -0,0 +1,33 @@
> +# PTP 1588 clock support - User space test program
> +#
> +# Copyright (C) 2010 OMICRON electronics GmbH
> +#
> +#  This program is free software; you can redistribute it and/or modify
> +#  it under the terms of the GNU General Public License as published by
> +#  the Free Software Foundation; either version 2 of the License, or
> +#  (at your option) any later version.
> +#
> +#  This program is distributed in the hope that it will be useful,
> +#  but WITHOUT ANY WARRANTY; without even the implied warranty of
> +#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +#  GNU General Public License for more details.
> +#
> +#  You should have received a copy of the GNU General Public License
> +#  along with this program; if not, write to the Free Software
> +#  Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
> +
> +CC        = $(CROSS_COMPILE)gcc
> +INC       = -I$(KBUILD_OUTPUT)/usr/include
> +CFLAGS    = -Wall $(INC)
> +LDLIBS    = -lrt
> +PROGS     = testptp
> +
> +all: $(PROGS)
> +
> +testptp: testptp.o
> +
> +clean:
> +	rm -f testptp.o
> +
> +distclean: clean
> +	rm -f $(PROGS)
> diff --git a/drivers/ptp/Kconfig b/drivers/ptp/Kconfig
> new file mode 100644
> index 0000000..458d0fe
> --- /dev/null
> +++ b/drivers/ptp/Kconfig
> @@ -0,0 +1,26 @@
> +#
> +# PTP clock support configuration
> +#
> +
> +menu "PTP clock support"
> +
> +config PTP_1588_CLOCK
> +	tristate "PTP clock support"
> +	depends on EXPERIMENTAL
> +	help
> +	  The IEEE 1588 standard defines a method to precisely
> +	  synchronize distributed clocks over Ethernet networks. The
> +	  standard defines a Precision Time Protocol (PTP), which can
> +	  be used to achieve synchronization within a few dozen
> +	  microseconds. In addition, with the help of special hardware
> +	  time stamping units, it can be possible to achieve
> +	  synchronization to within a few hundred nanoseconds.
> +
> +	  This driver adds support for PTP clocks as character
> +	  devices. If you want to use a PTP clock, then you should
> +	  also enable at least one clock driver as well.
> +
> +	  To compile this driver as a module, choose M here: the module
> +	  will be called ptp_clock.ko.

Drop the ".ko".  We normally don't include that part of the module name.

> +
> +endmenu

> diff --git a/include/linux/Kbuild b/include/linux/Kbuild
> index 2fc8e14..2d616cb 100644
> --- a/include/linux/Kbuild
> +++ b/include/linux/Kbuild
> @@ -318,6 +318,7 @@ unifdef-y += poll.h
>  unifdef-y += ppp_defs.h
>  unifdef-y += ppp-comp.h
>  unifdef-y += pps.h
> +unifdef-y += ptp_clock.h
>  unifdef-y += ptrace.h
>  unifdef-y += quota.h
>  unifdef-y += random.h

I think that the Kbuild file also needs this line:
header-y += ptp_clock.h

so that builds that use O=objdir will work, but even with that
change, I couldn't get it to work.  (?)

---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

^ permalink raw reply

* Re: [PATCH net-next] cxgb4: set skb->rxhash
From: Stephen Hemminger @ 2010-04-27 22:36 UTC (permalink / raw)
  To: Dimitris Michailidis; +Cc: netdev
In-Reply-To: <1272406825-11615-1-git-send-email-dm@chelsio.com>

On Tue, 27 Apr 2010 15:20:25 -0700
Dimitris Michailidis <dm@chelsio.com> wrote:

> Implement the ->set_flags ethtool method to control NETIF_F_RXHASH and
> set skb->rxhash to the HW calculated hash accordingly.
> 
> Signed-off-by: Dimitris Michailidis <dm@chelsio.com>
> ---
>  drivers/net/cxgb4/cxgb4_main.c |   15 ++++++++++++++-
>  drivers/net/cxgb4/sge.c        |    7 ++++++-
>  drivers/net/cxgb4/t4_msg.h     |    1 +
>  3 files changed, 21 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/cxgb4/cxgb4_main.c b/drivers/net/cxgb4/cxgb4_main.c
> index 5f582db..1bad500 100644
> --- a/drivers/net/cxgb4/cxgb4_main.c
> +++ b/drivers/net/cxgb4/cxgb4_main.c
> @@ -1711,6 +1711,18 @@ static int set_tso(struct net_device *dev, u32 value)
>  	return 0;
>  }
>  
> +static int set_flags(struct net_device *dev, u32 flags)
> +{
> +	if (flags & ~ETH_FLAG_RXHASH)
> +		return -EOPNOTSUPP;
> +
> +	if (flags & ETH_FLAG_RXHASH)
> +		dev->features |= NETIF_F_RXHASH;
> +	else
> +		dev->features &= ~NETIF_F_RXHASH;
> +	return 0;
> +}

You need to check for and reject other values NTUPLE and LRO

^ permalink raw reply

* Re: [PATCH net-next] cxgb4: set skb->rxhash
From: Dimitris Michailidis @ 2010-04-27 22:39 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev
In-Reply-To: <20100427153656.7bcaba83@nehalam>

Stephen Hemminger wrote:
> On Tue, 27 Apr 2010 15:20:25 -0700
> Dimitris Michailidis <dm@chelsio.com> wrote:
> 
>> Implement the ->set_flags ethtool method to control NETIF_F_RXHASH and
>> set skb->rxhash to the HW calculated hash accordingly.
>>
>> Signed-off-by: Dimitris Michailidis <dm@chelsio.com>
>> ---
>>  drivers/net/cxgb4/cxgb4_main.c |   15 ++++++++++++++-
>>  drivers/net/cxgb4/sge.c        |    7 ++++++-
>>  drivers/net/cxgb4/t4_msg.h     |    1 +
>>  3 files changed, 21 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/net/cxgb4/cxgb4_main.c b/drivers/net/cxgb4/cxgb4_main.c
>> index 5f582db..1bad500 100644
>> --- a/drivers/net/cxgb4/cxgb4_main.c
>> +++ b/drivers/net/cxgb4/cxgb4_main.c
>> @@ -1711,6 +1711,18 @@ static int set_tso(struct net_device *dev, u32 value)
>>  	return 0;
>>  }
>>  
>> +static int set_flags(struct net_device *dev, u32 flags)
>> +{
>> +	if (flags & ~ETH_FLAG_RXHASH)
>> +		return -EOPNOTSUPP;
>> +
>> +	if (flags & ETH_FLAG_RXHASH)
>> +		dev->features |= NETIF_F_RXHASH;
>> +	else
>> +		dev->features &= ~NETIF_F_RXHASH;
>> +	return 0;
>> +}
> 
> You need to check for and reject other values NTUPLE and LRO

The first if does that, doesn't it?

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox