netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/3] [SNMP]: Defer InDataGrams increment until recvmsg() does checksum
@ 2007-11-30  3:09 Wang Chen
  2007-11-30  3:15 ` [PATCH 2/3] [UDP]: Clean up for IS_UDPLITE macro Wang Chen
                   ` (2 more replies)
  0 siblings, 3 replies; 26+ messages in thread
From: Wang Chen @ 2007-11-30  3:09 UTC (permalink / raw)
  To: herbert; +Cc: davem, andi, Wang Chen, netdev, gerrit

From:  Wang Chen <wangchen@cn.fujitsu.com>

Thanks dave, herbert, gerrit, andi and other people for your
discussion about this problem.

UdpInDatagrams can be confusing because it counts packets that 
might be dropped later.
Move UdpInDatagrams into recvmsg() as allowed by the RFC.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
---
 Documentation/networking/udplite.txt |    2 +-
 net/ipv4/udp.c                       |    7 +++----
 net/ipv6/udp.c                       |    4 +++-
 3 files changed, 7 insertions(+), 6 deletions(-)

diff -Nurp linux-2.6.24.rc3.org/Documentation/networking/udplite.txt linux-2.6.24.rc3/Documentation/networking/udplite.txt
--- linux-2.6.24.rc3.org/Documentation/networking/udplite.txt	2007-11-19 12:37:40.000000000 +0800
+++ linux-2.6.24.rc3/Documentation/networking/udplite.txt	2007-11-30 09:43:08.000000000 +0800
@@ -236,7 +236,7 @@
 
   This displays UDP-Lite statistics variables, whose meaning is as follows.
 
-   InDatagrams:     Total number of received datagrams.
+   InDatagrams:     The total number of datagrams delivered to users.
 
    NoPorts:         Number of packets received to an unknown port.
                     These cases are counted separately (not as InErrors).
diff -Nurp linux-2.6.24.rc3.org/net/ipv4/udp.c linux-2.6.24.rc3/net/ipv4/udp.c
--- linux-2.6.24.rc3.org/net/ipv4/udp.c	2007-11-19 12:38:14.000000000 +0800
+++ linux-2.6.24.rc3/net/ipv4/udp.c	2007-11-30 09:48:21.000000000 +0800
@@ -873,6 +873,8 @@ try_again:
 	if (err)
 		goto out_free;
 
+	UDP_INC_STATS_BH(UDP_MIB_INDATAGRAMS, is_udplite);
+
 	sock_recv_timestamp(msg, sk, skb);
 
 	/* Copy the address. */
@@ -966,10 +968,8 @@ int udp_queue_rcv_skb(struct sock * sk, 
 			int ret;
 
 			ret = (*up->encap_rcv)(sk, skb);
-			if (ret <= 0) {
-				UDP_INC_STATS_BH(UDP_MIB_INDATAGRAMS, up->pcflag);
+			if (ret <= 0)
 				return -ret;
-			}
 		}
 
 		/* FALLTHROUGH -- it's a UDP Packet */
@@ -1023,7 +1023,6 @@ int udp_queue_rcv_skb(struct sock * sk, 
 		goto drop;
 	}
 
-	UDP_INC_STATS_BH(UDP_MIB_INDATAGRAMS, up->pcflag);
 	return 0;
 
 drop:
diff -Nurp linux-2.6.24.rc3.org/net/ipv6/udp.c linux-2.6.24.rc3/net/ipv6/udp.c
--- linux-2.6.24.rc3.org/net/ipv6/udp.c	2007-11-19 12:38:14.000000000 +0800
+++ linux-2.6.24.rc3/net/ipv6/udp.c	2007-11-30 09:52:52.000000000 +0800
@@ -164,6 +164,8 @@ try_again:
 	if (err)
 		goto out_free;
 
+	UDP6_INC_STATS_BH(UDP_MIB_INDATAGRAMS, is_udplite);
+
 	sock_recv_timestamp(msg, sk, skb);
 
 	/* Copy the address. */
@@ -292,7 +294,7 @@ int udpv6_queue_rcv_skb(struct sock * sk
 			UDP6_INC_STATS_BH(UDP_MIB_RCVBUFERRORS, up->pcflag);
 		goto drop;
 	}
-	UDP6_INC_STATS_BH(UDP_MIB_INDATAGRAMS, up->pcflag);
+
 	return 0;
 drop:
 	UDP6_INC_STATS_BH(UDP_MIB_INERRORS, up->pcflag);


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 2/3] [UDP]: Clean up for IS_UDPLITE macro
  2007-11-30  3:09 [PATCH 1/3] [SNMP]: Defer InDataGrams increment until recvmsg() does checksum Wang Chen
@ 2007-11-30  3:15 ` Wang Chen
  2007-11-30 11:05   ` Gerrit Renker
  2007-11-30  3:24 ` [PATCH 3/3] [UDP6]: Counter increment on BH mode Wang Chen
  2007-11-30 10:57 ` [PATCH 1/3] [SNMP]: Defer InDataGrams increment until recvmsg() does checksum Gerrit Renker
  2 siblings, 1 reply; 26+ messages in thread
From: Wang Chen @ 2007-11-30  3:15 UTC (permalink / raw)
  To: herbert; +Cc: davem, andi, netdev, gerrit, Wang Chen

(This patch base on "PATCH 1/3".)

Since we have macro IS_UDPLITE, we can use it.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
---
 ipv4/udp.c |   19 +++++++++++--------
 ipv6/udp.c |   14 ++++++++------
 2 files changed, 19 insertions(+), 14 deletions(-)

diff -Nurp linux-2.6.24.rc3.org/net/ipv4/udp.c linux-2.6.24.rc3/net/ipv4/udp.c
--- linux-2.6.24.rc3.org/net/ipv4/udp.c	2007-11-30 10:07:48.000000000 +0800
+++ linux-2.6.24.rc3/net/ipv4/udp.c	2007-11-30 10:18:49.000000000 +0800
@@ -471,6 +471,7 @@ static int udp_push_pending_frames(struc
 	struct sk_buff *skb;
 	struct udphdr *uh;
 	int err = 0;
+	int is_udplite = IS_UDPLITE(sk);
 	__wsum csum = 0;
 
 	/* Grab the skbuff where UDP header space exists. */
@@ -486,7 +487,7 @@ static int udp_push_pending_frames(struc
 	uh->len = htons(up->len);
 	uh->check = 0;
 
-	if (up->pcflag)  				 /*     UDP-Lite      */
+	if (is_udplite)  				 /*     UDP-Lite      */
 		csum  = udplite_csum_outgoing(sk, skb);
 
 	else if (sk->sk_no_check == UDP_CSUM_NOXMIT) {   /* UDP csum disabled */
@@ -514,7 +515,7 @@ out:
 	up->len = 0;
 	up->pending = 0;
 	if (!err)
-		UDP_INC_STATS_USER(UDP_MIB_OUTDATAGRAMS, up->pcflag);
+		UDP_INC_STATS_USER(UDP_MIB_OUTDATAGRAMS, is_udplite);
 	return err;
 }
 
@@ -531,7 +532,7 @@ int udp_sendmsg(struct kiocb *iocb, stru
 	__be32 daddr, faddr, saddr;
 	__be16 dport;
 	u8  tos;
-	int err, is_udplite = up->pcflag;
+	int err, is_udplite = IS_UDPLITE(sk);
 	int corkreq = up->corkflag || msg->msg_flags&MSG_MORE;
 	int (*getfrag)(void *, char *, int, int, int, struct sk_buff *);
 
@@ -942,6 +943,7 @@ int udp_queue_rcv_skb(struct sock * sk, 
 {
 	struct udp_sock *up = udp_sk(sk);
 	int rc;
+	int is_udplite = IS_UDPLITE(sk);
 
 	/*
 	 *	Charge it to the socket, dropping if the queue is full.
@@ -978,7 +980,7 @@ int udp_queue_rcv_skb(struct sock * sk, 
 	/*
 	 * 	UDP-Lite specific tests, ignored on UDP sockets
 	 */
-	if ((up->pcflag & UDPLITE_RECV_CC)  &&  UDP_SKB_CB(skb)->partial_cov) {
+	if ((is_udplite & UDPLITE_RECV_CC)  &&  UDP_SKB_CB(skb)->partial_cov) {
 
 		/*
 		 * MIB statistics other than incrementing the error count are
@@ -1019,14 +1021,14 @@ int udp_queue_rcv_skb(struct sock * sk, 
 	if ((rc = sock_queue_rcv_skb(sk,skb)) < 0) {
 		/* Note that an ENOMEM error is charged twice */
 		if (rc == -ENOMEM)
-			UDP_INC_STATS_BH(UDP_MIB_RCVBUFERRORS, up->pcflag);
+			UDP_INC_STATS_BH(UDP_MIB_RCVBUFERRORS, is_udplite);
 		goto drop;
 	}
 
 	return 0;
 
 drop:
-	UDP_INC_STATS_BH(UDP_MIB_INERRORS, up->pcflag);
+	UDP_INC_STATS_BH(UDP_MIB_INERRORS, is_udplite);
 	kfree_skb(skb);
 	return -1;
 }
@@ -1235,6 +1237,7 @@ int udp_lib_setsockopt(struct sock *sk, 
 	struct udp_sock *up = udp_sk(sk);
 	int val;
 	int err = 0;
+	int is_udplite = IS_UDPLITE(sk);
 
 	if (optlen<sizeof(int))
 		return -EINVAL;
@@ -1276,7 +1279,7 @@ int udp_lib_setsockopt(struct sock *sk, 
 	/* The sender sets actual checksum coverage length via this option.
 	 * The case coverage > packet length is handled by send module. */
 	case UDPLITE_SEND_CSCOV:
-		if (!up->pcflag)         /* Disable the option on UDP sockets */
+		if (!is_udplite)         /* Disable the option on UDP sockets */
 			return -ENOPROTOOPT;
 		if (val != 0 && val < 8) /* Illegal coverage: use default (8) */
 			val = 8;
@@ -1288,7 +1291,7 @@ int udp_lib_setsockopt(struct sock *sk, 
 	 * sense, this should be set to at least 8 (as done below). If zero is
 	 * used, this again means full checksum coverage.                     */
 	case UDPLITE_RECV_CSCOV:
-		if (!up->pcflag)         /* Disable the option on UDP sockets */
+		if (!is_udplite)         /* Disable the option on UDP sockets */
 			return -ENOPROTOOPT;
 		if (val != 0 && val < 8) /* Avoid silly minimal values.       */
 			val = 8;
diff -Nurp linux-2.6.24.rc3.org/net/ipv6/udp.c linux-2.6.24.rc3/net/ipv6/udp.c
--- linux-2.6.24.rc3.org/net/ipv6/udp.c	2007-11-30 10:07:48.000000000 +0800
+++ linux-2.6.24.rc3/net/ipv6/udp.c	2007-11-30 10:22:25.000000000 +0800
@@ -260,6 +260,7 @@ int udpv6_queue_rcv_skb(struct sock * sk
 {
 	struct udp_sock *up = udp_sk(sk);
 	int rc;
+	int is_udplite = IS_UDPLITE(sk);
 
 	if (!xfrm6_policy_check(sk, XFRM_POLICY_IN, skb))
 		goto drop;
@@ -267,7 +268,7 @@ int udpv6_queue_rcv_skb(struct sock * sk
 	/*
 	 * UDP-Lite specific tests, ignored on UDP sockets (see net/ipv4/udp.c).
 	 */
-	if ((up->pcflag & UDPLITE_RECV_CC)  &&  UDP_SKB_CB(skb)->partial_cov) {
+	if ((is_udplite & UDPLITE_RECV_CC)  &&  UDP_SKB_CB(skb)->partial_cov) {
 
 		if (up->pcrlen == 0) {          /* full coverage was set  */
 			LIMIT_NETDEBUG(KERN_WARNING "UDPLITE6: partial coverage"
@@ -291,13 +292,13 @@ int udpv6_queue_rcv_skb(struct sock * sk
 	if ((rc = sock_queue_rcv_skb(sk,skb)) < 0) {
 		/* Note that an ENOMEM error is charged twice */
 		if (rc == -ENOMEM)
-			UDP6_INC_STATS_BH(UDP_MIB_RCVBUFERRORS, up->pcflag);
+			UDP6_INC_STATS_BH(UDP_MIB_RCVBUFERRORS, is_udplite);
 		goto drop;
 	}
 
 	return 0;
 drop:
-	UDP6_INC_STATS_BH(UDP_MIB_INERRORS, up->pcflag);
+	UDP6_INC_STATS_BH(UDP_MIB_INERRORS, is_udplite);
 	kfree_skb(skb);
 	return -1;
 }
@@ -525,6 +526,7 @@ static int udp_v6_push_pending_frames(st
 	struct inet_sock *inet = inet_sk(sk);
 	struct flowi *fl = &inet->cork.fl;
 	int err = 0;
+	int is_udplite = IS_UDPLITE(sk);
 	__wsum csum = 0;
 
 	/* Grab the skbuff where UDP header space exists. */
@@ -540,7 +542,7 @@ static int udp_v6_push_pending_frames(st
 	uh->len = htons(up->len);
 	uh->check = 0;
 
-	if (up->pcflag)
+	if (is_udplite)
 		csum = udplite_csum_outgoing(sk, skb);
 	 else
 		csum = udp_csum_outgoing(sk, skb);
@@ -556,7 +558,7 @@ out:
 	up->len = 0;
 	up->pending = 0;
 	if (!err)
-		UDP6_INC_STATS_USER(UDP_MIB_OUTDATAGRAMS, up->pcflag);
+		UDP6_INC_STATS_USER(UDP_MIB_OUTDATAGRAMS, is_udplite);
 	return err;
 }
 
@@ -580,7 +582,7 @@ int udpv6_sendmsg(struct kiocb *iocb, st
 	int corkreq = up->corkflag || msg->msg_flags&MSG_MORE;
 	int err;
 	int connected = 0;
-	int is_udplite = up->pcflag;
+	int is_udplite = IS_UDPLITE(sk);
 	int (*getfrag)(void *, char *, int, int, int, struct sk_buff *);
 
 	/* destination address check */


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 3/3] [UDP6]: Counter increment on BH mode
  2007-11-30  3:09 [PATCH 1/3] [SNMP]: Defer InDataGrams increment until recvmsg() does checksum Wang Chen
  2007-11-30  3:15 ` [PATCH 2/3] [UDP]: Clean up for IS_UDPLITE macro Wang Chen
@ 2007-11-30  3:24 ` Wang Chen
  2007-11-30 11:19   ` Gerrit Renker
  2007-11-30 10:57 ` [PATCH 1/3] [SNMP]: Defer InDataGrams increment until recvmsg() does checksum Gerrit Renker
  2 siblings, 1 reply; 26+ messages in thread
From: Wang Chen @ 2007-11-30  3:24 UTC (permalink / raw)
  To: herbert; +Cc: davem, andi, netdev, gerrit, Wang Chen

(This patch base on "PATCH 2/3".)

UDP_MIB_INERRORS increment is in different way for IPv4
and IPv6. For the consistence, change UDP6_INC_STATS_USER
to UDP6_INC_STATS_BH.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
---
 udp.c |    2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)

--- linux-2.6.24.rc3.org/net/ipv6/udp.c	2007-11-30 10:38:36.000000000 +0800
+++ linux-2.6.24.rc3/net/ipv6/udp.c	2007-11-30 10:39:01.000000000 +0800
@@ -207,7 +207,7 @@ out:
 	return err;
 
 csum_copy_err:
-	UDP6_INC_STATS_USER(UDP_MIB_INERRORS, is_udplite);
+	UDP6_INC_STATS_BH(UDP_MIB_INERRORS, is_udplite);
 	skb_kill_datagram(sk, skb, flags);
 
 	if (flags & MSG_DONTWAIT)


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] [SNMP]: Defer InDataGrams increment until recvmsg() does checksum
  2007-11-30  3:09 [PATCH 1/3] [SNMP]: Defer InDataGrams increment until recvmsg() does checksum Wang Chen
  2007-11-30  3:15 ` [PATCH 2/3] [UDP]: Clean up for IS_UDPLITE macro Wang Chen
  2007-11-30  3:24 ` [PATCH 3/3] [UDP6]: Counter increment on BH mode Wang Chen
@ 2007-11-30 10:57 ` Gerrit Renker
  2 siblings, 0 replies; 26+ messages in thread
From: Gerrit Renker @ 2007-11-30 10:57 UTC (permalink / raw)
  To: Wang Chen; +Cc: herbert, davem, andi, netdev

Thanks again for all the work.
| UdpInDatagrams can be confusing because it counts packets that 
| might be dropped later.
| Move UdpInDatagrams into recvmsg() as allowed by the RFC.
| 
| Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
| ---
Acked-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/3] [UDP]: Clean up for IS_UDPLITE macro
  2007-11-30  3:15 ` [PATCH 2/3] [UDP]: Clean up for IS_UDPLITE macro Wang Chen
@ 2007-11-30 11:05   ` Gerrit Renker
  0 siblings, 0 replies; 26+ messages in thread
From: Gerrit Renker @ 2007-11-30 11:05 UTC (permalink / raw)
  To: Wang Chen; +Cc: herbert, davem, andi, netdev

| (This patch base on "PATCH 1/3".)
| 
| Since we have macro IS_UDPLITE, we can use it.
| 
| Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
| ---
Acked-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/3] [UDP6]: Counter increment on BH mode
  2007-11-30  3:24 ` [PATCH 3/3] [UDP6]: Counter increment on BH mode Wang Chen
@ 2007-11-30 11:19   ` Gerrit Renker
  2007-12-01  1:54     ` Herbert Xu
  0 siblings, 1 reply; 26+ messages in thread
From: Gerrit Renker @ 2007-11-30 11:19 UTC (permalink / raw)
  To: Wang Chen; +Cc: herbert, davem, andi, netdev

Quoting Wang Chen:
| (This patch base on "PATCH 2/3".)
| 
| UDP_MIB_INERRORS increment is in different way for IPv4
| and IPv6. For the consistence, change UDP6_INC_STATS_USER
| to UDP6_INC_STATS_BH.
| 
| Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
| ---
|  udp.c |    2 +-
|  1 files changed, 1 insertion(+), 1 deletion(-)
| 
| --- linux-2.6.24.rc3.org/net/ipv6/udp.c	2007-11-30 10:38:36.000000000 +0800
| +++ linux-2.6.24.rc3/net/ipv6/udp.c	2007-11-30 10:39:01.000000000 +0800
| @@ -207,7 +207,7 @@ out:
|  	return err;
|  
|  csum_copy_err:
| -	UDP6_INC_STATS_USER(UDP_MIB_INERRORS, is_udplite);
| +	UDP6_INC_STATS_BH(UDP_MIB_INERRORS, is_udplite);
|  	skb_kill_datagram(sk, skb, flags);
|  
|  	if (flags & MSG_DONTWAIT)
| 
Is it not the other way round ?? :- 

	struct proto udp{,v6}_prot = {
		// ...
		.recvmsg           = udp{,v6}_recvmsg,
	};


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/3] [UDP6]: Counter increment on BH mode
  2007-11-30 11:19   ` Gerrit Renker
@ 2007-12-01  1:54     ` Herbert Xu
  2007-12-03  7:19       ` Wang Chen
  0 siblings, 1 reply; 26+ messages in thread
From: Herbert Xu @ 2007-12-01  1:54 UTC (permalink / raw)
  To: Gerrit Renker; +Cc: Wang Chen, davem, andi, netdev

On Fri, Nov 30, 2007 at 11:19:49AM +0000, Gerrit Renker wrote:
>
> |  csum_copy_err:
> | -	UDP6_INC_STATS_USER(UDP_MIB_INERRORS, is_udplite);
> | +	UDP6_INC_STATS_BH(UDP_MIB_INERRORS, is_udplite);
> |  	skb_kill_datagram(sk, skb, flags);
> |  
> |  	if (flags & MSG_DONTWAIT)
> | 
> Is it not the other way round ?? :- 

I agree.  Wang Chen, please change this and other appropriate
BH calls to USER.  Basically recvmsg should always be USER while
rcv is BH.

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/3] [UDP6]: Counter increment on BH mode
  2007-12-01  1:54     ` Herbert Xu
@ 2007-12-03  7:19       ` Wang Chen
  2007-12-03 11:39         ` Herbert Xu
  0 siblings, 1 reply; 26+ messages in thread
From: Wang Chen @ 2007-12-03  7:19 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Gerrit Renker, davem, andi, netdev

Herbert Xu said the following on 2007-12-1 9:54:
> On Fri, Nov 30, 2007 at 11:19:49AM +0000, Gerrit Renker wrote:
>> |  csum_copy_err:
>> | -	UDP6_INC_STATS_USER(UDP_MIB_INERRORS, is_udplite);
>> | +	UDP6_INC_STATS_BH(UDP_MIB_INERRORS, is_udplite);
>> |  	skb_kill_datagram(sk, skb, flags);
>> |  
>> |  	if (flags & MSG_DONTWAIT)
>> | 
>> Is it not the other way round ?? :- 
> 
> I agree.  Wang Chen, please change this and other appropriate
> BH calls to USER.  Basically recvmsg should always be USER while
> rcv is BH.
> 

Resubmit the patch.

System calls should be USER. So change the BH to USER for
UDP*_INC_STATS_BH().

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
---
 ipv4/udp.c |    4 ++--
 ipv6/udp.c |    2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

--- linux-2.6.24.rc3.org/net/ipv4/udp.c	2007-12-03 15:10:49.000000000 +0800
+++ linux-2.6.24.rc3/net/ipv4/udp.c	2007-12-03 14:52:54.000000000 +0800
@@ -874,7 +874,7 @@ try_again:
 	if (err)
 		goto out_free;
 
-	UDP_INC_STATS_BH(UDP_MIB_INDATAGRAMS, is_udplite);
+	UDP_INC_STATS_USER(UDP_MIB_INDATAGRAMS, is_udplite);
 
 	sock_recv_timestamp(msg, sk, skb);
 
@@ -899,7 +899,7 @@ out:
 	return err;
 
 csum_copy_err:
-	UDP_INC_STATS_BH(UDP_MIB_INERRORS, is_udplite);
+	UDP_INC_STATS_USER(UDP_MIB_INERRORS, is_udplite);
 
 	skb_kill_datagram(sk, skb, flags);
 
--- linux-2.6.24.rc3.org/net/ipv6/udp.c	2007-12-03 15:10:49.000000000 +0800
+++ linux-2.6.24.rc3/net/ipv6/udp.c	2007-12-03 15:09:55.000000000 +0800
@@ -164,7 +164,7 @@ try_again:
 	if (err)
 		goto out_free;
 
-	UDP6_INC_STATS_BH(UDP_MIB_INDATAGRAMS, is_udplite);
+	UDP6_INC_STATS_USER(UDP_MIB_INDATAGRAMS, is_udplite);
 
 	sock_recv_timestamp(msg, sk, skb);


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/3] [UDP6]: Counter increment on BH mode
  2007-12-03  7:19       ` Wang Chen
@ 2007-12-03 11:39         ` Herbert Xu
  2007-12-03 11:49           ` Alexey Kuznetsov
  0 siblings, 1 reply; 26+ messages in thread
From: Herbert Xu @ 2007-12-03 11:39 UTC (permalink / raw)
  To: Wang Chen; +Cc: Gerrit Renker, davem, andi, netdev, Alexey Kuznetsov

On Mon, Dec 03, 2007 at 03:19:35PM +0800, Wang Chen wrote:
> 
> System calls should be USER. So change the BH to USER for
> UDP*_INC_STATS_BH().
> 
> Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>

I've rearragned it so that we the new INC_STATS call is USER in the
first patch.  Otherwise it's all in net-2.6.25 now.

BTW, thanks to Gerrit's note I took a look at the underlying code
for the _BH/_USER stuff.  It's in fact totally broken when PREEMPT
is on.  It relies on the fact that process-context kernel code does
not get preempted by other process-context kernel code or for that
matter migrate to other CPUs, neither of which is true with PREEMPT
on.

So we need to fix this, and whatever the fix is will probably render
the BH/USER distinction obsolete.

Any takers?

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/3] [UDP6]: Counter increment on BH mode
  2007-12-03 11:39         ` Herbert Xu
@ 2007-12-03 11:49           ` Alexey Kuznetsov
  2007-12-03 11:54             ` Herbert Xu
  0 siblings, 1 reply; 26+ messages in thread
From: Alexey Kuznetsov @ 2007-12-03 11:49 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Wang Chen, Gerrit Renker, davem, andi, netdev

On Mon, Dec 03, 2007 at 10:39:35PM +1100, Herbert Xu wrote:
> So we need to fix this, and whatever the fix is will probably render
> the BH/USER distinction obsolete.

Hmm, I would think opposite. USER (or generic) is expensive variant,
BH is lite. No?

Alexey

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/3] [UDP6]: Counter increment on BH mode
  2007-12-03 11:49           ` Alexey Kuznetsov
@ 2007-12-03 11:54             ` Herbert Xu
  2007-12-03 13:17               ` Herbert Xu
  0 siblings, 1 reply; 26+ messages in thread
From: Herbert Xu @ 2007-12-03 11:54 UTC (permalink / raw)
  To: Alexey Kuznetsov
  Cc: Wang Chen, Gerrit Renker, davem, andi, netdev,
	Linux Kernel Mailing List, Christoph Lameter, Ingo Molnar

On Mon, Dec 03, 2007 at 02:49:12PM +0300, Alexey Kuznetsov wrote:
> On Mon, Dec 03, 2007 at 10:39:35PM +1100, Herbert Xu wrote:
> > So we need to fix this, and whatever the fix is will probably render
> > the BH/USER distinction obsolete.
> 
> Hmm, I would think opposite. USER (or generic) is expensive variant,
> BH is lite. No?

Yes that would certainly be the obvious fix.  In other words, just use
smp_processor_id instead of the raw version.  I suppose the only issue
would be that disabling/enabling preemption isn't exactly cost-free and
we're going to be doing that for every single increment.

Hmm, wasn't someone else talking about a non-atomic version of atomic
ops lately (i.e., atomic with respect to the local CPU only)? Perhaps
this is the killer app for it :)

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/3] [UDP6]: Counter increment on BH mode
  2007-12-03 11:54             ` Herbert Xu
@ 2007-12-03 13:17               ` Herbert Xu
  2007-12-04  3:50                 ` Wang Chen
  2007-12-15 13:58                 ` Herbert Xu
  0 siblings, 2 replies; 26+ messages in thread
From: Herbert Xu @ 2007-12-03 13:17 UTC (permalink / raw)
  To: Alexey Kuznetsov
  Cc: Wang Chen, Gerrit Renker, davem, andi, netdev,
	Linux Kernel Mailing List, Christoph Lameter, Ingo Molnar

On Mon, Dec 03, 2007 at 10:54:35PM +1100, Herbert Xu wrote:
>
> Hmm, wasn't someone else talking about a non-atomic version of atomic
> ops lately (i.e., atomic with respect to the local CPU only)? Perhaps
> this is the killer app for it :)

Never mind, we already have that in local_t and as Alexey correctly
points out, USER is still going to be the expensive variant with the
preempt_disable (well until BH gets threaded).  So how about this patch?

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
diff --git a/include/net/snmp.h b/include/net/snmp.h
index ea206bf..9444b54 100644
--- a/include/net/snmp.h
+++ b/include/net/snmp.h
@@ -23,6 +23,7 @@
 
 #include <linux/cache.h>
 #include <linux/snmp.h>
+#include <linux/smp.h>
 
 /*
  * Mibs are stored in array of unsigned long.
@@ -137,7 +138,10 @@ struct linux_mib {
 #define SNMP_INC_STATS_OFFSET_BH(mib, field, offset)	\
 	(per_cpu_ptr(mib[0], raw_smp_processor_id())->mibs[field + (offset)]++)
 #define SNMP_INC_STATS_USER(mib, field) \
-	(per_cpu_ptr(mib[1], raw_smp_processor_id())->mibs[field]++)
+	do { \
+		per_cpu_ptr(mib[1], get_cpu())->mibs[field]++; \
+		put_cpu(); \
+	} while (0)
 #define SNMP_INC_STATS(mib, field) 	\
 	(per_cpu_ptr(mib[!in_softirq()], raw_smp_processor_id())->mibs[field]++)
 #define SNMP_DEC_STATS(mib, field) 	\
@@ -145,6 +149,9 @@ struct linux_mib {
 #define SNMP_ADD_STATS_BH(mib, field, addend) 	\
 	(per_cpu_ptr(mib[0], raw_smp_processor_id())->mibs[field] += addend)
 #define SNMP_ADD_STATS_USER(mib, field, addend) 	\
-	(per_cpu_ptr(mib[1], raw_smp_processor_id())->mibs[field] += addend)
+	do { \
+		per_cpu_ptr(mib[1], get_cpu())->mibs[field] += addend; \
+		put_cpu(); \
+	} while (0)
 
 #endif

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/3] [UDP6]: Counter increment on BH mode
  2007-12-03 13:17               ` Herbert Xu
@ 2007-12-04  3:50                 ` Wang Chen
  2007-12-04  3:57                   ` Herbert Xu
  2007-12-15 13:58                 ` Herbert Xu
  1 sibling, 1 reply; 26+ messages in thread
From: Wang Chen @ 2007-12-04  3:50 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Alexey Kuznetsov, Gerrit Renker, davem, andi, netdev,
	Linux Kernel Mailing List, Christoph Lameter, Ingo Molnar

Herbert Xu said the following on 2007-12-3 21:17:
> On Mon, Dec 03, 2007 at 10:54:35PM +1100, Herbert Xu wrote:
> diff --git a/include/net/snmp.h b/include/net/snmp.h
> index ea206bf..9444b54 100644
> --- a/include/net/snmp.h
> +++ b/include/net/snmp.h
> @@ -23,6 +23,7 @@
>  
>  #include <linux/cache.h>
>  #include <linux/snmp.h>
> +#include <linux/smp.h>

It's no need to include smp.h?

--
WCN


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/3] [UDP6]: Counter increment on BH mode
  2007-12-04  3:50                 ` Wang Chen
@ 2007-12-04  3:57                   ` Herbert Xu
  0 siblings, 0 replies; 26+ messages in thread
From: Herbert Xu @ 2007-12-04  3:57 UTC (permalink / raw)
  To: Wang Chen
  Cc: Alexey Kuznetsov, Gerrit Renker, davem, andi, netdev,
	Linux Kernel Mailing List, Christoph Lameter, Ingo Molnar

On Tue, Dec 04, 2007 at 11:50:55AM +0800, Wang Chen wrote:
>
> >  #include <linux/cache.h>
> >  #include <linux/snmp.h>
> > +#include <linux/smp.h>
> 
> It's no need to include smp.h?

We need it for smp_processor_id.  Well we needed it before too but
there's probably some implicit inclusion which has made it work.
It's better to declare these inclusions explicitly as otherwise
they may break on less common architectures later.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/3] [UDP6]: Counter increment on BH mode
  2007-12-03 13:17               ` Herbert Xu
  2007-12-04  3:50                 ` Wang Chen
@ 2007-12-15 13:58                 ` Herbert Xu
  2007-12-15 17:03                   ` Eric Dumazet
                                     ` (3 more replies)
  1 sibling, 4 replies; 26+ messages in thread
From: Herbert Xu @ 2007-12-15 13:58 UTC (permalink / raw)
  To: Alexey Kuznetsov
  Cc: Wang Chen, Gerrit Renker, davem, andi, netdev,
	Linux Kernel Mailing List, Christoph Lameter, Ingo Molnar

Ob Tue, Dec 04, 2007 at 12:17:23AM +1100, Herbert Xu wrote:
> 
> Never mind, we already have that in local_t and as Alexey correctly
> points out, USER is still going to be the expensive variant with the
> preempt_disable (well until BH gets threaded).  So how about this patch?

I didn't hear any objections so here is the patch again.

[SNMP]: Fix SNMP counters with PREEMPT

The SNMP macros use raw_smp_processor_id() in process context
which is illegal because the process may be preempted and then
migrated to another CPU.

This patch makes it use get_cpu/put_cpu to disable preemption.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
diff --git a/include/net/snmp.h b/include/net/snmp.h
index ea206bf..9444b54 100644
--- a/include/net/snmp.h
+++ b/include/net/snmp.h
@@ -23,6 +23,7 @@
 
 #include <linux/cache.h>
 #include <linux/snmp.h>
+#include <linux/smp.h>
 
 /*
  * Mibs are stored in array of unsigned long.
@@ -137,7 +138,10 @@ struct linux_mib {
 #define SNMP_INC_STATS_OFFSET_BH(mib, field, offset)	\
 	(per_cpu_ptr(mib[0], raw_smp_processor_id())->mibs[field + (offset)]++)
 #define SNMP_INC_STATS_USER(mib, field) \
-	(per_cpu_ptr(mib[1], raw_smp_processor_id())->mibs[field]++)
+	do { \
+		per_cpu_ptr(mib[1], get_cpu())->mibs[field]++; \
+		put_cpu(); \
+	} while (0)
 #define SNMP_INC_STATS(mib, field) 	\
 	(per_cpu_ptr(mib[!in_softirq()], raw_smp_processor_id())->mibs[field]++)
 #define SNMP_DEC_STATS(mib, field) 	\
@@ -145,6 +149,9 @@ struct linux_mib {
 #define SNMP_ADD_STATS_BH(mib, field, addend) 	\
 	(per_cpu_ptr(mib[0], raw_smp_processor_id())->mibs[field] += addend)
 #define SNMP_ADD_STATS_USER(mib, field, addend) 	\
-	(per_cpu_ptr(mib[1], raw_smp_processor_id())->mibs[field] += addend)
+	do { \
+		per_cpu_ptr(mib[1], get_cpu())->mibs[field] += addend; \
+		put_cpu(); \
+	} while (0)
 
 #endif

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/3] [UDP6]: Counter increment on BH mode
  2007-12-15 13:58                 ` Herbert Xu
@ 2007-12-15 17:03                   ` Eric Dumazet
  2007-12-16  2:30                     ` [SNMP]: Fix SNMP counters with PREEMPT Herbert Xu
  2007-12-15 18:43                   ` [PATCH 3/3] [UDP6]: Counter increment on BH mode Ingo Molnar
                                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 26+ messages in thread
From: Eric Dumazet @ 2007-12-15 17:03 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Alexey Kuznetsov, Wang Chen, Gerrit Renker, davem, andi, netdev,
	Linux Kernel Mailing List, Christoph Lameter, Ingo Molnar

Herbert Xu a écrit :
> Ob Tue, Dec 04, 2007 at 12:17:23AM +1100, Herbert Xu wrote:
>> Never mind, we already have that in local_t and as Alexey correctly
>> points out, USER is still going to be the expensive variant with the
>> preempt_disable (well until BH gets threaded).  So how about this patch?
> 
> I didn't hear any objections so here is the patch again.
> 
> [SNMP]: Fix SNMP counters with PREEMPT
> 
> The SNMP macros use raw_smp_processor_id() in process context
> which is illegal because the process may be preempted and then
> migrated to another CPU.
> 
> This patch makes it use get_cpu/put_cpu to disable preemption.
> 
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
> 
> Cheers,
  #define SNMP_INC_STATS_USER(mib, field) \
-	(per_cpu_ptr(mib[1], raw_smp_processor_id())->mibs[field]++)
+	do { \
+		per_cpu_ptr(mib[1], get_cpu())->mibs[field]++; \
+		put_cpu(); \
+	} while (0)
  #define SNMP_INC_STATS(mib, field) 	\
  	(per_cpu_ptr(mib[!in_softirq()], raw_smp_processor_id())->mibs[field]++)


How come you change SNMP_INC_STATS_USER() but not SNMP_INC_STATS() ?



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/3] [UDP6]: Counter increment on BH mode
  2007-12-15 13:58                 ` Herbert Xu
  2007-12-15 17:03                   ` Eric Dumazet
@ 2007-12-15 18:43                   ` Ingo Molnar
  2007-12-16  2:36                     ` Herbert Xu
  2007-12-17 19:40                     ` Christoph Lameter
  2007-12-18 11:43                   ` Gerrit Renker
  2007-12-20 12:12                   ` David Miller
  3 siblings, 2 replies; 26+ messages in thread
From: Ingo Molnar @ 2007-12-15 18:43 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Alexey Kuznetsov, Wang Chen, Gerrit Renker, davem, andi, netdev,
	Linux Kernel Mailing List, Christoph Lameter


* Herbert Xu <herbert@gondor.apana.org.au> wrote:

> Ob Tue, Dec 04, 2007 at 12:17:23AM +1100, Herbert Xu wrote:
> > 
> > Never mind, we already have that in local_t and as Alexey correctly
> > points out, USER is still going to be the expensive variant with the
> > preempt_disable (well until BH gets threaded).  So how about this patch?
> 
> I didn't hear any objections so here is the patch again.
> 
> [SNMP]: Fix SNMP counters with PREEMPT
> 
> The SNMP macros use raw_smp_processor_id() in process context which is 
> illegal because the process may be preempted and then migrated to 
> another CPU.

nit: please use 'invalid' instead of 'illegal'.

> This patch makes it use get_cpu/put_cpu to disable preemption.
> 
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

> -	(per_cpu_ptr(mib[1], raw_smp_processor_id())->mibs[field]++)
> +	do { \
> +		per_cpu_ptr(mib[1], get_cpu())->mibs[field]++; \
> +		put_cpu(); \
> +	} while (0)

> -	(per_cpu_ptr(mib[1], raw_smp_processor_id())->mibs[field] += addend)
> +	do { \
> +		per_cpu_ptr(mib[1], get_cpu())->mibs[field] += addend; \
> +		put_cpu(); \
> +	} while (0)

we could perhaps introduce stat_smp_processor_id(), which signals that 
the CPU id is used for statistical purposes and does not have to be 
exact? In any case, your patch looks good too.

	Ingo

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [SNMP]: Fix SNMP counters with PREEMPT
  2007-12-15 17:03                   ` Eric Dumazet
@ 2007-12-16  2:30                     ` Herbert Xu
  2007-12-20 12:13                       ` David Miller
  0 siblings, 1 reply; 26+ messages in thread
From: Herbert Xu @ 2007-12-16  2:30 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Alexey Kuznetsov, Wang Chen, Gerrit Renker, davem, andi, netdev,
	Linux Kernel Mailing List, Christoph Lameter, Ingo Molnar

On Sat, Dec 15, 2007 at 06:03:19PM +0100, Eric Dumazet wrote:
>
> How come you change SNMP_INC_STATS_USER() but not SNMP_INC_STATS() ?

Heh, my brain must have blocked me from seeing it because it's
too hard :)

Let's fix it the stupid way first and I'll do a local_t conversion
later.

[SNMP]: Fix SNMP counters with PREEMPT

The SNMP macros use raw_smp_processor_id() in process context
which is illegal because the process may be preempted and then
migrated to another CPU.

This patch makes it use get_cpu/put_cpu to disable preemption.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
diff --git a/include/net/snmp.h b/include/net/snmp.h
index 9c5793d..fbb6666 100644
--- a/include/net/snmp.h
+++ b/include/net/snmp.h
@@ -23,6 +23,7 @@
 
 #include <linux/cache.h>
 #include <linux/snmp.h>
+#include <linux/smp.h>
 
 /*
  * Mibs are stored in array of unsigned long.
@@ -135,14 +136,26 @@ struct linux_mib {
 #define SNMP_INC_STATS_BH(mib, field) 	\
 	(per_cpu_ptr(mib[0], raw_smp_processor_id())->mibs[field]++)
 #define SNMP_INC_STATS_USER(mib, field) \
-	(per_cpu_ptr(mib[1], raw_smp_processor_id())->mibs[field]++)
+	do { \
+		per_cpu_ptr(mib[1], get_cpu())->mibs[field]++; \
+		put_cpu(); \
+	} while (0)
 #define SNMP_INC_STATS(mib, field) 	\
-	(per_cpu_ptr(mib[!in_softirq()], raw_smp_processor_id())->mibs[field]++)
+	do { \
+		per_cpu_ptr(mib[!in_softirq()], get_cpu())->mibs[field]++; \
+		put_cpu(); \
+	} while (0)
 #define SNMP_DEC_STATS(mib, field) 	\
-	(per_cpu_ptr(mib[!in_softirq()], raw_smp_processor_id())->mibs[field]--)
+	do { \
+		per_cpu_ptr(mib[!in_softirq()], get_cpu())->mibs[field]--; \
+		put_cpu(); \
+	} while (0)
 #define SNMP_ADD_STATS_BH(mib, field, addend) 	\
 	(per_cpu_ptr(mib[0], raw_smp_processor_id())->mibs[field] += addend)
 #define SNMP_ADD_STATS_USER(mib, field, addend) 	\
-	(per_cpu_ptr(mib[1], raw_smp_processor_id())->mibs[field] += addend)
+	do { \
+		per_cpu_ptr(mib[1], get_cpu())->mibs[field] += addend; \
+		put_cpu(); \
+	} while (0)
 
 #endif

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/3] [UDP6]: Counter increment on BH mode
  2007-12-15 18:43                   ` [PATCH 3/3] [UDP6]: Counter increment on BH mode Ingo Molnar
@ 2007-12-16  2:36                     ` Herbert Xu
  2007-12-16  8:58                       ` Ingo Molnar
  2007-12-17 19:42                       ` Christoph Lameter
  2007-12-17 19:40                     ` Christoph Lameter
  1 sibling, 2 replies; 26+ messages in thread
From: Herbert Xu @ 2007-12-16  2:36 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Alexey Kuznetsov, Wang Chen, Gerrit Renker, davem, andi, netdev,
	Linux Kernel Mailing List, Christoph Lameter

On Sat, Dec 15, 2007 at 07:43:28PM +0100, Ingo Molnar wrote:
>
> we could perhaps introduce stat_smp_processor_id(), which signals that 
> the CPU id is used for statistical purposes and does not have to be 
> exact? In any case, your patch looks good too.

Unfortunately that doesn't work because we can then have two
CPUs trying to update the same counter which may corrupt it.

However, an optimisation is indeed possible with some work.
If we can get the address of the per-cpu counter against
some sort of a per-cpu base pointer, e.g., %gs on x86, then
we can do

	incq	%gs:(%rax)

where %rax would be the offset with %gs as the base.  This would
obviate the need for the CPU ID and therefore avoid disabling
preemption.

Hmm, wasn't Christoph working on something like that?

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/3] [UDP6]: Counter increment on BH mode
  2007-12-16  2:36                     ` Herbert Xu
@ 2007-12-16  8:58                       ` Ingo Molnar
  2007-12-17 19:42                       ` Christoph Lameter
  1 sibling, 0 replies; 26+ messages in thread
From: Ingo Molnar @ 2007-12-16  8:58 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Alexey Kuznetsov, Wang Chen, Gerrit Renker, davem, andi, netdev,
	Linux Kernel Mailing List, Christoph Lameter


* Herbert Xu <herbert@gondor.apana.org.au> wrote:

> On Sat, Dec 15, 2007 at 07:43:28PM +0100, Ingo Molnar wrote:
> >
> > we could perhaps introduce stat_smp_processor_id(), which signals that 
> > the CPU id is used for statistical purposes and does not have to be 
> > exact? In any case, your patch looks good too.
> 
> Unfortunately that doesn't work because we can then have two CPUs 
> trying to update the same counter which may corrupt it.

ah, indeed. I missed that correctness aspect of your patch. Good catch!

	Ingo

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/3] [UDP6]: Counter increment on BH mode
  2007-12-15 18:43                   ` [PATCH 3/3] [UDP6]: Counter increment on BH mode Ingo Molnar
  2007-12-16  2:36                     ` Herbert Xu
@ 2007-12-17 19:40                     ` Christoph Lameter
  1 sibling, 0 replies; 26+ messages in thread
From: Christoph Lameter @ 2007-12-17 19:40 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Herbert Xu, Alexey Kuznetsov, Wang Chen, Gerrit Renker, davem,
	andi, netdev, Linux Kernel Mailing List

The cpu alloc patches also fix this issue one way (disabling preempt) or 
the other (atomic instruction that does not need disabling of 
preeemption).


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/3] [UDP6]: Counter increment on BH mode
  2007-12-16  2:36                     ` Herbert Xu
  2007-12-16  8:58                       ` Ingo Molnar
@ 2007-12-17 19:42                       ` Christoph Lameter
  1 sibling, 0 replies; 26+ messages in thread
From: Christoph Lameter @ 2007-12-17 19:42 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Ingo Molnar, Alexey Kuznetsov, Wang Chen, Gerrit Renker, davem,
	andi, netdev, Linux Kernel Mailing List

On Sun, 16 Dec 2007, Herbert Xu wrote:

> If we can get the address of the per-cpu counter against
> some sort of a per-cpu base pointer, e.g., %gs on x86, then
> we can do
> 
> 	incq	%gs:(%rax)
> 
> where %rax would be the offset with %gs as the base.  This would
> obviate the need for the CPU ID and therefore avoid disabling
> preemption.
> 
> Hmm, wasn't Christoph working on something like that?

Yes that is what the cpu alloc patchset implements.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/3] [UDP6]: Counter increment on BH mode
  2007-12-15 13:58                 ` Herbert Xu
  2007-12-15 17:03                   ` Eric Dumazet
  2007-12-15 18:43                   ` [PATCH 3/3] [UDP6]: Counter increment on BH mode Ingo Molnar
@ 2007-12-18 11:43                   ` Gerrit Renker
  2007-12-18 12:49                     ` Herbert Xu
  2007-12-20 12:12                   ` David Miller
  3 siblings, 1 reply; 26+ messages in thread
From: Gerrit Renker @ 2007-12-18 11:43 UTC (permalink / raw)
  To: Herbert Xu; +Cc: netdev

| 
| I didn't hear any objections so here is the patch again.
| 
| [SNMP]: Fix SNMP counters with PREEMPT
| 
I don't feel competent to say whether this is correct, but it seems that
this is going a long way towards resolving older problems with the SNMP
counters.

What I can say is that a year ago, when doing the UDP/-Lite work, I found
that the counters were not accurate, which was due to the double-counting
problem that Wang Chen brought up again.

Resolving this issue was stalled at that time due to the fact that a
counter may be incremented on one CPU and then later decremented on
another.

It looks as if this work is reaching the cause of the problem, which would
be good since the problem propagates to all users of such counters (UDP,
UDP-Lite, and similar MIB counters). So thank you for taking this issue up,
I hope that this will lead to a patch set resolving the counter issues.

Best regards,
Gerrit

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/3] [UDP6]: Counter increment on BH mode
  2007-12-18 11:43                   ` Gerrit Renker
@ 2007-12-18 12:49                     ` Herbert Xu
  0 siblings, 0 replies; 26+ messages in thread
From: Herbert Xu @ 2007-12-18 12:49 UTC (permalink / raw)
  To: Gerrit Renker, netdev

On Tue, Dec 18, 2007 at 11:43:44AM +0000, Gerrit Renker wrote:
>
> It looks as if this work is reaching the cause of the problem, which would
> be good since the problem propagates to all users of such counters (UDP,
> UDP-Lite, and similar MIB counters). So thank you for taking this issue up,
> I hope that this will lead to a patch set resolving the counter issues.

Heh the counter issue is what led to this patch.  I think the
UDP counters should be fixed already.  Please have a look at
net-2.6.25 and let us know if it's still broken.

As to the SNMP counter issue in general, Christoph Lameter's
new percpu code will greatly simplify this and we won't need
to disable preemption or do the fancy softirq/user splitting
that we have right now.  I eagerly await its acceptance into
the Linux kernel.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/3] [UDP6]: Counter increment on BH mode
  2007-12-15 13:58                 ` Herbert Xu
                                     ` (2 preceding siblings ...)
  2007-12-18 11:43                   ` Gerrit Renker
@ 2007-12-20 12:12                   ` David Miller
  3 siblings, 0 replies; 26+ messages in thread
From: David Miller @ 2007-12-20 12:12 UTC (permalink / raw)
  To: herbert
  Cc: kuznet, wangchen, gerrit, andi, netdev, linux-kernel, clameter,
	mingo

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Sat, 15 Dec 2007 21:58:52 +0800

> [SNMP]: Fix SNMP counters with PREEMPT
> 
> The SNMP macros use raw_smp_processor_id() in process context
> which is illegal because the process may be preempted and then
> migrated to another CPU.
> 
> This patch makes it use get_cpu/put_cpu to disable preemption.
> 
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Applied to net-2.6.25, thanks.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [SNMP]: Fix SNMP counters with PREEMPT
  2007-12-16  2:30                     ` [SNMP]: Fix SNMP counters with PREEMPT Herbert Xu
@ 2007-12-20 12:13                       ` David Miller
  0 siblings, 0 replies; 26+ messages in thread
From: David Miller @ 2007-12-20 12:13 UTC (permalink / raw)
  To: herbert
  Cc: dada1, kuznet, wangchen, gerrit, andi, netdev, linux-kernel,
	clameter, mingo

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Sun, 16 Dec 2007 10:30:25 +0800

> On Sat, Dec 15, 2007 at 06:03:19PM +0100, Eric Dumazet wrote:
> >
> > How come you change SNMP_INC_STATS_USER() but not SNMP_INC_STATS() ?
> 
> Heh, my brain must have blocked me from seeing it because it's
> too hard :)
> 
> Let's fix it the stupid way first and I'll do a local_t conversion
> later.
> 
> [SNMP]: Fix SNMP counters with PREEMPT
> 
> The SNMP macros use raw_smp_processor_id() in process context
> which is illegal because the process may be preempted and then
> migrated to another CPU.
> 
> This patch makes it use get_cpu/put_cpu to disable preemption.
> 
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

I just noticed this and replaced the other SNMP fix patch
with this one.

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2007-12-20 12:13 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-11-30  3:09 [PATCH 1/3] [SNMP]: Defer InDataGrams increment until recvmsg() does checksum Wang Chen
2007-11-30  3:15 ` [PATCH 2/3] [UDP]: Clean up for IS_UDPLITE macro Wang Chen
2007-11-30 11:05   ` Gerrit Renker
2007-11-30  3:24 ` [PATCH 3/3] [UDP6]: Counter increment on BH mode Wang Chen
2007-11-30 11:19   ` Gerrit Renker
2007-12-01  1:54     ` Herbert Xu
2007-12-03  7:19       ` Wang Chen
2007-12-03 11:39         ` Herbert Xu
2007-12-03 11:49           ` Alexey Kuznetsov
2007-12-03 11:54             ` Herbert Xu
2007-12-03 13:17               ` Herbert Xu
2007-12-04  3:50                 ` Wang Chen
2007-12-04  3:57                   ` Herbert Xu
2007-12-15 13:58                 ` Herbert Xu
2007-12-15 17:03                   ` Eric Dumazet
2007-12-16  2:30                     ` [SNMP]: Fix SNMP counters with PREEMPT Herbert Xu
2007-12-20 12:13                       ` David Miller
2007-12-15 18:43                   ` [PATCH 3/3] [UDP6]: Counter increment on BH mode Ingo Molnar
2007-12-16  2:36                     ` Herbert Xu
2007-12-16  8:58                       ` Ingo Molnar
2007-12-17 19:42                       ` Christoph Lameter
2007-12-17 19:40                     ` Christoph Lameter
2007-12-18 11:43                   ` Gerrit Renker
2007-12-18 12:49                     ` Herbert Xu
2007-12-20 12:12                   ` David Miller
2007-11-30 10:57 ` [PATCH 1/3] [SNMP]: Defer InDataGrams increment until recvmsg() does checksum Gerrit Renker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).