All of lore.kernel.org
 help / color / mirror / Atom feed
From: Simon Horman <horms@verge.net.au>
To: Patrick McHardy <kaber@trash.net>
Cc: "홍신 shin hong" <hongshin@gmail.com>,
	netfilter-devel@vger.kernel.org,
	"Wensong Zhang" <wensong@linux-vs.org>
Subject: Re: [bug?] netfilter/ipvs : suspected race bugs related to atomic operations
Date: Fri, 14 Aug 2009 08:56:03 +1000	[thread overview]
Message-ID: <20090813225603.GG3873@verge.net.au> (raw)
In-Reply-To: <20090813224316.GF3873@verge.net.au>

On Fri, Aug 14, 2009 at 08:43:18AM +1000, Simon Horman wrote:
> On Thu, Aug 13, 2009 at 03:44:27PM +0200, Patrick McHardy wrote:
> > 홍신 shin hong wrote:
> > > Hi. I am reporting two suspected race bug related to atomic operations
> > > while I read net/netfilter/ipvs of Linux 2.6.30.4.
> > > 
> > > (1) In net/netfilter/ipvs/ip_vs_core.c, ip_vs_in() first increments &cp->in_pkts
> > >      and then reads variable for condition checking at line 1346-1351.
> > > 
> > >      However, these two atomic operations may not be executed atomically.
> > >      For this reason, it may result race with other concurrent executions
> > >      which manipulates &cp->in_pkts.
> > > 
> > > (2) In net/netfilter/ipvs/ip_vs_wrr.c, ip_vs_wrr_max_weight() first
> > > checks &dest->weight
> > >      and then reads the variable again to assign its value to a local variable.
> > >      For the similar reason above, it seems that two atomic_read() operations
> > >      may result different values so that it may result race condition.
> > > 
> > > Please examine the code and let me know your opinion. Thanks.
> > 
> > I'm not sure whether the IPVS guys are following the netfilter-devel
> > list, so I've CCed Wensong and Simon.
> 
> Hi,
> 
> my feeling is that with both of these is that while there are certainly
> correctness problems they shouldn't manifest in anything terribly bad
> happening. In both cases I feel that we are essentially dealing with
> inputs to a heuristic, and thus some noise in the input really
> ought not make much difference in the long run, although and individual packet
> or connection may be miss-classified.

However, having said that, it seems worthwhile to fix these problems.

[IPVS] Use atomic operations atomicly

A pointed out by Shin Hong, IPVS doesn't always use atomic operations
in an atomic manner. While this seems unlikely to be manifest in
strange behaviour, it seems appropriate to clean this up.

Cc: 홍신 shin hong <hongshin@gmail.com>
Signed-off-by: Simon Horman <horms@verge.net.au>

Index: linux-2.6/net/netfilter/ipvs/ip_vs_core.c
===================================================================
--- linux-2.6.orig/net/netfilter/ipvs/ip_vs_core.c	2009-08-14 08:31:48.000000000 +1000
+++ linux-2.6/net/netfilter/ipvs/ip_vs_core.c	2009-08-14 08:51:56.000000000 +1000
@@ -1256,7 +1256,7 @@ ip_vs_in(unsigned int hooknum, struct sk
 	struct ip_vs_iphdr iph;
 	struct ip_vs_protocol *pp;
 	struct ip_vs_conn *cp;
-	int ret, restart, af;
+	int ret, restart, af, pkts;
 
 	af = (skb->protocol == htons(ETH_P_IP)) ? AF_INET : AF_INET6;
 
@@ -1343,12 +1343,12 @@ ip_vs_in(unsigned int hooknum, struct sk
 	 * Sync connection if it is about to close to
 	 * encorage the standby servers to update the connections timeout
 	 */
-	atomic_inc(&cp->in_pkts);
+	pkts = atomic_add_return(1, &cp->in_pkts);
 	if (af == AF_INET &&
 	    (ip_vs_sync_state & IP_VS_STATE_MASTER) &&
 	    (((cp->protocol != IPPROTO_TCP ||
 	       cp->state == IP_VS_TCP_S_ESTABLISHED) &&
-	      (atomic_read(&cp->in_pkts) % sysctl_ip_vs_sync_threshold[1]
+	      (pkts % sysctl_ip_vs_sync_threshold[1]
 	       == sysctl_ip_vs_sync_threshold[0])) ||
 	     ((cp->protocol == IPPROTO_TCP) && (cp->old_state != cp->state) &&
 	      ((cp->state == IP_VS_TCP_S_FIN_WAIT) ||
Index: linux-2.6/net/netfilter/ipvs/ip_vs_wrr.c
===================================================================
--- linux-2.6.orig/net/netfilter/ipvs/ip_vs_wrr.c	2009-08-14 08:31:48.000000000 +1000
+++ linux-2.6/net/netfilter/ipvs/ip_vs_wrr.c	2009-08-14 08:31:50.000000000 +1000
@@ -74,11 +74,12 @@ static int ip_vs_wrr_gcd_weight(struct i
 static int ip_vs_wrr_max_weight(struct ip_vs_service *svc)
 {
 	struct ip_vs_dest *dest;
-	int weight = 0;
+	int new_weight, weight = 0;
 
 	list_for_each_entry(dest, &svc->destinations, n_list) {
-		if (atomic_read(&dest->weight) > weight)
-			weight = atomic_read(&dest->weight);
+		new_weight = atomic_read(&dest->weight);
+		if (new_weight > weight)
+			weight = new_weight;
 	}
 
 	return weight;
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

      reply	other threads:[~2009-08-13 22:56 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-08-12 14:43 [bug?] netfilter/ipvs : suspected race bugs related to atomic operations 홍신 shin hong
2009-08-13 13:44 ` Patrick McHardy
2009-08-13 22:43   ` Simon Horman
2009-08-13 22:56     ` Simon Horman [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090813225603.GG3873@verge.net.au \
    --to=horms@verge.net.au \
    --cc=hongshin@gmail.com \
    --cc=kaber@trash.net \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=wensong@linux-vs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.