netfilter 03/03: nf_conntrack: nf_conntrack_alloc() fixes

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Patrick McHardy <kaber@trash.net>
To: davem@davemloft.net
Cc: netdev@vger.kernel.org, Patrick McHardy <kaber@trash.net>,
	netfilter-devel@vger.kernel.org
Subject: netfilter 03/03: nf_conntrack: nf_conntrack_alloc() fixes
Date: Thu, 16 Jul 2009 14:26:47 +0200 (MEST)	[thread overview]
Message-ID: <20090716122642.23343.83669.sendpatchset@x2.localnet> (raw)
In-Reply-To: <20090716122638.23343.77832.sendpatchset@x2.localnet>

commit 941297f443f871b8c3372feccf27a8733f6ce9e9
Author: Eric Dumazet <eric.dumazet@gmail.com>
Date:   Thu Jul 16 14:03:40 2009 +0200

    netfilter: nf_conntrack: nf_conntrack_alloc() fixes
    
    When a slab cache uses SLAB_DESTROY_BY_RCU, we must be careful when allocating
    objects, since slab allocator could give a freed object still used by lockless
    readers.
    
    In particular, nf_conntrack RCU lookups rely on ct->tuplehash[xxx].hnnode.next
    being always valid (ie containing a valid 'nulls' value, or a valid pointer to next
    object in hash chain.)
    
    kmem_cache_zalloc() setups object with NULL values, but a NULL value is not valid
    for ct->tuplehash[xxx].hnnode.next.
    
    Fix is to call kmem_cache_alloc() and do the zeroing ourself.
    
    As spotted by Patrick, we also need to make sure lookup keys are committed to
    memory before setting refcount to 1, or a lockless reader could get a reference
    on the old version of the object. Its key re-check could then pass the barrier.
    
    Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
    Signed-off-by: Patrick McHardy <kaber@trash.net>

diff --git a/Documentation/RCU/rculist_nulls.txt b/Documentation/RCU/rculist_nulls.txt
index 93cb28d..18f9651 100644
--- a/Documentation/RCU/rculist_nulls.txt
+++ b/Documentation/RCU/rculist_nulls.txt
@@ -83,11 +83,12 @@ not detect it missed following items in original chain.
 obj = kmem_cache_alloc(...);
 lock_chain(); // typically a spin_lock()
 obj->key = key;
-atomic_inc(&obj->refcnt);
 /*
  * we need to make sure obj->key is updated before obj->next
+ * or obj->refcnt
  */
 smp_wmb();
+atomic_set(&obj->refcnt, 1);
 hlist_add_head_rcu(&obj->obj_node, list);
 unlock_chain(); // typically a spin_unlock()
 
@@ -159,6 +160,10 @@ out:
 obj = kmem_cache_alloc(cachep);
 lock_chain(); // typically a spin_lock()
 obj->key = key;
+/*
+ * changes to obj->key must be visible before refcnt one
+ */
+smp_wmb();
 atomic_set(&obj->refcnt, 1);
 /*
  * insert obj in RCU way (readers might be traversing chain)
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 7508f11..b5869b9 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -561,23 +561,38 @@ struct nf_conn *nf_conntrack_alloc(struct net *net,
 		}
 	}
 
-	ct = kmem_cache_zalloc(nf_conntrack_cachep, gfp);
+	/*
+	 * Do not use kmem_cache_zalloc(), as this cache uses
+	 * SLAB_DESTROY_BY_RCU.
+	 */
+	ct = kmem_cache_alloc(nf_conntrack_cachep, gfp);
 	if (ct == NULL) {
 		pr_debug("nf_conntrack_alloc: Can't alloc conntrack.\n");
 		atomic_dec(&net->ct.count);
 		return ERR_PTR(-ENOMEM);
 	}
-
+	/*
+	 * Let ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode.next
+	 * and ct->tuplehash[IP_CT_DIR_REPLY].hnnode.next unchanged.
+	 */
+	memset(&ct->tuplehash[IP_CT_DIR_MAX], 0,
+	       sizeof(*ct) - offsetof(struct nf_conn, tuplehash[IP_CT_DIR_MAX]));
 	spin_lock_init(&ct->lock);
-	atomic_set(&ct->ct_general.use, 1);
 	ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple = *orig;
+	ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode.pprev = NULL;
 	ct->tuplehash[IP_CT_DIR_REPLY].tuple = *repl;
+	ct->tuplehash[IP_CT_DIR_REPLY].hnnode.pprev = NULL;
 	/* Don't set timer yet: wait for confirmation */
 	setup_timer(&ct->timeout, death_by_timeout, (unsigned long)ct);
 #ifdef CONFIG_NET_NS
 	ct->ct_net = net;
 #endif
 
+	/*
+	 * changes to lookup keys must be done before setting refcnt to 1
+	 */
+	smp_wmb();
+	atomic_set(&ct->ct_general.use, 1);
 	return ct;
 }
 EXPORT_SYMBOL_GPL(nf_conntrack_alloc);

next prev parent reply	other threads:[~2009-07-16 12:26 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-07-16 12:26 netfilter 00/03: netfilter fixes Patrick McHardy
2009-07-16 12:26 ` netfilter 01/03: add netfilter git to MAINTAINERS Patrick McHardy
2009-07-16 12:26 ` netfilter 02/03: xt_osf: fix nf_log_packet() arguments Patrick McHardy
2009-07-16 12:26 ` Patrick McHardy [this message]
2009-07-17  0:37 ` netfilter 00/03: netfilter fixes David Miller

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:93cb28d dfblob:18f9651 dfblob:7508f11 dfblob:b5869b9 )
 OR (
bs:"netfilter 03/03: nf_conntrack: nf_conntrack_alloc() fixes" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090716122642.23343.83669.sendpatchset@x2.localnet \
    --to=kaber@trash.net \
    --cc=davem@davemloft.net \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.