All of lore.kernel.org
 help / color / mirror / Atom feed
From: Simon Horman <horms@verge.net.au>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>, David Miller <davem@davemloft.net>,
	torvalds@linux-foundation.org, akpm@linux-foundation.org,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Thomas Gleixner <tglx@linutronix.de>,
	Arnd Bergmann <arnd@arndb.de>,
	Pekka Enberg <penberg@cs.helsinki.fi>,
	Julian Anastasov <ja@ssi.bg>,
	Hans Schillstrom <hans@schillstrom.com>
Subject: Re: [slab poison overwritten] Re: [GIT] Networking
Date: Tue, 22 Mar 2011 10:18:01 +0900	[thread overview]
Message-ID: <20110322011801.GE27019@verge.net.au> (raw)
In-Reply-To: <20110322001706.GD27019@verge.net.au>

On Tue, Mar 22, 2011 at 09:17:07AM +0900, Simon Horman wrote:
> On Tue, Mar 22, 2011 at 09:01:33AM +0900, Simon Horman wrote:
> > On Tue, Mar 22, 2011 at 08:29:21AM +0900, Simon Horman wrote:
> > > On Tue, Mar 22, 2011 at 07:13:58AM +0900, Simon Horman wrote:
> > > > On Mon, Mar 21, 2011 at 09:15:40PM +0100, Eric Dumazet wrote:
> > > > > Le lundi 21 mars 2011 à 19:07 +0100, Eric Dumazet a écrit :
> > > > > > Le lundi 21 mars 2011 à 18:39 +0100, Ingo Molnar a écrit :
> > > > > > > here's the same but with kallsyms enabled.
> > > > > > > 
> > > > > > > Thanks,
> > > > > > > 
> > > > > > > 	Ingo
> > > > > > > 
> > > > > > > [    9.585627] initcall 0xffffffff81d5b806 returned 0 after 0 usecs
> > > > > > > [    9.588960] calling  0xffffffff81d5b9da @ 1
> > > > > > > [    9.592303] IPVS: Creating netns size=1272 id=0
> > > > > > > [    9.595646] IPVS: __ip_vs_control_init(): alloc_percpu.
> > > > > > > [    9.602298] IPVS: cannot register namespace.
> > > > > > > [    9.605627] IPVS: can't setup control
> > > > > > 
> > > > > > It seems IPVS is busted in case of memory allocation error in 
> > > > > > __ip_vs_control_init()
> > > > > > 
> > > > > > IPVS deinits its "struct netns_ipvs" space, but something (in IPVS) uses
> > > > > > it after free.
> > > > > > 
> > > > > > __ip_vs_init() seems to be called before ip_vs_init() completes
> > > > > > correctly. We then keep in net->ipvs a pointer to some freed memory.
> > > > > > 
> > > > > > Commit 14e405461e664b7 did some changes in this area
> > > > > > 
> > > > > > Simon, any idea ?
> > > > > > 
> > > > > > 
> > > > > 
> > > > > For the time being, we can avoid the false memory allocation error (and
> > > > > leak)
> > > > 
> > > > Sorry, that typo is my work.
> > > 
> > > With your patch applied I now see the following
> > > 
> > > ffff880003bbf1a0 corresponds to &ipvs->app_key in __ip_vs_app_init().
> > > I'll continue looking into this.
> > > 
> > > [   12.610000] IPVS: Creating netns size=2456 id=0
> > > [   12.630000] IPVS: Registered protocols (TCP, UDP, SCTP, AH, ESP)
> > > [   12.640000] BUG: key ffff880003bbf1a0 not in .data!
> > > [   12.640000] ------------[ cut here ]------------
> > > [   12.640000] WARNING: at kernel/lockdep.c:2701
> > > lockdep_init_map+0x37b/0x570()
> > > [   12.640000] Hardware name: Bochs
> > > [   12.640000] Pid: 1, comm: swapper Tainted: G        W
> > > 2.6.38-kexec-06330-g69b7efe-dirty #122
> > > [   12.650000] Call Trace:
> > > [   12.650000]  [<ffffffff8102e685>] warn_slowpath_common+0x75/0xb0
> > > [   12.650000]  [<ffffffff8102e6d5>] warn_slowpath_null+0x15/0x20
> > > [   12.650000]  [<ffffffff8105967b>] lockdep_init_map+0x37b/0x570
> > > [   12.650000]  [<ffffffff8105829d>] ? trace_hardirqs_on+0xd/0x10
> > > [   12.650000]  [<ffffffff81055ad8>] debug_mutex_init+0x38/0x50
> > > [   12.650000]  [<ffffffff8104bc4c>] __mutex_init+0x5c/0x70
> > > [   12.650000]  [<ffffffff81685ee7>] __ip_vs_app_init+0x64/0x86
> > > [   12.660000]  [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
> > > [   12.660000]  [<ffffffff811b1c33>] T.620+0x43/0x170
> > > [   12.660000]  [<ffffffff811b1e9a>] ? register_pernet_subsys+0x1a/0x40
> > > [   12.660000]  [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
> > > [   12.660000]  [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
> > > [   12.660000]  [<ffffffff811b1db7>] register_pernet_operations+0x57/0xb0
> > > [   12.660000]  [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
> > > [   12.670000]  [<ffffffff811b1ea9>] register_pernet_subsys+0x29/0x40
> > > [   12.670000]  [<ffffffff81685f19>] ip_vs_app_init+0x10/0x12
> > > [   12.670000]  [<ffffffff81685a87>] ip_vs_init+0x4c/0xff
> > > [   12.670000]  [<ffffffff8166562c>] do_one_initcall+0x7a/0x12e
> > > [   12.670000]  [<ffffffff8166583e>] kernel_init+0x13e/0x1c2
> > > [   12.670000]  [<ffffffff8128c134>] kernel_thread_helper+0x4/0x10
> > > [   12.670000]  [<ffffffff8128ad40>] ? restore_args+0x0/0x30
> > > [   12.680000]  [<ffffffff81665700>] ? kernel_init+0x0/0x1c2
> > > [   12.680000]  [<ffffffff8128c130>] ? kernel_thread_helper+0x0/0x10
> > > [   12.680000] ---[ end trace 4eaa2a86a8e2da23 ]---
> > 
> > It seems that the problem above was introduced by
> > ab8a5e8408c3 ("IPVS: netns awareness to ip_vs_app").
> > I assume the hungs are the cause:
> 
> s/hungs/hunks below/
> 
> I am a little unsure of what to do about this.
> 
> The problem seems to be that ipvs->app_key is not in static storage.
> But I'm not sure how to resolve that given that the struct netns_ipvs is
> per-network namespace. So I guess that the locking needs to be re-worked.
> Again, I'm a little unsure of what the best way forward is.

I had an idea for a fix over breakfast.

IPVS: Use global mutex in ip_vs_app.c

As part of the work to make IPVS network namespace aware
__ip_vs_app_mutex was replaced by a per-namespace lock,
ipvs->app_mutex. ipvs->app_key is also supplied for debugging purposes.

Unfortunately this implementation results in ipvs->app_key residing
in non-static storage which at the very least causes a lockdep warning.

This patch takes the rather heavy-handed approach of reinstating
__ip_vs_app_mutex which will cover access to the ipvs->list_head
of all network namespaces.

[   12.610000] IPVS: Creating netns size=2456 id=0
[   12.630000] IPVS: Registered protocols (TCP, UDP, SCTP, AH, ESP)
[   12.640000] BUG: key ffff880003bbf1a0 not in .data!
[   12.640000] ------------[ cut here ]------------
[   12.640000] WARNING: at kernel/lockdep.c:2701 lockdep_init_map+0x37b/0x570()
[   12.640000] Hardware name: Bochs
[   12.640000] Pid: 1, comm: swapper Tainted: G        W 2.6.38-kexec-06330-g69b7efe-dirty #122
[   12.650000] Call Trace:
[   12.650000]  [<ffffffff8102e685>] warn_slowpath_common+0x75/0xb0
[   12.650000]  [<ffffffff8102e6d5>] warn_slowpath_null+0x15/0x20
[   12.650000]  [<ffffffff8105967b>] lockdep_init_map+0x37b/0x570
[   12.650000]  [<ffffffff8105829d>] ? trace_hardirqs_on+0xd/0x10
[   12.650000]  [<ffffffff81055ad8>] debug_mutex_init+0x38/0x50
[   12.650000]  [<ffffffff8104bc4c>] __mutex_init+0x5c/0x70
[   12.650000]  [<ffffffff81685ee7>] __ip_vs_app_init+0x64/0x86
[   12.660000]  [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
[   12.660000]  [<ffffffff811b1c33>] T.620+0x43/0x170
[   12.660000]  [<ffffffff811b1e9a>] ? register_pernet_subsys+0x1a/0x40
[   12.660000]  [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
[   12.660000]  [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
[   12.660000]  [<ffffffff811b1db7>] register_pernet_operations+0x57/0xb0
[   12.660000]  [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
[   12.670000]  [<ffffffff811b1ea9>] register_pernet_subsys+0x29/0x40
[   12.670000]  [<ffffffff81685f19>] ip_vs_app_init+0x10/0x12
[   12.670000]  [<ffffffff81685a87>] ip_vs_init+0x4c/0xff
[   12.670000]  [<ffffffff8166562c>] do_one_initcall+0x7a/0x12e
[   12.670000]  [<ffffffff8166583e>] kernel_init+0x13e/0x1c2
[   12.670000]  [<ffffffff8128c134>] kernel_thread_helper+0x4/0x10
[   12.670000]  [<ffffffff8128ad40>] ? restore_args+0x0/0x30
[   12.680000]  [<ffffffff81665700>] ? kernel_init+0x0/0x1c2
[   12.680000]  [<ffffffff8128c130>] ? kernel_thread_helper+0x0/0x1global0

Signed-off-by: Simon Horman <horms@verge.net.au>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Julian Anastasov <ja@ssi.bg>
Cc: Hans Schillstrom <hans@schillstrom.com>
---
 include/net/ip_vs.h            |    2 --
 net/netfilter/ipvs/ip_vs_app.c |   23 ++++++++++-------------
 2 files changed, 10 insertions(+), 15 deletions(-)

diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index 272f593..30b49ed 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -801,8 +801,6 @@ struct netns_ipvs {
 	struct list_head	rs_table[IP_VS_RTAB_SIZE];
 	/* ip_vs_app */
 	struct list_head	app_list;
-	struct mutex		app_mutex;
-	struct lock_class_key	app_key;	/* mutex debuging */
 
 	/* ip_vs_proto */
 	#define IP_VS_PROTO_TAB_SIZE	32	/* must be power of 2 */
diff --git a/net/netfilter/ipvs/ip_vs_app.c b/net/netfilter/ipvs/ip_vs_app.c
index 5c48ffb..2dc6de1 100644
--- a/net/netfilter/ipvs/ip_vs_app.c
+++ b/net/netfilter/ipvs/ip_vs_app.c
@@ -43,6 +43,8 @@ EXPORT_SYMBOL(register_ip_vs_app);
 EXPORT_SYMBOL(unregister_ip_vs_app);
 EXPORT_SYMBOL(register_ip_vs_app_inc);
 
+static DEFINE_MUTEX(__ip_vs_app_mutex);
+
 /*
  *	Get an ip_vs_app object
  */
@@ -167,14 +169,13 @@ int
 register_ip_vs_app_inc(struct net *net, struct ip_vs_app *app, __u16 proto,
 		       __u16 port)
 {
-	struct netns_ipvs *ipvs = net_ipvs(net);
 	int result;
 
-	mutex_lock(&ipvs->app_mutex);
+	mutex_lock(&__ip_vs_app_mutex);
 
 	result = ip_vs_app_inc_new(net, app, proto, port);
 
-	mutex_unlock(&ipvs->app_mutex);
+	mutex_unlock(&__ip_vs_app_mutex);
 
 	return result;
 }
@@ -189,11 +190,11 @@ int register_ip_vs_app(struct net *net, struct ip_vs_app *app)
 	/* increase the module use count */
 	ip_vs_use_count_inc();
 
-	mutex_lock(&ipvs->app_mutex);
+	mutex_lock(&__ip_vs_app_mutex);
 
 	list_add(&app->a_list, &ipvs->app_list);
 
-	mutex_unlock(&ipvs->app_mutex);
+	mutex_unlock(&__ip_vs_app_mutex);
 
 	return 0;
 }
@@ -205,10 +206,9 @@ int register_ip_vs_app(struct net *net, struct ip_vs_app *app)
  */
 void unregister_ip_vs_app(struct net *net, struct ip_vs_app *app)
 {
-	struct netns_ipvs *ipvs = net_ipvs(net);
 	struct ip_vs_app *inc, *nxt;
 
-	mutex_lock(&ipvs->app_mutex);
+	mutex_lock(&__ip_vs_app_mutex);
 
 	list_for_each_entry_safe(inc, nxt, &app->incs_list, a_list) {
 		ip_vs_app_inc_release(net, inc);
@@ -216,7 +216,7 @@ void unregister_ip_vs_app(struct net *net, struct ip_vs_app *app)
 
 	list_del(&app->a_list);
 
-	mutex_unlock(&ipvs->app_mutex);
+	mutex_unlock(&__ip_vs_app_mutex);
 
 	/* decrease the module use count */
 	ip_vs_use_count_dec();
@@ -501,7 +501,7 @@ static void *ip_vs_app_seq_start(struct seq_file *seq, loff_t *pos)
 	struct net *net = seq_file_net(seq);
 	struct netns_ipvs *ipvs = net_ipvs(net);
 
-	mutex_lock(&ipvs->app_mutex);
+	mutex_lock(&__ip_vs_app_mutex);
 
 	return *pos ? ip_vs_app_idx(ipvs, *pos - 1) : SEQ_START_TOKEN;
 }
@@ -535,9 +535,7 @@ static void *ip_vs_app_seq_next(struct seq_file *seq, void *v, loff_t *pos)
 
 static void ip_vs_app_seq_stop(struct seq_file *seq, void *v)
 {
-	struct netns_ipvs *ipvs = net_ipvs(seq_file_net(seq));
-
-	mutex_unlock(&ipvs->app_mutex);
+	mutex_unlock(&__ip_vs_app_mutex);
 }
 
 static int ip_vs_app_seq_show(struct seq_file *seq, void *v)
@@ -583,7 +581,6 @@ static int __net_init __ip_vs_app_init(struct net *net)
 	struct netns_ipvs *ipvs = net_ipvs(net);
 
 	INIT_LIST_HEAD(&ipvs->app_list);
-	__mutex_init(&ipvs->app_mutex, "ipvs->app_mutex", &ipvs->app_key);
 	proc_net_fops_create(net, "ip_vs_app", 0, &ip_vs_app_fops);
 	return 0;
 }
-- 
1.7.2.3


  reply	other threads:[~2011-03-22  1:18 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-03-21  2:51 [GIT] Networking David Miller
2011-03-21 12:53 ` [held lock freed] " Ingo Molnar
2011-03-21 13:32   ` Eric Dumazet
2011-03-21 14:50     ` Arnd Bergmann
2011-03-21 14:55       ` Eric Dumazet
2011-03-21 15:22         ` Arnd Bergmann
2011-03-21 16:16           ` Ingo Molnar
2011-03-22  1:18           ` David Miller
2011-03-21 16:15     ` Ingo Molnar
2011-03-21 16:42       ` [slab poison overwritten] " Ingo Molnar
2011-03-21 17:37         ` Ingo Molnar
2011-03-21 17:39         ` Ingo Molnar
2011-03-21 18:07           ` Eric Dumazet
2011-03-21 20:15             ` Eric Dumazet
2011-03-21 22:13               ` Simon Horman
2011-03-21 23:29                 ` Simon Horman
2011-03-22  0:01                   ` Simon Horman
2011-03-22  0:17                     ` Simon Horman
2011-03-22  1:18                       ` Simon Horman [this message]
2011-03-22  3:40                         ` David Miller
2011-03-22  3:39                 ` David Miller
2011-03-22  9:56               ` Ingo Molnar
2011-03-22 10:00                 ` Eric Dumazet
2011-03-22 21:52                   ` Simon Horman
2011-03-22  9:07           ` Ingo Molnar
2011-03-22  1:16       ` [held lock freed] " David Miller
2011-03-21 19:24 ` Linus Torvalds
2011-03-21 20:10   ` Eric Dumazet
2011-03-22  4:09   ` David Miller
2011-03-22 10:00   ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110322011801.GE27019@verge.net.au \
    --to=horms@verge.net.au \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=hans@schillstrom.com \
    --cc=ja@ssi.bg \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=netdev@vger.kernel.org \
    --cc=penberg@cs.helsinki.fi \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.