From: Simon Horman <horms@verge.net.au>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>, David Miller <davem@davemloft.net>,
torvalds@linux-foundation.org, akpm@linux-foundation.org,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Thomas Gleixner <tglx@linutronix.de>,
Arnd Bergmann <arnd@arndb.de>,
Pekka Enberg <penberg@cs.helsinki.fi>,
Julian Anastasov <ja@ssi.bg>,
Hans Schillstrom <hans@schillstrom.com>
Subject: Re: [slab poison overwritten] Re: [GIT] Networking
Date: Tue, 22 Mar 2011 10:18:01 +0900 [thread overview]
Message-ID: <20110322011801.GE27019@verge.net.au> (raw)
In-Reply-To: <20110322001706.GD27019@verge.net.au>
On Tue, Mar 22, 2011 at 09:17:07AM +0900, Simon Horman wrote:
> On Tue, Mar 22, 2011 at 09:01:33AM +0900, Simon Horman wrote:
> > On Tue, Mar 22, 2011 at 08:29:21AM +0900, Simon Horman wrote:
> > > On Tue, Mar 22, 2011 at 07:13:58AM +0900, Simon Horman wrote:
> > > > On Mon, Mar 21, 2011 at 09:15:40PM +0100, Eric Dumazet wrote:
> > > > > Le lundi 21 mars 2011 à 19:07 +0100, Eric Dumazet a écrit :
> > > > > > Le lundi 21 mars 2011 à 18:39 +0100, Ingo Molnar a écrit :
> > > > > > > here's the same but with kallsyms enabled.
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Ingo
> > > > > > >
> > > > > > > [ 9.585627] initcall 0xffffffff81d5b806 returned 0 after 0 usecs
> > > > > > > [ 9.588960] calling 0xffffffff81d5b9da @ 1
> > > > > > > [ 9.592303] IPVS: Creating netns size=1272 id=0
> > > > > > > [ 9.595646] IPVS: __ip_vs_control_init(): alloc_percpu.
> > > > > > > [ 9.602298] IPVS: cannot register namespace.
> > > > > > > [ 9.605627] IPVS: can't setup control
> > > > > >
> > > > > > It seems IPVS is busted in case of memory allocation error in
> > > > > > __ip_vs_control_init()
> > > > > >
> > > > > > IPVS deinits its "struct netns_ipvs" space, but something (in IPVS) uses
> > > > > > it after free.
> > > > > >
> > > > > > __ip_vs_init() seems to be called before ip_vs_init() completes
> > > > > > correctly. We then keep in net->ipvs a pointer to some freed memory.
> > > > > >
> > > > > > Commit 14e405461e664b7 did some changes in this area
> > > > > >
> > > > > > Simon, any idea ?
> > > > > >
> > > > > >
> > > > >
> > > > > For the time being, we can avoid the false memory allocation error (and
> > > > > leak)
> > > >
> > > > Sorry, that typo is my work.
> > >
> > > With your patch applied I now see the following
> > >
> > > ffff880003bbf1a0 corresponds to &ipvs->app_key in __ip_vs_app_init().
> > > I'll continue looking into this.
> > >
> > > [ 12.610000] IPVS: Creating netns size=2456 id=0
> > > [ 12.630000] IPVS: Registered protocols (TCP, UDP, SCTP, AH, ESP)
> > > [ 12.640000] BUG: key ffff880003bbf1a0 not in .data!
> > > [ 12.640000] ------------[ cut here ]------------
> > > [ 12.640000] WARNING: at kernel/lockdep.c:2701
> > > lockdep_init_map+0x37b/0x570()
> > > [ 12.640000] Hardware name: Bochs
> > > [ 12.640000] Pid: 1, comm: swapper Tainted: G W
> > > 2.6.38-kexec-06330-g69b7efe-dirty #122
> > > [ 12.650000] Call Trace:
> > > [ 12.650000] [<ffffffff8102e685>] warn_slowpath_common+0x75/0xb0
> > > [ 12.650000] [<ffffffff8102e6d5>] warn_slowpath_null+0x15/0x20
> > > [ 12.650000] [<ffffffff8105967b>] lockdep_init_map+0x37b/0x570
> > > [ 12.650000] [<ffffffff8105829d>] ? trace_hardirqs_on+0xd/0x10
> > > [ 12.650000] [<ffffffff81055ad8>] debug_mutex_init+0x38/0x50
> > > [ 12.650000] [<ffffffff8104bc4c>] __mutex_init+0x5c/0x70
> > > [ 12.650000] [<ffffffff81685ee7>] __ip_vs_app_init+0x64/0x86
> > > [ 12.660000] [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
> > > [ 12.660000] [<ffffffff811b1c33>] T.620+0x43/0x170
> > > [ 12.660000] [<ffffffff811b1e9a>] ? register_pernet_subsys+0x1a/0x40
> > > [ 12.660000] [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
> > > [ 12.660000] [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
> > > [ 12.660000] [<ffffffff811b1db7>] register_pernet_operations+0x57/0xb0
> > > [ 12.660000] [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
> > > [ 12.670000] [<ffffffff811b1ea9>] register_pernet_subsys+0x29/0x40
> > > [ 12.670000] [<ffffffff81685f19>] ip_vs_app_init+0x10/0x12
> > > [ 12.670000] [<ffffffff81685a87>] ip_vs_init+0x4c/0xff
> > > [ 12.670000] [<ffffffff8166562c>] do_one_initcall+0x7a/0x12e
> > > [ 12.670000] [<ffffffff8166583e>] kernel_init+0x13e/0x1c2
> > > [ 12.670000] [<ffffffff8128c134>] kernel_thread_helper+0x4/0x10
> > > [ 12.670000] [<ffffffff8128ad40>] ? restore_args+0x0/0x30
> > > [ 12.680000] [<ffffffff81665700>] ? kernel_init+0x0/0x1c2
> > > [ 12.680000] [<ffffffff8128c130>] ? kernel_thread_helper+0x0/0x10
> > > [ 12.680000] ---[ end trace 4eaa2a86a8e2da23 ]---
> >
> > It seems that the problem above was introduced by
> > ab8a5e8408c3 ("IPVS: netns awareness to ip_vs_app").
> > I assume the hungs are the cause:
>
> s/hungs/hunks below/
>
> I am a little unsure of what to do about this.
>
> The problem seems to be that ipvs->app_key is not in static storage.
> But I'm not sure how to resolve that given that the struct netns_ipvs is
> per-network namespace. So I guess that the locking needs to be re-worked.
> Again, I'm a little unsure of what the best way forward is.
I had an idea for a fix over breakfast.
IPVS: Use global mutex in ip_vs_app.c
As part of the work to make IPVS network namespace aware
__ip_vs_app_mutex was replaced by a per-namespace lock,
ipvs->app_mutex. ipvs->app_key is also supplied for debugging purposes.
Unfortunately this implementation results in ipvs->app_key residing
in non-static storage which at the very least causes a lockdep warning.
This patch takes the rather heavy-handed approach of reinstating
__ip_vs_app_mutex which will cover access to the ipvs->list_head
of all network namespaces.
[ 12.610000] IPVS: Creating netns size=2456 id=0
[ 12.630000] IPVS: Registered protocols (TCP, UDP, SCTP, AH, ESP)
[ 12.640000] BUG: key ffff880003bbf1a0 not in .data!
[ 12.640000] ------------[ cut here ]------------
[ 12.640000] WARNING: at kernel/lockdep.c:2701 lockdep_init_map+0x37b/0x570()
[ 12.640000] Hardware name: Bochs
[ 12.640000] Pid: 1, comm: swapper Tainted: G W 2.6.38-kexec-06330-g69b7efe-dirty #122
[ 12.650000] Call Trace:
[ 12.650000] [<ffffffff8102e685>] warn_slowpath_common+0x75/0xb0
[ 12.650000] [<ffffffff8102e6d5>] warn_slowpath_null+0x15/0x20
[ 12.650000] [<ffffffff8105967b>] lockdep_init_map+0x37b/0x570
[ 12.650000] [<ffffffff8105829d>] ? trace_hardirqs_on+0xd/0x10
[ 12.650000] [<ffffffff81055ad8>] debug_mutex_init+0x38/0x50
[ 12.650000] [<ffffffff8104bc4c>] __mutex_init+0x5c/0x70
[ 12.650000] [<ffffffff81685ee7>] __ip_vs_app_init+0x64/0x86
[ 12.660000] [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
[ 12.660000] [<ffffffff811b1c33>] T.620+0x43/0x170
[ 12.660000] [<ffffffff811b1e9a>] ? register_pernet_subsys+0x1a/0x40
[ 12.660000] [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
[ 12.660000] [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
[ 12.660000] [<ffffffff811b1db7>] register_pernet_operations+0x57/0xb0
[ 12.660000] [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
[ 12.670000] [<ffffffff811b1ea9>] register_pernet_subsys+0x29/0x40
[ 12.670000] [<ffffffff81685f19>] ip_vs_app_init+0x10/0x12
[ 12.670000] [<ffffffff81685a87>] ip_vs_init+0x4c/0xff
[ 12.670000] [<ffffffff8166562c>] do_one_initcall+0x7a/0x12e
[ 12.670000] [<ffffffff8166583e>] kernel_init+0x13e/0x1c2
[ 12.670000] [<ffffffff8128c134>] kernel_thread_helper+0x4/0x10
[ 12.670000] [<ffffffff8128ad40>] ? restore_args+0x0/0x30
[ 12.680000] [<ffffffff81665700>] ? kernel_init+0x0/0x1c2
[ 12.680000] [<ffffffff8128c130>] ? kernel_thread_helper+0x0/0x1global0
Signed-off-by: Simon Horman <horms@verge.net.au>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Julian Anastasov <ja@ssi.bg>
Cc: Hans Schillstrom <hans@schillstrom.com>
---
include/net/ip_vs.h | 2 --
net/netfilter/ipvs/ip_vs_app.c | 23 ++++++++++-------------
2 files changed, 10 insertions(+), 15 deletions(-)
diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index 272f593..30b49ed 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -801,8 +801,6 @@ struct netns_ipvs {
struct list_head rs_table[IP_VS_RTAB_SIZE];
/* ip_vs_app */
struct list_head app_list;
- struct mutex app_mutex;
- struct lock_class_key app_key; /* mutex debuging */
/* ip_vs_proto */
#define IP_VS_PROTO_TAB_SIZE 32 /* must be power of 2 */
diff --git a/net/netfilter/ipvs/ip_vs_app.c b/net/netfilter/ipvs/ip_vs_app.c
index 5c48ffb..2dc6de1 100644
--- a/net/netfilter/ipvs/ip_vs_app.c
+++ b/net/netfilter/ipvs/ip_vs_app.c
@@ -43,6 +43,8 @@ EXPORT_SYMBOL(register_ip_vs_app);
EXPORT_SYMBOL(unregister_ip_vs_app);
EXPORT_SYMBOL(register_ip_vs_app_inc);
+static DEFINE_MUTEX(__ip_vs_app_mutex);
+
/*
* Get an ip_vs_app object
*/
@@ -167,14 +169,13 @@ int
register_ip_vs_app_inc(struct net *net, struct ip_vs_app *app, __u16 proto,
__u16 port)
{
- struct netns_ipvs *ipvs = net_ipvs(net);
int result;
- mutex_lock(&ipvs->app_mutex);
+ mutex_lock(&__ip_vs_app_mutex);
result = ip_vs_app_inc_new(net, app, proto, port);
- mutex_unlock(&ipvs->app_mutex);
+ mutex_unlock(&__ip_vs_app_mutex);
return result;
}
@@ -189,11 +190,11 @@ int register_ip_vs_app(struct net *net, struct ip_vs_app *app)
/* increase the module use count */
ip_vs_use_count_inc();
- mutex_lock(&ipvs->app_mutex);
+ mutex_lock(&__ip_vs_app_mutex);
list_add(&app->a_list, &ipvs->app_list);
- mutex_unlock(&ipvs->app_mutex);
+ mutex_unlock(&__ip_vs_app_mutex);
return 0;
}
@@ -205,10 +206,9 @@ int register_ip_vs_app(struct net *net, struct ip_vs_app *app)
*/
void unregister_ip_vs_app(struct net *net, struct ip_vs_app *app)
{
- struct netns_ipvs *ipvs = net_ipvs(net);
struct ip_vs_app *inc, *nxt;
- mutex_lock(&ipvs->app_mutex);
+ mutex_lock(&__ip_vs_app_mutex);
list_for_each_entry_safe(inc, nxt, &app->incs_list, a_list) {
ip_vs_app_inc_release(net, inc);
@@ -216,7 +216,7 @@ void unregister_ip_vs_app(struct net *net, struct ip_vs_app *app)
list_del(&app->a_list);
- mutex_unlock(&ipvs->app_mutex);
+ mutex_unlock(&__ip_vs_app_mutex);
/* decrease the module use count */
ip_vs_use_count_dec();
@@ -501,7 +501,7 @@ static void *ip_vs_app_seq_start(struct seq_file *seq, loff_t *pos)
struct net *net = seq_file_net(seq);
struct netns_ipvs *ipvs = net_ipvs(net);
- mutex_lock(&ipvs->app_mutex);
+ mutex_lock(&__ip_vs_app_mutex);
return *pos ? ip_vs_app_idx(ipvs, *pos - 1) : SEQ_START_TOKEN;
}
@@ -535,9 +535,7 @@ static void *ip_vs_app_seq_next(struct seq_file *seq, void *v, loff_t *pos)
static void ip_vs_app_seq_stop(struct seq_file *seq, void *v)
{
- struct netns_ipvs *ipvs = net_ipvs(seq_file_net(seq));
-
- mutex_unlock(&ipvs->app_mutex);
+ mutex_unlock(&__ip_vs_app_mutex);
}
static int ip_vs_app_seq_show(struct seq_file *seq, void *v)
@@ -583,7 +581,6 @@ static int __net_init __ip_vs_app_init(struct net *net)
struct netns_ipvs *ipvs = net_ipvs(net);
INIT_LIST_HEAD(&ipvs->app_list);
- __mutex_init(&ipvs->app_mutex, "ipvs->app_mutex", &ipvs->app_key);
proc_net_fops_create(net, "ip_vs_app", 0, &ip_vs_app_fops);
return 0;
}
--
1.7.2.3
next prev parent reply other threads:[~2011-03-22 1:18 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-03-21 2:51 [GIT] Networking David Miller
2011-03-21 12:53 ` [held lock freed] " Ingo Molnar
2011-03-21 13:32 ` Eric Dumazet
2011-03-21 14:50 ` Arnd Bergmann
2011-03-21 14:55 ` Eric Dumazet
2011-03-21 15:22 ` Arnd Bergmann
2011-03-21 16:16 ` Ingo Molnar
2011-03-22 1:18 ` David Miller
2011-03-21 16:15 ` Ingo Molnar
2011-03-21 16:42 ` [slab poison overwritten] " Ingo Molnar
2011-03-21 17:37 ` Ingo Molnar
2011-03-21 17:39 ` Ingo Molnar
2011-03-21 18:07 ` Eric Dumazet
2011-03-21 20:15 ` Eric Dumazet
2011-03-21 22:13 ` Simon Horman
2011-03-21 23:29 ` Simon Horman
2011-03-22 0:01 ` Simon Horman
2011-03-22 0:17 ` Simon Horman
2011-03-22 1:18 ` Simon Horman [this message]
2011-03-22 3:40 ` David Miller
2011-03-22 3:39 ` David Miller
2011-03-22 9:56 ` Ingo Molnar
2011-03-22 10:00 ` Eric Dumazet
2011-03-22 21:52 ` Simon Horman
2011-03-22 9:07 ` Ingo Molnar
2011-03-22 1:16 ` [held lock freed] " David Miller
2011-03-21 19:24 ` Linus Torvalds
2011-03-21 20:10 ` Eric Dumazet
2011-03-22 4:09 ` David Miller
2011-03-22 10:00 ` Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110322011801.GE27019@verge.net.au \
--to=horms@verge.net.au \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=arnd@arndb.de \
--cc=davem@davemloft.net \
--cc=eric.dumazet@gmail.com \
--cc=hans@schillstrom.com \
--cc=ja@ssi.bg \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=netdev@vger.kernel.org \
--cc=penberg@cs.helsinki.fi \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.