* kernel panic when running /etc/init.d/iptables restart
@ 2012-12-24 5:51 canqun zhang
2012-12-25 5:36 ` Gao feng
0 siblings, 1 reply; 8+ messages in thread
From: canqun zhang @ 2012-12-24 5:51 UTC (permalink / raw)
To: Patrick McHardy; +Cc: netfilter-devel, netfilter, linux-kernel
Hi Patrick,
If i start one lxc container instance, and then in the system there
will be two net namespaces,one is init_net namespace, the other is
created by lxc.If running "/etc/init.d/iptables restart",the system
will be panic. I find iptables restarting will clean init_net
namespace firstly,then clean the net namespace created by lxc,buf
related functions about cleaning up init_net namespace will destroy
global variables such as nf_ct_destroy,ip_ct_attach,etc.So,funtions
cleaning up the other net namespace will be panic.
I fixed it up (see below) .If the system need cleaning init_net
namespace, Ip conntrack belonging to other namespaces will be cleaned
up firstly.
diff -r 7884e663ef6f -r 57fd45b8a144 net/netfilter/nf_conntrack_
core.c
--- a/net/netfilter/nf_conntrack_core.c Sun Dec 09 21:41:08 2012 +0800
+++ b/net/netfilter/nf_conntrack_core.c Sun Dec 23 16:28:15 2012 +0800
@@ -1122,7 +1122,22 @@
diff -r 7884e663ef6f -r 57fd45b8a144 net/netfilter/nf_conntrack_core.c
--- a/net/netfilter/nf_conntrack_core.c Sun Dec 09 21:41:08 2012 +0800
+++ b/net/netfilter/nf_conntrack_core.c Sun Dec 23 16:28:15 2012 +0800
@@ -1122,7 +1122,22 @@
static void nf_conntrack_cleanup_net(struct net *net)
{
- i_see_dead_people:
+ if (net == &init_net) {
+ struct net *net_poll;
+ rcu_read_lock();
+ for_each_net_rcu(net_poll) {
+ synchronize_net();
+ again:
+ nf_ct_iterate_cleanup(net_poll, kill_all, NULL);
+ nf_ct_release_dying_list(net_poll);
+ if (atomic_read(&net_poll->ct.count) != 0) {
+ schedule();
+ goto again;
+ }
+ }
+ rcu_read_unlock();
+ }
+i_see_dead_people:
nf_ct_iterate_cleanup(net, kill_all, NULL);
nf_ct_release_dying_list(net);
if (atomic_read(&net->ct.count) != 0) {
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: kernel panic when running /etc/init.d/iptables restart
2012-12-24 5:51 kernel panic when running /etc/init.d/iptables restart canqun zhang
@ 2012-12-25 5:36 ` Gao feng
2012-12-25 7:25 ` canqun zhang
0 siblings, 1 reply; 8+ messages in thread
From: Gao feng @ 2012-12-25 5:36 UTC (permalink / raw)
To: canqun zhang
Cc: Patrick McHardy, netfilter-devel, netfilter, linux-kernel,
netdev@vger.kernel.org
cc netdev
Hi canqun:
On 2012/12/24 13:51, canqun zhang wrote:
> Hi Patrick,
> If i start one lxc container instance, and then in the system there
> will be two net namespaces,one is init_net namespace, the other is
> created by lxc.If running "/etc/init.d/iptables restart",the system
> will be panic. I find iptables restarting will clean init_net
> namespace firstly,then clean the net namespace created by lxc,buf
> related functions about cleaning up init_net namespace will destroy
> global variables such as nf_ct_destroy,ip_ct_attach,etc.So,funtions
> cleaning up the other net namespace will be panic.
>
I'm afraid that the system will not panic.
When do rmmod nf_conntrack_ipv[4,6],we already call nf_ct_iterate_cleanup
to destroy the nf_conns which belongs to l[3,4]proto protocols,At this
time the nf_ct_destroy still points to destroy_conntrack because the module
nf_conntrack is hold by l3 and l4proto.
You can check the function nf_conntrack_l[3,4]proto_unregister.
Can you make it a little clear?
The reproduction and oops dump stack is useful.
Thanks!
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: kernel panic when running /etc/init.d/iptables restart
2012-12-25 5:36 ` Gao feng
@ 2012-12-25 7:25 ` canqun zhang
2012-12-25 8:38 ` Gao feng
2012-12-25 8:50 ` Gao feng
0 siblings, 2 replies; 8+ messages in thread
From: canqun zhang @ 2012-12-25 7:25 UTC (permalink / raw)
To: Gao feng
Cc: Patrick McHardy, netfilter-devel, netfilter, linux-kernel,
netdev@vger.kernel.org
Hi Gao feng
The stack information is as follows. The kenel will panic because the
nf_ct_destroy is NULL.
Reproduction:
(1) starting a lxc container
(2) iptables -t nat -A POSTROUTING -s 10.48.254.18 -o eth1 -j
MASQUERADE (run it on host machine)
(3) /etc/ini.d/iptables save (run it on host machine)
(4)/etc/init.d/iptables restart (run it on host machine)
Stack:
Pid: 0, comm: swapper Not tainted 2.6.32-279.14.1.rc3.el6.x86_64 #1
IBM System x3650 M4 -[7915IA4]-/00J6528
RIP: 0010:[<ffffffff81466949>] [<ffffffff81466949>]
nf_conntrack_destroy+0x19/0x30
RSP: 0018:ffff880028303ab0 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff881051b237c0 RCX: 00000000000000d4
RDX: 0000000000000500 RSI: 0000000000000000 RDI: ffff881056bc0528
RBP: ffff880028303ab0 R08: ffff8810514fc020 R09: ffff8810574b6110
R10: 0000000000000000 R11: 0000000000000004 R12: ffffffff814445fe
R13: ffff88105327fba8 R14: ffff882059ed6e00 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff880028300000(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00007f3484002098 CR3: 0000002056792000 CR4: 00000000000406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 0, threadinfo ffff8810590b2000, task ffff88205954c040)
Stack:
ffff880028303ad0 ffffffff8142febd ffff880028303b30 ffff881051b237c0
<d> ffff880028303af0 ffffffff8142fc36 0000000200000004 ffff881051b237c0
<d> ffff880028303b20 ffffffff8142fd82 ffff880028303b20 ffff882059ed6d80
Call Trace:
<IRQ>
[<ffffffff8142febd>] skb_release_head_state+0xed/0x110
[<ffffffff8142fc36>] __kfree_skb+0x16/0xa0
[<ffffffff8142fd82>] kfree_skb+0x42/0x90
[<ffffffff814445fe>] __neigh_event_send+0x11e/0x1e0
[<ffffffff814447f3>] neigh_resolve_output+0x133/0x370
[<ffffffff81054d55>] ? select_idle_sibling+0x95/0x150
[<ffffffff814774b7>] ip_finish_output+0x237/0x310
[<ffffffff81477648>] ip_output+0xb8/0xc0
[<ffffffff81476945>] ip_local_out+0x25/0x30
[<ffffffff81476e20>] ip_queue_xmit+0x190/0x420
[<ffffffff8106012c>] ? try_to_wake_up+0x24c/0x3e0
[<ffffffff8148bbae>] tcp_transmit_skb+0x3fe/0x7b0
[<ffffffff8148cfda>] tcp_retransmit_skb+0x1ba/0x5f0
[<ffffffff81053463>] ? __wake_up+0x53/0x70
[<ffffffff8148fd00>] ? tcp_write_timer+0x0/0x200
[<ffffffff8148f85f>] tcp_retransmit_timer+0x1df/0x680
[<ffffffff8148fe98>] tcp_write_timer+0x198/0x200
[<ffffffff8107e907>] run_timer_softirq+0x197/0x340
[<ffffffff810a2350>] ? tick_sched_timer+0x0/0xc0
[<ffffffff8102b40d>] ? lapic_next_event+0x1d/0x30
[<ffffffff81073f31>] __do_softirq+0xc1/0x1e0
[<ffffffff81096d30>] ? hrtimer_interrupt+0x140/0x250
[<ffffffff8100c24c>] call_softirq+0x1c/0x30
[<ffffffff8100de85>] do_softirq+0x65/0xa0
[<ffffffff81073d15>] irq_exit+0x85/0x90
[<ffffffff81506050>] smp_apic_timer_interrupt+0x70/0x9b
[<ffffffff8100bc13>] apic_timer_interrupt+0x13/0x20
<EOI>
[<ffffffff812cd9ae>] ? intel_idle+0xde/0x170
[<ffffffff812cd991>] ? intel_idle+0xc1/0x170
[<ffffffff8109922d>] ? sched_clock_cpu+0xcd/0x110
[<ffffffff81407827>] cpuidle_idle_call+0xa7/0x140
[<ffffffff81009e06>] cpu_idle+0xb6/0x110
[<ffffffff814f714f>] start_secondary+0x22a/0x26d
Code: 02 ff d0 c9 c3 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89
e5 0f 1f 44 00 00 48 8b 05 90 7c ba 00 48 85 c0 74 04 ff d0 c9 c3 <0f>
0b 0f 1f 44 00 00 eb f9 66 66 66 66 66
2e 0f 1f 84 00 00 00
2012/12/25 Gao feng <gaofeng@cn.fujitsu.com>:
> cc netdev
> Hi canqun:
>
> On 2012/12/24 13:51, canqun zhang wrote:
>> Hi Patrick,
>> If i start one lxc container instance, and then in the system there
>> will be two net namespaces,one is init_net namespace, the other is
>> created by lxc.If running "/etc/init.d/iptables restart",the system
>> will be panic. I find iptables restarting will clean init_net
>> namespace firstly,then clean the net namespace created by lxc,buf
>> related functions about cleaning up init_net namespace will destroy
>> global variables such as nf_ct_destroy,ip_ct_attach,etc.So,funtions
>> cleaning up the other net namespace will be panic.
>>
>
> I'm afraid that the system will not panic.
> When do rmmod nf_conntrack_ipv[4,6],we already call nf_ct_iterate_cleanup
> to destroy the nf_conns which belongs to l[3,4]proto protocols,At this
> time the nf_ct_destroy still points to destroy_conntrack because the module
> nf_conntrack is hold by l3 and l4proto.
> You can check the function nf_conntrack_l[3,4]proto_unregister.
>
> Can you make it a little clear?
> The reproduction and oops dump stack is useful.
>
> Thanks!
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: kernel panic when running /etc/init.d/iptables restart
2012-12-25 7:25 ` canqun zhang
@ 2012-12-25 8:38 ` Gao feng
2012-12-25 10:45 ` canqun zhang
2012-12-28 3:39 ` canqun zhang
2012-12-25 8:50 ` Gao feng
1 sibling, 2 replies; 8+ messages in thread
From: Gao feng @ 2012-12-25 8:38 UTC (permalink / raw)
To: canqun zhang
Cc: Patrick McHardy, netfilter-devel, netfilter, linux-kernel,
netdev@vger.kernel.org
On 2012/12/25 15:25, canqun zhang wrote:
> Hi Gao feng
> The stack information is as follows. The kenel will panic because the
> nf_ct_destroy is NULL.
>
> Reproduction:
> (1) starting a lxc container
> (2) iptables -t nat -A POSTROUTING -s 10.48.254.18 -o eth1 -j
> MASQUERADE (run it on host machine)
> (3) /etc/ini.d/iptables save (run it on host machine)
> (4)/etc/init.d/iptables restart (run it on host machine)
>
Thanks!
It seems that nf_conntrack_l[3,4]proto_unregister doesn't make sure
nf_conns of the proto being destroyed.
If I'm right, there is another problem even your fix this panic problem.
the l3,14proto will be unregistered before all of it's nf_conns being destroyed.
So even nf_ct_destroy is not NULL,in destroy_conntrack we are not able to
find the right l4proto,the l4proto->destroy will be incorrect.resources will
not be released correctly.
So I think the root problem is we do register/unregister, set/unset both on the
first net (init_net), Maybe it's better to do register set on the first net, and
do unregister unset on the last net.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: kernel panic when running /etc/init.d/iptables restart
2012-12-25 8:38 ` Gao feng
@ 2012-12-25 10:45 ` canqun zhang
2012-12-28 3:27 ` canqun zhang
2012-12-28 3:39 ` canqun zhang
1 sibling, 1 reply; 8+ messages in thread
From: canqun zhang @ 2012-12-25 10:45 UTC (permalink / raw)
To: Gao feng
Cc: Patrick McHardy, netfilter-devel, netfilter, linux-kernel,
netdev@vger.kernel.org
Thanks for your suggestion,i will modify this patch and take tests.
2012/12/25 Gao feng <gaofeng@cn.fujitsu.com>:
> On 2012/12/25 15:25, canqun zhang wrote:
>> Hi Gao feng
>> The stack information is as follows. The kenel will panic because the
>> nf_ct_destroy is NULL.
>>
>> Reproduction:
>> (1) starting a lxc container
>> (2) iptables -t nat -A POSTROUTING -s 10.48.254.18 -o eth1 -j
>> MASQUERADE (run it on host machine)
>> (3) /etc/ini.d/iptables save (run it on host machine)
>> (4)/etc/init.d/iptables restart (run it on host machine)
>>
>
> Thanks!
> It seems that nf_conntrack_l[3,4]proto_unregister doesn't make sure
> nf_conns of the proto being destroyed.
>
> If I'm right, there is another problem even your fix this panic problem.
> the l3,14proto will be unregistered before all of it's nf_conns being destroyed.
> So even nf_ct_destroy is not NULL,in destroy_conntrack we are not able to
> find the right l4proto,the l4proto->destroy will be incorrect.resources will
> not be released correctly.
>
> So I think the root problem is we do register/unregister, set/unset both on the
> first net (init_net), Maybe it's better to do register set on the first net, and
> do unregister unset on the last net.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: kernel panic when running /etc/init.d/iptables restart
2012-12-25 10:45 ` canqun zhang
@ 2012-12-28 3:27 ` canqun zhang
0 siblings, 0 replies; 8+ messages in thread
From: canqun zhang @ 2012-12-28 3:27 UTC (permalink / raw)
To: Gao feng
Cc: Patrick McHardy, netfilter-devel, netfilter, linux-kernel,
netdev@vger.kernel.org
Hi all
As discussed above,if the host machine create several linux
containers, there will be several net namespaces.Resources with "nf
conntrack" are registered or unregistered on the first net
namespace(init_net),But init_net is not unregistered lastly,so
cleanuping other net namespaces will triger painic.
If net namespaces are created with the order of 1,2,...n,they should
be cleaned with the order of n,...2,1,so in this case init_net will be
unregistered lastly.
I fixed it up (see below)
diff -r 6a1a258923f5 -r 2667e89e6f50 net/core/net_namespace.c
--- a/net/core/net_namespace.c Fri Dec 28 11:01:17 2012 +0800
+++ b/net/core/net_namespace.c Fri Dec 28 11:05:12 2012 +0800
@@ -450,7 +450,7 @@
list_del(&ops->list);
for_each_net(net)
- list_add_tail(&net->exit_list, &net_exit_list);
+ list_add(&net->exit_list, &net_exit_list);
ops_exit_list(ops, &net_exit_list);
ops_free_list(ops, &net_exit_list);
}
2012/12/25 canqun zhang <canqunzhang@gmail.com>:
> Thanks for your suggestion,i will modify this patch and take tests.
>
> 2012/12/25 Gao feng <gaofeng@cn.fujitsu.com>:
>> On 2012/12/25 15:25, canqun zhang wrote:
>>> Hi Gao feng
>>> The stack information is as follows. The kenel will panic because the
>>> nf_ct_destroy is NULL.
>>>
>>> Reproduction:
>>> (1) starting a lxc container
>>> (2) iptables -t nat -A POSTROUTING -s 10.48.254.18 -o eth1 -j
>>> MASQUERADE (run it on host machine)
>>> (3) /etc/ini.d/iptables save (run it on host machine)
>>> (4)/etc/init.d/iptables restart (run it on host machine)
>>>
>>
>> Thanks!
>> It seems that nf_conntrack_l[3,4]proto_unregister doesn't make sure
>> nf_conns of the proto being destroyed.
>>
>> If I'm right, there is another problem even your fix this panic problem.
>> the l3,14proto will be unregistered before all of it's nf_conns being destroyed.
>> So even nf_ct_destroy is not NULL,in destroy_conntrack we are not able to
>> find the right l4proto,the l4proto->destroy will be incorrect.resources will
>> not be released correctly.
>>
>> So I think the root problem is we do register/unregister, set/unset both on the
>> first net (init_net), Maybe it's better to do register set on the first net, and
>> do unregister unset on the last net.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: kernel panic when running /etc/init.d/iptables restart
2012-12-25 8:38 ` Gao feng
2012-12-25 10:45 ` canqun zhang
@ 2012-12-28 3:39 ` canqun zhang
1 sibling, 0 replies; 8+ messages in thread
From: canqun zhang @ 2012-12-28 3:39 UTC (permalink / raw)
To: Gao feng
Cc: Patrick McHardy, netfilter-devel, netfilter, linux-kernel,
netdev@vger.kernel.org
Hi all
As discussed above,if the host machine create several linux
containers, there will be several net namespaces.Resources with "nf
conntrack" are registered or unregistered on the first net
namespace(init_net),But init_net is not unregistered lastly,so
cleanuping other net namespaces will triger painic.
If net namespaces are created with the order of 1,2,...n,they should
be cleaned with the order of n,...2,1,so in this case init_net will be
unregistered lastly.
I fixed it up (see below)
diff -r 6a1a258923f5 -r 2667e89e6f50 net/core/net_namespace.c
--- a/net/core/net_namespace.c Fri Dec 28 11:01:17 2012 +0800
+++ b/net/core/net_namespace.c Fri Dec 28 11:05:12 2012 +0800
@@ -450,7 +450,7 @@
list_del(&ops->list);
for_each_net(net)
- list_add_tail(&net->exit_list, &net_exit_list);
+ list_add(&net->exit_list, &net_exit_list);
ops_exit_list(ops, &net_exit_list);
ops_free_list(ops, &net_exit_list);
}
2012/12/25 Gao feng <gaofeng@cn.fujitsu.com>:
> On 2012/12/25 15:25, canqun zhang wrote:
>> Hi Gao feng
>> The stack information is as follows. The kenel will panic because the
>> nf_ct_destroy is NULL.
>>
>> Reproduction:
>> (1) starting a lxc container
>> (2) iptables -t nat -A POSTROUTING -s 10.48.254.18 -o eth1 -j
>> MASQUERADE (run it on host machine)
>> (3) /etc/ini.d/iptables save (run it on host machine)
>> (4)/etc/init.d/iptables restart (run it on host machine)
>>
>
> Thanks!
> It seems that nf_conntrack_l[3,4]proto_unregister doesn't make sure
> nf_conns of the proto being destroyed.
>
> If I'm right, there is another problem even your fix this panic problem.
> the l3,14proto will be unregistered before all of it's nf_conns being destroyed.
> So even nf_ct_destroy is not NULL,in destroy_conntrack we are not able to
> find the right l4proto,the l4proto->destroy will be incorrect.resources will
> not be released correctly.
>
> So I think the root problem is we do register/unregister, set/unset both on the
> first net (init_net), Maybe it's better to do register set on the first net, and
> do unregister unset on the last net.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: kernel panic when running /etc/init.d/iptables restart
2012-12-25 7:25 ` canqun zhang
2012-12-25 8:38 ` Gao feng
@ 2012-12-25 8:50 ` Gao feng
1 sibling, 0 replies; 8+ messages in thread
From: Gao feng @ 2012-12-25 8:50 UTC (permalink / raw)
To: canqun zhang
Cc: Patrick McHardy, netfilter-devel, netfilter, linux-kernel,
netdev@vger.kernel.org
On 2012/12/25 15:25, canqun zhang wrote:
> Hi Gao feng
> The stack information is as follows. The kenel will panic because the
> nf_ct_destroy is NULL.
Thanks!
It seems that nf_conntrack_l[3,4]proto_unregister doesn't make sure
nf_conns of the proto being destroyed.
If I'm right, there is another problem even your fix this panic problem.
the l3,14proto will be unregistered before all of it's nf_conns being destroyed.
So even nf_ct_destroy is not NULL,in destroy_conntrack we are not able to
find the right l4proto,the l4proto->destroy will be incorrect.resources will
not be released correctly.
So I think the root problem is we do register/unregister, set/unset both on the
first net (init_net), Maybe it's better to do register set on the first net, and
do unregister unset on the last net.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2012-12-28 3:39 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-12-24 5:51 kernel panic when running /etc/init.d/iptables restart canqun zhang
2012-12-25 5:36 ` Gao feng
2012-12-25 7:25 ` canqun zhang
2012-12-25 8:38 ` Gao feng
2012-12-25 10:45 ` canqun zhang
2012-12-28 3:27 ` canqun zhang
2012-12-28 3:39 ` canqun zhang
2012-12-25 8:50 ` Gao feng
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).