* Oops with latest (netfilter) nf-next tree, when unloading iptable_nat
@ 2012-09-11 9:51 Jesper Dangaard Brouer
2012-09-12 21:36 ` Florian Westphal
0 siblings, 1 reply; 16+ messages in thread
From: Jesper Dangaard Brouer @ 2012-09-11 9:51 UTC (permalink / raw)
To: Pablo Neira Ayuso, netfilter-devel; +Cc: netdev, Florian Westphal, yongjun_wei
Hi Pablo,
I'm hitting this general protection fault, when unloading iptables_nat.
Git tree: git://1984.lsi.us.es/nf-next
At commit 0edd94887d19 (ipvs: use list_del_init instead of list_del/INIT_LIST_HEAD)
Notice, I'm not seeing this with net-next (at commit 9f00d9776bc5b)
[ 524.590171] general protection fault: 0000 [#1] SMP
[ 524.591067] Modules linked in: netconsole ip_vs_lblc ip_vs_lc ip_vs_rr ip_vs libcrc32c ipt_MASQUERADE nf_nat_ipv4(-) nf_nat iptable_mangle xt_mark ip6table_mangle xt_LOG ip6table_filter ip6_tables virtio_balloon virtio_net [last unloaded: iptable_nat]
[ 524.591067] CPU 0
[ 524.591067] Pid: 5842, comm: modprobe Not tainted 3.6.0-rc3-pablo-nf-next+ #1 Red Hat KVM
[ 524.591067] RIP: 0010:[<ffffffffa002c2fd>] [<ffffffffa002c2fd>] nf_nat_proto_clean+0x6d/0xc0 [nf_nat]
[ 524.591067] RSP: 0018:ffff880073203e18 EFLAGS: 00010246
[ 524.591067] RAX: 0000000000000000 RBX: ffff880077dff2c8 RCX: ffff8800797fab70
[ 524.591067] RDX: dead000000200200 RSI: ffff880073203e88 RDI: ffffffffa002f208
[ 524.591067] RBP: ffff880073203e28 R08: ffff880073202000 R09: 0000000000000000
[ 524.591067] R10: dead000000200200 R11: dead000000100100 R12: ffffffff81c6dc00
list corruption? ^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^
[ 524.591067] R13: ffff880078671640 R14: ffff880078671638 R15: ffff880073203e88
[ 524.591067] FS: 00007f04dc38b700(0000) GS:ffff88007cc00000(0000) knlGS:0000000000000000
[ 524.591067] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 524.591067] CR2: 00007f04dc398000 CR3: 0000000072238000 CR4: 00000000000006f0
[ 524.591067] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 524.591067] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 524.591067] Process modprobe (pid: 5842, threadinfo ffff880073202000, task ffff8800797fab70)
[ 524.591067] Stack:
[ 524.591067] ffff880073203e68 ffffffffa002c290 ffff880073203e78 ffffffff815614e3
[ 524.591067] ffffffff00000000 0000258d00000246 ffff880073203e68 ffffffff81c6dc00
[ 524.591067] ffff880073203e88 ffffffffa00358a0 0000000000000000 000000000040f5b0
[ 524.591067] Call Trace:
[ 524.591067] [<ffffffffa002c290>] ? nf_nat_net_exit+0x50/0x50 [nf_nat]
[ 524.591067] [<ffffffff815614e3>] nf_ct_iterate_cleanup+0xc3/0x170
[ 524.591067] [<ffffffffa002c54a>] nf_nat_l3proto_unregister+0x8a/0x100 [nf_nat]
[ 524.591067] [<ffffffff812a0303>] ? compat_prepare_timeout+0x13/0xb0
[ 524.591067] [<ffffffffa0035848>] nf_nat_l3proto_ipv4_exit+0x10/0x23 [nf_nat_ipv4]
[ 524.591067] [<ffffffff8109f4a5>] sys_delete_module+0x235/0x2b0
[ 524.591067] [<ffffffff810b8193>] ? __audit_syscall_entry+0x1b3/0x1f0
[ 524.591067] [<ffffffff810b8776>] ? __audit_syscall_exit+0x3e6/0x410
[ 524.591067] [<ffffffff816679e2>] system_call_fastpath+0x16/0x1b
[ 524.591067] Code: 75 6c 0f b6 46 01 84 c0 74 05 3a 42 3e 75 5f 80 7e 02 00 74 41 48 c7 c7 08 f2 02 a0 e8 3d 3b 63 e1 48 8b 03 48 8b 53 08 48 85 c0 <48> 89 02 74 04 48 89 50 08 48 be 00 02 20 00 00 00 ad de 48 c7
[ 524.591067] RIP [<ffffffffa002c2fd>] nf_nat_proto_clean+0x6d/0xc0 [nf_nat]
[ 524.591067] RSP <ffff880073203e18>
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Sr. Network Kernel Developer at Red Hat
Author of http://www.iptv-analyzer.org
LinkedIn: http://www.linkedin.com/in/brouer
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Oops with latest (netfilter) nf-next tree, when unloading iptable_nat
2012-09-11 9:51 Oops with latest (netfilter) nf-next tree, when unloading iptable_nat Jesper Dangaard Brouer
@ 2012-09-12 21:36 ` Florian Westphal
2012-09-14 12:07 ` Pablo Neira Ayuso
2012-09-19 19:14 ` Jesper Dangaard Brouer
0 siblings, 2 replies; 16+ messages in thread
From: Florian Westphal @ 2012-09-12 21:36 UTC (permalink / raw)
To: Jesper Dangaard Brouer
Cc: Pablo Neira Ayuso, netfilter-devel, netdev, Florian Westphal,
yongjun_wei, kaber
Jesper Dangaard Brouer <brouer@redhat.com> wrote:
[ CC'd Patrick ]
> I'm hitting this general protection fault, when unloading iptables_nat.
> [ 524.591067] Pid: 5842, comm: modprobe Not tainted 3.6.0-rc3-pablo-nf-next+ #1 Red Hat KVM
> [ 524.591067] RIP: 0010:[<ffffffffa002c2fd>] [<ffffffffa002c2fd>] nf_nat_proto_clean+0x6d/0xc0 [nf_nat]
> [ 524.591067] RSP: 0018:ffff880073203e18 EFLAGS: 00010246
> [ 524.591067] RAX: 0000000000000000 RBX: ffff880077dff2c8 RCX: ffff8800797fab70
> [ 524.591067] RDX: dead000000200200 RSI: ffff880073203e88 RDI: ffffffffa002f208
> [ 524.591067] RBP: ffff880073203e28 R08: ffff880073202000 R09: 0000000000000000
> [ 524.591067] R10: dead000000200200 R11: dead000000100100 R12: ffffffff81c6dc00
> list corruption? ^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^
Yep, looks like it.
> [ 524.591067] [<ffffffffa002c290>] ? nf_nat_net_exit+0x50/0x50 [nf_nat]
> [ 524.591067] [<ffffffff815614e3>] nf_ct_iterate_cleanup+0xc3/0x170
> [ 524.591067] [<ffffffffa002c54a>] nf_nat_l3proto_unregister+0x8a/0x100 [nf_nat]
> [ 524.591067] [<ffffffff812a0303>] ? compat_prepare_timeout+0x13/0xb0
> [ 524.591067] [<ffffffffa0035848>] nf_nat_l3proto_ipv4_exit+0x10/0x23 [nf_nat_ipv4]
On module removal nf_nat_ipv4 calls nf_iterate_cleanup which invokes
nf_nat_proto_clean() for each conntrack. That will then call
hlist_del_rcu(&nat->bysource) using eachs conntracks nat ext area.
Problem is that nf_nat_proto_clean() is called multiple times for the same
conntrack:
a) nf_ct_iterate_cleanup() returns each ct twice (origin, reply)
b) we call it both for l3 and for l4 protocol ids
We barf in hlist_del_rcu the 2nd time because ->pprev is poisoned.
This was introduced with the ipv6 nat patches.
--- a/net/netfilter/nf_nat_core.c
+++ b/net/netfilter/nf_nat_core.c
@@ -487,7 +487,7 @@ static int nf_nat_proto_clean(struct nf_conn *i, void *data)
if (clean->hash) {
spin_lock_bh(&nf_nat_lock);
- hlist_del_rcu(&nat->bysource);
+ hlist_del_init_rcu(&nat->bysource);
spin_unlock_bh(&nf_nat_lock);
} else {
Would probably avoid it. I guess it would be nicer to only call this
once for each ct.
Patrick, any other idea?
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Oops with latest (netfilter) nf-next tree, when unloading iptable_nat
2012-09-12 21:36 ` Florian Westphal
@ 2012-09-14 12:07 ` Pablo Neira Ayuso
2012-09-14 13:15 ` Patrick McHardy
2012-09-19 19:14 ` Jesper Dangaard Brouer
1 sibling, 1 reply; 16+ messages in thread
From: Pablo Neira Ayuso @ 2012-09-14 12:07 UTC (permalink / raw)
To: Florian Westphal
Cc: Jesper Dangaard Brouer, netfilter-devel, netdev, yongjun_wei,
kaber
On Wed, Sep 12, 2012 at 11:36:27PM +0200, Florian Westphal wrote:
> Jesper Dangaard Brouer <brouer@redhat.com> wrote:
>
> [ CC'd Patrick ]
>
> > I'm hitting this general protection fault, when unloading iptables_nat.
> > [ 524.591067] Pid: 5842, comm: modprobe Not tainted 3.6.0-rc3-pablo-nf-next+ #1 Red Hat KVM
> > [ 524.591067] RIP: 0010:[<ffffffffa002c2fd>] [<ffffffffa002c2fd>] nf_nat_proto_clean+0x6d/0xc0 [nf_nat]
> > [ 524.591067] RSP: 0018:ffff880073203e18 EFLAGS: 00010246
> > [ 524.591067] RAX: 0000000000000000 RBX: ffff880077dff2c8 RCX: ffff8800797fab70
> > [ 524.591067] RDX: dead000000200200 RSI: ffff880073203e88 RDI: ffffffffa002f208
> > [ 524.591067] RBP: ffff880073203e28 R08: ffff880073202000 R09: 0000000000000000
> > [ 524.591067] R10: dead000000200200 R11: dead000000100100 R12: ffffffff81c6dc00
> > list corruption? ^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^
>
> Yep, looks like it.
>
> > [ 524.591067] [<ffffffffa002c290>] ? nf_nat_net_exit+0x50/0x50 [nf_nat]
> > [ 524.591067] [<ffffffff815614e3>] nf_ct_iterate_cleanup+0xc3/0x170
> > [ 524.591067] [<ffffffffa002c54a>] nf_nat_l3proto_unregister+0x8a/0x100 [nf_nat]
> > [ 524.591067] [<ffffffff812a0303>] ? compat_prepare_timeout+0x13/0xb0
> > [ 524.591067] [<ffffffffa0035848>] nf_nat_l3proto_ipv4_exit+0x10/0x23 [nf_nat_ipv4]
>
> On module removal nf_nat_ipv4 calls nf_iterate_cleanup which invokes
> nf_nat_proto_clean() for each conntrack. That will then call
> hlist_del_rcu(&nat->bysource) using eachs conntracks nat ext area.
>
> Problem is that nf_nat_proto_clean() is called multiple times for the same
> conntrack:
> a) nf_ct_iterate_cleanup() returns each ct twice (origin, reply)
> b) we call it both for l3 and for l4 protocol ids
>
> We barf in hlist_del_rcu the 2nd time because ->pprev is poisoned.
>
> This was introduced with the ipv6 nat patches.
>
> --- a/net/netfilter/nf_nat_core.c
> +++ b/net/netfilter/nf_nat_core.c
> @@ -487,7 +487,7 @@ static int nf_nat_proto_clean(struct nf_conn *i, void *data)
>
> if (clean->hash) {
> spin_lock_bh(&nf_nat_lock);
> - hlist_del_rcu(&nat->bysource);
> + hlist_del_init_rcu(&nat->bysource);
> spin_unlock_bh(&nf_nat_lock);
> } else {
>
> Would probably avoid it. I guess it would be nicer to only call this
> once for each ct.
>
> Patrick, any other idea?
I already discussed this with Florian (I've been having problems with
two out of three of my email accounts this week... so I couldn't reply
to this email in the mailing list).
We can add nf_nat_iterate_cleanup that can iterate over the NAT
hashtable to replace current usage of nf_ct_iterate_cleanup.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Oops with latest (netfilter) nf-next tree, when unloading iptable_nat
2012-09-14 12:07 ` Pablo Neira Ayuso
@ 2012-09-14 13:15 ` Patrick McHardy
2012-09-19 12:46 ` Jesper Dangaard Brouer
0 siblings, 1 reply; 16+ messages in thread
From: Patrick McHardy @ 2012-09-14 13:15 UTC (permalink / raw)
To: Pablo Neira Ayuso
Cc: Florian Westphal, Jesper Dangaard Brouer, netfilter-devel, netdev,
yongjun_wei
[-- Attachment #1: Type: TEXT/PLAIN, Size: 2900 bytes --]
On Fri, 14 Sep 2012, Pablo Neira Ayuso wrote:
> On Wed, Sep 12, 2012 at 11:36:27PM +0200, Florian Westphal wrote:
>> Jesper Dangaard Brouer <brouer@redhat.com> wrote:
>>
>> [ CC'd Patrick ]
>>
>>> I'm hitting this general protection fault, when unloading iptables_nat.
>>> [ 524.591067] Pid: 5842, comm: modprobe Not tainted 3.6.0-rc3-pablo-nf-next+ #1 Red Hat KVM
>>> [ 524.591067] RIP: 0010:[<ffffffffa002c2fd>] [<ffffffffa002c2fd>] nf_nat_proto_clean+0x6d/0xc0 [nf_nat]
>>> [ 524.591067] RSP: 0018:ffff880073203e18 EFLAGS: 00010246
>>> [ 524.591067] RAX: 0000000000000000 RBX: ffff880077dff2c8 RCX: ffff8800797fab70
>>> [ 524.591067] RDX: dead000000200200 RSI: ffff880073203e88 RDI: ffffffffa002f208
>>> [ 524.591067] RBP: ffff880073203e28 R08: ffff880073202000 R09: 0000000000000000
>>> [ 524.591067] R10: dead000000200200 R11: dead000000100100 R12: ffffffff81c6dc00
>>> list corruption? ^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^
>>
>> Yep, looks like it.
>>
>>> [ 524.591067] [<ffffffffa002c290>] ? nf_nat_net_exit+0x50/0x50 [nf_nat]
>>> [ 524.591067] [<ffffffff815614e3>] nf_ct_iterate_cleanup+0xc3/0x170
>>> [ 524.591067] [<ffffffffa002c54a>] nf_nat_l3proto_unregister+0x8a/0x100 [nf_nat]
>>> [ 524.591067] [<ffffffff812a0303>] ? compat_prepare_timeout+0x13/0xb0
>>> [ 524.591067] [<ffffffffa0035848>] nf_nat_l3proto_ipv4_exit+0x10/0x23 [nf_nat_ipv4]
>>
>> On module removal nf_nat_ipv4 calls nf_iterate_cleanup which invokes
>> nf_nat_proto_clean() for each conntrack. That will then call
>> hlist_del_rcu(&nat->bysource) using eachs conntracks nat ext area.
>>
>> Problem is that nf_nat_proto_clean() is called multiple times for the same
>> conntrack:
>> a) nf_ct_iterate_cleanup() returns each ct twice (origin, reply)
>> b) we call it both for l3 and for l4 protocol ids
>>
>> We barf in hlist_del_rcu the 2nd time because ->pprev is poisoned.
>>
>> This was introduced with the ipv6 nat patches.
>>
>> --- a/net/netfilter/nf_nat_core.c
>> +++ b/net/netfilter/nf_nat_core.c
>> @@ -487,7 +487,7 @@ static int nf_nat_proto_clean(struct nf_conn *i, void *data)
>>
>> if (clean->hash) {
>> spin_lock_bh(&nf_nat_lock);
>> - hlist_del_rcu(&nat->bysource);
>> + hlist_del_init_rcu(&nat->bysource);
>> spin_unlock_bh(&nf_nat_lock);
>> } else {
>>
>> Would probably avoid it. I guess it would be nicer to only call this
>> once for each ct.
>>
>> Patrick, any other idea?
>
> I already discussed this with Florian (I've been having problems with
> two out of three of my email accounts this week... so I couldn't reply
> to this email in the mailing list).
>
> We can add nf_nat_iterate_cleanup that can iterate over the NAT
> hashtable to replace current usage of nf_ct_iterate_cleanup.
Lets just bail out when IPS_SRC_NAT_DONE is not set, that should also fix
it. Could you try this patch please?
[-- Attachment #2: Type: TEXT/PLAIN, Size: 480 bytes --]
diff --git a/net/netfilter/nf_nat_core.c b/net/netfilter/nf_nat_core.c
index 29d4452..8b5d220 100644
--- a/net/netfilter/nf_nat_core.c
+++ b/net/netfilter/nf_nat_core.c
@@ -481,6 +481,8 @@ static int nf_nat_proto_clean(struct nf_conn *i, void *data)
if (!nat)
return 0;
+ if (!(i->status & IPS_SRC_NAT_DONE))
+ return 0;
if ((clean->l3proto && nf_ct_l3num(i) != clean->l3proto) ||
(clean->l4proto && nf_ct_protonum(i) != clean->l4proto))
return 0;
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: Oops with latest (netfilter) nf-next tree, when unloading iptable_nat
2012-09-14 13:15 ` Patrick McHardy
@ 2012-09-19 12:46 ` Jesper Dangaard Brouer
2012-09-20 6:57 ` Patrick McHardy
0 siblings, 1 reply; 16+ messages in thread
From: Jesper Dangaard Brouer @ 2012-09-19 12:46 UTC (permalink / raw)
To: Patrick McHardy
Cc: Pablo Neira Ayuso, Florian Westphal, netfilter-devel, netdev,
yongjun_wei
On Fri, 2012-09-14 at 15:15 +0200, Patrick McHardy wrote:
> On Fri, 14 Sep 2012, Pablo Neira Ayuso wrote:
>
[...cut...]
> >> Patrick, any other idea?
> >
[...cut...]
> > >
> > We can add nf_nat_iterate_cleanup that can iterate over the NAT
> > hashtable to replace current usage of nf_ct_iterate_cleanup.
>
> Lets just bail out when IPS_SRC_NAT_DONE is not set, that should also fix
> it. Could you try this patch please?
On Fri, 2012-09-14 at 15:15 +0200, Patrick McHardy wrote:
diff --git a/net/netfilter/nf_nat_core.c b/net/netfilter/nf_nat_core.c
> index 29d4452..8b5d220 100644
> --- a/net/netfilter/nf_nat_core.c
> +++ b/net/netfilter/nf_nat_core.c
> @@ -481,6 +481,8 @@ static int nf_nat_proto_clean(struct nf_conn *i,
void *data)
>
> if (!nat)
> return 0;
> + if (!(i->status & IPS_SRC_NAT_DONE))
> + return 0;
> if ((clean->l3proto && nf_ct_l3num(i) != clean->l3proto) ||
> (clean->l4proto && nf_ct_protonum(i) != clean->l4proto))
> return 0;
>
No it does not work :-(
[ 1216.310146] general protection fault: 0000 [#1] SMP
[ 1216.311046] Modules linked in: netconsole ip_vs_lblc ip_vs_lc ip_vs_rr ip_vs libcrc32c ipt_MASQUERADE nf_nat_ipv4(-) nf_nat iptable_mangle xt_mark ip6table_mangle xt_LOG ip6table_filter ip6_tables virtio_balloon virtio_net [last unloaded: iptable_nat]
[ 1216.311046] CPU 1
[ 1216.311046] Pid: 4052, comm: modprobe Not tainted 3.6.0-rc3-test-nat-unload-fix+ #32 Red Hat KVM
[ 1216.311046] RIP: 0010:[<ffffffffa002c303>] [<ffffffffa002c303>] nf_nat_proto_clean+0x73/0xd0 [nf_nat]
[ 1216.311046] RSP: 0018:ffff88007808fe18 EFLAGS: 00010246
[ 1216.311046] RAX: 0000000000000000 RBX: ffff8800728550c0 RCX: ffff8800756288b0
[ 1216.311046] RDX: dead000000200200 RSI: ffff88007808fe88 RDI: ffffffffa002f208
[ 1216.311046] RBP: ffff88007808fe28 R08: ffff88007808e000 R09: 0000000000000000
[ 1216.311046] R10: dead000000200200 R11: dead000000100100 R12: ffffffff81c6dc00
[ 1216.311046] R13: ffff8800787582b8 R14: ffff880078758278 R15: ffff88007808fe88
[ 1216.311046] FS: 00007f515985d700(0000) GS:ffff88007cd00000(0000) knlGS:0000000000000000
[ 1216.311046] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1216.311046] CR2: 00007f515986a000 CR3: 000000007867a000 CR4: 00000000000006e0
[ 1216.311046] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1216.311046] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1216.311046] Process modprobe (pid: 4052, threadinfo ffff88007808e000, task ffff8800756288b0)
[ 1216.311046] Stack:
[ 1216.311046] ffff88007808fe68 ffffffffa002c290 ffff88007808fe78 ffffffff815614e3
[ 1216.311046] ffffffff00000000 00000aeb00000246 ffff88007808fe68 ffffffff81c6dc00
[ 1216.311046] ffff88007808fe88 ffffffffa00358a0 0000000000000000 000000000040f5b0
[ 1216.311046] Call Trace:
[ 1216.311046] [<ffffffffa002c290>] ? nf_nat_net_exit+0x50/0x50 [nf_nat]
[ 1216.311046] [<ffffffff815614e3>] nf_ct_iterate_cleanup+0xc3/0x170
[ 1216.311046] [<ffffffffa002c55a>] nf_nat_l3proto_unregister+0x8a/0x100 [nf_nat]
[ 1216.311046] [<ffffffff812a0303>] ? compat_prepare_timeout+0x13/0xb0
[ 1216.311046] [<ffffffffa0035848>] nf_nat_l3proto_ipv4_exit+0x10/0x23 [nf_nat_ipv4]
[ 1216.311046] [<ffffffff8109f4a5>] sys_delete_module+0x235/0x2b0
[ 1216.311046] [<ffffffff810b8193>] ? __audit_syscall_entry+0x1b3/0x1f0
[ 1216.311046] [<ffffffff810b8776>] ? __audit_syscall_exit+0x3e6/0x410
[ 1216.311046] [<ffffffff816679e2>] system_call_fastpath+0x16/0x1b
[ 1216.311046] Code: 75 6e 0f b6 46 01 84 c0 74 05 3a 42 3e 75 61 80 7e 02 00 74 43 48 c7 c7 08 f2 02 a0 e8 37 3b 63 e1 48 8b 03 48 8b 53 08 48 85 c0 <48> 89 02 74 04 48 89 50 08 48 be 00 02 20 00 00 00 ad de 48 c7
[ 1216.311046] RIP [<ffffffffa002c303>] nf_nat_proto_clean+0x73/0xd0 [nf_nat]
[ 1216.311046] RSP <ffff88007808fe18>
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Sr. Network Kernel Developer at Red Hat
Author of http://www.iptv-analyzer.org
LinkedIn: http://www.linkedin.com/in/brouer
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Oops with latest (netfilter) nf-next tree, when unloading iptable_nat
2012-09-12 21:36 ` Florian Westphal
2012-09-14 12:07 ` Pablo Neira Ayuso
@ 2012-09-19 19:14 ` Jesper Dangaard Brouer
1 sibling, 0 replies; 16+ messages in thread
From: Jesper Dangaard Brouer @ 2012-09-19 19:14 UTC (permalink / raw)
To: Florian Westphal
Cc: Pablo Neira Ayuso, netfilter-devel, netdev, yongjun_wei, kaber
On Wed, 2012-09-12 at 23:36 +0200, Florian Westphal wrote:
[...cut...]
> On module removal nf_nat_ipv4 calls nf_iterate_cleanup which invokes
> nf_nat_proto_clean() for each conntrack. That will then call
> hlist_del_rcu(&nat->bysource) using eachs conntracks nat ext area.
>
> Problem is that nf_nat_proto_clean() is called multiple times for the same
> conntrack:
> a) nf_ct_iterate_cleanup() returns each ct twice (origin, reply)
> b) we call it both for l3 and for l4 protocol ids
>
> We barf in hlist_del_rcu the 2nd time because ->pprev is poisoned.
>
> This was introduced with the ipv6 nat patches.
>
> --- a/net/netfilter/nf_nat_core.c
> +++ b/net/netfilter/nf_nat_core.c
> @@ -487,7 +487,7 @@ static int nf_nat_proto_clean(struct nf_conn *i, void *data)
>
> if (clean->hash) {
> spin_lock_bh(&nf_nat_lock);
> - hlist_del_rcu(&nat->bysource);
> + hlist_del_init_rcu(&nat->bysource);
> spin_unlock_bh(&nf_nat_lock);
> } else {
>
> Would probably avoid it. I guess it would be nicer to only call this
> once for each ct.
Florian's patch fixes the Oops :-)
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Oops with latest (netfilter) nf-next tree, when unloading iptable_nat
2012-09-19 12:46 ` Jesper Dangaard Brouer
@ 2012-09-20 6:57 ` Patrick McHardy
2012-09-20 7:29 ` Jesper Dangaard Brouer
2012-09-20 10:08 ` Pablo Neira Ayuso
0 siblings, 2 replies; 16+ messages in thread
From: Patrick McHardy @ 2012-09-20 6:57 UTC (permalink / raw)
To: Jesper Dangaard Brouer
Cc: Pablo Neira Ayuso, Florian Westphal, netfilter-devel, netdev,
yongjun_wei
[-- Attachment #1: Type: TEXT/PLAIN, Size: 1333 bytes --]
On Wed, 19 Sep 2012, Jesper Dangaard Brouer wrote:
> On Fri, 2012-09-14 at 15:15 +0200, Patrick McHardy wrote:
>> On Fri, 14 Sep 2012, Pablo Neira Ayuso wrote:
>>
> [...cut...]
>>>> Patrick, any other idea?
>>>
> [...cut...]
>>>>
>>> We can add nf_nat_iterate_cleanup that can iterate over the NAT
>>> hashtable to replace current usage of nf_ct_iterate_cleanup.
>>
>> Lets just bail out when IPS_SRC_NAT_DONE is not set, that should also fix
>> it. Could you try this patch please?
>
> On Fri, 2012-09-14 at 15:15 +0200, Patrick McHardy wrote:
> diff --git a/net/netfilter/nf_nat_core.c b/net/netfilter/nf_nat_core.c
>> index 29d4452..8b5d220 100644
>> --- a/net/netfilter/nf_nat_core.c
>> +++ b/net/netfilter/nf_nat_core.c
>> @@ -481,6 +481,8 @@ static int nf_nat_proto_clean(struct nf_conn *i,
> void *data)
>>
>> if (!nat)
>> return 0;
>> + if (!(i->status & IPS_SRC_NAT_DONE))
>> + return 0;
>> if ((clean->l3proto && nf_ct_l3num(i) != clean->l3proto) ||
>> (clean->l4proto && nf_ct_protonum(i) != clean->l4proto))
>> return 0;
>>
>
> No it does not work :-(
Ok I think I understand the problem now, we're invoking the NAT cleanup
callback twice with clean->hash = true, once for each direction of the
conntrack.
Does this patch fix the problem?
[-- Attachment #2: Type: TEXT/PLAIN, Size: 4200 bytes --]
commit 6c46a3bfb2776ca098565daf7e872a3283d14e0d
Author: Patrick McHardy <kaber@trash.net>
Date: Thu Sep 20 08:43:02 2012 +0200
netfilter: nf_nat: fix oops when unloading protocol modules
When unloading a protocol module nf_ct_iterate_cleanup() is used to
remove all conntracks using the protocol from the bysource hash and
clean their NAT sections. Since the conntrack isn't actually killed,
the NAT callback is invoked twice, once for each direction, which
causes an oops when trying to delete it from the bysource hash for
the second time.
The same oops can also happen when removing both an L3 and L4 protocol
since the cleanup function doesn't check whether the conntrack has
already been cleaned up.
Pid: 4052, comm: modprobe Not tainted 3.6.0-rc3-test-nat-unload-fix+ #32 Red Hat KVM
RIP: 0010:[<ffffffffa002c303>] [<ffffffffa002c303>] nf_nat_proto_clean+0x73/0xd0 [nf_nat]
RSP: 0018:ffff88007808fe18 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff8800728550c0 RCX: ffff8800756288b0
RDX: dead000000200200 RSI: ffff88007808fe88 RDI: ffffffffa002f208
RBP: ffff88007808fe28 R08: ffff88007808e000 R09: 0000000000000000
R10: dead000000200200 R11: dead000000100100 R12: ffffffff81c6dc00
R13: ffff8800787582b8 R14: ffff880078758278 R15: ffff88007808fe88
FS: 00007f515985d700(0000) GS:ffff88007cd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007f515986a000 CR3: 000000007867a000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process modprobe (pid: 4052, threadinfo ffff88007808e000, task ffff8800756288b0)
Stack:
ffff88007808fe68 ffffffffa002c290 ffff88007808fe78 ffffffff815614e3
ffffffff00000000 00000aeb00000246 ffff88007808fe68 ffffffff81c6dc00
ffff88007808fe88 ffffffffa00358a0 0000000000000000 000000000040f5b0
Call Trace:
[<ffffffffa002c290>] ? nf_nat_net_exit+0x50/0x50 [nf_nat]
[<ffffffff815614e3>] nf_ct_iterate_cleanup+0xc3/0x170
[<ffffffffa002c55a>] nf_nat_l3proto_unregister+0x8a/0x100 [nf_nat]
[<ffffffff812a0303>] ? compat_prepare_timeout+0x13/0xb0
[<ffffffffa0035848>] nf_nat_l3proto_ipv4_exit+0x10/0x23 [nf_nat_ipv4]
...
To fix this,
- check whether the conntrack has already been cleaned up in
nf_nat_proto_clean
- change nf_ct_iterate_cleanup() to only invoke the callback function
once for each conntrack (IP_CT_DIR_ORIGINAL).
The second change doesn't affect other callers since when conntracks are
actually killed, both directions are removed from the hash immediately
and the callback is already only invoked once. If it is not killed, the
second callback invocation will always return the same decision not to
kill it.
Reported-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index dcb2791..0f241be 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -1224,6 +1224,8 @@ get_next_corpse(struct net *net, int (*iter)(struct nf_conn *i, void *data),
spin_lock_bh(&nf_conntrack_lock);
for (; *bucket < net->ct.htable_size; (*bucket)++) {
hlist_nulls_for_each_entry(h, n, &net->ct.hash[*bucket], hnnode) {
+ if (NF_CT_DIRECTION(h) != IP_CT_DIR_ORIGINAL)
+ continue;
ct = nf_ct_tuplehash_to_ctrack(h);
if (iter(ct, data))
goto found;
diff --git a/net/netfilter/nf_nat_core.c b/net/netfilter/nf_nat_core.c
index 1816ad3..65cf694 100644
--- a/net/netfilter/nf_nat_core.c
+++ b/net/netfilter/nf_nat_core.c
@@ -481,6 +481,8 @@ static int nf_nat_proto_clean(struct nf_conn *i, void *data)
if (!nat)
return 0;
+ if (!(i->status & IPS_SRC_NAT_DONE))
+ return 0;
if ((clean->l3proto && nf_ct_l3num(i) != clean->l3proto) ||
(clean->l4proto && nf_ct_protonum(i) != clean->l4proto))
return 0;
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: Oops with latest (netfilter) nf-next tree, when unloading iptable_nat
2012-09-20 6:57 ` Patrick McHardy
@ 2012-09-20 7:29 ` Jesper Dangaard Brouer
2012-09-20 7:31 ` Patrick McHardy
2012-09-20 10:08 ` Pablo Neira Ayuso
1 sibling, 1 reply; 16+ messages in thread
From: Jesper Dangaard Brouer @ 2012-09-20 7:29 UTC (permalink / raw)
To: Patrick McHardy
Cc: Pablo Neira Ayuso, Florian Westphal, netfilter-devel, netdev,
yongjun_wei
On Thu, 2012-09-20 at 08:57 +0200, Patrick McHardy wrote:
> On Wed, 19 Sep 2012, Jesper Dangaard Brouer wrote:
>
> > On Fri, 2012-09-14 at 15:15 +0200, Patrick McHardy wrote:
> >> On Fri, 14 Sep 2012, Pablo Neira Ayuso wrote:
> >>
> > [...cut...]
> >>>> Patrick, any other idea?
> >>>
> > [...cut...]
[... (hair)cut(?)...]
> > No it does not work :-(
>
> Ok I think I understand the problem now, we're invoking the NAT cleanup
> callback twice with clean->hash = true, once for each direction of the
> conntrack.
>
> Does this patch fix the problem?
Yes, it fixes the problem :-)
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Oops with latest (netfilter) nf-next tree, when unloading iptable_nat
2012-09-20 7:29 ` Jesper Dangaard Brouer
@ 2012-09-20 7:31 ` Patrick McHardy
0 siblings, 0 replies; 16+ messages in thread
From: Patrick McHardy @ 2012-09-20 7:31 UTC (permalink / raw)
To: Jesper Dangaard Brouer
Cc: Pablo Neira Ayuso, Florian Westphal, netfilter-devel, netdev,
yongjun_wei
On Thu, 20 Sep 2012, Jesper Dangaard Brouer wrote:
> On Thu, 2012-09-20 at 08:57 +0200, Patrick McHardy wrote:
>> On Wed, 19 Sep 2012, Jesper Dangaard Brouer wrote:
>>
>>> On Fri, 2012-09-14 at 15:15 +0200, Patrick McHardy wrote:
>>>> On Fri, 14 Sep 2012, Pablo Neira Ayuso wrote:
>>>>
>>> [...cut...]
>>>>>> Patrick, any other idea?
>>>>>
>>> [...cut...]
> [... (hair)cut(?)...]
>
>>> No it does not work :-(
>>
>> Ok I think I understand the problem now, we're invoking the NAT cleanup
>> callback twice with clean->hash = true, once for each direction of the
>> conntrack.
>>
>> Does this patch fix the problem?
>
> Yes, it fixes the problem :-)
>
> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Great, thanks for testing.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Oops with latest (netfilter) nf-next tree, when unloading iptable_nat
2012-09-20 6:57 ` Patrick McHardy
2012-09-20 7:29 ` Jesper Dangaard Brouer
@ 2012-09-20 10:08 ` Pablo Neira Ayuso
2012-09-20 10:31 ` Patrick McHardy
1 sibling, 1 reply; 16+ messages in thread
From: Pablo Neira Ayuso @ 2012-09-20 10:08 UTC (permalink / raw)
To: Patrick McHardy
Cc: Jesper Dangaard Brouer, Florian Westphal, netfilter-devel, netdev,
yongjun_wei
On Thu, Sep 20, 2012 at 08:57:04AM +0200, Patrick McHardy wrote:
> On Wed, 19 Sep 2012, Jesper Dangaard Brouer wrote:
>
> >On Fri, 2012-09-14 at 15:15 +0200, Patrick McHardy wrote:
> >>On Fri, 14 Sep 2012, Pablo Neira Ayuso wrote:
> >>
> >[...cut...]
> >>>>Patrick, any other idea?
> >>>
> >[...cut...]
> >>>>
> >>>We can add nf_nat_iterate_cleanup that can iterate over the NAT
> >>>hashtable to replace current usage of nf_ct_iterate_cleanup.
> >>
> >>Lets just bail out when IPS_SRC_NAT_DONE is not set, that should also fix
> >>it. Could you try this patch please?
> >
> >On Fri, 2012-09-14 at 15:15 +0200, Patrick McHardy wrote:
> >diff --git a/net/netfilter/nf_nat_core.c b/net/netfilter/nf_nat_core.c
> >>index 29d4452..8b5d220 100644
> >>--- a/net/netfilter/nf_nat_core.c
> >>+++ b/net/netfilter/nf_nat_core.c
> >>@@ -481,6 +481,8 @@ static int nf_nat_proto_clean(struct nf_conn *i,
> >void *data)
> >>
> >> if (!nat)
> >> return 0;
> >>+ if (!(i->status & IPS_SRC_NAT_DONE))
> >>+ return 0;
> >> if ((clean->l3proto && nf_ct_l3num(i) != clean->l3proto) ||
> >> (clean->l4proto && nf_ct_protonum(i) != clean->l4proto))
> >> return 0;
> >>
> >
> >No it does not work :-(
>
> Ok I think I understand the problem now, we're invoking the NAT cleanup
> callback twice with clean->hash = true, once for each direction of the
> conntrack.
>
> Does this patch fix the problem?
> commit 6c46a3bfb2776ca098565daf7e872a3283d14e0d
> Author: Patrick McHardy <kaber@trash.net>
> Date: Thu Sep 20 08:43:02 2012 +0200
>
> netfilter: nf_nat: fix oops when unloading protocol modules
>
> When unloading a protocol module nf_ct_iterate_cleanup() is used to
> remove all conntracks using the protocol from the bysource hash and
> clean their NAT sections. Since the conntrack isn't actually killed,
> the NAT callback is invoked twice, once for each direction, which
> causes an oops when trying to delete it from the bysource hash for
> the second time.
>
> The same oops can also happen when removing both an L3 and L4 protocol
> since the cleanup function doesn't check whether the conntrack has
> already been cleaned up.
>
> Pid: 4052, comm: modprobe Not tainted 3.6.0-rc3-test-nat-unload-fix+ #32 Red Hat KVM
> RIP: 0010:[<ffffffffa002c303>] [<ffffffffa002c303>] nf_nat_proto_clean+0x73/0xd0 [nf_nat]
> RSP: 0018:ffff88007808fe18 EFLAGS: 00010246
> RAX: 0000000000000000 RBX: ffff8800728550c0 RCX: ffff8800756288b0
> RDX: dead000000200200 RSI: ffff88007808fe88 RDI: ffffffffa002f208
> RBP: ffff88007808fe28 R08: ffff88007808e000 R09: 0000000000000000
> R10: dead000000200200 R11: dead000000100100 R12: ffffffff81c6dc00
> R13: ffff8800787582b8 R14: ffff880078758278 R15: ffff88007808fe88
> FS: 00007f515985d700(0000) GS:ffff88007cd00000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 00007f515986a000 CR3: 000000007867a000 CR4: 00000000000006e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process modprobe (pid: 4052, threadinfo ffff88007808e000, task ffff8800756288b0)
> Stack:
> ffff88007808fe68 ffffffffa002c290 ffff88007808fe78 ffffffff815614e3
> ffffffff00000000 00000aeb00000246 ffff88007808fe68 ffffffff81c6dc00
> ffff88007808fe88 ffffffffa00358a0 0000000000000000 000000000040f5b0
> Call Trace:
> [<ffffffffa002c290>] ? nf_nat_net_exit+0x50/0x50 [nf_nat]
> [<ffffffff815614e3>] nf_ct_iterate_cleanup+0xc3/0x170
> [<ffffffffa002c55a>] nf_nat_l3proto_unregister+0x8a/0x100 [nf_nat]
> [<ffffffff812a0303>] ? compat_prepare_timeout+0x13/0xb0
> [<ffffffffa0035848>] nf_nat_l3proto_ipv4_exit+0x10/0x23 [nf_nat_ipv4]
> ...
>
> To fix this,
>
> - check whether the conntrack has already been cleaned up in
> nf_nat_proto_clean
>
> - change nf_ct_iterate_cleanup() to only invoke the callback function
> once for each conntrack (IP_CT_DIR_ORIGINAL).
>
> The second change doesn't affect other callers since when conntracks are
> actually killed, both directions are removed from the hash immediately
> and the callback is already only invoked once. If it is not killed, the
> second callback invocation will always return the same decision not to
> kill it.
>
> Reported-by: Jesper Dangaard Brouer <brouer@redhat.com>
> Signed-off-by: Patrick McHardy <kaber@trash.net>
>
> diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
> index dcb2791..0f241be 100644
> --- a/net/netfilter/nf_conntrack_core.c
> +++ b/net/netfilter/nf_conntrack_core.c
> @@ -1224,6 +1224,8 @@ get_next_corpse(struct net *net, int (*iter)(struct nf_conn *i, void *data),
> spin_lock_bh(&nf_conntrack_lock);
> for (; *bucket < net->ct.htable_size; (*bucket)++) {
> hlist_nulls_for_each_entry(h, n, &net->ct.hash[*bucket], hnnode) {
> + if (NF_CT_DIRECTION(h) != IP_CT_DIR_ORIGINAL)
> + continue;
I think this will make the deletion of entries via `conntrack -F'
slowier as we'll have to iterate over more entries (we won't delete
entries for the reply tuple).
I think I prefer Florian's patch, it's fairly small and it does not
change the current nf_ct_iterate behaviour or adding some
nf_nat_iterate cleanup.
> ct = nf_ct_tuplehash_to_ctrack(h);
> if (iter(ct, data))
> goto found;
> diff --git a/net/netfilter/nf_nat_core.c b/net/netfilter/nf_nat_core.c
> index 1816ad3..65cf694 100644
> --- a/net/netfilter/nf_nat_core.c
> +++ b/net/netfilter/nf_nat_core.c
> @@ -481,6 +481,8 @@ static int nf_nat_proto_clean(struct nf_conn *i, void *data)
>
> if (!nat)
> return 0;
> + if (!(i->status & IPS_SRC_NAT_DONE))
> + return 0;
> if ((clean->l3proto && nf_ct_l3num(i) != clean->l3proto) ||
> (clean->l4proto && nf_ct_protonum(i) != clean->l4proto))
> return 0;
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Oops with latest (netfilter) nf-next tree, when unloading iptable_nat
2012-09-20 10:08 ` Pablo Neira Ayuso
@ 2012-09-20 10:31 ` Patrick McHardy
2012-09-20 17:06 ` Patrick McHardy
0 siblings, 1 reply; 16+ messages in thread
From: Patrick McHardy @ 2012-09-20 10:31 UTC (permalink / raw)
To: Pablo Neira Ayuso
Cc: Jesper Dangaard Brouer, Florian Westphal, netfilter-devel, netdev,
yongjun_wei
>> diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
>> index dcb2791..0f241be 100644
>> --- a/net/netfilter/nf_conntrack_core.c
>> +++ b/net/netfilter/nf_conntrack_core.c
>> @@ -1224,6 +1224,8 @@ get_next_corpse(struct net *net, int (*iter)(struct nf_conn *i, void *data),
>> spin_lock_bh(&nf_conntrack_lock);
>> for (; *bucket < net->ct.htable_size; (*bucket)++) {
>> hlist_nulls_for_each_entry(h, n, &net->ct.hash[*bucket], hnnode) {
>> + if (NF_CT_DIRECTION(h) != IP_CT_DIR_ORIGINAL)
>> + continue;
>
> I think this will make the deletion of entries via `conntrack -F'
> slowier as we'll have to iterate over more entries (we won't delete
> entries for the reply tuple).
Slightly maybe, but I doubt it makes much of a difference.
> I think I prefer Florian's patch, it's fairly small and it does not
> change the current nf_ct_iterate behaviour or adding some
> nf_nat_iterate cleanup.
I don't think I've received it. Could you forward it to me please?
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Oops with latest (netfilter) nf-next tree, when unloading iptable_nat
2012-09-20 10:31 ` Patrick McHardy
@ 2012-09-20 17:06 ` Patrick McHardy
2012-09-21 1:00 ` Pablo Neira Ayuso
0 siblings, 1 reply; 16+ messages in thread
From: Patrick McHardy @ 2012-09-20 17:06 UTC (permalink / raw)
To: Pablo Neira Ayuso
Cc: Jesper Dangaard Brouer, Florian Westphal, netfilter-devel, netdev,
yongjun_wei
On Thu, 20 Sep 2012, Patrick McHardy wrote:
>>> diff --git a/net/netfilter/nf_conntrack_core.c
>>> b/net/netfilter/nf_conntrack_core.c
>>> index dcb2791..0f241be 100644
>>> --- a/net/netfilter/nf_conntrack_core.c
>>> +++ b/net/netfilter/nf_conntrack_core.c
>>> @@ -1224,6 +1224,8 @@ get_next_corpse(struct net *net, int (*iter)(struct
>>> nf_conn *i, void *data),
>>> spin_lock_bh(&nf_conntrack_lock);
>>> for (; *bucket < net->ct.htable_size; (*bucket)++) {
>>> hlist_nulls_for_each_entry(h, n, &net->ct.hash[*bucket],
>>> hnnode) {
>>> + if (NF_CT_DIRECTION(h) != IP_CT_DIR_ORIGINAL)
>>> + continue;
>>
>> I think this will make the deletion of entries via `conntrack -F'
>> slowier as we'll have to iterate over more entries (we won't delete
>> entries for the reply tuple).
>
> Slightly maybe, but I doubt it makes much of a difference.
>
>> I think I prefer Florian's patch, it's fairly small and it does not
>> change the current nf_ct_iterate behaviour or adding some
>> nf_nat_iterate cleanup.
>
> I don't think I've received it. Could you forward it to me please?
Florian forwarded the patch to me. While it fixes the problem, it
is a workaround and it certainly is inelegant to do the
list_del_rcu_init() and memset up to *four* times for a single conntrack.
The correct thing IMO is to invoke the callbacks exactly once per
conntrack, either through my nf_ct_iterate_cleanup() change or through
a new iteration function for callers that don't kill conntracks. As
soon as we start generating events for NAT section cleanup this will be
needed in any case.
Unless I'm missing something, conntrack flushing is also a really rare
operation anyways and for large tables where this might make a small
difference will take a quite large time anyway.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Oops with latest (netfilter) nf-next tree, when unloading iptable_nat
2012-09-20 17:06 ` Patrick McHardy
@ 2012-09-21 1:00 ` Pablo Neira Ayuso
2012-09-21 9:47 ` Jesper Dangaard Brouer
0 siblings, 1 reply; 16+ messages in thread
From: Pablo Neira Ayuso @ 2012-09-21 1:00 UTC (permalink / raw)
To: Patrick McHardy
Cc: Jesper Dangaard Brouer, Florian Westphal, netfilter-devel, netdev,
yongjun_wei
On Thu, Sep 20, 2012 at 07:06:52PM +0200, Patrick McHardy wrote:
> On Thu, 20 Sep 2012, Patrick McHardy wrote:
>
> >>>diff --git a/net/netfilter/nf_conntrack_core.c
> >>>b/net/netfilter/nf_conntrack_core.c
> >>>index dcb2791..0f241be 100644
> >>>--- a/net/netfilter/nf_conntrack_core.c
> >>>+++ b/net/netfilter/nf_conntrack_core.c
> >>>@@ -1224,6 +1224,8 @@ get_next_corpse(struct net *net, int
> >>>(*iter)(struct nf_conn *i, void *data),
> >>> spin_lock_bh(&nf_conntrack_lock);
> >>> for (; *bucket < net->ct.htable_size; (*bucket)++) {
> >>> hlist_nulls_for_each_entry(h, n, &net->ct.hash[*bucket],
> >>>hnnode) {
> >>>+ if (NF_CT_DIRECTION(h) != IP_CT_DIR_ORIGINAL)
> >>>+ continue;
> >>
> >>I think this will make the deletion of entries via `conntrack -F'
> >>slowier as we'll have to iterate over more entries (we won't delete
> >>entries for the reply tuple).
> >
> >Slightly maybe, but I doubt it makes much of a difference.
> >
> >>I think I prefer Florian's patch, it's fairly small and it does not
> >>change the current nf_ct_iterate behaviour or adding some
> >>nf_nat_iterate cleanup.
> >
> >I don't think I've received it. Could you forward it to me please?
>
> Florian forwarded the patch to me. While it fixes the problem, it
> is a workaround and it certainly is inelegant to do the
> list_del_rcu_init() and memset up to *four* times for a single conntrack.
>
> The correct thing IMO is to invoke the callbacks exactly once per
> conntrack, either through my nf_ct_iterate_cleanup() change or through
> a new iteration function for callers that don't kill conntracks. As
> soon as we start generating events for NAT section cleanup this will be
> needed in any case.
>
> Unless I'm missing something, conntrack flushing is also a really
> rare operation anyways and for large tables where this might make a
> small difference will take a quite large time anyway.
Makes sense. And we can revisit this to improve it later.
I'll take this patch. I'll send a batch with updates for the nf-nat
thin asap.
Thanks a lot Patrick.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Oops with latest (netfilter) nf-next tree, when unloading iptable_nat
2012-09-21 1:00 ` Pablo Neira Ayuso
@ 2012-09-21 9:47 ` Jesper Dangaard Brouer
2012-09-21 10:03 ` Pablo Neira Ayuso
0 siblings, 1 reply; 16+ messages in thread
From: Jesper Dangaard Brouer @ 2012-09-21 9:47 UTC (permalink / raw)
To: Pablo Neira Ayuso, David Miller
Cc: Patrick McHardy, Florian Westphal, netfilter-devel, netdev
On Fri, 2012-09-21 at 03:00 +0200, Pablo Neira Ayuso wrote:
> On Thu, Sep 20, 2012 at 07:06:52PM +0200, Patrick McHardy wrote:
> > On Thu, 20 Sep 2012, Patrick McHardy wrote:
[cut]
(discussion of fixes by Patrick and Florian)
(...settling on Patricks second patch)
> Makes sense. And we can revisit this to improve it later.
>
> I'll take this patch. I'll send a batch with updates for the nf-nat
> thin asap.
What git tree is that?
I'm trying to work off Pablo's nf-next tree (for my IPVS changes):
git://1984.lsi.us.es/nf-next
But I don't see the patch in that tree ...yet.
Notice, the bug is also present in DaveM's net-next tree.
(I know I stated earlier that it didn't affect net-next, but I just
forgot to select the new netfilter .config options for nat)
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Sr. Network Kernel Developer at Red Hat
Author of http://www.iptv-analyzer.org
LinkedIn: http://www.linkedin.com/in/brouer
Oops output, for the sake of completeness:
------------------------------------------
[ 866.878092] general protection fault: 0000 [#1] SMP
[ 866.878986] Modules linked in: netconsole ip_vs_lblc ip_vs_lc ip_vs_rr ip_vs libcrc32c ipt_MASQUERADE nf_nat_ipv4(-) nf_nat iptable_mangle xt_mark ip6table_mangle xt_LOG ip6table_filter ip6_tables virtio_net virtio_balloon [last unloaded: iptable_nat]
[ 866.879045] CPU 0
[ 866.879045] Pid: 4053, comm: modprobe Not tainted 3.6.0-rc5-net-next-sysctl-tcp+ #13 Red Hat KVM
[ 866.879045] RIP: 0010:[<ffffffffa002c2dd>] [<ffffffffa002c2dd>] nf_nat_proto_clean+0x6d/0xc0 [nf_nat]
[ 866.879045] RSP: 0018:ffff880078a41e18 EFLAGS: 00010246
[ 866.879045] RAX: 0000000000000000 RBX: ffff880079142500 RCX: dead000000200200
[ 866.879045] RDX: dead000000200200 RSI: ffff880078a41e88 RDI: ffffffffa002f268
[ 866.879045] RBP: ffff880078a41e28 R08: ffff880078a40000 R09: 0000000000000002
[ 866.879045] R10: 0000000000000000 R11: 0000000000000246 R12: ffffffff81c6db40
[ 866.879045] R13: ffff880037d8f008 R14: ffff880037d8f000 R15: ffff880078a41e88
[ 866.879045] FS: 00007fc30c11a700(0000) GS:ffff88007cc00000(0000) knlGS:0000000000000000
[ 866.879045] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 866.879045] CR2: 00007fc30c127000 CR3: 00000000780ef000 CR4: 00000000000006f0
[ 866.879045] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 866.879045] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 866.879045] Process modprobe (pid: 4053, threadinfo ffff880078a40000, task ffff88007a379650)
[ 866.879045] Stack:
[ 866.879045] ffff880078a41e68 ffffffffa002c270 ffff880078a41e78 ffffffff81541413
[ 866.879045] ffffffff00000000 0000147578a40303 ffff880078a41e68 ffffffff81c6db40
[ 866.879045] ffff880078a41e88 ffffffffa00358a0 0000000000000000 000000000040f5b0
[ 866.879045] Call Trace:
[ 866.879045] [<ffffffffa002c270>] ? nf_nat_net_exit+0x50/0x50 [nf_nat]
[ 866.879045] [<ffffffff81541413>] nf_ct_iterate_cleanup+0xc3/0x170
[ 866.879045] [<ffffffffa002c50a>] nf_nat_l3proto_unregister+0x8a/0x100 [nf_nat]
[ 866.879045] [<ffffffff81290303>] ? proc_keys_next+0x23/0x60
[ 866.879045] [<ffffffffa0035848>] nf_nat_l3proto_ipv4_exit+0x10/0x23 [nf_nat_ipv4]
[ 866.879045] [<ffffffff810988b5>] sys_delete_module+0x235/0x2b0
[ 866.879045] [<ffffffff810b38c3>] ? __audit_syscall_entry+0x1b3/0x1f0
[ 866.879045] [<ffffffff810b36e6>] ? __audit_syscall_exit+0x3e6/0x410
[ 866.879045] [<ffffffff81640122>] system_call_fastpath+0x16/0x1b
[ 866.879045] Code: 75 6c 0f b6 46 01 84 c0 74 05 3a 42 3e 75 5f 80 7e 02 00 74 41 48 c7 c7 68 f2 02 a0 e8 1d c5 60 e1 48 8b 03 48 8b 53 08 48 85 c0 <48> 89 02 74 04 48 89 50 08 48 be 00 02 20 00 00 00 ad de 48 c7
[ 866.879045] RIP [<ffffffffa002c2dd>] nf_nat_proto_clean+0x6d/0xc0 [nf_nat]
[ 866.879045] RSP <ffff880078a41e18>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Oops with latest (netfilter) nf-next tree, when unloading iptable_nat
2012-09-21 9:47 ` Jesper Dangaard Brouer
@ 2012-09-21 10:03 ` Pablo Neira Ayuso
2012-09-21 10:17 ` Pablo Neira Ayuso
0 siblings, 1 reply; 16+ messages in thread
From: Pablo Neira Ayuso @ 2012-09-21 10:03 UTC (permalink / raw)
To: Jesper Dangaard Brouer
Cc: David Miller, Patrick McHardy, Florian Westphal, netfilter-devel,
netdev
On Fri, Sep 21, 2012 at 11:47:22AM +0200, Jesper Dangaard Brouer wrote:
> On Fri, 2012-09-21 at 03:00 +0200, Pablo Neira Ayuso wrote:
> > On Thu, Sep 20, 2012 at 07:06:52PM +0200, Patrick McHardy wrote:
> > > On Thu, 20 Sep 2012, Patrick McHardy wrote:
> [cut]
> (discussion of fixes by Patrick and Florian)
> (...settling on Patricks second patch)
>
> > Makes sense. And we can revisit this to improve it later.
> >
> > I'll take this patch. I'll send a batch with updates for the nf-nat
> > thin asap.
>
> What git tree is that?
>
> I'm trying to work off Pablo's nf-next tree (for my IPVS changes):
> git://1984.lsi.us.es/nf-next
>
> But I don't see the patch in that tree ...yet.
I didn't push it yet, will do asap.
> Notice, the bug is also present in DaveM's net-next tree.
> (I know I stated earlier that it didn't affect net-next, but I just
> forgot to select the new netfilter .config options for nat)
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Oops with latest (netfilter) nf-next tree, when unloading iptable_nat
2012-09-21 10:03 ` Pablo Neira Ayuso
@ 2012-09-21 10:17 ` Pablo Neira Ayuso
0 siblings, 0 replies; 16+ messages in thread
From: Pablo Neira Ayuso @ 2012-09-21 10:17 UTC (permalink / raw)
To: Jesper Dangaard Brouer
Cc: David Miller, Patrick McHardy, Florian Westphal, netfilter-devel,
netdev
On Fri, Sep 21, 2012 at 12:03:08PM +0200, Pablo Neira Ayuso wrote:
> On Fri, Sep 21, 2012 at 11:47:22AM +0200, Jesper Dangaard Brouer wrote:
> > On Fri, 2012-09-21 at 03:00 +0200, Pablo Neira Ayuso wrote:
> > > On Thu, Sep 20, 2012 at 07:06:52PM +0200, Patrick McHardy wrote:
> > > > On Thu, 20 Sep 2012, Patrick McHardy wrote:
> > [cut]
> > (discussion of fixes by Patrick and Florian)
> > (...settling on Patricks second patch)
> >
> > > Makes sense. And we can revisit this to improve it later.
> > >
> > > I'll take this patch. I'll send a batch with updates for the nf-nat
> > > thin asap.
> >
> > What git tree is that?
> >
> > I'm trying to work off Pablo's nf-next tree (for my IPVS changes):
> > git://1984.lsi.us.es/nf-next
> >
> > But I don't see the patch in that tree ...yet.
>
> I didn't push it yet, will do asap.
Done.
You may require git pull --rebase to get your patches up on the git
head.
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2012-09-21 10:17 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-09-11 9:51 Oops with latest (netfilter) nf-next tree, when unloading iptable_nat Jesper Dangaard Brouer
2012-09-12 21:36 ` Florian Westphal
2012-09-14 12:07 ` Pablo Neira Ayuso
2012-09-14 13:15 ` Patrick McHardy
2012-09-19 12:46 ` Jesper Dangaard Brouer
2012-09-20 6:57 ` Patrick McHardy
2012-09-20 7:29 ` Jesper Dangaard Brouer
2012-09-20 7:31 ` Patrick McHardy
2012-09-20 10:08 ` Pablo Neira Ayuso
2012-09-20 10:31 ` Patrick McHardy
2012-09-20 17:06 ` Patrick McHardy
2012-09-21 1:00 ` Pablo Neira Ayuso
2012-09-21 9:47 ` Jesper Dangaard Brouer
2012-09-21 10:03 ` Pablo Neira Ayuso
2012-09-21 10:17 ` Pablo Neira Ayuso
2012-09-19 19:14 ` Jesper Dangaard Brouer
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).