* crash in death_by_timeout() @ 2008-11-17 22:18 BORBELY Zoltan 2008-11-18 11:07 ` Patrick McHardy 0 siblings, 1 reply; 13+ messages in thread From: BORBELY Zoltan @ 2008-11-17 22:18 UTC (permalink / raw) To: Netfilter Development Mailinglist [-- Attachment #1: Type: text/plain, Size: 362 bytes --] Hi, There's a race in the nfct netlink code, the result is a crash in the death_by_timeout() function. When a timer interrupt occures during a new entry addition, the kernel will crash due to a NULL deref. The attached patch has solved the problem for us. I haven't tested it on the latest kernels, but the problem still seems to be there. Bye, Zoltan Borbely [-- Attachment #2: nf_conntrack_netlink.patch --] [-- Type: text/plain, Size: 317 bytes --] --- /tmp/nf_conntrack_netlink.c-orig 2008-09-29 23:28:55.000000000 +0200 +++ /tmp/nf_conntrack_netlink.c 2008-09-29 23:29:11.000000000 +0200 @@ -1177,8 +1177,8 @@ ct->master = master_ct; } - add_timer(&ct->timeout); nf_conntrack_hash_insert(ct); + add_timer(&ct->timeout); rcu_read_unlock(); return 0; ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: crash in death_by_timeout() 2008-11-17 22:18 crash in death_by_timeout() BORBELY Zoltan @ 2008-11-18 11:07 ` Patrick McHardy 2008-11-18 12:38 ` BORBELY Zoltan 0 siblings, 1 reply; 13+ messages in thread From: Patrick McHardy @ 2008-11-18 11:07 UTC (permalink / raw) To: BORBELY Zoltan; +Cc: Netfilter Development Mailinglist BORBELY Zoltan wrote: > Hi, > > There's a race in the nfct netlink code, the result is a crash in the > death_by_timeout() function. When a timer interrupt occures during a > new entry addition, the kernel will crash due to a NULL deref. The > attached patch has solved the problem for us. I haven't tested it on > the latest kernels, but the problem still seems to be there. > --- /tmp/nf_conntrack_netlink.c-orig 2008-09-29 23:28:55.000000000 +0200 > +++ /tmp/nf_conntrack_netlink.c 2008-09-29 23:29:11.000000000 +0200 > @@ -1177,8 +1177,8 @@ > ct->master = master_ct; > } > > - add_timer(&ct->timeout); > nf_conntrack_hash_insert(ct); > + add_timer(&ct->timeout); > rcu_read_unlock(); That code looks very fishy. We should be holding the conntrack lock, otherwise the addition is not only racy against the timer, but also against addition of identical conntracks. Let me look into what happened here. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: crash in death_by_timeout() 2008-11-18 11:07 ` Patrick McHardy @ 2008-11-18 12:38 ` BORBELY Zoltan 2008-11-18 13:19 ` Patrick McHardy 0 siblings, 1 reply; 13+ messages in thread From: BORBELY Zoltan @ 2008-11-18 12:38 UTC (permalink / raw) To: Patrick McHardy; +Cc: Netfilter Development Mailinglist [-- Attachment #1: Type: text/plain, Size: 1229 bytes --] Hi, On Tue, Nov 18, 2008 at 12:07:20PM +0100, Patrick McHardy wrote: >> --- /tmp/nf_conntrack_netlink.c-orig 2008-09-29 23:28:55.000000000 +0200 >> +++ /tmp/nf_conntrack_netlink.c 2008-09-29 23:29:11.000000000 +0200 >> @@ -1177,8 +1177,8 @@ >> ct->master = master_ct; >> } >> - add_timer(&ct->timeout); >> nf_conntrack_hash_insert(ct); >> + add_timer(&ct->timeout); >> rcu_read_unlock(); > > That code looks very fishy. We should be holding the conntrack lock, > otherwise the addition is not only racy against the timer, but also > against addition of identical conntracks. Let me look into what > happened here. We have experienced a lot of kernel crashes, _every time_ in the death_by_timeout() function while we were trying to add a new conntrack entry from userspace via netlink (attached the disassembled version of the function, ===> points to the EIP upon the crash). There was a possibility, that we tried to add conntrack entries with zero timeout value, maybe it's necessary to trigger this crash. The previous patch has definitly solved the problem for us. I've got photos from various crashes, but it takes a little time to find them. Please let me know if you want to see them. Thanks, Zoltan Borbely [-- Attachment #2: 2.6.26.3-death_by_timeout.txt --] [-- Type: text/plain, Size: 4551 bytes --] 00000350 <death_by_timeout>: 350: 55 push %ebp 351: 89 e5 mov %esp,%ebp 353: 56 push %esi 354: 53 push %ebx 355: 89 c3 mov %eax,%ebx 357: 83 ec 0c sub $0xc,%esp 35a: 8b 90 cc 00 00 00 mov 0xcc(%eax),%edx 360: 85 d2 test %edx,%edx 362: 74 08 je 36c <death_by_timeout+0x1c> 364: 0f b6 42 08 movzbl 0x8(%edx),%eax 368: 84 c0 test %al,%al 36a: 75 74 jne 3e0 <death_by_timeout+0x90> 36c: b8 00 00 00 00 mov <nf_conntrack_lock>,%eax 371: e8 fc ff ff ff call <_spin_lock_bh> 376: 8d 4b 04 lea 0x4(%ebx),%ecx 379: ff 05 18 00 00 00 incl <per_cpu__nf_conntrack_stat> // hlist_del_rcu(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnode) 37f: 8b 43 04 mov 0x4(%ebx),%eax 382: 8b 51 04 mov 0x4(%ecx),%edx 385: 85 c0 test %eax,%eax ===> 387: 89 02 mov %eax,(%edx) 389: 74 03 je 38e <death_by_timeout+0x3e> 38b: 89 50 04 mov %edx,0x4(%eax) 38e: c7 41 04 00 02 20 00 movl $0x200200,0x4(%ecx) // LIST_POISON2 // hlist_del_rcu(&ct->tuplehash[IP_CT_DIR_REPLY].hnode) 395: 8b 43 34 mov 0x34(%ebx),%eax 398: 8d 4b 34 lea 0x34(%ebx),%ecx 39b: 8b 51 04 mov 0x4(%ecx),%edx 39e: 85 c0 test %eax,%eax 3a0: 89 02 mov %eax,(%edx) 3a2: 74 03 je 3a7 <death_by_timeout+0x57> 3a4: 89 50 04 mov %edx,0x4(%eax) 3a7: c7 41 04 00 02 20 00 movl $0x200200,0x4(%ecx) // LIST_POISON2 3ae: 89 d8 mov %ebx,%eax 3b0: e8 fc ff ff ff call <nf_ct_remove_expectations> 3b5: b8 00 00 00 00 mov <nf_conntrack_lock>,%eax 3ba: e8 fc ff ff ff call <_spin_unlock_bh> 3bf: 85 db test %ebx,%ebx 3c1: 74 77 je 43a <death_by_timeout+0xea> 3c3: ff 0b decl (%ebx) 3c5: 0f 94 c0 sete %al 3c8: 84 c0 test %al,%al 3ca: 74 07 je 3d3 <death_by_timeout+0x83> 3cc: 89 d8 mov %ebx,%eax 3ce: e8 fc ff ff ff call <nf_conntrack_destroy> 3d3: 83 c4 0c add $0xc,%esp 3d6: 5b pop %ebx 3d7: 5e pop %esi 3d8: 5d pop %ebp 3d9: c3 ret 3da: 8d b6 00 00 00 00 lea 0x0(%esi),%esi 3e0: 0f b6 c0 movzbl %al,%eax 3e3: 89 d6 mov %edx,%esi 3e5: 01 c6 add %eax,%esi 3e7: 74 83 je 36c <death_by_timeout+0x1c> 3e9: b8 e9 03 00 00 mov $0x3e9,%eax 3ee: 31 c9 xor %ecx,%ecx 3f0: 89 44 24 08 mov %eax,0x8(%esp) 3f4: b8 01 00 00 00 mov $0x1,%eax 3f9: 31 d2 xor %edx,%edx 3fb: 89 44 24 04 mov %eax,0x4(%esp) 3ff: b8 00 00 00 00 mov <rcu_lock_map>,%eax 404: c7 04 24 02 00 00 00 movl $0x2,(%esp) 40b: e8 fc ff ff ff call <lock_acquire> 410: 8b 06 mov (%esi),%eax 412: 85 c0 test %eax,%eax 414: 74 0b je 421 <death_by_timeout+0xd1> 416: 8b 50 40 mov 0x40(%eax),%edx 419: 85 d2 test %edx,%edx 41b: 74 04 je 421 <death_by_timeout+0xd1> 41d: 89 d8 mov %ebx,%eax 41f: ff d2 call *%edx 421: b9 21 04 00 00 mov $0x421,%ecx 426: ba 01 00 00 00 mov $0x1,%edx 42b: b8 00 00 00 00 mov <rcu_lock_map>,%eax 430: e8 fc ff ff ff call <lock_release> 435: e9 32 ff ff ff jmp 36c <death_by_timeout+0x1c> 43a: ba b2 00 00 00 mov $0xb2,%edx 43f: b8 00 00 00 00 mov $0x0,%eax 444: e8 fc ff ff ff call <warn_on_slowpath> 449: eb 88 jmp 3d3 <death_by_timeout+0x83> 44b: 90 nop 44c: 8d 74 26 00 lea 0x0(%esi),%esi ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: crash in death_by_timeout() 2008-11-18 12:38 ` BORBELY Zoltan @ 2008-11-18 13:19 ` Patrick McHardy 2008-11-18 13:27 ` Patrick McHardy 0 siblings, 1 reply; 13+ messages in thread From: Patrick McHardy @ 2008-11-18 13:19 UTC (permalink / raw) To: BORBELY Zoltan; +Cc: Netfilter Development Mailinglist BORBELY Zoltan wrote: > Hi, > > On Tue, Nov 18, 2008 at 12:07:20PM +0100, Patrick McHardy wrote: >>> --- /tmp/nf_conntrack_netlink.c-orig 2008-09-29 23:28:55.000000000 +0200 >>> +++ /tmp/nf_conntrack_netlink.c 2008-09-29 23:29:11.000000000 +0200 >>> @@ -1177,8 +1177,8 @@ >>> ct->master = master_ct; >>> } >>> - add_timer(&ct->timeout); >>> nf_conntrack_hash_insert(ct); >>> + add_timer(&ct->timeout); >>> rcu_read_unlock(); >> That code looks very fishy. We should be holding the conntrack lock, >> otherwise the addition is not only racy against the timer, but also >> against addition of identical conntracks. Let me look into what >> happened here. > > We have experienced a lot of kernel crashes, _every time_ in the > death_by_timeout() function while we were trying to add a new conntrack > entry from userspace via netlink (attached the disassembled version > of the function, ===> points to the EIP upon the crash). There was a > possibility, that we tried to add conntrack entries with zero timeout > value, maybe it's necessary to trigger this crash. The previous patch > has definitly solved the problem for us. > > I've got photos from various crashes, but it takes a little time to > find them. Please let me know if you want to see them. Thats not necessary, the problem is pretty obvious, I was mainly wondering at what point we broke it. I'll send you a patch soon. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: crash in death_by_timeout() 2008-11-18 13:19 ` Patrick McHardy @ 2008-11-18 13:27 ` Patrick McHardy 2008-11-18 22:25 ` Pablo Neira Ayuso 2008-11-25 8:09 ` BORBELY Zoltan 0 siblings, 2 replies; 13+ messages in thread From: Patrick McHardy @ 2008-11-18 13:27 UTC (permalink / raw) To: BORBELY Zoltan; +Cc: Netfilter Development Mailinglist, Pablo Neira Ayuso [-- Attachment #1: Type: text/plain, Size: 1623 bytes --] Patrick McHardy wrote: > BORBELY Zoltan wrote: >> On Tue, Nov 18, 2008 at 12:07:20PM +0100, Patrick McHardy wrote: >>>> --- /tmp/nf_conntrack_netlink.c-orig 2008-09-29 >>>> 23:28:55.000000000 +0200 >>>> +++ /tmp/nf_conntrack_netlink.c 2008-09-29 23:29:11.000000000 +0200 >>>> @@ -1177,8 +1177,8 @@ >>>> ct->master = master_ct; >>>> } >>>> - add_timer(&ct->timeout); >>>> nf_conntrack_hash_insert(ct); >>>> + add_timer(&ct->timeout); >>>> rcu_read_unlock(); >>> That code looks very fishy. We should be holding the conntrack lock, >>> otherwise the addition is not only racy against the timer, but also >>> against addition of identical conntracks. Let me look into what >>> happened here. >> >> We have experienced a lot of kernel crashes, _every time_ in the >> death_by_timeout() function while we were trying to add a new conntrack >> entry from userspace via netlink (attached the disassembled version >> of the function, ===> points to the EIP upon the crash). There was a >> possibility, that we tried to add conntrack entries with zero timeout >> value, maybe it's necessary to trigger this crash. The previous patch >> has definitly solved the problem for us. >> >> I've got photos from various crashes, but it takes a little time to >> find them. Please let me know if you want to see them. > > Thats not necessary, the problem is pretty obvious, I was mainly > wondering at what point we broke it. > > I'll send you a patch soon. Could you try whether this patch fixes the problem? Pablo, do you recall the reason why the lock isn't held in ctnetlink_create_conntrack()? [-- Attachment #2: x --] [-- Type: text/plain, Size: 1540 bytes --] diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c index 622d7c6..233fdd2 100644 --- a/net/netfilter/nf_conntrack_core.c +++ b/net/netfilter/nf_conntrack_core.c @@ -305,9 +305,7 @@ void nf_conntrack_hash_insert(struct nf_conn *ct) hash = hash_conntrack(&ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple); repl_hash = hash_conntrack(&ct->tuplehash[IP_CT_DIR_REPLY].tuple); - spin_lock_bh(&nf_conntrack_lock); __nf_conntrack_hash_insert(ct, hash, repl_hash); - spin_unlock_bh(&nf_conntrack_lock); } EXPORT_SYMBOL_GPL(nf_conntrack_hash_insert); diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c index a040d46..3b009a3 100644 --- a/net/netfilter/nf_conntrack_netlink.c +++ b/net/netfilter/nf_conntrack_netlink.c @@ -1090,7 +1090,7 @@ ctnetlink_create_conntrack(struct nlattr *cda[], struct nf_conn_help *help; struct nf_conntrack_helper *helper; - ct = nf_conntrack_alloc(&init_net, otuple, rtuple, GFP_KERNEL); + ct = nf_conntrack_alloc(&init_net, otuple, rtuple, GFP_ATOMIC); if (ct == NULL || IS_ERR(ct)) return -ENOMEM; @@ -1212,13 +1212,14 @@ ctnetlink_new_conntrack(struct sock *ctnl, struct sk_buff *skb, atomic_inc(&master_ct->ct_general.use); } - spin_unlock_bh(&nf_conntrack_lock); err = -ENOENT; if (nlh->nlmsg_flags & NLM_F_CREATE) err = ctnetlink_create_conntrack(cda, &otuple, &rtuple, master_ct); + spin_unlock_bh(&nf_conntrack_lock); + if (err < 0 && master_ct) nf_ct_put(master_ct); ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: crash in death_by_timeout() 2008-11-18 13:27 ` Patrick McHardy @ 2008-11-18 22:25 ` Pablo Neira Ayuso 2008-11-19 12:04 ` Patrick McHardy 2008-11-25 8:09 ` BORBELY Zoltan 1 sibling, 1 reply; 13+ messages in thread From: Pablo Neira Ayuso @ 2008-11-18 22:25 UTC (permalink / raw) To: Patrick McHardy; +Cc: BORBELY Zoltan, Netfilter Development Mailinglist [-- Attachment #1: Type: text/plain, Size: 2351 bytes --] Patrick McHardy wrote: > Patrick McHardy wrote: >> BORBELY Zoltan wrote: >>> On Tue, Nov 18, 2008 at 12:07:20PM +0100, Patrick McHardy wrote: >>>>> --- /tmp/nf_conntrack_netlink.c-orig 2008-09-29 >>>>> 23:28:55.000000000 +0200 >>>>> +++ /tmp/nf_conntrack_netlink.c 2008-09-29 23:29:11.000000000 +0200 >>>>> @@ -1177,8 +1177,8 @@ >>>>> ct->master = master_ct; >>>>> } >>>>> - add_timer(&ct->timeout); >>>>> nf_conntrack_hash_insert(ct); >>>>> + add_timer(&ct->timeout); >>>>> rcu_read_unlock(); >>>> That code looks very fishy. We should be holding the conntrack lock, >>>> otherwise the addition is not only racy against the timer, but also >>>> against addition of identical conntracks. Let me look into what >>>> happened here. >>> >>> We have experienced a lot of kernel crashes, _every time_ in the >>> death_by_timeout() function while we were trying to add a new conntrack >>> entry from userspace via netlink (attached the disassembled version >>> of the function, ===> points to the EIP upon the crash). There was a >>> possibility, that we tried to add conntrack entries with zero timeout >>> value, maybe it's necessary to trigger this crash. The previous patch >>> has definitly solved the problem for us. >>> >>> I've got photos from various crashes, but it takes a little time to >>> find them. Please let me know if you want to see them. >> >> Thats not necessary, the problem is pretty obvious, I was mainly >> wondering at what point we broke it. >> >> I'll send you a patch soon. > > Could you try whether this patch fixes the problem? > > Pablo, do you recall the reason why the lock isn't held in > ctnetlink_create_conntrack()? The creation is done under the nfnl_mutex so that requests to create identical entries cannot race. Of course, this is not enough to avoid the race with the timer if we set a very small timer for a conntrack :(. AFAICS, we don't need to enclose the whole conntrack creation path. Would you prefer the patch attached? This patch should apply fine to 2.6.28-rc. I can send this patch to -stable. BTW, this patch may conflict with my patch enqueued for 2.6.29 that adds userspace reporting, let me know if I can give you a hand in some way (like sending it on top of this one or yours again, whatever). -- "Los honestos son inadaptados sociales" -- Les Luthiers [-- Attachment #2: fix-creation.patch --] [-- Type: text/x-diff, Size: 2555 bytes --] netfilter: ctnetlink: fix racy timer in the creation path If we set a very small timer for new conntrack created via ctnetlink, the entry may expire before it is in the hashes, resulting in a crash. To fix this problem, the timer addition and the insertion into the hashes is done holding the nf_conntrack spin lock bh. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> --- include/net/netfilter/nf_conntrack.h | 2 +- net/netfilter/nf_conntrack_core.c | 7 ++----- net/netfilter/nf_conntrack_netlink.c | 4 +++- 3 files changed, 6 insertions(+), 7 deletions(-) diff --git a/include/net/netfilter/nf_conntrack.h b/include/net/netfilter/nf_conntrack.h index b76a868..a05e2e2 100644 --- a/include/net/netfilter/nf_conntrack.h +++ b/include/net/netfilter/nf_conntrack.h @@ -197,7 +197,7 @@ extern void nf_ct_free_hashtable(struct hlist_head *hash, int vmalloced, extern struct nf_conntrack_tuple_hash * __nf_conntrack_find(struct net *net, const struct nf_conntrack_tuple *tuple); -extern void nf_conntrack_hash_insert(struct nf_conn *ct); +extern void __nf_conntrack_insert(struct nf_conn *ct); extern void nf_conntrack_flush(struct net *net); diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c index 622d7c6..22246ec 100644 --- a/net/netfilter/nf_conntrack_core.c +++ b/net/netfilter/nf_conntrack_core.c @@ -298,18 +298,15 @@ static void __nf_conntrack_hash_insert(struct nf_conn *ct, &net->ct.hash[repl_hash]); } -void nf_conntrack_hash_insert(struct nf_conn *ct) +void __nf_conntrack_insert(struct nf_conn *ct) { unsigned int hash, repl_hash; hash = hash_conntrack(&ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple); repl_hash = hash_conntrack(&ct->tuplehash[IP_CT_DIR_REPLY].tuple); - - spin_lock_bh(&nf_conntrack_lock); __nf_conntrack_hash_insert(ct, hash, repl_hash); - spin_unlock_bh(&nf_conntrack_lock); } -EXPORT_SYMBOL_GPL(nf_conntrack_hash_insert); +EXPORT_SYMBOL_GPL(__nf_conntrack_insert); /* Confirm a connection given skb; places it in hash table */ int diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c index b132124..2eb8fd3 100644 --- a/net/netfilter/nf_conntrack_netlink.c +++ b/net/netfilter/nf_conntrack_netlink.c @@ -1151,8 +1151,10 @@ ctnetlink_create_conntrack(struct nlattr *cda[], ct->master = master_ct; } + spin_lock_bh(&nf_conntrack_lock); add_timer(&ct->timeout); - nf_conntrack_hash_insert(ct); + __nf_conntrack_insert(ct); + spin_unlock_bh(&nf_conntrack_lock); rcu_read_unlock(); return 0; ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: crash in death_by_timeout() 2008-11-18 22:25 ` Pablo Neira Ayuso @ 2008-11-19 12:04 ` Patrick McHardy 2008-11-19 12:37 ` Pablo Neira Ayuso 0 siblings, 1 reply; 13+ messages in thread From: Patrick McHardy @ 2008-11-19 12:04 UTC (permalink / raw) To: Pablo Neira Ayuso; +Cc: BORBELY Zoltan, Netfilter Development Mailinglist Pablo Neira Ayuso wrote: > Patrick McHardy wrote: >> Pablo, do you recall the reason why the lock isn't held in >> ctnetlink_create_conntrack()? > > The creation is done under the nfnl_mutex so that requests to create > identical entries cannot race. Of course, this is not enough to avoid > the race with the timer if we set a very small timer for a conntrack :(. Its also not enough to avoid the race against packet processing, which takes nf_conntrack_lock. > AFAICS, we don't need to enclose the whole conntrack creation path. > Would you prefer the patch attached? This patch should apply fine to > 2.6.28-rc. That fixes the timer race, but the race between lookup and creation remains. We really need to either hold the lock the entire time or redo the lookup before inserting the entry into the hash tables. > I can send this patch to -stable. BTW, this patch may conflict with my > patch enqueued for 2.6.29 that adds userspace reporting, let me know if > I can give you a hand in some way (like sending it on top of this one or > yours again, whatever). I'll take care of any merge issues. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: crash in death_by_timeout() 2008-11-19 12:04 ` Patrick McHardy @ 2008-11-19 12:37 ` Pablo Neira Ayuso 2008-11-19 12:47 ` Patrick McHardy 0 siblings, 1 reply; 13+ messages in thread From: Pablo Neira Ayuso @ 2008-11-19 12:37 UTC (permalink / raw) To: Patrick McHardy; +Cc: BORBELY Zoltan, Netfilter Development Mailinglist Patrick McHardy wrote: > Pablo Neira Ayuso wrote: >> Patrick McHardy wrote: >>> Pablo, do you recall the reason why the lock isn't held in >>> ctnetlink_create_conntrack()? >> >> The creation is done under the nfnl_mutex so that requests to create >> identical entries cannot race. Of course, this is not enough to avoid >> the race with the timer if we set a very small timer for a conntrack :(. > > Its also not enough to avoid the race against packet processing, > which takes nf_conntrack_lock. > >> AFAICS, we don't need to enclose the whole conntrack creation path. >> Would you prefer the patch attached? This patch should apply fine to >> 2.6.28-rc. > > That fixes the timer race, but the race between lookup and creation > remains. We really need to either hold the lock the entire time or > redo the lookup before inserting the entry into the hash tables. I see, I forgot about that case. Your patch should be fine then. -- "Los honestos son inadaptados sociales" -- Les Luthiers ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: crash in death_by_timeout() 2008-11-19 12:37 ` Pablo Neira Ayuso @ 2008-11-19 12:47 ` Patrick McHardy 0 siblings, 0 replies; 13+ messages in thread From: Patrick McHardy @ 2008-11-19 12:47 UTC (permalink / raw) To: Pablo Neira Ayuso; +Cc: BORBELY Zoltan, Netfilter Development Mailinglist Pablo Neira Ayuso wrote: > Patrick McHardy wrote: >> Pablo Neira Ayuso wrote: >>> Patrick McHardy wrote: >>>> Pablo, do you recall the reason why the lock isn't held in >>>> ctnetlink_create_conntrack()? >>> The creation is done under the nfnl_mutex so that requests to create >>> identical entries cannot race. Of course, this is not enough to avoid >>> the race with the timer if we set a very small timer for a conntrack :(. >> Its also not enough to avoid the race against packet processing, >> which takes nf_conntrack_lock. >> >>> AFAICS, we don't need to enclose the whole conntrack creation path. >>> Would you prefer the patch attached? This patch should apply fine to >>> 2.6.28-rc. >> That fixes the timer race, but the race between lookup and creation >> remains. We really need to either hold the lock the entire time or >> redo the lookup before inserting the entry into the hash tables. > > I see, I forgot about that case. Your patch should be fine then. Thanks, I've queued it up, but will wait for confirmation from Zoltan. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: crash in death_by_timeout() 2008-11-18 13:27 ` Patrick McHardy 2008-11-18 22:25 ` Pablo Neira Ayuso @ 2008-11-25 8:09 ` BORBELY Zoltan 2008-11-25 11:11 ` Patrick McHardy 1 sibling, 1 reply; 13+ messages in thread From: BORBELY Zoltan @ 2008-11-25 8:09 UTC (permalink / raw) To: Patrick McHardy; +Cc: Netfilter Development Mailinglist Hi, On Tue, Nov 18, 2008 at 02:27:44PM +0100, Patrick McHardy wrote: > Could you try whether this patch fixes the problem? > > Pablo, do you recall the reason why the lock isn't held in > ctnetlink_create_conntrack()? > diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c > index 622d7c6..233fdd2 100644 > --- a/net/netfilter/nf_conntrack_core.c > +++ b/net/netfilter/nf_conntrack_core.c > @@ -305,9 +305,7 @@ void nf_conntrack_hash_insert(struct nf_conn *ct) > hash = hash_conntrack(&ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple); > repl_hash = hash_conntrack(&ct->tuplehash[IP_CT_DIR_REPLY].tuple); > > - spin_lock_bh(&nf_conntrack_lock); > __nf_conntrack_hash_insert(ct, hash, repl_hash); > - spin_unlock_bh(&nf_conntrack_lock); > } > EXPORT_SYMBOL_GPL(nf_conntrack_hash_insert); > > diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c > index a040d46..3b009a3 100644 > --- a/net/netfilter/nf_conntrack_netlink.c > +++ b/net/netfilter/nf_conntrack_netlink.c > @@ -1090,7 +1090,7 @@ ctnetlink_create_conntrack(struct nlattr *cda[], > struct nf_conn_help *help; > struct nf_conntrack_helper *helper; > > - ct = nf_conntrack_alloc(&init_net, otuple, rtuple, GFP_KERNEL); > + ct = nf_conntrack_alloc(&init_net, otuple, rtuple, GFP_ATOMIC); > if (ct == NULL || IS_ERR(ct)) > return -ENOMEM; > > @@ -1212,13 +1212,14 @@ ctnetlink_new_conntrack(struct sock *ctnl, struct sk_buff *skb, > atomic_inc(&master_ct->ct_general.use); > } > > - spin_unlock_bh(&nf_conntrack_lock); > err = -ENOENT; > if (nlh->nlmsg_flags & NLM_F_CREATE) > err = ctnetlink_create_conntrack(cda, > &otuple, > &rtuple, > master_ct); > + spin_unlock_bh(&nf_conntrack_lock); > + > if (err < 0 && master_ct) > nf_ct_put(master_ct); > We didn't see any kernel crashes during a half day heavy work (without the patch the kernel crashed in 3-4 hours every time), but we found a lot of BUG messages in the log (maybe for every new entry): Nov 24 14:45:43 test kernel: BUG: sleeping function called from invalid context at mm/slab.c:3043 Nov 24 14:45:43 test kernel: in_atomic():1, irqs_disabled():0 Nov 24 14:45:43 test kernel: 3 locks held by test/3586: Nov 24 14:45:43 test kernel: #0: (nfnl_mutex){--..}, at: [<d081500f>] nfnetlink_rcv+0xf/0x30 [nfnetlink] Nov 24 14:45:43 test kernel: #1: (nf_conntrack_lock){-+..}, at: [<d08c979f>] ctnetlink_new_conntrack+0x7f/0x770 [nf_conntrack_netlink] Nov 24 14:45:43 test kernel: #2: (rcu_read_lock){..--}, at: [<d08c98ee>] ctnetlink_new_conntrack+0x1ce/0x770 [nf_conntrack_netlink] Nov 24 14:45:43 test kernel: Pid: 3586, comm: test Not tainted 2.6.27.6bozotest #1 Nov 24 14:45:43 test kernel: [<c027a566>] __kmalloc_track_caller+0x126/0x160 Nov 24 14:45:43 test kernel: [<c052a7a5>] __nf_ct_ext_add+0xb5/0x290 Nov 24 14:45:43 test kernel: [<c026411d>] __krealloc+0x5d/0x80 Nov 24 14:45:44 test kernel: [<c052a7a5>] __nf_ct_ext_add+0xb5/0x290 Nov 24 14:45:44 test kernel: [<c052a71d>] __nf_ct_ext_add+0x2d/0x290 Nov 24 14:45:44 test kernel: [<d08c9af8>] ctnetlink_new_conntrack+0x3d8/0x770 [nf_conntrack_netlink] Nov 24 14:45:44 test kernel: [<d08c98ee>] ctnetlink_new_conntrack+0x1ce/0x770 [nf_conntrack_netlink] Nov 24 14:45:44 test kernel: [<c0248910>] validate_chain+0x380/0xed0 Nov 24 14:45:44 test kernel: [<d0815220>] nfnetlink_rcv_msg+0xf0/0x180 [nfnetlink] Nov 24 14:45:44 test kernel: [<d0815130>] nfnetlink_rcv_msg+0x0/0x180 [nfnetlink] Nov 24 14:45:44 test kernel: [<c0520ebc>] netlink_rcv_skb+0x7c/0xa0 Nov 24 14:45:44 test kernel: [<d081501b>] nfnetlink_rcv+0x1b/0x30 [nfnetlink] Nov 24 14:45:44 test kernel: [<c0520c50>] netlink_unicast+0x250/0x280 Nov 24 14:45:44 test kernel: [<c052145e>] netlink_sendmsg+0x1ee/0x2c0 Nov 24 14:45:44 test kernel: [<c04fad7f>] sock_sendmsg+0xbf/0xf0 Nov 24 14:45:44 test kernel: [<c02496e5>] __lock_acquire+0x285/0x9e0 Nov 24 14:45:44 test kernel: [<c0239790>] autoremove_wake_function+0x0/0x50 Nov 24 14:45:44 test kernel: [<c0248910>] validate_chain+0x380/0xed0 Nov 24 14:45:44 test kernel: [<c027ee33>] fget_light+0xd3/0xf0 Nov 24 14:45:44 test kernel: [<c031bea8>] copy_from_user+0x38/0x80 Nov 24 14:45:44 test kernel: [<c031bea8>] copy_from_user+0x38/0x80 Nov 24 14:45:44 test kernel: [<c0502e2a>] verify_iovec+0x2a/0x90 Nov 24 14:45:44 test kernel: [<c04faf14>] sys_sendmsg+0x164/0x280 Nov 24 14:45:44 test kernel: [<c027ee33>] fget_light+0xd3/0xf0 Nov 24 14:45:44 test kernel: [<c031c16a>] copy_to_user+0x3a/0x70 Nov 24 14:45:44 test kernel: [<c04fb98f>] move_addr_to_user+0x5f/0x70 Nov 24 14:45:44 test kernel: [<c04fbf0d>] sys_getsockname+0xcd/0xd0 Nov 24 14:45:44 test kernel: [<c022ad6c>] local_bh_enable_ip+0x7c/0xc0 Nov 24 14:45:44 test kernel: [<c0247e64>] trace_hardirqs_on_caller+0xc4/0x140 Nov 24 14:45:44 test kernel: [<c022ad6c>] local_bh_enable_ip+0x7c/0xc0 Nov 24 14:45:44 test kernel: [<c04fe578>] sock_setsockopt+0x128/0x590 Nov 24 14:45:44 test kernel: [<c027edb3>] fget_light+0x53/0xf0 Nov 24 14:45:44 test kernel: [<c04fa552>] sockfd_lookup_light+0x32/0x60 Nov 24 14:45:44 test kernel: [<c04fc39b>] sys_socketcall+0x25b/0x2b0 Nov 24 14:45:44 test kernel: [<c031ba44>] trace_hardirqs_on_thunk+0xc/0x10 Nov 24 14:45:44 test kernel: [<c031ba44>] trace_hardirqs_on_thunk+0xc/0x10 Nov 24 14:45:44 test kernel: [<c0203029>] sysenter_do_call+0x12/0x35 Nov 24 14:45:44 test kernel: ======================= Bye, Zoltan ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: crash in death_by_timeout() 2008-11-25 8:09 ` BORBELY Zoltan @ 2008-11-25 11:11 ` Patrick McHardy 2008-11-25 22:48 ` BORBELY Zoltan 0 siblings, 1 reply; 13+ messages in thread From: Patrick McHardy @ 2008-11-25 11:11 UTC (permalink / raw) To: BORBELY Zoltan; +Cc: Netfilter Development Mailinglist [-- Attachment #1: Type: text/plain, Size: 1044 bytes --] BORBELY Zoltan wrote: > Hi, > > On Tue, Nov 18, 2008 at 02:27:44PM +0100, Patrick McHardy wrote: >> Could you try whether this patch fixes the problem? >> > We didn't see any kernel crashes during a half day heavy work (without the > patch the kernel crashed in 3-4 hours every time), but we found a lot of > BUG messages in the log (maybe for every new entry): > > Nov 24 14:45:43 test kernel: BUG: sleeping function called from invalid context at mm/slab.c:3043 > Nov 24 14:45:43 test kernel: in_atomic():1, irqs_disabled():0 > Nov 24 14:45:43 test kernel: 3 locks held by test/3586: > Nov 24 14:45:43 test kernel: #0: (nfnl_mutex){--..}, at: [<d081500f>] nfnetlink_rcv+0xf/0x30 [nfnetlink] > Nov 24 14:45:43 test kernel: #1: (nf_conntrack_lock){-+..}, at: [<d08c979f>] ctnetlink_new_conntrack+0x7f/0x770 [nf_conntrack_netlink] > Nov 24 14:45:43 test kernel: #2: (rcu_read_lock){..--}, at: [<d08c98ee>] ctnetlink_new_conntrack+0x1ce/0x770 [nf_conntrack_netlink] Crap, I missed a spot. Does this patch help? [-- Attachment #2: x --] [-- Type: text/plain, Size: 425 bytes --] diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c index 3b009a3..5f4a651 100644 --- a/net/netfilter/nf_conntrack_netlink.c +++ b/net/netfilter/nf_conntrack_netlink.c @@ -1138,7 +1138,7 @@ ctnetlink_create_conntrack(struct nlattr *cda[], } } - nf_ct_acct_ext_add(ct, GFP_KERNEL); + nf_ct_acct_ext_add(ct, GFP_ATOMIC); #if defined(CONFIG_NF_CONNTRACK_MARK) if (cda[CTA_MARK]) ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: crash in death_by_timeout() 2008-11-25 11:11 ` Patrick McHardy @ 2008-11-25 22:48 ` BORBELY Zoltan 2008-11-26 11:16 ` Patrick McHardy 0 siblings, 1 reply; 13+ messages in thread From: BORBELY Zoltan @ 2008-11-25 22:48 UTC (permalink / raw) To: Patrick McHardy; +Cc: Netfilter Development Mailinglist Hi, On Tue, Nov 25, 2008 at 12:11:13PM +0100, Patrick McHardy wrote: > Crap, I missed a spot. Does this patch help? > > > diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c > index 3b009a3..5f4a651 100644 > --- a/net/netfilter/nf_conntrack_netlink.c > +++ b/net/netfilter/nf_conntrack_netlink.c > @@ -1138,7 +1138,7 @@ ctnetlink_create_conntrack(struct nlattr *cda[], > } > } > > - nf_ct_acct_ext_add(ct, GFP_KERNEL); > + nf_ct_acct_ext_add(ct, GFP_ATOMIC); > > #if defined(CONFIG_NF_CONNTRACK_MARK) > if (cda[CTA_MARK]) Thanks, the BUG messages have disappeared. Bye, Zoltan ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: crash in death_by_timeout() 2008-11-25 22:48 ` BORBELY Zoltan @ 2008-11-26 11:16 ` Patrick McHardy 0 siblings, 0 replies; 13+ messages in thread From: Patrick McHardy @ 2008-11-26 11:16 UTC (permalink / raw) To: BORBELY Zoltan; +Cc: Netfilter Development Mailinglist BORBELY Zoltan wrote: > Hi, > > On Tue, Nov 25, 2008 at 12:11:13PM +0100, Patrick McHardy wrote: >> Crap, I missed a spot. Does this patch help? >> >> > >> diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c >> index 3b009a3..5f4a651 100644 >> --- a/net/netfilter/nf_conntrack_netlink.c >> +++ b/net/netfilter/nf_conntrack_netlink.c >> @@ -1138,7 +1138,7 @@ ctnetlink_create_conntrack(struct nlattr *cda[], >> } >> } >> >> - nf_ct_acct_ext_add(ct, GFP_KERNEL); >> + nf_ct_acct_ext_add(ct, GFP_ATOMIC); >> >> #if defined(CONFIG_NF_CONNTRACK_MARK) >> if (cda[CTA_MARK]) > > Thanks, the BUG messages have disappeared. Great, thanks for testing. ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2008-11-26 11:16 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-11-17 22:18 crash in death_by_timeout() BORBELY Zoltan 2008-11-18 11:07 ` Patrick McHardy 2008-11-18 12:38 ` BORBELY Zoltan 2008-11-18 13:19 ` Patrick McHardy 2008-11-18 13:27 ` Patrick McHardy 2008-11-18 22:25 ` Pablo Neira Ayuso 2008-11-19 12:04 ` Patrick McHardy 2008-11-19 12:37 ` Pablo Neira Ayuso 2008-11-19 12:47 ` Patrick McHardy 2008-11-25 8:09 ` BORBELY Zoltan 2008-11-25 11:11 ` Patrick McHardy 2008-11-25 22:48 ` BORBELY Zoltan 2008-11-26 11:16 ` Patrick McHardy
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).