* conntrackd causes kernel panic
@ 2008-06-10 11:43 Rainer Sabelka
2008-06-10 14:29 ` Patrick McHardy
0 siblings, 1 reply; 11+ messages in thread
From: Rainer Sabelka @ 2008-06-10 11:43 UTC (permalink / raw)
To: netfilter-devel
[-- Attachment #1: Type: text/plain, Size: 5191 bytes --]
Hi,
I've posted the message below to the netfilter list yesterday.
Since Patrick asked to send crash reports also to netfilter-devel, I'm also
posting it here now.
Please let me know if you need additional information.
-Rainer
-------------------------------------------------------------------
Hi,
I'm using conntrackd and keepalived (for a pair of redundant firewalls in
active/backup configuration), and from time to time I experience kernel
panics or other random system crashes.
I'm new to conntrackd, so its likely that I made just some mistakes in my
configuration.
I'm getting the crashes when keepalived switches the backup host to
active.
Manually I can trigger the kernel panic when I execute "conntrackd -c" on the
backup host (sometimes "conntrackd -c" executes sucessfully, but it crashes
at the latest when I repeat the command a few times).
This is my setup:
* Ubuntu Linux with the distribution's kernel 2.6.24-18-server
* libnfnetlink 0.0.38 (compiled from sources)
* libnetfilter-conntrack 0.0.94 (compiled from sources)
* conntrack-tools 0.9.7 (compiled from sources)
My conntrackd.conf is attached below.
Does anybody have an idea why I get these crashes and what I could do to avoid
them?
Best regards,
-Rainer
---- /etc/conntrackd.conf -----
Sync {
Mode FTFW {
ResendBufferSize 262144
CommitTimeout 180
ACKWindowSize 20
}
Multicast {
IPv4_address 225.0.0.50
IPv4_interface 10.0.1.204 # IP of dedicated link
Interface eth0
Group 3780
}
Checksum on
}
General {
HashSize 8192
HashLimit 65535
LockFile /var/lock/conntrack.lock
UNIX {
Path /tmp/sync.sock
Backlog 20
}
SocketBufferSize 262142
SocketBufferSizeMaxGrown 655355
}
IgnoreTrafficFor {
IPv4_address 127.0.0.1 # loopback
IPv4_address 10.0.1.203
IPv4_address 10.0.1.204
IPv4_address 10.0.0.1
IPv4_address 10.9.62.1
IPv4_address 10.9.62.203
IPv4_address 10.9.62.204
}
IgnoreProtocol {
ICMP
IGMP
VRRP
}
---------------------------------------------
Some additional information:
I've now turned on logging to syslog in conntrackd.conf to see if I can get
some more
information on my problem.
1.) Now, I can see lots of the following messages in the syslog:
Jun 9 18:52:49 fw1b conntrack-tools[7385]: Received seq=1213034051 before
expected seq=1213034052
2.) When I do "conntrackd -c" I get:
Jun 9 18:52:50 fw1b conntrack-tools[10678]: committing external cache
Jun 9 18:52:50 fw1b conntrack-tools[10678]: commit: Invalid argument
[...]
Jun 9 18:52:50 fw1b conntrack-tools[10678]: commit: Cannot allocate memory
[...]
Jun 9 18:52:50 cfw1b conntrack-tools[10678]: Committed 2 new entries
Jun 9 18:52:50 cfw1b conntrack-tools[10678]: 89 entries can't be committed
3.) Since I turned on logging "conntrackd -c" now seems to be more stable. In
the first moment I thought my problem was fixed. But then, I started a script
which executed this command repeatedly in a loop. It eventually triggered
a kernel oops:
# while sleep 1 ; do conntrackd -c ; done
fw1b kernel: [ 6714.379206] ------------[ cut here ]------------
fw1b kernel: [ 6714.381285] invalid opcode: 0000 [#1] SMP
fw1b kernel: [ 6714.388793] Process kjournald (pid: 2267, ti=c79ac000
task=c5121140 task.ti=c79ac000)
fw1b kernel: [ 6714.388824] Stack: c5c40a80 00000000 c5c40a80 00000000
c1422000 c5121140 c50e0000 00000002
fw1b kernel: [ 6714.389418] c79adf84 c79adf7c 00000000 c04980e0
c049b480 c049b480 c049b480 c79adf88
fw1b kernel: [ 6714.390152] 00000286 c013b547 c79882ec ffffffff
c79882ec 00000286 c013b5c5 00000286
fw1b kernel: [ 6714.391701] Call Trace:
fw1b kernel: [ 6714.392871] [<c013b547>] lock_timer_base+0x27/0x60
fw1b kernel: [ 6714.393652] [<c013b5c5>] try_to_del_timer_sync+0x45/0x50
fw1b kernel: [ 6714.394210] [<c8ace740>] kjournald+0xa0/0x200 [jbd]
fw1b kernel: [ 6714.394780] [<c0145fc0>] autoremove_wake_function+0x0/0x40
fw1b kernel: [ 6714.395354] [<c8ace6a0>] kjournald+0x0/0x200 [jbd]
fw1b kernel: [ 6714.395907] [<c0145d02>] kthread+0x42/0x70
fw1b kernel: [ 6714.396440] [<c0145cc0>] kthread+0x0/0x70
fw1b kernel: [ 6714.396994] [<c010900b>] kernel_thread_helper+0x7/0x10
fw1b kernel: [ 6714.397580] =======================
fw1b kernel: [ 6714.398150] Code: ff f3 90 8b 03 a9 00 00 20 00 0f 84 1f f5
ff ff eb ef 0f 0b eb fe f3 90 8b 03 a9 00 00 20 00 0f 84 af f3 ff ff eb ef 0f
0b eb fe <0f> 0b eb fe 0f 0b eb fe 56 53 89 d3 8d 34 90 eb 16 8d b4 26 00
fw1b kernel: [ 6714.399468] EIP: [<c8acb958>]
journal_commit_transaction+0xd88/0xd90 [jbd] SS:ESP 0068:c79adf2c
At first glance this oops seems to unrelated because it happens within
kjournald. But is triggered by the conntrackd -c command, so I suspect
(rather naively) that conntrackd calls some kernel function which mixes up
some kenel memory (stack?) causing a crash later on.
Does anybody have a hint what could be wrong with my setup?
Best regards,
-Rainer
[-- Attachment #2: kernel-panic-conntrackd.png --]
[-- Type: image/png, Size: 40365 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: conntrackd causes kernel panic 2008-06-10 11:43 conntrackd causes kernel panic Rainer Sabelka @ 2008-06-10 14:29 ` Patrick McHardy 2008-06-10 15:06 ` [SPAM?] " Rainer Sabelka 0 siblings, 1 reply; 11+ messages in thread From: Patrick McHardy @ 2008-06-10 14:29 UTC (permalink / raw) To: Rainer Sabelka; +Cc: netfilter-devel [-- Attachment #1: Type: text/plain, Size: 2000 bytes --] Rainer Sabelka wrote: > > fw1b kernel: [ 6714.379206] ------------[ cut here ]------------ > fw1b kernel: [ 6714.381285] invalid opcode: 0000 [#1] SMP > fw1b kernel: [ 6714.388793] Process kjournald (pid: 2267, ti=c79ac000 > task=c5121140 task.ti=c79ac000) > fw1b kernel: [ 6714.388824] Stack: c5c40a80 00000000 c5c40a80 00000000 > c1422000 c5121140 c50e0000 00000002 > fw1b kernel: [ 6714.389418] c79adf84 c79adf7c 00000000 c04980e0 > c049b480 c049b480 c049b480 c79adf88 > fw1b kernel: [ 6714.390152] 00000286 c013b547 c79882ec ffffffff > c79882ec 00000286 c013b5c5 00000286 > fw1b kernel: [ 6714.391701] Call Trace: > fw1b kernel: [ 6714.392871] [<c013b547>] lock_timer_base+0x27/0x60 > fw1b kernel: [ 6714.393652] [<c013b5c5>] try_to_del_timer_sync+0x45/0x50 > fw1b kernel: [ 6714.394210] [<c8ace740>] kjournald+0xa0/0x200 [jbd] > fw1b kernel: [ 6714.394780] [<c0145fc0>] autoremove_wake_function+0x0/0x40 > fw1b kernel: [ 6714.395354] [<c8ace6a0>] kjournald+0x0/0x200 [jbd] > fw1b kernel: [ 6714.395907] [<c0145d02>] kthread+0x42/0x70 > fw1b kernel: [ 6714.396440] [<c0145cc0>] kthread+0x0/0x70 > fw1b kernel: [ 6714.396994] [<c010900b>] kernel_thread_helper+0x7/0x10 > fw1b kernel: [ 6714.397580] ======================= > fw1b kernel: [ 6714.398150] Code: ff f3 90 8b 03 a9 00 00 20 00 0f 84 1f f5 > ff ff eb ef 0f 0b eb fe f3 90 8b 03 a9 00 00 20 00 0f 84 af f3 ff ff eb ef 0f > 0b eb fe <0f> 0b eb fe 0f 0b eb fe 56 53 89 d3 8d 34 90 eb 16 8d b4 26 00 > fw1b kernel: [ 6714.399468] EIP: [<c8acb958>] > journal_commit_transaction+0xd88/0xd90 [jbd] SS:ESP 0068:c79adf2c > > At first glance this oops seems to unrelated because it happens within > kjournald. But is triggered by the conntrackd -c command, so I suspect > (rather naively) that conntrackd calls some kernel function which mixes up > some kenel memory (stack?) causing a crash later on. > > Does anybody have a hint what could be wrong with my setup? Do these two patches help? [-- Attachment #2: 02.diff --] [-- Type: text/x-diff, Size: 2905 bytes --] netfilter: nf_conntrack: fix ctnetlink related crash in nf_nat_setup_info() When creation of a new conntrack entry in ctnetlink fails after having set up the NAT mappings, the conntrack has an extension area allocated that is not getting properly destroyed when freeing the conntrack again. This means the NAT extension is still in the bysource hash, causing a crash when walking over the hash chain the next time: BUG: unable to handle kernel paging request at 00120fbd IP: [<c03d394b>] nf_nat_setup_info+0x221/0x58a *pde = 00000000 Oops: 0000 [#1] PREEMPT SMP Pid: 2795, comm: conntrackd Not tainted (2.6.26-rc5 #1) EIP: 0060:[<c03d394b>] EFLAGS: 00010206 CPU: 1 EIP is at nf_nat_setup_info+0x221/0x58a EAX: 00120fbd EBX: 00120fbd ECX: 00000001 EDX: 00000000 ESI: 0000019e EDI: e853bbb4 EBP: e853bbc8 ESP: e853bb78 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 Process conntrackd (pid: 2795, ti=e853a000 task=f7de10f0 task.ti=e853a000) Stack: 00000000 e853bc2c e85672ec 00000008 c0561084 63c1db4a 00000000 00000000 00000000 0002e109 61d2b1c3 00000000 00000000 00000000 01114e22 61d2b1c3 00000000 00000000 f7444674 e853bc04 00000008 c038e728 0000000a f7444674 Call Trace: [<c038e728>] nla_parse+0x5c/0xb0 [<c0397c1b>] ctnetlink_change_status+0x190/0x1c6 [<c0397eec>] ctnetlink_new_conntrack+0x189/0x61f [<c0119aee>] update_curr+0x3d/0x52 [<c03902d1>] nfnetlink_rcv_msg+0xc1/0xd8 [<c0390228>] nfnetlink_rcv_msg+0x18/0xd8 [<c0390210>] nfnetlink_rcv_msg+0x0/0xd8 [<c038d2ce>] netlink_rcv_skb+0x2d/0x71 [<c0390205>] nfnetlink_rcv+0x19/0x24 [<c038d0f5>] netlink_unicast+0x1b3/0x216 ... Move invocation of the extension destructors to nf_conntrack_free() to fix this problem. Fixes http://bugzilla.kernel.org/show_bug.cgi?id=10875 Reported-and-Tested-by: Krzysztof Piotr Oledzki <ole@ans.pl> Signed-off-by: Patrick McHardy <kaber@trash.net> --- commit 8a3d245effe6e699e587133a3f8ea700bd47842d tree 38676f126c592455747598b6d56bccf9550d0214 parent 21fa91adce646ad0449e898a64edaa828ca131e7 author Patrick McHardy <kaber@trash.net> Tue, 10 Jun 2008 10:56:29 +0200 committer Patrick McHardy <kaber@trash.net> Tue, 10 Jun 2008 10:56:29 +0200 net/netfilter/nf_conntrack_core.c | 3 +-- 1 files changed, 1 insertions(+), 2 deletions(-) diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c index c4b1799..662c1cc 100644 --- a/net/netfilter/nf_conntrack_core.c +++ b/net/netfilter/nf_conntrack_core.c @@ -196,8 +196,6 @@ destroy_conntrack(struct nf_conntrack *nfct) if (l4proto && l4proto->destroy) l4proto->destroy(ct); - nf_ct_ext_destroy(ct); - rcu_read_unlock(); spin_lock_bh(&nf_conntrack_lock); @@ -520,6 +518,7 @@ static void nf_conntrack_free_rcu(struct rcu_head *head) void nf_conntrack_free(struct nf_conn *ct) { + nf_ct_ext_destroy(ct); call_rcu(&ct->rcu, nf_conntrack_free_rcu); } EXPORT_SYMBOL_GPL(nf_conntrack_free); [-- Attachment #3: 03.diff --] [-- Type: text/x-diff, Size: 3306 bytes --] netfilter: nf_nat: fix RCU races Fix three ct_extend/NAT extension related races: - When cleaning up the extension area and removing it from the bysource hash, the nat->ct pointer must not be set to NULL since it may still be used in a RCU read side - When replacing a NAT extension area in the bysource hash, the nat->ct pointer must be assigned before performing the replacement - When reallocating extension storage in ct_extend, the old memory must not be freed immediately since it may still be used by a RCU read side Possibly fixes https://bugzilla.redhat.com/show_bug.cgi?id=449315 and/or http://bugzilla.kernel.org/show_bug.cgi?id=10875 Signed-off-by: Patrick McHardy <kaber@trash.net> --- commit f4efed322f3d3a30df1bb5fc33403e84aca66d8e tree 72d463aa289ab27850f40c76b68a24e91af7a6b0 parent 8a3d245effe6e699e587133a3f8ea700bd47842d author Patrick McHardy <kaber@trash.net> Tue, 10 Jun 2008 11:00:35 +0200 committer Patrick McHardy <kaber@trash.net> Tue, 10 Jun 2008 11:00:35 +0200 include/net/netfilter/nf_conntrack_extend.h | 1 + net/ipv4/netfilter/nf_nat_core.c | 3 +-- net/netfilter/nf_conntrack_extend.c | 9 ++++++++- 3 files changed, 10 insertions(+), 3 deletions(-) diff --git a/include/net/netfilter/nf_conntrack_extend.h b/include/net/netfilter/nf_conntrack_extend.h index f736e84..f80c0ed 100644 --- a/include/net/netfilter/nf_conntrack_extend.h +++ b/include/net/netfilter/nf_conntrack_extend.h @@ -15,6 +15,7 @@ enum nf_ct_ext_id /* Extensions: optional stuff which isn't permanently in struct. */ struct nf_ct_ext { + struct rcu_head rcu; u8 offset[NF_CT_EXT_NUM]; u8 len; char data[0]; diff --git a/net/ipv4/netfilter/nf_nat_core.c b/net/ipv4/netfilter/nf_nat_core.c index 0457859..d2a887f 100644 --- a/net/ipv4/netfilter/nf_nat_core.c +++ b/net/ipv4/netfilter/nf_nat_core.c @@ -556,7 +556,6 @@ static void nf_nat_cleanup_conntrack(struct nf_conn *ct) spin_lock_bh(&nf_nat_lock); hlist_del_rcu(&nat->bysource); - nat->ct = NULL; spin_unlock_bh(&nf_nat_lock); } @@ -570,8 +569,8 @@ static void nf_nat_move_storage(void *new, void *old) return; spin_lock_bh(&nf_nat_lock); - hlist_replace_rcu(&old_nat->bysource, &new_nat->bysource); new_nat->ct = ct; + hlist_replace_rcu(&old_nat->bysource, &new_nat->bysource); spin_unlock_bh(&nf_nat_lock); } diff --git a/net/netfilter/nf_conntrack_extend.c b/net/netfilter/nf_conntrack_extend.c index bcc19fa..8a3f8b3 100644 --- a/net/netfilter/nf_conntrack_extend.c +++ b/net/netfilter/nf_conntrack_extend.c @@ -59,12 +59,19 @@ nf_ct_ext_create(struct nf_ct_ext **ext, enum nf_ct_ext_id id, gfp_t gfp) if (!*ext) return NULL; + INIT_RCU_HEAD(&(*ext)->rcu); (*ext)->offset[id] = off; (*ext)->len = len; return (void *)(*ext) + off; } +static void __nf_ct_ext_free_rcu(struct rcu_head *head) +{ + struct nf_ct_ext *ext = container_of(head, struct nf_ct_ext, rcu); + kfree(ext); +} + void *__nf_ct_ext_add(struct nf_conn *ct, enum nf_ct_ext_id id, gfp_t gfp) { struct nf_ct_ext *new; @@ -106,7 +113,7 @@ void *__nf_ct_ext_add(struct nf_conn *ct, enum nf_ct_ext_id id, gfp_t gfp) (void *)ct->ext + ct->ext->offset[i]); rcu_read_unlock(); } - kfree(ct->ext); + call_rcu(&ct->ext->rcu, __nf_ct_ext_free_rcu); ct->ext = new; } ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [SPAM?] Re: conntrackd causes kernel panic 2008-06-10 14:29 ` Patrick McHardy @ 2008-06-10 15:06 ` Rainer Sabelka 2008-06-10 17:01 ` Krzysztof Oledzki 2008-06-11 6:17 ` Patrick McHardy 0 siblings, 2 replies; 11+ messages in thread From: Rainer Sabelka @ 2008-06-10 15:06 UTC (permalink / raw) To: Patrick McHardy; +Cc: netfilter-devel On Tuesday 10 June 2008 16:29, Patrick McHardy wrote: > Do these two patches help? Patrick, I tried to apply those patches to the Ubuntu kernel sources (2.6.24-18) but they failed, so I guess I should try to use a vanilla kernel instead. Which version should I try? -Rainer ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [SPAM?] Re: conntrackd causes kernel panic 2008-06-10 15:06 ` [SPAM?] " Rainer Sabelka @ 2008-06-10 17:01 ` Krzysztof Oledzki 2008-06-10 18:12 ` Krzysztof Oledzki 2008-06-11 6:17 ` Patrick McHardy 1 sibling, 1 reply; 11+ messages in thread From: Krzysztof Oledzki @ 2008-06-10 17:01 UTC (permalink / raw) To: Rainer Sabelka; +Cc: Patrick McHardy, netfilter-devel [-- Attachment #1: Type: TEXT/PLAIN, Size: 557 bytes --] On Tue, 10 Jun 2008, Rainer Sabelka wrote: > On Tuesday 10 June 2008 16:29, Patrick McHardy wrote: >> Do these two patches help? > > Patrick, I tried to apply those patches to the Ubuntu kernel sources > (2.6.24-18) but they failed, so I guess I should try to use a vanilla kernel > instead. > Which version should I try? Those patches are for nf-next-2.6. I have a backport of this patches for 2.6.24.x and 2.6.25.x kernels, but I need to make sure there were no changes in the latest version. Best regards, Krzysztof Olędzki ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [SPAM?] Re: conntrackd causes kernel panic 2008-06-10 17:01 ` Krzysztof Oledzki @ 2008-06-10 18:12 ` Krzysztof Oledzki 2008-06-11 6:41 ` Rainer Sabelka 0 siblings, 1 reply; 11+ messages in thread From: Krzysztof Oledzki @ 2008-06-10 18:12 UTC (permalink / raw) To: Rainer Sabelka; +Cc: Patrick McHardy, netfilter-devel [-- Attachment #1: Type: TEXT/PLAIN, Size: 936 bytes --] On Tue, 10 Jun 2008, Krzysztof Oledzki wrote: > > > On Tue, 10 Jun 2008, Rainer Sabelka wrote: > >> On Tuesday 10 June 2008 16:29, Patrick McHardy wrote: >>> Do these two patches help? >> >> Patrick, I tried to apply those patches to the Ubuntu kernel sources >> (2.6.24-18) but they failed, so I guess I should try to use a vanilla >> kernel >> instead. >> Which version should I try? > > Those patches are for nf-next-2.6. I have a backport of this patches for > 2.6.24.x and 2.6.25.x kernels, but I need to make sure there were no changes > in the latest version. OK, here it is (for 2.6.24.7): ftp://ftp.ans.pl/pub/patches/patch-ole-2.6.24-o4.gz Not sure what changes are in ubuntu 2.6.24-18 and probably you may not like to take all of my patches so: ftp://ftp.ans.pl/pub/patches/broken-out/2.6.24-o4/ You need patches 0750, 0760, 0840 and 0850. Best regards, Krzysztof Olędzki ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [SPAM?] Re: conntrackd causes kernel panic 2008-06-10 18:12 ` Krzysztof Oledzki @ 2008-06-11 6:41 ` Rainer Sabelka 2008-06-11 6:48 ` Patrick McHardy 0 siblings, 1 reply; 11+ messages in thread From: Rainer Sabelka @ 2008-06-11 6:41 UTC (permalink / raw) To: Krzysztof Oledzki; +Cc: Patrick McHardy, netfilter-devel On Tuesday 10 June 2008 20:12:48 Krzysztof Oledzki wrote: > On Tue, 10 Jun 2008, Krzysztof Oledzki wrote: > > On Tue, 10 Jun 2008, Rainer Sabelka wrote: > >> On Tuesday 10 June 2008 16:29, Patrick McHardy wrote: > >>> Do these two patches help? > >> > >> Patrick, I tried to apply those patches to the Ubuntu kernel sources > >> (2.6.24-18) but they failed, so I guess I should try to use a vanilla > >> kernel > >> instead. > >> Which version should I try? > > > > Those patches are for nf-next-2.6. I have a backport of this patches for > > 2.6.24.x and 2.6.25.x kernels, but I need to make sure there were no > > changes in the latest version. > > OK, here it is (for 2.6.24.7): > ftp://ftp.ans.pl/pub/patches/patch-ole-2.6.24-o4.gz > > Not sure what changes are in ubuntu 2.6.24-18 and probably you may not > like to take all of my patches so: > ftp://ftp.ans.pl/pub/patches/broken-out/2.6.24-o4/ > > You need patches 0750, 0760, 0840 and 0850. Krzysztof, I've applied these 4 patches to ubuntu's 2.6.24-18 and things look much better now. "conntrackd -c" no longer causes a kernel panic or oops. Thanks for your help! -Rainer ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [SPAM?] Re: conntrackd causes kernel panic 2008-06-11 6:41 ` Rainer Sabelka @ 2008-06-11 6:48 ` Patrick McHardy 0 siblings, 0 replies; 11+ messages in thread From: Patrick McHardy @ 2008-06-11 6:48 UTC (permalink / raw) To: Rainer Sabelka; +Cc: Krzysztof Oledzki, netfilter-devel Rainer Sabelka wrote: > On Tuesday 10 June 2008 20:12:48 Krzysztof Oledzki wrote: >> On Tue, 10 Jun 2008, Krzysztof Oledzki wrote: >>> On Tue, 10 Jun 2008, Rainer Sabelka wrote: >>>> On Tuesday 10 June 2008 16:29, Patrick McHardy wrote: >>>>> Do these two patches help? >>>> Patrick, I tried to apply those patches to the Ubuntu kernel sources >>>> (2.6.24-18) but they failed, so I guess I should try to use a vanilla >>>> kernel >>>> instead. >>>> Which version should I try? >>> Those patches are for nf-next-2.6. I have a backport of this patches for >>> 2.6.24.x and 2.6.25.x kernels, but I need to make sure there were no >>> changes in the latest version. >> OK, here it is (for 2.6.24.7): >> ftp://ftp.ans.pl/pub/patches/patch-ole-2.6.24-o4.gz >> >> Not sure what changes are in ubuntu 2.6.24-18 and probably you may not >> like to take all of my patches so: >> ftp://ftp.ans.pl/pub/patches/broken-out/2.6.24-o4/ >> >> You need patches 0750, 0760, 0840 and 0850. > > Krzysztof, I've applied these 4 patches to ubuntu's 2.6.24-18 and things look > much better now. > "conntrackd -c" no longer causes a kernel panic or oops. Thanks for testing. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [SPAM?] Re: conntrackd causes kernel panic 2008-06-10 15:06 ` [SPAM?] " Rainer Sabelka 2008-06-10 17:01 ` Krzysztof Oledzki @ 2008-06-11 6:17 ` Patrick McHardy 1 sibling, 0 replies; 11+ messages in thread From: Patrick McHardy @ 2008-06-11 6:17 UTC (permalink / raw) To: Rainer Sabelka; +Cc: netfilter-devel Rainer Sabelka wrote: > On Tuesday 10 June 2008 16:29, Patrick McHardy wrote: >> Do these two patches help? > > Patrick, I tried to apply those patches to the Ubuntu kernel sources > (2.6.24-18) but they failed, so I guess I should try to use a vanilla kernel > instead. > Which version should I try? They apply cleanly to the current 2.6.26-rc and probably also to 2.6.25. It should also be simple to fix up manually, with 2.6.24 you only need the first patch, but it doesn't apply because that kernel is missing the rcu conversion. You can fix it up manually by moving the nf_ct_ext_destroy() call from destroy_conntrack() to nf_conntrack_free() in net/netfilter/nf_conntrack_core.c. ^ permalink raw reply [flat|nested] 11+ messages in thread
* conntrackd causes kernel panic
@ 2008-06-09 14:07 Rainer Sabelka
2008-06-09 17:24 ` Rainer Sabelka
0 siblings, 1 reply; 11+ messages in thread
From: Rainer Sabelka @ 2008-06-09 14:07 UTC (permalink / raw)
To: netfilter
[-- Attachment #1: Type: text/plain, Size: 1875 bytes --]
Hi,
I'm using conntrackd and keepalived (for a pair of redundant firewalls in
active/backup configuration) and from time to time I experience a kernel
panics.
I'm new to conntrackd, so its likely that I made just some mistakes in my
configuration.
I'm getting these kernel panics when keepalived switches the backup host to
active.
Manually I can trigger the kernel panic when I execute "conntrackd -c" on the
backup host (sometimes "conntrackd -c" executes sucessfully, but it crashes
at the latest when I repeat the command a few times).
This is my setup:
* Ubuntu Linux with kernel 2.6.24-18-server
* libnfnetlink 0.0.38 (compiled from sources)
* libnetfilter-conntrack 0.0.94 (compiled from sources)
* conntrack-tools 0.9.7 (compiled from sources)
My conntrackd.conf is attached below.
Does anybody have an idea why I get these crashes and what I could do to avoid
them?
Best regards,
-Rainer
---- /etc/conntrackd.conf -----
Sync {
Mode FTFW {
ResendBufferSize 262144
CommitTimeout 180
ACKWindowSize 20
}
Multicast {
IPv4_address 225.0.0.50
IPv4_interface 10.0.1.204 # IP of dedicated link
Interface eth0
Group 3780
}
Checksum on
}
General {
HashSize 8192
HashLimit 65535
LockFile /var/lock/conntrack.lock
UNIX {
Path /tmp/sync.sock
Backlog 20
}
SocketBufferSize 262142
SocketBufferSizeMaxGrown 655355
}
IgnoreTrafficFor {
IPv4_address 127.0.0.1 # loopback
IPv4_address 10.0.1.203
IPv4_address 10.0.1.204
IPv4_address 10.0.0.1
IPv4_address 10.9.62.1
IPv4_address 10.9.62.203
IPv4_address 10.9.62.204
}
IgnoreProtocol {
ICMP
IGMP
VRRP
}
[-- Attachment #2: kernel-panic-conntrackd.png --]
[-- Type: image/png, Size: 40365 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: conntrackd causes kernel panic 2008-06-09 14:07 Rainer Sabelka @ 2008-06-09 17:24 ` Rainer Sabelka 2008-06-09 17:33 ` Marco Barbero 0 siblings, 1 reply; 11+ messages in thread From: Rainer Sabelka @ 2008-06-09 17:24 UTC (permalink / raw) To: netfilter On Monday 09 June 2008 16:07, Rainer Sabelka wrote: > Hi, > > I'm using conntrackd and keepalived (for a pair of redundant firewalls in > active/backup configuration) and from time to time I experience a kernel > panics. > I'm new to conntrackd, so its likely that I made just some mistakes in my > configuration. > > I'm getting these kernel panics when keepalived switches the backup host to > active. > Manually I can trigger the kernel panic when I execute "conntrackd -c" on > the backup host (sometimes "conntrackd -c" executes sucessfully, but it > crashes at the latest when I repeat the command a few times). > > This is my setup: > * Ubuntu Linux with kernel 2.6.24-18-server > * libnfnetlink 0.0.38 (compiled from sources) > * libnetfilter-conntrack 0.0.94 (compiled from sources) > * conntrack-tools 0.9.7 (compiled from sources) Some additional information: I've now turned on logging in conntrackd.conf to see if I can get some more information on my problem. 1.) I can see lots of the following messages in the logfile: Jun 9 18:52:49 fw1b conntrack-tools[7385]: Received seq=1213034051 before expected seq=1213034052 2.) When I do "conntrackd -c" I get: Jun 9 18:52:50 fw1b conntrack-tools[10678]: committing external cache Jun 9 18:52:50 fw1b conntrack-tools[10678]: commit: Invalid argument [...] Jun 9 18:52:50 fw1b conntrack-tools[10678]: commit: Cannot allocate memory [...] Jun 9 18:52:50 cfw1b conntrack-tools[10678]: Committed 2 new entries Jun 9 18:52:50 cfw1b conntrack-tools[10678]: 89 entries can't be committed 3.) Since I turned on logging "conntrackd -c" now seems to be more stable. I the first moment I thought my problem was fixed. But the I started a script which executed this command repeatedly in a loop, which eventually trigered an kernel oops: # while sleep 1 ; do conntrackd -c ; done fw1b kernel: [ 6714.379206] ------------[ cut here ]------------ fw1b kernel: [ 6714.381285] invalid opcode: 0000 [#1] SMP fw1b kernel: [ 6714.388793] Process kjournald (pid: 2267, ti=c79ac000 task=c5121140 task.ti=c79ac000) fw1b kernel: [ 6714.388824] Stack: c5c40a80 00000000 c5c40a80 00000000 c1422000 c5121140 c50e0000 00000002 fw1b kernel: [ 6714.389418] c79adf84 c79adf7c 00000000 c04980e0 c049b480 c049b480 c049b480 c79adf88 fw1b kernel: [ 6714.390152] 00000286 c013b547 c79882ec ffffffff c79882ec 00000286 c013b5c5 00000286 fw1b kernel: [ 6714.391701] Call Trace: fw1b kernel: [ 6714.392871] [<c013b547>] lock_timer_base+0x27/0x60 fw1b kernel: [ 6714.393652] [<c013b5c5>] try_to_del_timer_sync+0x45/0x50 fw1b kernel: [ 6714.394210] [<c8ace740>] kjournald+0xa0/0x200 [jbd] fw1b kernel: [ 6714.394780] [<c0145fc0>] autoremove_wake_function+0x0/0x40 fw1b kernel: [ 6714.395354] [<c8ace6a0>] kjournald+0x0/0x200 [jbd] fw1b kernel: [ 6714.395907] [<c0145d02>] kthread+0x42/0x70 fw1b kernel: [ 6714.396440] [<c0145cc0>] kthread+0x0/0x70 fw1b kernel: [ 6714.396994] [<c010900b>] kernel_thread_helper+0x7/0x10 fw1b kernel: [ 6714.397580] ======================= fw1b kernel: [ 6714.398150] Code: ff f3 90 8b 03 a9 00 00 20 00 0f 84 1f f5 ff ff eb ef 0f 0b eb fe f3 90 8b 03 a9 00 00 20 00 0f 84 af f3 ff ff eb ef 0f 0b eb fe <0f> 0b eb fe 0f 0b eb fe 56 53 89 d3 8d 34 90 eb 16 8d b4 26 00 fw1b kernel: [ 6714.399468] EIP: [<c8acb958>] journal_commit_transaction+0xd88/0xd90 [jbd] SS:ESP 0068:c79adf2c At firs glance this oops seems to unrelated because it happens within kjournald. But is triggered by the conntrackd -c command, so I suspect (rather naively) that conntrackd calls some kernel function which mixes up some kenel memory causing a crash later on. Does anybody have a hint what could be wrong with my setup? Best regards, -Rainer ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: conntrackd causes kernel panic 2008-06-09 17:24 ` Rainer Sabelka @ 2008-06-09 17:33 ` Marco Barbero 0 siblings, 0 replies; 11+ messages in thread From: Marco Barbero @ 2008-06-09 17:33 UTC (permalink / raw) To: netfilter 2008/6/9 Rainer Sabelka <sabelka@iue.tuwien.ac.at>: > On Monday 09 June 2008 16:07, Rainer Sabelka wrote: >> I'm getting these kernel panics when keepalived switches the backup host to >> active. >> Manually I can trigger the kernel panic when I execute "conntrackd -c" on >> the backup host (sometimes "conntrackd -c" executes sucessfully, but it >> crashes at the latest when I repeat the command a few times). Had same issue (see my post today). I solved using last kernel (2.6.25.5). Still I'm getting 'entries can't be committed' like you. Hope Pablo can help Regards ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2008-06-11 6:48 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-06-10 11:43 conntrackd causes kernel panic Rainer Sabelka 2008-06-10 14:29 ` Patrick McHardy 2008-06-10 15:06 ` [SPAM?] " Rainer Sabelka 2008-06-10 17:01 ` Krzysztof Oledzki 2008-06-10 18:12 ` Krzysztof Oledzki 2008-06-11 6:41 ` Rainer Sabelka 2008-06-11 6:48 ` Patrick McHardy 2008-06-11 6:17 ` Patrick McHardy -- strict thread matches above, loose matches on Subject: below -- 2008-06-09 14:07 Rainer Sabelka 2008-06-09 17:24 ` Rainer Sabelka 2008-06-09 17:33 ` Marco Barbero
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.