kernel oops with NAT in 2.6.16.13 kernel

All of lore.kernel.org
 help / color / mirror / Atom feed

* kernel oops with NAT in 2.6.16.13 kernel
@ 2006-10-06  7:57 Nishit Shah
  2006-10-10  5:04 ` Patrick McHardy
  0 siblings, 1 reply; 8+ messages in thread
From: Nishit Shah @ 2006-10-06  7:57 UTC (permalink / raw)
  To: netfilter-devel

Hi,
        During my load testing on kernel 2.6.16.13 i got following kernel
oops. It occures when I enable SNAT or MASQUERADE for all outgoing traffic,
with ACCEPT load testing works fine for me.

Regards,
Nishit Shah.

Oct 06 11:33:28 1160114608 kernel: BUG: soft lockup detected on CPU#0!
Oct 06 11:33:28 1160114608 kernel:
Oct 06 11:33:28 1160114608 kernel: Pid: 1754, comm:           in.telnetd
Oct 06 11:33:28 1160114608 kernel: EIP: 0060:[pg0+944264370/1068717056] CPU:
0
Oct 06 11:33:28 1160114608 kernel: EIP is at __ip_conntrack_find+0x22/0xb0
[ip_conntrack]
Oct 06 11:33:28 1160114608 kernel:  EFLAGS: 00000292    Not tainted
(2.6.16.13-1 #4)
Oct 06 11:33:28 1160114608 kernel: EAX: 00016ac5 EBX: f5e01218 ECX: f5c00000
EDX: f5cb5628
Oct 06 11:33:28 1160114608 kernel: ESI: 000046b9 EDI: f7bf1754 EBP: f5fb6400
DS: 007b ES: 007b
Oct 06 11:33:28 1160114608 kernel: CR0: 8005003b CR2: b7ee7dc4 CR3: 37a05000
CR4: 000006d0
Oct 06 11:33:28 1160114608 kernel:  [pg0+944265669/1068717056]
ip_conntrack_tuple_taken+0x25/0x40 [ip_conntrack]
Oct 06 11:33:28 1160114608 kernel:  [pg0+944312635/1068717056]
ip_nat_used_tuple+0x2b/0x40 [ip_nat]
Oct 06 11:33:28 1160114608 kernel:  [pg0+944318447/1068717056]
tcp_unique_tuple+0xaf/0x120 [ip_nat]
Oct 06 11:33:28 1160114608 kernel:  [pg0+944313566/1068717056]
get_unique_tuple+0xae/0x100 [ip_nat]
Oct 06 11:33:28 1160114608 kernel:  [pg0+944313779/1068717056]
ip_nat_setup_info+0x83/0x210 [ip_nat]
Oct 06 11:33:28 1160114608 kernel:  [pg0+944353635/1068717056]
masquerade_target+0x103/0x110 [ipt_MASQUERADE]
Oct 06 11:33:28 1160114608 kernel:  [pg0+944353376/1068717056]
masquerade_target+0x0/0x110 [ipt_MASQUERADE]
Oct 06 11:33:28 1160114608 kernel:  [pg0+943993598/1068717056]
ipt_do_table+0x2ce/0x360 [ip_tables]
Oct 06 11:33:28 1160114608 kernel:  [pg0+944333715/1068717056]
ip_nat_rule_find+0x43/0xc0 [iptable_nat]
Oct 06 11:33:28 1160114608 kernel:  [_write_unlock_bh+11/32]
_write_unlock_bh+0xb/0x20
Oct 06 11:33:28 1160114608 kernel:  [pg0+944334424/1068717056]
ip_nat_fn+0xe8/0x200 [iptable_nat]
Oct 06 11:33:28 1160114608 kernel:  [_read_unlock_bh+11/32]
_read_unlock_bh+0xb/0x20
Oct 06 11:33:28 1160114608 kernel:  [ip_finish_output+0/528]
ip_finish_output+0x0/0x210
Oct 06 11:33:28 1160114608 kernel:  [pg0+944335016/1068717056]
ip_nat_out+0x78/0x110 [iptable_nat]
Oct 06 11:33:28 1160114608 kernel:  [ip_finish_output+0/528]
ip_finish_output+0x0/0x210
Oct 06 11:33:28 1160114608 kernel:  [ip_finish_output+0/528]
ip_finish_output+0x0/0x210
Oct 06 11:33:28 1160114608 kernel:  [nf_iterate+120/144]
nf_iterate+0x78/0x90
Oct 06 11:33:28 1160114608 kernel:  [ip_finish_output+0/528]
ip_finish_output+0x0/0x210
Oct 06 11:33:28 1160114608 kernel:  [ip_finish_output+0/528]
ip_finish_output+0x0/0x210
Oct 06 11:33:28 1160114608 kernel:  [nf_hook_slow+110/272]
nf_hook_slow+0x6e/0x110
Oct 06 11:33:28 1160114608 kernel:  [ip_finish_output+0/528]
ip_finish_output+0x0/0x210
Oct 06 11:33:28 1160114608 kernel:  [ip_output+700/720]
ip_output+0x2bc/0x2d0
Oct 06 11:33:29 1160114609 kernel:  [ip_finish_output+0/528]
ip_finish_output+0x0/0x210
Oct 06 11:33:29 1160114609 kernel:  [ip_forward+420/720]
ip_forward+0x1a4/0x2d0
Oct 06 11:33:29 1160114609 kernel:  [ip_forward_finish+0/64]
ip_forward_finish+0x0/0x40
Oct 06 11:33:29 1160114609 kernel:  [ip_rcv+624/1264] ip_rcv+0x270/0x4f0
Oct 06 11:33:29 1160114609 kernel:  [ip_rcv_finish+0/624]
ip_rcv_finish+0x0/0x270
Oct 06 11:33:29 1160114609 kernel:  [netif_receive_skb+492/624]
netif_receive_skb+0x1ec/0x270
Oct 06 11:33:29 1160114609 kernel:  [pg0+944029678/1068717056]
e1000_clean_rx_irq+0x1ce/0x5f0 [e1000]
Oct 06 11:33:29 1160114609 kernel:  [ktime_get_ts+97/112]
ktime_get_ts+0x61/0x70
Oct 06 11:33:29 1160114609 kernel:  [pg0+944028203/1068717056]
e1000_clean+0xbb/0x1c0 [e1000]
Oct 06 11:33:29 1160114609 kernel:  [net_rx_action+116/256]
net_rx_action+0x74/0x100
Oct 06 11:33:29 1160114609 kernel:  [__do_softirq+123/144]
__do_softirq+0x7b/0x90
Oct 06 11:33:29 1160114609 kernel:  [do_softirq+38/48] do_softirq+0x26/0x30
Oct 06 11:33:29 1160114609 kernel:  [local_bh_enable+70/128]
local_bh_enable+0x46/0x80
Oct 06 11:33:29 1160114609 kernel:  [_write_unlock_bh+11/32]
_write_unlock_bh+0xb/0x20
Oct 06 11:33:29 1160114609 kernel:  [pg0+944276160/1068717056]
tcp_packet+0x180/0x5d0 [ip_conntrack]
Oct 06 11:33:29 1160114609 kernel:  [local_bh_enable+70/128]
local_bh_enable+0x46/0x80
Oct 06 11:33:29 1160114609 kernel:  [_read_unlock_bh+11/32]
_read_unlock_bh+0xb/0x20
Oct 06 11:33:29 1160114609 kernel:  [pg0+944267281/1068717056]
ip_conntrack_in+0xe1/0x2d0 [ip_conntrack]
Oct 06 11:33:29 1160114609 kernel:  [dst_output+0/32] dst_output+0x0/0x20
Oct 06 11:33:29 1160114609 kernel:  [nf_iterate+120/144]
nf_iterate+0x78/0x90
Oct 06 11:33:29 1160114609 kernel:  [dst_output+0/32] dst_output+0x0/0x20
Oct 06 11:33:29 1160114609 kernel:  [dst_output+0/32] dst_output+0x0/0x20
Oct 06 11:33:29 1160114609 kernel:  [nf_hook_slow+110/272]
nf_hook_slow+0x6e/0x110
Oct 06 11:33:29 1160114609 kernel:  [dst_output+0/32] dst_output+0x0/0x20
Oct 06 11:33:29 1160114609 kernel:  [ip_queue_xmit+1022/1360]
ip_queue_xmit+0x3fe/0x550
Oct 06 11:33:29 1160114609 kernel:  [dst_output+0/32] dst_output+0x0/0x20
Oct 06 11:33:29 1160114609 kernel:  [ip_rcv+624/1264] ip_rcv+0x270/0x4f0
Oct 06 11:33:29 1160114609 kernel:  [ip_rcv_finish+0/624]
ip_rcv_finish+0x0/0x270
Oct 06 11:33:29 1160114609 kernel:  [netif_receive_skb+492/624]
netif_receive_skb+0x1ec/0x270
Oct 06 11:33:29 1160114609 kernel:  [pg0+944029868/1068717056]
e1000_clean_rx_irq+0x28c/0x5f0 [e1000]
Oct 06 11:33:29 1160114609 kernel:  [ktime_get_ts+97/112]
ktime_get_ts+0x61/0x70
Oct 06 11:33:29 1160114609 kernel:  [tcp_cwnd_restart+41/240]
tcp_cwnd_restart+0x29/0xf0
Oct 06 11:33:29 1160114609 kernel:  [tcp_event_data_sent+117/128]
tcp_event_data_sent+0x75/0x80
Oct 06 11:33:29 1160114609 kernel:  [tcp_transmit_skb+799/1184]
tcp_transmit_skb+0x31f/0x4a0
Oct 06 11:33:29 1160114609 kernel:  [pg0+944028203/1068717056]
e1000_clean+0xbb/0x1c0 [e1000]
Oct 06 11:33:29 1160114609 kernel:  [tcp_write_xmit+374/656]
tcp_write_xmit+0x176/0x290
Oct 06 11:33:29 1160114609 kernel:  [__tcp_push_pending_frames+53/176]
__tcp_push_pending_frames+0x35/0xb0
Oct 06 11:33:29 1160114609 kernel:  [tcp_sendmsg+873/3120]
tcp_sendmsg+0x369/0xc30
Oct 06 11:33:29 1160114609 kernel:  [buffered_rmqueue+246/544]
buffered_rmqueue+0xf6/0x220
Oct 06 11:33:29 1160114609 kernel:  [inet_sendmsg+74/96]
inet_sendmsg+0x4a/0x60
Oct 06 11:33:29 1160114609 kernel:  [do_sock_write+161/192]
do_sock_write+0xa1/0xc0
Oct 06 11:33:29 1160114609 kernel:  [sock_aio_write+148/160]
sock_aio_write+0x94/0xa0
Oct 06 11:33:29 1160114609 kernel:  [do_sync_write+209/288]
do_sync_write+0xd1/0x120
Oct 06 11:33:29 1160114609 kernel:  [autoremove_wake_function+0/96]
autoremove_wake_function+0x0/0x60
Oct 06 11:33:29 1160114609 kernel:  [vfs_write+364/384]
vfs_write+0x16c/0x180
Oct 06 11:33:29 1160114609 kernel:  [sys_write+81/128] sys_write+0x51/0x80
Oct 06 11:33:29 1160114609 kernel:  [syscall_call+7/11] syscall_call+0x7/0xb

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: kernel oops with NAT in 2.6.16.13 kernel
  2006-10-06  7:57 kernel oops with NAT in 2.6.16.13 kernel Nishit Shah
@ 2006-10-10  5:04 ` Patrick McHardy
  2006-10-10  5:50   ` Nishit Shah
  0 siblings, 1 reply; 8+ messages in thread
From: Patrick McHardy @ 2006-10-10  5:04 UTC (permalink / raw)
  To: Nishit Shah; +Cc: netfilter-devel

Nishit Shah wrote:
> Hi,
>         During my load testing on kernel 2.6.16.13 i got following kernel
> oops. It occures when I enable SNAT or MASQUERADE for all outgoing traffic,
> with ACCEPT load testing works fine for me.

How exactly did you perform this testing? Did you loaded and unloaded
modules, or just loaded them? How many entries does the connection
tracking table have at the time?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: kernel oops with NAT in 2.6.16.13 kernel
  2006-10-10  5:04 ` Patrick McHardy
@ 2006-10-10  5:50   ` Nishit Shah
  2006-10-10  6:01     ` Patrick McHardy
  0 siblings, 1 reply; 8+ messages in thread
From: Nishit Shah @ 2006-10-10  5:50 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: netfilter-devel

I have performed load testing through Spirent avalanche.(Test Specification
is Connections/Second).
At one side of testing machine, there are 10/11 virtual clients created by
Spirent and 4/5 virtual servers at other side.
Testing is through HTTP 1.0 with heep alive and 1024 bytes of object size.
I haven't loaded or unloaded any modules..
I have loaded conntrack module with following parameters.
modprobe conntrack hashsize=262144
echo 1048576 > /proc/sys/net/ipv4/ip_conntrack_max
(Testing machines contain >= 1 GB of RAM and those were plain firewall only
machines.)
Connection rate is around 4000 Connections/Second at the time of oops and
around 3,00,000 connection entries at time of oops.

Also, some of my observations,
I don't think problem is with connection rate, problem is with number of
connection entries.
I have tried with different machines but every time i got kernel oops at
3,00,000 entries in conntrack table.(tried with Pentium 4,Xeon,Xeon duel
etc..)
Also, this oops is in 2.6.16.13 vanila as well as 2.6.16.13smp vanila both.

May be this information will help you......

Regards,
Nishit Shah.

----- Original Message ----- 
From: "Patrick McHardy" <kaber@trash.net>
To: "Nishit Shah" <nishit@elitecore.com>
Cc: <netfilter-devel@lists.netfilter.org>
Sent: Tuesday, October 10, 2006 10:34 AM
Subject: Re: kernel oops with NAT in 2.6.16.13 kernel

> Nishit Shah wrote:
> > Hi,
> >         During my load testing on kernel 2.6.16.13 i got following
kernel
> > oops. It occures when I enable SNAT or MASQUERADE for all outgoing
traffic,
> > with ACCEPT load testing works fine for me.
>
> How exactly did you perform this testing? Did you loaded and unloaded
> modules, or just loaded them? How many entries does the connection
> tracking table have at the time?
>
>
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: kernel oops with NAT in 2.6.16.13 kernel
  2006-10-10  5:50   ` Nishit Shah
@ 2006-10-10  6:01     ` Patrick McHardy
  2006-10-10  6:28       ` Nishit Shah
  0 siblings, 1 reply; 8+ messages in thread
From: Patrick McHardy @ 2006-10-10  6:01 UTC (permalink / raw)
  To: Nishit Shah; +Cc: netfilter-devel

Nishit Shah wrote:
> I have performed load testing through Spirent avalanche.(Test Specification
> is Connections/Second).
> At one side of testing machine, there are 10/11 virtual clients created by
> Spirent and 4/5 virtual servers at other side.
> Testing is through HTTP 1.0 with heep alive and 1024 bytes of object size.
> I haven't loaded or unloaded any modules..
> I have loaded conntrack module with following parameters.
> modprobe conntrack hashsize=262144
> echo 1048576 > /proc/sys/net/ipv4/ip_conntrack_max
> (Testing machines contain >= 1 GB of RAM and those were plain firewall only
> machines.)
> Connection rate is around 4000 Connections/Second at the time of oops and
> around 3,00,000 connection entries at time of oops.
> 
> Also, some of my observations,
> I don't think problem is with connection rate, problem is with number of
> connection entries.
> I have tried with different machines but every time i got kernel oops at
> 3,00,000 entries in conntrack table.(tried with Pentium 4,Xeon,Xeon duel
> etc..)

With many conntrack entries NAT may take considerable time to find a
free tuple (up to ~64000 quite expensive hash lookups). For optimal
performance, the hash should be twice as large as the maximum number
of entries. I assume the machine doesn't freeze completely but
just reports a softlockup? Are you running anything touching conntrack
/proc-files during this test?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: kernel oops with NAT in 2.6.16.13 kernel
  2006-10-10  6:01     ` Patrick McHardy
@ 2006-10-10  6:28       ` Nishit Shah
  2006-10-11  5:52         ` Patrick McHardy
  0 siblings, 1 reply; 8+ messages in thread
From: Nishit Shah @ 2006-10-10  6:28 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: netfilter-devel

Yes, machine was not completely freezed up, soft lockup is there.
I havent' touch any /proc files during the test.
I am using,
cat /proc/slabinfo | grep conntrack for number of conntrack entries,
vmstat 2 for CPU usage
(one through telnet and one through serial console..)

Also, from vmstat one more observation is at time of 4000 Connections/Sec,
system is taking 30 to 35% CPU and suddenly it boosts to 100% and machine
got hang up and same time if i stop the test, machine will start
responding..
Also, have done tests for following parameters but i got oops for all of
them !!!!

hashsize=1048576 ip_conntrack_max=1048576 (ip_conntrack_max = hashsize * 1)
hashsize=262144 ip_conntrack_max=1048576 (ip_conntrack_max = hashsize * 4)
hashsize=262144 ip_conntrack_max=2097152 (ip_conntrack_max = hashsize * 8)

Only missed your hashsize * 2 test case !!!!

Few more results that will help you,

Test
Connections/Sec
Vanila Kernel
24000
Only conntrack loaded
22000
Conntrack + NAT module loaded(but no MASQ or SNAT rule in iptables)
20000-21000
Conntrack + NAT module loaded(MASQ or SNAT rule in iptables)
4000 (oops)

Regards,
Nishit Shah.


----- Original Message ----- 
From: "Patrick McHardy" <kaber@trash.net>
To: "Nishit Shah" <nishit@elitecore.com>
Cc: <netfilter-devel@lists.netfilter.org>
Sent: Tuesday, October 10, 2006 11:31 AM
Subject: Re: kernel oops with NAT in 2.6.16.13 kernel


> Nishit Shah wrote:
> > I have performed load testing through Spirent avalanche.(Test
Specification
> > is Connections/Second).
> > At one side of testing machine, there are 10/11 virtual clients created
by
> > Spirent and 4/5 virtual servers at other side.
> > Testing is through HTTP 1.0 with heep alive and 1024 bytes of object
size.
> > I haven't loaded or unloaded any modules..
> > I have loaded conntrack module with following parameters.
> > modprobe conntrack hashsize=262144
> > echo 1048576 > /proc/sys/net/ipv4/ip_conntrack_max
> > (Testing machines contain >= 1 GB of RAM and those were plain firewall
only
> > machines.)
> > Connection rate is around 4000 Connections/Second at the time of oops
and
> > around 3,00,000 connection entries at time of oops.
> >
> > Also, some of my observations,
> > I don't think problem is with connection rate, problem is with number of
> > connection entries.
> > I have tried with different machines but every time i got kernel oops at
> > 3,00,000 entries in conntrack table.(tried with Pentium 4,Xeon,Xeon duel
> > etc..)
>
> With many conntrack entries NAT may take considerable time to find a
> free tuple (up to ~64000 quite expensive hash lookups). For optimal
> performance, the hash should be twice as large as the maximum number
> of entries. I assume the machine doesn't freeze completely but
> just reports a softlockup? Are you running anything touching conntrack
> /proc-files during this test?
>
>
>
>
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: kernel oops with NAT in 2.6.16.13 kernel
  2006-10-10  6:28       ` Nishit Shah
@ 2006-10-11  5:52         ` Patrick McHardy
  2006-10-11  6:44           ` Nishit Shah
  0 siblings, 1 reply; 8+ messages in thread
From: Patrick McHardy @ 2006-10-11  5:52 UTC (permalink / raw)
  To: Nishit Shah; +Cc: netfilter-devel

Nishit Shah wrote:
> Few more results that will help you,
> 
> Test
> Connections/Sec
> Vanila Kernel
> 24000
> Only conntrack loaded
> 22000
> Conntrack + NAT module loaded(but no MASQ or SNAT rule in iptables)
> 20000-21000
> Conntrack + NAT module loaded(MASQ or SNAT rule in iptables)
> 4000 (oops)

I'm pretty sure its finding an unused tuple thats taking all the time.
Does the c/s rate degrade linear with NAT?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: kernel oops with NAT in 2.6.16.13 kernel
  2006-10-11  5:52         ` Patrick McHardy
@ 2006-10-11  6:44           ` Nishit Shah
  2006-10-11  7:04             ` Patrick McHardy
  0 siblings, 1 reply; 8+ messages in thread
From: Nishit Shah @ 2006-10-11  6:44 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: netfilter-devel

Are you talking about MASQ/SNAT case  ??

well in both NAT cases upto 4000 c/s, everything is pretty same, but at
3,00,000 connections, system is freezed in MASQ/SNAT case, so don't have
idea about c/s rate degradation.

Regards,
Nishit Shah.

----- Original Message ----- 
From: "Patrick McHardy" <kaber@trash.net>
To: "Nishit Shah" <nishit@elitecore.com>
Cc: <netfilter-devel@lists.netfilter.org>
Sent: Wednesday, October 11, 2006 11:22 AM
Subject: Re: kernel oops with NAT in 2.6.16.13 kernel


> Nishit Shah wrote:
> > Few more results that will help you,
> >
> > Test
> > Connections/Sec
> > Vanila Kernel
> > 24000
> > Only conntrack loaded
> > 22000
> > Conntrack + NAT module loaded(but no MASQ or SNAT rule in iptables)
> > 20000-21000
> > Conntrack + NAT module loaded(MASQ or SNAT rule in iptables)
> > 4000 (oops)
>
> I'm pretty sure its finding an unused tuple thats taking all the time.
> Does the c/s rate degrade linear with NAT?
>
>
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: kernel oops with NAT in 2.6.16.13 kernel
  2006-10-11  6:44           ` Nishit Shah
@ 2006-10-11  7:04             ` Patrick McHardy
  0 siblings, 0 replies; 8+ messages in thread
From: Patrick McHardy @ 2006-10-11  7:04 UTC (permalink / raw)
  To: Nishit Shah; +Cc: netfilter-devel

Nishit Shah wrote:
> Are you talking about MASQ/SNAT case  ??

Yes.

> well in both NAT cases upto 4000 c/s, everything is pretty same, but at
> 3,00,000 connections, system is freezed in MASQ/SNAT case, so don't have
> idea about c/s rate degradation.

Try SNATing to multiple IPs using multiple rules and some match to
distribute the traffic (like matching on even/uneven IPs or something
like that). That should show if finding a unique tuple really is the
problem.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2006-10-11  7:04 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-10-06  7:57 kernel oops with NAT in 2.6.16.13 kernel Nishit Shah
2006-10-10  5:04 ` Patrick McHardy
2006-10-10  5:50   ` Nishit Shah
2006-10-10  6:01     ` Patrick McHardy
2006-10-10  6:28       ` Nishit Shah
2006-10-11  5:52         ` Patrick McHardy
2006-10-11  6:44           ` Nishit Shah
2006-10-11  7:04             ` Patrick McHardy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.