From: Oleksandr Samoylyk <oleksandr@samoylyk.sumy.ua>
To: linux-kernel@vger.kernel.org
Subject: dst cache overflow
Date: Mon, 13 Oct 2008 16:54:29 +0300 [thread overview]
Message-ID: <48F35315.3020705@samoylyk.sumy.ua> (raw)
Dear community,
I've a problem of unexpected loosing of network connection on a server
running GNU/Linux.
It's a 8-core / 8 GB RAM PPTP "aggregator" with about 2500 sessions and
200 Mb/s or 30 kpps of Internet traffic.
There are two GigE Intel NICs (Intel(R) PRO/1000 Network Driver -
version 7.3.20-k2-NAPI).
I've attached their IRQs to different CPUs with smp_affinity.
TSO is off.
Kernel version: linux-2.6.24.3 (server image from ubuntu hardy).
I'm booting with:
noapic acpi=off panic=5 rhash_entries=1048575
I got the following in logs:
Oct 7 22:26:50 linux kernel: [ 0.000000] CPU 2:
Oct 7 22:26:50 linux kernel: [ 0.000000] Modules linked in: oprofile
af_packet xt_tcpmss act_police cls_u32 sch_sfq sch_ingress sch_htb
xt_multiport xt_TCPMSS xt_state
xt_limit xt_tcpudp iptable_mangle iptable_nat nf_nat nf_conntrack_ipv4
nf_conntrack iptable_filter ip_tables x_tables pptp pppox ppp_generic
slhc parport_pc lp parport loop i
TCO_wdt iTCO_vendor_support pcspkr i5000_edac shpchp edac_core
pci_hotplug evdev ext3 jbd mbcache sg sr_mod cdrom sd_mod ata_piix
pata_acpi aic94xx libsas scsi_transport_sas a
ta_generic e1000 libata scsi_mod fbcon tileblit font bitblit softcursor fuse
Oct 7 22:26:50 linux kernel: [ 0.000000] Pid: 29, comm: events/2 Not
tainted 2.6.24-19-server #1
Oct 7 22:26:50 linux kernel: [ 0.000000] RIP:
0010:[libsas:_spin_lock_irqsave+0x15/0x30]
[libsas:_spin_lock_irqsave+0x15/0x30] _spin_lock_irqsave+0x15/0x30
Oct 7 22:26:50 linux kernel: [ 0.000000] RSP: 0018:ffff81022568de18
EFLAGS: 00000282
Oct 7 22:26:50 linux kernel: [ 0.000000] RAX: 0000000000000282 RBX:
ffff810164d62400 RCX: 0000000000000000
Oct 7 22:26:50 linux kernel: [ 0.000000] RDX: 00000000000e4e67 RSI:
00000000124839fe RDI: ffff810164d626b4
Oct 7 22:26:50 linux kernel: [ 0.000000] RBP: ffffffff802345b3 R08:
0000000000000000 R09: 0000000000000000
Oct 7 22:26:50 linux kernel: [ 0.000000] R10: 0000000000000000 R11:
ffff8101f04d5d00 R12: ffff81022568ddc0
Oct 7 22:26:50 linux kernel: [ 0.000000] R13: 0000000000000003 R14:
0000000000000286 R15: 0000000000000001
Oct 7 22:26:50 linux kernel: [ 0.000000] FS: 0000000000000000(0000)
GS:ffff810228001b80(0000) knlGS:0000000000000000
Oct 7 22:26:50 linux kernel: [ 0.000000] CS: 0010 DS: 0018 ES: 0018
CR0: 000000008005003b
Oct 7 22:26:50 linux kernel: [ 0.000000] CR2: 00007f54a9cd2000 CR3:
0000000000201000 CR4: 00000000000006e0
Oct 7 22:26:50 linux kernel: [ 0.000000] DR0: 0000000000000000 DR1:
0000000000000000 DR2: 0000000000000000
Oct 7 22:26:50 linux kernel: [ 0.000000] DR3: 0000000000000000 DR6:
00000000ffff0ff0 DR7: 0000000000000400
Oct 7 22:26:50 linux kernel: [ 0.000000]
Oct 7 22:26:50 linux kernel: [ 0.000000] Call Trace:
Oct 7 22:26:50 linux kernel: [ 0.000000]
[pptp:skb_dequeue+0x21/0x3d0] skb_dequeue+0x21/0x80
Oct 7 22:26:50 linux kernel: [ 0.000000]
[pptp:do_buf_work+0x38/0x150] :pptp:do_buf_work+0x38/0x150
Oct 7 22:26:50 linux kernel: [ 0.000000] [pptp:_buf_work+0x0/0x20]
:pptp:_buf_work+0x0/0x20
Oct 7 22:26:50 linux kernel: [ 0.000000] [run_workqueue+0xcc/0x170]
run_workqueue+0xcc/0x170
Oct 7 22:26:50 linux kernel: [ 0.000000] [worker_thread+0x0/0x110]
worker_thread+0x0/0x110
Oct 7 22:26:50 linux kernel: [ 0.000000] [worker_thread+0x0/0x110]
worker_thread+0x0/0x110
Oct 7 22:26:50 linux kernel: [ 0.000000] [worker_thread+0xa3/0x110]
worker_thread+0xa3/0x110
Oct 7 22:26:50 linux kernel: [ 0.000000] [<ffffffff80254260>]
autoremove_wake_function+0x0/0x30
Oct 7 22:26:50 linux kernel: [ 0.000000] [worker_thread+0x0/0x110]
worker_thread+0x0/0x110
Oct 7 22:26:50 linux kernel: [ 0.000000] [worker_thread+0x0/0x110]
worker_thread+0x0/0x110
Oct 7 22:26:50 linux kernel: [ 0.000000] [kthread+0x4b/0x80]
kthread+0x4b/0x80
Oct 7 22:26:50 linux kernel: [ 0.000000] [child_rip+0xa/0x12]
child_rip+0xa/0x12
Oct 7 22:26:50 linux kernel: [ 0.000000] [kthread+0x0/0x80]
kthread+0x0/0x80
Oct 7 22:26:50 linux kernel: [ 0.000000] [child_rip+0x0/0x12]
child_rip+0x0/0x12
Oct 7 22:26:50 linux kernel: [ 0.000000]
Oct 7 22:26:51 linux kernel: [ 0.000000] NETDEV WATCHDOG: eth0:
transmit timed out
Oct 7 22:26:53 linux kernel: [ 0.000000] printk: 19 messages suppressed.
Oct 7 22:26:53 linux kernel: [ 0.000000] dst cache overflow
Oct 7 22:26:56 linux kernel: [ 0.000000] NETDEV WATCHDOG: eth0:
transmit timed out
Oct 7 22:26:58 linux kernel: [ 0.000000] printk: 19 messages suppressed.
Oct 7 22:26:58 linux kernel: [ 0.000000] dst cache overflow
Oct 7 22:27:01 linux kernel: [ 0.000000] NETDEV WATCHDOG: eth0:
transmit timed out
Oct 7 22:27:02 linux kernel: [ 0.000000] CPU 2:
The server lost network connectivity until reboot.
I guess it's due to "dst cache overflow".
Some of custom sysctl variables:
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
net.ipv4.tcp_no_metrics_save = 1
net.ipv4.tcp_moderate_rcvbuf = 1
net.core.netdev_max_backlog = 4096
net.ipv4.conf.default.arp_filter = 1
net.ipv4.ip_default_ttl = 255
net.ipv4.tcp_max_syn_backlog = 4096
net.ipv4.tcp_timestamps = 0
net.ipv4.tcp_sack = 0
net.nf_conntrack_max = 1743087
net.ipv4.netfilter.ip_conntrack_max = 1743087
net.ipv4.tcp_max_orphans = 131072
net.ipv4.netfilter.ip_conntrack_generic_timeout = 300
net.ipv4.netfilter.ip_conntrack_tcp_timeout_established = 432000
net.ipv4.netfilter.ip_conntrack_icmp_timeout = 10
net.ipv4.netfilter.ip_conntrack_tcp_timeout_close = 5
net.ipv4.netfilter.ip_conntrack_tcp_timeout_syn_recv = 30
net.ipv4.netfilter.ip_conntrack_tcp_timeout_time_wait = 30
net.ipv4.netfilter.ip_conntrack_tcp_timeout_fin_wait = 20
net.ipv4.netfilter.ip_conntrack_tcp_timeout_close_wait = 20
net.ipv4.netfilter.ip_conntrack_tcp_timeout_last_ack = 30
net.ipv4.neigh.default.gc_thresh1 = 4096
net.ipv4.neigh.default.gc_thresh2 = 16384
net.ipv4.neigh.default.gc_thresh3 = 32768
net.ipv4.route.max_size = 1048576
net.ipv4.route.gc_thresh = 131072
net.ipv4.route.gc_elasticity = 4
net.ipv4.route.gc_interval = 1
net.ipv4.route.secret_interval = 3600
fs.file-max = 2097152
kernel.pid_max = 4194303
net.core.somaxconn = 640000
vm.min_free_kbytes = 65536
kernel.panic = 5
vm.swappiness = 0
What can I do to prevent such situations?
Any advice will be appreciate. :)
Thanks!
--
Oleksandr Samoylyk
OVS-RIPE
next reply other threads:[~2008-10-13 14:04 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-10-13 13:54 Oleksandr Samoylyk [this message]
-- strict thread matches above, loose matches on Subject: below --
2007-08-14 16:06 dst cache overflow Tobias Diedrich
2007-08-14 17:24 ` Eric Dumazet
2007-08-14 18:00 ` Tobias Diedrich
2007-08-16 16:45 ` Eric Dumazet
2007-12-14 17:29 ` Tobias Diedrich
2007-12-15 6:29 ` Herbert Xu
2007-12-15 10:08 ` Tobias Diedrich
2007-12-16 7:35 ` Herbert Xu
2007-12-16 8:38 ` Tobias Diedrich
2008-03-17 1:25 ` Tobias Diedrich
2008-03-18 18:59 ` Tobias Diedrich
2005-03-20 20:51 buakaw
2005-03-22 3:40 ` dst cache overflow Andrew Morton
2005-03-22 8:39 ` buakaw
2005-03-22 16:16 ` Phil Oester
2005-03-22 17:07 ` buakaw
2005-03-22 21:12 ` Eric Dumazet
2005-03-22 22:14 ` buakaw
2005-03-23 6:21 ` Eric Dumazet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=48F35315.3020705@samoylyk.sumy.ua \
--to=oleksandr@samoylyk.sumy.ua \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox