All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Kok, Auke" <auke-jan.h.kok@intel.com>
To: Oleksandr Samoylyk <oleksandr@samoylyk.sumy.ua>
Cc: linux-kernel@vger.kernel.org
Subject: Re: e1000 latency problem or what ?
Date: Mon, 03 Mar 2008 13:48:00 -0800	[thread overview]
Message-ID: <47CC7210.5050806@intel.com> (raw)
In-Reply-To: <47CC45E5.9080600@samoylyk.sumy.ua>

Oleksandr Samoylyk wrote:
> Dear community,
> 
> I'm running a busy pptp-server which serves about 1500 users on about 80
> Mbit/s (> 15000 packets/sec) through it. I use poptop, traffic shaper
> and basic firewall rules.
> 
> As for hardware it's a server on SuperMicro Server on X7DB8+ serverboard
> server with 8 GB RAM on it running Ubuntu Server Edition.
> 
> As for problem, sometimes I get the following erors in logs:
> Mar  3 15:21:14 vpn kernel: [    0.000000] BUG: soft lockup - CPU#7
> stuck for 11s! [events/7:34]
> Mar  3 15:21:14 vpn kernel: [    0.000000] CPU 7:
> Mar  3 15:21:14 vpn kernel: [    0.000000] Modules linked in: af_packet
> xt_tcpmss iptable_mangle act_police sch_ingress cls_u32 sch_sfq sch_htb
> e1000 xt_multiport xt_TCPMSS xt_state xt_limit xt_tcpudp
> nf_conntrack_pptp nf_conntrack_proto_gre nf_nat_irc nf_conntrack_irc
> nf_nat_ftp nf_conntrack_ftp iptable_nat nf_nat nf_conntrack_ipv4
> nf_conntrack e1000e iptable_filter ip_tables x_tables tcp_cubic pptp
> pppox ppp_generic slhc parport_pc lp parport loop iTCO_wdt
> iTCO_vendor_support pcspkr shpchp i5000_edac pci_hotplug evdev edac_core
> ext3 jbd mbcache sg sr_mod cdrom sd_mod ata_piix pata_acpi aic94xx
> libsas ata_generic libata scsi_transport_sas scsi_mod fuse
> Mar  3 15:21:14 vpn kernel: [    0.000000] Pid: 34, comm: events/7 Not
> tainted 2.6.24-11-server #1
> Mar  3 15:21:14 vpn kernel: [    0.000000] RIP:
> 0010:[libata:_spin_unlock_irqrestore+0x8/0x26222]
> [libata:_spin_unlock_irqrestore+0x8/0x26222]
> _spin_unlock_irqrestore+0x8/0x10
> Mar  3 15:21:14 vpn kernel: [    0.000000] RSP: 0018:ffff810225a779a0
> EFLAGS: 00000246
> Mar  3 15:21:14 vpn kernel: [    0.000000] RAX: 0000000000000000 RBX:
> ffffc20001a1ab80 RCX: 0000000000000000
> Mar  3 15:21:14 vpn kernel: [    0.000000] RDX: 0000000000000014 RSI:
> 0000000000000246 RDI: ffff8102249d5aa8
> Mar  3 15:21:14 vpn kernel: [    0.000000] RBP: ffff810225a77920 R08:
> 0000000000001000 R09: 0000000002000000
> Mar  3 15:21:14 vpn kernel: [    0.000000] R10: 0000000000000000 R11:
> ffff81021ac01300 R12: ffffffff8020cfb6
> Mar  3 15:21:14 vpn kernel: [    0.000000] R13: 0000000000000000 R14:
> ffff8102249d5a80 R15: 0000000000000010
> Mar  3 15:21:14 vpn kernel: [    0.000000] FS:  0000000000000000(0000)
> GS:ffff81022590cd00(0000) knlGS:0000000000000000
> Mar  3 15:21:14 vpn kernel: [    0.000000] CS:  0010 DS: 0018 ES: 0018
> CR0: 000000008005003b
> Mar  3 15:21:14 vpn kernel: [    0.000000] CR2: 00007fa2778f7315 CR3:
> 0000000000201000 CR4: 00000000000006e0
> Mar  3 15:21:14 vpn kernel: [    0.000000] DR0: 0000000000000000 DR1:
> 0000000000000000 DR2: 0000000000000000
> Mar  3 15:21:14 vpn kernel: [    0.000000] DR3: 0000000000000000 DR6:
> 00000000ffff0ff0 DR7: 0000000000000400
> Mar  3 15:21:14 vpn kernel: [    0.000000]
> Mar  3 15:21:14 vpn kernel: [    0.000000] Call Trace:
> Mar  3 15:21:14 vpn kernel: [    0.000000]  <IRQ>  [<ffffffff8825ab1c>]
> :e1000:e1000_xmit_frame+0x8fc/0xcb0
> Mar  3 15:21:14 vpn kernel: [    0.000000]
> [e1000:____versions+0x56f3/0x7340]
> :nf_conntrack_ipv4:ipv4_confirm+0x33/0x60
> Mar  3 15:21:14 vpn kernel: [    0.000000]
> [dev_hard_start_xmit+0x22c/0x2a0] dev_hard_start_xmit+0x22c/0x2a0
> Mar  3 15:21:14 vpn kernel: [    0.000000]  [__qdisc_run+0x74/0x1f0]
> __qdisc_run+0x74/0x1f0
> Mar  3 15:21:14 vpn kernel: [    0.000000]  [dev_queue_xmit+0x1e8/0x300]
> dev_queue_xmit+0x1e8/0x300
> Mar  3 15:21:14 vpn kernel: [    0.000000]  [pptp:pptp_xmit+0x567/0x7e0]
> :pptp:pptp_xmit+0x567/0x7e0
> Mar  3 15:21:14 vpn kernel: [    0.000000]
> [ppp_generic:ppp_push+0x45c/0x620] :ppp_generic:ppp_push+0x45c/0x620
> Mar  3 15:21:14 vpn kernel: [    0.000000]  [<ffffffff8827b235>]
> :sch_htb:htb_safe_rb_erase+0x15/0x50
> Mar  3 15:21:14 vpn kernel: [    0.000000]  [<ffffffff8827b635>]
> :sch_htb:htb_deactivate_prios+0x1a5/0x250
> Mar  3 15:21:14 vpn kernel: [    0.000000]
> [ppp_generic:ppp_xmit_process+0x469/0x640]
> :ppp_generic:ppp_xmit_process+0x469/0x640
> Mar  3 15:21:14 vpn kernel: [    0.000000]
> [ppp_generic:ppp_start_xmit+0x151/0x1e0]
> :ppp_generic:ppp_start_xmit+0x151/0x1e0
> Mar  3 15:21:14 vpn kernel: [    0.000000]
> [dev_hard_start_xmit+0x22c/0x2a0] dev_hard_start_xmit+0x22c/0x2a0
> Mar  3 15:21:14 vpn kernel: [    0.000000]  [__qdisc_run+0x74/0x1f0]
> __qdisc_run+0x74/0x1f0
> Mar  3 15:21:14 vpn kernel: [    0.000000]  [dev_queue_xmit+0x1e8/0x300]
> dev_queue_xmit+0x1e8/0x300
> Mar  3 15:21:14 vpn kernel: [    0.000000]  [ip_rcv_finish+0x114/0x3b0]
> ip_rcv_finish+0x114/0x3b0
> Mar  3 15:21:14 vpn kernel: [    0.000000]  [ip_rcv+0x212/0x300]
> ip_rcv+0x212/0x300
> Mar  3 15:21:14 vpn kernel: [    0.000000]
> [e1000:netif_receive_skb+0x3ac/0xd80] netif_receive_skb+0x3ac/0x490
> Mar  3 15:21:14 vpn kernel: [    0.000000]  [<ffffffff8825c7a4>]
> :e1000:e1000_clean_rx_irq+0x1c4/0x590
> Mar  3 15:21:14 vpn kernel: [    0.000000]  [<ffffffff88259609>]
> :e1000:e1000_clean+0x69/0x250
> Mar  3 15:21:14 vpn kernel: [    0.000000]  [net_rx_action+0x128/0x230]
> net_rx_action+0x128/0x230
> Mar  3 15:21:14 vpn kernel: [    0.000000]  [__do_softirq+0x75/0xe0]
> __do_softirq+0x75/0xe0
> Mar  3 15:21:14 vpn kernel: [    0.000000]  [call_softirq+0x1c/0x30]
> call_softirq+0x1c/0x30
> Mar  3 15:21:14 vpn kernel: [    0.000000]  <EOI>
> [pptp:_ack_timeout_work+0x0/0x20] :pptp:_ack_timeout_work+0x0/0x20
> Mar  3 15:21:14 vpn kernel: [    0.000000]  [do_softirq+0x35/0x90]
> do_softirq+0x35/0x90
> Mar  3 15:21:14 vpn kernel: [    0.000000]
> [local_bh_enable_ip+0x54/0x60] local_bh_enable_ip+0x54/0x60
> Mar  3 15:21:14 vpn kernel: [    0.000000]  [run_workqueue+0xcc/0x170]
> run_workqueue+0xcc/0x170
> Mar  3 15:21:14 vpn kernel: [    0.000000]  [worker_thread+0x0/0x110]
> worker_thread+0x0/0x110
> Mar  3 15:21:14 vpn kernel: [    0.000000]  [worker_thread+0x0/0x110]
> worker_thread+0x0/0x110
> Mar  3 15:21:14 vpn kernel: [    0.000000]  [worker_thread+0xa3/0x110]
> worker_thread+0xa3/0x110
> Mar  3 15:21:14 vpn kernel: [    0.000000]  [<ffffffff802543d0>]
> autoremove_wake_function+0x0/0x30
> Mar  3 15:21:14 vpn kernel: [    0.000000]  [worker_thread+0x0/0x110]
> worker_thread+0x0/0x110
> Mar  3 15:21:14 vpn kernel: [    0.000000]  [worker_thread+0x0/0x110]
> worker_thread+0x0/0x110
> Mar  3 15:21:14 vpn kernel: [    0.000000]  [kthread+0x4b/0x80]
> kthread+0x4b/0x80
> Mar  3 15:21:14 vpn kernel: [    0.000000]  [child_rip+0xa/0x12]
> child_rip+0xa/0x12
> Mar  3 15:21:14 vpn kernel: [    0.000000]  [kthread+0x0/0x80]
> kthread+0x0/0x80
> Mar  3 15:21:14 vpn kernel: [    0.000000]  [child_rip+0x0/0x12]
> child_rip+0x0/0x12
> Mar  3 15:21:14 vpn kernel: [    0.000000]
> 
> One of my CPUs goes 100% with ksoftirqd.

sysreq-t?

> Latency increases signifcantly during this this (more than >100ms).
> 
> 
> ethtool -G eth0 tx 4096 rx 4096
> ethtool -G eth1 tx 4096 rx 4096

unless you're not seeing rx_no_buffer_count increase in the `ethtool -S ethx`
increase this is unnecesary

> echo "8192" > /proc/sys/vm/min_free_kbytes

ditto

> I attached IRQ to different CPUs by:
> /bin/echo "0f" > /proc/irq/2297/smp_affinity
> /bin/echo "f0" > /proc/irq/2296/smp_affinity
> 
> root@vpn:~# cat /proc/interrupts
>            CPU0       CPU1       CPU2       CPU3       CPU4       CPU5
>       CPU6       CPU7
>   0:         72          0          0          0          0          0
>          0          0    XT-PIC-XT        timer
>   1:          2          0          0          0          0          0
>          0          0    XT-PIC-XT        i8042
>   2:          0          0          0          0          0          0
>          0          0    XT-PIC-XT        cascade
>   8:          1          0          0          0          0          0
>          0          0    XT-PIC-XT        rtc
>  11:       9117          0          0          0          0          0
>          0          0    XT-PIC-XT        aic94xx
>  14:         84          0          0          0          0          0
>          0          0    XT-PIC-XT        libata
>  15:          0          0          0          0          0          0
>          0          0    XT-PIC-XT        libata
> 2296:      76853      76976      76639      76692     248073     247957
>     248079     248084   PCI-MSI-edge      eth1
> 2297:     266795     266672     267009     266956      93920      94035
>      93914      93908   PCI-MSI-edge      eth0

your cpu is round-robinning network interrupts which is bad, turn the in-kernel
irqbalance feature OFF and use the userspace irqbalance tool instead

> ifconfig eth0 txqueuelen 1000
> ifconfig eth1 txqueuelen 1000
> 
> I wanted to switch to e1000e driver instead of e1000 as I've in kernel:
> 
> CONFIG_E1000=m
> CONFIG_E1000E=m
> # CONFIG_E1000_DISABLE_PACKET_SPLIT is not set
> CONFIG_E1000_NAPI=y
> 
> I'm trying to load it by:
> modprobe -r e1000; modprobe e1000e; /etc/init.d/networking restart
> 
> but NICs aren't detected :(
> 
> Why so?

because that device is not going to be supported by e1000e until 2.6.25. you're
using 2.6.24 so you need to use e1000 just one kernel release longer.

> I have (lspci -vvv output):
> 
> 06:00.0 Ethernet controller: Intel Corporation 80003ES2LAN Gigabit
> Ethernet Controller (Copper) (rev 01)
>         Subsystem: Super Micro Computer Inc Unknown device 0000
>         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> ParErr+ Stepping- SERR+ FastB2B-
>         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> <TAbort- <MAbort- >SERR- <PERR-
>         Latency: 0, Cache Line Size: 32 bytes
>         Interrupt: pin A routed to IRQ 2297
>         Region 0: Memory at d8300000 (32-bit, non-prefetchable) [size=128K]
>         Region 2: I/O ports at 3000 [size=32]
>         Capabilities: [c8] Power Management version 2
>                 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
> PME(D0+,D1-,D2-,D3hot+,D3cold+)
>                 Status: D0 PME-Enable- DSel=0 DScale=1 PME-
>         Capabilities: [d0] Message Signalled Interrupts: Mask- 64bit+
> Queue=0/0 Enable+
>                 Address: 00000000fee0f00c  Data: 41b9
>         Capabilities: [e0] Express Endpoint IRQ 0
>                 Device: Supported: MaxPayload 256 bytes, PhantFunc 0,
> ExtTag-
>                 Device: Latency L0s <512ns, L1 <64us
>                 Device: AtnBtn- AtnInd- PwrInd-
>                 Device: Errors: Correctable- Non-Fatal- Fatal- Unsupported-
>                 Device: RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
>                 Device: MaxPayload 128 bytes, MaxReadReq 512 bytes
>                 Link: Supported Speed 2.5Gb/s, Width x4, ASPM unknown,
> Port 0
>                 Link: Latency L0s <128ns, L1 <64us
>                 Link: ASPM Disabled RCB 64 bytes CommClk- ExtSynch-
>                 Link: Speed 2.5Gb/s, Width x4
> 
> 06:00.1 Ethernet controller: Intel Corporation 80003ES2LAN Gigabit
> Ethernet Controller (Copper) (rev 01)
>         Subsystem: Super Micro Computer Inc Unknown device 0000
>         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> ParErr+ Stepping- SERR+ FastB2B-
>         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> <TAbort- <MAbort- >SERR- <PERR-
>         Latency: 0, Cache Line Size: 32 bytes
>         Interrupt: pin B routed to IRQ 2296
>         Region 0: Memory at d8320000 (32-bit, non-prefetchable) [size=128K]
>         Region 2: I/O ports at 3020 [size=32]
>         Capabilities: [c8] Power Management version 2
>                 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
> PME(D0+,D1-,D2-,D3hot+,D3cold+)
>                 Status: D0 PME-Enable- DSel=0 DScale=1 PME-
>         Capabilities: [d0] Message Signalled Interrupts: Mask- 64bit+
> Queue=0/0 Enable+
>                 Address: 00000000feef000c  Data: 41c1
>         Capabilities: [e0] Express Endpoint IRQ 0
>                 Device: Supported: MaxPayload 256 bytes, PhantFunc 0,
> ExtTag-
>                 Device: Latency L0s <512ns, L1 <64us
>                 Device: AtnBtn- AtnInd- PwrInd-
>                 Device: Errors: Correctable- Non-Fatal- Fatal- Unsupported-
>                 Device: RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
>                 Device: MaxPayload 128 bytes, MaxReadReq 512 bytes
>                 Link: Supported Speed 2.5Gb/s, Width x4, ASPM unknown,
> Port 0
>                 Link: Latency L0s <128ns, L1 <64us
>                 Link: ASPM Disabled RCB 64 bytes CommClk- ExtSynch-
>                 Link: Speed 2.5Gb/s, Width x4
> 
> So I still have to use e1000 with InterruptThrottleRate=0,0
> RxIntDelay=0,0 RxAbsIntDelay=0,0 TxIntDelay=0,0 TxAbsIntDelay=0,0:

just use the defaults for e1000 please, unless those defaults don't work - and
then it's a bug.


I don't think e1000 is the issue here, I've definately not seen any reports of
e1000+pptp giving issues. perhaps one of the pptp/pppd developers can look at the
stack trace.

Auke

  reply	other threads:[~2008-03-03 21:56 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-03-03 18:39 e1000 latency problem or what ? Oleksandr Samoylyk
2008-03-03 21:48 ` Kok, Auke [this message]
2008-03-03 22:21   ` Eric Dumazet
2008-03-03 22:39     ` Oleksandr Samoylyk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47CC7210.5050806@intel.com \
    --to=auke-jan.h.kok@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oleksandr@samoylyk.sumy.ua \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.