From: Denys Fedoryshchenko <denys@visp.net.lb>
To: netdev@vger.kernel.org
Subject: Re: thousands of classes, e1000 TX unit hang
Date: Tue, 5 Aug 2008 13:05:40 +0300 [thread overview]
Message-ID: <200808051305.41064.denys@visp.net.lb> (raw)
In-Reply-To: <200808051106.53841.denys@visp.net.lb>
[-- Attachment #1: Type: text/plain, Size: 11579 bytes --]
I found, that packetloss happening when i am deleting/adding classes.
I attach result of oprofile as file.
On Tuesday 05 August 2008, Denys Fedoryshchenko wrote:
> A little bit more info:
>
> On oprofile i run on another machine (which doesn't suffer much, but i can
> notice also drops on eth0 after adding around 100 interfaces). On first
> machine clocksources is TSC, on machine where i read stats acpi_pm.
>
> CPU: P4 / Xeon with 2 hyper-threads, speed 3200.53 MHz (estimated)
> Counted GLOBAL_POWER_EVENTS events (time during which processor is not
> stopped) with a unit mask of 0x01 (mandatory) count 100000
> GLOBAL_POWER_E...|
> samples| %|
> ------------------
> 973464 75.7644 vmlinux
> 97703 7.6042 libc-2.6.1.so
> 36166 2.8148 cls_fw
> 18290 1.4235 nf_conntrack
> 17946 1.3967 busybox
> GLOBAL_POWER_E...|
>
> PU: P4 / Xeon with 2 hyper-threads, speed 3200.53 MHz (estimated)
> Counted GLOBAL_POWER_EVENTS events (time during which processor is not
> stopped) with a unit mask of 0x01 (mandatory) count 100000
> samples % symbol name
> 245545 23.1963 acpi_pm_read
> 143863 13.5905 __copy_to_user_ll
> 121269 11.4561 ioread16
> 58609 5.5367 gen_kill_estimator
> 40153 3.7932 ioread32
> 33923 3.2047 ioread8
> 16491 1.5579 arch_task_cache_init
> 16067 1.5178 sysenter_past_esp
> 11604 1.0962 find_get_page
> 10631 1.0043 est_timer
> 9038 0.8538 get_page_from_freelist
> 8681 0.8201 sk_run_filter
> 8077 0.7630 irq_entries_start
> 7711 0.7284 schedule
> 6451 0.6094 copy_to_user
>
> On Tuesday 05 August 2008, Denys Fedoryshchenko wrote:
> > I did script, that looks something like this (to simulate SFQ by flow
> > classifier):
> >
> > $2 (is ppp interface)
> > echo "qdisc del dev $2 root ">>${TEMP}
> > echo "qdisc add dev $2 root handle 1: htb ">>${TEMP}
> > echo "filter add dev $2 protocol ip pref 16 parent 1: u32 \
> > match ip dst 0.0.0.0/0 police rate 8kbit burst 2048kb \
> > peakrate 1024Kbit mtu 10000 \
> > conform-exceed continue/ok">>${TEMP}
> >
> > echo "filter add dev $2 protocol ip pref 32 parent 1: handle 1 \
> > flow hash keys nfct divisor 128 baseclass 1:2">>${TEMP}
> >
> > echo "class add dev $2 parent 1: classid 1:1 htb \
> > rate ${rate}bit ceil ${rate}Kbit quantum 1514">>${TEMP}
> >
> > #Cycle to add 128 classes
> > maxslot=130
> > for slot in `seq 2 $maxslot`; do
> > echo "class add dev $2 parent 1:1 classid 1:$slot htb \
> > rate 8Kbit ceil 256Kbit quantum 1514">>${TEMP}
> > echo "qdisc add dev $2 handle $slot: parent 1:$slot bfifo limit
> > 3000">>${TEMP} done
> >
> > After adding around 400-450 interfaces (ppp) server start to "crack".
> > Sure there is packetloss to eth0 (but there is no filters or shapers on
> > it). Even deleting all classes becomes a challenge. After deleting all
> > root handles on ppp interfaces - it becomes ok.
> >
> >
> > Traffic over host is 15-20Mbit/s at that moment, it is 1 CPU Xeon 3.0 Ghz
> > on server motherboard SE7520 with 1GB ram available (at moment of testing
> > more than 512Mb was free).
> >
> > Kernel is 2.6.26.1-vanilla
> > Anything else i need to add to info?
> >
> > Error message appearing in dmesg:
> > [149650.006939] e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
> > [149650.006943] Tx Queue <0>
> > [149650.006944] TDH <a3>
> > [149650.006945] TDT <a3>
> > [149650.006947] next_to_use <a3>
> > [149650.006948] next_to_clean <f8>
> > [149650.006949] buffer_info[next_to_clean]
> > [149650.006951] time_stamp <8e69a7c>
> > [149650.006952] next_to_watch <f8>
> > [149650.006953] jiffies <8e6a111>
> > [149650.006954] next_to_watch.status <1>
> > [149655.964100] e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
> > [149655.964104] Tx Queue <0>
> > [149655.964105] TDH <6c>
> > [149655.964107] TDT <6c>
> > [149655.964108] next_to_use <6c>
> > [149655.964109] next_to_clean <c1>
> > [149655.964111] buffer_info[next_to_clean]
> > [149655.964112] time_stamp <8e6b198>
> > [149655.964113] next_to_watch <c1>
> > [149655.964115] jiffies <8e6b853>
> > [149655.964116] next_to_watch.status <1>
> > [149666.765110] e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
> > [149666.765110] Tx Queue <0>
> > [149666.765110] TDH <28>
> > [149666.765110] TDT <28>
> > [149666.765110] next_to_use <28>
> > [149666.765110] next_to_clean <7e>
> > [149666.765110] buffer_info[next_to_clean]
> > [149666.765110] time_stamp <8e6db6a>
> > [149666.765110] next_to_watch <7e>
> > [149666.765110] jiffies <8e6e27f>
> > [149666.765110] next_to_watch.status <1>
> > [149668.629051] e1000: eth1: e1000_clean_tx_irq: Detected Tx Unit Hang
> > [149668.629056] Tx Queue <0>
> > [149668.629058] TDH <1b>
> > [149668.629060] TDT <1b>
> > [149668.629062] next_to_use <1b>
> > [149668.629064] next_to_clean <f1>
> > [149668.629066] buffer_info[next_to_clean]
> > [149668.629068] time_stamp <8e6e4c3>
> > [149668.629070] next_to_watch <f1>
> > [149668.629072] jiffies <8e6e9c7>
> > [149668.629074] next_to_watch.status <1>
> > [149676.606031] e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
> > [149676.606035] Tx Queue <0>
> > [149676.606037] TDH <9b>
> > [149676.606038] TDT <9b>
> > [149676.606039] next_to_use <9b>
> > [149676.606040] next_to_clean <f0>
> > [149676.606042] buffer_info[next_to_clean]
> > [149676.606043] time_stamp <8e7024c>
> > [149676.606044] next_to_watch <f0>
> > [149676.606046] jiffies <8e708eb>
> > [149676.606047] next_to_watch.status <1>
> > [149680.151750] e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
> > [149680.151750] Tx Queue <0>
> > [149680.151750] TDH <84>
> > [149680.151750] TDT <84>
> > [149680.151750] next_to_use <84>
> > [149680.151750] next_to_clean <d9>
> > [149680.151750] buffer_info[next_to_clean]
> > [149680.151750] time_stamp <8e7100d>
> > [149680.151750] next_to_watch <d9>
> > [149680.151750] jiffies <8e716c3>
> > [149680.151750] next_to_watch.status <1>
> > [149680.153751] e1000: eth1: e1000_clean_tx_irq: Detected Tx Unit Hang
> > [149680.153751] Tx Queue <0>
> > [149680.153751] TDH <aa>
> > [149680.153751] TDT <d2>
> > [149680.153751] next_to_use <d2>
> > [149680.153751] next_to_clean <2d>
> > [149680.153751] buffer_info[next_to_clean]
> > [149680.153751] time_stamp <8e710db>
> > [149680.153751] next_to_watch <2d>
> > [149680.153751] jiffies <8e716c5>
> > [149680.153751] next_to_watch.status <1>
> > [149702.565549] e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
> > [149702.565549] Tx Queue <0>
> > [149702.565549] TDH <3c>
> > [149702.565549] TDT <3c>
> > [149702.565549] next_to_use <3c>
> > [149702.565549] next_to_clean <91>
> > [149702.565549] buffer_info[next_to_clean]
> > [149702.565549] time_stamp <8e7676e>
> > [149702.565549] next_to_watch <91>
> > [149702.565549] jiffies <8e76e48>
> > [149702.565549] next_to_watch.status <1>
> > [149708.020581] e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
> > [149708.020581] Tx Queue <0>
> > [149708.020581] TDH <4c>
> > [149708.020581] TDT <4c>
> > [149708.020581] next_to_use <4c>
> > [149708.020581] next_to_clean <a1>
> > [149708.020581] buffer_info[next_to_clean]
> > [149708.020581] time_stamp <8e77cc3>
> > [149708.020581] next_to_watch <a1>
> > [149708.020581] jiffies <8e78394>
> > [149708.020581] next_to_watch.status <1>
> > [149713.864829] e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
> > [149713.864833] Tx Queue <0>
> > [149713.864835] TDH <b0>
> > [149713.864836] TDT <b0>
> > [149713.864837] next_to_use <b0>
> > [149713.864839] next_to_clean <5>
> > [149713.864840] buffer_info[next_to_clean]
> > [149713.864841] time_stamp <8e7937b>
> > [149713.864842] next_to_watch <5>
> > [149713.864844] jiffies <8e79a64>
> > [149713.864845] next_to_watch.status <1>
> > [149759.710721] e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
> > [149759.710726] Tx Queue <0>
> > [149759.710729] TDH <88>
> > [149759.710730] TDT <88>
> > [149759.710732] next_to_use <88>
> > [149759.710734] next_to_clean <dd>
> > [149759.710736] buffer_info[next_to_clean]
> > [149759.710738] time_stamp <8e8465c>
> > [149759.710740] next_to_watch <dd>
> > [149759.710742] jiffies <8e84d6f>
> > [149759.710744] next_to_watch.status <1>
> > [149759.712712] e1000: eth1: e1000_clean_tx_irq: Detected Tx Unit Hang
> > [149759.712715] Tx Queue <0>
> > [149759.712717] TDH <84>
> > [149759.712719] TDT <90>
> > [149759.712721] next_to_use <90>
> > [149759.712723] next_to_clean <e5>
> > [149759.712725] buffer_info[next_to_clean]
> > [149759.712726] time_stamp <8e84782>
> > [149759.712728] next_to_watch <e5>
> > [149759.712730] jiffies <8e84d71>
> > [149759.712732] next_to_watch.status <1>
> > [149768.334753] e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
> > [149768.334757] Tx Queue <0>
> > [149768.334758] TDH <92>
> > [149768.334760] TDT <92>
> > [149768.334761] next_to_use <92>
> > [149768.334762] next_to_clean <e7>
> > [149768.334764] buffer_info[next_to_clean]
> > [149768.334765] time_stamp <8e86829>
> > [149768.334766] next_to_watch <e7>
> > [149768.334767] jiffies <8e86f1c>
> > [149768.334769] next_to_watch.status <1>
> > [149776.537825] e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
> > [149776.537825] Tx Queue <0>
> > [149776.537825] TDH <4e>
> > [149776.537825] TDT <4e>
> > [149776.537825] next_to_use <4e>
> > [149776.537825] next_to_clean <a3>
> > [149776.537825] buffer_info[next_to_clean]
> > [149776.537825] time_stamp <8e8882b>
> > [149776.537825] next_to_watch <a3>
> > [149776.537825] jiffies <8e88f21>
> > [149776.537825] next_to_watch.status <1>
> > --
> > To unsubscribe from this list: send the line "unsubscribe netdev" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
[-- Attachment #2: oprofile_result1.txt --]
[-- Type: text/plain, Size: 6113 bytes --]
CPU: P4 / Xeon with 2 hyper-threads, speed 3200.53 MHz (estimated)
Counted GLOBAL_POWER_EVENTS events (time during which processor is not stopped) with a unit mask of 0x01 (mandatory) count 10000
samples % symbol name
348827 33.2260 gen_kill_estimator
124818 11.8890 acpi_pm_read
44388 4.2280 ioread16
35074 3.3408 __copy_to_user_ll
33251 3.1672 rtnl_fill_ifinfo
18708 1.7819 ioread32
18350 1.7478 netif_receive_skb
17045 1.6235 copy_to_user
15829 1.5077 __nla_reserve
15261 1.4536 page_fault
14644 1.3948 rtnl_dump_ifinfo
12348 1.1762 __nla_put
11149 1.0619 dev_queue_xmit
11077 1.0551 ioread8
8327 0.7932 sysenter_past_esp
7279 0.6933 est_timer
6631 0.6316 get_page_from_freelist
4863 0.4632 handle_mm_fault
4817 0.4588 csum_partial
4782 0.4555 strlen
4644 0.4423 find_vma
4422 0.4212 nla_put
4406 0.4197 ip_route_input
4200 0.4001 skb_put
Here is details from opannotate
c024dc30 <gen_kill_estimator>: /* gen_kill_estimator total: 266925 23.2740 */
:c024dc30: push %ebp
1 8.7e-05 :c024dc31: mov %esp,%ebp
:c024dc33: push %edi
:c024dc34: xor %edi,%edi
:c024dc36: push %esi
:c024dc37: push %ebx
:c024dc38: sub $0x10,%esp
:c024dc3b: mov %eax,0xffffffe8(%ebp)
:c024dc3e: mov %edx,0xffffffe4(%ebp)
:c024dc41: movl $0xc08630bc,0xfffffff0(%ebp)
:c024dc48: mov 0xfffffff0(%ebp),%eax
1 8.7e-05 :c024dc4b: cmpl $0x0,(%eax)
26 0.0023 :c024dc4e: je c024dcb6 <gen_kill_estimator+0x86>
3 2.6e-04 :c024dc50: mov 0xc(%eax),%ebx
3 2.6e-04 :c024dc53: mov %edi,%eax
:c024dc55: shl $0x5,%eax
:c024dc58: add $0xc08630c8,%eax
1 8.7e-05 :c024dc5d: mov (%ebx),%esi
4 3.5e-04 :c024dc5f: mov %eax,0xffffffec(%ebp)
:c024dc62: jmp c024dcb1 <gen_kill_estimator+0x81>
782 0.0682 :c024dc64: mov 0xffffffe4(%ebp),%eax
390 0.0340 :c024dc67: cmp %eax,0xc(%ebx)
12239 1.0672 :c024dc6a: jne c024dcad <gen_kill_estimator+0x7d>
:c024dc6c: mov 0xffffffe8(%ebp),%eax
:c024dc6f: cmp %eax,0x8(%ebx)
:c024dc72: jne c024dcad <gen_kill_estimator+0x7d>
:c024dc74: mov $0xc036d620,%eax
:c024dc79: call c02a7817 <_write_lock_bh>
3 2.6e-04 :c024dc7e: mov $0xc036d620,%eax
:c024dc83: movl $0x0,0x8(%ebx)
:c024dc8a: call c02a78ac <_write_unlock_bh>
:c024dc8f: mov 0x4(%ebx),%edx
:c024dc92: mov (%ebx),%eax
:c024dc94: mov %edx,0x4(%eax)
:c024dc97: mov %eax,(%edx)
:c024dc99: lea 0x2c(%ebx),%eax
:c024dc9c: mov $0xc024dcc8,%edx
:c024dca1: movl $0x200200,0x4(%ebx)
:c024dca8: call c014279c <call_rcu>
6905 0.6021 :c024dcad: mov %esi,%ebx
1716 0.1496 :c024dcaf: mov (%esi),%esi
232351 20.2593 :c024dcb1: cmp 0xffffffec(%ebp),%ebx
12492 1.0892 :c024dcb4: jne c024dc64 <gen_kill_estimator+0x34>
2 1.7e-04 :c024dcb6: inc %edi
5 4.4e-04 :c024dcb7: addl $0x20,0xfffffff0(%ebp)
:c024dcbb: cmp $0x6,%edi
:c024dcbe: jne c024dc48 <gen_kill_estimator+0x18>
:c024dcc0: add $0x10,%esp
:c024dcc3: pop %ebx
:c024dcc4: pop %esi
1 8.7e-05 :c024dcc5: pop %edi
:c024dc94: mov %edx,0x4(%eax)
:c024dc97: mov %eax,(%edx)
:c024dc99: lea 0x2c(%ebx),%eax
:c024dc9c: mov $0xc024dcc8,%edx
:c024dca1: movl $0x200200,0x4(%ebx)
:c024dca8: call c014279c <call_rcu>
6905 0.6021 :c024dcad: mov %esi,%ebx
1716 0.1496 :c024dcaf: mov (%esi),%esi
232351 20.2593 :c024dcb1: cmp 0xffffffec(%ebp),%ebx
12492 1.0892 :c024dcb4: jne c024dc64 <gen_kill_estimator+0x34>
2 1.7e-04 :c024dcb6: inc %edi
5 4.4e-04 :c024dcb7: addl $0x20,0xfffffff0(%ebp)
:c024dcbb: cmp $0x6,%edi
:c024dcbe: jne c024dc48 <gen_kill_estimator+0x18>
:c024dcc0: add $0x10,%esp
:c024dcc3: pop %ebx
:c024dcc4: pop %esi
1 8.7e-05 :c024dcc5: pop %edi
next prev parent reply other threads:[~2008-08-05 10:06 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-08-05 7:47 thousands of classes, e1000 TX unit hang Denys Fedoryshchenko
2008-08-05 8:06 ` Denys Fedoryshchenko
2008-08-05 10:05 ` Denys Fedoryshchenko [this message]
2008-08-05 11:04 ` Jarek Poplawski
2008-08-05 11:13 ` Denys Fedoryshchenko
2008-08-05 12:23 ` Jarek Poplawski
2008-08-05 13:02 ` Denys Fedoryshchenko
2008-08-05 16:41 ` Jarek Poplawski
2008-08-05 16:48 ` Denys Fedoryshchenko
2008-08-05 21:14 ` Jarek Poplawski
2008-08-05 14:07 ` Denys Fedoryshchenko
2008-08-05 16:48 ` Jarek Poplawski
2008-08-05 17:18 ` Denys Fedoryshchenko
2008-08-06 1:13 ` Brandeburg, Jesse
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200808051305.41064.denys@visp.net.lb \
--to=denys@visp.net.lb \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).