From: Eric Dumazet <eric.dumazet@gmail.com>
To: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jesper Dangaard Brouer <jdb@comx.dk>,
Robert Olsson <robert@herjulf.net>,
netdev@vger.kernel.org, "David S. Miller" <davem@davemloft.net>
Subject: [PATCH] pktgen: Avoid dirtying skb->users when txq is full
Date: Thu, 01 Oct 2009 01:03:33 +0200 [thread overview]
Message-ID: <4AC3E3C5.1090108@gmail.com> (raw)
In-Reply-To: <20090923174141.1d350103@s6510>
Stephen Hemminger a écrit :
> On Tue, 22 Sep 2009 22:49:02 -0700
> Stephen Hemminger <shemminger@vyatta.com> wrote:
>
>> I thought others want to know how to get maximum speed of pktgen.
>>
>> 1. Run nothing else (even X11), just a command line
>> 2. Make sure ethernet flow control is disabled
>> ethtool -A eth0 autoneg off rx off tx off
>> 3. Make sure clocksource is TSC. On my old SMP Opteron's
>> needed to get patch since in 2.6.30 or later, the clock guru's
>> decided to remove it on all non Intel machines. Look for patch
>> than enables "tsc=reliable"
>> 4. Compile Ethernet drivers in, the overhead of the indirect
>> function call required for modules (or cache footprint),
>> slows things down.
>> 5. Increase transmit ring size to 1000
>> ethtool -G eth0 tx 1000
>>
Thanks a lot Stephen.
I did some pktgen session tonight and found one contention on skb->users field
that following patch avoids.
Before patch :
------------------------------------------------------------------------------
PerfTop: 5187 irqs/sec kernel:100.0% [100000 cycles], (all, cpu: 0)
------------------------------------------------------------------------------
samples pcnt kernel function
_______ _____ _______________
16688.00 - 50.9% : consume_skb
6541.00 - 20.0% : skb_dma_unmap
3277.00 - 10.0% : tg3_poll
1968.00 - 6.0% : mwait_idle
651.00 - 2.0% : irq_entries_start
466.00 - 1.4% : _spin_lock
442.00 - 1.3% : mix_pool_bytes_extract
373.00 - 1.1% : tg3_msi
353.00 - 1.1% : read_tsc
177.00 - 0.5% : sched_clock_local
176.00 - 0.5% : sched_clock
137.00 - 0.4% : tick_nohz_stop_sched_tick
After patch:
------------------------------------------------------------------------------
PerfTop: 3530 irqs/sec kernel:99.9% [100000 cycles], (all, cpu: 0)
------------------------------------------------------------------------------
samples pcnt kernel function
_______ _____ _______________
17127.00 - 34.0% : tg3_poll
12691.00 - 25.2% : consume_skb
5299.00 - 10.5% : skb_dma_unmap
4179.00 - 8.3% : mwait_idle
1583.00 - 3.1% : irq_entries_start
1288.00 - 2.6% : mix_pool_bytes_extract
1239.00 - 2.5% : tg3_msi
1062.00 - 2.1% : read_tsc
583.00 - 1.2% : _spin_lock
432.00 - 0.9% : sched_clock
416.00 - 0.8% : sched_clock_local
360.00 - 0.7% : tick_nohz_stop_sched_tick
329.00 - 0.7% : ktime_get
263.00 - 0.5% : _spin_lock_irqsave
I believe we could go further, batching the atomic_inc(&skb->users) we do all the
time, competing with the atomic_dec() done by tx completion handler (possibly run
on other cpu): Reserve XXXXXXX units to the skb->users, and decrement a pktgen
local variable and refill the reserve if necessary, once in a while...
[PATCH] pktgen: Avoid dirtying skb->users when txq is full
We can avoid two atomic ops on skb->users if packet is not going to be
sent to the device (because hardware txqueue is full)
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
diff --git a/net/core/pktgen.c b/net/core/pktgen.c
index 4d11c28..6a9ab28 100644
--- a/net/core/pktgen.c
+++ b/net/core/pktgen.c
@@ -3439,12 +3439,14 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev)
txq = netdev_get_tx_queue(odev, queue_map);
__netif_tx_lock_bh(txq);
- atomic_inc(&(pkt_dev->skb->users));
- if (unlikely(netif_tx_queue_stopped(txq) || netif_tx_queue_frozen(txq)))
+ if (unlikely(netif_tx_queue_stopped(txq) || netif_tx_queue_frozen(txq))) {
ret = NETDEV_TX_BUSY;
- else
- ret = (*xmit)(pkt_dev->skb, odev);
+ pkt_dev->last_ok = 0;
+ goto unlock;
+ }
+ atomic_inc(&(pkt_dev->skb->users));
+ ret = (*xmit)(pkt_dev->skb, odev);
switch (ret) {
case NETDEV_TX_OK:
@@ -3466,6 +3468,7 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev)
atomic_dec(&(pkt_dev->skb->users));
pkt_dev->last_ok = 0;
}
+unlock:
__netif_tx_unlock_bh(txq);
/* If pkt_dev->count is zero, then run forever */
next prev parent reply other threads:[~2009-09-30 23:03 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-09-23 5:49 pktgen: tricks Stephen Hemminger
2009-09-24 0:41 ` Stephen Hemminger
2009-09-24 1:05 ` Rick Jones
2009-09-24 10:10 ` Denys Fedoryschenko
2009-09-24 10:32 ` Eric Dumazet
2009-09-30 23:03 ` Eric Dumazet [this message]
2009-10-01 0:25 ` [PATCH] pktgen: Avoid dirtying skb->users when txq is full Stephen Hemminger
2009-10-01 9:47 ` [PATCH] pktgen: Fix delay handling Eric Dumazet
2009-10-01 10:04 ` Eric Dumazet
2009-10-01 16:29 ` David Miller
2009-10-01 16:32 ` Eric Dumazet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4AC3E3C5.1090108@gmail.com \
--to=eric.dumazet@gmail.com \
--cc=davem@davemloft.net \
--cc=jdb@comx.dk \
--cc=netdev@vger.kernel.org \
--cc=robert@herjulf.net \
--cc=shemminger@vyatta.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.