All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Matt Carlson" <mcarlson@broadcom.com>
To: davem@davemloft.net
Cc: netdev@vger.kernel.org, andy@greyhouse.net,
	mcarlson@broadcom.com, "Michael Chan" <mchan@broadcom.com>
Subject: [PATCH net-next 09/14] tg3: Improve small packet performance
Date: Mon, 2 Aug 2010 14:26:03 -0700	[thread overview]
Message-ID: <1280784368-4226-9-git-send-email-mcarlson@broadcom.com> (raw)

smp_mb() inside tg3_tx_avail() is used twice in the normal
tg3_start_xmit() path (see illustration below).  The full memory
barrier is only necessary during race conditions with tx completion.
We can speed up the tx path by replacing smp_mb() in tg3_tx_avail()
with a compiler barrier.  The compiler barrier is to force the
compiler to fetch the tx_prod and tx_cons from memory.

In the race condition between tg3_start_xmit() and tg3_tx(),
we have the following situation:

tg3_start_xmit()                       tg3_tx()
    if (!tg3_tx_avail())
        BUG();

    ...

    if (!tg3_tx_avail())
        netif_tx_stop_queue();         update_tx_index();
        smp_mb();                      smp_mb();
        if (tg3_tx_avail())            if (netif_tx_queue_stopped() &&
            netif_tx_wake_queue();         tg3_tx_avail())

With smp_mb() removed from tg3_tx_avail(), we need to add smp_mb() to
tg3_start_xmit() as shown above to properly order netif_tx_stop_queue()
and tg3_tx_avail() to check the ring index.  If it is not strictly
ordered, the tx queue can be stopped forever.

This improves performance by about 3% with 2 ports running
bi-directional 64-byte packets.

Reviewed-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
---
 drivers/net/tg3.c |   24 +++++++++++++++++++++++-
 1 files changed, 23 insertions(+), 1 deletions(-)

diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index 32e3a3d..820a7dd 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -4386,7 +4386,8 @@ static void tg3_tx_recover(struct tg3 *tp)
 
 static inline u32 tg3_tx_avail(struct tg3_napi *tnapi)
 {
-	smp_mb();
+	/* Tell compiler to fetch tx indices from memory. */
+	barrier();
 	return tnapi->tx_pending -
 	       ((tnapi->tx_prod - tnapi->tx_cons) & (TG3_TX_RING_SIZE - 1));
 }
@@ -5670,6 +5671,13 @@ static netdev_tx_t tg3_start_xmit(struct sk_buff *skb,
 	tnapi->tx_prod = entry;
 	if (unlikely(tg3_tx_avail(tnapi) <= (MAX_SKB_FRAGS + 1))) {
 		netif_tx_stop_queue(txq);
+
+		/* netif_tx_stop_queue() must be done before checking
+		 * checking tx index in tg3_tx_avail() below, because in
+		 * tg3_tx(), we update tx index before checking for
+		 * netif_tx_queue_stopped().
+		 */
+		smp_mb();
 		if (tg3_tx_avail(tnapi) > TG3_TX_WAKEUP_THRESH(tnapi))
 			netif_tx_wake_queue(txq);
 	}
@@ -5715,6 +5723,13 @@ static int tg3_tso_bug(struct tg3 *tp, struct sk_buff *skb)
 	/* Estimate the number of fragments in the worst case */
 	if (unlikely(tg3_tx_avail(&tp->napi[0]) <= frag_cnt_est)) {
 		netif_stop_queue(tp->dev);
+
+		/* netif_tx_stop_queue() must be done before checking
+		 * checking tx index in tg3_tx_avail() below, because in
+		 * tg3_tx(), we update tx index before checking for
+		 * netif_tx_queue_stopped().
+		 */
+		smp_mb();
 		if (tg3_tx_avail(&tp->napi[0]) <= frag_cnt_est)
 			return NETDEV_TX_BUSY;
 
@@ -5950,6 +5965,13 @@ static netdev_tx_t tg3_start_xmit_dma_bug(struct sk_buff *skb,
 	tnapi->tx_prod = entry;
 	if (unlikely(tg3_tx_avail(tnapi) <= (MAX_SKB_FRAGS + 1))) {
 		netif_tx_stop_queue(txq);
+
+		/* netif_tx_stop_queue() must be done before checking
+		 * checking tx index in tg3_tx_avail() below, because in
+		 * tg3_tx(), we update tx index before checking for
+		 * netif_tx_queue_stopped().
+		 */
+		smp_mb();
 		if (tg3_tx_avail(tnapi) > TG3_TX_WAKEUP_THRESH(tnapi))
 			netif_tx_wake_queue(txq);
 	}
-- 
1.7.1



             reply	other threads:[~2010-08-02 21:26 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-02 21:26 Matt Carlson [this message]
2010-08-04 22:27 ` [PATCH net-next 09/14] tg3: Improve small packet performance Anton Blanchard
2010-08-04 22:46   ` Matt Carlson
2010-08-04 22:47   ` Michael Chan
2010-08-04 23:08     ` Anton Blanchard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1280784368-4226-9-git-send-email-mcarlson@broadcom.com \
    --to=mcarlson@broadcom.com \
    --cc=andy@greyhouse.net \
    --cc=davem@davemloft.net \
    --cc=mchan@broadcom.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.