From: Alexander Duyck <alexander.duyck@gmail.com>
To: Tom Lendacky <thomas.lendacky@amd.com>, netdev@vger.kernel.org
Cc: David Miller <davem@davemloft.net>,
Christoffer Dall <christoffer.dall@linaro.org>
Subject: Re: [PATCH net] amd-xgbe: Use wmb before updating current descriptor count
Date: Fri, 23 Oct 2015 09:29:44 -0700 [thread overview]
Message-ID: <562A6078.9070709@gmail.com> (raw)
In-Reply-To: <20151021203705.29412.11836.stgit@tlendack-t1.amdoffice.net>
On 10/21/2015 01:37 PM, Tom Lendacky wrote:
> The code currently uses the lightweight dma_wmb barrier before updating
> the current descriptor count. Under heavy load, the Tx cleanup routine
> was seeing the updated current descriptor count before the updated
> descriptor information. As a result, the Tx descriptor was being cleaned
> up before it was used because it was not "owned" by the hardware yet,
> resulting in a Tx queue hang.
>
> Using the wmb barrier insures that the descriptor is updated before the
> descriptor counter preventing the Tx queue hang. For extra insurance,
> the Tx cleanup routine is changed to grab the current decriptor count on
> entry and uses that initial value in the processing loop rather than
> trying to chase the current value.
>
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> Tested-by: Christoffer Dall <christoffer.dall@linaro.org>
This shouldn't be using wmb() or dma_wmb(). It looks like you are just
using the barriers to synchronize between CPUs. If that is the case you
should probably be using smp_wmb()/smp_rmb().
> ---
> drivers/net/ethernet/amd/xgbe/xgbe-dev.c | 2 +-
> drivers/net/ethernet/amd/xgbe/xgbe-drv.c | 4 +++-
> 2 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-dev.c b/drivers/net/ethernet/amd/xgbe/xgbe-dev.c
> index a4473d8..e9ab8b9 100644
> --- a/drivers/net/ethernet/amd/xgbe/xgbe-dev.c
> +++ b/drivers/net/ethernet/amd/xgbe/xgbe-dev.c
> @@ -1595,7 +1595,7 @@ static void xgbe_dev_xmit(struct xgbe_channel *channel)
> packet->rdesc_count, 1);
>
> /* Make sure ownership is written to the descriptor */
> - dma_wmb();
> + wmb();
>
> ring->cur = cur_index + 1;
> if (!packet->skb->xmit_more ||
If anything you could probably just replace it with an smp_wmb() since
the device doesn't ever read the ring->cur value. Using a wmb() here is
just doing unnecessary harm to your performance as you aren't dealing
with multiple memory domains.
> diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
> index aae9d5e..d2b77d9 100644
> --- a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
> +++ b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
> @@ -1807,6 +1807,7 @@ static int xgbe_tx_poll(struct xgbe_channel *channel)
> struct netdev_queue *txq;
> int processed = 0;
> unsigned int tx_packets = 0, tx_bytes = 0;
> + unsigned int cur;
>
> DBGPR("-->xgbe_tx_poll\n");
>
> @@ -1814,10 +1815,11 @@ static int xgbe_tx_poll(struct xgbe_channel *channel)
> if (!ring)
> return 0;
>
> + cur = ring->cur;
> txq = netdev_get_tx_queue(netdev, channel->queue_index);
>
> while ((processed < XGBE_TX_DESC_MAX_PROC) &&
> - (ring->dirty != ring->cur)) {
> + (ring->dirty != cur)) {
> rdata = XGBE_GET_DESC_DATA(ring, ring->dirty);
> rdesc = rdata->rdesc;
>
>
I believe your bug is down here. You likely need an smp_rmb() between
the read of ring->cur and rdata. Otherwise there is nothing to prevent
the compiler from reordering this so that it reads the descriptor data
ahead of ring->cur.
Moving the read out of the loop likely resolves the issue as it becomes
harder for the compiler to optimize the read versus the code in the
loop. You may want to just try reverting this patch and starting the
loop with a smp_rmb() here and replace the dma_wmb() with an smp_wmb()
above and you should see the race between the interrupt routine and the
transmit path resolved as I believe the current fix is still a bit racy.
- Alex
next prev parent reply other threads:[~2015-10-23 16:29 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-21 20:37 [PATCH net] amd-xgbe: Use wmb before updating current descriptor count Tom Lendacky
2015-10-23 9:59 ` David Miller
2015-10-23 13:35 ` Tom Lendacky
2015-10-23 14:02 ` David Miller
2015-10-23 16:29 ` Alexander Duyck [this message]
2015-10-23 16:57 ` Tom Lendacky
2015-10-24 0:43 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=562A6078.9070709@gmail.com \
--to=alexander.duyck@gmail.com \
--cc=christoffer.dall@linaro.org \
--cc=davem@davemloft.net \
--cc=netdev@vger.kernel.org \
--cc=thomas.lendacky@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).