From: Robert Hancock <robert.hancock@calian.com>
To: "kuba@kernel.org" <kuba@kernel.org>
Cc: "pabeni@redhat.com" <pabeni@redhat.com>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
"davem@davemloft.net" <davem@davemloft.net>,
"michal.simek@xilinx.com" <michal.simek@xilinx.com>,
"radhey.shyam.pandey@xilinx.com" <radhey.shyam.pandey@xilinx.com>,
"edumazet@google.com" <edumazet@google.com>,
"linux-arm-kernel@lists.infradead.org"
<linux-arm-kernel@lists.infradead.org>
Subject: Re: [PATCH net-next v5] net: axienet: Use NAPI for TX completion path
Date: Wed, 11 May 2022 17:16:55 +0000 [thread overview]
Message-ID: <f3868a5f9abe263f4ebebd21382cd022afa6a029.camel@calian.com> (raw)
In-Reply-To: <20220510185639.1c6d6c8a@kernel.org>
On Tue, 2022-05-10 at 18:56 -0700, Jakub Kicinski wrote:
> On Mon, 9 May 2022 11:30:39 -0600 Robert Hancock wrote:
> > This driver was using the TX IRQ handler to perform all TX completion
> > tasks. Under heavy TX network load, this can cause significant irqs-off
> > latencies (found to be in the hundreds of microseconds using ftrace).
> > This can cause other issues, such as overrunning serial UART FIFOs when
> > using high baud rates with limited UART FIFO sizes.
> >
> > Switch to using a NAPI poll handler to perform the TX completion work
> > to get this out of hard IRQ context and avoid the IRQ latency impact.
> > A separate poll handler is used for TX and RX since they have separate
> > IRQs on this controller, so that the completion work for each of them
> > stays on the same CPU as the interrupt.
> >
> > Testing on a Xilinx MPSoC ZU9EG platform using iperf3 from a Linux PC
> > through a switch at 1G link speed showed no significant change in TX or
> > RX throughput, with approximately 941 Mbps before and after. Hard IRQ
> > time in the TX throughput test was significantly reduced from 12% to
> > below 1% on the CPU handling TX interrupts, with total hard+soft IRQ CPU
> > usage dropping from about 56% down to 48%.
> >
> > Signed-off-by: Robert Hancock <robert.hancock@calian.com>
> > ---
> >
> > Changed since v4: Added locking to protect TX ring tail pointer against
> > concurrent access by TX transmit and TX poll paths.
>
> Hi, sorry for a late reply there's just too many patches to look at
> lately.
>
> The lock is slightly concerning, the driver follows the usual wake up
> scheme based on memory barriers. If we add the lock we should probably
> take the barriers out.
So there's basically two places where there is contention, axienet_start_xmit
where it is moving the tail pointer down after adding more entries to the TX
ring, and the TX poll function calling axienet_check_tx_bd_space where it is
using the tail pointer to see if there is enough space in the TX ring to wake
the queue. I suppose barriers are likely sufficient if the code updating the
ring pointer is more careful about how it is done - for example in the snippet
quoted below, it's moving the pointer down and then moving it back to 0 if it
is past the end of the ring; this would need to change to only update the
pointer once and not have the intermediate state where it is at an invalid
position.
I think the stability issue I saw earlier was not actually due to these changes
however, but to similar changes in v1 of the "net: macb: use NAPI for TX
completion path" patch. In the case of that driver, it was previously relying
on the TX completion path being protected by a spinlock in the IRQ handler,
which was lost when the TX completion was moved to a poll function.
>
> We can also try to avoid the lock and drill into what the issue is.
> At a quick look it seems like there is a barrier missing between setup
> of the descriptors and kicking the transfer off:
>
> diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
> b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
> index d6fc3f7acdf0..9e244b73a0ca 100644
> --- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
> +++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
> @@ -878,10 +878,11 @@ axienet_start_xmit(struct sk_buff *skb, struct
> net_device *ndev)
> cur_p->skb = skb;
>
> tail_p = lp->tx_bd_p + sizeof(*lp->tx_bd_v) * lp->tx_bd_tail;
> - /* Start the transfer */
> - axienet_dma_out_addr(lp, XAXIDMA_TX_TDESC_OFFSET, tail_p);
> if (++lp->tx_bd_tail >= lp->tx_bd_num)
> lp->tx_bd_tail = 0;
> + wmb(); // possibly dma_wmb()
I think the MMIO write in axienet_dma_out_addr is supposed to be an implicit
barrier, so that shouldn't be needed?
> + /* Start the transfer */
> + axienet_dma_out_addr(lp, XAXIDMA_TX_TDESC_OFFSET, tail_p);
>
> /* Stop queue if next transmit may not have space */
> if (axienet_check_tx_bd_space(lp, MAX_SKB_FRAGS + 1)) {
--
Robert Hancock
Senior Hardware Designer, Calian Advanced Technologies
www.calian.com
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
prev parent reply other threads:[~2022-05-11 17:18 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-09 17:30 [PATCH net-next v5] net: axienet: Use NAPI for TX completion path Robert Hancock
2022-05-11 1:56 ` Jakub Kicinski
2022-05-11 17:16 ` Robert Hancock [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f3868a5f9abe263f4ebebd21382cd022afa6a029.camel@calian.com \
--to=robert.hancock@calian.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=kuba@kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=michal.simek@xilinx.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=radhey.shyam.pandey@xilinx.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).