From: Arnd Bergmann <arnd@arndb.de>
To: linux-arm-kernel@lists.infradead.org
Cc: zhangfei <zhangfei.gao@linaro.org>,
mark.rutland@arm.com, devicetree@vger.kernel.org,
f.fainelli@gmail.com, linux@arm.linux.org.uk,
eric.dumazet@gmail.com, sergei.shtylyov@cogentembedded.com,
netdev@vger.kernel.org, David.Laight@aculab.com,
davem@davemloft.net
Subject: Re: [PATCH 3/3] net: hisilicon: new hip04 ethernet driver
Date: Wed, 02 Apr 2014 17:24:24 +0200 [thread overview]
Message-ID: <4242558.6NaQec4f7j@wuerfel> (raw)
In-Reply-To: <533BDDBA.1060800@linaro.org>
On Wednesday 02 April 2014 17:51:54 zhangfei wrote:
> Dear Arnd
>
> On 04/02/2014 05:21 PM, Arnd Bergmann wrote:
> > On Tuesday 01 April 2014 21:27:12 Zhangfei Gao wrote:
> >> +static int hip04_mac_start_xmit(struct sk_buff *skb, struct net_device *ndev)
> >
> > While it looks like there are no serious functionality bugs left, this
> > function is rather inefficient, as has been pointed out before:
>
> Yes, still need more performance tuning in the next step.
> We need to enable the hardware feature of cache flush, under help of
> arm-smmu, as a result dma_map_single etc can be removed.
You cannot remove the dma_map_single call here, but the implementation
of that function will be different when you use the iommu_coherent_ops:
Instead of flushing the caches, it will create or remove an iommu entry
and return the bus address.
I remember you mentioned before that using the iommu on this particular
SoC actually gives you cache-coherent DMA, so you may also be able
to use arm_coherent_dma_ops if you can set up a static 1:1 mapping
between bus and phys addresses.
> >> +{
> >> + struct hip04_priv *priv = netdev_priv(ndev);
> >> + struct net_device_stats *stats = &ndev->stats;
> >> + unsigned int tx_head = priv->tx_head;
> >> + struct tx_desc *desc = &priv->tx_desc[tx_head];
> >> + dma_addr_t phys;
> >> +
> >> + hip04_tx_reclaim(ndev, false);
> >> + mod_timer(&priv->txtimer, jiffies + RECLAIM_PERIOD);
> >> +
> >> + if (priv->tx_count >= TX_DESC_NUM) {
> >> + netif_stop_queue(ndev);
> >> + return NETDEV_TX_BUSY;
> >> + }
> >
> > This is where you have two problems:
> >
> > - if the descriptor ring is full, you wait for RECLAIM_PERIOD,
> > which is far too long at 500ms, because during that time you
> > are not able to add further data to the stopped queue.
>
> Understand
> The idea here is not using the timer as much as possible.
> As experiment shows, only xmit reclaim buffers, the best throughput can
> be achieved.
I'm only talking about the case where that doesn't work: once you stop
the queue, the xmit function won't get called again until the timer
causes the reclaim be done and restart the queue.
> > - As David Laight pointed out earlier, you must also ensure that
> > you don't have too much /data/ pending in the descriptor ring
> > when you stop the queue. For a 10mbit connection, you have already
> > tested (as we discussed on IRC) that 64 descriptors with 1500 byte
> > frames gives you a 68ms round-trip ping time, which is too much.
>
> When iperf & ping running together and only ping, it is 0.7 ms.
>
> > Conversely, on 1gbit, having only 64 descriptors actually seems
> > a little low, and you may be able to get better throughput if
> > you extend the ring to e.g. 512 descriptors.
>
> OK, Will check throughput of upgrade xmit descriptors.
> But is it said not using too much descripors for xmit since no xmit
> interrupt?
The important part is to limit the time that data spends in the queue,
which is a function of the interface tx speed and the number of bytes
in the queue.
> >> + phys = dma_map_single(&ndev->dev, skb->data, skb->len, DMA_TO_DEVICE);
> >> + if (dma_mapping_error(&ndev->dev, phys)) {
> >> + dev_kfree_skb(skb);
> >> + return NETDEV_TX_OK;
> >> + }
> >> +
> >> + priv->tx_skb[tx_head] = skb;
> >> + priv->tx_phys[tx_head] = phys;
> >> + desc->send_addr = cpu_to_be32(phys);
> >> + desc->send_size = cpu_to_be16(skb->len);
> >> + desc->cfg = cpu_to_be32(DESC_DEF_CFG);
> >> + phys = priv->tx_desc_dma + tx_head * sizeof(struct tx_desc);
> >> + desc->wb_addr = cpu_to_be32(phys);
> >
> > One detail: since you don't have cache-coherent DMA, "desc" will
> > reside in uncached memory, so you try to minimize the number of accesses.
> > It's probably faster if you build the descriptor on the stack and
> > then atomically copy it over, rather than assigning each member at
> > a time.
>
> I am sorry, not quite understand, could you clarify more?
> The phys and size etc of skb->data is changing, so need to assign.
> If member contents keep constant, it can be set when initializing.
I meant you should use 64-bit accesses here instead of multiple 32 and
16 bit accesses, but as David noted, it's actually not that much of
a deal for the writes as it is for the reads from uncached memory.
The important part is to avoid the line where you do 'if (desc->send_addr
!= 0)' as much as possible.
Arnd
next prev parent reply other threads:[~2014-04-02 15:24 UTC|newest]
Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-04-01 13:27 [PATCH v5 0/3] add hisilicon hip04 ethernet driver Zhangfei Gao
2014-04-01 13:27 ` [PATCH 1/3] Documentation: add Device tree bindings for Hisilicon hip04 ethernet Zhangfei Gao
[not found] ` <1396358832-15828-1-git-send-email-zhangfei.gao-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
2014-04-01 13:27 ` [PATCH 2/3] net: hisilicon: new hip04 MDIO driver Zhangfei Gao
2014-04-04 15:42 ` Zhangfei Gao
2014-04-04 17:48 ` David Miller
2014-04-01 13:27 ` [PATCH 3/3] net: hisilicon: new hip04 ethernet driver Zhangfei Gao
[not found] ` <1396358832-15828-4-git-send-email-zhangfei.gao-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
2014-04-02 9:21 ` Arnd Bergmann
2014-04-02 9:51 ` zhangfei
2014-04-02 15:24 ` Arnd Bergmann [this message]
2014-04-02 10:04 ` David Laight
2014-04-02 15:49 ` Arnd Bergmann
2014-04-03 6:24 ` Zhangfei Gao
[not found] ` <CAMj5BkgfwE1hHpVeqH9WRitwCB30x3c4w0qw7sXT3PiOV-QcPQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-04-03 8:35 ` Arnd Bergmann
2014-04-03 15:22 ` David Miller
2014-04-03 15:38 ` zhangfei
2014-04-03 15:27 ` Russell King - ARM Linux
2014-04-03 15:42 ` David Laight
2014-04-03 15:50 ` Russell King - ARM Linux
2014-04-03 17:57 ` Arnd Bergmann
2014-04-04 6:52 ` Zhangfei Gao
-- strict thread matches above, loose matches on Subject: below --
2014-04-05 4:35 [PATCH v7 0/3] add hisilicon " Zhangfei Gao
2014-04-05 4:35 ` [PATCH 3/3] net: hisilicon: new " Zhangfei Gao
2014-04-07 18:53 ` David Miller
2014-04-08 8:07 ` zhangfei
2014-04-08 8:30 ` David Laight
[not found] ` <063D6719AE5E284EB5DD2968C1650D6D0F6F1434-VkEWCZq2GCInGFn1LkZF6NBPR1lH4CV8@public.gmane.org>
2014-04-08 9:42 ` Arnd Bergmann
2014-04-08 14:47 ` zhangfei
2014-04-18 13:17 ` zhangfei
2014-04-07 18:56 ` David Miller
2014-04-04 15:16 [PATCH v6 0/3] add hisilicon " Zhangfei Gao
[not found] ` <1396624597-390-1-git-send-email-zhangfei.gao-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
2014-04-04 15:16 ` [PATCH 3/3] net: hisilicon: new " Zhangfei Gao
2014-03-28 15:35 [PATCH v4 0/3] add hisilicon " Zhangfei Gao
2014-03-28 15:36 ` [PATCH 3/3] net: hisilicon: new " Zhangfei Gao
2014-03-24 14:14 [PATCH v3 0/3] add hisilicon " Zhangfei Gao
2014-03-24 14:14 ` [PATCH 3/3] net: hisilicon: new " Zhangfei Gao
[not found] ` <1395670496-17381-4-git-send-email-zhangfei.gao-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
2014-03-24 15:18 ` Arnd Bergmann
2014-03-25 4:06 ` Zhangfei Gao
2014-03-25 8:12 ` Arnd Bergmann
2014-03-25 17:00 ` Florian Fainelli
2014-03-25 17:05 ` Arnd Bergmann
2014-03-25 17:16 ` Florian Fainelli
2014-03-25 17:57 ` Arnd Bergmann
2014-03-26 9:55 ` David Laight
2014-03-25 17:17 ` David Laight
2014-03-25 17:21 ` Eric Dumazet
2014-03-25 17:54 ` Arnd Bergmann
2014-03-27 12:53 ` zhangfei
2014-03-24 16:32 ` Florian Fainelli
2014-03-24 17:23 ` Arnd Bergmann
2014-03-24 17:35 ` Florian Fainelli
2014-03-27 6:27 ` Zhangfei Gao
2014-03-21 15:09 [PATCH v2 0/3] add hisilicon " Zhangfei Gao
2014-03-21 15:09 ` [PATCH 3/3] net: hisilicon: new " Zhangfei Gao
2014-03-21 15:27 ` Arnd Bergmann
2014-03-22 1:18 ` zhangfei
2014-03-22 8:08 ` Arnd Bergmann
2014-03-18 8:40 [PATCH 0/3] add hisilicon " Zhangfei Gao
[not found] ` <1395132017-15928-1-git-send-email-zhangfei.gao-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
2014-03-18 8:40 ` [PATCH 3/3] net: hisilicon: new " Zhangfei Gao
[not found] ` <1395132017-15928-4-git-send-email-zhangfei.gao-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
2014-03-18 10:46 ` Russell King - ARM Linux
2014-03-20 9:51 ` Zhangfei Gao
2014-03-24 14:17 ` Rob Herring
2014-03-26 14:22 ` Zhangfei Gao
2014-03-18 11:25 ` Arnd Bergmann
2014-03-20 14:00 ` Zhangfei Gao
2014-03-20 14:31 ` Arnd Bergmann
[not found] ` <201403201531.20416.arnd-r2nGTMty4D4@public.gmane.org>
2014-03-21 5:19 ` Zhangfei Gao
2014-03-21 7:37 ` Arnd Bergmann
2014-03-21 7:56 ` Zhangfei Gao
2014-03-24 8:17 ` Zhangfei Gao
2014-03-24 10:02 ` Arnd Bergmann
2014-03-24 13:23 ` Zhangfei Gao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4242558.6NaQec4f7j@wuerfel \
--to=arnd@arndb.de \
--cc=David.Laight@aculab.com \
--cc=davem@davemloft.net \
--cc=devicetree@vger.kernel.org \
--cc=eric.dumazet@gmail.com \
--cc=f.fainelli@gmail.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux@arm.linux.org.uk \
--cc=mark.rutland@arm.com \
--cc=netdev@vger.kernel.org \
--cc=sergei.shtylyov@cogentembedded.com \
--cc=zhangfei.gao@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).