public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed
From: arnd@arndb.de (Arnd Bergmann)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH 3/3] net: hisilicon: new hip04 ethernet driver
Date: Wed, 02 Apr 2014 17:24:24 +0200	[thread overview]
Message-ID: <4242558.6NaQec4f7j@wuerfel> (raw)
In-Reply-To: <533BDDBA.1060800@linaro.org>

On Wednesday 02 April 2014 17:51:54 zhangfei wrote:
> Dear Arnd
> 
> On 04/02/2014 05:21 PM, Arnd Bergmann wrote:
> > On Tuesday 01 April 2014 21:27:12 Zhangfei Gao wrote:
> >> +static int hip04_mac_start_xmit(struct sk_buff *skb, struct net_device *ndev)
> >
> > While it looks like there are no serious functionality bugs left, this
> > function is rather inefficient, as has been pointed out before:
> 
> Yes, still need more performance tuning in the next step.
> We need to enable the hardware feature of cache flush, under help of 
> arm-smmu, as a result dma_map_single etc can be removed.

You cannot remove the dma_map_single call here, but the implementation
of that function will be different when you use the iommu_coherent_ops:
Instead of flushing the caches, it will create or remove an iommu entry
and return the bus address.

I remember you mentioned before that using the iommu on this particular
SoC actually gives you cache-coherent DMA, so you may also be able
to use arm_coherent_dma_ops if you can set up a static 1:1 mapping 
between bus and phys addresses.

> >> +{
> >> +       struct hip04_priv *priv = netdev_priv(ndev);
> >> +       struct net_device_stats *stats = &ndev->stats;
> >> +       unsigned int tx_head = priv->tx_head;
> >> +       struct tx_desc *desc = &priv->tx_desc[tx_head];
> >> +       dma_addr_t phys;
> >> +
> >> +       hip04_tx_reclaim(ndev, false);
> >> +       mod_timer(&priv->txtimer, jiffies + RECLAIM_PERIOD);
> >> +
> >> +       if (priv->tx_count >= TX_DESC_NUM) {
> >> +               netif_stop_queue(ndev);
> >> +               return NETDEV_TX_BUSY;
> >> +       }
> >
> > This is where you have two problems:
> >
> > - if the descriptor ring is full, you wait for RECLAIM_PERIOD,
> >    which is far too long at 500ms, because during that time you
> >    are not able to add further data to the stopped queue.
> 
> Understand
> The idea here is not using the timer as much as possible.
> As experiment shows, only xmit reclaim buffers, the best throughput can 
> be achieved.

I'm only talking about the case where that doesn't work: once you stop
the queue, the xmit function won't get called again until the timer
causes the reclaim be done and restart the queue.

> > - As David Laight pointed out earlier, you must also ensure that
> >    you don't have too much /data/ pending in the descriptor ring
> >    when you stop the queue. For a 10mbit connection, you have already
> >    tested (as we discussed on IRC) that 64 descriptors with 1500 byte
> >    frames gives you a 68ms round-trip ping time, which is too much.
> 
> When iperf & ping running together and only ping, it is 0.7 ms.
> 
> >    Conversely, on 1gbit, having only 64 descriptors actually seems
> >    a little low, and you may be able to get better throughput if
> >    you extend the ring to e.g. 512 descriptors.
> 
> OK, Will check throughput of upgrade xmit descriptors.
> But is it said not using too much descripors for xmit since no xmit 
> interrupt?

The important part is to limit the time that data spends in the queue,
which is a function of the interface tx speed and the number of bytes
in the queue.

> >> +       phys = dma_map_single(&ndev->dev, skb->data, skb->len, DMA_TO_DEVICE);
> >> +       if (dma_mapping_error(&ndev->dev, phys)) {
> >> +               dev_kfree_skb(skb);
> >> +               return NETDEV_TX_OK;
> >> +       }
> >> +
> >> +       priv->tx_skb[tx_head] = skb;
> >> +       priv->tx_phys[tx_head] = phys;
> >> +       desc->send_addr = cpu_to_be32(phys);
> >> +       desc->send_size = cpu_to_be16(skb->len);
> >> +       desc->cfg = cpu_to_be32(DESC_DEF_CFG);
> >> +       phys = priv->tx_desc_dma + tx_head * sizeof(struct tx_desc);
> >> +       desc->wb_addr = cpu_to_be32(phys);
> >
> > One detail: since you don't have cache-coherent DMA, "desc" will
> > reside in uncached memory, so you try to minimize the number of accesses.
> > It's probably faster if you build the descriptor on the stack and
> > then atomically copy it over, rather than assigning each member at
> > a time.
> 
> I am sorry, not quite understand, could you clarify more?
> The phys and size etc of skb->data is changing, so need to assign.
> If member contents keep constant, it can be set when initializing.

I meant you should use 64-bit accesses here instead of multiple 32 and
16 bit accesses, but as David noted, it's actually not that much of
a deal for the writes as it is for the reads from uncached memory.

The important part is to avoid the line where you do 'if (desc->send_addr
!= 0)' as much as possible.

	Arnd

  reply	other threads:[~2014-04-02 15:24 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-01 13:27 [PATCH v5 0/3] add hisilicon hip04 ethernet driver Zhangfei Gao
2014-04-01 13:27 ` [PATCH 1/3] Documentation: add Device tree bindings for Hisilicon hip04 ethernet Zhangfei Gao
2014-04-01 13:27 ` [PATCH 2/3] net: hisilicon: new hip04 MDIO driver Zhangfei Gao
2014-04-04 15:42   ` Zhangfei Gao
2014-04-04 17:48     ` David Miller
2014-04-01 13:27 ` [PATCH 3/3] net: hisilicon: new hip04 ethernet driver Zhangfei Gao
2014-04-02  9:21   ` Arnd Bergmann
2014-04-02  9:51     ` zhangfei
2014-04-02 15:24       ` Arnd Bergmann [this message]
2014-04-02 10:04     ` David Laight
2014-04-02 15:49       ` Arnd Bergmann
2014-04-03  6:24         ` Zhangfei Gao
2014-04-03  8:35           ` Arnd Bergmann
2014-04-03 15:22       ` David Miller
2014-04-03 15:38       ` zhangfei
2014-04-03 15:27     ` Russell King - ARM Linux
2014-04-03 15:42       ` David Laight
2014-04-03 15:50         ` Russell King - ARM Linux
2014-04-03 17:57       ` Arnd Bergmann
2014-04-04  6:52       ` Zhangfei Gao
  -- strict thread matches above, loose matches on Subject: below --
2014-04-05  4:35 [PATCH v7 0/3] add hisilicon " Zhangfei Gao
2014-04-05  4:35 ` [PATCH 3/3] net: hisilicon: new " Zhangfei Gao
2014-04-07 18:53   ` David Miller
2014-04-08  8:07     ` zhangfei
2014-04-08  8:30       ` David Laight
2014-04-08  9:42         ` Arnd Bergmann
2014-04-08 14:47         ` zhangfei
2014-04-18 13:17     ` zhangfei
2014-04-07 18:56   ` David Miller
2014-04-04 15:16 [PATCH v6 0/3] add hisilicon " Zhangfei Gao
2014-04-04 15:16 ` [PATCH 3/3] net: hisilicon: new " Zhangfei Gao
2014-03-28 15:35 [PATCH v4 0/3] add hisilicon " Zhangfei Gao
2014-03-28 15:36 ` [PATCH 3/3] net: hisilicon: new " Zhangfei Gao
2014-03-24 14:14 [PATCH v3 0/3] add hisilicon " Zhangfei Gao
2014-03-24 14:14 ` [PATCH 3/3] net: hisilicon: new " Zhangfei Gao
2014-03-24 15:18   ` Arnd Bergmann
2014-03-25  4:06     ` Zhangfei Gao
2014-03-25  8:12       ` Arnd Bergmann
2014-03-25 17:00         ` Florian Fainelli
2014-03-25 17:05           ` Arnd Bergmann
2014-03-25 17:16             ` Florian Fainelli
2014-03-25 17:57               ` Arnd Bergmann
2014-03-26  9:55                 ` David Laight
2014-03-25 17:17             ` David Laight
2014-03-25 17:21             ` Eric Dumazet
2014-03-25 17:54               ` Arnd Bergmann
2014-03-27 12:53                 ` zhangfei
2014-03-24 16:32   ` Florian Fainelli
2014-03-24 17:23     ` Arnd Bergmann
2014-03-24 17:35       ` Florian Fainelli
2014-03-27  6:27     ` Zhangfei Gao
2014-03-21 15:09 [PATCH v2 0/3] add hisilicon " Zhangfei Gao
2014-03-21 15:09 ` [PATCH 3/3] net: hisilicon: new " Zhangfei Gao
2014-03-21 15:27   ` Arnd Bergmann
2014-03-22  1:18     ` zhangfei
2014-03-22  8:08       ` Arnd Bergmann
2014-03-18  8:40 [PATCH 0/3] add hisilicon " Zhangfei Gao
2014-03-18  8:40 ` [PATCH 3/3] net: hisilicon: new " Zhangfei Gao
2014-03-18 10:46   ` Russell King - ARM Linux
2014-03-20  9:51     ` Zhangfei Gao
2014-03-24 14:17       ` Rob Herring
2014-03-26 14:22         ` Zhangfei Gao
2014-03-18 11:25   ` Arnd Bergmann
2014-03-20 14:00     ` Zhangfei Gao
2014-03-20 14:31       ` Arnd Bergmann
2014-03-21  5:19         ` Zhangfei Gao
2014-03-21  7:37           ` Arnd Bergmann
2014-03-21  7:56             ` Zhangfei Gao
2014-03-24  8:17         ` Zhangfei Gao
2014-03-24 10:02           ` Arnd Bergmann
2014-03-24 13:23             ` Zhangfei Gao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4242558.6NaQec4f7j@wuerfel \
    --to=arnd@arndb.de \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox