netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Willy Tarreau <w@1wt.eu>
To: David Miller <davem@davemloft.net>
Cc: David.Laight@ACULAB.COM, netdev@vger.kernel.org,
	thomas.petazzoni@free-electrons.com,
	gregory.clement@free-electrons.com
Subject: Re: [PATCH 11/13] net: mvneta: implement rx_copybreak
Date: Thu, 16 Jan 2014 21:07:57 +0100	[thread overview]
Message-ID: <20140116200757.GA19969@1wt.eu> (raw)
In-Reply-To: <20140116.114905.1582218754259744747.davem@davemloft.net>

Hi David,

On Thu, Jan 16, 2014 at 11:49:05AM -0800, David Miller wrote:
> From: Willy Tarreau <w@1wt.eu>
> Date: Thu, 16 Jan 2014 10:36:40 +0100
> 
> > On Thu, Jan 16, 2014 at 09:14:00AM +0000, David Laight wrote:
> >> From: Of Willy Tarreau
> >> > calling dma_map_single()/dma_unmap_single() is quite expensive compared
> >> > to copying a small packet. So let's copy short frames and keep the buffers
> >> > mapped. We set the limit to 256 bytes which seems to give good results both
> >> > on the XP-GP board and on the AX3/4.
> >> 
> >> Have you tried something similar for transmits?
> >> Copying short tx into a preallocated memory area that is dma_mapped
> >> at driver initialisation?
> > 
> > Not yet by lack of time, but I intend to do so. Both the NIC and the driver
> > are capable of running at line rate (1.488 Mpps) when filling descriptors
> > directly, so many improvements are still possible.
> 
> I don't know if this is relevant for your hardware, but there is a trick
> with IOMMUs that I at least at one point was doing on sparc64.
> 
> I never flushed the IOMMU mappings explicitly on a dma_unmap call.
> 
> Instead I always allocate by moving the allocation cursor to higher
> addresses in the IOMMU range looking for free space.
> 
> Then when I hit the end of the range and had to wrap the allocation
> cursor back to the beginning, I flush the entire IOMMU.
> 
> So for %99 of IOMMU mapping creation and teardown calls, I didn't even
> touch the hardware.

Ah that's an interesting trick! We don't have an IOMMU on this platform
so the call is cheap. However, it relies on an I/O barrier to wait for
a cache snoop completion. From what I read, it's not needed when going
to the device. But when going to the CPU for the Rx case, we're calling
it for every packet which is counter productive because I'd like to do
it only once when entering the rx loop and avoid any other call during
the loop. I measured that I could save 300 ns per packet by doing this
(thus slightly modifying the DMA API to add a new dma_io_barrier function).

I think it could be interesting (at least for this platform, I don't know
if other platforms work with cache coherent DMA which only require to
verify end of snooping), to have two distinct set of DMA calls, something
like this :

    rx_loop(napi, budget)
    {
         dma_wait_for_snoop_completion(dev);
         ...
         for_each_rx_packet(pkt) {
             dma_sync_single_range_for_cpu_unless_completed(dev, pkt->addr);
             /* use packet */
         }
    }

The dma_wait_for_snoop_completion() call would call this I/O barrier on
platforms which provide one, and dma_sync_single_range_unless_completed()
would do the equivalent call only when the I/O barrier is not provided.
However I don't like this principle because it requires adding many new
entries to the struct dma_ops while we could avoid this by just adding
the first entry and changing all drivers at once to add the first call
before the loop. I'm just having a hard time believing that it's possible
anymore to change all drivers at once, especially when there are out of
tree drivers... So such a change would probably require to add a number
of new calls to the DMA API. I really don't know what's the best solution
would be in fact :-/

Thanks,
Willy

  reply	other threads:[~2014-01-16 20:08 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-16  7:20 [PATCH 00/13] Assorted mvneta fixes and improvements Willy Tarreau
2014-01-16  7:20 ` [PATCH 01/13] net: mvneta: increase the 64-bit rx/tx stats out of the hot path Willy Tarreau
2014-01-16  7:20 ` [PATCH 02/13] net: mvneta: use per_cpu stats to fix an SMP lock up Willy Tarreau
2014-01-16  7:20 ` [PATCH 03/13] net: mvneta: do not schedule in mvneta_tx_timeout Willy Tarreau
2014-01-16  7:20 ` [PATCH 04/13] net: mvneta: add missing bit descriptions for interrupt masks and causes Willy Tarreau
2014-01-16  7:20 ` [PATCH 05/13] net: mvneta: replace Tx timer with a real interrupt Willy Tarreau
2014-01-16  7:20 ` [PATCH 06/13] net: mvneta: remove tests for impossible cases in the tx_done path Willy Tarreau
2014-01-16  7:20 ` [PATCH 07/13] net: mvneta: factor rx refilling code Willy Tarreau
2014-01-16  7:20 ` [PATCH 08/13] net: mvneta: simplify access to the rx descriptor status Willy Tarreau
2014-01-16  7:20 ` [PATCH 09/13] net: mvneta: prefetch next rx descriptor instead of current one Willy Tarreau
2014-01-16  7:20 ` [PATCH 10/13] net: mvneta: convert to build_skb() Willy Tarreau
2014-01-16  7:20 ` [PATCH 11/13] net: mvneta: implement rx_copybreak Willy Tarreau
2014-01-16  9:14   ` David Laight
2014-01-16  9:36     ` Willy Tarreau
2014-01-16 19:49       ` David Miller
2014-01-16 20:07         ` Willy Tarreau [this message]
2014-01-16 20:11           ` David Miller
2014-01-17  9:28           ` David Laight
2014-01-17  9:48             ` Willy Tarreau
2014-01-17  9:32         ` David Laight
2014-01-16  7:20 ` [PATCH 12/13] net: mvneta: mvneta_tx_done_gbe() cleanups Willy Tarreau
2014-01-16  7:20 ` [PATCH 13/13] net: mvneta: make mvneta_txq_done() return void Willy Tarreau
2014-01-16  7:22 ` [PATCH net-next 00/13] Assorted mvneta fixes and improvements Willy Tarreau
2014-01-16 23:21 ` [PATCH " David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140116200757.GA19969@1wt.eu \
    --to=w@1wt.eu \
    --cc=David.Laight@ACULAB.COM \
    --cc=davem@davemloft.net \
    --cc=gregory.clement@free-electrons.com \
    --cc=netdev@vger.kernel.org \
    --cc=thomas.petazzoni@free-electrons.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).