public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed
From: mark.rutland@arm.com (Mark Rutland)
To: linux-arm-kernel@lists.infradead.org
Subject: v7_dma_inv_range performance/high expense
Date: Fri, 27 May 2016 17:14:33 +0100	[thread overview]
Message-ID: <20160527161432.GI24469@leverpostej> (raw)
In-Reply-To: <20160527144045.GB20214@lunn.ch>

On Fri, May 27, 2016 at 04:40:45PM +0200, Andrew Lunn wrote:
> Hi folks
> 
> I have an imx6q, which is a quad core v7 processor. Attached to it via
> pcie i have an intel i210 Ethernet controller.
> 
> When the ethernet is transmitting, i can get gigabit line rate, and
> use one core to about 35% of one core. When receiving, i get around
> 700Mbps and ksoftirqd/0 is 98% loading a core.
> 
> Using perf to profile the ksoftirqd/0 pid is see:
> 
>   46.38%  [kernel]  [k] v7_dma_inv_range
>   21.25%  [kernel]  [k] l2c210_inv_range
>   10.90%  [kernel]  [k] igb_poll
>    1.69%  [kernel]  [k] dma_cache_maint_page
>    1.27%  [kernel]  [k] eth_type_trans
>    1.20%  [kernel]  [k] skb_add_rx_frag
> 
> Digging deeper into v7_dma_inv_range i see:
> 
>            801182c0 <v7_dma_inv_range>:
>            v7_dma_inv_range():
>   0.26       mrc    15, 0, r3, cr0, cr0, {1}
>   0.07       lsr    r3, r3, #16
>              and    r3, r3, #15
>   0.04       mov    r2, #4
>              lsl    r2, r2, r3
>   0.04       sub    r3, r2, #1
>              tst    r0, r3
>   0.02       bic    r0, r0, r3
>   0.03       dsb    sy
>   3.01       mcrne  15, 0, r0, cr7, cr14, {1}
>   0.54       tst    r1, r3
>              bic    r1, r1, r3
>   0.08       mcrne  15, 0, r1, cr7, cr14, {1}
>   3.82 34:   mcr    15, 0, r0, cr7, cr6, {1}
>  88.32       add    r0, r0, r2
>              cmp    r0, r1
>   1.97       bcc    34
>   0.43       dsb    st
>   1.37       bx     lr
> 
> I'm assuming perf is off by one here, and the add is not taking 88.32%
> of the load, rather it is the mcr instruction before it.

The address perf reports is the PC at the moment the PMU overflow
interrupt was architecturally taken by the core. Reporting anything else
would require us to make up bogus PC values (e.g. if a branch was just
taken, you can't reconstruct the previous PC).

If the PMU overflow interrupt comes in (asynchronously) while an
expensive instruction is in progress, the CPU will likely have to wait
for that to complete before it can handle the interrupt.

So yes, the MCR is very likely to be the expensive instruction here.

> The original code in arch/arm/mm/cache-v7.S  says:
> 
>         mcr     p15, 0, r0, c7, c6, 1           @ invalidate D / U line
> 
> I don't get why a cache invalidate instruction should be so expensive.
> It is just throwing away the contents of the cache line, not flushing
> it out to DRAM. 

This really depends on the microarchitecture and integration.

The cache maintenance operations likely have to synchronise with some
logic in other cores to safely invalidate all copies of the line, there
may be some limit on the number of outstanding operations, etc.

> Should i trust perf?

I don't see a reason not to. Nothing above implies that perf is
providing erroneous information.

Thanks,
Mark.

      parent reply	other threads:[~2016-05-27 16:14 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-27 14:40 v7_dma_inv_range performance/high expense Andrew Lunn
2016-05-27 14:58 ` Russell King - ARM Linux
2016-05-27 15:38   ` Andrew Lunn
2016-05-27 16:37     ` Russell King - ARM Linux
2016-05-27 16:14 ` Mark Rutland [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160527161432.GI24469@leverpostej \
    --to=mark.rutland@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox