[PATCH v2] ARM: l2x0: make background cache ops optional for clean and flush range

linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed

From: robherring2@gmail.com (Rob Herring)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v2] ARM: l2x0: make background cache ops optional for clean and flush range
Date: Mon, 17 Sep 2012 21:43:55 -0500	[thread overview]
Message-ID: <5057DFEB.40101@gmail.com> (raw)
In-Reply-To: <20120917214732.GR12245@n2100.arm.linux.org.uk>

On 09/17/2012 04:47 PM, Russell King - ARM Linux wrote:
> On Mon, Sep 17, 2012 at 03:29:14PM -0500, Rob Herring wrote:
>> On 09/17/2012 02:47 PM, Russell King - ARM Linux wrote:
>>> (a) you're not describing hardware, you're describing a feature of the
>>>     Linux kernel's implementation - no, you're describing a configuration
>>>     feature of the kernel.  This should not be in DT.
>>> (b) you're disabling the optimization to avoid doing a lengthy line-by-line
>>>     cache operation when the size is larger than the cache size, and trading
>>>     it for possibly slightly quicker accesses.
>>>
>>> The problem I have with this is it's likely that you've only looked at
>>> "oh, this makes IO accesses faster" and not the total picture wrt the
>>> effect on time taken by the DMA API.
>>
>> I could easily "fix" this in the xgmac driver by using __raw_writel or
>> writel_relaxed, but that would break non-coherent operation which does
>> need the L2 cache barriers.
> 
> Note that in many cases, drivers do not need a barrier before every write
> or after every read operation.
> 
> In terms of L2 coherency, the only time that L2 actually needs to be
> touched is when your MMIO access provokes the device to read from memory.
> Most accesses do not.

Agreed. The current approach is a bit of a sledge hammer for relatively
few places that need it.

> The solution - using the relaxed MMIO accessors, not fiddling around
> with the L2 cache code.

I have done that in critical paths as well. As I explained in my
response to Catalin, too many barriers is not my problem. The network
transmit path is a single writel and one explicit wmb already. Using
relaxed accessors will only make both wmb's explicit. I still need them
at least for the L1 in the coherent case. Even if I'm coherent, I still
have to ensure the order the DMA descriptors are written and the order
of DMA descriptor writes with the DMA poll register write. But there are
not any universal L1 only barriers.

The sequence is like this:
Write out N DMA descriptors
barrier
Write 1st DMA descriptor's ready bit
barrier
Write DMA poll register

For non-coherent, the barrier needs to be a dsb and L2 sync. For
coherent, this only really needs to be a dsb. Removal of the spinlock is
enough to make the L2 sync a don't care.

> However, there's an in-built problem here - which I'm told of by PowerPC
> folk.  Apparantly, Linus refuses to accept that anyone needs relaxed MMIO
> accessors and essentially says "all platforms must behave like x86 does".
> I don't actually know Linus' real position, or what his position would be
> if he new that without relaxed accessors, we have to do a very time
> consuming operation...  It might be worth broaching the subject of true
> relaxed MMIO accessors again.
> 
> The only issue with that approach is getting people who don't have the
> problem (iow, x86 driver authors) to understand this issue - and that's
> where I absolutely agree with Linus' apparant position...

Also, how much of this knowledge should be in the driver? While I really
don't care if the xgmac driver works in non-coherent mode, that's
clearly not going to be acceptable. Differences between coherent or
non-coherent need to stay abstracted out of the drivers.

> A better solution than relaxed IO accessors maybe a barrier, but don't
> call it that.  Maybe call it "pre_dma();" and "post_dma();" - that way
> people won't get bogged down in barrier terminology or whatever.  They
> just call "pre_dma();" before they write to a register which provokes
> DMA, and "post_dma();" before they read from a register to get DMA
> status.  What platforms do for each of those calls is up to them, and
> whether platforms use relaxed MMIO accessors becomes their choice too.

If pre_dma() still calls l2x0 cache sync which takes a spinlock, then
I'm back to the same problem.

These would need to be per device as well. Perhaps it could be part of
the DMA mapping API ops.

Rob

> Another but - we have a hell of a lot of drivers... and existing
> accessors must work as per x86...
> 
>> It's not without precedence, but you are right that it is a questionable
>> use of DT. I'm open to suggestions of how you would expose this
>> configuration to platforms. We obviously don't want a compile time
>> option. Some options:
>>
>> - Add a flags param to l2x0_of_init to configure it
>> - Add non-background version of functions and override the function ptrs
>> in platform code.
>> - Add a new call to configure l2 settings like this.
>> - Expose the use_background_ops variable directly
> 
> - command line setting which is available to everyone whether they're using
> DT or not.
>

next prev parent reply	other threads:[~2012-09-18  2:43 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-10 19:45 [RFC PATCH] ARM: l2x0: avoid spinlock for sync op on pl310 Rob Herring
2012-09-10 23:13 ` Russell King - ARM Linux
2012-09-17 13:59 ` [PATCH v2] ARM: l2x0: make background cache ops optional for clean and flush range Rob Herring
2012-09-17 19:47   ` Russell King - ARM Linux
2012-09-17 20:29     ` Rob Herring
2012-09-17 20:45       ` Catalin Marinas
2012-09-17 21:36         ` Rob Herring
2012-09-17 21:47       ` Russell King - ARM Linux
2012-09-18  2:43         ` Rob Herring [this message]
2012-09-18  8:21           ` Catalin Marinas
2012-09-18 12:00             ` Rob Herring
2012-09-17 22:31       ` Nicolas Pitre
2012-09-18 12:50         ` Rob Herring

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5057DFEB.40101@gmail.com \
    --to=robherring2@gmail.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).