linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Catalin Marinas <catalin.marinas@arm.com>
To: David Clear <dclear@amd.com>
Cc: ardb@kernel.org, linux-arm-kernel@lists.infradead.org,
	mark.rutland@arm.com, maz@kernel.org, will@kernel.org
Subject: Re: ARM64: Question: How to map non-shareable memory
Date: Thu, 25 May 2023 09:30:27 +0100	[thread overview]
Message-ID: <ZG8co3v/TYllfHAv@arm.com> (raw)
In-Reply-To: <20230525003359.3690-1-dclear@amd.com>

Hi David,

On Wed, May 24, 2023 at 05:33:59PM -0700, David Clear wrote:
> On Wed, 24 May 2023 at 23:59, Ard Biesheuvel <ardb@kernel.org> wrote:
> > Non-shareable cacheable mappings are problematic because they are not
> > covered by the hardware coherency protocol that keeps caches
> > synchronized between CPUs and cluster-level and system-level caches.
> > (IOW, accesses to non-shareable mappings will have snooping disabled).
> >
> > This means that, unless your system only has a single CPU and does not
> > support cache coherent DMA at all, the cached view of those RAM
> > regions will go out of sync between CPUs and wrt other coherent
> > masters, which is probably not what you're after.
> 
> Hi Ard. Thanks for the quick reply.
> 
> I understand your concerns. The general Linux memory within the
> (multi-cluster) system is fully coherent, and there are no surprises
> w.r.t normal SMP system operation and device DMA.
> 
> The non-coherent memories are outside of the general Linux pool, owned
> by autonomous hardware units, and are used for product-specific purposes.
> These memories are either internal to the units (far away from coherence
> machinery) or purposefully avoid the system coherency controllers so as
> to not incur the latecy tax in back-to-back dependent transactions. In
> this product it would be a significant performance burden to maintain
> coherence with ARM caches that are essentially nothing to do with these
> unit's operations.

Are these memories bus masters themselves? I doubt it. My guess is that
such memory is also accessed by a device that cannot maintain coherency
with the CPU caches. So IIUC you want a cached mapping from the CPU side
for performance reason but treat it non-coherent from a DMA perspective.
For some hardware reason, shareable cacheable transactions to such
memory trigger SErrors. Do you know why this is the case? Because any
other non-cacheable transactions are considered shareable anyway. Or is
it that out shareable is fine but inner shareable is not? The Arm CPUs
don't really distinguish between these AFAIK.

> For the userspace software that needs to access this memory, the current
> non-cached mapping is obtained via a device driver and the goal is
> to minimize the number of discrete memory transactions by supporting
> cached burst-reads and burst-writes, bracketed with appropriate cache
> maintenance ops. There are already private caches within the hardware
> pipelines that software needs to be explicitly flush or invalidate,
> so this is just one more thing.

I agree with Ard, such mapping won't work. When you mark it as
non-shareable, it tells the CPU that the cache lines for that mapping
are not shared with other CPUs, they don't participate in the cache
coherency protocols. Any cache maintenance to PoC is also limited to
that CPU. See "Effects of instructions that operate by VA to the PoC" in
the latest Arm ARM (page D7-5784).

So let's say that your user process starts reading from such mapping
(potentially speculatively) but doing some DC IVAC before (it needs to
be in the kernel). The process is than migrated by the kernel to another
CPU which has stale cache lines for that range because the DC IVAC only
affected the first CPU. Similarly with the writes, you can't guarantee
that the write and the DC CVAC happen on the same CPU. I also have no
idea how some "transparent" system caches behave here, whether they do
anything on the DC instructions and how shareability changes their
behaviour.

Your best bet is Normal Non-cacheable here. On newer architecture
versions Arm introduced ST64B/LD64B for similar performance reasons
(FEAT_LS64 in Armv8.7) but I don't think there's hardware yet.

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2023-05-25  8:31 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-24 21:07 ARM64: Question: How to map non-shareable memory David Clear
2023-05-24 21:59 ` Ard Biesheuvel
2023-05-25  0:33   ` David Clear
2023-05-25  8:30     ` Catalin Marinas [this message]
2023-05-25 23:47       ` David Clear

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZG8co3v/TYllfHAv@arm.com \
    --to=catalin.marinas@arm.com \
    --cc=ardb@kernel.org \
    --cc=dclear@amd.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=mark.rutland@arm.com \
    --cc=maz@kernel.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).