public inbox for linux-riscv@lists.infradead.org
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Andrea Parri <parri.andrea@gmail.com>
Cc: paulmck@kernel.org, paul.walmsley@sifive.com, palmer@dabbelt.com,
	aou@eecs.berkeley.edu, linux-riscv@lists.infradead.org,
	linux-kernel@vger.kernel.org, mmaas@google.com,
	hboehm@google.com, striker@us.ibm.com
Subject: Re: [RFC PATCH] membarrier: riscv: Provide core serializing command
Date: Fri, 4 Aug 2023 14:05:55 -0400	[thread overview]
Message-ID: <ab562167-e4a5-4a7d-7722-a4f99848d63e@efficios.com> (raw)
In-Reply-To: <ZM0STfpkRSfNQBt8@andrea>

On 8/4/23 10:59, Andrea Parri wrote:
>> What is the relationship between FENCE.I and instruction cache flush on
>> RISC-V ?
> 
> The exact nature of this relationship is implementation-dependent.  From
> commentary included in the ISA portion referred to in the changelog:
> 
>    A simple implementation can flush the local instruction cache and
>    the instruction pipeline when the FENCE.I is executed.  A more
>    complex implementation might snoop the instruction (data) cache on
>    every data (instruction) cache miss, or use an inclusive unified
>    private L2 cache to invalidate lines from the primary instruction
>    cache when they are being written by a local store instruction.  If
>    instruction and data caches are kept coherent in this way, or if
>    the memory system consists of only uncached RAMs, then just the
>    fetch pipeline needs to be flushed at a FENCE.I.  [..]
> 
> Mmh, does this help?

Quoting

https://github.com/riscv/riscv-isa-manual/releases/download/Ratified-IMAFDQC/riscv-spec-20191213.pdf

Chapter 3 "“Zifencei” Instruction-Fetch Fence, Version 2.0"

"First, it has been recognized that on some systems, FENCE.I will be expensive to implement
and alternate mechanisms are being discussed in the memory model task group. In particular,
for designs that have an incoherent instruction cache and an incoherent data cache, or where
the instruction cache refill does not snoop a coherent data cache, both caches must be completely
flushed when a FENCE.I instruction is encountered. This problem is exacerbated when there are
multiple levels of I and D cache in front of a unified cache or outer memory system.

Second, the instruction is not powerful enough to make available at user level in a Unix-like
operating system environment. The FENCE.I only synchronizes the local hart, and the OS can
reschedule the user hart to a different physical hart after the FENCE.I. This would require the
OS to execute an additional FENCE.I as part of every context migration. For this reason, the
standard Linux ABI has removed FENCE.I from user-level and now requires a system call to
maintain instruction-fetch coherence, which allows the OS to minimize the number of FENCE.I
executions required on current systems and provides forward-compatibility with future improved
instruction-fetch coherence mechanisms.

Future approaches to instruction-fetch coherence under discussion include providing more
restricted versions of FENCE.I that only target a given address specified in rs1, and/or allowing
software to use an ABI that relies on machine-mode cache-maintenance operations."

I start to suspect that even the people working on the riscv memory model have noticed
that letting a single instruction such as FENCE.I take care of both cache coherency
*and* flush the instruction pipeline will be a performance bottleneck, because it
can only clear the whole instruction cache.

Other architectures are either cache-coherent, or have cache flushing which can be
performed on a range of addresses. This is kept apart from whatever instruction
flushes the instruction pipeline of the processor.

By keeping instruction cache flushing separate from instruction pipeline flush, we can
let membarrier (and context switches, including thread migration) only care about the
instruction pipeline part, and leave instruction cache flush to either a dedicated
system call, or to specialized instructions which are available from user-mode.

Considering that FENCE.I is forced to invalidate the whole i-cache, I don't think you
will get away with executing it from switch_mm without making performance go down the
drain on cache incoherent implementations.

In my opinion, what we would need from RISC-V for membarrier (and context switch) is a
lightweight version of FENCE.I which only flushes the instruction pipeline of the local
processor. This should ideally come with a way for architectures with incoherent caches
to flush the relevant address ranges of the i-cache which are modified by a JIT. This
i-cache flush would not be required to flush the instruction pipeline, as it is typical
to batch invalidation of various address ranges together and issue a single instruction
pipeline flush on each CPU at the end. The i-cache flush could either be done by new
instructions available from user-space (similar to aarch64), or through privileged
instructions available through system calls (similar to arm cacheflush).

Thanks,

Mathieu


> 
>    Andrea

-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

  reply	other threads:[~2023-08-04 18:05 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-03  4:01 [RFC PATCH] membarrier: riscv: Provide core serializing command Andrea Parri
2023-08-03 15:45 ` Andrea Parri
2023-08-03 20:28   ` Mathieu Desnoyers
2023-08-04  0:16     ` Andrea Parri
2023-08-04 14:20       ` Mathieu Desnoyers
2023-08-04 14:59         ` Andrea Parri
2023-08-04 18:05           ` Mathieu Desnoyers [this message]
2023-08-04 19:16             ` Andrea Parri
2023-08-04 20:06               ` Mathieu Desnoyers
2023-08-07 13:19                 ` Andrea Parri
2023-10-13 17:29                   ` Palmer Dabbelt
2023-10-13 18:49                     ` Mathieu Desnoyers
2023-10-16 18:27                       ` Robbin Ehn
2023-11-09 19:24                       ` Andrea Parri
2023-11-10  6:33                         ` [PATCH 1/2] locking: Introduce prepare_sync_core_cmd() kernel test robot
2023-11-23  1:07                         ` [RFC PATCH] membarrier: riscv: Provide core serializing command Charlie Jenkins
2023-11-23  2:13                           ` Mathieu Desnoyers
2023-11-27 10:44                             ` Andrea Parri
2023-11-23  6:52                           ` Robbin Ehn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ab562167-e4a5-4a7d-7722-a4f99848d63e@efficios.com \
    --to=mathieu.desnoyers@efficios.com \
    --cc=aou@eecs.berkeley.edu \
    --cc=hboehm@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=mmaas@google.com \
    --cc=palmer@dabbelt.com \
    --cc=parri.andrea@gmail.com \
    --cc=paul.walmsley@sifive.com \
    --cc=paulmck@kernel.org \
    --cc=striker@us.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox