[PATCH 0/2] riscv: Fix race conditions in PR_RISCV_SET_ICACHE_FLUSH

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/2] riscv: Fix race conditions in PR_RISCV_SET_ICACHE_FLUSH_CTX
@ 2024-08-13 23:02 Charlie Jenkins
  2024-08-13 23:02 ` [PATCH 1/2] riscv: Disable preemption while handling PR_RISCV_CTX_SW_FENCEI_OFF Charlie Jenkins
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Charlie Jenkins @ 2024-08-13 23:02 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti,
	Atish Patra, Samuel Holland, Andrea Parri
  Cc: Palmer Dabbelt, linux-riscv, linux-kernel, Charlie Jenkins

There are two race conditions possible with
PR_RISCV_SET_ICACHE_FLUSH_CTX. The first one can be seen by enabling
DEBUG_PREEMPT and using this prctl which will warn with BUG: using
smp_processor_id() in preemptible. This can be fixed by disabling
preemption during this prctl handling. Another race condition is present
when the mm->context.icache_stale_mask is changed by a thread while a
different thread in the same mm context is between switch_mm() and
switch_to() during a context switch.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
---
Charlie Jenkins (2):
      riscv: Disable preemption while handling PR_RISCV_CTX_SW_FENCEI_OFF
      riscv: Eagerly flush in flush_icache_deferred()

 arch/riscv/include/asm/switch_to.h | 19 ++++++++++++++++---
 arch/riscv/mm/cacheflush.c         | 13 +++++++------
 arch/riscv/mm/context.c            |  6 +-----
 3 files changed, 24 insertions(+), 14 deletions(-)
---
base-commit: 7c626ce4bae1ac14f60076d00eafe71af30450ba
change-id: 20240812-fix_fencei_optimization-3f81ac200505
-- 
- Charlie


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/2] riscv: Disable preemption while handling PR_RISCV_CTX_SW_FENCEI_OFF
  2024-08-13 23:02 [PATCH 0/2] riscv: Fix race conditions in PR_RISCV_SET_ICACHE_FLUSH_CTX Charlie Jenkins
@ 2024-08-13 23:02 ` Charlie Jenkins
  2024-08-13 23:02 ` [PATCH 2/2] riscv: Eagerly flush in flush_icache_deferred() Charlie Jenkins
  2024-09-11 15:30 ` [PATCH 0/2] riscv: Fix race conditions in PR_RISCV_SET_ICACHE_FLUSH_CTX patchwork-bot+linux-riscv
  2 siblings, 0 replies; 8+ messages in thread
From: Charlie Jenkins @ 2024-08-13 23:02 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti,
	Atish Patra, Samuel Holland, Andrea Parri
  Cc: Palmer Dabbelt, linux-riscv, linux-kernel, Charlie Jenkins

The icache will be flushed in switch_to() if force_icache_flush is true,
or in flush_icache_deferred() if icache_stale_mask is set. Between
setting force_icache_flush to false and calculating the new
icache_stale_mask, preemption needs to be disabled. There are two
reasons for this:

1. If CPU migration happens between force_icache_flush = false, and the
   icache_stale_mask is set, an icache flush will not be emitted.
2. smp_processor_id() is used in set_icache_stale_mask() to mark the
   current CPU as not needing another flush since a flush will have
   happened either by userspace or by the kernel when performing the
   migration. smp_processor_id() is currently called twice with preemption
   enabled which causes a race condition. It allows
   icache_stale_mask to be populated with inconsistent CPU ids.

Resolve these two issues by setting the icache_stale_mask before setting
force_icache_flush to false, and using get_cpu()/put_cpu() to obtain the
smp_processor_id().

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Fixes: 6b9391b581fd ("riscv: Include riscv_set_icache_flush_ctx prctl")
---
 arch/riscv/mm/cacheflush.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/riscv/mm/cacheflush.c b/arch/riscv/mm/cacheflush.c
index a03c994eed3b..b81672729887 100644
--- a/arch/riscv/mm/cacheflush.c
+++ b/arch/riscv/mm/cacheflush.c
@@ -158,6 +158,7 @@ void __init riscv_init_cbo_blocksizes(void)
 #ifdef CONFIG_SMP
 static void set_icache_stale_mask(void)
 {
+	int cpu = get_cpu();
 	cpumask_t *mask;
 	bool stale_cpu;
 
@@ -168,10 +169,11 @@ static void set_icache_stale_mask(void)
 	 * concurrently on different harts.
 	 */
 	mask = &current->mm->context.icache_stale_mask;
-	stale_cpu = cpumask_test_cpu(smp_processor_id(), mask);
+	stale_cpu = cpumask_test_cpu(cpu, mask);
 
 	cpumask_setall(mask);
-	cpumask_assign_cpu(smp_processor_id(), mask, stale_cpu);
+	cpumask_assign_cpu(cpu, mask, stale_cpu);
+	put_cpu();
 }
 #endif
 
@@ -239,14 +241,12 @@ int riscv_set_icache_flush_ctx(unsigned long ctx, unsigned long scope)
 	case PR_RISCV_CTX_SW_FENCEI_OFF:
 		switch (scope) {
 		case PR_RISCV_SCOPE_PER_PROCESS:
-			current->mm->context.force_icache_flush = false;
-
 			set_icache_stale_mask();
+			current->mm->context.force_icache_flush = false;
 			break;
 		case PR_RISCV_SCOPE_PER_THREAD:
-			current->thread.force_icache_flush = false;
-
 			set_icache_stale_mask();
+			current->thread.force_icache_flush = false;
 			break;
 		default:
 			return -EINVAL;

-- 
2.45.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/2] riscv: Eagerly flush in flush_icache_deferred()
  2024-08-13 23:02 [PATCH 0/2] riscv: Fix race conditions in PR_RISCV_SET_ICACHE_FLUSH_CTX Charlie Jenkins
  2024-08-13 23:02 ` [PATCH 1/2] riscv: Disable preemption while handling PR_RISCV_CTX_SW_FENCEI_OFF Charlie Jenkins
@ 2024-08-13 23:02 ` Charlie Jenkins
  2024-08-15  0:34   ` Andrea Parri
  2024-09-11 15:30 ` [PATCH 0/2] riscv: Fix race conditions in PR_RISCV_SET_ICACHE_FLUSH_CTX patchwork-bot+linux-riscv
  2 siblings, 1 reply; 8+ messages in thread
From: Charlie Jenkins @ 2024-08-13 23:02 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti,
	Atish Patra, Samuel Holland, Andrea Parri
  Cc: Palmer Dabbelt, linux-riscv, linux-kernel, Charlie Jenkins

It is possible for mm->context.icache_stale_mask to be set between
switch_mm() and switch_to() when there are two threads on different CPUs
executing in the same mm:

<---- Thread 1 starts running here on CPU1

<---- Thread 2 starts running here with same mm

T2: prctl(PR_RISCV_SET_ICACHE_FLUSH_CTX, PR_RISCV_CTX_SW_FENCEI_ON, PR_RISCV_SCOPE_PER_PROCESS);
T2: <-- kernel sets current->mm->context.force_icache_flush to true
T2: <modification of instructions>
T2: fence.i

T1: fence.i (to synchronize with other thread, has some logic to
             determine when to do this)
T1: <-- thread 1 is preempted
T1: <-- thread 1 is placed onto CPU3 and starts context switch sequence
T1 (kernel): flush_icache_deferred() -> skips flush because switch_to_should_flush_icache() returns true
				     -> thread has migrated and task->mm->context.force_icache_flush is true

T2: prctl(PR_RISCV_SET_ICACHE_FLUSH_CTX, PR_RISCV_CTX_SW_FENCEI_OFF, PR_RISCV_SCOPE_PER_PROCESS);
T2 (kernel): kernel sets current->mm->context.force_icache_flush = false

T1 (kernel): switch_to() calls switch_to_should_flush_icache() which now
	     returns false because task->mm->context.force_icache_flush
	     is false due to the other thread emitting
	     PR_RISCV_CTX_SW_FENCEI_OFF.
T1 (back in userspace): Instruction cache was never flushed on context
			switch to CPU3, and thus may execute incorrect
			instructions.

This commit fixes this issue by also flushing in switch_to() when
task->mm->context.icache_stale_mask is set, and always flushing in
flush_icache_deferred().

It is possible for switch_mm() to be called without being followed by
switch_to() such as with the riscv efi driver in efi_virtmap_load(), so
we cannot do all of the flushing in switch_to().

To avoid flushing multiple times in a single context switch, clear
mm->context.icache_stale_mask in switch_mm() and only flush in
switch_to() if it was set again in the meantime. Set icache_stale_mask
when handling prctl and in switch_to() if
task->mm->context.force_icache_flush is set to prepare for next context
switch.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Fixes: 6b9391b581fd ("riscv: Include riscv_set_icache_flush_ctx prctl")
---
 arch/riscv/include/asm/switch_to.h | 19 ++++++++++++++++---
 arch/riscv/mm/cacheflush.c         |  1 +
 arch/riscv/mm/context.c            |  6 +-----
 3 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/arch/riscv/include/asm/switch_to.h b/arch/riscv/include/asm/switch_to.h
index 7594df37cc9f..5701cfcf2b68 100644
--- a/arch/riscv/include/asm/switch_to.h
+++ b/arch/riscv/include/asm/switch_to.h
@@ -76,11 +76,24 @@ extern struct task_struct *__switch_to(struct task_struct *,
 static inline bool switch_to_should_flush_icache(struct task_struct *task)
 {
 #ifdef CONFIG_SMP
-	bool stale_mm = task->mm && task->mm->context.force_icache_flush;
-	bool stale_thread = task->thread.force_icache_flush;
+	bool stale_mm = false;
 	bool thread_migrated = smp_processor_id() != task->thread.prev_cpu;
+	bool stale_thread = thread_migrated && task->thread.force_icache_flush;
+
+	if (task->mm) {
+		/*
+		 * The mm is only stale if the respective CPU bit in
+		 * icache_stale_mask is set. force_icache_flush indicates that
+		 * icache_stale_mask should be set again before returning to userspace.
+		 */
+		stale_mm = cpumask_test_cpu(smp_processor_id(),
+					    &task->mm->context.icache_stale_mask);
+		cpumask_assign_cpu(smp_processor_id(),
+				   &task->mm->context.icache_stale_mask,
+				   task->mm->context.force_icache_flush);
+	}
 
-	return thread_migrated && (stale_mm || stale_thread);
+	return stale_mm || stale_thread;
 #else
 	return false;
 #endif
diff --git a/arch/riscv/mm/cacheflush.c b/arch/riscv/mm/cacheflush.c
index b81672729887..c4a9ac2ad7ab 100644
--- a/arch/riscv/mm/cacheflush.c
+++ b/arch/riscv/mm/cacheflush.c
@@ -230,6 +230,7 @@ int riscv_set_icache_flush_ctx(unsigned long ctx, unsigned long scope)
 		switch (scope) {
 		case PR_RISCV_SCOPE_PER_PROCESS:
 			current->mm->context.force_icache_flush = true;
+			cpumask_setall(&current->mm->context.icache_stale_mask);
 			break;
 		case PR_RISCV_SCOPE_PER_THREAD:
 			current->thread.force_icache_flush = true;
diff --git a/arch/riscv/mm/context.c b/arch/riscv/mm/context.c
index 4abe3de23225..9e82e2a99441 100644
--- a/arch/riscv/mm/context.c
+++ b/arch/riscv/mm/context.c
@@ -306,11 +306,7 @@ static inline void flush_icache_deferred(struct mm_struct *mm, unsigned int cpu,
 		 */
 		smp_mb();
 
-		/*
-		 * If cache will be flushed in switch_to, no need to flush here.
-		 */
-		if (!(task && switch_to_should_flush_icache(task)))
-			local_flush_icache_all();
+		local_flush_icache_all();
 	}
 #endif
 }

-- 
2.45.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] riscv: Eagerly flush in flush_icache_deferred()
  2024-08-13 23:02 ` [PATCH 2/2] riscv: Eagerly flush in flush_icache_deferred() Charlie Jenkins
@ 2024-08-15  0:34   ` Andrea Parri
  2024-08-15 23:17     ` Charlie Jenkins
  0 siblings, 1 reply; 8+ messages in thread
From: Andrea Parri @ 2024-08-15  0:34 UTC (permalink / raw)
  To: Charlie Jenkins
  Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti,
	Atish Patra, Samuel Holland, Palmer Dabbelt, linux-riscv,
	linux-kernel

> <---- Thread 1 starts running here on CPU1
> 
> <---- Thread 2 starts running here with same mm
> 
> T2: prctl(PR_RISCV_SET_ICACHE_FLUSH_CTX, PR_RISCV_CTX_SW_FENCEI_ON, PR_RISCV_SCOPE_PER_PROCESS);
> T2: <-- kernel sets current->mm->context.force_icache_flush to true

Mmh, TBH, I'm not sure how this patch is supposed to fix the race in
question:

For once, AFAIU, the operation

T2: cpumask_setall(&current->mm->context.icache_stale_mask)

(on CPU2?) you've added with this patch...


> T2: <modification of instructions>
> T2: fence.i
> 
> T1: fence.i (to synchronize with other thread, has some logic to
>              determine when to do this)
> T1: <-- thread 1 is preempted
> T1: <-- thread 1 is placed onto CPU3 and starts context switch sequence
> T1 (kernel): flush_icache_deferred() -> skips flush because switch_to_should_flush_icache() returns true

... does _not_ ensure that T1: flush_icache_deferred() on CPU3 will
observe/read from that operation: IOW,

T1: cpumask_test_and_clear_cpu(cpu, &mm->context.icache_stale_mask)

may still evaluate to FALSE, thus preventing the FENCE.I execution.

Moreover, AFAIU, ...


> 				     -> thread has migrated and task->mm->context.force_icache_flush is true
> 
> T2: prctl(PR_RISCV_SET_ICACHE_FLUSH_CTX, PR_RISCV_CTX_SW_FENCEI_OFF, PR_RISCV_SCOPE_PER_PROCESS);

... moving the operation(s)

T2: set_icache_stale_mask()
	T2: cpumask_setall(&current->mm->context.icache_stale_mask)

before the following operation...  (per patch #1)


> T2 (kernel): kernel sets current->mm->context.force_icache_flush = false
> 
> T1 (kernel): switch_to() calls switch_to_should_flush_icache() which now
> 	     returns false because task->mm->context.force_icache_flush
> 	     is false due to the other thread emitting
> 	     PR_RISCV_CTX_SW_FENCEI_OFF.

... does not ensure that T1: switch_to_should_flush_icache() on CPU3
will observe

T1: cpumask_test_cpu(<cpu3>, &task->mm->context.icache_stale_mask) == true

in fact, T1 may evaluate the latter expression to FALSE while still
being able to observe the "later" operation, i.e.

T1: task->mm->context.force_icache_flush == false


Perhaps a simplified but useful way to look at such scenarios is as
follows:

  - CPUs are just like nodes of a distributed system, and

  - store are like messages to be exchanged (by the memory subsystem)
    between CPUs: without some (explicit) synchronization/constraints,
    messages originating from a given CPU can propagate/be visible to
    other CPUs at any time and in any order.


IAC, can you elaborate on the solution proposed here (maybe by adding
some inline comments), keeping the above considerations in mind? what
am I missing?


> T1 (back in userspace): Instruction cache was never flushed on context
> 			switch to CPU3, and thus may execute incorrect
> 			instructions.

Mmh, flushing the I$ (or, as meant here, executing a FENCE.I) seems
to be only half of the solution: IIUC, we'd like to ensure that the
store operation

T2: <modification of instructions>

originated from CPU2 is _visible_ to CPU3 by the time that FENCE.I
instruction is executed, cf.

[from Zifencei - emphasis mine]

A FENCE.I instruction ensures that a subsequent instruction fetch on
a RISC-V hart will see any previous data stores _already visible_ to
the same RISC-V hart.


IOW (but assuming code is in coherent main memory), imagine that the
(putative) FENCE.I on CPU3 is replaced by some

T1: LOAD reg,0(insts_addr)

question is: would such a load be guaranteed to observe the store

T2: <modification of instructions>  #  STORE new_insts,0(insts_addr)

originated from CPU2? can you elaborate?

  Andrea

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] riscv: Eagerly flush in flush_icache_deferred()
  2024-08-15  0:34   ` Andrea Parri
@ 2024-08-15 23:17     ` Charlie Jenkins
  2024-08-16  9:27       ` Andrea Parri
  0 siblings, 1 reply; 8+ messages in thread
From: Charlie Jenkins @ 2024-08-15 23:17 UTC (permalink / raw)
  To: Andrea Parri
  Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti,
	Atish Patra, Samuel Holland, Palmer Dabbelt, linux-riscv,
	linux-kernel

On Thu, Aug 15, 2024 at 02:34:19AM +0200, Andrea Parri wrote:
> > <---- Thread 1 starts running here on CPU1
> > 
> > <---- Thread 2 starts running here with same mm
> > 
> > T2: prctl(PR_RISCV_SET_ICACHE_FLUSH_CTX, PR_RISCV_CTX_SW_FENCEI_ON, PR_RISCV_SCOPE_PER_PROCESS);
> > T2: <-- kernel sets current->mm->context.force_icache_flush to true
> 
> Mmh, TBH, I'm not sure how this patch is supposed to fix the race in
> question:
> 
> For once, AFAIU, the operation
> 
> T2: cpumask_setall(&current->mm->context.icache_stale_mask)
> 
> (on CPU2?) you've added with this patch...
> 
> 
> > T2: <modification of instructions>
> > T2: fence.i
> > 
> > T1: fence.i (to synchronize with other thread, has some logic to
> >              determine when to do this)
> > T1: <-- thread 1 is preempted
> > T1: <-- thread 1 is placed onto CPU3 and starts context switch sequence
> > T1 (kernel): flush_icache_deferred() -> skips flush because switch_to_should_flush_icache() returns true
> 
> ... does _not_ ensure that T1: flush_icache_deferred() on CPU3 will
> observe/read from that operation: IOW,
> 
> T1: cpumask_test_and_clear_cpu(cpu, &mm->context.icache_stale_mask)
> 
> may still evaluate to FALSE, thus preventing the FENCE.I execution.
> 
> Moreover, AFAIU, ...
> 
> 
> > 				     -> thread has migrated and task->mm->context.force_icache_flush is true
> > 
> > T2: prctl(PR_RISCV_SET_ICACHE_FLUSH_CTX, PR_RISCV_CTX_SW_FENCEI_OFF, PR_RISCV_SCOPE_PER_PROCESS);
> 
> ... moving the operation(s)
> 
> T2: set_icache_stale_mask()
> 	T2: cpumask_setall(&current->mm->context.icache_stale_mask)
> 
> before the following operation...  (per patch #1)
> 
> 
> > T2 (kernel): kernel sets current->mm->context.force_icache_flush = false
> > 
> > T1 (kernel): switch_to() calls switch_to_should_flush_icache() which now
> > 	     returns false because task->mm->context.force_icache_flush
> > 	     is false due to the other thread emitting
> > 	     PR_RISCV_CTX_SW_FENCEI_OFF.
> 
> ... does not ensure that T1: switch_to_should_flush_icache() on CPU3
> will observe
> 
> T1: cpumask_test_cpu(<cpu3>, &task->mm->context.icache_stale_mask) == true
> 
> in fact, T1 may evaluate the latter expression to FALSE while still
> being able to observe the "later" operation, i.e.
> 
> T1: task->mm->context.force_icache_flush == false
> 
> 
> Perhaps a simplified but useful way to look at such scenarios is as
> follows:
> 
>   - CPUs are just like nodes of a distributed system, and
> 
>   - store are like messages to be exchanged (by the memory subsystem)
>     between CPUs: without some (explicit) synchronization/constraints,
>     messages originating from a given CPU can propagate/be visible to
>     other CPUs at any time and in any order.
> 
> 
> IAC, can you elaborate on the solution proposed here (maybe by adding
> some inline comments), keeping the above considerations in mind? what
> am I missing?

I should have added some memory barriers. I want to have the stores to
task->mm->context.force_icache_flush and
task->mm->context.icache_stale_mask in riscv_set_icache_flush_ctx() from
one hart to be visible by another hart that is observing the values in
flush_icache_deferred() and switch_to_should_flush_icache(). Then also
for the changes to those variables in flush_icache_deferred() and
switch_to_should_flush_icache() to be visible in future invocations of
those functions.

> 
> 
> > T1 (back in userspace): Instruction cache was never flushed on context
> > 			switch to CPU3, and thus may execute incorrect
> > 			instructions.
> 
> Mmh, flushing the I$ (or, as meant here, executing a FENCE.I) seems
> to be only half of the solution: IIUC, we'd like to ensure that the
> store operation
> 
> T2: <modification of instructions>
> 
> originated from CPU2 is _visible_ to CPU3 by the time that FENCE.I
> instruction is executed, cf.
> 
> [from Zifencei - emphasis mine]
> 
> A FENCE.I instruction ensures that a subsequent instruction fetch on
> a RISC-V hart will see any previous data stores _already visible_ to
> the same RISC-V hart.
>

Oh okay so we will need to do a memory barrier before the fence.i in the
userspace program. I don't believe a memory barrier will be necessary in
the kernel though while this prctl is active, will the kernel ensure
memory coherence upon migration?

> 
> IOW (but assuming code is in coherent main memory), imagine that the
> (putative) FENCE.I on CPU3 is replaced by some
> 
> T1: LOAD reg,0(insts_addr)
> 
> question is: would such a load be guaranteed to observe the store
> 
> T2: <modification of instructions>  #  STORE new_insts,0(insts_addr)
> 
> originated from CPU2? can you elaborate?

To give a broader overview, the usecase for the per-mm mode of the prctl
is to support JIT languages that generate code sequences on one hart and
execute the generated code on another hart. In this example, thread 2 is
generating the code sequences and thread 1 is executing them.

The goal here is for Linux to guarantee that CPU migration does not
cause thread 1 to not see the instructions generated by thread 2, if it
was able to see the generated instructions on the CPU it was migrating
from. To hold this guarantee, when thread 1 (or any thread that is in
the mm group) is migrated, its instruction cache is synchronized.
Ideally, it would contain exactly the same contents as it did on the
previous CPU, but instead it must rely on fence.i since that is the only
option.

The stipulation of "if it was able to see the generated instructions on
the CPU it was migration from" is there to say that the thread is
expected to emit fence.i as necessary to cover the case that migration
does not occur.

Another note is that with this prctl, fence.i is emitted by the kernel
whenever any thread in the mm group is migrated, however the described
usecase is that there is one producer of the instructions and one
consumer. An extension to this prctl could be to specify which threads
are consumers and which threads are producers and only flush the icache
when the consumers have migrated but that optimization has not yet be
written.

- Charlie

> 
>   Andrea

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] riscv: Eagerly flush in flush_icache_deferred()
  2024-08-15 23:17     ` Charlie Jenkins
@ 2024-08-16  9:27       ` Andrea Parri
  0 siblings, 0 replies; 8+ messages in thread
From: Andrea Parri @ 2024-08-16  9:27 UTC (permalink / raw)
  To: Charlie Jenkins
  Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti,
	Atish Patra, Samuel Holland, Palmer Dabbelt, linux-riscv,
	linux-kernel

> I should have added some memory barriers. I want to have the stores to
> task->mm->context.force_icache_flush and
> task->mm->context.icache_stale_mask in riscv_set_icache_flush_ctx() from
> one hart to be visible by another hart that is observing the values in
> flush_icache_deferred() and switch_to_should_flush_icache(). Then also
> for the changes to those variables in flush_icache_deferred() and
> switch_to_should_flush_icache() to be visible in future invocations of
> those functions.

[...]

> Oh okay so we will need to do a memory barrier before the fence.i in the
> userspace program. I don't believe a memory barrier will be necessary in
> the kernel though while this prctl is active, will the kernel ensure
> memory coherence upon migration?

Yes, the kernel enforces coherence upon migration:(*) in the example
at stake, this means that T1 will have a coherent view of memory when
migrating from CPU1 to CPU3.

To leverage this property, we (or the user application) would need to
provide some synchronization between T2 (that modifies the code) and
T1.  This typically involves some form of barrier pairing.

Feel free to reach out, here on the list or in private chats, for any
related memory-barrier doubts I might be able to clarify.

  Andrea

(*) A malicious/buggy hypervisor could migrate (virtual)CPU X, and all
its threads, at any time and way allowed by Murphy's law.  :-) I take
it as that goes beyond the scope of this series...

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/2] riscv: Fix race conditions in PR_RISCV_SET_ICACHE_FLUSH_CTX
  2024-08-13 23:02 [PATCH 0/2] riscv: Fix race conditions in PR_RISCV_SET_ICACHE_FLUSH_CTX Charlie Jenkins
  2024-08-13 23:02 ` [PATCH 1/2] riscv: Disable preemption while handling PR_RISCV_CTX_SW_FENCEI_OFF Charlie Jenkins
  2024-08-13 23:02 ` [PATCH 2/2] riscv: Eagerly flush in flush_icache_deferred() Charlie Jenkins
@ 2024-09-11 15:30 ` patchwork-bot+linux-riscv
  2024-09-11 15:38   ` Palmer Dabbelt
  2 siblings, 1 reply; 8+ messages in thread
From: patchwork-bot+linux-riscv @ 2024-09-11 15:30 UTC (permalink / raw)
  To: Charlie Jenkins
  Cc: linux-riscv, paul.walmsley, palmer, aou, alexghiti, atishp,
	samuel.holland, parri.andrea, palmer, linux-kernel

Hello:

This series was applied to riscv/linux.git (fixes)
by Palmer Dabbelt <palmer@rivosinc.com>:

On Tue, 13 Aug 2024 16:02:16 -0700 you wrote:
> There are two race conditions possible with
> PR_RISCV_SET_ICACHE_FLUSH_CTX. The first one can be seen by enabling
> DEBUG_PREEMPT and using this prctl which will warn with BUG: using
> smp_processor_id() in preemptible. This can be fixed by disabling
> preemption during this prctl handling. Another race condition is present
> when the mm->context.icache_stale_mask is changed by a thread while a
> different thread in the same mm context is between switch_mm() and
> switch_to() during a context switch.
> 
> [...]

Here is the summary with links:
  - [1/2] riscv: Disable preemption while handling PR_RISCV_CTX_SW_FENCEI_OFF
    https://git.kernel.org/riscv/c/7c1e5b9690b0
  - [2/2] riscv: Eagerly flush in flush_icache_deferred()
    (no matching commit)

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/2] riscv: Fix race conditions in PR_RISCV_SET_ICACHE_FLUSH_CTX
  2024-09-11 15:30 ` [PATCH 0/2] riscv: Fix race conditions in PR_RISCV_SET_ICACHE_FLUSH_CTX patchwork-bot+linux-riscv
@ 2024-09-11 15:38   ` Palmer Dabbelt
  0 siblings, 0 replies; 8+ messages in thread
From: Palmer Dabbelt @ 2024-09-11 15:38 UTC (permalink / raw)
  To: patchwork-bot+linux-riscv
  Cc: Charlie Jenkins, linux-riscv, Paul Walmsley, aou, alexghiti,
	Atish Patra, samuel.holland, parri.andrea, linux-kernel

On Wed, 11 Sep 2024 08:30:32 PDT (-0700), patchwork-bot+linux-riscv@kernel.org wrote:
> Hello:
>
> This series was applied to riscv/linux.git (fixes)
> by Palmer Dabbelt <palmer@rivosinc.com>:
>
> On Tue, 13 Aug 2024 16:02:16 -0700 you wrote:
>> There are two race conditions possible with
>> PR_RISCV_SET_ICACHE_FLUSH_CTX. The first one can be seen by enabling
>> DEBUG_PREEMPT and using this prctl which will warn with BUG: using
>> smp_processor_id() in preemptible. This can be fixed by disabling
>> preemption during this prctl handling. Another race condition is present
>> when the mm->context.icache_stale_mask is changed by a thread while a
>> different thread in the same mm context is between switch_mm() and
>> switch_to() during a context switch.
>>
>> [...]
>
> Here is the summary with links:
>   - [1/2] riscv: Disable preemption while handling PR_RISCV_CTX_SW_FENCEI_OFF
>     https://git.kernel.org/riscv/c/7c1e5b9690b0
>   - [2/2] riscv: Eagerly flush in flush_icache_deferred()
>     (no matching commit)
>
> You are awesome, thank you!

I think the bot just got a little lost here, I applied the v2 from over 
here: https://lore.kernel.org/r/20240903-fix_fencei_optimization-v2-1-8025f20171fc@rivosinc.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-09-11 15:38 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-13 23:02 [PATCH 0/2] riscv: Fix race conditions in PR_RISCV_SET_ICACHE_FLUSH_CTX Charlie Jenkins
2024-08-13 23:02 ` [PATCH 1/2] riscv: Disable preemption while handling PR_RISCV_CTX_SW_FENCEI_OFF Charlie Jenkins
2024-08-13 23:02 ` [PATCH 2/2] riscv: Eagerly flush in flush_icache_deferred() Charlie Jenkins
2024-08-15  0:34   ` Andrea Parri
2024-08-15 23:17     ` Charlie Jenkins
2024-08-16  9:27       ` Andrea Parri
2024-09-11 15:30 ` [PATCH 0/2] riscv: Fix race conditions in PR_RISCV_SET_ICACHE_FLUSH_CTX patchwork-bot+linux-riscv
2024-09-11 15:38   ` Palmer Dabbelt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox