linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH] Fix: membarrier: racy access to p->mm in membarrier_global_expedited()
@ 2019-01-28 18:26 Mathieu Desnoyers
  2019-01-28 20:27 ` Linus Torvalds
  2019-01-28 21:33 ` Jann Horn
  0 siblings, 2 replies; 6+ messages in thread
From: Mathieu Desnoyers @ 2019-01-28 18:26 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra
  Cc: linux-kernel, linux-api, Mathieu Desnoyers, Jann Horn,
	Thomas Gleixner, Andrea Parri, Andrew Hunter, Andy Lutomirski,
	Avi Kivity, Benjamin Herrenschmidt, Boqun Feng, Dave Watson,
	David Sehr, Greg Hackmann, H . Peter Anvin, Linus Torvalds,
	Maged Michael, Michael Ellerman, Paul E . McKenney,
	Paul Mackerras, Ru

Jann Horn identified a racy access to p->mm in the global expedited
command of the membarrier system call.

The suggested fix is to hold the task_lock() around the accesses to
p->mm and to the mm_struct membarrier_state field to guarantee the
existence of the mm_struct.

Link: https://lore.kernel.org/lkml/CAG48ez2G8ctF8dHS42TF37pThfr3y0RNOOYTmxvACm4u8Yu3cw@mail.gmail.com
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
CC: Jann Horn <jannh@google.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Peter Zijlstra (Intel) <peterz@infradead.org>
CC: Ingo Molnar <mingo@kernel.org>
CC: Andrea Parri <parri.andrea@gmail.com>
CC: Andrew Hunter <ahh@google.com>
CC: Andy Lutomirski <luto@kernel.org>
CC: Avi Kivity <avi@scylladb.com>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Boqun Feng <boqun.feng@gmail.com>
CC: Dave Watson <davejwatson@fb.com>
CC: David Sehr <sehr@google.com>
CC: Greg Hackmann <ghackmann@google.com>
CC: H. Peter Anvin <hpa@zytor.com>
CC: Linus Torvalds <torvalds@linux-foundation.org>
CC: Maged Michael <maged.michael@gmail.com>
CC: Michael Ellerman <mpe@ellerman.id.au>
CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
CC: Paul Mackerras <paulus@samba.org>
CC: Russell King <linux@armlinux.org.uk>
CC: Will Deacon <will.deacon@arm.com>
CC: stable@vger.kernel.org # v4.16+
CC: linux-api@vger.kernel.org
---
 kernel/sched/membarrier.c | 27 +++++++++++++++++++++------
 1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c
index 76e0eaf4654e..305fdcc4c5f7 100644
--- a/kernel/sched/membarrier.c
+++ b/kernel/sched/membarrier.c
@@ -81,12 +81,27 @@ static int membarrier_global_expedited(void)
 
 		rcu_read_lock();
 		p = task_rcu_dereference(&cpu_rq(cpu)->curr);
-		if (p && p->mm && (atomic_read(&p->mm->membarrier_state) &
-				   MEMBARRIER_STATE_GLOBAL_EXPEDITED)) {
-			if (!fallback)
-				__cpumask_set_cpu(cpu, tmpmask);
-			else
-				smp_call_function_single(cpu, ipi_mb, NULL, 1);
+		/*
+		 * Skip this CPU if the runqueue's current task is NULL or if
+		 * it is a kernel thread.
+		 */
+		if (p && READ_ONCE(p->mm)) {
+			bool mm_match;
+
+			/*
+			 * Read p->mm and access membarrier_state while holding
+			 * the task lock to ensure existence of mm.
+			 */
+			task_lock(p);
+			mm_match = p->mm && (atomic_read(&p->mm->membarrier_state) &
+					     MEMBARRIER_STATE_GLOBAL_EXPEDITED);
+			task_unlock(p);
+			if (mm_match) {
+				if (!fallback)
+					__cpumask_set_cpu(cpu, tmpmask);
+				else
+					smp_call_function_single(cpu, ipi_mb, NULL, 1);
+			}
 		}
 		rcu_read_unlock();
 	}
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH] Fix: membarrier: racy access to p->mm in membarrier_global_expedited()
  2019-01-28 18:26 [RFC PATCH] Fix: membarrier: racy access to p->mm in membarrier_global_expedited() Mathieu Desnoyers
@ 2019-01-28 20:27 ` Linus Torvalds
  2019-01-28 20:46   ` Paul E. McKenney
  2019-01-28 21:33 ` Jann Horn
  1 sibling, 1 reply; 6+ messages in thread
From: Linus Torvalds @ 2019-01-28 20:27 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Ingo Molnar, Peter Zijlstra, Linux List Kernel Mailing, Linux API,
	Jann Horn, Thomas Gleixner, Andrea Parri, Andrew Hunter,
	Andy Lutomirski, Avi Kivity, Benjamin Herrenschmidt, Boqun Feng,
	Dave Watson, David Sehr, Greg Hackmann, H . Peter Anvin,
	Maged Michael, Michael Ellerman, Paul E . McKenney,
	Paul Mackerras

On Mon, Jan 28, 2019 at 10:27 AM Mathieu Desnoyers
<mathieu.desnoyers@efficios.com> wrote:
>
> Jann Horn identified a racy access to p->mm in the global expedited
> command of the membarrier system call.
>
> The suggested fix is to hold the task_lock() around the accesses to
> p->mm and to the mm_struct membarrier_state field to guarantee the
> existence of the mm_struct.

Hmm. I think this is right. You shouldn't access another threads mm
pointer without proper locking.

That said, we *could* make the mm_cachep be SLAB_TYPESAFE_BY_RCU,
which would allow speculatively reading data off the mm pointer under
RCU. It might not be the *right* mm if somebody just did an exit, but
for things like this it shouldn't matter.

But if this is the only case that might care, it sounds like just
doing the proper locking is the right approach.

           Linus

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH] Fix: membarrier: racy access to p->mm in membarrier_global_expedited()
  2019-01-28 20:27 ` Linus Torvalds
@ 2019-01-28 20:46   ` Paul E. McKenney
  2019-01-28 21:07     ` Mathieu Desnoyers
  0 siblings, 1 reply; 6+ messages in thread
From: Paul E. McKenney @ 2019-01-28 20:46 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Mathieu Desnoyers, Ingo Molnar, Peter Zijlstra,
	Linux List Kernel Mailing, Linux API, Jann Horn, Thomas Gleixner,
	Andrea Parri, Andrew Hunter, Andy Lutomirski, Avi Kivity,
	Benjamin Herrenschmidt, Boqun Feng, Dave Watson, David Sehr,
	Greg Hackmann, H . Peter Anvin, Maged Michael, Michael Ellerman,
	Paul Mackerras

On Mon, Jan 28, 2019 at 12:27:03PM -0800, Linus Torvalds wrote:
> On Mon, Jan 28, 2019 at 10:27 AM Mathieu Desnoyers
> <mathieu.desnoyers@efficios.com> wrote:
> >
> > Jann Horn identified a racy access to p->mm in the global expedited
> > command of the membarrier system call.
> >
> > The suggested fix is to hold the task_lock() around the accesses to
> > p->mm and to the mm_struct membarrier_state field to guarantee the
> > existence of the mm_struct.
> 
> Hmm. I think this is right. You shouldn't access another threads mm
> pointer without proper locking.
> 
> That said, we *could* make the mm_cachep be SLAB_TYPESAFE_BY_RCU,
> which would allow speculatively reading data off the mm pointer under
> RCU. It might not be the *right* mm if somebody just did an exit, but
> for things like this it shouldn't matter.

That sounds much simpler and more effective than the contention-reduction
approach that I suggested.  ;-)

							Thanx, Paul

> But if this is the only case that might care, it sounds like just
> doing the proper locking is the right approach.
> 
>            Linus
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH] Fix: membarrier: racy access to p->mm in membarrier_global_expedited()
  2019-01-28 20:46   ` Paul E. McKenney
@ 2019-01-28 21:07     ` Mathieu Desnoyers
  2019-01-28 21:26       ` Paul E. McKenney
  0 siblings, 1 reply; 6+ messages in thread
From: Mathieu Desnoyers @ 2019-01-28 21:07 UTC (permalink / raw)
  To: paulmck, Linus Torvalds, Jann Horn
  Cc: Ingo Molnar, Peter Zijlstra, linux-kernel, linux-api,
	Thomas Gleixner, Andrea Parri, Andrew Hunter, Andy Lutomirski,
	Avi Kivity, Benjamin Herrenschmidt, Boqun Feng, Dave Watson,
	David Sehr, Greg Hackmann, H. Peter Anvin, maged michael,
	Michael Ellerman, Paul Mackerras, Russell King, ARM Linux,
	Will Deacon

----- On Jan 28, 2019, at 3:46 PM, paulmck paulmck@linux.ibm.com wrote:

> On Mon, Jan 28, 2019 at 12:27:03PM -0800, Linus Torvalds wrote:
>> On Mon, Jan 28, 2019 at 10:27 AM Mathieu Desnoyers
>> <mathieu.desnoyers@efficios.com> wrote:
>> >
>> > Jann Horn identified a racy access to p->mm in the global expedited
>> > command of the membarrier system call.
>> >
>> > The suggested fix is to hold the task_lock() around the accesses to
>> > p->mm and to the mm_struct membarrier_state field to guarantee the
>> > existence of the mm_struct.
>> 
>> Hmm. I think this is right. You shouldn't access another threads mm
>> pointer without proper locking.
>> 
>> That said, we *could* make the mm_cachep be SLAB_TYPESAFE_BY_RCU,
>> which would allow speculatively reading data off the mm pointer under
>> RCU. It might not be the *right* mm if somebody just did an exit, but
>> for things like this it shouldn't matter.
> 
> That sounds much simpler and more effective than the contention-reduction
> approach that I suggested.  ;-)

I'd be tempted to stick to the locking approach for a fix, and implement
Linus' type-safe mm_cachep idea if anyone complains about the overhead
of membarrier GLOBAL_EXPEDITED (and submit for a future merge window).

I tested the KASAN splat reproducer from Jann locally, and confirmed that
my patch fixes the issue it reproduces.

Please let me know if the task_lock() approach is OK as a fix for now.

I'm also awaiting a Tested-by from Jann before submitting this for real.

Thanks,

Mathieu

> 
>							Thanx, Paul
> 
>> But if this is the only case that might care, it sounds like just
>> doing the proper locking is the right approach.
>> 
>>            Linus

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH] Fix: membarrier: racy access to p->mm in membarrier_global_expedited()
  2019-01-28 21:07     ` Mathieu Desnoyers
@ 2019-01-28 21:26       ` Paul E. McKenney
  0 siblings, 0 replies; 6+ messages in thread
From: Paul E. McKenney @ 2019-01-28 21:26 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Linus Torvalds, Jann Horn, Ingo Molnar, Peter Zijlstra,
	linux-kernel, linux-api, Thomas Gleixner, Andrea Parri,
	Andrew Hunter, Andy Lutomirski, Avi Kivity,
	Benjamin Herrenschmidt, Boqun Feng, Dave Watson, David Sehr,
	Greg Hackmann, H. Peter Anvin, maged michael, Michael Ellerman,
	Paul Mackerras, Russel

On Mon, Jan 28, 2019 at 04:07:26PM -0500, Mathieu Desnoyers wrote:
> ----- On Jan 28, 2019, at 3:46 PM, paulmck paulmck@linux.ibm.com wrote:
> 
> > On Mon, Jan 28, 2019 at 12:27:03PM -0800, Linus Torvalds wrote:
> >> On Mon, Jan 28, 2019 at 10:27 AM Mathieu Desnoyers
> >> <mathieu.desnoyers@efficios.com> wrote:
> >> >
> >> > Jann Horn identified a racy access to p->mm in the global expedited
> >> > command of the membarrier system call.
> >> >
> >> > The suggested fix is to hold the task_lock() around the accesses to
> >> > p->mm and to the mm_struct membarrier_state field to guarantee the
> >> > existence of the mm_struct.
> >> 
> >> Hmm. I think this is right. You shouldn't access another threads mm
> >> pointer without proper locking.
> >> 
> >> That said, we *could* make the mm_cachep be SLAB_TYPESAFE_BY_RCU,
> >> which would allow speculatively reading data off the mm pointer under
> >> RCU. It might not be the *right* mm if somebody just did an exit, but
> >> for things like this it shouldn't matter.
> > 
> > That sounds much simpler and more effective than the contention-reduction
> > approach that I suggested.  ;-)
> 
> I'd be tempted to stick to the locking approach for a fix, and implement
> Linus' type-safe mm_cachep idea if anyone complains about the overhead
> of membarrier GLOBAL_EXPEDITED (and submit for a future merge window).
> 
> I tested the KASAN splat reproducer from Jann locally, and confirmed that
> my patch fixes the issue it reproduces.
> 
> Please let me know if the task_lock() approach is OK as a fix for now.

Agreed, no need for added complexity until there is a clear need.

> I'm also awaiting a Tested-by from Jann before submitting this for real.

Makes sense to me!

							Thanx, Paul

> Thanks,
> 
> Mathieu
> 
> > 
> >							Thanx, Paul
> > 
> >> But if this is the only case that might care, it sounds like just
> >> doing the proper locking is the right approach.
> >> 
> >>            Linus
> 
> -- 
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH] Fix: membarrier: racy access to p->mm in membarrier_global_expedited()
  2019-01-28 18:26 [RFC PATCH] Fix: membarrier: racy access to p->mm in membarrier_global_expedited() Mathieu Desnoyers
  2019-01-28 20:27 ` Linus Torvalds
@ 2019-01-28 21:33 ` Jann Horn
  1 sibling, 0 replies; 6+ messages in thread
From: Jann Horn @ 2019-01-28 21:33 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Ingo Molnar, Peter Zijlstra, kernel list, Linux API,
	Thomas Gleixner, Andrea Parri, Andrew Hunter, Andy Lutomirski,
	Avi Kivity, Benjamin Herrenschmidt, Boqun Feng, Dave Watson,
	David Sehr, Greg Hackmann, H . Peter Anvin, Linus Torvalds,
	Maged Michael, Michael Ellerman, Paul E . McKenney,
	Paul Mackerras <paulus>

On Mon, Jan 28, 2019 at 7:27 PM Mathieu Desnoyers
<mathieu.desnoyers@efficios.com> wrote:
> Jann Horn identified a racy access to p->mm in the global expedited
> command of the membarrier system call.
>
> The suggested fix is to hold the task_lock() around the accesses to
> p->mm and to the mm_struct membarrier_state field to guarantee the
> existence of the mm_struct.
>
> Link: https://lore.kernel.org/lkml/CAG48ez2G8ctF8dHS42TF37pThfr3y0RNOOYTmxvACm4u8Yu3cw@mail.gmail.com
> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

The patch looks good to me, and to be sure, I've also given it a spin
- I can't trigger a splat anymore. You can add:

Tested-by: Jann Horn <jannh@google.com>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-01-28 21:33 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-01-28 18:26 [RFC PATCH] Fix: membarrier: racy access to p->mm in membarrier_global_expedited() Mathieu Desnoyers
2019-01-28 20:27 ` Linus Torvalds
2019-01-28 20:46   ` Paul E. McKenney
2019-01-28 21:07     ` Mathieu Desnoyers
2019-01-28 21:26       ` Paul E. McKenney
2019-01-28 21:33 ` Jann Horn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).