All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Boqun Feng <boqun.feng@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Will Deacon <will@kernel.org>, paulmck <paulmck@kernel.org>,
	Andy Lutomirski <luto@amacapital.net>,
	Andrew Morton <akpm@linux-foundation.org>,
	Alan Stern <stern@rowland.harvard.edu>,
	Nicholas Piggin <npiggin@gmail.com>
Subject: Re: [RFC PATCH 2/3] sched: membarrier: cover kthread_use_mm (v2)
Date: Mon, 24 Aug 2020 11:27:49 -0400 (EDT)	[thread overview]
Message-ID: <764014395.16126.1598282869127.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <20200816152907.GB87259@debian-boqun.qqnc3lrjykvubdpftowmye0fmh.lx.internal.cloudapp.net>

----- On Aug 16, 2020, at 11:29 AM, Boqun Feng boqun.feng@gmail.com wrote:

> On Fri, Aug 14, 2020 at 12:43:57PM -0400, Mathieu Desnoyers wrote:
>> Add comments and memory barrier to kthread_use_mm and kthread_unuse_mm
>> to allow the effect of membarrier(2) to apply to kthreads accessing
>> user-space memory as well.
>> 
>> Given that no prior kthread use this guarantee and that it only affects
>> kthreads, adding this guarantee does not affect user-space ABI.
>> 
>> Refine the check in membarrier_global_expedited to exclude runqueues
>> running the idle thread rather than all kthreads from the IPI cpumask.
>> 
>> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
>> Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
>> Cc: Will Deacon <will@kernel.org>
>> Cc: Paul E. McKenney <paulmck@kernel.org>
>> Cc: Nicholas Piggin <npiggin@gmail.com>
>> Cc: Andy Lutomirski <luto@amacapital.net>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> ---
>> Changes since v1:
>> - Add WARN_ON_ONCE(current->mm) in play_idle_precise (PeterZ),
>> - Use smp_mb__after_spinlock rather than smp_mb after task_lock.
>> ---
>>  kernel/kthread.c          | 19 +++++++++++++++++++
>>  kernel/sched/idle.c       |  1 +
>>  kernel/sched/membarrier.c |  8 ++------
>>  3 files changed, 22 insertions(+), 6 deletions(-)
>> 
>> diff --git a/kernel/kthread.c b/kernel/kthread.c
>> index 3edaa380dc7b..77aaaa7bc8d9 100644
>> --- a/kernel/kthread.c
>> +++ b/kernel/kthread.c
>> @@ -1255,8 +1255,19 @@ void kthread_use_mm(struct mm_struct *mm)
>>  	finish_arch_post_lock_switch();
>>  #endif
>>  
>> +	/*
>> +	 * When a kthread starts operating on an address space, the loop
>> +	 * in membarrier_{private,global}_expedited() may not observe
>> +	 * that tsk->mm, and not issue an IPI. Membarrier requires a
>> +	 * memory barrier after storing to tsk->mm, before accessing
>> +	 * user-space memory. A full memory barrier for membarrier
>> +	 * {PRIVATE,GLOBAL}_EXPEDITED is implicitly provided by
>> +	 * mmdrop(), or explicitly with smp_mb().
>> +	 */
>>  	if (active_mm != mm)
>>  		mmdrop(active_mm);
>> +	else
>> +		smp_mb();
> 
> Similar question here: could smp_mb() guarantee the correctness of
> GLOBAL_EXPEDITED? Don't you need membarrier_switch_mm() here and in
> kthread_unuse_mm(), too?
> 
> Am I miss something here?

I think you have a good point there. Which brings me to wonder why we
don't have membarrier_switch_mm() when entering/leaving lazy TLB mode.
This means an IPI can be sent to a kthread even if it does not use an
mm, just because the membarrier state in the runqueue is that of the
active_mm.

Thoughts ?

Thanks,

Mathieu

> 
> Regards,
> Boqun
> 
>>  
>>  	to_kthread(tsk)->oldfs = force_uaccess_begin();
>>  }
>> @@ -1276,6 +1287,14 @@ void kthread_unuse_mm(struct mm_struct *mm)
>>  	force_uaccess_end(to_kthread(tsk)->oldfs);
>>  
>>  	task_lock(tsk);
>> +	/*
>> +	 * When a kthread stops operating on an address space, the loop
>> +	 * in membarrier_{private,global}_expedited() may not observe
>> +	 * that tsk->mm, and not issue an IPI. Membarrier requires a
>> +	 * memory barrier after accessing user-space memory, before
>> +	 * clearing tsk->mm.
>> +	 */
>> +	smp_mb__after_spinlock();
>>  	sync_mm_rss(mm);
>>  	local_irq_disable();
>>  	tsk->mm = NULL;
>> diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
>> index 6bf34986f45c..3443ee8335d0 100644
>> --- a/kernel/sched/idle.c
>> +++ b/kernel/sched/idle.c
>> @@ -341,6 +341,7 @@ void play_idle_precise(u64 duration_ns, u64 latency_ns)
>>  	WARN_ON_ONCE(!(current->flags & PF_KTHREAD));
>>  	WARN_ON_ONCE(!(current->flags & PF_NO_SETAFFINITY));
>>  	WARN_ON_ONCE(!duration_ns);
>> +	WARN_ON_ONCE(current->mm);
>>  
>>  	rcu_sleep_check();
>>  	preempt_disable();
>> diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c
>> index 168479a7d61b..8a294483074d 100644
>> --- a/kernel/sched/membarrier.c
>> +++ b/kernel/sched/membarrier.c
>> @@ -100,13 +100,9 @@ static int membarrier_global_expedited(void)
>>  		    MEMBARRIER_STATE_GLOBAL_EXPEDITED))
>>  			continue;
>>  
>> -		/*
>> -		 * Skip the CPU if it runs a kernel thread. The scheduler
>> -		 * leaves the prior task mm in place as an optimization when
>> -		 * scheduling a kthread.
>> -		 */
>> +		/* Skip the CPU if it runs the idle thread. */
>>  		p = rcu_dereference(cpu_rq(cpu)->curr);
>> -		if (p->flags & PF_KTHREAD)
>> +		if (is_idle_task(p))
>>  			continue;
>>  
>>  		__cpumask_set_cpu(cpu, tmpmask);
>> --
>> 2.11.0

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

  reply	other threads:[~2020-08-24 15:28 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-14 16:43 [RFC PATCH 0/3] sched: membarrier updates Mathieu Desnoyers
2020-08-14 16:43 ` [RFC PATCH 1/3] sched: fix exit_mm vs membarrier (v2) Mathieu Desnoyers
2020-08-16 15:23   ` Boqun Feng
2020-09-24 15:01     ` Mathieu Desnoyers
2020-08-14 16:43 ` [RFC PATCH 2/3] sched: membarrier: cover kthread_use_mm (v2) Mathieu Desnoyers
2020-08-16 15:29   ` Boqun Feng
2020-08-24 15:27     ` Mathieu Desnoyers [this message]
2020-08-25  2:06       ` Boqun Feng
2020-09-24 15:26         ` Mathieu Desnoyers
2020-08-14 16:43 ` [RFC PATCH 3/3] sched: membarrier: document memory ordering scenarios Mathieu Desnoyers
     [not found] ` <20200816070932.10752-1-hdanton@sina.com>
2020-08-16 21:05   ` [RFC PATCH 2/3] sched: membarrier: cover kthread_use_mm (v2) Mathieu Desnoyers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=764014395.16126.1598282869127.JavaMail.zimbra@efficios.com \
    --to=mathieu.desnoyers@efficios.com \
    --cc=akpm@linux-foundation.org \
    --cc=boqun.feng@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=npiggin@gmail.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=stern@rowland.harvard.edu \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.