From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751623AbdG0U3h (ORCPT ); Thu, 27 Jul 2017 16:29:37 -0400 Received: from mail.efficios.com ([167.114.142.141]:56156 "EHLO mail.efficios.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751518AbdG0U3d (ORCPT ); Thu, 27 Jul 2017 16:29:33 -0400 Date: Thu, 27 Jul 2017 20:31:37 +0000 (UTC) From: Mathieu Desnoyers To: Peter Zijlstra Cc: linux-kernel , "Paul E. McKenney" , Boqun Feng Message-ID: <681315624.29102.1501187497745.JavaMail.zimbra@efficios.com> In-Reply-To: <20170727195531.GE28975@worktop> References: <20170727185943.11570-1-mathieu.desnoyers@efficios.com> <20170727195531.GE28975@worktop> Subject: Re: [RFC PATCH] membarrier: expedited private command MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [167.114.142.141] X-Mailer: Zimbra 8.7.9_GA_1794 (ZimbraWebClient - FF52 (Linux)/8.7.9_GA_1794) Thread-Topic: membarrier: expedited private command Thread-Index: PAK8niyvdDQeAI2Sq4m+GTO6TytKxg== Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org ----- On Jul 27, 2017, at 3:55 PM, Peter Zijlstra peterz@infradead.org wrote: > On Thu, Jul 27, 2017 at 02:59:43PM -0400, Mathieu Desnoyers wrote: >> diff --git a/kernel/membarrier.c b/kernel/membarrier.c >> index 9f9284f37f8d..8c6c0f96f617 100644 >> --- a/kernel/membarrier.c >> +++ b/kernel/membarrier.c >> @@ -19,10 +19,81 @@ >> #include >> >> /* >> + * XXX For cpu_rq(). Should we rather move >> + * membarrier_private_expedited() to sched/core.c or create >> + * sched/membarrier.c ? > > The later perhaps. Allright, will do that in v2. > >> +static void membarrier_private_expedited(void) >> +{ >> + int cpu, this_cpu; >> + cpumask_var_t tmpmask; >> + >> + if (num_online_cpus() == 1) >> + return; >> + >> + /* >> + * Matches memory barriers around rq->curr modification in >> + * scheduler. >> + */ >> + smp_mb(); /* system call entry is not a mb. */ >> + >> + if (!alloc_cpumask_var(&tmpmask, GFP_NOWAIT)) { > > Why GFP_NOWAIT ? and falback. There seems to be a desire to make this a > nonblocking syscall. Should we document this somewhere? blocking a synchronization system call due to memory allocation pressure seemed like an unwanted effect back in 2010, so I kept the same approach. Perhaps we could state that all the "expedited" commands should be non-blocking ? > >> + /* Fallback for OOM. */ >> + membarrier_private_expedited_ipi_each(); >> + goto end; >> + } >> + >> + this_cpu = raw_smp_processor_id(); > > This is a tad dodgy, you might want to put in a comment on how migrating > this thread is ok. How about this ? /* * Skipping the current CPU is OK even through we can be * migrated at any point. The current CPU, at the point where we * read raw_smp_processor_id(), is ensured to be in program * order with respect to the caller thread. Therefore, we can * skip this CPU from the iteration. */ > >> + for_each_online_cpu(cpu) { > > One would also need cpus_read_lock() if you rely on the online mask. OK. > >> + struct task_struct *p; >> + >> + if (cpu == this_cpu) >> + continue; >> + rcu_read_lock(); >> + p = task_rcu_dereference(&cpu_rq(cpu)->curr); >> + if (p && p->mm == current->mm) >> + __cpumask_set_cpu(cpu, tmpmask); >> + rcu_read_unlock(); >> + } >> + smp_call_function_many(tmpmask, ipi_mb, NULL, 1); >> + free_cpumask_var(tmpmask); >> +end: >> + /* >> + * Memory barrier on the caller thread _after_ we finished >> + * waiting for the last IPI. Matches memory barriers around >> + * rq->curr modification in scheduler. >> + */ >> + smp_mb(); /* exit from system call is not a mb */ >> +} > >> @@ -2737,6 +2757,7 @@ context_switch(struct rq *rq, struct task_struct *prev, >> >> mm = next->mm; >> oldmm = prev->active_mm; >> + membarrier_expedited_mb_after_set_current(mm, oldmm); >> /* >> * For paravirt, this is coupled with an exit in switch_to to >> * combine the page table reload and the switch backend into > > As said on IRC, we have finish_task_switch()->if (mm) > mmdrop(mm)->atomic_dec_and_test() providing a smp_mb(). We just need to > deal with the !mm case. Yes, I have a v2 brewing that includes this change :) Thanks! Mathieu -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com