From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751517AbdG0KW0 (ORCPT ); Thu, 27 Jul 2017 06:22:26 -0400 Received: from foss.arm.com ([217.140.101.70]:44372 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751113AbdG0KWZ (ORCPT ); Thu, 27 Jul 2017 06:22:25 -0400 Date: Thu, 27 Jul 2017 11:22:32 +0100 From: Will Deacon To: Peter Zijlstra Cc: Mathieu Desnoyers , "Paul E. McKenney" , linux-kernel , Ingo Molnar , Lai Jiangshan , dipankar , Andrew Morton , Josh Triplett , Thomas Gleixner , rostedt , David Howells , Eric Dumazet , fweisbec , Oleg Nesterov Subject: Re: [PATCH tip/core/rcu 4/5] sys_membarrier: Add expedited option Message-ID: <20170727102231.GA20746@arm.com> References: <20170724215758.GA12075@linux.vnet.ibm.com> <20170725193612.GW3730@linux.vnet.ibm.com> <20170725202451.GC28975@worktop> <20170725211926.GA3730@linux.vnet.ibm.com> <20170725215510.GD28975@worktop> <1480480872.25829.1501023013862.JavaMail.zimbra@efficios.com> <20170726074656.obadfdu6hdlrmy7r@hirez.programming.kicks-ass.net> <20170726154229.GO3730@linux.vnet.ibm.com> <1415161894.26359.1501092075006.JavaMail.zimbra@efficios.com> <20170727085312.xweucokjghay3jtx@hirez.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170727085312.xweucokjghay3jtx@hirez.programming.kicks-ass.net> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jul 27, 2017 at 10:53:12AM +0200, Peter Zijlstra wrote: > On Wed, Jul 26, 2017 at 06:01:15PM +0000, Mathieu Desnoyers wrote: > > > Another alternative for a MEMBARRIER_CMD_SHARED_EXPEDITED would be rate-limiting > > per thread. For instance, we could add a new "ulimit" that would bound the > > number of expedited membarrier per thread that can be done per millisecond, > > and switch to synchronize_sched() whenever a thread goes beyond that limit > > for the rest of the time-slot. > > You forgot to ask yourself how you could abuse this.. just spawn more > threads. > > Per-thread limits are nearly useless, because spawning new threads is > cheap. > > > A RT system that really cares about not having userspace sending IPIs > > to all cpus could set the ulimit value to 0, which would always use > > synchronize_sched(). > > > > Thoughts ? > > So I really don't like SHARED_EXPEDITED, and your use-cases (from later > emails) makes me think sys_membarrier() should have a pointer argument > to identify the shared mapping. > > But even then, iterating the rmap for something that has 1000+ maps > isn't going to be nice or fast, even in kernel space. > > Another crazy idea is using madvise() for this. The new MADV_MEMBAR > could revoke PROT_WRITE and PROT_READ for all extant PTEs. Then the > tasks attempting access will fault and the fault handler can figure out > if it still needs to issue a MB or not before reinstating the PTE. If you did that, wouldn't you need to leave the faulting entries intact until all CPUs with the mm scheduled have either faulted or context-switched? We don't have per-cpu permission bits in the page tables, unfortunately. Will