From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755819AbbCLWan (ORCPT ); Thu, 12 Mar 2015 18:30:43 -0400 Received: from mail.efficios.com ([78.47.125.74]:41770 "EHLO mail.efficios.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751252AbbCLWam (ORCPT ); Thu, 12 Mar 2015 18:30:42 -0400 Date: Thu, 12 Mar 2015 22:30:35 +0000 (UTC) From: Mathieu Desnoyers To: Linus Torvalds Cc: Michael Sullivan , lttng-dev@lists.lttng.org, LKML , "Paul E. McKenney" , Peter Zijlstra , Ingo Molnar , Thomas Gleixner , Steven Rostedt Message-ID: <1601505044.287659.1426199435904.JavaMail.zimbra@efficios.com> In-Reply-To: References: <867044376.285926.1426172227750.JavaMail.zimbra@efficios.com> <666590480.287502.1426193588471.JavaMail.zimbra@efficios.com> Subject: Re: Alternative to signals/sys_membarrier() in liburcu MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [173.246.22.116] X-Mailer: Zimbra 8.0.7_GA_6021 (ZimbraWebClient - FF36 (Linux)/8.0.7_GA_6021) Thread-Topic: Alternative to signals/sys_membarrier() in liburcu Thread-Index: +g7/EAv0mcVEuWpzV04Sbq2fhjNZUg== Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org ----- Original Message ----- > From: "Linus Torvalds" > To: "Mathieu Desnoyers" > Cc: "Michael Sullivan" , lttng-dev@lists.lttng.org, "LKML" , "Paul E. > McKenney" , "Peter Zijlstra" , "Ingo Molnar" , > "Thomas Gleixner" , "Steven Rostedt" > Sent: Thursday, March 12, 2015 5:47:05 PM > Subject: Re: Alternative to signals/sys_membarrier() in liburcu > > On Thu, Mar 12, 2015 at 1:53 PM, Mathieu Desnoyers > wrote: > > > > So the question as it stands appears to be: would you be comfortable > > having users abuse mprotect(), relying on its side-effect of issuing > > a smp_mb() on each targeted CPU for the TLB shootdown, as > > an effective implementation of process-wide memory barrier ? > > Be *very* careful. > > Just yesterday, in another thread (discussing the auto-numa TLB > performance regression), we were discussing skipping the TLB > invalidates entirely if the mprotect relaxes the protections. > > Because if you *used* to be read-only, and them mprotect() something > so that it is read-write, there really is no need to send a TLB > invalidate, at least on x86. You can just change the page tables, and > *if* any entries are stale in the TLB they'll take a microfault on > access and then just reload the TLB. > > So mprotect() to a more permissive mode is not necessarily serializing. The idea here is to always mprotect() to a more restrictive mode, which should trigger the TLB shootdown. > > Also, you need to make sure that your page is actually in memory, > because otherwise the kernel may end up seeing "oh, it's not even > present", and never flush the TLB at all. > > So now you need to mlock that page. Which can be problematic for non-root. I'm aware the default amount of locked memory is usually quite low (64kB here). So we'd need to handle cases where we run out of locked memory. We could fallback to a slower userspace RCU scheme if this occurs. > > In other words, I'd be a bit leery about it. There may be other > gotcha's about it. Looking again at this old proposed patch (https://lkml.org/lkml/2010/4/18/15) which adds a few memory barriers around updates to mm_cpumask for sys_membarrier makes me wonder whether mprotect() may not skip some CPU from the mask that would actually need to be taken care of in very narrow race scenarios. Thanks, Mathieu > > Linus > -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com