From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753259AbbCQMqq (ORCPT ); Tue, 17 Mar 2015 08:46:46 -0400 Received: from mail.efficios.com ([78.47.125.74]:60190 "EHLO mail.efficios.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752297AbbCQMqo (ORCPT ); Tue, 17 Mar 2015 08:46:44 -0400 Date: Tue, 17 Mar 2015 12:46:41 +0000 (UTC) From: Mathieu Desnoyers To: Steven Rostedt Cc: Peter Zijlstra , linux-kernel@vger.kernel.org, KOSAKI Motohiro , "Paul E. McKenney" , Nicholas Miell , Linus Torvalds , Ingo Molnar , Alan Cox , Lai Jiangshan , Stephen Hemminger , Andrew Morton , Josh Triplett , Thomas Gleixner , David Howells Message-ID: <894387964.19110.1426596401635.JavaMail.zimbra@efficios.com> In-Reply-To: <20150316222611.782cc0e4@grimm.local.home> References: <1426447459-28620-1-git-send-email-mathieu.desnoyers@efficios.com> <20150316141939.GE21418@twins.programming.kicks-ass.net> <1203077851.9491.1426520636551.JavaMail.zimbra@efficios.com> <20150316172104.GH21418@twins.programming.kicks-ass.net> <1003922584.10662.1426532015839.JavaMail.zimbra@efficios.com> <20150316205435.GJ21418@twins.programming.kicks-ass.net> <910572156.13900.1426556725438.JavaMail.zimbra@efficios.com> <20150316222611.782cc0e4@grimm.local.home> Subject: Re: [RFC PATCH] sys_membarrier(): system/process-wide memory barrier (x86) (v12) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [192.222.176.50] X-Mailer: Zimbra 8.0.7_GA_6021 (ZimbraWebClient - FF36 (Linux)/8.0.7_GA_6021) Thread-Topic: sys_membarrier(): system/process-wide memory barrier (x86) (v12) Thread-Index: GWJBx9eaAa6t1ASFLt7JUO5i4rlp4g== Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org ----- Original Message ----- > > [ Removed npiggen@kernel.dk as I keep getting bounces from that addr ] Yep, me too. However this his the address that shows up in the MAINTAINERS file. Weird. > > On Tue, 17 Mar 2015 01:45:25 +0000 (UTC) > Mathieu Desnoyers wrote: > [...] > > Can you please fix your mail client to not include the entire header in > your replies please. Done, thanks for pointing it out! > > > Let's consider the following memory barrier scenario performed in > > user-space on an architecture with very relaxed ordering. PowerPC comes > > to mind. > > > > https://lwn.net/Articles/573436/ > > scenario 12: > > > > CPU 0 CPU 1 > > CAO(x) = 1; r3 = CAO(y); > > cmm_smp_wmb(); cmm_smp_rmb(); > > CAO(y) = 1; r4 = CAO(x); > > > > BUG_ON(r3 == 1 && r4 == 0) > > > > > > We tweak it to use sys_membarrier on CPU 1, and a simple compiler > > barrier() on CPU 0: > > > > CPU 0 CPU 1 > > CAO(x) = 1; r3 = CAO(y); > > barrier(); sys_membarrier(); > > CAO(y) = 1; r4 = CAO(x); > > > > BUG_ON(r3 == 1 && r4 == 0) > > > > Now if CPU 1 executes sys_membarrier while CPU 0 is preempted after both > > stores, we have: > > > > CPU 0 CPU 1 > > CAO(x) = 1; > > [1st store is slow to > > reach other cores] > > CAO(y) = 1; > > [2nd store reaches other > > cores more quickly] > > [preempted] > > r3 = CAO(y) > > (may see y = 1) > > sys_membarrier() > > Scheduler changes rq->curr. > > skips CPU 0, because rq->curr has > > been updated. > > [return to userspace] > > r4 = CAO(x) > > (may see x = 0) > > BUG_ON(r3 == 1 && r4 == 0) -> fails. > > load_cr3, with implied > > memory barrier, comes > > after CPU 1 has read "x". > > > > The only way to make this scenario work is if a memory barrier is added > > before updating rq->curr. (we could also do a similar scenario for the > > needed barrier after store to rq->curr). > > Hmm, I wonder if anything were to break if rq->curr was updated after > the context_switch() call? > > Would that help? > > this_cpu_write(saved_next, next); > rq = context_switch(rq, prev, next); > rq->curr = this_cpu_read(saved_next); Assuming there is a full memory barrier (e.g. load_cr3) within context_switch, it would help for ordering memory accesses that are performed prior to the preemption, but not for memory accesses to be performed immediately after return to userspace from preemption. Thanks, Mathieu > > As I recently found out that this_cpu_read/write() is not that nice on > all architectures, something else may need to be updated. Or we can add > a temp variable on the rq. > > rq->saved_next = next; > rq = context_switch(rq, prev, next); > rq->curr = rq->saved_next; > > -- Steve > > -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com