From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755045AbcIBSrq (ORCPT ); Fri, 2 Sep 2016 14:47:46 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:54021 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753419AbcIBSrn (ORCPT ); Fri, 2 Sep 2016 14:47:43 -0400 X-IBM-Helo: d03dlp02.boulder.ibm.com X-IBM-MailFrom: paulmck@linux.vnet.ibm.com Date: Fri, 2 Sep 2016 11:47:59 -0700 From: "Paul E. McKenney" To: Alan Stern Cc: Peter Zijlstra , Ingo Molnar , Felipe Balbi , USB list , Kernel development list Subject: Re: Memory barrier needed with wake_up_process()? Reply-To: paulmck@linux.vnet.ibm.com References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16090218-0028-0000-0000-000005814C35 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00005698; HX=3.00000240; KW=3.00000007; PH=3.00000004; SC=3.00000184; SDB=6.00753028; UDB=6.00356092; IPR=6.00525843; BA=6.00004687; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00012561; XFM=3.00000011; UTC=2016-09-02 18:47:39 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 16090218-0029-0000-0000-00002EE6E113 Message-Id: <20160902184759.GB3663@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-09-02_06:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1604210000 definitions=main-1609020252 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Sep 02, 2016 at 02:10:13PM -0400, Alan Stern wrote: > Paul, Peter, and Ingo: > > This must have come up before, but I don't know what was decided. > > Isn't it often true that a memory barrier is needed before a call to > wake_up_process()? A typical scenario might look like this: > > CPU 0 > ----- > for (;;) { > set_current_state(TASK_INTERRUPTIBLE); > if (signal_pending(current)) > break; > if (wakeup_flag) > break; > schedule(); > } > __set_current_state(TASK_RUNNING); > wakeup_flag = 0; > > > CPU 1 > ----- > wakeup_flag = 1; > wake_up_process(my_task); > > The underlying pattern is: > > CPU 0 CPU 1 > ----- ----- > write current->state write wakeup_flag > smp_mb(); > read wakeup_flag read my_task->state > > where set_current_state() does the write to current->state and > automatically adds the smp_mb(), and wake_up_process() reads > my_task->state to see whether the task needs to be woken up. > > The kerneldoc for wake_up_process() says that it has no implied memory > barrier if it doesn't actually wake anything up. And even when it > does, the implied barrier is only smp_wmb, not smp_mb. > > This is the so-called SB (Store Buffer) pattern, which is well known to > require a full smp_mb on both sides. Since wake_up_process() doesn't > include smp_mb(), isn't it correct that the caller must add it > explicitly? > > In other words, shouldn't the code for CPU 1 really be: > > wakeup_flag = 1; > smp_mb(); > wake_up_process(task); > > If my reasoning is correct, then why doesn't wake_up_process() include > this memory barrier automatically, the way set_current_state() does? > There could be an alternate version (__wake_up_process()) which omits > the barrier, just like __set_current_state(). A common case uses locking, in which case additional memory barriers inside of the wait/wakeup functions are not needed. Any accesses made while holding the lock before invoking the wakeup function (e.g., wake_up()) are guaranteed to be seen after acquiring that same lock following return from the wait function (e.g., wait_event()). In this case, adding barriers to the wait and wakeup functions would just add overhead. But yes, this decision does mean that people using the wait/wakeup functions without locking need to be more careful. Something like this: /* prior accesses. */ smp_mb(); wakeup_flag = 1; wake_up(...); And on the other task: wait_event(... wakeup_flag == 1 ...); smp_mb(); /* The waker's prior accesses will be visible here. */ Or am I missing your point? Thanx, Paul