From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752509AbdJDORw (ORCPT ); Wed, 4 Oct 2017 10:17:52 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:46852 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752403AbdJDORt (ORCPT ); Wed, 4 Oct 2017 10:17:49 -0400 Date: Wed, 4 Oct 2017 07:17:45 -0700 From: "Paul E. McKenney" To: Steven Rostedt Cc: Peter Zijlstra , pmladek@suse.com, sergey.senozhatsky@gmail.com, linux-kernel@vger.kernel.org, mingo@kernel.org, tglx@linutronix.de Subject: Re: [PATCH 3/3] early_printk: Add simple serialization to early_vprintk() Reply-To: paulmck@linux.vnet.ibm.com References: <20170928121823.430053219@infradead.org> <20170928122513.431444176@infradead.org> <20171003182422.025d0a67@gandalf.local.home> <20171004090830.p5kwzu6ex2wimh4v@hirez.programming.kicks-ass.net> <20171004090401.3a5123a6@gandalf.local.home> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20171004090401.3a5123a6@gandalf.local.home> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 x-cbid: 17100414-0056-0000-0000-000003D39F97 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00007841; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000233; SDB=6.00926390; UDB=6.00466014; IPR=6.00706602; BA=6.00005620; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00017389; XFM=3.00000015; UTC=2017-10-04 14:17:47 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17100414-0057-0000-0000-0000080AAF31 Message-Id: <20171004141745.GH3521@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-10-04_07:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1707230000 definitions=main-1710040200 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Oct 04, 2017 at 09:04:01AM -0400, Steven Rostedt wrote: > On Wed, 4 Oct 2017 11:08:30 +0200 > Peter Zijlstra wrote: > > > On Tue, Oct 03, 2017 at 06:24:22PM -0400, Steven Rostedt wrote: > > > On Thu, 28 Sep 2017 14:18:26 +0200 > > > Peter Zijlstra wrote: > > > > static int early_vprintk(const char *fmt, va_list args) > > > > { > > > > + int n, cpu, old; > > > > char buf[512]; > > > > + > > > > + cpu = get_cpu(); > > > > + /* > > > > + * Test-and-Set inter-cpu spinlock with recursion. > > > > + */ > > > > + for (;;) { > > > > + /* > > > > + * c-cas to avoid the exclusive bouncing on spin. > > > > + * Depends on the memory barrier implied by cmpxchg > > > > + * for ACQUIRE semantics. > > > > + */ > > > > + old = READ_ONCE(early_printk_cpu); > > > > + if (old == -1) { > > > > > > If old != -1 and old != cpu, is it possible that the CPU could have > > > fetched an old value, and never try to fetch it again? > > > > What? If old != -1 and old != cpu, we'll hit the cpu_relax() and do the > > READ_ONCE() again. The READ_ONCE() guarantees we'll do the load again, > > as does the barrier() implied by cpu_relax(). > > I'm more worried about other architectures that don't have as strong of > a cache coherency. > > [ Added Paul as he knows a lot about odd architectures ] > > Is there any architecture that we support that can have the following: > > CPU0 CPU1 > ---- ---- > early_printk_cpu = 1 > for (;;) > old = READ_ONCE(early_printk_cpu); > [ old = 1 ] > > early_printk_cpu = -1 > > [...] > cpu_relax(); > old = READ_ONCE(early_printk_cpu); > > [ but the CPU uses the cache and not the memory? ] > > old = 1; If you use READ_ONCE(), then all architectures I know of enforce full ordering for accesses to a single variable. (If you don't use READ_ONCE(), then in theory Itanium can reorder reads.) Me, I would argue for WRITE_ONCE() as well to prevent store tearing. It is only when you have at least two variables and at least two threads than things start getting really "interesting". ;-) Thanx, Paul > -- Steve > > > > > > > The cmpxchg memory barrier only happens when old == -1. > > > > Yeah, so? > > > > > > + old = cmpxchg(&early_printk_cpu, -1, cpu); > > > > + if (old == -1) > > > > + break; > > > > + } > > > > + /* > > > > + * Allow recursion for interrupts and the like. > > > > + */ > > > > + if (old == cpu) > > > > + break; > > > > + > > > > + cpu_relax(); > > > > + } > > > > > > > > n = vscnprintf(buf, sizeof(buf), fmt, args); > > > > early_console->write(early_console, buf, n); > > > > > > > > + /* > > > > + * Unlock -- in case @old == @cpu, this is a no-op. > > > > + */ > > > > + smp_store_release(&early_printk_cpu, old); > > > > + put_cpu(); > > > > + > > > > return n; > > > > } >