From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751210AbdGZPlQ (ORCPT ); Wed, 26 Jul 2017 11:41:16 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:49873 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750867AbdGZPlP (ORCPT ); Wed, 26 Jul 2017 11:41:15 -0400 Date: Wed, 26 Jul 2017 08:41:10 -0700 From: "Paul E. McKenney" To: Peter Zijlstra Cc: linux-kernel@vger.kernel.org, mingo@kernel.org, jiangshanlai@gmail.com, dipankar@in.ibm.com, akpm@linux-foundation.org, mathieu.desnoyers@efficios.com, josh@joshtriplett.org, tglx@linutronix.de, rostedt@goodmis.org, dhowells@redhat.com, edumazet@google.com, fweisbec@gmail.com, oleg@redhat.com, will.deacon@arm.com Subject: Re: [PATCH tip/core/rcu 4/5] sys_membarrier: Add expedited option Reply-To: paulmck@linux.vnet.ibm.com References: <20170725164900.GR3730@linux.vnet.ibm.com> <20170725165957.alykngbnrrwn3onw@hirez.programming.kicks-ass.net> <20170725171701.GS3730@linux.vnet.ibm.com> <20170725185320.uis4hxqaqlx7y7gp@hirez.programming.kicks-ass.net> <20170725193612.GW3730@linux.vnet.ibm.com> <20170725202451.GC28975@worktop> <20170725211926.GA3730@linux.vnet.ibm.com> <20170725215510.GD28975@worktop> <20170725235936.GC3730@linux.vnet.ibm.com> <20170726074128.ybb3e4flnjkrpi74@hirez.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170726074128.ybb3e4flnjkrpi74@hirez.programming.kicks-ass.net> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 x-cbid: 17072615-0040-0000-0000-00000385F043 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00007429; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000214; SDB=6.00893211; UDB=6.00446525; IPR=6.00673375; BA=6.00005492; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00016390; XFM=3.00000015; UTC=2017-07-26 15:41:13 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17072615-0041-0000-0000-0000077A0F53 Message-Id: <20170726154110.GN3730@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-07-26_06:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=3 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1706020000 definitions=main-1707260225 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jul 26, 2017 at 09:41:28AM +0200, Peter Zijlstra wrote: > On Tue, Jul 25, 2017 at 04:59:36PM -0700, Paul E. McKenney wrote: > > On Tue, Jul 25, 2017 at 11:55:10PM +0200, Peter Zijlstra wrote: > > > > People always do crazy stuff, but what surprised me is that such s patch > > > got merged in urcu even though its known broken for a number of > > > architectures. > > > > It did not get merged into urcu. It is instead used directly by a > > number of people for a number of concurrent algorithms. > > Yah, Mathieu also already pointed that out. It seems I really cannot > deal with github well -- that website always terminally confuses me. > > > > > But it would not be hard for userspace code to force IPIs by repeatedly > > > > awakening higher-priority threads that sleep immediately after being > > > > awakened, right? > > > > > > RT tasks are not readily available to !root, and the user might have > > > been constrained to a subset of available CPUs. > > > > So non-idle non-nohz CPUs never get IPIed for wakeups of SCHED_OTHER > > threads? > > Sure, but SCHED_OTHER auto throttles in that if there's anything else to > run, you get to wait. So you can't generate an IPI storm with it. Also, > again, we can be limited to a subset of CPUs. OK, what is its auto-throttle policy? One round of IPIs per jiffy or some such? Does this auto-throttling also apply if the user is running a CPU-bound SCHED_BATCH or SCHED_IDLE task on each CPU, and periodically waking up one of a large group of SCHED_OTHER tasks, where the SCHED_OTHER tasks immediately sleep upon being awakened? > > > My thinking was that if we observe '!= mm' that CPU will have to do a > > > context switch in order to make it true. That context switch will > > > provide the ordering we're after so all is well. > > > > > > Quite possible there's a hole in, but since I'm running on fumes someone > > > needs to spell it out for me :-) > > > > This would be the https://marc.info/?l=linux-kernel&m=126349766324224&w=2 > > URL below. > > > > Which might or might not still be applicable. > > I think we actually have those two smp_mb()'s around the rq->curr > assignment. > > we have smp_mb__before_spinlock(), which per the argument here: > > https://lkml.kernel.org/r/20170607162013.755917928@infradead.org > > is actually a full MB, irrespective of that weird smp_wmb() definition > we have now. And we have switch_mm() on the other side. OK, and the rq->curr assignment is in common code, correct? Does this allow the IPI-only-requesting-process approach to live entirely within common code? The 2010 email thread ended up with sys_membarrier() acquiring the runqueue lock for each CPU, because doing otherwise meant adding code to the scheduler fastpath. Don't we still need to do this? https://marc.info/?l=linux-kernel&m=126341138408407&w=2 https://marc.info/?l=linux-kernel&m=126349766324224&w=2 > > > > I was intending to base this on the last few versions of a 2010 patch, > > > > but maybe things have changed: > > > > > > > > https://marc.info/?l=linux-kernel&m=126358017229620&w=2 > > > > https://marc.info/?l=linux-kernel&m=126436996014016&w=2 > > > > https://marc.info/?l=linux-kernel&m=126601479802978&w=2 > > > > https://marc.info/?l=linux-kernel&m=126970692903302&w=2 > > > > > > > > Discussion here: > > > > > > > > https://marc.info/?l=linux-kernel&m=126349766324224&w=2 > > > > > > > > The discussion led to acquiring the runqueue locks, as there was > > > > otherwise a need to add code to the scheduler fastpaths. > > > > > > TL;DR.. that's far too much to trawl through. > > > > So we re-derive it from first principles instead? ;-) > > Yep, that's what I usually do anyway, who knows what kind of crazy our > younger selves were up to ;-) In my experience, it ends up being a type of crazy worth ignoring only if I don't ignore it. ;-) Thanx, Paul