From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Paul E. McKenney" Subject: Re: [RFC] Add BPF_SYNCHRONIZE bpf(2) command Date: Tue, 10 Jul 2018 10:42:25 -0700 Message-ID: <20180710174225.GA3593@linux.vnet.ibm.com> References: <20180707015616.25988-1-dancol@google.com> <20180707025426.ssxipi7hsehoiuyo@ast-mbp.dhcp.thefacebook.com> <20180707203340.GA74719@joelaf.mtv.corp.google.com> <951478560.1636.1531083278064.JavaMail.zimbra@efficios.com> <20180710051347.GA180724@joelaf.mtv.corp.google.com> <20180710164212.GY3593@linux.vnet.ibm.com> <20180710165744.GA99146@joelaf.mtv.corp.google.com> <20180710171229.GZ3593@linux.vnet.ibm.com> <20180710172957.GA103636@joelaf.mtv.corp.google.com> Reply-To: paulmck@linux.vnet.ibm.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Joel Fernandes , Mathieu Desnoyers , Alexei Starovoitov , Daniel Colascione , Alexei Starovoitov , linux-kernel , Tim Murray , Daniel Borkmann , netdev , fengc@google.com To: Joel Fernandes Return-path: Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:39730 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2390169AbeGJShy (ORCPT ); Tue, 10 Jul 2018 14:37:54 -0400 Received: from pps.filterd (m0098414.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w6AHdCaY008033 for ; Tue, 10 Jul 2018 13:40:11 -0400 Received: from e17.ny.us.ibm.com (e17.ny.us.ibm.com [129.33.205.207]) by mx0b-001b2d01.pphosted.com with ESMTP id 2k4xvt2x45-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Tue, 10 Jul 2018 13:40:11 -0400 Received: from localhost by e17.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 10 Jul 2018 13:40:10 -0400 Content-Disposition: inline In-Reply-To: <20180710172957.GA103636@joelaf.mtv.corp.google.com> Sender: netdev-owner@vger.kernel.org List-ID: On Tue, Jul 10, 2018 at 10:29:57AM -0700, Joel Fernandes wrote: > On Tue, Jul 10, 2018 at 10:12:29AM -0700, Paul E. McKenney wrote: > [..] > > > > > The other question I have is about the whole "nohz-full doesn't work" thing. > > > > > I didn't fully understand why. RCU is already tracking the state of nohz-full > > > > > CPUs because the rcu dynticks code in (kernel/rcu/tree.c) monitors > > > > > transitions to and from usermode even if the timer tick is turned off. So why > > > > > would it not work? > > > > > > > > In the nohz_full case, there is no need for sys_membarrier()'s call to > > > > synchronize_sched() to interact directly with the nohz_full CPU. It > > > > can instead look at the target CPU's dyntick-idle state, and that state > > > > would potentially have been set in the dim distant past, thus having > > > > no effect on the target CPU's current execution. > > > > > > In nohz-idle case though, there's nothing to promote the barrier() to > > > smp_mb() if you were to purely look at the dynticks-idle state on the > > > nohz-full CPU executing in user mode? > > > > > > So then it makes sense to me now that nohz-full needs something to IPI that > > > CPU inorder to enforce the needed memory barrier and pure synchronize_sched() > > > wouldn't work. So then makes me think the expedited versions of > > > synchronize_sched should be able to do the job but I could off on a different > > > track.. > > > > The problem is that the expedited versions also check the dyntick-idle > > state and don't touch idle (or nohz_full usermode) CPUs. This is by > > design for the battery-powered embedded use case. ;-) > > Oh ok! ;) > > I guess there's also a MEMBARRIER_CMD_GLOBAL_EXPEDITED which seems to IPI > CPUs (I'm guessing regardless of dynticks state) and execute smp_mb within > the IPI so userspace can fallback to using that incase MEMBARRIER_CMD_GLOBAL > returns -EINVAL. Yes, and this avoids IPIing idle CPUs via the ->mm checks. But it will IPI nohz_full CPUs in that same process, as it must for correctness. Thanx, Paul