From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.5 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_PASS,USER_AGENT_MUTT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9A17FC282D7 for ; Wed, 30 Jan 2019 18:36:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 7403720869 for ; Wed, 30 Jan 2019 18:36:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1733177AbfA3Sg0 (ORCPT ); Wed, 30 Jan 2019 13:36:26 -0500 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:49234 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1733011AbfA3Sg0 (ORCPT ); Wed, 30 Jan 2019 13:36:26 -0500 Received: from pps.filterd (m0098399.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x0UIZ2Ta086085 for ; Wed, 30 Jan 2019 13:36:25 -0500 Received: from e14.ny.us.ibm.com (e14.ny.us.ibm.com [129.33.205.204]) by mx0a-001b2d01.pphosted.com with ESMTP id 2qbha304b7-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Wed, 30 Jan 2019 13:36:25 -0500 Received: from localhost by e14.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 30 Jan 2019 18:36:23 -0000 Received: from b01cxnp22034.gho.pok.ibm.com (9.57.198.24) by e14.ny.us.ibm.com (146.89.104.201) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Wed, 30 Jan 2019 18:36:19 -0000 Received: from b01ledav003.gho.pok.ibm.com (b01ledav003.gho.pok.ibm.com [9.57.199.108]) by b01cxnp22034.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id x0UIaIlF24510676 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Wed, 30 Jan 2019 18:36:18 GMT Received: from b01ledav003.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C6191B2064; Wed, 30 Jan 2019 18:36:18 +0000 (GMT) Received: from b01ledav003.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A1314B205F; Wed, 30 Jan 2019 18:36:18 +0000 (GMT) Received: from paulmck-ThinkPad-W541 (unknown [9.70.82.57]) by b01ledav003.gho.pok.ibm.com (Postfix) with ESMTP; Wed, 30 Jan 2019 18:36:18 +0000 (GMT) Received: by paulmck-ThinkPad-W541 (Postfix, from userid 1000) id 7D07816C69B0; Wed, 30 Jan 2019 10:36:18 -0800 (PST) Date: Wed, 30 Jan 2019 10:36:18 -0800 From: "Paul E. McKenney" To: Will Deacon Cc: Alexei Starovoitov , Peter Zijlstra , Alexei Starovoitov , davem@davemloft.net, daniel@iogearbox.net, jakub.kicinski@netronome.com, netdev@vger.kernel.org, kernel-team@fb.com, mingo@redhat.com, jannh@google.com Subject: Re: bpf memory model. Was: [PATCH v4 bpf-next 1/9] bpf: introduce bpf_spin_lock Reply-To: paulmck@linux.ibm.com References: <20190124041403.2100609-1-ast@kernel.org> <20190124041403.2100609-2-ast@kernel.org> <20190124180109.GA27771@hirez.programming.kicks-ass.net> <20190124235857.xyb5xx2ufr6x5mbt@ast-mbp.dhcp.thefacebook.com> <20190125102312.GC4500@hirez.programming.kicks-ass.net> <20190126001725.roqqfrpysyljqiqx@ast-mbp.dhcp.thefacebook.com> <20190128092408.GD28467@hirez.programming.kicks-ass.net> <20190128215623.6eqskzhklydhympa@ast-mbp> <20190130181100.GA18558@fuggles.cambridge.arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190130181100.GA18558@fuggles.cambridge.arm.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 x-cbid: 19013018-0052-0000-0000-000003809F42 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00010505; HX=3.00000242; KW=3.00000007; PH=3.00000004; SC=3.00000277; SDB=6.01154030; UDB=6.00601709; IPR=6.00934407; MB=3.00025357; MTD=3.00000008; XFM=3.00000015; UTC=2019-01-30 18:36:23 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 19013018-0053-0000-0000-00005FA885A3 Message-Id: <20190130183618.GX4240@linux.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2019-01-30_14:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1901300140 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Wed, Jan 30, 2019 at 06:11:00PM +0000, Will Deacon wrote: > Hi Alexei, > > On Mon, Jan 28, 2019 at 01:56:24PM -0800, Alexei Starovoitov wrote: > > On Mon, Jan 28, 2019 at 10:24:08AM +0100, Peter Zijlstra wrote: > > > On Fri, Jan 25, 2019 at 04:17:26PM -0800, Alexei Starovoitov wrote: > > > > What I want to avoid is to define the whole execution ordering model upfront. > > > > We cannot say that BPF ISA is weakly ordered like alpha. > > > > Most of the bpf progs are written and running on x86. We shouldn't > > > > twist bpf developer's arm by artificially relaxing memory model. > > > > BPF memory model is equal to memory model of underlying architecture. > > > > What we can do is to make it bpf progs a bit more portable with > > > > smp_rmb instructions, but we must not force weak execution on the developer. > > > > > > Well, I agree with only introducing bits you actually need, and my > > > smp_rmb() example might have been poorly chosen, smp_load_acquire() / > > > smp_store_release() might have been a far more useful example. > > > > > > But I disagree with the last part; we have to pick a model now; > > > otherwise you'll pain yourself into a corner. > > > > > > Also; Alpha isn't very relevant these days; however ARM64 does seem to > > > be gaining a lot of attention and that is very much a weak architecture. > > > Adding strongly ordered assumptions to BPF now, will penalize them in > > > the long run. > > > > arm64 is gaining attention just like riscV is gaining it too. > > BPF jit for arm64 is very solid, while BPF jit for riscV is being worked on. > > BPF is not picking sides in CPU HW and ISA battles. > > It's not about picking a side, it's about providing an abstraction of the > various CPU architectures out there so that the programmer doesn't need to > worry about where their program may run. Hell, even if you just said "eBPF > follows x86 semantics" that would be better than saying nothing (and then we > could have a discussion about whether x86 semantics are really what you > want). To reinforce this point, the Linux-kernel memory model (tools/memory-model) is that abstraction for the Linux kernel. Why not just use that for BPF? Thanx, Paul > > Memory model is CPU HW design decision. BPF ISA cannot dictate HW design. > > We're not saying today that BPF is strongly ordered. > > BPF load/stores are behaving differently on x86 vs arm64. > > We can add new instructions, but we cannot 'define' how load/stores behave > > from memory model perspective. > > For example, take atomicity of single byte load/store. > > Not all archs have them atomic, but we cannot say to bpf programmers > > to always assume non-atomic byte loads. > > Hmm, I don't think this is a good approach to take for the future of eBPF. > Assuming that a desirable property of an eBPF program is portability between > CPU architectures, then you're effectively forcing the programmer to "assume > the worst", where the worst is almost certainly unusable for practical > purposes. > > One easy thing you could consider would be to allow tagging of an eBPF > program with its supported target architectures (the JIT will refuse to > accept it for other architectures). This would at least prevent remove the > illusion of portability and force the programmer to be explicit. > > However, I think we'd much better off if we defined some basic ordering > primitives such as relaxed and RCpc-style acquire/release operations > (including atomic RmW), single-copy atomic memory accesses up to the native > machine size and a full-fence instruction. If your program uses something > that the underlying arch doesn't support, then it is rejected (e.g. 64-bit > READ_ONCE on a 32-bit arch) > > That should map straightforwardly to all modern architectures and allow for > efficient codegen on x86 and arm64. It would probably require a bunch of new > BPF instructions that would be defined to be atomic (you already have XADD > as a relaxed atomic add). > > Apologies if this sounds patronising, but I'm keen to help figure out the > semantics *now* so that we don't end up having to infer them later on, which > is the usual painful case for memory models. I suspect Peter and Paul would > also prefer to attack it that way around. I appreciate that the temptation > is to avoid the problem by deferring to the underlying hardware memory > model, but I think that will create more problems than it solves and we're > here to help you get this right. > > Will >