From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D1F42C4332F for ; Thu, 20 Oct 2022 00:45:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229644AbiJTApD (ORCPT ); Wed, 19 Oct 2022 20:45:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36792 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229731AbiJTApB (ORCPT ); Wed, 19 Oct 2022 20:45:01 -0400 Received: from mail-pl1-x641.google.com (mail-pl1-x641.google.com [IPv6:2607:f8b0:4864:20::641]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7EB95140E68 for ; Wed, 19 Oct 2022 17:44:57 -0700 (PDT) Received: by mail-pl1-x641.google.com with SMTP id k9so18390464pll.11 for ; Wed, 19 Oct 2022 17:44:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=zPUUBhLAtaVS1RzfECTuosYpZGmg2E5OwbHWuXVh0BQ=; b=q0ogZExmhAPMxlyQhy4FG3Ddr7H+PQw3t9NlDCORQsz9CwQHGgYNvqxiPQNgljmHfL Zh6QKRCmxu64Npz1gn8oJqgK3EYJ1mEnoWD6A5KMPqqfWa8zdqqGdWON1PER0AGpMt7f cjUf2hcg7zV2yTkoilPSd4GsjEx/JvjkyqQc9p8B+m+chiuM66MovwbTht5uro9mZdkA mJHPdrEH4BTPYtVRfnqde94L98JrOMi7A8+hZ1OXy1O7NladveCSuwyzjXEu2O3/vdS3 eIVaoQHjIcRdgVLAbhG+4NOFKyU7FL72egboio1byfRU5wFP3RglAsdyNabCS0iE2DTd /iWg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=zPUUBhLAtaVS1RzfECTuosYpZGmg2E5OwbHWuXVh0BQ=; b=dhUL5Mzur2Vf+EwUcItKITROnycc9tjozQBREJboxrmPJeR8073G4lo3vySLq6EEVo CS6fWSXh7p1NC0PFwYF2cxJOYSWfut3z6EDIVT0baMlrgJzDsByaVenaaZ5Fvplep3P0 xSLNi23HMC1ngNyLkUqiTJh/IQbciG6TM5h13014PjegGsao9IZ5iqHq0Ae0G54bxYL4 k3S1T4Tw7wRIyuB2jOO10Mn9X80I0w1BocOxtVYvTYrFZGZKECF+ukt3w3y8sxzWRXfe yOFhI8bGYBnoNqEoLvld1bt4EVKbz/Wi/ikOykixVPQukzPpswLItLUPsKAShzXuCqv2 0cgg== X-Gm-Message-State: ACrzQf2PYXBo2Rn1WvzXN8miUNHK4SXefS7eCsXIbHTKOYKUk6zMzlHT eptF1I4rDWFWyiFxk5zAKLE= X-Google-Smtp-Source: AMsMyM7r5D3jb2KmeCGFSWjXpGxrMpesN1nysOUxZM3YzTRI6gStoOzfhNz4cq8uSwIbd5cZtOdnog== X-Received: by 2002:a17:902:cec8:b0:185:505b:95da with SMTP id d8-20020a170902cec800b00185505b95damr11202125plg.83.1666226696889; Wed, 19 Oct 2022 17:44:56 -0700 (PDT) Received: from localhost ([14.96.13.220]) by smtp.gmail.com with ESMTPSA id b15-20020a17090a6e0f00b001ef81574355sm521914pjk.12.2022.10.19.17.44.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 19 Oct 2022 17:44:56 -0700 (PDT) Date: Thu, 20 Oct 2022 06:14:45 +0530 From: Kumar Kartikeya Dwivedi To: Alexei Starovoitov Cc: bpf , Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Dave Marchevsky , Delyan Kratunov Subject: Re: [PATCH bpf-next v2 19/25] bpf: Introduce bpf_kptr_new Message-ID: <20221020004445.zuwwtqe2qdk6ccfj@apollo> References: <20221013062303.896469-1-memxor@gmail.com> <20221013062303.896469-20-memxor@gmail.com> <20221019023124.47zzi3gs2zcdvxca@macbook-pro-4.dhcp.thefacebook.com> <20221019055834.ux5dfoot7hyuf4jk@apollo> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org On Wed, Oct 19, 2022 at 10:01:21PM IST, Alexei Starovoitov wrote: > On Tue, Oct 18, 2022 at 10:58 PM Kumar Kartikeya Dwivedi > wrote: > > > > On Wed, Oct 19, 2022 at 08:01:24AM IST, Alexei Starovoitov wrote: > > > On Thu, Oct 13, 2022 at 11:52:57AM +0530, Kumar Kartikeya Dwivedi wrote: > > > > +void *bpf_kptr_new_impl(u64 local_type_id__k, u64 flags, void *meta__ign) > > > > +{ > > > > + struct btf_struct_meta *meta = meta__ign; > > > > + u64 size = local_type_id__k; > > > > + void *p; > > > > + > > > > + if (unlikely(flags || !bpf_global_ma_set)) > > > > + return NULL; > > > > > > Unused 'flags' looks weird in unstable api. Just drop it? > > > And keep it as: > > > void *bpf_kptr_new(u64 local_type_id__k, struct btf_struct_meta *meta__ign); > > > > > > and in bpf_experimental.h: > > > > > > extern void *bpf_kptr_new(__u64 local_type_id) __ksym; > > > > > > since __ign args are ignored during kfunc type match > > > the bpf progs can use it without #define. > > > > > > > It's ignored during check_kfunc_call, but libbpf doesn't ignore that. The > > prototypes will not be the same. I guess I'll have to teach it do that during > > type match, but IDK how you feel about that. > > libbpf does the full type match, really? > Could you point me to the code? > Not full type match, but the number of arguments must be same, so it won't allow having kfunc as: void *bpf_kptr_new(u64 local_type_id__k, struct btf_struct_meta *meta__ign); in the kernel and ksym declaration in the program as: extern void *bpf_kptr_new(__u64 local_type_id) __ksym; I get: libbpf: extern (func ksym) 'bpf_kptr_new_impl': func_proto [25] incompatible with kernel [60043] vlen of func_proto in kernel type is 2, for us it will be 1. > > Otherwise unless you want people to manually pass something to the ignored > > argument, we have to hide it behind a macro. > > > > I actually like the macro on top, then I don't even pass the type ID but the > > type. But that's a personal preference, and I don't feel strongly about it. > > > > So in C one does malloc(sizeof(*p)), here we'll just write > > bpf_kptr_new(typeof(*p)). YMMV. > > bpf_kptr_new(typeof(*p)) is cleaner. > So if we're having a macro anyway, the thing above can be hidden behind it. > > > > + p = bpf_mem_alloc(&bpf_global_ma, size); > > > > + if (!p) > > > > + return NULL; > > > > + if (meta) > > > > + bpf_obj_init(meta->off_arr, p); > > > > > > I'm starting to dislike all that _arr and _tab suffixes in the verifier code base. > > > It reminds me of programming style where people tried to add types into > > > variable names. imo dropping _arr wouldn't be just fine. > > > > Ack, I'll do it in v3. > > > > Also, I'd like to invite people to please bikeshed a bit over the naming of the > > APIs, e.g. whether it should be bpf_kptr_drop vs bpf_kptr_delete. > > bpf_kptr_drop is more precise. > delete assumes instant free which is not the case here. > > How about > extern void *__bpf_obj_new(__u64 local_type_id) __ksym; > extern void bpf_obj_drop(void *obj) __ksym; > #define bpf_obj_new(t) \ > (t *)__bpf_obj_new(bpf_core_type_id_local(t)); > > kptr means 'kernel pointer'. > Here we have program supplied object. > It feels 'obj' is better than 'kptr' in this context. > I agree, I'll rename it to bpf_obj_*. Also, that __bpf_obj_new doesn't work yet in clang [0] but I think we can rename it to that once clang is fixed. [0]: https://reviews.llvm.org/D136041 > > In the BPF list API, it's named bpf_list_del but it's actually distinct from how > > list_del in the kernel works. So it does make sense to give them a different > > name (like pop_front/pop_back and push_front/push_back)? > > > > Because even bpf_list_add takes bpf_list_head, in the kernel there's no > > distinction between node and head, so you can do list_add on a node as well, but > > it won't be possible with the kfunc (unless we overload the head argument to > > also work with nodes). > > > > Later we'll probably have to add bpf_list_node_add etc. that add before or after > > a node to make that work. > > > > The main question is whether it should closely resembly the linked list API in > > the kernel, or can it steer away considerably from that? > > If we do doubly linked list we should allow delete in > the middle with > bpf_list_del_any(head, node) > > and > bpf_list_pop_front/pop_back(head) > > bpf_list_add(node, head) would match kernel style, > but I think it's cleaner to have head as 1st arg. > In that sense new pop/push/_front/_back are cleaner. > And similar for rbtree. > > If we keep (node, head) and (rb_node, rb_root) order > we should keep kernel names. There's also tradeoffs in how various operations are done. Right now we have bpf_list_del and bpf_list_del_tail doing pop_front and pop_back. To replicate the same in kernel style API, you would do: struct foo *f = bpf_list_first_entry_or_null(head); if (!f) {} bpf_list_del(&f->node); Between those two calls, you might do bpf_list_last_entry -> bpf_list_del, the verifier is going to have a hard time proving the aliasing of the two nodes, so it will have to invalidate all pointers peeking into the list whenever something modifies it. You would still be able to access memory but cannot pass it to list ops anymore. But I think we will pay this cost eventually anyway, people will add a peek operation and such analysis would have to be done then. So I think it's unlikely we can avoid this, and it might be better to make things more consistent and end up mirroring the kernel list APIs. WDYT?