From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 24FEEC0015E for ; Fri, 11 Aug 2023 23:34:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236924AbjHKXeZ (ORCPT ); Fri, 11 Aug 2023 19:34:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33194 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231685AbjHKXeU (ORCPT ); Fri, 11 Aug 2023 19:34:20 -0400 Received: from out-79.mta1.migadu.com (out-79.mta1.migadu.com [95.215.58.79]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0A41710E6 for ; Fri, 11 Aug 2023 16:34:17 -0700 (PDT) Message-ID: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1691796855; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XOAmUrlmN46mc8kLRjXg/L06bahnJsC47ASbtgox2n0=; b=SNHoCg5d13wmYFB3Vv9vgXrj8JvkuLaWsHbv6s1y5JjHbtBbtZKKPWBjqBxloswco9LOjr 7IndQQykzTcQQpn3uC1QjRHqlrTXkvucO5lPLo8m7b3mZ2jQMN1VzjNtU+WIrX7wbM+fQB 09xhUlEP0LxhAOA/EtAWr2ZHGMgogiM= Date: Fri, 11 Aug 2023 16:34:10 -0700 MIME-Version: 1.0 Subject: Re: [PATCH bpf-next] bpf: Support default .validate() and .update() behavior for struct_ops links Content-Language: en-US To: Kui-Feng Lee Cc: bpf@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, haoluo@google.com, jolsa@kernel.org, linux-kernel@vger.kernel.org, kernel-team@meta.com, tj@kernel.org, clm@meta.com, thinker.li@gmail.com, Stanislav Fomichev , David Vernet References: <20230810220456.521517-1-void@manifault.com> <20230810230141.GA529552@maniforge> <20230811201914.GD542801@maniforge> <03f9f9be-620d-a44d-d6a3-8b9084344db5@gmail.com> X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Martin KaFai Lau In-Reply-To: <03f9f9be-620d-a44d-d6a3-8b9084344db5@gmail.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 8/11/23 4:12 PM, Kui-Feng Lee wrote: > > > On 8/11/23 15:49, Martin KaFai Lau wrote: >> On 8/11/23 1:19 PM, David Vernet wrote: >>> On Fri, Aug 11, 2023 at 10:35:03AM -0700, Martin KaFai Lau wrote: >>>> On 8/10/23 4:15 PM, Stanislav Fomichev wrote: >>>>> On 08/10, David Vernet wrote: >>>>>> On Thu, Aug 10, 2023 at 03:46:18PM -0700, Stanislav Fomichev wrote: >>>>>>> On 08/10, David Vernet wrote: >>>>>>>> Currently, if a struct_ops map is loaded with BPF_F_LINK, it must also >>>>>>>> define the .validate() and .update() callbacks in its corresponding >>>>>>>> struct bpf_struct_ops in the kernel. Enabling struct_ops link is useful >>>>>>>> in its own right to ensure that the map is unloaded if an application >>>>>>>> crashes. For example, with sched_ext, we want to automatically unload >>>>>>>> the host-wide scheduler if the application crashes. We would likely >>>>>>>> never support updating elements of a sched_ext struct_ops map, so we'd >>>>>>>> have to implement these callbacks showing that they _can't_ support >>>>>>>> element updates just to benefit from the basic lifetime management of >>>>>>>> struct_ops links. >>>>>>>> >>>>>>>> Let's enable struct_ops maps to work with BPF_F_LINK even if they >>>>>>>> haven't defined these callbacks, by assuming that a struct_ops map >>>>>>>> element cannot be updated by default. >>>>>>> >>>>>>> Any reason this is not part of sched_ext series? As you mention, >>>>>>> we don't seem to have such users in the three? >>>>>> >>>>>> Hi Stanislav, >>>>>> >>>>>> The sched_ext series [0] implements these callbacks. See >>>>>> bpf_scx_update() and bpf_scx_validate(). >>>>>> >>>>>> [0]: https://lore.kernel.org/all/20230711011412.100319-13-tj@kernel.org/ >>>>>> >>>>>> We could add this into that series and remove those callbacks, but this >>>>>> patch is fixing a UX / API issue with struct_ops links that's not really >>>>>> relevant to sched_ext. I don't think there's any reason to couple >>>>>> updating struct_ops map elements with allowing the kernel to manage the >>>>>> lifetime of struct_ops maps -- just because we only have 1 (non-test) >>>> >>>> Agree the link-update does not necessarily couple with link-creation, so >>>> removing 'link' update function enforcement is ok. The intention was to >>>> avoid the struct_ops link inconsistent experience (one struct_ops link >>>> support update and another struct_ops link does not) because consistency was >>>> one of the reason for the true kernel backed link support that Kui-Feng did. >>>> tcp-cc is the only one for now in struct_ops and it can support update, so >>>> the enforcement is here. I can see Stan's point that removing it now looks >>>> immature before a struct_ops landed in the kernel showing it does not make >>>> sense or very hard to support 'link' update. However, the scx patch set has >>>> shown this point, so I think it is good enough. >>> >>> Sorry for sending v2 of the patch a bit prematurely. Should have let you >>> weigh in first. >>> >>>> For 'validate', it is not related a 'link' update. It is for the struct_ops >>>> 'map' update. If the loaded struct_ops map is invalid, it will end up having >>>> a useless struct_ops map and no link can be created from it. I can see some >>> >>> To be honest I'm actually not sure I understand why .validate() is only >>> called for when BPF_F_LINK is specified. Is it because it could break >> >> Regardless '.validate' must be enforced or not, the ->validate() should be >> called for the non BPF_F_LINK case also during map update. This should be fixed. > > For the case of the TCP congestion control, its validation function is > called by the implementations of ->validate() and ->reg(). I mean it > expects ->reg() to do validation as well. Right, for tcp-cc, the reg is doing the validation because it is how the kernel tcp-cc module is done. For newer subsystem supporting struct_ops, it should expect the validation is done in the .validate alone.