From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 112FDC00140 for ; Wed, 10 Aug 2022 08:12:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:Content-Type: Content-Transfer-Encoding:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:From:References:CC:To:Subject: MIME-Version:Date:Message-ID:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=CgP3oRKkfNxS2Zh2j45Fy6pd1QbGRfLUYhtJq33okuw=; b=0hmqJ+GULoKpVz VvbjZ/kR1Ko4Ln89Ou9Afzu1/0DPagDSJPAK3TiO6PpZ71qXo9thhDTJNs1PankDNXxw/iuqmgLN1 lk7LIQGaxNcFMd2njooXwwhceq1EDK2K/1UnbSEUctKP9iFaYd3ZPMItM0K6XwWiWA0GIW5vbb7xn 4klSlN7HAP3/z7xrb2KTdeGB/73101z/iuqVYppgk5f82G0RFs3NXxYkS5yhcwArxKvRgAGGYLruj 0fbrmTk7A5D0Ue42MIAf5mW9t8scjeB1UtwleAc2DOPVLRfUVTwhKU76IfHQoy/TgWeAgq699153w 5zTXQXUZXJxpPh7Nch4g==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1oLgns-00AdcA-6x; Wed, 10 Aug 2022 08:11:00 +0000 Received: from szxga02-in.huawei.com ([45.249.212.188]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1oLgnd-00AdNF-Lq for linux-arm-kernel@lists.infradead.org; Wed, 10 Aug 2022 08:10:56 +0000 Received: from kwepemi500013.china.huawei.com (unknown [172.30.72.54]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4M2jGW6VKgzfZ4q; Wed, 10 Aug 2022 16:07:31 +0800 (CST) Received: from [10.67.111.192] (10.67.111.192) by kwepemi500013.china.huawei.com (7.221.188.120) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Wed, 10 Aug 2022 16:10:23 +0800 Message-ID: Date: Wed, 10 Aug 2022 16:10:23 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.9.0 Subject: Re: [PATCH bpf-next v5 1/6] arm64: ftrace: Add ftrace direct call support Content-Language: en-US To: Florent Revest CC: Mark Rutland , , , , , , Catalin Marinas , Will Deacon , Steven Rostedt , Ingo Molnar , Daniel Borkmann , Alexei Starovoitov , Zi Shen Lim , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , "David S . Miller" , Hideaki YOSHIFUJI , David Ahern , Thomas Gleixner , Borislav Petkov , Dave Hansen , , , Shuah Khan , Jakub Kicinski , Jesper Dangaard Brouer , Pasha Tatashin , Ard Biesheuvel , Daniel Kiss , Steven Price , Sudeep Holla , Marc Zyngier , Peter Collingbourne , Mark Brown , Delyan Kratunov , Kumar Kartikeya Dwivedi , Wang ShaoBo , , , , References: <0f8fe661-c450-ccd8-761f-dbfff449c533@huawei.com> <40fda0b0-0efc-ea1b-96d5-e51a4d1593dd@huawei.com> <55c1b9d6-1d53-9752-fb03-00f60ed15db7@huawei.com> From: Xu Kuohai In-Reply-To: X-Originating-IP: [10.67.111.192] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To kwepemi500013.china.huawei.com (7.221.188.120) X-CFilter-Loop: Reflected X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220810_011046_166153_47A12951 X-CRM114-Status: GOOD ( 49.12 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On 8/10/2022 1:03 AM, Florent Revest wrote: > On Thu, Jun 9, 2022 at 6:27 AM Xu Kuohai wrote: >> On 6/7/2022 12:35 AM, Mark Rutland wrote: >>> On Thu, May 26, 2022 at 10:48:05PM +0800, Xu Kuohai wrote: >>>> On 5/26/2022 6:06 PM, Mark Rutland wrote: >>>>> On Thu, May 26, 2022 at 05:45:03PM +0800, Xu Kuohai wrote: >>>>>> On 5/25/2022 9:38 PM, Mark Rutland wrote: >>>>>>> On Wed, May 18, 2022 at 09:16:33AM -0400, Xu Kuohai wrote: >>>>>>>> As noted in that thread, I have a few concerns which equally apply here: >>>>>>> >>>>>>> * Due to the limited range of BL instructions, it's not always possible to >>>>>>> patch an ftrace call-site to branch to an arbitrary trampoline. The way this >>>>>>> works for ftrace today relies upon knowingthe set of trampolines at >>>>>>> compile-time, and allocating module PLTs for those, and that approach cannot >>>>>>> work reliably for dynanically allocated trampolines. >>>>>> >>>>>> Currently patch 5 returns -ENOTSUPP when long jump is detected, so no >>>>>> bpf trampoline is constructed for out of range patch-site: >>>>>> >>>>>> if (is_long_jump(orig_call, image)) >>>>>> return -ENOTSUPP; >>>>> >>>>> Sure, my point is that in practice that means that (from the user's PoV) this >>>>> may randomly fail to work, and I'd like something that we can ensure works >>>>> consistently. >>>>> >>>> >>>> OK, should I suspend this work until you finish refactoring ftrace? >>> >>> Yes; I'd appreciate if we could hold on this for a bit. >>> >>> I think with some ground work we can avoid most of the painful edge cases and >>> might be able to avoid the need for custom trampolines. >>> >> >> I'v read your WIP code, but unfortunately I didn't find any mechanism to >> replace bpf trampoline in your code, sorry. >> >> It looks like bpf trampoline and ftrace works can be done at the same >> time. I think for now we can just attach bpf trampoline to bpf prog. >> Once your ftrace work is done, we can add support for attaching bpf >> trampoline to regular kernel function. Is this OK? > > Hey Mark and Xu! :) > > I'm interested in this feature too and would be happy to help. > > I've been trying to understand what you both have in mind to figure out a way > forward, please correct me if I got anything wrong! :) > > > It looks like, currently, there are three places where an indirection to BPF is > technically possible. Chronologically these are: > > - the function's patchsite (currently there are 2 nops, this could become 4 > nops with Mark's series on per call-site ops) > > - the ftrace ops (currently called by iterating over a global list but could be > called more directly with Mark's series on per-call-site ops or by > dynamically generated branches with Wang's series on dynamic trampolines) > > - a ftrace trampoline tail call (currently, this is after restoring a full > pt_regs but this could become an args only restoration with Mark's series on > DYNAMIC_FTRACE_WITH_ARGS) > > > If we first consider the situation when only a BPF program is attached to a > kernel function: > - Using the patchsite for indirection (proposed by Xu, same as on x86) > Pros: > - We have BPF trampolines anyway because they are required for orthogonal > features such as calling BPF programs as functions, so jumping into that > existing JITed code is straightforward > - This has the minimum overhead (eg: these trampolines only save the actual > number of args used by the function in ctx and avoid indirect calls) > Cons: > - If the BPF trampoline is JITed outside BL's limits, attachment can > randomly fail > > - Using a ftrace op for indirection (proposed by Mark) > Pros: > - BPF doesn't need to care about BL's range, ftrace_caller will be in range > Cons: > - The ftrace trampoline would first save all args in an ftrace_regs only for > the BPF op to then re-save them in a BPF ctx array (as per BPF calling > convention) so we'd effectively have to do the work of saving args twice > - BPF currently uses DYNAMIC_FTRACE_WITH_DIRECT_CALLS APIs. Either arm64 > should implement DIRECT_CALLS with... an indirect call :) (that is, the > arch_ftrace_set_direct_caller op would turn back its ftrace_regs into > arguments for the BPF trampoline) or BPF would need to use a different > ftrace API just on arm64 (to define new ops, which, unless if they would be > dynamically JITed, wouldn't be as performant as the existing BPF > trampolines) > > - Using a ftrace trampoline tail call for indirection (not discussed yet iiuc) > Pros: > - BPF also doesn't need to care about BL's range > - This also leverages the existing BPF trampolines > Cons: > - This also does the work of saving/restoring arguments twice > - DYNAMIC_FTRACE_WITH_DIRECT_CALLS depends on DYNAMIC_FTRACE_WITH_REGS now > although in practice the registers kept by DYNAMIC_FTRACE_WITH_ARGS > should be enough to call BPF trampolines > > If we consider the situation when both ftrace ops and BPF programs are attached > to a kernel function: > - Using the patchsite for indirection can't solve this > > - Using a ftrace op for indirection (proposed by Mark) or using a ftrace > trampoline tail call as an indirection (proposed by Xu, same as on x86) have > the same pros & cons as in the BPF only situation except that this time we > pay the cost of registers saving twice for good reasons (we need args in both > ftrace_regs and the BPF ctx array formats anyway) > > > Unless I'm missing something, it sounds like the following approach would work: > - Always patch patchsites with calls to ftrace trampolines (within BL ranges) > - Always go through ops and have arch_ftrace_set_direct_caller set > ftrace_regs->direct_call (instead of pt_regs->orig_x0 in this patch) > - If ftrace_regs->direct_call != 0 at the end of the ftrace trampoline, tail > call it > > Once Mark's series on DYNAMIC_FTRACE_WITH_ARGS is merged, we would need to have > DYNAMIC_FTRACE_WITH_DIRECT_CALLS > depend on DYNAMIC_FTRACE_WITH_REGS || DYNAMIC_FTRACE_WITH_ARGS > BPF trampolines (the only users of this API now) only care about args to the > attachment point anyway so I think this would work transparently ? > > Once Mark's series on per-callsite ops is merged, the second step (going > through ops) would be significantly faster in the situation where only one > program is used, therefore one arch_ftrace_set_direct_caller op. > > Once Wang's series on dynamic trampolines is merged, the second step (going > through ops) would also be significantly faster in the case when multiple ops > are attached. > > > What are your thoughts? If this sounds somewhat sane, I'm happy to help out > with the implementation as well :) > Hi Florent, I'm struggling with how to attach bpf trampoline to regular kernel functions. I think your suggestion is fine. Thanks for the help! > Thanks! > Florent > . _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel