From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BE9C417A93E for ; Thu, 2 May 2024 19:19:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714677592; cv=none; b=HWYLs2MoEJap6TJvpdh0STfRCxh5ruxlNzvHtvNDgohV9GiAw5I+YZBqtRtZM/7O16EMxncZA+2wywMcImunKgOP50hl6+woJx3S/EtcJm3ev8C8zWk0cIzm6PAJy45bnC1XI1+53AaiMtFIwGZ9uKlOVJzSGe+PY711ldiFRRM= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714677592; c=relaxed/simple; bh=4M5yGmgjLTknVX3wQ9DLGdQCvnsoI2F64EEeCn+eeag=; h=From:To:Subject:In-Reply-To:References:Date:Message-ID: MIME-Version:Content-Type; b=FX5YGVC1mitonqA0paG/w8FdGBBa1+oE7iDILarigeLrpFFr4sJksQP17WXfUZLGs7cUzlljLFw0EzllcDXc1nA8QcDpm9cZjp6oe5/K4V/APCT+G+lZY0EdtDL8+emzXTfNvGi/naTrY4XSbsWiBx9CkBkkRQTYrtrAnQxWfPg= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=mGxP6QJz; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="mGxP6QJz" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 295EEC113CC; Thu, 2 May 2024 19:19:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1714677592; bh=4M5yGmgjLTknVX3wQ9DLGdQCvnsoI2F64EEeCn+eeag=; h=From:To:Subject:In-Reply-To:References:Date:From; b=mGxP6QJz6OpW8zRZxQf683B//+NpSV648NcKJ2BI2jmVo6GwrSvyv5/wNbGW9Hvmm vXROjsdu5oA1KhSc/md7GPKtNytwoFS9nUAY56YOXzW7Hff389Zj+rbxa7YkQf1FON oQyp1z0CqhmG1ovvqrrbvA02FeLLvva+Whr4JQpt517jlwn6sKJAljaX4FF3dgfa+K R2zRv1AXv3hxJ7QxaT8QgHbwTriu67EkxeqXyQDqyDfRzb5HBte5TWXAtuPffwGeYO 8LURspDsQSKBiA0CvRtqAJI0QT2wkkjXMaXOz5fI1UvoWX8opOLuFulDGOvJ7+LLgD W5rHi6uygvtaA== From: Puranjay Mohan To: =?utf-8?B?QmrDtnJuIFTDtnBlbA==?= , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , Kumar Kartikeya Dwivedi , bpf@vger.kernel.org Subject: Re: On inlining more helpers in the JITs or the verifier In-Reply-To: References: Date: Thu, 02 May 2024 19:19:49 +0000 Message-ID: Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain Puranjay Mohan writes: > Hi Everyone, > > While working on inlining bpf_get_smp_processor_id() in the ARM64 and > RISCV JITs, I realized that these archs allow such optimizations because > they keep some information like the per-cpu offset or the pointer to the > task_struct in special system registers. > > So, I went through the list of all BPF helpers and made a list of > helpers that we can inline in these JITs to make their usage much more > optimized: > > I. ARM64 and RISC-V specific optimzations if inlined: > > A) Because pointer to tast_struct is available in a register: > 1. bpf_get_current_pid_tgid() > 2. bpf_get_current_task() Tried inlining bpf_get_current_task() on ARM64: Before After -------- -------- bpf_prog_6e2672bcc4451a42_trigger_get_current_task: bpf_prog_6e2672bcc4451a42_trigger_get_current_task: ; task = (struct task_struct *)bpf_get_current_task(); ; task = (struct task_struct *)bpf_get_current_task(); 34: mov x10, #0xffffffffffff9838 34: mrs x7, sp_el0 38: movk x10, #0x8027, lsl #16 3c: movk x10, #0x8000, lsl #32 40: blr x10 44: add x7, x0, #0x0 In the non-inlined version there is a branch [blr x10] to: 0xffff800080279838 bpf_get_current_task: <+0>: mrs x0, sp_el0 <+4>: ret So, we only need a single instruction after inlining!! I just don't know the best way to benchmark this. In theory it looks highly optimized. Thanks, Puranjay