From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A40FEC4345F for ; Fri, 19 Apr 2024 17:04:30 +0000 (UTC) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=kernel.org header.i=@kernel.org header.a=rsa-sha256 header.s=k20201202 header.b=RUL9Opqg; dkim-atps=neutral Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4VLgws03T9z3vhX for ; Sat, 20 Apr 2024 03:04:29 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=kernel.org header.i=@kernel.org header.a=rsa-sha256 header.s=k20201202 header.b=RUL9Opqg; dkim-atps=neutral Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=kernel.org (client-ip=2604:1380:4641:c500::1; helo=dfw.source.kernel.org; envelope-from=rppt@kernel.org; receiver=lists.ozlabs.org) Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4VLgvz3kxPz2xcw for ; Sat, 20 Apr 2024 03:03:43 +1000 (AEST) Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 2F75061A85; Fri, 19 Apr 2024 17:03:39 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 20D36C3277B; Fri, 19 Apr 2024 17:03:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1713546218; bh=jFxi5qZrScF2xjH3z+fMU0rA+S3WPw/R5/k/PlfLlD4=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=RUL9Opqg9lVQKyIEyIuq2xjKwlBEmIHBH/IVoveA6aP2JLQi0uzPvTDRn7OmSzfgD nIb4g/5o2RBtfUp3FCsBZLo4UMhQBN8itmVTPuHB1vCt7tnCtqGOlM7w17hYv3aiWy FPRpIYW/h0QBF2/1gYcg2KSSl6cBIyHw8P1BRp78GQ/1rnAZNqj1HgINRQQE2y6kx8 pa/lP0BVFbrI+oqwHSJpzLJUH1OONwzfgc8jyKWmKgwbnoS/jGKAA+PE+NrL1eKBBr VPLO+D3Fy9QluAkUMgIaLcYTb9+hJ5Az7wIHrpBlSplLPGXqXzT9s0UFhk3C1grs7n akOPXTz5YdCXA== Date: Fri, 19 Apr 2024 20:02:17 +0300 From: Mike Rapoport To: Song Liu Subject: Re: [PATCH v4 05/15] mm: introduce execmem_alloc() and execmem_free() Message-ID: References: <20240415075241.GF40213@noisy.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Mark Rutland , x86@kernel.org, Peter Zijlstra , Catalin Marinas , Russell King , linux-mm@kvack.org, Donald Dutile , sparclinux@vger.kernel.org, linux-riscv@lists.infradead.org, Nadav Amit , linux-arch@vger.kernel.org, linux-s390@vger.kernel.org, Helge Deller , Huacai Chen , Luis Chamberlain , linux-mips@vger.kernel.org, linux-trace-kernel@vger.kernel.org, Alexandre Ghiti , Will Deacon , Heiko Carstens , Steven Rostedt , loongarch@lists.linux.dev, Thomas Gleixner , bpf@vger.kernel.org, linux-arm-kernel@lists.infradead.org, Thomas Bogendoerfer , linux-parisc@vger.kernel.org, Puranjay Mohan , netdev@vger.kernel.org, Kent Overstreet , linux-kernel@vger.kernel.org, Dinh Nguyen , Bjorn Topel , Eric Chanudet , Palmer Dabbelt , Andrew Morton , Rick Edgecombe , linuxppc-dev@lists.ozlabs.org, "David S. Miller" , linux-modules@vger.kernel.org Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" On Fri, Apr 19, 2024 at 08:54:40AM -0700, Song Liu wrote: > On Thu, Apr 18, 2024 at 11:56 PM Mike Rapoport wrote: > > > > On Thu, Apr 18, 2024 at 02:01:22PM -0700, Song Liu wrote: > > > On Thu, Apr 18, 2024 at 10:54 AM Mike Rapoport wrote: > > > > > > > > On Thu, Apr 18, 2024 at 09:13:27AM -0700, Song Liu wrote: > > > > > On Thu, Apr 18, 2024 at 8:37 AM Mike Rapoport wrote: > > > > > > > > > > > > > > > > I'm looking at execmem_types more as definition of the consumers, maybe I > > > > > > > > should have named the enum execmem_consumer at the first place. > > > > > > > > > > > > > > I think looking at execmem_type from consumers' point of view adds > > > > > > > unnecessary complexity. IIUC, for most (if not all) archs, ftrace, kprobe, > > > > > > > and bpf (and maybe also module text) all have the same requirements. > > > > > > > Did I miss something? > > > > > > > > > > > > It's enough to have one architecture with different constrains for kprobes > > > > > > and bpf to warrant a type for each. > > > > > > > > > > AFAICT, some of these constraints can be changed without too much work. > > > > > > > > But why? > > > > I honestly don't understand what are you trying to optimize here. A few > > > > lines of initialization in execmem_info? > > > > > > IIUC, having separate EXECMEM_BPF and EXECMEM_KPROBE makes it > > > harder for bpf and kprobe to share the same ROX page. In many use cases, > > > a 2MiB page (assuming x86_64) is enough for all BPF, kprobe, ftrace, and > > > module text. It is not efficient if we have to allocate separate pages for each > > > of these use cases. If this is not a problem, the current approach works. > > > > The caching of large ROX pages does not need to be per type. > > > > In the POC I've posted for caching of large ROX pages on x86 [1], the cache is > > global and to make kprobes and bpf use it it's enough to set a flag in > > execmem_info. > > > > [1] https://lore.kernel.org/all/20240411160526.2093408-1-rppt@kernel.org > > For the ROX to work, we need different users (module text, kprobe, etc.) to have > the same execmem_range. From [1]: > > static void *execmem_cache_alloc(struct execmem_range *range, size_t size) > { > ... > p = __execmem_cache_alloc(size); > if (p) > return p; > err = execmem_cache_populate(range, size); > ... > } > > We are calling __execmem_cache_alloc() without range. For this to work, > we can only call execmem_cache_alloc() with one execmem_range. Actually, on x86 this will "just work" because everything shares the same address space :) The 2M pages in the cache will be in the modules space, so __execmem_cache_alloc() will always return memory from that address space. For other architectures this indeed needs to be fixed with passing the range to __execmem_cache_alloc() and limiting search in the cache for that range. > Did I miss something? > > Thanks, > Song -- Sincerely yours, Mike.