From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 4ACF9207A2A; Tue, 18 Mar 2025 12:14:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742300073; cv=none; b=kBpIEWGjSdIkvYRQZ46U/b//DpKHXPUPr6sY11l6+2tARXnpnOKpj2OQDItps4RnkMceiGLfi8XV71E2Y2L5JeZCMFTeCGNrXMc9r7km3J4gsAlAlOHsGULaK7L/p+SUcoaxUJtF6HtLNeR9Ubw+pJ26YpjYhPthrZwsFvJzekI= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742300073; c=relaxed/simple; bh=64IIiCxSmb90ZQ6fmueb4xJ3WQaFIC2OATzJjR0a98k=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=Lb4bg8Lxm72ZgZtTw1mAFX4wKq72MKdllVCyRi7AmvchSybPt4cmKJ4nOWyn0aLFzFfxwcrQzSsPrTI6/8fBSUVVrgeNDoBAWEaLupA7aaqf8QJMFRhzFoQujRU1VRtJLTdfwIRriEl4n/7+r9g5pcWt7HAzIaVdrq550gRlJCU= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E278013D5; Tue, 18 Mar 2025 05:14:38 -0700 (PDT) Received: from [10.57.85.104] (unknown [10.57.85.104]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 7AF4D3F673; Tue, 18 Mar 2025 05:14:22 -0700 (PDT) Message-ID: Date: Tue, 18 Mar 2025 13:14:18 +0100 Precedence: bulk X-Mailing-List: linux-s390@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 00/11] Always call constructor for kernel page tables To: Ryan Roberts , linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, Albert Ou , Andreas Larsson , Andrew Morton , Catalin Marinas , Dave Hansen , "David S. Miller" , Geert Uytterhoeven , Linus Walleij , Madhavan Srinivasan , Mark Rutland , Matthew Wilcox , Michael Ellerman , "Mike Rapoport (IBM)" , Palmer Dabbelt , Paul Walmsley , Peter Zijlstra , Qi Zheng , Will Deacon , Yang Shi , linux-arch@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-csky@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-openrisc@vger.kernel.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, sparclinux@vger.kernel.org References: <20250317141700.3701581-1-kevin.brodsky@arm.com> <70349335-84ee-4bca-a3d6-d7cf3c05b92b@arm.com> Content-Language: en-GB From: Kevin Brodsky In-Reply-To: <70349335-84ee-4bca-a3d6-d7cf3c05b92b@arm.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit On 17/03/2025 16:30, Ryan Roberts wrote: > On 17/03/2025 14:16, Kevin Brodsky wrote: >> The complications in those special pgtable allocators beg the question: >> does it really make sense to treat efi_mm and init_mm differently in >> e.g. apply_to_pte_range()? Maybe what we really need is a way to tell if >> an mm corresponds to user memory or not, and never use split locks for >> non-user mm's. Feedback and suggestions welcome! > The difference in treatment is whether or not the ptl is taken, right? So the > real question is when calling apply_to_pte_range() for efi_mm, is there already > a higher level serialization mechanism that prevents racy accesses? For init_mm, > I think this is handled implicitly because there is no way for user space to > cause apply_to_pte_range() for an arbitrary piece of kernel memory. Although I > can't even see where apply_to_page_range() is called for efi_mm. The commit I mentioned above, 61444cde9170 ("ARM: 8591/1: mm: use fully constructed struct pages for EFI pgd allocations"), shows that apply_to_page_range() is called from efi_set_mapping_permissions(), and this indeed hasn't changed. It is itself called from efi_virtmap_init(). I would expect that no locking at all is necessary here, since the mapping has just been created and surely isn't used yet. Now the question is where exactly init_mm is special-cased in this manner. I can see that walk_page_range() does something similar, there may be more cases. And the other question is whether those functions are ever used on special mm's, aside from efi_set_mapping_permissions(). > FWIW, contpte.c has mm_is_user() which is used by arm64. Interesting! But not pretty, that's basically checking that the mm is not &init_mm or &efi_mm... which wouldn't work for a generic implementation. It feels like adding some attribute to mm_struct wouldn't hurt. It looks like we've run out of MMF_* flags though :/ - Kevin