From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A2F6E2FFDD6 for ; Fri, 8 May 2026 16:53:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778259181; cv=none; b=G9UTxE+cJiHzUrZPQNqJzE+3AUKEzKU09WtsoEnu/N41YChoDPWi4KowWswnNTeFTG4KQ0XmjkZpNYimyDgyOknUKWCWdm3kF+aYe0hqOiWI36mYFMcMW3DjwBBOH3m3xWPyF/UgkuOMVQj009emsxexGkdXB/dL0a5OW+P8qwk= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778259181; c=relaxed/simple; bh=4MDk5NZ1XxKFL8fmd4JnOeZWI7qWd3z6NK+jJrIZHC8=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=HihlW9ymrNBDe21/EdL8EGG3O6KrU11aArP8FKnBSupePzTTZt9h/9yovtKwFfU1lZjKe2mqkItIYcIIA68RpVbWC5KfBqaAdxmJByXwRir65MC8ho9FuZIch6zpaNvlzkUkEOJyDSR9Px0ogMCJpCCwETfQTSdi6idODXAzjX8= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=HfLRkJRp; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="HfLRkJRp" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 34282C2BCB0; Fri, 8 May 2026 16:52:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1778259181; bh=4MDk5NZ1XxKFL8fmd4JnOeZWI7qWd3z6NK+jJrIZHC8=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=HfLRkJRpbISH//Zr0O1feN91Bz7B9madopm6wHy3EcCpiQ0DQGEXXawbo9Qm3gPCt WI5Vq1oiKY1Cft7fv2trK1lApA4q0Cx5Y/6s/n7iYmLvAhMNEeopsW395poPhYtZ74 YEtYQoJbV6xbAsToVodNn5hupiWeJU7jMSojPbrKmR0tCtNViCCbj105LiWD4ZYfy1 dNLLVu7YayjD3QIbTWez7OfGEVKUpWwpbchK3g+PEqMig3nXGDbousGpG0om9BZ3oy 6re1isgvvlr5+l6KnoV7RCAXeznxcJYXYLdmUTZNtqv6+3m9PhrcymLQScRCL602pl NsCTgKsNO5TQA== Date: Fri, 8 May 2026 17:52:54 +0100 From: Lorenzo Stoakes To: Dave Hansen Cc: linux-kernel@vger.kernel.org, Andrew Morton , "Liam R. Howlett" , linux-mm@kvack.org, Shakeel Butt , Suren Baghdasaryan , Vlastimil Babka Subject: Re: [PATCH 0/6] mm: Make per-VMA locks available in all builds Message-ID: References: <20260429181954.F50224AE@davehans-spike.ostc.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20260429181954.F50224AE@davehans-spike.ostc.intel.com> I'm guessing this is kinda an RFC? :P On Wed, Apr 29, 2026 at 11:19:54AM -0700, Dave Hansen wrote: > tl;dr: I hope I'm not missing something big here. The basic > observastion here is that forcing code to account for per-VMA lock > failure adds a lot of complexity. This series theorizes that with a > some Kconfig changes and a new helper, many callers can avoid writing > code that falls back to mmap_lock. In general very much in support of this! It'd be great to just know that this is available and frankly I think it's a critical part of the kernel. Obviously Suren needs to have a look through, most important of all :) > > -- > > When working on some x86 shadow stack code, it was a real pain to > avoid causing recursive locking problems with mmap_lock. One way > to avoid those was to avoid mmap_lock and use per-VMA locks instead. > They are great, but they are not available in all configs which > makes them unusable in generic code, or if you want to completely > avoid mmap_lock. Yeah, lock ordering is a pain. > > Make per-VMA locks available in all configs. Right now, they are > only available on select architectures when SMP and MMU are enabled. > But all of the primitives that per-VMA locks are built on (RCU, maple > trees, refcounts) work just fine without SMP or MMU. > > Their only real downside is that they make VMAs a wee bit bigger > on !MMU and !SMP builds. > > The upside is much cleaner code, lower complexity and less #ifdeffery. > > Building on top of universally-available per-VMA locks, introduce a > new helper. Since the new API does not require callers to have a > fallback to mmap_lock, it's much easier to use. Callers could > potentially replace this very common kernel idiom: > > mmap_read_lock(mm); > vma = vma_lookup() > // fiddle with vma > mmap_read_unlock(mm); > > with: > > vma = lock_vma_under_rcu_wait(mm, address); I will look at what you're proposing but this seems a bit like something I proposed at LSF (but was probably not the right solution for what was under discussion). Doing this 'right' would require quite a bit of engineering effort. The VMA locks are pretty bloody complicated :) so we have to be careful not to spread the complexity around too much. But I guess you could 'wait' by doing it in the slow path and then using vma_start_read_locked()... Of course that'd not help you with any lock inversions though! Anyway need to read the code :) > // fiddle with vma > vma_end_read(vma); > > Which avoids mmap_lock entirely in the fast path. Yeah it's nice! > > Things I think needs more discussion: > * The new helper has a horrible name. Suggestions are very welcome. > * I'm not very confident that this approach completely avoids the > deadlock issues that arise from touching userspace while holding > mm-related locks. Yeah we have to be careful... > * Can the helper avoid the goto, maybe by taking the VMA refcount > while holding mmap_lock? Surely that'd defeat the purpose of VMA locks though? you'd hold the mmap lock for less time but you're still contending vs. _any_ VMA write locks whilst trying to get a VMA read lock? Unless it's on a slow path... hmm :) > * mm_struct and vm_area_struct "bloat" Probably not a problem really. For any modern system you're using the fields. > > I've included a couple patches where I think the new helper really > makes the code nicer. > > I'm keeping the cc list on the short side for now because I'm not > actually proposing that we go ahead and do the ipv4 changes, for > example. Ack! > > Cc: Suren Baghdasaryan > Cc: Andrew Morton > Cc: "Liam R. Howlett" > Cc: Lorenzo Stoakes > Cc: Vlastimil Babka > Cc: Shakeel Butt > Cc: linux-mm@kvack.org > > arch/arm/Kconfig | 1 > arch/arm64/Kconfig | 1 > arch/loongarch/Kconfig | 1 > arch/powerpc/platforms/powernv/Kconfig | 1 > arch/powerpc/platforms/pseries/Kconfig | 1 > arch/riscv/Kconfig | 1 > arch/s390/Kconfig | 1 > arch/x86/Kconfig | 2 - > arch/x86/kernel/shstk.c | 47 +++++++++++------------------- > drivers/android/binder_alloc.c | 39 ++++++------------------- > fs/proc/internal.h | 2 - > fs/proc/task_mmu.c | 51 --------------------------------- > include/linux/mm.h | 12 ------- > include/linux/mm_types.h | 7 ---- > include/linux/mmap_lock.h | 50 +------------------------------- > kernel/fork.c | 2 - > mm/Kconfig | 13 -------- > mm/mmap_lock.c | 45 +++++++++++++++++++++++++++-- > net/ipv4/tcp.c | 31 +++++--------------- > 19 files changed, 82 insertions(+), 226 deletions(-)