From: Lorenzo Stoakes <ljs@kernel.org>
To: Kiryl Shutsemau <kirill@shutemov.name>
Cc: akpm@linux-foundation.org, rppt@kernel.org, peterx@redhat.com,
david@kernel.org, surenb@google.com, vbabka@kernel.org,
Liam.Howlett@oracle.com, ziy@nvidia.com, corbet@lwn.net,
skhan@linuxfoundation.org, seanjc@google.com,
pbonzini@redhat.com, jthoughton@google.com, aarcange@redhat.com,
sj@kernel.org, usama.arif@linux.dev, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
linux-kselftest@vger.kernel.org, kvm@vger.kernel.org,
kernel-team@meta.com, "Kiryl Shutsemau (Meta)" <kas@kernel.org>,
stable@vger.kernel.org
Subject: Re: [PATCH v5 04/18] mm: skip out-of-range bits in mk_vma_flags()
Date: Fri, 29 May 2026 15:00:14 +0100 [thread overview]
Message-ID: <ahmQvfNk7S4F0LBj@lucifer> (raw)
In-Reply-To: <20260526130509.2748441-5-kirill@shutemov.name>
On Tue, May 26, 2026 at 02:04:52PM +0100, Kiryl Shutsemau wrote:
> From: "Kiryl Shutsemau (Meta)" <kas@kernel.org>
>
> vma_flags_t is one unsigned long on 32-bit -- NUM_VMA_FLAG_BITS ==
> BITS_PER_LONG by design, so VM_xxx-declared bits sit in the first
> word and hit the single-long fast path. But the bit enum declares
> some bits unconditionally above BITS_PER_LONG (VMA_UFFD_MINOR_BIT
> == 41 today, with VM_UFFD_MINOR == VM_NONE on 32-bit so no VMA
> actually carries the bit).
Yeah ugh.
>
> Passing such a bit to mk_vma_flags() goes through __set_bit(41,
> &one_long) and writes one word past the end. The compiler folds
> the OOB store with wraparound (1UL << (41 % 32) == bit 9) into
> the first word. Bit 9 is already in __VMA_UFFD_FLAGS so the mask
> happens to come out right today, but any high-numbered bit whose
That is... helpful :) but not great that this is the situation, an
oversight, clearly! How I hate 32-bit kernels :)
> mod-BITS_PER_LONG position is otherwise unused would silently OR
> an extra bit into the mask.
>
> Add VMA_NO_BIT and have DECLARE_VMA_BIT() resolve any bitnum out
> of range to it. vma_flags_set_flag() drops negative bit values.
> The ternary collapses at compile time, the runtime check folds
> away when the bit is in range, and the common path is unchanged.
Hmm are you sure it does?
A key design goal was that mk_vma_flags() generates compile-time constants
the same as if the bitmap were constructed independently.
This surely must generate code? Or at least runs a significant risk of it?
Setting a precedent here would probably lead to VMA_NO_BIT being used more
and therefore generating code in more places I think.
And I'm not sure I really want to bend over backwards here to work around
issues with 32-bit kernels when in the long run the intent is that we move
to making these values 64-bit anyway.
Perhaps it's even wise possibly to just make these values 64-bit already,
ahead of time?
(I did look at this in terms of wanting something like a VMA_NO_BIT so we
could get VM_NONE-like behaviour but nothing I tried failed to generate
code.)
A simple solution that doesn't require change to the core is to just uglify
userfaultfd_k.h a bit with:
#ifdef HAVE_ARCH_USERFAULTFD_MINOR
#define __VMA_UFFD_FLAGS mk_vma_flags(VMA_UFFD_MISSING_BIT, VMA_UFFD_WP_BIT, \
VMA_UFFD_MINOR_BIT)
#else
#define __VMA_UFFD_FLAGS mk_vma_flags(VMA_UFFD_MISSING_BIT, VMA_UFFD_WP_BIT)
#endif
But of course that becomes much more horrible with your changes...
Another alternative, which I used for VMA_DROPPABLE is to add something
like this in mm.h:
#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_MINOR
#define VM_UFFD_MINOR INIT_VM_FLAG(UFFD_MINOR)
+define VMA_UFFD_MINOR mk_vma_flags(VMA_UFFD_MINOR_BIT)
#else
#define VM_UFFD_MINOR VM_NONE
+define VMA_UFFD_MINOR EMPTY_VMA_FLAGS
#endif
Then we can do:
#define __VMA_UFFD_FLAGS append_vma_flags(VMA_MINOR, VMA_UFFD_MISSING_BIT, \
VMA_UFFD_WP_BIT)
With you changes in 08/18 on top it'd get hairier, but we could make our
life easier by implementing something like:
static __always_inline vma_flags_t __mk_vma_flags_from_masks(size_t count,
const vma_flags_t *masks)
{
vma_flags_t flags = EMPTY_VMA_FLAGS;
int i;
for (i = 0; i < count; i++)
mask = vma_flags_set_mask(&flags, masks[i]);
return flags;
}
#define mk_vma_flags_from_masks(...) __mk_vma_flags_from_masks(, \
COUNT_ARGS(__VA_ARGS__), (const vma_flags_t []){__VA_ARGS__})
(untested code - you would need to ensure it generated equivalent
constants as VM_xxx would now :)
Then you could get VMA_UFFD_RWP with:
#ifdef CONFIG_USERFAULTFD_RWP
#define VMA_UFFD_RWP mk_vma_flags(VMA_UFFD_RWP_BIT)
#else
#define VMA_UFFD_RWP EMPTY_VMA_FLAGS
#endif
And then:
/* Always available if CONFIG_USERFAULTFD set. */
#define __VMA_UFFD_DEFAULT_FLAGS \
mk_vma_flags(VMA_UFFD_MISSING_BIT, VMA_UFFD_WP_BIT)
#define __VMA_UFFD_FLAGS mk_vma_flags_from_masks(__VMA_UFFD_DEFAULT_FLAGS \
VMA_MINOR, VMA_RWP)
Which is kind ok-ish right? I mean it's all a bit fugly obviously.
>
> Bits declared in the enum are now safe to pass to mk_vma_flags()
> regardless of arch.
I mean another issue here is we're then codifying behaviour that's legacy
ahead of time. I really want to avoid that.
>
> Fixes: 9ea35a25d51b ("mm: introduce VMA flags bitmap type")
> Cc: stable@vger.kernel.org
> Signed-off-by: Kiryl Shutsemau <kas@kernel.org>
> ---
> include/linux/mm.h | 15 +++++++++++++--
> 1 file changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 0f2612a70fb1..71b11945e4fc 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -286,8 +286,17 @@ extern unsigned int kobjsize(const void *objp);
> */
> typedef int __bitwise vma_flag_t;
>
> -#define DECLARE_VMA_BIT(name, bitnum) \
> - VMA_ ## name ## _BIT = ((__force vma_flag_t)bitnum)
> +/*
> + * VMA_NO_BIT means "no bit"; mk_vma_flags() skips it. DECLARE_VMA_BIT()
> + * below uses it for any bit number that doesn't fit in the bitmap, so
> + * callers don't need to track which bits are valid on the current build.
> + */
> +#define VMA_NO_BIT ((__force vma_flag_t)-1)
> +
> +#define DECLARE_VMA_BIT(name, bitnum) \
> + VMA_ ## name ## _BIT = (((bitnum) < NUM_VMA_FLAG_BITS) ? \
> + ((__force vma_flag_t)(bitnum)) : \
> + VMA_NO_BIT)
> #define DECLARE_VMA_BIT_ALIAS(name, aliased) \
> VMA_ ## name ## _BIT = (VMA_ ## aliased ## _BIT)
> enum {
> @@ -1081,6 +1090,8 @@ static __always_inline void vma_flags_set_flag(vma_flags_t *flags,
> {
> unsigned long *bitmap = flags->__vma_flags;
>
> + if ((__force int)bit < 0)
> + return;
> __set_bit((__force int)bit, bitmap);
> }
>
> --
> 2.54.0
>
Either way, I think we should break out any fix like this from the series.
Andrew is I think also not a fan of fixes patches in the middle of series
anyway :P
Cheers, Lorenzo
next prev parent reply other threads:[~2026-05-29 14:00 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20260526130509.2748441-1-kirill@shutemov.name>
2026-05-26 13:04 ` [PATCH v5 01/18] fs/proc/task_mmu: fix make_uffd_wp_huge_pte() prot-update race Kiryl Shutsemau
2026-05-26 13:04 ` [PATCH v5 02/18] mm/huge_memory: preserve pmd_swp_uffd_wp on device-private PMD downgrade Kiryl Shutsemau
2026-05-26 13:04 ` [PATCH v5 03/18] userfaultfd: gate must_wait writability check on pte_present() Kiryl Shutsemau
2026-05-26 13:04 ` [PATCH v5 04/18] mm: skip out-of-range bits in mk_vma_flags() Kiryl Shutsemau
2026-05-29 14:00 ` Lorenzo Stoakes [this message]
2026-05-29 16:09 ` Kiryl Shutsemau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ahmQvfNk7S4F0LBj@lucifer \
--to=ljs@kernel.org \
--cc=Liam.Howlett@oracle.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=corbet@lwn.net \
--cc=david@kernel.org \
--cc=jthoughton@google.com \
--cc=kas@kernel.org \
--cc=kernel-team@meta.com \
--cc=kirill@shutemov.name \
--cc=kvm@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=rppt@kernel.org \
--cc=seanjc@google.com \
--cc=sj@kernel.org \
--cc=skhan@linuxfoundation.org \
--cc=stable@vger.kernel.org \
--cc=surenb@google.com \
--cc=usama.arif@linux.dev \
--cc=vbabka@kernel.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox