Linux ARM-MSM sub-architecture
 help / color / mirror / Atom feed
From: Lorenzo Stoakes <ljs@kernel.org>
To: Pedro Falcato <pfalcato@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	 Russell King <linux@armlinux.org.uk>,
	Dinh Nguyen <dinguyen@kernel.org>,
	 Simon Schuster <schuster.simon@siemens-energy.com>,
	 "James E . J . Bottomley"
	<James.Bottomley@hansenpartnership.com>,
	Helge Deller <deller@gmx.de>,
	 Jarkko Sakkinen <jarkko@kernel.org>,
	Thomas Gleixner <tglx@kernel.org>,
	 Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	 Dave Hansen <dave.hansen@linux.intel.com>,
	x86@kernel.org, Ian Abbott <abbotti@mev.co.uk>,
	 H Hartley Sweeten <hsweeten@visionengravers.com>,
	Lucas Stach <l.stach@pengutronix.de>,
	 David Airlie <airlied@gmail.com>,
	Simona Vetter <simona@ffwll.ch>,
	 Patrik Jakobsson <patrik.r.jakobsson@gmail.com>,
	Maarten Lankhorst <maarten.lankhorst@linux.intel.com>,
	 Maxime Ripard <mripard@kernel.org>,
	Thomas Zimmermann <tzimmermann@suse.de>,
	 Rob Clark <robin.clark@oss.qualcomm.com>,
	Dmitry Baryshkov <lumag@kernel.org>,
	 Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>,
	Thierry Reding <thierry.reding@kernel.org>,
	 Mikko Perttunen <mperttunen@nvidia.com>,
	Jonathan Hunter <jonathanh@nvidia.com>,
	 Christian Koenig <christian.koenig@amd.com>,
	Huang Rui <ray.huang@amd.com>, Ankit Agrawal <ankita@nvidia.com>,
	 Alex Williamson <alex@shazbot.org>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	 Christian Brauner <brauner@kernel.org>,
	Dan Williams <djbw@kernel.org>,
	 Muchun Song <muchun.song@linux.dev>,
	Oscar Salvador <osalvador@suse.de>,
	 David Hildenbrand <david@kernel.org>,
	Suren Baghdasaryan <surenb@google.com>,
	 "Liam R . Howlett" <liam@infradead.org>,
	Matthew Wilcox <willy@infradead.org>,
	 Marek Szyprowski <m.szyprowski@samsung.com>,
	Peter Zijlstra <peterz@infradead.org>,
	 Arnaldo Carvalho de Melo <acme@kernel.org>,
	Namhyung Kim <namhyung@kernel.org>,
	 Masami Hiramatsu <mhiramat@kernel.org>,
	Oleg Nesterov <oleg@redhat.com>,
	 Steven Rostedt <rostedt@goodmis.org>,
	SeongJae Park <sj@kernel.org>, Miaohe Lin <linmiaohe@huawei.com>,
	 Hugh Dickins <hughd@google.com>, Mike Rapoport <rppt@kernel.org>,
	Kees Cook <kees@kernel.org>,  Paolo Bonzini <pbonzini@redhat.com>,
	linux-kernel@vger.kernel.org,
	 linux-arm-kernel@lists.infradead.org,
	linux-parisc@vger.kernel.org, linux-sgx@vger.kernel.org,
	 etnaviv@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
	linux-arm-msm@vger.kernel.org,  freedreno@lists.freedesktop.org,
	linux-tegra@vger.kernel.org, kvm@vger.kernel.org,
	 linux-fsdevel@vger.kernel.org, nvdimm@lists.linux.dev,
	linux-mm@kvack.org,  iommu@lists.linux.dev,
	linux-perf-users@vger.kernel.org,
	 linux-trace-kernel@vger.kernel.org, kasan-dev@googlegroups.com,
	damon@lists.linux.dev,  Rik van Riel <riel@surriel.com>,
	Harry Yoo <harry@kernel.org>, Jann Horn <jannh@google.com>
Subject: Re: [PATCH 20/30] mm/vma: introduce vma_assert_can_modify()
Date: Thu, 2 Jul 2026 13:16:38 +0100	[thread overview]
Message-ID: <akZNWNuyBU4xDDyf@lucifer> (raw)
In-Reply-To: <akZHWwVgJfqCwA2W@pedro-suse.lan>

On Thu, Jul 02, 2026 at 12:16:32PM +0100, Pedro Falcato wrote:
> On Mon, Jun 29, 2026 at 01:23:31PM +0100, Lorenzo Stoakes wrote:
> > vma_assert_write_locked() and vma_assert_attached() are useful for their
> > own purposes, however VMA code absolutely does allow the modification of
> > non-write locked VMAs if they are at that point detached (i.e. unreachable
> > from anywhere).
> >
> > It's therefore useful to be able to assert that a VMA is either
> > detached (modification doesn't matter) or write locked (you're explicitly
> > locked for modification).
>
> Hmm, I was wondering why detached does not imply write_locked, and then

For one obviously when detaching they are also write locked (see
vma_mark_detached()) but yeah the point of this is when you have a VMA allocated
which doesn't hold the appropriate lock.

> realized that new VMAs aren't write-locked. Could we do it by default?
> Like a simple:
>
> 	vma->vm_lock_seq = __vma_raw_mm_seqnum(vma);
>
> might do the trick. I don't see why it wouldn't work? Is there some other
> case I am not considering?

Firstly, I'd rather not rework the VMA lock logic as part of this series. It's
subtle and such changes are tricky.

I'm trying to achieve the minimum changes while adding extra validation as I can
in preparation for the scalable CoW work.

And I'm not sure wanting to alter the fundamentals of VMA locks is a good reason
not to ack a patch :)

But since you bring it up...

There's subtleties here too.

- At what point do you do this assignment? If it's the VMA allocation logic,
  it's not obvious then that you even have the mmap write lock necessarily. Now
  you're making assumptions that might be broken, or broken in future.

  Now everybody who allocates a VMA has to 'just know' that they need to assign
  an mm and mark it write-locked (but in a special new VMA way) before setting
  the pgoff.

- Detached means it is not currently in any tree, nor belongs necessarily to any
  mm. So the concept of it being write locked is meaningless.

- It's broken to perform actions on a new VMA that is not yet linked into any
  tree that would require the VMA write lock. Currently the code _explicitly_
  asserts that a detached VMA is not attempted to be write locked.

- So we have a good way of catching people doing stupid or broken stuff to VMAs
  that are not in the correct state (we currently _don't_ do that for detached
  VMAs that _were_ in the tree, probably we should change that...!)

- You'd have to create a new function to do this since we explicitly disallow
  doing this right now, and that's more complexity and then you're then
  creating a whole new meaning as to what VMA write lock acquisition is,
  which is even more added complexity.

- create_init_stack_vma() would break, so would hugetlb (lol) and static gate
  VMAs explicitly do not belong to an mm, so there's simply no concept of them
  being VMA write locked anyway.

- I'm not sure violating the invariant of seqnum = 0 = no write locked VMAs is
  safe.

- Things get quite horrendous on fork (prior to us taking tmp's VMA write lock)
  - you'd have to assign this field after duplicating the VMA midway through
  attaching it to the new mm, where that mm has a new seqnum. But the new mm has
  a seqnum of 0... so now you're having to duplicate it but alter what
  vma_lock_init() does to reset the seqnum to 0 but then what if the
  'duplication' logic asserts VMA write locked?

It's very chicken and egg - on VMA duplication you want to be able to write the
very fields that you need to do anything with the VMA prior to it being linked
in anywhere.

And we probably don't want to bake in the assumption that to change fundamental
fields requires that you always hold the write lock as a consequence.

>
> >
> > Therefore introduce vma_assert_can_modify() for this purpose.
> >
> > While we're here, make vma_is_attached() available generally - if
> > !CONFIG_PER_VMA_LOCKS, then there's no sense in which a VMA is
> > detached (vma_mark_detached() is a noop), so have this default to true in
> > this case.
> >
> > Signed-off-by: Lorenzo Stoakes <ljs@kernel.org>
> > ---
> >  include/linux/mmap_lock.h | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> >
> > diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h
> > index 04b8f61ece5d..d513286d8160 100644
> > --- a/include/linux/mmap_lock.h
> > +++ b/include/linux/mmap_lock.h
> > @@ -506,6 +506,8 @@ static inline __must_check
> >  int vma_start_write_killable(struct vm_area_struct *vma) { return 0; }
> >  static inline void vma_assert_write_locked(struct vm_area_struct *vma)
> >  		{ mmap_assert_write_locked(vma->vm_mm); }
> > +static inline bool vma_is_attached(struct vm_area_struct *vma)
> > +		{ return true; }
> >  static inline void vma_assert_attached(struct vm_area_struct *vma) {}
> >  static inline void vma_assert_detached(struct vm_area_struct *vma) {}
> >  static inline void vma_mark_attached(struct vm_area_struct *vma) {}
> > @@ -530,6 +532,12 @@ static inline void vma_assert_stabilised(struct vm_area_struct *vma)
> >
> >  #endif /* CONFIG_PER_VMA_LOCK */
> >
> > +static inline void vma_assert_can_modify(struct vm_area_struct *vma)
> > +{
> > +	if (vma_is_attached(vma))
> > +		vma_assert_write_locked(vma);
> > +}
> > +
> >  static inline void mmap_write_lock(struct mm_struct *mm)
> >  {
> >  	__mmap_lock_trace_start_locking(mm, true);
> > --
> > 2.54.0
> >
>
> --
> Pedro

Thanks, Lorenzo

  reply	other threads:[~2026-07-02 12:17 UTC|newest]

Thread overview: 89+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-29 12:23 [PATCH 00/30] mm: make VMA page offset handling more consistent Lorenzo Stoakes
2026-06-29 12:23 ` [PATCH 01/30] mm: move vma_start_pgoff() into mm.h and clean up Lorenzo Stoakes
2026-06-29 15:27   ` Gregory Price
2026-06-30 16:10   ` Pedro Falcato
2026-07-01  9:42     ` Lorenzo Stoakes
2026-06-29 12:23 ` [PATCH 02/30] mm: add kdoc comments for vma_start/last_pgoff() Lorenzo Stoakes
2026-06-29 15:31   ` Gregory Price
2026-06-30 16:11   ` Pedro Falcato
2026-06-29 12:23 ` [PATCH 03/30] tools/testing/vma: use vma_start_pgoff() in merge tests Lorenzo Stoakes
2026-06-29 15:40   ` Gregory Price
2026-06-29 16:35     ` Lorenzo Stoakes
2026-06-30 16:12   ` Pedro Falcato
2026-06-29 12:23 ` [PATCH 04/30] mm: introduce and use vma_end_pgoff() Lorenzo Stoakes
2026-06-29 15:54   ` Gregory Price
2026-06-30 16:13   ` Pedro Falcato
2026-06-29 12:23 ` [PATCH 05/30] mm/rmap: update mm/interval_tree.c comments Lorenzo Stoakes
2026-06-29 16:01   ` Gregory Price
2026-06-29 16:41     ` Lorenzo Stoakes
2026-06-29 17:11       ` Gregory Price
2026-06-29 17:40         ` Lorenzo Stoakes
2026-06-30 16:16   ` Pedro Falcato
2026-07-01  9:55     ` Lorenzo Stoakes
2026-06-29 12:23 ` [PATCH 06/30] mm/rmap: parameterise vma_interval_tree_*() by address_space Lorenzo Stoakes
2026-06-30 16:19   ` Pedro Falcato
2026-07-01  9:56     ` Lorenzo Stoakes
2026-06-29 12:23 ` [PATCH 07/30] mm/rmap: elide unnecessary static inline's in interval_tree.c Lorenzo Stoakes
2026-06-30 15:30   ` Gregory Price
2026-06-30 16:22   ` Pedro Falcato
2026-06-29 12:23 ` [PATCH 08/30] mm/rmap: rename vma_interval_tree_*() to mapping_interval_tree_*() Lorenzo Stoakes
2026-06-30 15:42   ` Gregory Price
2026-06-30 16:28   ` Pedro Falcato
2026-07-01 10:14     ` Lorenzo Stoakes
2026-06-29 12:23 ` [PATCH 09/30] mm/rmap: parameterise anon_vma_interval_tree_*() by anon_vma Lorenzo Stoakes
2026-06-30 15:46   ` Gregory Price
2026-06-30 15:49     ` Lorenzo Stoakes
2026-06-30 15:55       ` Gregory Price
2026-06-30 15:59         ` Lorenzo Stoakes
2026-06-30 16:32   ` Pedro Falcato
2026-07-01 10:18     ` Lorenzo Stoakes
2026-06-29 12:23 ` [PATCH 10/30] MAINTAINERS: Move mm/interval_tree.c to rmap section Lorenzo Stoakes
2026-06-30 16:33   ` Pedro Falcato
2026-06-29 12:23 ` [PATCH 11/30] mm/vma: introduce and use vmg_pages(), vmg_[start, end]_pgoff() Lorenzo Stoakes
2026-06-30 16:35   ` Pedro Falcato
2026-06-29 12:23 ` [PATCH 12/30] mm/vma: clean up anon_vma_compatible() Lorenzo Stoakes
2026-06-30 16:36   ` Pedro Falcato
2026-07-01 10:20     ` Lorenzo Stoakes
2026-06-29 12:23 ` [PATCH 13/30] mm/vma: refactor vmg_adjust_set_range() for clarity Lorenzo Stoakes
2026-07-02 10:37   ` Pedro Falcato
2026-07-02 10:50     ` Lorenzo Stoakes
2026-06-29 12:23 ` [PATCH 14/30] mm/vma: minor cleanup of expand_[upwards, downwards]() Lorenzo Stoakes
2026-07-02 10:41   ` Pedro Falcato
2026-06-29 12:23 ` [PATCH 15/30] mm: introduce and use linear_page_delta() Lorenzo Stoakes
2026-07-02 10:42   ` Pedro Falcato
2026-06-29 12:23 ` [PATCH 16/30] mm/vma: use vma_start_pgoff(), linear_page_index() in mm code Lorenzo Stoakes
2026-06-30  0:11   ` SJ Park
2026-07-02 10:47   ` Pedro Falcato
2026-07-02 10:49     ` Lorenzo Stoakes
2026-06-29 12:23 ` [PATCH 17/30] mm: prefer vma_[start,end]_pgoff() to vma->vm_pgoff in kernel/ Lorenzo Stoakes
2026-07-02 11:01   ` Pedro Falcato
2026-07-02 11:30     ` Lorenzo Stoakes
2026-06-29 12:23 ` [PATCH 18/30] mm/vma: remove duplicative vma_pgoff_offset() helper Lorenzo Stoakes
2026-07-02 11:02   ` Pedro Falcato
2026-06-29 12:23 ` [PATCH 19/30] mm: use linear_page_[index, delta]() consistently Lorenzo Stoakes
2026-06-29 13:56   ` Thomas Zimmermann
2026-06-29 14:56     ` Lorenzo Stoakes
2026-07-02 11:04   ` Pedro Falcato
2026-06-29 12:23 ` [PATCH 20/30] mm/vma: introduce vma_assert_can_modify() Lorenzo Stoakes
2026-07-02 11:16   ` Pedro Falcato
2026-07-02 12:16     ` Lorenzo Stoakes [this message]
2026-06-29 12:23 ` [PATCH 21/30] mm/vma: add and use vma_[add/sub]_pgoff() Lorenzo Stoakes
2026-07-02 11:20   ` Pedro Falcato
2026-06-29 12:23 ` [PATCH 22/30] mm/vma: move __install_special_mapping() to vma.c Lorenzo Stoakes
2026-07-02 11:22   ` Pedro Falcato
2026-06-29 12:23 ` [PATCH 23/30] mm/vma: make vma_set_range() static, drop insert_vm_struct() decl Lorenzo Stoakes
2026-07-02 11:25   ` Pedro Falcato
2026-06-29 12:23 ` [PATCH 24/30] mm/vma: update vma_shrink() to not pass unnecessary pgoff parameter Lorenzo Stoakes
2026-07-02 11:27   ` Pedro Falcato
2026-06-29 12:23 ` [PATCH 25/30] mm/vma: update vmg_adjust_set_range() to offset pgoff instead Lorenzo Stoakes
2026-07-02 11:29   ` Pedro Falcato
2026-06-29 12:23 ` [PATCH 26/30] mm/vma: introduce and use vma_set_pgoff() Lorenzo Stoakes
2026-07-02 11:34   ` Pedro Falcato
2026-06-29 12:23 ` [PATCH 27/30] mm/vma: correct incorrect vma.h inclusion Lorenzo Stoakes
2026-07-02 11:40   ` Pedro Falcato
2026-06-29 12:23 ` [PATCH 28/30] mm/vma: use guard clauses in can_vma_merge_[before, after]() Lorenzo Stoakes
2026-07-02 11:41   ` Pedro Falcato
2026-06-29 12:23 ` [PATCH 29/30] tools/testing/vma: default VMA flag bits to 64-bit Lorenzo Stoakes
2026-07-02 11:44   ` Pedro Falcato
2026-06-29 12:23 ` [PATCH 30/30] tools/testing/vma: output compared expression on ASSERT_[EQ, NE]() Lorenzo Stoakes
2026-07-02 11:53   ` Pedro Falcato

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=akZNWNuyBU4xDDyf@lucifer \
    --to=ljs@kernel.org \
    --cc=James.Bottomley@hansenpartnership.com \
    --cc=abbotti@mev.co.uk \
    --cc=acme@kernel.org \
    --cc=airlied@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex@shazbot.org \
    --cc=ankita@nvidia.com \
    --cc=bp@alien8.de \
    --cc=brauner@kernel.org \
    --cc=christian.koenig@amd.com \
    --cc=damon@lists.linux.dev \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@kernel.org \
    --cc=deller@gmx.de \
    --cc=dinguyen@kernel.org \
    --cc=djbw@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=etnaviv@lists.freedesktop.org \
    --cc=freedreno@lists.freedesktop.org \
    --cc=harry@kernel.org \
    --cc=hsweeten@visionengravers.com \
    --cc=hughd@google.com \
    --cc=iommu@lists.linux.dev \
    --cc=jannh@google.com \
    --cc=jarkko@kernel.org \
    --cc=jonathanh@nvidia.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=kees@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=l.stach@pengutronix.de \
    --cc=liam@infradead.org \
    --cc=linmiaohe@huawei.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-parisc@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=linux-sgx@vger.kernel.org \
    --cc=linux-tegra@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=lumag@kernel.org \
    --cc=m.szyprowski@samsung.com \
    --cc=maarten.lankhorst@linux.intel.com \
    --cc=mhiramat@kernel.org \
    --cc=mingo@redhat.com \
    --cc=mperttunen@nvidia.com \
    --cc=mripard@kernel.org \
    --cc=muchun.song@linux.dev \
    --cc=namhyung@kernel.org \
    --cc=nvdimm@lists.linux.dev \
    --cc=oleg@redhat.com \
    --cc=osalvador@suse.de \
    --cc=patrik.r.jakobsson@gmail.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=pfalcato@suse.de \
    --cc=ray.huang@amd.com \
    --cc=riel@surriel.com \
    --cc=robin.clark@oss.qualcomm.com \
    --cc=rostedt@goodmis.org \
    --cc=rppt@kernel.org \
    --cc=schuster.simon@siemens-energy.com \
    --cc=simona@ffwll.ch \
    --cc=sj@kernel.org \
    --cc=surenb@google.com \
    --cc=tglx@kernel.org \
    --cc=thierry.reding@kernel.org \
    --cc=tomi.valkeinen@ideasonboard.com \
    --cc=tzimmermann@suse.de \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox