linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Suren Baghdasaryan <surenb@google.com>
To: "Liam R. Howlett" <Liam.Howlett@oracle.com>,
	Suren Baghdasaryan <surenb@google.com>,
	 Vlastimil Babka <vbabka@suse.cz>,
	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	akpm@linux-foundation.org,  david@redhat.com, peterx@redhat.com,
	jannh@google.com, hannes@cmpxchg.org,  mhocko@kernel.org,
	paulmck@kernel.org, shuah@kernel.org, adobriyan@gmail.com,
	 brauner@kernel.org, josef@toxicpanda.com, yebin10@huawei.com,
	 linux@weissschuh.net, willy@infradead.org, osalvador@suse.de,
	 andrii@kernel.org, ryan.roberts@arm.com,
	christophe.leroy@csgroup.eu,  tjmercier@google.com,
	kaleshsingh@google.com, aha310510@gmail.com,
	 linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	 linux-mm@kvack.org, linux-kselftest@vger.kernel.org
Subject: Re: [PATCH v6 7/8] fs/proc/task_mmu: read proc/pid/maps under per-vma lock
Date: Thu, 10 Jul 2025 00:03:06 -0700	[thread overview]
Message-ID: <CAJuCfpG_dRLVDv1DWveJWS5cQS0ADEVAeBxJ=5MaPQFNEvQ1+g@mail.gmail.com> (raw)
In-Reply-To: <CAJuCfpFKNm6CEcfkuy+0o-Qu8xXppCFbOcYVXUFLeg10ztMFPw@mail.gmail.com>

On Wed, Jul 9, 2025 at 10:47 AM Suren Baghdasaryan <surenb@google.com> wrote:
>
> On Wed, Jul 9, 2025 at 4:12 PM Liam R. Howlett <Liam.Howlett@oracle.com> wrote:
> >
> > * Suren Baghdasaryan <surenb@google.com> [250709 11:06]:
> > > On Wed, Jul 9, 2025 at 3:03 PM Vlastimil Babka <vbabka@suse.cz> wrote:
> > > >
> > > > On 7/9/25 16:43, Suren Baghdasaryan wrote:
> > > > > On Wed, Jul 9, 2025 at 1:57 AM Vlastimil Babka <vbabka@suse.cz> wrote:
> > > > >>
> > > > >> On 7/8/25 01:10, Suren Baghdasaryan wrote:
> > > > >> >>> +     rcu_read_unlock();
> > > > >> >>> +     vma = lock_vma_under_mmap_lock(mm, iter, address);
> > > > >> >>> +     rcu_read_lock();
> > > > >> >> OK I guess we hold the RCU lock the whole time as we traverse except when
> > > > >> >> we lock under mmap lock.
> > > > >> > Correct.
> > > > >>
> > > > >> I wonder if it's really necessary? Can't it be done just inside
> > > > >> lock_next_vma()? It would also avoid the unlock/lock dance quoted above.
> > > > >>
> > > > >> Even if we later manage to extend this approach to smaps and employ rcu
> > > > >> locking to traverse the page tables, I'd think it's best to separate and
> > > > >> fine-grain the rcu lock usage for vma iterator and page tables, if only to
> > > > >> avoid too long time under the lock.
> > > > >
> > > > > I thought we would need to be in the same rcu read section while
> > > > > traversing the maple tree using vma_next() but now looking at it,
> > > > > maybe we can indeed enter only while finding and locking the next
> > > > > vma...
> > > > > Liam, would that work? I see struct ma_state containing a node field.
> > > > > Can it be freed from under us if we find a vma, exit rcu read section
> > > > > then re-enter rcu and use the same iterator to find the next vma?
> > > >
> > > > If the rcu protection needs to be contigous, and patch 8 avoids the issue by
> > > > always doing vma_iter_init() after rcu_read_lock() (but does it really avoid
> > > > the issue or is it why we see the syzbot reports?) then I guess in the code
> > > > quoted above we also need a vma_iter_init() after the rcu_read_lock(),
> > > > because although the iterator was used briefly under mmap_lock protection,
> > > > that was then unlocked and there can be a race before the rcu_read_lock().
> > >
> > > Quite true. So, let's wait for Liam's confirmation and based on his
> > > answer I'll change the patch by either reducing the rcu read section
> > > or adding the missing vma_iter_init() after we switch to mmap_lock.
> >
> > You need to either be under rcu or mmap lock to ensure the node in the
> > maple state hasn't been freed (and potentially, reallocated).
> >
> > So in this case, in the higher level, we can hold the rcu read lock for
> > a series of walks and avoid re-walking the tree then the performance
> > would be better.
>
> Got it. Thanks for confirming!
>
> >
> > When we return to userspace, then we should drop the rcu read lock and
> > will need to vma_iter_set()/vma_iter_invalidate() on return.  I thought
> > this was being done (through vma_iter_init()), but syzbot seems to
> > indicate a path that was missed?
>
> We do that in m_start()/m_stop() by calling
> lock_vma_range()/unlock_vma_range() but I think I have two problems
> here:
> 1. As Vlastimil mentioned I do not reset the iterator when falling
> back to mmap_lock and exiting and then re-entering rcu read section;
> 2. I do not reset the iterator after exiting rcu read section in
> m_stop() and re-entering it in m_start(), so the later call to
> lock_next_vma() might be using an iterator with a node that was freed
> (and possibly reallocated).
>
> >
> > This is the same thing that needed to be done previously with the mmap
> > lock, but now under the rcu lock.
> >
> > I'm not sure how to mitigate the issue with the page table, maybe we
> > guess on the number of vmas that we were doing for 4k blocks of output
> > and just drop/reacquire then.  Probably a problem for another day
> > anyways.
> >
> > Also, I think you can also change the vma_iter_init() to vma_iter_set(),
> > which is slightly less code under the hood.  Vlastimil asked about this
> > and it's probably a better choice.
>
> Ack.
> I'll update my series with these fixes and all comments I received so
> far, will run the reproducers to confirm no issues and repost them
> later today.

I have the patchset ready but would like to test it some more. Will
post it tomorrow.

> Thanks,
> Suren.
>
> >
> > Thanks,
> > Liam
> >

  reply	other threads:[~2025-07-10  7:03 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-04  6:07 [PATCH v6 0/8] use per-vma locks for /proc/pid/maps reads and PROCMAP_QUERY Suren Baghdasaryan
2025-07-04  6:07 ` [PATCH v6 1/8] selftests/proc: add /proc/pid/maps tearing from vma split test Suren Baghdasaryan
2025-07-04  6:07 ` [PATCH v6 2/8] selftests/proc: extend /proc/pid/maps tearing test to include vma resizing Suren Baghdasaryan
2025-07-04  6:07 ` [PATCH v6 3/8] selftests/proc: extend /proc/pid/maps tearing test to include vma remapping Suren Baghdasaryan
2025-07-04  6:07 ` [PATCH v6 4/8] selftests/proc: test PROCMAP_QUERY ioctl while vma is concurrently modified Suren Baghdasaryan
2025-07-04  6:07 ` [PATCH v6 5/8] selftests/proc: add verbose more for tests to facilitate debugging Suren Baghdasaryan
2025-07-04  6:07 ` [PATCH v6 6/8] fs/proc/task_mmu: remove conversion of seq_file position to unsigned Suren Baghdasaryan
2025-07-07 15:01   ` Lorenzo Stoakes
2025-07-08 17:37   ` Vlastimil Babka
2025-07-10  5:49     ` Suren Baghdasaryan
2025-07-04  6:07 ` [PATCH v6 7/8] fs/proc/task_mmu: read proc/pid/maps under per-vma lock Suren Baghdasaryan
2025-07-07 16:51   ` Lorenzo Stoakes
2025-07-07 23:10     ` Suren Baghdasaryan
2025-07-09  8:57       ` Vlastimil Babka
2025-07-09 14:43         ` Suren Baghdasaryan
2025-07-09 15:03           ` Vlastimil Babka
2025-07-09 15:06             ` Suren Baghdasaryan
2025-07-09 16:11               ` Liam R. Howlett
2025-07-09 17:47                 ` Suren Baghdasaryan
2025-07-10  7:03                   ` Suren Baghdasaryan [this message]
2025-07-10 17:02                     ` Suren Baghdasaryan
2025-07-10 17:42                       ` Vlastimil Babka
2025-07-15  8:16                       ` Vlastimil Babka
2025-07-15  9:40                         ` Lorenzo Stoakes
2025-07-15  9:52                           ` David Hildenbrand
2025-07-15 10:16                             ` Lorenzo Stoakes
2025-07-15 10:23                             ` Vlastimil Babka
2025-07-15 10:31                               ` Lorenzo Stoakes
2025-07-15 10:51                                 ` Lorenzo Stoakes
2025-07-15 17:05                                 ` Andrii Nakryiko
2025-07-15 17:10                                   ` Lorenzo Stoakes
2025-07-15 17:20                                     ` Lorenzo Stoakes
2025-07-15 17:29                                       ` Andrii Nakryiko
2025-07-15 20:18                                         ` Suren Baghdasaryan
2025-07-16  1:50                                           ` Suren Baghdasaryan
2025-07-15 20:13                         ` Suren Baghdasaryan
2025-07-16 14:00                           ` Lorenzo Stoakes
2025-07-16 14:07                             ` Vlastimil Babka
2025-07-16 14:27                               ` Suren Baghdasaryan
2025-07-07 18:20   ` Liam R. Howlett
2025-07-07 23:12     ` Suren Baghdasaryan
2025-07-09 10:03   ` Vlastimil Babka
2025-07-09 14:43     ` Suren Baghdasaryan
2025-07-04  6:07 ` [PATCH v6 8/8] fs/proc/task_mmu: execute PROCMAP_QUERY ioctl under per-vma locks Suren Baghdasaryan
2025-07-07 16:54   ` Lorenzo Stoakes
2025-07-07 18:26   ` Liam R. Howlett
2025-07-15  8:10 ` [PATCH v6 0/8] use per-vma locks for /proc/pid/maps reads and PROCMAP_QUERY Lorenzo Stoakes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJuCfpG_dRLVDv1DWveJWS5cQS0ADEVAeBxJ=5MaPQFNEvQ1+g@mail.gmail.com' \
    --to=surenb@google.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=adobriyan@gmail.com \
    --cc=aha310510@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=andrii@kernel.org \
    --cc=brauner@kernel.org \
    --cc=christophe.leroy@csgroup.eu \
    --cc=david@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=jannh@google.com \
    --cc=josef@toxicpanda.com \
    --cc=kaleshsingh@google.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux@weissschuh.net \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mhocko@kernel.org \
    --cc=osalvador@suse.de \
    --cc=paulmck@kernel.org \
    --cc=peterx@redhat.com \
    --cc=ryan.roberts@arm.com \
    --cc=shuah@kernel.org \
    --cc=tjmercier@google.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=yebin10@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).