From: Suren Baghdasaryan <surenb@google.com>
To: akpm@linux-foundation.org
Cc: viro@zeniv.linux.org.uk, brauner@kernel.org, shuah@kernel.org,
aarcange@redhat.com, lokeshgidra@google.com, peterx@redhat.com,
david@redhat.com, hughd@google.com, mhocko@suse.com,
axelrasmussen@google.com, rppt@kernel.org, willy@infradead.org,
Liam.Howlett@oracle.com, jannh@google.com,
zhangpeng362@huawei.com, bgeffon@google.com,
kaleshsingh@google.com, ngeoffray@google.com, jdduke@google.com,
surenb@google.com, linux-mm@kvack.org,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-kselftest@vger.kernel.org, kernel-team@android.com
Subject: [PATCH v2 1/3] userfaultfd: UFFDIO_REMAP: rmap preparation
Date: Fri, 22 Sep 2023 18:31:44 -0700 [thread overview]
Message-ID: <20230923013148.1390521-2-surenb@google.com> (raw)
In-Reply-To: <20230923013148.1390521-1-surenb@google.com>
From: Andrea Arcangeli <aarcange@redhat.com>
As far as the rmap code is concerned, UFFDIO_REMAP only alters the
page->mapping and page->index. It does it while holding the page
lock. However folio_referenced() is doing rmap walks without taking the
folio lock first, so folio_lock_anon_vma_read() must be updated to
re-check that the folio->mapping didn't change after we obtained the
anon_vma read lock.
UFFDIO_REMAP takes the anon_vma lock for writing before altering the
folio->mapping, so if the folio->mapping is still the same after
obtaining the anon_vma read lock (without the folio lock), the rmap
walks can go ahead safely (and UFFDIO_REMAP will wait the rmap walk to
complete before proceeding).
UFFDIO_REMAP serializes against itself with the folio lock.
All other places taking the anon_vma lock while holding the mmap_lock
for writing, don't need to check if the folio->mapping has changed
after taking the anon_vma lock, regardless of the folio lock, because
UFFDIO_REMAP holds the mmap_lock for reading.
There's one constraint enforced to allow this simplification: the
source pages passed to UFFDIO_REMAP must be mapped only in one vma,
but this constraint is an acceptable tradeoff for UFFDIO_REMAP
users.
The source addresses passed to UFFDIO_REMAP can be set as
VM_DONTCOPY with MADV_DONTFORK to avoid any risk of the mapcount of
the pages increasing if some thread of the process forks() before
UFFDIO_REMAP run.
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
---
mm/rmap.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/mm/rmap.c b/mm/rmap.c
index ec7f8e6c9e48..c1ebbd23fa61 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -542,6 +542,7 @@ struct anon_vma *folio_lock_anon_vma_read(struct folio *folio,
struct anon_vma *root_anon_vma;
unsigned long anon_mapping;
+repeat:
rcu_read_lock();
anon_mapping = (unsigned long)READ_ONCE(folio->mapping);
if ((anon_mapping & PAGE_MAPPING_FLAGS) != PAGE_MAPPING_ANON)
@@ -586,6 +587,18 @@ struct anon_vma *folio_lock_anon_vma_read(struct folio *folio,
rcu_read_unlock();
anon_vma_lock_read(anon_vma);
+ /*
+ * Check if UFFDIO_REMAP changed the anon_vma. This is needed
+ * because we don't assume the folio was locked.
+ */
+ if (unlikely((unsigned long) READ_ONCE(folio->mapping) !=
+ anon_mapping)) {
+ anon_vma_unlock_read(anon_vma);
+ put_anon_vma(anon_vma);
+ anon_vma = NULL;
+ goto repeat;
+ }
+
if (atomic_dec_and_test(&anon_vma->refcount)) {
/*
* Oops, we held the last refcount, release the lock
--
2.42.0.515.g380fc7ccd1-goog
next prev parent reply other threads:[~2023-09-23 1:32 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-23 1:31 [PATCH v2 0/3] userfaultfd remap option Suren Baghdasaryan
2023-09-23 1:31 ` Suren Baghdasaryan [this message]
2023-09-28 16:23 ` [PATCH v2 1/3] userfaultfd: UFFDIO_REMAP: rmap preparation Peter Xu
2023-09-28 20:03 ` Suren Baghdasaryan
2023-10-02 14:42 ` David Hildenbrand
2023-10-02 15:23 ` Peter Xu
2023-10-02 17:30 ` David Hildenbrand
2023-10-03 17:56 ` Suren Baghdasaryan
2023-09-23 1:31 ` [PATCH v2 2/3] userfaultfd: UFFDIO_REMAP uABI Suren Baghdasaryan
2023-09-27 10:06 ` potential new userfaultfd vs khugepaged conflict [was: Re: [PATCH v2 2/3] userfaultfd: UFFDIO_REMAP uABI] Jann Horn
2023-09-27 17:12 ` Suren Baghdasaryan
2023-09-28 15:29 ` Jann Horn
2023-09-27 12:47 ` [PATCH v2 2/3] userfaultfd: UFFDIO_REMAP uABI Jann Horn
2023-09-27 13:29 ` David Hildenbrand
2023-09-27 18:25 ` Suren Baghdasaryan
2023-09-28 16:28 ` Peter Xu
2023-09-28 17:15 ` David Hildenbrand
2023-09-28 18:32 ` Suren Baghdasaryan
2023-09-28 20:11 ` Suren Baghdasaryan
2023-09-28 19:00 ` Peter Xu
2023-10-02 7:49 ` David Hildenbrand
2023-09-28 16:24 ` Peter Xu
2023-09-28 17:05 ` David Hildenbrand
2023-09-28 17:21 ` Peter Xu
2023-09-28 17:51 ` David Hildenbrand
2023-09-28 18:34 ` Peter Xu
2023-09-28 19:47 ` Suren Baghdasaryan
2023-10-02 8:00 ` David Hildenbrand
2023-10-02 15:21 ` Peter Xu
2023-10-02 15:46 ` Lokesh Gidra
2023-10-02 15:55 ` Lokesh Gidra
2023-10-02 17:43 ` David Hildenbrand
2023-10-02 19:33 ` Lokesh Gidra
2023-10-03 20:04 ` Suren Baghdasaryan
2023-10-03 20:21 ` Peter Xu
2023-10-03 21:08 ` David Hildenbrand
2023-10-03 21:20 ` Peter Xu
2023-10-03 22:26 ` Suren Baghdasaryan
2023-10-03 23:39 ` Lokesh Gidra
2023-10-06 12:30 ` David Hildenbrand
2023-10-06 15:02 ` Suren Baghdasaryan
2023-10-03 21:04 ` David Hildenbrand
2023-10-02 17:33 ` David Hildenbrand
2023-10-02 17:36 ` David Hildenbrand
2023-09-27 18:07 ` Suren Baghdasaryan
2023-09-27 20:04 ` Jann Horn
2023-09-27 20:42 ` Suren Baghdasaryan
2023-09-27 21:08 ` Suren Baghdasaryan
2023-09-27 22:48 ` Jann Horn
2023-09-28 15:36 ` Suren Baghdasaryan
2023-09-28 17:09 ` Peter Xu
2023-09-28 18:23 ` Suren Baghdasaryan
2023-09-28 18:43 ` Peter Xu
2023-09-28 19:50 ` Suren Baghdasaryan
2023-09-23 1:31 ` [PATCH v2 3/3] selftests/mm: add UFFDIO_REMAP ioctl test Suren Baghdasaryan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230923013148.1390521-2-surenb@google.com \
--to=surenb@google.com \
--cc=Liam.Howlett@oracle.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=axelrasmussen@google.com \
--cc=bgeffon@google.com \
--cc=brauner@kernel.org \
--cc=david@redhat.com \
--cc=hughd@google.com \
--cc=jannh@google.com \
--cc=jdduke@google.com \
--cc=kaleshsingh@google.com \
--cc=kernel-team@android.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lokeshgidra@google.com \
--cc=mhocko@suse.com \
--cc=ngeoffray@google.com \
--cc=peterx@redhat.com \
--cc=rppt@kernel.org \
--cc=shuah@kernel.org \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
--cc=zhangpeng362@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.