From: Peter Xu <peterx@redhat.com>
To: Suren Baghdasaryan <surenb@google.com>
Cc: akpm@linux-foundation.org, willy@infradead.org,
hannes@cmpxchg.org, mhocko@suse.com, josef@toxicpanda.com,
jack@suse.cz, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com,
michel@lespinasse.org, liam.howlett@oracle.com,
jglisse@google.com, vbabka@suse.cz, minchan@google.com,
dave@stgolabs.net, punit.agrawal@bytedance.com,
lstoakes@gmail.com, hdanton@sina.com, apopple@nvidia.com,
ying.huang@intel.com, david@redhat.com, yuzhao@google.com,
dhowells@redhat.com, hughd@google.com, viro@zeniv.linux.org.uk,
brauner@kernel.org, pasha.tatashin@soleen.com,
linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
linux-kernel@vger.kernel.org, kernel-team@android.com
Subject: Re: [PATCH v3 7/8] mm: drop VMA lock before waiting for migration
Date: Tue, 27 Jun 2023 11:49:43 -0400 [thread overview]
Message-ID: <ZJsFFzKG3W7UPCeo@x1n> (raw)
In-Reply-To: <20230627042321.1763765-8-surenb@google.com>
On Mon, Jun 26, 2023 at 09:23:20PM -0700, Suren Baghdasaryan wrote:
> migration_entry_wait does not need VMA lock, therefore it can be
> dropped before waiting.
Hmm, I'm not sure..
Note that we're still dereferencing *vmf->pmd when waiting, while *pmd is
on the page table and IIUC only be guaranteed if the vma is still there.
If without both mmap / vma lock I don't see what makes sure the pgtable is
always there. E.g. IIUC a race can happen where unmap() runs right after
vma_end_read() below but before pmdp_get_lockless() (inside
migration_entry_wait()), then pmdp_get_lockless() can read some random
things if the pgtable is freed.
>
> Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> ---
> mm/memory.c | 14 ++++++++++++--
> 1 file changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/mm/memory.c b/mm/memory.c
> index 5caaa4c66ea2..bdf46fdc58d6 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3715,8 +3715,18 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
> entry = pte_to_swp_entry(vmf->orig_pte);
> if (unlikely(non_swap_entry(entry))) {
> if (is_migration_entry(entry)) {
> - migration_entry_wait(vma->vm_mm, vmf->pmd,
> - vmf->address);
> + /* Save mm in case VMA lock is dropped */
> + struct mm_struct *mm = vma->vm_mm;
> +
> + if (vmf->flags & FAULT_FLAG_VMA_LOCK) {
> + /*
> + * No need to hold VMA lock for migration.
> + * WARNING: vma can't be used after this!
> + */
> + vma_end_read(vma);
> + ret |= VM_FAULT_COMPLETED;
> + }
> + migration_entry_wait(mm, vmf->pmd, vmf->address);
> } else if (is_device_exclusive_entry(entry)) {
> vmf->page = pfn_swap_entry_to_page(entry);
> ret = remove_device_exclusive_entry(vmf);
> --
> 2.41.0.178.g377b9f9a00-goog
>
--
Peter Xu
next prev parent reply other threads:[~2023-06-27 15:50 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-27 4:23 [PATCH v3 0/8] Per-VMA lock support for swap and userfaults Suren Baghdasaryan
2023-06-27 4:23 ` [PATCH v3 1/8] swap: remove remnants of polling from read_swap_cache_async Suren Baghdasaryan
2023-06-27 4:23 ` [PATCH v3 2/8] mm: add missing VM_FAULT_RESULT_TRACE name for VM_FAULT_COMPLETED Suren Baghdasaryan
2023-06-27 4:23 ` [PATCH v3 3/8] mm: drop per-VMA lock in handle_mm_fault if retrying or when finished Suren Baghdasaryan
2023-06-27 15:27 ` Peter Xu
2023-06-27 16:25 ` Suren Baghdasaryan
2023-06-27 4:23 ` [PATCH v3 4/8] mm: replace folio_lock_or_retry with folio_lock_fault Suren Baghdasaryan
2023-06-27 15:22 ` Peter Xu
2023-06-27 16:27 ` Suren Baghdasaryan
2023-06-27 4:23 ` [PATCH v3 5/8] mm: make folio_lock_fault indicate the state of mmap_lock upon return Suren Baghdasaryan
2023-06-27 8:06 ` Alistair Popple
2023-06-27 16:01 ` Suren Baghdasaryan
2023-06-27 15:32 ` Peter Xu
2023-06-27 16:00 ` Suren Baghdasaryan
2023-06-27 4:23 ` [PATCH v3 6/8] mm: handle swap page faults under per-VMA lock Suren Baghdasaryan
2023-06-27 15:41 ` Peter Xu
2023-06-27 16:05 ` Suren Baghdasaryan
2023-06-27 16:24 ` Peter Xu
2023-06-27 4:23 ` [PATCH v3 7/8] mm: drop VMA lock before waiting for migration Suren Baghdasaryan
2023-06-27 8:02 ` Alistair Popple
2023-06-27 15:35 ` Suren Baghdasaryan
2023-06-27 15:49 ` Peter Xu [this message]
2023-06-27 16:23 ` Suren Baghdasaryan
2023-06-28 3:22 ` Alistair Popple
2023-06-27 4:23 ` [PATCH v3 8/8] mm: handle userfaults under VMA lock Suren Baghdasaryan
2023-06-27 15:54 ` Peter Xu
2023-06-27 16:10 ` Suren Baghdasaryan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZJsFFzKG3W7UPCeo@x1n \
--to=peterx@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=apopple@nvidia.com \
--cc=brauner@kernel.org \
--cc=dave@stgolabs.net \
--cc=david@redhat.com \
--cc=dhowells@redhat.com \
--cc=hannes@cmpxchg.org \
--cc=hdanton@sina.com \
--cc=hughd@google.com \
--cc=jack@suse.cz \
--cc=jglisse@google.com \
--cc=josef@toxicpanda.com \
--cc=kernel-team@android.com \
--cc=laurent.dufour@fr.ibm.com \
--cc=ldufour@linux.ibm.com \
--cc=liam.howlett@oracle.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lstoakes@gmail.com \
--cc=mhocko@suse.com \
--cc=michel@lespinasse.org \
--cc=minchan@google.com \
--cc=pasha.tatashin@soleen.com \
--cc=punit.agrawal@bytedance.com \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
--cc=ying.huang@intel.com \
--cc=yuzhao@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.