From: Vlastimil Babka <vbabka@suse.cz>
To: Hugh Dickins <hughd@google.com>,
Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>,
"Kirill A. Shutemov" <kirill@shutemov.name>,
Matthew Wilcox <willy@infradead.org>,
David Hildenbrand <david@redhat.com>,
Alistair Popple <apopple@nvidia.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Rik van Riel <riel@surriel.com>,
Suren Baghdasaryan <surenb@google.com>,
Yu Zhao <yuzhao@google.com>, Greg Thelen <gthelen@google.com>,
Shakeel Butt <shakeelb@google.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH 11/13] mm/munlock: page migration needs mlock pagevec drained
Date: Fri, 11 Feb 2022 19:49:11 +0100 [thread overview]
Message-ID: <f9960727-d061-8856-45ce-4e33e8ed1de6@suse.cz> (raw)
In-Reply-To: <90c8962-d188-8687-dc70-628293316343@google.com>
On 2/6/22 22:49, Hugh Dickins wrote:
> Page migration of a VM_LOCKED page tends to fail, because when the old
> page is unmapped, it is put on the mlock pagevec with raised refcount,
> which then fails the freeze.
>
> At first I thought this would be fixed by a local mlock_page_drain() at
> the upper rmap_walk() level - which would have nicely batched all the
> munlocks of that page; but tests show that the task can too easily move
> to another cpu, leaving pagevec residue behind which fails the migration.
>
> So try_to_migrate_one() drain the local pagevec after page_remove_rmap()
> from a VM_LOCKED vma; and do the same in try_to_unmap_one(), whose
> TTU_IGNORE_MLOCK users would want the same treatment; and do the same
> in remove_migration_pte() - not important when successfully inserting
> a new page, but necessary when hoping to retry after failure.
>
> Any new pagevec runs the risk of adding a new way of stranding, and we
> might discover other corners where mlock_page_drain() or lru_add_drain()
> would now help. If the mlock pagevec raises doubts, we can easily add a
> sysctl to tune its length to 1, which reverts to synchronous operation.
Not a fan of adding new sysctls like those as that just pushes the failure
of kernel devs to poor admins :)
The old pagevec usage deleted by patch 1 was limited to the naturally larger
munlock_vma_pages_range() operation. The new per-cpu based one is more
general, which obviously has its advantages, but then it might bring new
corner cases.
So if this turns out to be an big problem, I would rather go back to the
limited scenario pagevec than a sysctl?
> Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
> ---
> mm/migrate.c | 2 ++
> mm/rmap.c | 4 ++++
> 2 files changed, 6 insertions(+)
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index f4bcf1541b62..e7d0b68d5dcb 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -251,6 +251,8 @@ static bool remove_migration_pte(struct page *page, struct vm_area_struct *vma,
> page_add_file_rmap(new, vma, false);
> set_pte_at(vma->vm_mm, pvmw.address, pvmw.pte, pte);
> }
> + if (vma->vm_flags & VM_LOCKED)
> + mlock_page_drain(smp_processor_id());
>
> /* No need to invalidate - it was non-present before */
> update_mmu_cache(vma, pvmw.address, pvmw.pte);
> diff --git a/mm/rmap.c b/mm/rmap.c
> index 5442a5c97a85..714bfdc72c7b 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -1656,6 +1656,8 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
> * See Documentation/vm/mmu_notifier.rst
> */
> page_remove_rmap(subpage, vma, PageHuge(page));
> + if (vma->vm_flags & VM_LOCKED)
> + mlock_page_drain(smp_processor_id());
> put_page(page);
> }
>
> @@ -1930,6 +1932,8 @@ static bool try_to_migrate_one(struct page *page, struct vm_area_struct *vma,
> * See Documentation/vm/mmu_notifier.rst
> */
> page_remove_rmap(subpage, vma, PageHuge(page));
> + if (vma->vm_flags & VM_LOCKED)
> + mlock_page_drain(smp_processor_id());
> put_page(page);
> }
>
next prev parent reply other threads:[~2022-02-11 18:49 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-06 21:27 [PATCH 00/13] mm/munlock: rework of mlock+munlock page handling Hugh Dickins
2022-02-06 21:30 ` [PATCH 01/13] mm/munlock: delete page_mlock() and all its works Hugh Dickins
2022-02-09 18:31 ` Vlastimil Babka
2022-02-09 22:28 ` Hugh Dickins
2022-02-10 9:52 ` Vlastimil Babka
2022-02-14 6:52 ` Hugh Dickins
2022-02-14 6:59 ` [PATCH v2 " Hugh Dickins
2022-02-14 10:07 ` Vlastimil Babka
2022-02-06 21:32 ` [PATCH 02/13] mm/munlock: delete FOLL_MLOCK and FOLL_POPULATE Hugh Dickins
2022-02-10 11:35 ` Vlastimil Babka
2022-02-06 21:34 ` [PATCH 03/13] mm/munlock: delete munlock_vma_pages_all(), allow oomreap Hugh Dickins
2022-02-10 15:30 ` Vlastimil Babka
2022-02-06 21:36 ` [PATCH 04/13] mm/munlock: rmap call mlock_vma_page() munlock_vma_page() Hugh Dickins
2022-02-11 10:29 ` Vlastimil Babka
2022-02-14 7:05 ` [PATCH v2 " Hugh Dickins
2022-02-06 21:38 ` [PATCH 05/13] mm/munlock: replace clear_page_mlock() by final clearance Hugh Dickins
2022-02-11 11:42 ` Vlastimil Babka
2022-02-06 21:38 ` [PATCH 00/13] mm/munlock: rework of mlock+munlock page handling Matthew Wilcox
2022-02-07 18:20 ` Hugh Dickins
2022-02-06 21:40 ` [PATCH 06/13] mm/munlock: maintain page->mlock_count while unevictable Hugh Dickins
2022-02-11 12:27 ` Vlastimil Babka
2022-02-14 5:42 ` Hugh Dickins
2022-02-11 18:07 ` Vlastimil Babka
2022-02-14 6:28 ` Hugh Dickins
2022-02-06 21:42 ` [PATCH 07/13] mm/munlock: mlock_pte_range() when mlocking or munlocking Hugh Dickins
2022-02-07 3:35 ` Hillf Danton
2022-02-07 18:46 ` Hugh Dickins
2022-02-11 16:45 ` Vlastimil Babka
2022-02-14 6:32 ` Hugh Dickins
2022-02-14 7:09 ` [PATCH v2 " Hugh Dickins
2022-02-06 21:43 ` [PATCH 08/13] mm/migrate: __unmap_and_move() push good newpage to LRU Hugh Dickins
2022-02-11 17:19 ` Vlastimil Babka
2022-02-06 21:45 ` [PATCH 09/13] mm/munlock: delete smp_mb() from __pagevec_lru_add_fn() Hugh Dickins
2022-02-11 17:43 ` Vlastimil Babka
2022-02-06 21:47 ` [PATCH 10/13] mm/munlock: mlock_page() munlock_page() batch by pagevec Hugh Dickins
2022-02-09 8:15 ` Geert Uytterhoeven
2022-02-09 15:45 ` Hugh Dickins
2022-02-11 18:26 ` Vlastimil Babka
2022-02-14 7:15 ` [PATCH v2 " Hugh Dickins
2022-02-06 21:49 ` [PATCH 11/13] mm/munlock: page migration needs mlock pagevec drained Hugh Dickins
2022-02-11 18:49 ` Vlastimil Babka [this message]
2022-02-14 5:34 ` Hugh Dickins
2022-02-14 7:17 ` [PATCH v2 " Hugh Dickins
2022-02-06 21:51 ` [PATCH 12/13] mm/thp: collapse_file() do try_to_unmap(TTU_BATCH_FLUSH) Hugh Dickins
2022-02-06 21:53 ` [PATCH 13/13] mm/thp: shrink_page_list() avoid splitting VM_LOCKED THP Hugh Dickins
2022-02-09 15:35 ` [PATCH 00/13] mm/munlock: rework of mlock+munlock page handling Michal Hocko
2022-02-09 16:21 ` Hugh Dickins
2022-02-09 21:01 ` Michal Hocko
2022-02-09 22:59 ` Hugh Dickins
2022-02-10 7:49 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f9960727-d061-8856-45ce-4e33e8ed1de6@suse.cz \
--to=vbabka@suse.cz \
--cc=akpm@linux-foundation.org \
--cc=apopple@nvidia.com \
--cc=david@redhat.com \
--cc=gthelen@google.com \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=kirill@shutemov.name \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=riel@surriel.com \
--cc=shakeelb@google.com \
--cc=surenb@google.com \
--cc=willy@infradead.org \
--cc=yuzhao@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).