From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Michal Hocko <mhocko@kernel.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Andrea Arcangeli <aarcange@redhat.com>,
Hugh Dickins <hughd@google.com>, Rik van Riel <riel@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCHv3 03/12] mm: fix handling PTE-mapped THPs in page_referenced()
Date: Sat, 4 Feb 2017 13:33:53 +0300 [thread overview]
Message-ID: <20170204103353.GA8013@node.shutemov.name> (raw)
In-Reply-To: <20170202152655.GB22823@dhcp22.suse.cz>
On Thu, Feb 02, 2017 at 04:26:56PM +0100, Michal Hocko wrote:
> On Sun 29-01-17 20:38:49, Kirill A. Shutemov wrote:
> > For PTE-mapped THP page_check_address_transhuge() is not adequate: it
> > cannot find all relevant PTEs, only the first one. It means we can miss
> > some references of the page and it can result in suboptimal decisions by
> > vmscan.
> >
> > Let's switch it to page_vma_mapped_walk().
> >
> > I don't think it's subject for stable@: it's not fatal. The only side
> > effect is that THP can be swapped out when it shouldn't.
>
> Please be more specific about the situation when this happens and how a
> user can recognize this is going on. In other words when should I
> consider backporting this series.
The first you need huge PMD to get split with split_huge_pmd(). It can
happen due to munmap(), mprotect(), mremap(), etc. After split_huge_pmd()
we have THP mapped with bunch of PTEs instead of single PMD.
The bug is that the kernel only sees pte_young() on the PTEs that maps the
first 4k, but not the rest. So if your access pattern touches the THP, but
not the first 4k, the page can be reclaimed unfairly and possibly
re-faulted from swap soon after.
I don't think it's visible to user, except as unneeded swap-out/swap-in in
on rare occasion.
> Also the interface is quite awkward imho. Why cannot we provide a
> callback into page_vma_mapped_walk and call it for each pte/pmd that
> matters to the given page? Wouldn't that be much easier than the loop
> around page_vma_mapped_walk iterator?
I don't agree that interface with call back would be easier. You would
also need to pass down additional context with packing/unpacking it on
both ends. I don't think it makes interface less awkward.
But it's matter of taste.
--
Kirill A. Shutemov
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-02-04 10:33 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-01-29 17:38 [PATCHv3 00/12] Fix few rmap-related THP bugs Kirill A. Shutemov
2017-01-29 17:38 ` [PATCHv3 01/12] uprobes: split THPs before trying replace them Kirill A. Shutemov
2017-01-31 15:44 ` Oleg Nesterov
2017-01-29 17:38 ` [PATCHv3 02/12] mm: introduce page_vma_mapped_walk() Kirill A. Shutemov
2017-01-29 17:38 ` [PATCHv3 03/12] mm: fix handling PTE-mapped THPs in page_referenced() Kirill A. Shutemov
2017-02-02 15:26 ` Michal Hocko
2017-02-04 10:33 ` Kirill A. Shutemov [this message]
2017-01-29 17:38 ` [PATCHv3 04/12] mm: fix handling PTE-mapped THPs in page_idle_clear_pte_refs() Kirill A. Shutemov
2017-01-29 17:38 ` [PATCHv3 05/12] mm, rmap: check all VMAs that PTE-mapped THP can be part of Kirill A. Shutemov
2017-01-29 17:38 ` [PATCHv3 06/12] mm: convert page_mkclean_one() to use page_vma_mapped_walk() Kirill A. Shutemov
2017-01-29 17:38 ` [PATCHv3 07/12] mm: convert try_to_unmap_one() " Kirill A. Shutemov
2017-01-29 17:38 ` [PATCHv3 08/12] mm, ksm: convert write_protect_page() " Kirill A. Shutemov
2017-01-29 17:38 ` [PATCHv3 09/12] mm, uprobes: convert __replace_page() " Kirill A. Shutemov
2017-01-29 17:38 ` [PATCHv3 10/12] mm: convert page_mapped_in_vma() " Kirill A. Shutemov
2017-01-29 17:38 ` [PATCHv3 11/12] mm: drop page_check_address{,_transhuge} Kirill A. Shutemov
2017-01-29 17:38 ` [PATCHv3 12/12] mm: convert remove_migration_pte() to use page_vma_mapped_walk() Kirill A. Shutemov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170204103353.GA8013@node.shutemov.name \
--to=kirill@shutemov.name \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=hughd@google.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).