From: Michal Hocko <mhocko@kernel.org>
To: gregkh@linuxfoundation.org
Cc: keno@juliacomputing.com, akpm@linux-foundation.org,
ben@decadent.org.uk, gthelen@google.com, hughd@google.com,
keescook@chromium.org, kirill.shutemov@linux.intel.com,
luto@kernel.org, npiggin@gmail.com, oleg@redhat.com,
torvalds@linux-foundation.org, w@1wt.eu, stable@vger.kernel.org,
stable-commits@vger.kernel.org
Subject: Re: Patch "mm/huge_memory.c: respect FOLL_FORCE/FOLL_COW for thp" has been added to the 4.4-stable tree
Date: Mon, 10 Apr 2017 17:34:17 +0200 [thread overview]
Message-ID: <20170410153417.GH4618@dhcp22.suse.cz> (raw)
In-Reply-To: <14918369578070@kroah.com>
On Mon 10-04-17 17:09:17, Greg KH wrote:
[...]
> >From 303681d5d538d81b5e23754515202b5b9febd2e9 Mon Sep 17 00:00:00 2001
> From: Keno Fischer <keno@juliacomputing.com>
> Date: Tue, 24 Jan 2017 15:17:48 -0800
> Subject: mm/huge_memory.c: respect FOLL_FORCE/FOLL_COW for thp
>
> From: Keno Fischer <keno@juliacomputing.com>
>
> commit 8310d48b125d19fcd9521d83b8293e63eb1646aa upstream.
This backport is wrong. See
http://lkml.kernel.org/r/20170328131154.GH18241@dhcp22.suse.cz
>
> In commit 19be0eaffa3a ("mm: remove gup_flags FOLL_WRITE games from
> __get_user_pages()"), the mm code was changed from unsetting FOLL_WRITE
> after a COW was resolved to setting the (newly introduced) FOLL_COW
> instead. Simultaneously, the check in gup.c was updated to still allow
> writes with FOLL_FORCE set if FOLL_COW had also been set.
>
> However, a similar check in huge_memory.c was forgotten. As a result,
> remote memory writes to ro regions of memory backed by transparent huge
> pages cause an infinite loop in the kernel (handle_mm_fault sets
> FOLL_COW and returns 0 causing a retry, but follow_trans_huge_pmd bails
> out immidiately because `(flags & FOLL_WRITE) && !pmd_write(*pmd)` is
> true.
>
> While in this state the process is stil SIGKILLable, but little else
> works (e.g. no ptrace attach, no other signals). This is easily
> reproduced with the following code (assuming thp are set to always):
>
> #include <assert.h>
> #include <fcntl.h>
> #include <stdint.h>
> #include <stdio.h>
> #include <string.h>
> #include <sys/mman.h>
> #include <sys/stat.h>
> #include <sys/types.h>
> #include <sys/wait.h>
> #include <unistd.h>
>
> #define TEST_SIZE 5 * 1024 * 1024
>
> int main(void) {
> int status;
> pid_t child;
> int fd = open("/proc/self/mem", O_RDWR);
> void *addr = mmap(NULL, TEST_SIZE, PROT_READ,
> MAP_ANONYMOUS | MAP_PRIVATE, 0, 0);
> assert(addr != MAP_FAILED);
> pid_t parent_pid = getpid();
> if ((child = fork()) == 0) {
> void *addr2 = mmap(NULL, TEST_SIZE, PROT_READ | PROT_WRITE,
> MAP_ANONYMOUS | MAP_PRIVATE, 0, 0);
> assert(addr2 != MAP_FAILED);
> memset(addr2, 'a', TEST_SIZE);
> pwrite(fd, addr2, TEST_SIZE, (uintptr_t)addr);
> return 0;
> }
> assert(child == waitpid(child, &status, 0));
> assert(WIFEXITED(status) && WEXITSTATUS(status) == 0);
> return 0;
> }
>
> Fix this by updating follow_trans_huge_pmd in huge_memory.c analogously
> to the update in gup.c in the original commit. The same pattern exists
> in follow_devmap_pmd. However, we should not be able to reach that
> check with FOLL_COW set, so add WARN_ONCE to make sure we notice if we
> ever do.
>
> [akpm@linux-foundation.org: coding-style fixes]
> Link: http://lkml.kernel.org/r/20170106015025.GA38411@juliacomputing.com
> Signed-off-by: Keno Fischer <keno@juliacomputing.com>
> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: Greg Thelen <gthelen@google.com>
> Cc: Nicholas Piggin <npiggin@gmail.com>
> Cc: Willy Tarreau <w@1wt.eu>
> Cc: Oleg Nesterov <oleg@redhat.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Hugh Dickins <hughd@google.com>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> [bwh: Backported to 3.16:
> - Drop change to follow_devmap_pmd()
> - pmd_dirty() is not available; check the page flags as in older
> backports of can_follow_write_pte()
> - Adjust context]
> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>
> ---
> mm/huge_memory.c | 19 ++++++++++++++++---
> 1 file changed, 16 insertions(+), 3 deletions(-)
>
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -1269,6 +1269,18 @@ out_unlock:
> return ret;
> }
>
> +/*
> + * FOLL_FORCE can write to even unwritable pmd's, but only
> + * after we've gone through a COW cycle and they are dirty.
> + */
> +static inline bool can_follow_write_pmd(pmd_t pmd, struct page *page,
> + unsigned int flags)
> +{
> + return pmd_write(pmd) ||
> + ((flags & FOLL_FORCE) && (flags & FOLL_COW) &&
> + page && PageAnon(page));
> +}
> +
> struct page *follow_trans_huge_pmd(struct vm_area_struct *vma,
> unsigned long addr,
> pmd_t *pmd,
> @@ -1279,9 +1291,6 @@ struct page *follow_trans_huge_pmd(struc
>
> assert_spin_locked(pmd_lockptr(mm, pmd));
>
> - if (flags & FOLL_WRITE && !pmd_write(*pmd))
> - goto out;
> -
> /* Avoid dumping huge zero page */
> if ((flags & FOLL_DUMP) && is_huge_zero_pmd(*pmd))
> return ERR_PTR(-EFAULT);
> @@ -1292,6 +1301,10 @@ struct page *follow_trans_huge_pmd(struc
>
> page = pmd_page(*pmd);
> VM_BUG_ON_PAGE(!PageHead(page), page);
> +
> + if (flags & FOLL_WRITE && !can_follow_write_pmd(*pmd, page, flags))
> + goto out;
> +
> if (flags & FOLL_TOUCH) {
> pmd_t _pmd;
> /*
>
>
> Patches currently in stable-queue which might be from keno@juliacomputing.com are
>
> queue-4.4/mm-huge_memory.c-respect-foll_force-foll_cow-for-thp.patch
--
Michal Hocko
SUSE Labs
next prev parent reply other threads:[~2017-04-10 15:34 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-04-10 15:09 Patch "mm/huge_memory.c: respect FOLL_FORCE/FOLL_COW for thp" has been added to the 4.4-stable tree gregkh
2017-04-10 15:34 ` Michal Hocko [this message]
2017-04-10 16:05 ` Greg KH
2017-04-10 16:13 ` Michal Hocko
2017-04-10 15:48 ` Ben Hutchings
2017-04-10 16:04 ` Greg KH
2017-04-10 16:13 ` Michal Hocko
2017-04-10 16:21 ` Greg KH
2017-04-10 18:10 ` Ben Hutchings
-- strict thread matches above, loose matches on Subject: below --
2017-05-23 14:46 gregkh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170410153417.GH4618@dhcp22.suse.cz \
--to=mhocko@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=ben@decadent.org.uk \
--cc=gregkh@linuxfoundation.org \
--cc=gthelen@google.com \
--cc=hughd@google.com \
--cc=keescook@chromium.org \
--cc=keno@juliacomputing.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=luto@kernel.org \
--cc=npiggin@gmail.com \
--cc=oleg@redhat.com \
--cc=stable-commits@vger.kernel.org \
--cc=stable@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=w@1wt.eu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).