From: "Li Xinhai" <lixinhai.lxh@gmail.com>
To: "linux-mm@kvack.org" <linux-mm@kvack.org>,
n-horiguchi <n-horiguchi@ah.jp.nec.com>,
akpm <akpm@linux-foundation.org>
Cc: Vlastimil Babka <vbabka@suse.cz>,
Michal Hocko <mhocko@kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Linux API <linux-api@vger.kernel.org>,
Hugh Dickins <hughd@google.com>
Subject: Re: [PATCH v2] mm: Fix checking unmapped holes for mbind
Date: Thu, 31 Oct 2019 11:15:09 +0800 [thread overview]
Message-ID: <2019103111150700409251@gmail.com> (raw)
In-Reply-To: 201910291756045288126@gmail.com
On 2019-10-29 at 17:56 Li Xinhai wrote:
>queue_pages_range() will check for unmapped holes besides queue pages for
>migration. The rules for checking unmapped holes are:
>1 Unmapped holes at any part of the specified range should be reported as
> EFAULT if mbind() for none MPOL_DEFAULT cases;
>2 Unmapped holes at any part of the specified range should be ignored if
> mbind() for MPOL_DEFAULT case;
>Note that the second rule is the current implementation, but it seems
>conflicts the Linux API definition.
>
>queue_pages_test_walk() is fixed by introduce new fields in struct
>queue_pages which help to check:
>1 holes at head and tail side of specified range;
>2 the whole range is in a hole;
>
>Besides, queue_pages_test_walk() must update previous vma record no matter
>the current vma should be considered for queue pages or not.
>
More details about current issue (which breaks the mbind() API definition):
1. In queue_pages_test_walk()
checking on (!vma->vm_next && vma->vm_end < end) would never success,
because 'end' passed from walk_page_test() make sure "end <= vma->vm_end". so hole
beyond the last vma can't be detected.
2. queue_pages_test_walk() only called for vma, and 'start', 'end' parameters are guaranteed
within vma. Then, if the range starting or ending in an unmapped hole,
queue_pages_test_walk() don't have chance to be called to check. In other words, the
current code can detect this case (range span over hole):
[ vma ][ hole ][ vma]
[ range ]
but cant not detect these case :
[ vma ][ hole ][ vma]
[ range ]
[ vma ][ hole ][ vma ]
[ range ]
3. a checking in mbind_range() try to recover if the hole is in head side, but can't
recover if hole is in tail side of range.
- Xinhai
>Fixes: 9d8cebd4bcd7 ("mm: fix mbind vma merge problem")
>Fixes: 6f4576e3687b ("mempolicy: apply page table walker on queue_pages_range()")
>Fixes: 48684a65b4e3 ("mm: pagewalk: fix misbehavior of walk_page_range for vma(VM_PFNMAP)")
>Signed-off-by: Li Xinhai <lixinhai.li@gmail.com>
>---
>Changes in v2:
> - Fix the unmapped holes checking in queue_pages_test_walk() instead of
> mbind_rnage().
>
> mm/mempolicy.c | 44 +++++++++++++++++++++++++++++---------------
> 1 file changed, 29 insertions(+), 15 deletions(-)
>
>diff --git a/mm/mempolicy.c b/mm/mempolicy.c
>index 4ae967bcf954..24087dfa4dcd 100644
>--- a/mm/mempolicy.c
>+++ b/mm/mempolicy.c
>@@ -411,6 +411,9 @@ struct queue_pages {
> unsigned long flags;
> nodemask_t *nmask;
> struct vm_area_struct *prev;
>+ unsigned long start;
>+ unsigned long end;
>+ int in_hole;
> };
>
> /*
>@@ -618,28 +621,31 @@ static int queue_pages_test_walk(unsigned long start, unsigned long end,
> unsigned long endvma = vma->vm_end;
> unsigned long flags = qp->flags;
>
>- /*
>- * Need check MPOL_MF_STRICT to return -EIO if possible
>- * regardless of vma_migratable
>- */
>- if (!vma_migratable(vma) &&
>- !(flags & MPOL_MF_STRICT))
>- return 1;
>-
>+ /* range check first */
> if (endvma > end)
> endvma = end;
>- if (vma->vm_start > start)
>- start = vma->vm_start;
>+ BUG_ON((vma->vm_start > start) || (vma->vm_end < end));
>
>+ qp->in_hole = 0;
> if (!(flags & MPOL_MF_DISCONTIG_OK)) {
>- if (!vma->vm_next && vma->vm_end < end)
>+ if ((!vma->vm_next && vma->vm_end < qp->end) ||
>+ (vma->vm_next && qp->end < vma->vm_next->vm_start))
> return -EFAULT;
>- if (qp->prev && qp->prev->vm_end < vma->vm_start)
>+ if ((qp->prev && qp->prev->vm_end < vma->vm_start) ||
>+ (!qp->prev && qp->start < vma->vm_start))
> return -EFAULT;
> }
>
> qp->prev = vma;
>
>+ /*
>+ * Need check MPOL_MF_STRICT to return -EIO if possible
>+ * regardless of vma_migratable
>+ */
>+ if (!vma_migratable(vma) &&
>+ !(flags & MPOL_MF_STRICT))
>+ return 1;
>+
> if (flags & MPOL_MF_LAZY) {
> /* Similar to task_numa_work, skip inaccessible VMAs */
> if (!is_vm_hugetlb_page(vma) &&
>@@ -679,14 +685,23 @@ queue_pages_range(struct mm_struct *mm, unsigned long start, unsigned long end,
> nodemask_t *nodes, unsigned long flags,
> struct list_head *pagelist)
> {
>+ int err;
> struct queue_pages qp = {
> .pagelist = pagelist,
> .flags = flags,
> .nmask = nodes,
> .prev = NULL,
>+ .start = start,
>+ .end = end,
>+ .in_hole = 1,
> };
>
>- return walk_page_range(mm, start, end, &queue_pages_walk_ops, &qp);
>+ err = walk_page_range(mm, start, end, &queue_pages_walk_ops, &qp);
>+ /* whole range in unmapped hole */
>+ if (qp->in_hole && !(flags & MPOL_MF_DISCONTIG_OK))
>+ err = -EFAULT;
>+
>+ return err;
> }
>
> /*
>@@ -738,8 +753,7 @@ static int mbind_range(struct mm_struct *mm, unsigned long start,
> unsigned long vmend;
>
> vma = find_vma(mm, start);
>- if (!vma || vma->vm_start > start)
>- return -EFAULT;
>+ BUG_ON(!vma);
>
> prev = vma->vm_prev;
> if (start > vma->vm_start)
>--
>2.22.0
next prev parent reply other threads:[~2019-10-31 3:15 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-29 9:56 [PATCH v2] mm: Fix checking unmapped holes for mbind Li Xinhai
2019-10-31 3:15 ` Li Xinhai [this message]
2019-10-31 9:06 ` Naoya Horiguchi
2019-10-31 12:32 ` Li Xinhai
2019-10-31 4:08 ` Andrew Morton
2019-10-31 6:01 ` Li Xinhai
2019-10-31 11:26 ` Michal Hocko
2019-11-05 2:29 ` Li Xinhai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2019103111150700409251@gmail.com \
--to=lixinhai.lxh@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=hughd@google.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=n-horiguchi@ah.jp.nec.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.