All of lore.kernel.org
 help / color / mirror / Atom feed
From: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
To: Mikulas Patocka <mpatocka@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Hugh Dickins <hughd@google.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Shaohua Li <shaohua.li@intel.com>, Michal Hocko <mhocko@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, kosaki.motohiro@gmail.com
Subject: Re: [PATCH] fix bug introduced in "mm: simplify find_vma_prev()"
Date: Tue, 06 Mar 2012 19:27:32 -0500	[thread overview]
Message-ID: <4F56AB74.2080301@gmail.com> (raw)
In-Reply-To: <Pine.LNX.4.64.1203061326120.22195@file.rdu.redhat.com>

(3/6/12 1:57 PM), Mikulas Patocka wrote:
>
>
> On Sun, 4 Mar 2012, Linus Torvalds wrote:
>
>> On Sun, Mar 4, 2012 at 4:52 PM, Mikulas Patocka<mpatocka@redhat.com>  wrote:
>>>
>>> This patch fixes a bug introduced in "mm: simplify find_vma_prev()". You
>>> can apply this, or alternatively revert the original patch.
>>
>> Hmm.
>>
>> I realize the patch is a bug-fix, but I do wonder if we shouldn't look
>> at alternatives.
>>
>> In particular, how about we just change the rule to be clearer, and
>> make the rule be:
>>
>>   - no more find_vma_prev() AT ALL.
>>
>>   - callers get turned into "find_vma()" and then we have a
>> "vma_prev()", which basically does your thing.
>>
>>   - many of the callers seem to be interested in "prev" *only* if they
>> found a vma. For example, the do_mlock() kind of logic seems to be
>> common:
>>
>>      vma = find_vma_prev(current->mm, start,&prev);
>>      if (!vma || vma->vm_start>  start)
>>          return -ENOMEM;
>>
>>     which implies that the current find_vma_prev() semantics are really
>> wrong, because they force us to do extra work in this case where we
>> really really don't care about the result.
>>
>> So we could do an entirely mechanical conversion of
>>
>>     vma = find_vma_prev(mm, address,&prev_vma);
>>
>> into
>>
>>     vma = find_vma(mm, address);
>>     prev_vma = vma_prev(vma);
>>
>> and then after we've done that conversion, it looks like the bulk of
>> them will not care about "prev_vma" if vma is NULL, so they can then
>> be replaced with a pure
>>
>>     prev_vma = vma->vm_prev;
>>
>> instead. Leaving just the (few) cases that may care about the
>> "previous vma may be the last vma of the VM" case.
>>
>> Hmm? And we'd get away from that horrible "find_vma_prev()" interface
>> that uses pointers to vma pointers for return values. Ugh. There only
>> seems to be nine callers in the whole kernel.
>>
>>                   Linus
>
> You can try this to remove most users of find_vma_prev (only three are
> left --- those dealing with up-growing stack). But the patch should be
> delayed until the next merge window.
>
> Mikulas
>

Thank you, Mikulas for finding and fixing the bug. And sorry for the long
delay. I was offlined a while.

Following is the replacing last three caller of find_vma_prev. oh, I use
find_last_vma() instead vma_prev(vma) because parisc need last vma search
when Milulas's original testcase. Does this make sense to you? if yes, we
can remove find_vma_prev() completely.



=========================
Subject: [PATCH] mm: introduce find_last_vma()

Cc: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
---
  arch/ia64/mm/fault.c   |   20 +++++++++++++-------
  arch/parisc/mm/fault.c |    6 +++---
  include/linux/mm.h     |    3 +++
  mm/mmap.c              |   32 +++++++++++++++++++++++++++++++-
  4 files changed, 50 insertions(+), 11 deletions(-)

diff --git a/arch/ia64/mm/fault.c b/arch/ia64/mm/fault.c
index 20b3593..27ca30c 100644
--- a/arch/ia64/mm/fault.c
+++ b/arch/ia64/mm/fault.c
@@ -112,18 +112,16 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re
  
  	down_read(&mm->mmap_sem);
  
-	vma = find_vma_prev(mm, address, &prev_vma);
-	if (!vma && !prev_vma )
-		goto bad_area;
+	vma = find_vma(mm, address);
  
-        /*
-         * find_vma_prev() returns vma such that address < vma->vm_end or NULL
+	/*
+         * find_vma() returns vma such that address < vma->vm_end or NULL
           *
           * May find no vma, but could be that the last vm area is the
           * register backing store that needs to expand upwards, in
-         * this case vma will be null, but prev_vma will ne non-null
+         * this case vma will be null and a stack need to be expanded.
           */
-        if (( !vma && prev_vma ) || (address < vma->vm_start) )
+	if (!vma || (address < vma->vm_start))
  		goto check_expansion;
  
    good_area:
@@ -177,6 +175,14 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re
  	return;
  
    check_expansion:
+	if (vma)
+		prev_vma = vma->vm_prev;
+	else {
+		prev_vma = find_last_vma(mm);
+		if (!prev_vma)
+			goto bad_area;
+	}
+
  	if (!(prev_vma && (prev_vma->vm_flags & VM_GROWSUP) && (address == prev_vma->vm_end))) {
  		if (!vma)
  			goto bad_area;
diff --git a/arch/parisc/mm/fault.c b/arch/parisc/mm/fault.c
index 18162ce..7b5a466 100644
--- a/arch/parisc/mm/fault.c
+++ b/arch/parisc/mm/fault.c
@@ -170,7 +170,7 @@ int fixup_exception(struct pt_regs *regs)
  void do_page_fault(struct pt_regs *regs, unsigned long code,
  			      unsigned long address)
  {
-	struct vm_area_struct *vma, *prev_vma;
+	struct vm_area_struct *vma;
  	struct task_struct *tsk = current;
  	struct mm_struct *mm = tsk->mm;
  	unsigned long acc_type;
@@ -180,7 +180,7 @@ void do_page_fault(struct pt_regs *regs, unsigned long code,
  		goto no_context;
  
  	down_read(&mm->mmap_sem);
-	vma = find_vma_prev(mm, address, &prev_vma);
+	vma = find_vma(mm, address);
  	if (!vma || address < vma->vm_start)
  		goto check_expansion;
  /*
@@ -222,7 +222,7 @@ good_area:
  	return;
  
  check_expansion:
-	vma = prev_vma;
+	vma = vma ? vma->vm_prev : find_last_vma(mm);
  	if (vma && (expand_stack(vma, address) == 0))
  		goto good_area;
  
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 17b27cd..d758818 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1463,6 +1463,9 @@ extern int expand_upwards(struct vm_area_struct *vma, unsigned long address);
  
  /* Look up the first VMA which satisfies  addr < vm_end,  NULL if none. */
  extern struct vm_area_struct * find_vma(struct mm_struct * mm, unsigned long addr);
+#if defined(CONFIG_STACK_GROWSUP) || defined(CONFIG_IA64)
+extern struct vm_area_struct *find_last_vma(struct mm_struct *mm);
+#endif
  extern struct vm_area_struct * find_vma_prev(struct mm_struct * mm, unsigned long addr,
  					     struct vm_area_struct **pprev);
  
diff --git a/mm/mmap.c b/mm/mmap.c
index 22e1a0b..3e9c186 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1605,6 +1605,32 @@ struct vm_area_struct *find_vma(struct mm_struct *mm, unsigned long addr)
  
  EXPORT_SYMBOL(find_vma);
  
+#if defined(CONFIG_STACK_GROWSUP) || defined(CONFIG_IA64)
+/* Look up the last VMA,  NULL if mm have no vma. */
+struct vm_area_struct *find_last_vma(struct mm_struct *mm)
+{
+	struct vm_area_struct *vma = NULL;
+	struct rb_node *rb_node;
+
+	BUG_ON(!mm);
+
+	rb_node = mm->mm_rb.rb_node;
+	while (rb_node) {
+		vma = rb_entry(rb_node, struct vm_area_struct, vm_rb);
+		rb_node = rb_node->rb_right;
+	}
+
+	/*
+	 * This function is mainly used for a page fault. Thus recording vma
+	 * may improve the performance.
+	 */
+	if (vma)
+		mm->mmap_cache = vma;
+
+	return vma;
+}
+#endif
+
  /*
   * Same as find_vma, but also return a pointer to the previous VMA in *pprev.
   * Note: pprev is set to NULL when return value is NULL.
@@ -1789,9 +1815,13 @@ find_extend_vma(struct mm_struct *mm, unsigned long addr)
  	struct vm_area_struct *vma, *prev;
  
  	addr &= PAGE_MASK;
-	vma = find_vma_prev(mm, addr, &prev);
+	vma = find_vma(mm, addr);
  	if (vma && (vma->vm_start <= addr))
  		return vma;
+
+	prev = vma->vm_prev;
+	if (!prev)
+		prev = find_last_vma(mm);
  	if (!prev || expand_stack(prev, addr))
  		return NULL;
  	if (prev->vm_flags & VM_LOCKED) {
-- 
1.7.1



  reply	other threads:[~2012-03-07  0:27 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-05  0:52 [PATCH] fix bug introduced in "mm: simplify find_vma_prev()" Mikulas Patocka
2012-03-05  1:22 ` Linus Torvalds
2012-03-06 18:57   ` Mikulas Patocka
2012-03-07  0:27     ` KOSAKI Motohiro [this message]
2012-03-07  0:30 ` KOSAKI Motohiro
2012-03-07  2:27   ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F56AB74.2080301@gmail.com \
    --to=kosaki.motohiro@gmail.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhocko@suse.cz \
    --cc=mpatocka@redhat.com \
    --cc=shaohua.li@intel.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.