All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nick Piggin <npiggin@suse.de>
To: "Mika Penttilä" <mika.penttila@kolumbus.fi>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Tony Battersby <tonyb@cybernetics.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH] more ZERO_PAGE handling ( was 2.6.24 regression: deadlock on coredump of big process)
Date: Wed, 30 Apr 2008 07:09:03 +0200	[thread overview]
Message-ID: <20080430050903.GC27652@wotan.suse.de> (raw)
In-Reply-To: <4817FDA5.1040702@kolumbus.fi>

On Wed, Apr 30, 2008 at 08:03:33AM +0300, Mika Penttilä wrote:
> KAMEZAWA Hiroyuki wrote:
> >On Tue, 29 Apr 2008 10:10:58 -0400
> >Tony Battersby <tonyb@cybernetics.com> wrote:
> >  
> >>If I leave more memory free by changing the argument to
> >>malloc_all_but_x_mb(), then I have to increase the number of threads
> >>required to trigger the deadlock.  Changing the thread stack size via
> >>setrlimit(RLIMIT_STACK) also changes the number of threads that are
> >>required to trigger the deadlock.  For example, with
> >>malloc_all_but_x_mb(16) and the default stack size of 8 MB, <= 5 threads
> >>will coredump successfully, and >= 6 threads will deadlock.  With
> >>malloc_all_but_x_mb(16) and a reduced stack size of 4096 bytes, <= 8
> >>threads will coredump successfully, and >= 9 threads will deadlock.
> >>
> >>Also note that the "free" command reports 10 MB free memory while the
> >>program is running before the segfault is triggered.
> >>
> >>    
> >Hmm, my idea is below.
> >
> >Nick's remove ZERO_PAGE patch includes following change
> >
> >==
> >@@ -2252,39 +2158,24 @@ static int do_anonymous_page(struct mm_struct *mm, 
> >struct vm_area_struct *vma,
> >        spinlock_t *ptl;
> > {
> ><snip>
> >-               page_add_new_anon_rmap(page, vma, address);
> >-       } else {
> >-               /* Map the ZERO_PAGE - vm_page_prot is readonly */
> >-               page = ZERO_PAGE(address);
> >-               page_cache_get(page);
> >-               entry = mk_pte(page, vma->vm_page_prot);
> >+       if (unlikely(anon_vma_prepare(vma)))
> >+               goto oom;
> >+       page = alloc_zeroed_user_highpage_movable(vma, address);
> >==
> >
> >above change is for avoiding to use ZERO_PAGE at read-page-fault to 
> >anonymous
> >vma. This is reasonable I think. But at coredump, tons of 
> >read-but-never-written pages can be allocated.
> >==
> >coredump
> >  -> get_user_pages()
> >       -> follow_page() returns NULL
> >            -> handle mm fault
> >                 -> do_anonymous page.
> >==
> >follow_page() returns ZERO_PAGE only when page table is not avaiable.
> >
> >So, making follow_page() return ZERO_PAGE can be a fix of extra memory
> >consumpstion at core dump. (Maybe someone can think of other fix.)
> >
> >how about this patch ? Could you try ?
> >
> >(I'm sorry but I'll not be active for a week because my servers are 
> >powered off.)
> >
> >-Kame
> >
> >  
> 
> 
> But sure we still have to handle the fault for instance swapped pages, 
> for other uses of get_user_pages();

Yeah, it does need to test for pte_none.


WARNING: multiple messages have this Message-ID (diff)
From: Nick Piggin <npiggin@suse.de>
To: "Mika Penttilä" <mika.penttila@kolumbus.fi>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Tony Battersby <tonyb@cybernetics.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH] more ZERO_PAGE handling ( was 2.6.24 regression: deadlock on coredump of big process)
Date: Wed, 30 Apr 2008 07:09:03 +0200	[thread overview]
Message-ID: <20080430050903.GC27652@wotan.suse.de> (raw)
In-Reply-To: <4817FDA5.1040702@kolumbus.fi>

On Wed, Apr 30, 2008 at 08:03:33AM +0300, Mika Penttila wrote:
> KAMEZAWA Hiroyuki wrote:
> >On Tue, 29 Apr 2008 10:10:58 -0400
> >Tony Battersby <tonyb@cybernetics.com> wrote:
> >  
> >>If I leave more memory free by changing the argument to
> >>malloc_all_but_x_mb(), then I have to increase the number of threads
> >>required to trigger the deadlock.  Changing the thread stack size via
> >>setrlimit(RLIMIT_STACK) also changes the number of threads that are
> >>required to trigger the deadlock.  For example, with
> >>malloc_all_but_x_mb(16) and the default stack size of 8 MB, <= 5 threads
> >>will coredump successfully, and >= 6 threads will deadlock.  With
> >>malloc_all_but_x_mb(16) and a reduced stack size of 4096 bytes, <= 8
> >>threads will coredump successfully, and >= 9 threads will deadlock.
> >>
> >>Also note that the "free" command reports 10 MB free memory while the
> >>program is running before the segfault is triggered.
> >>
> >>    
> >Hmm, my idea is below.
> >
> >Nick's remove ZERO_PAGE patch includes following change
> >
> >==
> >@@ -2252,39 +2158,24 @@ static int do_anonymous_page(struct mm_struct *mm, 
> >struct vm_area_struct *vma,
> >        spinlock_t *ptl;
> > {
> ><snip>
> >-               page_add_new_anon_rmap(page, vma, address);
> >-       } else {
> >-               /* Map the ZERO_PAGE - vm_page_prot is readonly */
> >-               page = ZERO_PAGE(address);
> >-               page_cache_get(page);
> >-               entry = mk_pte(page, vma->vm_page_prot);
> >+       if (unlikely(anon_vma_prepare(vma)))
> >+               goto oom;
> >+       page = alloc_zeroed_user_highpage_movable(vma, address);
> >==
> >
> >above change is for avoiding to use ZERO_PAGE at read-page-fault to 
> >anonymous
> >vma. This is reasonable I think. But at coredump, tons of 
> >read-but-never-written pages can be allocated.
> >==
> >coredump
> >  -> get_user_pages()
> >       -> follow_page() returns NULL
> >            -> handle mm fault
> >                 -> do_anonymous page.
> >==
> >follow_page() returns ZERO_PAGE only when page table is not avaiable.
> >
> >So, making follow_page() return ZERO_PAGE can be a fix of extra memory
> >consumpstion at core dump. (Maybe someone can think of other fix.)
> >
> >how about this patch ? Could you try ?
> >
> >(I'm sorry but I'll not be active for a week because my servers are 
> >powered off.)
> >
> >-Kame
> >
> >  
> 
> 
> But sure we still have to handle the fault for instance swapped pages, 
> for other uses of get_user_pages();

Yeah, it does need to test for pte_none.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2008-04-30  5:09 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-04-28 15:11 2.6.24 regression: deadlock on coredump of big process Tony Battersby
2008-04-28 15:11 ` Tony Battersby
2008-04-29  1:00 ` KAMEZAWA Hiroyuki
2008-04-29  1:00   ` KAMEZAWA Hiroyuki
2008-04-29 14:10   ` Tony Battersby
2008-04-29 14:10     ` Tony Battersby
2008-04-30  4:25     ` [PATCH] more ZERO_PAGE handling ( was 2.6.24 regression: deadlock on coredump of big process) KAMEZAWA Hiroyuki
2008-04-30  4:25       ` KAMEZAWA Hiroyuki
2008-04-30  4:46       ` Nick Piggin
2008-04-30  4:46         ` Nick Piggin
2008-04-30  5:03       ` Mika Penttilä
2008-04-30  5:03         ` Mika Penttilä
2008-04-30  5:09         ` Nick Piggin [this message]
2008-04-30  5:09           ` Nick Piggin
2008-04-30  5:17         ` KAMEZAWA Hiroyuki
2008-04-30  5:17           ` KAMEZAWA Hiroyuki
2008-04-30  5:19           ` Nick Piggin
2008-04-30  5:19             ` Nick Piggin
2008-04-30  5:35             ` KAMEZAWA Hiroyuki
2008-04-30  5:35               ` KAMEZAWA Hiroyuki
2008-04-30  6:11               ` Nick Piggin
2008-04-30  6:11                 ` Nick Piggin
2008-05-07  2:14                 ` KAMEZAWA Hiroyuki
2008-05-07  2:14                   ` KAMEZAWA Hiroyuki
2008-05-07  2:27                   ` KAMEZAWA Hiroyuki
2008-05-07  2:27                     ` KAMEZAWA Hiroyuki
2008-04-30 13:57               ` Tony Battersby
2008-04-30 13:57                 ` Tony Battersby
2008-05-01  8:39                 ` kamezawa.hiroyu
2008-05-01  8:39                   ` kamezawa.hiroyu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080430050903.GC27652@wotan.suse.de \
    --to=npiggin@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mika.penttila@kolumbus.fi \
    --cc=tonyb@cybernetics.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.