linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
To: Mel Gorman <mel@csn.ul.ie>
Cc: starlight@binnacle.cx, Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, bugzilla-daemon@bugzilla.kernel.org,
	bugme-daemon@bugzilla.kernel.org, Adam Litke <agl@us.ibm.com>,
	Eric B Munson <ebmunson@us.ibm.com>,
	riel@redhat.com
Subject: Re: [Bugme-new] [Bug 13302] New: "bad pmd" on fork() of process with hugepage shared memory segments attached
Date: Wed, 20 May 2009 11:05:15 -0400	[thread overview]
Message-ID: <1242831915.6194.15.camel@lts-notebook> (raw)
In-Reply-To: <1242831218.6194.13.camel@lts-notebook>

On Wed, 2009-05-20 at 10:53 -0400, Lee Schermerhorn wrote:
> On Wed, 2009-05-20 at 12:35 +0100, Mel Gorman wrote:
> > On Fri, May 15, 2009 at 02:53:27PM -0400, starlight@binnacle.cx wrote:
> > > Here's another possible clue:
> > > 
> > > I tried the first 'tcbm' testcase on a 2.6.27.7
> > > kernel that was hanging around from a few months
> > > ago and it breaks it 100% of the time.
> > > 
> > > Completely hoses huge memory.  Enough "bad pmd"
> > > errors to fill the kernel log.
> > > 
> > 
> > So I investigated what's wrong with 2.6.27.7. The problem is a race between
> > exec() and the handling of mlock()ed VMAs but I can't see where. The normal
> > teardown of pages is applied to a shared memory segment as if VM_HUGETLB
> > was not set.
> > 
> > This was fixed between 2.6.27 and 2.6.28 but apparently by accident during the
> > introduction of CONFIG_UNEVITABLE_LRU. This patchset made a number of changes
> > to how mlock()ed are handled but I didn't spot which was the relevant change
> > that fixed the problem and reverse bisecting didn't help. I've added two people
> > that were working on the unevictable LRU patches to see if they spot something.
> 
> Hi, Mel:
> and still do.  With the unevictable lru, mlock()/mmap('LOCKED) now move
> the mlocked pages to the unevictable lru list and munlock, including at
> exit, must rescue them from the unevictable list.   Since hugepages are
> not maintained on the lru and don't get reclaimed, we don't want to move
> them to the unevictable list,  However, we still want to populate the
> page tables.  So, we still call [_]mlock_vma_pages_range() for hugepage
> vmas, but after making the pages present to preserve prior behavior, we
> remove the VM_LOCKED flag from the vma.

Wow!  that got garbled.  not sure how.  Message was intended to start
here:

> The basic change to handling of hugepage handling with the unevictable
> lru patches is that we no longer keep a huge page vma marked with
> VM_LOCKED.  So, at exit time, there is no record that this is a vmlocked
> vma.
> 
> A bit of context:  before the unevictable lru, mlock() or
> mmap(MAP_LOCKED) would just set the VM_LOCKED flag and
> "make_pages_present()" for all but a few vma types.  We've always
> excluded those that get_user_pages() can't handle and still do.  With
> the unevictable lru, mlock()/mmap('LOCKED) now move the mlocked pages to
> the unevictable lru list and munlock, including at exit, must rescue
> them from the unevictable list.   Since hugepages are not maintained on
> the lru and don't get reclaimed, we don't want to move them to the
> unevictable list,  However, we still want to populate the page tables.
> So, we still call [_]mlock_vma_pages_range() for hugepage vmas, but
> after making the pages present to preserve prior behavior, we remove the
> VM_LOCKED flag from the vma.
> 
> This may have resulted in the apparent fix to the subject problem in
> 2.6.28...
> 
> > 
> > For context, the two attached files are used to reproduce a problem
> > where bad pmd messages are scribbled all over the console on 2.6.27.7.
> > Do something like
> > 
> > echo 64 > /proc/sys/vm/nr_hugepages
> > mount -t hugetlbfs none /mnt
> > sh ./test-tcbm.sh
> > 
> > I did confirm that it didn't matter to 2.6.29.1 if CONFIG_UNEVITABLE_LRU is
> > set or not.  It's possible the race it still there but I don't know where
> > it is.
> > 
> > Any ideas where the race might be?
> 
> No, sorry.  Haven't had time to investigate this.
> 
> Lee
> > 
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2009-05-20 15:05 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-15 18:53 [Bugme-new] [Bug 13302] New: "bad pmd" on fork() of process with hugepage shared memory segments attached starlight
2009-05-20 11:35 ` Mel Gorman
2009-05-20 14:29   ` Mel Gorman
2009-05-20 14:53   ` Lee Schermerhorn
2009-05-20 15:05     ` Lee Schermerhorn [this message]
2009-05-20 15:41       ` Mel Gorman
2009-05-21  0:41         ` KOSAKI Motohiro
2009-05-22 16:41           ` Mel Gorman
2009-05-24 13:44             ` KOSAKI Motohiro
2009-05-25  8:51               ` Mel Gorman
2009-05-25 10:10                 ` Hugh Dickins
2009-05-25 13:17                   ` Mel Gorman
  -- strict thread matches above, loose matches on Subject: below --
2009-05-15 18:44 starlight
2009-05-18 16:36 ` Mel Gorman
2009-05-15  5:32 starlight
2009-05-15 14:55 ` Mel Gorman
2009-05-15 15:02   ` starlight
     [not found] <bug-13302-10286@http.bugzilla.kernel.org/>
2009-05-13 20:08 ` Andrew Morton
2009-05-14 10:53   ` Mel Gorman
2009-05-14 10:59     ` Mel Gorman
2009-05-14 17:20       ` starlight
2009-05-14 17:49         ` Mel Gorman
2009-05-14 18:42           ` starlight
2009-05-14 19:10           ` starlight
2009-05-14 17:16     ` starlight

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1242831915.6194.15.camel@lts-notebook \
    --to=lee.schermerhorn@hp.com \
    --cc=agl@us.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=bugme-daemon@bugzilla.kernel.org \
    --cc=bugzilla-daemon@bugzilla.kernel.org \
    --cc=ebmunson@us.ibm.com \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=riel@redhat.com \
    --cc=starlight@binnacle.cx \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).