From: Mel Gorman <mel@csn.ul.ie>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Hugh Dickins <hugh@veritas.com>,
Lee Schermerhorn <Lee.Schermerhorn@hp.com>,
Greg KH <gregkh@suse.de>,
Maksim Yevmenkin <maksim.yevmenkin@gmail.com>,
linux-kernel <linux-kernel@vger.kernel.org>,
Nick Piggin <npiggin@suse.de>,
Andrew Morton <akpm@linux-foundation.org>,
will@crowder-design.com, Rik van Riel <riel@redhat.com>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Mikos Szeredi <miklos@szeredi.hu>,
Andy Whitcroft <apw@canonical.com>
Subject: Re: [PATCH] Fix OOPS in mmap_region() when merging adjacent VM_LOCKED file segments
Date: Mon, 2 Feb 2009 21:50:42 +0000 [thread overview]
Message-ID: <20090202215042.GD9840@csn.ul.ie> (raw)
In-Reply-To: <alpine.LFD.2.00.0902021121120.3247@localhost.localdomain>
On Mon, Feb 02, 2009 at 11:23:33AM -0800, Linus Torvalds wrote:
>
>
> On Mon, 2 Feb 2009, Mel Gorman wrote:
> >
> > Do not account for address space usage when making hugetlbfs mappings RW
> >
> > hugetlbfs accounts for its address space usage separate from the VM
> > core. VM_ACCOUNT should not be set for its mappings but it is possible it gets
> > set if a user creates a RO hugetlbfs mapping MAP_NORESERVE and then calls
> > mprotect(). This patch stops VM_ACCOUNT being set for hugetlbfs mappings
> > during mprotect().
> >
> > Credit goes to Kosaki Motohiro for spotting this.
> >
> > Signed-off-by: Mel Gorman <mel@csn.ul.ie>
> >
> > diff --git a/mm/mprotect.c b/mm/mprotect.c
> > index abe2694..31ddc6a 100644
> > --- a/mm/mprotect.c
> > +++ b/mm/mprotect.c
> > @@ -153,7 +153,7 @@ mprotect_fixup(struct vm_area_struct *vma, struct vm_area_struct **pprev,
> > * but (without finer accounting) cannot reduce our commit if we
> > * make it unwritable again.
> > */
> > - if (newflags & VM_WRITE) {
> > + if (newflags & VM_WRITE && !(vma->vm_flags & VM_HUGETLB)) {
>
> Wouldn't it be _much_ nicer to just depend on that whole VM_NORESERVE
> thing?
>
Yeah, it would but it's not a trivial change. mm/hugetlb.c depends on
VM_NORESERVE for its own accounting but also depends on VM_ACCOUNT not being
set because counters would get mucked up when the VMAs get unmapped.
The ideal answer would be to handle VM_ACCOUNT properly but it's not
clear-cut. If it's counted towards reserves, then we are double reserving -
the huge pages already allocated and base pages that will never be used. Then
again, maybe the right thing to do is update nr_accounted when VM_HUGETLB
is not set converting things like
if (vma->vm_flags & VM_ACCOUNT)
*nr_accounted += (end - start) >> PAGE_SHIFT;
to
if (vma->vm_flags & (VM_ACCOUNT | VM_HUGETLB) == VM_ACCOUNT)
*nr_accounted += (end - start) >> PAGE_SHIFT;
?
> Those hugetlb mappings _should_ have VM_NORESERVE on them, so the
> following test:
>
> > if (!(oldflags & (VM_ACCOUNT|VM_WRITE|
> > VM_SHARED|VM_NORESERVE))) {
> > charged = nrpages;
>
> should do it all correctly.
>
Lets say someone does the following
1. mmap(PROT_READ, MAP_PRIVATE) on a hugetlbfs file
VM_ACCOUNT is not set for hugetlbfs
VM_NORESERVE is not set because MAP_NORESERVE was not there
2. mprotect(PROT_WRITE)
VM_ACCOUNT|VM_WRITE|VM_SHARE|VM_NORESERVE == 0
That check is true
newflags |= VM_ACCOUNT and hugetlbfs now has VM_ACCOUNT
3. unmap the vmas
nr_accounted gets decremented, maybe wraps negative and
unhappiness ensues
> Why make up some ad-hoc testing, when we already have a flag for _exactly_
> this issue. That's what VM_NORESERVE means: don't apply VM_ACCOUNT.
>
> IOW, I don't see the point of this patch at all.
>
> And if there is some hugetlb path that doesn't set VM_NORESERVE, then fix
> _that_ instead.
>
It gets set all right, the problem is with VM_ACCOUNT getting set.
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
next prev parent reply other threads:[~2009-02-02 21:50 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <bb4a86c70901281151w4300605r3882461cd6e9774a@mail.gmail.com>
[not found] ` <alpine.LFD.2.00.0901281316450.3123@localhost.localdomain>
2009-01-29 20:03 ` [PATCH] Fix OOPS in mmap_region() when merging adjacent VM_LOCKED file segments Lee Schermerhorn
2009-01-29 20:33 ` Linus Torvalds
2009-01-29 20:48 ` Linus Torvalds
2009-01-29 22:32 ` Hugh Dickins
2009-01-29 23:02 ` Linus Torvalds
2009-01-30 4:43 ` Lee Schermerhorn
2009-01-30 4:49 ` Linus Torvalds
2009-01-29 22:47 ` Maksim Yevmenkin
2009-01-29 22:48 ` Randy Dunlap
2009-01-29 23:31 ` Maksim Yevmenkin
2009-01-30 2:08 ` Linus Torvalds
2009-01-30 5:56 ` Greg KH
2009-01-30 16:36 ` Linus Torvalds
2009-01-30 17:40 ` Hugh Dickins
2009-01-30 18:14 ` Linus Torvalds
2009-01-30 18:30 ` Hugh Dickins
2009-01-30 19:53 ` Lee Schermerhorn
2009-01-30 20:31 ` Linus Torvalds
2009-01-30 21:12 ` Hugh Dickins
2009-01-30 21:25 ` Linus Torvalds
2009-01-30 21:36 ` Lee Schermerhorn
2009-01-30 22:27 ` Linus Torvalds
2009-01-31 12:35 ` Hugh Dickins
2009-01-31 18:34 ` Linus Torvalds
2009-02-02 11:59 ` KOSAKI Motohiro
2009-02-02 12:54 ` Hugh Dickins
2009-02-02 14:10 ` KOSAKI Motohiro
2009-02-02 18:58 ` Mel Gorman
2009-02-02 19:23 ` Linus Torvalds
2009-02-02 21:50 ` Mel Gorman [this message]
2009-02-02 22:12 ` Linus Torvalds
2009-02-02 22:35 ` Mel Gorman
2009-02-02 18:33 ` Mel Gorman
2009-02-03 16:13 ` Lee Schermerhorn
2009-02-03 16:40 ` Linus Torvalds
2009-02-03 17:10 ` Hugh Dickins
2009-02-03 21:50 ` Lee Schermerhorn
2009-01-30 21:37 ` Linus Torvalds
2009-01-31 12:16 ` Hugh Dickins
2009-01-30 20:33 ` Hugh Dickins
2009-01-30 20:53 ` Randy Dunlap
2009-01-30 20:59 ` Lee Schermerhorn
2009-01-30 21:11 ` Will Crowder
2009-01-30 23:44 ` Greg KH
2009-01-30 8:34 ` Peter Zijlstra
2009-01-30 16:45 ` Linus Torvalds
2009-01-30 16:49 ` Randy Dunlap
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090202215042.GD9840@csn.ul.ie \
--to=mel@csn.ul.ie \
--cc=Lee.Schermerhorn@hp.com \
--cc=akpm@linux-foundation.org \
--cc=apw@canonical.com \
--cc=gregkh@suse.de \
--cc=hugh@veritas.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=maksim.yevmenkin@gmail.com \
--cc=miklos@szeredi.hu \
--cc=npiggin@suse.de \
--cc=riel@redhat.com \
--cc=torvalds@linux-foundation.org \
--cc=will@crowder-design.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.