From: David Gibson <david@gibson.dropbear.id.au>
To: Andrew Barry <abarry@cray.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
linux-mm <linux-mm@kvack.org>, Rik van Riel <riel@redhat.com>,
Minchan Kim <minchan.kim@gmail.com>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Mel Gorman <mgorman@suse.de>, Hugh Dickins <hughd@google.com>,
Andrew Hastings <abh@cray.com>
Subject: Re: [PATCH v2 1/1] hugepages: Fix race between hugetlbfs umount and quota update.
Date: Tue, 23 Aug 2011 14:10:20 +1000 [thread overview]
Message-ID: <20110823041020.GQ30097@yookeroo.fritz.box> (raw)
In-Reply-To: <4E52B71A.9030108@cray.com>
On Mon, Aug 22, 2011 at 03:07:54PM -0500, Andrew Barry wrote:
> On 08/19/2011 04:51 PM, Andrew Morton wrote:
> > What's different about hugetlbfs? Why don't other filesystems hit this?
> >
> > <investigates further>
> >
> > OK so the incorrect interaction happened in free_huge_page(), which is
> > called via the compound page destructor (this dtor is "what's different
> > about hugetlbfs"). What is incorrect about this is
> >
> > a) that we're doing fs operations in response to a
> > get_user_pages()/put_page() operation which has *nothing* to do with
> > filesystems!
> >
> > b) that we continue to try to do that fs operation against an fs
> > which was unmounted and freed three days ago. duh.
>
> Yes.
>
> > So I hereby pronounce that
> >
> > a) It was wrong to manipulate hugetlbfs quotas within
> > free_huge_page(). Because free_huge_page() is a low-level
> > page-management function which shouldn't know about one of its
> > specific clients (in this case, hugetlbfs).
> >
> > In fact it's wrong for there to be *any* mention of hugetlbfs
> > within hugetlb.c.
> >
> > b) I shouldn't have merged that hugetlbfs quota code. whodidthat.
> > Mel, Adam, Dave, at least...
> >
> > c) The proper fix here is to get that hugetlbfs quota code out of
> > free_huge_page() and do it all where it belongs: within hugetlbfs
> > code.
> >
> >
> > Regular filesystems don't need to diddle quota counts within
> > page_cache_release(). Why should hugetlbfs need to?
>
> Is there anyone, more expert in hugetlbfs code than I, who can/should/will take
> that on?
As far as I can tell the hugetlbfs "quota" counts that are updated
here don't share much with the normal quota mechanisms. The way they
operate, they logically divide the pool of free huge pages between
different hugetlbfs instances. This means that you can give different
hugepage mounts to different applications and they won't be able to
exhaust each others resources.
I can't see how that can be done without updating the count somewhere
at free_huge_page() time.
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2011-08-23 4:10 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-08-19 19:14 [PATCH v2 1/1] hugepages: Fix race between hugetlbfs umount and quota update Andrew Barry
2011-08-19 21:51 ` Andrew Morton
2011-08-22 20:07 ` Andrew Barry
2011-08-23 4:10 ` David Gibson [this message]
2011-09-01 5:28 ` David Gibson
2011-10-12 4:43 ` Paul Mackerras
2011-10-14 20:59 ` Andrew Morton
2011-10-17 5:14 ` David Gibson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110823041020.GQ30097@yookeroo.fritz.box \
--to=david@gibson.dropbear.id.au \
--cc=abarry@cray.com \
--cc=abh@cray.com \
--cc=akpm@linux-foundation.org \
--cc=hughd@google.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=minchan.kim@gmail.com \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).