From: Nishanth Aravamudan <nacc@us.ibm.com>
To: Nick Piggin <npiggin@suse.de>
Cc: Andi Kleen <andi@firstfloor.org>,
Dave Hansen <dave@linux.vnet.ibm.com>,
akpm@linux-foundation.org, linux-mm@kvack.org, kniht@us.ibm.com,
abh@cray.com, joachim.deguara@amd.com, lee.schermerhorn@hp.com
Subject: Re: [patch 14/21] x86: add hugepagesz option on 64-bit
Date: Thu, 5 Jun 2008 17:23:28 -0600 [thread overview]
Message-ID: <20080605232328.GE31534@us.ibm.com> (raw)
In-Reply-To: <20080605231247.GC31534@us.ibm.com>
On 05.06.2008 [17:12:47 -0600], Nishanth Aravamudan wrote:
> On 04.06.2008 [03:10:16 +0200], Nick Piggin wrote:
> > On Tue, Jun 03, 2008 at 10:57:52PM +0200, Andi Kleen wrote:
> > > > The downside of something like this is that you have yet another data
> > > > structure to manage. Andi, do you think something like this would be
> > > > workable?
> > >
> > > The reason I don't like your proposal is that it makes only sense
> > > with a lot of hugepage sizes being active at the same time. But the
> > > API (one mount per size) doesn't really scale to that anyways.
> > > It should support two (as on x86), three if you stretch it, but
> > > anything beyond would be difficult.
> > > If you really wanted to support a zillion sizes you would at least
> > > first need a different flexible interface that completely hides page
> > > sizes.
> > > Otherwise you would drive both sysadmins and programmers crazy and
> > > overlong command lines would be the smallest of their problems
> > > With two or even three sizes only the whole thing is not needed and my original
> > > scheme works fine IMHO.
> > >
> > > That is why I was also sceptical of the newly proposed sysfs interfaces.
> > > For two or three numbers you don't need a sysfs interface.
> >
> > I do think your proc enhancements are clever, and you're right that
> > for the current setup they are pretty workable. The reason I haven't
> > submitted them in this round is because they do cause libhugetlbfs
> > failures... maybe that's just because the regression suite does
> > really dumb parsing, and nothing important will break, but it is the
> > only thing I have to go on so I have to give it some credit ;)
>
> Will chime in here that yes, regardless of anything we do here,
> libhugetlbfs will need to be updated to leverage multiple hugepage sizes
> available at run-time. And I think it is sane to make sure that the
> parser we have either is fixed if it has a bug that is causing the
> failures or assuming the failures indicate a userspace interface change
> :)
>
> > With the sysfs API, we have a way to control the other hstates, so it
> > takes a little importance off the proc interface.
> >
> > sysfs doesn't appear to give a huge improvement yet (although I still
> > think it is nicer), but I think the hugetlbfs guys want to have control
> > over which nodes things get allocated on etc. so I think proc really
> > was going to run out of steam at some point.
>
> Well, I know Lee S. really wants it and it could help on large NUMA
> systems using cpusets or other process restriction methods to be able to
> specify which nodes the hugepages get allocated on.
Oh, and I imagine the layout will be something like (on power):
/sys/kernel/hugepages/hugepages-64kB/nodeX/nr_hugepages
/sys/kernel/hugepages/hugepages-64kB/nodeX/nr_overcommit_hugepages
/sys/kernel/hugepages/hugepages-64kB/nodeX/free_hugepages
/sys/kernel/hugepages/hugepages-64kB/nodeX/resv_hugepages
/sys/kernel/hugepages/hugepages-64kB/nodeX/surplus_hugepages
/sys/kernel/hugepages/hugepages-64kB/nr_hugepages
/sys/kernel/hugepages/hugepages-64kB/nr_overcommit_hugepages
/sys/kernel/hugepages/hugepages-64kB/free_hugepages
/sys/kernel/hugepages/hugepages-64kB/resv_hugepages
/sys/kernel/hugepages/hugepages-64kB/surplus_hugepages
/sys/kernel/hugepages/hugepages-16384kB/nodeX/nr_hugepages
/sys/kernel/hugepages/hugepages-16384kB/nodeX/nr_overcommit_hugepages
/sys/kernel/hugepages/hugepages-16384kB/nodeX/free_hugepages
/sys/kernel/hugepages/hugepages-16384kB/nodeX/resv_hugepages
/sys/kernel/hugepages/hugepages-16384kB/nodeX/surplus_hugepages
/sys/kernel/hugepages/hugepages-16384kB/nr_hugepages
/sys/kernel/hugepages/hugepages-16384kB/nr_overcommit_hugepages
/sys/kernel/hugepages/hugepages-16384kB/free_hugepages
/sys/kernel/hugepages/hugepages-16384kB/resv_hugepages
/sys/kernel/hugepages/hugepages-16384kB/surplus_hugepages
/sys/kernel/hugepages/hugepages-16777216kB/nodeX/nr_hugepages
/sys/kernel/hugepages/hugepages-16777216kB/nodeX/nr_overcommit_hugepages
/sys/kernel/hugepages/hugepages-16777216kB/nodeX/free_hugepages
/sys/kernel/hugepages/hugepages-16777216kB/nodeX/resv_hugepages
/sys/kernel/hugepages/hugepages-16777216kB/nodeX/surplus_hugepages
/sys/kernel/hugepages/hugepages-16777216kB/nr_hugepages
/sys/kernel/hugepages/hugepages-16777216kB/nr_overcommit_hugepages
/sys/kernel/hugepages/hugepages-16777216kB/free_hugepages
/sys/kernel/hugepages/hugepages-16777216kB/resv_hugepages
/sys/kernel/hugepages/hugepages-16777216kB/surplus_hugepages
Where X varies over all possible nids. Does that seem reasonable? I'm
not entirely sure I like the amount of repetion of the topology, e.g.,
does it make more sense to have:
/sys/kernel/hugepages/hugepages-64kB
/sys/kernel/hugepages/hugepages-16384kB
/sys/kernel/hugepages/hugepages-16777216kB
/sys/kernel/hugepages/nodeX/hugepages-64kB
/sys/kernel/hugepages/nodeX/hugepages-16384kB
/sys/kernel/hugepages/nodeX/hugepages-16777216kB
?
Would definitely be a more compact representation...
Thanks,
Nish
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2008-06-05 23:23 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-06-03 9:59 [patch 00/21] hugetlb multi size, giant hugetlb support, etc npiggin
2008-06-03 9:59 ` [patch 01/21] hugetlb: factor out prep_new_huge_page npiggin
2008-06-03 9:59 ` [patch 02/21] hugetlb: modular state npiggin
2008-06-03 10:58 ` [patch 02/21] hugetlb: modular state (take 2) Nick Piggin
2008-06-03 9:59 ` [patch 03/21] hugetlb: multiple hstates npiggin
2008-06-03 10:00 ` [patch 04/21] hugetlbfs: per mount hstates npiggin
2008-06-03 10:00 ` [patch 05/21] hugetlb: new sysfs interface npiggin
2008-06-03 10:00 ` [patch 06/21] hugetlb: abstract numa round robin selection npiggin
2008-06-03 10:00 ` [patch 07/21] mm: introduce non panic alloc_bootmem npiggin
2008-06-03 10:00 ` [patch 08/21] mm: export prep_compound_page to mm npiggin
2008-06-03 10:00 ` [patch 09/21] hugetlb: support larger than MAX_ORDER npiggin
2008-06-03 10:00 ` [patch 10/21] hugetlb: support boot allocate different sizes npiggin
2008-06-03 10:00 ` [patch 11/21] hugetlb: printk cleanup npiggin
2008-06-03 10:00 ` [patch 12/21] hugetlb: introduce pud_huge npiggin
2008-06-03 10:00 ` [patch 13/21] x86: support GB hugepages on 64-bit npiggin
2008-06-03 10:00 ` [patch 14/21] x86: add hugepagesz option " npiggin
2008-06-03 17:48 ` Dave Hansen
2008-06-03 18:24 ` Andi Kleen
2008-06-03 18:59 ` Dave Hansen
2008-06-03 20:57 ` Andi Kleen
2008-06-03 21:27 ` Dave Hansen
2008-06-04 0:06 ` Andi Kleen
2008-06-04 1:04 ` Nick Piggin
2008-06-04 16:01 ` Dave Hansen
2008-06-06 16:09 ` Dave Hansen
2008-06-05 23:15 ` Nishanth Aravamudan
2008-06-06 0:29 ` Andi Kleen
2008-06-04 1:10 ` Nick Piggin
2008-06-05 23:12 ` Nishanth Aravamudan
2008-06-05 23:23 ` Nishanth Aravamudan [this message]
2008-06-03 19:00 ` Dave Hansen
2008-06-03 10:00 ` [patch 15/21] hugetlb: override default huge page size npiggin
2008-06-03 10:00 ` [patch 16/21] hugetlb: allow arch overried hugepage allocation npiggin
2008-06-03 10:00 ` [patch 17/21] powerpc: function to allocate gigantic hugepages npiggin
2008-06-03 10:00 ` [patch 18/21] powerpc: scan device tree for gigantic pages npiggin
2008-06-03 10:00 ` [patch 19/21] powerpc: define support for 16G hugepages npiggin
2008-06-03 10:00 ` [patch 20/21] fs: check for statfs overflow npiggin
2008-06-03 10:00 ` [patch 21/21] powerpc: support multiple hugepage sizes npiggin
2008-06-03 10:29 ` [patch 1/1] x86: get_user_pages_lockless support 1GB hugepages Nick Piggin
2008-06-03 10:57 ` [patch 00/21] hugetlb multi size, giant hugetlb support, etc Nick Piggin
2008-06-06 17:12 ` Andy Whitcroft
2008-06-04 8:29 ` Andrew Morton
2008-06-04 9:35 ` Nick Piggin
2008-06-04 9:46 ` Andrew Morton
2008-06-04 11:04 ` Nick Piggin
2008-06-04 11:33 ` Nick Piggin
2008-06-04 11:57 ` Andi Kleen
2008-06-04 18:39 ` Andrew Morton
-- strict thread matches above, loose matches on Subject: below --
2008-06-04 11:29 [patch 00/21] hugetlb patches resend npiggin
2008-06-04 11:29 ` [patch 14/21] x86: add hugepagesz option on 64-bit npiggin
2008-06-04 17:51 ` Randy Dunlap
2008-06-05 2:01 ` Nick Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080605232328.GE31534@us.ibm.com \
--to=nacc@us.ibm.com \
--cc=abh@cray.com \
--cc=akpm@linux-foundation.org \
--cc=andi@firstfloor.org \
--cc=dave@linux.vnet.ibm.com \
--cc=joachim.deguara@amd.com \
--cc=kniht@us.ibm.com \
--cc=lee.schermerhorn@hp.com \
--cc=linux-mm@kvack.org \
--cc=npiggin@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).