linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Nishanth Aravamudan <nacc@us.ibm.com>
To: Greg KH <gregkh@suse.de>
Cc: Nick Piggin <npiggin@suse.de>,
	wli@holomorphy.com, clameter@sgi.com, agl@us.ibm.com,
	luick@cray.com, Lee.Schermerhorn@hp.com, linux-mm@kvack.org
Subject: Re: [RFC][PATCH 4/5] Documentation: add node files to sysfs ABI
Date: Mon, 14 Apr 2008 14:05:06 -0700	[thread overview]
Message-ID: <20080414210506.GA6350@us.ibm.com> (raw)
In-Reply-To: <20080413034136.GA22686@suse.de>

On 12.04.2008 [20:41:36 -0700], Greg KH wrote:
> On Sat, Apr 12, 2008 at 11:41:18AM +0200, Nick Piggin wrote:
> > On Fri, Apr 11, 2008 at 04:56:48PM -0700, Greg KH wrote:
> > > On Fri, Apr 11, 2008 at 04:49:13PM -0700, Nishanth Aravamudan wrote:
> > > > /sys/devices/system/node represents the current NUMA configuration of
> > > > the machine, but is undocumented in the ABI files. Add bare-bones
> > > > documentation for these files.
> > > > 
> > > > Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
> > > > 
> > > > ---
> > > > Greg, is something like this what you'd want?
> > > 
> > > Yes it is, thanks for doing it.
> > 
> > Can you comment on the aspect of configuring various kernel hugetlb 
> > configuration parameters? Especifically, what directory it should go in?
> > IMO it should be /sys/kernel/*
> 
> I don't really know.
> 
> > /sys/devices/system/etc should be fine eg. for showing how many pages are
> > available in a given node, or what kinds of TLBs the CPU has, but I would
> > have thought that configuring the kernel's hugetlb settings should be
> > in /sys/kernel.
> 
> /sys/devices/system are for "sysdev" devices, a breed of device
> structures that are problimatic to use, and are on my TODO list to
> rework.  If you need a hugetlb paramter to be tied to a cpu or other
> system device, then it should go under here.
> 
> Otherwise, if it is just a "system wide" parameter, then put it in
> /sys/kernel/

We have both, and that's kind of where things are being discussed right
now.

Currently, we have:

/proc/sys/vm/nr_hugepages
/proc/sys/vm/nr_overcommit_hugepages

which are global sysctls.

My patchset would add:

/sys/devices/system/node/nodeX/nr_hugepages

to allow for finer-grained control of the hugetlb pool allocation.

Nick/Andi's patchset would modify /proc/sys/vm/nr_hugepages to allow
specifying the pool sizes for multiple hugepage sizes.

To make my patchset and Nick's work well together, I think we'd need a
per-node, per-hugepage-size interface in sysfs. I pointed out to Nick
that it might be better to make the extended interface (supporting
multiple hugepage sizes) be in sysfs altogether, and leave
/proc/sys/vm/nr_hugepages alone (as only controlling the default
hugepage size).

That would leave us with [1]:

/sys/kernel/nr_hugepages --> nr_hugepages_2M
/sys/kernel/nr_hugepages_2M
/sys/kernel/nr_hugepages_1G
/sys/kernel/nr_overcommit_hugepages --> nr_overcommit_hugepages_2M
/sys/kernel/nr_overcommit_hugepages_2M
/sys/kernel/nr_overcommit_hugepages_1G

and [2]

/sys/devices/system/node/nodeX/nr_hugepages --> nr_hugepages_2M
/sys/devices/system/node/nodeX/nr_hugepages_2M
/sys/devices/system/node/nodeX/nr_hugepages_1G

The questions I see are (with my answers):

Is this separation correct?

	- I believe this puts the globals in one place and the per-nodes
	  in another (both of which are correct) keeping things
	  accurate. The per-node interface would be the first writable
	  attribute in /sys/devices/system/node, though.

Is this separation confusing to an administrator?

	- Similar to the previous question, I think the separation
	  corresponds well to the system's layout.

Is there a better way of presenting these attributes?

	- Nick's alternative was to (I think, please CMIIW) have:

	/sys/kernel/hugetlb/2M/nr_hugepages
	/sys/kernel/hugetlb/2M/nr_overcommit_hugepages
	/sys/kernel/hugetlb/2M/nodeX/nr_hugepages
	/sys/kernel/hugetlb/2M/nodeX/nr_overcommit_hugepages

	with perhaps symlinks in /sys/kernel/ or /sys/kernel/hugetlb
	directly to the default pools. And similar diretories/files for
	1G pages. This seems like a lot of duplication of the NUMA
	layout, but I can see it also being better in that all of the
	hugetlb-related interface is in one place. [3]

Do you see a particular more-sysfs-way here, Greg?

Thanks for reading this particularly long e-mail,
Nish

[1] Nick suggested using directories in /sys/kernel per-hugepage-size,
but I'm not sure how they should be named, so I went with the simpler
filename-style, to make the point clearer.

[2] I have a patch to allow for per-node dynamic pool control, but it's
pretty gross. Right now, we let the memory policy enforce where we get
hugepages from, presuming we can allocate there. If we had per-node
control, we'd need some way to specify a restriction on how many
hugepages can be allocated on a particular node down to alloc_pages, or
use a round-robin style, which would probably break mempolicies. For
now, I've let the patch alone while I try to find a better way.

[3] Is there an in-between, perhaps, that we could have the real files
in /sys/devices/system/node, but have symlinks, like
/sys/kernel/hugetlb/nodeX/nr_hugepages_2M -->
/sys/devices/system/node/nodeX/nr_hugepages_2M ? That seems like
overkill...

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2008-04-14 21:05 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-04-11 23:44 [PATCH 1/5] hugetlb: numafy several functions Nishanth Aravamudan
2008-04-11 23:47 ` [RFC][PATCH 2/5] " Nishanth Aravamudan
2008-04-11 23:47   ` [PATCH 3/5] hugetlb: interleave dequeueing of huge pages Nishanth Aravamudan
2008-04-11 23:49     ` [RFC][PATCH 4/5] Documentation: add node files to sysfs ABI Nishanth Aravamudan
2008-04-11 23:50       ` [RFC][PATCH 5/5] Documentation: update ABI and hugetlbpage.txt for per-node files Nishanth Aravamudan
2008-04-11 23:56       ` [RFC][PATCH 4/5] Documentation: add node files to sysfs ABI Greg KH
2008-04-12  0:27         ` Nishanth Aravamudan
2008-04-12  9:41         ` Nick Piggin
2008-04-12 10:26           ` Christoph Lameter
2008-04-14 21:09             ` Nishanth Aravamudan
2008-04-13  3:41           ` Greg KH
2008-04-14 21:05             ` Nishanth Aravamudan [this message]
2008-04-17 23:16               ` Nishanth Aravamudan
2008-04-17 23:22                 ` Christoph Lameter
2008-04-17 23:36                   ` Nishanth Aravamudan
2008-04-17 23:39                     ` Christoph Lameter
2008-04-18  6:04                       ` Nishanth Aravamudan
2008-04-18 17:27                         ` Nishanth Aravamudan
2008-04-20  2:24                           ` Greg KH
2008-04-21 16:43                             ` Nishanth Aravamudan
2008-04-20  2:21                       ` Greg KH
2008-04-21  6:06                         ` Christoph Lameter
2008-04-21 16:41                           ` Nishanth Aravamudan
2008-04-22  5:14                   ` Nick Piggin
2008-04-22 16:56                     ` Nishanth Aravamudan
2008-04-23  1:03                       ` Nick Piggin
2008-04-23 18:32                         ` Nishanth Aravamudan
2008-04-23 19:07                           ` Adam Litke
2008-04-24  7:13                           ` Nick Piggin
2008-04-24 15:54                             ` Nishanth Aravamudan
2008-04-27  3:49                             ` [RFC][PATCH] hugetlb: add information and interface in sysfs [Was Re: [RFC][PATCH 4/5] Documentation: add node files to sysfs ABI] Nishanth Aravamudan
2008-04-27  5:10                               ` Greg KH
2008-04-28 17:22                                 ` Nishanth Aravamudan
2008-04-28 17:29                                   ` Greg KH
2008-04-29 17:11                                     ` Nishanth Aravamudan
2008-04-29 17:22                                       ` Greg KH
2008-04-29 18:14                                         ` Nishanth Aravamudan
2008-04-29 18:26                                           ` Greg KH
2008-04-29 23:48                                             ` Nishanth Aravamudan
2008-05-01  3:07                                               ` Greg KH
2008-05-01 18:25                                                 ` Nishanth Aravamudan
2008-04-30 19:19                                             ` Nishanth Aravamudan
2008-05-01  3:08                                               ` Greg KH
2008-05-02 17:58                                                 ` Nishanth Aravamudan
2008-04-28 20:31                                 ` Christoph Lameter
2008-04-28 20:52                                   ` Nishanth Aravamudan
2008-04-28 21:29                                     ` Christoph Lameter
2008-04-29 16:43                                       ` Nishanth Aravamudan
2008-04-29 17:01                                         ` Christoph Lameter
2008-04-14 14:52   ` [RFC][PATCH 2/5] hugetlb: numafy several functions Adam Litke
2008-04-14 21:10     ` Nishanth Aravamudan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080414210506.GA6350@us.ibm.com \
    --to=nacc@us.ibm.com \
    --cc=Lee.Schermerhorn@hp.com \
    --cc=agl@us.ibm.com \
    --cc=clameter@sgi.com \
    --cc=gregkh@suse.de \
    --cc=linux-mm@kvack.org \
    --cc=luick@cray.com \
    --cc=npiggin@suse.de \
    --cc=wli@holomorphy.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).