linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Nathan Fontenot <nfont@austin.ibm.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linuxppc-dev@ozlabs.org, Greg KH <greg@kroah.com>,
	linux-kernel@vger.kernel.org,
	Dave Hansen <dave@linux.vnet.ibm.com>,
	linux-mm@kvack.org,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Subject: Re: [PATCH 0/8] v5 De-couple sysfs memory directories from memory sections
Date: Mon, 16 Aug 2010 09:34:08 -0500	[thread overview]
Message-ID: <4C694C60.6030207@austin.ibm.com> (raw)
In-Reply-To: <20100812120816.e97d8b9e.akpm@linux-foundation.org>

On 08/12/2010 02:08 PM, Andrew Morton wrote:
> On Mon, 09 Aug 2010 12:53:00 -0500
> Nathan Fontenot <nfont@austin.ibm.com> wrote:
> 
>> This set of patches de-couples the idea that there is a single
>> directory in sysfs for each memory section.  The intent of the
>> patches is to reduce the number of sysfs directories created to
>> resolve a boot-time performance issue.  On very large systems
>> boot time are getting very long (as seen on powerpc hardware)
>> due to the enormous number of sysfs directories being created.
>> On a system with 1 TB of memory we create ~63,000 directories.
>> For even larger systems boot times are being measured in hours.
> 
> And those "hours" are mainly due to this problem, I assume.

Yes, those hours are spent creating the sysfs directories for each
of the memory sections.

> 
>> This set of patches allows for each directory created in sysfs
>> to cover more than one memory section.  The default behavior for
>> sysfs directory creation is the same, in that each directory
>> represents a single memory section.  A new file 'end_phys_index'
>> in each directory contains the physical_id of the last memory
>> section covered by the directory so that users can easily
>> determine the memory section range of a directory.
> 
> What you're proposing appears to be a non-back-compatible
> userspace-visible change.  This is a big issue!
> 
> It's not an unresolvable issue, as this is a must-fix problem.  But you
> should tell us what your proposal is to prevent breakage of existing
> installations.  A Kconfig option would be good, but a boot-time kernel
> command line option which selects the new format would be much better.

This shouldn't break existing installations, unless an architecture chooses
to do so.  With my patch only the powerpc/pseries arch is updated such that
what is seen in userspace is different.

The default behavior is maintained for all architectures unless they define
their own version of memory_block_size_bytes().  The default definition of
this routine (defined as __weak in Patch 5/8) sets the memory block size
to the same size it currently is, and thus preserving the exisitng 1 sysfs
directory per memory section.  The only change that will be seen is a new
propery for memory section, end_phys_addr, which will have the same value
as the existing 'phys_addr' property.

> 
> However you didn't mention this issue at all, and it's the most
> important one.
> 
> 
>> Updates for version 5 of the patchset include the following:
>>
>> Patch 4/8 Add mutex for add/remove of memory blocks
>> - Define the mutex using DEFINE_MUTEX macro.
>>
>> Patch 8/8 Update memory-hotplug documentation
>> - Add information concerning memory holes in phys_index..end_phys_index.
> 
> And you forgot to tell us how long those machines boot with the
> patchset applied, which is the entire point of the patchset!

Yes,  I am working on getting more time on our large systems to get
performance numbers with this patch.  I'll post them when I get them.

-Nathan

  parent reply	other threads:[~2010-08-16 14:34 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-09 17:53 [PATCH 0/8] v5 De-couple sysfs memory directories from memory sections Nathan Fontenot
2010-08-09 18:35 ` [PATCH 1/8] v5 Move the find_memory_block() routine up Nathan Fontenot
2010-08-09 18:36 ` [PATCH 2/8] v5 Add new phys_index properties Nathan Fontenot
2010-08-09 18:37 ` [PATCH 3/8] v5 Add section count to memory_block Nathan Fontenot
2010-08-09 18:38 ` [PATCH 4/8] v5 Add mutex for add/remove of memory blocks Nathan Fontenot
2010-08-09 18:39 ` [PATCH 5/8] v5 Allow memory_block to span multiple memory sections Nathan Fontenot
2010-08-09 18:41 ` [PATCH 6/8] v5 Update the node sysfs code Nathan Fontenot
2010-08-09 18:42 ` [PATCH 7/8] v5 Define memory_block_size_bytes() for ppc/pseries Nathan Fontenot
2010-08-09 18:43 ` [PATCH 8/8] v5 Update memory-hotplug documentation Nathan Fontenot
2010-08-09 20:44   ` Nishanth Aravamudan
2010-08-09 20:48     ` Nishanth Aravamudan
2010-08-10 12:17     ` Nathan Fontenot
2010-08-11 15:18 ` [PATCH 0/8] v5 De-couple sysfs memory directories from memory sections Dave Hansen
2010-08-12 19:08 ` Andrew Morton
2010-08-12 20:07   ` Dave Hansen
2010-08-16 14:34   ` Nathan Fontenot [this message]
2010-08-31 18:12     ` Dave Hansen
2010-08-31 21:57 ` Anton Blanchard
2010-09-02 17:39   ` Nathan Fontenot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C694C60.6030207@austin.ibm.com \
    --to=nfont@austin.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=dave@linux.vnet.ibm.com \
    --cc=greg@kroah.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).