All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nathan Fontenot <nfont@austin.ibm.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linuxppc-dev@ozlabs.org, Greg KH <greg@kroah.com>,
	linux-kernel@vger.kernel.org,
	Dave Hansen <dave@linux.vnet.ibm.com>,
	linux-mm@kvack.org,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Subject: Re: [PATCH 0/8] v5 De-couple sysfs memory directories from memory sections
Date: Mon, 16 Aug 2010 09:34:08 -0500	[thread overview]
Message-ID: <4C694C60.6030207@austin.ibm.com> (raw)
In-Reply-To: <20100812120816.e97d8b9e.akpm@linux-foundation.org>

On 08/12/2010 02:08 PM, Andrew Morton wrote:
> On Mon, 09 Aug 2010 12:53:00 -0500
> Nathan Fontenot <nfont@austin.ibm.com> wrote:
> 
>> This set of patches de-couples the idea that there is a single
>> directory in sysfs for each memory section.  The intent of the
>> patches is to reduce the number of sysfs directories created to
>> resolve a boot-time performance issue.  On very large systems
>> boot time are getting very long (as seen on powerpc hardware)
>> due to the enormous number of sysfs directories being created.
>> On a system with 1 TB of memory we create ~63,000 directories.
>> For even larger systems boot times are being measured in hours.
> 
> And those "hours" are mainly due to this problem, I assume.

Yes, those hours are spent creating the sysfs directories for each
of the memory sections.

> 
>> This set of patches allows for each directory created in sysfs
>> to cover more than one memory section.  The default behavior for
>> sysfs directory creation is the same, in that each directory
>> represents a single memory section.  A new file 'end_phys_index'
>> in each directory contains the physical_id of the last memory
>> section covered by the directory so that users can easily
>> determine the memory section range of a directory.
> 
> What you're proposing appears to be a non-back-compatible
> userspace-visible change.  This is a big issue!
> 
> It's not an unresolvable issue, as this is a must-fix problem.  But you
> should tell us what your proposal is to prevent breakage of existing
> installations.  A Kconfig option would be good, but a boot-time kernel
> command line option which selects the new format would be much better.

This shouldn't break existing installations, unless an architecture chooses
to do so.  With my patch only the powerpc/pseries arch is updated such that
what is seen in userspace is different.

The default behavior is maintained for all architectures unless they define
their own version of memory_block_size_bytes().  The default definition of
this routine (defined as __weak in Patch 5/8) sets the memory block size
to the same size it currently is, and thus preserving the exisitng 1 sysfs
directory per memory section.  The only change that will be seen is a new
propery for memory section, end_phys_addr, which will have the same value
as the existing 'phys_addr' property.

> 
> However you didn't mention this issue at all, and it's the most
> important one.
> 
> 
>> Updates for version 5 of the patchset include the following:
>>
>> Patch 4/8 Add mutex for add/remove of memory blocks
>> - Define the mutex using DEFINE_MUTEX macro.
>>
>> Patch 8/8 Update memory-hotplug documentation
>> - Add information concerning memory holes in phys_index..end_phys_index.
> 
> And you forgot to tell us how long those machines boot with the
> patchset applied, which is the entire point of the patchset!

Yes,  I am working on getting more time on our large systems to get
performance numbers with this patch.  I'll post them when I get them.

-Nathan

WARNING: multiple messages have this Message-ID (diff)
From: Nathan Fontenot <nfont@austin.ibm.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linuxppc-dev@ozlabs.org,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Dave Hansen <dave@linux.vnet.ibm.com>, Greg KH <greg@kroah.com>
Subject: Re: [PATCH 0/8] v5 De-couple sysfs memory directories from memory sections
Date: Mon, 16 Aug 2010 09:34:08 -0500	[thread overview]
Message-ID: <4C694C60.6030207@austin.ibm.com> (raw)
In-Reply-To: <20100812120816.e97d8b9e.akpm@linux-foundation.org>

On 08/12/2010 02:08 PM, Andrew Morton wrote:
> On Mon, 09 Aug 2010 12:53:00 -0500
> Nathan Fontenot <nfont@austin.ibm.com> wrote:
> 
>> This set of patches de-couples the idea that there is a single
>> directory in sysfs for each memory section.  The intent of the
>> patches is to reduce the number of sysfs directories created to
>> resolve a boot-time performance issue.  On very large systems
>> boot time are getting very long (as seen on powerpc hardware)
>> due to the enormous number of sysfs directories being created.
>> On a system with 1 TB of memory we create ~63,000 directories.
>> For even larger systems boot times are being measured in hours.
> 
> And those "hours" are mainly due to this problem, I assume.

Yes, those hours are spent creating the sysfs directories for each
of the memory sections.

> 
>> This set of patches allows for each directory created in sysfs
>> to cover more than one memory section.  The default behavior for
>> sysfs directory creation is the same, in that each directory
>> represents a single memory section.  A new file 'end_phys_index'
>> in each directory contains the physical_id of the last memory
>> section covered by the directory so that users can easily
>> determine the memory section range of a directory.
> 
> What you're proposing appears to be a non-back-compatible
> userspace-visible change.  This is a big issue!
> 
> It's not an unresolvable issue, as this is a must-fix problem.  But you
> should tell us what your proposal is to prevent breakage of existing
> installations.  A Kconfig option would be good, but a boot-time kernel
> command line option which selects the new format would be much better.

This shouldn't break existing installations, unless an architecture chooses
to do so.  With my patch only the powerpc/pseries arch is updated such that
what is seen in userspace is different.

The default behavior is maintained for all architectures unless they define
their own version of memory_block_size_bytes().  The default definition of
this routine (defined as __weak in Patch 5/8) sets the memory block size
to the same size it currently is, and thus preserving the exisitng 1 sysfs
directory per memory section.  The only change that will be seen is a new
propery for memory section, end_phys_addr, which will have the same value
as the existing 'phys_addr' property.

> 
> However you didn't mention this issue at all, and it's the most
> important one.
> 
> 
>> Updates for version 5 of the patchset include the following:
>>
>> Patch 4/8 Add mutex for add/remove of memory blocks
>> - Define the mutex using DEFINE_MUTEX macro.
>>
>> Patch 8/8 Update memory-hotplug documentation
>> - Add information concerning memory holes in phys_index..end_phys_index.
> 
> And you forgot to tell us how long those machines boot with the
> patchset applied, which is the entire point of the patchset!

Yes,  I am working on getting more time on our large systems to get
performance numbers with this patch.  I'll post them when I get them.

-Nathan

WARNING: multiple messages have this Message-ID (diff)
From: Nathan Fontenot <nfont@austin.ibm.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linuxppc-dev@ozlabs.org,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Dave Hansen <dave@linux.vnet.ibm.com>, Greg KH <greg@kroah.com>
Subject: Re: [PATCH 0/8] v5 De-couple sysfs memory directories from memory sections
Date: Mon, 16 Aug 2010 09:34:08 -0500	[thread overview]
Message-ID: <4C694C60.6030207@austin.ibm.com> (raw)
In-Reply-To: <20100812120816.e97d8b9e.akpm@linux-foundation.org>

On 08/12/2010 02:08 PM, Andrew Morton wrote:
> On Mon, 09 Aug 2010 12:53:00 -0500
> Nathan Fontenot <nfont@austin.ibm.com> wrote:
> 
>> This set of patches de-couples the idea that there is a single
>> directory in sysfs for each memory section.  The intent of the
>> patches is to reduce the number of sysfs directories created to
>> resolve a boot-time performance issue.  On very large systems
>> boot time are getting very long (as seen on powerpc hardware)
>> due to the enormous number of sysfs directories being created.
>> On a system with 1 TB of memory we create ~63,000 directories.
>> For even larger systems boot times are being measured in hours.
> 
> And those "hours" are mainly due to this problem, I assume.

Yes, those hours are spent creating the sysfs directories for each
of the memory sections.

> 
>> This set of patches allows for each directory created in sysfs
>> to cover more than one memory section.  The default behavior for
>> sysfs directory creation is the same, in that each directory
>> represents a single memory section.  A new file 'end_phys_index'
>> in each directory contains the physical_id of the last memory
>> section covered by the directory so that users can easily
>> determine the memory section range of a directory.
> 
> What you're proposing appears to be a non-back-compatible
> userspace-visible change.  This is a big issue!
> 
> It's not an unresolvable issue, as this is a must-fix problem.  But you
> should tell us what your proposal is to prevent breakage of existing
> installations.  A Kconfig option would be good, but a boot-time kernel
> command line option which selects the new format would be much better.

This shouldn't break existing installations, unless an architecture chooses
to do so.  With my patch only the powerpc/pseries arch is updated such that
what is seen in userspace is different.

The default behavior is maintained for all architectures unless they define
their own version of memory_block_size_bytes().  The default definition of
this routine (defined as __weak in Patch 5/8) sets the memory block size
to the same size it currently is, and thus preserving the exisitng 1 sysfs
directory per memory section.  The only change that will be seen is a new
propery for memory section, end_phys_addr, which will have the same value
as the existing 'phys_addr' property.

> 
> However you didn't mention this issue at all, and it's the most
> important one.
> 
> 
>> Updates for version 5 of the patchset include the following:
>>
>> Patch 4/8 Add mutex for add/remove of memory blocks
>> - Define the mutex using DEFINE_MUTEX macro.
>>
>> Patch 8/8 Update memory-hotplug documentation
>> - Add information concerning memory holes in phys_index..end_phys_index.
> 
> And you forgot to tell us how long those machines boot with the
> patchset applied, which is the entire point of the patchset!

Yes,  I am working on getting more time on our large systems to get
performance numbers with this patch.  I'll post them when I get them.

-Nathan

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2010-08-16 14:34 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-09 17:53 [PATCH 0/8] v5 De-couple sysfs memory directories from memory sections Nathan Fontenot
2010-08-09 17:53 ` Nathan Fontenot
2010-08-09 17:53 ` Nathan Fontenot
2010-08-09 18:35 ` [PATCH 1/8] v5 Move the find_memory_block() routine up Nathan Fontenot
2010-08-09 18:35   ` Nathan Fontenot
2010-08-09 18:35   ` Nathan Fontenot
2010-08-09 18:36 ` [PATCH 2/8] v5 Add new phys_index properties Nathan Fontenot
2010-08-09 18:36   ` Nathan Fontenot
2010-08-09 18:36   ` Nathan Fontenot
2010-08-09 18:37 ` [PATCH 3/8] v5 Add section count to memory_block Nathan Fontenot
2010-08-09 18:37   ` Nathan Fontenot
2010-08-09 18:37   ` Nathan Fontenot
2010-08-09 18:38 ` [PATCH 4/8] v5 Add mutex for add/remove of memory blocks Nathan Fontenot
2010-08-09 18:38   ` Nathan Fontenot
2010-08-09 18:38   ` Nathan Fontenot
2010-08-09 18:39 ` [PATCH 5/8] v5 Allow memory_block to span multiple memory sections Nathan Fontenot
2010-08-09 18:39   ` Nathan Fontenot
2010-08-09 18:39   ` Nathan Fontenot
2010-08-09 18:41 ` [PATCH 6/8] v5 Update the node sysfs code Nathan Fontenot
2010-08-09 18:41   ` Nathan Fontenot
2010-08-09 18:41   ` Nathan Fontenot
2010-08-09 18:42 ` [PATCH 7/8] v5 Define memory_block_size_bytes() for ppc/pseries Nathan Fontenot
2010-08-09 18:42   ` Nathan Fontenot
2010-08-09 18:42   ` Nathan Fontenot
2010-08-09 18:43 ` [PATCH 8/8] v5 Update memory-hotplug documentation Nathan Fontenot
2010-08-09 18:43   ` Nathan Fontenot
2010-08-09 18:43   ` Nathan Fontenot
2010-08-09 20:44   ` Nishanth Aravamudan
2010-08-09 20:44     ` Nishanth Aravamudan
2010-08-09 20:44     ` Nishanth Aravamudan
2010-08-09 20:48     ` Nishanth Aravamudan
2010-08-09 20:48       ` Nishanth Aravamudan
2010-08-10 12:17     ` Nathan Fontenot
2010-08-10 12:17       ` Nathan Fontenot
2010-08-10 12:17       ` Nathan Fontenot
2010-08-11 15:18 ` [PATCH 0/8] v5 De-couple sysfs memory directories from memory sections Dave Hansen
2010-08-11 15:18   ` Dave Hansen
2010-08-11 15:18   ` Dave Hansen
2010-08-12 19:08 ` Andrew Morton
2010-08-12 19:08   ` Andrew Morton
2010-08-12 19:08   ` Andrew Morton
2010-08-12 20:07   ` Dave Hansen
2010-08-12 20:07     ` Dave Hansen
2010-08-12 20:07     ` Dave Hansen
2010-08-16 14:34   ` Nathan Fontenot [this message]
2010-08-16 14:34     ` Nathan Fontenot
2010-08-16 14:34     ` Nathan Fontenot
2010-08-31 18:12     ` Dave Hansen
2010-08-31 18:12       ` Dave Hansen
2010-08-31 18:12       ` Dave Hansen
2010-08-31 21:57 ` Anton Blanchard
2010-08-31 21:57   ` Anton Blanchard
2010-08-31 21:57   ` Anton Blanchard
2010-09-02 17:39   ` Nathan Fontenot
2010-09-02 17:39     ` Nathan Fontenot
2010-09-02 17:39     ` Nathan Fontenot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C694C60.6030207@austin.ibm.com \
    --to=nfont@austin.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=dave@linux.vnet.ibm.com \
    --cc=greg@kroah.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.