Re: [RFC PATCH v2 00/15][Sorted-buddy] mm: Memory Power Management

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
To: Dave Hansen <dave@sr71.net>
Cc: akpm@linux-foundation.org, mgorman@suse.de,
	matthew.garrett@nebula.com, rientjes@google.com, riel@redhat.com,
	arjan@linux.intel.com, srinivas.pandruvada@linux.intel.com,
	maxime.coquelin@stericsson.com, loic.pallardy@stericsson.com,
	kamezawa.hiroyu@jp.fujitsu.com, lenb@kernel.org, rjw@sisk.pl,
	gargankita@gmail.com, paulmck@linux.vnet.ibm.com,
	amit.kachhap@linaro.org, svaidy@linux.vnet.ibm.com,
	andi@firstfloor.org, wujianguo@huawei.com, kmpark@infradead.org,
	thomas.abraham@linaro.org, santosh.shilimkar@ti.com,
	linux-pm@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH v2 00/15][Sorted-buddy] mm: Memory Power Management
Date: Fri, 19 Apr 2013 12:20:35 +0530	[thread overview]
Message-ID: <5170E93B.5090902@linux.vnet.ibm.com> (raw)
In-Reply-To: <517028F1.6000002@sr71.net>

On 04/18/2013 10:40 PM, Dave Hansen wrote:
> On 04/09/2013 02:45 PM, Srivatsa S. Bhat wrote:
>> 2. Performance overhead is expected to be low: Since we retain the simplicity
>>    of the algorithm in the page allocation path, page allocation can
>>    potentially remain as fast as it would be without memory regions. The
>>    overhead is pushed to the page-freeing paths which are not that critical.
> 
> Numbers, please.  The problem with pushing the overhead to frees is that
> they, believe it or not, actually average out to the same as the number
> of allocs.  Think kernel compile, or a large dd.  Both of those churn
> through a lot of memory, and both do an awful lot of allocs _and_ frees.
>  We need to know both the overhead on a system that does *no* memory
> power management, and the overhead on a system which is carved and
> actually using this code.
> 
>> Kernbench results didn't show any noticeable performance degradation with
>> this patchset as compared to vanilla 3.9-rc5.
> 
> Surely this code isn't magical and there's overhead _somewhere_, and
> such overhead can be quantified _somehow_.  Have you made an effort to
> find those cases, even with microbenchmarks?
>

Sorry for not posting the numbers explicitly. It really shows no difference in
kernbench, see below.

[For the following run, I reverted patch 14, since it seems to be intermittently
causing kernel-instability at high loads. So the numbers below show the effect
of only the sorted-buddy part of the patchset, and not the compaction part.]

Kernbench was run on a 2 socket 8 core machine (HT disabled) with 128 GB RAM,
with allyesconfig on 3.9-rc5 kernel source. 

Vanilla 3.9-rc5:
---------------
Fri Apr 19 08:30:12 IST 2013
3.9.0-rc5
Average Optimal load -j 16 Run (std deviation):
Elapsed Time 574.66 (2.31846)
User Time 3919.12 (3.71256)
System Time 339.296 (0.73694)
Percent CPU 740.4 (2.50998)
Context Switches 1.2183e+06 (4019.47)
Sleeps 1.61239e+06 (2657.33)

This patchset (minus patch 14): [Region size = 512 MB]
------------------------------
Fri Apr 19 09:42:38 IST 2013
3.9.0-rc5-mpmv2-nowq
Average Optimal load -j 16 Run (std deviation):
Elapsed Time 575.668 (2.01583)
User Time 3916.77 (3.48345)
System Time 337.406 (0.701591)
Percent CPU 738.4 (3.36155)
Context Switches 1.21683e+06 (6980.13)
Sleeps 1.61474e+06 (4906.23)


So, that shows almost no degradation due to the sorted-buddy logic (considering
the elapsed time).
 
> I still also want to see some hard numbers on:
>> However, memory consumes a significant amount of power, potentially upto
>> more than a third of total system power on server systems.
> and
>> It had been demonstrated on a Samsung Exynos board
>> (with 2 GB RAM) that upto 6 percent of total system power can be saved by
>> making the Linux kernel MM subsystem power-aware[4]. 
> 
> That was *NOT* with this code, and it's nearing being two years old.
> What can *this* *patch* do?
> 

Please let me clarify that. My intention behind quoting that 6% power-savings
number was _not_ to trick reviewers into believing that _this_ patchset provides
that much power-savings. It was only to show that this whole effort of doing memory
power management is not worthless, and we do have some valuable/tangible
benefits to gain from it. IOW, it was only meant to show an _estimate_ of how
much we can potentially save and thus justify the effort behind managing memory
power-efficiently.

As I had mentioned in the cover-letter, I don't have the the exact power-savings
number for this particular patchset yet. I'll definitely work towards getting
those numbers soon.

> I think there are three scenarios to look at.  Let's say you have an 8GB
> system with 1GB regions:
> 1. Normal unpatched kernel, booted with  mem=1G...8G (in 1GB increments
>    perhaps) running some benchmark which sees performance scale with
>    the amount of memory present in the system.
> 2. Kernel patched with this set, running the same test, but with single
>    memory regions.
> 3. Kernel patched with this set.  But, instead of using mem=, you run
>    it trying to evacuate equivalent amount of memory to the amounts you
>    removed using mem=.
> 
> That will tell us both what the overhead is, and how effective it is.
> I'd much rather see actual numbers and a description of the test than
> some hand waving that it "didn't show any noticeable performance
> degradation".
> 

Sure, I'll perform more extensive tests to evaluate the performance overhead
more thoroughly. I'll first fix the compaction logic that seems to be buggy
and run benchmarks again.

Thanks a lot for your all invaluable inputs, Dave!

Regards,
Srivatsa S. Bhat

WARNING: multiple messages have this Message-ID (diff)

From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
To: Dave Hansen <dave@sr71.net>
Cc: akpm@linux-foundation.org, mgorman@suse.de,
	matthew.garrett@nebula.com, rientjes@google.com, riel@redhat.com,
	arjan@linux.intel.com, srinivas.pandruvada@linux.intel.com,
	maxime.coquelin@stericsson.com, loic.pallardy@stericsson.com,
	kamezawa.hiroyu@jp.fujitsu.com, lenb@kernel.org, rjw@sisk.pl,
	gargankita@gmail.com, paulmck@linux.vnet.ibm.com,
	amit.kachhap@linaro.org, svaidy@linux.vnet.ibm.com,
	andi@firstfloor.org, wujianguo@huawei.com, kmpark@infradead.org,
	thomas.abraham@linaro.org, santosh.shilimkar@ti.com,
	linux-pm@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH v2 00/15][Sorted-buddy] mm: Memory Power Management
Date: Fri, 19 Apr 2013 12:20:35 +0530	[thread overview]
Message-ID: <5170E93B.5090902@linux.vnet.ibm.com> (raw)
In-Reply-To: <517028F1.6000002@sr71.net>

On 04/18/2013 10:40 PM, Dave Hansen wrote:
> On 04/09/2013 02:45 PM, Srivatsa S. Bhat wrote:
>> 2. Performance overhead is expected to be low: Since we retain the simplicity
>>    of the algorithm in the page allocation path, page allocation can
>>    potentially remain as fast as it would be without memory regions. The
>>    overhead is pushed to the page-freeing paths which are not that critical.
> 
> Numbers, please.  The problem with pushing the overhead to frees is that
> they, believe it or not, actually average out to the same as the number
> of allocs.  Think kernel compile, or a large dd.  Both of those churn
> through a lot of memory, and both do an awful lot of allocs _and_ frees.
>  We need to know both the overhead on a system that does *no* memory
> power management, and the overhead on a system which is carved and
> actually using this code.
> 
>> Kernbench results didn't show any noticeable performance degradation with
>> this patchset as compared to vanilla 3.9-rc5.
> 
> Surely this code isn't magical and there's overhead _somewhere_, and
> such overhead can be quantified _somehow_.  Have you made an effort to
> find those cases, even with microbenchmarks?
>

Sorry for not posting the numbers explicitly. It really shows no difference in
kernbench, see below.

[For the following run, I reverted patch 14, since it seems to be intermittently
causing kernel-instability at high loads. So the numbers below show the effect
of only the sorted-buddy part of the patchset, and not the compaction part.]

Kernbench was run on a 2 socket 8 core machine (HT disabled) with 128 GB RAM,
with allyesconfig on 3.9-rc5 kernel source. 

Vanilla 3.9-rc5:
---------------
Fri Apr 19 08:30:12 IST 2013
3.9.0-rc5
Average Optimal load -j 16 Run (std deviation):
Elapsed Time 574.66 (2.31846)
User Time 3919.12 (3.71256)
System Time 339.296 (0.73694)
Percent CPU 740.4 (2.50998)
Context Switches 1.2183e+06 (4019.47)
Sleeps 1.61239e+06 (2657.33)

This patchset (minus patch 14): [Region size = 512 MB]
------------------------------
Fri Apr 19 09:42:38 IST 2013
3.9.0-rc5-mpmv2-nowq
Average Optimal load -j 16 Run (std deviation):
Elapsed Time 575.668 (2.01583)
User Time 3916.77 (3.48345)
System Time 337.406 (0.701591)
Percent CPU 738.4 (3.36155)
Context Switches 1.21683e+06 (6980.13)
Sleeps 1.61474e+06 (4906.23)


So, that shows almost no degradation due to the sorted-buddy logic (considering
the elapsed time).
 
> I still also want to see some hard numbers on:
>> However, memory consumes a significant amount of power, potentially upto
>> more than a third of total system power on server systems.
> and
>> It had been demonstrated on a Samsung Exynos board
>> (with 2 GB RAM) that upto 6 percent of total system power can be saved by
>> making the Linux kernel MM subsystem power-aware[4]. 
> 
> That was *NOT* with this code, and it's nearing being two years old.
> What can *this* *patch* do?
> 

Please let me clarify that. My intention behind quoting that 6% power-savings
number was _not_ to trick reviewers into believing that _this_ patchset provides
that much power-savings. It was only to show that this whole effort of doing memory
power management is not worthless, and we do have some valuable/tangible
benefits to gain from it. IOW, it was only meant to show an _estimate_ of how
much we can potentially save and thus justify the effort behind managing memory
power-efficiently.

As I had mentioned in the cover-letter, I don't have the the exact power-savings
number for this particular patchset yet. I'll definitely work towards getting
those numbers soon.

> I think there are three scenarios to look at.  Let's say you have an 8GB
> system with 1GB regions:
> 1. Normal unpatched kernel, booted with  mem=1G...8G (in 1GB increments
>    perhaps) running some benchmark which sees performance scale with
>    the amount of memory present in the system.
> 2. Kernel patched with this set, running the same test, but with single
>    memory regions.
> 3. Kernel patched with this set.  But, instead of using mem=, you run
>    it trying to evacuate equivalent amount of memory to the amounts you
>    removed using mem=.
> 
> That will tell us both what the overhead is, and how effective it is.
> I'd much rather see actual numbers and a description of the test than
> some hand waving that it "didn't show any noticeable performance
> degradation".
> 

Sure, I'll perform more extensive tests to evaluate the performance overhead
more thoroughly. I'll first fix the compaction logic that seems to be buggy
and run benchmarks again.

Thanks a lot for your all invaluable inputs, Dave!

Regards,
Srivatsa S. Bhat

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2013-04-19  6:53 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-09 21:45 [RFC PATCH v2 00/15][Sorted-buddy] mm: Memory Power Management Srivatsa S. Bhat
2013-04-09 21:45 ` Srivatsa S. Bhat
2013-04-09 21:45 ` [RFC PATCH v2 01/15] mm: Introduce memory regions data-structure to capture region boundaries within nodes Srivatsa S. Bhat
2013-04-09 21:45   ` Srivatsa S. Bhat
2013-04-09 21:46 ` [RFC PATCH v2 02/15] mm: Initialize node memory regions during boot Srivatsa S. Bhat
2013-04-09 21:46   ` Srivatsa S. Bhat
2013-04-09 21:46 ` [RFC PATCH v2 03/15] mm: Introduce and initialize zone memory regions Srivatsa S. Bhat
2013-04-09 21:46   ` Srivatsa S. Bhat
2013-04-09 21:46 ` [RFC PATCH v2 04/15] mm: Add helpers to retrieve node region and zone region for a given page Srivatsa S. Bhat
2013-04-09 21:46   ` Srivatsa S. Bhat
2013-04-09 21:46 ` [RFC PATCH v2 05/15] mm: Add data-structures to describe memory regions within the zones' freelists Srivatsa S. Bhat
2013-04-09 21:46   ` Srivatsa S. Bhat
2013-04-09 21:47 ` [RFC PATCH v2 06/15] mm: Demarcate and maintain pageblocks in region-order in " Srivatsa S. Bhat
2013-04-09 21:47   ` Srivatsa S. Bhat
2013-04-09 21:47 ` [RFC PATCH v2 07/15] mm: Add an optimized version of del_from_freelist to keep page allocation fast Srivatsa S. Bhat
2013-04-09 21:47   ` Srivatsa S. Bhat
2013-04-09 21:47 ` [RFC PATCH v2 08/15] bitops: Document the difference in indexing between fls() and __fls() Srivatsa S. Bhat
2013-04-09 21:47   ` Srivatsa S. Bhat
2013-04-09 21:47 ` [RFC PATCH v2 09/15] mm: A new optimized O(log n) sorting algo to speed up buddy-sorting Srivatsa S. Bhat
2013-04-09 21:47   ` Srivatsa S. Bhat
2013-04-09 21:47 ` [RFC PATCH v2 10/15] mm: Add support to accurately track per-memory-region allocation Srivatsa S. Bhat
2013-04-09 21:47   ` Srivatsa S. Bhat
2013-04-09 21:48 ` [RFC PATCH v2 11/15] mm: Restructure the compaction part of CMA for wider use Srivatsa S. Bhat
2013-04-09 21:48   ` Srivatsa S. Bhat
2013-04-09 21:48 ` [RFC PATCH v2 12/15] mm: Add infrastructure to evacuate memory regions using compaction Srivatsa S. Bhat
2013-04-09 21:48   ` Srivatsa S. Bhat
2013-04-09 21:48 ` [RFC PATCH v2 13/15] mm: Implement the worker function for memory region compaction Srivatsa S. Bhat
2013-04-09 21:48   ` Srivatsa S. Bhat
2013-04-09 21:48 ` [RFC PATCH v2 14/15] mm: Add alloc-free handshake to trigger " Srivatsa S. Bhat
2013-04-09 21:48   ` Srivatsa S. Bhat
2013-04-10 23:26   ` Cody P Schafer
2013-04-10 23:26     ` Cody P Schafer
2013-04-16 13:49     ` Srivatsa S. Bhat
2013-04-16 13:49       ` Srivatsa S. Bhat
2013-04-09 21:49 ` [RFC PATCH v2 15/15] mm: Print memory region statistics to understand the buddy allocator behavior Srivatsa S. Bhat
2013-04-09 21:49   ` Srivatsa S. Bhat
2013-04-17 16:53 ` [RFC PATCH v2 00/15][Sorted-buddy] mm: Memory Power Management Srinivas Pandruvada
2013-04-17 16:53   ` Srinivas Pandruvada
2013-04-18  9:54   ` Srivatsa S. Bhat
2013-04-18  9:54     ` Srivatsa S. Bhat
2013-04-18 15:13     ` Srinivas Pandruvada
2013-04-18 15:13       ` Srinivas Pandruvada
2013-04-19  8:11       ` Srivatsa S. Bhat
2013-04-19  8:11         ` Srivatsa S. Bhat
2013-04-18 17:10 ` Dave Hansen
2013-04-18 17:10   ` Dave Hansen
2013-04-19  6:50   ` Srivatsa S. Bhat [this message]
2013-04-19  6:50     ` Srivatsa S. Bhat
2013-04-25 17:57   ` Srivatsa S. Bhat
2013-04-25 17:57     ` Srivatsa S. Bhat
2013-04-19  5:34 ` Simon Jeons
2013-04-19  5:34   ` Simon Jeons
2013-04-19  7:12   ` Srivatsa S. Bhat
2013-04-19  7:12     ` Srivatsa S. Bhat
2013-04-19 15:26     ` Srinivas Pandruvada
2013-04-19 15:26       ` Srinivas Pandruvada
2013-05-28 20:08     ` Phillip Susi
2013-05-28 20:08       ` Phillip Susi
2013-05-29  5:36       ` Srivatsa S. Bhat
2013-05-29  5:36         ` Srivatsa S. Bhat

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5170E93B.5090902@linux.vnet.ibm.com \
    --to=srivatsa.bhat@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=amit.kachhap@linaro.org \
    --cc=andi@firstfloor.org \
    --cc=arjan@linux.intel.com \
    --cc=dave@sr71.net \
    --cc=gargankita@gmail.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kmpark@infradead.org \
    --cc=lenb@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=loic.pallardy@stericsson.com \
    --cc=matthew.garrett@nebula.com \
    --cc=maxime.coquelin@stericsson.com \
    --cc=mgorman@suse.de \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=riel@redhat.com \
    --cc=rientjes@google.com \
    --cc=rjw@sisk.pl \
    --cc=santosh.shilimkar@ti.com \
    --cc=srinivas.pandruvada@linux.intel.com \
    --cc=svaidy@linux.vnet.ibm.com \
    --cc=thomas.abraham@linaro.org \
    --cc=wujianguo@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.