linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
To: Dave Hansen <dave@sr71.net>
Cc: akpm@linux-foundation.org, mgorman@suse.de,
	matthew.garrett@nebula.com, rientjes@google.com, riel@redhat.com,
	arjan@linux.intel.com, srinivas.pandruvada@linux.intel.com,
	maxime.coquelin@stericsson.com, loic.pallardy@stericsson.com,
	kamezawa.hiroyu@jp.fujitsu.com, lenb@kernel.org, rjw@sisk.pl,
	gargankita@gmail.com, paulmck@linux.vnet.ibm.com,
	amit.kachhap@linaro.org, svaidy@linux.vnet.ibm.com,
	andi@firstfloor.org, wujianguo@huawei.com, kmpark@infradead.org,
	thomas.abraham@linaro.org, santosh.shilimkar@ti.com,
	linux-pm@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH v2 00/15][Sorted-buddy] mm: Memory Power Management
Date: Fri, 19 Apr 2013 12:20:35 +0530	[thread overview]
Message-ID: <5170E93B.5090902@linux.vnet.ibm.com> (raw)
In-Reply-To: <517028F1.6000002@sr71.net>

On 04/18/2013 10:40 PM, Dave Hansen wrote:
> On 04/09/2013 02:45 PM, Srivatsa S. Bhat wrote:
>> 2. Performance overhead is expected to be low: Since we retain the simplicity
>>    of the algorithm in the page allocation path, page allocation can
>>    potentially remain as fast as it would be without memory regions. The
>>    overhead is pushed to the page-freeing paths which are not that critical.
> 
> Numbers, please.  The problem with pushing the overhead to frees is that
> they, believe it or not, actually average out to the same as the number
> of allocs.  Think kernel compile, or a large dd.  Both of those churn
> through a lot of memory, and both do an awful lot of allocs _and_ frees.
>  We need to know both the overhead on a system that does *no* memory
> power management, and the overhead on a system which is carved and
> actually using this code.
> 
>> Kernbench results didn't show any noticeable performance degradation with
>> this patchset as compared to vanilla 3.9-rc5.
> 
> Surely this code isn't magical and there's overhead _somewhere_, and
> such overhead can be quantified _somehow_.  Have you made an effort to
> find those cases, even with microbenchmarks?
>

Sorry for not posting the numbers explicitly. It really shows no difference in
kernbench, see below.

[For the following run, I reverted patch 14, since it seems to be intermittently
causing kernel-instability at high loads. So the numbers below show the effect
of only the sorted-buddy part of the patchset, and not the compaction part.]

Kernbench was run on a 2 socket 8 core machine (HT disabled) with 128 GB RAM,
with allyesconfig on 3.9-rc5 kernel source. 

Vanilla 3.9-rc5:
---------------
Fri Apr 19 08:30:12 IST 2013
3.9.0-rc5
Average Optimal load -j 16 Run (std deviation):
Elapsed Time 574.66 (2.31846)
User Time 3919.12 (3.71256)
System Time 339.296 (0.73694)
Percent CPU 740.4 (2.50998)
Context Switches 1.2183e+06 (4019.47)
Sleeps 1.61239e+06 (2657.33)

This patchset (minus patch 14): [Region size = 512 MB]
------------------------------
Fri Apr 19 09:42:38 IST 2013
3.9.0-rc5-mpmv2-nowq
Average Optimal load -j 16 Run (std deviation):
Elapsed Time 575.668 (2.01583)
User Time 3916.77 (3.48345)
System Time 337.406 (0.701591)
Percent CPU 738.4 (3.36155)
Context Switches 1.21683e+06 (6980.13)
Sleeps 1.61474e+06 (4906.23)


So, that shows almost no degradation due to the sorted-buddy logic (considering
the elapsed time).
 
> I still also want to see some hard numbers on:
>> However, memory consumes a significant amount of power, potentially upto
>> more than a third of total system power on server systems.
> and
>> It had been demonstrated on a Samsung Exynos board
>> (with 2 GB RAM) that upto 6 percent of total system power can be saved by
>> making the Linux kernel MM subsystem power-aware[4]. 
> 
> That was *NOT* with this code, and it's nearing being two years old.
> What can *this* *patch* do?
> 

Please let me clarify that. My intention behind quoting that 6% power-savings
number was _not_ to trick reviewers into believing that _this_ patchset provides
that much power-savings. It was only to show that this whole effort of doing memory
power management is not worthless, and we do have some valuable/tangible
benefits to gain from it. IOW, it was only meant to show an _estimate_ of how
much we can potentially save and thus justify the effort behind managing memory
power-efficiently.

As I had mentioned in the cover-letter, I don't have the the exact power-savings
number for this particular patchset yet. I'll definitely work towards getting
those numbers soon.

> I think there are three scenarios to look at.  Let's say you have an 8GB
> system with 1GB regions:
> 1. Normal unpatched kernel, booted with  mem=1G...8G (in 1GB increments
>    perhaps) running some benchmark which sees performance scale with
>    the amount of memory present in the system.
> 2. Kernel patched with this set, running the same test, but with single
>    memory regions.
> 3. Kernel patched with this set.  But, instead of using mem=, you run
>    it trying to evacuate equivalent amount of memory to the amounts you
>    removed using mem=.
> 
> That will tell us both what the overhead is, and how effective it is.
> I'd much rather see actual numbers and a description of the test than
> some hand waving that it "didn't show any noticeable performance
> degradation".
> 

Sure, I'll perform more extensive tests to evaluate the performance overhead
more thoroughly. I'll first fix the compaction logic that seems to be buggy
and run benchmarks again.

Thanks a lot for your all invaluable inputs, Dave!

Regards,
Srivatsa S. Bhat

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2013-04-19  6:54 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-09 21:45 [RFC PATCH v2 00/15][Sorted-buddy] mm: Memory Power Management Srivatsa S. Bhat
2013-04-09 21:45 ` [RFC PATCH v2 01/15] mm: Introduce memory regions data-structure to capture region boundaries within nodes Srivatsa S. Bhat
2013-04-09 21:46 ` [RFC PATCH v2 02/15] mm: Initialize node memory regions during boot Srivatsa S. Bhat
2013-04-09 21:46 ` [RFC PATCH v2 03/15] mm: Introduce and initialize zone memory regions Srivatsa S. Bhat
2013-04-09 21:46 ` [RFC PATCH v2 04/15] mm: Add helpers to retrieve node region and zone region for a given page Srivatsa S. Bhat
2013-04-09 21:46 ` [RFC PATCH v2 05/15] mm: Add data-structures to describe memory regions within the zones' freelists Srivatsa S. Bhat
2013-04-09 21:47 ` [RFC PATCH v2 06/15] mm: Demarcate and maintain pageblocks in region-order in " Srivatsa S. Bhat
2013-04-09 21:47 ` [RFC PATCH v2 07/15] mm: Add an optimized version of del_from_freelist to keep page allocation fast Srivatsa S. Bhat
2013-04-09 21:47 ` [RFC PATCH v2 08/15] bitops: Document the difference in indexing between fls() and __fls() Srivatsa S. Bhat
2013-04-09 21:47 ` [RFC PATCH v2 09/15] mm: A new optimized O(log n) sorting algo to speed up buddy-sorting Srivatsa S. Bhat
2013-04-09 21:47 ` [RFC PATCH v2 10/15] mm: Add support to accurately track per-memory-region allocation Srivatsa S. Bhat
2013-04-09 21:48 ` [RFC PATCH v2 11/15] mm: Restructure the compaction part of CMA for wider use Srivatsa S. Bhat
2013-04-09 21:48 ` [RFC PATCH v2 12/15] mm: Add infrastructure to evacuate memory regions using compaction Srivatsa S. Bhat
2013-04-09 21:48 ` [RFC PATCH v2 13/15] mm: Implement the worker function for memory region compaction Srivatsa S. Bhat
2013-04-09 21:48 ` [RFC PATCH v2 14/15] mm: Add alloc-free handshake to trigger " Srivatsa S. Bhat
2013-04-10 23:26   ` Cody P Schafer
2013-04-16 13:49     ` Srivatsa S. Bhat
2013-04-09 21:49 ` [RFC PATCH v2 15/15] mm: Print memory region statistics to understand the buddy allocator behavior Srivatsa S. Bhat
2013-04-17 16:53 ` [RFC PATCH v2 00/15][Sorted-buddy] mm: Memory Power Management Srinivas Pandruvada
2013-04-18  9:54   ` Srivatsa S. Bhat
2013-04-18 15:13     ` Srinivas Pandruvada
2013-04-19  8:11       ` Srivatsa S. Bhat
2013-04-18 17:10 ` Dave Hansen
2013-04-19  6:50   ` Srivatsa S. Bhat [this message]
2013-04-25 17:57   ` Srivatsa S. Bhat
2013-04-19  5:34 ` Simon Jeons
2013-04-19  7:12   ` Srivatsa S. Bhat
2013-04-19 15:26     ` Srinivas Pandruvada
2013-05-28 20:08     ` Phillip Susi
2013-05-29  5:36       ` Srivatsa S. Bhat

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5170E93B.5090902@linux.vnet.ibm.com \
    --to=srivatsa.bhat@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=amit.kachhap@linaro.org \
    --cc=andi@firstfloor.org \
    --cc=arjan@linux.intel.com \
    --cc=dave@sr71.net \
    --cc=gargankita@gmail.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kmpark@infradead.org \
    --cc=lenb@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=loic.pallardy@stericsson.com \
    --cc=matthew.garrett@nebula.com \
    --cc=maxime.coquelin@stericsson.com \
    --cc=mgorman@suse.de \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=riel@redhat.com \
    --cc=rientjes@google.com \
    --cc=rjw@sisk.pl \
    --cc=santosh.shilimkar@ti.com \
    --cc=srinivas.pandruvada@linux.intel.com \
    --cc=svaidy@linux.vnet.ibm.com \
    --cc=thomas.abraham@linaro.org \
    --cc=wujianguo@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).