All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Travis <travis@sgi.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@elte.hu>,
	Andrew Morton <akpm@linux-foundation.org>,
	davej@codemonkey.org.uk, David Miller <davem@davemloft.net>,
	Eric Dumazet <dada1@cosmosbay.com>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Jack Steiner <steiner@sgi.com>,
	Jeremy Fitzhardinge <jeremy@goop.org>, Jes Sorensen <jes@sgi.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC 07/13] sched: Reduce stack size requirements in kernel/sched.c
Date: Mon, 08 Sep 2008 07:54:09 -0700	[thread overview]
Message-ID: <48C53C91.70604@sgi.com> (raw)
In-Reply-To: <1220783087.8687.73.camel@twins.programming.kicks-ass.net>

Peter Zijlstra wrote:
> On Sat, 2008-09-06 at 16:50 -0700, Mike Travis wrote:
>> plain text document attachment (stack-hogs-kernel_sched_c)
>> * Make the following changes to kernel/sched.c functions:
>>
>>     - use node_to_cpumask_ptr in place of node_to_cpumask
>>     - use get_cpumask_var for temporary cpumask_t variables
>>     - use alloc_cpumask_ptr where available
>>
>>   * Remove special code for SCHED_CPUMASK_ALLOC and use CPUMASK_ALLOC
>>     from linux/cpumask.h.
>>
>>   * The resultant stack savings are:
>>
>>     ====== Stack (-l 100)
>>
>> 	1 - initial
>> 	2 - stack-hogs-kernel_sched_c
>> 	'.' is less than the limit(100)
>>
>>        .1.    .2.    ..final..
>>       2216  -1536 680   -69%  __build_sched_domains
>>       1592  -1592   .  -100%  move_task_off_dead_cpu
>>       1096  -1096   .  -100%  sched_balance_self
>>       1032  -1032   .  -100%  sched_setaffinity
>>        616   -616   .  -100%  rebalance_domains
>>        552   -552   .  -100%  free_sched_groups
>>        512   -512   .  -100%  cpu_to_allnodes_group
>>       7616  -6936 680   -91%  Totals
>>
>>
>> Applies to linux-2.6.tip/master.
>>
>> Signed-off-by: Mike Travis <travis@sgi.com>
>> ---
>>  kernel/sched.c |  151 ++++++++++++++++++++++++++++++---------------------------
>>  1 file changed, 81 insertions(+), 70 deletions(-)
>>
>> --- linux-2.6.tip.orig/kernel/sched.c
>> +++ linux-2.6.tip/kernel/sched.c
>> @@ -70,6 +70,7 @@
>>  #include <linux/bootmem.h>
>>  #include <linux/debugfs.h>
>>  #include <linux/ctype.h>
>> +#include <linux/cpumask_ptr.h>
>>  #include <linux/ftrace.h>
>>  #include <trace/sched.h>
>>  
>> @@ -117,6 +118,12 @@
>>   */
>>  #define RUNTIME_INF	((u64)~0ULL)
>>  
>> +/*
>> + * temp cpumask variables
>> + */
>> +static DEFINE_PER_CPUMASK(temp_cpumask_1);
>> +static DEFINE_PER_CPUMASK(temp_cpumask_2);
> 
> Yuck, that relies on turning preemption off everywhere you want to use
> those.
> 
> 
>> @@ -5384,11 +5400,14 @@ out_unlock:
>>  
>>  long sched_setaffinity(pid_t pid, const cpumask_t *in_mask)
>>  {
>> -	cpumask_t cpus_allowed;
>> -	cpumask_t new_mask = *in_mask;
>> +	cpumask_ptr cpus_allowed;
>> +	cpumask_ptr new_mask;
>>  	struct task_struct *p;
>>  	int retval;
>>  
>> +	get_cpumask_var(cpus_allowed, temp_cpumask_1);
>> +	get_cpumask_var(new_mask, temp_cpumask_2);
>> +	*new_mask = *in_mask;
>>  	get_online_cpus();
>>  	read_lock(&tasklist_lock);
> 
> BUG!
> 
> get_online_cpus() can sleep, but you just disabled preemption with those
> get_cpumask_var() horribles!
> 
> Couldn't be arsed to look through the rest, but I really hate this
> cpumask_ptr() stuff that relies on disabling preemption.
> 
> NAK

Yeah, I really agree as well.  But I wanted to start playing with using
cpumask_t pointers in some fairly straight forward manner.  Linus's and
Ingo's suggestion to just bite the bullet and redefine the cpumask_t 
would force a lot of changes to be made, but perhaps that's really the
way to go.

As to obtaining temp cpumask_t's (both early and late), perhaps a pool of
them would be better?  I believe it could be done similar to alloc_bootmem
(but much simpler), and I don't think there's enough nesting to require a
very large pool.  (4 was the largest depth I could find in io_apic.c.)  Of
course, with preemption enabled then other problems arise...

One other really big use was for the "allbutself" cpumask in the send_IPI
functions.  I think here, preemption is ok because the ownership of the
cpumask temp is very short lived.

But thanks for pointing out the get_online_cpus problem.  I did try and
chase down as many call trees as I could, but I obviously missed one
important one.

And thanks for looking it over!
Mike


  parent reply	other threads:[~2008-09-08 14:54 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-09-06 23:50 [RFC 00/13] smp: reduce stack requirements for genapic send_IPI_mask functions Mike Travis
2008-09-06 23:50 ` [RFC 01/13] smp: modify send_IPI_mask interface to accept cpumask_t pointers Mike Travis
2008-09-06 23:50 ` [RFC 02/13] cpumask: add for_each_online_cpu_mask_nr function Mike Travis
2008-09-06 23:50 ` [RFC 03/13] xen: use new " Mike Travis
2008-09-06 23:50 ` [RFC 04/13] cpumask: add cpumask_ptr operations Mike Travis
2008-09-06 23:50 ` [RFC 05/13] cpumask: add get_cpumask_var debug operations Mike Travis
2008-09-06 23:50 ` [RFC 06/13] genapic: use get_cpumask_var operations for allbutself cpumask_ts Mike Travis
2008-09-06 23:50 ` [RFC 07/13] sched: Reduce stack size requirements in kernel/sched.c Mike Travis
2008-09-07 10:24   ` Peter Zijlstra
2008-09-07 11:00     ` Andrew Morton
2008-09-07 13:05       ` Peter Zijlstra
2008-09-08 14:56         ` Mike Travis
2008-09-07 20:28       ` Peter Zijlstra
2008-09-08 14:54     ` Mike Travis [this message]
2008-09-08 15:05       ` Peter Zijlstra
2008-09-08 18:38         ` Ingo Molnar
2008-09-10 22:47           ` [RFC] CPUMASK: proposal for replacing cpumask_t Mike Travis
2008-09-10 22:53             ` Andi Kleen
2008-09-10 23:33               ` Mike Travis
2008-09-11  5:21                 ` Andi Kleen
2008-09-11  9:00             ` Peter Zijlstra
2008-09-11 15:04               ` Mike Travis
2008-09-12  4:55             ` Rusty Russell
2008-09-12 14:28               ` Mike Travis
2008-09-12 22:02                 ` Rusty Russell
2008-09-12 22:50                   ` Mike Travis
2008-09-12 22:58                     ` H. Peter Anvin
2008-09-06 23:50 ` [RFC 08/13] cpufreq: Reduce stack size requirements in acpi-cpufreq.c Mike Travis
2008-09-06 23:50 ` [RFC 09/13] genapic: reduce stack pressuge in io_apic.c step 1 temp cpumask_ts Mike Travis
2008-09-08 11:01   ` Andi Kleen
2008-09-08 16:03     ` Mike Travis
2008-09-06 23:50 ` [RFC 10/13] genapic: reduce stack pressuge in io_apic.c step 2 internal abi Mike Travis
2008-09-06 23:50 ` [RFC 11/13] genapic: reduce stack pressuge in io_apic.c step 3 target_cpus Mike Travis
2008-09-07  7:55   ` Bert Wesarg
2008-09-07  9:13     ` Ingo Molnar
2008-09-08 15:01       ` Mike Travis
2008-09-08 15:29     ` Mike Travis
2008-09-06 23:50 ` [RFC 12/13] genapic: reduce stack pressuge in io_apic.c step 4 vector allocation Mike Travis
2008-09-06 23:50 ` [RFC 13/13] genapic: reduce stack pressuge in io_apic.c step 5 cpu_mask_to_apicid Mike Travis
2008-09-07  7:36 ` [RFC 00/13] smp: reduce stack requirements for genapic send_IPI_mask functions Ingo Molnar
2008-09-08 15:17   ` Mike Travis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48C53C91.70604@sgi.com \
    --to=travis@sgi.com \
    --cc=akpm@linux-foundation.org \
    --cc=dada1@cosmosbay.com \
    --cc=davej@codemonkey.org.uk \
    --cc=davem@davemloft.net \
    --cc=ebiederm@xmission.com \
    --cc=hpa@zytor.com \
    --cc=jeremy@goop.org \
    --cc=jes@sgi.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=peterz@infradead.org \
    --cc=steiner@sgi.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.