All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: Jeff Mahoney <jeffm@suse.com>
Cc: Jiri Kosina <jkosina@suse.cz>, Tejun Heo <tj@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Tony Luck <tony.luck@intel.com>,
	Fenghua Yu <fenghua.yu@intel.com>,
	linux-ia64@vger.kernel.org
Subject: Re: Commit 34d76c41 causes linker errors on ia64 with NR_CPUS=4096
Date: Mon, 26 Oct 2009 20:16:51 +0000	[thread overview]
Message-ID: <20091026201651.GD24682@elte.hu> (raw)
In-Reply-To: <4ADDCDED.6060706@suse.com>


* Jeff Mahoney <jeffm@suse.com> wrote:

> On 10/20/2009 10:18 AM, Jeff Mahoney wrote:
> > On 10/20/2009 02:27 AM, Jiri Kosina wrote:
> >> @@ -1627,11 +1623,10 @@ static int tg_shares_up(struct task_group *tg, void *data)
> >>  		return 0;
> >>  
> >>  	local_irq_save(flags);
> >> -	usd = &__get_cpu_var(update_shares_data);
> >>  
> >>  	for_each_cpu(i, sched_domain_span(sd)) {
> >>  		weight = tg->cfs_rq[i]->load.weight;
> >> -		usd->rq_weight[i] = weight;
> >> +		usd = *per_cpu_ptr(update_shares_data, i) = weight;
> >>  
> >>  		/*
> >>  		 * If there are currently no tasks on the cpu pretend there
> > 
> > I don't think this is what you want here.
> > 
> > In the original version, usd is the percpu var using the current cpu. In
> > your version, usd is the percpu var using i instead of the current cpu.
> > 
> > I'll post my version of the patch shortly. I don't think keeping most of
> > the original version is a bad thing. We can just allocate it dynamically
> > instead.
> 
> This version fixes a build issue (__alignof__(unsigned long)) and fixes the
> percpu lookup to be usd = percpu array pointer, usd[i] = actual variable.
> 
> -Jeff
> 
> From: Jiri Kosina <jkosina@suse.cz>
> Subject: sched: move rq_weight data array out of .percpu
> 
> Commit 34d76c41 introduced percpu array update_shares_data, size of which
> being proportional to NR_CPUS. Unfortunately this blows up ia64 for large
> NR_CPUS configuration, as ia64 allows only 64k for .percpu section.
> 
> Fix this by allocating this array dynamically and keep only pointer to it
> percpu.
> 
> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
> --- 
>  kernel/sched.c |   20 ++++++++++----------
>  1 file changed, 10 insertions(+), 10 deletions(-)
> 
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -1564,11 +1564,7 @@ static unsigned long cpu_avg_load_per_ta
>  
>  #ifdef CONFIG_FAIR_GROUP_SCHED
>  
> -struct update_shares_data {
> -	unsigned long rq_weight[NR_CPUS];
> -};
> -
> -static DEFINE_PER_CPU(struct update_shares_data, update_shares_data);
> +unsigned long *update_shares_data;

That should be __read_mostly - this can be a hotly accessed data 
structure of the scheduler - if it happens to go next to a frequently 
bouncing variable that can be bad for performance.

> -				    struct update_shares_data *usd)
> +				    unsigned long *usd)

I dont think using usd[cpu] is clearer than usd->rq_weight[cpu]. At 
minimum it should be renamed to usd_rq_weight not usd.

>  	local_irq_save(flags);
> -	usd = &__get_cpu_var(update_shares_data);
> +	usd = per_cpu_ptr(update_shares_data, smp_processor_id());

Could we please have a look at the before/after assembly of this 
sequence on x86, to make sure the claims in this thread are true and we 
dont lose performance? (and included it in the changelog with a 
resubmission - with a new, changed '[PATCH] ...' subject line, not 
hidden inside a discussion thread.)

From a merge POV i'm quite nervous about such a change to the scheduler 
this late in the .32 cycle - to offset that risk i'd really like to see 
that this change has been pursued carefully to the edge of possibilities 
- currently it does not give that impression.

Thanks,

	Ingo

WARNING: multiple messages have this Message-ID (diff)
From: Ingo Molnar <mingo@elte.hu>
To: Jeff Mahoney <jeffm@suse.com>
Cc: Jiri Kosina <jkosina@suse.cz>, Tejun Heo <tj@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Tony Luck <tony.luck@intel.com>,
	Fenghua Yu <fenghua.yu@intel.com>,
	linux-ia64@vger.kernel.org
Subject: Re: Commit 34d76c41 causes linker errors on ia64 with NR_CPUS=4096
Date: Mon, 26 Oct 2009 21:16:51 +0100	[thread overview]
Message-ID: <20091026201651.GD24682@elte.hu> (raw)
In-Reply-To: <4ADDCDED.6060706@suse.com>


* Jeff Mahoney <jeffm@suse.com> wrote:

> On 10/20/2009 10:18 AM, Jeff Mahoney wrote:
> > On 10/20/2009 02:27 AM, Jiri Kosina wrote:
> >> @@ -1627,11 +1623,10 @@ static int tg_shares_up(struct task_group *tg, void *data)
> >>  		return 0;
> >>  
> >>  	local_irq_save(flags);
> >> -	usd = &__get_cpu_var(update_shares_data);
> >>  
> >>  	for_each_cpu(i, sched_domain_span(sd)) {
> >>  		weight = tg->cfs_rq[i]->load.weight;
> >> -		usd->rq_weight[i] = weight;
> >> +		usd = *per_cpu_ptr(update_shares_data, i) = weight;
> >>  
> >>  		/*
> >>  		 * If there are currently no tasks on the cpu pretend there
> > 
> > I don't think this is what you want here.
> > 
> > In the original version, usd is the percpu var using the current cpu. In
> > your version, usd is the percpu var using i instead of the current cpu.
> > 
> > I'll post my version of the patch shortly. I don't think keeping most of
> > the original version is a bad thing. We can just allocate it dynamically
> > instead.
> 
> This version fixes a build issue (__alignof__(unsigned long)) and fixes the
> percpu lookup to be usd = percpu array pointer, usd[i] = actual variable.
> 
> -Jeff
> 
> From: Jiri Kosina <jkosina@suse.cz>
> Subject: sched: move rq_weight data array out of .percpu
> 
> Commit 34d76c41 introduced percpu array update_shares_data, size of which
> being proportional to NR_CPUS. Unfortunately this blows up ia64 for large
> NR_CPUS configuration, as ia64 allows only 64k for .percpu section.
> 
> Fix this by allocating this array dynamically and keep only pointer to it
> percpu.
> 
> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
> --- 
>  kernel/sched.c |   20 ++++++++++----------
>  1 file changed, 10 insertions(+), 10 deletions(-)
> 
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -1564,11 +1564,7 @@ static unsigned long cpu_avg_load_per_ta
>  
>  #ifdef CONFIG_FAIR_GROUP_SCHED
>  
> -struct update_shares_data {
> -	unsigned long rq_weight[NR_CPUS];
> -};
> -
> -static DEFINE_PER_CPU(struct update_shares_data, update_shares_data);
> +unsigned long *update_shares_data;

That should be __read_mostly - this can be a hotly accessed data 
structure of the scheduler - if it happens to go next to a frequently 
bouncing variable that can be bad for performance.

> -				    struct update_shares_data *usd)
> +				    unsigned long *usd)

I dont think using usd[cpu] is clearer than usd->rq_weight[cpu]. At 
minimum it should be renamed to usd_rq_weight not usd.

>  	local_irq_save(flags);
> -	usd = &__get_cpu_var(update_shares_data);
> +	usd = per_cpu_ptr(update_shares_data, smp_processor_id());

Could we please have a look at the before/after assembly of this 
sequence on x86, to make sure the claims in this thread are true and we 
dont lose performance? (and included it in the changelog with a 
resubmission - with a new, changed '[PATCH] ...' subject line, not 
hidden inside a discussion thread.)

>From a merge POV i'm quite nervous about such a change to the scheduler 
this late in the .32 cycle - to offset that risk i'd really like to see 
that this change has been pursued carefully to the edge of possibilities 
- currently it does not give that impression.

Thanks,

	Ingo

  parent reply	other threads:[~2009-10-26 20:16 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-10-18 22:28 Commit 34d76c41 causes linker errors on ia64 with NR_CPUS=4096 Jeff Mahoney
2009-10-20  2:02 ` Jiri Kosina
2009-10-20  2:02   ` Jiri Kosina
2009-10-20  4:57   ` Jiri Kosina
2009-10-20  4:57     ` Jiri Kosina
2009-10-20  5:21     ` Tejun Heo
2009-10-20  5:21       ` Tejun Heo
2009-10-20  5:58       ` Jiri Kosina
2009-10-20  5:58         ` Jiri Kosina
2009-10-20  6:12         ` Tejun Heo
2009-10-20  6:12           ` Tejun Heo
2009-10-20  6:14           ` Tejun Heo
2009-10-20  6:14             ` Tejun Heo
2009-10-20  6:27             ` Jiri Kosina
2009-10-20  6:27               ` Jiri Kosina
2009-10-20 14:18               ` Jeff Mahoney
2009-10-20 14:18                 ` Jeff Mahoney
2009-10-20 14:49                 ` Jeff Mahoney
2009-10-20 14:49                   ` Jeff Mahoney
2009-10-21  6:11                   ` Ingo Molnar
2009-10-21  6:11                     ` Ingo Molnar
2009-10-21 15:19                     ` Tejun Heo
2009-10-21 15:19                       ` Tejun Heo
2009-10-21 22:11                       ` Luck, Tony
2009-10-21 22:11                         ` Luck, Tony
2009-10-22 14:49                         ` Jiri Kosina
2009-10-22 14:49                           ` Jiri Kosina
2009-10-22 14:53                           ` Jeff Mahoney
2009-10-22 14:53                             ` Jeff Mahoney
2009-10-22 22:24                           ` Luck, Tony
2009-10-22 22:24                             ` Luck, Tony
2009-10-23  7:51                             ` Ingo Molnar
2009-10-23  7:51                               ` Ingo Molnar
2009-10-23 12:30                               ` Jiri Kosina
2009-10-23 12:30                                 ` Jiri Kosina
2009-10-26 16:38                                 ` Jiri Kosina
2009-10-26 16:38                                   ` Jiri Kosina
2009-10-26 20:16                   ` Ingo Molnar [this message]
2009-10-26 20:16                     ` Ingo Molnar
2009-10-27 10:03                     ` Jiri Kosina
2009-10-27 10:03                       ` Jiri Kosina
2009-10-27 10:52       ` Jiri Kosina
2009-10-27 10:52         ` Jiri Kosina
2009-10-20  6:15     ` Ingo Molnar
2009-10-20  6:15       ` Ingo Molnar
2009-10-20  6:26       ` Jiri Kosina
2009-10-20  6:26         ` Jiri Kosina
2009-10-20  6:35         ` Ingo Molnar
2009-10-20  6:35           ` Ingo Molnar
2009-10-20  7:11           ` Eric Dumazet
2009-10-20  7:11             ` Eric Dumazet
2009-10-20  7:39             ` Tejun Heo
2009-10-20  7:39               ` Tejun Heo
2009-10-20  7:12           ` Tejun Heo
2009-10-20  7:12             ` Tejun Heo
2009-10-20  7:17             ` Jiri Kosina
2009-10-20  7:17               ` Jiri Kosina
2009-10-20  7:36               ` Tejun Heo
2009-10-20  7:36                 ` Tejun Heo
2009-10-20 13:08           ` Jeff Mahoney
2009-10-20 13:08             ` Jeff Mahoney
2009-10-20 13:43             ` Ingo Molnar
2009-10-20 13:43               ` Ingo Molnar
2009-10-20 13:57               ` Tejun Heo
2009-10-20 13:57                 ` Tejun Heo
2009-10-20 13:58                 ` Tejun Heo
2009-10-20 13:58                   ` Tejun Heo
2009-10-21  6:43                   ` Christoph Lameter
2009-10-21  6:43                     ` Christoph Lameter
2009-10-20  9:21       ` Peter Zijlstra
2009-10-20  9:21         ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091026201651.GD24682@elte.hu \
    --to=mingo@elte.hu \
    --cc=fenghua.yu@intel.com \
    --cc=jeffm@suse.com \
    --cc=jkosina@suse.cz \
    --cc=linux-ia64@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=tj@kernel.org \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.