All of lore.kernel.org
 help / color / mirror / Atom feed
From: Aaron Lu <aaron.lu@intel.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org, Ingo Molnar <mingo@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Tim Chen <tim.c.chen@linux.intel.com>,
	Dave Hansen <dave.hansen@intel.com>,
	Huang Ying <ying.huang@intel.com>
Subject: Re: [PATCH resend] smp: avoid sending needless IPI in smp_call_function_many()
Date: Mon, 22 May 2017 16:04:57 +0800	[thread overview]
Message-ID: <20170522080457.GR2084@aaronlu.sh.intel.com> (raw)
In-Reply-To: <20170519105954.fp56qt6252vzup4m@hirez.programming.kicks-ass.net>

On Fri, May 19, 2017 at 12:59:54PM +0200, Peter Zijlstra wrote:
> On Fri, May 19, 2017 at 03:53:31PM +0800, Aaron Lu wrote:
> > Inter-Processor-Interrupt(IPI) is needed when a page is unmapped and the
> > process' mm_cpumask() shows the process has ever run on other CPUs. page
> > migration, page reclaim all need IPIs. The number of IPI needed to send
> > to different CPUs is especially large for multi-threaded workload since
> > mm_cpumask() is per process.
> > 
> > For smp_call_function_many(), whenever a CPU queues a CSD to a target
> > CPU, it will send an IPI to let the target CPU to handle the work.
> > This isn't necessary - we need only send IPI when queueing a CSD
> > to an empty call_single_queue.
> > 
> > The reason:
> > flush_smp_call_function_queue() that is called upon a CPU receiving an
> > IPI will empty the queue and then handle all of the CSDs there. So if
> > the target CPU's call_single_queue is not empty, we know that:
> > i.  An IPI for the target CPU has already been sent by 'previous queuers';
> > ii. flush_smp_call_function_queue() hasn't emptied that CPU's queue yet.
> > Thus, it's safe for us to just queue our CSD there without sending an
> > addtional IPI. And for the 'previous queuers', we can limit it to the
> > first queuer.
> > 
> > To demonstrate the effect of this patch, a multi-thread workload that
> > spawns 80 threads to equally consume 100G memory is used. This is tested
> > on a 2 node broadwell-EP which has 44cores/88threads and 32G memory. So
> > after 32G memory is used up, page reclaiming starts to happen a lot.
> > 
> > With this patch, IPI number dropped 88% and throughput increased about
> > 15% for the above workload.
> > 
> > Signed-off-by: Aaron Lu <aaron.lu@intel.com>
> 
> Seems fine to me, I'll queue it.
> 
> 
> > @@ -434,6 +442,7 @@ void smp_call_function_many(const struct cpumask *mask,
> >  	if (unlikely(!cpumask_weight(cfd->cpumask)))
> >  		return;
> >  
> > +	cpumask_clear(cfd->cpumask_ipi);
> >  	for_each_cpu(cpu, cfd->cpumask) {
> >  		struct call_single_data *csd = per_cpu_ptr(cfd->csd, cpu);
> >  
> > @@ -442,11 +451,12 @@ void smp_call_function_many(const struct cpumask *mask,
> >  			csd->flags |= CSD_FLAG_SYNCHRONOUS;
> >  		csd->func = func;
> >  		csd->info = info;
> > -		llist_add(&csd->llist, &per_cpu(call_single_queue, cpu));
> > +		if (llist_add(&csd->llist, &per_cpu(call_single_queue, cpu)))
> > +			cpumask_set_cpu(cpu, cfd->cpumask_ipi);
> >  	}
> 
> But looking at this I wonder why cpumask_{set,clear}_cpu() are atomic
> ops while most other cpumask ops are not.
> 
> This seems to suggest we want __cpumask_{set,clear}_cpu() and use those
> here. Because those LOCK prefixes sure are pointless.

Sounds reasonable to me.

> 
> >  
> >  	/* Send a message to all CPUs in the map */
> > -	arch_send_call_function_ipi_mask(cfd->cpumask);
> > +	arch_send_call_function_ipi_mask(cfd->cpumask_ipi);
> >  
> >  	if (wait) {
> >  		for_each_cpu(cpu, cfd->cpumask) {
> 
> Something like so on top I suppose.
> 
> Anybody?

With the below patch applied on top, the workload still works fine.
Performance is about the same.

Thanks,
Aaron
 
> ---
>  include/linux/cpumask.h |   11 +++++++++++
>  kernel/smp.c            |    4 ++--
>  2 files changed, 13 insertions(+), 2 deletions(-)
> 
> --- a/include/linux/cpumask.h
> +++ b/include/linux/cpumask.h
> @@ -293,6 +293,12 @@ static inline void cpumask_set_cpu(unsig
>  	set_bit(cpumask_check(cpu), cpumask_bits(dstp));
>  }
>  
> +static inline void __cpumask_set_cpu(unsigned int cpu, struct cpumask *dstp)
> +{
> +	__set_bit(cpumask_check(cpu), cpumask_bits(dstp));
> +}
> +
> +
>  /**
>   * cpumask_clear_cpu - clear a cpu in a cpumask
>   * @cpu: cpu number (< nr_cpu_ids)
> @@ -303,6 +309,11 @@ static inline void cpumask_clear_cpu(int
>  	clear_bit(cpumask_check(cpu), cpumask_bits(dstp));
>  }
>  
> +static inline void __cpumask_clear_cpu(int cpu, struct cpumask *dstp)
> +{
> +	__clear_bit(cpumask_check(cpu), cpumask_bits(dstp));
> +}
> +
>  /**
>   * cpumask_test_cpu - test for a cpu in a cpumask
>   * @cpu: cpu number (< nr_cpu_ids)
> --- a/kernel/smp.c
> +++ b/kernel/smp.c
> @@ -436,7 +436,7 @@ void smp_call_function_many(const struct
>  	cfd = this_cpu_ptr(&cfd_data);
>  
>  	cpumask_and(cfd->cpumask, mask, cpu_online_mask);
> -	cpumask_clear_cpu(this_cpu, cfd->cpumask);
> +	__cpumask_clear_cpu(this_cpu, cfd->cpumask);
>  
>  	/* Some callers race with other cpus changing the passed mask */
>  	if (unlikely(!cpumask_weight(cfd->cpumask)))
> @@ -452,7 +452,7 @@ void smp_call_function_many(const struct
>  		csd->func = func;
>  		csd->info = info;
>  		if (llist_add(&csd->llist, &per_cpu(call_single_queue, cpu)))
> -			cpumask_set_cpu(cpu, cfd->cpumask_ipi);
> +			__cpumask_set_cpu(cpu, cfd->cpumask_ipi);
>  	}
>  
>  	/* Send a message to all CPUs in the map */

  reply	other threads:[~2017-05-22  8:04 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-28 13:19 [PATCH] smp: do not send IPI if call_single_queue not empty Aaron Lu
2017-05-19  7:53 ` [PATCH resend] smp: avoid sending needless IPI in smp_call_function_many() Aaron Lu
2017-05-19 10:59   ` Peter Zijlstra
2017-05-22  8:04     ` Aaron Lu [this message]
2017-05-19 11:09   ` Peter Zijlstra
2017-05-23  8:42   ` [tip:sched/core] smp: Avoid " tip-bot for Aaron Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170522080457.GR2084@aaronlu.sh.intel.com \
    --to=aaron.lu@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=tim.c.chen@linux.intel.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.