Re: [PATCH 1/2] smp: Non busy-waiting IPI queue

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	LKML <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Jens Axboe <jens.axboe@oracle.com>,
	Kevin Hilman <khilman@linaro.org>,
	Peter Zijlstra <peterz@infradead.org>
Subject: Re: [PATCH 1/2] smp: Non busy-waiting IPI queue
Date: Wed, 2 Apr 2014 11:05:13 -0700	[thread overview]
Message-ID: <20140402180513.GQ4284@linux.vnet.ibm.com> (raw)
In-Reply-To: <1396455966-21128-2-git-send-email-fweisbec@gmail.com>

On Wed, Apr 02, 2014 at 06:26:05PM +0200, Frederic Weisbecker wrote:
> Some IPI users, such as the nohz subsystem, need to be able to send
> an async IPI (async = non waiting for any other IPI completion) on
> contexts with disabled interrupts. And we want the IPI subsystem to handle
> concurrent calls by itself.
> 
> Currently the nohz machinery uses the scheduler IPI for this purpose
> because it can be triggered from any context and doesn't need any
> serialization from the caller. But this is an abuse of a scheduler
> fast path. We are bloating it with a job that should use its own IPI.
> 
> The current set of IPI functions can't be called when interrupts are
> disabled otherwise we risk a deadlock when two CPUs wait for each
> other's IPI completion.
> 
> OTOH smp_call_function_single_async() can be called when interrupts
> are disabled. But then it's up to the caller to serialize the given
> IPI. This can't be called concurrently without special care.
> 
> So we need a version of the async IPI that takes care of concurrent
> calls.
> 
> The proposed solution is to synchronize the IPI with a specific flag
> that prevents the IPI from being sent if it is already pending but not
> yet executed. Ordering is maintained such that, if the IPI is not sent
> because it's already pending, we guarantee it will see the new state of
> the data we expect it to when it will execute.
> 
> This model is close to the irq_work design. It's also partly inspired by
> suggestions from Peter Zijlstra.
> 
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Jens Axboe <jens.axboe@oracle.com>
> Cc: Kevin Hilman <khilman@linaro.org>
> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>

Nice, but one question below.

							Thanx, Paul

> ---
>  include/linux/smp.h | 12 ++++++++++++
>  kernel/smp.c        | 44 ++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 56 insertions(+)
> 
> diff --git a/include/linux/smp.h b/include/linux/smp.h
> index 633f5ed..155dc86 100644
> --- a/include/linux/smp.h
> +++ b/include/linux/smp.h
> @@ -29,6 +29,18 @@ extern unsigned int total_cpus;
>  int smp_call_function_single(int cpuid, smp_call_func_t func, void *info,
>  			     int wait);
> 
> +struct queue_single_data;
> +typedef void (*smp_queue_func_t)(struct queue_single_data *qsd);
> +
> +struct queue_single_data {
> +	struct call_single_data data;
> +	smp_queue_func_t func;
> +	int pending;
> +};
> +
> +int smp_queue_function_single(int cpuid, smp_queue_func_t func,
> +			      struct queue_single_data *qsd);
> +
>  /*
>   * Call a function on all processors
>   */
> diff --git a/kernel/smp.c b/kernel/smp.c
> index 06d574e..bfe7b36 100644
> --- a/kernel/smp.c
> +++ b/kernel/smp.c
> @@ -265,6 +265,50 @@ int smp_call_function_single_async(int cpu, struct call_single_data *csd)
>  }
>  EXPORT_SYMBOL_GPL(smp_call_function_single_async);
> 
> +void generic_smp_queue_function_single_interrupt(void *info)
> +{
> +	struct queue_single_data *qsd = info;
> +
> +	WARN_ON_ONCE(xchg(&qsd->pending, 0) != 1);

I am probably missing something here, but shouldn't this function copy
*qsd to a local on-stack variable before doing the above xchg()?  What
prevents the following from happening?

o	CPU 0 does smp_queue_function_single(), which sets ->pending
	and fills in ->func and ->data.

o	CPU 1 takes IPI, invoking generic_smp_queue_function_single_interrupt().

o	CPU 1 does xchg(), so that ->pending is now zero.

o	An attempt to reuse the queue_single_data sees ->pending equal
	to zero, so the ->func and ->data is overwritten.

o	CPU 1 calls the new ->func with the new ->data (or any of the other
	two possible unexpected outcomes), which might not be helpful to
	the kernel's actuarial statistics.

So what am I missing?

> +	qsd->func(qsd);
> +}
> +
> +/**
> + * smp_queue_function_single - Queue an asynchronous function to run on a
> + * 				specific CPU unless it's already pending.
> + * @func: The function to run. This must be fast and non-blocking.
> + * @qsd: The data contained in the interested object if any
> + *
> + * Like smp_call_function_single_async() but the call to @func is serialized
> + * and won't be queued if it is already pending. In the latter case, ordering
> + * is still guaranteed such that the pending call will sees the new data we
> + * expect it to.
> + *
> + * This must not be called on offline CPUs.
> + *
> + * Returns 0 when @func is successfully queued or already pending, else a negative
> + * status code.
> + */
> +int smp_queue_function_single(int cpu, smp_queue_func_t func, struct queue_single_data *qsd)
> +{
> +	int err;
> +
> +	if (cmpxchg(&qsd->pending, 0, 1))
> +		return 0;
> +
> +	qsd->func = func;
> +
> +	preempt_disable();
> +	err = generic_exec_single(cpu, &qsd->data, generic_smp_queue_function_single_interrupt, qsd, 0);
> +	preempt_enable();
> +
> +	/* Reset is case of error. This must not be called on offline CPUs */
> +	if (err)
> +		qsd->pending = 0;
> +
> +	return err;
> +}
> +
>  /*
>   * smp_call_function_any - Run a function on any of the given cpus
>   * @mask: The mask of cpus it can run on.
> -- 
> 1.8.3.1
>

next prev parent reply	other threads:[~2014-04-02 18:05 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-02 16:26 [GIT PULL] nohz: Move nohz kick out of scheduler IPI Frederic Weisbecker
2014-04-02 16:26 ` [PATCH 1/2] smp: Non busy-waiting IPI queue Frederic Weisbecker
2014-04-02 18:05   ` Paul E. McKenney [this message]
2014-04-02 18:30     ` Frederic Weisbecker
2014-04-02 19:01       ` Paul E. McKenney
2014-04-02 16:26 ` [PATCH 2/2] nohz: Move full nohz kick to its own IPI Frederic Weisbecker
  -- strict thread matches above, loose matches on Subject: below --
2014-04-03  0:09 [PATCH 0/2] nohz: Move nohz kick out of scheduler IPI v2 Frederic Weisbecker
2014-04-03  0:09 ` [PATCH 1/2] smp: Non busy-waiting IPI queue Frederic Weisbecker
2014-04-03 16:17 [GIT PULL v2] nohz: Move nohz kick out of scheduler IPI Frederic Weisbecker
2014-04-03 16:17 ` [PATCH 1/2] smp: Non busy-waiting IPI queue Frederic Weisbecker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140402180513.GQ4284@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=fweisbec@gmail.com \
    --cc=jens.axboe@oracle.com \
    --cc=khilman@linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox