All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: linux-kernel@vger.kernel.org, rusty@rustcorp.com.au,
	npiggin@suse.de, torvalds@linux-foundation.org, mingo@elte.hu,
	tglx@linutronix.de, peterz@infradead.org, arjan@infradead.org
Subject: Re: [PATCH] use per cpu data for single cpu ipi calls
Date: Wed, 28 Jan 2009 17:30:39 -0800	[thread overview]
Message-ID: <20090128173039.cbc29e81.akpm@linux-foundation.org> (raw)
In-Reply-To: <alpine.DEB.1.10.0901281916400.9203@gandalf.stny.rr.com>

On Wed, 28 Jan 2009 19:52:16 -0500 (EST)
Steven Rostedt <rostedt@goodmis.org> wrote:

> 
> The smp_call_function can be passed a wait parameter telling it to
> wait for all the functions running on other CPUs to complete before
> returning, or to return without waiting. Unfortunately, this is
> currently just a suggestion and not manditory. That is, the

"mandatory"

> smp_call_function can decide not to return and wait instead.
> 
> The reason for this is because it uses kmalloc to allocate storage
> to send to the called CPU and that CPU will free it when it is done.
> But if we fail to allocate the storage, the stack is used instead.
> This means we must wait for the called CPU to finish before
> continuing.
> 
> Unfortunatly, some callers do no abide by this hint and act as if

"Unfortunately".

> the non-wait option is mandatory. The MTRR code for instance will
> deadlock if the smp_call_function is set to wait. This is because
> the smp_call_function will wait for the other CPUs to finish their
> called functions, but those functions are waiting on the caller to
> continue.
> 
> This patch changes the generic smp_call_function code to use per cpu
> variables instead of allocating for a single CPU call. The
> smp_call_function_many will fall back to the smp_call_function_single
> if it fails its alloc. The smp_call_function_single is modified
> to not force the wait state.
> 
> Since we now are using a single data per cpu we must synchronize the
> callers to prevent a second caller modifying the data before the
> first called IPI functions complete. To do so, I added a flag to
> the call_single_data called CSD_FLAG_LOCK. When the single CPU is
> called (which can be called when a many call fails an alloc), we
> set the LOCK bit on this per cpu data. When the caller finishes
> it clears the LOCK bit.
> 
> The caller must wait till the LOCK bit is cleared before setting
> it. When it is cleared, there is no IPI function using it.
> A spinlock is used to synchronize the setting of the bit between
> callers. Since only one callee can be called at a time, and it
> is the only thing to clear it, the IPI does not need to use
> any locking.
> 
> Signed-off-by: Steven Rostedt <srostedt@redhat.com>
> ---
> diff --git a/kernel/smp.c b/kernel/smp.c
> index 5cfa0e5..aba3813 100644
> --- a/kernel/smp.c
> +++ b/kernel/smp.c
> @@ -18,6 +18,7 @@ __cacheline_aligned_in_smp DEFINE_SPINLOCK(call_function_lock);
>  enum {
>  	CSD_FLAG_WAIT		= 0x01,
>  	CSD_FLAG_ALLOC		= 0x02,
> +	CSD_FLAG_LOCK		= 0x04,
>  };
>  
>  struct call_function_data {
> @@ -186,6 +187,9 @@ void generic_smp_call_function_single_interrupt(void)
>  			if (data_flags & CSD_FLAG_WAIT) {
>  				smp_wmb();
>  				data->flags &= ~CSD_FLAG_WAIT;
> +			} else if (data_flags & CSD_FLAG_LOCK) {
> +				smp_wmb();
> +				data->flags &= ~CSD_FLAG_LOCK;
>  			} else if (data_flags & CSD_FLAG_ALLOC)
>  				kfree(data);
>  		}
> @@ -196,6 +200,9 @@ void generic_smp_call_function_single_interrupt(void)
>  	}
>  }
>  
> +static DEFINE_PER_CPU(struct call_single_data, csd_data);
> +static DEFINE_SPINLOCK(csd_data_lock);
> +
>  /*
>   * smp_call_function_single - Run a function on a specific CPU
>   * @func: The function to run. This must be fast and non-blocking.
> @@ -224,14 +231,35 @@ int smp_call_function_single(int cpu, void (*func) (void *info), void *info,
>  		func(info);
>  		local_irq_restore(flags);
>  	} else if ((unsigned)cpu < nr_cpu_ids && cpu_online(cpu)) {
> -		struct call_single_data *data = NULL;
> +		struct call_single_data *data;
>  
>  		if (!wait) {
> -			data = kmalloc(sizeof(*data), GFP_ATOMIC);
> -			if (data)
> -				data->flags = CSD_FLAG_ALLOC;
> -		}
> -		if (!data) {
> +			data = &per_cpu(csd_data, cpu);
> +			/*
> +			 * We are calling a function on a single CPU
> +			 * and we are not going to wait for it to finish.
> +			 * We use a per cpu data to pass the information
> +			 * to that CPU, but since all callers of this
> +			 * code will use the same data, we must
> +			 * synchronize the callers to prevent a new caller
> +			 * from corrupting the data before the callee
> +			 * can access it.
> +			 *
> +			 * The CSD_FLAG_LOCK is used to let us know when
> +			 * the IPI handler is done with the data.
> +			 * The first caller will set it, and the callee
> +			 * will clear it. The next caller must wait for
> +			 * it to clear before we set it again. This
> +			 * will make sure the callee is done with the
> +			 * data before a new caller will use it.
> +			 * We use spinlocks to manage the callers.
> +			 */
> +			spin_lock(&csd_data_lock);
> +			while (data->flags & CSD_FLAG_LOCK)
> +				cpu_relax();
> +			data->flags = CSD_FLAG_LOCK;
> +			spin_unlock(&csd_data_lock);
> +		} else {
>  			data = &d;
>  			data->flags = CSD_FLAG_WAIT;
>  		}

Well that looks nice.

Can we make the spinlock a per-cpu thing as well?  Or is that
over-optimising?  We'd need to initialise all those spinlocks at
runtime.

In generic_smp_call_function_single_interrupt(), did you consider
releasing the "lock" _before_ calling the callback function?  That
would reduces latencies a bit, allow more concurrency.  Maybe that's
over-optimising too.

Can generic_smp_call_function_single_interrupt() ever see
CSD_FLAG_ALLOC set now?  If not, that kfree can go away.

And where do we now stand with the architectures which _don't_ use the
kernel/smp.c code?  If someone writes code in generic kernel which
relies upon the new capabilities, it will go bad on those
architectures.  Makes davem sad.


  reply	other threads:[~2009-01-29  1:31 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-28 16:38 Buggy IPI and MTRR code on low memory Steven Rostedt
2009-01-28 16:41 ` Steven Rostedt
2009-01-28 16:46 ` Peter Zijlstra
2009-01-28 16:56   ` Steven Rostedt
2009-01-28 17:00     ` Peter Zijlstra
2009-01-28 17:24   ` Steven Rostedt
2009-01-28 18:20     ` Peter Zijlstra
2009-01-28 18:52       ` Steven Rostedt
2009-01-28 18:22     ` Arjan van de Ven
2009-01-28 18:34       ` Steven Rostedt
2009-01-28 21:12 ` Andrew Morton
2009-01-28 21:13   ` Andrew Morton
2009-01-28 21:23     ` Steven Rostedt
2009-01-28 22:07       ` Andrew Morton
2009-01-28 22:47         ` Steven Rostedt
2009-01-28 23:20           ` Andrew Morton
2009-01-28 23:50             ` Steven Rostedt
2009-01-28 23:25 ` Rusty Russell
2009-01-28 23:41   ` Steven Rostedt
2009-01-29  0:52   ` [PATCH] use per cpu data for single cpu ipi calls Steven Rostedt
2009-01-29  1:30     ` Andrew Morton [this message]
2009-01-29  1:56       ` Steven Rostedt
2009-01-29  8:49       ` Peter Zijlstra
2009-01-29 11:13         ` Ingo Molnar
2009-01-29 11:41           ` Peter Zijlstra
2009-01-29 13:42             ` Ingo Molnar
2009-01-29 14:07             ` Steven Rostedt
2009-01-29 15:08         ` [PATCH -v2] " Steven Rostedt
2009-01-29 15:33           ` Peter Zijlstra
2009-01-29 16:17             ` Ingo Molnar
2009-01-29 17:21           ` Linus Torvalds
2009-01-29 17:44             ` Steven Rostedt
2009-01-29 17:50               ` Steven Rostedt
2009-01-29 18:08               ` Linus Torvalds
2009-01-29 18:11                 ` Steven Rostedt
2009-01-29 18:23                 ` Peter Zijlstra
2009-01-29 18:31                   ` Steven Rostedt
2009-01-29 18:39                   ` Linus Torvalds
2009-01-29 18:44                     ` Peter Zijlstra
2009-01-30 11:23                       ` Jens Axboe
2009-01-30 12:32                         ` [PATCH -v3] " Peter Zijlstra
2009-01-30 12:38                           ` Jens Axboe
2009-01-30 12:48                             ` Peter Zijlstra
2009-01-30 12:55                               ` Jens Axboe
2009-01-30 12:56                                 ` Jens Axboe
2009-01-30 13:00                                   ` Peter Zijlstra
2009-01-30 13:02                           ` [PATCH -v4] " Peter Zijlstra
2009-01-30 14:51                             ` Ingo Molnar
2009-01-30 16:04                           ` [PATCH -v3] " Linus Torvalds
2009-01-30 16:16                             ` Peter Zijlstra
2009-01-31  8:44                               ` Jens Axboe
2009-01-29 18:49                 ` [PATCH -v2] " Ingo Molnar
2009-01-30  1:55                 ` Rusty Russell
2009-01-29 17:47             ` Peter Zijlstra
2009-01-29 17:55               ` Peter Zijlstra
2009-01-29 18:08                 ` Steven Rostedt
2009-01-30  1:11           ` Rusty Russell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090128173039.cbc29e81.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=arjan@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=npiggin@suse.de \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=rusty@rustcorp.com.au \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.