Re: [PATCH] hotplug: Optimize {get,put}_online_cpus()

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>, Mel Gorman <mgorman@suse.de>,
	Rik van Riel <riel@redhat.com>,
	Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
	Ingo Molnar <mingo@kernel.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Linux-MM <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Steven Rostedt <rostedt@goodmis.org>
Subject: Re: [PATCH] hotplug: Optimize {get,put}_online_cpus()
Date: Tue, 24 Sep 2013 14:09:43 -0700	[thread overview]
Message-ID: <20130924210943.GI9093@linux.vnet.ibm.com> (raw)
In-Reply-To: <20130924160959.GO9326@twins.programming.kicks-ass.net>

On Tue, Sep 24, 2013 at 06:09:59PM +0200, Peter Zijlstra wrote:
> On Tue, Sep 24, 2013 at 07:42:36AM -0700, Paul E. McKenney wrote:
> > > +#define cpuhp_writer_wake()						\
> > > +	wake_up_process(cpuhp_writer_task)
> > > +
> > > +#define cpuhp_writer_wait(cond)						\
> > > +do {									\
> > > +	for (;;) {							\
> > > +		set_current_state(TASK_UNINTERRUPTIBLE);		\
> > > +		if (cond)						\
> > > +			break;						\
> > > +		schedule();						\
> > > +	}								\
> > > +	__set_current_state(TASK_RUNNING);				\
> > > +} while (0)
> > 
> > Why not wait_event()?  Presumably the above is a bit lighter weight,
> > but is that even something that can be measured?
> 
> I didn't want to mix readers and writers on cpuhp_wq, and I suppose I
> could create a second waitqueue; that might also be a better solution
> for the NULL thing below.

That would have the advantage of being a bit less racy.

> > > +	atomic_inc(&cpuhp_waitcount);
> > > +
> > > +	/*
> > > +	 * We either call schedule() in the wait, or we'll fall through
> > > +	 * and reschedule on the preempt_enable() in get_online_cpus().
> > > +	 */
> > > +	preempt_enable_no_resched();
> > > +	wait_event(cpuhp_wq, !__cpuhp_writer);
> > 
> > Finally!  A good use for preempt_enable_no_resched().  ;-)
> 
> Hehe, there were a few others, but tglx removed most with the
> schedule_preempt_disabled() primitive.

;-)

> In fact, I considered a wait_event_preempt_disabled() but was too lazy.
> That whole wait_event macro fest looks like it could use an iteration or
> two of collapse anyhow.

There are some serious layers there, aren't there?

> > > +	preempt_disable();
> > > +
> > > +	/*
> > > +	 * It would be possible for cpu_hotplug_done() to complete before
> > > +	 * the atomic_inc() above; in which case there is no writer waiting
> > > +	 * and doing a wakeup would be BAD (tm).
> > > +	 *
> > > +	 * If however we still observe cpuhp_writer_task here we know
> > > +	 * cpu_hotplug_done() is currently stuck waiting for cpuhp_waitcount.
> > > +	 */
> > > +	if (atomic_dec_and_test(&cpuhp_waitcount) && cpuhp_writer_task)
> > 
> > OK, I'll bite...  What sequence of events results in the
> > atomic_dec_and_test() returning true but there being no
> > cpuhp_writer_task?
> > 
> > Ah, I see it...
> 
> <snip>
> 
> Indeed, and
> 
> > But what prevents the following sequence of events?
> 
> <snip>
> 
> > o	Task B's call to cpuhp_writer_wake() sees a NULL pointer.
> 
> Quite so.. nothing. See there was a reason I kept being confused about
> it.
> 
> > >  void cpu_hotplug_begin(void)
> > >  {
> > > +	unsigned int count = 0;
> > > +	int cpu;
> > > +
> > > +	lockdep_assert_held(&cpu_add_remove_lock);
> > > 
> > > +	__cpuhp_writer = 1;
> > > +	cpuhp_writer_task = current;
> > 
> > At this point, the value of cpuhp_slowcount can go negative.  Can't see
> > that this causes a problem, given the atomic_add() below.
> 
> Agreed.
> 
> > > +
> > > +	/* After this everybody will observe writer and take the slow path. */
> > > +	synchronize_sched();
> > > +
> > > +	/* Collapse the per-cpu refcount into slowcount */
> > > +	for_each_possible_cpu(cpu) {
> > > +		count += per_cpu(__cpuhp_refcount, cpu);
> > > +		per_cpu(__cpuhp_refcount, cpu) = 0;
> > >  	}
> > 
> > The above is safe because the readers are no longer changing their
> > __cpuhp_refcount values.
> 
> Yes, I'll expand the comment.
> 
> So how about something like this?

A few memory barriers required, if I am reading the code correctly.
Some of them, perhaps all of them, called out by Oleg.

							Thanx, Paul

> ---
> --- a/include/linux/cpu.h
> +++ b/include/linux/cpu.h
> @@ -16,6 +16,7 @@
>  #include <linux/node.h>
>  #include <linux/compiler.h>
>  #include <linux/cpumask.h>
> +#include <linux/percpu.h>
> 
>  struct device;
> 
> @@ -173,10 +174,50 @@ extern struct bus_type cpu_subsys;
>  #ifdef CONFIG_HOTPLUG_CPU
>  /* Stop CPUs going up and down. */
> 
> +extern void cpu_hotplug_init_task(struct task_struct *p);
> +
>  extern void cpu_hotplug_begin(void);
>  extern void cpu_hotplug_done(void);
> -extern void get_online_cpus(void);
> -extern void put_online_cpus(void);
> +
> +extern struct task_struct *__cpuhp_writer;
> +DECLARE_PER_CPU(unsigned int, __cpuhp_refcount);
> +
> +extern void __get_online_cpus(void);
> +
> +static inline void get_online_cpus(void)
> +{
> +	might_sleep();
> +
> +	/* Support reader-in-reader recursion */
> +	if (current->cpuhp_ref++) {
> +		barrier();
> +		return;
> +	}
> +
> +	preempt_disable();
> +	if (likely(!__cpuhp_writer))
> +		__this_cpu_inc(__cpuhp_refcount);

As Oleg noted, need a barrier here for when a new reader runs concurrently
with a completing writer.

> +	else
> +		__get_online_cpus();
> +	preempt_enable();
> +}
> +
> +extern void __put_online_cpus(void);
> +
> +static inline void put_online_cpus(void)
> +{
> +	barrier();
> +	if (--current->cpuhp_ref)
> +		return;
> +
> +	preempt_disable();
> +	if (likely(!__cpuhp_writer))
> +		__this_cpu_dec(__cpuhp_refcount);

No barrier needed here because synchronize_sched() covers it.

> +	else
> +		__put_online_cpus();
> +	preempt_enable();
> +}
> +
>  extern void cpu_hotplug_disable(void);
>  extern void cpu_hotplug_enable(void);
>  #define hotcpu_notifier(fn, pri)	cpu_notifier(fn, pri)
> @@ -200,6 +241,8 @@ static inline void cpu_hotplug_driver_un
> 
>  #else		/* CONFIG_HOTPLUG_CPU */
> 
> +static inline void cpu_hotplug_init_task(struct task_struct *p) {}
> +
>  static inline void cpu_hotplug_begin(void) {}
>  static inline void cpu_hotplug_done(void) {}
>  #define get_online_cpus()	do { } while (0)
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1454,6 +1454,9 @@ struct task_struct {
>  	unsigned int	sequential_io;
>  	unsigned int	sequential_io_avg;
>  #endif
> +#ifdef CONFIG_HOTPLUG_CPU
> +	int		cpuhp_ref;
> +#endif
>  };
> 
>  /* Future-safe accessor for struct task_struct's cpus_allowed. */
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -49,88 +49,100 @@ static int cpu_hotplug_disabled;
> 
>  #ifdef CONFIG_HOTPLUG_CPU
> 
> -static struct {
> -	struct task_struct *active_writer;
> -	struct mutex lock; /* Synchronizes accesses to refcount, */
> -	/*
> -	 * Also blocks the new readers during
> -	 * an ongoing cpu hotplug operation.
> -	 */
> -	int refcount;
> -} cpu_hotplug = {
> -	.active_writer = NULL,
> -	.lock = __MUTEX_INITIALIZER(cpu_hotplug.lock),
> -	.refcount = 0,
> -};
> +struct task_struct *__cpuhp_writer;
> +EXPORT_SYMBOL_GPL(__cpuhp_writer);
> 
> -void get_online_cpus(void)
> -{
> -	might_sleep();
> -	if (cpu_hotplug.active_writer == current)
> -		return;
> -	mutex_lock(&cpu_hotplug.lock);
> -	cpu_hotplug.refcount++;
> -	mutex_unlock(&cpu_hotplug.lock);
> +DEFINE_PER_CPU(unsigned int, __cpuhp_refcount);
> +EXPORT_PER_CPU_SYMBOL_GPL(__cpuhp_refcount);
> +
> +static atomic_t cpuhp_waitcount;
> +static atomic_t cpuhp_slowcount;
> +static DECLARE_WAIT_QUEUE_HEAD(cpuhp_readers);
> +static DECLARE_WAIT_QUEUE_HEAD(cpuhp_writer);
> 
> +void cpu_hotplug_init_task(struct task_struct *p)
> +{
> +	p->cpuhp_ref = 0;
>  }
> -EXPORT_SYMBOL_GPL(get_online_cpus);
> 
> -void put_online_cpus(void)
> +void __get_online_cpus(void)
>  {
> -	if (cpu_hotplug.active_writer == current)
> +	/* Support reader-in-writer recursion */
> +	if (__cpuhp_writer == current)
>  		return;
> -	mutex_lock(&cpu_hotplug.lock);
> 
> -	if (WARN_ON(!cpu_hotplug.refcount))
> -		cpu_hotplug.refcount++; /* try to fix things up */
> +	atomic_inc(&cpuhp_waitcount);
> 
> -	if (!--cpu_hotplug.refcount && unlikely(cpu_hotplug.active_writer))
> -		wake_up_process(cpu_hotplug.active_writer);
> -	mutex_unlock(&cpu_hotplug.lock);
> +	/*
> +	 * We either call schedule() in the wait, or we'll fall through
> +	 * and reschedule on the preempt_enable() in get_online_cpus().
> +	 */
> +	preempt_enable_no_resched();
> +	wait_event(cpuhp_readers, !__cpuhp_writer);
> +	preempt_disable();
> +
> +	if (atomic_dec_and_test(&cpuhp_waitcount))

This provides the needed memory barrier for concurrent write releases.

> +		wake_up_all(&cpuhp_writer);
> +}
> +EXPORT_SYMBOL_GPL(__get_online_cpus);
> +
> +void __put_online_cpus(void)
> +{
> +	if (__cpuhp_writer == current)
> +		return;
> 
> +	if (atomic_dec_and_test(&cpuhp_slowcount))

This provides the needed memory barrier for concurrent write acquisitions.

> +		wake_up_all(&cpuhp_writer);
>  }
> -EXPORT_SYMBOL_GPL(put_online_cpus);
> +EXPORT_SYMBOL_GPL(__put_online_cpus);
> 
>  /*
>   * This ensures that the hotplug operation can begin only when the
>   * refcount goes to zero.
>   *
> - * Note that during a cpu-hotplug operation, the new readers, if any,
> - * will be blocked by the cpu_hotplug.lock
> - *
>   * Since cpu_hotplug_begin() is always called after invoking
>   * cpu_maps_update_begin(), we can be sure that only one writer is active.
> - *
> - * Note that theoretically, there is a possibility of a livelock:
> - * - Refcount goes to zero, last reader wakes up the sleeping
> - *   writer.
> - * - Last reader unlocks the cpu_hotplug.lock.
> - * - A new reader arrives at this moment, bumps up the refcount.
> - * - The writer acquires the cpu_hotplug.lock finds the refcount
> - *   non zero and goes to sleep again.
> - *
> - * However, this is very difficult to achieve in practice since
> - * get_online_cpus() not an api which is called all that often.
> - *
>   */
>  void cpu_hotplug_begin(void)
>  {
> -	cpu_hotplug.active_writer = current;
> +	unsigned int count = 0;
> +	int cpu;
> +
> +	lockdep_assert_held(&cpu_add_remove_lock);
> 
> -	for (;;) {
> -		mutex_lock(&cpu_hotplug.lock);
> -		if (likely(!cpu_hotplug.refcount))
> -			break;
> -		__set_current_state(TASK_UNINTERRUPTIBLE);
> -		mutex_unlock(&cpu_hotplug.lock);
> -		schedule();
> +	__cpuhp_writer = current;
> +
> +	/* 
> +	 * After this everybody will observe writer and take the slow path.
> +	 */
> +	synchronize_sched();
> +
> +	/* 
> +	 * Collapse the per-cpu refcount into slowcount. This is safe because
> +	 * readers are now taking the slow path (per the above) which doesn't
> +	 * touch __cpuhp_refcount.
> +	 */
> +	for_each_possible_cpu(cpu) {
> +		count += per_cpu(__cpuhp_refcount, cpu);
> +		per_cpu(__cpuhp_refcount, cpu) = 0;
>  	}
> +	atomic_add(count, &cpuhp_slowcount);
> +
> +	/* Wait for all readers to go away */
> +	wait_event(cpuhp_writer, !atomic_read(&cpuhp_slowcount));

Oddly enough, there appear to be cases where you need a memory barrier
here.  Suppose that all the readers finish after the atomic_add() above,
but before the wait_event().  Then wait_event() just checks the condition
without any memory barriers.  So smp_mb() needed here.

/me runs off to check RCU's use of wait_event()...

Found one missing.  And some places in need of comments.  And a few
places that could use an ACCESS_ONCE().

Back to the review...

>  }
> 
>  void cpu_hotplug_done(void)
>  {
> -	cpu_hotplug.active_writer = NULL;
> -	mutex_unlock(&cpu_hotplug.lock);
> +	/* Signal the writer is done */

And I believe we need a memory barrier here to keep the write-side
critical section confined from the viewpoint of a reader that starts
just after the NULLing of cpuhp_writer.

Of course, being who I am, I cannot resist pointing out that you have
the same number of memory barriers as would use of SRCU, and that
synchronize_srcu() can be quite a bit faster than synchronize_sched()
in the case where there are no readers.  ;-)

> +	cpuhp_writer = NULL;
> +	wake_up_all(&cpuhp_readers);
> +
> +	/* 
> +	 * Wait for any pending readers to be running. This ensures readers
> +	 * after writer and avoids writers starving readers.
> +	 */
> +	wait_event(cpuhp_writer, !atomic_read(&cpuhp_waitcount));
>  }
> 
>  /*
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -1736,6 +1736,8 @@ static void __sched_fork(unsigned long c
>  	INIT_LIST_HEAD(&p->numa_entry);
>  	p->numa_group = NULL;
>  #endif /* CONFIG_NUMA_BALANCING */
> +
> +	cpu_hotplug_init_task(p);
>  }
> 
>  #ifdef CONFIG_NUMA_BALANCING
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>, Mel Gorman <mgorman@suse.de>,
	Rik van Riel <riel@redhat.com>,
	Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
	Ingo Molnar <mingo@kernel.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Linux-MM <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Steven Rostedt <rostedt@goodmis.org>
Subject: Re: [PATCH] hotplug: Optimize {get,put}_online_cpus()
Date: Tue, 24 Sep 2013 14:09:43 -0700	[thread overview]
Message-ID: <20130924210943.GI9093@linux.vnet.ibm.com> (raw)
In-Reply-To: <20130924160959.GO9326@twins.programming.kicks-ass.net>

On Tue, Sep 24, 2013 at 06:09:59PM +0200, Peter Zijlstra wrote:
> On Tue, Sep 24, 2013 at 07:42:36AM -0700, Paul E. McKenney wrote:
> > > +#define cpuhp_writer_wake()						\
> > > +	wake_up_process(cpuhp_writer_task)
> > > +
> > > +#define cpuhp_writer_wait(cond)						\
> > > +do {									\
> > > +	for (;;) {							\
> > > +		set_current_state(TASK_UNINTERRUPTIBLE);		\
> > > +		if (cond)						\
> > > +			break;						\
> > > +		schedule();						\
> > > +	}								\
> > > +	__set_current_state(TASK_RUNNING);				\
> > > +} while (0)
> > 
> > Why not wait_event()?  Presumably the above is a bit lighter weight,
> > but is that even something that can be measured?
> 
> I didn't want to mix readers and writers on cpuhp_wq, and I suppose I
> could create a second waitqueue; that might also be a better solution
> for the NULL thing below.

That would have the advantage of being a bit less racy.

> > > +	atomic_inc(&cpuhp_waitcount);
> > > +
> > > +	/*
> > > +	 * We either call schedule() in the wait, or we'll fall through
> > > +	 * and reschedule on the preempt_enable() in get_online_cpus().
> > > +	 */
> > > +	preempt_enable_no_resched();
> > > +	wait_event(cpuhp_wq, !__cpuhp_writer);
> > 
> > Finally!  A good use for preempt_enable_no_resched().  ;-)
> 
> Hehe, there were a few others, but tglx removed most with the
> schedule_preempt_disabled() primitive.

;-)

> In fact, I considered a wait_event_preempt_disabled() but was too lazy.
> That whole wait_event macro fest looks like it could use an iteration or
> two of collapse anyhow.

There are some serious layers there, aren't there?

> > > +	preempt_disable();
> > > +
> > > +	/*
> > > +	 * It would be possible for cpu_hotplug_done() to complete before
> > > +	 * the atomic_inc() above; in which case there is no writer waiting
> > > +	 * and doing a wakeup would be BAD (tm).
> > > +	 *
> > > +	 * If however we still observe cpuhp_writer_task here we know
> > > +	 * cpu_hotplug_done() is currently stuck waiting for cpuhp_waitcount.
> > > +	 */
> > > +	if (atomic_dec_and_test(&cpuhp_waitcount) && cpuhp_writer_task)
> > 
> > OK, I'll bite...  What sequence of events results in the
> > atomic_dec_and_test() returning true but there being no
> > cpuhp_writer_task?
> > 
> > Ah, I see it...
> 
> <snip>
> 
> Indeed, and
> 
> > But what prevents the following sequence of events?
> 
> <snip>
> 
> > o	Task B's call to cpuhp_writer_wake() sees a NULL pointer.
> 
> Quite so.. nothing. See there was a reason I kept being confused about
> it.
> 
> > >  void cpu_hotplug_begin(void)
> > >  {
> > > +	unsigned int count = 0;
> > > +	int cpu;
> > > +
> > > +	lockdep_assert_held(&cpu_add_remove_lock);
> > > 
> > > +	__cpuhp_writer = 1;
> > > +	cpuhp_writer_task = current;
> > 
> > At this point, the value of cpuhp_slowcount can go negative.  Can't see
> > that this causes a problem, given the atomic_add() below.
> 
> Agreed.
> 
> > > +
> > > +	/* After this everybody will observe writer and take the slow path. */
> > > +	synchronize_sched();
> > > +
> > > +	/* Collapse the per-cpu refcount into slowcount */
> > > +	for_each_possible_cpu(cpu) {
> > > +		count += per_cpu(__cpuhp_refcount, cpu);
> > > +		per_cpu(__cpuhp_refcount, cpu) = 0;
> > >  	}
> > 
> > The above is safe because the readers are no longer changing their
> > __cpuhp_refcount values.
> 
> Yes, I'll expand the comment.
> 
> So how about something like this?

A few memory barriers required, if I am reading the code correctly.
Some of them, perhaps all of them, called out by Oleg.

							Thanx, Paul

> ---
> --- a/include/linux/cpu.h
> +++ b/include/linux/cpu.h
> @@ -16,6 +16,7 @@
>  #include <linux/node.h>
>  #include <linux/compiler.h>
>  #include <linux/cpumask.h>
> +#include <linux/percpu.h>
> 
>  struct device;
> 
> @@ -173,10 +174,50 @@ extern struct bus_type cpu_subsys;
>  #ifdef CONFIG_HOTPLUG_CPU
>  /* Stop CPUs going up and down. */
> 
> +extern void cpu_hotplug_init_task(struct task_struct *p);
> +
>  extern void cpu_hotplug_begin(void);
>  extern void cpu_hotplug_done(void);
> -extern void get_online_cpus(void);
> -extern void put_online_cpus(void);
> +
> +extern struct task_struct *__cpuhp_writer;
> +DECLARE_PER_CPU(unsigned int, __cpuhp_refcount);
> +
> +extern void __get_online_cpus(void);
> +
> +static inline void get_online_cpus(void)
> +{
> +	might_sleep();
> +
> +	/* Support reader-in-reader recursion */
> +	if (current->cpuhp_ref++) {
> +		barrier();
> +		return;
> +	}
> +
> +	preempt_disable();
> +	if (likely(!__cpuhp_writer))
> +		__this_cpu_inc(__cpuhp_refcount);

As Oleg noted, need a barrier here for when a new reader runs concurrently
with a completing writer.

> +	else
> +		__get_online_cpus();
> +	preempt_enable();
> +}
> +
> +extern void __put_online_cpus(void);
> +
> +static inline void put_online_cpus(void)
> +{
> +	barrier();
> +	if (--current->cpuhp_ref)
> +		return;
> +
> +	preempt_disable();
> +	if (likely(!__cpuhp_writer))
> +		__this_cpu_dec(__cpuhp_refcount);

No barrier needed here because synchronize_sched() covers it.

> +	else
> +		__put_online_cpus();
> +	preempt_enable();
> +}
> +
>  extern void cpu_hotplug_disable(void);
>  extern void cpu_hotplug_enable(void);
>  #define hotcpu_notifier(fn, pri)	cpu_notifier(fn, pri)
> @@ -200,6 +241,8 @@ static inline void cpu_hotplug_driver_un
> 
>  #else		/* CONFIG_HOTPLUG_CPU */
> 
> +static inline void cpu_hotplug_init_task(struct task_struct *p) {}
> +
>  static inline void cpu_hotplug_begin(void) {}
>  static inline void cpu_hotplug_done(void) {}
>  #define get_online_cpus()	do { } while (0)
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1454,6 +1454,9 @@ struct task_struct {
>  	unsigned int	sequential_io;
>  	unsigned int	sequential_io_avg;
>  #endif
> +#ifdef CONFIG_HOTPLUG_CPU
> +	int		cpuhp_ref;
> +#endif
>  };
> 
>  /* Future-safe accessor for struct task_struct's cpus_allowed. */
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -49,88 +49,100 @@ static int cpu_hotplug_disabled;
> 
>  #ifdef CONFIG_HOTPLUG_CPU
> 
> -static struct {
> -	struct task_struct *active_writer;
> -	struct mutex lock; /* Synchronizes accesses to refcount, */
> -	/*
> -	 * Also blocks the new readers during
> -	 * an ongoing cpu hotplug operation.
> -	 */
> -	int refcount;
> -} cpu_hotplug = {
> -	.active_writer = NULL,
> -	.lock = __MUTEX_INITIALIZER(cpu_hotplug.lock),
> -	.refcount = 0,
> -};
> +struct task_struct *__cpuhp_writer;
> +EXPORT_SYMBOL_GPL(__cpuhp_writer);
> 
> -void get_online_cpus(void)
> -{
> -	might_sleep();
> -	if (cpu_hotplug.active_writer == current)
> -		return;
> -	mutex_lock(&cpu_hotplug.lock);
> -	cpu_hotplug.refcount++;
> -	mutex_unlock(&cpu_hotplug.lock);
> +DEFINE_PER_CPU(unsigned int, __cpuhp_refcount);
> +EXPORT_PER_CPU_SYMBOL_GPL(__cpuhp_refcount);
> +
> +static atomic_t cpuhp_waitcount;
> +static atomic_t cpuhp_slowcount;
> +static DECLARE_WAIT_QUEUE_HEAD(cpuhp_readers);
> +static DECLARE_WAIT_QUEUE_HEAD(cpuhp_writer);
> 
> +void cpu_hotplug_init_task(struct task_struct *p)
> +{
> +	p->cpuhp_ref = 0;
>  }
> -EXPORT_SYMBOL_GPL(get_online_cpus);
> 
> -void put_online_cpus(void)
> +void __get_online_cpus(void)
>  {
> -	if (cpu_hotplug.active_writer == current)
> +	/* Support reader-in-writer recursion */
> +	if (__cpuhp_writer == current)
>  		return;
> -	mutex_lock(&cpu_hotplug.lock);
> 
> -	if (WARN_ON(!cpu_hotplug.refcount))
> -		cpu_hotplug.refcount++; /* try to fix things up */
> +	atomic_inc(&cpuhp_waitcount);
> 
> -	if (!--cpu_hotplug.refcount && unlikely(cpu_hotplug.active_writer))
> -		wake_up_process(cpu_hotplug.active_writer);
> -	mutex_unlock(&cpu_hotplug.lock);
> +	/*
> +	 * We either call schedule() in the wait, or we'll fall through
> +	 * and reschedule on the preempt_enable() in get_online_cpus().
> +	 */
> +	preempt_enable_no_resched();
> +	wait_event(cpuhp_readers, !__cpuhp_writer);
> +	preempt_disable();
> +
> +	if (atomic_dec_and_test(&cpuhp_waitcount))

This provides the needed memory barrier for concurrent write releases.

> +		wake_up_all(&cpuhp_writer);
> +}
> +EXPORT_SYMBOL_GPL(__get_online_cpus);
> +
> +void __put_online_cpus(void)
> +{
> +	if (__cpuhp_writer == current)
> +		return;
> 
> +	if (atomic_dec_and_test(&cpuhp_slowcount))

This provides the needed memory barrier for concurrent write acquisitions.

> +		wake_up_all(&cpuhp_writer);
>  }
> -EXPORT_SYMBOL_GPL(put_online_cpus);
> +EXPORT_SYMBOL_GPL(__put_online_cpus);
> 
>  /*
>   * This ensures that the hotplug operation can begin only when the
>   * refcount goes to zero.
>   *
> - * Note that during a cpu-hotplug operation, the new readers, if any,
> - * will be blocked by the cpu_hotplug.lock
> - *
>   * Since cpu_hotplug_begin() is always called after invoking
>   * cpu_maps_update_begin(), we can be sure that only one writer is active.
> - *
> - * Note that theoretically, there is a possibility of a livelock:
> - * - Refcount goes to zero, last reader wakes up the sleeping
> - *   writer.
> - * - Last reader unlocks the cpu_hotplug.lock.
> - * - A new reader arrives at this moment, bumps up the refcount.
> - * - The writer acquires the cpu_hotplug.lock finds the refcount
> - *   non zero and goes to sleep again.
> - *
> - * However, this is very difficult to achieve in practice since
> - * get_online_cpus() not an api which is called all that often.
> - *
>   */
>  void cpu_hotplug_begin(void)
>  {
> -	cpu_hotplug.active_writer = current;
> +	unsigned int count = 0;
> +	int cpu;
> +
> +	lockdep_assert_held(&cpu_add_remove_lock);
> 
> -	for (;;) {
> -		mutex_lock(&cpu_hotplug.lock);
> -		if (likely(!cpu_hotplug.refcount))
> -			break;
> -		__set_current_state(TASK_UNINTERRUPTIBLE);
> -		mutex_unlock(&cpu_hotplug.lock);
> -		schedule();
> +	__cpuhp_writer = current;
> +
> +	/* 
> +	 * After this everybody will observe writer and take the slow path.
> +	 */
> +	synchronize_sched();
> +
> +	/* 
> +	 * Collapse the per-cpu refcount into slowcount. This is safe because
> +	 * readers are now taking the slow path (per the above) which doesn't
> +	 * touch __cpuhp_refcount.
> +	 */
> +	for_each_possible_cpu(cpu) {
> +		count += per_cpu(__cpuhp_refcount, cpu);
> +		per_cpu(__cpuhp_refcount, cpu) = 0;
>  	}
> +	atomic_add(count, &cpuhp_slowcount);
> +
> +	/* Wait for all readers to go away */
> +	wait_event(cpuhp_writer, !atomic_read(&cpuhp_slowcount));

Oddly enough, there appear to be cases where you need a memory barrier
here.  Suppose that all the readers finish after the atomic_add() above,
but before the wait_event().  Then wait_event() just checks the condition
without any memory barriers.  So smp_mb() needed here.

/me runs off to check RCU's use of wait_event()...

Found one missing.  And some places in need of comments.  And a few
places that could use an ACCESS_ONCE().

Back to the review...

>  }
> 
>  void cpu_hotplug_done(void)
>  {
> -	cpu_hotplug.active_writer = NULL;
> -	mutex_unlock(&cpu_hotplug.lock);
> +	/* Signal the writer is done */

And I believe we need a memory barrier here to keep the write-side
critical section confined from the viewpoint of a reader that starts
just after the NULLing of cpuhp_writer.

Of course, being who I am, I cannot resist pointing out that you have
the same number of memory barriers as would use of SRCU, and that
synchronize_srcu() can be quite a bit faster than synchronize_sched()
in the case where there are no readers.  ;-)

> +	cpuhp_writer = NULL;
> +	wake_up_all(&cpuhp_readers);
> +
> +	/* 
> +	 * Wait for any pending readers to be running. This ensures readers
> +	 * after writer and avoids writers starving readers.
> +	 */
> +	wait_event(cpuhp_writer, !atomic_read(&cpuhp_waitcount));
>  }
> 
>  /*
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -1736,6 +1736,8 @@ static void __sched_fork(unsigned long c
>  	INIT_LIST_HEAD(&p->numa_entry);
>  	p->numa_group = NULL;
>  #endif /* CONFIG_NUMA_BALANCING */
> +
> +	cpu_hotplug_init_task(p);
>  }
> 
>  #ifdef CONFIG_NUMA_BALANCING
>

next prev parent reply	other threads:[~2013-09-24 21:09 UTC|newest]

Thread overview: 361+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-10  9:31 [PATCH 0/50] Basic scheduler support for automatic NUMA balancing V7 Mel Gorman
2013-09-10  9:31 ` Mel Gorman
2013-09-10  9:31 ` [PATCH 01/50] sched: monolithic code dump of what is being pushed upstream Mel Gorman
2013-09-10  9:31   ` Mel Gorman
2013-09-11  0:58   ` Joonsoo Kim
2013-09-11  0:58     ` Joonsoo Kim
2013-09-11  3:11   ` Hillf Danton
2013-09-11  3:11     ` Hillf Danton
2013-09-13  8:11     ` Mel Gorman
2013-09-13  8:11       ` Mel Gorman
2013-09-10  9:31 ` [PATCH 02/50] mm: numa: Document automatic NUMA balancing sysctls Mel Gorman
2013-09-10  9:31   ` Mel Gorman
2013-09-10  9:31 ` [PATCH 03/50] sched, numa: Comment fixlets Mel Gorman
2013-09-10  9:31   ` Mel Gorman
2013-09-10  9:31 ` [PATCH 04/50] mm: numa: Do not account for a hinting fault if we raced Mel Gorman
2013-09-10  9:31   ` Mel Gorman
2013-09-10  9:31 ` [PATCH 05/50] mm: Wait for THP migrations to complete during NUMA hinting faults Mel Gorman
2013-09-10  9:31   ` Mel Gorman
2013-09-10  9:31 ` [PATCH 06/50] mm: Prevent parallel splits during THP migration Mel Gorman
2013-09-10  9:31   ` Mel Gorman
2013-09-10  9:31 ` [PATCH 07/50] mm: Account for a THP NUMA hinting update as one PTE update Mel Gorman
2013-09-10  9:31   ` Mel Gorman
2013-09-16 12:36   ` Peter Zijlstra
2013-09-16 12:36     ` Peter Zijlstra
2013-09-16 13:39     ` Rik van Riel
2013-09-16 13:39       ` Rik van Riel
2013-09-16 14:54       ` Peter Zijlstra
2013-09-16 14:54         ` Peter Zijlstra
2013-09-16 16:11         ` Mel Gorman
2013-09-16 16:11           ` Mel Gorman
2013-09-16 16:37           ` Peter Zijlstra
2013-09-16 16:37             ` Peter Zijlstra
2013-09-10  9:31 ` [PATCH 08/50] mm: numa: Sanitize task_numa_fault() callsites Mel Gorman
2013-09-10  9:31   ` Mel Gorman
2013-09-10  9:31 ` [PATCH 09/50] mm: numa: Do not migrate or account for hinting faults on the zero page Mel Gorman
2013-09-10  9:31   ` Mel Gorman
2013-09-10  9:31 ` [PATCH 10/50] sched: numa: Mitigate chance that same task always updates PTEs Mel Gorman
2013-09-10  9:31   ` Mel Gorman
2013-09-10  9:31 ` [PATCH 11/50] sched: numa: Continue PTE scanning even if migrate rate limited Mel Gorman
2013-09-10  9:31   ` Mel Gorman
2013-09-10  9:31 ` [PATCH 12/50] Revert "mm: sched: numa: Delay PTE scanning until a task is scheduled on a new node" Mel Gorman
2013-09-10  9:31   ` Mel Gorman
2013-09-10  9:31 ` [PATCH 13/50] sched: numa: Initialise numa_next_scan properly Mel Gorman
2013-09-10  9:31   ` Mel Gorman
2013-09-10  9:31 ` [PATCH 14/50] sched: Set the scan rate proportional to the memory usage of the task being scanned Mel Gorman
2013-09-10  9:31   ` Mel Gorman
2013-09-16 15:18   ` Peter Zijlstra
2013-09-16 15:18     ` Peter Zijlstra
2013-09-16 15:40     ` Mel Gorman
2013-09-16 15:40       ` Mel Gorman
2013-09-10  9:31 ` [PATCH 15/50] sched: numa: Correct adjustment of numa_scan_period Mel Gorman
2013-09-10  9:31   ` Mel Gorman
2013-09-10  9:31 ` [PATCH 16/50] mm: Only flush TLBs if a transhuge PMD is modified for NUMA pte scanning Mel Gorman
2013-09-10  9:31   ` Mel Gorman
2013-09-10  9:31 ` [PATCH 17/50] mm: Do not flush TLB during protection change if !pte_present && !migration_entry Mel Gorman
2013-09-10  9:31   ` Mel Gorman
2013-09-16 16:35   ` Peter Zijlstra
2013-09-16 16:35     ` Peter Zijlstra
2013-09-17 17:00     ` Mel Gorman
2013-09-17 17:00       ` Mel Gorman
2013-09-10  9:31 ` [PATCH 18/50] sched: numa: Slow scan rate if no NUMA hinting faults are being recorded Mel Gorman
2013-09-10  9:31   ` Mel Gorman
2013-09-10  9:31 ` [PATCH 19/50] sched: Track NUMA hinting faults on per-node basis Mel Gorman
2013-09-10  9:31   ` Mel Gorman
2013-09-10  9:32 ` [PATCH 20/50] sched: Select a preferred node with the most numa hinting faults Mel Gorman
2013-09-10  9:32   ` Mel Gorman
2013-09-10  9:32 ` [PATCH 21/50] sched: Update NUMA hinting faults once per scan Mel Gorman
2013-09-10  9:32   ` Mel Gorman
2013-09-10  9:32 ` [PATCH 22/50] sched: Favour moving tasks towards the preferred node Mel Gorman
2013-09-10  9:32   ` Mel Gorman
2013-09-10  9:32 ` [PATCH 23/50] sched: Resist moving tasks towards nodes with fewer hinting faults Mel Gorman
2013-09-10  9:32   ` Mel Gorman
2013-09-10  9:32 ` [PATCH 24/50] sched: Reschedule task on preferred NUMA node once selected Mel Gorman
2013-09-10  9:32   ` Mel Gorman
2013-09-10  9:32 ` [PATCH 25/50] sched: Add infrastructure for split shared/private accounting of NUMA hinting faults Mel Gorman
2013-09-10  9:32   ` Mel Gorman
2013-09-10  9:32 ` [PATCH 26/50] sched: Check current->mm before allocating NUMA faults Mel Gorman
2013-09-10  9:32   ` Mel Gorman
2013-09-10  9:32 ` [PATCH 27/50] mm: numa: Scan pages with elevated page_mapcount Mel Gorman
2013-09-10  9:32   ` Mel Gorman
2013-09-12  2:10   ` Hillf Danton
2013-09-12  2:10     ` Hillf Danton
2013-09-13  8:11     ` Mel Gorman
2013-09-13  8:11       ` Mel Gorman
2013-09-10  9:32 ` [PATCH 28/50] sched: Remove check that skips small VMAs Mel Gorman
2013-09-10  9:32   ` Mel Gorman
2013-09-10  9:32 ` [PATCH 29/50] sched: Set preferred NUMA node based on number of private faults Mel Gorman
2013-09-10  9:32   ` Mel Gorman
2013-09-10  9:32 ` [PATCH 30/50] sched: Do not migrate memory immediately after switching node Mel Gorman
2013-09-10  9:32   ` Mel Gorman
2013-09-10  9:32 ` [PATCH 31/50] sched: Avoid overloading CPUs on a preferred NUMA node Mel Gorman
2013-09-10  9:32   ` Mel Gorman
2013-09-10  9:32 ` [PATCH 32/50] sched: Retry migration of tasks to CPU on a preferred node Mel Gorman
2013-09-10  9:32   ` Mel Gorman
2013-09-10  9:32 ` [PATCH 33/50] sched: numa: increment numa_migrate_seq when task runs in correct location Mel Gorman
2013-09-10  9:32   ` Mel Gorman
2013-09-10  9:32 ` [PATCH 34/50] sched: numa: Do not trap hinting faults for shared libraries Mel Gorman
2013-09-10  9:32   ` Mel Gorman
2013-09-17  2:02   ` 答复: " 张天飞
2013-09-17  8:05     ` ????: " Mel Gorman
2013-09-17  8:05       ` Mel Gorman
2013-09-17  8:22       ` Figo.zhang
2013-09-10  9:32 ` [PATCH 35/50] mm: numa: Only trap pmd hinting faults if we would otherwise trap PTE faults Mel Gorman
2013-09-10  9:32   ` Mel Gorman
2013-09-10  9:32 ` [PATCH 36/50] stop_machine: Introduce stop_two_cpus() Mel Gorman
2013-09-10  9:32   ` Mel Gorman
2013-09-10  9:32 ` [PATCH 37/50] sched: Introduce migrate_swap() Mel Gorman
2013-09-10  9:32   ` Mel Gorman
2013-09-17 14:30   ` [PATCH] hotplug: Optimize {get,put}_online_cpus() Peter Zijlstra
2013-09-17 14:30     ` Peter Zijlstra
2013-09-17 16:20     ` Mel Gorman
2013-09-17 16:20       ` Mel Gorman
2013-09-17 16:45       ` Peter Zijlstra
2013-09-17 16:45         ` Peter Zijlstra
2013-09-18 15:49         ` Peter Zijlstra
2013-09-18 15:49           ` Peter Zijlstra
2013-09-19 14:32           ` Peter Zijlstra
2013-09-19 14:32             ` Peter Zijlstra
2013-09-21 16:34             ` Oleg Nesterov
2013-09-21 16:34               ` Oleg Nesterov
2013-09-21 19:13               ` Oleg Nesterov
2013-09-21 19:13                 ` Oleg Nesterov
2013-09-23  9:29               ` Peter Zijlstra
2013-09-23  9:29                 ` Peter Zijlstra
2013-09-23 17:32                 ` Oleg Nesterov
2013-09-23 17:32                   ` Oleg Nesterov
2013-09-24 20:24                   ` Peter Zijlstra
2013-09-24 20:24                     ` Peter Zijlstra
2013-09-24 21:02                     ` Peter Zijlstra
2013-09-24 21:02                       ` Peter Zijlstra
2013-09-25 15:55                     ` Oleg Nesterov
2013-09-25 15:55                       ` Oleg Nesterov
2013-09-25 16:59                       ` Paul E. McKenney
2013-09-25 16:59                         ` Paul E. McKenney
2013-09-25 17:43                       ` Peter Zijlstra
2013-09-25 17:43                         ` Peter Zijlstra
2013-09-25 17:50                         ` Oleg Nesterov
2013-09-25 17:50                           ` Oleg Nesterov
2013-09-25 18:40                           ` Peter Zijlstra
2013-09-25 18:40                             ` Peter Zijlstra
2013-09-25 21:22                             ` Paul E. McKenney
2013-09-25 21:22                               ` Paul E. McKenney
2013-09-26 11:10                               ` Peter Zijlstra
2013-09-26 11:10                                 ` Peter Zijlstra
     [not found]                                 ` <20130926155321.GA4342@redhat.com>
2013-09-26 16:13                                   ` Peter Zijlstra
2013-09-26 16:13                                     ` Peter Zijlstra
2013-09-26 16:14                                     ` Oleg Nesterov
2013-09-26 16:14                                       ` Oleg Nesterov
2013-09-26 16:40                                       ` Peter Zijlstra
2013-09-26 16:40                                         ` Peter Zijlstra
2013-09-26 16:58                                 ` Oleg Nesterov
2013-09-26 16:58                                   ` Oleg Nesterov
2013-09-26 17:50                                   ` Peter Zijlstra
2013-09-26 17:50                                     ` Peter Zijlstra
2013-09-27 18:15                                     ` Oleg Nesterov
2013-09-27 18:15                                       ` Oleg Nesterov
2013-09-27 20:41                                       ` Peter Zijlstra
2013-09-27 20:41                                         ` Peter Zijlstra
2013-09-28 12:48                                         ` Oleg Nesterov
2013-09-28 12:48                                           ` Oleg Nesterov
2013-09-28 14:47                                           ` Peter Zijlstra
2013-09-28 14:47                                             ` Peter Zijlstra
2013-09-28 16:31                                             ` Oleg Nesterov
2013-09-28 16:31                                               ` Oleg Nesterov
2013-09-30 20:11                                               ` Rafael J. Wysocki
2013-09-30 20:11                                                 ` Rafael J. Wysocki
2013-10-01 17:11                                                 ` Srivatsa S. Bhat
2013-10-01 17:11                                                   ` Srivatsa S. Bhat
2013-10-01 17:36                                                   ` Peter Zijlstra
2013-10-01 17:36                                                     ` Peter Zijlstra
2013-10-01 17:45                                                     ` Oleg Nesterov
2013-10-01 17:45                                                       ` Oleg Nesterov
2013-10-01 17:56                                                       ` Peter Zijlstra
2013-10-01 17:56                                                         ` Peter Zijlstra
2013-10-01 18:07                                                         ` Oleg Nesterov
2013-10-01 18:07                                                           ` Oleg Nesterov
2013-10-01 19:05                                                           ` Paul E. McKenney
2013-10-01 19:05                                                             ` Paul E. McKenney
2013-10-02 12:16                                                             ` Oleg Nesterov
2013-10-02 12:16                                                               ` Oleg Nesterov
2013-10-02  9:08                                                           ` Peter Zijlstra
2013-10-02  9:08                                                             ` Peter Zijlstra
2013-10-02 12:13                                                             ` Oleg Nesterov
2013-10-02 12:13                                                               ` Oleg Nesterov
2013-10-02 12:25                                                               ` Peter Zijlstra
2013-10-02 12:25                                                                 ` Peter Zijlstra
2013-10-02 13:31                                                               ` Peter Zijlstra
2013-10-02 13:31                                                                 ` Peter Zijlstra
2013-10-02 14:00                                                                 ` Oleg Nesterov
2013-10-02 14:00                                                                   ` Oleg Nesterov
2013-10-02 15:17                                                                   ` Peter Zijlstra
2013-10-02 15:17                                                                     ` Peter Zijlstra
2013-10-02 16:31                                                                     ` Oleg Nesterov
2013-10-02 16:31                                                                       ` Oleg Nesterov
2013-10-02 17:52                                                                   ` Paul E. McKenney
2013-10-02 17:52                                                                     ` Paul E. McKenney
2013-10-01 19:03                                                         ` Srivatsa S. Bhat
2013-10-01 19:03                                                           ` Srivatsa S. Bhat
2013-10-01 18:14                                                     ` Srivatsa S. Bhat
2013-10-01 18:14                                                       ` Srivatsa S. Bhat
2013-10-01 18:56                                                       ` Srivatsa S. Bhat
2013-10-01 18:56                                                         ` Srivatsa S. Bhat
2013-10-02 10:14                                                       ` Srivatsa S. Bhat
2013-10-02 10:14                                                         ` Srivatsa S. Bhat
2013-09-28 20:46                                           ` Paul E. McKenney
2013-09-28 20:46                                             ` Paul E. McKenney
2013-10-01  3:56                                         ` Paul E. McKenney
2013-10-01  3:56                                           ` Paul E. McKenney
2013-10-01 14:14                                           ` Oleg Nesterov
2013-10-01 14:14                                             ` Oleg Nesterov
2013-10-01 14:45                                             ` Paul E. McKenney
2013-10-01 14:45                                               ` Paul E. McKenney
2013-10-01 14:48                                               ` Peter Zijlstra
2013-10-01 14:48                                                 ` Peter Zijlstra
2013-10-01 15:24                                                 ` Paul E. McKenney
2013-10-01 15:24                                                   ` Paul E. McKenney
2013-10-01 15:34                                                   ` Oleg Nesterov
2013-10-01 15:34                                                     ` Oleg Nesterov
2013-10-01 15:00                                               ` Oleg Nesterov
2013-10-01 15:00                                                 ` Oleg Nesterov
2013-09-29 13:56                                       ` Oleg Nesterov
2013-09-29 13:56                                         ` Oleg Nesterov
2013-10-01 15:38                                         ` Paul E. McKenney
2013-10-01 15:38                                           ` Paul E. McKenney
2013-10-01 15:40                                           ` Oleg Nesterov
2013-10-01 15:40                                             ` Oleg Nesterov
2013-10-01 20:40                                 ` Paul E. McKenney
2013-10-01 20:40                                   ` Paul E. McKenney
2013-09-23 14:50             ` Steven Rostedt
2013-09-23 14:50               ` Steven Rostedt
2013-09-23 14:54               ` Peter Zijlstra
2013-09-23 14:54                 ` Peter Zijlstra
2013-09-23 15:13                 ` Steven Rostedt
2013-09-23 15:13                   ` Steven Rostedt
2013-09-23 15:22                   ` Peter Zijlstra
2013-09-23 15:22                     ` Peter Zijlstra
2013-09-23 15:59                     ` Steven Rostedt
2013-09-23 15:59                       ` Steven Rostedt
2013-09-23 16:02                       ` Peter Zijlstra
2013-09-23 16:02                         ` Peter Zijlstra
2013-09-23 15:50                   ` Paul E. McKenney
2013-09-23 15:50                     ` Paul E. McKenney
2013-09-23 16:01                     ` Peter Zijlstra
2013-09-23 16:01                       ` Peter Zijlstra
2013-09-23 17:04                       ` Paul E. McKenney
2013-09-23 17:04                         ` Paul E. McKenney
2013-09-23 17:30                         ` Peter Zijlstra
2013-09-23 17:30                           ` Peter Zijlstra
2013-09-23 17:50             ` Oleg Nesterov
2013-09-23 17:50               ` Oleg Nesterov
2013-09-24 12:38               ` Peter Zijlstra
2013-09-24 12:38                 ` Peter Zijlstra
2013-09-24 14:42                 ` Paul E. McKenney
2013-09-24 14:42                   ` Paul E. McKenney
2013-09-24 16:09                   ` Peter Zijlstra
2013-09-24 16:09                     ` Peter Zijlstra
2013-09-24 16:31                     ` Oleg Nesterov
2013-09-24 16:31                       ` Oleg Nesterov
2013-09-24 21:09                     ` Paul E. McKenney [this message]
2013-09-24 21:09                       ` Paul E. McKenney
2013-09-24 16:03                 ` Oleg Nesterov
2013-09-24 16:03                   ` Oleg Nesterov
2013-09-24 16:43                   ` Steven Rostedt
2013-09-24 16:43                     ` Steven Rostedt
2013-09-24 17:06                     ` Oleg Nesterov
2013-09-24 17:06                       ` Oleg Nesterov
2013-09-24 17:47                       ` Paul E. McKenney
2013-09-24 17:47                         ` Paul E. McKenney
2013-09-24 18:00                         ` Oleg Nesterov
2013-09-24 18:00                           ` Oleg Nesterov
2013-09-24 20:35                           ` Peter Zijlstra
2013-09-24 20:35                             ` Peter Zijlstra
2013-09-25 15:16                             ` Oleg Nesterov
2013-09-25 15:16                               ` Oleg Nesterov
2013-09-25 15:35                               ` Peter Zijlstra
2013-09-25 15:35                                 ` Peter Zijlstra
2013-09-25 16:33                                 ` Oleg Nesterov
2013-09-25 16:33                                   ` Oleg Nesterov
2013-09-24 16:49                   ` Paul E. McKenney
2013-09-24 16:49                     ` Paul E. McKenney
2013-09-24 16:54                     ` Peter Zijlstra
2013-09-24 16:54                       ` Peter Zijlstra
2013-09-24 17:02                       ` Oleg Nesterov
2013-09-24 17:02                         ` Oleg Nesterov
2013-09-24 16:51                   ` Peter Zijlstra
2013-09-24 16:51                     ` Peter Zijlstra
2013-09-24 16:39                 ` Steven Rostedt
2013-09-24 16:39                   ` Steven Rostedt
2013-09-29 18:36     ` [RFC] introduce synchronize_sched_{enter,exit}() Oleg Nesterov
2013-09-29 18:36       ` Oleg Nesterov
2013-09-29 20:01       ` Paul E. McKenney
2013-09-29 20:01         ` Paul E. McKenney
2013-09-30 12:42         ` Oleg Nesterov
2013-09-30 12:42           ` Oleg Nesterov
2013-09-29 21:34       ` Steven Rostedt
2013-09-29 21:34         ` Steven Rostedt
2013-09-30 13:03         ` Oleg Nesterov
2013-09-30 13:03           ` Oleg Nesterov
2013-09-30 12:59       ` Peter Zijlstra
2013-09-30 12:59         ` Peter Zijlstra
2013-09-30 14:24         ` Peter Zijlstra
2013-09-30 14:24           ` Peter Zijlstra
2013-09-30 15:06           ` Peter Zijlstra
2013-09-30 15:06             ` Peter Zijlstra
2013-09-30 16:58             ` Oleg Nesterov
2013-09-30 16:58               ` Oleg Nesterov
2013-09-30 16:38         ` Oleg Nesterov
2013-09-30 16:38           ` Oleg Nesterov
2013-10-02 14:41       ` Peter Zijlstra
2013-10-02 14:41         ` Peter Zijlstra
2013-10-03  7:04         ` Ingo Molnar
2013-10-03  7:04           ` Ingo Molnar
2013-10-03  7:43           ` Peter Zijlstra
2013-10-03  7:43             ` Peter Zijlstra
2013-09-17 14:32   ` [PATCH 37/50] sched: Introduce migrate_swap() Peter Zijlstra
2013-09-17 14:32     ` Peter Zijlstra
2013-09-10  9:32 ` [PATCH 38/50] sched: numa: Use a system-wide search to find swap/migration candidates Mel Gorman
2013-09-10  9:32   ` Mel Gorman
2013-09-10  9:32 ` [PATCH 39/50] sched: numa: Favor placing a task on the preferred node Mel Gorman
2013-09-10  9:32   ` Mel Gorman
2013-09-10  9:32 ` [PATCH 40/50] mm: numa: Change page last {nid,pid} into {cpu,pid} Mel Gorman
2013-09-10  9:32   ` Mel Gorman
2013-09-10  9:32 ` [PATCH 41/50] sched: numa: Use {cpu, pid} to create task groups for shared faults Mel Gorman
2013-09-10  9:32   ` Mel Gorman
2013-09-12 12:42   ` Hillf Danton
2013-09-12 12:42     ` Hillf Danton
2013-09-12 14:40     ` Mel Gorman
2013-09-12 14:40       ` Mel Gorman
2013-09-12 12:45   ` Hillf Danton
2013-09-12 12:45     ` Hillf Danton
2013-09-10  9:32 ` [PATCH 42/50] sched: numa: Report a NUMA task group ID Mel Gorman
2013-09-10  9:32   ` Mel Gorman
2013-09-10  9:32 ` [PATCH 43/50] mm: numa: Do not group on RO pages Mel Gorman
2013-09-10  9:32   ` Mel Gorman
2013-09-10  9:32 ` [PATCH 44/50] sched: numa: stay on the same node if CLONE_VM Mel Gorman
2013-09-10  9:32   ` Mel Gorman
2013-09-10  9:32 ` [PATCH 45/50] sched: numa: use group fault statistics in numa placement Mel Gorman
2013-09-10  9:32   ` Mel Gorman
2013-09-10  9:32 ` [PATCH 46/50] sched: numa: Prevent parallel updates to group stats during placement Mel Gorman
2013-09-10  9:32   ` Mel Gorman
2013-09-20  9:55   ` Peter Zijlstra
2013-09-20  9:55     ` Peter Zijlstra
2013-09-20 12:31     ` Mel Gorman
2013-09-20 12:31       ` Mel Gorman
2013-09-20 12:36       ` Peter Zijlstra
2013-09-20 12:36         ` Peter Zijlstra
2013-09-20 13:31       ` Mel Gorman
2013-09-20 13:31         ` Mel Gorman
2013-09-10  9:32 ` [PATCH 47/50] sched: numa: add debugging Mel Gorman
2013-09-10  9:32   ` Mel Gorman
2013-09-10  9:32 ` [PATCH 48/50] sched: numa: Decide whether to favour task or group weights based on swap candidate relationships Mel Gorman
2013-09-10  9:32   ` Mel Gorman
2013-09-10  9:32 ` [PATCH 49/50] sched: numa: fix task or group comparison Mel Gorman
2013-09-10  9:32   ` Mel Gorman
2013-09-10  9:32 ` [PATCH 50/50] sched: numa: Avoid migrating tasks that are placed on their preferred node Mel Gorman
2013-09-10  9:32   ` Mel Gorman
2013-09-11  2:03 ` [PATCH 0/50] Basic scheduler support for automatic NUMA balancing V7 Rik van Riel
2013-09-14  2:57 ` Bob Liu
2013-09-14  2:57   ` Bob Liu
2013-09-30 10:30   ` Mel Gorman
2013-09-30 10:30     ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130924210943.GI9093@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=aarcange@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.