linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Rik van Riel <riel@redhat.com>
To: Andrea Arcangeli <aarcange@redhat.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Hillf Danton <dhillf@gmail.com>, Dan Smith <danms@us.ibm.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@elte.hu>,
	Paul Turner <pjt@google.com>,
	Suresh Siddha <suresh.b.siddha@intel.com>,
	Mike Galbraith <efault@gmx.de>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Lai Jiangshan <laijs@cn.fujitsu.com>,
	Bharata B Rao <bharata.rao@gmail.com>,
	Lee Schermerhorn <Lee.Schermerhorn@hp.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>,
	Christoph Lameter <cl@linux.com>, Alex Shi <alex.shi@intel.com>,
	Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	Don Morris <don.morris@hp.com>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>
Subject: Re: [PATCH 13/40] autonuma: CPU follow memory algorithm
Date: Fri, 29 Jun 2012 14:03:46 -0400	[thread overview]
Message-ID: <4FEDEE02.7060405@redhat.com> (raw)
In-Reply-To: <1340888180-15355-14-git-send-email-aarcange@redhat.com>

On 06/28/2012 08:55 AM, Andrea Arcangeli wrote:
> This algorithm takes as input the statistical information filled by the
> knuma_scand (mm->mm_autonuma) and by the NUMA hinting page faults
> (p->sched_autonuma),

Somewhat confusing patch order, since the NUMA hinting page faults
appear to be later in the patch series.

At least the data structures got introduced earlier, albeit without
any documentation whatsoever (that needs fixing).

> evaluates it for the current scheduled task, and
> compares it against every other running process to see if it should
> move the current task to another NUMA node.

This is a little worrying. What if you are running on a system
with hundreds of NUMA nodes? How often does this code run?

> +static bool inline task_autonuma_cpu(struct task_struct *p, int cpu)
> +{
> +#ifdef CONFIG_AUTONUMA
> +	int autonuma_node;
> +	struct task_autonuma *task_autonuma = p->task_autonuma;
> +
> +	if (!task_autonuma)
> +		return true;
> +
> +	autonuma_node = ACCESS_ONCE(task_autonuma->autonuma_node);
> +	if (autonuma_node<  0 || autonuma_node == cpu_to_node(cpu))
> +		return true;
> +	else
> +		return false;
> +#else
> +	return true;
> +#endif
> +}

What is the return value of task_autonuma_cpu supposed
to represent?  It is not at all clear what this function
is trying to do...

> +#ifdef CONFIG_AUTONUMA
> +	/* this is used by the scheduler and the page allocator */
> +	struct mm_autonuma *mm_autonuma;
> +#endif
>   };

Great.  What is it used for, and how?
Why is that not documented?

> @@ -1514,6 +1514,9 @@ struct task_struct {
>   	struct mempolicy *mempolicy;	/* Protected by alloc_lock */
>   	short il_next;
>   	short pref_node_fork;
> +#ifdef CONFIG_AUTONUMA
> +	struct task_autonuma *task_autonuma;
> +#endif

This could use a comment,too.  I know task_struct has historically
been documented rather poorly, but it may be time to break that
tradition and add documentation.

> +/*
> + * autonuma_balance_cpu_stop() is a callback to be invoked by
> + * stop_one_cpu_nowait(). It is used by sched_autonuma_balance() to
> + * migrate the tasks to the selected_cpu, from softirq context.
> + */
> +static int autonuma_balance_cpu_stop(void *data)
> +{

Uhhh what?   It does not look like anything ends up stopped
as a result of this function running.

It looks like the function migrates a task to another NUMA
node, and always returns 0. Maybe void should be the return
type, and not the argument type?

It would be nice if the function name described what the
function does.

> +	struct rq *src_rq = data;

A void* as the function parameter, when you know what the
data pointer actually is?

Why are you doing this?

> +	int src_cpu = cpu_of(src_rq);
> +	int dst_cpu = src_rq->autonuma_balance_dst_cpu;
> +	struct task_struct *p = src_rq->autonuma_balance_task;

Why is the task to be migrated an item in the runqueue struct,
and not a function argument?

This seems backwards from the way things are usually done.
Not saying it is wrong, but doing things this way needs a good
explanation.

> +out_unlock:
> +	src_rq->autonuma_balance = false;
> +	raw_spin_unlock(&src_rq->lock);
> +	/* spinlocks acts as barrier() so p is stored local on the stack */

What race are you trying to protect against?

Surely the reason p continues to be valid is that you are
holding a refcount to the task?

> +	raw_spin_unlock_irq(&p->pi_lock);
> +	put_task_struct(p);
> +	return 0;
> +}

> +enum {
> +	W_TYPE_THREAD,
> +	W_TYPE_PROCESS,
> +};

What is W?  What is the difference between thread type
and process type Ws?

You wrote a lot of text describing sched_autonuma_balance(),
but none of it helps me understand what you are trying to do :(

> + * We run the above math on every CPU not part of the current NUMA
> + * node, and we compare the current process against the other
> + * processes running in the other CPUs in the remote NUMA nodes. The
> + * objective is to select the cpu (in selected_cpu) with a bigger
> + * "weight". The bigger the "weight" the biggest gain we'll get by
> + * moving the current process to the selected_cpu (not only the
> + * biggest immediate CPU gain but also the fewer async memory
> + * migrations that will be required to reach full convergence
> + * later). If we select a cpu we migrate the current process to it.

The one thing you have not described at all is what factors
go into the weight calculation, and why you are using those.

We can all read C and figure out what the code does, but
we need to know why.

What factors does the code use to weigh the NUMA nodes and processes?

Why are statistics kept both on a per process and a per thread basis?

What is the difference between those two?

What makes a particular NUMA node a good node for a thread to run on?

When is it worthwhile moving stuff around?

When is it not worthwhile?

> + * One non trivial bit of this logic that deserves an explanation is
> + * how the three crucial variables of the core math
> + * (w_nid/w_other/wcpu_nid) are going to change depending on whether
> + * the other CPU is running a thread of the current process, or a
> + * thread of a different process.

It would be nice to know what w_nid/w_other/w_cpu_nid mean.

You have a one-line description of them higher up in the comment,
but there is still no description at all of what factors go into
calculating the weights, or why...

> + * A simple example is required. Given the following:
> + * - 2 processes
> + * - 4 threads per process
> + * - 2 NUMA nodes
> + * - 4 CPUS per NUMA node
> + *
> + * Because the 8 threads belong to 2 different processes, by using the
> + * process statistics when comparing threads of different processes,
> + * we will converge reliably and quickly to a configuration where the
> + * 1st process is entirely contained in one node and the 2nd process
> + * in the other node.
> + *
> + * If all threads only use thread local memory (no sharing of memory
> + * between the threads), it wouldn't matter if we use per-thread or
> + * per-mm statistics for w_nid/w_other/w_cpu_nid. We could then use
> + * per-thread statistics all the time.
> + *
> + * But clearly with threads it's expected to get some sharing of
> + * memory. To avoid false sharing it's better to keep all threads of
> + * the same process in the same node (or if they don't fit in a single
> + * node, in as fewer nodes as possible). This is why we have to use
> + * processes statistics in w_nid/w_other/wcpu_nid when comparing
> + * threads of different processes. Why instead do we have to use
> + * thread statistics when comparing threads of the same process? This
> + * should be obvious if you're still reading

Nothing at all here is obvious, because you have not explained
what factors go into determining each weight.

You describe a lot of specific details, but are missing the
general overview that helps us make sense of things.

> +void sched_autonuma_balance(void)
> +{
> +	int cpu, nid, selected_cpu, selected_nid, selected_nid_mm;
> +	int cpu_nid = numa_node_id();
> +	int this_cpu = smp_processor_id();
> +	/*
> +	 * w_t: node thread weight
> +	 * w_t_t: total sum of all node thread weights
> +	 * w_m: node mm/process weight
> +	 * w_m_t: total sum of all node mm/process weights
> +	 */
> +	unsigned long w_t, w_t_t, w_m, w_m_t;
> +	unsigned long w_t_max, w_m_max;
> +	unsigned long weight_max, weight;
> +	long s_w_nid = -1, s_w_cpu_nid = -1, s_w_other = -1;
> +	int s_w_type = -1;
> +	struct cpumask *allowed;
> +	struct task_struct *p = current;
> +	struct task_autonuma *task_autonuma = p->task_autonuma;

Considering that p is always current, it may be better to just use
current throughout the function, that way people can see at a glance
that "p" cannot go away while the code is running, because current
is running the code on itself.

> +	/*
> +	 * The below two arrays holds the NUMA affinity information of
> +	 * the current process if scheduled in the "nid". This is task
> +	 * local and mm local information. We compute this information
> +	 * for all nodes.
> +	 *
> +	 * task/mm_numa_weight[nid] will become w_nid.
> +	 * task/mm_numa_weight[cpu_nid] will become w_cpu_nid.
> +	 */
> +	rq = cpu_rq(this_cpu);
> +	task_numa_weight = rq->task_numa_weight;
> +	mm_numa_weight = rq->mm_numa_weight;

It is a mystery to me why these items are allocated in the
runqueue structure. We have per-cpu allocations for things
like this, why are you adding them to the runqueue?

If there is a reason, you need to document it.

> +	w_t_max = w_m_max = 0;
> +	selected_nid = selected_nid_mm = -1;
> +	for_each_online_node(nid) {
> +		w_m = ACCESS_ONCE(p->mm->mm_autonuma->mm_numa_fault[nid]);
> +		w_t = task_autonuma->task_numa_fault[nid];
> +		if (w_m>  w_m_t)
> +			w_m_t = w_m;
> +		mm_numa_weight[nid] = w_m*AUTONUMA_BALANCE_SCALE/w_m_t;
> +		if (w_t>  w_t_t)
> +			w_t_t = w_t;
> +		task_numa_weight[nid] = w_t*AUTONUMA_BALANCE_SCALE/w_t_t;
> +		if (mm_numa_weight[nid]>  w_m_max) {
> +			w_m_max = mm_numa_weight[nid];
> +			selected_nid_mm = nid;
> +		}
> +		if (task_numa_weight[nid]>  w_t_max) {
> +			w_t_max = task_numa_weight[nid];
> +			selected_nid = nid;
> +		}
> +	}

What do the task and mm numa weights mean?

What factors go into calculating them?

Is it better to have a higher or a lower number? :)

We could use less documentation of what the code
does, and more explaining what the code is trying
to do, and why.


Under what circumstances do we continue into this loop?

What is it trying to do?

> +	for_each_online_node(nid) {
> +		/*
> +		 * Calculate the "weight" for all CPUs that the
> +		 * current process is allowed to be migrated to,
> +		 * except the CPUs of the current nid (it would be
> +		 * worthless from a NUMA affinity standpoint to
> +		 * migrate the task to another CPU of the current
> +		 * node).
> +		 */
> +		if (nid == cpu_nid)
> +			continue;
> +		for_each_cpu_and(cpu, cpumask_of_node(nid), allowed) {
> +			long w_nid, w_cpu_nid, w_other;
> +			int w_type;
> +			struct mm_struct *mm;
> +			rq = cpu_rq(cpu);
> +			if (!cpu_online(cpu))
> +				continue;
> +
> +			if (idle_cpu(cpu))
> +				/*
> +				 * Offload the while IDLE balancing
> +				 * and physical / logical imbalances
> +				 * to CFS.
> +				 */

			/* CFS idle balancing takes care of this */

> +				continue;
> +
> +			mm = rq->curr->mm;
> +			if (!mm)
> +				continue;
> +			/*
> +			 * Grab the w_m/w_t/w_m_t/w_t_t of the
> +			 * processes running in the other CPUs to
> +			 * compute w_other.
> +			 */
> +			raw_spin_lock_irq(&rq->lock);
> +			/* recheck after implicit barrier() */
> +			mm = rq->curr->mm;
> +			if (!mm) {
> +				raw_spin_unlock_irq(&rq->lock);
> +				continue;
> +			}
> +			w_m_t = ACCESS_ONCE(mm->mm_autonuma->mm_numa_fault_tot);
> +			w_t_t = rq->curr->task_autonuma->task_numa_fault_tot;
> +			if (!w_m_t || !w_t_t) {
> +				raw_spin_unlock_irq(&rq->lock);
> +				continue;
> +			}
> +			w_m = ACCESS_ONCE(mm->mm_autonuma->mm_numa_fault[nid]);
> +			w_t = rq->curr->task_autonuma->task_numa_fault[nid];
> +			raw_spin_unlock_irq(&rq->lock);

Is this why the info is stored in the runqueue struct?

How do we know the other runqueue's data is consistent?
We seem to be doing our own updates outside of the lock...

How do we know the other runqueue's data is up to date?
How often is this function run?

> +			/*
> +			 * Generate the w_nid/w_cpu_nid from the
> +			 * pre-computed mm/task_numa_weight[] and
> +			 * compute w_other using the w_m/w_t info
> +			 * collected from the other process.
> +			 */
> +			if (mm == p->mm) {

			if (mm == current->mm) {

> +				if (w_t>  w_t_t)
> +					w_t_t = w_t;
> +				w_other = w_t*AUTONUMA_BALANCE_SCALE/w_t_t;
> +				w_nid = task_numa_weight[nid];
> +				w_cpu_nid = task_numa_weight[cpu_nid];
> +				w_type = W_TYPE_THREAD;
> +			} else {
> +				if (w_m>  w_m_t)
> +					w_m_t = w_m;
> +				w_other = w_m*AUTONUMA_BALANCE_SCALE/w_m_t;
> +				w_nid = mm_numa_weight[nid];
> +				w_cpu_nid = mm_numa_weight[cpu_nid];
> +				w_type = W_TYPE_PROCESS;
> +			}

Wait, what?

Why is w_t used in one case, and w_m in the other?

Explaining the meaning of the two, and how each is used,
would help people understand this code.

> +			/*
> +			 * Finally check if there's a combined gain in
> +			 * NUMA affinity. If there is and it's the
> +			 * biggest weight seen so far, record its
> +			 * weight and select this NUMA remote "cpu" as
> +			 * candidate migration destination.
> +			 */
> +			if (w_nid>  w_other&&  w_nid>  w_cpu_nid) {
> +				weight = w_nid - w_other + w_nid - w_cpu_nid;

I read this as "check if moving the current task to the other CPU,
and moving its task away, would increase overall NUMA affinity".

Is that correct?

> +	stop_one_cpu_nowait(this_cpu,
> +			    autonuma_balance_cpu_stop, rq,
> +			&rq->autonuma_balance_work);
> +#ifdef __ia64__
> +#error "NOTE: tlb_migrate_finish won't run here"
> +#endif
> +}

So that is why the function is called autonuma_balance_cpu_stop?
Even though its function is to migrate a task?

What will happen in the IA64 case?
Memory corruption?
What would the IA64 maintainers have to do to make things work?

> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index 6d52cea..e5b7ae9 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -463,6 +463,24 @@ struct rq {
>   #ifdef CONFIG_SMP
>   	struct llist_head wake_list;
>   #endif
> +#ifdef CONFIG_AUTONUMA
> +	/*
> +	 * Per-cpu arrays to compute the per-thread and per-process
> +	 * statistics. Allocated statically to avoid overflowing the
> +	 * stack with large MAX_NUMNODES values.
> +	 *
> +	 * FIXME: allocate dynamically and with num_possible_nodes()
> +	 * array sizes only if autonuma is not impossible, to save
> +	 * some dozen KB of RAM when booting on not NUMA (or small
> +	 * NUMA) systems.
> +	 */

I have a second FIXME: document what these fields actually mean,
what they are used for, and why they are allocated as part of the
runqueue structure.

> +	long task_numa_weight[MAX_NUMNODES];
> +	long mm_numa_weight[MAX_NUMNODES];
> +	bool autonuma_balance;
> +	int autonuma_balance_dst_cpu;
> +	struct task_struct *autonuma_balance_task;
> +	struct cpu_stop_work autonuma_balance_work;
> +#endif


-- 
All rights reversed

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2012-06-29 18:04 UTC|newest]

Thread overview: 177+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-28 12:55 [PATCH 00/40] AutoNUMA19 Andrea Arcangeli
2012-06-28 12:55 ` [PATCH 01/40] mm: add unlikely to the mm allocation failure check Andrea Arcangeli
2012-06-29 14:10   ` Rik van Riel
2012-06-28 12:55 ` [PATCH 02/40] autonuma: make set_pmd_at always available Andrea Arcangeli
2012-06-29 14:10   ` Rik van Riel
2012-06-28 12:55 ` [PATCH 03/40] autonuma: export is_vma_temporary_stack() even if CONFIG_TRANSPARENT_HUGEPAGE=n Andrea Arcangeli
2012-06-29 14:11   ` Rik van Riel
2012-06-28 12:55 ` [PATCH 04/40] xen: document Xen is using an unused bit for the pagetables Andrea Arcangeli
2012-06-29 14:16   ` Rik van Riel
2012-07-04 23:05     ` Andrea Arcangeli
2012-06-30  4:47   ` Konrad Rzeszutek Wilk
2012-07-03 10:45     ` Andrea Arcangeli
2012-06-28 12:55 ` [PATCH 05/40] autonuma: define _PAGE_NUMA_PTE and _PAGE_NUMA_PMD Andrea Arcangeli
2012-06-28 15:13   ` Don Morris
2012-06-28 15:00     ` Andrea Arcangeli
2012-06-29 14:26   ` Rik van Riel
2012-07-03 20:30     ` Andrea Arcangeli
2012-06-28 12:55 ` [PATCH 06/40] autonuma: x86 pte_numa() and pmd_numa() Andrea Arcangeli
2012-06-29 15:02   ` Rik van Riel
2012-07-04 23:03     ` Andrea Arcangeli
2012-06-28 12:55 ` [PATCH 07/40] autonuma: generic " Andrea Arcangeli
2012-06-29 15:13   ` Rik van Riel
2012-06-28 12:55 ` [PATCH 08/40] autonuma: teach gup_fast about pte_numa Andrea Arcangeli
2012-06-29 15:27   ` Rik van Riel
2012-06-28 12:55 ` [PATCH 09/40] autonuma: introduce kthread_bind_node() Andrea Arcangeli
2012-06-29 15:36   ` Rik van Riel
2012-06-29 16:04     ` Peter Zijlstra
2012-06-29 16:11       ` Rik van Riel
2012-06-29 16:38     ` Andrea Arcangeli
2012-06-29 16:58       ` Rik van Riel
2012-07-05 13:09         ` Johannes Weiner
2012-07-05 18:33           ` Glauber Costa
2012-07-05 20:07             ` Andrea Arcangeli
2012-06-30  4:50   ` Konrad Rzeszutek Wilk
2012-07-04 23:14     ` Andrea Arcangeli
2012-07-05 12:04       ` Konrad Rzeszutek Wilk
2012-07-05 12:28         ` Andrea Arcangeli
2012-07-05 12:18       ` Peter Zijlstra
2012-07-05 12:21         ` Andrea Arcangeli
2012-06-28 12:55 ` [PATCH 10/40] autonuma: mm_autonuma and sched_autonuma data structures Andrea Arcangeli
2012-06-29 15:47   ` Rik van Riel
2012-06-29 17:45   ` Rik van Riel
2012-07-04 23:16     ` Andrea Arcangeli
2012-06-28 12:55 ` [PATCH 11/40] autonuma: define the autonuma flags Andrea Arcangeli
2012-06-29 16:10   ` Rik van Riel
2012-06-30  4:58   ` Konrad Rzeszutek Wilk
2012-07-02 15:42     ` Konrad Rzeszutek Wilk
2012-06-30  5:01   ` Konrad Rzeszutek Wilk
2012-07-04 23:45     ` Andrea Arcangeli
2012-06-28 12:55 ` [PATCH 12/40] autonuma: core autonuma.h header Andrea Arcangeli
2012-06-28 12:55 ` [PATCH 13/40] autonuma: CPU follow memory algorithm Andrea Arcangeli
2012-06-28 14:46   ` Peter Zijlstra
2012-06-29 14:11     ` Nai Xia
2012-06-29 16:30       ` Andrea Arcangeli
2012-06-29 18:09         ` Nai Xia
2012-06-29 21:02         ` Nai Xia
2012-07-03 11:53     ` Peter Zijlstra
2012-06-28 14:53   ` Peter Zijlstra
2012-06-29 12:16     ` Hillf Danton
2012-06-29 12:55       ` Ingo Molnar
2012-06-29 16:51         ` Dor Laor
2012-06-29 18:41           ` Peter Zijlstra
2012-06-29 18:46             ` Rik van Riel
2012-06-29 18:51               ` Peter Zijlstra
2012-06-29 18:57               ` Peter Zijlstra
2012-06-29 19:03                 ` Peter Zijlstra
2012-06-29 19:19                   ` Rik van Riel
2012-07-02 16:57                     ` Vaidyanathan Srinivasan
2012-07-05 16:56                       ` Vaidyanathan Srinivasan
2012-07-06 13:04                         ` Hillf Danton
2012-07-06 18:38                           ` Vaidyanathan Srinivasan
2012-07-12 13:12                             ` Andrea Arcangeli
2012-06-29 18:49           ` Peter Zijlstra
2012-06-29 18:53           ` Peter Zijlstra
2012-06-29 20:01             ` Nai Xia
2012-06-29 20:44               ` Nai Xia
2012-06-30  1:23               ` Andrea Arcangeli
2012-06-30  2:43                 ` Nai Xia
2012-06-30  5:48                   ` Dor Laor
2012-06-30  6:58                     ` Nai Xia
2012-06-30 13:04                       ` Andrea Arcangeli
2012-06-30 15:19                         ` Nai Xia
2012-06-30 19:37                       ` Dor Laor
2012-07-01  2:41                         ` Nai Xia
2012-06-30 23:55                       ` Benjamin Herrenschmidt
2012-07-01  3:10                         ` Nai Xia
2012-06-30  8:23                     ` Nai Xia
2012-07-02  7:29                       ` Rik van Riel
2012-07-02  7:43                         ` Nai Xia
2012-06-30 12:48                   ` Andrea Arcangeli
2012-06-30 15:10                     ` Nai Xia
2012-07-02  7:36                       ` Rik van Riel
2012-07-02  7:56                         ` Nai Xia
2012-07-02  8:17                           ` Rik van Riel
2012-07-02  8:31                             ` Nai Xia
2012-07-05 18:07               ` Rik van Riel
2012-07-05 22:59                 ` Andrea Arcangeli
2012-07-06  1:00                 ` Nai Xia
2012-06-29 19:04           ` Peter Zijlstra
2012-06-29 20:27             ` Nai Xia
2012-06-29 18:03   ` Rik van Riel [this message]
2012-06-28 12:55 ` [PATCH 14/40] autonuma: add page structure fields Andrea Arcangeli
2012-06-29 18:06   ` Rik van Riel
2012-06-28 12:55 ` [PATCH 15/40] autonuma: knuma_migrated per NUMA node queues Andrea Arcangeli
2012-06-29 18:31   ` Rik van Riel
2012-06-28 12:55 ` [PATCH 16/40] autonuma: init knuma_migrated queues Andrea Arcangeli
2012-06-29 18:35   ` Rik van Riel
2012-06-28 12:55 ` [PATCH 17/40] autonuma: autonuma_enter/exit Andrea Arcangeli
2012-06-29 18:37   ` Rik van Riel
2012-06-28 12:55 ` [PATCH 18/40] autonuma: call autonuma_setup_new_exec() Andrea Arcangeli
2012-06-29 18:39   ` Rik van Riel
2012-06-30  5:04   ` Konrad Rzeszutek Wilk
2012-07-12 17:50     ` Andrea Arcangeli
2012-06-28 12:55 ` [PATCH 19/40] autonuma: alloc/free/init sched_autonuma Andrea Arcangeli
2012-06-29 18:52   ` Rik van Riel
2012-06-30  5:10   ` Konrad Rzeszutek Wilk
2012-07-12 17:59     ` Andrea Arcangeli
2012-06-28 12:56 ` [PATCH 20/40] autonuma: alloc/free/init mm_autonuma Andrea Arcangeli
2012-06-29 18:54   ` Rik van Riel
2012-06-30  5:12   ` Konrad Rzeszutek Wilk
2012-07-12 18:08     ` Andrea Arcangeli
2012-07-12 18:17       ` Johannes Weiner
2012-07-13 14:19         ` Christoph Lameter
2012-07-14 17:01           ` Andrea Arcangeli
2012-07-01 15:33   ` Rik van Riel
2012-07-12 18:27     ` Andrea Arcangeli
2012-06-28 12:56 ` [PATCH 21/40] autonuma: avoid CFS select_task_rq_fair to return -1 Andrea Arcangeli
2012-06-29 18:57   ` Rik van Riel
2012-06-29 19:05     ` Peter Zijlstra
2012-06-29 19:07       ` Rik van Riel
2012-06-29 20:48         ` Ingo Molnar
2012-06-28 12:56 ` [PATCH 22/40] autonuma: teach CFS about autonuma affinity Andrea Arcangeli
2012-07-01 16:37   ` Rik van Riel
2012-06-28 12:56 ` [PATCH 23/40] autonuma: sched_set_autonuma_need_balance Andrea Arcangeli
2012-07-01 16:57   ` Rik van Riel
2012-06-28 12:56 ` [PATCH 24/40] autonuma: core Andrea Arcangeli
2012-07-02  4:07   ` Rik van Riel
2012-06-28 12:56 ` [PATCH 25/40] autonuma: follow_page check for pte_numa/pmd_numa Andrea Arcangeli
2012-07-02  4:14   ` Rik van Riel
2012-07-14 16:43     ` Andrea Arcangeli
2012-06-28 12:56 ` [PATCH 26/40] autonuma: default mempolicy follow AutoNUMA Andrea Arcangeli
2012-07-02  4:19   ` Rik van Riel
2012-06-28 12:56 ` [PATCH 27/40] autonuma: call autonuma_split_huge_page() Andrea Arcangeli
2012-07-02  4:22   ` Rik van Riel
2012-06-28 12:56 ` [PATCH 28/40] autonuma: make khugepaged pte_numa aware Andrea Arcangeli
2012-07-02  4:24   ` Rik van Riel
2012-07-12 18:50     ` Andrea Arcangeli
2012-07-12 21:25       ` Rik van Riel
2012-06-28 12:56 ` [PATCH 29/40] autonuma: retain page last_nid information in khugepaged Andrea Arcangeli
2012-07-02  4:33   ` Rik van Riel
2012-06-28 12:56 ` [PATCH 30/40] autonuma: numa hinting page faults entry points Andrea Arcangeli
2012-07-02  4:47   ` Rik van Riel
2012-06-28 12:56 ` [PATCH 31/40] autonuma: reset autonuma page data when pages are freed Andrea Arcangeli
2012-07-02  4:49   ` Rik van Riel
2012-06-28 12:56 ` [PATCH 32/40] autonuma: initialize page structure fields Andrea Arcangeli
2012-07-02  4:50   ` Rik van Riel
2012-06-28 12:56 ` [PATCH 33/40] autonuma: link mm/autonuma.o and kernel/sched/numa.o Andrea Arcangeli
2012-07-02  4:56   ` Rik van Riel
2012-06-28 12:56 ` [PATCH 34/40] autonuma: add CONFIG_AUTONUMA and CONFIG_AUTONUMA_DEFAULT_ENABLED Andrea Arcangeli
2012-07-02  4:58   ` Rik van Riel
2012-06-28 12:56 ` [PATCH 35/40] autonuma: boost khugepaged scanning rate Andrea Arcangeli
2012-07-02  5:12   ` Rik van Riel
2012-06-28 12:56 ` [PATCH 36/40] autonuma: page_autonuma Andrea Arcangeli
2012-06-30  5:24   ` Konrad Rzeszutek Wilk
2012-07-12 19:43     ` Andrea Arcangeli
2012-07-02  6:37   ` Rik van Riel
2012-07-12 19:58     ` Andrea Arcangeli
2012-06-28 12:56 ` [PATCH 37/40] autonuma: page_autonuma change #include for sparse Andrea Arcangeli
2012-07-02  6:22   ` Rik van Riel
2012-06-28 12:56 ` [PATCH 38/40] autonuma: autonuma_migrate_head[0] dynamic size Andrea Arcangeli
2012-07-02  5:15   ` Rik van Riel
2012-06-28 12:56 ` [PATCH 39/40] autonuma: bugcheck page_autonuma fields on newly allocated pages Andrea Arcangeli
2012-07-02  6:40   ` Rik van Riel
2012-06-28 12:56 ` [PATCH 40/40] autonuma: shrink the per-page page_autonuma struct size Andrea Arcangeli
2012-07-02  7:18   ` Rik van Riel
2012-07-12 20:21     ` Andrea Arcangeli
2012-07-09 15:40 ` [PATCH 00/40] AutoNUMA19 Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4FEDEE02.7060405@redhat.com \
    --to=riel@redhat.com \
    --cc=Lee.Schermerhorn@hp.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex.shi@intel.com \
    --cc=benh@kernel.crashing.org \
    --cc=bharata.rao@gmail.com \
    --cc=cl@linux.com \
    --cc=danms@us.ibm.com \
    --cc=dhillf@gmail.com \
    --cc=don.morris@hp.com \
    --cc=efault@gmx.de \
    --cc=hannes@cmpxchg.org \
    --cc=konrad.wilk@oracle.com \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mauricfo@linux.vnet.ibm.com \
    --cc=mingo@elte.hu \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=pjt@google.com \
    --cc=suresh.b.siddha@intel.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=vatsa@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).