From: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
To: Mel Gorman <mgorman@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
Ingo Molnar <mingo@kernel.org>,
Andrea Arcangeli <aarcange@redhat.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Linux-MM <linux-mm@kvack.org>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 11/13] sched: Check current->mm before allocating NUMA faults
Date: Thu, 4 Jul 2013 18:18:23 +0530 [thread overview]
Message-ID: <20130704124823.GB29916@linux.vnet.ibm.com> (raw)
In-Reply-To: <1372861300-9973-12-git-send-email-mgorman@suse.de>
* Mel Gorman <mgorman@suse.de> [2013-07-03 15:21:38]:
> task_numa_placement checks current->mm but after buffers for faults
> have already been uselessly allocated. Move the check earlier.
>
> [peterz@infradead.org: Identified the problem]
> Signed-off-by: Mel Gorman <mgorman@suse.de>
> ---
> kernel/sched/fair.c | 22 ++++++++++++++--------
> 1 file changed, 14 insertions(+), 8 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 336074f..3c796b0 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -870,8 +870,6 @@ static void task_numa_placement(struct task_struct *p)
> int seq, nid, max_nid = 0;
> unsigned long max_faults = 0;
>
> - if (!p->mm) /* for example, ksmd faulting in a user's mm */
> - return;
> seq = ACCESS_ONCE(p->mm->numa_scan_seq);
> if (p->numa_scan_seq == seq)
> return;
> @@ -945,6 +943,12 @@ void task_numa_fault(int last_nid, int node, int pages, bool migrated)
> if (!sched_feat_numa(NUMA))
> return;
>
> + /* for example, ksmd faulting in a user's mm */
> + if (!p->mm) {
> + p->numa_scan_period = sysctl_numa_balancing_scan_period_max;
Naive question:
Why are we resetting the scan_period?
> + return;
> + }
> +
> /* Allocate buffer to track faults on a per-node basis */
> if (unlikely(!p->numa_faults)) {
> int size = sizeof(*p->numa_faults) * 2 * nr_node_ids;
> @@ -1072,16 +1076,18 @@ void task_numa_work(struct callback_head *work)
> end = ALIGN(start + (pages << PAGE_SHIFT), HPAGE_SIZE);
> end = min(end, vma->vm_end);
> nr_pte_updates += change_prot_numa(vma, start, end);
> - pages -= (end - start) >> PAGE_SHIFT;
> -
> - start = end;
>
> /*
> * Scan sysctl_numa_balancing_scan_size but ensure that
> - * least one PTE is updated so that unused virtual
> - * address space is quickly skipped
> + * at least one PTE is updated so that unused virtual
> + * address space is quickly skipped.
> */
> - if (pages <= 0 && nr_pte_updates)
> + if (nr_pte_updates)
> + pages -= (end - start) >> PAGE_SHIFT;
> +
> + start = end;
> +
> + if (pages <= 0)
> goto out;
> } while (end != vma->vm_end);
> }
> --
> 1.8.1.4
>
--
Thanks and Regards
Srikar Dronamraju
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
To: Mel Gorman <mgorman@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
Ingo Molnar <mingo@kernel.org>,
Andrea Arcangeli <aarcange@redhat.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Linux-MM <linux-mm@kvack.org>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 11/13] sched: Check current->mm before allocating NUMA faults
Date: Thu, 4 Jul 2013 18:18:23 +0530 [thread overview]
Message-ID: <20130704124823.GB29916@linux.vnet.ibm.com> (raw)
In-Reply-To: <1372861300-9973-12-git-send-email-mgorman@suse.de>
* Mel Gorman <mgorman@suse.de> [2013-07-03 15:21:38]:
> task_numa_placement checks current->mm but after buffers for faults
> have already been uselessly allocated. Move the check earlier.
>
> [peterz@infradead.org: Identified the problem]
> Signed-off-by: Mel Gorman <mgorman@suse.de>
> ---
> kernel/sched/fair.c | 22 ++++++++++++++--------
> 1 file changed, 14 insertions(+), 8 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 336074f..3c796b0 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -870,8 +870,6 @@ static void task_numa_placement(struct task_struct *p)
> int seq, nid, max_nid = 0;
> unsigned long max_faults = 0;
>
> - if (!p->mm) /* for example, ksmd faulting in a user's mm */
> - return;
> seq = ACCESS_ONCE(p->mm->numa_scan_seq);
> if (p->numa_scan_seq == seq)
> return;
> @@ -945,6 +943,12 @@ void task_numa_fault(int last_nid, int node, int pages, bool migrated)
> if (!sched_feat_numa(NUMA))
> return;
>
> + /* for example, ksmd faulting in a user's mm */
> + if (!p->mm) {
> + p->numa_scan_period = sysctl_numa_balancing_scan_period_max;
Naive question:
Why are we resetting the scan_period?
> + return;
> + }
> +
> /* Allocate buffer to track faults on a per-node basis */
> if (unlikely(!p->numa_faults)) {
> int size = sizeof(*p->numa_faults) * 2 * nr_node_ids;
> @@ -1072,16 +1076,18 @@ void task_numa_work(struct callback_head *work)
> end = ALIGN(start + (pages << PAGE_SHIFT), HPAGE_SIZE);
> end = min(end, vma->vm_end);
> nr_pte_updates += change_prot_numa(vma, start, end);
> - pages -= (end - start) >> PAGE_SHIFT;
> -
> - start = end;
>
> /*
> * Scan sysctl_numa_balancing_scan_size but ensure that
> - * least one PTE is updated so that unused virtual
> - * address space is quickly skipped
> + * at least one PTE is updated so that unused virtual
> + * address space is quickly skipped.
> */
> - if (pages <= 0 && nr_pte_updates)
> + if (nr_pte_updates)
> + pages -= (end - start) >> PAGE_SHIFT;
> +
> + start = end;
> +
> + if (pages <= 0)
> goto out;
> } while (end != vma->vm_end);
> }
> --
> 1.8.1.4
>
--
Thanks and Regards
Srikar Dronamraju
next prev parent reply other threads:[~2013-07-04 12:48 UTC|newest]
Thread overview: 86+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-07-03 14:21 [PATCH 0/13] Basic scheduler support for automatic NUMA balancing V2 Mel Gorman
2013-07-03 14:21 ` Mel Gorman
2013-07-03 14:21 ` [PATCH 01/13] mm: numa: Document automatic NUMA balancing sysctls Mel Gorman
2013-07-03 14:21 ` Mel Gorman
2013-07-03 14:21 ` [PATCH 02/13] sched: Track NUMA hinting faults on per-node basis Mel Gorman
2013-07-03 14:21 ` Mel Gorman
2013-07-03 14:21 ` [PATCH 03/13] sched: Select a preferred node with the most numa hinting faults Mel Gorman
2013-07-03 14:21 ` Mel Gorman
2013-07-03 14:21 ` [PATCH 04/13] sched: Update NUMA hinting faults once per scan Mel Gorman
2013-07-03 14:21 ` Mel Gorman
2013-07-03 14:21 ` [PATCH 05/13] sched: Favour moving tasks towards the preferred node Mel Gorman
2013-07-03 14:21 ` Mel Gorman
2013-07-03 14:21 ` [PATCH 06/13] sched: Reschedule task on preferred NUMA node once selected Mel Gorman
2013-07-03 14:21 ` Mel Gorman
2013-07-04 12:26 ` Srikar Dronamraju
2013-07-04 12:26 ` Srikar Dronamraju
2013-07-04 13:29 ` Mel Gorman
2013-07-04 13:29 ` Mel Gorman
2013-07-03 14:21 ` [PATCH 07/13] sched: Split accounting of NUMA hinting faults that pass two-stage filter Mel Gorman
2013-07-03 14:21 ` Mel Gorman
2013-07-03 21:56 ` Johannes Weiner
2013-07-03 21:56 ` Johannes Weiner
2013-07-04 9:23 ` Mel Gorman
2013-07-04 9:23 ` Mel Gorman
2013-07-04 14:24 ` Rik van Riel
2013-07-04 14:24 ` Rik van Riel
2013-07-04 19:36 ` Johannes Weiner
2013-07-04 19:36 ` Johannes Weiner
2013-07-05 9:41 ` Mel Gorman
2013-07-05 9:41 ` Mel Gorman
2013-07-05 10:48 ` Peter Zijlstra
2013-07-05 10:48 ` Peter Zijlstra
2013-07-03 14:21 ` [PATCH 08/13] sched: Increase NUMA PTE scanning when a new preferred node is selected Mel Gorman
2013-07-03 14:21 ` Mel Gorman
2013-07-03 14:21 ` [PATCH 09/13] sched: Favour moving tasks towards nodes that incurred more faults Mel Gorman
2013-07-03 14:21 ` Mel Gorman
2013-07-03 18:27 ` Peter Zijlstra
2013-07-03 18:27 ` Peter Zijlstra
2013-07-04 9:25 ` Mel Gorman
2013-07-04 9:25 ` Mel Gorman
2013-07-03 14:21 ` [PATCH 10/13] sched: Set the scan rate proportional to the size of the task being scanned Mel Gorman
2013-07-03 14:21 ` Mel Gorman
2013-07-03 14:21 ` [PATCH 11/13] sched: Check current->mm before allocating NUMA faults Mel Gorman
2013-07-03 14:21 ` Mel Gorman
2013-07-03 15:33 ` Mel Gorman
2013-07-03 15:33 ` Mel Gorman
2013-07-04 12:48 ` Srikar Dronamraju [this message]
2013-07-04 12:48 ` Srikar Dronamraju
2013-07-05 10:07 ` Mel Gorman
2013-07-05 10:07 ` Mel Gorman
2013-07-03 14:21 ` [PATCH 12/13] mm: numa: Scan pages with elevated page_mapcount Mel Gorman
2013-07-03 14:21 ` Mel Gorman
2013-07-03 18:35 ` Peter Zijlstra
2013-07-03 18:35 ` Peter Zijlstra
2013-07-04 9:27 ` Mel Gorman
2013-07-04 9:27 ` Mel Gorman
2013-07-03 18:41 ` Peter Zijlstra
2013-07-03 18:41 ` Peter Zijlstra
2013-07-04 9:32 ` Mel Gorman
2013-07-04 9:32 ` Mel Gorman
2013-07-03 18:42 ` Peter Zijlstra
2013-07-03 18:42 ` Peter Zijlstra
2013-07-03 14:21 ` [PATCH 13/13] sched: Account for the number of preferred tasks running on a node when selecting a preferred node Mel Gorman
2013-07-03 14:21 ` Mel Gorman
2013-07-03 18:32 ` Peter Zijlstra
2013-07-03 18:32 ` Peter Zijlstra
2013-07-04 9:37 ` Mel Gorman
2013-07-04 9:37 ` Mel Gorman
2013-07-04 13:07 ` Srikar Dronamraju
2013-07-04 13:07 ` Srikar Dronamraju
2013-07-04 13:54 ` Mel Gorman
2013-07-04 13:54 ` Mel Gorman
2013-07-04 14:06 ` Peter Zijlstra
2013-07-04 14:06 ` Peter Zijlstra
2013-07-04 14:40 ` Mel Gorman
2013-07-04 14:40 ` Mel Gorman
2013-07-03 16:19 ` [PATCH 0/13] Basic scheduler support for automatic NUMA balancing V2 Mel Gorman
2013-07-03 16:19 ` Mel Gorman
2013-07-03 16:26 ` Mel Gorman
2013-07-03 16:26 ` Mel Gorman
2013-07-04 18:02 ` [PATCH RFC WIP] Process weights based scheduling for better consolidation Srikar Dronamraju
2013-07-04 18:02 ` Srikar Dronamraju
2013-07-05 10:16 ` Peter Zijlstra
2013-07-05 10:16 ` Peter Zijlstra
2013-07-05 12:49 ` Srikar Dronamraju
2013-07-05 12:49 ` Srikar Dronamraju
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130704124823.GB29916@linux.vnet.ibm.com \
--to=srikar@linux.vnet.ibm.com \
--cc=a.p.zijlstra@chello.nl \
--cc=aarcange@redhat.com \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mingo@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.