From: nzimmer <nzimmer@sgi.com>
To: Andrew Morton <akpm@linux-foundation.org>, Mel Gorman <mgorman@suse.de>
Cc: Waiman Long <waiman.long@hp.com>,
Dave Hansen <dave.hansen@intel.com>,
Scott Norton <scott.norton@hp.com>,
Daniel J Blueman <daniel@numascale.com>,
Linux-MM <linux-mm@kvack.org>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] mm: meminit: Finish initialisation of struct pages before basic setup
Date: Wed, 13 May 2015 10:53:33 -0500 [thread overview]
Message-ID: <5553737D.8080904@sgi.com> (raw)
In-Reply-To: <20150507150932.79e038167f70dd467c25d6ee@linux-foundation.org>
I am just noticed a hang on my largest box.
I can only reproduce with large core counts, if I turn down the number
of cpus it doesn't have an issue.
Also as time goes on the amount of time required to initialize pages
goes up.
log_uv48_05121052:[ 177.250385] node 0 initialised, 14950072 pages in 544ms
log_uv48_05121052:[ 177.269629] node 1 initialised, 15990505 pages in 564ms
log_uv48_05121052:[ 177.436047] node 215 initialised, 3600110 pages in
724ms
log_uv48_05121052:[ 177.464056] node 102 initialised, 3604205 pages in
756ms
log_uv48_05121052:[ 178.073822] node 30 initialised, 7732972 pages in
1368ms
log_uv48_05121052:[ 178.082888] node 31 initialised, 7728877 pages in
1372ms
log_uv48_05121052:[ 178.080060] node 29 initialised, 7728877 pages in
1376ms
....
log_uv48_05121052:[ 178.217980] node 197 initialised, 7728877 pages in
1504ms
log_uv48_05121052:[ 178.217851] node 196 initialised, 7732972 pages in
1504ms
log_uv48_05121052:[ 178.219992] node 247 initialised, 7726418 pages in
1504ms
log_uv48_05121052:[ 178.325299] node 3 initialised, 15986409 pages in
1624ms
log_uv48_05121052:[ 178.328455] node 2 initialised, 15990505 pages in
1624ms
log_uv48_05121052:[ 178.383371] node 4 initialised, 15990505 pages in
1680ms
...
log_uv48_05121052:[ 178.438401] node 19 initialised, 15986409 pages in
1728ms
I apologize for the tardiness of this report but I have not been able to
get to the largest boxes reliably.
Hopefully I will have more access this week.
On 05/07/2015 05:09 PM, Andrew Morton wrote:
> On Thu, 7 May 2015 08:25:18 +0100 Mel Gorman <mgorman@suse.de> wrote:
>
>> Waiman Long reported that 24TB machines hit OOM during basic setup when
>> struct page initialisation was deferred. One approach is to initialise memory
>> on demand but it interferes with page allocator paths. This patch creates
>> dedicated threads to initialise memory before basic setup. It then blocks
>> on a rw_semaphore until completion as a wait_queue and counter is overkill.
>> This may be slower to boot but it's simplier overall and also gets rid of a
>> section mangling which existed so kswapd could do the initialisation.
> Seems a reasonable compromise. It makes a bit of a mess of the patch
> sequencing.
>
> Have some tweaklets:
>
>
>
> From: Andrew Morton <akpm@linux-foundation.org>
> Subject: mm-meminit-finish-initialisation-of-struct-pages-before-basic-setup-fix
>
> include rwsem.h, use DECLARE_RWSEM, fix comment, remove unneeded cast
>
> Cc: Daniel J Blueman <daniel@numascale.com>
> Cc: Dave Hansen <dave.hansen@intel.com>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Nathan Zimmer <nzimmer@sgi.com>
> Cc: Scott Norton <scott.norton@hp.com>
> Cc: Waiman Long <waiman.long@hp.com
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
>
> mm/page_alloc.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff -puN mm/page_alloc.c~mm-meminit-finish-initialisation-of-struct-pages-before-basic-setup-fix mm/page_alloc.c
> --- a/mm/page_alloc.c~mm-meminit-finish-initialisation-of-struct-pages-before-basic-setup-fix
> +++ a/mm/page_alloc.c
> @@ -18,6 +18,7 @@
> #include <linux/mm.h>
> #include <linux/swap.h>
> #include <linux/interrupt.h>
> +#include <linux/rwsem.h>
> #include <linux/pagemap.h>
> #include <linux/jiffies.h>
> #include <linux/bootmem.h>
> @@ -1075,12 +1076,12 @@ static void __init deferred_free_range(s
> __free_pages_boot_core(page, pfn, 0);
> }
>
> -static struct rw_semaphore __initdata pgdat_init_rwsem;
> +static __initdata DECLARE_RWSEM(pgdat_init_rwsem);
>
> /* Initialise remaining memory on a node */
> static int __init deferred_init_memmap(void *data)
> {
> - pg_data_t *pgdat = (pg_data_t *)data;
> + pg_data_t *pgdat = data;
> int nid = pgdat->node_id;
> struct mminit_pfnnid_cache nid_init_state = { };
> unsigned long start = jiffies;
> @@ -1096,7 +1097,7 @@ static int __init deferred_init_memmap(v
> return 0;
> }
>
> - /* Bound memory initialisation to a local node if possible */
> + /* Bind memory initialisation thread to a local node if possible */
> if (!cpumask_empty(cpumask))
> set_cpus_allowed_ptr(current, cpumask);
>
> @@ -1200,7 +1201,6 @@ void __init page_alloc_init_late(void)
> {
> int nid;
>
> - init_rwsem(&pgdat_init_rwsem);
> for_each_node_state(nid, N_MEMORY) {
> down_read(&pgdat_init_rwsem);
> kthread_run(deferred_init_memmap, NODE_DATA(nid), "pgdatinit%d", nid);
> _
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: nzimmer <nzimmer@sgi.com>
To: Andrew Morton <akpm@linux-foundation.org>, Mel Gorman <mgorman@suse.de>
Cc: Waiman Long <waiman.long@hp.com>,
Dave Hansen <dave.hansen@intel.com>,
Scott Norton <scott.norton@hp.com>,
Daniel J Blueman <daniel@numascale.com>,
Linux-MM <linux-mm@kvack.org>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] mm: meminit: Finish initialisation of struct pages before basic setup
Date: Wed, 13 May 2015 10:53:33 -0500 [thread overview]
Message-ID: <5553737D.8080904@sgi.com> (raw)
In-Reply-To: <20150507150932.79e038167f70dd467c25d6ee@linux-foundation.org>
I am just noticed a hang on my largest box.
I can only reproduce with large core counts, if I turn down the number
of cpus it doesn't have an issue.
Also as time goes on the amount of time required to initialize pages
goes up.
log_uv48_05121052:[ 177.250385] node 0 initialised, 14950072 pages in 544ms
log_uv48_05121052:[ 177.269629] node 1 initialised, 15990505 pages in 564ms
log_uv48_05121052:[ 177.436047] node 215 initialised, 3600110 pages in
724ms
log_uv48_05121052:[ 177.464056] node 102 initialised, 3604205 pages in
756ms
log_uv48_05121052:[ 178.073822] node 30 initialised, 7732972 pages in
1368ms
log_uv48_05121052:[ 178.082888] node 31 initialised, 7728877 pages in
1372ms
log_uv48_05121052:[ 178.080060] node 29 initialised, 7728877 pages in
1376ms
....
log_uv48_05121052:[ 178.217980] node 197 initialised, 7728877 pages in
1504ms
log_uv48_05121052:[ 178.217851] node 196 initialised, 7732972 pages in
1504ms
log_uv48_05121052:[ 178.219992] node 247 initialised, 7726418 pages in
1504ms
log_uv48_05121052:[ 178.325299] node 3 initialised, 15986409 pages in
1624ms
log_uv48_05121052:[ 178.328455] node 2 initialised, 15990505 pages in
1624ms
log_uv48_05121052:[ 178.383371] node 4 initialised, 15990505 pages in
1680ms
...
log_uv48_05121052:[ 178.438401] node 19 initialised, 15986409 pages in
1728ms
I apologize for the tardiness of this report but I have not been able to
get to the largest boxes reliably.
Hopefully I will have more access this week.
On 05/07/2015 05:09 PM, Andrew Morton wrote:
> On Thu, 7 May 2015 08:25:18 +0100 Mel Gorman <mgorman@suse.de> wrote:
>
>> Waiman Long reported that 24TB machines hit OOM during basic setup when
>> struct page initialisation was deferred. One approach is to initialise memory
>> on demand but it interferes with page allocator paths. This patch creates
>> dedicated threads to initialise memory before basic setup. It then blocks
>> on a rw_semaphore until completion as a wait_queue and counter is overkill.
>> This may be slower to boot but it's simplier overall and also gets rid of a
>> section mangling which existed so kswapd could do the initialisation.
> Seems a reasonable compromise. It makes a bit of a mess of the patch
> sequencing.
>
> Have some tweaklets:
>
>
>
> From: Andrew Morton <akpm@linux-foundation.org>
> Subject: mm-meminit-finish-initialisation-of-struct-pages-before-basic-setup-fix
>
> include rwsem.h, use DECLARE_RWSEM, fix comment, remove unneeded cast
>
> Cc: Daniel J Blueman <daniel@numascale.com>
> Cc: Dave Hansen <dave.hansen@intel.com>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Nathan Zimmer <nzimmer@sgi.com>
> Cc: Scott Norton <scott.norton@hp.com>
> Cc: Waiman Long <waiman.long@hp.com
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
>
> mm/page_alloc.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff -puN mm/page_alloc.c~mm-meminit-finish-initialisation-of-struct-pages-before-basic-setup-fix mm/page_alloc.c
> --- a/mm/page_alloc.c~mm-meminit-finish-initialisation-of-struct-pages-before-basic-setup-fix
> +++ a/mm/page_alloc.c
> @@ -18,6 +18,7 @@
> #include <linux/mm.h>
> #include <linux/swap.h>
> #include <linux/interrupt.h>
> +#include <linux/rwsem.h>
> #include <linux/pagemap.h>
> #include <linux/jiffies.h>
> #include <linux/bootmem.h>
> @@ -1075,12 +1076,12 @@ static void __init deferred_free_range(s
> __free_pages_boot_core(page, pfn, 0);
> }
>
> -static struct rw_semaphore __initdata pgdat_init_rwsem;
> +static __initdata DECLARE_RWSEM(pgdat_init_rwsem);
>
> /* Initialise remaining memory on a node */
> static int __init deferred_init_memmap(void *data)
> {
> - pg_data_t *pgdat = (pg_data_t *)data;
> + pg_data_t *pgdat = data;
> int nid = pgdat->node_id;
> struct mminit_pfnnid_cache nid_init_state = { };
> unsigned long start = jiffies;
> @@ -1096,7 +1097,7 @@ static int __init deferred_init_memmap(v
> return 0;
> }
>
> - /* Bound memory initialisation to a local node if possible */
> + /* Bind memory initialisation thread to a local node if possible */
> if (!cpumask_empty(cpumask))
> set_cpus_allowed_ptr(current, cpumask);
>
> @@ -1200,7 +1201,6 @@ void __init page_alloc_init_late(void)
> {
> int nid;
>
> - init_rwsem(&pgdat_init_rwsem);
> for_each_node_state(nid, N_MEMORY) {
> down_read(&pgdat_init_rwsem);
> kthread_run(deferred_init_memmap, NODE_DATA(nid), "pgdatinit%d", nid);
> _
>
next prev parent reply other threads:[~2015-05-13 15:53 UTC|newest]
Thread overview: 168+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-04-28 14:36 [PATCH 0/13] Parallel struct page initialisation v4 Mel Gorman
2015-04-28 14:36 ` Mel Gorman
2015-04-28 14:36 ` [PATCH 01/13] memblock: Introduce a for_each_reserved_mem_region iterator Mel Gorman
2015-04-28 14:36 ` Mel Gorman
2015-04-28 14:36 ` [PATCH 02/13] mm: meminit: Move page initialization into a separate function Mel Gorman
2015-04-28 14:36 ` Mel Gorman
2015-04-28 14:37 ` [PATCH 03/13] mm: meminit: Only set page reserved in the memblock region Mel Gorman
2015-04-28 14:37 ` Mel Gorman
2015-05-22 20:31 ` Tony Luck
2015-05-22 20:31 ` Tony Luck
2015-05-26 10:22 ` Mel Gorman
2015-05-26 10:22 ` Mel Gorman
2015-04-28 14:37 ` [PATCH 04/13] mm: page_alloc: Pass PFN to __free_pages_bootmem Mel Gorman
2015-04-28 14:37 ` Mel Gorman
2015-05-01 9:20 ` [PATCH] mm: page_alloc: pass PFN to __free_pages_bootmem -fix Mel Gorman
2015-05-01 9:20 ` Mel Gorman
2015-04-28 14:37 ` [PATCH 05/13] mm: meminit: Make __early_pfn_to_nid SMP-safe and introduce meminit_pfn_in_nid Mel Gorman
2015-04-28 14:37 ` Mel Gorman
2015-04-28 14:37 ` [PATCH 06/13] mm: meminit: Inline some helper functions Mel Gorman
2015-04-28 14:37 ` Mel Gorman
2015-04-30 21:53 ` Andrew Morton
2015-04-30 21:53 ` Andrew Morton
2015-04-30 21:55 ` Andrew Morton
2015-04-30 21:55 ` Andrew Morton
2015-05-04 8:33 ` Michal Hocko
2015-05-04 8:33 ` Michal Hocko
2015-05-04 8:38 ` Michal Hocko
2015-05-04 8:38 ` Michal Hocko
2015-04-28 14:37 ` [PATCH 07/13] mm: meminit: Initialise a subset of struct pages if CONFIG_DEFERRED_STRUCT_PAGE_INIT is set Mel Gorman
2015-04-28 14:37 ` Mel Gorman
2015-04-29 21:19 ` Andrew Morton
2015-04-29 21:19 ` Andrew Morton
2015-04-30 8:45 ` Mel Gorman
2015-04-30 8:45 ` Mel Gorman
2015-05-01 9:21 ` [PATCH] mm: meminit: Initialise a subset of struct pages if CONFIG_DEFERRED_STRUCT_PAGE_INIT is set -fix Mel Gorman
2015-05-01 9:21 ` Mel Gorman
2015-07-14 15:54 ` 4.2-rc2: hitting "file-max limit 8192 reached" Dave Hansen
2015-07-14 15:54 ` Dave Hansen
2015-07-14 16:15 ` Andrew Morton
2015-07-14 16:15 ` Andrew Morton
2015-07-15 10:45 ` Mel Gorman
2015-07-15 10:45 ` Mel Gorman
2015-04-28 14:37 ` [PATCH 08/13] mm: meminit: Initialise remaining struct pages in parallel with kswapd Mel Gorman
2015-04-28 14:37 ` Mel Gorman
2015-04-28 14:37 ` [PATCH 09/13] mm: meminit: Minimise number of pfn->page lookups during initialisation Mel Gorman
2015-04-28 14:37 ` Mel Gorman
2015-04-28 14:37 ` [PATCH 10/13] x86: mm: Enable deferred struct page initialisation on x86-64 Mel Gorman
2015-04-28 14:37 ` Mel Gorman
2015-04-28 14:37 ` [PATCH 11/13] mm: meminit: Free pages in large chunks where possible Mel Gorman
2015-04-28 14:37 ` Mel Gorman
2015-04-28 14:37 ` [PATCH 12/13] mm: meminit: Reduce number of times pageblocks are set during struct page init Mel Gorman
2015-04-28 14:37 ` Mel Gorman
2015-05-01 9:23 ` [PATCH] mm: meminit: Reduce number of times pageblocks are set during struct page init -fix Mel Gorman
2015-05-01 9:23 ` Mel Gorman
2015-04-28 14:37 ` [PATCH 13/13] mm: meminit: Remove mminit_verify_page_links Mel Gorman
2015-04-28 14:37 ` Mel Gorman
2015-04-28 16:06 ` [PATCH 0/13] Parallel struct page initialisation v4 Pekka Enberg
2015-04-28 16:06 ` Pekka Enberg
2015-04-28 18:38 ` nzimmer
2015-04-28 18:38 ` nzimmer
2015-04-30 16:10 ` Daniel J Blueman
2015-04-30 16:10 ` Daniel J Blueman
2015-04-30 17:12 ` nzimmer
2015-04-30 17:12 ` nzimmer
2015-04-30 17:28 ` Mel Gorman
2015-04-30 17:28 ` Mel Gorman
2015-05-02 11:52 ` Elliott, Robert (Server Storage)
2015-05-02 11:52 ` Elliott, Robert (Server Storage)
2015-05-02 11:52 ` Elliott, Robert (Server Storage)
2015-04-29 1:16 ` Waiman Long
2015-04-29 1:16 ` Waiman Long
2015-05-01 22:02 ` Waiman Long
2015-05-01 22:02 ` Waiman Long
2015-05-02 0:09 ` Waiman Long
2015-05-02 0:09 ` Waiman Long
2015-05-02 8:52 ` Daniel J Blueman
2015-05-02 8:52 ` Daniel J Blueman
2015-05-02 16:05 ` Daniel J Blueman
2015-05-02 16:05 ` Daniel J Blueman
2015-05-04 21:30 ` Andrew Morton
2015-05-04 21:30 ` Andrew Morton
2015-05-05 3:32 ` Waiman Long
2015-05-05 3:32 ` Waiman Long
2015-05-05 10:45 ` Mel Gorman
2015-05-05 10:45 ` Mel Gorman
2015-05-05 13:55 ` Waiman Long
2015-05-05 13:55 ` Waiman Long
2015-05-05 14:31 ` Mel Gorman
2015-05-05 14:31 ` Mel Gorman
2015-05-05 15:01 ` Waiman Long
2015-05-05 15:01 ` Waiman Long
2015-05-06 3:39 ` Waiman Long
2015-05-06 3:39 ` Waiman Long
2015-05-06 0:55 ` Waiman Long
2015-05-06 0:55 ` Waiman Long
2015-05-05 20:02 ` Andrew Morton
2015-05-05 20:02 ` Andrew Morton
2015-05-05 22:13 ` Mel Gorman
2015-05-05 22:13 ` Mel Gorman
2015-05-05 22:25 ` Andrew Morton
2015-05-05 22:25 ` Andrew Morton
2015-05-06 7:12 ` Mel Gorman
2015-05-06 7:12 ` Mel Gorman
2015-05-06 10:22 ` Mel Gorman
2015-05-06 10:22 ` Mel Gorman
2015-05-06 12:05 ` Mel Gorman
2015-05-06 12:05 ` Mel Gorman
2015-05-06 17:58 ` Waiman Long
2015-05-06 17:58 ` Waiman Long
2015-05-07 2:37 ` Waiman Long
2015-05-07 2:37 ` Waiman Long
2015-05-07 7:21 ` Mel Gorman
2015-05-07 7:21 ` Mel Gorman
2015-05-06 1:21 ` Waiman Long
2015-05-06 1:21 ` Waiman Long
2015-05-06 2:01 ` Andrew Morton
2015-05-06 2:01 ` Andrew Morton
2015-05-07 7:25 ` [PATCH] mm: meminit: Finish initialisation of struct pages before basic setup Mel Gorman
2015-05-07 7:25 ` Mel Gorman
2015-05-07 22:09 ` Andrew Morton
2015-05-07 22:09 ` Andrew Morton
2015-05-07 22:52 ` Mel Gorman
2015-05-07 22:52 ` Mel Gorman
2015-05-07 23:02 ` Andrew Morton
2015-05-07 23:02 ` Andrew Morton
2015-05-13 15:53 ` nzimmer [this message]
2015-05-13 15:53 ` nzimmer
2015-05-13 16:31 ` Mel Gorman
2015-05-13 16:31 ` Mel Gorman
2015-05-14 10:03 ` Daniel J Blueman
2015-05-14 10:03 ` Daniel J Blueman
2015-05-14 15:47 ` nzimmer
2015-05-14 15:47 ` nzimmer
2015-05-19 18:31 ` nzimmer
2015-05-19 18:31 ` nzimmer
2015-05-19 19:06 ` Mel Gorman
2015-05-19 19:06 ` Mel Gorman
2015-05-22 6:30 ` Daniel J Blueman
2015-05-22 6:30 ` Daniel J Blueman
2015-05-22 9:33 ` Mel Gorman
2015-05-22 9:33 ` Mel Gorman
2015-05-22 17:14 ` Waiman Long
2015-05-22 17:14 ` Waiman Long
2015-05-22 21:43 ` Davidlohr Bueso
2015-05-22 21:43 ` Davidlohr Bueso
2015-05-23 3:49 ` Daniel J Blueman
2015-05-23 3:49 ` Daniel J Blueman
2015-06-24 22:50 ` Nathan Zimmer
2015-06-24 22:50 ` Nathan Zimmer
2015-06-25 20:48 ` Mel Gorman
2015-06-25 20:48 ` Mel Gorman
2015-06-25 20:57 ` Mel Gorman
2015-06-25 20:57 ` Mel Gorman
2015-06-25 21:37 ` Nathan Zimmer
2015-06-25 21:37 ` Nathan Zimmer
2015-06-25 21:34 ` Nathan Zimmer
2015-06-25 21:34 ` Nathan Zimmer
2015-06-25 21:44 ` [RFC] kthread_create_on_node is failing to honor the node choice Nathan Zimmer
2015-06-26 1:08 ` Lai Jiangshan
2015-07-09 22:12 ` Andrew Morton
2015-07-10 14:26 ` Mel Gorman
2015-07-10 17:34 ` Nathan Zimmer
2015-06-26 10:16 ` [PATCH] mm: meminit: Finish initialisation of struct pages before basic setup Mel Gorman
2015-06-26 10:16 ` Mel Gorman
2015-07-06 17:45 ` Daniel J Blueman
2015-07-06 17:45 ` Daniel J Blueman
2015-07-09 17:49 ` Nathan Zimmer
2015-07-09 17:49 ` Nathan Zimmer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5553737D.8080904@sgi.com \
--to=nzimmer@sgi.com \
--cc=akpm@linux-foundation.org \
--cc=daniel@numascale.com \
--cc=dave.hansen@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=scott.norton@hp.com \
--cc=waiman.long@hp.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.