All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@techsingularity.net>
To: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: linux-mm@kvack.org, Vlastimil Babka <vbabka@suse.cz>,
	Michal Hocko <mhocko@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Michael Ellerman <mpe@ellerman.id.au>,
	linuxppc-dev@lists.ozlabs.org,
	Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>,
	Hari Bathini <hbathini@linux.vnet.ibm.com>,
	Dave Hansen <dave.hansen@intel.com>,
	Balbir Singh <bsingharora@gmail.com>
Subject: Re: [PATCH] fadump: Register the memory reserved by fadump
Date: Thu, 4 Aug 2016 15:09:34 +0100	[thread overview]
Message-ID: <20160804140934.GM2799@techsingularity.net> (raw)
In-Reply-To: <1470318165-2521-1-git-send-email-srikar@linux.vnet.ibm.com>

On Thu, Aug 04, 2016 at 07:12:45PM +0530, Srikar Dronamraju wrote:
> Fadump kernel reserves large chunks of memory even before the pages are
> initialized. This could mean memory that corresponds to several nodes might
> fall in memblock reserved regions.
> 
> Kernels compiled with CONFIG_DEFERRED_STRUCT_PAGE_INIT will initialize
> only certain size memory per node. The certain size takes into account
> the dentry and inode cache sizes. Currently the cache sizes are
> calculated based on the total system memory including the reserved
> memory. However such a kernel when booting the same kernel as fadump
> kernel will not be able to allocate the required amount of memory to
> suffice for the dentry and inode caches. This results in crashes like
> the below on large systems such as 32 TB systems.
> 
> Dentry cache hash table entries: 536870912 (order: 16, 4294967296 bytes)
> vmalloc: allocation failure, allocated 4097114112 of 17179934720 bytes
> swapper/0: page allocation failure: order:0, mode:0x2080020(GFP_ATOMIC)
> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.6-master+ #3
> Call Trace:
> [c00000000108fb10] [c0000000007fac88] dump_stack+0xb0/0xf0 (unreliable)
> [c00000000108fb50] [c000000000235264] warn_alloc_failed+0x114/0x160
> [c00000000108fbf0] [c000000000281484] __vmalloc_node_range+0x304/0x340
> [c00000000108fca0] [c00000000028152c] __vmalloc+0x6c/0x90
> [c00000000108fd40] [c000000000aecfb0]
> alloc_large_system_hash+0x1b8/0x2c0
> [c00000000108fe00] [c000000000af7240] inode_init+0x94/0xe4
> [c00000000108fe80] [c000000000af6fec] vfs_caches_init+0x8c/0x13c
> [c00000000108ff00] [c000000000ac4014] start_kernel+0x50c/0x578
> [c00000000108ff90] [c000000000008c6c] start_here_common+0x20/0xa8
> 
> Register the memory reserved by fadump, so that the cache sizes are
> calculated based on the free memory (i.e Total memory - reserved
> memory).
> 
> Suggested-by: Mel Gorman <mgorman@techsingularity.net>

I didn't suggest this specifically. While it happens to be safe on ppc64,
it potentially overwrites any future caller of set_dma_reserve. While the
only other one is for the e820 map, it may be better to change the API
to inc_dma_reserve?

It's also unfortunate that it's called dma_reserve because it has
nothing to do with DMA or ZONE_DMA. inc_kernel_reserve may be more
appropriate?

-- 
Mel Gorman
SUSE Labs

WARNING: multiple messages have this Message-ID (diff)
From: Mel Gorman <mgorman@techsingularity.net>
To: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: linux-mm@kvack.org, Vlastimil Babka <vbabka@suse.cz>,
	Michal Hocko <mhocko@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Michael Ellerman <mpe@ellerman.id.au>,
	linuxppc-dev@lists.ozlabs.org,
	Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>,
	Hari Bathini <hbathini@linux.vnet.ibm.com>,
	Dave Hansen <dave.hansen@intel.com>,
	Balbir Singh <bsingharora@gmail.com>
Subject: Re: [PATCH] fadump: Register the memory reserved by fadump
Date: Thu, 4 Aug 2016 15:09:34 +0100	[thread overview]
Message-ID: <20160804140934.GM2799@techsingularity.net> (raw)
In-Reply-To: <1470318165-2521-1-git-send-email-srikar@linux.vnet.ibm.com>

On Thu, Aug 04, 2016 at 07:12:45PM +0530, Srikar Dronamraju wrote:
> Fadump kernel reserves large chunks of memory even before the pages are
> initialized. This could mean memory that corresponds to several nodes might
> fall in memblock reserved regions.
> 
> Kernels compiled with CONFIG_DEFERRED_STRUCT_PAGE_INIT will initialize
> only certain size memory per node. The certain size takes into account
> the dentry and inode cache sizes. Currently the cache sizes are
> calculated based on the total system memory including the reserved
> memory. However such a kernel when booting the same kernel as fadump
> kernel will not be able to allocate the required amount of memory to
> suffice for the dentry and inode caches. This results in crashes like
> the below on large systems such as 32 TB systems.
> 
> Dentry cache hash table entries: 536870912 (order: 16, 4294967296 bytes)
> vmalloc: allocation failure, allocated 4097114112 of 17179934720 bytes
> swapper/0: page allocation failure: order:0, mode:0x2080020(GFP_ATOMIC)
> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.6-master+ #3
> Call Trace:
> [c00000000108fb10] [c0000000007fac88] dump_stack+0xb0/0xf0 (unreliable)
> [c00000000108fb50] [c000000000235264] warn_alloc_failed+0x114/0x160
> [c00000000108fbf0] [c000000000281484] __vmalloc_node_range+0x304/0x340
> [c00000000108fca0] [c00000000028152c] __vmalloc+0x6c/0x90
> [c00000000108fd40] [c000000000aecfb0]
> alloc_large_system_hash+0x1b8/0x2c0
> [c00000000108fe00] [c000000000af7240] inode_init+0x94/0xe4
> [c00000000108fe80] [c000000000af6fec] vfs_caches_init+0x8c/0x13c
> [c00000000108ff00] [c000000000ac4014] start_kernel+0x50c/0x578
> [c00000000108ff90] [c000000000008c6c] start_here_common+0x20/0xa8
> 
> Register the memory reserved by fadump, so that the cache sizes are
> calculated based on the free memory (i.e Total memory - reserved
> memory).
> 
> Suggested-by: Mel Gorman <mgorman@techsingularity.net>

I didn't suggest this specifically. While it happens to be safe on ppc64,
it potentially overwrites any future caller of set_dma_reserve. While the
only other one is for the e820 map, it may be better to change the API
to inc_dma_reserve?

It's also unfortunate that it's called dma_reserve because it has
nothing to do with DMA or ZONE_DMA. inc_kernel_reserve may be more
appropriate?

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-08-04 14:16 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-04 13:42 [PATCH] fadump: Register the memory reserved by fadump Srikar Dronamraju
2016-08-04 13:42 ` Srikar Dronamraju
2016-08-04 14:09 ` Mel Gorman [this message]
2016-08-04 14:09   ` Mel Gorman
2016-08-04 15:27   ` Srikar Dronamraju
2016-08-04 15:27     ` Srikar Dronamraju
2016-08-05  7:07 ` Michael Ellerman
2016-08-05  7:07   ` Michael Ellerman
2016-08-05  7:28   ` Srikar Dronamraju
2016-08-05  7:28     ` Srikar Dronamraju
2016-08-05  9:25     ` Michael Ellerman
2016-08-05  9:25       ` Michael Ellerman
2016-08-05 10:06       ` Mel Gorman
2016-08-05 10:06         ` Mel Gorman
2016-08-10  6:02         ` Michael Ellerman
2016-08-10  6:02           ` Michael Ellerman
2016-08-10  6:40           ` Srikar Dronamraju
2016-08-10  6:40             ` Srikar Dronamraju
2016-08-10  6:57             ` Michael Ellerman
2016-08-10  6:57               ` Michael Ellerman
2016-08-10  9:21               ` Srikar Dronamraju
2016-08-10  9:21                 ` Srikar Dronamraju
2016-08-10  7:51           ` Mel Gorman
2016-08-10  7:51             ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160804140934.GM2799@techsingularity.net \
    --to=mgorman@techsingularity.net \
    --cc=akpm@linux-foundation.org \
    --cc=bsingharora@gmail.com \
    --cc=dave.hansen@intel.com \
    --cc=hbathini@linux.vnet.ibm.com \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mahesh@linux.vnet.ibm.com \
    --cc=mhocko@kernel.org \
    --cc=mpe@ellerman.id.au \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.