All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gerald Schaefer <gerald.schaefer@de.ibm.com>
To: David Hildenbrand <david@redhat.com>
Cc: Mikhail Zaslonko <zaslonko@linux.ibm.com>,
	akpm@linux-foundation.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, mhocko@kernel.org,
	Pavel.Tatashin@microsoft.com, schwidefsky@de.ibm.com,
	heiko.carstens@de.ibm.com
Subject: Re: [PATCH v2 1/1] mm, memory_hotplug: Initialize struct pages for the full memory section
Date: Fri, 14 Dec 2018 20:23:15 +0100	[thread overview]
Message-ID: <20181214202315.1c685f1e@thinkpad> (raw)
In-Reply-To: <bcd0c49c-e417-ef8b-996f-99ecef540d9c@redhat.com>

On Fri, 14 Dec 2018 16:49:14 +0100
David Hildenbrand <david@redhat.com> wrote:

> On 14.12.18 16:22, David Hildenbrand wrote:
> > On 12.12.18 18:27, Mikhail Zaslonko wrote:  
> >> If memory end is not aligned with the sparse memory section boundary, the
> >> mapping of such a section is only partly initialized. This may lead to
> >> VM_BUG_ON due to uninitialized struct page access from
> >> is_mem_section_removable() or test_pages_in_a_zone() function triggered by
> >> memory_hotplug sysfs handlers:
> >>
> >> Here are the the panic examples:
> >>  CONFIG_DEBUG_VM=y
> >>  CONFIG_DEBUG_VM_PGFLAGS=y
> >>
> >>  kernel parameter mem=2050M
> >>  --------------------------
> >>  page:000003d082008000 is uninitialized and poisoned
> >>  page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
> >>  Call Trace:
> >>  ([<0000000000385b26>] test_pages_in_a_zone+0xde/0x160)
> >>   [<00000000008f15c4>] show_valid_zones+0x5c/0x190
> >>   [<00000000008cf9c4>] dev_attr_show+0x34/0x70
> >>   [<0000000000463ad0>] sysfs_kf_seq_show+0xc8/0x148
> >>   [<00000000003e4194>] seq_read+0x204/0x480
> >>   [<00000000003b53ea>] __vfs_read+0x32/0x178
> >>   [<00000000003b55b2>] vfs_read+0x82/0x138
> >>   [<00000000003b5be2>] ksys_read+0x5a/0xb0
> >>   [<0000000000b86ba0>] system_call+0xdc/0x2d8
> >>  Last Breaking-Event-Address:
> >>   [<0000000000385b26>] test_pages_in_a_zone+0xde/0x160
> >>  Kernel panic - not syncing: Fatal exception: panic_on_oops
> >>
> >>  kernel parameter mem=3075M
> >>  --------------------------
> >>  page:000003d08300c000 is uninitialized and poisoned
> >>  page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
> >>  Call Trace:
> >>  ([<000000000038596c>] is_mem_section_removable+0xb4/0x190)
> >>   [<00000000008f12fa>] show_mem_removable+0x9a/0xd8
> >>   [<00000000008cf9c4>] dev_attr_show+0x34/0x70
> >>   [<0000000000463ad0>] sysfs_kf_seq_show+0xc8/0x148
> >>   [<00000000003e4194>] seq_read+0x204/0x480
> >>   [<00000000003b53ea>] __vfs_read+0x32/0x178
> >>   [<00000000003b55b2>] vfs_read+0x82/0x138
> >>   [<00000000003b5be2>] ksys_read+0x5a/0xb0
> >>   [<0000000000b86ba0>] system_call+0xdc/0x2d8
> >>  Last Breaking-Event-Address:
> >>   [<000000000038596c>] is_mem_section_removable+0xb4/0x190
> >>  Kernel panic - not syncing: Fatal exception: panic_on_oops
> >>
> >> Fix the problem by initializing the last memory section of each zone
> >> in memmap_init_zone() till the very end, even if it goes beyond the zone
> >> end.
> >>
> >> Signed-off-by: Mikhail Zaslonko <zaslonko@linux.ibm.com>
> >> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
> >> Cc: <stable@vger.kernel.org>
> >> ---
> >>  mm/page_alloc.c | 12 ++++++++++++
> >>  1 file changed, 12 insertions(+)
> >>
> >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> >> index 2ec9cc407216..e2afdb2dc2c5 100644
> >> --- a/mm/page_alloc.c
> >> +++ b/mm/page_alloc.c
> >> @@ -5542,6 +5542,18 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
> >>  			cond_resched();
> >>  		}
> >>  	}
> >> +#ifdef CONFIG_SPARSEMEM
> >> +	/*
> >> +	 * If the zone does not span the rest of the section then
> >> +	 * we should at least initialize those pages. Otherwise we
> >> +	 * could blow up on a poisoned page in some paths which depend
> >> +	 * on full sections being initialized (e.g. memory hotplug).
> >> +	 */
> >> +	while (end_pfn % PAGES_PER_SECTION) {
> >> +		__init_single_page(pfn_to_page(end_pfn), end_pfn, zone, nid);
> >> +		end_pfn++;  
> > 
> > This page will not be marked as PG_reserved - although it is a physical
> > memory gap. Do we care?
> >   
> 
> Hm, or do we even have any idea what this is (e.g. could it also be
> something not a gap)?

In the "mem=" restriction scenario it would be a gap, and probably fall
into the PG_reserved categorization from your recent patch:
 * - Pages falling into physical memory gaps - not IORESOURCE_SYSRAM. Trying
 *   to read/write these pages might end badly. Don't touch!

Not sure if it could be something else. In theory, if it is possible to have
a scenario where memory zones are not section-aligned, then this
end_pfn % PAGES_PER_SECTION part could be part of another zone. But then it
should not matter if the pages get pre-initialized here, with or w/o
PG_reseved, because they should later be properly initialized in their zone.

So marking them as PG_reserved sounds right, especially in the light of your
current PG_reserved clean-up.

> 
> For physical memory gaps within a section, architectures usually exclude
> that memory from getting passed to e.g. the page allocator by
> memblock_reserve().
> 
> Before handing all free pages to the page allocator, all such reserved
> memblocks will be marked reserved.
> 
> But this here seems to be different. We don't have a previous
> memblock_reserve(), because otherwise these pages would have properly
> been initialized already when marking them reserved.

Not sure how memblock_reserve() and struct page initialization are
related, but at least on s390 there is a memblock_reserve() on the range
in question in setup_arch() -> reserve_memory_end(). However, in this
"mem=" scenario, the range is also removed later with memblock_remove()
in setup_memory_end(), because it is beyond memory_end.

  reply	other threads:[~2018-12-14 19:23 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-12 17:27 [PATCH v2 0/1] Initialize struct pages for the full section Mikhail Zaslonko
2018-12-12 17:27 ` [PATCH v2 1/1] mm, memory_hotplug: Initialize struct pages for the full memory section Mikhail Zaslonko
2018-12-13  3:46   ` Wei Yang
2018-12-13 12:37     ` Zaslonko Mikhail
2018-12-13 15:12       ` Wei Yang
2018-12-14  9:33         ` Zaslonko Mikhail
2018-12-14 10:19           ` Michal Hocko
2018-12-14 10:19             ` Michal Hocko
2018-12-15  0:26             ` Wei Yang
2018-12-13 12:59   ` Michal Hocko
2018-12-14 15:22   ` David Hildenbrand
2018-12-14 15:49     ` David Hildenbrand
2018-12-14 19:23       ` Gerald Schaefer [this message]
2018-12-17  9:38         ` David Hildenbrand
2018-12-17 12:28           ` Michal Hocko
2018-12-17 13:29             ` David Hildenbrand
2018-12-17 13:35               ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181214202315.1c685f1e@thinkpad \
    --to=gerald.schaefer@de.ibm.com \
    --cc=Pavel.Tatashin@microsoft.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=heiko.carstens@de.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=schwidefsky@de.ibm.com \
    --cc=zaslonko@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.