From: Mike Rapoport <rppt@kernel.org>
To: David Hildenbrand <david@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Andrea Arcangeli <aarcange@redhat.com>,
Baoquan He <bhe@redhat.com>, Borislav Petkov <bp@alien8.de>,
"H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@redhat.com>,
Mel Gorman <mgorman@suse.de>, Michal Hocko <mhocko@kernel.org>,
Mike Rapoport <rppt@linux.ibm.com>, Qian Cai <cai@lca.pw>,
Thomas Gleixner <tglx@linutronix.de>,
Vlastimil Babka <vbabka@suse.cz>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org
Subject: Re: [PATCH v3 1/2] x86/setup: don't remove E820_TYPE_RAM for pfn 0
Date: Wed, 13 Jan 2021 17:35:09 +0200 [thread overview]
Message-ID: <20210113153509.GH1106298@kernel.org> (raw)
In-Reply-To: <6ba6bde3-1520-5cd0-f987-32d543f0b79f@redhat.com>
On Wed, Jan 13, 2021 at 01:56:45PM +0100, David Hildenbrand wrote:
> On 11.01.21 20:40, Mike Rapoport wrote:
> > From: Mike Rapoport <rppt@linux.ibm.com>
> >
> > The first 4Kb of memory is a BIOS owned area and to avoid its allocation
> > for the kernel it was not listed in e820 tables as memory. As the result,
> > pfn 0 was never recognised by the generic memory management and it is not a
> > part of neither node 0 nor ZONE_DMA.
> >
> > If set_pfnblock_flags_mask() would be ever called for the pageblock
> > corresponding to the first 2Mbytes of memory, having pfn 0 outside of
> > ZONE_DMA would trigger
> >
> > VM_BUG_ON_PAGE(!zone_spans_pfn(page_zone(page), pfn), page);
> >
> > Along with reserving the first 4Kb in e820 tables, several first pages are
> > reserved with memblock in several places during setup_arch(). These
> > reservations are enough to ensure the kernel does not touch the BIOS area
> > and it is not necessary to remove E820_TYPE_RAM for pfn 0.
> >
> > Remove the update of e820 table that changes the type of pfn 0 and move the
> > comment describing why it was done to trim_low_memory_range() that reserves
> > the beginning of the memory.
> >
> > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
> > ---
> > arch/x86/kernel/setup.c | 20 +++++++++-----------
> > 1 file changed, 9 insertions(+), 11 deletions(-)
> >
> > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> > index 740f3bdb3f61..3412c4595efd 100644
> > --- a/arch/x86/kernel/setup.c
> > +++ b/arch/x86/kernel/setup.c
> > @@ -660,17 +660,6 @@ static void __init trim_platform_memory_ranges(void)
> >
> > static void __init trim_bios_range(void)
> > {
> > - /*
> > - * A special case is the first 4Kb of memory;
> > - * This is a BIOS owned area, not kernel ram, but generally
> > - * not listed as such in the E820 table.
> > - *
> > - * This typically reserves additional memory (64KiB by default)
> > - * since some BIOSes are known to corrupt low memory. See the
> > - * Kconfig help text for X86_RESERVE_LOW.
> > - */
> > - e820__range_update(0, PAGE_SIZE, E820_TYPE_RAM, E820_TYPE_RESERVED);
> > -
> > /*
> > * special case: Some BIOSes report the PC BIOS
> > * area (640Kb -> 1Mb) as RAM even though it is not.
> > @@ -728,6 +717,15 @@ early_param("reservelow", parse_reservelow);
> >
> > static void __init trim_low_memory_range(void)
> > {
> > + /*
> > + * A special case is the first 4Kb of memory;
> > + * This is a BIOS owned area, not kernel ram, but generally
> > + * not listed as such in the E820 table.
> > + *
> > + * This typically reserves additional memory (64KiB by default)
> > + * since some BIOSes are known to corrupt low memory. See the
> > + * Kconfig help text for X86_RESERVE_LOW.
> > + */
> > memblock_reserve(0, ALIGN(reserve_low, PAGE_SIZE));
> > }
> >
> >
>
> The only somewhat-confusing thing is that in-between
> e820__memblock_setup() and trim_low_memory_range(), we already have
> memblock allocations. So [0..4095] might look like ordinary memory until
> we reserve it later on.
>
> E.g., reserve_real_mode() does a
>
> mem = memblock_find_in_range(0, 1<<20, size, PAGE_SIZE);
> ...
> memblock_reserve(mem, size);
> set_real_mode_mem(mem);
>
> which looks kind of suspicious to me. Most probably I am missing
> something, just wanted to point that out. We might want to do such
> trimming/adjustments before any kind of allocations.
You are right and it looks suspicious, but the first page is reserved at
the very beginning of x86::setup_arch() and, moreover, memblock never
allocates it (look at memblock::memblock_find_in_range_node()).
As for the range 0x1000 <-> reserve_low, we are unlikely to allocate it in
the default top-down mode. The bottom-up mode was only allocating memory
above the kernel so this would also prevent allocation of the lowest
memory, at least until the recent changes for CMA allocation:
https://lore.kernel.org/lkml/20201217201214.3414100-1-guro@fb.com
That said, we'd better consolidate all the trim_some_memory() and move it
closer to the beginning of setup_arch().
I'm going to take a look at it in the next few days.
> --
> Thanks,
>
> David / dhildenb
>
--
Sincerely yours,
Mike.
next prev parent reply other threads:[~2021-01-13 15:35 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-01-11 19:40 [PATCH v3 0/2] mm: fix initialization of struct page for holes in memory layout Mike Rapoport
2021-01-11 19:40 ` [PATCH v3 1/2] x86/setup: don't remove E820_TYPE_RAM for pfn 0 Mike Rapoport
2021-01-13 8:56 ` Oscar Salvador
2021-01-13 11:23 ` Mike Rapoport
2021-01-13 12:56 ` David Hildenbrand
2021-01-13 15:35 ` Mike Rapoport [this message]
2021-01-21 13:25 ` Borislav Petkov
2021-01-11 19:40 ` [PATCH v3 2/2] mm: fix initialization of struct page for holes in memory layout Mike Rapoport
2021-02-01 9:14 ` David Hildenbrand
2021-02-01 9:39 ` Baoquan He
2021-02-01 14:12 ` Mike Rapoport
2021-01-12 0:58 ` [PATCH v3 0/2] " Andrew Morton
2021-01-12 5:25 ` Mike Rapoport
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210113153509.GH1106298@kernel.org \
--to=rppt@kernel.org \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=bhe@redhat.com \
--cc=bp@alien8.de \
--cc=cai@lca.pw \
--cc=david@redhat.com \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mhocko@kernel.org \
--cc=mingo@redhat.com \
--cc=rppt@linux.ibm.com \
--cc=tglx@linutronix.de \
--cc=vbabka@suse.cz \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.