From: Mike Rapoport <rppt@linux.ibm.com>
To: Baoquan He <bhe@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>,
kexec@lists.infradead.org, Stefan Agner <stefan@agner.ch>,
Tang Chen <tangchen@cn.fujitsu.com>,
linux-mm@kvack.org, Yaowei Bai <baiyaowei@cmss.chinamobile.com>,
Jonathan Corbet <corbet@lwn.net>,
Pavel Tatashin <pasha.tatashin@oracle.com>,
linux-acpi@vger.kernel.org, Dave Young <dyoung@redhat.com>,
Daniel Vacek <neelx@redhat.com>,
vgoyal@redhat.com, Len Brown <lenb@kernel.org>,
Nicholas Piggin <npiggin@gmail.com>,
Mike Rapoport <rppt@linux.vnet.ibm.com>,
Pingfan Liu <kernelfans@gmail.com>,
yinghai@kernel.org, Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
Mathieu Malaterre <malat@debian.org>,
"Rafael J. Wysocki" <rjw@rjwysocki.net>,
linux-kernel@vger.kernel.org, Tejun Heo <tj@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCHv3 1/2] mm/memblock: extend the limit inferior of bottom-up after parsing hotplug attr
Date: Sun, 6 Jan 2019 08:27:34 +0200 [thread overview]
Message-ID: <20190106062733.GA3728@rapoport-lnx> (raw)
In-Reply-To: <20190105034450.GE30750@MiWiFi-R3L-srv>
On Sat, Jan 05, 2019 at 11:44:50AM +0800, Baoquan He wrote:
> On 01/04/19 at 05:09pm, Mike Rapoport wrote:
> > On Thu, Jan 03, 2019 at 10:47:06AM -0800, Tejun Heo wrote:
> > > Hello,
> > >
> > > On Wed, Jan 02, 2019 at 07:05:38PM +0200, Mike Rapoport wrote:
> > > > I agree that currently the bottom-up allocation after the kernel text has
> > > > issues with KASLR. But this issues are not necessarily related to the
> > > > memory hotplug. Even with a single memory node, a bottom-up allocation will
> > > > fail if KASLR would put the kernel near the end of node0.
> > > >
> > > > What I am trying to understand is whether there is a fundamental reason to
> > > > prevent allocations from [0, kernel_start)?
> > > >
> > > > Maybe Tejun can recall why he suggested to start bottom-up allocations from
> > > > kernel_end.
> > >
> > > That's from 79442ed189ac ("mm/memblock.c: introduce bottom-up
> > > allocation mode"). I wasn't involved in that patch, so no idea why
> > > the restrictions were added, but FWIW it doesn't seem necessary to me.
> >
> > I should have added the reference [1] at the first place :)
> > Thanks!
> >
> > [1] https://lore.kernel.org/lkml/20130904192215.GG26609@mtj.dyndns.org/
>
> With my understanding, we may not be able to discard the bottom-up
> method for the current kernel. It's related to hotplug feature when
> 'movable_node' kernel parameter is specified. With 'movable_node',
> system relies on reading hotplug information from firmware, on x86 it's
> acpi SRAT table. In the current system, we allocate memblock region
> top-down by default. However, before that hotplug information retrieving,
> there are several places of memblock allocating, top-down memblock
> allocation must break hotplug feature since it will allocate kernel data
> in movable zone which is usually at the end node on bare metal system.
I do not suggest to discard the bottom-up method, I merely suggest to allow
it to use [0, kernel_start).
> This bottom-up way is taken on many ARCHes, it works well on system if
> KASLR is not enabled. Below is the searching result in the current linux
> kernel, we can see that all ARCHes have this mechanism, except of
> arm/arm64. But now only arm64/mips/x86 have KASLR.
>
> W/o KASLR, allocating memblock region above kernle end when hotplug info
> is not parsed, looks very reasonable. Since kernel is usually put at
> lower address, e.g on x86, it's 16M. My thought is that we need do
> memblock allocation around kernel before hotplug info parsed. That is
> for system w/o KASLR, we will keep the current bottom-up way; for system
> with KASLR, we should allocate memblock region top-down just below
> kernel start.
I completely agree. I was thinking about making
memblock_find_in_range_node() to do something like
if (memblock_bottom_up()) {
bottom_up_start = max(start, kernel_end);
ret = __memblock_find_range_bottom_up(bottom_up_start, end,
size, align, nid, flags);
if (ret)
return ret;
bottom_up_start = max(start, 0);
end = kernel_start;
ret = __memblock_find_range_top_down(bottom_up_start, end,
size, align, nid, flags);
if (ret)
return ret;
}
> This issue must break hotplug, just because currently bare metal system
> need add 'nokaslr' to disable KASLR since another bug fix is under
> discussion as below, so this issue is covered up.
>
> [PATCH v14 0/5] x86/boot/KASLR: Parse ACPI table and limit KASLR to choosing immovable memory
> lkml.kernel.org/r/20181214093013.13370-1-fanc.fnst@cn.fujitsu.com
>
> [~ ]$ git grep memblock_set_bottom_up
> arch/alpha/kernel/setup.c: memblock_set_bottom_up(true);
> arch/m68k/mm/motorola.c: memblock_set_bottom_up(true);
> arch/mips/kernel/setup.c: memblock_set_bottom_up(true);
> arch/mips/kernel/traps.c: memblock_set_bottom_up(false);
> arch/nds32/kernel/setup.c: memblock_set_bottom_up(true);
> arch/powerpc/kernel/paca.c: memblock_set_bottom_up(true);
> arch/powerpc/kernel/paca.c: memblock_set_bottom_up(false);
> arch/s390/kernel/setup.c: memblock_set_bottom_up(true);
> arch/s390/kernel/setup.c: memblock_set_bottom_up(false);
> arch/sparc/mm/init_32.c: memblock_set_bottom_up(true);
> arch/x86/kernel/setup.c: memblock_set_bottom_up(true);
> arch/x86/mm/numa.c: memblock_set_bottom_up(false);
> include/linux/memblock.h:static inline void __init memblock_set_bottom_up(bool enable)
>
--
Sincerely yours,
Mike.
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
WARNING: multiple messages have this Message-ID (diff)
From: Mike Rapoport <rppt@linux.ibm.com>
To: Baoquan He <bhe@redhat.com>
Cc: Tejun Heo <tj@kernel.org>, Pingfan Liu <kernelfans@gmail.com>,
linux-acpi@vger.kernel.org, linux-mm@kvack.org,
kexec@lists.infradead.org, Tang Chen <tangchen@cn.fujitsu.com>,
"Rafael J. Wysocki" <rjw@rjwysocki.net>,
Len Brown <lenb@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Mike Rapoport <rppt@linux.vnet.ibm.com>,
Michal Hocko <mhocko@suse.com>, Jonathan Corbet <corbet@lwn.net>,
Yaowei Bai <baiyaowei@cmss.chinamobile.com>,
Pavel Tatashin <pasha.tatashin@oracle.com>,
Nicholas Piggin <npiggin@gmail.com>,
Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
Daniel Vacek <neelx@redhat.com>,
Mathieu Malaterre <malat@debian.org>,
Stefan Agner <stefan@agner.ch>, Dave Young <dyoung@redhat.com>,
yinghai@kernel.org, vgoyal@redhat.com,
linux-kernel@vger.kernel.org
Subject: Re: [PATCHv3 1/2] mm/memblock: extend the limit inferior of bottom-up after parsing hotplug attr
Date: Sun, 6 Jan 2019 08:27:34 +0200 [thread overview]
Message-ID: <20190106062733.GA3728@rapoport-lnx> (raw)
In-Reply-To: <20190105034450.GE30750@MiWiFi-R3L-srv>
On Sat, Jan 05, 2019 at 11:44:50AM +0800, Baoquan He wrote:
> On 01/04/19 at 05:09pm, Mike Rapoport wrote:
> > On Thu, Jan 03, 2019 at 10:47:06AM -0800, Tejun Heo wrote:
> > > Hello,
> > >
> > > On Wed, Jan 02, 2019 at 07:05:38PM +0200, Mike Rapoport wrote:
> > > > I agree that currently the bottom-up allocation after the kernel text has
> > > > issues with KASLR. But this issues are not necessarily related to the
> > > > memory hotplug. Even with a single memory node, a bottom-up allocation will
> > > > fail if KASLR would put the kernel near the end of node0.
> > > >
> > > > What I am trying to understand is whether there is a fundamental reason to
> > > > prevent allocations from [0, kernel_start)?
> > > >
> > > > Maybe Tejun can recall why he suggested to start bottom-up allocations from
> > > > kernel_end.
> > >
> > > That's from 79442ed189ac ("mm/memblock.c: introduce bottom-up
> > > allocation mode"). I wasn't involved in that patch, so no idea why
> > > the restrictions were added, but FWIW it doesn't seem necessary to me.
> >
> > I should have added the reference [1] at the first place :)
> > Thanks!
> >
> > [1] https://lore.kernel.org/lkml/20130904192215.GG26609@mtj.dyndns.org/
>
> With my understanding, we may not be able to discard the bottom-up
> method for the current kernel. It's related to hotplug feature when
> 'movable_node' kernel parameter is specified. With 'movable_node',
> system relies on reading hotplug information from firmware, on x86 it's
> acpi SRAT table. In the current system, we allocate memblock region
> top-down by default. However, before that hotplug information retrieving,
> there are several places of memblock allocating, top-down memblock
> allocation must break hotplug feature since it will allocate kernel data
> in movable zone which is usually at the end node on bare metal system.
I do not suggest to discard the bottom-up method, I merely suggest to allow
it to use [0, kernel_start).
> This bottom-up way is taken on many ARCHes, it works well on system if
> KASLR is not enabled. Below is the searching result in the current linux
> kernel, we can see that all ARCHes have this mechanism, except of
> arm/arm64. But now only arm64/mips/x86 have KASLR.
>
> W/o KASLR, allocating memblock region above kernle end when hotplug info
> is not parsed, looks very reasonable. Since kernel is usually put at
> lower address, e.g on x86, it's 16M. My thought is that we need do
> memblock allocation around kernel before hotplug info parsed. That is
> for system w/o KASLR, we will keep the current bottom-up way; for system
> with KASLR, we should allocate memblock region top-down just below
> kernel start.
I completely agree. I was thinking about making
memblock_find_in_range_node() to do something like
if (memblock_bottom_up()) {
bottom_up_start = max(start, kernel_end);
ret = __memblock_find_range_bottom_up(bottom_up_start, end,
size, align, nid, flags);
if (ret)
return ret;
bottom_up_start = max(start, 0);
end = kernel_start;
ret = __memblock_find_range_top_down(bottom_up_start, end,
size, align, nid, flags);
if (ret)
return ret;
}
> This issue must break hotplug, just because currently bare metal system
> need add 'nokaslr' to disable KASLR since another bug fix is under
> discussion as below, so this issue is covered up.
>
> [PATCH v14 0/5] x86/boot/KASLR: Parse ACPI table and limit KASLR to choosing immovable memory
> lkml.kernel.org/r/20181214093013.13370-1-fanc.fnst@cn.fujitsu.com
>
> [~ ]$ git grep memblock_set_bottom_up
> arch/alpha/kernel/setup.c: memblock_set_bottom_up(true);
> arch/m68k/mm/motorola.c: memblock_set_bottom_up(true);
> arch/mips/kernel/setup.c: memblock_set_bottom_up(true);
> arch/mips/kernel/traps.c: memblock_set_bottom_up(false);
> arch/nds32/kernel/setup.c: memblock_set_bottom_up(true);
> arch/powerpc/kernel/paca.c: memblock_set_bottom_up(true);
> arch/powerpc/kernel/paca.c: memblock_set_bottom_up(false);
> arch/s390/kernel/setup.c: memblock_set_bottom_up(true);
> arch/s390/kernel/setup.c: memblock_set_bottom_up(false);
> arch/sparc/mm/init_32.c: memblock_set_bottom_up(true);
> arch/x86/kernel/setup.c: memblock_set_bottom_up(true);
> arch/x86/mm/numa.c: memblock_set_bottom_up(false);
> include/linux/memblock.h:static inline void __init memblock_set_bottom_up(bool enable)
>
--
Sincerely yours,
Mike.
next prev parent reply other threads:[~2019-01-06 6:27 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-12-28 3:00 [PATCHv3 0/2] mm/memblock: reuse memblock bottom-up allocation style Pingfan Liu
2018-12-28 3:00 ` Pingfan Liu
2018-12-28 3:00 ` [PATCHv3 1/2] mm/memblock: extend the limit inferior of bottom-up after parsing hotplug attr Pingfan Liu
2018-12-28 3:00 ` Pingfan Liu
2018-12-31 8:40 ` Mike Rapoport
2018-12-31 8:40 ` Mike Rapoport
2019-01-02 6:47 ` Pingfan Liu
2019-01-02 9:27 ` Mike Rapoport
2019-01-02 10:18 ` Baoquan He
2019-01-02 17:05 ` Mike Rapoport
2019-01-02 17:05 ` Mike Rapoport
2019-01-03 18:47 ` Tejun Heo
2019-01-03 18:47 ` Tejun Heo
2019-01-03 18:47 ` Tejun Heo
2019-01-04 15:09 ` Mike Rapoport
2019-01-04 15:09 ` Mike Rapoport
2019-01-04 15:09 ` Mike Rapoport
2019-01-05 3:44 ` Baoquan He
2019-01-05 3:44 ` Baoquan He
2019-01-06 6:27 ` Mike Rapoport [this message]
2019-01-06 6:27 ` Mike Rapoport
2019-01-08 8:50 ` Baoquan He
2019-01-08 8:50 ` Baoquan He
2019-01-07 8:37 ` Pingfan Liu
2019-01-07 8:37 ` Pingfan Liu
2019-01-04 5:59 ` Pingfan Liu
2019-01-04 5:59 ` Pingfan Liu
2019-01-04 16:20 ` Mike Rapoport
2019-01-04 16:20 ` Mike Rapoport
2018-12-28 3:00 ` [PATCHv3 2/2] x86/kdump: bugfix, make the behavior of crashkernel=X consistent with kaslr Pingfan Liu
2018-12-28 3:00 ` Pingfan Liu
2018-12-31 8:46 ` Mike Rapoport
2018-12-31 8:46 ` Mike Rapoport
2018-12-31 8:46 ` Mike Rapoport
2019-01-02 6:47 ` Pingfan Liu
2019-01-02 9:28 ` Mike Rapoport
2018-12-28 3:39 ` [PATCHv3 0/2] mm/memblock: reuse memblock bottom-up allocation style Baoquan He
2018-12-28 3:39 ` Baoquan He
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190106062733.GA3728@rapoport-lnx \
--to=rppt@linux.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=baiyaowei@cmss.chinamobile.com \
--cc=bhe@redhat.com \
--cc=corbet@lwn.net \
--cc=dyoung@redhat.com \
--cc=kernelfans@gmail.com \
--cc=kexec@lists.infradead.org \
--cc=lenb@kernel.org \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=malat@debian.org \
--cc=mhocko@suse.com \
--cc=n-horiguchi@ah.jp.nec.com \
--cc=neelx@redhat.com \
--cc=npiggin@gmail.com \
--cc=pasha.tatashin@oracle.com \
--cc=rjw@rjwysocki.net \
--cc=rppt@linux.vnet.ibm.com \
--cc=stefan@agner.ch \
--cc=tangchen@cn.fujitsu.com \
--cc=tj@kernel.org \
--cc=vgoyal@redhat.com \
--cc=yinghai@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.