From: Mike Rapoport <rppt@kernel.org>
To: "Russell King (Oracle)" <linux@armlinux.org.uk>
Cc: Mike Rapoport <rppt@linux.ibm.com>,
linux-kernel@vger.kernel.org,
Andrew Morton <akpm@linux-foundation.org>,
Catalin Marinas <catalin.marinas@arm.com>,
Christian Borntraeger <borntraeger@de.ibm.com>,
David Hildenbrand <david@redhat.com>,
Heiko Carstens <hca@linux.ibm.com>,
Thomas Bogendoerfer <tsbogend@alpha.franken.de>,
Vasily Gorbik <gor@linux.ibm.com>, Will Deacon <will@kernel.org>,
linux-arm-kernel@lists.infradead.org, linux-mips@vger.kernel.org,
linux-mm@kvack.org, linux-s390@vger.kernel.org
Subject: Re: [RFC/RFT PATCH 2/5] memblock: introduce generic memblock_setup_resources()
Date: Wed, 2 Jun 2021 21:43:32 +0300 [thread overview]
Message-ID: <YLfRVGC+tq5L0TZ6@kernel.org> (raw)
In-Reply-To: <20210602155141.GM30436@shell.armlinux.org.uk>
On Wed, Jun 02, 2021 at 04:51:41PM +0100, Russell King (Oracle) wrote:
> On Wed, Jun 02, 2021 at 04:54:17PM +0300, Mike Rapoport wrote:
> > On Wed, Jun 02, 2021 at 11:15:21AM +0100, Russell King (Oracle) wrote:
> > > On Wed, Jun 02, 2021 at 11:33:10AM +0300, Mike Rapoport wrote:
> > > > On Tue, Jun 01, 2021 at 02:54:15PM +0100, Russell King (Oracle) wrote:
> > > > > If I look at one of my kernels:
> > > > >
> > > > > c0008000 T _text
> > > > > c0b5b000 R __end_rodata
> > > > > ... exception and unwind tables live here ...
> > > > > c0c00000 T __init_begin
> > > > > c0e00000 D _sdata
> > > > > c0e68870 D _edata
> > > > > c0e68870 B __bss_start
> > > > > c0e995d4 B __bss_stop
> > > > > c0e995d4 B _end
> > > > >
> > > > > So the original covers _text..__init_begin-1 which includes the
> > > > > exception and unwind tables. Your version above omits these, which
> > > > > leaves them exposed.
> > > >
> > > > Right, this needs to be fixed. Is there any reason the exception and unwind
> > > > tables cannot be placed between _sdata and _edata?
> > > >
> > > > It seems to me that they were left outside for purely historical reasons.
> > > > Commit ee951c630c5c ("ARM: 7568/1: Sort exception table at compile time")
> > > > moved the exception tables out of .data section before _sdata existed.
> > > > Commit 14c4a533e099 ("ARM: 8583/1: mm: fix location of _etext") moved
> > > > _etext before the unwind tables and didn't bother to put them into data or
> > > > rodata areas.
> > >
> > > You can not assume that all sections will be between these symbols. This
> > > isn't specific to 32-bit ARM. If you look at x86's vmlinux.lds.in, you
> > > will see that BUG_TABLE and ORC_UNWIND_TABLE are after _edata, along
> > > with many other undiscarded sections before __bss_start.
> >
> > But if you look at x86's setup_arch() all these never make it to the
> > resource tree. So there are holes in /proc/iomem between the kernel
> > resources.
>
> Also true. However, my point was to counter your claim that these
> sections should be part of the .text/.data/.rodata etc sections in the
> output vmlinux.
>
> There is, however, a more important point. The __ex_table section
> must exist and be separate from the .text/.data/.rodata sections in
> the output ELF file, as sorttable (the exception table sorter) relies
> on this to be able to find the table and sort it.
>
> So, it isn't entirely "for historical reasons" as you said two messages
> ago.
Back then when __ex_table was moved from .data section, _sdata and _edata
were part of the .data section. Today they are not. So something like the
patch below will ensure for instance that __ex_table would be a part of
"Kernel data" in /proc/iomem without moving it to the .data section:
diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
index f7f4620d59c3..2991feceab31 100644
--- a/arch/arm/kernel/vmlinux.lds.S
+++ b/arch/arm/kernel/vmlinux.lds.S
@@ -72,13 +72,6 @@ SECTIONS
RO_DATA(PAGE_SIZE)
- . = ALIGN(4);
- __ex_table : AT(ADDR(__ex_table) - LOAD_OFFSET) {
- __start___ex_table = .;
- ARM_MMU_KEEP(*(__ex_table))
- __stop___ex_table = .;
- }
-
#ifdef CONFIG_ARM_UNWIND
ARM_UNWIND_SECTIONS
#endif
@@ -143,6 +136,14 @@ SECTIONS
__init_end = .;
_sdata = .;
+
+ . = ALIGN(4);
+ __ex_table : AT(ADDR(__ex_table) - LOAD_OFFSET) {
+ __start___ex_table = .;
+ ARM_MMU_KEEP(*(__ex_table))
+ __stop___ex_table = .;
+ }
+
RW_DATA(L1_CACHE_BYTES, PAGE_SIZE, THREAD_SIZE)
_edata = .;
> Now, bear in mind that /proc/iomem is a user API, one which userspace
> depends on. If we start going around making /proc/iomem report stuff
> like kernel boot time reservations as "reserved" memory, we will end up
> breaking the kexec tooling on some platforms. For example, kexec
> tooling for 32-bit ARM parses /proc/iomem, looking for "System RAM",
> "System RAM (boot alias)" and "reserved" regions.
>
> So, I think changes to make this "more consistent" come with high
> risk.
I agree there is a risk but I don't think it's high. It does not look like
the minor changes in "reserved" reporting in /proc/iomem will break kexec
tooling. Anyway the amount of reserved and free memory depends on a
particular system, kernel version, configuration and command line.
I have no intention to report kernel boot time reservations
to /proc/iomem on architectures that do not report them there today,
although this also does not seem like a significant factor.
On the other hand, making /proc/iomem reporting consistent among
architectures will allow to reduce complexity of both the kernel and kexec
tools in the long run.
--
Sincerely yours,
Mike.
WARNING: multiple messages have this Message-ID (diff)
From: Mike Rapoport <rppt@kernel.org>
To: "Russell King (Oracle)" <linux@armlinux.org.uk>
Cc: Mike Rapoport <rppt@linux.ibm.com>,
linux-kernel@vger.kernel.org,
Andrew Morton <akpm@linux-foundation.org>,
Catalin Marinas <catalin.marinas@arm.com>,
Christian Borntraeger <borntraeger@de.ibm.com>,
David Hildenbrand <david@redhat.com>,
Heiko Carstens <hca@linux.ibm.com>,
Thomas Bogendoerfer <tsbogend@alpha.franken.de>,
Vasily Gorbik <gor@linux.ibm.com>, Will Deacon <will@kernel.org>,
linux-arm-kernel@lists.infradead.org, linux-mips@vger.kernel.org,
linux-mm@kvack.org, linux-s390@vger.kernel.org
Subject: Re: [RFC/RFT PATCH 2/5] memblock: introduce generic memblock_setup_resources()
Date: Wed, 2 Jun 2021 21:43:32 +0300 [thread overview]
Message-ID: <YLfRVGC+tq5L0TZ6@kernel.org> (raw)
In-Reply-To: <20210602155141.GM30436@shell.armlinux.org.uk>
On Wed, Jun 02, 2021 at 04:51:41PM +0100, Russell King (Oracle) wrote:
> On Wed, Jun 02, 2021 at 04:54:17PM +0300, Mike Rapoport wrote:
> > On Wed, Jun 02, 2021 at 11:15:21AM +0100, Russell King (Oracle) wrote:
> > > On Wed, Jun 02, 2021 at 11:33:10AM +0300, Mike Rapoport wrote:
> > > > On Tue, Jun 01, 2021 at 02:54:15PM +0100, Russell King (Oracle) wrote:
> > > > > If I look at one of my kernels:
> > > > >
> > > > > c0008000 T _text
> > > > > c0b5b000 R __end_rodata
> > > > > ... exception and unwind tables live here ...
> > > > > c0c00000 T __init_begin
> > > > > c0e00000 D _sdata
> > > > > c0e68870 D _edata
> > > > > c0e68870 B __bss_start
> > > > > c0e995d4 B __bss_stop
> > > > > c0e995d4 B _end
> > > > >
> > > > > So the original covers _text..__init_begin-1 which includes the
> > > > > exception and unwind tables. Your version above omits these, which
> > > > > leaves them exposed.
> > > >
> > > > Right, this needs to be fixed. Is there any reason the exception and unwind
> > > > tables cannot be placed between _sdata and _edata?
> > > >
> > > > It seems to me that they were left outside for purely historical reasons.
> > > > Commit ee951c630c5c ("ARM: 7568/1: Sort exception table at compile time")
> > > > moved the exception tables out of .data section before _sdata existed.
> > > > Commit 14c4a533e099 ("ARM: 8583/1: mm: fix location of _etext") moved
> > > > _etext before the unwind tables and didn't bother to put them into data or
> > > > rodata areas.
> > >
> > > You can not assume that all sections will be between these symbols. This
> > > isn't specific to 32-bit ARM. If you look at x86's vmlinux.lds.in, you
> > > will see that BUG_TABLE and ORC_UNWIND_TABLE are after _edata, along
> > > with many other undiscarded sections before __bss_start.
> >
> > But if you look at x86's setup_arch() all these never make it to the
> > resource tree. So there are holes in /proc/iomem between the kernel
> > resources.
>
> Also true. However, my point was to counter your claim that these
> sections should be part of the .text/.data/.rodata etc sections in the
> output vmlinux.
>
> There is, however, a more important point. The __ex_table section
> must exist and be separate from the .text/.data/.rodata sections in
> the output ELF file, as sorttable (the exception table sorter) relies
> on this to be able to find the table and sort it.
>
> So, it isn't entirely "for historical reasons" as you said two messages
> ago.
Back then when __ex_table was moved from .data section, _sdata and _edata
were part of the .data section. Today they are not. So something like the
patch below will ensure for instance that __ex_table would be a part of
"Kernel data" in /proc/iomem without moving it to the .data section:
diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
index f7f4620d59c3..2991feceab31 100644
--- a/arch/arm/kernel/vmlinux.lds.S
+++ b/arch/arm/kernel/vmlinux.lds.S
@@ -72,13 +72,6 @@ SECTIONS
RO_DATA(PAGE_SIZE)
- . = ALIGN(4);
- __ex_table : AT(ADDR(__ex_table) - LOAD_OFFSET) {
- __start___ex_table = .;
- ARM_MMU_KEEP(*(__ex_table))
- __stop___ex_table = .;
- }
-
#ifdef CONFIG_ARM_UNWIND
ARM_UNWIND_SECTIONS
#endif
@@ -143,6 +136,14 @@ SECTIONS
__init_end = .;
_sdata = .;
+
+ . = ALIGN(4);
+ __ex_table : AT(ADDR(__ex_table) - LOAD_OFFSET) {
+ __start___ex_table = .;
+ ARM_MMU_KEEP(*(__ex_table))
+ __stop___ex_table = .;
+ }
+
RW_DATA(L1_CACHE_BYTES, PAGE_SIZE, THREAD_SIZE)
_edata = .;
> Now, bear in mind that /proc/iomem is a user API, one which userspace
> depends on. If we start going around making /proc/iomem report stuff
> like kernel boot time reservations as "reserved" memory, we will end up
> breaking the kexec tooling on some platforms. For example, kexec
> tooling for 32-bit ARM parses /proc/iomem, looking for "System RAM",
> "System RAM (boot alias)" and "reserved" regions.
>
> So, I think changes to make this "more consistent" come with high
> risk.
I agree there is a risk but I don't think it's high. It does not look like
the minor changes in "reserved" reporting in /proc/iomem will break kexec
tooling. Anyway the amount of reserved and free memory depends on a
particular system, kernel version, configuration and command line.
I have no intention to report kernel boot time reservations
to /proc/iomem on architectures that do not report them there today,
although this also does not seem like a significant factor.
On the other hand, making /proc/iomem reporting consistent among
architectures will allow to reduce complexity of both the kernel and kexec
tools in the long run.
--
Sincerely yours,
Mike.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2021-06-02 18:43 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-05-31 12:29 [RFC/RFT PATCH 0/5] consolidate "System RAM" resources setup Mike Rapoport
2021-05-31 12:29 ` Mike Rapoport
2021-05-31 12:29 ` [RFC/RFT PATCH 1/5] s390: make crashk_res resource a child of "System RAM" Mike Rapoport
2021-05-31 12:29 ` Mike Rapoport
2021-06-01 8:45 ` David Hildenbrand
2021-06-01 8:45 ` David Hildenbrand
2021-06-01 9:02 ` David Hildenbrand
2021-06-01 9:02 ` David Hildenbrand
2021-06-02 6:25 ` Mike Rapoport
2021-06-02 6:25 ` Mike Rapoport
2021-06-01 13:18 ` Gerald Schaefer
2021-06-01 13:18 ` Gerald Schaefer
2021-06-02 6:54 ` Mike Rapoport
2021-06-02 6:54 ` Mike Rapoport
2021-05-31 12:29 ` [RFC/RFT PATCH 2/5] memblock: introduce generic memblock_setup_resources() Mike Rapoport
2021-05-31 12:29 ` Mike Rapoport
2021-06-01 13:54 ` Russell King (Oracle)
2021-06-01 13:54 ` Russell King (Oracle)
2021-06-02 8:33 ` Mike Rapoport
2021-06-02 8:33 ` Mike Rapoport
2021-06-02 10:15 ` Russell King (Oracle)
2021-06-02 10:15 ` Russell King (Oracle)
2021-06-02 13:54 ` Mike Rapoport
2021-06-02 13:54 ` Mike Rapoport
2021-06-02 15:51 ` Russell King (Oracle)
2021-06-02 15:51 ` Russell King (Oracle)
2021-06-02 18:43 ` Mike Rapoport [this message]
2021-06-02 18:43 ` Mike Rapoport
2021-06-02 20:15 ` Russell King (Oracle)
2021-06-02 20:15 ` Russell King (Oracle)
2021-06-03 10:32 ` Mike Rapoport
2021-06-03 10:32 ` Mike Rapoport
2021-05-31 12:29 ` [RFC/RFT PATCH 3/5] arm: switch to " Mike Rapoport
2021-05-31 12:29 ` Mike Rapoport
2021-05-31 12:29 ` [RFC/RFT PATCH 4/5] MIPS: switch to generic memblock_setup_resources Mike Rapoport
2021-05-31 12:29 ` Mike Rapoport
2021-05-31 12:29 ` [RFC/RFT PATCH 5/5] arm64: switch to generic memblock_setup_resources() Mike Rapoport
2021-05-31 12:29 ` Mike Rapoport
2021-06-01 13:44 ` [RFC/RFT PATCH 0/5] consolidate "System RAM" resources setup Russell King (Oracle)
2021-06-01 13:44 ` Russell King (Oracle)
2021-06-02 7:05 ` Mike Rapoport
2021-06-02 7:05 ` Mike Rapoport
-- strict thread matches above, loose matches on Subject: below --
2021-06-01 1:39 [RFC/RFT PATCH 2/5] memblock: introduce generic memblock_setup_resources() kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YLfRVGC+tq5L0TZ6@kernel.org \
--to=rppt@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=borntraeger@de.ibm.com \
--cc=catalin.marinas@arm.com \
--cc=david@redhat.com \
--cc=gor@linux.ibm.com \
--cc=hca@linux.ibm.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mips@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-s390@vger.kernel.org \
--cc=linux@armlinux.org.uk \
--cc=rppt@linux.ibm.com \
--cc=tsbogend@alpha.franken.de \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.