* [PATCH 0/3] arm64: simplify and optimize kernel mapping
@ 2016-03-03 13:09 Ard Biesheuvel
2016-03-03 13:09 ` [PATCH 1/3] arm64: move early boot code to the .init segment Ard Biesheuvel
` (3 more replies)
0 siblings, 4 replies; 7+ messages in thread
From: Ard Biesheuvel @ 2016-03-03 13:09 UTC (permalink / raw)
To: linux-arm-kernel
This series makes a couple of minor changes that should result in the
kernel being mapped in a more efficient manner.
First of all, it merges the .head.text with the .text section (patch #2)
after moving everything except the kernel and EFI header into the __init
section (patch #1)
Then, it standardizes the segment alignment to 64 KB for all page sizes.
(patch #3). In the example below (4 KB granule, with Jeremy's PTE_CONT
patch applied), we lose 80 KB in total to padding, but the resulting
mappings do look somewhat better.
Before:
0xffff000008082000-0xffff000008090000 56K ro x SHD AF UXN MEM
0xffff000008090000-0xffff000008200000 1472K ro x SHD AF CON UXN MEM
0xffff000008200000-0xffff000008600000 4M ro x SHD AF BLK UXN MEM
0xffff000008600000-0xffff000008660000 384K ro x SHD AF CON UXN MEM
0xffff000008660000-0xffff00000866c000 48K ro x SHD AF UXN MEM
0xffff00000866c000-0xffff000008670000 16K ro NX SHD AF UXN MEM
0xffff000008670000-0xffff000008900000 2624K ro NX SHD AF CON UXN MEM
0xffff000008900000-0xffff000008909000 36K ro NX SHD AF UXN MEM
0xffff000008c39000-0xffff000008c40000 28K RW NX SHD AF UXN MEM
0xffff000008c40000-0xffff000008d50000 1088K RW NX SHD AF CON UXN MEM
0xffff000008d50000-0xffff000008d57000 28K RW NX SHD AF UXN MEM
After:
0xffff000008080000-0xffff000008200000 1536K ro x SHD AF CON UXN MEM
0xffff000008200000-0xffff000008600000 4M ro x SHD AF BLK UXN MEM
0xffff000008600000-0xffff000008670000 448K ro x SHD AF CON UXN MEM
0xffff000008670000-0xffff000008910000 2688K ro NX SHD AF CON UXN MEM
0xffff000008c50000-0xffff000008d60000 1088K RW NX SHD AF CON UXN MEM
0xffff000008d60000-0xffff000008d6b000 44K RW NX SHD AF UXN MEM
I am aware that this clashes with Jeremy's patch to allow CONT_SIZE alignment
when CONFIG_DEBUG_ALIGN_RODATA=y, but the net effect of patch #3 is the same
(only the Kconfig change is not included here)
Ard Biesheuvel (3):
arm64: move early boot code to the .init segment
arm64: cover the .head.text section in the .text segment mapping
arm64: simplify kernel segment mapping granularity
arch/arm64/kernel/efi-entry.S | 2 +-
arch/arm64/kernel/head.S | 32 +++++++++-----------
arch/arm64/kernel/image.h | 4 +++
arch/arm64/kernel/vmlinux.lds.S | 26 +++++++++-------
arch/arm64/mm/mmu.c | 10 +++---
5 files changed, 40 insertions(+), 34 deletions(-)
--
2.5.0
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/3] arm64: move early boot code to the .init segment
2016-03-03 13:09 [PATCH 0/3] arm64: simplify and optimize kernel mapping Ard Biesheuvel
@ 2016-03-03 13:09 ` Ard Biesheuvel
2016-03-03 13:09 ` [PATCH 2/3] arm64: cover the .head.text section in the .text segment mapping Ard Biesheuvel
` (2 subsequent siblings)
3 siblings, 0 replies; 7+ messages in thread
From: Ard Biesheuvel @ 2016-03-03 13:09 UTC (permalink / raw)
To: linux-arm-kernel
Apart from the arm64/linux and EFI header data structures, there is nothing
in the .head.text section that must reside at the beginning of the Image.
So let's move it to the .init section where it belongs.
Note that this involves some minor tweaking of the EFI header, primarily
because the address of 'stext' no longer coincides with the start of the
.text section. It also requires a couple of relocated symbol references
to be slightly rewritten or their definition moved to the linker script.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/kernel/efi-entry.S | 2 +-
arch/arm64/kernel/head.S | 32 +++++++++-----------
arch/arm64/kernel/image.h | 4 +++
3 files changed, 20 insertions(+), 18 deletions(-)
diff --git a/arch/arm64/kernel/efi-entry.S b/arch/arm64/kernel/efi-entry.S
index f82036e02485..936022f0655e 100644
--- a/arch/arm64/kernel/efi-entry.S
+++ b/arch/arm64/kernel/efi-entry.S
@@ -61,7 +61,7 @@ ENTRY(entry)
*/
mov x20, x0 // DTB address
ldr x0, [sp, #16] // relocated _text address
- movz x21, #:abs_g0:stext_offset
+ ldr w21, =stext_offset
add x21, x0, x21
/*
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 50c2134a4aaf..af522c853b7f 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -101,8 +101,6 @@ _head:
#endif
#ifdef CONFIG_EFI
- .globl __efistub_stext_offset
- .set __efistub_stext_offset, stext - _head
.align 3
pe_header:
.ascii "PE"
@@ -122,11 +120,11 @@ optional_header:
.short 0x20b // PE32+ format
.byte 0x02 // MajorLinkerVersion
.byte 0x14 // MinorLinkerVersion
- .long _end - stext // SizeOfCode
+ .long _end - efi_header_end // SizeOfCode
.long 0 // SizeOfInitializedData
.long 0 // SizeOfUninitializedData
.long __efistub_entry - _head // AddressOfEntryPoint
- .long __efistub_stext_offset // BaseOfCode
+ .long efi_header_end - _head // BaseOfCode
extra_header_fields:
.quad 0 // ImageBase
@@ -143,7 +141,7 @@ extra_header_fields:
.long _end - _head // SizeOfImage
// Everything before the kernel image is considered part of the header
- .long __efistub_stext_offset // SizeOfHeaders
+ .long efi_header_end - _head // SizeOfHeaders
.long 0 // CheckSum
.short 0xa // Subsystem (EFI application)
.short 0 // DllCharacteristics
@@ -187,10 +185,10 @@ section_table:
.byte 0
.byte 0
.byte 0 // end of 0 padding of section name
- .long _end - stext // VirtualSize
- .long __efistub_stext_offset // VirtualAddress
- .long _edata - stext // SizeOfRawData
- .long __efistub_stext_offset // PointerToRawData
+ .long _end - efi_header_end // VirtualSize
+ .long efi_header_end - _head // VirtualAddress
+ .long _edata - efi_header_end // SizeOfRawData
+ .long efi_header_end - _head // PointerToRawData
.long 0 // PointerToRelocations (0 for executables)
.long 0 // PointerToLineNumbers (0 for executables)
@@ -199,15 +197,18 @@ section_table:
.long 0xe0500020 // Characteristics (section flags)
/*
- * EFI will load stext onwards at the 4k section alignment
+ * EFI will load .text onwards at the 4k section alignment
* described in the PE/COFF header. To ensure that instruction
* sequences using an adrp and a :lo12: immediate will function
- * correctly at this alignment, we must ensure that stext is
+ * correctly at this alignment, we must ensure that .text is
* placed at a 4k boundary in the Image to begin with.
*/
.align 12
+efi_header_end:
#endif
+ __INIT
+
ENTRY(stext)
bl preserve_boot_args
bl el2_setup // Drop to EL1, w20=cpu_boot_mode
@@ -222,12 +223,12 @@ ENTRY(stext)
* the TCR will have been set.
*/
ldr x27, 0f // address to jump to after
- // MMU has been enabled
+ neg x27, x27 // MMU has been enabled
adr_l lr, __enable_mmu // return (PIC) address
b __cpu_setup // initialise processor
ENDPROC(stext)
.align 3
-0: .quad __mmap_switched - (_head - TEXT_OFFSET) + KIMAGE_VADDR
+0: .quad (_text - TEXT_OFFSET) - __mmap_switched - KIMAGE_VADDR
/*
* Preserve the arguments passed by the bootloader in x0 .. x3
@@ -396,7 +397,7 @@ __create_page_tables:
ldr x5, =KIMAGE_VADDR
add x5, x5, x23 // add KASLR displacement
create_pgd_entry x0, x5, x3, x6
- ldr w6, kernel_img_size
+ ldr w6, =kernel_img_size
add x6, x6, x5
mov x3, x24 // phys offset
create_block_map x0, x7, x3, x5, x6
@@ -413,9 +414,6 @@ __create_page_tables:
ret x28
ENDPROC(__create_page_tables)
-
-kernel_img_size:
- .long _end - (_head - TEXT_OFFSET)
.ltorg
/*
diff --git a/arch/arm64/kernel/image.h b/arch/arm64/kernel/image.h
index db1bf57948f1..5ff892f40a0a 100644
--- a/arch/arm64/kernel/image.h
+++ b/arch/arm64/kernel/image.h
@@ -71,8 +71,12 @@
DEFINE_IMAGE_LE64(_kernel_offset_le, TEXT_OFFSET); \
DEFINE_IMAGE_LE64(_kernel_flags_le, __HEAD_FLAGS);
+kernel_img_size = _end - (_text - TEXT_OFFSET);
+
#ifdef CONFIG_EFI
+__efistub_stext_offset = stext - _text;
+
/*
* Prevent the symbol aliases below from being emitted into the kallsyms
* table, by forcing them to be absolute symbols (which are conveniently
--
2.5.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/3] arm64: cover the .head.text section in the .text segment mapping
2016-03-03 13:09 [PATCH 0/3] arm64: simplify and optimize kernel mapping Ard Biesheuvel
2016-03-03 13:09 ` [PATCH 1/3] arm64: move early boot code to the .init segment Ard Biesheuvel
@ 2016-03-03 13:09 ` Ard Biesheuvel
2016-03-03 13:09 ` [PATCH 3/3] arm64: simplify kernel segment mapping granularity Ard Biesheuvel
2016-03-07 1:40 ` [PATCH 0/3] arm64: simplify and optimize kernel mapping Mark Rutland
3 siblings, 0 replies; 7+ messages in thread
From: Ard Biesheuvel @ 2016-03-03 13:09 UTC (permalink / raw)
To: linux-arm-kernel
Keeping .head.text out of the .text mapping buys us very little: its actual
payload is only 4 KB, most of which is padding, but the page alignment may
add up to 2 MB (in case of CONFIG_DEBUG_ALIGN_RODATA=y) of additional
padding to the uncompressed kernel Image.
Also, on 4 KB granule kernels, the 4 KB misalignment of .text forces us to
map the adjacent 56 KB of code without the PTE_CONT attribute, and since
this region contains the GIC interrupt handling entry point, among other
things, this region is likely to benefit from the reduced TLB pressure that
results from PTE_CONT mappings.
So remove the alignment between the .head.text and .text sections, and use
the [_text, _etext) rather than the [_stext, _etext) interval for mapping
the .text segment.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/kernel/vmlinux.lds.S | 1 -
arch/arm64/mm/mmu.c | 10 +++++-----
2 files changed, 5 insertions(+), 6 deletions(-)
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 4c56e7a0621b..7a141c098bbb 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -96,7 +96,6 @@ SECTIONS
_text = .;
HEAD_TEXT
}
- ALIGN_DEBUG_RO_MIN(PAGE_SIZE)
.text : { /* Real text segment */
_stext = .; /* Text and read-only data */
__exception_text_start = .;
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index d2d8b8c2e17f..1d727018e90b 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -387,7 +387,7 @@ static void create_mapping_late(phys_addr_t phys, unsigned long virt,
static void __init __map_memblock(pgd_t *pgd, phys_addr_t start, phys_addr_t end)
{
- unsigned long kernel_start = __pa(_stext);
+ unsigned long kernel_start = __pa(_text);
unsigned long kernel_end = __pa(_etext);
/*
@@ -419,7 +419,7 @@ static void __init __map_memblock(pgd_t *pgd, phys_addr_t start, phys_addr_t end
early_pgtable_alloc);
/*
- * Map the linear alias of the [_stext, _etext) interval as
+ * Map the linear alias of the [_text, _etext) interval as
* read-only/non-executable. This makes the contents of the
* region accessible to subsystems such as hibernate, but
* protects it from inadvertent modification or execution.
@@ -451,8 +451,8 @@ void mark_rodata_ro(void)
{
unsigned long section_size;
- section_size = (unsigned long)__start_rodata - (unsigned long)_stext;
- create_mapping_late(__pa(_stext), (unsigned long)_stext,
+ section_size = (unsigned long)__start_rodata - (unsigned long)_text;
+ create_mapping_late(__pa(_text), (unsigned long)_text,
section_size, PAGE_KERNEL_ROX);
/*
* mark .rodata as read only. Use _etext rather than __end_rodata to
@@ -501,7 +501,7 @@ static void __init map_kernel(pgd_t *pgd)
{
static struct vm_struct vmlinux_text, vmlinux_rodata, vmlinux_init, vmlinux_data;
- map_kernel_chunk(pgd, _stext, __start_rodata, PAGE_KERNEL_EXEC, &vmlinux_text);
+ map_kernel_chunk(pgd, _text, __start_rodata, PAGE_KERNEL_EXEC, &vmlinux_text);
map_kernel_chunk(pgd, __start_rodata, _etext, PAGE_KERNEL, &vmlinux_rodata);
map_kernel_chunk(pgd, __init_begin, __init_end, PAGE_KERNEL_EXEC,
&vmlinux_init);
--
2.5.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 3/3] arm64: simplify kernel segment mapping granularity
2016-03-03 13:09 [PATCH 0/3] arm64: simplify and optimize kernel mapping Ard Biesheuvel
2016-03-03 13:09 ` [PATCH 1/3] arm64: move early boot code to the .init segment Ard Biesheuvel
2016-03-03 13:09 ` [PATCH 2/3] arm64: cover the .head.text section in the .text segment mapping Ard Biesheuvel
@ 2016-03-03 13:09 ` Ard Biesheuvel
2016-03-07 1:40 ` [PATCH 0/3] arm64: simplify and optimize kernel mapping Mark Rutland
3 siblings, 0 replies; 7+ messages in thread
From: Ard Biesheuvel @ 2016-03-03 13:09 UTC (permalink / raw)
To: linux-arm-kernel
The mapping of the kernel consist of four segments, each of which is mapped
with different permission attributes and/or lifetimes. To optimize the TLB
and translation table footprint, we define various opaque constants in the
linker script that resolve to different aligment values depending on the
page size and whether CONFIG_DEBUG_ALIGN_RODATA is set.
Considering that
- a 4 KB granule kernel benefits from a 64 KB segment alignment (due to
the fact that it allows the use of the contiguous bit),
- the minimum alignment of the .data segment is THREAD_SIZE already, not
PAGE_SIZE (i.e., we already have padding between _data and the start of
the .data payload in many cases),
- 2 MB is a suitable alignment value on all granule sizes, either for
mapping directly (level 2 on 4 KB), or via the contiguous bit (level 3 on
16 KB and 64 KB),
- anything beyond 2 MB exceeds the minimum alignment mandated by the boot
protocol, and can only be mapped efficiently if the physical alignment
happens to be the same,
we can simplify this by standardizing on 64 KB (or 2 MB) explicitly, i.e.,
regardless of granule size, all segments are aligned either to 64 KB, or to
2 MB if CONFIG_DEBUG_ALIGN_RODATA=y.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/kernel/vmlinux.lds.S | 25 ++++++++++++--------
1 file changed, 15 insertions(+), 10 deletions(-)
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 7a141c098bbb..2c60d19b038c 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -63,14 +63,19 @@ PECOFF_FILE_ALIGNMENT = 0x200;
#endif
#if defined(CONFIG_DEBUG_ALIGN_RODATA)
-#define ALIGN_DEBUG_RO . = ALIGN(1<<SECTION_SHIFT);
-#define ALIGN_DEBUG_RO_MIN(min) ALIGN_DEBUG_RO
-#elif defined(CONFIG_DEBUG_RODATA)
-#define ALIGN_DEBUG_RO . = ALIGN(1<<PAGE_SHIFT);
-#define ALIGN_DEBUG_RO_MIN(min) ALIGN_DEBUG_RO
+/*
+ * 4 KB granule: 1 level 2 entry
+ * 16 KB granule: 128 level 3 entries, with contiguous bit
+ * 64 KB granule: 32 level 3 entries, with contiguous bit
+ */
+#define SEGMENT_ALIGN SZ_2M
#else
-#define ALIGN_DEBUG_RO
-#define ALIGN_DEBUG_RO_MIN(min) . = ALIGN(min);
+/*
+ * 4 KB granule: 16 level 3 entries, with contiguous bit
+ * 16 KB granule: 4 level 3 entries, without contiguous bit
+ * 64 KB granule: 1 level 3 entry
+ */
+#define SEGMENT_ALIGN SZ_64K
#endif
SECTIONS
@@ -113,12 +118,12 @@ SECTIONS
*(.got) /* Global offset table */
}
- ALIGN_DEBUG_RO_MIN(PAGE_SIZE)
+ . = ALIGN(SEGMENT_ALIGN);
RO_DATA(PAGE_SIZE) /* everything from this point to */
EXCEPTION_TABLE(8) /* _etext will be marked RO NX */
NOTES
- ALIGN_DEBUG_RO_MIN(PAGE_SIZE)
+ . = ALIGN(SEGMENT_ALIGN);
_etext = .; /* End of text and rodata section */
__init_begin = .;
@@ -166,7 +171,7 @@ SECTIONS
*(.hash)
}
- . = ALIGN(PAGE_SIZE);
+ . = ALIGN(SEGMENT_ALIGN);
__init_end = .;
_data = .;
--
2.5.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 0/3] arm64: simplify and optimize kernel mapping
2016-03-03 13:09 [PATCH 0/3] arm64: simplify and optimize kernel mapping Ard Biesheuvel
` (2 preceding siblings ...)
2016-03-03 13:09 ` [PATCH 3/3] arm64: simplify kernel segment mapping granularity Ard Biesheuvel
@ 2016-03-07 1:40 ` Mark Rutland
2016-03-08 0:38 ` Jeremy Linton
2016-03-09 5:03 ` Ard Biesheuvel
3 siblings, 2 replies; 7+ messages in thread
From: Mark Rutland @ 2016-03-07 1:40 UTC (permalink / raw)
To: linux-arm-kernel
Hi,
I like this series, though I have a few minor comments below.
On Thu, Mar 03, 2016 at 02:09:38PM +0100, Ard Biesheuvel wrote:
> This series makes a couple of minor changes that should result in the
> kernel being mapped in a more efficient manner.
>
> First of all, it merges the .head.text with the .text section (patch #2)
> after moving everything except the kernel and EFI header into the __init
> section (patch #1)
Face-to-face, you suggested it might be possible to move .init before .text, so
we could place the EFI header in there too (and keep .text aligned while making
it smaller). Is there some reason we missed that means we cannot do this?
> Then, it standardizes the segment alignment to 64 KB for all page sizes.
> (patch #3). In the example below (4 KB granule, with Jeremy's PTE_CONT
> patch applied), we lose 80 KB in total to padding, but the resulting
> mappings do look somewhat better.
I suspect some people might want a minimal alignment option for tinification
purposes, but this sounds fine to me as a default. I also think we can wait
until someone asks.
As a general think, currently we use "chunk" instead of "segment" in the mm
code. We only used "chunk" so as to not overload "section". For consistency it
would be nice to either keep with "chunk" or convert existing uses to
"segment". I much prefer the latter!
> Before:
>
> 0xffff000008082000-0xffff000008090000 56K ro x SHD AF UXN MEM
> 0xffff000008090000-0xffff000008200000 1472K ro x SHD AF CON UXN MEM
> 0xffff000008200000-0xffff000008600000 4M ro x SHD AF BLK UXN MEM
> 0xffff000008600000-0xffff000008660000 384K ro x SHD AF CON UXN MEM
> 0xffff000008660000-0xffff00000866c000 48K ro x SHD AF UXN MEM
> 0xffff00000866c000-0xffff000008670000 16K ro NX SHD AF UXN MEM
> 0xffff000008670000-0xffff000008900000 2624K ro NX SHD AF CON UXN MEM
> 0xffff000008900000-0xffff000008909000 36K ro NX SHD AF UXN MEM
> 0xffff000008c39000-0xffff000008c40000 28K RW NX SHD AF UXN MEM
> 0xffff000008c40000-0xffff000008d50000 1088K RW NX SHD AF CON UXN MEM
> 0xffff000008d50000-0xffff000008d57000 28K RW NX SHD AF UXN MEM
>
> After:
>
> 0xffff000008080000-0xffff000008200000 1536K ro x SHD AF CON UXN MEM
> 0xffff000008200000-0xffff000008600000 4M ro x SHD AF BLK UXN MEM
> 0xffff000008600000-0xffff000008670000 448K ro x SHD AF CON UXN MEM
> 0xffff000008670000-0xffff000008910000 2688K ro NX SHD AF CON UXN MEM
> 0xffff000008c50000-0xffff000008d60000 1088K RW NX SHD AF CON UXN MEM
> 0xffff000008d60000-0xffff000008d6b000 44K RW NX SHD AF UXN MEM
>
> I am aware that this clashes with Jeremy's patch to allow CONT_SIZE alignment
> when CONFIG_DEBUG_ALIGN_RODATA=y, but the net effect of patch #3 is the same
> (only the Kconfig change is not included here)
Jeremy, do you have any thoughts on this series?
Thanks,
Mark.
> Ard Biesheuvel (3):
> arm64: move early boot code to the .init segment
> arm64: cover the .head.text section in the .text segment mapping
> arm64: simplify kernel segment mapping granularity
>
> arch/arm64/kernel/efi-entry.S | 2 +-
> arch/arm64/kernel/head.S | 32 +++++++++-----------
> arch/arm64/kernel/image.h | 4 +++
> arch/arm64/kernel/vmlinux.lds.S | 26 +++++++++-------
> arch/arm64/mm/mmu.c | 10 +++---
> 5 files changed, 40 insertions(+), 34 deletions(-)
>
> --
> 2.5.0
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 0/3] arm64: simplify and optimize kernel mapping
2016-03-07 1:40 ` [PATCH 0/3] arm64: simplify and optimize kernel mapping Mark Rutland
@ 2016-03-08 0:38 ` Jeremy Linton
2016-03-09 5:03 ` Ard Biesheuvel
1 sibling, 0 replies; 7+ messages in thread
From: Jeremy Linton @ 2016-03-08 0:38 UTC (permalink / raw)
To: linux-arm-kernel
On 03/06/2016 07:40 PM, Mark Rutland wrote:
> Hi,
>
> I like this series, though I have a few minor comments below.
>
> On Thu, Mar 03, 2016 at 02:09:38PM +0100, Ard Biesheuvel wrote:
>> This series makes a couple of minor changes that should result in the
>> kernel being mapped in a more efficient manner.
>>
>> First of all, it merges the .head.text with the .text section (patch #2)
>> after moving everything except the kernel and EFI header into the __init
>> section (patch #1)
>
> Face-to-face, you suggested it might be possible to move .init before .text, so
> we could place the EFI header in there too (and keep .text aligned while making
> it smaller). Is there some reason we missed that means we cannot do this?
>
>> Then, it standardizes the segment alignment to 64 KB for all page sizes.
>> (patch #3). In the example below (4 KB granule, with Jeremy's PTE_CONT
>> patch applied), we lose 80 KB in total to padding, but the resulting
>> mappings do look somewhat better.
>
> I suspect some people might want a minimal alignment option for tinification
> purposes, but this sounds fine to me as a default. I also think we can wait
> until someone asks.
>
> As a general think, currently we use "chunk" instead of "segment" in the mm
> code. We only used "chunk" so as to not overload "section". For consistency it
> would be nice to either keep with "chunk" or convert existing uses to
> "segment". I much prefer the latter!
>
>> Before:
>>
>> 0xffff000008082000-0xffff000008090000 56K ro x SHD AF UXN MEM
>> 0xffff000008090000-0xffff000008200000 1472K ro x SHD AF CON UXN MEM
>> 0xffff000008200000-0xffff000008600000 4M ro x SHD AF BLK UXN MEM
>> 0xffff000008600000-0xffff000008660000 384K ro x SHD AF CON UXN MEM
>> 0xffff000008660000-0xffff00000866c000 48K ro x SHD AF UXN MEM
>> 0xffff00000866c000-0xffff000008670000 16K ro NX SHD AF UXN MEM
>> 0xffff000008670000-0xffff000008900000 2624K ro NX SHD AF CON UXN MEM
>> 0xffff000008900000-0xffff000008909000 36K ro NX SHD AF UXN MEM
>> 0xffff000008c39000-0xffff000008c40000 28K RW NX SHD AF UXN MEM
>> 0xffff000008c40000-0xffff000008d50000 1088K RW NX SHD AF CON UXN MEM
>> 0xffff000008d50000-0xffff000008d57000 28K RW NX SHD AF UXN MEM
>>
>> After:
>>
>> 0xffff000008080000-0xffff000008200000 1536K ro x SHD AF CON UXN MEM
>> 0xffff000008200000-0xffff000008600000 4M ro x SHD AF BLK UXN MEM
>> 0xffff000008600000-0xffff000008670000 448K ro x SHD AF CON UXN MEM
>> 0xffff000008670000-0xffff000008910000 2688K ro NX SHD AF CON UXN MEM
>> 0xffff000008c50000-0xffff000008d60000 1088K RW NX SHD AF CON UXN MEM
>> 0xffff000008d60000-0xffff000008d6b000 44K RW NX SHD AF UXN MEM
>>
>> I am aware that this clashes with Jeremy's patch to allow CONT_SIZE alignment
>> when CONFIG_DEBUG_ALIGN_RODATA=y, but the net effect of patch #3 is the same
>> (only the Kconfig change is not included here)
>
> Jeremy, do you have any thoughts on this series?
I was waiting to see if anyone questioned padding the minimum kernel
alignment. But other than that, It looks good.
I've been meaning to test it at 64k pages to see how much rearranging
everything helps. I guess I will be rolling another CONT set to address
Will's reluctance to have the extra TLB flushes (and probably rework the
block break case as well) unless someone decides they want this now. I
will make sure that these two patch sets merge cleanly when I do that.
Thanks,
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 0/3] arm64: simplify and optimize kernel mapping
2016-03-07 1:40 ` [PATCH 0/3] arm64: simplify and optimize kernel mapping Mark Rutland
2016-03-08 0:38 ` Jeremy Linton
@ 2016-03-09 5:03 ` Ard Biesheuvel
1 sibling, 0 replies; 7+ messages in thread
From: Ard Biesheuvel @ 2016-03-09 5:03 UTC (permalink / raw)
To: linux-arm-kernel
On 7 March 2016 at 08:40, Mark Rutland <mark.rutland@arm.com> wrote:
> Hi,
>
> I like this series, though I have a few minor comments below.
>
> On Thu, Mar 03, 2016 at 02:09:38PM +0100, Ard Biesheuvel wrote:
>> This series makes a couple of minor changes that should result in the
>> kernel being mapped in a more efficient manner.
>>
>> First of all, it merges the .head.text with the .text section (patch #2)
>> after moving everything except the kernel and EFI header into the __init
>> section (patch #1)
>
> Face-to-face, you suggested it might be possible to move .init before .text, so
> we could place the EFI header in there too (and keep .text aligned while making
> it smaller). Is there some reason we missed that means we cannot do this?
>
Well, I tried implementing this, and it breaks the PIE linking. That
itself is probably a binutils problem, and I will try to follow up on
that with the toolchain group. But in the meantime, it means we cannot
reorder .text and .init.
>> Then, it standardizes the segment alignment to 64 KB for all page sizes.
>> (patch #3). In the example below (4 KB granule, with Jeremy's PTE_CONT
>> patch applied), we lose 80 KB in total to padding, but the resulting
>> mappings do look somewhat better.
>
> I suspect some people might want a minimal alignment option for tinification
> purposes, but this sounds fine to me as a default. I also think we can wait
> until someone asks.
>
> As a general think, currently we use "chunk" instead of "segment" in the mm
> code. We only used "chunk" so as to not overload "section". For consistency it
> would be nice to either keep with "chunk" or convert existing uses to
> "segment". I much prefer the latter!
>
OK
>> Before:
>>
>> 0xffff000008082000-0xffff000008090000 56K ro x SHD AF UXN MEM
>> 0xffff000008090000-0xffff000008200000 1472K ro x SHD AF CON UXN MEM
>> 0xffff000008200000-0xffff000008600000 4M ro x SHD AF BLK UXN MEM
>> 0xffff000008600000-0xffff000008660000 384K ro x SHD AF CON UXN MEM
>> 0xffff000008660000-0xffff00000866c000 48K ro x SHD AF UXN MEM
>> 0xffff00000866c000-0xffff000008670000 16K ro NX SHD AF UXN MEM
>> 0xffff000008670000-0xffff000008900000 2624K ro NX SHD AF CON UXN MEM
>> 0xffff000008900000-0xffff000008909000 36K ro NX SHD AF UXN MEM
>> 0xffff000008c39000-0xffff000008c40000 28K RW NX SHD AF UXN MEM
>> 0xffff000008c40000-0xffff000008d50000 1088K RW NX SHD AF CON UXN MEM
>> 0xffff000008d50000-0xffff000008d57000 28K RW NX SHD AF UXN MEM
>>
>> After:
>>
>> 0xffff000008080000-0xffff000008200000 1536K ro x SHD AF CON UXN MEM
>> 0xffff000008200000-0xffff000008600000 4M ro x SHD AF BLK UXN MEM
>> 0xffff000008600000-0xffff000008670000 448K ro x SHD AF CON UXN MEM
>> 0xffff000008670000-0xffff000008910000 2688K ro NX SHD AF CON UXN MEM
>> 0xffff000008c50000-0xffff000008d60000 1088K RW NX SHD AF CON UXN MEM
>> 0xffff000008d60000-0xffff000008d6b000 44K RW NX SHD AF UXN MEM
>>
>> I am aware that this clashes with Jeremy's patch to allow CONT_SIZE alignment
>> when CONFIG_DEBUG_ALIGN_RODATA=y, but the net effect of patch #3 is the same
>> (only the Kconfig change is not included here)
>
> Jeremy, do you have any thoughts on this series?
>
> Thanks,
> Mark.
>
>> Ard Biesheuvel (3):
>> arm64: move early boot code to the .init segment
>> arm64: cover the .head.text section in the .text segment mapping
>> arm64: simplify kernel segment mapping granularity
>>
>> arch/arm64/kernel/efi-entry.S | 2 +-
>> arch/arm64/kernel/head.S | 32 +++++++++-----------
>> arch/arm64/kernel/image.h | 4 +++
>> arch/arm64/kernel/vmlinux.lds.S | 26 +++++++++-------
>> arch/arm64/mm/mmu.c | 10 +++---
>> 5 files changed, 40 insertions(+), 34 deletions(-)
>>
>> --
>> 2.5.0
>>
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2016-03-09 5:03 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-03-03 13:09 [PATCH 0/3] arm64: simplify and optimize kernel mapping Ard Biesheuvel
2016-03-03 13:09 ` [PATCH 1/3] arm64: move early boot code to the .init segment Ard Biesheuvel
2016-03-03 13:09 ` [PATCH 2/3] arm64: cover the .head.text section in the .text segment mapping Ard Biesheuvel
2016-03-03 13:09 ` [PATCH 3/3] arm64: simplify kernel segment mapping granularity Ard Biesheuvel
2016-03-07 1:40 ` [PATCH 0/3] arm64: simplify and optimize kernel mapping Mark Rutland
2016-03-08 0:38 ` Jeremy Linton
2016-03-09 5:03 ` Ard Biesheuvel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).