linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] arm64: simplify and optimize kernel mapping
@ 2016-03-03 13:09 Ard Biesheuvel
  2016-03-03 13:09 ` [PATCH 1/3] arm64: move early boot code to the .init segment Ard Biesheuvel
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Ard Biesheuvel @ 2016-03-03 13:09 UTC (permalink / raw)
  To: linux-arm-kernel

This series makes a couple of minor changes that should result in the
kernel being mapped in a more efficient manner.

First of all, it merges the .head.text with the .text section (patch #2)
after moving everything except the kernel and EFI header into the __init
section (patch #1)

Then, it standardizes the segment alignment to 64 KB for all page sizes.
(patch #3). In the example below (4 KB granule, with Jeremy's PTE_CONT
patch applied), we lose 80 KB in total to padding, but the resulting
mappings do look somewhat better.

Before:

0xffff000008082000-0xffff000008090000    56K  ro x  SHD AF         UXN MEM
0xffff000008090000-0xffff000008200000  1472K  ro x  SHD AF CON     UXN MEM
0xffff000008200000-0xffff000008600000     4M  ro x  SHD AF     BLK UXN MEM
0xffff000008600000-0xffff000008660000   384K  ro x  SHD AF CON     UXN MEM
0xffff000008660000-0xffff00000866c000    48K  ro x  SHD AF         UXN MEM
0xffff00000866c000-0xffff000008670000    16K  ro NX SHD AF         UXN MEM
0xffff000008670000-0xffff000008900000  2624K  ro NX SHD AF CON     UXN MEM
0xffff000008900000-0xffff000008909000    36K  ro NX SHD AF         UXN MEM
0xffff000008c39000-0xffff000008c40000    28K  RW NX SHD AF         UXN MEM
0xffff000008c40000-0xffff000008d50000  1088K  RW NX SHD AF CON     UXN MEM
0xffff000008d50000-0xffff000008d57000    28K  RW NX SHD AF         UXN MEM

After:

0xffff000008080000-0xffff000008200000  1536K  ro x  SHD AF CON     UXN MEM
0xffff000008200000-0xffff000008600000     4M  ro x  SHD AF     BLK UXN MEM
0xffff000008600000-0xffff000008670000   448K  ro x  SHD AF CON     UXN MEM
0xffff000008670000-0xffff000008910000  2688K  ro NX SHD AF CON     UXN MEM
0xffff000008c50000-0xffff000008d60000  1088K  RW NX SHD AF CON     UXN MEM
0xffff000008d60000-0xffff000008d6b000    44K  RW NX SHD AF         UXN MEM

I am aware that this clashes with Jeremy's patch to allow CONT_SIZE alignment
when CONFIG_DEBUG_ALIGN_RODATA=y, but the net effect of patch #3 is the same
(only the Kconfig change is not included here)

Ard Biesheuvel (3):
  arm64: move early boot code to the .init segment
  arm64: cover the .head.text section in the .text segment mapping
  arm64: simplify kernel segment mapping granularity

 arch/arm64/kernel/efi-entry.S   |  2 +-
 arch/arm64/kernel/head.S        | 32 +++++++++-----------
 arch/arm64/kernel/image.h       |  4 +++
 arch/arm64/kernel/vmlinux.lds.S | 26 +++++++++-------
 arch/arm64/mm/mmu.c             | 10 +++---
 5 files changed, 40 insertions(+), 34 deletions(-)

-- 
2.5.0

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/3] arm64: move early boot code to the .init segment
  2016-03-03 13:09 [PATCH 0/3] arm64: simplify and optimize kernel mapping Ard Biesheuvel
@ 2016-03-03 13:09 ` Ard Biesheuvel
  2016-03-03 13:09 ` [PATCH 2/3] arm64: cover the .head.text section in the .text segment mapping Ard Biesheuvel
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: Ard Biesheuvel @ 2016-03-03 13:09 UTC (permalink / raw)
  To: linux-arm-kernel

Apart from the arm64/linux and EFI header data structures, there is nothing
in the .head.text section that must reside at the beginning of the Image.
So let's move it to the .init section where it belongs.

Note that this involves some minor tweaking of the EFI header, primarily
because the address of 'stext' no longer coincides with the start of the
.text section. It also requires a couple of relocated symbol references
to be slightly rewritten or their definition moved to the linker script.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/kernel/efi-entry.S |  2 +-
 arch/arm64/kernel/head.S      | 32 +++++++++-----------
 arch/arm64/kernel/image.h     |  4 +++
 3 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/arch/arm64/kernel/efi-entry.S b/arch/arm64/kernel/efi-entry.S
index f82036e02485..936022f0655e 100644
--- a/arch/arm64/kernel/efi-entry.S
+++ b/arch/arm64/kernel/efi-entry.S
@@ -61,7 +61,7 @@ ENTRY(entry)
 	 */
 	mov	x20, x0		// DTB address
 	ldr	x0, [sp, #16]	// relocated _text address
-	movz	x21, #:abs_g0:stext_offset
+	ldr	w21, =stext_offset
 	add	x21, x0, x21
 
 	/*
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 50c2134a4aaf..af522c853b7f 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -101,8 +101,6 @@ _head:
 #endif
 
 #ifdef CONFIG_EFI
-	.globl	__efistub_stext_offset
-	.set	__efistub_stext_offset, stext - _head
 	.align 3
 pe_header:
 	.ascii	"PE"
@@ -122,11 +120,11 @@ optional_header:
 	.short	0x20b				// PE32+ format
 	.byte	0x02				// MajorLinkerVersion
 	.byte	0x14				// MinorLinkerVersion
-	.long	_end - stext			// SizeOfCode
+	.long	_end - efi_header_end		// SizeOfCode
 	.long	0				// SizeOfInitializedData
 	.long	0				// SizeOfUninitializedData
 	.long	__efistub_entry - _head		// AddressOfEntryPoint
-	.long	__efistub_stext_offset		// BaseOfCode
+	.long	efi_header_end - _head		// BaseOfCode
 
 extra_header_fields:
 	.quad	0				// ImageBase
@@ -143,7 +141,7 @@ extra_header_fields:
 	.long	_end - _head			// SizeOfImage
 
 	// Everything before the kernel image is considered part of the header
-	.long	__efistub_stext_offset		// SizeOfHeaders
+	.long	efi_header_end - _head		// SizeOfHeaders
 	.long	0				// CheckSum
 	.short	0xa				// Subsystem (EFI application)
 	.short	0				// DllCharacteristics
@@ -187,10 +185,10 @@ section_table:
 	.byte	0
 	.byte	0
 	.byte	0        		// end of 0 padding of section name
-	.long	_end - stext		// VirtualSize
-	.long	__efistub_stext_offset	// VirtualAddress
-	.long	_edata - stext		// SizeOfRawData
-	.long	__efistub_stext_offset	// PointerToRawData
+	.long	_end - efi_header_end	// VirtualSize
+	.long	efi_header_end - _head	// VirtualAddress
+	.long	_edata - efi_header_end	// SizeOfRawData
+	.long	efi_header_end - _head	// PointerToRawData
 
 	.long	0		// PointerToRelocations (0 for executables)
 	.long	0		// PointerToLineNumbers (0 for executables)
@@ -199,15 +197,18 @@ section_table:
 	.long	0xe0500020	// Characteristics (section flags)
 
 	/*
-	 * EFI will load stext onwards at the 4k section alignment
+	 * EFI will load .text onwards at the 4k section alignment
 	 * described in the PE/COFF header. To ensure that instruction
 	 * sequences using an adrp and a :lo12: immediate will function
-	 * correctly at this alignment, we must ensure that stext is
+	 * correctly at this alignment, we must ensure that .text is
 	 * placed at a 4k boundary in the Image to begin with.
 	 */
 	.align 12
+efi_header_end:
 #endif
 
+	__INIT
+
 ENTRY(stext)
 	bl	preserve_boot_args
 	bl	el2_setup			// Drop to EL1, w20=cpu_boot_mode
@@ -222,12 +223,12 @@ ENTRY(stext)
 	 * the TCR will have been set.
 	 */
 	ldr	x27, 0f				// address to jump to after
-						// MMU has been enabled
+	neg	x27, x27			// MMU has been enabled
 	adr_l	lr, __enable_mmu		// return (PIC) address
 	b	__cpu_setup			// initialise processor
 ENDPROC(stext)
 	.align	3
-0:	.quad	__mmap_switched - (_head - TEXT_OFFSET) + KIMAGE_VADDR
+0:	.quad	(_text - TEXT_OFFSET) - __mmap_switched - KIMAGE_VADDR
 
 /*
  * Preserve the arguments passed by the bootloader in x0 .. x3
@@ -396,7 +397,7 @@ __create_page_tables:
 	ldr	x5, =KIMAGE_VADDR
 	add	x5, x5, x23			// add KASLR displacement
 	create_pgd_entry x0, x5, x3, x6
-	ldr	w6, kernel_img_size
+	ldr	w6, =kernel_img_size
 	add	x6, x6, x5
 	mov	x3, x24				// phys offset
 	create_block_map x0, x7, x3, x5, x6
@@ -413,9 +414,6 @@ __create_page_tables:
 
 	ret	x28
 ENDPROC(__create_page_tables)
-
-kernel_img_size:
-	.long	_end - (_head - TEXT_OFFSET)
 	.ltorg
 
 /*
diff --git a/arch/arm64/kernel/image.h b/arch/arm64/kernel/image.h
index db1bf57948f1..5ff892f40a0a 100644
--- a/arch/arm64/kernel/image.h
+++ b/arch/arm64/kernel/image.h
@@ -71,8 +71,12 @@
 	DEFINE_IMAGE_LE64(_kernel_offset_le, TEXT_OFFSET);	\
 	DEFINE_IMAGE_LE64(_kernel_flags_le, __HEAD_FLAGS);
 
+kernel_img_size = _end - (_text - TEXT_OFFSET);
+
 #ifdef CONFIG_EFI
 
+__efistub_stext_offset = stext - _text;
+
 /*
  * Prevent the symbol aliases below from being emitted into the kallsyms
  * table, by forcing them to be absolute symbols (which are conveniently
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/3] arm64: cover the .head.text section in the .text segment mapping
  2016-03-03 13:09 [PATCH 0/3] arm64: simplify and optimize kernel mapping Ard Biesheuvel
  2016-03-03 13:09 ` [PATCH 1/3] arm64: move early boot code to the .init segment Ard Biesheuvel
@ 2016-03-03 13:09 ` Ard Biesheuvel
  2016-03-03 13:09 ` [PATCH 3/3] arm64: simplify kernel segment mapping granularity Ard Biesheuvel
  2016-03-07  1:40 ` [PATCH 0/3] arm64: simplify and optimize kernel mapping Mark Rutland
  3 siblings, 0 replies; 7+ messages in thread
From: Ard Biesheuvel @ 2016-03-03 13:09 UTC (permalink / raw)
  To: linux-arm-kernel

Keeping .head.text out of the .text mapping buys us very little: its actual
payload is only 4 KB, most of which is padding, but the page alignment may
add up to 2 MB (in case of CONFIG_DEBUG_ALIGN_RODATA=y) of additional
padding to the uncompressed kernel Image.

Also, on 4 KB granule kernels, the 4 KB misalignment of .text forces us to
map the adjacent 56 KB of code without the PTE_CONT attribute, and since
this region contains the GIC interrupt handling entry point, among other
things, this region is likely to benefit from the reduced TLB pressure that
results from PTE_CONT mappings.

So remove the alignment between the .head.text and .text sections, and use
the [_text, _etext) rather than the [_stext, _etext) interval for mapping
the .text segment.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/kernel/vmlinux.lds.S |  1 -
 arch/arm64/mm/mmu.c             | 10 +++++-----
 2 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 4c56e7a0621b..7a141c098bbb 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -96,7 +96,6 @@ SECTIONS
 		_text = .;
 		HEAD_TEXT
 	}
-	ALIGN_DEBUG_RO_MIN(PAGE_SIZE)
 	.text : {			/* Real text segment		*/
 		_stext = .;		/* Text and read-only data	*/
 			__exception_text_start = .;
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index d2d8b8c2e17f..1d727018e90b 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -387,7 +387,7 @@ static void create_mapping_late(phys_addr_t phys, unsigned long virt,
 
 static void __init __map_memblock(pgd_t *pgd, phys_addr_t start, phys_addr_t end)
 {
-	unsigned long kernel_start = __pa(_stext);
+	unsigned long kernel_start = __pa(_text);
 	unsigned long kernel_end = __pa(_etext);
 
 	/*
@@ -419,7 +419,7 @@ static void __init __map_memblock(pgd_t *pgd, phys_addr_t start, phys_addr_t end
 				     early_pgtable_alloc);
 
 	/*
-	 * Map the linear alias of the [_stext, _etext) interval as
+	 * Map the linear alias of the [_text, _etext) interval as
 	 * read-only/non-executable. This makes the contents of the
 	 * region accessible to subsystems such as hibernate, but
 	 * protects it from inadvertent modification or execution.
@@ -451,8 +451,8 @@ void mark_rodata_ro(void)
 {
 	unsigned long section_size;
 
-	section_size = (unsigned long)__start_rodata - (unsigned long)_stext;
-	create_mapping_late(__pa(_stext), (unsigned long)_stext,
+	section_size = (unsigned long)__start_rodata - (unsigned long)_text;
+	create_mapping_late(__pa(_text), (unsigned long)_text,
 			    section_size, PAGE_KERNEL_ROX);
 	/*
 	 * mark .rodata as read only. Use _etext rather than __end_rodata to
@@ -501,7 +501,7 @@ static void __init map_kernel(pgd_t *pgd)
 {
 	static struct vm_struct vmlinux_text, vmlinux_rodata, vmlinux_init, vmlinux_data;
 
-	map_kernel_chunk(pgd, _stext, __start_rodata, PAGE_KERNEL_EXEC, &vmlinux_text);
+	map_kernel_chunk(pgd, _text, __start_rodata, PAGE_KERNEL_EXEC, &vmlinux_text);
 	map_kernel_chunk(pgd, __start_rodata, _etext, PAGE_KERNEL, &vmlinux_rodata);
 	map_kernel_chunk(pgd, __init_begin, __init_end, PAGE_KERNEL_EXEC,
 			 &vmlinux_init);
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/3] arm64: simplify kernel segment mapping granularity
  2016-03-03 13:09 [PATCH 0/3] arm64: simplify and optimize kernel mapping Ard Biesheuvel
  2016-03-03 13:09 ` [PATCH 1/3] arm64: move early boot code to the .init segment Ard Biesheuvel
  2016-03-03 13:09 ` [PATCH 2/3] arm64: cover the .head.text section in the .text segment mapping Ard Biesheuvel
@ 2016-03-03 13:09 ` Ard Biesheuvel
  2016-03-07  1:40 ` [PATCH 0/3] arm64: simplify and optimize kernel mapping Mark Rutland
  3 siblings, 0 replies; 7+ messages in thread
From: Ard Biesheuvel @ 2016-03-03 13:09 UTC (permalink / raw)
  To: linux-arm-kernel

The mapping of the kernel consist of four segments, each of which is mapped
with different permission attributes and/or lifetimes. To optimize the TLB
and translation table footprint, we define various opaque constants in the
linker script that resolve to different aligment values depending on the
page size and whether CONFIG_DEBUG_ALIGN_RODATA is set.

Considering that
- a 4 KB granule kernel benefits from a 64 KB segment alignment (due to
  the fact that it allows the use of the contiguous bit),
- the minimum alignment of the .data segment is THREAD_SIZE already, not
  PAGE_SIZE (i.e., we already have padding between _data and the start of
  the .data payload in many cases),
- 2 MB is a suitable alignment value on all granule sizes, either for
  mapping directly (level 2 on 4 KB), or via the contiguous bit (level 3 on
  16 KB and 64 KB),
- anything beyond 2 MB exceeds the minimum alignment mandated by the boot
  protocol, and can only be mapped efficiently if the physical alignment
  happens to be the same,

we can simplify this by standardizing on 64 KB (or 2 MB) explicitly, i.e.,
regardless of granule size, all segments are aligned either to 64 KB, or to
2 MB if CONFIG_DEBUG_ALIGN_RODATA=y.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/kernel/vmlinux.lds.S | 25 ++++++++++++--------
 1 file changed, 15 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 7a141c098bbb..2c60d19b038c 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -63,14 +63,19 @@ PECOFF_FILE_ALIGNMENT = 0x200;
 #endif
 
 #if defined(CONFIG_DEBUG_ALIGN_RODATA)
-#define ALIGN_DEBUG_RO			. = ALIGN(1<<SECTION_SHIFT);
-#define ALIGN_DEBUG_RO_MIN(min)		ALIGN_DEBUG_RO
-#elif defined(CONFIG_DEBUG_RODATA)
-#define ALIGN_DEBUG_RO			. = ALIGN(1<<PAGE_SHIFT);
-#define ALIGN_DEBUG_RO_MIN(min)		ALIGN_DEBUG_RO
+/*
+ *  4 KB granule:   1 level 2 entry
+ * 16 KB granule: 128 level 3 entries, with contiguous bit
+ * 64 KB granule:  32 level 3 entries, with contiguous bit
+ */
+#define SEGMENT_ALIGN			SZ_2M
 #else
-#define ALIGN_DEBUG_RO
-#define ALIGN_DEBUG_RO_MIN(min)		. = ALIGN(min);
+/*
+ *  4 KB granule:  16 level 3 entries, with contiguous bit
+ * 16 KB granule:   4 level 3 entries, without contiguous bit
+ * 64 KB granule:   1 level 3 entry
+ */
+#define SEGMENT_ALIGN			SZ_64K
 #endif
 
 SECTIONS
@@ -113,12 +118,12 @@ SECTIONS
 		*(.got)			/* Global offset table		*/
 	}
 
-	ALIGN_DEBUG_RO_MIN(PAGE_SIZE)
+	. = ALIGN(SEGMENT_ALIGN);
 	RO_DATA(PAGE_SIZE)		/* everything from this point to */
 	EXCEPTION_TABLE(8)		/* _etext will be marked RO NX   */
 	NOTES
 
-	ALIGN_DEBUG_RO_MIN(PAGE_SIZE)
+	. = ALIGN(SEGMENT_ALIGN);
 	_etext = .;			/* End of text and rodata section */
 	__init_begin = .;
 
@@ -166,7 +171,7 @@ SECTIONS
 		*(.hash)
 	}
 
-	. = ALIGN(PAGE_SIZE);
+	. = ALIGN(SEGMENT_ALIGN);
 	__init_end = .;
 
 	_data = .;
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 0/3] arm64: simplify and optimize kernel mapping
  2016-03-03 13:09 [PATCH 0/3] arm64: simplify and optimize kernel mapping Ard Biesheuvel
                   ` (2 preceding siblings ...)
  2016-03-03 13:09 ` [PATCH 3/3] arm64: simplify kernel segment mapping granularity Ard Biesheuvel
@ 2016-03-07  1:40 ` Mark Rutland
  2016-03-08  0:38   ` Jeremy Linton
  2016-03-09  5:03   ` Ard Biesheuvel
  3 siblings, 2 replies; 7+ messages in thread
From: Mark Rutland @ 2016-03-07  1:40 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

I like this series, though I have a few minor comments below.

On Thu, Mar 03, 2016 at 02:09:38PM +0100, Ard Biesheuvel wrote:
> This series makes a couple of minor changes that should result in the
> kernel being mapped in a more efficient manner.
> 
> First of all, it merges the .head.text with the .text section (patch #2)
> after moving everything except the kernel and EFI header into the __init
> section (patch #1)

Face-to-face, you suggested it might be possible to move .init before .text, so
we could place the EFI header in there too (and keep .text aligned while making
it smaller). Is there some reason we missed that means we cannot do this?

> Then, it standardizes the segment alignment to 64 KB for all page sizes.
> (patch #3). In the example below (4 KB granule, with Jeremy's PTE_CONT
> patch applied), we lose 80 KB in total to padding, but the resulting
> mappings do look somewhat better.

I suspect some people might want a minimal alignment option for tinification
purposes, but this sounds fine to me as a default. I also think we can wait
until someone asks.

As a general think, currently we use "chunk" instead of "segment" in the mm
code. We only used "chunk" so as to not overload "section". For consistency it
would be nice to either keep with "chunk" or convert existing uses to
"segment". I much prefer the latter!

> Before:
> 
> 0xffff000008082000-0xffff000008090000    56K  ro x  SHD AF         UXN MEM
> 0xffff000008090000-0xffff000008200000  1472K  ro x  SHD AF CON     UXN MEM
> 0xffff000008200000-0xffff000008600000     4M  ro x  SHD AF     BLK UXN MEM
> 0xffff000008600000-0xffff000008660000   384K  ro x  SHD AF CON     UXN MEM
> 0xffff000008660000-0xffff00000866c000    48K  ro x  SHD AF         UXN MEM
> 0xffff00000866c000-0xffff000008670000    16K  ro NX SHD AF         UXN MEM
> 0xffff000008670000-0xffff000008900000  2624K  ro NX SHD AF CON     UXN MEM
> 0xffff000008900000-0xffff000008909000    36K  ro NX SHD AF         UXN MEM
> 0xffff000008c39000-0xffff000008c40000    28K  RW NX SHD AF         UXN MEM
> 0xffff000008c40000-0xffff000008d50000  1088K  RW NX SHD AF CON     UXN MEM
> 0xffff000008d50000-0xffff000008d57000    28K  RW NX SHD AF         UXN MEM
> 
> After:
> 
> 0xffff000008080000-0xffff000008200000  1536K  ro x  SHD AF CON     UXN MEM
> 0xffff000008200000-0xffff000008600000     4M  ro x  SHD AF     BLK UXN MEM
> 0xffff000008600000-0xffff000008670000   448K  ro x  SHD AF CON     UXN MEM
> 0xffff000008670000-0xffff000008910000  2688K  ro NX SHD AF CON     UXN MEM
> 0xffff000008c50000-0xffff000008d60000  1088K  RW NX SHD AF CON     UXN MEM
> 0xffff000008d60000-0xffff000008d6b000    44K  RW NX SHD AF         UXN MEM
> 
> I am aware that this clashes with Jeremy's patch to allow CONT_SIZE alignment
> when CONFIG_DEBUG_ALIGN_RODATA=y, but the net effect of patch #3 is the same
> (only the Kconfig change is not included here)

Jeremy, do you have any thoughts on this series?

Thanks,
Mark.

> Ard Biesheuvel (3):
>   arm64: move early boot code to the .init segment
>   arm64: cover the .head.text section in the .text segment mapping
>   arm64: simplify kernel segment mapping granularity
> 
>  arch/arm64/kernel/efi-entry.S   |  2 +-
>  arch/arm64/kernel/head.S        | 32 +++++++++-----------
>  arch/arm64/kernel/image.h       |  4 +++
>  arch/arm64/kernel/vmlinux.lds.S | 26 +++++++++-------
>  arch/arm64/mm/mmu.c             | 10 +++---
>  5 files changed, 40 insertions(+), 34 deletions(-)
> 
> -- 
> 2.5.0
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 0/3] arm64: simplify and optimize kernel mapping
  2016-03-07  1:40 ` [PATCH 0/3] arm64: simplify and optimize kernel mapping Mark Rutland
@ 2016-03-08  0:38   ` Jeremy Linton
  2016-03-09  5:03   ` Ard Biesheuvel
  1 sibling, 0 replies; 7+ messages in thread
From: Jeremy Linton @ 2016-03-08  0:38 UTC (permalink / raw)
  To: linux-arm-kernel

On 03/06/2016 07:40 PM, Mark Rutland wrote:
> Hi,
>
> I like this series, though I have a few minor comments below.
>
> On Thu, Mar 03, 2016 at 02:09:38PM +0100, Ard Biesheuvel wrote:
>> This series makes a couple of minor changes that should result in the
>> kernel being mapped in a more efficient manner.
>>
>> First of all, it merges the .head.text with the .text section (patch #2)
>> after moving everything except the kernel and EFI header into the __init
>> section (patch #1)
>
> Face-to-face, you suggested it might be possible to move .init before .text, so
> we could place the EFI header in there too (and keep .text aligned while making
> it smaller). Is there some reason we missed that means we cannot do this?
>
>> Then, it standardizes the segment alignment to 64 KB for all page sizes.
>> (patch #3). In the example below (4 KB granule, with Jeremy's PTE_CONT
>> patch applied), we lose 80 KB in total to padding, but the resulting
>> mappings do look somewhat better.
>
> I suspect some people might want a minimal alignment option for tinification
> purposes, but this sounds fine to me as a default. I also think we can wait
> until someone asks.
>
> As a general think, currently we use "chunk" instead of "segment" in the mm
> code. We only used "chunk" so as to not overload "section". For consistency it
> would be nice to either keep with "chunk" or convert existing uses to
> "segment". I much prefer the latter!
>
>> Before:
>>
>> 0xffff000008082000-0xffff000008090000    56K  ro x  SHD AF         UXN MEM
>> 0xffff000008090000-0xffff000008200000  1472K  ro x  SHD AF CON     UXN MEM
>> 0xffff000008200000-0xffff000008600000     4M  ro x  SHD AF     BLK UXN MEM
>> 0xffff000008600000-0xffff000008660000   384K  ro x  SHD AF CON     UXN MEM
>> 0xffff000008660000-0xffff00000866c000    48K  ro x  SHD AF         UXN MEM
>> 0xffff00000866c000-0xffff000008670000    16K  ro NX SHD AF         UXN MEM
>> 0xffff000008670000-0xffff000008900000  2624K  ro NX SHD AF CON     UXN MEM
>> 0xffff000008900000-0xffff000008909000    36K  ro NX SHD AF         UXN MEM
>> 0xffff000008c39000-0xffff000008c40000    28K  RW NX SHD AF         UXN MEM
>> 0xffff000008c40000-0xffff000008d50000  1088K  RW NX SHD AF CON     UXN MEM
>> 0xffff000008d50000-0xffff000008d57000    28K  RW NX SHD AF         UXN MEM
>>
>> After:
>>
>> 0xffff000008080000-0xffff000008200000  1536K  ro x  SHD AF CON     UXN MEM
>> 0xffff000008200000-0xffff000008600000     4M  ro x  SHD AF     BLK UXN MEM
>> 0xffff000008600000-0xffff000008670000   448K  ro x  SHD AF CON     UXN MEM
>> 0xffff000008670000-0xffff000008910000  2688K  ro NX SHD AF CON     UXN MEM
>> 0xffff000008c50000-0xffff000008d60000  1088K  RW NX SHD AF CON     UXN MEM
>> 0xffff000008d60000-0xffff000008d6b000    44K  RW NX SHD AF         UXN MEM
>>
>> I am aware that this clashes with Jeremy's patch to allow CONT_SIZE alignment
>> when CONFIG_DEBUG_ALIGN_RODATA=y, but the net effect of patch #3 is the same
>> (only the Kconfig change is not included here)
>
> Jeremy, do you have any thoughts on this series?

	I was waiting to see if anyone questioned padding the minimum kernel 
alignment. But other than that, It looks good.

	I've been meaning to test it at 64k pages to see how much rearranging 
everything helps. I guess I will be rolling another CONT set to address 
Will's reluctance to have the extra TLB flushes (and probably rework the 
block break case as well) unless someone decides they want this now. I 
will make sure that these two patch sets merge cleanly when I do that.

	Thanks,

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 0/3] arm64: simplify and optimize kernel mapping
  2016-03-07  1:40 ` [PATCH 0/3] arm64: simplify and optimize kernel mapping Mark Rutland
  2016-03-08  0:38   ` Jeremy Linton
@ 2016-03-09  5:03   ` Ard Biesheuvel
  1 sibling, 0 replies; 7+ messages in thread
From: Ard Biesheuvel @ 2016-03-09  5:03 UTC (permalink / raw)
  To: linux-arm-kernel

On 7 March 2016 at 08:40, Mark Rutland <mark.rutland@arm.com> wrote:
> Hi,
>
> I like this series, though I have a few minor comments below.
>
> On Thu, Mar 03, 2016 at 02:09:38PM +0100, Ard Biesheuvel wrote:
>> This series makes a couple of minor changes that should result in the
>> kernel being mapped in a more efficient manner.
>>
>> First of all, it merges the .head.text with the .text section (patch #2)
>> after moving everything except the kernel and EFI header into the __init
>> section (patch #1)
>
> Face-to-face, you suggested it might be possible to move .init before .text, so
> we could place the EFI header in there too (and keep .text aligned while making
> it smaller). Is there some reason we missed that means we cannot do this?
>

Well, I tried implementing this, and it breaks the PIE linking. That
itself is probably a binutils problem, and I will try to follow up on
that with the toolchain group. But in the meantime, it means we cannot
reorder .text and .init.

>> Then, it standardizes the segment alignment to 64 KB for all page sizes.
>> (patch #3). In the example below (4 KB granule, with Jeremy's PTE_CONT
>> patch applied), we lose 80 KB in total to padding, but the resulting
>> mappings do look somewhat better.
>
> I suspect some people might want a minimal alignment option for tinification
> purposes, but this sounds fine to me as a default. I also think we can wait
> until someone asks.
>
> As a general think, currently we use "chunk" instead of "segment" in the mm
> code. We only used "chunk" so as to not overload "section". For consistency it
> would be nice to either keep with "chunk" or convert existing uses to
> "segment". I much prefer the latter!
>

OK

>> Before:
>>
>> 0xffff000008082000-0xffff000008090000    56K  ro x  SHD AF         UXN MEM
>> 0xffff000008090000-0xffff000008200000  1472K  ro x  SHD AF CON     UXN MEM
>> 0xffff000008200000-0xffff000008600000     4M  ro x  SHD AF     BLK UXN MEM
>> 0xffff000008600000-0xffff000008660000   384K  ro x  SHD AF CON     UXN MEM
>> 0xffff000008660000-0xffff00000866c000    48K  ro x  SHD AF         UXN MEM
>> 0xffff00000866c000-0xffff000008670000    16K  ro NX SHD AF         UXN MEM
>> 0xffff000008670000-0xffff000008900000  2624K  ro NX SHD AF CON     UXN MEM
>> 0xffff000008900000-0xffff000008909000    36K  ro NX SHD AF         UXN MEM
>> 0xffff000008c39000-0xffff000008c40000    28K  RW NX SHD AF         UXN MEM
>> 0xffff000008c40000-0xffff000008d50000  1088K  RW NX SHD AF CON     UXN MEM
>> 0xffff000008d50000-0xffff000008d57000    28K  RW NX SHD AF         UXN MEM
>>
>> After:
>>
>> 0xffff000008080000-0xffff000008200000  1536K  ro x  SHD AF CON     UXN MEM
>> 0xffff000008200000-0xffff000008600000     4M  ro x  SHD AF     BLK UXN MEM
>> 0xffff000008600000-0xffff000008670000   448K  ro x  SHD AF CON     UXN MEM
>> 0xffff000008670000-0xffff000008910000  2688K  ro NX SHD AF CON     UXN MEM
>> 0xffff000008c50000-0xffff000008d60000  1088K  RW NX SHD AF CON     UXN MEM
>> 0xffff000008d60000-0xffff000008d6b000    44K  RW NX SHD AF         UXN MEM
>>
>> I am aware that this clashes with Jeremy's patch to allow CONT_SIZE alignment
>> when CONFIG_DEBUG_ALIGN_RODATA=y, but the net effect of patch #3 is the same
>> (only the Kconfig change is not included here)
>
> Jeremy, do you have any thoughts on this series?
>
> Thanks,
> Mark.
>
>> Ard Biesheuvel (3):
>>   arm64: move early boot code to the .init segment
>>   arm64: cover the .head.text section in the .text segment mapping
>>   arm64: simplify kernel segment mapping granularity
>>
>>  arch/arm64/kernel/efi-entry.S   |  2 +-
>>  arch/arm64/kernel/head.S        | 32 +++++++++-----------
>>  arch/arm64/kernel/image.h       |  4 +++
>>  arch/arm64/kernel/vmlinux.lds.S | 26 +++++++++-------
>>  arch/arm64/mm/mmu.c             | 10 +++---
>>  5 files changed, 40 insertions(+), 34 deletions(-)
>>
>> --
>> 2.5.0
>>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-03-09  5:03 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-03-03 13:09 [PATCH 0/3] arm64: simplify and optimize kernel mapping Ard Biesheuvel
2016-03-03 13:09 ` [PATCH 1/3] arm64: move early boot code to the .init segment Ard Biesheuvel
2016-03-03 13:09 ` [PATCH 2/3] arm64: cover the .head.text section in the .text segment mapping Ard Biesheuvel
2016-03-03 13:09 ` [PATCH 3/3] arm64: simplify kernel segment mapping granularity Ard Biesheuvel
2016-03-07  1:40 ` [PATCH 0/3] arm64: simplify and optimize kernel mapping Mark Rutland
2016-03-08  0:38   ` Jeremy Linton
2016-03-09  5:03   ` Ard Biesheuvel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).