linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory
@ 2017-05-23  5:02 Pratyush Anand
  2017-05-23  5:02 ` [PATCH v3 1/2] kexec: arm64: create identity page table to be used " Pratyush Anand
                   ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Pratyush Anand @ 2017-05-23  5:02 UTC (permalink / raw)
  To: linux-arm-kernel

It takes more that 2 minutes to verify SHA in purgatory when vmlinuz image
is around 13MB and initramfs is around 30MB. It takes more than 20 second
even when we have -O2 optimization enabled. However, if dcache is enabled
during purgatory execution then, it takes just a second in SHA
verification.

Therefore, these patches adds support for dcache enabling facility during
purgatory execution.

Changes since V2:
	- typo in name create_block_entry() fixed
	- moved shared definition between kexec and purgatory to a common
	  file.
	- MT_NORMAL as 0
	- PTRS_PER_*  and other calculation around pmd_index() etc have
	  been defined like in kernel
	- page table size has been kept for 5 tables, with comment about
	  expansion
	- Pass only virt address to create_block_entry() now
	- Addresses in kexec and purgatory domain have been defined as
	  host_addr_t and guest_addr_t respectively for clear code reading


Changes since V1:
	- Moved page table creation logic from purgatory to kexec code.
	- Only 4K page table is supported, with 48 bit VA and 2M block size
	- if platform supports a 4K page size, then D-cache is always
	  enabled now.
 

Pratyush Anand (2):
  kexec: arm64: create identity page table to be used in purgatory
  arm64: enable d-cache support during purgatory sha verification

 kexec/arch/arm64/kexec-arm64.c         | 119 +++++++++++++++
 kexec/arch/arm64/kexec-mmu.h           |  76 ++++++++++
 purgatory/arch/arm64/Makefile          |   1 +
 purgatory/arch/arm64/cache.S           | 264 +++++++++++++++++++++++++++++++++
 purgatory/arch/arm64/purgatory-arm64.c |   5 +
 5 files changed, 465 insertions(+)
 create mode 100644 kexec/arch/arm64/kexec-mmu.h
 create mode 100644 purgatory/arch/arm64/cache.S

-- 
2.9.3

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v3 1/2] kexec: arm64: create identity page table to be used in purgatory
  2017-05-23  5:02 [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory Pratyush Anand
@ 2017-05-23  5:02 ` Pratyush Anand
  2017-06-02  8:24   ` James Morse
  2017-05-23  5:02 ` [PATCH v3 2/2] arm64: enable d-cache support during purgatory sha verification Pratyush Anand
  2017-06-02  8:23 ` [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory James Morse
  2 siblings, 1 reply; 20+ messages in thread
From: Pratyush Anand @ 2017-05-23  5:02 UTC (permalink / raw)
  To: linux-arm-kernel

Purgatory sha verification is very slow when D-cache is not enabled there.
We need to enable MMU as well to enable D-Cache.Therefore,we need to
have an identity mapped page table in purgatory.

Since debugging is very difficult in purgatory therefore we prefer to do as
much work as possible in kexec.

This patch prepares page table for purgatory in advance. We support only 4K
page table,because it will be available on most of the platform. This page
table is passed to the purgatory as a new segment.

VA bit is fixed as 48, page table level is 3 where 3rd level page table
contains 2M block entries.

Signed-off-by: Pratyush Anand <panand@redhat.com>
---
 kexec/arch/arm64/kexec-arm64.c | 119 +++++++++++++++++++++++++++++++++++++++++
 kexec/arch/arm64/kexec-mmu.h   |  58 ++++++++++++++++++++
 2 files changed, 177 insertions(+)
 create mode 100644 kexec/arch/arm64/kexec-mmu.h

diff --git a/kexec/arch/arm64/kexec-arm64.c b/kexec/arch/arm64/kexec-arm64.c
index 62f37585b788..af5dab331e97 100644
--- a/kexec/arch/arm64/kexec-arm64.c
+++ b/kexec/arch/arm64/kexec-arm64.c
@@ -17,6 +17,7 @@
 
 #include "kexec.h"
 #include "kexec-arm64.h"
+#include "kexec-mmu.h"
 #include "crashdump.h"
 #include "crashdump-arm64.h"
 #include "dt-ops.h"
@@ -33,6 +34,10 @@
 #define PROP_ELFCOREHDR "linux,elfcorehdr"
 #define PROP_USABLE_MEM_RANGE "linux,usable-memory-range"
 
+typedef unsigned long guest_addr_t;
+typedef unsigned long host_addr_t;
+static int next_tbl_cnt = 1;
+
 /* Global varables the core kexec routines expect. */
 
 unsigned char reuse_initrd;
@@ -519,6 +524,118 @@ unsigned long arm64_locate_kernel_segment(struct kexec_info *info)
 	return hole;
 }
 
+static host_addr_t *create_table_entry(host_addr_t *pgtbl_buf,
+		guest_addr_t pgtbl_mem, host_addr_t *tbl,
+		guest_addr_t virt, int shift,
+		unsigned long ptrs)
+{
+	unsigned long index, desc, offset;
+
+	index = (virt >> shift) & (ptrs - 1);
+	/* check if we have allocated a table already for this index */
+	if (tbl[index]) {
+		/*
+		 * table index will have entry as per purgatory (guest)page
+		 * table memory. Find out corresponding buffer address of
+		 * table (host).
+		 */
+		desc = tbl[index] & ~3UL;
+		offset = desc - pgtbl_mem;
+		return &pgtbl_buf[offset >> 3];
+	}
+
+	if (next_tbl_cnt >= MAX_PGTBLE_SZ / PAGE_SIZE)
+		die("%s:No more memory for page table\n", __func__);
+	/*
+	 * Always write page table content as per page table memory
+	 * allocated for purgatory(guest) area, but return corresponding
+	 * buffer area allocated in kexec (host).
+	 */
+
+	tbl[index] = (pgtbl_mem + PAGE_SIZE * next_tbl_cnt) | PMD_TYPE_TABLE;
+
+	return &pgtbl_buf[(next_tbl_cnt++ * PAGE_SIZE) >> 3];
+}
+
+static void create_block_entry(host_addr_t *tbl, unsigned long flags,
+				guest_addr_t virt)
+{
+	unsigned long index;
+	unsigned long desc;
+
+	index = pmd_index(virt);
+	desc = virt & PMD_MASK;
+	desc |= flags;
+	tbl[index] = desc;
+}
+
+static void create_identity_entry(host_addr_t *pgtbl_buf,
+		guest_addr_t pgtbl_mem, guest_addr_t virt,
+		unsigned long flags)
+{
+	host_addr_t *tbl = pgtbl_buf;
+
+	tbl = create_table_entry(pgtbl_buf, pgtbl_mem, tbl, virt,
+			EXTRA_SHIFT, EXTRA_PTRS);
+	tbl = create_table_entry(pgtbl_buf, pgtbl_mem, tbl, virt,
+			IDENTITY_SHIFT, PTRS_PER_PTE);
+	create_block_entry(tbl, flags, virt);
+}
+
+/**
+ * arm64_create_pgtbl_segment - Create page table segments to be used by
+ * purgatory. Page table will have entries to access memory area of all
+ * those segments which becomes part of sha verification in purgatory.
+ * Additionally, we also create page table for purgatory segment as well.
+ */
+
+static int arm64_create_pgtbl_segment(struct kexec_info *info,
+		unsigned long hole_min, unsigned long hole_max)
+{
+	host_addr_t *pgtbl_buf;
+	guest_addr_t pgtbl_mem, mstart, mend;
+	int i;
+	unsigned long purgatory_base, purgatory_len;
+
+	pgtbl_buf = xmalloc(MAX_PGTBLE_SZ);
+	memset(pgtbl_buf, 0, MAX_PGTBLE_SZ);
+	pgtbl_mem = add_buffer_phys_virt(info, pgtbl_buf, MAX_PGTBLE_SZ,
+			MAX_PGTBLE_SZ, PAGE_SIZE, hole_min, hole_max, 1, 0);
+	for (i = 0; i < info->nr_segments; i++) {
+		if (info->segment[i].mem == (void *)info->rhdr.rel_addr) {
+			purgatory_base = (unsigned long)info->segment[i].mem;
+			purgatory_len = info->segment[i].memsz;
+		}
+		mstart = (unsigned long)info->segment[i].mem;
+		mend = mstart + info->segment[i].memsz;
+		mstart &= ~(SECTION_SIZE - 1);
+		while (mstart < mend) {
+			create_identity_entry(pgtbl_buf, pgtbl_mem,
+					mstart, MMU_FLAGS_NORMAL);
+			mstart += SECTION_SIZE;
+		}
+	}
+
+	/* we will need pgtble_base in purgatory for enabling d-cache */
+	elf_rel_set_symbol(&info->rhdr, "pgtble_base", &pgtbl_mem,
+		sizeof(pgtbl_mem));
+	/*
+	 * We need to disable d-cache before we exit from purgatory.
+	 * Since, only d-cache flush by VAs is recommended, therefore we
+	 * will also need memory location of all those area which will be
+	 * accessed in purgatory with enabled d-cache. sha256_regions
+	 * already have start and length for all the segments except
+	 * purgatory. Therefore, we will need to pass start and length of
+	 * purgatory additionally.
+	 */
+	elf_rel_set_symbol(&info->rhdr, "purgatory_base", &purgatory_base,
+		sizeof(purgatory_base));
+	elf_rel_set_symbol(&info->rhdr, "purgatory_len", &purgatory_len,
+		sizeof(purgatory_len));
+
+	return 0;
+}
+
 /**
  * arm64_load_other_segments - Prepare the dtb, initrd and purgatory segments.
  */
@@ -630,6 +747,8 @@ int arm64_load_other_segments(struct kexec_info *info,
 	elf_rel_set_symbol(&info->rhdr, "arm64_dtb_addr", &dtb_base,
 		sizeof(dtb_base));
 
+	arm64_create_pgtbl_segment(info, hole_min, hole_max);
+
 	return 0;
 }
 
diff --git a/kexec/arch/arm64/kexec-mmu.h b/kexec/arch/arm64/kexec-mmu.h
new file mode 100644
index 000000000000..55354b5e3002
--- /dev/null
+++ b/kexec/arch/arm64/kexec-mmu.h
@@ -0,0 +1,58 @@
+#if !defined(KEXEC_MMU_H)
+#define KEXEC_MMU_H
+/*
+ * kexec creates identity page table to be used in purgatory so that
+ * dcache verification becomes faster.
+ *
+ * These are the definitions to be used by page table creation routine.
+ *
+ * Only 4K page table, 3 level, 2M block mapping, 48bit VA is supported
+ */
+#define PAGE_SHIFT		12
+#define PGTABLE_LEVELS		3
+#define ARM64_HW_PGTABLE_LEVEL_SHIFT(n) ((PAGE_SHIFT - 3) * (4 - (n)) + 3)
+#define PGDIR_SHIFT		ARM64_HW_PGTABLE_LEVEL_SHIFT(4 - PGTABLE_LEVELS)
+#define EXTRA_SHIFT		(PGDIR_SHIFT + PAGE_SHIFT - 3)
+#define EXTRA_PTRS		(1 << (48 - EXTRA_SHIFT))
+#define PUD_SHIFT		PGDIR_SHIFT
+#define PMD_SHIFT		ARM64_HW_PGTABLE_LEVEL_SHIFT(2)
+#define PMD_SIZE		(1UL << PMD_SHIFT)
+#define PMD_MASK		(~(PMD_SIZE-1))
+#define IDENTITY_SHIFT		PUD_SHIFT
+#define PTRS_PER_PTE		(1 << (PAGE_SHIFT - 3))
+#define PTRS_PER_PMD		PTRS_PER_PTE
+#define pmd_index(addr)		(((addr) >> PMD_SHIFT) & (PTRS_PER_PMD - 1))
+#define PMD_TYPE_TABLE		(3UL << 0)
+#define PMD_TYPE_SECT		(1UL << 0)
+#define PMD_SECT_AF		(1UL << 10)
+#define PMD_ATTRINDX(t)		((unsigned long)(t) << 2)
+#define MT_NORMAL		0
+#define PMD_FLAGS_NORMAL	(PMD_TYPE_SECT | PMD_SECT_AF)
+#define MMU_FLAGS_NORMAL	(PMD_ATTRINDX(MT_NORMAL) | PMD_FLAGS_NORMAL)
+#define SECTION_SHIFT		PMD_SHIFT
+#define SECTION_SIZE		(1UL << SECTION_SHIFT)
+#define PAGE_SIZE		(1 << PAGE_SHIFT)
+/*
+ * Since we are using 3 level of page tables, therefore minimum number of
+ * table will be 3. Each entry in level 3 page table can map 2MB memory
+ * area. Thus a level 3 page table indexed by bit 29:21 can map a total of
+ * 1G memory area. Therefore, if any segment crosses 1G boundary, then we
+ * will need one more level 3 table. Similarly, level 2 page table indexed
+ * by bit 38:30 can map a total of 512G memory area. If segment addresses
+ * are more than 512G apart then we will need two more table for each such
+ * block.
+ * Lets consider 2G as the maximum size of crashkernel. This 2G memory
+ * location might not be allocated at 1G aligned boundary, so in that case
+ * we need to have 5 table size resrved to map any location of the crash
+ * kernel region.
+ *
+ * If we will ever wish to support uart debugging in purgatory then that
+ * might cross the boundary and therefore additional 2 more table space. In
+ * that case, we will need a total of 7 table space.
+ *
+ * As of now keep it fixed@5. Increase it if any platform either
+ * supports uart or more than 2G of crash kernel size.
+ */
+#define MAX_PGTBLE_SZ	(5 * PAGE_SIZE)
+
+#endif
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v3 2/2] arm64: enable d-cache support during purgatory sha verification
  2017-05-23  5:02 [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory Pratyush Anand
  2017-05-23  5:02 ` [PATCH v3 1/2] kexec: arm64: create identity page table to be used " Pratyush Anand
@ 2017-05-23  5:02 ` Pratyush Anand
  2017-06-02  8:23 ` [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory James Morse
  2 siblings, 0 replies; 20+ messages in thread
From: Pratyush Anand @ 2017-05-23  5:02 UTC (permalink / raw)
  To: linux-arm-kernel

If a platform supports 4K page table then enable D-cache in purgatory
before SHA verification. Disable it before switching to kernel.

Signed-off-by: Pratyush Anand <panand@redhat.com>
---
 kexec/arch/arm64/kexec-mmu.h           |  20 ++-
 purgatory/arch/arm64/Makefile          |   1 +
 purgatory/arch/arm64/cache.S           | 264 +++++++++++++++++++++++++++++++++
 purgatory/arch/arm64/purgatory-arm64.c |   5 +
 4 files changed, 289 insertions(+), 1 deletion(-)
 create mode 100644 purgatory/arch/arm64/cache.S

diff --git a/kexec/arch/arm64/kexec-mmu.h b/kexec/arch/arm64/kexec-mmu.h
index 55354b5e3002..6f0a8e90a205 100644
--- a/kexec/arch/arm64/kexec-mmu.h
+++ b/kexec/arch/arm64/kexec-mmu.h
@@ -1,5 +1,24 @@
 #if !defined(KEXEC_MMU_H)
 #define KEXEC_MMU_H
+
+#define SCTLR_ELx_I		(1 << 12)
+#define SCTLR_ELx_C		(1 << 2)
+#define SCTLR_ELx_M		(1 << 0)
+#define SCTLR_ELx_FLAGS 	(SCTLR_ELx_M | SCTLR_ELx_C | SCTLR_ELx_I)
+#define TCR_SHARED_NONE		(0 << 12)
+#define TCR_ORGN_WBWA		(1 << 10)
+#define TCR_IRGN_WBWA		(1 << 8)
+#define TCR_T0SZ_48		16
+#define TCR_TG0_4K		(0 << 14)
+#define TCR_FLAGS 		(TCR_SHARED_NONE | TCR_ORGN_WBWA |\
+				TCR_IRGN_WBWA | TCR_T0SZ_48 | TCR_TG0_4K)
+#define TCR_IPS_EL1_SHIFT	32
+#define TCR_IPS_EL2_SHIFT	16
+#define ID_AA64MMFR0_TGRAN4_SHIFT	28
+#define ID_AA64MMFR0_PARANGE_MASK	0xF
+#define MT_NORMAL		0
+#define MEMORY_ATTRIBUTES	(0xFF << (MT_NORMAL*8))
+
 /*
  * kexec creates identity page table to be used in purgatory so that
  * dcache verification becomes faster.
@@ -26,7 +45,6 @@
 #define PMD_TYPE_SECT		(1UL << 0)
 #define PMD_SECT_AF		(1UL << 10)
 #define PMD_ATTRINDX(t)		((unsigned long)(t) << 2)
-#define MT_NORMAL		0
 #define PMD_FLAGS_NORMAL	(PMD_TYPE_SECT | PMD_SECT_AF)
 #define MMU_FLAGS_NORMAL	(PMD_ATTRINDX(MT_NORMAL) | PMD_FLAGS_NORMAL)
 #define SECTION_SHIFT		PMD_SHIFT
diff --git a/purgatory/arch/arm64/Makefile b/purgatory/arch/arm64/Makefile
index 636abeab17b2..db28a0de6891 100644
--- a/purgatory/arch/arm64/Makefile
+++ b/purgatory/arch/arm64/Makefile
@@ -11,6 +11,7 @@ arm64_PURGATORY_EXTRA_CFLAGS = \
 
 arm64_PURGATORY_SRCS += \
 	purgatory/arch/arm64/entry.S \
+	purgatory/arch/arm64/cache.S \
 	purgatory/arch/arm64/purgatory-arm64.c
 
 dist += \
diff --git a/purgatory/arch/arm64/cache.S b/purgatory/arch/arm64/cache.S
new file mode 100644
index 000000000000..bb5e08397ef9
--- /dev/null
+++ b/purgatory/arch/arm64/cache.S
@@ -0,0 +1,264 @@
+/*
+ * Some of the routines have been copied from Linux Kernel, therefore
+ * copying the license as well.
+ *
+ * Copyright (C) 2001 Deep Blue Solutions Ltd.
+ * Copyright (C) 2012 ARM Ltd.
+ * Copyright (C) 2016 Pratyush Anand <panand@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "../../../kexec/arch/arm64/kexec-mmu.h"
+/*
+ * 	dcache_line_size - get the minimum D-cache line size from the CTR register.
+ */
+	.macro	dcache_line_size, reg, tmp
+	mrs	\tmp, ctr_el0			// read CTR
+	ubfm	\tmp, \tmp, #16, #19		// cache line size encoding
+	mov	\reg, #4			// bytes per word
+	lsl	\reg, \reg, \tmp		// actual cache line size
+	.endm
+
+/*
+ *	flush_dcache_range(start, end)
+ *	- x0 - start	- start address of region
+ *	- x1 - end	- end address of region
+ *
+ */
+flush_dcache_range:
+	dcache_line_size x2, x3
+	sub	x3, x2, #1
+	bic	x0, x0, x3
+1:	dc	civac, x0			// clean & invalidate D line / unified line
+	add	x0, x0, x2
+	cmp	x0, x1
+	b.lo	1b
+	dsb	sy
+	ret
+
+/*
+ *	invalidate_tlbs_el1()
+ */
+invalidate_tlbs_el1:
+	dsb	nshst
+	tlbi	vmalle1
+	dsb	nsh
+	isb
+	ret
+
+/*
+ *	invalidate_tlbs_el2()
+ */
+invalidate_tlbs_el2:
+	dsb	nshst
+	tlbi	alle2
+	dsb	nsh
+	isb
+	ret
+/*
+ *	is_4k_page_not_supported - return nonzero if 4k page is not supported
+ */
+is_4k_page_not_supported:
+	mrs	x0, ID_AA64MMFR0_EL1
+	and	x0, x0, #(0xF << ID_AA64MMFR0_TGRAN4_SHIFT)
+	ret
+
+/*
+ *	get_ips_bits - return supported IPS bits
+ */
+get_ips_bits:
+	mrs	x0, ID_AA64MMFR0_EL1
+	and	x0, x0, #ID_AA64MMFR0_PARANGE_MASK
+	ret
+
+/*
+ * 	get_current_el - Get information about current exception level
+ */
+get_current_el:
+	mrs 	x0, CurrentEL
+	lsr	x0, x0, #2
+	ret
+
+/*
+ * 	invalidate_icache - Invalidate I-cache
+ */
+invalidate_icache:
+	ic	iallu
+	dsb	nsh
+	isb
+	ret
+
+/*
+ * 	set_mair_tcr_ttbr_sctlr_el1(page_table, tcr_flags) - sets MAIR, TCR , TTBR and SCTLR registers
+ * 	x0 - page_table - Page Table Base
+ * 	x1 - tcr_flags - TCR Flags to be set
+ */
+set_mair_tcr_ttbr_sctlr_el1:
+	ldr	x2, =MEMORY_ATTRIBUTES
+	msr	mair_el1, x2
+	msr	tcr_el1, x1
+	msr	ttbr0_el1, x0
+	isb
+	mrs	x0, sctlr_el1
+	ldr	x3, =SCTLR_ELx_FLAGS
+	orr	x0, x0, x3
+	msr	sctlr_el1, x0
+	isb
+	ret
+
+/*
+ * 	set_mair_tcr_ttbr_sctlr_el2(page_table, tcr_flags) - sets MAIR, TCR , TTBR and SCTLR registers
+ * 	x0 - page_table - Page Table Base
+ * 	x1 - tcr_flags - TCR Flags to be set
+ */
+set_mair_tcr_ttbr_sctlr_el2:
+	ldr	x2, =MEMORY_ATTRIBUTES
+	msr	mair_el2, x2
+	msr	tcr_el2, x1
+	msr	ttbr0_el2, x0
+	isb
+	mrs	x0, sctlr_el2
+	ldr	x3, =SCTLR_ELx_FLAGS
+	orr	x0, x0, x3
+	msr	sctlr_el2, x0
+	isb
+	ret
+
+/*
+ * reset_sctlr_el1 - disables cache and mmu
+ */
+reset_sctlr_el1:
+	mrs	x0, sctlr_el1
+	bic	x0, x0, #SCTLR_ELx_C
+	bic	x0, x0, #SCTLR_ELx_M
+	msr	sctlr_el1, x0
+	isb
+	ret
+
+/*
+ * reset_sctlr_el2 - disables cache and mmu
+ */
+reset_sctlr_el2:
+	mrs	x0, sctlr_el2
+	bic	x0, x0, #SCTLR_ELx_C
+	bic	x0, x0, #SCTLR_ELx_M
+	msr	sctlr_el2, x0
+	isb
+	ret
+
+.globl enable_dcache
+/*
+ * x6 - pgtble_base
+ * x7 - tcr_flags
+ * x8 - current_el
+ */
+enable_dcache:
+	stp	x29, x30, [sp,#-16]!
+	stp	x6, x7, [sp,#-16]!
+	stp	x8, x9, [sp,#-16]!
+	bl	is_4k_page_not_supported
+	cmp	x0, #0
+	b.ne	1f
+	ldr	x6, pgtble_base
+	ldr	x7, =TCR_FLAGS
+	bl	get_current_el
+	mov	x8, x0
+	cmp	x8, #2
+	b.ne	2f
+	bl	invalidate_tlbs_el2
+	bl	get_ips_bits
+	lsl	x1, x0, #TCR_IPS_EL2_SHIFT
+	orr	x1, x1, x7
+	mov	x0, x6
+	bl	set_mair_tcr_ttbr_sctlr_el2
+	b	1f
+2:
+	cmp	x8, #1
+	b.ne	1f
+	bl	invalidate_tlbs_el1
+	bl	get_ips_bits
+	lsl	x1, x0, #TCR_IPS_EL1_SHIFT
+	orr	x1, x1, x7
+	mov	x0, x6
+	bl	set_mair_tcr_ttbr_sctlr_el1
+1:
+	ldp	x8, x9, [sp],#16
+	ldp	x6, x7, [sp],#16
+	ldp	x29, x30, [sp],#16
+	ret
+
+.extern sha256_regions
+.globl disable_dcache
+/*
+ * x6 - pgtble_base
+ * x7 - current_el
+ */
+disable_dcache:
+	stp	x29, x30, [sp,#-16]!
+	stp	x6, x7, [sp,#-16]!
+	bl	is_4k_page_not_supported
+	cmp	x0, #0
+	b.ne	1f
+	ldr	x6, pgtble_base
+	bl	get_current_el
+	mov	x7, x0
+	cmp	x7, #2
+	b.ne	2f
+	bl	reset_sctlr_el2
+	b	3f
+2:
+	cmp	x7, #1
+	b.ne	1f
+	bl	reset_sctlr_el1
+3:
+	/*
+	 * we can only branch to function which does not use stack, until
+	 * all memories are flushed.
+	 */
+	bl	invalidate_icache
+	/* flush d cache for purgatory region */
+	ldr	x0, purgatory_base
+	ldr	x1, purgatory_len
+	add	x1, x1, x0
+	bl	flush_dcache_range
+	/* flush d cache for rest of the regions */
+	ldr	x6, =sha256_regions
+4:
+	ldp	x0, x1, [x6],#16
+	cmp	x1, #0
+	b.eq	1f
+	add	x1, x1, x0
+	bl	flush_dcache_range
+	b	4b
+1:
+	ldp	x6, x7, [sp],#16
+	ldp	x29, x30, [sp],#16
+	ret
+
+.align 3
+
+.globl pgtble_base
+pgtble_base:
+	.quad	0
+	.size	pgtble_base, .-pgtble_base
+
+.globl purgatory_base
+purgatory_base:
+	.quad	0
+	.size	purgatory_base, .-purgatory_base
+
+.globl purgatory_len
+purgatory_len:
+	.quad	0
+	.size	purgatory_len, .-purgatory_len
diff --git a/purgatory/arch/arm64/purgatory-arm64.c b/purgatory/arch/arm64/purgatory-arm64.c
index fe50fcf8ebc3..638fb11d9843 100644
--- a/purgatory/arch/arm64/purgatory-arm64.c
+++ b/purgatory/arch/arm64/purgatory-arm64.c
@@ -5,6 +5,9 @@
 #include <stdint.h>
 #include <purgatory.h>
 
+void enable_dcache(void);
+void disable_dcache(void);
+
 void putchar(int ch)
 {
 	/* Nothing for now */
@@ -12,8 +15,10 @@ void putchar(int ch)
 
 void post_verification_setup_arch(void)
 {
+	disable_dcache();
 }
 
 void setup_arch(void)
 {
+	enable_dcache();
 }
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory
  2017-05-23  5:02 [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory Pratyush Anand
  2017-05-23  5:02 ` [PATCH v3 1/2] kexec: arm64: create identity page table to be used " Pratyush Anand
  2017-05-23  5:02 ` [PATCH v3 2/2] arm64: enable d-cache support during purgatory sha verification Pratyush Anand
@ 2017-06-02  8:23 ` James Morse
  2017-06-02  9:55   ` Ard Biesheuvel
  2017-06-02 14:42   ` Pratyush Anand
  2 siblings, 2 replies; 20+ messages in thread
From: James Morse @ 2017-06-02  8:23 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Pratyush,

On 23/05/17 06:02, Pratyush Anand wrote:
> It takes more that 2 minutes to verify SHA in purgatory when vmlinuz image
> is around 13MB and initramfs is around 30MB. It takes more than 20 second
> even when we have -O2 optimization enabled. However, if dcache is enabled
> during purgatory execution then, it takes just a second in SHA
> verification.
> 
> Therefore, these patches adds support for dcache enabling facility during
> purgatory execution.

I'm still not convinced we need this. Moving the SHA verification to happen
before the dcache+mmu are disabled would also solve the delay problem, and we
can print an error message or fail the syscall.

For kexec we don't expect memory corruption, what are we testing for?
I can see the use for kdump, but the kdump-kernel is unmapped so the kernel
can't accidentally write over it.

(we discussed all this last time, but it fizzled-out. If you and the
 kexec-tools maintainer think its necessary, fine by me!)

I have some comments on making this code easier to maintain..


Thanks,

James

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v3 1/2] kexec: arm64: create identity page table to be used in purgatory
  2017-05-23  5:02 ` [PATCH v3 1/2] kexec: arm64: create identity page table to be used " Pratyush Anand
@ 2017-06-02  8:24   ` James Morse
  2017-06-02 14:29     ` Pratyush Anand
  0 siblings, 1 reply; 20+ messages in thread
From: James Morse @ 2017-06-02  8:24 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Pratyush,

On 23/05/17 06:02, Pratyush Anand wrote:
> Purgatory sha verification is very slow when D-cache is not enabled there.
> We need to enable MMU as well to enable D-Cache.Therefore,we need to
> have an identity mapped page table in purgatory.
> 
> Since debugging is very difficult in purgatory therefore we prefer to do as
> much work as possible in kexec.
> 
> This patch prepares page table for purgatory in advance. We support only 4K
> page table,because it will be available on most of the platform. This page
> table is passed to the purgatory as a new segment.
> 
> VA bit is fixed as 48, page table level is 3 where 3rd level page table
> contains 2M block entries.

This had me confused: You actually have 4 levels of page tables, but the last
level is never used, because the third level always contains block entries.

If you want to use the kernel's macros for shifts and masks (which I think you
should), you must use the same terminology.

Because you wrote this:
> +#define PGTABLE_LEVELS		3

The calculated PGDIR_SHIFT is what the kernel would call PUD_SHIFT with 48bit VA.

> +#define ARM64_HW_PGTABLE_LEVEL_SHIFT(n) ((PAGE_SHIFT - 3) * (4 - (n)) + 3)
> +#define PGDIR_SHIFT		ARM64_HW_PGTABLE_LEVEL_SHIFT(4 - PGTABLE_LEVELS)

Which then explains why you have to invent a new name for PGD_SHIFT, and
calculate it manually:
> +#define EXTRA_SHIFT		(PGDIR_SHIFT + PAGE_SHIFT - 3)


kexec-tools is closely coupled to the kernel, can we try to make this as similar
as possible so that code (and reviewers!) can easily move between the two.

(some other comments below)

Thanks,


James

> diff --git a/kexec/arch/arm64/kexec-arm64.c b/kexec/arch/arm64/kexec-arm64.c
> index 62f37585b788..af5dab331e97 100644
> --- a/kexec/arch/arm64/kexec-arm64.c
> +++ b/kexec/arch/arm64/kexec-arm64.c
> @@ -33,6 +34,10 @@
>  #define PROP_ELFCOREHDR "linux,elfcorehdr"
>  #define PROP_USABLE_MEM_RANGE "linux,usable-memory-range"
>  
> +typedef unsigned long guest_addr_t;
> +typedef unsigned long host_addr_t;

What do guest and host mean here?
I assumed everything that happens in purgatory would have something like
phys_addr_t (for both physical and virtual addresses).


> +static int next_tbl_cnt = 1;
> +
>  /* Global varables the core kexec routines expect. */

Nit: variables


>  
>  unsigned char reuse_initrd;
> @@ -519,6 +524,118 @@ unsigned long arm64_locate_kernel_segment(struct kexec_info *info)
>  	return hole;
>  }
>  
> +static host_addr_t *create_table_entry(host_addr_t *pgtbl_buf,
> +		guest_addr_t pgtbl_mem, host_addr_t *tbl,
> +		guest_addr_t virt, int shift,
> +		unsigned long ptrs)
> +{
> +	unsigned long index, desc, offset;
> +
> +	index = (virt >> shift) & (ptrs - 1);
> +	/* check if we have allocated a table already for this index */
> +	if (tbl[index]) {
> +		/*
> +		 * table index will have entry as per purgatory (guest)page
> +		 * table memory. Find out corresponding buffer address of
> +		 * table (host).
> +		 */
> +		desc = tbl[index] & ~3UL;
> +		offset = desc - pgtbl_mem;
> +		return &pgtbl_buf[offset >> 3];
> +	}
> +
> +	if (next_tbl_cnt >= MAX_PGTBLE_SZ / PAGE_SIZE)
> +		die("%s:No more memory for page table\n", __func__);
> +	/*
> +	 * Always write page table content as per page table memory
> +	 * allocated for purgatory(guest) area, but return corresponding
> +	 * buffer area allocated in kexec (host).
> +	 */
> +
> +	tbl[index] = (pgtbl_mem + PAGE_SIZE * next_tbl_cnt) | PMD_TYPE_TABLE;
> +
> +	return &pgtbl_buf[(next_tbl_cnt++ * PAGE_SIZE) >> 3];
> +}
> +
> +static void create_block_entry(host_addr_t *tbl, unsigned long flags,
> +				guest_addr_t virt)
> +{
> +	unsigned long index;
> +	unsigned long desc;
> +
> +	index = pmd_index(virt);
> +	desc = virt & PMD_MASK;
> +	desc |= flags;
> +	tbl[index] = desc;
> +}
> +
> +static void create_identity_entry(host_addr_t *pgtbl_buf,
> +		guest_addr_t pgtbl_mem, guest_addr_t virt,
> +		unsigned long flags)
> +{
> +	host_addr_t *tbl = pgtbl_buf;
> +	tbl = create_table_entry(pgtbl_buf, pgtbl_mem, tbl, virt,
> +			EXTRA_SHIFT, EXTRA_PTRS);
> +	tbl = create_table_entry(pgtbl_buf, pgtbl_mem, tbl, virt,
> +			IDENTITY_SHIFT, PTRS_PER_PTE);

Are 'extra' and 'identity' names for levels in the page tables? If so can we use
the same names as the kernel.


> +	create_block_entry(tbl, flags, virt);
> +}
> +
> +/**
> + * arm64_create_pgtbl_segment - Create page table segments to be used by
> + * purgatory. Page table will have entries to access memory area of all
> + * those segments which becomes part of sha verification in purgatory.
> + * Additionally, we also create page table for purgatory segment as well.
> + */
> +
> +static int arm64_create_pgtbl_segment(struct kexec_info *info,
> +		unsigned long hole_min, unsigned long hole_max)
> +{
> +	host_addr_t *pgtbl_buf;
> +	guest_addr_t pgtbl_mem, mstart, mend;
> +	int i;
> +	unsigned long purgatory_base, purgatory_len;
> +
> +	pgtbl_buf = xmalloc(MAX_PGTBLE_SZ);
> +	memset(pgtbl_buf, 0, MAX_PGTBLE_SZ);
> +	pgtbl_mem = add_buffer_phys_virt(info, pgtbl_buf, MAX_PGTBLE_SZ,
> +			MAX_PGTBLE_SZ, PAGE_SIZE, hole_min, hole_max, 1, 0);
> +	for (i = 0; i < info->nr_segments; i++) {
> +		if (info->segment[i].mem == (void *)info->rhdr.rel_addr) {
> +			purgatory_base = (unsigned long)info->segment[i].mem;
> +			purgatory_len = info->segment[i].memsz;
> +		}
> +		mstart = (unsigned long)info->segment[i].mem;
> +		mend = mstart + info->segment[i].memsz;
> +		mstart &= ~(SECTION_SIZE - 1);
> +		while (mstart < mend) {
> +			create_identity_entry(pgtbl_buf, pgtbl_mem,
> +					mstart, MMU_FLAGS_NORMAL);
> +			mstart += SECTION_SIZE;
> +		}
> +	}
> +
> +	/* we will need pgtble_base in purgatory for enabling d-cache */
> +	elf_rel_set_symbol(&info->rhdr, "pgtble_base", &pgtbl_mem,
> +		sizeof(pgtbl_mem));
> +	/*
> +	 * We need to disable d-cache before we exit from purgatory.
> +	 * Since, only d-cache flush by VAs is recommended, therefore we
> +	 * will also need memory location of all those area which will be
> +	 * accessed in purgatory with enabled d-cache. sha256_regions
> +	 * already have start and length for all the segments except
> +	 * purgatory. Therefore, we will need to pass start and length of
> +	 * purgatory additionally.
> +	 */
> +	elf_rel_set_symbol(&info->rhdr, "purgatory_base", &purgatory_base,
> +		sizeof(purgatory_base));
> +	elf_rel_set_symbol(&info->rhdr, "purgatory_len", &purgatory_len,
> +		sizeof(purgatory_len));
> +
> +	return 0;
> +}
> +
>  /**
>   * arm64_load_other_segments - Prepare the dtb, initrd and purgatory segments.
>   */
> @@ -630,6 +747,8 @@ int arm64_load_other_segments(struct kexec_info *info,
>  	elf_rel_set_symbol(&info->rhdr, "arm64_dtb_addr", &dtb_base,
>  		sizeof(dtb_base));
>  
> +	arm64_create_pgtbl_segment(info, hole_min, hole_max);

hole_min->hole_max is the region from the end of the Kernel image, to the end of
usable memory. I can't see how this window is made smaller as the dtb, initramfs
then page tables are added to it... (kvmtool updates the equivalent values as it
adds things)

Does kexec-tools have some other accounting for what memory is in use? (in which
case why are these values needed?)


>  	return 0;
>  }
>  
> diff --git a/kexec/arch/arm64/kexec-mmu.h b/kexec/arch/arm64/kexec-mmu.h
> new file mode 100644
> index 000000000000..55354b5e3002
> --- /dev/null
> +++ b/kexec/arch/arm64/kexec-mmu.h
> @@ -0,0 +1,58 @@
> +#if !defined(KEXEC_MMU_H)
> +#define KEXEC_MMU_H

As lots of the symbols names and values below were copied from the kernel, we
should copy the copyright statement too. Something like:

/*
 * Based on linux v4.11's arch/arm64/include/asm/pgtable-hwdef.h
 * Copyright (C) 2012 ARM Ltd.
 */


> +/*
> + * kexec creates identity page table to be used in purgatory so that

> + * dcache verification becomes faster.

... so that SHA verification of the image can make use of the dcache.


> + *
> + * These are the definitions to be used by page table creation routine.
> + *
> + * Only 4K page table, 3 level, 2M block mapping, 48bit VA is supported
> + */
> +#define PAGE_SHIFT		12
> +#define PGTABLE_LEVELS		3
> +#define ARM64_HW_PGTABLE_LEVEL_SHIFT(n) ((PAGE_SHIFT - 3) * (4 - (n)) + 3)
> +#define PGDIR_SHIFT		ARM64_HW_PGTABLE_LEVEL_SHIFT(4 - PGTABLE_LEVELS)

> +#define EXTRA_SHIFT		(PGDIR_SHIFT + PAGE_SHIFT - 3)
> +#define EXTRA_PTRS		(1 << (48 - EXTRA_SHIFT))

This looks like the level above PGD? What do you need this for if you have only
3 levels of page tables?


> +#define PUD_SHIFT		PGDIR_SHIFT
> +#define PMD_SHIFT		ARM64_HW_PGTABLE_LEVEL_SHIFT(2)
> +#define PMD_SIZE		(1UL << PMD_SHIFT)
> +#define PMD_MASK		(~(PMD_SIZE-1))
> +#define IDENTITY_SHIFT		PUD_SHIFT
> +#define PTRS_PER_PTE		(1 << (PAGE_SHIFT - 3))
> +#define PTRS_PER_PMD		PTRS_PER_PTE
> +#define pmd_index(addr)		(((addr) >> PMD_SHIFT) & (PTRS_PER_PMD - 1))
> +#define PMD_TYPE_TABLE		(3UL << 0)
> +#define PMD_TYPE_SECT		(1UL << 0)
> +#define PMD_SECT_AF		(1UL << 10)
> +#define PMD_ATTRINDX(t)		((unsigned long)(t) << 2)
> +#define MT_NORMAL		0
> +#define PMD_FLAGS_NORMAL	(PMD_TYPE_SECT | PMD_SECT_AF)
> +#define MMU_FLAGS_NORMAL	(PMD_ATTRINDX(MT_NORMAL) | PMD_FLAGS_NORMAL)
> +#define SECTION_SHIFT		PMD_SHIFT
> +#define SECTION_SIZE		(1UL << SECTION_SHIFT)
> +#define PAGE_SIZE		(1 << PAGE_SHIFT)
> +/*
> + * Since we are using 3 level of page tables, therefore minimum number of
> + * table will be 3. Each entry in level 3 page table can map 2MB memory
> + * area. Thus a level 3 page table indexed by bit 29:21 can map a total of
> + * 1G memory area. Therefore, if any segment crosses 1G boundary, then we
> + * will need one more level 3 table. Similarly, level 2 page table indexed
> + * by bit 38:30 can map a total of 512G memory area. If segment addresses
> + * are more than 512G apart then we will need two more table for each such
> + * block.
> + * Lets consider 2G as the maximum size of crashkernel.

crashkernel+initramfs ?

A good assumption. Can we test this is true when the image is loaded?
(otherwise its going to be really fun to debug!)


> This 2G memory
> + * location might not be allocated at 1G aligned boundary, so in that case
> + * we need to have 5 table size resrved to map any location of the crash

(Nit: reserved)


> + * kernel region.
> + *
> + * If we will ever wish to support uart debugging in purgatory then that
> + * might cross the boundary and therefore additional 2 more table space. In
> + * that case, we will need a total of 7 table space.
> + *
> + * As of now keep it fixed at 5. Increase it if any platform either
> + * supports uart or more than 2G of crash kernel size.
> + */
> +#define MAX_PGTBLE_SZ	(5 * PAGE_SIZE)
> +
> +#endif
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory
  2017-06-02  8:23 ` [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory James Morse
@ 2017-06-02  9:55   ` Ard Biesheuvel
  2017-06-02 11:15     ` Bhupesh SHARMA
  2017-06-02 14:42   ` Pratyush Anand
  1 sibling, 1 reply; 20+ messages in thread
From: Ard Biesheuvel @ 2017-06-02  9:55 UTC (permalink / raw)
  To: linux-arm-kernel

On 2 June 2017 at 08:23, James Morse <james.morse@arm.com> wrote:
> Hi Pratyush,
>
> On 23/05/17 06:02, Pratyush Anand wrote:
>> It takes more that 2 minutes to verify SHA in purgatory when vmlinuz image
>> is around 13MB and initramfs is around 30MB. It takes more than 20 second
>> even when we have -O2 optimization enabled. However, if dcache is enabled
>> during purgatory execution then, it takes just a second in SHA
>> verification.
>>
>> Therefore, these patches adds support for dcache enabling facility during
>> purgatory execution.
>
> I'm still not convinced we need this. Moving the SHA verification to happen
> before the dcache+mmu are disabled would also solve the delay problem, and we
> can print an error message or fail the syscall.
>
> For kexec we don't expect memory corruption, what are we testing for?

This is a very good question. SHA-256 is quite a heavy hammer if all
you need is CRC style error detection. Note that SHA-256 uses 256
bytes of round keys, which means that in the absence of a cache, each
64 byte chunk of data processed involves (re)reading 320 bytes from
DRAM. That also means you could write a SHA-256 implementation for
AArch64 that keeps the round keys in NEON registers instead, and it
would probably be a lot faster.

> I can see the use for kdump, but the kdump-kernel is unmapped so the kernel
> can't accidentally write over it.
>
> (we discussed all this last time, but it fizzled-out. If you and the
>  kexec-tools maintainer think its necessary, fine by me!)
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory
  2017-06-02  9:55   ` Ard Biesheuvel
@ 2017-06-02 11:15     ` Bhupesh SHARMA
  2017-06-02 11:36       ` Ard Biesheuvel
  2017-06-02 16:36       ` James Morse
  0 siblings, 2 replies; 20+ messages in thread
From: Bhupesh SHARMA @ 2017-06-02 11:15 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Ard, James

On Fri, Jun 2, 2017 at 3:25 PM, Ard Biesheuvel
<ard.biesheuvel@linaro.org> wrote:
> On 2 June 2017 at 08:23, James Morse <james.morse@arm.com> wrote:
>> Hi Pratyush,
>>
>> On 23/05/17 06:02, Pratyush Anand wrote:
>>> It takes more that 2 minutes to verify SHA in purgatory when vmlinuz image
>>> is around 13MB and initramfs is around 30MB. It takes more than 20 second
>>> even when we have -O2 optimization enabled. However, if dcache is enabled
>>> during purgatory execution then, it takes just a second in SHA
>>> verification.
>>>
>>> Therefore, these patches adds support for dcache enabling facility during
>>> purgatory execution.
>>
>> I'm still not convinced we need this. Moving the SHA verification to happen
>> before the dcache+mmu are disabled would also solve the delay problem, and we
>> can print an error message or fail the syscall.
>>
>> For kexec we don't expect memory corruption, what are we testing for?
>
> This is a very good question. SHA-256 is quite a heavy hammer if all
> you need is CRC style error detection. Note that SHA-256 uses 256
> bytes of round keys, which means that in the absence of a cache, each
> 64 byte chunk of data processed involves (re)reading 320 bytes from
> DRAM. That also means you could write a SHA-256 implementation for
> AArch64 that keeps the round keys in NEON registers instead, and it
> would probably be a lot faster.

AFAICR the sha-256 implementation was proposed to boot a signed
kexec/kdump kernel to circumvent kexec from violating UEFI secure boot
restrictions (see [1]).

As Matthew Garret rightly noted (see[2]), secure Boot, if enabled, is
explicitly designed to stop you booting modified kernels unless you've
added your own keys.

But if you boot a signed Linux distribution with kexec enabled without
using the SHA like feature in the purgatory (like, say, Ubuntu) then
you're able to boot a modified Windows kernel that will still believe
it was booted securely.

So, CRC wouldn't possibly fulfil the functionality we are trying to
achieve with SHA-256 in the purgatory.

However, having seen the power of using the inbuilt CRC instructions
from the ARM64 ISA on a SoC which supports it, I can vouch that the
native ISA implementations are much faster than other approaches.

However, using the SHA-256 implementation (as you rightly noted) would
employ NEON registers and can be faster, however I remember some SoC
vendors disabling co-processor extensions in their ARM implementations
in the past, so I am not sure we can assume that NEON extensions would
be available in all ARMv8 implementations by default.

[1] https://lwn.net/Articles/603116/
[2] http://mjg59.dreamwidth.org/28746.html

Regards,
Bhupesh

>> I can see the use for kdump, but the kdump-kernel is unmapped so the kernel
>> can't accidentally write over it.
>>
>> (we discussed all this last time, but it fizzled-out. If you and the
>>  kexec-tools maintainer think its necessary, fine by me!)
>>
>
> _______________________________________________
> kexec mailing list
> kexec at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory
  2017-06-02 11:15     ` Bhupesh SHARMA
@ 2017-06-02 11:36       ` Ard Biesheuvel
  2017-06-02 11:44         ` Ard Biesheuvel
  2017-06-02 16:36       ` James Morse
  1 sibling, 1 reply; 20+ messages in thread
From: Ard Biesheuvel @ 2017-06-02 11:36 UTC (permalink / raw)
  To: linux-arm-kernel

On 2 June 2017 at 11:15, Bhupesh SHARMA <bhupesh.linux@gmail.com> wrote:
> Hi Ard, James
>
> On Fri, Jun 2, 2017 at 3:25 PM, Ard Biesheuvel
> <ard.biesheuvel@linaro.org> wrote:
>> On 2 June 2017 at 08:23, James Morse <james.morse@arm.com> wrote:
>>> Hi Pratyush,
>>>
>>> On 23/05/17 06:02, Pratyush Anand wrote:
>>>> It takes more that 2 minutes to verify SHA in purgatory when vmlinuz image
>>>> is around 13MB and initramfs is around 30MB. It takes more than 20 second
>>>> even when we have -O2 optimization enabled. However, if dcache is enabled
>>>> during purgatory execution then, it takes just a second in SHA
>>>> verification.
>>>>
>>>> Therefore, these patches adds support for dcache enabling facility during
>>>> purgatory execution.
>>>
>>> I'm still not convinced we need this. Moving the SHA verification to happen
>>> before the dcache+mmu are disabled would also solve the delay problem, and we
>>> can print an error message or fail the syscall.
>>>
>>> For kexec we don't expect memory corruption, what are we testing for?
>>
>> This is a very good question. SHA-256 is quite a heavy hammer if all
>> you need is CRC style error detection. Note that SHA-256 uses 256
>> bytes of round keys, which means that in the absence of a cache, each
>> 64 byte chunk of data processed involves (re)reading 320 bytes from
>> DRAM. That also means you could write a SHA-256 implementation for
>> AArch64 that keeps the round keys in NEON registers instead, and it
>> would probably be a lot faster.
>
> AFAICR the sha-256 implementation was proposed to boot a signed
> kexec/kdump kernel to circumvent kexec from violating UEFI secure boot
> restrictions (see [1]).
>
> As Matthew Garret rightly noted (see[2]), secure Boot, if enabled, is
> explicitly designed to stop you booting modified kernels unless you've
> added your own keys.
>
> But if you boot a signed Linux distribution with kexec enabled without
> using the SHA like feature in the purgatory (like, say, Ubuntu) then
> you're able to boot a modified Windows kernel that will still believe
> it was booted securely.
>
> So, CRC wouldn't possibly fulfil the functionality we are trying to
> achieve with SHA-256 in the purgatory.
>

OK. But it appears that kexec_load_file() generates the hashes, and
the purgatory just double checks them? That means there is wiggle room
in terms of hash implementation, even though a non-cryptographic hash
may be out of the question.

> However, having seen the power of using the inbuilt CRC instructions
> from the ARM64 ISA on a SoC which supports it, I can vouch that the
> native ISA implementations are much faster than other approaches.
>
> However, using the SHA-256 implementation (as you rightly noted) would
> employ NEON registers and can be faster, however I remember some SoC
> vendors disabling co-processor extensions in their ARM implementations
> in the past, so I am not sure we can assume that NEON extensions would
> be available in all ARMv8 implementations by default.
>

Alternatively, a SHA-256 implementation that uses movz/movk sequences
instead of ldr instructions to load the round constants would already
be 5x faster, given that we don't need page tables to enable the
I-cache.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory
  2017-06-02 11:36       ` Ard Biesheuvel
@ 2017-06-02 11:44         ` Ard Biesheuvel
  2017-06-02 14:32           ` Pratyush Anand
  0 siblings, 1 reply; 20+ messages in thread
From: Ard Biesheuvel @ 2017-06-02 11:44 UTC (permalink / raw)
  To: linux-arm-kernel

On 2 June 2017 at 11:36, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
> On 2 June 2017 at 11:15, Bhupesh SHARMA <bhupesh.linux@gmail.com> wrote:
>> Hi Ard, James
>>
>> On Fri, Jun 2, 2017 at 3:25 PM, Ard Biesheuvel
>> <ard.biesheuvel@linaro.org> wrote:
>>> On 2 June 2017 at 08:23, James Morse <james.morse@arm.com> wrote:
>>>> Hi Pratyush,
>>>>
>>>> On 23/05/17 06:02, Pratyush Anand wrote:
>>>>> It takes more that 2 minutes to verify SHA in purgatory when vmlinuz image
>>>>> is around 13MB and initramfs is around 30MB. It takes more than 20 second
>>>>> even when we have -O2 optimization enabled. However, if dcache is enabled
>>>>> during purgatory execution then, it takes just a second in SHA
>>>>> verification.
>>>>>
>>>>> Therefore, these patches adds support for dcache enabling facility during
>>>>> purgatory execution.
>>>>
>>>> I'm still not convinced we need this. Moving the SHA verification to happen
>>>> before the dcache+mmu are disabled would also solve the delay problem, and we
>>>> can print an error message or fail the syscall.
>>>>
>>>> For kexec we don't expect memory corruption, what are we testing for?
>>>
>>> This is a very good question. SHA-256 is quite a heavy hammer if all
>>> you need is CRC style error detection. Note that SHA-256 uses 256
>>> bytes of round keys, which means that in the absence of a cache, each
>>> 64 byte chunk of data processed involves (re)reading 320 bytes from
>>> DRAM. That also means you could write a SHA-256 implementation for
>>> AArch64 that keeps the round keys in NEON registers instead, and it
>>> would probably be a lot faster.
>>
>> AFAICR the sha-256 implementation was proposed to boot a signed
>> kexec/kdump kernel to circumvent kexec from violating UEFI secure boot
>> restrictions (see [1]).
>>
>> As Matthew Garret rightly noted (see[2]), secure Boot, if enabled, is
>> explicitly designed to stop you booting modified kernels unless you've
>> added your own keys.
>>
>> But if you boot a signed Linux distribution with kexec enabled without
>> using the SHA like feature in the purgatory (like, say, Ubuntu) then
>> you're able to boot a modified Windows kernel that will still believe
>> it was booted securely.
>>
>> So, CRC wouldn't possibly fulfil the functionality we are trying to
>> achieve with SHA-256 in the purgatory.
>>
>
> OK. But it appears that kexec_load_file() generates the hashes, and
> the purgatory just double checks them? That means there is wiggle room
> in terms of hash implementation, even though a non-cryptographic hash
> may be out of the question.
>
>> However, having seen the power of using the inbuilt CRC instructions
>> from the ARM64 ISA on a SoC which supports it, I can vouch that the
>> native ISA implementations are much faster than other approaches.
>>
>> However, using the SHA-256 implementation (as you rightly noted) would
>> employ NEON registers and can be faster, however I remember some SoC
>> vendors disabling co-processor extensions in their ARM implementations
>> in the past, so I am not sure we can assume that NEON extensions would
>> be available in all ARMv8 implementations by default.
>>
>
> Alternatively, a SHA-256 implementation that uses movz/movk sequences
> instead of ldr instructions to load the round constants would already
> be 5x faster, given that we don't need page tables to enable the
> I-cache.

Actually, looking at the C code and the objdump of the kernel's
sha256_generic driver, it is likely that it is already doing this, and
none of the points I made actually make a lot of sense ...

Pratyush: I assume you are already enabling the I-cache in the purgatory?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v3 1/2] kexec: arm64: create identity page table to be used in purgatory
  2017-06-02  8:24   ` James Morse
@ 2017-06-02 14:29     ` Pratyush Anand
  0 siblings, 0 replies; 20+ messages in thread
From: Pratyush Anand @ 2017-06-02 14:29 UTC (permalink / raw)
  To: linux-arm-kernel

Hi James,

Thanks for taking out time to review the code.

On Friday 02 June 2017 01:54 PM, James Morse wrote:
> Hi Pratyush,
>
> On 23/05/17 06:02, Pratyush Anand wrote:
>> Purgatory sha verification is very slow when D-cache is not enabled there.
>> We need to enable MMU as well to enable D-Cache.Therefore,we need to
>> have an identity mapped page table in purgatory.
>>
>> Since debugging is very difficult in purgatory therefore we prefer to do as
>> much work as possible in kexec.
>>
>> This patch prepares page table for purgatory in advance. We support only 4K
>> page table,because it will be available on most of the platform. This page
>> table is passed to the purgatory as a new segment.
>>
>> VA bit is fixed as 48, page table level is 3 where 3rd level page table
>> contains 2M block entries.
>
> This had me confused: You actually have 4 levels of page tables, but the last
> level is never used, because the third level always contains block entries.
>
> If you want to use the kernel's macros for shifts and masks (which I think you
> should), you must use the same terminology.
>
> Because you wrote this:
>> +#define PGTABLE_LEVELS		3
>
> The calculated PGDIR_SHIFT is what the kernel would call PUD_SHIFT with 48bit VA.
>
>> +#define ARM64_HW_PGTABLE_LEVEL_SHIFT(n) ((PAGE_SHIFT - 3) * (4 - (n)) + 3)
>> +#define PGDIR_SHIFT		ARM64_HW_PGTABLE_LEVEL_SHIFT(4 - PGTABLE_LEVELS)
>
> Which then explains why you have to invent a new name for PGD_SHIFT, and
> calculate it manually:
>> +#define EXTRA_SHIFT		(PGDIR_SHIFT + PAGE_SHIFT - 3)
>
>
> kexec-tools is closely coupled to the kernel, can we try to make this as similar
> as possible so that code (and reviewers!) can easily move between the two.

OK..I will try to sort out the differences
...
>
> (some other comments below)
>
> Thanks,
>
>
> James
>
>> diff --git a/kexec/arch/arm64/kexec-arm64.c b/kexec/arch/arm64/kexec-arm64.c
>> index 62f37585b788..af5dab331e97 100644
>> --- a/kexec/arch/arm64/kexec-arm64.c
>> +++ b/kexec/arch/arm64/kexec-arm64.c
>> @@ -33,6 +34,10 @@
>>  #define PROP_ELFCOREHDR "linux,elfcorehdr"
>>  #define PROP_USABLE_MEM_RANGE "linux,usable-memory-range"
>>
>> +typedef unsigned long guest_addr_t;
>> +typedef unsigned long host_addr_t;
>
> What do guest and host mean here?

guest is something which will be in purgatory domain and host is something in 
kexec domain. I remember, you had suggested to differentiate these two address 
domain for better code readability.

> I assumed everything that happens in purgatory would have something like
> phys_addr_t (for both physical and virtual addresses).

Yes, so guest_addr_t is like phys_addr_t.  I can replace, if there are some 
better names to differentiate between these two address domain.

>
>
>> +static int next_tbl_cnt = 1;
>> +
>>  /* Global varables the core kexec routines expect. */
>
> Nit: variables

OK.

>
>
>>
>>  unsigned char reuse_initrd;
>> @@ -519,6 +524,118 @@ unsigned long arm64_locate_kernel_segment(struct kexec_info *info)
>>  	return hole;
>>  }
>>
>> +static host_addr_t *create_table_entry(host_addr_t *pgtbl_buf,
>> +		guest_addr_t pgtbl_mem, host_addr_t *tbl,
>> +		guest_addr_t virt, int shift,
>> +		unsigned long ptrs)
>> +{
>> +	unsigned long index, desc, offset;
>> +
>> +	index = (virt >> shift) & (ptrs - 1);
>> +	/* check if we have allocated a table already for this index */
>> +	if (tbl[index]) {
>> +		/*
>> +		 * table index will have entry as per purgatory (guest)page
>> +		 * table memory. Find out corresponding buffer address of
>> +		 * table (host).
>> +		 */
>> +		desc = tbl[index] & ~3UL;
>> +		offset = desc - pgtbl_mem;
>> +		return &pgtbl_buf[offset >> 3];
>> +	}
>> +
>> +	if (next_tbl_cnt >= MAX_PGTBLE_SZ / PAGE_SIZE)
>> +		die("%s:No more memory for page table\n", __func__);
>> +	/*
>> +	 * Always write page table content as per page table memory
>> +	 * allocated for purgatory(guest) area, but return corresponding
>> +	 * buffer area allocated in kexec (host).
>> +	 */
>> +
>> +	tbl[index] = (pgtbl_mem + PAGE_SIZE * next_tbl_cnt) | PMD_TYPE_TABLE;
>> +
>> +	return &pgtbl_buf[(next_tbl_cnt++ * PAGE_SIZE) >> 3];
>> +}
>> +
>> +static void create_block_entry(host_addr_t *tbl, unsigned long flags,
>> +				guest_addr_t virt)
>> +{
>> +	unsigned long index;
>> +	unsigned long desc;
>> +
>> +	index = pmd_index(virt);
>> +	desc = virt & PMD_MASK;
>> +	desc |= flags;
>> +	tbl[index] = desc;
>> +}
>> +
>> +static void create_identity_entry(host_addr_t *pgtbl_buf,
>> +		guest_addr_t pgtbl_mem, guest_addr_t virt,
>> +		unsigned long flags)
>> +{
>> +	host_addr_t *tbl = pgtbl_buf;
>> +	tbl = create_table_entry(pgtbl_buf, pgtbl_mem, tbl, virt,
>> +			EXTRA_SHIFT, EXTRA_PTRS);
>> +	tbl = create_table_entry(pgtbl_buf, pgtbl_mem, tbl, virt,
>> +			IDENTITY_SHIFT, PTRS_PER_PTE);
>
> Are 'extra' and 'identity' names for levels in the page tables? If so can we use
> the same names as the kernel.
>

OK,will make it similar as per 4 level page (so 3 level swapper page) in kernel.

>
>> +	create_block_entry(tbl, flags, virt);
>> +}
>> +
>> +/**
>> + * arm64_create_pgtbl_segment - Create page table segments to be used by
>> + * purgatory. Page table will have entries to access memory area of all
>> + * those segments which becomes part of sha verification in purgatory.
>> + * Additionally, we also create page table for purgatory segment as well.
>> + */
>> +
>> +static int arm64_create_pgtbl_segment(struct kexec_info *info,
>> +		unsigned long hole_min, unsigned long hole_max)
>> +{
>> +	host_addr_t *pgtbl_buf;
>> +	guest_addr_t pgtbl_mem, mstart, mend;
>> +	int i;
>> +	unsigned long purgatory_base, purgatory_len;
>> +
>> +	pgtbl_buf = xmalloc(MAX_PGTBLE_SZ);
>> +	memset(pgtbl_buf, 0, MAX_PGTBLE_SZ);
>> +	pgtbl_mem = add_buffer_phys_virt(info, pgtbl_buf, MAX_PGTBLE_SZ,
>> +			MAX_PGTBLE_SZ, PAGE_SIZE, hole_min, hole_max, 1, 0);
>> +	for (i = 0; i < info->nr_segments; i++) {
>> +		if (info->segment[i].mem == (void *)info->rhdr.rel_addr) {
>> +			purgatory_base = (unsigned long)info->segment[i].mem;
>> +			purgatory_len = info->segment[i].memsz;
>> +		}
>> +		mstart = (unsigned long)info->segment[i].mem;
>> +		mend = mstart + info->segment[i].memsz;
>> +		mstart &= ~(SECTION_SIZE - 1);
>> +		while (mstart < mend) {
>> +			create_identity_entry(pgtbl_buf, pgtbl_mem,
>> +					mstart, MMU_FLAGS_NORMAL);
>> +			mstart += SECTION_SIZE;
>> +		}
>> +	}
>> +
>> +	/* we will need pgtble_base in purgatory for enabling d-cache */
>> +	elf_rel_set_symbol(&info->rhdr, "pgtble_base", &pgtbl_mem,
>> +		sizeof(pgtbl_mem));
>> +	/*
>> +	 * We need to disable d-cache before we exit from purgatory.
>> +	 * Since, only d-cache flush by VAs is recommended, therefore we
>> +	 * will also need memory location of all those area which will be
>> +	 * accessed in purgatory with enabled d-cache. sha256_regions
>> +	 * already have start and length for all the segments except
>> +	 * purgatory. Therefore, we will need to pass start and length of
>> +	 * purgatory additionally.n
>> +	 */
>> +	elf_rel_set_symbol(&info->rhdr, "purgatory_base", &purgatory_base,
>> +		sizeof(purgatory_base));
>> +	elf_rel_set_symbol(&info->rhdr, "purgatory_len", &purgatory_len,
>> +		sizeof(purgatory_len));
>> +
>> +	return 0;
>> +}
>> +
>>  /**
>>   * arm64_load_other_segments - Prepare the dtb, initrd and purgatory segments.
>>   */
>> @@ -630,6 +747,8 @@ int arm64_load_other_segments(struct kexec_info *info,
>>  	elf_rel_set_symbol(&info->rhdr, "arm64_dtb_addr", &dtb_base,
>>  		sizeof(dtb_base));
>>
>> +	arm64_create_pgtbl_segment(info, hole_min, hole_max);
>
> hole_min->hole_max is the region from the end of the Kernel image, to the end of
> usable memory. I can't see how this window is made smaller as the dtb, initramfs
> then page tables are added to it... (kvmtool updates the equivalent values as it
> adds things)
>
> Does kexec-tools have some other accounting for what memory is in use? (in which
> case why are these values needed?)
>

Please see locate_hole() and add_segment_phys_virt().

locate_hole() finds a hole from info->memory_range lying between hole_min and 
hole_max. add_segment_phys_virt() assigned allocated hole to info->segment. 
locate_hole()  uses info->segment as well to know about consumed regions.

>
>>  	return 0;
>>  }
>>
>> diff --git a/kexec/arch/arm64/kexec-mmu.h b/kexec/arch/arm64/kexec-mmu.h
>> new file mode 100644
>> index 000000000000..55354b5e3002
>> --- /dev/null
>> +++ b/kexec/arch/arm64/kexec-mmu.h
>> @@ -0,0 +1,58 @@
>> +#if !defined(KEXEC_MMU_H)
>> +#define KEXEC_MMU_H
>
> As lots of the symbols names and values below were copied from the kernel, we
> should copy the copyright statement too. Something like:
>
> /*
>  * Based on linux v4.11's arch/arm64/include/asm/pgtable-hwdef.h
>  * Copyright (C) 2012 ARM Ltd.
>  */
>

OK

>
>> +/*
>> + * kexec creates identity page table to be used in purgatory so that
>
>> + * dcache verification becomes faster.
>
> ... so that SHA verification of the image can make use of the dcache.

OK.. :-)

>
>
>> + *
>> + * These are the definitions to be used by page table creation routine.
>> + *
>> + * Only 4K page table, 3 level, 2M block mapping, 48bit VA is supported
>> + */
>> +#define PAGE_SHIFT		12
>> +#define PGTABLE_LEVELS		3
>> +#define ARM64_HW_PGTABLE_LEVEL_SHIFT(n) ((PAGE_SHIFT - 3) * (4 - (n)) + 3)
>> +#define PGDIR_SHIFT		ARM64_HW_PGTABLE_LEVEL_SHIFT(4 - PGTABLE_LEVELS)
>
>> +#define EXTRA_SHIFT		(PGDIR_SHIFT + PAGE_SHIFT - 3)
>> +#define EXTRA_PTRS		(1 << (48 - EXTRA_SHIFT))
>
> This looks like the level above PGD? What do you need this for if you have only
> 3 levels of page tables?

As said above, will remove these names now.

>
>
>> +#define PUD_SHIFT		PGDIR_SHIFT
>> +#define PMD_SHIFT		ARM64_HW_PGTABLE_LEVEL_SHIFT(2)
>> +#define PMD_SIZE		(1UL << PMD_SHIFT)
>> +#define PMD_MASK		(~(PMD_SIZE-1))
>> +#define IDENTITY_SHIFT		PUD_SHIFT
>> +#define PTRS_PER_PTE		(1 << (PAGE_SHIFT - 3))
>> +#define PTRS_PER_PMD		PTRS_PER_PTE
>> +#define pmd_index(addr)		(((addr) >> PMD_SHIFT) & (PTRS_PER_PMD - 1))
>> +#define PMD_TYPE_TABLE		(3UL << 0)
>> +#define PMD_TYPE_SECT		(1UL << 0)
>> +#define PMD_SECT_AF		(1UL << 10)
>> +#define PMD_ATTRINDX(t)		((unsigned long)(t) << 2)
>> +#define MT_NORMAL		0
>> +#define PMD_FLAGS_NORMAL	(PMD_TYPE_SECT | PMD_SECT_AF)
>> +#define MMU_FLAGS_NORMAL	(PMD_ATTRINDX(MT_NORMAL) | PMD_FLAGS_NORMAL)
>> +#define SECTION_SHIFT		PMD_SHIFT
>> +#define SECTION_SIZE		(1UL << SECTION_SHIFT)
>> +#define PAGE_SIZE		(1 << PAGE_SHIFT)
>> +/*
>> + * Since we are using 3 level of page tables, therefore minimum number of
>> + * table will be 3. Each entry in level 3 page table can map 2MB memory
>> + * area. Thus a level 3 page table indexed by bit 29:21 can map a total of
>> + * 1G memory area. Therefore, if any segment crosses 1G boundary, then we
>> + * will need one more level 3 table. Similarly, level 2 page table indexed
>> + * by bit 38:30 can map a total of 512G memory area. If segment addresses
>> + * are more than 512G apart then we will need two more table for each such
>> + * block.
>> + * Lets consider 2G as the maximum size of crashkernel.
>
> crashkernel+initramfs ?

OK

>
> A good assumption. Can we test this is true when the image is loaded?
> (otherwise its going to be really fun to debug!)
>

That we already do. If we cross that boundary we will need additional space 
and it will die() during kexec load.
         if (next_tbl_cnt >= MAX_PGTBLE_SZ / PAGE_SIZE)
                 die("%s:No more memory for page table\n", __func__);
[It is difficult to implement, get going without MMU enabled in that case]

>
>> This 2G memory
>> + * location might not be allocated at 1G aligned boundary, so in that case
>> + * we need to have 5 table size resrved to map any location of the crash
>
> (Nit: reserved)

OK.

>
>
>> + * kernel region.
>> + *
>> + * If we will ever wish to support uart debugging in purgatory then that
>> + * might cross the boundary and therefore additional 2 more table space. In
>> + * that case, we will need a total of 7 table space.
>> + *
>> + * As of now keep it fixed at 5. Increase it if any platform either
>> + * supports uart or more than 2G of crash kernel size.
>> + */
>> +#define MAX_PGTBLE_SZ	(5 * PAGE_SIZE)
>> +
>> +#endif
>>


~Pratyush

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory
  2017-06-02 11:44         ` Ard Biesheuvel
@ 2017-06-02 14:32           ` Pratyush Anand
  0 siblings, 0 replies; 20+ messages in thread
From: Pratyush Anand @ 2017-06-02 14:32 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Ard,

Thanks for all the inputs.

On Friday 02 June 2017 05:14 PM, Ard Biesheuvel wrote:
>> Alternatively, a SHA-256 implementation that uses movz/movk sequences
>> instead of ldr instructions to load the round constants would already
>> be 5x faster, given that we don't need page tables to enable the
>> I-cache.
> Actually, looking at the C code and the objdump of the kernel's
> sha256_generic driver, it is likely that it is already doing this, and
> none of the points I made actually make a lot of sense ...
>
> Pratyush: I assume you are already enabling the I-cache in the purgatory?

It's not enabled yet. But, I had tried first with enabling only I-cache, but 
that did not make much difference in execution time.

~Pratyush

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory
  2017-06-02  8:23 ` [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory James Morse
  2017-06-02  9:55   ` Ard Biesheuvel
@ 2017-06-02 14:42   ` Pratyush Anand
       [not found]     ` <CAB=otbT23ySN4VC6G0sBKF5p4SvsnG8PS-C_beBgn8YJUsbw9Q@mail.gmail.com>
  1 sibling, 1 reply; 20+ messages in thread
From: Pratyush Anand @ 2017-06-02 14:42 UTC (permalink / raw)
  To: linux-arm-kernel

Hi James,

On Friday 02 June 2017 01:53 PM, James Morse wrote:
> Hi Pratyush,
>
> On 23/05/17 06:02, Pratyush Anand wrote:
>> It takes more that 2 minutes to verify SHA in purgatory when vmlinuz image
>> is around 13MB and initramfs is around 30MB. It takes more than 20 second
>> even when we have -O2 optimization enabled. However, if dcache is enabled
>> during purgatory execution then, it takes just a second in SHA
>> verification.
>>
>> Therefore, these patches adds support for dcache enabling facility during
>> purgatory execution.
>
> I'm still not convinced we need this. Moving the SHA verification to happen
> before the dcache+mmu are disabled would also solve the delay problem,

Humm..I am not sure, if we can do that.

When we leave kernel (and enter into purgatory), icache+dcache+mmu are already 
disabled. I think, that would be possible when we will be in a position to use 
in-kernel purgatory.

> and we
> can print an error message or fail the syscall.
>
> For kexec we don't expect memory corruption, what are we testing for?
> I can see the use for kdump, but the kdump-kernel is unmapped so the kernel
> can't accidentally write over it.
>
> (we discussed all this last time, but it fizzled-out. If you and the
>  kexec-tools maintainer think its necessary, fine by me!)

Yes, there had already been discussion and MAINTAINERs have discouraged 
none-purgatory implementation.

>
> I have some comments on making this code easier to maintain..
>

Thanks.

I have implemented your review comments and have archived the code in

https://github.com/pratyushanand/kexec-tools.git : purgatory-enable-dcache

I will be posting the next version only when someone complains about ARM64 
kdump behavior that it is not as fast as x86.

Thanks for all your time on this series. That really helped me to understand 
the arm64 page table in a better way.

~Pratyush

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory
  2017-06-02 11:15     ` Bhupesh SHARMA
  2017-06-02 11:36       ` Ard Biesheuvel
@ 2017-06-02 16:36       ` James Morse
  1 sibling, 0 replies; 20+ messages in thread
From: James Morse @ 2017-06-02 16:36 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Bhupesh,

On 02/06/17 12:15, Bhupesh SHARMA wrote:
> On Fri, Jun 2, 2017 at 3:25 PM, Ard Biesheuvel
> <ard.biesheuvel@linaro.org> wrote:
>> On 2 June 2017 at 08:23, James Morse <james.morse@arm.com> wrote:
>>> On 23/05/17 06:02, Pratyush Anand wrote:
>>>> It takes more that 2 minutes to verify SHA in purgatory when vmlinuz image
>>>> is around 13MB and initramfs is around 30MB. It takes more than 20 second
>>>> even when we have -O2 optimization enabled. However, if dcache is enabled
>>>> during purgatory execution then, it takes just a second in SHA
>>>> verification.
>>>>
>>>> Therefore, these patches adds support for dcache enabling facility during
>>>> purgatory execution.
>>>
>>> I'm still not convinced we need this. Moving the SHA verification to happen
>>> before the dcache+mmu are disabled would also solve the delay problem, and we
>>> can print an error message or fail the syscall.
>>>
>>> For kexec we don't expect memory corruption, what are we testing for?
>>
>> This is a very good question. SHA-256 is quite a heavy hammer if all
>> you need is CRC style error detection.

Thanks for the history links.

We don't (yet) support KEXEC_FILE or KEXEC_VERIFY_SIG, and arm64 doesn't have an
in-kernel purgatory (which looks to be required for kexec_file under secure-boot).


> AFAICR the sha-256 implementation was proposed to boot a signed
> kexec/kdump kernel to circumvent kexec from violating UEFI secure boot
> restrictions (see [1]).

The beginning of the kexec-tools git history is 'kexec-tools-1.101' in 2006, it
had util_lib/sha256.c. It looks like SecureBoot arrived in 2011 with v2.3.1 of UEFI.
I can see how x86 picked up on this checksum for secure-boot as kexec-tools
already did this work, (some of the files under arch/x86/purgatory note their
kexec-tools origin), my question is why did it do it in the first place?
If the reason is accidental writes, we mitigate this on arm64 by unmapping the
kdump region instead of just marking it read-only.


> As Matthew Garret rightly noted (see[2]), secure Boot, if enabled, is
> explicitly designed to stop you booting modified kernels unless you've
> added your own keys.

> So, CRC wouldn't possibly fulfil the functionality we are trying to
> achieve with SHA-256 in the purgatory.

Is this still true for a purgatory provided by user-space?


Thanks,

James

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory
       [not found]     ` <CAB=otbT23ySN4VC6G0sBKF5p4SvsnG8PS-C_beBgn8YJUsbw9Q@mail.gmail.com>
@ 2018-04-04 12:45       ` Kostiantyn Iarmak
  2018-04-04 13:04         ` Kostiantyn Iarmak
  2018-04-04 13:28         ` James Morse
  0 siblings, 2 replies; 20+ messages in thread
From: Kostiantyn Iarmak @ 2018-04-04 12:45 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Pratyush,

From: Pratyush Anand <panand@redhat.com>
> Date: Fri, Jun 2, 2017 at 5:42 PM
> Subject: Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory
> To: James Morse <james.morse@arm.com>
> Cc: mark.rutland at arm.com, bhe at redhat.com, kexec at lists.infradead.org,
> horms at verge.net.au, dyoung at redhat.com,
> linux-arm-kernel at lists.infradead.org
>
>
> Hi James,
>
> On Friday 02 June 2017 01:53 PM, James Morse wrote:
>> Hi Pratyush,
>>
>> On 23/05/17 06:02, Pratyush Anand wrote:
>>> It takes more that 2 minutes to verify SHA in purgatory when vmlinuz image
>>> is around 13MB and initramfs is around 30MB. It takes more than 20 second
>>> even when we have -O2 optimization enabled. However, if dcache is enabled
>>> during purgatory execution then, it takes just a second in SHA
>>> verification.
>>>
>>> Therefore, these patches adds support for dcache enabling facility during
>>> purgatory execution.
>>
>> I'm still not convinced we need this. Moving the SHA verification to happen
>> before the dcache+mmu are disabled would also solve the delay problem,
>
> Humm..I am not sure, if we can do that.
>
> When we leave kernel (and enter into purgatory), icache+dcache+mmu are
> already disabled. I think, that would be possible when we will be in a
> position to use in-kernel purgatory.
>
>> and we
>> can print an error message or fail the syscall.
>>
>> For kexec we don't expect memory corruption, what are we testing for?
>> I can see the use for kdump, but the kdump-kernel is unmapped so the kernel
>> can't accidentally write over it.
>>
>> (we discussed all this last time, but it fizzled-out. If you and the
>>   kexec-tools maintainer think its necessary, fine by me!)
>
> Yes, there had already been discussion and MAINTAINERs have
> discouraged none-purgatory implementation.
>
>> I have some comments on making this code easier to maintain..
>>
> Thanks.
>
> I have implemented your review comments and have archived the code in
>
> https://github.com/pratyushanand/kexec-tools.git : purgatory-enable-dcache
>
> I will be posting the next version only when someone complains about
> ARM64 kdump behavior that it is not as fast as x86.
On our ARM64-based platform we have very long main kernel-secondary 
kernel switch time.

This patch set fixes the issue (we are using 4.4 kernel and 2.0.13 
kexec-tools version), we can get ~25x speedup, with this patch secondary 
kernel boots in ~3 seconds while on 2.0.13-2.0.16 kexec-tools without 
this patch switch takes about 75 seconds.

When do you plan merge this patch?

I can help you with testing on our platform.

> Thanks for all your time on this series. That really helped me to
> understand the arm64 page table in a better way.
>
> ~Pratyush
>
>
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

-- 
Best Regards,

   Kostiantyn Iarmak.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory
  2018-04-04 12:45       ` Kostiantyn Iarmak
@ 2018-04-04 13:04         ` Kostiantyn Iarmak
  2018-04-04 13:28         ` James Morse
  1 sibling, 0 replies; 20+ messages in thread
From: Kostiantyn Iarmak @ 2018-04-04 13:04 UTC (permalink / raw)
  To: linux-arm-kernel

Unfortunately got delivery failure notification for Pratyush Anand's 
address (Unknown address),

   who can help with merging this patch set?


On 04.04.18 15:45, Kostiantyn Iarmak wrote:
> Hi Pratyush,
>
> From: Pratyush Anand <panand@redhat.com>
>> Date: Fri, Jun 2, 2017 at 5:42 PM
>> Subject: Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in 
>> purgatory
>> To: James Morse <james.morse@arm.com>
>> Cc: mark.rutland at arm.com, bhe at redhat.com, kexec at lists.infradead.org,
>> horms at verge.net.au, dyoung at redhat.com,
>> linux-arm-kernel at lists.infradead.org
>>
>>
>> Hi James,
>>
>> On Friday 02 June 2017 01:53 PM, James Morse wrote:
>>> Hi Pratyush,
>>>
>>> On 23/05/17 06:02, Pratyush Anand wrote:
>>>> It takes more that 2 minutes to verify SHA in purgatory when 
>>>> vmlinuz image
>>>> is around 13MB and initramfs is around 30MB. It takes more than 20 
>>>> second
>>>> even when we have -O2 optimization enabled. However, if dcache is 
>>>> enabled
>>>> during purgatory execution then, it takes just a second in SHA
>>>> verification.
>>>>
>>>> Therefore, these patches adds support for dcache enabling facility 
>>>> during
>>>> purgatory execution.
>>>
>>> I'm still not convinced we need this. Moving the SHA verification to 
>>> happen
>>> before the dcache+mmu are disabled would also solve the delay problem,
>>
>> Humm..I am not sure, if we can do that.
>>
>> When we leave kernel (and enter into purgatory), icache+dcache+mmu are
>> already disabled. I think, that would be possible when we will be in a
>> position to use in-kernel purgatory.
>>
>>> and we
>>> can print an error message or fail the syscall.
>>>
>>> For kexec we don't expect memory corruption, what are we testing for?
>>> I can see the use for kdump, but the kdump-kernel is unmapped so the 
>>> kernel
>>> can't accidentally write over it.
>>>
>>> (we discussed all this last time, but it fizzled-out. If you and the
>>>   kexec-tools maintainer think its necessary, fine by me!)
>>
>> Yes, there had already been discussion and MAINTAINERs have
>> discouraged none-purgatory implementation.
>>
>>> I have some comments on making this code easier to maintain..
>>>
>> Thanks.
>>
>> I have implemented your review comments and have archived the code in
>>
>> https://github.com/pratyushanand/kexec-tools.git : 
>> purgatory-enable-dcache
>>
>> I will be posting the next version only when someone complains about
>> ARM64 kdump behavior that it is not as fast as x86.
> On our ARM64-based platform we have very long main kernel-secondary 
> kernel switch time.
>
> This patch set fixes the issue (we are using 4.4 kernel and 2.0.13 
> kexec-tools version), we can get ~25x speedup, with this patch 
> secondary kernel boots in ~3 seconds while on 2.0.13-2.0.16 
> kexec-tools without this patch switch takes about 75 seconds.
>
> When do you plan merge this patch?
>
> I can help you with testing on our platform.
>
>> Thanks for all your time on this series. That really helped me to
>> understand the arm64 page table in a better way.
>>
>> ~Pratyush
>>
>>
>>
>>
>> _______________________________________________
>> linux-arm-kernel mailing list
>> linux-arm-kernel at lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>

-- 
Best Regards,

   Kostiantyn (Kostia) Iarmak.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory
  2018-04-04 12:45       ` Kostiantyn Iarmak
  2018-04-04 13:04         ` Kostiantyn Iarmak
@ 2018-04-04 13:28         ` James Morse
  2018-04-04 22:55           ` Ard Biesheuvel
                             ` (3 more replies)
  1 sibling, 4 replies; 20+ messages in thread
From: James Morse @ 2018-04-04 13:28 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Kostiantyn,

On 04/04/18 13:45, Kostiantyn Iarmak wrote:
> From: Pratyush Anand <panand@redhat.com>
>> Date: Fri, Jun 2, 2017 at 5:42 PM
>> Subject: Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory
>> To: James Morse <james.morse@arm.com>
>> Cc: mark.rutland at arm.com, bhe at redhat.com, kexec at lists.infradead.org,
>> horms at verge.net.au, dyoung at redhat.com,
>> linux-arm-kernel at lists.infradead.org
>>
>> On Friday 02 June 2017 01:53 PM, James Morse wrote:
>>> On 23/05/17 06:02, Pratyush Anand wrote:
>>>> It takes more that 2 minutes to verify SHA in purgatory when vmlinuz image
>>>> is around 13MB and initramfs is around 30MB. It takes more than 20 second
>>>> even when we have -O2 optimization enabled. However, if dcache is enabled
>>>> during purgatory execution then, it takes just a second in SHA
>>>> verification.
>>>>
>>>> Therefore, these patches adds support for dcache enabling facility during
>>>> purgatory execution.

>>> I'm still not convinced we need this. Moving the SHA verification to happen
>>> before the dcache+mmu are disabled would also solve the delay problem,
>>
>> Humm..I am not sure, if we can do that.

>> When we leave kernel (and enter into purgatory), icache+dcache+mmu are
>> already disabled. I think, that would be possible when we will be in a
>> position to use in-kernel purgatory.
>>
>>> and we
>>> can print an error message or fail the syscall.
>>>
>>> For kexec we don't expect memory corruption, what are we testing for?
>>> I can see the use for kdump, but the kdump-kernel is unmapped so the kernel
>>> can't accidentally write over it.
>>>
>>> (we discussed all this last time, but it fizzled-out. If you and the
>>> ? kexec-tools maintainer think its necessary, fine by me!)

>> Yes, there had already been discussion and MAINTAINERs have
>> discouraged none-purgatory implementation.
>>
>>> I have some comments on making this code easier to maintain..
>>>
>> Thanks.
>>
>> I have implemented your review comments and have archived the code in
>>
>> https://github.com/pratyushanand/kexec-tools.git : purgatory-enable-dcache
>>
>> I will be posting the next version only when someone complains about
>> ARM64 kdump behavior that it is not as fast as x86.

> On our ARM64-based platform we have very long main kernel-secondary kernel
> switch time.
> 
> This patch set fixes the issue (we are using 4.4 kernel and 2.0.13 kexec-tools
> version), we can get ~25x speedup, with this patch secondary kernel boots in ~3
> seconds while on 2.0.13-2.0.16 kexec-tools without this patch switch takes about
> 75 seconds.

This is slow because its generating a checksum of the kernel without the benefit
of the caches. This series generated page tables so that it could enable the MMU
and caches. But, the purgatory code also needs to be a simple as possible
because its practically impossible to debug.

The purgatory code does this checksum-ing because its worried the panic() was
because the kernel cause some memory corruption, and that memory corruption may
have affected the kdump kernel too.

This can't happen on arm64 as we unmap kdump's crash region, so not even the
kernel can accidentally write to it. 98d2e1539b84 ("arm64: kdump: protect crash
dump kernel memory") has all the details.

(we also needed to do this to avoid the risk of mismatched memory attributes if
kdump boots and some CPUs are still stuck in the old kernel)


> When do you plan merge this patch?

We ended up with the check-summing code because its the default behaviour of
kexec-tools on other architectures.

One alternative is to rip it out for arm64. Untested:
--------------------%<--------------------
diff --git a/purgatory/arch/arm64/Makefile b/purgatory/arch/arm64/Makefile
index 636abea..f10c148 100644
--- a/purgatory/arch/arm64/Makefile
+++ b/purgatory/arch/arm64/Makefile
@@ -7,7 +7,8 @@ arm64_PURGATORY_EXTRA_CFLAGS = \
        -Werror-implicit-function-declaration \
        -Wdeclaration-after-statement \
        -Werror=implicit-int \
-       -Werror=strict-prototypes
+       -Werror=strict-prototypes \
+       -DNO_SHA_IN_PURGATORY

 arm64_PURGATORY_SRCS += \
        purgatory/arch/arm64/entry.S \
diff --git a/purgatory/purgatory.c b/purgatory/purgatory.c
index 3bbcc09..44e792a 100644
--- a/purgatory/purgatory.c
+++ b/purgatory/purgatory.c
@@ -9,6 +9,8 @@
 struct sha256_region sha256_regions[SHA256_REGIONS] = {};
 sha256_digest_t sha256_digest = { };

+#ifndef NO_SHA_IN_PURGATORY
+
 int verify_sha256_digest(void)
 {
        struct sha256_region *ptr, *end;
@@ -39,14 +41,18 @@ int verify_sha256_digest(void)
        return 0;
 }

+#endif /* NO_SHA_IN_PURGATORY */
+
 void purgatory(void)
 {
        printf("I'm in purgatory\n");
        setup_arch();
+#ifndef NO_SHA_IN_PURGATORY
        if (verify_sha256_digest()) {
                for(;;) {
                        /* loop forever */
                }
        }
+#endif /* NO_SHA_IN_PURGATORY */
        post_verification_setup_arch();
 }
--------------------%<--------------------


Thanks,

James

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory
  2018-04-04 13:28         ` James Morse
@ 2018-04-04 22:55           ` Ard Biesheuvel
  2018-04-05  1:47           ` AKASHI Takahiro
                             ` (2 subsequent siblings)
  3 siblings, 0 replies; 20+ messages in thread
From: Ard Biesheuvel @ 2018-04-04 22:55 UTC (permalink / raw)
  To: linux-arm-kernel

On 4 April 2018 at 15:28, James Morse <james.morse@arm.com> wrote:
> Hi Kostiantyn,
>
> On 04/04/18 13:45, Kostiantyn Iarmak wrote:
>> From: Pratyush Anand <panand@redhat.com>
>>> Date: Fri, Jun 2, 2017 at 5:42 PM
>>> Subject: Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory
>>> To: James Morse <james.morse@arm.com>
>>> Cc: mark.rutland at arm.com, bhe at redhat.com, kexec at lists.infradead.org,
>>> horms at verge.net.au, dyoung at redhat.com,
>>> linux-arm-kernel at lists.infradead.org
>>>
>>> On Friday 02 June 2017 01:53 PM, James Morse wrote:
>>>> On 23/05/17 06:02, Pratyush Anand wrote:
>>>>> It takes more that 2 minutes to verify SHA in purgatory when vmlinuz image
>>>>> is around 13MB and initramfs is around 30MB. It takes more than 20 second
>>>>> even when we have -O2 optimization enabled. However, if dcache is enabled
>>>>> during purgatory execution then, it takes just a second in SHA
>>>>> verification.
>>>>>
>>>>> Therefore, these patches adds support for dcache enabling facility during
>>>>> purgatory execution.
>
>>>> I'm still not convinced we need this. Moving the SHA verification to happen
>>>> before the dcache+mmu are disabled would also solve the delay problem,
>>>
>>> Humm..I am not sure, if we can do that.
>
>>> When we leave kernel (and enter into purgatory), icache+dcache+mmu are
>>> already disabled. I think, that would be possible when we will be in a
>>> position to use in-kernel purgatory.
>>>
>>>> and we
>>>> can print an error message or fail the syscall.
>>>>
>>>> For kexec we don't expect memory corruption, what are we testing for?
>>>> I can see the use for kdump, but the kdump-kernel is unmapped so the kernel
>>>> can't accidentally write over it.
>>>>
>>>> (we discussed all this last time, but it fizzled-out. If you and the
>>>>   kexec-tools maintainer think its necessary, fine by me!)
>
>>> Yes, there had already been discussion and MAINTAINERs have
>>> discouraged none-purgatory implementation.
>>>
>>>> I have some comments on making this code easier to maintain..
>>>>
>>> Thanks.
>>>
>>> I have implemented your review comments and have archived the code in
>>>
>>> https://github.com/pratyushanand/kexec-tools.git : purgatory-enable-dcache
>>>
>>> I will be posting the next version only when someone complains about
>>> ARM64 kdump behavior that it is not as fast as x86.
>
>> On our ARM64-based platform we have very long main kernel-secondary kernel
>> switch time.
>>
>> This patch set fixes the issue (we are using 4.4 kernel and 2.0.13 kexec-tools
>> version), we can get ~25x speedup, with this patch secondary kernel boots in ~3
>> seconds while on 2.0.13-2.0.16 kexec-tools without this patch switch takes about
>> 75 seconds.
>
> This is slow because its generating a checksum of the kernel without the benefit
> of the caches. This series generated page tables so that it could enable the MMU
> and caches. But, the purgatory code also needs to be a simple as possible
> because its practically impossible to debug.
>
> The purgatory code does this checksum-ing because its worried the panic() was
> because the kernel cause some memory corruption, and that memory corruption may
> have affected the kdump kernel too.
>

If this is the only reason, there is no need to use a strong
cryptographic hash, and we should be able to recover some performance
by switching to CRC32 instead, preferably using the special arm64
instructions (if implemented).

But I agree that skipping the checksum calculation altogether is
probably the best approach here.


> This can't happen on arm64 as we unmap kdump's crash region, so not even the
> kernel can accidentally write to it. 98d2e1539b84 ("arm64: kdump: protect crash
> dump kernel memory") has all the details.
>
> (we also needed to do this to avoid the risk of mismatched memory attributes if
> kdump boots and some CPUs are still stuck in the old kernel)
>
>
>> When do you plan merge this patch?
>
> We ended up with the check-summing code because its the default behaviour of
> kexec-tools on other architectures.
>
> One alternative is to rip it out for arm64. Untested:
> --------------------%<--------------------
> diff --git a/purgatory/arch/arm64/Makefile b/purgatory/arch/arm64/Makefile
> index 636abea..f10c148 100644
> --- a/purgatory/arch/arm64/Makefile
> +++ b/purgatory/arch/arm64/Makefile
> @@ -7,7 +7,8 @@ arm64_PURGATORY_EXTRA_CFLAGS = \
>         -Werror-implicit-function-declaration \
>         -Wdeclaration-after-statement \
>         -Werror=implicit-int \
> -       -Werror=strict-prototypes
> +       -Werror=strict-prototypes \
> +       -DNO_SHA_IN_PURGATORY
>
>  arm64_PURGATORY_SRCS += \
>         purgatory/arch/arm64/entry.S \
> diff --git a/purgatory/purgatory.c b/purgatory/purgatory.c
> index 3bbcc09..44e792a 100644
> --- a/purgatory/purgatory.c
> +++ b/purgatory/purgatory.c
> @@ -9,6 +9,8 @@
>  struct sha256_region sha256_regions[SHA256_REGIONS] = {};
>  sha256_digest_t sha256_digest = { };
>
> +#ifndef NO_SHA_IN_PURGATORY
> +
>  int verify_sha256_digest(void)
>  {
>         struct sha256_region *ptr, *end;
> @@ -39,14 +41,18 @@ int verify_sha256_digest(void)
>         return 0;
>  }
>
> +#endif /* NO_SHA_IN_PURGATORY */
> +
>  void purgatory(void)
>  {
>         printf("I'm in purgatory\n");
>         setup_arch();
> +#ifndef NO_SHA_IN_PURGATORY
>         if (verify_sha256_digest()) {
>                 for(;;) {
>                         /* loop forever */
>                 }
>         }
> +#endif /* NO_SHA_IN_PURGATORY */
>         post_verification_setup_arch();
>  }
> --------------------%<--------------------
>
>
> Thanks,
>
> James
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory
  2018-04-04 13:28         ` James Morse
  2018-04-04 22:55           ` Ard Biesheuvel
@ 2018-04-05  1:47           ` AKASHI Takahiro
  2018-04-06 18:15           ` Kostiantyn Iarmak
  2018-04-11 15:59           ` Geoff Levand
  3 siblings, 0 replies; 20+ messages in thread
From: AKASHI Takahiro @ 2018-04-05  1:47 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Apr 04, 2018 at 02:28:52PM +0100, James Morse wrote:
> Hi Kostiantyn,
> 
> On 04/04/18 13:45, Kostiantyn Iarmak wrote:
> > From: Pratyush Anand <panand@redhat.com>
> >> Date: Fri, Jun 2, 2017 at 5:42 PM
> >> Subject: Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory
> >> To: James Morse <james.morse@arm.com>
> >> Cc: mark.rutland at arm.com, bhe at redhat.com, kexec at lists.infradead.org,
> >> horms at verge.net.au, dyoung at redhat.com,
> >> linux-arm-kernel at lists.infradead.org
> >>
> >> On Friday 02 June 2017 01:53 PM, James Morse wrote:
> >>> On 23/05/17 06:02, Pratyush Anand wrote:
> >>>> It takes more that 2 minutes to verify SHA in purgatory when vmlinuz image
> >>>> is around 13MB and initramfs is around 30MB. It takes more than 20 second
> >>>> even when we have -O2 optimization enabled. However, if dcache is enabled
> >>>> during purgatory execution then, it takes just a second in SHA
> >>>> verification.
> >>>>
> >>>> Therefore, these patches adds support for dcache enabling facility during
> >>>> purgatory execution.
> 
> >>> I'm still not convinced we need this. Moving the SHA verification to happen
> >>> before the dcache+mmu are disabled would also solve the delay problem,
> >>
> >> Humm..I am not sure, if we can do that.
> 
> >> When we leave kernel (and enter into purgatory), icache+dcache+mmu are
> >> already disabled. I think, that would be possible when we will be in a
> >> position to use in-kernel purgatory.
> >>
> >>> and we
> >>> can print an error message or fail the syscall.
> >>>
> >>> For kexec we don't expect memory corruption, what are we testing for?
> >>> I can see the use for kdump, but the kdump-kernel is unmapped so the kernel
> >>> can't accidentally write over it.
> >>>
> >>> (we discussed all this last time, but it fizzled-out. If you and the
> >>> ? kexec-tools maintainer think its necessary, fine by me!)
> 
> >> Yes, there had already been discussion and MAINTAINERs have
> >> discouraged none-purgatory implementation.

I don't remember the discussion like this quite well, but anyhow ...

> >>
> >>> I have some comments on making this code easier to maintain..
> >>>
> >> Thanks.
> >>
> >> I have implemented your review comments and have archived the code in
> >>
> >> https://github.com/pratyushanand/kexec-tools.git : purgatory-enable-dcache
> >>
> >> I will be posting the next version only when someone complains about
> >> ARM64 kdump behavior that it is not as fast as x86.
> 
> > On our ARM64-based platform we have very long main kernel-secondary kernel
> > switch time.
> > 
> > This patch set fixes the issue (we are using 4.4 kernel and 2.0.13 kexec-tools
> > version), we can get ~25x speedup, with this patch secondary kernel boots in ~3
> > seconds while on 2.0.13-2.0.16 kexec-tools without this patch switch takes about
> > 75 seconds.
> 
> This is slow because its generating a checksum of the kernel without the benefit
> of the caches. This series generated page tables so that it could enable the MMU
> and caches. But, the purgatory code also needs to be a simple as possible
> because its practically impossible to debug.

Not impossible, but I admit that I occasionally had hard time in debugging.

> The purgatory code does this checksum-ing because its worried the panic() was
> because the kernel cause some memory corruption, and that memory corruption may
> have affected the kdump kernel too.
> 
> This can't happen on arm64 as we unmap kdump's crash region, so not even the
> kernel can accidentally write to it. 98d2e1539b84 ("arm64: kdump: protect crash
> dump kernel memory") has all the details.
> 
> (we also needed to do this to avoid the risk of mismatched memory attributes if
> kdump boots and some CPUs are still stuck in the old kernel)
> 
> 
> > When do you plan merge this patch?
> 
> We ended up with the check-summing code because its the default behaviour of
> kexec-tools on other architectures.
> 
> One alternative is to rip it out for arm64. Untested:

Thanks for the patch. This eventually eliminates "reason d'etre" of
purgatory on arm64 as I does in my kexec_file patch, although it would
require a small re-work.

-Takahiro AKASHI


> --------------------%<--------------------
> diff --git a/purgatory/arch/arm64/Makefile b/purgatory/arch/arm64/Makefile
> index 636abea..f10c148 100644
> --- a/purgatory/arch/arm64/Makefile
> +++ b/purgatory/arch/arm64/Makefile
> @@ -7,7 +7,8 @@ arm64_PURGATORY_EXTRA_CFLAGS = \
>         -Werror-implicit-function-declaration \
>         -Wdeclaration-after-statement \
>         -Werror=implicit-int \
> -       -Werror=strict-prototypes
> +       -Werror=strict-prototypes \
> +       -DNO_SHA_IN_PURGATORY
> 
>  arm64_PURGATORY_SRCS += \
>         purgatory/arch/arm64/entry.S \
> diff --git a/purgatory/purgatory.c b/purgatory/purgatory.c
> index 3bbcc09..44e792a 100644
> --- a/purgatory/purgatory.c
> +++ b/purgatory/purgatory.c
> @@ -9,6 +9,8 @@
>  struct sha256_region sha256_regions[SHA256_REGIONS] = {};
>  sha256_digest_t sha256_digest = { };
> 
> +#ifndef NO_SHA_IN_PURGATORY
> +
>  int verify_sha256_digest(void)
>  {
>         struct sha256_region *ptr, *end;
> @@ -39,14 +41,18 @@ int verify_sha256_digest(void)
>         return 0;
>  }
> 
> +#endif /* NO_SHA_IN_PURGATORY */
> +
>  void purgatory(void)
>  {
>         printf("I'm in purgatory\n");
>         setup_arch();
> +#ifndef NO_SHA_IN_PURGATORY
>         if (verify_sha256_digest()) {
>                 for(;;) {
>                         /* loop forever */
>                 }
>         }
> +#endif /* NO_SHA_IN_PURGATORY */
>         post_verification_setup_arch();
>  }
> --------------------%<--------------------
> 
> 
> Thanks,
> 
> James
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory
  2018-04-04 13:28         ` James Morse
  2018-04-04 22:55           ` Ard Biesheuvel
  2018-04-05  1:47           ` AKASHI Takahiro
@ 2018-04-06 18:15           ` Kostiantyn Iarmak
  2018-04-11 15:59           ` Geoff Levand
  3 siblings, 0 replies; 20+ messages in thread
From: Kostiantyn Iarmak @ 2018-04-06 18:15 UTC (permalink / raw)
  To: linux-arm-kernel


On 04.04.18 16:28, James Morse wrote:
> Hi Kostiantyn,
>
> On 04/04/18 13:45, Kostiantyn Iarmak wrote:
>> From: Pratyush Anand <panand@redhat.com>
>>> Date: Fri, Jun 2, 2017 at 5:42 PM
>>> Subject: Re: [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory
>>> To: James Morse <james.morse@arm.com>
>>> Cc: mark.rutland at arm.com, bhe at redhat.com, kexec at lists.infradead.org,
>>> horms at verge.net.au, dyoung at redhat.com,
>>> linux-arm-kernel at lists.infradead.org
>>>
>>> On Friday 02 June 2017 01:53 PM, James Morse wrote:
>>>> On 23/05/17 06:02, Pratyush Anand wrote:
>>>>> It takes more that 2 minutes to verify SHA in purgatory when vmlinuz image
>>>>> is around 13MB and initramfs is around 30MB. It takes more than 20 second
>>>>> even when we have -O2 optimization enabled. However, if dcache is enabled
>>>>> during purgatory execution then, it takes just a second in SHA
>>>>> verification.
>>>>>
>>>>> Therefore, these patches adds support for dcache enabling facility during
>>>>> purgatory execution.
>>>> I'm still not convinced we need this. Moving the SHA verification to happen
>>>> before the dcache+mmu are disabled would also solve the delay problem,
>>> Humm..I am not sure, if we can do that.
>>> When we leave kernel (and enter into purgatory), icache+dcache+mmu are
>>> already disabled. I think, that would be possible when we will be in a
>>> position to use in-kernel purgatory.
>>>
>>>> and we
>>>> can print an error message or fail the syscall.
>>>>
>>>> For kexec we don't expect memory corruption, what are we testing for?
>>>> I can see the use for kdump, but the kdump-kernel is unmapped so the kernel
>>>> can't accidentally write over it.
>>>>
>>>> (we discussed all this last time, but it fizzled-out. If you and the
>>>>    kexec-tools maintainer think its necessary, fine by me!)
>>> Yes, there had already been discussion and MAINTAINERs have
>>> discouraged none-purgatory implementation.
>>>
>>>> I have some comments on making this code easier to maintain..
>>>>
>>> Thanks.
>>>
>>> I have implemented your review comments and have archived the code in
>>>
>>> https://github.com/pratyushanand/kexec-tools.git : purgatory-enable-dcache
>>>
>>> I will be posting the next version only when someone complains about
>>> ARM64 kdump behavior that it is not as fast as x86.
>> On our ARM64-based platform we have very long main kernel-secondary kernel
>> switch time.
>>
>> This patch set fixes the issue (we are using 4.4 kernel and 2.0.13 kexec-tools
>> version), we can get ~25x speedup, with this patch secondary kernel boots in ~3
>> seconds while on 2.0.13-2.0.16 kexec-tools without this patch switch takes about
>> 75 seconds.
> This is slow because its generating a checksum of the kernel without the benefit
> of the caches. This series generated page tables so that it could enable the MMU
> and caches. But, the purgatory code also needs to be a simple as possible
> because its practically impossible to debug.
>
> The purgatory code does this checksum-ing because its worried the panic() was
> because the kernel cause some memory corruption, and that memory corruption may
> have affected the kdump kernel too.
>
> This can't happen on arm64 as we unmap kdump's crash region, so not even the
> kernel can accidentally write to it. 98d2e1539b84 ("arm64: kdump: protect crash
> dump kernel memory") has all the details.
>
> (we also needed to do this to avoid the risk of mismatched memory attributes if
> kdump boots and some CPUs are still stuck in the old kernel)
>
>
>> When do you plan merge this patch?
> We ended up with the check-summing code because its the default behaviour of
> kexec-tools on other architectures.
>
> One alternative is to rip it out for arm64. Untested:
> --------------------%<--------------------
> diff --git a/purgatory/arch/arm64/Makefile b/purgatory/arch/arm64/Makefile
> index 636abea..f10c148 100644
> --- a/purgatory/arch/arm64/Makefile
> +++ b/purgatory/arch/arm64/Makefile
> @@ -7,7 +7,8 @@ arm64_PURGATORY_EXTRA_CFLAGS = \
>          -Werror-implicit-function-declaration \
>          -Wdeclaration-after-statement \
>          -Werror=implicit-int \
> -       -Werror=strict-prototypes
> +       -Werror=strict-prototypes \
> +       -DNO_SHA_IN_PURGATORY
>
>   arm64_PURGATORY_SRCS += \
>          purgatory/arch/arm64/entry.S \
> diff --git a/purgatory/purgatory.c b/purgatory/purgatory.c
> index 3bbcc09..44e792a 100644
> --- a/purgatory/purgatory.c
> +++ b/purgatory/purgatory.c
> @@ -9,6 +9,8 @@
>   struct sha256_region sha256_regions[SHA256_REGIONS] = {};
>   sha256_digest_t sha256_digest = { };
>
> +#ifndef NO_SHA_IN_PURGATORY
> +
>   int verify_sha256_digest(void)
>   {
>          struct sha256_region *ptr, *end;
> @@ -39,14 +41,18 @@ int verify_sha256_digest(void)
>          return 0;
>   }
>
> +#endif /* NO_SHA_IN_PURGATORY */
> +
>   void purgatory(void)
>   {
>          printf("I'm in purgatory\n");
>          setup_arch();
> +#ifndef NO_SHA_IN_PURGATORY
>          if (verify_sha256_digest()) {
>                  for(;;) {
>                          /* loop forever */
>                  }
>          }
> +#endif /* NO_SHA_IN_PURGATORY */
>          post_verification_setup_arch();
>   }
> --------------------%<--------------------

Thank you, I've tested this patch (no issues).

>
> Thanks,
>
> James

-- 
Best Regards,

   Kostiantyn Iarmak.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory
  2018-04-04 13:28         ` James Morse
                             ` (2 preceding siblings ...)
  2018-04-06 18:15           ` Kostiantyn Iarmak
@ 2018-04-11 15:59           ` Geoff Levand
  3 siblings, 0 replies; 20+ messages in thread
From: Geoff Levand @ 2018-04-11 15:59 UTC (permalink / raw)
  To: linux-arm-kernel

On 04/04/2018 06:28 AM, James Morse wrote:
> We ended up with the check-summing code because its the default behaviour of
> kexec-tools on other architectures.
> 
> One alternative is to rip it out for arm64.

Or add arm64 support to kexec-lite:

  https://github.com/antonblanchard/kexec-lite

Or accept my bypass patch:

  http://lists.infradead.org/pipermail/kexec/2015-October/014573.html

-Geoff

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2018-04-11 15:59 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-05-23  5:02 [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory Pratyush Anand
2017-05-23  5:02 ` [PATCH v3 1/2] kexec: arm64: create identity page table to be used " Pratyush Anand
2017-06-02  8:24   ` James Morse
2017-06-02 14:29     ` Pratyush Anand
2017-05-23  5:02 ` [PATCH v3 2/2] arm64: enable d-cache support during purgatory sha verification Pratyush Anand
2017-06-02  8:23 ` [PATCH v3 0/2] kexec-tools: arm64: Enable D-cache in purgatory James Morse
2017-06-02  9:55   ` Ard Biesheuvel
2017-06-02 11:15     ` Bhupesh SHARMA
2017-06-02 11:36       ` Ard Biesheuvel
2017-06-02 11:44         ` Ard Biesheuvel
2017-06-02 14:32           ` Pratyush Anand
2017-06-02 16:36       ` James Morse
2017-06-02 14:42   ` Pratyush Anand
     [not found]     ` <CAB=otbT23ySN4VC6G0sBKF5p4SvsnG8PS-C_beBgn8YJUsbw9Q@mail.gmail.com>
2018-04-04 12:45       ` Kostiantyn Iarmak
2018-04-04 13:04         ` Kostiantyn Iarmak
2018-04-04 13:28         ` James Morse
2018-04-04 22:55           ` Ard Biesheuvel
2018-04-05  1:47           ` AKASHI Takahiro
2018-04-06 18:15           ` Kostiantyn Iarmak
2018-04-11 15:59           ` Geoff Levand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).