Kexec Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH v9 01/18] arm64: kexec: make dtb_mem always enabled
       [not found] ` <20200326032420.27220-2-pasha.tatashin@soleen.com>
@ 2020-04-29 17:00   ` James Morse
  2021-01-23  0:17     ` Pavel Tatashin
  0 siblings, 1 reply; 26+ messages in thread
From: James Morse @ 2020-04-29 17:00 UTC (permalink / raw)
  To: Pavel Tatashin
  Cc: sashal, mark.rutland, vladimir.murzin, corbet, catalin.marinas,
	bhsharma, steve.capper, kexec, linux-kernel, jmorris, linux-mm,
	rfontana, ebiederm, maz, matthias.bgg, tglx, will, selindag,
	linux-arm-kernel

Hi Pavel,

On 26/03/2020 03:24, Pavel Tatashin wrote:
> Currently, dtb_mem is enabled only when CONFIG_KEXEC_FILE is
> enabled. This adds ugly ifdefs to c files.

~s/dtb_mem/ARCH_HAS_KIMAGE_ARCH/ ?
dtb_mem is just one member of struct kimage_arch.


> Always enabled dtb_mem, when it is not used, it is NULL.
> Change the dtb_mem to phys_addr_t, as it is a physical address.

Regardless,
Reviewed-by: James Morse <james.morse@arm.com>


Thanks,

James

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v9 02/18] arm64: hibernate: move page handling function to new trans_pgd.c
       [not found] ` <20200326032420.27220-3-pasha.tatashin@soleen.com>
@ 2020-04-29 17:00   ` James Morse
  2021-01-23  0:18     ` Pavel Tatashin
  0 siblings, 1 reply; 26+ messages in thread
From: James Morse @ 2020-04-29 17:00 UTC (permalink / raw)
  To: Pavel Tatashin
  Cc: sashal, mark.rutland, vladimir.murzin, corbet, catalin.marinas,
	bhsharma, steve.capper, kexec, linux-kernel, jmorris, linux-mm,
	rfontana, ebiederm, maz, matthias.bgg, tglx, will, selindag,
	linux-arm-kernel

Hi Pavel,

On 26/03/2020 03:24, Pavel Tatashin wrote:
> Now, that we abstracted the required functions move them to a new home.
> Later, we will generalize these function in order to be useful outside
> of hibernation.

Reviewed-by: James Morse <james.morse@arm.com>


Thanks,

James

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v9 03/18] arm64: trans_pgd: make trans_pgd_map_page generic
       [not found] ` <20200326032420.27220-4-pasha.tatashin@soleen.com>
@ 2020-04-29 17:01   ` James Morse
  2021-01-22 21:52     ` Pavel Tatashin
  0 siblings, 1 reply; 26+ messages in thread
From: James Morse @ 2020-04-29 17:01 UTC (permalink / raw)
  To: Pavel Tatashin
  Cc: sashal, mark.rutland, vladimir.murzin, corbet, catalin.marinas,
	bhsharma, steve.capper, kexec, linux-kernel, jmorris, linux-mm,
	rfontana, ebiederm, maz, matthias.bgg, tglx, will, selindag,
	linux-arm-kernel

Hi Pavel,

On 26/03/2020 03:24, Pavel Tatashin wrote:
> kexec is going to use a different allocator, so make

> trans_pgd_map_page to accept allocator as an argument, and also
> kexec is going to use a different map protection, so also pass
> it via argument.

This trans_pgd_map_page() used to be create_single_mapping() It creates page tables that
map one page: the relocation code.

Why do you need a different pgprot? Surely PAGE_KERNEL_EXEC is exactly what you want.


> diff --git a/arch/arm64/include/asm/trans_pgd.h b/arch/arm64/include/asm/trans_pgd.h
> index 23153c13d1ce..ad5194ad178d 100644
> --- a/arch/arm64/include/asm/trans_pgd.h
> +++ b/arch/arm64/include/asm/trans_pgd.h
> @@ -12,10 +12,24 @@
>  #include <linux/types.h>
>  #include <asm/pgtable-types.h>
>  
> +/*
> + * trans_alloc_page
> + *	- Allocator that should return exactly one zeroed page, if this
> + *	 allocator fails, trans_pgd returns -ENOMEM error.

trans_pgd is what you pass in to trans_pgd_map_page() or trans_pgd_create_copy().
Do you mean what those functions return?


> + *
> + * trans_alloc_arg
> + *	- Passed to trans_alloc_page as an argument
> + */

> diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c
> index 3d6f0fd73591..607bb1fbc349 100644
> --- a/arch/arm64/kernel/hibernate.c
> +++ b/arch/arm64/kernel/hibernate.c
> @@ -195,6 +200,11 @@ static int create_safe_exec_page(void *src_start, size_t length,
>  				 unsigned long dst_addr,
>  				 phys_addr_t *phys_dst_addr)
>  {
> +	struct trans_pgd_info trans_info = {
> +		.trans_alloc_page	= hibernate_page_alloc,
> +		.trans_alloc_arg	= (void *)GFP_ATOMIC,
> +	};

As you need another copy of this in the next patch, is it worth declaring this globally
and making it const?


> diff --git a/arch/arm64/mm/trans_pgd.c b/arch/arm64/mm/trans_pgd.c
> index d20e48520cef..275a79935d7e 100644
> --- a/arch/arm64/mm/trans_pgd.c
> +++ b/arch/arm64/mm/trans_pgd.c
> @@ -180,8 +185,18 @@ int trans_pgd_create_copy(pgd_t **dst_pgdp, unsigned long start,
>  	return rc;
>  }
>  
> -int trans_pgd_map_page(pgd_t *trans_pgd, void *page, unsigned long dst_addr,
> -		       pgprot_t pgprot)
> +/*
> + * Add map entry to trans_pgd for a base-size page at PTE level.
> + * info:	contains allocator and its argument
> + * trans_pgd:	page table in which new map is added.
> + * page:	page to be mapped.

> + * dst_addr:	new VA address for the pages

~s/pages/page/

This thing only maps one page.


> + * pgprot:	protection for the page.
> + *
> + * Returns 0 on success, and -ENOMEM on failure.
> + */
> +int trans_pgd_map_page(struct trans_pgd_info *info, pgd_t *trans_pgd,
> +		       void *page, unsigned long dst_addr, pgprot_t pgprot)
>  {
>  	pgd_t *pgdp;
>  	pud_t *pudp;



Thanks,

James

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v9 04/18] arm64: trans_pgd: pass allocator trans_pgd_create_copy
       [not found] ` <20200326032420.27220-5-pasha.tatashin@soleen.com>
@ 2020-04-29 17:01   ` James Morse
  2021-01-23  0:20     ` Pavel Tatashin
  0 siblings, 1 reply; 26+ messages in thread
From: James Morse @ 2020-04-29 17:01 UTC (permalink / raw)
  To: Pavel Tatashin
  Cc: sashal, mark.rutland, vladimir.murzin, corbet, catalin.marinas,
	bhsharma, steve.capper, kexec, linux-kernel, jmorris, linux-mm,
	rfontana, ebiederm, maz, matthias.bgg, tglx, will, selindag,
	linux-arm-kernel

Hi Pavel,

On 26/03/2020 03:24, Pavel Tatashin wrote:
> Make trans_pgd_create_copy and its subroutines to use allocator that is
> passed as an argument

Reviewed-by: James Morse <james.morse@arm.com>


Thanks,

James

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v9 05/18] arm64: trans_pgd: pass NULL instead of init_mm to *_populate functions
       [not found] ` <20200326032420.27220-6-pasha.tatashin@soleen.com>
@ 2020-04-29 17:01   ` James Morse
  2021-01-23  0:22     ` Pavel Tatashin
  0 siblings, 1 reply; 26+ messages in thread
From: James Morse @ 2020-04-29 17:01 UTC (permalink / raw)
  To: Pavel Tatashin
  Cc: sashal, mark.rutland, vladimir.murzin, corbet, catalin.marinas,
	bhsharma, steve.capper, kexec, linux-kernel, jmorris, linux-mm,
	rfontana, ebiederm, maz, matthias.bgg, tglx, will, selindag,
	linux-arm-kernel

Hi Pavel,

On 26/03/2020 03:24, Pavel Tatashin wrote:
> trans_pgd_* should be independent from mm context because the tables that
> are created by this code are used when there are no mm context around, as
> it is between kernels. Simply replace mm_init's with NULL.

arm64's p?d_populate() helpers don't use the mm parameter, so it doesn't make any
difference. This was originally done so that if we ever needed anything from the mm, we
didn't get a NULL dereference or EL0 behaviour due to a future '!= &init_mm'.

If you think it matters,
Acked-by: James Morse <james.morse@arm.com>


Thanks,

James

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v9 07/18] arm64: trans_pgd: hibernate: idmap the single page that holds the copy page routines
       [not found] ` <20200326032420.27220-8-pasha.tatashin@soleen.com>
@ 2020-04-29 17:01   ` James Morse
  2021-01-23  0:35     ` Pavel Tatashin
  0 siblings, 1 reply; 26+ messages in thread
From: James Morse @ 2020-04-29 17:01 UTC (permalink / raw)
  To: Pavel Tatashin
  Cc: sashal, mark.rutland, vladimir.murzin, corbet, catalin.marinas,
	bhsharma, steve.capper, kexec, linux-kernel, jmorris, linux-mm,
	rfontana, ebiederm, maz, matthias.bgg, tglx, will, selindag,
	linux-arm-kernel

Hi Pavel,

On 26/03/2020 03:24, Pavel Tatashin wrote:
> From: James Morse <james.morse@arm.com>
> 
> To resume from hibernate, the contents of memory are restored from
> the swap image. This may overwrite any page, including the running
> kernel and its page tables.
> 
> Hibernate copies the code it uses to do the restore into a single
> page that it knows won't be overwritten, and maps it with page tables
> built from pages that won't be overwritten.
> 
> Today the address it uses for this mapping is arbitrary, but to allow
> kexec to reuse this code, it needs to be idmapped. To idmap the page
> we must avoid the kernel helpers that have VA_BITS baked in.
> 
> Convert create_single_mapping() to take a single PA, and idmap it.
> The page tables are built in the reverse order to normal using
> pfn_pte() to stir in any bits between 52:48. T0SZ is always increased
> to cover 48bits, or 52 if the copy code has bits 52:48 in its PA.
> 
> Pasha: The original patch from James
> inux-arm-kernel/20200115143322.214247-4-james.morse@arm.com

-EBROKENLINK

The convention is to use a 'Link:' tag in the signed-off area.
e.g. 5a3577039cbe

> Adopted it to trans_pgd, so it can be commonly used by both Kexec
> and Hibernate. Some minor clean-ups.

Please describe your changes just before your SoB. This means each author sign's off on
the stuff above their SoB, and its obvious who made which changes.

Search for 'Lucky K Maintainer' in process/submitting-patches.rst for an example.


> diff --git a/arch/arm64/include/asm/trans_pgd.h b/arch/arm64/include/asm/trans_pgd.h
> index 97a7ea73b289..4912d3caf0ca 100644
> --- a/arch/arm64/include/asm/trans_pgd.h
> +++ b/arch/arm64/include/asm/trans_pgd.h
> @@ -32,4 +32,7 @@ int trans_pgd_create_copy(struct trans_pgd_info *info, pgd_t **trans_pgd,
>  int trans_pgd_map_page(struct trans_pgd_info *info, pgd_t *trans_pgd,
>  		       void *page, unsigned long dst_addr, pgprot_t pgprot);

This trans_pgd_map_page() used to be create_single_mapping(), which is where the original
patch made its changes.

You should only need one of these, not both.


> +int trans_pgd_idmap_page(struct trans_pgd_info *info, phys_addr_t *trans_ttbr0,
> +			 unsigned long *t0sz, void *page);
> +
>  #endif /* _ASM_TRANS_TABLE_H */

> diff --git a/arch/arm64/mm/trans_pgd.c b/arch/arm64/mm/trans_pgd.c
> index 37d7d1c60f65..c2517d1af2af 100644
> --- a/arch/arm64/mm/trans_pgd.c
> +++ b/arch/arm64/mm/trans_pgd.c
> @@ -242,3 +242,52 @@ int trans_pgd_map_page(struct trans_pgd_info *info, pgd_t *trans_pgd,
>  
>  	return 0;
>  }
> +
> +/*
> + * The page we want to idmap may be outside the range covered by VA_BITS that
> + * can be built using the kernel's p?d_populate() helpers. As a one off, for a
> + * single page, we build these page tables bottom up and just assume that will
> + * need the maximum T0SZ.
> + *
> + * Returns 0 on success, and -ENOMEM on failure.
> + * On success trans_ttbr0 contains page table with idmapped page, t0sz is set to

> + * maxumum T0SZ for this page.

maxumum

> + */


Thanks,

James

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v9 08/18] arm64: kexec: move relocation function setup
       [not found] ` <20200326032420.27220-9-pasha.tatashin@soleen.com>
@ 2020-04-29 17:01   ` James Morse
  2021-01-23  1:01     ` Pavel Tatashin
  0 siblings, 1 reply; 26+ messages in thread
From: James Morse @ 2020-04-29 17:01 UTC (permalink / raw)
  To: Pavel Tatashin
  Cc: sashal, mark.rutland, vladimir.murzin, corbet, catalin.marinas,
	bhsharma, steve.capper, kexec, linux-kernel, jmorris, linux-mm,
	rfontana, ebiederm, maz, matthias.bgg, tglx, will, selindag,
	linux-arm-kernel

Hi Pavel,

On 26/03/2020 03:24, Pavel Tatashin wrote:
> Currently, kernel relocation function is configured in machine_kexec()
> at the time of kexec reboot by using control_code_page.
> 
> This operation, however, is more logical to be done during kexec_load,
> and thus remove from reboot time. Move, setup of this function to
> newly added machine_kexec_post_load().

This would avoid the need to special-case the cache maintenance, so its a good cleanup...


> Because once MMU is enabled, kexec control page will contain more than
> relocation kernel, but also vector table, add pointer to the actual
> function within this page arch.kern_reloc. Currently, it equals to the
> beginning of page, we will add offsets later, when vector table is
> added.

If the vector table always comes second, wouldn't this be extra work to hold the value 0?
You can control the layout of this relocation code, as it has to be written in assembly.
I don't get why this would be necessary.


> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
> index ae1bad0156cd..ec71a153cc2d 100644
> --- a/arch/arm64/kernel/machine_kexec.c
> +++ b/arch/arm64/kernel/machine_kexec.c
> @@ -58,6 +59,17 @@ void machine_kexec_cleanup(struct kimage *kimage)
>  	/* Empty routine needed to avoid build errors. */
>  }
>  
> +int machine_kexec_post_load(struct kimage *kimage)
> +{
> +	void *reloc_code = page_to_virt(kimage->control_code_page);
> +
> +	memcpy(reloc_code, arm64_relocate_new_kernel,
> +	       arm64_relocate_new_kernel_size);
> +	kimage->arch.kern_reloc = __pa(reloc_code);

Could we move the two cache maintenance calls for this area in here too. Keeping it next
to the modification makes it clearer why it is required.

In this case we can use flush_icache_range() instead of its __variant because this now
happens much earlier.


> +	return 0;
> +}

Regardless,
Reviewed-by: James Morse <james.morse@arm.com>


Thanks,

James

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v9 09/18] arm64: kexec: call kexec_image_info only once
       [not found] ` <20200326032420.27220-10-pasha.tatashin@soleen.com>
@ 2020-04-29 17:01   ` James Morse
  0 siblings, 0 replies; 26+ messages in thread
From: James Morse @ 2020-04-29 17:01 UTC (permalink / raw)
  To: Pavel Tatashin
  Cc: sashal, mark.rutland, vladimir.murzin, corbet, catalin.marinas,
	bhsharma, steve.capper, kexec, linux-kernel, jmorris, linux-mm,
	rfontana, ebiederm, maz, matthias.bgg, tglx, will, selindag,
	linux-arm-kernel

Hi Pavel,

On 26/03/2020 03:24, Pavel Tatashin wrote:
> Currently, kexec_image_info() is called during load time, and
> right before kernel is being kexec'ed. There is no need to do both.

I think the original logic was if debugging, you'd see the load-time value in dmesg, and
the kexec-time values when the machine panic()d and you hadn't made a note of the previous
set. But I'm not sure why you'd have these debug messages enabled unless you were
debugging kexec.


> So, call it only once when segments are loaded and the physical
> location of page with copy of arm64_relocate_new_kernel is known.

Sure, keep the earlier version that is more likely to be seen.

Acked-by: James Morse <james.morse@arm.com>


Thanks,

James

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v9 10/18] arm64: kexec: cpu_soft_restart change argument types
       [not found] ` <20200326032420.27220-11-pasha.tatashin@soleen.com>
@ 2020-04-29 17:01   ` James Morse
  2021-01-23  1:14     ` Pavel Tatashin
  0 siblings, 1 reply; 26+ messages in thread
From: James Morse @ 2020-04-29 17:01 UTC (permalink / raw)
  To: Pavel Tatashin
  Cc: sashal, mark.rutland, vladimir.murzin, corbet, catalin.marinas,
	bhsharma, steve.capper, kexec, linux-kernel, jmorris, linux-mm,
	rfontana, ebiederm, maz, matthias.bgg, tglx, will, selindag,
	linux-arm-kernel

Hi Pavel,

On 26/03/2020 03:24, Pavel Tatashin wrote:
> Change argument types from unsigned long to a more descriptive
> phys_addr_t.

For 'entry', which is a physical addresses, sure...

> diff --git a/arch/arm64/kernel/cpu-reset.h b/arch/arm64/kernel/cpu-reset.h
> index ed50e9587ad8..38cbd4068019 100644
> --- a/arch/arm64/kernel/cpu-reset.h
> +++ b/arch/arm64/kernel/cpu-reset.h
> @@ -10,17 +10,17 @@
>  
>  #include <asm/virt.h>
>  
> -void __cpu_soft_restart(unsigned long el2_switch, unsigned long entry,
> -	unsigned long arg0, unsigned long arg1, unsigned long arg2);

> +void __cpu_soft_restart(phys_addr_t el2_switch, phys_addr_t entry,
> +			phys_addr_t arg0, phys_addr_t arg1, phys_addr_t arg2);

This looks weird because its re-using the hyp-stub API, because it might call the hyp-stub
from the idmap. entry is passed in, so this isn't tied to kexec. Without tying it to
kexec, how do you know arg2 is a physical address?
I think it tried to be re-usable because 32bit has more users for this.

arg0-2 are unsigned long because the hyp-stub is just moving general purpose registers around.

This is to avoid casting?
Sure, its only got one caller. This thing evolved because the platform-has-EL2 and
kdump-while-KVM-was-running code was bolted on as they were discovered.


> -static inline void __noreturn cpu_soft_restart(unsigned long entry,
> -					       unsigned long arg0,
> -					       unsigned long arg1,
> -					       unsigned long arg2)
> +static inline void __noreturn cpu_soft_restart(phys_addr_t entry,
> +					       phys_addr_t arg0,
> +					       phys_addr_t arg1,
> +					       phys_addr_t arg2)
>  {
>  	typeof(__cpu_soft_restart) *restart;
>  
> -	unsigned long el2_switch = !is_kernel_in_hyp_mode() &&
> +	phys_addr_t el2_switch = !is_kernel_in_hyp_mode() &&
>  		is_hyp_mode_available();

What on earth happened here!?


>  	restart = (void *)__pa_symbol(__cpu_soft_restart);


Thanks,

James

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v9 15/18] arm64: kexec: kexec EL2 vectors
       [not found] ` <20200326032420.27220-16-pasha.tatashin@soleen.com>
@ 2020-04-29 17:35   ` Marc Zyngier
  2021-01-25 19:07     ` Pavel Tatashin
  2020-05-07 16:21   ` James Morse
  1 sibling, 1 reply; 26+ messages in thread
From: Marc Zyngier @ 2020-04-29 17:35 UTC (permalink / raw)
  To: Pavel Tatashin
  Cc: sashal, mark.rutland, vladimir.murzin, corbet, catalin.marinas,
	bhsharma, steve.capper, kexec, linux-kernel, selindag, jmorris,
	linux-mm, james.morse, ebiederm, matthias.bgg, rfontana, will,
	tglx, linux-arm-kernel

On 2020-03-26 03:24, Pavel Tatashin wrote:
> If we have a EL2 mode without VHE, the EL2 vectors are needed in order
> to switch to EL2 and jump to new world with hyperivsor privileges.
> 
> Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
> ---
>  arch/arm64/include/asm/kexec.h      |  5 +++++
>  arch/arm64/kernel/asm-offsets.c     |  1 +
>  arch/arm64/kernel/machine_kexec.c   |  5 +++++
>  arch/arm64/kernel/relocate_kernel.S | 35 +++++++++++++++++++++++++++++
>  4 files changed, 46 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/kexec.h 
> b/arch/arm64/include/asm/kexec.h
> index d944c2e289b2..0f758fd51518 100644
> --- a/arch/arm64/include/asm/kexec.h
> +++ b/arch/arm64/include/asm/kexec.h
> @@ -95,6 +95,7 @@ static inline void crash_post_resume(void) {}
>  extern const unsigned long kexec_relocate_code_size;
>  extern const unsigned char kexec_relocate_code_start[];
>  extern const unsigned long kexec_kern_reloc_offset;
> +extern const unsigned long kexec_el2_vectors_offset;
>  #endif
> 
>  /*
> @@ -104,6 +105,9 @@ extern const unsigned long kexec_kern_reloc_offset;
>   *		kernel, or purgatory entry address).
>   * kern_arg0	first argument to kernel is its dtb address. The other
>   *		arguments are currently unused, and must be set to 0
> + * el2_vector	If present means that relocation routine will go to EL1
> + *		from EL2 to do the copy, and then back to EL2 to do the jump
> + *		to new world.
>   */
>  struct kern_reloc_arg {
>  	phys_addr_t head;
> @@ -112,6 +116,7 @@ struct kern_reloc_arg {
>  	phys_addr_t kern_arg1;
>  	phys_addr_t kern_arg2;
>  	phys_addr_t kern_arg3;
> +	phys_addr_t el2_vector;
>  };
> 
>  #define ARCH_HAS_KIMAGE_ARCH
> diff --git a/arch/arm64/kernel/asm-offsets.c 
> b/arch/arm64/kernel/asm-offsets.c
> index 448230684749..ff974b648347 100644
> --- a/arch/arm64/kernel/asm-offsets.c
> +++ b/arch/arm64/kernel/asm-offsets.c
> @@ -136,6 +136,7 @@ int main(void)
>    DEFINE(KEXEC_KRELOC_KERN_ARG1,	offsetof(struct kern_reloc_arg, 
> kern_arg1));
>    DEFINE(KEXEC_KRELOC_KERN_ARG2,	offsetof(struct kern_reloc_arg, 
> kern_arg2));
>    DEFINE(KEXEC_KRELOC_KERN_ARG3,	offsetof(struct kern_reloc_arg, 
> kern_arg3));
> +  DEFINE(KEXEC_KRELOC_EL2_VECTOR,	offsetof(struct kern_reloc_arg, 
> el2_vector));
>  #endif
>    return 0;
>  }
> diff --git a/arch/arm64/kernel/machine_kexec.c
> b/arch/arm64/kernel/machine_kexec.c
> index ab571fca9bd1..bd398def7627 100644
> --- a/arch/arm64/kernel/machine_kexec.c
> +++ b/arch/arm64/kernel/machine_kexec.c
> @@ -84,6 +84,11 @@ int machine_kexec_post_load(struct kimage *kimage)
>  	kern_reloc_arg->head = kimage->head;
>  	kern_reloc_arg->entry_addr = kimage->start;
>  	kern_reloc_arg->kern_arg0 = kimage->arch.dtb_mem;
> +	/* Setup vector table only when EL2 is available, but no VHE */
> +	if (is_hyp_mode_available() && !is_kernel_in_hyp_mode()) {
> +		kern_reloc_arg->el2_vector = __pa(reloc_code)
> +						+ kexec_el2_vectors_offset;
> +	}
>  	kexec_image_info(kimage);
> 
>  	return 0;
> diff --git a/arch/arm64/kernel/relocate_kernel.S
> b/arch/arm64/kernel/relocate_kernel.S
> index aa9f2b2cd77c..6fd2fc0ef373 100644
> --- a/arch/arm64/kernel/relocate_kernel.S
> +++ b/arch/arm64/kernel/relocate_kernel.S
> @@ -89,6 +89,38 @@ ENTRY(arm64_relocate_new_kernel)
>  .ltorg
>  END(arm64_relocate_new_kernel)
> 
> +.macro el1_sync_64
> +	br	x4			/* Jump to new world from el2 */
> +	.fill 31, 4, 0			/* Set other 31 instr to zeroes */
> +.endm

The common idiom to write this is to align the beginning of the
macro, and not to bother about what follows:

.macro whatever
         .align 7
         br      x4
.endm

Specially given that 0 is an undefined instruction, and I really hate to 
see
those in the actual text. On the contrary, .align generates NOPs.

> +
> +.macro invalid_vector label
> +\label:
> +	b \label
> +	.fill 31, 4, 0			/* Set other 31 instr to zeroes */
> +.endm
> +
> +/* el2 vectors - switch el2 here while we restore the memory image. */
> +	.align 11
> +ENTRY(kexec_el2_vectors)

Please see commit 617a2f392c92 ("arm64: kvm: Annotate assembly using 
modern
annoations"), and follow the same pattern.

> +	invalid_vector el2_sync_invalid_sp0	/* Synchronous EL2t */
> +	invalid_vector el2_irq_invalid_sp0	/* IRQ EL2t */
> +	invalid_vector el2_fiq_invalid_sp0	/* FIQ EL2t */
> +	invalid_vector el2_error_invalid_sp0	/* Error EL2t */
> +	invalid_vector el2_sync_invalid_spx	/* Synchronous EL2h */
> +	invalid_vector el2_irq_invalid_spx	/* IRQ EL2h */
> +	invalid_vector el2_fiq_invalid_spx	/* FIQ EL2h */
> +	invalid_vector el2_error_invalid_spx	/* Error EL2h */
> +		el1_sync_64			/* Synchronous 64-bit EL1 */
> +	invalid_vector el1_irq_invalid_64	/* IRQ 64-bit EL1 */
> +	invalid_vector el1_fiq_invalid_64	/* FIQ 64-bit EL1 */
> +	invalid_vector el1_error_invalid_64	/* Error 64-bit EL1 */
> +	invalid_vector el1_sync_invalid_32	/* Synchronous 32-bit EL1 */
> +	invalid_vector el1_irq_invalid_32	/* IRQ 32-bit EL1 */
> +	invalid_vector el1_fiq_invalid_32	/* FIQ 32-bit EL1 */
> +	invalid_vector el1_error_invalid_32	/* Error 32-bit EL1 */
> +END(kexec_el2_vectors)

Please write the vectors in 4 groups of 4, as this makes it a lot easier
to follow what is what.

> +
>  .Lkexec_relocate_code_end:
>  .org	KEXEC_CONTROL_PAGE_SIZE
>  .align 3	/* To keep the 64-bit values below naturally aligned. */
> @@ -102,3 +134,6 @@ kexec_relocate_code_size:
>  .globl kexec_kern_reloc_offset
>  kexec_kern_reloc_offset:
>  	.quad	arm64_relocate_new_kernel - kexec_relocate_code_start
> +.globl kexec_el2_vectors_offset
> +kexec_el2_vectors_offset:
> +	.quad	kexec_el2_vectors - kexec_relocate_code_start

Thanks,

         M.
-- 
Jazz is not dead. It just smells funny...

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v9 15/18] arm64: kexec: kexec EL2 vectors
       [not found] ` <20200326032420.27220-16-pasha.tatashin@soleen.com>
  2020-04-29 17:35   ` [PATCH v9 15/18] arm64: kexec: kexec EL2 vectors Marc Zyngier
@ 2020-05-07 16:21   ` James Morse
  1 sibling, 0 replies; 26+ messages in thread
From: James Morse @ 2020-05-07 16:21 UTC (permalink / raw)
  To: Pavel Tatashin
  Cc: sashal, mark.rutland, vladimir.murzin, corbet, catalin.marinas,
	bhsharma, steve.capper, kexec, linux-kernel, jmorris, linux-mm,
	rfontana, ebiederm, maz, matthias.bgg, tglx, will, selindag,
	linux-arm-kernel

Hi Pavel,

What happened to the subject?
(it really needs a verb to make any sense)

On 26/03/2020 03:24, Pavel Tatashin wrote:
> If we have a EL2 mode without VHE, the EL2 vectors are needed in order
> to switch to EL2 and jump to new world with hyperivsor privileges.

Yes, but the hyp-stub has an API to let you do this... but you need your own version.

Could you explain why in the commit message?

(spelling: hyperivsor)


> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
> index ab571fca9bd1..bd398def7627 100644
> --- a/arch/arm64/kernel/machine_kexec.c
> +++ b/arch/arm64/kernel/machine_kexec.c
> @@ -84,6 +84,11 @@ int machine_kexec_post_load(struct kimage *kimage)
>  	kern_reloc_arg->head = kimage->head;
>  	kern_reloc_arg->entry_addr = kimage->start;
>  	kern_reloc_arg->kern_arg0 = kimage->arch.dtb_mem;
> +	/* Setup vector table only when EL2 is available, but no VHE */
> +	if (is_hyp_mode_available() && !is_kernel_in_hyp_mode()) {
> +		kern_reloc_arg->el2_vector = __pa(reloc_code)
> +						+ kexec_el2_vectors_offset;
> +	}

Why does the asm relocation code need to know where the vector is? It must access it via HVC.




Thanks,

James

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v9 11/18] arm64: kexec: arm64_relocate_new_kernel clean-ups
       [not found] ` <20200326032420.27220-12-pasha.tatashin@soleen.com>
@ 2020-05-07 16:22   ` James Morse
  0 siblings, 0 replies; 26+ messages in thread
From: James Morse @ 2020-05-07 16:22 UTC (permalink / raw)
  To: Pavel Tatashin
  Cc: sashal, mark.rutland, vladimir.murzin, corbet, catalin.marinas,
	bhsharma, steve.capper, kexec, linux-kernel, jmorris, linux-mm,
	rfontana, ebiederm, maz, matthias.bgg, tglx, will, selindag,
	linux-arm-kernel

Hi Pavel,

On 26/03/2020 03:24, Pavel Tatashin wrote:
> Remove excessive empty lines from arm64_relocate_new_kernel.

To make it harder to read? Or just for the churn ...

> Also, use comments on the same lines with instructions where
> appropriate.

Churn,


> Change ENDPROC to END it never returns.

It might be more useful to convert this to the new style annotations, which should be a
separate patch. See Documentation/asm-annotations.rst


> copy_page(dest, src, tmps...)
> Increments dest and src by PAGE_SIZE, so no need to store dest
> prior to calling copy_page and increment it after. Also, src is not
> used after a copy, not need to copy either.

This bit sounds like cleanup, but I can't isolate it from the noise below....


> Call raw_dcache_line_size()  only when relocation is actually going to
> happen.

Why?

The pattern in this code is to setup register that don't change at the top, then do all
the work. I think this was an attempt to make it more readable.

Nothing branches back to that label, so this is fine, its just less obviously correct.


> Since '.align 3' is intended to align globals at the end of the file,
> move it there.


Please don't make noisy changes to whitespace and comments, its never worth it.


Thanks,

James


> diff --git a/arch/arm64/kernel/relocate_kernel.S b/arch/arm64/kernel/relocate_kernel.S
> index c1d7db71a726..e9c974ea4717 100644
> --- a/arch/arm64/kernel/relocate_kernel.S
> +++ b/arch/arm64/kernel/relocate_kernel.S
> @@ -8,7 +8,6 @@
>  
>  #include <linux/kexec.h>
>  #include <linux/linkage.h>
> -
>  #include <asm/assembler.h>
>  #include <asm/kexec.h>
>  #include <asm/page.h>
> @@ -17,25 +16,21 @@
>  /*
>   * arm64_relocate_new_kernel - Put a 2nd stage image in place and boot it.
>   *
> - * The memory that the old kernel occupies may be overwritten when coping the
> + * The memory that the old kernel occupies may be overwritten when copying the
>   * new image to its final location.  To assure that the
>   * arm64_relocate_new_kernel routine which does that copy is not overwritten,
>   * all code and data needed by arm64_relocate_new_kernel must be between the
>   * symbols arm64_relocate_new_kernel and arm64_relocate_new_kernel_end.  The
>   * machine_kexec() routine will copy arm64_relocate_new_kernel to the kexec
> - * control_code_page, a special page which has been set up to be preserved
> - * during the copy operation.
> + * safe memory that has been set up to be preserved during the copy operation.
>   */
>  ENTRY(arm64_relocate_new_kernel)
> -
>  	/* Setup the list loop variables. */
>  	mov	x18, x2				/* x18 = dtb address */
>  	mov	x17, x1				/* x17 = kimage_start */
>  	mov	x16, x0				/* x16 = kimage_head */
> -	raw_dcache_line_size x15, x0		/* x15 = dcache line size */
>  	mov	x14, xzr			/* x14 = entry ptr */
>  	mov	x13, xzr			/* x13 = copy dest */
> -
>  	/* Clear the sctlr_el2 flags. */
>  	mrs	x0, CurrentEL
>  	cmp	x0, #CurrentEL_EL2
> @@ -46,14 +41,11 @@ ENTRY(arm64_relocate_new_kernel)
>  	pre_disable_mmu_workaround
>  	msr	sctlr_el2, x0
>  	isb
> -1:
> -
> -	/* Check if the new image needs relocation. */
> +1:	/* Check if the new image needs relocation. */
>  	tbnz	x16, IND_DONE_BIT, .Ldone
> -
> +	raw_dcache_line_size x15, x1		/* x15 = dcache line size */
>  .Lloop:
>  	and	x12, x16, PAGE_MASK		/* x12 = addr */
> -
>  	/* Test the entry flags. */
>  .Ltest_source:
>  	tbz	x16, IND_SOURCE_BIT, .Ltest_indirection
> @@ -69,34 +61,18 @@ ENTRY(arm64_relocate_new_kernel)
>  	b.lo    2b
>  	dsb     sy
>  
> -	mov x20, x13
> -	mov x21, x12
> -	copy_page x20, x21, x0, x1, x2, x3, x4, x5, x6, x7
> -
> -	/* dest += PAGE_SIZE */
> -	add	x13, x13, PAGE_SIZE
> +	copy_page x13, x12, x0, x1, x2, x3, x4, x5, x6, x7
>  	b	.Lnext
> -
>  .Ltest_indirection:
>  	tbz	x16, IND_INDIRECTION_BIT, .Ltest_destination
> -
> -	/* ptr = addr */
> -	mov	x14, x12
> +	mov	x14, x12			/* ptr = addr */
>  	b	.Lnext
> -
>  .Ltest_destination:
>  	tbz	x16, IND_DESTINATION_BIT, .Lnext
> -
> -	/* dest = addr */
> -	mov	x13, x12
> -
> +	mov	x13, x12			/* dest = addr */
>  .Lnext:
> -	/* entry = *ptr++ */
> -	ldr	x16, [x14], #8
> -
> -	/* while (!(entry & DONE)) */
> -	tbz	x16, IND_DONE_BIT, .Lloop
> -
> +	ldr	x16, [x14], #8			/* entry = *ptr++ */
> +	tbz	x16, IND_DONE_BIT, .Lloop	/* while (!(entry & DONE)) */
>  .Ldone:
>  	/* wait for writes from copy_page to finish */
>  	dsb	nsh
> @@ -110,16 +86,12 @@ ENTRY(arm64_relocate_new_kernel)
>  	mov	x2, xzr
>  	mov	x3, xzr
>  	br	x17
> -
> -ENDPROC(arm64_relocate_new_kernel)
> -
>  .ltorg
> -
> -.align 3	/* To keep the 64-bit values below naturally aligned. */
> +END(arm64_relocate_new_kernel)
>  
>  .Lcopy_end:
>  .org	KEXEC_CONTROL_PAGE_SIZE
> -
> +.align 3	/* To keep the 64-bit values below naturally aligned. */
>  /*
>   * arm64_relocate_new_kernel_size - Number of bytes to copy to the
>   * control_code_page.
> 


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v9 13/18] arm64: kexec: add expandable argument to relocation function
       [not found] ` <20200326032420.27220-14-pasha.tatashin@soleen.com>
@ 2020-05-07 16:22   ` James Morse
  2021-01-23  2:49     ` Pavel Tatashin
  0 siblings, 1 reply; 26+ messages in thread
From: James Morse @ 2020-05-07 16:22 UTC (permalink / raw)
  To: Pavel Tatashin
  Cc: sashal, mark.rutland, vladimir.murzin, corbet, catalin.marinas,
	bhsharma, steve.capper, kexec, linux-kernel, jmorris, linux-mm,
	rfontana, ebiederm, maz, matthias.bgg, tglx, will, selindag,
	linux-arm-kernel

Hi Pavel,

On 26/03/2020 03:24, Pavel Tatashin wrote:
> Currently, kexec relocation function (arm64_relocate_new_kernel) accepts
> the following arguments:
> 
> head:		start of array that contains relocation information.
> entry:		entry point for new kernel or purgatory.
> dtb_mem:	first and only argument to entry.

> The number of arguments cannot be easily expended, because this
> function is also called from HVC_SOFT_RESTART, which preserves only
> three arguments. And, also arm64_relocate_new_kernel is written in
> assembly but called without stack, thus no place to move extra
> arguments to free registers.
> 
> Soon, we will need to pass more arguments: once we enable MMU we
> will need to pass information about page tables.


> Another benefit of allowing this function to accept more arguments, is that
> kernel can actually accept up to 4 arguments (x0-x3), however currently
> only one is used, but if in the future we will need for more (for example,
> pass information about when previous kernel exited to have a precise
> measurement in time spent in purgatory), we won't be easilty do that
> if arm64_relocate_new_kernel can't accept more arguments.

This is a niche debug hack.
We really don't want an ABI with purgatory. I think the register values it gets were added
early for compatibility with kexec_file_load().


> So, add a new struct: kern_reloc_arg, and place it in kexec safe page (i.e
> memory that is not overwritten during relocation).
> Thus, make arm64_relocate_new_kernel to only take one argument, that
> contains all the needed information.

Do we really not have enough registers?

The PCS[0] gives you 8 arguments. In this patch you use 6.


If this is really about the hyp-stub abi, please state that.


> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
> index cee3be586384..b1122eea627e 100644
> --- a/arch/arm64/kernel/machine_kexec.c
> +++ b/arch/arm64/kernel/machine_kexec.c
> @@ -59,13 +60,35 @@ void machine_kexec_cleanup(struct kimage *kimage)

>  int machine_kexec_post_load(struct kimage *kimage)
>  {
>  	void *reloc_code = page_to_virt(kimage->control_code_page);
> +	struct kern_reloc_arg *kern_reloc_arg = kexec_page_alloc(kimage);
> +
> +	if (!kern_reloc_arg)
> +		return -ENOMEM;
>  
>  	memcpy(reloc_code, arm64_relocate_new_kernel,
>  	       arm64_relocate_new_kernel_size);
>  	kimage->arch.kern_reloc = __pa(reloc_code);
> +	kimage->arch.kern_reloc_arg = __pa(kern_reloc_arg);
> +	kern_reloc_arg->head = kimage->head;
> +	kern_reloc_arg->entry_addr = kimage->start;
> +	kern_reloc_arg->kern_arg0 = kimage->arch.dtb_mem;

These kern_reloc_arg values are written via the cacheable linear map.
They are read in arm64_relocate_new_kernel() where the MMU is disabled an all memory
access are non-cacheable.

To ensure you read the values you wrote, you must clean kern_reloc_arg to the PoC.


>  	kexec_image_info(kimage);
>  
>  	return 0;Thanks,

James

[0]
https://developer.arm.com/docs/ihi0055/d/procedure-call-standard-for-the-arm-64-bit-architecture

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v9 12/18] arm64: kexec: arm64_relocate_new_kernel don't use x0 as temp
       [not found] ` <20200326032420.27220-13-pasha.tatashin@soleen.com>
@ 2020-05-07 16:22   ` James Morse
  0 siblings, 0 replies; 26+ messages in thread
From: James Morse @ 2020-05-07 16:22 UTC (permalink / raw)
  To: Pavel Tatashin, sashal
  Cc: mark.rutland, vladimir.murzin, corbet, catalin.marinas, bhsharma,
	steve.capper, kexec, linux-kernel, jmorris, linux-mm, rfontana,
	ebiederm, maz, matthias.bgg, tglx, will, selindag,
	linux-arm-kernel

Hi Pavel,

On 26/03/2020 03:24, Pavel Tatashin wrote:
> x0 will contain the only argument to arm64_relocate_new_kernel; don't
> use it as a temp. Reassigned registers to free-up x0.

The missing bit of motivation is _why_ you need x0 keep its value until the end of this code.

With that covered,
Reviewed-by: James Morse <james.morse@arm.com>


Thanks,

James

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v9 14/18] arm64: kexec: offset for relocation function
       [not found] ` <20200326032420.27220-15-pasha.tatashin@soleen.com>
@ 2020-05-07 16:22   ` James Morse
  0 siblings, 0 replies; 26+ messages in thread
From: James Morse @ 2020-05-07 16:22 UTC (permalink / raw)
  To: Pavel Tatashin
  Cc: sashal, mark.rutland, vladimir.murzin, corbet, catalin.marinas,
	bhsharma, steve.capper, kexec, linux-kernel, jmorris, linux-mm,
	rfontana, ebiederm, maz, matthias.bgg, tglx, will, selindag,
	linux-arm-kernel

Hi Pavel,

On 26/03/2020 03:24, Pavel Tatashin wrote:
> Soon, relocation function will share the same page with EL2 vectors.

The EL2 vectors would only be executed with the MMU off, so they don't need to be mapped
anywhere in particular. (this is something hibernate probably does sloppily).


> Add offset within this page to arm64_relocate_new_kernel, and also
> the total size of relocation code which will include both the function
> and the EL2 vectors.

See arch/arm64/kernel/vmlinux.lds.S and sections.h for an example of how the idmap,
hibernate and the non-KVM hyp code do this.

If we're going to change this, I'd prefer it be the same as the other users...


> diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
> index 990185744148..d944c2e289b2 100644
> --- a/arch/arm64/include/asm/kexec.h
> +++ b/arch/arm64/include/asm/kexec.h
> @@ -90,6 +90,13 @@ static inline void crash_prepare_suspend(void) {}
>  static inline void crash_post_resume(void) {}
>  #endif
>  
> +#if defined(CONFIG_KEXEC_CORE)
> +/* The beginning and size of relcation code to stage 2 kernel */
> +extern const unsigned long kexec_relocate_code_size;
> +extern const unsigned char kexec_relocate_code_start[];
> +extern const unsigned long kexec_kern_reloc_offset;
> +#endif

This makes these things visible globally ... but drops the arm64_ prefix!


> diff --git a/arch/arm64/kernel/relocate_kernel.S b/arch/arm64/kernel/relocate_kernel.S
> index 22ccdcb106d3..aa9f2b2cd77c 100644
> --- a/arch/arm64/kernel/relocate_kernel.S
> +++ b/arch/arm64/kernel/relocate_kernel.S
> @@ -14,6 +14,9 @@
>  #include <asm/page.h>
>  #include <asm/sysreg.h>
>  
> +.globl kexec_relocate_code_start
> +kexec_relocate_code_start:

Which of the fancy new asm-annotations should this be?



> @@ -86,13 +89,16 @@ ENTRY(arm64_relocate_new_kernel)
>  .ltorg
>  END(arm64_relocate_new_kernel)
>  
> -.Lcopy_end:
> +.Lkexec_relocate_code_end:
>  .org	KEXEC_CONTROL_PAGE_SIZE
>  .align 3	/* To keep the 64-bit values below naturally aligned. */

>  /*
> - * arm64_relocate_new_kernel_size - Number of bytes to copy to the
> + * kexec_relocate_code_size - Number of bytes to copy to the
>   * control_code_page.
>   */
> -.globl arm64_relocate_new_kernel_size
> -arm64_relocate_new_kernel_size:
> -	.quad	.Lcopy_end - arm64_relocate_new_kernel
> +.globl kexec_relocate_code_size
> +kexec_relocate_code_size:
> +	.quad	.Lkexec_relocate_code_end - kexec_relocate_code_start
> +.globl kexec_kern_reloc_offset
> +kexec_kern_reloc_offset:
> +	.quad	arm64_relocate_new_kernel - kexec_relocate_code_start
> 

Can't we calculate this from the start/end markers? These variables held in assembly
generated code is pretty manky.


Thanks,

James

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v9 16/18] arm64: kexec: configure trans_pgd page table for kexec
       [not found] ` <20200326032420.27220-17-pasha.tatashin@soleen.com>
@ 2020-05-07 16:22   ` James Morse
  0 siblings, 0 replies; 26+ messages in thread
From: James Morse @ 2020-05-07 16:22 UTC (permalink / raw)
  To: Pavel Tatashin
  Cc: sashal, mark.rutland, vladimir.murzin, corbet, catalin.marinas,
	bhsharma, steve.capper, kexec, linux-kernel, jmorris, linux-mm,
	rfontana, ebiederm, maz, matthias.bgg, tglx, will, selindag,
	linux-arm-kernel

Hi Pavel,

On 26/03/2020 03:24, Pavel Tatashin wrote:
> Configure a page table located in kexec-safe memory that has
> the following mappings:
> 
> 1. identity mapping for text of relocation function with executable
>    permission.
> 2. linear mappings for all source ranges
> 3. linear mappings for all destination ranges.

Its taken this long to work out your definition of linear here doesn't match the way the
rest of the arch code uses the term.

You are using the MMU to re-assemble the scattered kexec image in VA space, so that the
relocation code doesn't have to walk the list.

While its a cool trick, I don't think this is a good idea, it makes it much harder to
debug as we have a new definition for VA->PA, instead of re-using the kernels. We should
do the least surprising thing. The person debugging a problem's first assumptions should
be correct. Doing this means any debug information printed before kexec() is suddenly
useless for debugging a problem that occurs during relocation.

...

Let me hack together what I've been describing and we can discuss whether its simpler.
(most of next week is gone already though...)

(some Nits below)

> diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
> index 0f758fd51518..8f4332ac607a 100644
> --- a/arch/arm64/include/asm/kexec.h
> +++ b/arch/arm64/include/asm/kexec.h
> @@ -108,6 +108,12 @@ extern const unsigned long kexec_el2_vectors_offset;
>   * el2_vector	If present means that relocation routine will go to EL1
>   *		from EL2 to do the copy, and then back to EL2 to do the jump
>   *		to new world.
> + * trans_ttbr0	idmap for relocation function and its argument
> + * trans_ttbr1	linear map for source/destination addresses.
> + * trans_t0sz	t0sz for idmap page in trans_ttbr0

You should be able to load the TTBR0_EL1 (and corresponding TCR_EL1.T0SZ) before kicking
off the relocation code. There should be no need to pass it in to assembly.

For example, hibernate sets TTBR0_EL1 in create_safe_exec_page().


> + * src_addr	linear map for source pages.
> + * dst_addr	linear map for destination pages.
> + * copy_len	Number of bytes that need to be copied
>   */
>  struct kern_reloc_arg {
>  	phys_addr_t head;

> @@ -70,10 +71,90 @@ static void *kexec_page_alloc(void *arg)
>  	return page_address(page);
>  }
>  
> +/*
> + * Map source segments starting from src_va, and map destination
> + * segments starting from dst_va, and return size of copy in
> + * *copy_len argument.
> + * Relocation function essentially needs to do:
> + * memcpy(dst_va, src_va, copy_len);
> + */
> +static int map_segments(struct kimage *kimage, pgd_t *pgdp,
> +			struct trans_pgd_info *info,
> +			unsigned long src_va,
> +			unsigned long dst_va,
> +			unsigned long *copy_len)
> +{
> +	unsigned long *ptr = 0;
> +	unsigned long dest = 0;
> +	unsigned long len = 0;
> +	unsigned long entry, addr;
> +	int rc;
> +
> +	for (entry = kimage->head; !(entry & IND_DONE); entry = *ptr++) {
> +		addr = entry & PAGE_MASK;
> +
> +		switch (entry & IND_FLAGS) {
> +		case IND_DESTINATION:
> +			dest = addr;
> +			break;

So we hope to always find a destination first?


> +		case IND_INDIRECTION:
> +			ptr = __va(addr);
> +			if (rc)
> +				return rc;

Where does rc come from?

> +			break;

> +		case IND_SOURCE:
> +			rc = trans_pgd_map_page(info, pgdp, __va(addr),
> +						src_va, PAGE_KERNEL);
> +			if (rc)
> +				return rc;
> +			rc = trans_pgd_map_page(info, pgdp, __va(dest),
> +						dst_va, PAGE_KERNEL);
> +			if (rc)
> +				return rc;
> +			dest += PAGE_SIZE;
> +			src_va += PAGE_SIZE;
> +			dst_va += PAGE_SIZE;
> +			len += PAGE_SIZE;
> +		}
> +	}
> +	*copy_len = len;
> +
> +	return 0;
> +}
> +
> @@ -89,9 +170,18 @@ int machine_kexec_post_load(struct kimage *kimage)
>  		kern_reloc_arg->el2_vector = __pa(reloc_code)
>  						+ kexec_el2_vectors_offset;
>  	}
> +
> +	/*
> +	 * If relocation is not needed, we do not need to enable MMU in

Strictly you aren't enabling it, but disabling it _after_ the relocation.


> +	 * relocation routine, therefore do not create page tables for
> +	 * scenarios such as crash kernel
> +	 */
> +	if (!(kimage->head & IND_DONE))
> +		rc = mmu_relocate_setup(kimage, reloc_code, kern_reloc_arg);
> +
>  	kexec_image_info(kimage);
>  
> -	return 0;
> +	return rc;
>  }


Thanks,

James

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v9 03/18] arm64: trans_pgd: make trans_pgd_map_page generic
  2020-04-29 17:01   ` [PATCH v9 03/18] arm64: trans_pgd: make trans_pgd_map_page generic James Morse
@ 2021-01-22 21:52     ` Pavel Tatashin
  0 siblings, 0 replies; 26+ messages in thread
From: Pavel Tatashin @ 2021-01-22 21:52 UTC (permalink / raw)
  To: James Morse
  Cc: Sasha Levin, Mark Rutland, Vladimir Murzin, Jonathan Corbet,
	Catalin Marinas, Bhupesh Sharma, steve.capper, kexec mailing list,
	LKML, James Morris, linux-mm, rfontana, Eric W. Biederman,
	Marc Zyngier, Matthias Brugger, Thomas Gleixner, Will Deacon,
	Selin Dag, Linux ARM

Hi James,

I am working on an updated version of this patch series. We had back
and forth discussion on the list and off the list about MMU-enabled
series. So, I decided to sync the last series I had with the current
mainline. Address your last comments (those that I can address), and
send it again, so we can take a fresh look. I will reply to some of
your comments, as I address them in the synced version of my series.

On Wed, Apr 29, 2020 at 1:01 PM James Morse <james.morse@arm.com> wrote:
>
> Hi Pavel,
>
> On 26/03/2020 03:24, Pavel Tatashin wrote:
> > kexec is going to use a different allocator, so make
>
> > trans_pgd_map_page to accept allocator as an argument, and also
> > kexec is going to use a different map protection, so also pass
> > it via argument.
>
> This trans_pgd_map_page() used to be create_single_mapping() It creates page tables that
> map one page: the relocation code.
>
> Why do you need a different pgprot? Surely PAGE_KERNEL_EXEC is exactly what you want.

For hibernate case yes, but for MMU enabled kexec case, PAGE_KERNEL is
used, because it is used to copy data segments.

> > diff --git a/arch/arm64/include/asm/trans_pgd.h b/arch/arm64/include/asm/trans_pgd.h
> > index 23153c13d1ce..ad5194ad178d 100644
> > --- a/arch/arm64/include/asm/trans_pgd.h
> > +++ b/arch/arm64/include/asm/trans_pgd.h
> > @@ -12,10 +12,24 @@
> >  #include <linux/types.h>
> >  #include <asm/pgtable-types.h>
> >
> > +/*
> > + * trans_alloc_page
> > + *   - Allocator that should return exactly one zeroed page, if this
> > + *    allocator fails, trans_pgd returns -ENOMEM error.
>
> trans_pgd is what you pass in to trans_pgd_map_page() or trans_pgd_create_copy().
> Do you mean what those functions return?

I meant to say trans_pgd_*, but I will change the comment to
explicitly say trans_pgd_map_page() and trans_pgd_create_copy() will
return -ENOMEM.

>
>
> > + *
> > + * trans_alloc_arg
> > + *   - Passed to trans_alloc_page as an argument
> > + */
>
> > diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c
> > index 3d6f0fd73591..607bb1fbc349 100644
> > --- a/arch/arm64/kernel/hibernate.c
> > +++ b/arch/arm64/kernel/hibernate.c
> > @@ -195,6 +200,11 @@ static int create_safe_exec_page(void *src_start, size_t length,
> >                                unsigned long dst_addr,
> >                                phys_addr_t *phys_dst_addr)
> >  {
> > +     struct trans_pgd_info trans_info = {
> > +             .trans_alloc_page       = hibernate_page_alloc,
> > +             .trans_alloc_arg        = (void *)GFP_ATOMIC,
> > +     };
>
> As you need another copy of this in the next patch, is it worth declaring this globally
> and making it const?

I think it is alright to have it on the stack instead of permanently
using the data section for this. Plus, we will have a different one
for the kexec case, so having this globally available will make it
strange.

>
>
> > diff --git a/arch/arm64/mm/trans_pgd.c b/arch/arm64/mm/trans_pgd.c
> > index d20e48520cef..275a79935d7e 100644
> > --- a/arch/arm64/mm/trans_pgd.c
> > +++ b/arch/arm64/mm/trans_pgd.c
> > @@ -180,8 +185,18 @@ int trans_pgd_create_copy(pgd_t **dst_pgdp, unsigned long start,
> >       return rc;
> >  }
> >
> > -int trans_pgd_map_page(pgd_t *trans_pgd, void *page, unsigned long dst_addr,
> > -                    pgprot_t pgprot)
> > +/*
> > + * Add map entry to trans_pgd for a base-size page at PTE level.
> > + * info:     contains allocator and its argument
> > + * trans_pgd:        page table in which new map is added.
> > + * page:     page to be mapped.
>
> > + * dst_addr: new VA address for the pages
>
> ~s/pages/page/
>
> This thing only maps one page.

Sure, I will change that.

Thank you,
Pasha

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v9 01/18] arm64: kexec: make dtb_mem always enabled
  2020-04-29 17:00   ` [PATCH v9 01/18] arm64: kexec: make dtb_mem always enabled James Morse
@ 2021-01-23  0:17     ` Pavel Tatashin
  0 siblings, 0 replies; 26+ messages in thread
From: Pavel Tatashin @ 2021-01-23  0:17 UTC (permalink / raw)
  To: James Morse
  Cc: Sasha Levin, Mark Rutland, Vladimir Murzin, Jonathan Corbet,
	Catalin Marinas, Selin Dag, steve.capper, kexec mailing list,
	LKML, James Morris, linux-mm, rfontana, Eric W. Biederman,
	Marc Zyngier, Matthias Brugger, Thomas Gleixner, Will Deacon,
	Linux ARM

On Wed, Apr 29, 2020 at 1:00 PM James Morse <james.morse@arm.com> wrote:
>
> Hi Pavel,
>
> On 26/03/2020 03:24, Pavel Tatashin wrote:
> > Currently, dtb_mem is enabled only when CONFIG_KEXEC_FILE is
> > enabled. This adds ugly ifdefs to c files.
>
> ~s/dtb_mem/ARCH_HAS_KIMAGE_ARCH/ ?
> dtb_mem is just one member of struct kimage_arch.
>
>
> > Always enabled dtb_mem, when it is not used, it is NULL.
> > Change the dtb_mem to phys_addr_t, as it is a physical address.
>
> Regardless,
> Reviewed-by: James Morse <james.morse@arm.com>

Thank you.

Pasha

>
>
> Thanks,
>
> James

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v9 02/18] arm64: hibernate: move page handling function to new trans_pgd.c
  2020-04-29 17:00   ` [PATCH v9 02/18] arm64: hibernate: move page handling function to new trans_pgd.c James Morse
@ 2021-01-23  0:18     ` Pavel Tatashin
  0 siblings, 0 replies; 26+ messages in thread
From: Pavel Tatashin @ 2021-01-23  0:18 UTC (permalink / raw)
  To: James Morse
  Cc: Sasha Levin, Mark Rutland, Vladimir Murzin, Jonathan Corbet,
	Catalin Marinas, Bhupesh Sharma, steve.capper, kexec mailing list,
	LKML, James Morris, linux-mm, rfontana, Eric W. Biederman,
	Marc Zyngier, Matthias Brugger, Thomas Gleixner, Will Deacon,
	Selin Dag, Linux ARM

On Wed, Apr 29, 2020 at 1:00 PM James Morse <james.morse@arm.com> wrote:
>
> Hi Pavel,
>
> On 26/03/2020 03:24, Pavel Tatashin wrote:
> > Now, that we abstracted the required functions move them to a new home.
> > Later, we will generalize these function in order to be useful outside
> > of hibernation.
>
> Reviewed-by: James Morse <james.morse@arm.com>

Thank you,
Pasha

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v9 04/18] arm64: trans_pgd: pass allocator trans_pgd_create_copy
  2020-04-29 17:01   ` [PATCH v9 04/18] arm64: trans_pgd: pass allocator trans_pgd_create_copy James Morse
@ 2021-01-23  0:20     ` Pavel Tatashin
  0 siblings, 0 replies; 26+ messages in thread
From: Pavel Tatashin @ 2021-01-23  0:20 UTC (permalink / raw)
  To: James Morse
  Cc: Sasha Levin, Mark Rutland, Vladimir Murzin, Jonathan Corbet,
	Catalin Marinas, Bhupesh Sharma, steve.capper, kexec mailing list,
	LKML, James Morris, linux-mm, rfontana, Eric W. Biederman,
	Marc Zyngier, Matthias Brugger, Thomas Gleixner, Will Deacon,
	Selin Dag, Linux ARM

On Wed, Apr 29, 2020 at 1:01 PM James Morse <james.morse@arm.com> wrote:
>
> Hi Pavel,
>
> On 26/03/2020 03:24, Pavel Tatashin wrote:
> > Make trans_pgd_create_copy and its subroutines to use allocator that is
> > passed as an argument
>
> Reviewed-by: James Morse <james.morse@arm.com>

Thank you,
Pasha

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v9 05/18] arm64: trans_pgd: pass NULL instead of init_mm to *_populate functions
  2020-04-29 17:01   ` [PATCH v9 05/18] arm64: trans_pgd: pass NULL instead of init_mm to *_populate functions James Morse
@ 2021-01-23  0:22     ` Pavel Tatashin
  0 siblings, 0 replies; 26+ messages in thread
From: Pavel Tatashin @ 2021-01-23  0:22 UTC (permalink / raw)
  To: James Morse
  Cc: Sasha Levin, Mark Rutland, Vladimir Murzin, Jonathan Corbet,
	Catalin Marinas, Bhupesh Sharma, steve.capper, kexec mailing list,
	LKML, James Morris, linux-mm, rfontana, Eric W. Biederman,
	Marc Zyngier, Matthias Brugger, Thomas Gleixner, Will Deacon,
	Selin Dag, Linux ARM

> Acked-by: James Morse <james.morse@arm.com>

Thank you for the review.
Pasha

>
>
> Thanks,
>
> James

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v9 07/18] arm64: trans_pgd: hibernate: idmap the single page that holds the copy page routines
  2020-04-29 17:01   ` [PATCH v9 07/18] arm64: trans_pgd: hibernate: idmap the single page that holds the copy page routines James Morse
@ 2021-01-23  0:35     ` Pavel Tatashin
  0 siblings, 0 replies; 26+ messages in thread
From: Pavel Tatashin @ 2021-01-23  0:35 UTC (permalink / raw)
  To: James Morse
  Cc: Sasha Levin, Mark Rutland, Vladimir Murzin, Jonathan Corbet,
	Catalin Marinas, Bhupesh Sharma, steve.capper, kexec mailing list,
	LKML, James Morris, linux-mm, rfontana, Eric W. Biederman,
	Marc Zyngier, Matthias Brugger, Thomas Gleixner, Will Deacon,
	Selin Dag, Linux ARM

On Wed, Apr 29, 2020 at 1:01 PM James Morse <james.morse@arm.com> wrote:
>
> Hi Pavel,
>
> On 26/03/2020 03:24, Pavel Tatashin wrote:
> > From: James Morse <james.morse@arm.com>
> >
> > To resume from hibernate, the contents of memory are restored from
> > the swap image. This may overwrite any page, including the running
> > kernel and its page tables.
> >
> > Hibernate copies the code it uses to do the restore into a single
> > page that it knows won't be overwritten, and maps it with page tables
> > built from pages that won't be overwritten.
> >
> > Today the address it uses for this mapping is arbitrary, but to allow
> > kexec to reuse this code, it needs to be idmapped. To idmap the page
> > we must avoid the kernel helpers that have VA_BITS baked in.
> >
> > Convert create_single_mapping() to take a single PA, and idmap it.
> > The page tables are built in the reverse order to normal using
> > pfn_pte() to stir in any bits between 52:48. T0SZ is always increased
> > to cover 48bits, or 52 if the copy code has bits 52:48 in its PA.
> >
> > Pasha: The original patch from James
> > inux-arm-kernel/20200115143322.214247-4-james.morse@arm.com
>
> -EBROKENLINK
>
> The convention is to use a 'Link:' tag in the signed-off area.
> e.g. 5a3577039cbe

OK Fixed.

>
> > Adopted it to trans_pgd, so it can be commonly used by both Kexec
> > and Hibernate. Some minor clean-ups.
>
> Please describe your changes just before your SoB. This means each author sign's off on
> the stuff above their SoB, and its obvious who made which changes.
>
> Search for 'Lucky K Maintainer' in process/submitting-patches.rst for an example.

OK, Fixed.
eed the maximum T0SZ.
> > + *
> > + * Returns 0 on success, and -ENOMEM on failure.
> > + * On success trans_ttbr0 contains page table with idmapped page, t0sz is set to
>
> > + * maxumum T0SZ for this page.
>
> maxumum
>

Ok.

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v9 08/18] arm64: kexec: move relocation function setup
  2020-04-29 17:01   ` [PATCH v9 08/18] arm64: kexec: move relocation function setup James Morse
@ 2021-01-23  1:01     ` Pavel Tatashin
  0 siblings, 0 replies; 26+ messages in thread
From: Pavel Tatashin @ 2021-01-23  1:01 UTC (permalink / raw)
  To: James Morse
  Cc: Sasha Levin, Mark Rutland, Vladimir Murzin, Jonathan Corbet,
	Catalin Marinas, Selin Dag, steve.capper, kexec mailing list,
	LKML, James Morris, linux-mm, rfontana, Eric W. Biederman,
	Marc Zyngier, Matthias Brugger, Thomas Gleixner, Will Deacon,
	Linux ARM

On Wed, Apr 29, 2020 at 1:01 PM James Morse <james.morse@arm.com> wrote:
>
> Hi Pavel,
>
> On 26/03/2020 03:24, Pavel Tatashin wrote:
> > Currently, kernel relocation function is configured in machine_kexec()
> > at the time of kexec reboot by using control_code_page.
> >
> > This operation, however, is more logical to be done during kexec_load,
> > and thus remove from reboot time. Move, setup of this function to
> > newly added machine_kexec_post_load().
>
> This would avoid the need to special-case the cache maintenance, so its a good cleanup...

Yes, the computation should be moved from kexec-reboot to kexec-load
when possible.

>
>
> > Because once MMU is enabled, kexec control page will contain more than
> > relocation kernel, but also vector table, add pointer to the actual
> > function within this page arch.kern_reloc. Currently, it equals to the
> > beginning of page, we will add offsets later, when vector table is
> > added.
>
> If the vector table always comes second, wouldn't this be extra work to hold the value 0?
> You can control the layout of this relocation code, as it has to be written in assembly.
> I don't get why this would be necessary.
>
>
> > diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
> > index ae1bad0156cd..ec71a153cc2d 100644
> > --- a/arch/arm64/kernel/machine_kexec.c
> > +++ b/arch/arm64/kernel/machine_kexec.c
> > @@ -58,6 +59,17 @@ void machine_kexec_cleanup(struct kimage *kimage)
> >       /* Empty routine needed to avoid build errors. */
> >  }
> >
> > +int machine_kexec_post_load(struct kimage *kimage)
> > +{
> > +     void *reloc_code = page_to_virt(kimage->control_code_page);
> > +
> > +     memcpy(reloc_code, arm64_relocate_new_kernel,
> > +            arm64_relocate_new_kernel_size);
> > +     kimage->arch.kern_reloc = __pa(reloc_code);
>
> Could we move the two cache maintenance calls for this area in here too. Keeping it next
> to the modification makes it clearer why it is required.

Yes, I moved it.

>
> In this case we can use flush_icache_range() instead of its __variant because this now
> happens much earlier.

True.

>
>
> > +     return 0;
> > +}
>
> Regardless,
> Reviewed-by: James Morse <james.morse@arm.com>

Thank you for your review.

Pasha

>
>
> Thanks,
>
> James

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v9 10/18] arm64: kexec: cpu_soft_restart change argument types
  2020-04-29 17:01   ` [PATCH v9 10/18] arm64: kexec: cpu_soft_restart change argument types James Morse
@ 2021-01-23  1:14     ` Pavel Tatashin
  0 siblings, 0 replies; 26+ messages in thread
From: Pavel Tatashin @ 2021-01-23  1:14 UTC (permalink / raw)
  To: James Morse
  Cc: Sasha Levin, Mark Rutland, Vladimir Murzin, Jonathan Corbet,
	Catalin Marinas, Bhupesh Sharma, steve.capper, kexec mailing list,
	LKML, James Morris, linux-mm, rfontana, Eric W. Biederman,
	Marc Zyngier, Matthias Brugger, Thomas Gleixner, Will Deacon,
	Selin Dag, Linux ARM

On Wed, Apr 29, 2020 at 1:01 PM James Morse <james.morse@arm.com> wrote:
>
> Hi Pavel,
>
> On 26/03/2020 03:24, Pavel Tatashin wrote:
> > Change argument types from unsigned long to a more descriptive
> > phys_addr_t.
>
> For 'entry', which is a physical addresses, sure...
>
> > diff --git a/arch/arm64/kernel/cpu-reset.h b/arch/arm64/kernel/cpu-reset.h
> > index ed50e9587ad8..38cbd4068019 100644
> > --- a/arch/arm64/kernel/cpu-reset.h
> > +++ b/arch/arm64/kernel/cpu-reset.h
> > @@ -10,17 +10,17 @@
> >
> >  #include <asm/virt.h>
> >
> > -void __cpu_soft_restart(unsigned long el2_switch, unsigned long entry,
> > -     unsigned long arg0, unsigned long arg1, unsigned long arg2);
>
> > +void __cpu_soft_restart(phys_addr_t el2_switch, phys_addr_t entry,
> > +                     phys_addr_t arg0, phys_addr_t arg1, phys_addr_t arg2);
>
> This looks weird because its re-using the hyp-stub API, because it might call the hyp-stub
> from the idmap. entry is passed in, so this isn't tied to kexec. Without tying it to
> kexec, how do you know arg2 is a physical address?
> I think it tried to be re-usable because 32bit has more users for this.

I will drop this patch. It was intended as a cleanup from suggestions
in earlier versions of this series, but I see it is not really needed.

Thank you,
Pasha

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v9 13/18] arm64: kexec: add expandable argument to relocation function
  2020-05-07 16:22   ` [PATCH v9 13/18] arm64: kexec: add expandable argument to relocation function James Morse
@ 2021-01-23  2:49     ` Pavel Tatashin
  0 siblings, 0 replies; 26+ messages in thread
From: Pavel Tatashin @ 2021-01-23  2:49 UTC (permalink / raw)
  To: James Morse
  Cc: Sasha Levin, Mark Rutland, Vladimir Murzin, Jonathan Corbet,
	Catalin Marinas, Selin Dag, steve.capper, kexec mailing list,
	LKML, James Morris, linux-mm, rfontana, Eric W. Biederman,
	Marc Zyngier, Matthias Brugger, Thomas Gleixner, Will Deacon,
	Linux ARM

On Thu, May 7, 2020 at 12:22 PM James Morse <james.morse@arm.com> wrote:
>
> Hi Pavel,
>
> On 26/03/2020 03:24, Pavel Tatashin wrote:
> > Currently, kexec relocation function (arm64_relocate_new_kernel) accepts
> > the following arguments:
> >
> > head:         start of array that contains relocation information.
> > entry:                entry point for new kernel or purgatory.
> > dtb_mem:      first and only argument to entry.
>
> > The number of arguments cannot be easily expended, because this
> > function is also called from HVC_SOFT_RESTART, which preserves only
> > three arguments. And, also arm64_relocate_new_kernel is written in
> > assembly but called without stack, thus no place to move extra
> > arguments to free registers.
> >
> > Soon, we will need to pass more arguments: once we enable MMU we
> > will need to pass information about page tables.
>
>
> > Another benefit of allowing this function to accept more arguments, is that
> > kernel can actually accept up to 4 arguments (x0-x3), however currently
> > only one is used, but if in the future we will need for more (for example,
> > pass information about when previous kernel exited to have a precise
> > measurement in time spent in purgatory), we won't be easilty do that
> > if arm64_relocate_new_kernel can't accept more arguments.
>
> This is a niche debug hack.
> We really don't want an ABI with purgatory. I think the register values it gets were added
> early for compatibility with kexec_file_load().
>
>
> > So, add a new struct: kern_reloc_arg, and place it in kexec safe page (i.e
> > memory that is not overwritten during relocation).
> > Thus, make arm64_relocate_new_kernel to only take one argument, that
> > contains all the needed information.
>
> Do we really not have enough registers?
>
> The PCS[0] gives you 8 arguments. In this patch you use 6.
>
>
> If this is really about the hyp-stub abi, please state that.

Yes, this is a hypervisor abi limitation. I will improve the commit
log to state it clearly.

> > diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
> > index cee3be586384..b1122eea627e 100644
> > --- a/arch/arm64/kernel/machine_kexec.c
> > +++ b/arch/arm64/kernel/machine_kexec.c
> > @@ -59,13 +60,35 @@ void machine_kexec_cleanup(struct kimage *kimage)
>
> >  int machine_kexec_post_load(struct kimage *kimage)
> >  {
> >       void *reloc_code = page_to_virt(kimage->control_code_page);
> > +     struct kern_reloc_arg *kern_reloc_arg = kexec_page_alloc(kimage);
> > +
> > +     if (!kern_reloc_arg)
> > +             return -ENOMEM;
> >
> >       memcpy(reloc_code, arm64_relocate_new_kernel,
> >              arm64_relocate_new_kernel_size);
> >       kimage->arch.kern_reloc = __pa(reloc_code);
> > +     kimage->arch.kern_reloc_arg = __pa(kern_reloc_arg);
> > +     kern_reloc_arg->head = kimage->head;
> > +     kern_reloc_arg->entry_addr = kimage->start;
> > +     kern_reloc_arg->kern_arg0 = kimage->arch.dtb_mem;
>
> These kern_reloc_arg values are written via the cacheable linear map.
> They are read in arm64_relocate_new_kernel() where the MMU is disabled an all memory
> access are non-cacheable.
>
> To ensure you read the values you wrote, you must clean kern_reloc_arg to the PoC.

Thank you for catching this, I added:
 __flush_dcache_area(kern_reloc_arg, sizeof (struct kern_reloc_arg));

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v9 15/18] arm64: kexec: kexec EL2 vectors
  2020-04-29 17:35   ` [PATCH v9 15/18] arm64: kexec: kexec EL2 vectors Marc Zyngier
@ 2021-01-25 19:07     ` Pavel Tatashin
  0 siblings, 0 replies; 26+ messages in thread
From: Pavel Tatashin @ 2021-01-25 19:07 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Sasha Levin, Mark Rutland, Vladimir Murzin, Jonathan Corbet,
	Catalin Marinas, Bhupesh Sharma, steve.capper, kexec mailing list,
	LKML, Selin Dag, James Morris, linux-mm, James Morse,
	Eric W. Biederman, Matthias Brugger, rfontana, Will Deacon,
	Thomas Gleixner, Linux ARM

> > +.macro el1_sync_64
> > +     br      x4                      /* Jump to new world from el2 */
> > +     .fill 31, 4, 0                  /* Set other 31 instr to zeroes */
> > +.endm
>
> The common idiom to write this is to align the beginning of the
> macro, and not to bother about what follows:
>
> .macro whatever
>          .align 7
>          br      x4
> .endm
>
> Specially given that 0 is an undefined instruction, and I really hate to
> see
> those in the actual text. On the contrary, .align generates NOPs.

Fixed that.

>
> > +
> > +.macro invalid_vector label
> > +\label:
> > +     b \label
> > +     .fill 31, 4, 0                  /* Set other 31 instr to zeroes */
> > +.endm
> > +
> > +/* el2 vectors - switch el2 here while we restore the memory image. */
> > +     .align 11
> > +ENTRY(kexec_el2_vectors)
>
> Please see commit 617a2f392c92 ("arm64: kvm: Annotate assembly using
> modern
> annoations"), and follow the same pattern.

Fixed that as well.

Thank you,
Pasha

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2021-01-25 19:08 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20200326032420.27220-1-pasha.tatashin@soleen.com>
     [not found] ` <20200326032420.27220-2-pasha.tatashin@soleen.com>
2020-04-29 17:00   ` [PATCH v9 01/18] arm64: kexec: make dtb_mem always enabled James Morse
2021-01-23  0:17     ` Pavel Tatashin
     [not found] ` <20200326032420.27220-3-pasha.tatashin@soleen.com>
2020-04-29 17:00   ` [PATCH v9 02/18] arm64: hibernate: move page handling function to new trans_pgd.c James Morse
2021-01-23  0:18     ` Pavel Tatashin
     [not found] ` <20200326032420.27220-4-pasha.tatashin@soleen.com>
2020-04-29 17:01   ` [PATCH v9 03/18] arm64: trans_pgd: make trans_pgd_map_page generic James Morse
2021-01-22 21:52     ` Pavel Tatashin
     [not found] ` <20200326032420.27220-5-pasha.tatashin@soleen.com>
2020-04-29 17:01   ` [PATCH v9 04/18] arm64: trans_pgd: pass allocator trans_pgd_create_copy James Morse
2021-01-23  0:20     ` Pavel Tatashin
     [not found] ` <20200326032420.27220-6-pasha.tatashin@soleen.com>
2020-04-29 17:01   ` [PATCH v9 05/18] arm64: trans_pgd: pass NULL instead of init_mm to *_populate functions James Morse
2021-01-23  0:22     ` Pavel Tatashin
     [not found] ` <20200326032420.27220-8-pasha.tatashin@soleen.com>
2020-04-29 17:01   ` [PATCH v9 07/18] arm64: trans_pgd: hibernate: idmap the single page that holds the copy page routines James Morse
2021-01-23  0:35     ` Pavel Tatashin
     [not found] ` <20200326032420.27220-9-pasha.tatashin@soleen.com>
2020-04-29 17:01   ` [PATCH v9 08/18] arm64: kexec: move relocation function setup James Morse
2021-01-23  1:01     ` Pavel Tatashin
     [not found] ` <20200326032420.27220-10-pasha.tatashin@soleen.com>
2020-04-29 17:01   ` [PATCH v9 09/18] arm64: kexec: call kexec_image_info only once James Morse
     [not found] ` <20200326032420.27220-11-pasha.tatashin@soleen.com>
2020-04-29 17:01   ` [PATCH v9 10/18] arm64: kexec: cpu_soft_restart change argument types James Morse
2021-01-23  1:14     ` Pavel Tatashin
     [not found] ` <20200326032420.27220-16-pasha.tatashin@soleen.com>
2020-04-29 17:35   ` [PATCH v9 15/18] arm64: kexec: kexec EL2 vectors Marc Zyngier
2021-01-25 19:07     ` Pavel Tatashin
2020-05-07 16:21   ` James Morse
     [not found] ` <20200326032420.27220-12-pasha.tatashin@soleen.com>
2020-05-07 16:22   ` [PATCH v9 11/18] arm64: kexec: arm64_relocate_new_kernel clean-ups James Morse
     [not found] ` <20200326032420.27220-14-pasha.tatashin@soleen.com>
2020-05-07 16:22   ` [PATCH v9 13/18] arm64: kexec: add expandable argument to relocation function James Morse
2021-01-23  2:49     ` Pavel Tatashin
     [not found] ` <20200326032420.27220-13-pasha.tatashin@soleen.com>
2020-05-07 16:22   ` [PATCH v9 12/18] arm64: kexec: arm64_relocate_new_kernel don't use x0 as temp James Morse
     [not found] ` <20200326032420.27220-15-pasha.tatashin@soleen.com>
2020-05-07 16:22   ` [PATCH v9 14/18] arm64: kexec: offset for relocation function James Morse
     [not found] ` <20200326032420.27220-17-pasha.tatashin@soleen.com>
2020-05-07 16:22   ` [PATCH v9 16/18] arm64: kexec: configure trans_pgd page table for kexec James Morse

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox