All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pingfan Liu <piliu@redhat.com>
To: "HAGIO KAZUHITO(萩尾 一仁)" <k-hagio-abat@nec.com>,
	kexec@lists.infradead.org, "Bhupesh Sharma" <bhsharma@redhat.com>
Subject: Re: [PATCH v5 3/3] makedumpfile/arm64: Add support for ARMv8.2-LVA (52-bit kernel VA support)
Date: Wed, 13 Jan 2021 17:17:01 +0800	[thread overview]
Message-ID: <20210113091007.GA24810@x1pad> (raw)
In-Reply-To: <OSBPR01MB199192C810AC846341D189B4DD390@OSBPR01MB1991.jpnprd01.prod.outlook.com>

On Thu, Sep 24, 2020 at 05:05:45AM +0000, HAGIO KAZUHITO(萩尾 一仁) wrote:
> Hi Bhupesh,
> 
> Thank you for the updated patch.
> 
> -----Original Message-----
> > With ARMv8.2-LVA architecture extension availability, arm64 hardware
> > which supports this extension can support upto 52-bit virtual
> > addresses. It is specially useful for having a 52-bit user-space virtual
> > address space while the kernel can still retain 48-bit/52-bit virtual
> > addressing.
> > 
> > Since at the moment we enable the support of this extension in the
> > kernel via a CONFIG flag (CONFIG_ARM64_VA_BITS_52), so there are
> > no clear mechanisms in user-space to determine this CONFIG
> > flag value and use it to determine the kernel-space VA address range
> > values.
> > 
> > 'makedumpfile' can instead use 'TCR_EL1.T1SZ' value from vmcoreinfo
> > which indicates the size offset of the memory region addressed by
> > TTBR1_EL1 (and hence can be used for determining the
> > vabits_actual value).
> > 
> > Using the vmcoreinfo variable exported by kernel commit
> >  bbdbc11804ff ("arm64/crash_core: Export  TCR_EL1.T1SZ in vmcoreinfo"),
> > the user-space can use the following computation for determining whether
> >  an address lies in the linear map range (for newer kernels >= 5.4):
> > 
> >   #define __is_lm_address(addr)	(!(((u64)addr) & BIT(vabits_actual - 1)))
> > 
> > Note that for the --mem-usage case though we need to calculate
> > vabits_actual value before the vmcoreinfo read functionality is ready,
> 
> For this, can't we read the TCR_EL1.T1SZ from vmcoreinfo in /proc/kcore's
> ELF note?  I think we can use the common functions used to vmcore with it.
> 
> I'll write a patch to do so if it sounds good.
> 
> > so we can instead read the architecture register ID_AA64MMFR2_EL1
> > directly to see if the underlying hardware supports 52-bit addressing
> > and accordingly set vabits_actual as:
> > 
> >    read_id_aa64mmfr2_el1();
> >    if (hardware supports 52-bit addressing)
> > 	vabits_actual = 52;
> >    else
> > 	vabits_actual = va_bits value calculated via _stext symbol;
> > 
> > Also make sure that the page_offset, is_linear_addr(addr) and __pa()
> > calculations work both for older (< 5.4) and newer kernels (>= 5.4).
> > 
> > I have tested several combinations with both kernel categories
> > [for e.g. with different VA (39, 42, 48 and 52-bit) and PA combinations
> > (48 and 52-bit)] on at-least 3 different boards.
> > 
> > Unfortunately, this means that we need to call 'populate_kernel_version()'
> > earlier 'get_page_offset_arm64()' as 'info->kernel_version' remains
> > uninitialized before its first use otherwise.
> 
> The populate_kernel_version() uses uname(), so this means that there will
> be some cases that makedumpfile doesn't work with vmcores which were
> captured on other kernels than running one.  This is a rather big limitation
> especially to backward-compatibility test, and it would be better to
> avoid changing behavior depending on environment, not on data.
> 
> Is there no room to avoid it?

I have a new idea about it, which avoid any version judgement. Please see the comment inline
> 
> Just an idea, but can we use the OSRELEASE vmcoreinfo in ELF note first
> to determine the kernel version?  It's from init_uts_ns.name.release,
> why can't we use it?
> 
> Thanks,
> Kazu
> 
> > 
> > This patch is in accordance with ARMv8 Architecture Reference Manual
> > 
> > Cc: Kazuhito Hagio <k-hagio at ab.jp.nec.com>
> > Cc: John Donnelly <john.p.donnelly at oracle.com>
> > Cc: kexec at lists.infradead.org
> > Signed-off-by: Bhupesh Sharma <bhsharma at redhat.com>
> > ---
> >  arch/arm64.c   | 233 ++++++++++++++++++++++++++++++++++++++++++-------
> >  common.h       |  10 +++
> >  makedumpfile.c |   4 +-
> >  makedumpfile.h |   6 +-
> >  4 files changed, 218 insertions(+), 35 deletions(-)
> > 
> > diff --git a/arch/arm64.c b/arch/arm64.c
> > index 709e0a506916..ccaa8641ca66 100644
> > --- a/arch/arm64.c
> > +++ b/arch/arm64.c
> > @@ -19,10 +19,23 @@
> > 
> >  #ifdef __aarch64__
> > 
> > +#include <asm/hwcap.h>
> > +#include <sys/auxv.h>
> >  #include "../elf_info.h"
> >  #include "../makedumpfile.h"
> >  #include "../print_info.h"
> > 
> > +/* ID_AA64MMFR2_EL1 related helpers: */
> > +#define ID_AA64MMFR2_LVA_SHIFT	16
> > +#define ID_AA64MMFR2_LVA_MASK	(0xf << ID_AA64MMFR2_LVA_SHIFT)
> > +
> > +/* CPU feature ID registers */
> > +#define get_cpu_ftr(id) ({							\
> > +		unsigned long __val;						\
> > +		asm volatile("mrs %0, " __stringify(id) : "=r" (__val));	\
> > +		__val;								\
> > +})
> > +
> >  typedef struct {
> >  	unsigned long pgd;
> >  } pgd_t;
> > @@ -47,6 +60,7 @@ typedef struct {
> >  static int lpa_52_bit_support_available;
> >  static int pgtable_level;
> >  static int va_bits;
> > +static int vabits_actual;
> >  static unsigned long kimage_voffset;
> > 
> >  #define SZ_4K			4096
> > @@ -58,7 +72,6 @@ static unsigned long kimage_voffset;
> >  #define PAGE_OFFSET_42		((0xffffffffffffffffUL) << 42)
> >  #define PAGE_OFFSET_47		((0xffffffffffffffffUL) << 47)
> >  #define PAGE_OFFSET_48		((0xffffffffffffffffUL) << 48)
> > -#define PAGE_OFFSET_52		((0xffffffffffffffffUL) << 52)
> > 
> >  #define pgd_val(x)		((x).pgd)
> >  #define pud_val(x)		(pgd_val((x).pgd))
> > @@ -219,13 +232,25 @@ pmd_page_paddr(pmd_t pmd)
> >  #define pte_index(vaddr) 		(((vaddr) >> PAGESHIFT()) & (PTRS_PER_PTE - 1))
> >  #define pte_offset(dir, vaddr) 		(pmd_page_paddr((*dir)) + pte_index(vaddr) * sizeof(pte_t))
> > 
> > +/*
> > + * The linear kernel range starts at the bottom of the virtual address
> > + * space. Testing the top bit for the start of the region is a
> > + * sufficient check and avoids having to worry about the tag.
> > + */
> > +#define is_linear_addr(addr)	((info->kernel_version < KERNEL_VERSION(5, 4, 0)) ?	\
> > +	(!!((unsigned long)(addr) & (1UL << (vabits_actual - 1)))) : \
> > +	(!((unsigned long)(addr) & (1UL << (vabits_actual - 1)))))
> > +
> >  static unsigned long long
> >  __pa(unsigned long vaddr)
> >  {
> >  	if (kimage_voffset == NOT_FOUND_NUMBER ||
> > -			(vaddr >= PAGE_OFFSET))
> > -		return (vaddr - PAGE_OFFSET + info->phys_base);
> > -	else
> > +			is_linear_addr(vaddr)) {
> > +		if (info->kernel_version < KERNEL_VERSION(5, 4, 0))
> > +			return ((vaddr & ~PAGE_OFFSET) + info->phys_base);
> > +		else
> > +			return (vaddr + info->phys_base - PAGE_OFFSET);
> > +	} else
> >  		return (vaddr - kimage_voffset);
> >  }
> > 
> > @@ -254,6 +279,7 @@ static int calculate_plat_config(void)
> >  			(PAGESIZE() == SZ_64K && va_bits == 42)) {
> >  		pgtable_level = 2;
> >  	} else if ((PAGESIZE() == SZ_64K && va_bits == 48) ||
> > +			(PAGESIZE() == SZ_64K && va_bits == 52) ||
> >  			(PAGESIZE() == SZ_4K && va_bits == 39) ||
> >  			(PAGESIZE() == SZ_16K && va_bits == 47)) {
> >  		pgtable_level = 3;
> > @@ -288,8 +314,14 @@ get_phys_base_arm64(void)
> >  		return TRUE;
> >  	}
> > 
> > +	/* Ignore the 1st PT_LOAD */
> >  	if (get_num_pt_loads() && PAGE_OFFSET) {
> > -		for (i = 0;
> > +		/* Note that the following loop starts with i = 1.
> > +		 * This is required to make sure that the following logic
> > +		 * works both for old and newer kernels (with flipped
> > +		 * VA space, i.e. >= 5.4.0)
> > +		 */
> > +		for (i = 1;
> >  		    get_pt_load(i, &phys_start, NULL, &virt_start, NULL);
> >  		    i++) {
> >  			if (virt_start != NOT_KV_ADDR
> > @@ -346,6 +378,139 @@ get_stext_symbol(void)
> >  	return(found ? kallsym : FALSE);
> >  }
> > 
> > +static int
> > +get_va_bits_from_stext_arm64(void)
> > +{
> > +	ulong _stext;
> > +
> > +	_stext = get_stext_symbol();
> > +	if (!_stext) {
> > +		ERRMSG("Can't get the symbol of _stext.\n");
> > +		return FALSE;
> > +	}
> > +
> > +	/* Derive va_bits as per arch/arm64/Kconfig. Note that this is a
> > +	 * best case approximation at the moment, as there can be
> > +	 * inconsistencies in this calculation (for e.g., for
> > +	 * 52-bit kernel VA case, the 48th bit is set in
> > +	 * the _stext symbol).
> > +	 *
> > +	 * So, we need to rely on the vabits_actual symbol in the
> > +	 * vmcoreinfo or read via system register for a accurate value
> > +	 * of the virtual addressing supported by the underlying kernel.
> > +	 */
> > +	if ((_stext & PAGE_OFFSET_48) == PAGE_OFFSET_48) {
> > +		va_bits = 48;
> > +	} else if ((_stext & PAGE_OFFSET_47) == PAGE_OFFSET_47) {
> > +		va_bits = 47;
> > +	} else if ((_stext & PAGE_OFFSET_42) == PAGE_OFFSET_42) {
> > +		va_bits = 42;
> > +	} else if ((_stext & PAGE_OFFSET_39) == PAGE_OFFSET_39) {
> > +		va_bits = 39;
> > +	} else if ((_stext & PAGE_OFFSET_36) == PAGE_OFFSET_36) {
> > +		va_bits = 36;
> > +	} else {
> > +		ERRMSG("Cannot find a proper _stext for calculating VA_BITS\n");
> > +		return FALSE;
> > +	}
> > +
> > +	DEBUG_MSG("va_bits       : %d (approximation via _stext)\n", va_bits);
> > +
> > +	return TRUE;
> > +}
> > +
> > +/* Note that its important to note that the
> > + * ID_AA64MMFR2_EL1 architecture register can be read
> > + * only when we give an .arch hint to the gcc/binutils,
> > + * so we use the gcc construct '__attribute__ ((target ("arch=armv8.2-a")))'
> > + * here which is an .arch directive (see AArch64-Target-selection-directives
> > + * documentation from ARM for details). This is required only for
> > + * this function to make sure it compiles well with gcc/binutils.
> > + */
> > +__attribute__ ((target ("arch=armv8.2-a")))
> > +static unsigned long
> > +read_id_aa64mmfr2_el1(void)
> > +{
> > +	return get_cpu_ftr(ID_AA64MMFR2_EL1);
> > +}
> > +
> > +static int
> > +get_vabits_actual_from_id_aa64mmfr2_el1(void)
> > +{
> > +	int l_vabits_actual;
> > +	unsigned long val;
> > +
> > +	/* Check if ID_AA64MMFR2_EL1 CPU-ID register indicates
> > +	 * ARMv8.2/LVA support:
> > +	 * VARange, bits [19:16]
> > +	 *   From ARMv8.2:
> > +	 *   Indicates support for a larger virtual address.
> > +	 *   Defined values are:
> > +	 *     0b0000 VMSAv8-64 supports 48-bit VAs.
> > +	 *     0b0001 VMSAv8-64 supports 52-bit VAs when using the 64KB
> > +	 *            page size. The other translation granules support
> > +	 *            48-bit VAs.
> > +	 *
> > +	 * See ARMv8 ARM for more details.
> > +	 */
> > +	if (!(getauxval(AT_HWCAP) & HWCAP_CPUID)) {
> > +		ERRMSG("arm64 CPUID registers unavailable.\n");
> > +		return ERROR;
> > +	}
> > +
> > +	val = read_id_aa64mmfr2_el1();
> > +	val = (val & ID_AA64MMFR2_LVA_MASK) > ID_AA64MMFR2_LVA_SHIFT;
> > +
> > +	if ((val == 0x1) && (PAGESIZE() == SZ_64K))
> > +		l_vabits_actual = 52;
> > +	else
> > +		l_vabits_actual = 48;
> > +
> > +	return l_vabits_actual;
> > +}
> > +
> > +static void
> > +get_page_offset_arm64(void)
> > +{
> > +	/* Check if 'vabits_actual' is initialized yet.
> > +	 * If not, our best bet is to read ID_AA64MMFR2_EL1 CPU-ID
> > +	 * register.
> > +	 */
> > +	if (!vabits_actual) {
> > +		vabits_actual = get_vabits_actual_from_id_aa64mmfr2_el1();
> > +		if ((vabits_actual == ERROR) || (vabits_actual != 52)) {
> > +			/* If we cannot read ID_AA64MMFR2_EL1 arch
> > +			 * register or if this register does not indicate
> > +			 * support for a larger virtual address, our last
> > +			 * option is to use the VA_BITS to calculate the
> > +			 * PAGE_OFFSET value, i.e. vabits_actual = VA_BITS.
> > +			 */
> > +			vabits_actual = va_bits;
> > +			DEBUG_MSG("vabits_actual : %d (approximation via va_bits)\n",
> > +					vabits_actual);
> > +		} else
> > +			DEBUG_MSG("vabits_actual : %d (via id_aa64mmfr2_el1)\n",
> > +					vabits_actual);
> > +	}
> > +
> > +	if (!populate_kernel_version()) {
> > +		ERRMSG("Cannot get information about current kernel\n");
> > +		return;
> > +	}
> > +
> > +	/* See arch/arm64/include/asm/memory.h for more details of
> > +	 * the PAGE_OFFSET calculation.
> > +	 */
> > +	if (info->kernel_version < KERNEL_VERSION(5, 4, 0))
> > +		info->page_offset = ((0xffffffffffffffffUL) -
> > +				((1UL) << (vabits_actual - 1)) + 1);
> > +	else
> > +		info->page_offset = (-(1UL << vabits_actual));
> > +

Considering the following related commit order

    b6d00d47e81a arm64: mm: Introduce 52-bit Kernel VAs                        (2)
    ce3aaed87344 arm64: mm: Modify calculation of VMEMMAP_SIZE
    c8b6d2ccf9b1 arm64: mm: Separate out vmemmap
    c812026c54cf arm64: mm: Logic to make offset_ttbr1 conditional
    5383cc6efed1 arm64: mm: Introduce vabits_actual
    90ec95cda91a arm64: mm: Introduce VA_BITS_MIN
    99426e5e8c9f arm64: dump: De-constify VA_START and KASAN_SHADOW_START
    6bd1d0be0e97 arm64: kasan: Switch to using KASAN_SHADOW_OFFSET
    14c127c957c1 arm64: mm: Flip kernel VA space                               (1)

And
    #define _PAGE_END(va)		(-(UL(1) << ((va) - 1)))
    #define PAGE_OFFSET (((0xffffffffffffffffUL) - ((1UL) << (vabits_actual - 1)) + 1))  //old
    #define PAGE_OFFSET (-(1UL << vabits_actual))                                        //new

before (1), SYMBOL(_text) < PAGE_OFFSET, afterward, SYMBOL(_text) > PAGE_END == "old PAGE_OFFSET"

So the comparasion of kernel version can be replaced by
    if SYMBOL(_text) > PAGE_END
	info->page_offset = new PAGE_OFFSET
    else
	info->page_offset = old PAGE_OFFSET


Any comment?

Thanks,
	Pingfan


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

  reply	other threads:[~2021-01-13  9:18 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-10  5:33 [PATCH v5 0/3] makedumpfile/arm64: Add support for ARMv8.2 extensions Bhupesh Sharma
2020-09-10  5:33 ` [PATCH v5 1/3] tree-wide: Retrieve 'MAX_PHYSMEM_BITS' from vmcoreinfo (if available) Bhupesh Sharma
2020-09-18  5:43   ` HAGIO KAZUHITO(萩尾 一仁)
2020-09-10  5:33 ` [PATCH v5 2/3] makedumpfile/arm64: Add support for ARMv8.2-LPA (52-bit PA support) Bhupesh Sharma
2020-09-23  5:28   ` HAGIO KAZUHITO(萩尾 一仁)
2020-09-10  5:33 ` [PATCH v5 3/3] makedumpfile/arm64: Add support for ARMv8.2-LVA (52-bit kernel VA support) Bhupesh Sharma
2020-09-24  5:05   ` HAGIO KAZUHITO(萩尾 一仁)
2021-01-13  9:17     ` Pingfan Liu [this message]
2021-01-14  0:43       ` HAGIO KAZUHITO(萩尾 一仁)
2020-09-24  5:23 ` [PATCH v5 0/3] makedumpfile/arm64: Add support for ARMv8.2 extensions HAGIO KAZUHITO(萩尾 一仁)
2020-11-11  0:34   ` HAGIO KAZUHITO(萩尾 一仁)
2020-11-12  6:46     ` Bhupesh Sharma
2020-11-13  6:03       ` HAGIO KAZUHITO(萩尾 一仁)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210113091007.GA24810@x1pad \
    --to=piliu@redhat.com \
    --cc=bhsharma@redhat.com \
    --cc=k-hagio-abat@nec.com \
    --cc=kexec@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.