Kexec Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: James Morse <james.morse@arm.com>
To: Bhupesh Sharma <bhsharma@redhat.com>,
	Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Kexec Mailing List <kexec@lists.infradead.org>,
	Will Deacon <will.deacon@arm.com>,
	AKASHI Takahiro <takahiro.akashi@linaro.org>,
	Bhupesh SHARMA <bhupesh.linux@gmail.com>,
	linux-arm-kernel <linux-arm-kernel@lists.infradead.org>
Subject: Re: [PATCH] arm64/mm: Introduce a variable to hold base address of linear region
Date: Tue, 12 Jun 2018 11:12:46 +0100	[thread overview]
Message-ID: <d801005a-5f7a-8fe4-a486-bc6443a406a3@arm.com> (raw)
In-Reply-To: <CACi5LpNdg=Enp66BybNNuj5tQOp3ggyX=tm_SVzr8v2qCRteDw@mail.gmail.com>

Hi Bhupesh, Ard,

On 12/06/18 09:25, Bhupesh Sharma wrote:
> On Tue, Jun 12, 2018 at 12:23 PM, Ard Biesheuvel
> <ard.biesheuvel@linaro.org> wrote:
>> On 12 June 2018 at 08:36, Bhupesh Sharma <bhsharma@redhat.com> wrote:
>>> The start of the linear region map on a KASLR enabled ARM64 machine -
>>> which supports a compatible EFI firmware (with EFI_RNG_PROTOCOL
>>> support), is no longer correctly represented by the PAGE_OFFSET macro,
>>> since it is defined as:
>>>
>>>     (UL(1) << (VA_BITS - 1)) + 1)

>> PAGE_OFFSET is the VA of the start of the linear map. The linear map
>> can be sparsely populated with actual memory, regardless of whether
>> KASLR is in effect or not. The only difference in the presence of
>> KASLR is that there may be such a hole at the beginning, but that does
>> not mean the linear map has moved, or that the value of PAGE_OFFSET is
>> now wrong.

>>> So taking an example of a platform with VA_BITS=48, this gives a static
>>> value of:
>>> PAGE_OFFSET = 0xffff800000000000
>>>
>>> However, for the KASLR case, we use the 'memstart_offset_seed'
>>> to randomize the linear region - since 'memstart_addr' indicates the
>>> start of physical RAM, we randomize the same on basis
>>> of 'memstart_offset_seed' value.
>>>
>>> As the PAGE_OFFSET value is used presently by several user space
>>> tools (for e.g. makedumpfile and crash tools) to determine the start
>>> of linear region and hence to read addresses (like PT_NOTE fields) from
>>> '/proc/kcore' for the non-KASLR boot cases, so it would be better to
>>> use 'memblock_start_of_DRAM()' value (converted to virtual) as
>>> the start of linear region for the KASLR cases and default to
>>> the PAGE_OFFSET value for non-KASLR cases to indicate the start of
>>> linear region.

>> Userland code that assumes that the linear map cannot have a hole at
>> the beginning should be fixed.

> That is a separate case (although that needs fixing as well via a
> kernel patch probably as the user-space tools rely on '/proc/iomem'
> contents to determine the first System RAM/reserved range).

This is for kexec-tools generating the kdump vmcore ELF headers in user-space?


> 1. In that particular case (see [1]) the EFI firmware sets the first
> EFI block as EfiReservedMemType:
> 
> Region1: 0x000000000000-0x000000200000 [EfiReservedMemType]
> Region2: 0x000000200000-0x00000021fffff [EfiRuntimeServiceData]
> 
> Since EFI firmware won't return the "EfiReservedMemType" memory to
> Linux kernel, 

(Its linux that makes this choice in
drivers/firmware/efi/arm-init.c::is_usable_memory())


> so the kernel can't get any info about the first mem
> block, and kernel can only see region2 as below:
> 
> efi: Processing EFI memory map:
> efi:   0x000000200000-0x00000021ffff [Runtime Data       |RUN|  |  |
> |  |  |  |   |WB|WT|WC|UC]
> 
> # head -1 /proc/iomem
> 00200000-0021ffff : reserved
> 
> 2a. If we add debug prints to 'arch/arm64/mm/init.c' to print the
> kernel Virtual map we can see that the memory node is set to:
> 
> # dmesg | grep memory
> ..........
> memory  : 0xffff800000200000 - 0xffff801800000000
> 
> 2b. Now if we use kexec-tools to obtain a crash vmcore we can see that
> if we use 'readelf' to get the last program Header from vmcore (logs
> below are for the non-kaslr case):
> 
> # readelf -l vmcore
> 
> ELF Header:
> ........................
> 
> Program Headers:
>   Type           Offset             VirtAddr           PhysAddr
>          FileSiz            MemSiz              Flags  Align
> ..............................................................................................................................................................
>   LOAD        0x0000000076d40000 0xffff80017fe00000 0x0000000180000000
>                 0x0000001680000000 0x0000001680000000  RWE    0
> 
> 3. So if we do a simple calculation:
> 
> (VirtAddr + MemSiz) = 0xffff80017fe00000 + 0x0000001680000000 =
> 0xFFFF8017FFE00000 != 0xffff801800000000.
> 
> which indicates that the end virtual memory nodes are not the same
> between vmlinux and vmcore.

If I've followed this properly: the problem is that to generate the ELF headers
in the post-kdump vmcore, at kdump-load-time kexec-tools has to guess the
virtual addresses of the 'System RAM' regions it can see in /proc/iomem.

The problem you are hitting is an invisible hole at the beginning of RAM,
meaning user-space's guess_phys_to_virt() is off by the size of this hole.

Isn't KASLR a special case for this? You must have to correct for that after
kdump has happened, based on an elf-note in the vmcore. Can't we always do this?


> This happens because the kexec-tools rely on 'proc/iomem' contents
> while 'memstart_addr' is computed as 0 by kernel (as value of
> memblock_start_of_DRAM() < ARM64_MEMSTART_ALIGN).

> Returning back to this patch, this is a generic requirement where we
> need the linear region start/base addresses in user-space applications
> which is used to read addresses which lie in the linear region (for
> e.g. when we read /proc/kcore contents).


>>> I tested this on my qualcomm (which supports EFI_RNG_PROTOCOL)
>>> and apm mustang (which does not support EFI_RNG_PROTOCOL) arm64 boards
>>> and was able to use a modified user space utility (like kexec-tools and
>>> makedumpfile) to determine the start of linear region correctly for
>>> both the KASLR and non-KASLR boot cases.
>>>
>>
>> Can you explain the nature of the changes to the userland code?
> 
> The changes are not to rely on the fixed PAGE_OFFSET macro value for
> determining the base address of the linear region, but rather read the
> ' linear_reg_start_addr' symbol from kernel and use the same both in
> case of KASLR and non-KASLR boots to determine the base of the linear
> region (in [2], I have implemented a test change to kexec-tools to
> read the 'linear_reg_start_addr' symbol which is available on my

Don't use /dev/mem.


> public github tree, I have a similar change available in makedumpfile
> which I have not yet pushed to github, as it implements other features
> as well)

>>> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
>>> index 49d99214f43c..bfd0915ecaf8 100644
>>> --- a/arch/arm64/include/asm/memory.h
>>> +++ b/arch/arm64/include/asm/memory.h
>>> @@ -178,6 +178,9 @@ extern s64                  memstart_addr;
>>>  /* PHYS_OFFSET - the physical address of the start of memory. */
>>>  #define PHYS_OFFSET            ({ VM_BUG_ON(memstart_addr & 1); memstart_addr; })
>>>
>>> +/* the virtual base of the linear region. */
>>> +extern s64                     linear_reg_start_addr;
>>> +
>>>  /* the virtual base of the kernel image (minus TEXT_OFFSET) */
>>>  extern u64                     kimage_vaddr;
>>>
>>> diff --git a/arch/arm64/kernel/arm64ksyms.c b/arch/arm64/kernel/arm64ksyms.c
>>> index d894a20b70b2..a92238ea45ff 100644
>>> --- a/arch/arm64/kernel/arm64ksyms.c
>>> +++ b/arch/arm64/kernel/arm64ksyms.c
>>> @@ -42,6 +42,7 @@ EXPORT_SYMBOL(__arch_copy_in_user);
>>>
>>>         /* physical memory */
>>>  EXPORT_SYMBOL(memstart_addr);
>>> +EXPORT_SYMBOL(linear_reg_start_addr);
>>>
>>>         /* string / mem functions */
>>>  EXPORT_SYMBOL(strchr);
>>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>>> index 325cfb3b858a..29447adb0eef 100644
>>> --- a/arch/arm64/mm/init.c
>>> +++ b/arch/arm64/mm/init.c
>>> @@ -60,6 +60,7 @@
>>>   * that cannot be mistaken for a real physical address.
>>>   */
>>>  s64 memstart_addr __ro_after_init = -1;
>>> +s64 linear_reg_start_addr __ro_after_init = PAGE_OFFSET;
>>>  phys_addr_t arm64_dma_phys_limit __ro_after_init;
>>>
>>>  #ifdef CONFIG_BLK_DEV_INITRD
>>> @@ -452,6 +453,8 @@ void __init arm64_memblock_init(void)
>>>                 }
>>>         }
>>>
>>> +       linear_reg_start_addr = __phys_to_virt(memblock_start_of_DRAM());

This patch adds a variable that nothing uses, its going to be removed. You can't
depend on reading this via /dev/mem.

Could you add the information you need as an elf-note to the vmcore instead? You
must already pick these up to handle kaslr. (from memory, this is where the
kaslr-offset is described to user-space after we kdump).


Thanks,

James

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

  reply	other threads:[~2018-06-12 10:12 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-12  6:36 [PATCH] arm64/mm: Introduce a variable to hold base address of linear region Bhupesh Sharma
2018-06-12  6:53 ` Ard Biesheuvel
2018-06-12  8:25   ` Bhupesh Sharma
2018-06-12 10:12     ` James Morse [this message]
2018-06-13  5:16       ` Bhupesh Sharma
2018-06-13 10:11         ` Will Deacon
2018-06-14  6:23           ` Bhupesh Sharma
2018-06-15 16:52             ` Will Deacon
2018-06-15 20:02               ` Bhupesh Sharma
2018-06-13 10:29         ` James Morse
2018-06-14  7:53           ` Bhupesh Sharma
2018-06-14 16:17             ` James Morse
2018-06-19  3:02               ` Jin, Yanjiang
2018-06-19  8:55                 ` Will Deacon
2018-06-19  9:34                   ` Jin, Yanjiang
2018-06-19  9:40                     ` Will Deacon
2018-06-19  9:57                       ` Jin, Yanjiang
2018-06-19 10:16                         ` James Morse
2018-06-19 10:37                           ` Bhupesh Sharma
2018-06-19 11:26                             ` James Morse
2018-06-19 11:58                               ` Bhupesh Sharma
2018-06-20  2:16                                 ` Jin, Yanjiang
2018-06-20  7:26                                   ` Bhupesh Sharma
2018-06-20 10:06                                     ` James Morse
2018-07-11 13:24                                     ` James Morse
2018-07-11 15:36                                       ` Bhupesh Sharma
2018-07-11 16:24                                         ` Omar Sandoval

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d801005a-5f7a-8fe4-a486-bc6443a406a3@arm.com \
    --to=james.morse@arm.com \
    --cc=ard.biesheuvel@linaro.org \
    --cc=bhsharma@redhat.com \
    --cc=bhupesh.linux@gmail.com \
    --cc=catalin.marinas@arm.com \
    --cc=kexec@lists.infradead.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=mark.rutland@arm.com \
    --cc=takahiro.akashi@linaro.org \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox