From: James Morse <james.morse@arm.com>
To: Bhupesh Sharma <bhsharma@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>,
Ard Biesheuvel <ard.biesheuvel@linaro.org>,
Catalin Marinas <catalin.marinas@arm.com>,
Kexec Mailing List <kexec@lists.infradead.org>,
Will Deacon <will.deacon@arm.com>,
AKASHI Takahiro <takahiro.akashi@linaro.org>,
Bhupesh SHARMA <bhupesh.linux@gmail.com>,
linux-arm-kernel <linux-arm-kernel@lists.infradead.org>
Subject: Re: [PATCH] arm64/mm: Introduce a variable to hold base address of linear region
Date: Wed, 13 Jun 2018 11:29:17 +0100 [thread overview]
Message-ID: <3a05cd0e-466e-4fb6-a78a-4b363e21aaab@arm.com> (raw)
In-Reply-To: <CACi5LpOJ7Q7nbo9m_HGqr44+n_iEtreZt+Uxtf2PzcHSzTdKuA@mail.gmail.com>
Hi Bhupesh,
On 13/06/18 06:16, Bhupesh Sharma wrote:
> On Tue, Jun 12, 2018 at 3:42 PM, James Morse <james.morse@arm.com> wrote:
>> On 12/06/18 09:25, Bhupesh Sharma wrote:
>>> On Tue, Jun 12, 2018 at 12:23 PM, Ard Biesheuvel wrote:
>>>> Userland code that assumes that the linear map cannot have a hole at
>>>> the beginning should be fixed.
>>> That is a separate case (although that needs fixing as well via a
>>> kernel patch probably as the user-space tools rely on '/proc/iomem'
>>> contents to determine the first System RAM/reserved range).
>>
>> This is for kexec-tools generating the kdump vmcore ELF headers in user-space?
>
> Yes, but again, I would like to reiterate that the case where I see a
> hole at the start of the System RAM range (as I listed above) is just
> a specific case, which probably deserves a separate patch. The current
> patch though is for a generic issue (please see more details below).
>>> # readelf -l vmcore
>>>
>>> ELF Header:
>>> ........................
>>>
>>> Program Headers:
>>> Type Offset VirtAddr PhysAddr
>>> FileSiz MemSiz Flags Align
>>> ..............................................................................................................................................................
>>> LOAD 0x0000000076d40000 0xffff80017fe00000 0x0000000180000000
>>> 0x0000001680000000 0x0000001680000000 RWE 0
>>>
>>> 3. So if we do a simple calculation:
>>>
>>> (VirtAddr + MemSiz) = 0xffff80017fe00000 + 0x0000001680000000 =
>>> 0xFFFF8017FFE00000 != 0xffff801800000000.
>>>
>>> which indicates that the end virtual memory nodes are not the same
>>> between vmlinux and vmcore.
>>
>> If I've followed this properly: the problem is that to generate the ELF headers
>> in the post-kdump vmcore, at kdump-load-time kexec-tools has to guess the
>> virtual addresses of the 'System RAM' regions it can see in /proc/iomem.
>>
>> The problem you are hitting is an invisible hole at the beginning of RAM,
>> meaning user-space's guess_phys_to_virt() is off by the size of this hole.
>>
>> Isn't KASLR a special case for this? You must have to correct for that after
>> kdump has happened, based on an elf-note in the vmcore. Can't we always do this?
>
> No, I hit this issue both for the KASLR and non-KASLR boot cases.
Because in both cases there is a hole at the beginning of the linear-map. KASLR
is a special-case of this as the kernel adds a variable sized hole to do the
randomization.
Surely treating this as one case makes your user-space code simpler.
> Fixing this in kernel space seems better to me as the definition of
Is there a kernel bug? Changing the definitions of internal kernel variables for
the benefit of code digging in /proc/kcore|/dev/mem isn't going to fly.
> 'memstart_addr' is that it indicates the start of the physical ram,
> but since in this case there is a hole at the start of the system ram
> visible in Linux (and thus to user-space), but 'memstart_addr' is
> still 0 which seems contradictory at the least. This causes PHY_OFFSET
> to be 0 as well, which is again contradictory.
>>> This happens because the kexec-tools rely on 'proc/iomem' contents
>>> while 'memstart_addr' is computed as 0 by kernel (as value of
>>> memblock_start_of_DRAM() < ARM64_MEMSTART_ALIGN).
>>
>>> Returning back to this patch, this is a generic requirement where we
>>> need the linear region start/base addresses in user-space applications
>>> which is used to read addresses which lie in the linear region (for
>>> e.g. when we read /proc/kcore contents).
[...]
>> This patch adds a variable that nothing uses, its going to be removed. You can't
>> depend on reading this via /dev/mem.
>>
>> Could you add the information you need as an elf-note to the vmcore instead? You
>> must already pick these up to handle kaslr. (from memory, this is where the
>> kaslr-offset is described to user-space after we kdump).
> No you are mixing up the two cases (please see above), the issue which
> this patch fixes is for use cases where we don't have the vmcore
> available in case of 'live' debugging via makedumpfile and crash tools
> (we only have '/proc/kcore' or 'vmlinux' available in such cases). I
> detailed the use case in [1] better (in a reply to Ard), I will detail
> the use-case again below:
Okay, so not kdump...
> One specific use case that I am working on at the moment is the
> makedumpfile '--mem-usage', which allows one to see the page numbers
> of current system (1st kernel) in different use (please see
> MAKEDUMPFILE(8) for more details).
https://linux.die.net/man/8/makedumpfile :
| Name: makedumpfile - make a small dumpfile of kdump
... but now we are talking about kdump again ...
> Using this we can know how many pages are dumpable when different
> dump_level is specified when invoking the makedumpfile.
>
> Normally, makedumpfile analyses the contents of '/proc/kcore' (while
> excluding the crashkernel range), and then calculates the page number
> of different kind per vmcoreinfo.
$ apt-get source makedumpfile
$ cd makedumpfile-1.5.3
$ grep -r "kcore" .
$
I suspect there are two pieces of software with the same name here.
> This use case requires directly reading the '/proc/kcore' and the
> hence the PAGE_OFFSET value is used to determine the base address of
> the linear region, whose value is not static in case of KASLR boot.
Eh? I thought PAGE_OFFSET was a compile-time constant, and it was PHYS_OFFSET
has a value other the aligned base of memory for KASLR.
> Another use-case is where the crash-utility uses the PAGE_OFFSET value
> to perform a virtual-to-physical conversion for the address lying in
> the linear region:
In all cases the problem you have is assuming the first 'System RAM' value in
/proc/iomem is the base of DRAM, which you can use a PHYS_OFFSET in your
user-space phys2virt() calculation.
What information do you need to make this work?
You can evidently read kernel variables, why can't you read memstart_addr and do:
| #define __phys_to_virt(x) \
| ((unsigned long)((x) - memstart_addr) | PAGE_OFFSET)
based on the physical addresses in /proc/iomem, and PAGE_OFFSET pulled out of
the vmlinux.
Reading memstart_addr is fragile, we might need to rename it
wednesday_memstart_addr. If user-space needs this value to work with
/proc/{kcore,vmcore} we should expose something like 'p2v_offset' as an elf-note
on those files. (looks like they both have elf-headers).
Thanks,
James
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
next prev parent reply other threads:[~2018-06-13 10:29 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-06-12 6:36 [PATCH] arm64/mm: Introduce a variable to hold base address of linear region Bhupesh Sharma
2018-06-12 6:53 ` Ard Biesheuvel
2018-06-12 8:25 ` Bhupesh Sharma
2018-06-12 10:12 ` James Morse
2018-06-13 5:16 ` Bhupesh Sharma
2018-06-13 10:11 ` Will Deacon
2018-06-14 6:23 ` Bhupesh Sharma
2018-06-15 16:52 ` Will Deacon
2018-06-15 20:02 ` Bhupesh Sharma
2018-06-13 10:29 ` James Morse [this message]
2018-06-14 7:53 ` Bhupesh Sharma
2018-06-14 16:17 ` James Morse
2018-06-19 3:02 ` Jin, Yanjiang
2018-06-19 8:55 ` Will Deacon
2018-06-19 9:34 ` Jin, Yanjiang
2018-06-19 9:40 ` Will Deacon
2018-06-19 9:57 ` Jin, Yanjiang
2018-06-19 10:16 ` James Morse
2018-06-19 10:37 ` Bhupesh Sharma
2018-06-19 11:26 ` James Morse
2018-06-19 11:58 ` Bhupesh Sharma
2018-06-20 2:16 ` Jin, Yanjiang
2018-06-20 7:26 ` Bhupesh Sharma
2018-06-20 10:06 ` James Morse
2018-07-11 13:24 ` James Morse
2018-07-11 15:36 ` Bhupesh Sharma
2018-07-11 16:24 ` Omar Sandoval
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3a05cd0e-466e-4fb6-a78a-4b363e21aaab@arm.com \
--to=james.morse@arm.com \
--cc=ard.biesheuvel@linaro.org \
--cc=bhsharma@redhat.com \
--cc=bhupesh.linux@gmail.com \
--cc=catalin.marinas@arm.com \
--cc=kexec@lists.infradead.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=mark.rutland@arm.com \
--cc=takahiro.akashi@linaro.org \
--cc=will.deacon@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox