From: bhsharma@redhat.com (Bhupesh Sharma)
To: linux-arm-kernel@lists.infradead.org
Subject: arm64 crashkernel fails to boot on acpi-only machines due to ACPI regions being no longer mapped as NOMAP
Date: Wed, 15 Nov 2017 16:28:55 +0530 [thread overview]
Message-ID: <3df4c6c5-0abe-01ee-730d-2edaa5f497d2@redhat.com> (raw)
In-Reply-To: <CAKv+Gu_eQ-s0J22tKeHKJme4qXcvxvDkS7vKrNW+o_XtMTkMhQ@mail.gmail.com>
Hi Ard, Akashi,
On 11/14/2017 04:50 PM, Ard Biesheuvel wrote:
> On 13 November 2017 at 09:27, AKASHI Takahiro
> <takahiro.akashi@linaro.org> wrote:
>> Hi,
>>
>> On Fri, Nov 10, 2017 at 05:41:56PM +0530, Bhupesh Sharma wrote:
>>> Resent with Akashi's correct email address.
>>>
>>> On Fri, Nov 10, 2017 at 5:39 PM, Bhupesh Sharma <bhsharma@redhat.com> wrote:
>>>> Hi Ard, Akashi
>>>>
>>>> I have met an issue on an arm64 board using the latest master branch from Linus.
>> (snip)
>>>>
>>>> 8. Also, I think now the crashkernel handling changed by
>>>> e7cd190385d17790cc3eb3821b1094b00aacf325 (arm64: mark reserved
>>>> memblock regions explicitly in iomem), needs to be changed to handle
>>>> the change added by Ard to fix this issue on ACPI only machines.
>>>>
>>>> I have a dirty hack in place, but I would like to have your opinions
>>>> about what can be a more concrete fix to this issue (as we mark these
>>>> regions as System RAM now rather than NOMAP) and I don't have a DTB
>>>> based machine to test on currently.
>>
>> I don't know much about acpi reclaim regions,
>> can you please tell me how your change affects your panic case?
Sorry I was away yesterday and couldn't get back with the dirty hack
details. But I see Ard has already proposed the following change and it
looks similar to the change I did locally however that doesn't seem to
fix the issue completely at my end so far.
Here are more details on the same ..
>
> Does this help at all?
>
> diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
> index 7768423b39d3..61d867647cca 100644
> --- a/arch/arm64/kernel/setup.c
> +++ b/arch/arm64/kernel/setup.c
> @@ -213,7 +213,7 @@ static void __init request_standard_resources(void)
>
> for_each_memblock(memory, region) {
> res = alloc_bootmem_low(sizeof(*res));
> - if (memblock_is_nomap(region)) {
> + if (memblock_is_nomap(region) || memblock_is_reserved(region)) {
> res->name = "reserved";
> res->flags = IORESOURCE_MEM;
> } else {
>
.. So, I tried using the 'memblock_is_reserved' check in '
request_standard_resources' however as 'memblock_is_reserved' expects a
phy_addr as an input argument, I changed mine to something like this:
- if (memblock_is_nomap(region)) {
+ if (memblock_is_nomap(region) ||
memblock_is_reserved(__pfn_to_phys(memblock_region_reserved_base_pfn(region))))
{
However, I see I am hitting a still hitting the issue and its quite
peculiar one. First some more background on what is happening on this
Huawei Taishan arm64 board that I have:
1a. I see from the boot logs that one of the ACPI tables (DSDT) is at
phy addr 0x39710000:
# dmesg | grep -i "DSDT"
[ 0.000000] ACPI: DSDT 0x0000000039710000 006656 (v02 HISI HIP07
00000000 INTL 20151124)
1b. This DSDT table is correctly marked as a ACPI Reclaim memory,
however I see that just preceding this entry there also is a 'Boot Code'
entry from address '0x0000396c0000-0x00003970ffff':
# dmesg | grep -B 2 -i "ACPI reclaim"
[ 0.000000] efi: 0x000039670000-0x0000396bffff [Runtime Code
|RUN| | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x0000396c0000-0x00003970ffff [Boot Code
| | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x000039710000-0x00003975ffff [ACPI Reclaim
Memory| | | | | | | | |WB|WT|WC|UC]
2. Now, I am not sure which kernel layer does the following changes (I
am still trying to dig it out more), but I see that the 'Boot Code' and
ACPI DSDT table regions are somehow merged into one memblock_region and
appear as range '396c0000-3975ffff' in the '/proc/iomem' interface:
# cat /proc/iomem | grep -A 2 -B 2 39
00000000-3961ffff : System RAM
00080000-00b6ffff : Kernel code
00cb0000-0167ffff : Kernel data
0e800000-2e7fffff : Crash kernel
39620000-396bffff : reserved
396c0000-3975ffff : System RAM
39760000-3976ffff : reserved
39770000-397affff : reserved
397b0000-3989ffff : reserved
398a0000-398bffff : reserved
398c0000-39d3ffff : reserved
39d40000-3ed2ffff : System RAM
3. As to why this merged region appears as a System RAM area, rather
than a RESERVED one, the following code path explains the same:
3a. The check we added in 'arch/arm64/kernel/setup.c' doesn't handle the
ACPI DSDT table properly and mark it as 'RESERVED'. This is because
'memblock_is_reserved' calls 'memblock_search' internally which is
implemented currently as:
static int __init_memblock memblock_search(struct memblock_type *type,
phys_addr_t addr)
{
unsigned int left = 0, right = type->cnt;
do {
unsigned int mid = (right + left) / 2;
if (addr < type->regions[mid].base)
right = mid;
else if (addr >= (type->regions[mid].base +
type->regions[mid].size))
left = mid + 1;
else
return mid;
} while (left < right);
return -1;
}
3b. Since 'addr' being passed to 'memblock_search' calculated via
'__pfn_to__phys(memblock_region_memory_base_pfn(region)' in this case is
0x396c0000 (see iomem entry in point 2 above), so we never see that
this memblock is reserved for the ACPI DSDT entry@0x39710000.
4. Now, when we run the kexec-tools to load a crashdump kernel, it
doesn't find an entry for the ACPI DSDT table in the reserved range (but
instead finds it as a System RAM range):
# kexec -p /boot/vmlinuz-`uname -r` --initrd=/boot/initramfs-`uname
-r`.img --reuse-cmdline -d
...
get_memory_ranges_iomem_cb: 0000000000000000 - 000000003961ffff : System RAM
get_memory_ranges_iomem_cb: 0000000039620000 - 00000000396bffff : reserved
get_memory_ranges_iomem_cb: 00000000396c0000 - 000000003975ffff : System RAM
get_memory_ranges_iomem_cb: 0000000039760000 - 000000003976ffff : reserved
get_memory_ranges_iomem_cb: 0000000039770000 - 00000000397affff : reserved
get_memory_ranges_iomem_cb: 00000000397b0000 - 000000003989ffff : reserved
get_memory_ranges_iomem_cb: 00000000398a0000 - 00000000398bffff : reserved
get_memory_ranges_iomem_cb: 00000000398c0000 - 0000000039d3ffff : reserved
get_memory_ranges_iomem_cb: 0000000039d40000 - 000000003ed2ffff : System RAM
get_memory_ranges_iomem_cb: 000000003ed30000 - 000000003ed5ffff : reserved
get_memory_ranges_iomem_cb: 000000003ed60000 - 000000003fbfffff : System RAM
get_memory_ranges_iomem_cb: 0000001040000000 - 0000001ffbffffff : System RAM
get_memory_ranges_iomem_cb: 0000002000000000 - 0000002ffbffffff : System RAM
get_memory_ranges_iomem_cb: 0000009000000000 - 0000009ffbffffff : System RAM
get_memory_ranges_iomem_cb: 000000a000000000 - 000000affbffffff : System RAM
elf_arm64_probe: Not an ELF executable.
..
5. Now when a crash is issued to boot the crashkernel, we see it panic
while trying to access the acpi tables (note that the logs below have
been snipped for clarity):
# echo c > /proc/sysrq-trigger
...
[ 419.495621] Bye!
...
[ 0.000000] efi: 0x0000396c0000-0x00003970ffff [Boot Code
| | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x000039710000-0x00003975ffff [ACPI Reclaim
Memory| | | | | | | | |WB|WT|WC|UC]
...
[ 0.000000] ACPI: DSDT 0x0000000039710000 006656 (v02 HISI HIP07
00000000 INTL 20151124)
...
[ 0.000000] Early memory node ranges
[ 0.000000] node 0: [mem 0x0000000010200000-0x00000000301fffff]
[ 0.000000] node 0: [mem 0x0000000039620000-0x00000000396bffff]
[ 0.000000] node 0: [mem 0x0000000039760000-0x000000003976ffff]
[ 0.000000] node 0: [mem 0x00000000397b0000-0x000000003989ffff]
[ 0.000000] node 0: [mem 0x00000000398c0000-0x0000000039d3ffff]
[ 0.000000] node 0: [mem 0x000000003ed30000-0x000000003ed5ffff]
...
[ 0.039309] ACPI: Core revision 20170728
[ 0.044383] Unable to handle kernel paging request at virtual address
ffff000009f10027
[ 0.052386] Mem abort info:
[ 0.055201] Exception class = DABT (current EL), IL = 32 bits
[ 0.061179] SET = 0, FnV = 0
[ 0.064258] EA = 0, S1PTW = 0
[ 0.067424] Data abort info:
[ 0.070326] ISV = 0, ISS = 0x00000021
[ 0.074195] CM = 0, WnR = 0
[ 0.077187] swapper pgtable: 64k pages, 48-bit VAs, pgd =
ffff000009650000
[ 0.084133] [ffff000009f10027] *pgd=00000000301d0003,
*pud=00000000301d0003, *pmd=00000000301c0003, *pte=00e8000039710707
[ 0.095215] Internal error: Oops: 96000021 [#1] SMP
[ 0.100139] Modules linked in:
[ 0.103219] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.14.0+ #30
[ 0.109373] task: ffff000008d05580 task.stack: ffff000008cc0000
[ 0.115356] PC is at acpi_ns_lookup+0x25c/0x3c0
[ 0.119929] LR is at acpi_ds_load1_begin_op+0xa4/0x294
[ 0.125117] pc : [<ffff0000084a862c>] lr : [<ffff00000849d3c0>]
pstate: 60000045
[ 0.132589] sp : ffff000008ccfb40
[ 0.135930] x29: ffff000008ccfb40 x28: ffff000008a9c18c
[ 0.141295] x27: ffff0000088be820 x26: 0000000000000000
[ 0.146659] x25: 000000000000001b x24: 0000000000000001
[ 0.152024] x23: 0000000000000001 x22: ffff000009f10027
[ 0.157389] x21: ffff000008ccfc50 x20: 0000000000000001
[ 0.162753] x19: 000000000000001b x18: 0000000000000005
[ 0.168117] x17: 0000000000000000 x16: 0000000000000000
[ 0.173481] x15: 0000000000000000 x14: 000000000000038e
[ 0.178846] x13: ffffffff00000000 x12: ffffffffffffffff
[ 0.184210] x11: 0000000000000006 x10: 00000000ffffff76
[ 0.189574] x9 : 000000000000005f x8 : ffff800014670140
[ 0.194939] x7 : 0000000000000000 x6 : ffff000008ccfc50
[ 0.200303] x5 : ffff800012d45000 x4 : 0000000000000001
[ 0.205668] x3 : ffff000008ccfbe0 x2 : ffff0000095e3a00
[ 0.211032] x1 : ffff000009f10027 x0 : 0000000000000000
[ 0.216397] Process swapper/0 (pid: 0, stack limit = 0xffff000008cc0000)
[ 0.223166] Call trace:
[ 0.225629] Exception stack(0xffff000008ccfa00 to 0xffff000008ccfb40)
[ 0.232136] fa00: 0000000000000000 ffff000009f10027 ffff0000095e3a00
ffff000008ccfbe0
[ 0.240048] fa20: 0000000000000001 ffff800012d45000 ffff000008ccfc50
0000000000000000
[ 0.247960] fa40: ffff800014670140 000000000000005f 00000000ffffff76
0000000000000006
[ 0.255872] fa60: ffffffffffffffff ffffffff00000000 000000000000038e
0000000000000000
[ 0.263785] fa80: 0000000000000000 0000000000000000 0000000000000005
000000000000001b
[ 0.271697] faa0: 0000000000000001 ffff000008ccfc50 ffff000009f10027
0000000000000001
[ 0.279609] fac0: 0000000000000001 000000000000001b 0000000000000000
ffff0000088be820
[ 0.287521] fae0: ffff000008a9c18c ffff000008ccfb40 ffff00000849d3c0
ffff000008ccfb40
[ 0.295433] fb00: ffff0000084a862c 0000000060000045 ffff000008ccfb40
ffff000008261918
[ 0.303345] fb20: ffffffffffffffff ffff0000087f193c ffff000008ccfb40
ffff0000084a862c
[ 0.311258] [<ffff0000084a862c>] acpi_ns_lookup+0x25c/0x3c0
[ 0.316885] [<ffff00000849d3c0>] acpi_ds_load1_begin_op+0xa4/0x294
[ 0.323128] [<ffff0000084af374>] acpi_ps_build_named_op+0xc4/0x198
[ 0.329371] [<ffff0000084af594>] acpi_ps_create_op+0x14c/0x270
[ 0.335262] [<ffff0000084aee70>] acpi_ps_parse_loop+0x188/0x5c8
[ 0.341241] [<ffff0000084aff10>] acpi_ps_parse_aml+0xb0/0x2b8
[ 0.347044] [<ffff0000084aacd8>] acpi_ns_one_complete_parse+0x144/0x184
[ 0.353726] [<ffff0000084aad60>] acpi_ns_parse_table+0x48/0x68
[ 0.359616] [<ffff0000084aa194>] acpi_ns_load_table+0x4c/0xdc
[ 0.365420] [<ffff0000084b51c0>] acpi_tb_load_namespace+0xe4/0x264
[ 0.371664] [<ffff000008bafd64>] acpi_load_tables+0x48/0xc0
[ 0.377292] [<ffff000008badfd0>] acpi_early_init+0x9c/0xd0
[ 0.382832] [<ffff000008b70d50>] start_kernel+0x3b4/0x43c
So, I am looking at what could be causing the 'Boot Code' and 'ACPI DSDT
table' ranges to be merged into a single region at
'0x0000396c0000-0x00003970ffff' which cannot be marked as RESERVED using
'memblock_is_reserved'.
Any pointers?
Regards,
Bhupesh
next prev parent reply other threads:[~2017-11-15 10:58 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-10 12:09 arm64 crashkernel fails to boot on acpi-only machines due to ACPI regions being no longer mapped as NOMAP Bhupesh Sharma
2017-11-10 12:11 ` Bhupesh Sharma
2017-11-13 9:27 ` AKASHI Takahiro
2017-11-14 11:20 ` Ard Biesheuvel
2017-11-15 10:58 ` Bhupesh Sharma [this message]
2017-11-16 7:00 ` AKASHI Takahiro
2017-11-26 8:29 ` Bhupesh SHARMA
2017-12-04 14:02 ` Ard Biesheuvel
2017-12-12 21:51 ` Bhupesh Sharma
2017-12-13 10:26 ` AKASHI Takahiro
2017-12-13 10:49 ` Ard Biesheuvel
2017-12-13 12:16 ` AKASHI Takahiro
2017-12-13 12:17 ` Ard Biesheuvel
2017-12-13 19:22 ` Bhupesh SHARMA
2017-12-15 8:59 ` AKASHI Takahiro
2017-12-15 9:35 ` Ard Biesheuvel
2017-12-17 21:01 ` Bhupesh Sharma
2017-12-18 5:16 ` Dave Young
2017-12-18 5:54 ` AKASHI Takahiro
2017-12-18 8:59 ` Bhupesh SHARMA
2017-12-18 11:18 ` AKASHI Takahiro
2017-12-18 22:28 ` Bhupesh Sharma
2017-12-19 5:01 ` AKASHI Takahiro
2017-12-20 19:52 ` Bhupesh Sharma
2017-12-18 21:28 ` Bhupesh Sharma
2017-12-19 5:25 ` AKASHI Takahiro
2017-12-18 5:40 ` Dave Young
2017-12-18 5:43 ` Dave Young
2017-12-19 6:09 ` AKASHI Takahiro
2017-12-19 13:09 ` Ard Biesheuvel
2017-12-20 20:00 ` Bhupesh Sharma
2017-12-21 10:34 ` AKASHI Takahiro
2017-12-21 12:06 ` Bhupesh Sharma
2017-12-22 8:33 ` AKASHI Takahiro
2017-12-23 19:51 ` Bhupesh Sharma
2017-12-25 3:25 ` AKASHI Takahiro
2017-12-25 20:14 ` Bhupesh Sharma
2017-12-26 1:32 ` Dave Young
2017-12-26 1:35 ` Dave Young
2017-12-26 2:28 ` AKASHI Takahiro
2017-12-26 2:56 ` Bhupesh Sharma
2017-12-26 6:58 ` Dave Young
2018-01-09 5:22 ` AKASHI Takahiro
2018-01-08 20:00 ` Bhupesh Sharma
2018-01-09 4:42 ` AKASHI Takahiro
2018-01-09 11:46 ` Bhupesh Sharma
2017-12-26 6:56 ` Dave Young
2018-01-09 5:02 ` AKASHI Takahiro
2017-11-24 8:47 ` Dave Young
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3df4c6c5-0abe-01ee-730d-2edaa5f497d2@redhat.com \
--to=bhsharma@redhat.com \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).