From: xuwei5@hisilicon.com (Wei Xu)
To: linux-arm-kernel@lists.infradead.org
Subject: KVM guest sometimes failed to boot because of kernel stack overflow if KPTI is enabled on a hisilicon ARM64 platform.
Date: Fri, 22 Jun 2018 23:28:21 +0800 [thread overview]
Message-ID: <5B2D1595.6020000@hisilicon.com> (raw)
In-Reply-To: <20180622142851.g3r4em3kidx5p3wv@lakrids.cambridge.arm.com>
Hi Mark,
On 2018/6/22 22:28, Mark Rutland wrote:
> On Fri, Jun 22, 2018 at 09:18:27PM +0800, Wei Xu wrote:
>> [ 0.042462] Insufficient stack space to handle exception!
>> [ 0.042464] ESR: 0x96000046 -- DABT (current EL)
>> [ 0.043781] FAR: 0xffff0000093a80e0
>> [ 0.044239] Task stack: [0xffff0000093a8000..0xffff0000093ac000]
> Here, the FAR points somewhere in the task stack, so we're evidently
> faulting on that...
>
>> [ 0.046967] IRQ stack: [0xffff000008000000..0xffff000008004000]
>> [ 0.053361] Overflow stack: [0xffff80003efce2f0..0xffff80003efcf2f0]
>> [ 0.059754] CPU: 0 PID: 12 Comm: migration/0 Not tainted
>> 4.17.0-45864-g29dcea8-dirty #16
>> [ 0.067946] Hardware name: linux,dummy-virt (DT)
>> [ 0.072644] pstate: 604003c5 (nZCv DAIF +PAN -UAO)
>> [ 0.077480] pc : el1_sync+0x0/0xb0
>> [ 0.080970] lr : kpti_install_ng_mappings+0x120/0x214
>> [ 0.086143] sp : ffff0000093a80e0
>> [ 0.089513] x29: ffff0000093abce0 x28: ffff000008ea9000
>> [ 0.094929] x27: ffff000008ea9000 x26: ffff0000091f7000
>> [ 0.100241] x25: ffff00000906d000 x24: ffff000009191000
>> [ 0.105657] x23: ffff000008ea9000 x22: 0000000041190000
>> [ 0.111448] x21: ffff0000091f7000 x20: 0000000000000000
>> [ 0.116437] x19: ffff000009190000 x18: 000000003455d99d
>> [ 0.121739] x17: 0000000000000001 x16: 00f8000040ffff13
>> [ 0.127155] x15: 000000007eff6000 x14: 000000007eff6000
>> [ 0.132576] x13: 00f800007fe00f11 x12: 000000007eff8000
>> [ 0.137886] x11: 000000007eff8000 x10: 0000000000000000
>> [ 0.143300] x9 : 000000007eff9000 x8 : 000000007eff9000
>> [ 0.148717] x7 : 0000000000000000 x6 : 00000000411f8000
>> [ 0.154028] x5 : 00000000411f8000 x4 : 0000000040a443d4
>> [ 0.159444] x3 : 00000000411f7000 x2 : 00000000411f7000
>> [ 0.164862] x1 : ffff00000906d7b0 x0 : ffff80003da61c00
>> [ 0.170179] Kernel panic - not syncing: kernel stack overflow
>> [ 0.176069] CPU: 0 PID: 12 Comm: migration/0 Not tainted
>> 4.17.0-45864-g29dcea8-dirty #16
>> [ 0.184152] Hardware name: linux,dummy-virt (DT)
>> [ 0.188851] Call trace:
>> [ 0.191380] dump_backtrace+0x0/0x180
>> [ 0.195113] show_stack+0x14/0x1c
>> [ 0.198488] dump_stack+0x90/0xb0
>> [ 0.201862] panic+0x138/0x2a0
>> [ 0.204989] __stack_chk_fail+0x0/0x18
>> [ 0.208836] handle_bad_stack+0x118/0x124
>> [ 0.212927] __bad_stack+0x88/0x8c
>> [ 0.216414] el1_sync+0x0/0xb0
>> [ 0.219544] Unable to handle kernel paging request at virtual address
>> ffff0000093abce0
> Likewise, here we're faulting on an address within the task stack,
> presumably as part of the unwinding process...
>
>> [ 0.227507] Mem abort info:
>> [ 0.230390] ESR = 0x96000006
>> [ 0.233517] Exception class = DABT (current EL), IL = 32 bits
>> [ 0.239428] SET = 0, FnV = 0
>> [ 0.242555] EA = 0, S1PTW = 0
>> [ 0.245797] Data abort info:
>> [ 0.248795] ISV = 0, ISS = 0x00000006
>> [ 0.252652] CM = 0, WnR = 0
>> [ 0.255769] swapper pgtable: 4k pages, 48-bit VAs, pgdp =
>> (ptrval)
>> [ 0.262645] [ffff0000093abce0] pgd=00000000411f8803,
>> pud=00000000411f9803, pmd=0000000000000000
> ... and here the PMD for the task stack is all zeroes, so evidently
> that's getting corrupted somehow.
>
> It appears that the overflow stack (which IIRC is embedded within the
> kernel's data segment, as part of the image mapping), is fine.
>
> I wonder if there's some existing weirdness in the page tables for the
> vmalloc area that causes things to go wrong. Can you please:
>
> * enable ARM64_PTDUMP_DEBUGFS
>
> * boot with kpti=off (with Will's patch to make this work)
>
> * as root, cat /sys/kernel/debug/kernel_page_tables
>
> ... and dump the result here?
Thanks!
Can I do this later since Will's new patch works?
Best Regards,
Wei
> Thanks,
> Mark.
>
> .
>
WARNING: multiple messages have this Message-ID (diff)
From: Wei Xu <xuwei5@hisilicon.com>
To: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>,
James Morse <james.morse@arm.com>, <catalin.marinas@arm.com>,
<suzuki.poulose@arm.com>, <dave.martin@arm.com>,
<marc.zyngier@arm.com>, <linux-arm-kernel@lists.infradead.org>,
<linux-kernel@vger.kernel.org>, Linuxarm <linuxarm@huawei.com>,
Hanjun Guo <guohanjun@huawei.com>, <xiexiuqi@huawei.com>,
huangdaode <huangdaode@hisilicon.com>,
"Chenxin (Charles)" <charles.chenxin@huawei.com>,
"Xiongfanggou (James)" <james.xiong@huawei.com>,
"Liguozhu (Kenneth)" <liguozhu@hisilicon.com>,
Zhangyi ac <zhangyi.ac@huawei.com>, <jonathan.cameron@huawei.com>,
Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>,
John Garry <john.garry@huawei.com>,
Salil Mehta <salil.mehta@huawei.com>,
Shiju Jose <shiju.jose@huawei.com>,
"Zhuangyuzeng (Yisen)" <yisen.zhuang@huawei.com>,
"Wangzhou (B)" <wangzhou1@hisilicon.com>,
"kongxinwei (A)" <kong.kongxinwei@hisilicon.com>,
"Liyuan (Larry, Turing Solution)" <Larry.T@huawei.com>,
<libeijian@hisilicon.com>, <zhangbin011@hisilicon.com>
Subject: Re: KVM guest sometimes failed to boot because of kernel stack overflow if KPTI is enabled on a hisilicon ARM64 platform.
Date: Fri, 22 Jun 2018 23:28:21 +0800 [thread overview]
Message-ID: <5B2D1595.6020000@hisilicon.com> (raw)
In-Reply-To: <20180622142851.g3r4em3kidx5p3wv@lakrids.cambridge.arm.com>
Hi Mark,
On 2018/6/22 22:28, Mark Rutland wrote:
> On Fri, Jun 22, 2018 at 09:18:27PM +0800, Wei Xu wrote:
>> [ 0.042462] Insufficient stack space to handle exception!
>> [ 0.042464] ESR: 0x96000046 -- DABT (current EL)
>> [ 0.043781] FAR: 0xffff0000093a80e0
>> [ 0.044239] Task stack: [0xffff0000093a8000..0xffff0000093ac000]
> Here, the FAR points somewhere in the task stack, so we're evidently
> faulting on that...
>
>> [ 0.046967] IRQ stack: [0xffff000008000000..0xffff000008004000]
>> [ 0.053361] Overflow stack: [0xffff80003efce2f0..0xffff80003efcf2f0]
>> [ 0.059754] CPU: 0 PID: 12 Comm: migration/0 Not tainted
>> 4.17.0-45864-g29dcea8-dirty #16
>> [ 0.067946] Hardware name: linux,dummy-virt (DT)
>> [ 0.072644] pstate: 604003c5 (nZCv DAIF +PAN -UAO)
>> [ 0.077480] pc : el1_sync+0x0/0xb0
>> [ 0.080970] lr : kpti_install_ng_mappings+0x120/0x214
>> [ 0.086143] sp : ffff0000093a80e0
>> [ 0.089513] x29: ffff0000093abce0 x28: ffff000008ea9000
>> [ 0.094929] x27: ffff000008ea9000 x26: ffff0000091f7000
>> [ 0.100241] x25: ffff00000906d000 x24: ffff000009191000
>> [ 0.105657] x23: ffff000008ea9000 x22: 0000000041190000
>> [ 0.111448] x21: ffff0000091f7000 x20: 0000000000000000
>> [ 0.116437] x19: ffff000009190000 x18: 000000003455d99d
>> [ 0.121739] x17: 0000000000000001 x16: 00f8000040ffff13
>> [ 0.127155] x15: 000000007eff6000 x14: 000000007eff6000
>> [ 0.132576] x13: 00f800007fe00f11 x12: 000000007eff8000
>> [ 0.137886] x11: 000000007eff8000 x10: 0000000000000000
>> [ 0.143300] x9 : 000000007eff9000 x8 : 000000007eff9000
>> [ 0.148717] x7 : 0000000000000000 x6 : 00000000411f8000
>> [ 0.154028] x5 : 00000000411f8000 x4 : 0000000040a443d4
>> [ 0.159444] x3 : 00000000411f7000 x2 : 00000000411f7000
>> [ 0.164862] x1 : ffff00000906d7b0 x0 : ffff80003da61c00
>> [ 0.170179] Kernel panic - not syncing: kernel stack overflow
>> [ 0.176069] CPU: 0 PID: 12 Comm: migration/0 Not tainted
>> 4.17.0-45864-g29dcea8-dirty #16
>> [ 0.184152] Hardware name: linux,dummy-virt (DT)
>> [ 0.188851] Call trace:
>> [ 0.191380] dump_backtrace+0x0/0x180
>> [ 0.195113] show_stack+0x14/0x1c
>> [ 0.198488] dump_stack+0x90/0xb0
>> [ 0.201862] panic+0x138/0x2a0
>> [ 0.204989] __stack_chk_fail+0x0/0x18
>> [ 0.208836] handle_bad_stack+0x118/0x124
>> [ 0.212927] __bad_stack+0x88/0x8c
>> [ 0.216414] el1_sync+0x0/0xb0
>> [ 0.219544] Unable to handle kernel paging request at virtual address
>> ffff0000093abce0
> Likewise, here we're faulting on an address within the task stack,
> presumably as part of the unwinding process...
>
>> [ 0.227507] Mem abort info:
>> [ 0.230390] ESR = 0x96000006
>> [ 0.233517] Exception class = DABT (current EL), IL = 32 bits
>> [ 0.239428] SET = 0, FnV = 0
>> [ 0.242555] EA = 0, S1PTW = 0
>> [ 0.245797] Data abort info:
>> [ 0.248795] ISV = 0, ISS = 0x00000006
>> [ 0.252652] CM = 0, WnR = 0
>> [ 0.255769] swapper pgtable: 4k pages, 48-bit VAs, pgdp =
>> (ptrval)
>> [ 0.262645] [ffff0000093abce0] pgd=00000000411f8803,
>> pud=00000000411f9803, pmd=0000000000000000
> ... and here the PMD for the task stack is all zeroes, so evidently
> that's getting corrupted somehow.
>
> It appears that the overflow stack (which IIRC is embedded within the
> kernel's data segment, as part of the image mapping), is fine.
>
> I wonder if there's some existing weirdness in the page tables for the
> vmalloc area that causes things to go wrong. Can you please:
>
> * enable ARM64_PTDUMP_DEBUGFS
>
> * boot with kpti=off (with Will's patch to make this work)
>
> * as root, cat /sys/kernel/debug/kernel_page_tables
>
> ... and dump the result here?
Thanks!
Can I do this later since Will's new patch works?
Best Regards,
Wei
> Thanks,
> Mark.
>
> .
>
next prev parent reply other threads:[~2018-06-22 15:28 UTC|newest]
Thread overview: 79+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-06-20 14:18 KVM guest sometimes failed to boot because of kernel stack overflow if KPTI is enabled on a hisilicon ARM64 platform Wei Xu
2018-06-20 14:18 ` Wei Xu
2018-06-20 14:42 ` Will Deacon
2018-06-20 14:42 ` Will Deacon
2018-06-20 15:52 ` Wei Xu
2018-06-20 15:52 ` Wei Xu
2018-06-20 15:54 ` James Morse
2018-06-20 15:54 ` James Morse
2018-06-20 16:25 ` Wei Xu
2018-06-20 16:25 ` Wei Xu
2018-06-20 16:28 ` Will Deacon
2018-06-20 16:28 ` Will Deacon
2018-06-20 16:33 ` Wei Xu
2018-06-20 16:33 ` Wei Xu
2018-06-21 8:38 ` James Morse
2018-06-21 8:38 ` James Morse
2018-06-21 9:00 ` Marc Zyngier
2018-06-21 9:00 ` Marc Zyngier
2018-06-21 9:18 ` Will Deacon
2018-06-21 9:18 ` Will Deacon
2018-06-21 10:14 ` Wei Xu
2018-06-21 10:14 ` Wei Xu
2018-06-21 10:54 ` Will Deacon
2018-06-21 10:54 ` Will Deacon
2018-06-22 8:33 ` Wei Xu
2018-06-22 8:33 ` Wei Xu
2018-06-22 9:23 ` Will Deacon
2018-06-22 9:23 ` Will Deacon
2018-06-22 10:45 ` Wei Xu
2018-06-22 10:45 ` Wei Xu
2018-06-22 11:16 ` Will Deacon
2018-06-22 11:16 ` Will Deacon
2018-06-22 13:18 ` Wei Xu
2018-06-22 13:18 ` Wei Xu
2018-06-22 13:31 ` Will Deacon
2018-06-22 13:31 ` Will Deacon
2018-06-22 13:46 ` Wei Xu
2018-06-22 13:46 ` Wei Xu
2018-06-22 14:43 ` Will Deacon
2018-06-22 14:43 ` Will Deacon
2018-06-22 15:26 ` Wei Xu
2018-06-22 15:26 ` Wei Xu
2018-06-22 14:28 ` Mark Rutland
2018-06-22 14:28 ` Mark Rutland
2018-06-22 15:28 ` Wei Xu [this message]
2018-06-22 15:28 ` Wei Xu
2018-06-22 15:41 ` Will Deacon
2018-06-22 15:41 ` Will Deacon
2018-06-22 16:02 ` Wei Xu
2018-06-22 16:02 ` Wei Xu
2018-06-21 9:20 ` Wei Xu
2018-06-21 9:20 ` Wei Xu
2018-06-26 17:16 ` Wei Xu
2018-06-26 17:16 ` Wei Xu
2018-06-26 17:47 ` Will Deacon
2018-06-26 17:47 ` Will Deacon
2018-06-27 8:39 ` James Morse
2018-06-27 8:39 ` James Morse
2018-06-27 13:26 ` Wei Xu
2018-06-27 13:26 ` Wei Xu
2018-06-28 8:45 ` James Morse
2018-06-28 8:45 ` James Morse
2018-06-28 10:20 ` Wei Xu
2018-06-28 10:20 ` Wei Xu
2018-06-27 13:22 ` Wei Xu
2018-06-27 13:22 ` Wei Xu
2018-06-27 13:28 ` Will Deacon
2018-06-27 13:28 ` Will Deacon
2018-06-27 13:32 ` Wei Xu
2018-06-27 13:32 ` Wei Xu
2018-06-28 14:50 ` Wei Xu
2018-06-28 14:50 ` Wei Xu
2018-06-28 15:34 ` Mark Rutland
2018-06-28 15:34 ` Mark Rutland
[not found] ` <etPan.5b3507f7.914aa16.1d6b@localhost>
2018-06-28 16:24 ` 答复: " Mark Rutland
2018-06-28 16:24 ` Mark Rutland
2018-06-29 9:59 ` Mark Rutland
2018-06-29 9:59 ` Mark Rutland
2018-06-29 8:47 ` Marc Zyngier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5B2D1595.6020000@hisilicon.com \
--to=xuwei5@hisilicon.com \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.