From: xuwei5@hisilicon.com (Wei Xu)
To: linux-arm-kernel@lists.infradead.org
Subject: KVM guest sometimes failed to boot because of kernel stack overflow if KPTI is enabled on a hisilicon ARM64 platform.
Date: Thu, 21 Jun 2018 10:20:42 +0100 [thread overview]
Message-ID: <5B2B6DEA.2090100@hisilicon.com> (raw)
In-Reply-To: <e701eaa8-dcb9-777c-2211-67ee27b43acb@arm.com>
Hi James,
On 2018/6/21 9:38, James Morse wrote:
> Hi Will, Wei,
>
> On 20/06/18 17:25, Wei Xu wrote:
>> On 2018/6/20 23:54, James Morse wrote:
>> I have disabled CONFIG_ARM64_RAS_EXTN and reverted that commit.
>> But I still got the stack overflow issue sometimes.
>> Do you have more hint?
>
>> The log is as below:
>> [ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x480fd010]
>> [ 0.000000] Linux version 4.17.0-45865-g2b31fe7-dirty
>
> Could you reproduce this with v4.17? This says there are ~45,000 extra patches,
> and un-committed changes. None of the hashes so far have been commits in
> mainline, so we have no idea what this tree is.
>
I have tried v4.17 and log is as below and also it can be found in the first mail
of this thread.
[ 0.000000] Linux version 4.17.0-45864-g29dcea8-dirty
(joyx at Turing-Arch-b) (gcc version 4.9.1 20140505 (prerelease) (crosstool-NG
linaro-1.13.1-4.9-2014.05 - Linaro GCC 4.9-2014.05)) #6 SMP PREEMPT Fri Jun
15 21:39:52 CST 2018
I will try v4.17.2 and v4.18-rc1.
>
>> (joyx at Turing-Arch-b) (gcc version 4.9.1 20140505 (prerelease) (crosstool-NG
>> linaro-1.13.1-4.9-2014.05 - Linaro GCC 4.9-2014.05)) #10 SMP PREEMPT Wed Jun 20
>> 23:59:05 CST 2018
>
>> [ 0.000000] CPU0: using LPI pending table @0x000000007d860000
>> [ 0.000000] GIC: PPI11 is secure or misconfigured
>> [ 0.000000] arch_timer: WARNING: Invalid trigger for IRQ3, assuming level
>> low
>> [ 0.000000] arch_timer: WARNING: Please fix your firmware
>> [ 0.000000] arch_timer: cp15 timer(s) running at 100.00MHz (virt).
>
> (No idea what these mean, but I doubt they are relevant)
>
I will try with mainline qemu 2.12.0.
Thanks!
Best Regards,
Wei
>
>> [ 0.042421] Insufficient stack space to handle exception!
>> [ 0.042423] ESR: 0x96000046 -- DABT (current EL)
>> [ 0.043730] FAR: 0xffff0000093a80e0
>> [ 0.044714] Task stack: [0xffff0000093a8000..0xffff0000093ac000]
>
> This was a level 2 translation fault on a write, to an address that is within
> the stack....
>
>
>> [ 0.051113] IRQ stack: [0xffff000008000000..0xffff000008004000]
>> [ 0.057610] Overflow stack: [0xffff80003efce2f0..0xffff80003efcf2f0]
>> [ 0.064003] CPU: 0 PID: 12 Comm: migration/0 Not tainted
>> 4.17.0-45865-g2b31fe7-dirty #10
>> [ 0.072201] Hardware name: linux,dummy-virt (DT)
>
>> [ 0.076797] pstate: 604003c5 (nZCv DAIF +PAN -UAO)
>> [ 0.081727] pc : el1_sync+0x0/0xb0
>
> ... from the vectors.
>
>
>> [ 0.085217] lr : kpti_install_ng_mappings+0x120/0x214
>
> What I think is happening is: we come out of the kpti idmap with the stack
> unmapped. Shortly after we access the stack, which faults. el1_sync faults as
> well when it tries to push the registers to the stack, and we keep going until
> we overflow the stack.
>
> I can't reproduce this with kvmtool or qemu in the model.
>
>
> Thanks,
>
> James
>
> .
>
WARNING: multiple messages have this Message-ID (diff)
From: Wei Xu <xuwei5@hisilicon.com>
To: James Morse <james.morse@arm.com>, Will Deacon <will.deacon@arm.com>
Cc: <catalin.marinas@arm.com>, <suzuki.poulose@arm.com>,
<dave.martin@arm.com>, <mark.rutland@arm.com>,
<marc.zyngier@arm.com>, <linux-arm-kernel@lists.infradead.org>,
<linux-kernel@vger.kernel.org>, Linuxarm <linuxarm@huawei.com>,
Hanjun Guo <guohanjun@huawei.com>, <xiexiuqi@huawei.com>,
huangdaode <huangdaode@hisilicon.com>,
"Chenxin (Charles)" <charles.chenxin@huawei.com>,
"Xiongfanggou (James)" <james.xiong@huawei.com>,
"Liguozhu (Kenneth)" <liguozhu@hisilicon.com>,
Zhangyi ac <zhangyi.ac@huawei.com>, <jonathan.cameron@huawei.com>,
"Shameerali Kolothum Thodi"
<shameerali.kolothum.thodi@huawei.com>,
John Garry <john.garry@huawei.com>,
Salil Mehta <salil.mehta@huawei.com>,
Shiju Jose <shiju.jose@huawei.com>,
"Zhuangyuzeng (Yisen)" <yisen.zhuang@huawei.com>,
"Wangzhou (B)" <wangzhou1@hisilicon.com>,
"kongxinwei (A)" <kong.kongxinwei@hisilicon.com>,
"Liyuan (Larry, Turing Solution)" <Larry.T@huawei.com>,
<libeijian@hisilicon.com>
Subject: Re: KVM guest sometimes failed to boot because of kernel stack overflow if KPTI is enabled on a hisilicon ARM64 platform.
Date: Thu, 21 Jun 2018 10:20:42 +0100 [thread overview]
Message-ID: <5B2B6DEA.2090100@hisilicon.com> (raw)
In-Reply-To: <e701eaa8-dcb9-777c-2211-67ee27b43acb@arm.com>
Hi James,
On 2018/6/21 9:38, James Morse wrote:
> Hi Will, Wei,
>
> On 20/06/18 17:25, Wei Xu wrote:
>> On 2018/6/20 23:54, James Morse wrote:
>> I have disabled CONFIG_ARM64_RAS_EXTN and reverted that commit.
>> But I still got the stack overflow issue sometimes.
>> Do you have more hint?
>
>> The log is as below:
>> [ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x480fd010]
>> [ 0.000000] Linux version 4.17.0-45865-g2b31fe7-dirty
>
> Could you reproduce this with v4.17? This says there are ~45,000 extra patches,
> and un-committed changes. None of the hashes so far have been commits in
> mainline, so we have no idea what this tree is.
>
I have tried v4.17 and log is as below and also it can be found in the first mail
of this thread.
[ 0.000000] Linux version 4.17.0-45864-g29dcea8-dirty
(joyx@Turing-Arch-b) (gcc version 4.9.1 20140505 (prerelease) (crosstool-NG
linaro-1.13.1-4.9-2014.05 - Linaro GCC 4.9-2014.05)) #6 SMP PREEMPT Fri Jun
15 21:39:52 CST 2018
I will try v4.17.2 and v4.18-rc1.
>
>> (joyx@Turing-Arch-b) (gcc version 4.9.1 20140505 (prerelease) (crosstool-NG
>> linaro-1.13.1-4.9-2014.05 - Linaro GCC 4.9-2014.05)) #10 SMP PREEMPT Wed Jun 20
>> 23:59:05 CST 2018
>
>> [ 0.000000] CPU0: using LPI pending table @0x000000007d860000
>> [ 0.000000] GIC: PPI11 is secure or misconfigured
>> [ 0.000000] arch_timer: WARNING: Invalid trigger for IRQ3, assuming level
>> low
>> [ 0.000000] arch_timer: WARNING: Please fix your firmware
>> [ 0.000000] arch_timer: cp15 timer(s) running at 100.00MHz (virt).
>
> (No idea what these mean, but I doubt they are relevant)
>
I will try with mainline qemu 2.12.0.
Thanks!
Best Regards,
Wei
>
>> [ 0.042421] Insufficient stack space to handle exception!
>> [ 0.042423] ESR: 0x96000046 -- DABT (current EL)
>> [ 0.043730] FAR: 0xffff0000093a80e0
>> [ 0.044714] Task stack: [0xffff0000093a8000..0xffff0000093ac000]
>
> This was a level 2 translation fault on a write, to an address that is within
> the stack....
>
>
>> [ 0.051113] IRQ stack: [0xffff000008000000..0xffff000008004000]
>> [ 0.057610] Overflow stack: [0xffff80003efce2f0..0xffff80003efcf2f0]
>> [ 0.064003] CPU: 0 PID: 12 Comm: migration/0 Not tainted
>> 4.17.0-45865-g2b31fe7-dirty #10
>> [ 0.072201] Hardware name: linux,dummy-virt (DT)
>
>> [ 0.076797] pstate: 604003c5 (nZCv DAIF +PAN -UAO)
>> [ 0.081727] pc : el1_sync+0x0/0xb0
>
> ... from the vectors.
>
>
>> [ 0.085217] lr : kpti_install_ng_mappings+0x120/0x214
>
> What I think is happening is: we come out of the kpti idmap with the stack
> unmapped. Shortly after we access the stack, which faults. el1_sync faults as
> well when it tries to push the registers to the stack, and we keep going until
> we overflow the stack.
>
> I can't reproduce this with kvmtool or qemu in the model.
>
>
> Thanks,
>
> James
>
> .
>
next prev parent reply other threads:[~2018-06-21 9:20 UTC|newest]
Thread overview: 79+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-06-20 14:18 KVM guest sometimes failed to boot because of kernel stack overflow if KPTI is enabled on a hisilicon ARM64 platform Wei Xu
2018-06-20 14:18 ` Wei Xu
2018-06-20 14:42 ` Will Deacon
2018-06-20 14:42 ` Will Deacon
2018-06-20 15:52 ` Wei Xu
2018-06-20 15:52 ` Wei Xu
2018-06-20 15:54 ` James Morse
2018-06-20 15:54 ` James Morse
2018-06-20 16:25 ` Wei Xu
2018-06-20 16:25 ` Wei Xu
2018-06-20 16:28 ` Will Deacon
2018-06-20 16:28 ` Will Deacon
2018-06-20 16:33 ` Wei Xu
2018-06-20 16:33 ` Wei Xu
2018-06-21 8:38 ` James Morse
2018-06-21 8:38 ` James Morse
2018-06-21 9:00 ` Marc Zyngier
2018-06-21 9:00 ` Marc Zyngier
2018-06-21 9:18 ` Will Deacon
2018-06-21 9:18 ` Will Deacon
2018-06-21 10:14 ` Wei Xu
2018-06-21 10:14 ` Wei Xu
2018-06-21 10:54 ` Will Deacon
2018-06-21 10:54 ` Will Deacon
2018-06-22 8:33 ` Wei Xu
2018-06-22 8:33 ` Wei Xu
2018-06-22 9:23 ` Will Deacon
2018-06-22 9:23 ` Will Deacon
2018-06-22 10:45 ` Wei Xu
2018-06-22 10:45 ` Wei Xu
2018-06-22 11:16 ` Will Deacon
2018-06-22 11:16 ` Will Deacon
2018-06-22 13:18 ` Wei Xu
2018-06-22 13:18 ` Wei Xu
2018-06-22 13:31 ` Will Deacon
2018-06-22 13:31 ` Will Deacon
2018-06-22 13:46 ` Wei Xu
2018-06-22 13:46 ` Wei Xu
2018-06-22 14:43 ` Will Deacon
2018-06-22 14:43 ` Will Deacon
2018-06-22 15:26 ` Wei Xu
2018-06-22 15:26 ` Wei Xu
2018-06-22 14:28 ` Mark Rutland
2018-06-22 14:28 ` Mark Rutland
2018-06-22 15:28 ` Wei Xu
2018-06-22 15:28 ` Wei Xu
2018-06-22 15:41 ` Will Deacon
2018-06-22 15:41 ` Will Deacon
2018-06-22 16:02 ` Wei Xu
2018-06-22 16:02 ` Wei Xu
2018-06-21 9:20 ` Wei Xu [this message]
2018-06-21 9:20 ` Wei Xu
2018-06-26 17:16 ` Wei Xu
2018-06-26 17:16 ` Wei Xu
2018-06-26 17:47 ` Will Deacon
2018-06-26 17:47 ` Will Deacon
2018-06-27 8:39 ` James Morse
2018-06-27 8:39 ` James Morse
2018-06-27 13:26 ` Wei Xu
2018-06-27 13:26 ` Wei Xu
2018-06-28 8:45 ` James Morse
2018-06-28 8:45 ` James Morse
2018-06-28 10:20 ` Wei Xu
2018-06-28 10:20 ` Wei Xu
2018-06-27 13:22 ` Wei Xu
2018-06-27 13:22 ` Wei Xu
2018-06-27 13:28 ` Will Deacon
2018-06-27 13:28 ` Will Deacon
2018-06-27 13:32 ` Wei Xu
2018-06-27 13:32 ` Wei Xu
2018-06-28 14:50 ` Wei Xu
2018-06-28 14:50 ` Wei Xu
2018-06-28 15:34 ` Mark Rutland
2018-06-28 15:34 ` Mark Rutland
[not found] ` <etPan.5b3507f7.914aa16.1d6b@localhost>
2018-06-28 16:24 ` 答复: " Mark Rutland
2018-06-28 16:24 ` Mark Rutland
2018-06-29 9:59 ` Mark Rutland
2018-06-29 9:59 ` Mark Rutland
2018-06-29 8:47 ` Marc Zyngier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5B2B6DEA.2090100@hisilicon.com \
--to=xuwei5@hisilicon.com \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.