From: Will Deacon <will.deacon@arm.com>
To: Wei Xu <xuwei5@hisilicon.com>
Cc: James Morse <james.morse@arm.com>,
mark.rutland@arm.com, catalin.marinas@arm.com,
Linuxarm <linuxarm@huawei.com>,
Zhangyi ac <zhangyi.ac@huawei.com>,
suzuki.poulose@arm.com, marc.zyngier@arm.com,
"Xiongfanggou (James)" <james.xiong@huawei.com>,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, dave.martin@arm.com,
"Liyuan (Larry, Turing Solution)" <Larry.T@huawei.com>,
libeijian@hisilicon.com
Subject: Re: KVM guest sometimes failed to boot because of kernel stack overflow if KPTI is enabled on a hisilicon ARM64 platform.
Date: Tue, 26 Jun 2018 18:47:46 +0100 [thread overview]
Message-ID: <20180626174746.GO23375@arm.com> (raw)
In-Reply-To: <5B3274FC.7000206@hisilicon.com>
Hi Wei,
On Wed, Jun 27, 2018 at 01:16:44AM +0800, Wei Xu wrote:
> Today I tried the kernel 4.18-rc2(defconfig, no change on top) with qemu
> 2.12.0.
> The guest sometimes still failed to boot. But the crash reason is different.
> Could you please share any hint?
> Thanks!
>
> The guest boot log is as below:
> ===========================
>
> estuary:/$ ./qemu-system-aarch64 -machine virt,kernel_irqchip=on,gic-v
> ersion=3 -cpu host -enable-kvm -smp 1 -m 1024 -kernel ./Image-4.18-joyx
> -initrd
> ../mini-rootfs-arm64.cpio.gz -nographic -append "rdinit=init
> console=ttyAMA0 ear
> lycon=pl011,0x9000000"
>
> [ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x480fd010]
> [ 0.000000] Linux version 4.18.0-rc2-58583-g7daf201-dirty
I'm still suspicious that this is 4.18-rc2 with "no change on top" ^^^ !
> [ 0.048119] Unable to handle kernel NULL pointer dereference at
> virtual address 0000000000000288
> [ 0.048991] Mem abort info:
> [ 0.049267] ESR = 0x96000004
> [ 0.049567] Exception class = DABT (current EL), IL = 32 bits
> [ 0.050146] SET = 0, FnV = 0
> [ 0.050446] EA = 0, S1PTW = 0
> [ 0.050754] Data abort info:
> [ 0.051038] ISV = 0, ISS = 0x00000004
> [ 0.051921] CM = 0, WnR = 0
> [ 0.054936] [0000000000000288] user address but active_mm is swapper
> [ 0.061427] Internal error: Oops: 96000004 [#1] PREEMPT SMP
> [ 0.067080] Modules linked in:
> [ 0.070206] CPU: 0 PID: 13 Comm: migration/0 Not tainted
> 4.18.0-rc2-58583-g7daf201-dirty #20
> [ 0.078745] Hardware name: linux,dummy-virt (DT)
> [ 0.083433] pstate: 60400085 (nZCv daIf +PAN -UAO)
> [ 0.088258] pc : kpti_install_ng_mappings+0x154/0x214
> [ 0.093319] lr : kpti_install_ng_mappings+0x120/0x214
> [ 0.098483] sp : ffff0000093fbce0
> [ 0.101854] x29: ffff0000093fbce0 x28: ffff000008ee5000
> [ 0.107263] x27: ffff000008ee5000 x26: ffff00000923b000
> [ 0.112568] x25: ffff0000090ac000 x24: ffff0000091d9000
> [ 0.117983] x23: ffff000008ee5000 x22: 00000000411d8000
> [ 0.123392] x21: ffff00000923b000 x20: 0000000000000000
> [ 0.128801] x19: ffff0000091d8000 x18: 000000003455d99d
> [ 0.134209] x17: 0000000000000001 x16: 00f8000040ffff13
> [ 0.139513] x15: 000000007dff5000 x14: 000000007dff5000
> [ 0.144920] x13: 00f800007fe00f11 x12: 000000007dff7000
> [ 0.150329] x11: 000000007dff7000 x10: 0000000000000000
> [ 0.155633] x9 : 000000007dff8000 x8 : 000000007dff8000
> [ 0.161042] x7 : 0000000000000000 x6 : 000000004123c000
> [ 0.166451] x5 : 000000004123c000 x4 : 0000000040a5f3d4
> [ 0.171860] x3 : 0000000000000000 x2 : 000000004123b000
> [ 0.177163] x1 : ffff0000090acd88 x0 : ffff80003ca627c0
So looking at the disassembly, we access idmap_t0sz as part of
cpu_install_idmap() and it looks like we push its page address to the
stack:
> 0xffff000008091ffc <+128>: adrp x3, 0xffff000009096000 <early_node_cpu_hwid+1440>
[...]
> 0xffff000008092044 <+200>: str x3, [x29,#96]
Then after we've come back from the asm call, we want to access idmap_t0sz
again as part of cpu_uninstall_idmap() so we pop it back off:
> 0xffff0000080920cc <+336>: ldr x3, [x29,#96]
> 0xffff0000080920d0 <+340>: ldr x0, [x3,#648]
And this access is the one that faults, because we popped off NULL.
So actually, rather than faulting on the stack access, we're managing to
load zeroes from somewhere, so it could still be indicative of page table
corruption for the stack mapping.
If you look at the __idmap_kpti_put_pgtable_ent_ng asm macro, can you try
replacing:
dc civac, cur_\()\type\()p
with:
dc ivac, cur_\()\type\()p
please? Only do this for the guest kernel, not the host. KVM will upgrade
the clean to a clean+invalidate, so it's interesting to see if this has
an effect on the behaviour.
Will
next prev parent reply other threads:[~2018-06-26 17:47 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-06-20 14:18 KVM guest sometimes failed to boot because of kernel stack overflow if KPTI is enabled on a hisilicon ARM64 platform Wei Xu
2018-06-20 14:42 ` Will Deacon
2018-06-20 15:52 ` Wei Xu
2018-06-20 15:54 ` James Morse
2018-06-20 16:25 ` Wei Xu
2018-06-20 16:28 ` Will Deacon
2018-06-20 16:33 ` Wei Xu
2018-06-21 8:38 ` James Morse
2018-06-21 9:00 ` Marc Zyngier
2018-06-21 9:18 ` Will Deacon
2018-06-21 10:14 ` Wei Xu
2018-06-21 10:54 ` Will Deacon
2018-06-22 8:33 ` Wei Xu
2018-06-22 9:23 ` Will Deacon
2018-06-22 10:45 ` Wei Xu
2018-06-22 11:16 ` Will Deacon
2018-06-22 13:18 ` Wei Xu
2018-06-22 13:31 ` Will Deacon
2018-06-22 13:46 ` Wei Xu
2018-06-22 14:43 ` Will Deacon
2018-06-22 15:26 ` Wei Xu
2018-06-22 14:28 ` Mark Rutland
2018-06-22 15:28 ` Wei Xu
2018-06-22 15:41 ` Will Deacon
2018-06-22 16:02 ` Wei Xu
2018-06-21 9:20 ` Wei Xu
2018-06-26 17:16 ` Wei Xu
2018-06-26 17:47 ` Will Deacon [this message]
2018-06-27 8:39 ` James Morse
2018-06-27 13:26 ` Wei Xu
2018-06-28 8:45 ` James Morse
2018-06-28 10:20 ` Wei Xu
2018-06-27 13:22 ` Wei Xu
2018-06-27 13:28 ` Will Deacon
2018-06-27 13:32 ` Wei Xu
2018-06-28 14:50 ` Wei Xu
2018-06-28 15:34 ` Mark Rutland
[not found] ` <etPan.5b3507f7.914aa16.1d6b@localhost>
2018-06-28 16:24 ` 答复: " Mark Rutland
2018-06-29 9:59 ` Mark Rutland
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180626174746.GO23375@arm.com \
--to=will.deacon@arm.com \
--cc=Larry.T@huawei.com \
--cc=catalin.marinas@arm.com \
--cc=dave.martin@arm.com \
--cc=james.morse@arm.com \
--cc=james.xiong@huawei.com \
--cc=libeijian@hisilicon.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxarm@huawei.com \
--cc=marc.zyngier@arm.com \
--cc=mark.rutland@arm.com \
--cc=suzuki.poulose@arm.com \
--cc=xuwei5@hisilicon.com \
--cc=zhangyi.ac@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox