All of lore.kernel.org
 help / color / mirror / Atom feed
From: will.deacon@arm.com (Will Deacon)
To: linux-arm-kernel@lists.infradead.org
Subject: KVM guest sometimes failed to boot because of kernel stack overflow if KPTI is enabled on a hisilicon ARM64 platform.
Date: Fri, 22 Jun 2018 10:23:31 +0100	[thread overview]
Message-ID: <20180622092330.GD7601@arm.com> (raw)
In-Reply-To: <5B2CB440.8040705@hisilicon.com>

Hi Wei,

On Fri, Jun 22, 2018 at 09:33:04AM +0100, Wei Xu wrote:
> On 2018/6/21 11:54, Will Deacon wrote:
> > On Thu, Jun 21, 2018 at 11:14:28AM +0100, Wei Xu wrote:
> >> On 2018/6/21 10:18, Will Deacon wrote:
> >>> Wei -- does the diff below help at all? Make sure you disable CONFIG_KASAN,
> >>> otherwise your kernel will take an age to boot.
> >>
> >> Yes, amazing! This patch resolved the issue.
> > 
> > Great...
> > 
> >> I have tested 50 times and can not reproduce the issue any more.
> >> Could you please tell more why this patch works?
> > 
> > You might need to ask your CPU design team ;)
> > 
> > Without this patch, the code in idmap_kpti_install_ng_mappings() sets
> > bit 11 in table descriptors so that we can keep track of which parts of
> > the page table we've visited. With this patch, we don't bother tracking
> > and potentially rewalk parts of the page table (which takes a very long
> > time if KASAN is enabled).
> 
> Got it. Thanks!
> 
> > 
> > The architecture documents I've looked at are clear that bit 11 is IGNORED
> > by the CPU, which:
> > 
> >   "Indicates that the architecture guarantees that the bit or field is not
> >    interpreted or modified by hardware."
> > 
> > Please can you double-check that your CPU is indeed ignoring bit 11 in
> > non-leaf (table) descriptors?
> 
> Do the non-leaf(table) descriptors mean the table descriptors
> of the section D4.3.1 "VMSAv8-64 translation table level 0, level 1, and level 2 descriptor formats"
> in the ARM Architecture Reference Manual ARMv8 for ARMv8-A(DDI0487C_a_armv8_arm.pdf)?
> 
> If yes, our hardware does ignore it(not interpret or modify).

Ok, thanks for checking.

> Is there any other possible reason cause this?

Perhaps just writing back the table entries is enough to cause the issue,
although I really can't understand why that would be the case. Can you try
the diff below (without my previous change), please?

Will

--->8

diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 5f9a73a4452c..e2a8e88f95a0 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -216,7 +216,7 @@ ENDPROC(idmap_cpu_replace_ttbr1)
 	.endm
 
 	.macro __idmap_kpti_put_pgtable_ent_ng, type
-	orr	\type, \type, #PTE_NG		// Same bit for blocks and pages
+	eor	\type, \type, #PTE_NG		// Same bit for blocks and pages
 	str	\type, [cur_\()\type\()p]	// Update the entry and ensure it
 	dc	civac, cur_\()\type\()p		// is visible to all CPUs.
 	.endm
@@ -298,6 +298,7 @@ skip_pgd:
 	/* PUD */
 walk_puds:
 	.if CONFIG_PGTABLE_LEVELS > 3
+	eor	pgd, pgd, #PTE_NG
 	pte_to_phys	cur_pudp, pgd
 	add	end_pudp, cur_pudp, #(PTRS_PER_PUD * 8)
 do_pud:	__idmap_kpti_get_pgtable_ent	pud
@@ -319,6 +320,7 @@ next_pud:
 	/* PMD */
 walk_pmds:
 	.if CONFIG_PGTABLE_LEVELS > 2
+	eor	pud, pud, #PTE_NG
 	pte_to_phys	cur_pmdp, pud
 	add	end_pmdp, cur_pmdp, #(PTRS_PER_PMD * 8)
 do_pmd:	__idmap_kpti_get_pgtable_ent	pmd
@@ -339,6 +341,7 @@ next_pmd:
 
 	/* PTE */
 walk_ptes:
+	eor	pmd, pmd, #PTE_NG
 	pte_to_phys	cur_ptep, pmd
 	add	end_ptep, cur_ptep, #(PTRS_PER_PTE * 8)
 do_pte:	__idmap_kpti_get_pgtable_ent	pte

WARNING: multiple messages have this Message-ID (diff)
From: Will Deacon <will.deacon@arm.com>
To: Wei Xu <xuwei5@hisilicon.com>
Cc: James Morse <james.morse@arm.com>,
	catalin.marinas@arm.com, suzuki.poulose@arm.com,
	dave.martin@arm.com, mark.rutland@arm.com, marc.zyngier@arm.com,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, Linuxarm <linuxarm@huawei.com>,
	Hanjun Guo <guohanjun@huawei.com>,
	xiexiuqi@huawei.com, huangdaode <huangdaode@hisilicon.com>,
	"Chenxin (Charles)" <charles.chenxin@huawei.com>,
	"Xiongfanggou (James)" <james.xiong@huawei.com>,
	"Liguozhu (Kenneth)" <liguozhu@hisilicon.com>,
	Zhangyi ac <zhangyi.ac@huawei.com>,
	jonathan.cameron@huawei.com,
	Shameerali Kolothum Thodi  <shameerali.kolothum.thodi@huawei.com>,
	John Garry <john.garry@huawei.com>,
	Salil Mehta <salil.mehta@huawei.com>,
	Shiju Jose <shiju.jose@huawei.com>,
	"Zhuangyuzeng (Yisen)" <yisen.zhuang@huawei.com>,
	"Wangzhou (B)" <wangzhou1@hisilicon.com>,
	"kongxinwei (A)" <kong.kongxinwei@hisilicon.com>,
	"Liyuan (Larry, Turing Solution)" <Larry.T@huawei.com>,
	libeijian@hisilicon.com, zhangbin011@hisilicon.com
Subject: Re: KVM guest sometimes failed to boot because of kernel stack overflow if KPTI is enabled on a hisilicon ARM64 platform.
Date: Fri, 22 Jun 2018 10:23:31 +0100	[thread overview]
Message-ID: <20180622092330.GD7601@arm.com> (raw)
In-Reply-To: <5B2CB440.8040705@hisilicon.com>

Hi Wei,

On Fri, Jun 22, 2018 at 09:33:04AM +0100, Wei Xu wrote:
> On 2018/6/21 11:54, Will Deacon wrote:
> > On Thu, Jun 21, 2018 at 11:14:28AM +0100, Wei Xu wrote:
> >> On 2018/6/21 10:18, Will Deacon wrote:
> >>> Wei -- does the diff below help at all? Make sure you disable CONFIG_KASAN,
> >>> otherwise your kernel will take an age to boot.
> >>
> >> Yes, amazing! This patch resolved the issue.
> > 
> > Great...
> > 
> >> I have tested 50 times and can not reproduce the issue any more.
> >> Could you please tell more why this patch works?
> > 
> > You might need to ask your CPU design team ;)
> > 
> > Without this patch, the code in idmap_kpti_install_ng_mappings() sets
> > bit 11 in table descriptors so that we can keep track of which parts of
> > the page table we've visited. With this patch, we don't bother tracking
> > and potentially rewalk parts of the page table (which takes a very long
> > time if KASAN is enabled).
> 
> Got it. Thanks!
> 
> > 
> > The architecture documents I've looked at are clear that bit 11 is IGNORED
> > by the CPU, which:
> > 
> >   "Indicates that the architecture guarantees that the bit or field is not
> >    interpreted or modified by hardware."
> > 
> > Please can you double-check that your CPU is indeed ignoring bit 11 in
> > non-leaf (table) descriptors?
> 
> Do the non-leaf(table) descriptors mean the table descriptors
> of the section D4.3.1 "VMSAv8-64 translation table level 0, level 1, and level 2 descriptor formats"
> in the ARM Architecture Reference Manual ARMv8 for ARMv8-A(DDI0487C_a_armv8_arm.pdf)?
> 
> If yes, our hardware does ignore it(not interpret or modify).

Ok, thanks for checking.

> Is there any other possible reason cause this?

Perhaps just writing back the table entries is enough to cause the issue,
although I really can't understand why that would be the case. Can you try
the diff below (without my previous change), please?

Will

--->8

diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 5f9a73a4452c..e2a8e88f95a0 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -216,7 +216,7 @@ ENDPROC(idmap_cpu_replace_ttbr1)
 	.endm
 
 	.macro __idmap_kpti_put_pgtable_ent_ng, type
-	orr	\type, \type, #PTE_NG		// Same bit for blocks and pages
+	eor	\type, \type, #PTE_NG		// Same bit for blocks and pages
 	str	\type, [cur_\()\type\()p]	// Update the entry and ensure it
 	dc	civac, cur_\()\type\()p		// is visible to all CPUs.
 	.endm
@@ -298,6 +298,7 @@ skip_pgd:
 	/* PUD */
 walk_puds:
 	.if CONFIG_PGTABLE_LEVELS > 3
+	eor	pgd, pgd, #PTE_NG
 	pte_to_phys	cur_pudp, pgd
 	add	end_pudp, cur_pudp, #(PTRS_PER_PUD * 8)
 do_pud:	__idmap_kpti_get_pgtable_ent	pud
@@ -319,6 +320,7 @@ next_pud:
 	/* PMD */
 walk_pmds:
 	.if CONFIG_PGTABLE_LEVELS > 2
+	eor	pud, pud, #PTE_NG
 	pte_to_phys	cur_pmdp, pud
 	add	end_pmdp, cur_pmdp, #(PTRS_PER_PMD * 8)
 do_pmd:	__idmap_kpti_get_pgtable_ent	pmd
@@ -339,6 +341,7 @@ next_pmd:
 
 	/* PTE */
 walk_ptes:
+	eor	pmd, pmd, #PTE_NG
 	pte_to_phys	cur_ptep, pmd
 	add	end_ptep, cur_ptep, #(PTRS_PER_PTE * 8)
 do_pte:	__idmap_kpti_get_pgtable_ent	pte

  reply	other threads:[~2018-06-22  9:23 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-20 14:18 KVM guest sometimes failed to boot because of kernel stack overflow if KPTI is enabled on a hisilicon ARM64 platform Wei Xu
2018-06-20 14:18 ` Wei Xu
2018-06-20 14:42 ` Will Deacon
2018-06-20 14:42   ` Will Deacon
2018-06-20 15:52   ` Wei Xu
2018-06-20 15:52     ` Wei Xu
2018-06-20 15:54     ` James Morse
2018-06-20 15:54       ` James Morse
2018-06-20 16:25       ` Wei Xu
2018-06-20 16:25         ` Wei Xu
2018-06-20 16:28         ` Will Deacon
2018-06-20 16:28           ` Will Deacon
2018-06-20 16:33           ` Wei Xu
2018-06-20 16:33             ` Wei Xu
2018-06-21  8:38         ` James Morse
2018-06-21  8:38           ` James Morse
2018-06-21  9:00           ` Marc Zyngier
2018-06-21  9:00             ` Marc Zyngier
2018-06-21  9:18           ` Will Deacon
2018-06-21  9:18             ` Will Deacon
2018-06-21 10:14             ` Wei Xu
2018-06-21 10:14               ` Wei Xu
2018-06-21 10:54               ` Will Deacon
2018-06-21 10:54                 ` Will Deacon
2018-06-22  8:33                 ` Wei Xu
2018-06-22  8:33                   ` Wei Xu
2018-06-22  9:23                   ` Will Deacon [this message]
2018-06-22  9:23                     ` Will Deacon
2018-06-22 10:45                     ` Wei Xu
2018-06-22 10:45                       ` Wei Xu
2018-06-22 11:16                       ` Will Deacon
2018-06-22 11:16                         ` Will Deacon
2018-06-22 13:18                         ` Wei Xu
2018-06-22 13:18                           ` Wei Xu
2018-06-22 13:31                           ` Will Deacon
2018-06-22 13:31                             ` Will Deacon
2018-06-22 13:46                             ` Wei Xu
2018-06-22 13:46                               ` Wei Xu
2018-06-22 14:43                               ` Will Deacon
2018-06-22 14:43                                 ` Will Deacon
2018-06-22 15:26                                 ` Wei Xu
2018-06-22 15:26                                   ` Wei Xu
2018-06-22 14:28                           ` Mark Rutland
2018-06-22 14:28                             ` Mark Rutland
2018-06-22 15:28                             ` Wei Xu
2018-06-22 15:28                               ` Wei Xu
2018-06-22 15:41                               ` Will Deacon
2018-06-22 15:41                                 ` Will Deacon
2018-06-22 16:02                                 ` Wei Xu
2018-06-22 16:02                                   ` Wei Xu
2018-06-21  9:20           ` Wei Xu
2018-06-21  9:20             ` Wei Xu
2018-06-26 17:16             ` Wei Xu
2018-06-26 17:16               ` Wei Xu
2018-06-26 17:47               ` Will Deacon
2018-06-26 17:47                 ` Will Deacon
2018-06-27  8:39                 ` James Morse
2018-06-27  8:39                   ` James Morse
2018-06-27 13:26                   ` Wei Xu
2018-06-27 13:26                     ` Wei Xu
2018-06-28  8:45                     ` James Morse
2018-06-28  8:45                       ` James Morse
2018-06-28 10:20                       ` Wei Xu
2018-06-28 10:20                         ` Wei Xu
2018-06-27 13:22                 ` Wei Xu
2018-06-27 13:22                   ` Wei Xu
2018-06-27 13:28                   ` Will Deacon
2018-06-27 13:28                     ` Will Deacon
2018-06-27 13:32                     ` Wei Xu
2018-06-27 13:32                       ` Wei Xu
2018-06-28 14:50                     ` Wei Xu
2018-06-28 14:50                       ` Wei Xu
2018-06-28 15:34                       ` Mark Rutland
2018-06-28 15:34                         ` Mark Rutland
     [not found]                         ` <etPan.5b3507f7.914aa16.1d6b@localhost>
2018-06-28 16:24                           ` 答复: " Mark Rutland
2018-06-28 16:24                             ` Mark Rutland
2018-06-29  9:59                             ` Mark Rutland
2018-06-29  9:59                               ` Mark Rutland
2018-06-29  8:47                           ` Marc Zyngier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180622092330.GD7601@arm.com \
    --to=will.deacon@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.