* [PATCH] arm64: clear_page() shouldn't use DC ZVA when DCZID_EL0.DZP == 1 @ 2021-10-26 3:48 Reiji Watanabe 2021-10-26 9:06 ` Catalin Marinas 2021-10-26 11:22 ` Robin Murphy 0 siblings, 2 replies; 10+ messages in thread From: Reiji Watanabe @ 2021-10-26 3:48 UTC (permalink / raw) To: Catalin Marinas, Will Deacon Cc: Marc Zyngier, linux-arm-kernel, Peter Shier, Ricardo Koller, Oliver Upton, Jing Zhang, Raghavendra Rao Anata, Reiji Watanabe Currently, clear_page() uses DC ZVA instruction unconditionally. But it should make sure that DCZID_EL0.DZP, which indicates whether or not use of DC ZVA instruction is prohibited, is zero when using the instruction. Use stp as memset does instead when DCZID_EL0.DZP == 1. Signed-off-by: Reiji Watanabe <reijiw@google.com> --- arch/arm64/lib/clear_page.S | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/arch/arm64/lib/clear_page.S b/arch/arm64/lib/clear_page.S index b84b179edba3..7ce1bfa4081c 100644 --- a/arch/arm64/lib/clear_page.S +++ b/arch/arm64/lib/clear_page.S @@ -16,6 +16,7 @@ */ SYM_FUNC_START_PI(clear_page) mrs x1, dczid_el0 + tbnz x1, #4, 2f /* Branch if DC GVA is prohibited */ and w1, w1, #0xf mov x2, #4 lsl x1, x2, x1 @@ -25,5 +26,15 @@ SYM_FUNC_START_PI(clear_page) tst x0, #(PAGE_SIZE - 1) b.ne 1b ret + +2: mov x1, #(PAGE_SIZE) + sub x0, x0, #16 /* Pre-bias. */ +3: stp xzr, xzr, [x0, #16] + stp xzr, xzr, [x0, #32] + stp xzr, xzr, [x0, #48] + stp xzr, xzr, [x0, #64]! + subs x1, x1, #64 + b.gt 3b + ret SYM_FUNC_END_PI(clear_page) EXPORT_SYMBOL(clear_page) -- 2.33.0.1079.g6e70778dc9-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] arm64: clear_page() shouldn't use DC ZVA when DCZID_EL0.DZP == 1 2021-10-26 3:48 [PATCH] arm64: clear_page() shouldn't use DC ZVA when DCZID_EL0.DZP == 1 Reiji Watanabe @ 2021-10-26 9:06 ` Catalin Marinas 2021-10-27 1:35 ` Reiji Watanabe 2021-10-26 11:22 ` Robin Murphy 1 sibling, 1 reply; 10+ messages in thread From: Catalin Marinas @ 2021-10-26 9:06 UTC (permalink / raw) To: Reiji Watanabe Cc: Will Deacon, Marc Zyngier, linux-arm-kernel, Peter Shier, Ricardo Koller, Oliver Upton, Jing Zhang, Raghavendra Rao Anata On Mon, Oct 25, 2021 at 08:48:44PM -0700, Reiji Watanabe wrote: > Currently, clear_page() uses DC ZVA instruction unconditionally. But it > should make sure that DCZID_EL0.DZP, which indicates whether or not use > of DC ZVA instruction is prohibited, is zero when using the instruction. > Use stp as memset does instead when DCZID_EL0.DZP == 1. For correctness, this is fine, but have you come across a system that has DZP == 1 (or hypervisor setting HCR_EL2.TDZ)? -- Catalin _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] arm64: clear_page() shouldn't use DC ZVA when DCZID_EL0.DZP == 1 2021-10-26 9:06 ` Catalin Marinas @ 2021-10-27 1:35 ` Reiji Watanabe 0 siblings, 0 replies; 10+ messages in thread From: Reiji Watanabe @ 2021-10-27 1:35 UTC (permalink / raw) To: Catalin Marinas Cc: Will Deacon, Marc Zyngier, linux-arm-kernel, Peter Shier, Ricardo Koller, Oliver Upton, Jing Zhang, Raghavendra Rao Anata On Tue, Oct 26, 2021 at 2:06 AM Catalin Marinas <catalin.marinas@arm.com> wrote: > > On Mon, Oct 25, 2021 at 08:48:44PM -0700, Reiji Watanabe wrote: > > Currently, clear_page() uses DC ZVA instruction unconditionally. But it > > should make sure that DCZID_EL0.DZP, which indicates whether or not use > > of DC ZVA instruction is prohibited, is zero when using the instruction. > > Use stp as memset does instead when DCZID_EL0.DZP == 1. > > For correctness, this is fine, but have you come across a system that > has DZP == 1 (or hypervisor setting HCR_EL2.TDZ)? We are thinking of setting HCR_EL2.TDZ considering live migration support between two systems with different DCZID_EL0.BS. Thanks, Reiji _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] arm64: clear_page() shouldn't use DC ZVA when DCZID_EL0.DZP == 1 2021-10-26 3:48 [PATCH] arm64: clear_page() shouldn't use DC ZVA when DCZID_EL0.DZP == 1 Reiji Watanabe 2021-10-26 9:06 ` Catalin Marinas @ 2021-10-26 11:22 ` Robin Murphy 2021-10-26 12:23 ` Mark Rutland 1 sibling, 1 reply; 10+ messages in thread From: Robin Murphy @ 2021-10-26 11:22 UTC (permalink / raw) To: Reiji Watanabe, Catalin Marinas, Will Deacon Cc: Marc Zyngier, linux-arm-kernel, Peter Shier, Ricardo Koller, Oliver Upton, Jing Zhang, Raghavendra Rao Anata On 2021-10-26 04:48, Reiji Watanabe wrote: > Currently, clear_page() uses DC ZVA instruction unconditionally. But it > should make sure that DCZID_EL0.DZP, which indicates whether or not use > of DC ZVA instruction is prohibited, is zero when using the instruction. > Use stp as memset does instead when DCZID_EL0.DZP == 1. > > Signed-off-by: Reiji Watanabe <reijiw@google.com> > --- > arch/arm64/lib/clear_page.S | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/arch/arm64/lib/clear_page.S b/arch/arm64/lib/clear_page.S > index b84b179edba3..7ce1bfa4081c 100644 > --- a/arch/arm64/lib/clear_page.S > +++ b/arch/arm64/lib/clear_page.S > @@ -16,6 +16,7 @@ > */ > SYM_FUNC_START_PI(clear_page) > mrs x1, dczid_el0 > + tbnz x1, #4, 2f /* Branch if DC GVA is prohibited */ > and w1, w1, #0xf > mov x2, #4 > lsl x1, x2, x1 > @@ -25,5 +26,15 @@ SYM_FUNC_START_PI(clear_page) > tst x0, #(PAGE_SIZE - 1) > b.ne 1b > ret > + > +2: mov x1, #(PAGE_SIZE) > + sub x0, x0, #16 /* Pre-bias. */ Out of curiosity, what's this for? It's not like we need to worry about PAGE_SIZE or page addresses being misaligned. I don't really see why we'd need a different condition from the DC ZVA loop. Robin. > +3: stp xzr, xzr, [x0, #16] > + stp xzr, xzr, [x0, #32] > + stp xzr, xzr, [x0, #48] > + stp xzr, xzr, [x0, #64]! > + subs x1, x1, #64 > + b.gt 3b > + ret > SYM_FUNC_END_PI(clear_page) > EXPORT_SYMBOL(clear_page) > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] arm64: clear_page() shouldn't use DC ZVA when DCZID_EL0.DZP == 1 2021-10-26 11:22 ` Robin Murphy @ 2021-10-26 12:23 ` Mark Rutland 2021-10-27 6:44 ` Reiji Watanabe 0 siblings, 1 reply; 10+ messages in thread From: Mark Rutland @ 2021-10-26 12:23 UTC (permalink / raw) To: Robin Murphy Cc: Reiji Watanabe, Catalin Marinas, Will Deacon, Marc Zyngier, linux-arm-kernel, Peter Shier, Ricardo Koller, Oliver Upton, Jing Zhang, Raghavendra Rao Anata On Tue, Oct 26, 2021 at 12:22:20PM +0100, Robin Murphy wrote: > On 2021-10-26 04:48, Reiji Watanabe wrote: > > Currently, clear_page() uses DC ZVA instruction unconditionally. But it > > should make sure that DCZID_EL0.DZP, which indicates whether or not use > > of DC ZVA instruction is prohibited, is zero when using the instruction. > > Use stp as memset does instead when DCZID_EL0.DZP == 1. > > > > Signed-off-by: Reiji Watanabe <reijiw@google.com> > > --- > > arch/arm64/lib/clear_page.S | 11 +++++++++++ > > 1 file changed, 11 insertions(+) > > > > diff --git a/arch/arm64/lib/clear_page.S b/arch/arm64/lib/clear_page.S > > index b84b179edba3..7ce1bfa4081c 100644 > > --- a/arch/arm64/lib/clear_page.S > > +++ b/arch/arm64/lib/clear_page.S > > @@ -16,6 +16,7 @@ > > */ > > SYM_FUNC_START_PI(clear_page) > > mrs x1, dczid_el0 > > + tbnz x1, #4, 2f /* Branch if DC GVA is prohibited */ DCZID_EL0.DZP (AKA DCZID_EL0[4]) says whether all of DC {ZVA,GVA,GZVA} are prohibited. This loop uses DZ ZVA, not GC GVA, so it'd be nice to s/GVA/ZVA/ here. Howver, `DC GVA` and `DC GZVA` are both used in mte_set_mem_tag_range(), which'll need a similar update... > > and w1, w1, #0xf > > mov x2, #4 > > lsl x1, x2, x1 > > @@ -25,5 +26,15 @@ SYM_FUNC_START_PI(clear_page) > > tst x0, #(PAGE_SIZE - 1) > > b.ne 1b > > ret > > + > > +2: mov x1, #(PAGE_SIZE) > > + sub x0, x0, #16 /* Pre-bias. */ > > Out of curiosity, what's this for? It's not like we need to worry about > PAGE_SIZE or page addresses being misaligned. I don't really see why we'd > need a different condition from the DC ZVA loop. I believe this was copied from arch/arm64/lib/memset.S, in the `.Lnot_short` case, where we have: | .Lnot_short: | sub dst, dst, #16/* Pre-bias. */ | sub count, count, #64 | 1: | stp A_l, A_l, [dst, #16] | stp A_l, A_l, [dst, #32] | stp A_l, A_l, [dst, #48] | stp A_l, A_l, [dst, #64]! | subs count, count, #64 | b.ge 1b > Robin. > > > +3: stp xzr, xzr, [x0, #16] > > + stp xzr, xzr, [x0, #32] > > + stp xzr, xzr, [x0, #48] > > + stp xzr, xzr, [x0, #64]! > > + subs x1, x1, #64 > > + b.gt 3b > > + ret FWIW, I'd also prefer consistency with the existing loop, i.e. 2: stp xzr, xzr, [x0, #0] stp xzr, xzr, [x0, #16] stp xzr, xzr, [x0, #32] stp xzr, xzr, [x0, #48] add x0, x0, #64 tst x0, #(PAGE_SIZE - 1) b.ne 2b ret Thanks, Mark. > > SYM_FUNC_END_PI(clear_page) > > EXPORT_SYMBOL(clear_page) > > > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] arm64: clear_page() shouldn't use DC ZVA when DCZID_EL0.DZP == 1 2021-10-26 12:23 ` Mark Rutland @ 2021-10-27 6:44 ` Reiji Watanabe 2021-10-27 11:09 ` Mark Rutland 0 siblings, 1 reply; 10+ messages in thread From: Reiji Watanabe @ 2021-10-27 6:44 UTC (permalink / raw) To: Mark Rutland Cc: Robin Murphy, Catalin Marinas, Will Deacon, Marc Zyngier, linux-arm-kernel, Peter Shier, Ricardo Koller, Oliver Upton, Jing Zhang, Raghavendra Rao Anata On Tue, Oct 26, 2021 at 5:23 AM Mark Rutland <mark.rutland@arm.com> wrote: > > On Tue, Oct 26, 2021 at 12:22:20PM +0100, Robin Murphy wrote: > > On 2021-10-26 04:48, Reiji Watanabe wrote: > > > Currently, clear_page() uses DC ZVA instruction unconditionally. But it > > > should make sure that DCZID_EL0.DZP, which indicates whether or not use > > > of DC ZVA instruction is prohibited, is zero when using the instruction. > > > Use stp as memset does instead when DCZID_EL0.DZP == 1. > > > > > > Signed-off-by: Reiji Watanabe <reijiw@google.com> > > > --- > > > arch/arm64/lib/clear_page.S | 11 +++++++++++ > > > 1 file changed, 11 insertions(+) > > > > > > diff --git a/arch/arm64/lib/clear_page.S b/arch/arm64/lib/clear_page.S > > > index b84b179edba3..7ce1bfa4081c 100644 > > > --- a/arch/arm64/lib/clear_page.S > > > +++ b/arch/arm64/lib/clear_page.S > > > @@ -16,6 +16,7 @@ > > > */ > > > SYM_FUNC_START_PI(clear_page) > > > mrs x1, dczid_el0 > > > + tbnz x1, #4, 2f /* Branch if DC GVA is prohibited */ > > DCZID_EL0.DZP (AKA DCZID_EL0[4]) says whether all of DC {ZVA,GVA,GZVA} > are prohibited. This loop uses DZ ZVA, not GC GVA, so it'd be nice to > s/GVA/ZVA/ here. Thank you for catching it ! I will fix that. > Howver, `DC GVA` and `DC GZVA` are both used in mte_set_mem_tag_range(), > which'll need a similar update... Yes, I'm aware of that and mte_zero_clear_page_tags() needs to get updated as well. But, Since I'm not familiar with MTE (and I don't have any plans to use MTE yet), I didn't work on them (I'm not sure how I can test them). I might try to fix them separately later as well when I have time (not so soon most likely though). > > > and w1, w1, #0xf > > > mov x2, #4 > > > lsl x1, x2, x1 > > > @@ -25,5 +26,15 @@ SYM_FUNC_START_PI(clear_page) > > > tst x0, #(PAGE_SIZE - 1) > > > b.ne 1b > > > ret > > > + > > > +2: mov x1, #(PAGE_SIZE) > > > + sub x0, x0, #16 /* Pre-bias. */ > > > > Out of curiosity, what's this for? It's not like we need to worry about > > PAGE_SIZE or page addresses being misaligned. I don't really see why we'd > > need a different condition from the DC ZVA loop. > > I believe this was copied from arch/arm64/lib/memset.S, in the > `.Lnot_short` case, where we have: Yes, I used the same code as memset. I couldn't come up with any other implementation that could get same performance with fewer number of instructions in the loop than that. > | .Lnot_short: > | sub dst, dst, #16/* Pre-bias. */ > | sub count, count, #64 > | 1: > | stp A_l, A_l, [dst, #16] > | stp A_l, A_l, [dst, #32] > | stp A_l, A_l, [dst, #48] > | stp A_l, A_l, [dst, #64]! > | subs count, count, #64 > | b.ge 1b > > > Robin. > > > > > +3: stp xzr, xzr, [x0, #16] > > > + stp xzr, xzr, [x0, #32] > > > + stp xzr, xzr, [x0, #48] > > > + stp xzr, xzr, [x0, #64]! > > > + subs x1, x1, #64 > > > + b.gt 3b > > > + ret > > FWIW, I'd also prefer consistency with the existing loop, i.e. > > 2: stp xzr, xzr, [x0, #0] > stp xzr, xzr, [x0, #16] > stp xzr, xzr, [x0, #32] > stp xzr, xzr, [x0, #48] > add x0, x0, #64 > tst x0, #(PAGE_SIZE - 1) > b.ne 2b > ret Thank you for the suggestion. Yes, I agree that it would be better in terms of consistency with the existing loop. I will fix this as you suggested in the v2 patch. Thanks, Reiji _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] arm64: clear_page() shouldn't use DC ZVA when DCZID_EL0.DZP == 1 2021-10-27 6:44 ` Reiji Watanabe @ 2021-10-27 11:09 ` Mark Rutland 2021-10-28 1:49 ` Reiji Watanabe 2021-10-28 7:46 ` Will Deacon 0 siblings, 2 replies; 10+ messages in thread From: Mark Rutland @ 2021-10-27 11:09 UTC (permalink / raw) To: Reiji Watanabe Cc: Robin Murphy, Catalin Marinas, Will Deacon, Marc Zyngier, linux-arm-kernel, Peter Shier, Ricardo Koller, Oliver Upton, Jing Zhang, Raghavendra Rao Anata On Tue, Oct 26, 2021 at 11:44:51PM -0700, Reiji Watanabe wrote: > On Tue, Oct 26, 2021 at 5:23 AM Mark Rutland <mark.rutland@arm.com> wrote: > > On Tue, Oct 26, 2021 at 12:22:20PM +0100, Robin Murphy wrote: > > > On 2021-10-26 04:48, Reiji Watanabe wrote: > > > > Currently, clear_page() uses DC ZVA instruction unconditionally. But it > > > > should make sure that DCZID_EL0.DZP, which indicates whether or not use > > > > of DC ZVA instruction is prohibited, is zero when using the instruction. > > > > Use stp as memset does instead when DCZID_EL0.DZP == 1. > > > > > > > > Signed-off-by: Reiji Watanabe <reijiw@google.com> > > > > --- > > > > arch/arm64/lib/clear_page.S | 11 +++++++++++ > > > > 1 file changed, 11 insertions(+) > > > > > > > > diff --git a/arch/arm64/lib/clear_page.S b/arch/arm64/lib/clear_page.S > > > > index b84b179edba3..7ce1bfa4081c 100644 > > > > --- a/arch/arm64/lib/clear_page.S > > > > +++ b/arch/arm64/lib/clear_page.S > > > > @@ -16,6 +16,7 @@ > > > > */ > > > > SYM_FUNC_START_PI(clear_page) > > > > mrs x1, dczid_el0 > > > > + tbnz x1, #4, 2f /* Branch if DC GVA is prohibited */ > > > > DCZID_EL0.DZP (AKA DCZID_EL0[4]) says whether all of DC {ZVA,GVA,GZVA} > > are prohibited. This loop uses DZ ZVA, not GC GVA, so it'd be nice to > > s/GVA/ZVA/ here. > > Thank you for catching it ! I will fix that. > > > Howver, `DC GVA` and `DC GZVA` are both used in mte_set_mem_tag_range(), > > which'll need a similar update... > > Yes, I'm aware of that and mte_zero_clear_page_tags() needs to get > updated as well. But, Since I'm not familiar with MTE (and I don't > have any plans to use MTE yet), I didn't work on them (I'm not sure > how I can test them). > I might try to fix them separately later as well when I have time > (not so soon most likely though). My view is that we should either: * Document that we require DCZID_EL0.DZP==0, as is implicitly the case today. * Fix *all* usage of DC {ZVA,GVZ,GZVA} to work with DCZID_EL0.DZP==1. ... otherwise we're just hiding the problem rather than fixing it. QEMU TCG mode has MTE support, so it should be possible to test using that in a configuration such as: -machine virt,virtualization=on,mte=on -cpu max ... then you can hack the EL2 stub code in head.S to initialize HCR_EL2.TDZ=1 before dropping to EL1 (and reporting that the kernel started at EL1). Thanks, Mark. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] arm64: clear_page() shouldn't use DC ZVA when DCZID_EL0.DZP == 1 2021-10-27 11:09 ` Mark Rutland @ 2021-10-28 1:49 ` Reiji Watanabe 2021-10-28 7:46 ` Will Deacon 1 sibling, 0 replies; 10+ messages in thread From: Reiji Watanabe @ 2021-10-28 1:49 UTC (permalink / raw) To: Mark Rutland Cc: Robin Murphy, Catalin Marinas, Will Deacon, Marc Zyngier, linux-arm-kernel, Peter Shier, Ricardo Koller, Oliver Upton, Jing Zhang, Raghavendra Rao Anata On Wed, Oct 27, 2021 at 4:09 AM Mark Rutland <mark.rutland@arm.com> wrote: > > On Tue, Oct 26, 2021 at 11:44:51PM -0700, Reiji Watanabe wrote: > > On Tue, Oct 26, 2021 at 5:23 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > On Tue, Oct 26, 2021 at 12:22:20PM +0100, Robin Murphy wrote: > > > > On 2021-10-26 04:48, Reiji Watanabe wrote: > > > > > Currently, clear_page() uses DC ZVA instruction unconditionally. But it > > > > > should make sure that DCZID_EL0.DZP, which indicates whether or not use > > > > > of DC ZVA instruction is prohibited, is zero when using the instruction. > > > > > Use stp as memset does instead when DCZID_EL0.DZP == 1. > > > > > > > > > > Signed-off-by: Reiji Watanabe <reijiw@google.com> > > > > > --- > > > > > arch/arm64/lib/clear_page.S | 11 +++++++++++ > > > > > 1 file changed, 11 insertions(+) > > > > > > > > > > diff --git a/arch/arm64/lib/clear_page.S b/arch/arm64/lib/clear_page.S > > > > > index b84b179edba3..7ce1bfa4081c 100644 > > > > > --- a/arch/arm64/lib/clear_page.S > > > > > +++ b/arch/arm64/lib/clear_page.S > > > > > @@ -16,6 +16,7 @@ > > > > > */ > > > > > SYM_FUNC_START_PI(clear_page) > > > > > mrs x1, dczid_el0 > > > > > + tbnz x1, #4, 2f /* Branch if DC GVA is prohibited */ > > > > > > DCZID_EL0.DZP (AKA DCZID_EL0[4]) says whether all of DC {ZVA,GVA,GZVA} > > > are prohibited. This loop uses DZ ZVA, not GC GVA, so it'd be nice to > > > s/GVA/ZVA/ here. > > > > Thank you for catching it ! I will fix that. > > > > > Howver, `DC GVA` and `DC GZVA` are both used in mte_set_mem_tag_range(), > > > which'll need a similar update... > > > > Yes, I'm aware of that and mte_zero_clear_page_tags() needs to get > > updated as well. But, Since I'm not familiar with MTE (and I don't > > have any plans to use MTE yet), I didn't work on them (I'm not sure > > how I can test them). > > I might try to fix them separately later as well when I have time > > (not so soon most likely though). > > My view is that we should either: > > * Document that we require DCZID_EL0.DZP==0, as is implicitly the case > today. > > * Fix *all* usage of DC {ZVA,GVZ,GZVA} to work with DCZID_EL0.DZP==1. > > ... otherwise we're just hiding the problem rather than fixing it. > > QEMU TCG mode has MTE support, so it should be possible to test using > that in a configuration such as: > > -machine virt,virtualization=on,mte=on -cpu max > > ... then you can hack the EL2 stub code in head.S to initialize > HCR_EL2.TDZ=1 before dropping to EL1 (and reporting that the kernel > started at EL1). Understood. I will work on the MTE fixes and include them into the v2 patch. Thank you so much for all the comments and information. Regards, Reiji _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] arm64: clear_page() shouldn't use DC ZVA when DCZID_EL0.DZP == 1 2021-10-27 11:09 ` Mark Rutland 2021-10-28 1:49 ` Reiji Watanabe @ 2021-10-28 7:46 ` Will Deacon 2021-10-28 9:03 ` Mark Rutland 1 sibling, 1 reply; 10+ messages in thread From: Will Deacon @ 2021-10-28 7:46 UTC (permalink / raw) To: Mark Rutland Cc: Reiji Watanabe, Robin Murphy, Catalin Marinas, Marc Zyngier, linux-arm-kernel, Peter Shier, Ricardo Koller, Oliver Upton, Jing Zhang, Raghavendra Rao Anata On Wed, Oct 27, 2021 at 12:09:47PM +0100, Mark Rutland wrote: > On Tue, Oct 26, 2021 at 11:44:51PM -0700, Reiji Watanabe wrote: > > On Tue, Oct 26, 2021 at 5:23 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > On Tue, Oct 26, 2021 at 12:22:20PM +0100, Robin Murphy wrote: > > > > On 2021-10-26 04:48, Reiji Watanabe wrote: > > > > > Currently, clear_page() uses DC ZVA instruction unconditionally. But it > > > > > should make sure that DCZID_EL0.DZP, which indicates whether or not use > > > > > of DC ZVA instruction is prohibited, is zero when using the instruction. > > > > > Use stp as memset does instead when DCZID_EL0.DZP == 1. > > > > > > > > > > Signed-off-by: Reiji Watanabe <reijiw@google.com> > > > > > --- > > > > > arch/arm64/lib/clear_page.S | 11 +++++++++++ > > > > > 1 file changed, 11 insertions(+) > > > > > > > > > > diff --git a/arch/arm64/lib/clear_page.S b/arch/arm64/lib/clear_page.S > > > > > index b84b179edba3..7ce1bfa4081c 100644 > > > > > --- a/arch/arm64/lib/clear_page.S > > > > > +++ b/arch/arm64/lib/clear_page.S > > > > > @@ -16,6 +16,7 @@ > > > > > */ > > > > > SYM_FUNC_START_PI(clear_page) > > > > > mrs x1, dczid_el0 > > > > > + tbnz x1, #4, 2f /* Branch if DC GVA is prohibited */ > > > > > > DCZID_EL0.DZP (AKA DCZID_EL0[4]) says whether all of DC {ZVA,GVA,GZVA} > > > are prohibited. This loop uses DZ ZVA, not GC GVA, so it'd be nice to > > > s/GVA/ZVA/ here. > > > > Thank you for catching it ! I will fix that. > > > > > Howver, `DC GVA` and `DC GZVA` are both used in mte_set_mem_tag_range(), > > > which'll need a similar update... > > > > Yes, I'm aware of that and mte_zero_clear_page_tags() needs to get > > updated as well. But, Since I'm not familiar with MTE (and I don't > > have any plans to use MTE yet), I didn't work on them (I'm not sure > > how I can test them). > > I might try to fix them separately later as well when I have time > > (not so soon most likely though). > > My view is that we should either: > > * Document that we require DCZID_EL0.DZP==0, as is implicitly the case > today. I disagree with that. There's nothing wrong in trapping this stuff, as long as you go ahead and emulate it, which is exactly why we aren't checking it at the moment as EL2 should be prepared to handle the trap. The Arm ARM talks about the instructions being "prohibited" but that doesn't mean anything -- the reality is that they trap to EL2. We could document *that* though? Will _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] arm64: clear_page() shouldn't use DC ZVA when DCZID_EL0.DZP == 1 2021-10-28 7:46 ` Will Deacon @ 2021-10-28 9:03 ` Mark Rutland 0 siblings, 0 replies; 10+ messages in thread From: Mark Rutland @ 2021-10-28 9:03 UTC (permalink / raw) To: Will Deacon Cc: Reiji Watanabe, Robin Murphy, Catalin Marinas, Marc Zyngier, linux-arm-kernel, Peter Shier, Ricardo Koller, Oliver Upton, Jing Zhang, Raghavendra Rao Anata On Thu, Oct 28, 2021 at 08:46:10AM +0100, Will Deacon wrote: > On Wed, Oct 27, 2021 at 12:09:47PM +0100, Mark Rutland wrote: > > On Tue, Oct 26, 2021 at 11:44:51PM -0700, Reiji Watanabe wrote: > > > On Tue, Oct 26, 2021 at 5:23 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > > On Tue, Oct 26, 2021 at 12:22:20PM +0100, Robin Murphy wrote: > > > > > On 2021-10-26 04:48, Reiji Watanabe wrote: > > > > > > Currently, clear_page() uses DC ZVA instruction unconditionally. But it > > > > > > should make sure that DCZID_EL0.DZP, which indicates whether or not use > > > > > > of DC ZVA instruction is prohibited, is zero when using the instruction. > > > > > > Use stp as memset does instead when DCZID_EL0.DZP == 1. > > > > > > > > > > > > Signed-off-by: Reiji Watanabe <reijiw@google.com> > > > > > > --- > > > > > > arch/arm64/lib/clear_page.S | 11 +++++++++++ > > > > > > 1 file changed, 11 insertions(+) > > > > > > > > > > > > diff --git a/arch/arm64/lib/clear_page.S b/arch/arm64/lib/clear_page.S > > > > > > index b84b179edba3..7ce1bfa4081c 100644 > > > > > > --- a/arch/arm64/lib/clear_page.S > > > > > > +++ b/arch/arm64/lib/clear_page.S > > > > > > @@ -16,6 +16,7 @@ > > > > > > */ > > > > > > SYM_FUNC_START_PI(clear_page) > > > > > > mrs x1, dczid_el0 > > > > > > + tbnz x1, #4, 2f /* Branch if DC GVA is prohibited */ > > > > > > > > DCZID_EL0.DZP (AKA DCZID_EL0[4]) says whether all of DC {ZVA,GVA,GZVA} > > > > are prohibited. This loop uses DZ ZVA, not GC GVA, so it'd be nice to > > > > s/GVA/ZVA/ here. > > > > > > Thank you for catching it ! I will fix that. > > > > > > > Howver, `DC GVA` and `DC GZVA` are both used in mte_set_mem_tag_range(), > > > > which'll need a similar update... > > > > > > Yes, I'm aware of that and mte_zero_clear_page_tags() needs to get > > > updated as well. But, Since I'm not familiar with MTE (and I don't > > > have any plans to use MTE yet), I didn't work on them (I'm not sure > > > how I can test them). > > > I might try to fix them separately later as well when I have time > > > (not so soon most likely though). > > > > My view is that we should either: > > > > * Document that we require DCZID_EL0.DZP==0, as is implicitly the case > > today. > > I disagree with that. There's nothing wrong in trapping this stuff, as long > as you go ahead and emulate it, which is exactly why we aren't checking it at > the moment as EL2 should be prepared to handle the trap. The Arm ARM talks > about the instructions being "prohibited" but that doesn't mean anything -- > the reality is that they trap to EL2. TBH, I think trapping DC ZVA rather than forcing it to be UNDEF was an architectural mistake. I think the intent of the "prohibited" wording (and exposure of DCZID_EL0.DZP to EL0 and EL1) is clearly that SW should *avoid* DC ZVA and friends when DCZID_EL0.DZP is set (and libc and the Arm optimized routines have *always* checked that), and performance would be abysmal with emulated DC ZVA, so at minimum we want to avoid hitting emulation in the common case. > We could document *that* though? I guess; though it makes me uneasy since the architecture clearly pushes people to read CTR_EL0.DZP, and heavily implies that there's no need to emulate, so I don't think we have much of a leg to stand on from an architecture PoV. If we need to support this, my preference would be to support DCZID_EL0.DZP==1 by avoiding the trapped instructions entirely. I realise the problem as ever is "what does userspace do?". Thanks, Mark. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2021-10-28 9:06 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2021-10-26 3:48 [PATCH] arm64: clear_page() shouldn't use DC ZVA when DCZID_EL0.DZP == 1 Reiji Watanabe 2021-10-26 9:06 ` Catalin Marinas 2021-10-27 1:35 ` Reiji Watanabe 2021-10-26 11:22 ` Robin Murphy 2021-10-26 12:23 ` Mark Rutland 2021-10-27 6:44 ` Reiji Watanabe 2021-10-27 11:09 ` Mark Rutland 2021-10-28 1:49 ` Reiji Watanabe 2021-10-28 7:46 ` Will Deacon 2021-10-28 9:03 ` Mark Rutland
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.