* [PATCH] KVM: arm64: Make the exposed feature bits in AA64DFR0_EL1 writable from userspace
@ 2024-08-13 14:28 Shameer Kolothum
2024-08-13 18:20 ` Marc Zyngier
0 siblings, 1 reply; 11+ messages in thread
From: Shameer Kolothum @ 2024-08-13 14:28 UTC (permalink / raw)
To: kvmarm, linux-arm-kernel
Cc: maz, will, catalin.marinas, oliver.upton, james.morse,
suzuki.poulose, yuzenghui, wangzhou1, linuxarm
KVM exposes the OS double lock feature bit to Guests but returns
RAZ/WI on Guest OSDLR_EL1 access. This breaks Guest migration between
systems where this feature support differ. Add support to make this
feature writable from userspace by setting the mask bit. While at it,
set the mask bits for other exposed features in the AA64DFR0_EL1
register as well.
Also update the selftest to cover these fields.
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
This is based on the discussion here(Thanks to Oliver),
https://lore.kernel.org/all/ZrVSlbVwnaMDShah@linux.dev/
---
arch/arm64/kvm/sys_regs.c | 6 +++++-
tools/testing/selftests/kvm/aarch64/set_id_regs.c | 4 ++++
2 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index c90324060436..adb49d681052 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -2376,7 +2376,11 @@ static const struct sys_reg_desc sys_reg_descs[] = {
.get_user = get_id_reg,
.set_user = set_id_aa64dfr0_el1,
.reset = read_sanitised_id_aa64dfr0_el1,
- .val = ID_AA64DFR0_EL1_PMUVer_MASK |
+ .val = ID_AA64DFR0_EL1_DoubleLock_MASK |
+ ID_AA64DFR0_EL1_CTX_CMPs_MASK |
+ ID_AA64DFR0_EL1_WRPs_MASK |
+ ID_AA64DFR0_EL1_BRPs_MASK |
+ ID_AA64DFR0_EL1_PMUVer_MASK |
ID_AA64DFR0_EL1_DebugVer_MASK, },
ID_SANITISED(ID_AA64DFR1_EL1),
ID_UNALLOCATED(5,2),
diff --git a/tools/testing/selftests/kvm/aarch64/set_id_regs.c b/tools/testing/selftests/kvm/aarch64/set_id_regs.c
index d20981663831..1e6b9594daf8 100644
--- a/tools/testing/selftests/kvm/aarch64/set_id_regs.c
+++ b/tools/testing/selftests/kvm/aarch64/set_id_regs.c
@@ -68,6 +68,10 @@ struct test_feature_reg {
}
static const struct reg_ftr_bits ftr_id_aa64dfr0_el1[] = {
+ S_REG_FTR_BITS(FTR_LOWER_SAFE, ID_AA64DFR0_EL1, DoubleLock, 0),
+ REG_FTR_BITS(FTR_LOWER_SAFE, ID_AA64DFR0_EL1, CTX_CMPs, 0),
+ REG_FTR_BITS(FTR_LOWER_SAFE, ID_AA64DFR0_EL1, WRPs, 0),
+ REG_FTR_BITS(FTR_LOWER_SAFE, ID_AA64DFR0_EL1, BRPs, 0),
S_REG_FTR_BITS(FTR_LOWER_SAFE, ID_AA64DFR0_EL1, PMUVer, 0),
REG_FTR_BITS(FTR_LOWER_SAFE, ID_AA64DFR0_EL1, DebugVer, ID_AA64DFR0_EL1_DebugVer_IMP),
REG_FTR_END,
--
2.45.2
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH] KVM: arm64: Make the exposed feature bits in AA64DFR0_EL1 writable from userspace
2024-08-13 14:28 [PATCH] KVM: arm64: Make the exposed feature bits in AA64DFR0_EL1 writable from userspace Shameer Kolothum
@ 2024-08-13 18:20 ` Marc Zyngier
2024-08-14 9:17 ` Shameerali Kolothum Thodi
0 siblings, 1 reply; 11+ messages in thread
From: Marc Zyngier @ 2024-08-13 18:20 UTC (permalink / raw)
To: Shameer Kolothum
Cc: kvmarm, linux-arm-kernel, will, catalin.marinas, oliver.upton,
james.morse, suzuki.poulose, yuzenghui, wangzhou1, linuxarm
On Tue, 13 Aug 2024 15:28:35 +0100,
Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> wrote:
>
> KVM exposes the OS double lock feature bit to Guests but returns
> RAZ/WI on Guest OSDLR_EL1 access. This breaks Guest migration between
> systems where this feature support differ. Add support to make this
> feature writable from userspace by setting the mask bit. While at it,
> set the mask bits for other exposed features in the AA64DFR0_EL1
> register as well.
>
> Also update the selftest to cover these fields.
>
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> ---
> This is based on the discussion here(Thanks to Oliver),
> https://lore.kernel.org/all/ZrVSlbVwnaMDShah@linux.dev/
> ---
> arch/arm64/kvm/sys_regs.c | 6 +++++-
> tools/testing/selftests/kvm/aarch64/set_id_regs.c | 4 ++++
> 2 files changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index c90324060436..adb49d681052 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -2376,7 +2376,11 @@ static const struct sys_reg_desc sys_reg_descs[] = {
> .get_user = get_id_reg,
> .set_user = set_id_aa64dfr0_el1,
> .reset = read_sanitised_id_aa64dfr0_el1,
> - .val = ID_AA64DFR0_EL1_PMUVer_MASK |
> + .val = ID_AA64DFR0_EL1_DoubleLock_MASK |
> + ID_AA64DFR0_EL1_CTX_CMPs_MASK |
> + ID_AA64DFR0_EL1_WRPs_MASK |
> + ID_AA64DFR0_EL1_BRPs_MASK |
I think this is going to cause some troubles.
The issue is that context-aware breakpoints are the highest-numbered
breakpoints, right after the normal breakpoints (D2.8.3 "Breakpoint
types and linking of breakpoints"). So if you reduce the number of
normal breakpoints, you shift the context-aware ones down, and
everything breaks.
I really don't see how you can safely do that without completely
changing the way we handle the debug registers.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [PATCH] KVM: arm64: Make the exposed feature bits in AA64DFR0_EL1 writable from userspace
2024-08-13 18:20 ` Marc Zyngier
@ 2024-08-14 9:17 ` Shameerali Kolothum Thodi
2024-08-15 8:32 ` Marc Zyngier
2024-11-26 17:00 ` Sebastian Ott
0 siblings, 2 replies; 11+ messages in thread
From: Shameerali Kolothum Thodi @ 2024-08-14 9:17 UTC (permalink / raw)
To: Marc Zyngier
Cc: kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org,
will@kernel.org, catalin.marinas@arm.com, oliver.upton@linux.dev,
james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui,
Wangzhou (B), Linuxarm, reijiw@google.com
> -----Original Message-----
> From: Marc Zyngier <maz@kernel.org>
> Sent: Tuesday, August 13, 2024 7:21 PM
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
> Cc: kvmarm@lists.linux.dev; linux-arm-kernel@lists.infradead.org;
> will@kernel.org; catalin.marinas@arm.com; oliver.upton@linux.dev;
> james.morse@arm.com; suzuki.poulose@arm.com; yuzenghui
> <yuzenghui@huawei.com>; Wangzhou (B) <wangzhou1@hisilicon.com>;
> Linuxarm <linuxarm@huawei.com>
> Subject: Re: [PATCH] KVM: arm64: Make the exposed feature bits in
> AA64DFR0_EL1 writable from userspace
>
> On Tue, 13 Aug 2024 15:28:35 +0100,
> Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> wrote:
> >
> > KVM exposes the OS double lock feature bit to Guests but returns
> > RAZ/WI on Guest OSDLR_EL1 access. This breaks Guest migration between
> > systems where this feature support differ. Add support to make this
> > feature writable from userspace by setting the mask bit. While at it,
> > set the mask bits for other exposed features in the AA64DFR0_EL1
> > register as well.
> >
> > Also update the selftest to cover these fields.
> >
> > Signed-off-by: Shameer Kolothum
> <shameerali.kolothum.thodi@huawei.com>
> > ---
> > This is based on the discussion here(Thanks to Oliver),
> > https://lore.kernel.org/all/ZrVSlbVwnaMDShah@linux.dev/
> > ---
> > arch/arm64/kvm/sys_regs.c | 6 +++++-
> > tools/testing/selftests/kvm/aarch64/set_id_regs.c | 4 ++++
> > 2 files changed, 9 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> > index c90324060436..adb49d681052 100644
> > --- a/arch/arm64/kvm/sys_regs.c
> > +++ b/arch/arm64/kvm/sys_regs.c
> > @@ -2376,7 +2376,11 @@ static const struct sys_reg_desc sys_reg_descs[]
> = {
> > .get_user = get_id_reg,
> > .set_user = set_id_aa64dfr0_el1,
> > .reset = read_sanitised_id_aa64dfr0_el1,
> > - .val = ID_AA64DFR0_EL1_PMUVer_MASK |
> > + .val = ID_AA64DFR0_EL1_DoubleLock_MASK |
> > + ID_AA64DFR0_EL1_CTX_CMPs_MASK |
> > + ID_AA64DFR0_EL1_WRPs_MASK |
> > + ID_AA64DFR0_EL1_BRPs_MASK |
>
>
> I think this is going to cause some troubles.
>
> The issue is that context-aware breakpoints are the highest-numbered
> breakpoints, right after the normal breakpoints (D2.8.3 "Breakpoint
> types and linking of breakpoints"). So if you reduce the number of
> normal breakpoints, you shift the context-aware ones down, and
> everything breaks.
Thanks Marc for explaining this. I was not aware of this one.
> I really don't see how you can safely do that without completely
> changing the way we handle the debug registers.
Looks like Reji has attempted to do this a while back,
https://lore.kernel.org/kvm/20220419065544.3616948-13-reijiw@google.com/
I guess that one is trying to address the problem you described above, right?
Though, not clear to me what happened afterwards to these patches in the series.
Coming back to this patch, we don't have a requirement now to make the
breakpoints writable for migration. The only concern is OS Double lock feature.
Not sure anyone has a high priority requirement to make the other features
writable or not. Will it be acceptable if I resent this patch with just OS Double Lock
being writable?(Sorry If I sound selfish, but at least some progress can be made soon).
Thanks,
Shameer
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] KVM: arm64: Make the exposed feature bits in AA64DFR0_EL1 writable from userspace
2024-08-14 9:17 ` Shameerali Kolothum Thodi
@ 2024-08-15 8:32 ` Marc Zyngier
2024-11-26 17:00 ` Sebastian Ott
1 sibling, 0 replies; 11+ messages in thread
From: Marc Zyngier @ 2024-08-15 8:32 UTC (permalink / raw)
To: Shameerali Kolothum Thodi
Cc: kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org,
will@kernel.org, catalin.marinas@arm.com, oliver.upton@linux.dev,
james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui,
Wangzhou (B), Linuxarm, reijiw@google.com
On Wed, 14 Aug 2024 10:17:10 +0100,
Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> wrote:
>
>
>
> > -----Original Message-----
> > From: Marc Zyngier <maz@kernel.org>
> > Sent: Tuesday, August 13, 2024 7:21 PM
> > To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
> > Cc: kvmarm@lists.linux.dev; linux-arm-kernel@lists.infradead.org;
> > will@kernel.org; catalin.marinas@arm.com; oliver.upton@linux.dev;
> > james.morse@arm.com; suzuki.poulose@arm.com; yuzenghui
> > <yuzenghui@huawei.com>; Wangzhou (B) <wangzhou1@hisilicon.com>;
> > Linuxarm <linuxarm@huawei.com>
> > Subject: Re: [PATCH] KVM: arm64: Make the exposed feature bits in
> > AA64DFR0_EL1 writable from userspace
> >
> > On Tue, 13 Aug 2024 15:28:35 +0100,
> > Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> wrote:
> > >
> > > KVM exposes the OS double lock feature bit to Guests but returns
> > > RAZ/WI on Guest OSDLR_EL1 access. This breaks Guest migration between
> > > systems where this feature support differ. Add support to make this
> > > feature writable from userspace by setting the mask bit. While at it,
> > > set the mask bits for other exposed features in the AA64DFR0_EL1
> > > register as well.
> > >
> > > Also update the selftest to cover these fields.
> > >
> > > Signed-off-by: Shameer Kolothum
> > <shameerali.kolothum.thodi@huawei.com>
> > > ---
> > > This is based on the discussion here(Thanks to Oliver),
> > > https://lore.kernel.org/all/ZrVSlbVwnaMDShah@linux.dev/
> > > ---
> > > arch/arm64/kvm/sys_regs.c | 6 +++++-
> > > tools/testing/selftests/kvm/aarch64/set_id_regs.c | 4 ++++
> > > 2 files changed, 9 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> > > index c90324060436..adb49d681052 100644
> > > --- a/arch/arm64/kvm/sys_regs.c
> > > +++ b/arch/arm64/kvm/sys_regs.c
> > > @@ -2376,7 +2376,11 @@ static const struct sys_reg_desc sys_reg_descs[]
> > = {
> > > .get_user = get_id_reg,
> > > .set_user = set_id_aa64dfr0_el1,
> > > .reset = read_sanitised_id_aa64dfr0_el1,
> > > - .val = ID_AA64DFR0_EL1_PMUVer_MASK |
> > > + .val = ID_AA64DFR0_EL1_DoubleLock_MASK |
> > > + ID_AA64DFR0_EL1_CTX_CMPs_MASK |
> > > + ID_AA64DFR0_EL1_WRPs_MASK |
> > > + ID_AA64DFR0_EL1_BRPs_MASK |
> >
> >
> > I think this is going to cause some troubles.
> >
> > The issue is that context-aware breakpoints are the highest-numbered
> > breakpoints, right after the normal breakpoints (D2.8.3 "Breakpoint
> > types and linking of breakpoints"). So if you reduce the number of
> > normal breakpoints, you shift the context-aware ones down, and
> > everything breaks.
>
> Thanks Marc for explaining this. I was not aware of this one.
Yeah, that's a pretty annoying shortcoming of the architecture. There
is an effort to try and address it, but not sure when that will be
fixed.
>
> > I really don't see how you can safely do that without completely
> > changing the way we handle the debug registers.
>
> Looks like Reji has attempted to do this a while back,
> https://lore.kernel.org/kvm/20220419065544.3616948-13-reijiw@google.com/
>
> I guess that one is trying to address the problem you described
> above, right? Though, not clear to me what happened afterwards to
> these patches in the series.
>
> Coming back to this patch, we don't have a requirement now to make
> the breakpoints writable for migration. The only concern is OS
> Double lock feature. Not sure anyone has a high priority
> requirement to make the other features writable or not. Will it be
> acceptable if I resent this patch with just OS Double Lock being
> writable?(Sorry If I sound selfish, but at least some progress can
> be made soon).
I think you can keep all the other two fields, as they are
independent. You could add a comment indicating why we can't just let
userspace change this field.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [PATCH] KVM: arm64: Make the exposed feature bits in AA64DFR0_EL1 writable from userspace
2024-08-14 9:17 ` Shameerali Kolothum Thodi
2024-08-15 8:32 ` Marc Zyngier
@ 2024-11-26 17:00 ` Sebastian Ott
2024-11-26 19:29 ` Marc Zyngier
1 sibling, 1 reply; 11+ messages in thread
From: Sebastian Ott @ 2024-11-26 17:00 UTC (permalink / raw)
To: Shameerali Kolothum Thodi
Cc: Marc Zyngier, kvmarm@lists.linux.dev,
linux-arm-kernel@lists.infradead.org, will@kernel.org,
catalin.marinas@arm.com, oliver.upton@linux.dev,
james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui,
Wangzhou (B), Linuxarm, reijiw@google.com
Hi,
On Wed, 14 Aug 2024, Shameerali Kolothum Thodi wrote:
>>
>> On Tue, 13 Aug 2024 15:28:35 +0100,
>> Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> wrote:
>>>
>>> KVM exposes the OS double lock feature bit to Guests but returns
>>> RAZ/WI on Guest OSDLR_EL1 access. This breaks Guest migration between
>>> systems where this feature support differ. Add support to make this
>>> feature writable from userspace by setting the mask bit. While at it,
>>> set the mask bits for other exposed features in the AA64DFR0_EL1
>>> register as well.
>>>
>>> Also update the selftest to cover these fields.
>>>
>>> Signed-off-by: Shameer Kolothum
>> <shameerali.kolothum.thodi@huawei.com>
>>> ---
>>> This is based on the discussion here(Thanks to Oliver),
>>> https://lore.kernel.org/all/ZrVSlbVwnaMDShah@linux.dev/
>>> ---
>>> arch/arm64/kvm/sys_regs.c | 6 +++++-
>>> tools/testing/selftests/kvm/aarch64/set_id_regs.c | 4 ++++
>>> 2 files changed, 9 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
>>> index c90324060436..adb49d681052 100644
>>> --- a/arch/arm64/kvm/sys_regs.c
>>> +++ b/arch/arm64/kvm/sys_regs.c
>>> @@ -2376,7 +2376,11 @@ static const struct sys_reg_desc sys_reg_descs[]
>> = {
>>> .get_user = get_id_reg,
>>> .set_user = set_id_aa64dfr0_el1,
>>> .reset = read_sanitised_id_aa64dfr0_el1,
>>> - .val = ID_AA64DFR0_EL1_PMUVer_MASK |
>>> + .val = ID_AA64DFR0_EL1_DoubleLock_MASK |
>>> + ID_AA64DFR0_EL1_CTX_CMPs_MASK |
>>> + ID_AA64DFR0_EL1_WRPs_MASK |
>>> + ID_AA64DFR0_EL1_BRPs_MASK |
>>
>>
>> I think this is going to cause some troubles.
>>
>> The issue is that context-aware breakpoints are the highest-numbered
>> breakpoints, right after the normal breakpoints (D2.8.3 "Breakpoint
>> types and linking of breakpoints"). So if you reduce the number of
>> normal breakpoints, you shift the context-aware ones down, and
>> everything breaks.
>
> Thanks Marc for explaining this. I was not aware of this one.
>
>> I really don't see how you can safely do that without completely
>> changing the way we handle the debug registers.
>
> Looks like Reji has attempted to do this a while back,
> https://lore.kernel.org/kvm/20220419065544.3616948-13-reijiw@google.com/
>
I've got two machines that differ in the number of breakpoints and
it would be nice to be able to migrate between these. Is anything
preventing us from trapping the access and make sure the correct
breakpoint is used? Is anyone working on this? If not I'd like to
give it a shot.
Thanks,
Sebastian
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] KVM: arm64: Make the exposed feature bits in AA64DFR0_EL1 writable from userspace
2024-11-26 17:00 ` Sebastian Ott
@ 2024-11-26 19:29 ` Marc Zyngier
2024-11-27 17:53 ` Sebastian Ott
2024-11-28 9:31 ` Eric Auger
0 siblings, 2 replies; 11+ messages in thread
From: Marc Zyngier @ 2024-11-26 19:29 UTC (permalink / raw)
To: Sebastian Ott
Cc: Shameerali Kolothum Thodi, kvmarm@lists.linux.dev,
linux-arm-kernel@lists.infradead.org, will@kernel.org,
catalin.marinas@arm.com, oliver.upton@linux.dev,
james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui,
Wangzhou (B), Linuxarm, reijiw@google.com
On Tue, 26 Nov 2024 17:00:35 +0000,
Sebastian Ott <sebott@redhat.com> wrote:
>
> Hi,
>
> On Wed, 14 Aug 2024, Shameerali Kolothum Thodi wrote:
> >>
> >> On Tue, 13 Aug 2024 15:28:35 +0100,
> >> Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> wrote:
> >>>
> >>> KVM exposes the OS double lock feature bit to Guests but returns
> >>> RAZ/WI on Guest OSDLR_EL1 access. This breaks Guest migration between
> >>> systems where this feature support differ. Add support to make this
> >>> feature writable from userspace by setting the mask bit. While at it,
> >>> set the mask bits for other exposed features in the AA64DFR0_EL1
> >>> register as well.
> >>>
> >>> Also update the selftest to cover these fields.
> >>>
> >>> Signed-off-by: Shameer Kolothum
> >> <shameerali.kolothum.thodi@huawei.com>
> >>> ---
> >>> This is based on the discussion here(Thanks to Oliver),
> >>> https://lore.kernel.org/all/ZrVSlbVwnaMDShah@linux.dev/
> >>> ---
> >>> arch/arm64/kvm/sys_regs.c | 6 +++++-
> >>> tools/testing/selftests/kvm/aarch64/set_id_regs.c | 4 ++++
> >>> 2 files changed, 9 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> >>> index c90324060436..adb49d681052 100644
> >>> --- a/arch/arm64/kvm/sys_regs.c
> >>> +++ b/arch/arm64/kvm/sys_regs.c
> >>> @@ -2376,7 +2376,11 @@ static const struct sys_reg_desc sys_reg_descs[]
> >> = {
> >>> .get_user = get_id_reg,
> >>> .set_user = set_id_aa64dfr0_el1,
> >>> .reset = read_sanitised_id_aa64dfr0_el1,
> >>> - .val = ID_AA64DFR0_EL1_PMUVer_MASK |
> >>> + .val = ID_AA64DFR0_EL1_DoubleLock_MASK |
> >>> + ID_AA64DFR0_EL1_CTX_CMPs_MASK |
> >>> + ID_AA64DFR0_EL1_WRPs_MASK |
> >>> + ID_AA64DFR0_EL1_BRPs_MASK |
> >>
> >>
> >> I think this is going to cause some troubles.
> >>
> >> The issue is that context-aware breakpoints are the highest-numbered
> >> breakpoints, right after the normal breakpoints (D2.8.3 "Breakpoint
> >> types and linking of breakpoints"). So if you reduce the number of
> >> normal breakpoints, you shift the context-aware ones down, and
> >> everything breaks.
> >
> > Thanks Marc for explaining this. I was not aware of this one.
> >
> >> I really don't see how you can safely do that without completely
> >> changing the way we handle the debug registers.
> >
> > Looks like Reji has attempted to do this a while back,
> > https://lore.kernel.org/kvm/20220419065544.3616948-13-reijiw@google.com/
> >
>
> I've got two machines that differ in the number of breakpoints and
> it would be nice to be able to migrate between these. Is anything
Is that the *only* thing that differ? Do the have the same number of
context-aware breakpoints?
> preventing us from trapping the access and make sure the correct
> breakpoint is used? Is anyone working on this? If not I'd like to
> give it a shot.
Not only trapping. You also need to handle some interesting parts of
the architecture, such as the breakpoint linking fun.
But if we are to go down that road, I really want to restrict that to
implementations that have FEAT_FGT. Because otherwise we need to trap
and emulate *everything*, instead of just the breakpoint registers.
And that would be pretty bad from a performance perspective.
Another thing is that this only works because there is no report of
the breakpoint number in ESR_ELx. The moment we offering this
migration "feature", we are painting ourselves in a corner, should the
architecture ever evolve to something less... bizarre.
Finally, who is going to ensure this keeps working in the foreseeable
future? Because while this is nice, that's not what gets deployed in
production, as it leads to unpredictable performances. My take is that
this thing will eventually bitrot and die.
So, do we *really* want to go down that road?
M.
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] KVM: arm64: Make the exposed feature bits in AA64DFR0_EL1 writable from userspace
2024-11-26 19:29 ` Marc Zyngier
@ 2024-11-27 17:53 ` Sebastian Ott
2024-11-28 9:31 ` Eric Auger
1 sibling, 0 replies; 11+ messages in thread
From: Sebastian Ott @ 2024-11-27 17:53 UTC (permalink / raw)
To: Marc Zyngier
Cc: Shameerali Kolothum Thodi, kvmarm@lists.linux.dev,
linux-arm-kernel@lists.infradead.org, will@kernel.org,
catalin.marinas@arm.com, oliver.upton@linux.dev,
james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui,
Wangzhou (B), Linuxarm, reijiw@google.com
On Tue, 26 Nov 2024, Marc Zyngier wrote:
> On Tue, 26 Nov 2024 17:00:35 +0000,
> Sebastian Ott <sebott@redhat.com> wrote:
>> On Wed, 14 Aug 2024, Shameerali Kolothum Thodi wrote:
>>>>
>>>> On Tue, 13 Aug 2024 15:28:35 +0100,
>>>> Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> wrote:
>>>>>
>>>>> KVM exposes the OS double lock feature bit to Guests but returns
>>>>> RAZ/WI on Guest OSDLR_EL1 access. This breaks Guest migration between
>>>>> systems where this feature support differ. Add support to make this
>>>>> feature writable from userspace by setting the mask bit. While at it,
>>>>> set the mask bits for other exposed features in the AA64DFR0_EL1
>>>>> register as well.
>>>>>
>>>>> Also update the selftest to cover these fields.
>>>>>
>>>>> Signed-off-by: Shameer Kolothum
>>>> <shameerali.kolothum.thodi@huawei.com>
>>>>> ---
>>>>> This is based on the discussion here(Thanks to Oliver),
>>>>> https://lore.kernel.org/all/ZrVSlbVwnaMDShah@linux.dev/
>>>>> ---
>>>>> arch/arm64/kvm/sys_regs.c | 6 +++++-
>>>>> tools/testing/selftests/kvm/aarch64/set_id_regs.c | 4 ++++
>>>>> 2 files changed, 9 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
>>>>> index c90324060436..adb49d681052 100644
>>>>> --- a/arch/arm64/kvm/sys_regs.c
>>>>> +++ b/arch/arm64/kvm/sys_regs.c
>>>>> @@ -2376,7 +2376,11 @@ static const struct sys_reg_desc sys_reg_descs[]
>>>> = {
>>>>> .get_user = get_id_reg,
>>>>> .set_user = set_id_aa64dfr0_el1,
>>>>> .reset = read_sanitised_id_aa64dfr0_el1,
>>>>> - .val = ID_AA64DFR0_EL1_PMUVer_MASK |
>>>>> + .val = ID_AA64DFR0_EL1_DoubleLock_MASK |
>>>>> + ID_AA64DFR0_EL1_CTX_CMPs_MASK |
>>>>> + ID_AA64DFR0_EL1_WRPs_MASK |
>>>>> + ID_AA64DFR0_EL1_BRPs_MASK |
>>>>
>>>>
>>>> I think this is going to cause some troubles.
>>>>
>>>> The issue is that context-aware breakpoints are the highest-numbered
>>>> breakpoints, right after the normal breakpoints (D2.8.3 "Breakpoint
>>>> types and linking of breakpoints"). So if you reduce the number of
>>>> normal breakpoints, you shift the context-aware ones down, and
>>>> everything breaks.
>>>
>>> Thanks Marc for explaining this. I was not aware of this one.
>>>
>>>> I really don't see how you can safely do that without completely
>>>> changing the way we handle the debug registers.
>>>
>>> Looks like Reji has attempted to do this a while back,
>>> https://lore.kernel.org/kvm/20220419065544.3616948-13-reijiw@google.com/
>>>
>>
>> I've got two machines that differ in the number of breakpoints and
>> it would be nice to be able to migrate between these. Is anything
>
> Is that the *only* thing that differ? Do the have the same number of
> context-aware breakpoints?
It's the only diff in DFR0 - CTX_CMPs is the same. There are diffs in
other ID regs as well but these are already writable.
>> preventing us from trapping the access and make sure the correct
>> breakpoint is used? Is anyone working on this? If not I'd like to
>> give it a shot.
>
> Not only trapping. You also need to handle some interesting parts of
> the architecture, such as the breakpoint linking fun.
Ugh, and I was thinking this might be straightforward ;-(
> But if we are to go down that road, I really want to restrict that to
> implementations that have FEAT_FGT. Because otherwise we need to trap
> and emulate *everything*, instead of just the breakpoint registers.
> And that would be pretty bad from a performance perspective.
OK, understood.
> Another thing is that this only works because there is no report of
> the breakpoint number in ESR_ELx. The moment we offering this
> migration "feature", we are painting ourselves in a corner, should the
> architecture ever evolve to something less... bizarre.
>
> Finally, who is going to ensure this keeps working in the foreseeable
> future? Because while this is nice, that's not what gets deployed in
> production, as it leads to unpredictable performances. My take is that
> this thing will eventually bitrot and die.
>
> So, do we *really* want to go down that road?
Thanks a lot for the pointers! I'll do some more digging to figure out
what needs to be done and if that's actually worth it..
Sebastian
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] KVM: arm64: Make the exposed feature bits in AA64DFR0_EL1 writable from userspace
2024-11-26 19:29 ` Marc Zyngier
2024-11-27 17:53 ` Sebastian Ott
@ 2024-11-28 9:31 ` Eric Auger
2024-12-01 12:21 ` Marc Zyngier
1 sibling, 1 reply; 11+ messages in thread
From: Eric Auger @ 2024-11-28 9:31 UTC (permalink / raw)
To: Marc Zyngier, Sebastian Ott
Cc: Shameerali Kolothum Thodi, kvmarm@lists.linux.dev,
linux-arm-kernel@lists.infradead.org, will@kernel.org,
catalin.marinas@arm.com, oliver.upton@linux.dev,
james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui,
Wangzhou (B), Linuxarm, reijiw@google.com
Hi Marc,
On 11/26/24 20:29, Marc Zyngier wrote:
> On Tue, 26 Nov 2024 17:00:35 +0000,
> Sebastian Ott <sebott@redhat.com> wrote:
>>
>> Hi,
>>
>> On Wed, 14 Aug 2024, Shameerali Kolothum Thodi wrote:
>>>>
>>>> On Tue, 13 Aug 2024 15:28:35 +0100,
>>>> Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> wrote:
>>>>>
>>>>> KVM exposes the OS double lock feature bit to Guests but returns
>>>>> RAZ/WI on Guest OSDLR_EL1 access. This breaks Guest migration between
>>>>> systems where this feature support differ. Add support to make this
>>>>> feature writable from userspace by setting the mask bit. While at it,
>>>>> set the mask bits for other exposed features in the AA64DFR0_EL1
>>>>> register as well.
>>>>>
>>>>> Also update the selftest to cover these fields.
>>>>>
>>>>> Signed-off-by: Shameer Kolothum
>>>> <shameerali.kolothum.thodi@huawei.com>
>>>>> ---
>>>>> This is based on the discussion here(Thanks to Oliver),
>>>>> https://lore.kernel.org/all/ZrVSlbVwnaMDShah@linux.dev/
>>>>> ---
>>>>> arch/arm64/kvm/sys_regs.c | 6 +++++-
>>>>> tools/testing/selftests/kvm/aarch64/set_id_regs.c | 4 ++++
>>>>> 2 files changed, 9 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
>>>>> index c90324060436..adb49d681052 100644
>>>>> --- a/arch/arm64/kvm/sys_regs.c
>>>>> +++ b/arch/arm64/kvm/sys_regs.c
>>>>> @@ -2376,7 +2376,11 @@ static const struct sys_reg_desc sys_reg_descs[]
>>>> = {
>>>>> .get_user = get_id_reg,
>>>>> .set_user = set_id_aa64dfr0_el1,
>>>>> .reset = read_sanitised_id_aa64dfr0_el1,
>>>>> - .val = ID_AA64DFR0_EL1_PMUVer_MASK |
>>>>> + .val = ID_AA64DFR0_EL1_DoubleLock_MASK |
>>>>> + ID_AA64DFR0_EL1_CTX_CMPs_MASK |
>>>>> + ID_AA64DFR0_EL1_WRPs_MASK |
>>>>> + ID_AA64DFR0_EL1_BRPs_MASK |
>>>>
>>>>
>>>> I think this is going to cause some troubles.
>>>>
>>>> The issue is that context-aware breakpoints are the highest-numbered
>>>> breakpoints, right after the normal breakpoints (D2.8.3 "Breakpoint
>>>> types and linking of breakpoints"). So if you reduce the number of
>>>> normal breakpoints, you shift the context-aware ones down, and
>>>> everything breaks.
>>>
>>> Thanks Marc for explaining this. I was not aware of this one.
>>>
>>>> I really don't see how you can safely do that without completely
>>>> changing the way we handle the debug registers.
>>>
>>> Looks like Reji has attempted to do this a while back,
>>> https://lore.kernel.org/kvm/20220419065544.3616948-13-reijiw@google.com/
>>>
>>
>> I've got two machines that differ in the number of breakpoints and
>> it would be nice to be able to migrate between these. Is anything
>
> Is that the *only* thing that differ? Do the have the same number of
> context-aware breakpoints?
>
>> preventing us from trapping the access and make sure the correct
>> breakpoint is used? Is anyone working on this? If not I'd like to
>> give it a shot.
>
> Not only trapping. You also need to handle some interesting parts of
> the architecture, such as the breakpoint linking fun.
>
> But if we are to go down that road, I really want to restrict that to
> implementations that have FEAT_FGT. Because otherwise we need to trap
> and emulate *everything*, instead of just the breakpoint registers.
> And that would be pretty bad from a performance perspective.
>
> Another thing is that this only works because there is no report of
> the breakpoint number in ESR_ELx. The moment we offering this
> migration "feature", we are painting ourselves in a corner, should the
> architecture ever evolve to something less... bizarre.
>
> Finally, who is going to ensure this keeps working in the foreseeable
> future? Because while this is nice, that's not what gets deployed in
> production, as it leads to unpredictable performances. My take is that
> this thing will eventually bitrot and die.
In the context of our works to define qemu vcpu models for ARM
(https://lore.kernel.org/all/20241025101959.601048-1-eric.auger@redhat.com/)
, our current approach is to try migrating between modern HW we have
access to. The case above is migration between AmpereOne and Grace which
both should be prevalent systems. Do you think this does not make sense
at all to try migrating between those, alhough this may be challenging?
Other cases we have looked at are migration within Ampere Altra Max
system family (which should be hopefully fine now with have CTR_EL0
works from Sebastian upstream), mig between Graviton hosts. Wrt Ampere
Altra Max to AmpereOne, Oliver pointed out the cntfrq issue which is
blocking.
Do you think we should restrict our studies to systems which are
"closer" to each other in terms of ARM spec rev. We throught that
migration bewteen AmpereOne And Grace would be an interesting POC and
not totally irrelevant in terms of industry.
Thanks
Eric
>
> So, do we *really* want to go down that road?
>
> M.
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] KVM: arm64: Make the exposed feature bits in AA64DFR0_EL1 writable from userspace
2024-11-28 9:31 ` Eric Auger
@ 2024-12-01 12:21 ` Marc Zyngier
2024-12-02 8:03 ` Eric Auger
0 siblings, 1 reply; 11+ messages in thread
From: Marc Zyngier @ 2024-12-01 12:21 UTC (permalink / raw)
To: Eric Auger
Cc: Sebastian Ott, Shameerali Kolothum Thodi, kvmarm@lists.linux.dev,
linux-arm-kernel@lists.infradead.org, will@kernel.org,
catalin.marinas@arm.com, oliver.upton@linux.dev,
james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui,
Wangzhou (B), Linuxarm, reijiw@google.com
Hey Eric,
On Thu, 28 Nov 2024 09:31:08 +0000,
Eric Auger <eauger@redhat.com> wrote:
>
> Hi Marc,
>
> On 11/26/24 20:29, Marc Zyngier wrote:
> > Finally, who is going to ensure this keeps working in the foreseeable
> > future? Because while this is nice, that's not what gets deployed in
> > production, as it leads to unpredictable performances. My take is that
> > this thing will eventually bitrot and die.
> In the context of our works to define qemu vcpu models for ARM
> (https://lore.kernel.org/all/20241025101959.601048-1-eric.auger@redhat.com/)
> , our current approach is to try migrating between modern HW we have
> access to. The case above is migration between AmpereOne and Grace which
> both should be prevalent systems. Do you think this does not make sense
> at all to try migrating between those, alhough this may be challenging?
I don't mind the challenge. But I'm worried this is something that
looks like a reasonable idea that doesn't get any traction in
practice.
And the example you mention is pretty striking: who in their right
mind would migrate between these two systems? If you deploy a Grace
system, that's because you are making use of the GPU, and your VM is
likely to require it. Conversely, if you run on an Ampere system, you
don't want to use a valuable (read: bloody expensive) slot on a Grace
machine.
> Other cases we have looked at are migration within Ampere Altra Max
> system family (which should be hopefully fine now with have CTR_EL0
> works from Sebastian upstream), mig between Graviton hosts. Wrt Ampere
> Altra Max to AmpereOne, Oliver pointed out the cntfrq issue which is
> blocking.
>
> Do you think we should restrict our studies to systems which are
> "closer" to each other in terms of ARM spec rev. We throught that
> migration bewteen AmpereOne And Grace would be an interesting POC and
> not totally irrelevant in terms of industry.
These two implementations may be close in terms of CPU features. But
as systems, they are massively different, and I very much doubt they
have the same deployment story. If they have one at all.
The Graviton story may have more traction, but these folks have their
own way of doing things, and in my experience do not give upstream
much consideration.
To sum it up, I'm not opposed to this work. But if we are going to
carry this sort of complex emulation, I want someone to step up and
promise that they will test it for the next 10 years, at the very
least. Because I'm very unlikely to ever have access to any of these
machines, let alone both, and I don't see people using it in practice.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] KVM: arm64: Make the exposed feature bits in AA64DFR0_EL1 writable from userspace
2024-12-01 12:21 ` Marc Zyngier
@ 2024-12-02 8:03 ` Eric Auger
2024-12-02 9:11 ` Marc Zyngier
0 siblings, 1 reply; 11+ messages in thread
From: Eric Auger @ 2024-12-02 8:03 UTC (permalink / raw)
To: Marc Zyngier
Cc: Sebastian Ott, Shameerali Kolothum Thodi, kvmarm@lists.linux.dev,
linux-arm-kernel@lists.infradead.org, will@kernel.org,
catalin.marinas@arm.com, oliver.upton@linux.dev,
james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui,
Wangzhou (B), Linuxarm, reijiw@google.com
Hi Marc,
On 12/1/24 13:21, Marc Zyngier wrote:
> Hey Eric,
>
> On Thu, 28 Nov 2024 09:31:08 +0000,
> Eric Auger <eauger@redhat.com> wrote:
>>
>> Hi Marc,
>>
>> On 11/26/24 20:29, Marc Zyngier wrote:
>>> Finally, who is going to ensure this keeps working in the foreseeable
>>> future? Because while this is nice, that's not what gets deployed in
>>> production, as it leads to unpredictable performances. My take is that
>>> this thing will eventually bitrot and die.
>> In the context of our works to define qemu vcpu models for ARM
>> (https://lore.kernel.org/all/20241025101959.601048-1-eric.auger@redhat.com/)
>> , our current approach is to try migrating between modern HW we have
>> access to. The case above is migration between AmpereOne and Grace which
>> both should be prevalent systems. Do you think this does not make sense
>> at all to try migrating between those, alhough this may be challenging?
>
> I don't mind the challenge. But I'm worried this is something that
> looks like a reasonable idea that doesn't get any traction in
> practice.
>
> And the example you mention is pretty striking: who in their right
> mind would migrate between these two systems? If you deploy a Grace
> system, that's because you are making use of the GPU, and your VM is
> likely to require it. Conversely, if you run on an Ampere system, you
> don't want to use a valuable (read: bloody expensive) slot on a Grace
> machine.
Yes I acknowledge it is a total valid point from a use case and cost
point of view. I was expecting maybe some interest migrating between
AmpereOne and Grace-Grace for farm enhancement but most probably it is
marginal.
Definition of [qemu] named vcpu models looks pretty uneasy then because
we don't have much relevant and accessible HW to test with, taking into
account such non technical considerations. Besides migration within a
CPU family I don't see much.
>
>> Other cases we have looked at are migration within Ampere Altra Max
>> system family (which should be hopefully fine now with have CTR_EL0
>> works from Sebastian upstream), mig between Graviton hosts. Wrt Ampere
>> Altra Max to AmpereOne, Oliver pointed out the cntfrq issue which is
>> blocking.
>>
>> Do you think we should restrict our studies to systems which are
>> "closer" to each other in terms of ARM spec rev. We throught that
>> migration bewteen AmpereOne And Grace would be an interesting POC and
>> not totally irrelevant in terms of industry.
>
> These two implementations may be close in terms of CPU features. But
> as systems, they are massively different, and I very much doubt they
> have the same deployment story. If they have one at all.
OK thank you for sharing your point of view.
>
> The Graviton story may have more traction, but these folks have their
> own way of doing things, and in my experience do not give upstream
> much consideration.
OK thanks
>
> To sum it up, I'm not opposed to this work. But if we are going to
> carry this sort of complex emulation, I want someone to step up and
> promise that they will test it for the next 10 years, at the very
> least. Because I'm very unlikely to ever have access to any of these
> machines, let alone both, and I don't see people using it in practice.
understood!
Thanks
Eric
>
> Thanks,
>
> M.
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] KVM: arm64: Make the exposed feature bits in AA64DFR0_EL1 writable from userspace
2024-12-02 8:03 ` Eric Auger
@ 2024-12-02 9:11 ` Marc Zyngier
0 siblings, 0 replies; 11+ messages in thread
From: Marc Zyngier @ 2024-12-02 9:11 UTC (permalink / raw)
To: Eric Auger
Cc: Sebastian Ott, Shameerali Kolothum Thodi, kvmarm@lists.linux.dev,
linux-arm-kernel@lists.infradead.org, will@kernel.org,
catalin.marinas@arm.com, oliver.upton@linux.dev,
james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui,
Wangzhou (B), Linuxarm, reijiw@google.com
On Mon, 02 Dec 2024 08:03:26 +0000,
Eric Auger <eauger@redhat.com> wrote:
>
> Hi Marc,
>
> On 12/1/24 13:21, Marc Zyngier wrote:
> > Hey Eric,
> >
> > On Thu, 28 Nov 2024 09:31:08 +0000,
> > Eric Auger <eauger@redhat.com> wrote:
> >>
> >> Hi Marc,
> >>
> >> On 11/26/24 20:29, Marc Zyngier wrote:
> >>> Finally, who is going to ensure this keeps working in the foreseeable
> >>> future? Because while this is nice, that's not what gets deployed in
> >>> production, as it leads to unpredictable performances. My take is that
> >>> this thing will eventually bitrot and die.
> >> In the context of our works to define qemu vcpu models for ARM
> >> (https://lore.kernel.org/all/20241025101959.601048-1-eric.auger@redhat.com/)
> >> , our current approach is to try migrating between modern HW we have
> >> access to. The case above is migration between AmpereOne and Grace which
> >> both should be prevalent systems. Do you think this does not make sense
> >> at all to try migrating between those, alhough this may be challenging?
> >
> > I don't mind the challenge. But I'm worried this is something that
> > looks like a reasonable idea that doesn't get any traction in
> > practice.
> >
> > And the example you mention is pretty striking: who in their right
> > mind would migrate between these two systems? If you deploy a Grace
> > system, that's because you are making use of the GPU, and your VM is
> > likely to require it. Conversely, if you run on an Ampere system, you
> > don't want to use a valuable (read: bloody expensive) slot on a Grace
> > machine.
>
> Yes I acknowledge it is a total valid point from a use case and cost
> point of view. I was expecting maybe some interest migrating between
> AmpereOne and Grace-Grace for farm enhancement but most probably it is
> marginal.
That's my understanding as well. Cloud vendors tend to have
homogeneous systems, and those who don't are realising that they shot
themselves in the foot (and in that case, the debug infrastructure is
the least of their worries).
> Definition of [qemu] named vcpu models looks pretty uneasy then because
> we don't have much relevant and accessible HW to test with, taking into
> account such non technical considerations. Besides migration within a
> CPU family I don't see much.
That's basically my point.
Another thing to consider is that there is an effort on the SBSA
front to standardise which breakpoint numbers are context-aware, which
would side-step the need for emulation in the long run (once the
current crop of HW is gone and forgotten, which should be in the next
3 months or so ;-).
But if there was a *credible* user coming out and saying that they
depended on this sort of feature to be supported now and forever, then
I'd be more supportive.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2024-12-02 9:12 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-13 14:28 [PATCH] KVM: arm64: Make the exposed feature bits in AA64DFR0_EL1 writable from userspace Shameer Kolothum
2024-08-13 18:20 ` Marc Zyngier
2024-08-14 9:17 ` Shameerali Kolothum Thodi
2024-08-15 8:32 ` Marc Zyngier
2024-11-26 17:00 ` Sebastian Ott
2024-11-26 19:29 ` Marc Zyngier
2024-11-27 17:53 ` Sebastian Ott
2024-11-28 9:31 ` Eric Auger
2024-12-01 12:21 ` Marc Zyngier
2024-12-02 8:03 ` Eric Auger
2024-12-02 9:11 ` Marc Zyngier
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).