* Re: [PATCH] x86/amd: do not expose HWCR.TscFreqSel to guests
2023-09-12 16:35 ` Andrew Cooper
@ 2023-09-12 16:36 ` Andrew Cooper
2023-09-13 8:08 ` Roger Pau Monné
2023-09-13 6:18 ` Jan Beulich
2023-09-13 7:50 ` [PATCH] " Roger Pau Monné
2 siblings, 1 reply; 18+ messages in thread
From: Andrew Cooper @ 2023-09-12 16:36 UTC (permalink / raw)
To: Roger Pau Monne, xen-devel
Cc: Jan Beulich, Wei Liu, solene, Marek Marczykowski-Górecki,
Demi Marie Obenour
On 12/09/2023 5:35 pm, Andrew Cooper wrote:
> On 12/09/2023 5:23 pm, Roger Pau Monne wrote:
>> OpenBSD will attempt to unconditionally access PSTATE0 if HWCR.TscFreqSel is
>> set, and will also attempt to unconditionally access HWCR if the TSC is
>> reported as Invariant.
>>
>> The reasoning for exposing HWCR.TscFreqSel was to avoid Linux from printing a
>> (bogus) warning message, but doing so at the cost of OpenBSD not booting is not
>> a suitable solution.
>>
>> In order to fix expose an empty HWCR.
> At first I was thinking a straight up revert, but AMD's CPUID Faulting
> is an architectural bit in here so it's worth keeping the register around.
>
>> Fixes: 14b95b3b8546 ('x86/AMD: expose HWCR.TscFreqSel to guests')
>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
>> ---
>> Not sure whether we want to expose something when is_cpufreq_controller() is
>> true, seeing as there's a special wrmsr handler for the same MSR in that case.
>> Likely should be done for PV only, but also likely quite bogus.
>>
>> Missing reported by as the issue came from the QubesOS tracker.
> Well - we can at least have a:
>
> Link: https://github.com/QubesOS/qubes-issues/issues/8502
>
> in the commit message, and it's probably worth asking Solène / Marek
> (both CC'd) if they want a Reported-by tag.
>
>> ---
>> xen/arch/x86/msr.c | 8 ++++++--
>> 1 file changed, 6 insertions(+), 2 deletions(-)
>>
>> diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
>> index 3f0450259cdf..964d500ff8a1 100644
>> --- a/xen/arch/x86/msr.c
>> +++ b/xen/arch/x86/msr.c
>> @@ -240,8 +240,12 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
>> case MSR_K8_HWCR:
>> if ( !(cp->x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) )
>> goto gp_fault;
>> - *val = get_cpu_family(cp->basic.raw_fms, NULL, NULL) >= 0x10
>> - ? K8_HWCR_TSC_FREQ_SEL : 0;
>> + /*
>> + * OpenBSD 7.3 accesses HWCR unconditionally if the TSC is reported as
>> + * Invariant. Do not set TSC_FREQ_SEL as that would trigger OpenBSD to
>> + * also poke at PSTATE0.
>> + */
> While this is true, the justification for removing this is because
> TSC_FREQ_SEL is a model specific bit, not an architectural bit in HWCR.
>
> Also because it's addition without writing into the migration stream was
> bogus irrespective of the specifics of the bit.
>
> I'm still of the opinion that it's buggy for OpenBSD to be looking at
> model specific bits when virtualised, but given my latest reading of the
> AMD manuals, I think OpenBSD *is* well behaved looking at PSTATE0 if it
> can see TSC_FREQ_SEL.
>
> In some theoretical future where the toolstack better understands MSRs
> and (non)migratable VMs (which is the QubesOS usecase), then it would in
> principle be fine to construct a VM which can see the host TSC_FREQ_SEL
> and PSTATE* values.
>
> Preferably with an adjusted comment, Reviewed-by: Andrew Cooper
> <andrew.cooper3@citrix.com>
Sorry - I meant to be clearer here. I'd suggest just deleting the
comment and leaving an unconditional return of 0 (which will become
conditional when we wire up CPUID Faulting).
MSR_HWCR *is* an architectural MSR on any 64bit AMD system, so shouldn't
fault.
~Andrew
^ permalink raw reply [flat|nested] 18+ messages in thread* Re: [PATCH] x86/amd: do not expose HWCR.TscFreqSel to guests
2023-09-12 16:36 ` Andrew Cooper
@ 2023-09-13 8:08 ` Roger Pau Monné
2023-09-13 10:11 ` Andrew Cooper
0 siblings, 1 reply; 18+ messages in thread
From: Roger Pau Monné @ 2023-09-13 8:08 UTC (permalink / raw)
To: Andrew Cooper
Cc: xen-devel, Jan Beulich, Wei Liu, solene,
Marek Marczykowski-Górecki, Demi Marie Obenour
On Tue, Sep 12, 2023 at 05:36:53PM +0100, Andrew Cooper wrote:
> On 12/09/2023 5:35 pm, Andrew Cooper wrote:
> > On 12/09/2023 5:23 pm, Roger Pau Monne wrote:
> >> OpenBSD will attempt to unconditionally access PSTATE0 if HWCR.TscFreqSel is
> >> set, and will also attempt to unconditionally access HWCR if the TSC is
> >> reported as Invariant.
> >>
> >> The reasoning for exposing HWCR.TscFreqSel was to avoid Linux from printing a
> >> (bogus) warning message, but doing so at the cost of OpenBSD not booting is not
> >> a suitable solution.
> >>
> >> In order to fix expose an empty HWCR.
> > At first I was thinking a straight up revert, but AMD's CPUID Faulting
> > is an architectural bit in here so it's worth keeping the register around.
> >
> >> Fixes: 14b95b3b8546 ('x86/AMD: expose HWCR.TscFreqSel to guests')
> >> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> >> ---
> >> Not sure whether we want to expose something when is_cpufreq_controller() is
> >> true, seeing as there's a special wrmsr handler for the same MSR in that case.
> >> Likely should be done for PV only, but also likely quite bogus.
> >>
> >> Missing reported by as the issue came from the QubesOS tracker.
> > Well - we can at least have a:
> >
> > Link: https://github.com/QubesOS/qubes-issues/issues/8502
> >
> > in the commit message, and it's probably worth asking Solène / Marek
> > (both CC'd) if they want a Reported-by tag.
> >
> >> ---
> >> xen/arch/x86/msr.c | 8 ++++++--
> >> 1 file changed, 6 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
> >> index 3f0450259cdf..964d500ff8a1 100644
> >> --- a/xen/arch/x86/msr.c
> >> +++ b/xen/arch/x86/msr.c
> >> @@ -240,8 +240,12 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
> >> case MSR_K8_HWCR:
> >> if ( !(cp->x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) )
> >> goto gp_fault;
> >> - *val = get_cpu_family(cp->basic.raw_fms, NULL, NULL) >= 0x10
> >> - ? K8_HWCR_TSC_FREQ_SEL : 0;
> >> + /*
> >> + * OpenBSD 7.3 accesses HWCR unconditionally if the TSC is reported as
> >> + * Invariant. Do not set TSC_FREQ_SEL as that would trigger OpenBSD to
> >> + * also poke at PSTATE0.
> >> + */
> > While this is true, the justification for removing this is because
> > TSC_FREQ_SEL is a model specific bit, not an architectural bit in HWCR.
> >
> > Also because it's addition without writing into the migration stream was
> > bogus irrespective of the specifics of the bit.
> >
> > I'm still of the opinion that it's buggy for OpenBSD to be looking at
> > model specific bits when virtualised, but given my latest reading of the
> > AMD manuals, I think OpenBSD *is* well behaved looking at PSTATE0 if it
> > can see TSC_FREQ_SEL.
> >
> > In some theoretical future where the toolstack better understands MSRs
> > and (non)migratable VMs (which is the QubesOS usecase), then it would in
> > principle be fine to construct a VM which can see the host TSC_FREQ_SEL
> > and PSTATE* values.
> >
> > Preferably with an adjusted comment, Reviewed-by: Andrew Cooper
> > <andrew.cooper3@citrix.com>
>
> Sorry - I meant to be clearer here. I'd suggest just deleting the
> comment and leaving an unconditional return of 0 (which will become
> conditional when we wire up CPUID Faulting).
>
> MSR_HWCR *is* an architectural MSR on any 64bit AMD system, so shouldn't
> fault.
Hm, I think it's worth to at least keep a note that if TSC_FREQ_SEL is
exposed PSTATE0 must also be exposed to prevent OpenBSD 7.3 from
panicking.
Thanks, Roger.
^ permalink raw reply [flat|nested] 18+ messages in thread* Re: [PATCH] x86/amd: do not expose HWCR.TscFreqSel to guests
2023-09-13 8:08 ` Roger Pau Monné
@ 2023-09-13 10:11 ` Andrew Cooper
0 siblings, 0 replies; 18+ messages in thread
From: Andrew Cooper @ 2023-09-13 10:11 UTC (permalink / raw)
To: Roger Pau Monné
Cc: xen-devel, Jan Beulich, Wei Liu, solene,
Marek Marczykowski-Górecki, Demi Marie Obenour
On 13/09/2023 9:08 am, Roger Pau Monné wrote:
> On Tue, Sep 12, 2023 at 05:36:53PM +0100, Andrew Cooper wrote:
>> On 12/09/2023 5:35 pm, Andrew Cooper wrote:
>>> On 12/09/2023 5:23 pm, Roger Pau Monne wrote:
>>>> OpenBSD will attempt to unconditionally access PSTATE0 if HWCR.TscFreqSel is
>>>> set, and will also attempt to unconditionally access HWCR if the TSC is
>>>> reported as Invariant.
>>>>
>>>> The reasoning for exposing HWCR.TscFreqSel was to avoid Linux from printing a
>>>> (bogus) warning message, but doing so at the cost of OpenBSD not booting is not
>>>> a suitable solution.
>>>>
>>>> In order to fix expose an empty HWCR.
>>> At first I was thinking a straight up revert, but AMD's CPUID Faulting
>>> is an architectural bit in here so it's worth keeping the register around.
>>>
>>>> Fixes: 14b95b3b8546 ('x86/AMD: expose HWCR.TscFreqSel to guests')
>>>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
>>>> ---
>>>> Not sure whether we want to expose something when is_cpufreq_controller() is
>>>> true, seeing as there's a special wrmsr handler for the same MSR in that case.
>>>> Likely should be done for PV only, but also likely quite bogus.
>>>>
>>>> Missing reported by as the issue came from the QubesOS tracker.
>>> Well - we can at least have a:
>>>
>>> Link: https://github.com/QubesOS/qubes-issues/issues/8502
>>>
>>> in the commit message, and it's probably worth asking Solène / Marek
>>> (both CC'd) if they want a Reported-by tag.
>>>
>>>> ---
>>>> xen/arch/x86/msr.c | 8 ++++++--
>>>> 1 file changed, 6 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
>>>> index 3f0450259cdf..964d500ff8a1 100644
>>>> --- a/xen/arch/x86/msr.c
>>>> +++ b/xen/arch/x86/msr.c
>>>> @@ -240,8 +240,12 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
>>>> case MSR_K8_HWCR:
>>>> if ( !(cp->x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) )
>>>> goto gp_fault;
>>>> - *val = get_cpu_family(cp->basic.raw_fms, NULL, NULL) >= 0x10
>>>> - ? K8_HWCR_TSC_FREQ_SEL : 0;
>>>> + /*
>>>> + * OpenBSD 7.3 accesses HWCR unconditionally if the TSC is reported as
>>>> + * Invariant. Do not set TSC_FREQ_SEL as that would trigger OpenBSD to
>>>> + * also poke at PSTATE0.
>>>> + */
>>> While this is true, the justification for removing this is because
>>> TSC_FREQ_SEL is a model specific bit, not an architectural bit in HWCR.
>>>
>>> Also because it's addition without writing into the migration stream was
>>> bogus irrespective of the specifics of the bit.
>>>
>>> I'm still of the opinion that it's buggy for OpenBSD to be looking at
>>> model specific bits when virtualised, but given my latest reading of the
>>> AMD manuals, I think OpenBSD *is* well behaved looking at PSTATE0 if it
>>> can see TSC_FREQ_SEL.
>>>
>>> In some theoretical future where the toolstack better understands MSRs
>>> and (non)migratable VMs (which is the QubesOS usecase), then it would in
>>> principle be fine to construct a VM which can see the host TSC_FREQ_SEL
>>> and PSTATE* values.
>>>
>>> Preferably with an adjusted comment, Reviewed-by: Andrew Cooper
>>> <andrew.cooper3@citrix.com>
>> Sorry - I meant to be clearer here. I'd suggest just deleting the
>> comment and leaving an unconditional return of 0 (which will become
>> conditional when we wire up CPUID Faulting).
>>
>> MSR_HWCR *is* an architectural MSR on any 64bit AMD system, so shouldn't
>> fault.
> Hm, I think it's worth to at least keep a note that if TSC_FREQ_SEL is
> exposed PSTATE0 must also be exposed to prevent OpenBSD 7.3 from
> panicking.
But there's nothing OpenBSD 7.3 specific about it.
Any software which sees this bit is permitted (expected even) to edit
the pstate registers.
You know why it's called frequency select? Because firmware is supposed
to program the systemwide frequency/voltage to the user's request for
whether the system wants to run cooler, or be overclocked.
Putting OpenBSD 7.3 in the commit message is absolutely relevant to why
we're making this change now, but it's not relevant to why we have
concluded that TSC_FREQ_SEL is bogus to expose to guests.
~Andrew
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] x86/amd: do not expose HWCR.TscFreqSel to guests
2023-09-12 16:35 ` Andrew Cooper
2023-09-12 16:36 ` Andrew Cooper
@ 2023-09-13 6:18 ` Jan Beulich
2023-09-13 8:50 ` Roger Pau Monné
2023-09-13 11:02 ` [Xen PATCH] " Andrew Cooper
2023-09-13 7:50 ` [PATCH] " Roger Pau Monné
2 siblings, 2 replies; 18+ messages in thread
From: Jan Beulich @ 2023-09-13 6:18 UTC (permalink / raw)
To: Andrew Cooper
Cc: Wei Liu, solene, Marek Marczykowski-Górecki,
Demi Marie Obenour, Roger Pau Monne, xen-devel
On 12.09.2023 18:35, Andrew Cooper wrote:
> On 12/09/2023 5:23 pm, Roger Pau Monne wrote:
>> --- a/xen/arch/x86/msr.c
>> +++ b/xen/arch/x86/msr.c
>> @@ -240,8 +240,12 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
>> case MSR_K8_HWCR:
>> if ( !(cp->x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) )
>> goto gp_fault;
>> - *val = get_cpu_family(cp->basic.raw_fms, NULL, NULL) >= 0x10
>> - ? K8_HWCR_TSC_FREQ_SEL : 0;
>> + /*
>> + * OpenBSD 7.3 accesses HWCR unconditionally if the TSC is reported as
>> + * Invariant. Do not set TSC_FREQ_SEL as that would trigger OpenBSD to
>> + * also poke at PSTATE0.
>> + */
>
> While this is true, the justification for removing this is because
> TSC_FREQ_SEL is a model specific bit, not an architectural bit in HWCR.
>
> Also because it's addition without writing into the migration stream was
> bogus irrespective of the specifics of the bit.
>
> I'm still of the opinion that it's buggy for OpenBSD to be looking at
> model specific bits when virtualised, but given my latest reading of the
> AMD manuals, I think OpenBSD *is* well behaved looking at PSTATE0 if it
> can see TSC_FREQ_SEL.
I'm afraid I'm still at a loss seeing what wording in which PPR makes you
see a connection. If there was a connection, I'd like to ask that we at
least properly consider exposing PSTATE0 (or if necessary all PSTATEn) as
well, with zero value (i.e. in particular with the valid bit clear),
rather than exposing a r/o bit with the wrong value, upsetting Linux.
Jan
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] x86/amd: do not expose HWCR.TscFreqSel to guests
2023-09-13 6:18 ` Jan Beulich
@ 2023-09-13 8:50 ` Roger Pau Monné
2023-09-13 11:02 ` [Xen PATCH] " Andrew Cooper
1 sibling, 0 replies; 18+ messages in thread
From: Roger Pau Monné @ 2023-09-13 8:50 UTC (permalink / raw)
To: Jan Beulich
Cc: Andrew Cooper, Wei Liu, solene, Marek Marczykowski-Górecki,
Demi Marie Obenour, xen-devel
On Wed, Sep 13, 2023 at 08:18:57AM +0200, Jan Beulich wrote:
> On 12.09.2023 18:35, Andrew Cooper wrote:
> > On 12/09/2023 5:23 pm, Roger Pau Monne wrote:
> >> --- a/xen/arch/x86/msr.c
> >> +++ b/xen/arch/x86/msr.c
> >> @@ -240,8 +240,12 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
> >> case MSR_K8_HWCR:
> >> if ( !(cp->x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) )
> >> goto gp_fault;
> >> - *val = get_cpu_family(cp->basic.raw_fms, NULL, NULL) >= 0x10
> >> - ? K8_HWCR_TSC_FREQ_SEL : 0;
> >> + /*
> >> + * OpenBSD 7.3 accesses HWCR unconditionally if the TSC is reported as
> >> + * Invariant. Do not set TSC_FREQ_SEL as that would trigger OpenBSD to
> >> + * also poke at PSTATE0.
> >> + */
> >
> > While this is true, the justification for removing this is because
> > TSC_FREQ_SEL is a model specific bit, not an architectural bit in HWCR.
> >
> > Also because it's addition without writing into the migration stream was
> > bogus irrespective of the specifics of the bit.
> >
> > I'm still of the opinion that it's buggy for OpenBSD to be looking at
> > model specific bits when virtualised, but given my latest reading of the
> > AMD manuals, I think OpenBSD *is* well behaved looking at PSTATE0 if it
> > can see TSC_FREQ_SEL.
>
> I'm afraid I'm still at a loss seeing what wording in which PPR makes you
> see a connection. If there was a connection, I'd like to ask that we at
> least properly consider exposing PSTATE0 (or if necessary all PSTATEn) as
> well, with zero value (i.e. in particular with the valid bit clear),
> rather than exposing a r/o bit with the wrong value, upsetting Linux.
But PSTATEn is also non-architectural, so the bit meaning could in
theory change between models.
There seems to be no bit anywhere that signals whether PSTATEn is
available, as it's my reading that you can have PSTATEn without the
architectural PSTATE_{LIMIT,CTRL,STATUS} MSRs that are signaled by
HW_PSTATE CPUID bit.
Thanks, Roger.
^ permalink raw reply [flat|nested] 18+ messages in thread* Re: [Xen PATCH] x86/amd: do not expose HWCR.TscFreqSel to guests
2023-09-13 6:18 ` Jan Beulich
2023-09-13 8:50 ` Roger Pau Monné
@ 2023-09-13 11:02 ` Andrew Cooper
2023-09-13 13:37 ` Jan Beulich
2023-09-13 20:31 ` Thomas Gleixner
1 sibling, 2 replies; 18+ messages in thread
From: Andrew Cooper @ 2023-09-13 11:02 UTC (permalink / raw)
To: Jan Beulich
Cc: Wei Liu, solene, Marek Marczykowski-Górecki,
Demi Marie Obenour, Roger Pau Monne, xen-devel,
the arch/x86 maintainers
On 13/09/2023 7:18 am, Jan Beulich wrote:
> On 12.09.2023 18:35, Andrew Cooper wrote:
>> On 12/09/2023 5:23 pm, Roger Pau Monne wrote:
>>> --- a/xen/arch/x86/msr.c
>>> +++ b/xen/arch/x86/msr.c
>>> @@ -240,8 +240,12 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
>>> case MSR_K8_HWCR:
>>> if ( !(cp->x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) )
>>> goto gp_fault;
>>> - *val = get_cpu_family(cp->basic.raw_fms, NULL, NULL) >= 0x10
>>> - ? K8_HWCR_TSC_FREQ_SEL : 0;
>>> + /*
>>> + * OpenBSD 7.3 accesses HWCR unconditionally if the TSC is reported as
>>> + * Invariant. Do not set TSC_FREQ_SEL as that would trigger OpenBSD to
>>> + * also poke at PSTATE0.
>>> + */
>> While this is true, the justification for removing this is because
>> TSC_FREQ_SEL is a model specific bit, not an architectural bit in HWCR.
>>
>> Also because it's addition without writing into the migration stream was
>> bogus irrespective of the specifics of the bit.
>>
>> I'm still of the opinion that it's buggy for OpenBSD to be looking at
>> model specific bits when virtualised, but given my latest reading of the
>> AMD manuals, I think OpenBSD *is* well behaved looking at PSTATE0 if it
>> can see TSC_FREQ_SEL.
> I'm afraid I'm still at a loss seeing what wording in which PPR makes you
> see a connection. If there was a connection, I'd like to ask that we at
> least properly consider exposing PSTATE0 (or if necessary all PSTATEn) as
> well, with zero value (i.e. in particular with the valid bit clear),
> rather than exposing a r/o bit with the wrong value, upsetting Linux.
This mess is caused by the erroneous advertising of a model specific
bit, and there's no way in hell that giving the guest even more model
specific data is going to fix it.
The PSTATE MSRs are entirely model specific, fully read/write, and the
Enable bit is not an enable bit; its a "not valid yet" bit that firmware
is required to adjust to be consistent across the coherency fabric.
Linux is simply wrong with it's printk() under virt, and wants adjusting.
~Andrew
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Xen PATCH] x86/amd: do not expose HWCR.TscFreqSel to guests
2023-09-13 11:02 ` [Xen PATCH] " Andrew Cooper
@ 2023-09-13 13:37 ` Jan Beulich
2023-09-13 20:31 ` Thomas Gleixner
1 sibling, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2023-09-13 13:37 UTC (permalink / raw)
To: Andrew Cooper
Cc: Wei Liu, solene, Marek Marczykowski-Górecki,
Demi Marie Obenour, Roger Pau Monne, xen-devel,
the arch/x86 maintainers
On 13.09.2023 13:02, Andrew Cooper wrote:
> On 13/09/2023 7:18 am, Jan Beulich wrote:
>> On 12.09.2023 18:35, Andrew Cooper wrote:
>>> On 12/09/2023 5:23 pm, Roger Pau Monne wrote:
>>>> --- a/xen/arch/x86/msr.c
>>>> +++ b/xen/arch/x86/msr.c
>>>> @@ -240,8 +240,12 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
>>>> case MSR_K8_HWCR:
>>>> if ( !(cp->x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) )
>>>> goto gp_fault;
>>>> - *val = get_cpu_family(cp->basic.raw_fms, NULL, NULL) >= 0x10
>>>> - ? K8_HWCR_TSC_FREQ_SEL : 0;
>>>> + /*
>>>> + * OpenBSD 7.3 accesses HWCR unconditionally if the TSC is reported as
>>>> + * Invariant. Do not set TSC_FREQ_SEL as that would trigger OpenBSD to
>>>> + * also poke at PSTATE0.
>>>> + */
>>> While this is true, the justification for removing this is because
>>> TSC_FREQ_SEL is a model specific bit, not an architectural bit in HWCR.
>>>
>>> Also because it's addition without writing into the migration stream was
>>> bogus irrespective of the specifics of the bit.
>>>
>>> I'm still of the opinion that it's buggy for OpenBSD to be looking at
>>> model specific bits when virtualised, but given my latest reading of the
>>> AMD manuals, I think OpenBSD *is* well behaved looking at PSTATE0 if it
>>> can see TSC_FREQ_SEL.
>> I'm afraid I'm still at a loss seeing what wording in which PPR makes you
>> see a connection. If there was a connection, I'd like to ask that we at
>> least properly consider exposing PSTATE0 (or if necessary all PSTATEn) as
>> well, with zero value (i.e. in particular with the valid bit clear),
>> rather than exposing a r/o bit with the wrong value, upsetting Linux.
>
> This mess is caused by the erroneous advertising of a model specific
> bit, and there's no way in hell that giving the guest even more model
> specific data is going to fix it.
>
> The PSTATE MSRs are entirely model specific, fully read/write, and the
> Enable bit is not an enable bit; its a "not valid yet" bit that firmware
> is required to adjust to be consistent across the coherency fabric.
>
> Linux is simply wrong with it's printk() under virt, and wants adjusting.
Assuming Roger agrees with this statement, then I think this is another
item wanting mentioning in the description. I then still wouldn't ack
the change, but I also wouldn't object to it anymore.
Jan
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Xen PATCH] x86/amd: do not expose HWCR.TscFreqSel to guests
2023-09-13 11:02 ` [Xen PATCH] " Andrew Cooper
2023-09-13 13:37 ` Jan Beulich
@ 2023-09-13 20:31 ` Thomas Gleixner
1 sibling, 0 replies; 18+ messages in thread
From: Thomas Gleixner @ 2023-09-13 20:31 UTC (permalink / raw)
To: Andrew Cooper, Jan Beulich
Cc: Wei Liu, solene, Marek Marczykowski-Górecki,
Demi Marie Obenour, Roger Pau Monne, xen-devel,
the arch/x86 maintainers
On Wed, Sep 13 2023 at 12:02, Andrew Cooper wrote:
> The PSTATE MSRs are entirely model specific, fully read/write, and the
> Enable bit is not an enable bit; its a "not valid yet" bit that firmware
> is required to adjust to be consistent across the coherency fabric.
>
> Linux is simply wrong with it's printk() under virt, and wants adjusting.
No objections from my side.
Thanks,
tglx
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] x86/amd: do not expose HWCR.TscFreqSel to guests
2023-09-12 16:35 ` Andrew Cooper
2023-09-12 16:36 ` Andrew Cooper
2023-09-13 6:18 ` Jan Beulich
@ 2023-09-13 7:50 ` Roger Pau Monné
2023-09-13 10:50 ` Andrew Cooper
2 siblings, 1 reply; 18+ messages in thread
From: Roger Pau Monné @ 2023-09-13 7:50 UTC (permalink / raw)
To: Andrew Cooper
Cc: xen-devel, Jan Beulich, Wei Liu, solene,
Marek Marczykowski-Górecki, Demi Marie Obenour
On Tue, Sep 12, 2023 at 05:35:15PM +0100, Andrew Cooper wrote:
> On 12/09/2023 5:23 pm, Roger Pau Monne wrote:
> > OpenBSD will attempt to unconditionally access PSTATE0 if HWCR.TscFreqSel is
> > set, and will also attempt to unconditionally access HWCR if the TSC is
> > reported as Invariant.
> >
> > The reasoning for exposing HWCR.TscFreqSel was to avoid Linux from printing a
> > (bogus) warning message, but doing so at the cost of OpenBSD not booting is not
> > a suitable solution.
> >
> > In order to fix expose an empty HWCR.
>
> At first I was thinking a straight up revert, but AMD's CPUID Faulting
> is an architectural bit in here so it's worth keeping the register around.
A straight up revert won't work, because (as you notice below)
HWCR is architectural, so accesses must not fault.
> >
> > Fixes: 14b95b3b8546 ('x86/AMD: expose HWCR.TscFreqSel to guests')
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> > ---
> > Not sure whether we want to expose something when is_cpufreq_controller() is
> > true, seeing as there's a special wrmsr handler for the same MSR in that case.
> > Likely should be done for PV only, but also likely quite bogus.
> >
> > Missing reported by as the issue came from the QubesOS tracker.
>
> Well - we can at least have a:
>
> Link: https://github.com/QubesOS/qubes-issues/issues/8502
Sure.
> in the commit message, and it's probably worth asking Solène / Marek
> (both CC'd) if they want a Reported-by tag.
I'm happy to add a Reported-by tag, just didn't have an email to use.
> > ---
> > xen/arch/x86/msr.c | 8 ++++++--
> > 1 file changed, 6 insertions(+), 2 deletions(-)
> >
> > diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
> > index 3f0450259cdf..964d500ff8a1 100644
> > --- a/xen/arch/x86/msr.c
> > +++ b/xen/arch/x86/msr.c
> > @@ -240,8 +240,12 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
> > case MSR_K8_HWCR:
> > if ( !(cp->x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) )
> > goto gp_fault;
> > - *val = get_cpu_family(cp->basic.raw_fms, NULL, NULL) >= 0x10
> > - ? K8_HWCR_TSC_FREQ_SEL : 0;
> > + /*
> > + * OpenBSD 7.3 accesses HWCR unconditionally if the TSC is reported as
> > + * Invariant. Do not set TSC_FREQ_SEL as that would trigger OpenBSD to
> > + * also poke at PSTATE0.
> > + */
>
> While this is true, the justification for removing this is because
> TSC_FREQ_SEL is a model specific bit, not an architectural bit in HWCR.
>
> Also because it's addition without writing into the migration stream was
> bogus irrespective of the specifics of the bit.
>
> I'm still of the opinion that it's buggy for OpenBSD to be looking at
> model specific bits when virtualised,
Well, I think we can argue that an OS is free to ignore the CPUID HV
bit and still boot on Xen (even if that leads to non-ideal decisions).
> but given my latest reading of the
> AMD manuals, I think OpenBSD *is* well behaved looking at PSTATE0 if it
> can see TSC_FREQ_SEL.
Hm, there's no written down note that TSC_FREQ_SEL implies PSTATE0 to
be available (and PSTATE0 is not an architectural MSR), but I can see
how a guest can expect to fetch the P0 frequency if it sees
TSC_FREQ_SEL. It might have been more fail safe to check for
PSTATE_LIMIT not faulting before attempting to access PSTATE0.
> In some theoretical future where the toolstack better understands MSRs
> and (non)migratable VMs (which is the QubesOS usecase), then it would in
> principle be fine to construct a VM which can see the host TSC_FREQ_SEL
> and PSTATE* values.
>
> Preferably with an adjusted comment, Reviewed-by: Andrew Cooper
> <andrew.cooper3@citrix.com>
Thanks, will reply to other comments before taking the RB and
resending.
Roger.
^ permalink raw reply [flat|nested] 18+ messages in thread* Re: [PATCH] x86/amd: do not expose HWCR.TscFreqSel to guests
2023-09-13 7:50 ` [PATCH] " Roger Pau Monné
@ 2023-09-13 10:50 ` Andrew Cooper
2023-09-13 13:27 ` Jan Beulich
0 siblings, 1 reply; 18+ messages in thread
From: Andrew Cooper @ 2023-09-13 10:50 UTC (permalink / raw)
To: Roger Pau Monné
Cc: xen-devel, Jan Beulich, Wei Liu, solene,
Marek Marczykowski-Górecki, Demi Marie Obenour
On 13/09/2023 8:50 am, Roger Pau Monné wrote:
> On Tue, Sep 12, 2023 at 05:35:15PM +0100, Andrew Cooper wrote:
>> On 12/09/2023 5:23 pm, Roger Pau Monne wrote:
>>> ---
>>> xen/arch/x86/msr.c | 8 ++++++--
>>> 1 file changed, 6 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
>>> index 3f0450259cdf..964d500ff8a1 100644
>>> --- a/xen/arch/x86/msr.c
>>> +++ b/xen/arch/x86/msr.c
>>> @@ -240,8 +240,12 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
>>> case MSR_K8_HWCR:
>>> if ( !(cp->x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) )
>>> goto gp_fault;
>>> - *val = get_cpu_family(cp->basic.raw_fms, NULL, NULL) >= 0x10
>>> - ? K8_HWCR_TSC_FREQ_SEL : 0;
>>> + /*
>>> + * OpenBSD 7.3 accesses HWCR unconditionally if the TSC is reported as
>>> + * Invariant. Do not set TSC_FREQ_SEL as that would trigger OpenBSD to
>>> + * also poke at PSTATE0.
>>> + */
>> While this is true, the justification for removing this is because
>> TSC_FREQ_SEL is a model specific bit, not an architectural bit in HWCR.
>>
>> Also because it's addition without writing into the migration stream was
>> bogus irrespective of the specifics of the bit.
>>
>> I'm still of the opinion that it's buggy for OpenBSD to be looking at
>> model specific bits when virtualised,
> Well, I think we can argue that an OS is free to ignore the CPUID HV
> bit and still boot on Xen (even if that leads to non-ideal decisions).
An OS which keeps itself to architectural details that we do our very
best to be correct with, ought to function even if it ignores the HV bit.
But we're (deliberately) not doing model-specific-accurate emulations of
a specific platform. An OS which ignores details about the environment
it is operating in gets to keep the faults/malfunctions when it does
something illegal.
>> but given my latest reading of the
>> AMD manuals, I think OpenBSD *is* well behaved looking at PSTATE0 if it
>> can see TSC_FREQ_SEL.
> Hm, there's no written down note that TSC_FREQ_SEL implies PSTATE0 to
> be available (and PSTATE0 is not an architectural MSR), but I can see
> how a guest can expect to fetch the P0 frequency if it sees
> TSC_FREQ_SEL.
The PPR is a reference of mostly autogenerated details and misc notes,
put together in a non- hand-write way, unlike the older BKWGs.
Lots of the information elided from public and partner-NDA versions is
"see TICKET/LINK for rational" type comments.
It is not a spec - it is a reference (the clue is even in the name)
aimed at people already familiar with the area. Do not fall into the
trap of thinking it it can be read as a spec.
~Andrew
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] x86/amd: do not expose HWCR.TscFreqSel to guests
2023-09-13 10:50 ` Andrew Cooper
@ 2023-09-13 13:27 ` Jan Beulich
0 siblings, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2023-09-13 13:27 UTC (permalink / raw)
To: Andrew Cooper
Cc: xen-devel, Wei Liu, solene, Marek Marczykowski-Górecki,
Demi Marie Obenour, Roger Pau Monné
On 13.09.2023 12:50, Andrew Cooper wrote:
> On 13/09/2023 8:50 am, Roger Pau Monné wrote:
>> Hm, there's no written down note that TSC_FREQ_SEL implies PSTATE0 to
>> be available (and PSTATE0 is not an architectural MSR), but I can see
>> how a guest can expect to fetch the P0 frequency if it sees
>> TSC_FREQ_SEL.
>
> The PPR is a reference of mostly autogenerated details and misc notes,
> put together in a non- hand-write way, unlike the older BKWGs.
>
> Lots of the information elided from public and partner-NDA versions is
> "see TICKET/LINK for rational" type comments.
>
> It is not a spec - it is a reference (the clue is even in the name)
> aimed at people already familiar with the area. Do not fall into the
> trap of thinking it it can be read as a spec.
But then where is it written down that the bit set implies the PSTATEn
MSRs to exist?
Jan
^ permalink raw reply [flat|nested] 18+ messages in thread