* [PATCH 0/2] x86/boot: More watchdog fixes
@ 2024-03-19 14:48 Andrew Cooper
2024-03-19 14:48 ` [PATCH 1/2] x86/boot: Fix setup_apic_nmi_watchdog() to fail more cleanly Andrew Cooper
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Andrew Cooper @ 2024-03-19 14:48 UTC (permalink / raw)
To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné
Andrew Cooper (2):
x86/boot: Fix setup_apic_nmi_watchdog() to fail more cleanly
x86/boot: Support the watchdog on newer AMD systems
xen/arch/x86/nmi.c | 36 ++++++++++++++++--------------------
1 file changed, 16 insertions(+), 20 deletions(-)
base-commit: d2276b86e5eb8dd2617d917f7b49cdd1f29ac299
prerequisite-patch-id: 9595e30e0dde42b57dbad93a40db2cd338b26a58
--
2.30.2
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/2] x86/boot: Fix setup_apic_nmi_watchdog() to fail more cleanly
2024-03-19 14:48 [PATCH 0/2] x86/boot: More watchdog fixes Andrew Cooper
@ 2024-03-19 14:48 ` Andrew Cooper
2024-03-19 16:49 ` Jan Beulich
2024-03-19 14:48 ` [PATCH 2/2] x86/boot: Support the watchdog on newer AMD systems Andrew Cooper
2024-03-19 16:53 ` [PATCH 0/2] x86/boot: More watchdog fixes Jan Beulich
2 siblings, 1 reply; 7+ messages in thread
From: Andrew Cooper @ 2024-03-19 14:48 UTC (permalink / raw)
To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné
Right now, if the user requests the watchdog on the command line,
setup_apic_nmi_watchdog() will blindly assume that setting up the watchdog
worked. Reuse nmi_perfctr_msr to identify when the watchdog has been set up.
Rearrange setup_p6_watchdog() to not set nmi_perfctr_msr until the sanity
checks are performed. Turn setup_p4_watchdog() into a void function, matching
the others.
If the watchdog isn't set up, inform the user and override to NMI_NONE, which
will prevent check_nmi_watchdog() from claiming that all CPUs are stuck.
e.g.:
(XEN) alt table ffff82d040697c38 -> ffff82d0406a97f0
(XEN) Failed to configure NMI watchdog
(XEN) Brought up 512 CPUs
(XEN) Scheduling granularity: cpu, 1 CPU per sched-resource
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
This is all horrible code.
---
xen/arch/x86/nmi.c | 25 ++++++++++++-------------
1 file changed, 12 insertions(+), 13 deletions(-)
diff --git a/xen/arch/x86/nmi.c b/xen/arch/x86/nmi.c
index 8994c50cb5e4..33f77a92047f 100644
--- a/xen/arch/x86/nmi.c
+++ b/xen/arch/x86/nmi.c
@@ -319,8 +319,6 @@ static void setup_p6_watchdog(unsigned counter)
{
unsigned int evntsel;
- nmi_perfctr_msr = MSR_P6_PERFCTR(0);
-
if ( !nmi_p6_event_width && current_cpu_data.cpuid_level >= 0xa )
nmi_p6_event_width = MASK_EXTR(cpuid_eax(0xa), P6_EVENT_WIDTH_MASK);
if ( !nmi_p6_event_width )
@@ -330,6 +328,8 @@ static void setup_p6_watchdog(unsigned counter)
nmi_p6_event_width > BITS_PER_LONG )
return;
+ nmi_perfctr_msr = MSR_P6_PERFCTR(0);
+
clear_msr_range(MSR_P6_EVNTSEL(0), 2);
clear_msr_range(MSR_P6_PERFCTR(0), 2);
@@ -345,13 +345,13 @@ static void setup_p6_watchdog(unsigned counter)
wrmsr(MSR_P6_EVNTSEL(0), evntsel, 0);
}
-static int setup_p4_watchdog(void)
+static void setup_p4_watchdog(void)
{
uint64_t misc_enable;
rdmsrl(MSR_IA32_MISC_ENABLE, misc_enable);
if (!(misc_enable & MSR_IA32_MISC_ENABLE_PERF_AVAIL))
- return 0;
+ return;
nmi_perfctr_msr = MSR_P4_IQ_PERFCTR0;
nmi_p4_cccr_val = P4_NMI_IQ_CCCR0;
@@ -374,13 +374,12 @@ static int setup_p4_watchdog(void)
clear_msr_range(0x3E0, 2);
clear_msr_range(MSR_P4_BPU_CCCR0, 18);
clear_msr_range(MSR_P4_BPU_PERFCTR0, 18);
-
+
wrmsrl(MSR_P4_CRU_ESCR0, P4_NMI_CRU_ESCR0);
wrmsrl(MSR_P4_IQ_CCCR0, P4_NMI_IQ_CCCR0 & ~P4_CCCR_ENABLE);
write_watchdog_counter("P4_IQ_COUNTER0");
apic_write(APIC_LVTPC, APIC_DM_NMI);
wrmsrl(MSR_P4_IQ_CCCR0, nmi_p4_cccr_val);
- return 1;
}
void setup_apic_nmi_watchdog(void)
@@ -395,8 +394,6 @@ void setup_apic_nmi_watchdog(void)
case 0xf ... 0x19:
setup_k7_watchdog();
break;
- default:
- return;
}
break;
case X86_VENDOR_INTEL:
@@ -407,14 +404,16 @@ void setup_apic_nmi_watchdog(void)
: CORE_EVENT_CPU_CLOCKS_NOT_HALTED);
break;
case 15:
- if (!setup_p4_watchdog())
- return;
+ setup_p4_watchdog();
break;
- default:
- return;
}
break;
- default:
+ }
+
+ if ( nmi_perfctr_msr == 0 )
+ {
+ printk(XENLOG_WARNING "Failed to configure NMI watchdog\n");
+ nmi_watchdog = NMI_NONE;
return;
}
--
2.30.2
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/2] x86/boot: Support the watchdog on newer AMD systems
2024-03-19 14:48 [PATCH 0/2] x86/boot: More watchdog fixes Andrew Cooper
2024-03-19 14:48 ` [PATCH 1/2] x86/boot: Fix setup_apic_nmi_watchdog() to fail more cleanly Andrew Cooper
@ 2024-03-19 14:48 ` Andrew Cooper
2024-03-19 16:51 ` Jan Beulich
2024-03-19 16:53 ` [PATCH 0/2] x86/boot: More watchdog fixes Jan Beulich
2 siblings, 1 reply; 7+ messages in thread
From: Andrew Cooper @ 2024-03-19 14:48 UTC (permalink / raw)
To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné
The MSRs used by setup_k7_watchdog() are architectural in 64bit. The Unit
Select (0x76, cycles not in halt state) isn't, but it hasn't changed in 23
years, making this a trend likely to continue.
Drop the family check. If the Unit Select does happen to change meaning in
the future, check_nmi_watchdog() will still notice the watchdog not operating
as expected.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
---
xen/arch/x86/nmi.c | 11 ++++-------
1 file changed, 4 insertions(+), 7 deletions(-)
diff --git a/xen/arch/x86/nmi.c b/xen/arch/x86/nmi.c
index 33f77a92047f..f6329cb0270e 100644
--- a/xen/arch/x86/nmi.c
+++ b/xen/arch/x86/nmi.c
@@ -387,15 +387,12 @@ void setup_apic_nmi_watchdog(void)
if ( nmi_watchdog == NMI_NONE )
return;
- switch (boot_cpu_data.x86_vendor) {
+ switch ( boot_cpu_data.x86_vendor )
+ {
case X86_VENDOR_AMD:
- switch (boot_cpu_data.x86) {
- case 6:
- case 0xf ... 0x19:
- setup_k7_watchdog();
- break;
- }
+ setup_k7_watchdog();
break;
+
case X86_VENDOR_INTEL:
switch (boot_cpu_data.x86) {
case 6:
--
2.30.2
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] x86/boot: Fix setup_apic_nmi_watchdog() to fail more cleanly
2024-03-19 14:48 ` [PATCH 1/2] x86/boot: Fix setup_apic_nmi_watchdog() to fail more cleanly Andrew Cooper
@ 2024-03-19 16:49 ` Jan Beulich
0 siblings, 0 replies; 7+ messages in thread
From: Jan Beulich @ 2024-03-19 16:49 UTC (permalink / raw)
To: Andrew Cooper; +Cc: Roger Pau Monné, Xen-devel
On 19.03.2024 15:48, Andrew Cooper wrote:
> Right now, if the user requests the watchdog on the command line,
> setup_apic_nmi_watchdog() will blindly assume that setting up the watchdog
> worked. Reuse nmi_perfctr_msr to identify when the watchdog has been set up.
>
> Rearrange setup_p6_watchdog() to not set nmi_perfctr_msr until the sanity
> checks are performed. Turn setup_p4_watchdog() into a void function, matching
> the others.
>
> If the watchdog isn't set up, inform the user and override to NMI_NONE, which
> will prevent check_nmi_watchdog() from claiming that all CPUs are stuck.
>
> e.g.:
>
> (XEN) alt table ffff82d040697c38 -> ffff82d0406a97f0
> (XEN) Failed to configure NMI watchdog
> (XEN) Brought up 512 CPUs
> (XEN) Scheduling granularity: cpu, 1 CPU per sched-resource
>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] x86/boot: Support the watchdog on newer AMD systems
2024-03-19 14:48 ` [PATCH 2/2] x86/boot: Support the watchdog on newer AMD systems Andrew Cooper
@ 2024-03-19 16:51 ` Jan Beulich
2024-03-19 16:54 ` Andrew Cooper
0 siblings, 1 reply; 7+ messages in thread
From: Jan Beulich @ 2024-03-19 16:51 UTC (permalink / raw)
To: Andrew Cooper; +Cc: Roger Pau Monné, Xen-devel
On 19.03.2024 15:48, Andrew Cooper wrote:
> The MSRs used by setup_k7_watchdog() are architectural in 64bit. The Unit
> Select (0x76, cycles not in halt state) isn't, but it hasn't changed in 23
> years, making this a trend likely to continue.
>
> Drop the family check. If the Unit Select does happen to change meaning in
> the future, check_nmi_watchdog() will still notice the watchdog not operating
> as expected.
>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
> --- a/xen/arch/x86/nmi.c
> +++ b/xen/arch/x86/nmi.c
> @@ -387,15 +387,12 @@ void setup_apic_nmi_watchdog(void)
> if ( nmi_watchdog == NMI_NONE )
> return;
>
> - switch (boot_cpu_data.x86_vendor) {
> + switch ( boot_cpu_data.x86_vendor )
> + {
> case X86_VENDOR_AMD:
> - switch (boot_cpu_data.x86) {
> - case 6:
Just to mention it - this case label has been dead code anyway for about 10
years.
Jan
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 0/2] x86/boot: More watchdog fixes
2024-03-19 14:48 [PATCH 0/2] x86/boot: More watchdog fixes Andrew Cooper
2024-03-19 14:48 ` [PATCH 1/2] x86/boot: Fix setup_apic_nmi_watchdog() to fail more cleanly Andrew Cooper
2024-03-19 14:48 ` [PATCH 2/2] x86/boot: Support the watchdog on newer AMD systems Andrew Cooper
@ 2024-03-19 16:53 ` Jan Beulich
2 siblings, 0 replies; 7+ messages in thread
From: Jan Beulich @ 2024-03-19 16:53 UTC (permalink / raw)
To: Andrew Cooper; +Cc: Roger Pau Monné, Xen-devel
On 19.03.2024 15:48, Andrew Cooper wrote:
> Andrew Cooper (2):
> x86/boot: Fix setup_apic_nmi_watchdog() to fail more cleanly
> x86/boot: Support the watchdog on newer AMD systems
>
> xen/arch/x86/nmi.c | 36 ++++++++++++++++--------------------
> 1 file changed, 16 insertions(+), 20 deletions(-)
With the R-b-s provided, may I ask that in return you also look at my
watchdog related patches ("x86: adjust initial setting of watchdog kind"
and "x86: increase NMI timer frequency if necessary")? They've been
sitting there for soon 2 months ...
Jan
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] x86/boot: Support the watchdog on newer AMD systems
2024-03-19 16:51 ` Jan Beulich
@ 2024-03-19 16:54 ` Andrew Cooper
0 siblings, 0 replies; 7+ messages in thread
From: Andrew Cooper @ 2024-03-19 16:54 UTC (permalink / raw)
To: Jan Beulich; +Cc: Roger Pau Monné, Xen-devel
On 19/03/2024 4:51 pm, Jan Beulich wrote:
> On 19.03.2024 15:48, Andrew Cooper wrote:
>> The MSRs used by setup_k7_watchdog() are architectural in 64bit. The Unit
>> Select (0x76, cycles not in halt state) isn't, but it hasn't changed in 23
>> years, making this a trend likely to continue.
>>
>> Drop the family check. If the Unit Select does happen to change meaning in
>> the future, check_nmi_watchdog() will still notice the watchdog not operating
>> as expected.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Thanks.
>
>> --- a/xen/arch/x86/nmi.c
>> +++ b/xen/arch/x86/nmi.c
>> @@ -387,15 +387,12 @@ void setup_apic_nmi_watchdog(void)
>> if ( nmi_watchdog == NMI_NONE )
>> return;
>>
>> - switch (boot_cpu_data.x86_vendor) {
>> + switch ( boot_cpu_data.x86_vendor )
>> + {
>> case X86_VENDOR_AMD:
>> - switch (boot_cpu_data.x86) {
>> - case 6:
> Just to mention it - this case label has been dead code anyway for about 10
> years.
Yeah... I noticed that after writing the commit message, although it's
not very easy to slip in.
Also it's 25 years since the K7 was released (June 1999), because I
apparently can't count.
~Andrew
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2024-03-19 16:54 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-19 14:48 [PATCH 0/2] x86/boot: More watchdog fixes Andrew Cooper
2024-03-19 14:48 ` [PATCH 1/2] x86/boot: Fix setup_apic_nmi_watchdog() to fail more cleanly Andrew Cooper
2024-03-19 16:49 ` Jan Beulich
2024-03-19 14:48 ` [PATCH 2/2] x86/boot: Support the watchdog on newer AMD systems Andrew Cooper
2024-03-19 16:51 ` Jan Beulich
2024-03-19 16:54 ` Andrew Cooper
2024-03-19 16:53 ` [PATCH 0/2] x86/boot: More watchdog fixes Jan Beulich
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.