* [PATCH 0/2] x86/boot: More watchdog fixes
@ 2024-03-19 14:48 Andrew Cooper
2024-03-19 14:48 ` [PATCH 1/2] x86/boot: Fix setup_apic_nmi_watchdog() to fail more cleanly Andrew Cooper
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Andrew Cooper @ 2024-03-19 14:48 UTC (permalink / raw)
To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné
Andrew Cooper (2):
x86/boot: Fix setup_apic_nmi_watchdog() to fail more cleanly
x86/boot: Support the watchdog on newer AMD systems
xen/arch/x86/nmi.c | 36 ++++++++++++++++--------------------
1 file changed, 16 insertions(+), 20 deletions(-)
base-commit: d2276b86e5eb8dd2617d917f7b49cdd1f29ac299
prerequisite-patch-id: 9595e30e0dde42b57dbad93a40db2cd338b26a58
--
2.30.2
^ permalink raw reply [flat|nested] 7+ messages in thread* [PATCH 1/2] x86/boot: Fix setup_apic_nmi_watchdog() to fail more cleanly 2024-03-19 14:48 [PATCH 0/2] x86/boot: More watchdog fixes Andrew Cooper @ 2024-03-19 14:48 ` Andrew Cooper 2024-03-19 16:49 ` Jan Beulich 2024-03-19 14:48 ` [PATCH 2/2] x86/boot: Support the watchdog on newer AMD systems Andrew Cooper 2024-03-19 16:53 ` [PATCH 0/2] x86/boot: More watchdog fixes Jan Beulich 2 siblings, 1 reply; 7+ messages in thread From: Andrew Cooper @ 2024-03-19 14:48 UTC (permalink / raw) To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné Right now, if the user requests the watchdog on the command line, setup_apic_nmi_watchdog() will blindly assume that setting up the watchdog worked. Reuse nmi_perfctr_msr to identify when the watchdog has been set up. Rearrange setup_p6_watchdog() to not set nmi_perfctr_msr until the sanity checks are performed. Turn setup_p4_watchdog() into a void function, matching the others. If the watchdog isn't set up, inform the user and override to NMI_NONE, which will prevent check_nmi_watchdog() from claiming that all CPUs are stuck. e.g.: (XEN) alt table ffff82d040697c38 -> ffff82d0406a97f0 (XEN) Failed to configure NMI watchdog (XEN) Brought up 512 CPUs (XEN) Scheduling granularity: cpu, 1 CPU per sched-resource Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> --- CC: Jan Beulich <JBeulich@suse.com> CC: Roger Pau Monné <roger.pau@citrix.com> This is all horrible code. --- xen/arch/x86/nmi.c | 25 ++++++++++++------------- 1 file changed, 12 insertions(+), 13 deletions(-) diff --git a/xen/arch/x86/nmi.c b/xen/arch/x86/nmi.c index 8994c50cb5e4..33f77a92047f 100644 --- a/xen/arch/x86/nmi.c +++ b/xen/arch/x86/nmi.c @@ -319,8 +319,6 @@ static void setup_p6_watchdog(unsigned counter) { unsigned int evntsel; - nmi_perfctr_msr = MSR_P6_PERFCTR(0); - if ( !nmi_p6_event_width && current_cpu_data.cpuid_level >= 0xa ) nmi_p6_event_width = MASK_EXTR(cpuid_eax(0xa), P6_EVENT_WIDTH_MASK); if ( !nmi_p6_event_width ) @@ -330,6 +328,8 @@ static void setup_p6_watchdog(unsigned counter) nmi_p6_event_width > BITS_PER_LONG ) return; + nmi_perfctr_msr = MSR_P6_PERFCTR(0); + clear_msr_range(MSR_P6_EVNTSEL(0), 2); clear_msr_range(MSR_P6_PERFCTR(0), 2); @@ -345,13 +345,13 @@ static void setup_p6_watchdog(unsigned counter) wrmsr(MSR_P6_EVNTSEL(0), evntsel, 0); } -static int setup_p4_watchdog(void) +static void setup_p4_watchdog(void) { uint64_t misc_enable; rdmsrl(MSR_IA32_MISC_ENABLE, misc_enable); if (!(misc_enable & MSR_IA32_MISC_ENABLE_PERF_AVAIL)) - return 0; + return; nmi_perfctr_msr = MSR_P4_IQ_PERFCTR0; nmi_p4_cccr_val = P4_NMI_IQ_CCCR0; @@ -374,13 +374,12 @@ static int setup_p4_watchdog(void) clear_msr_range(0x3E0, 2); clear_msr_range(MSR_P4_BPU_CCCR0, 18); clear_msr_range(MSR_P4_BPU_PERFCTR0, 18); - + wrmsrl(MSR_P4_CRU_ESCR0, P4_NMI_CRU_ESCR0); wrmsrl(MSR_P4_IQ_CCCR0, P4_NMI_IQ_CCCR0 & ~P4_CCCR_ENABLE); write_watchdog_counter("P4_IQ_COUNTER0"); apic_write(APIC_LVTPC, APIC_DM_NMI); wrmsrl(MSR_P4_IQ_CCCR0, nmi_p4_cccr_val); - return 1; } void setup_apic_nmi_watchdog(void) @@ -395,8 +394,6 @@ void setup_apic_nmi_watchdog(void) case 0xf ... 0x19: setup_k7_watchdog(); break; - default: - return; } break; case X86_VENDOR_INTEL: @@ -407,14 +404,16 @@ void setup_apic_nmi_watchdog(void) : CORE_EVENT_CPU_CLOCKS_NOT_HALTED); break; case 15: - if (!setup_p4_watchdog()) - return; + setup_p4_watchdog(); break; - default: - return; } break; - default: + } + + if ( nmi_perfctr_msr == 0 ) + { + printk(XENLOG_WARNING "Failed to configure NMI watchdog\n"); + nmi_watchdog = NMI_NONE; return; } -- 2.30.2 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] x86/boot: Fix setup_apic_nmi_watchdog() to fail more cleanly 2024-03-19 14:48 ` [PATCH 1/2] x86/boot: Fix setup_apic_nmi_watchdog() to fail more cleanly Andrew Cooper @ 2024-03-19 16:49 ` Jan Beulich 0 siblings, 0 replies; 7+ messages in thread From: Jan Beulich @ 2024-03-19 16:49 UTC (permalink / raw) To: Andrew Cooper; +Cc: Roger Pau Monné, Xen-devel On 19.03.2024 15:48, Andrew Cooper wrote: > Right now, if the user requests the watchdog on the command line, > setup_apic_nmi_watchdog() will blindly assume that setting up the watchdog > worked. Reuse nmi_perfctr_msr to identify when the watchdog has been set up. > > Rearrange setup_p6_watchdog() to not set nmi_perfctr_msr until the sanity > checks are performed. Turn setup_p4_watchdog() into a void function, matching > the others. > > If the watchdog isn't set up, inform the user and override to NMI_NONE, which > will prevent check_nmi_watchdog() from claiming that all CPUs are stuck. > > e.g.: > > (XEN) alt table ffff82d040697c38 -> ffff82d0406a97f0 > (XEN) Failed to configure NMI watchdog > (XEN) Brought up 512 CPUs > (XEN) Scheduling granularity: cpu, 1 CPU per sched-resource > > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 2/2] x86/boot: Support the watchdog on newer AMD systems 2024-03-19 14:48 [PATCH 0/2] x86/boot: More watchdog fixes Andrew Cooper 2024-03-19 14:48 ` [PATCH 1/2] x86/boot: Fix setup_apic_nmi_watchdog() to fail more cleanly Andrew Cooper @ 2024-03-19 14:48 ` Andrew Cooper 2024-03-19 16:51 ` Jan Beulich 2024-03-19 16:53 ` [PATCH 0/2] x86/boot: More watchdog fixes Jan Beulich 2 siblings, 1 reply; 7+ messages in thread From: Andrew Cooper @ 2024-03-19 14:48 UTC (permalink / raw) To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné The MSRs used by setup_k7_watchdog() are architectural in 64bit. The Unit Select (0x76, cycles not in halt state) isn't, but it hasn't changed in 23 years, making this a trend likely to continue. Drop the family check. If the Unit Select does happen to change meaning in the future, check_nmi_watchdog() will still notice the watchdog not operating as expected. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> --- CC: Jan Beulich <JBeulich@suse.com> CC: Roger Pau Monné <roger.pau@citrix.com> --- xen/arch/x86/nmi.c | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-) diff --git a/xen/arch/x86/nmi.c b/xen/arch/x86/nmi.c index 33f77a92047f..f6329cb0270e 100644 --- a/xen/arch/x86/nmi.c +++ b/xen/arch/x86/nmi.c @@ -387,15 +387,12 @@ void setup_apic_nmi_watchdog(void) if ( nmi_watchdog == NMI_NONE ) return; - switch (boot_cpu_data.x86_vendor) { + switch ( boot_cpu_data.x86_vendor ) + { case X86_VENDOR_AMD: - switch (boot_cpu_data.x86) { - case 6: - case 0xf ... 0x19: - setup_k7_watchdog(); - break; - } + setup_k7_watchdog(); break; + case X86_VENDOR_INTEL: switch (boot_cpu_data.x86) { case 6: -- 2.30.2 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] x86/boot: Support the watchdog on newer AMD systems 2024-03-19 14:48 ` [PATCH 2/2] x86/boot: Support the watchdog on newer AMD systems Andrew Cooper @ 2024-03-19 16:51 ` Jan Beulich 2024-03-19 16:54 ` Andrew Cooper 0 siblings, 1 reply; 7+ messages in thread From: Jan Beulich @ 2024-03-19 16:51 UTC (permalink / raw) To: Andrew Cooper; +Cc: Roger Pau Monné, Xen-devel On 19.03.2024 15:48, Andrew Cooper wrote: > The MSRs used by setup_k7_watchdog() are architectural in 64bit. The Unit > Select (0x76, cycles not in halt state) isn't, but it hasn't changed in 23 > years, making this a trend likely to continue. > > Drop the family check. If the Unit Select does happen to change meaning in > the future, check_nmi_watchdog() will still notice the watchdog not operating > as expected. > > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> > --- a/xen/arch/x86/nmi.c > +++ b/xen/arch/x86/nmi.c > @@ -387,15 +387,12 @@ void setup_apic_nmi_watchdog(void) > if ( nmi_watchdog == NMI_NONE ) > return; > > - switch (boot_cpu_data.x86_vendor) { > + switch ( boot_cpu_data.x86_vendor ) > + { > case X86_VENDOR_AMD: > - switch (boot_cpu_data.x86) { > - case 6: Just to mention it - this case label has been dead code anyway for about 10 years. Jan ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] x86/boot: Support the watchdog on newer AMD systems 2024-03-19 16:51 ` Jan Beulich @ 2024-03-19 16:54 ` Andrew Cooper 0 siblings, 0 replies; 7+ messages in thread From: Andrew Cooper @ 2024-03-19 16:54 UTC (permalink / raw) To: Jan Beulich; +Cc: Roger Pau Monné, Xen-devel On 19/03/2024 4:51 pm, Jan Beulich wrote: > On 19.03.2024 15:48, Andrew Cooper wrote: >> The MSRs used by setup_k7_watchdog() are architectural in 64bit. The Unit >> Select (0x76, cycles not in halt state) isn't, but it hasn't changed in 23 >> years, making this a trend likely to continue. >> >> Drop the family check. If the Unit Select does happen to change meaning in >> the future, check_nmi_watchdog() will still notice the watchdog not operating >> as expected. >> >> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > Reviewed-by: Jan Beulich <jbeulich@suse.com> Thanks. > >> --- a/xen/arch/x86/nmi.c >> +++ b/xen/arch/x86/nmi.c >> @@ -387,15 +387,12 @@ void setup_apic_nmi_watchdog(void) >> if ( nmi_watchdog == NMI_NONE ) >> return; >> >> - switch (boot_cpu_data.x86_vendor) { >> + switch ( boot_cpu_data.x86_vendor ) >> + { >> case X86_VENDOR_AMD: >> - switch (boot_cpu_data.x86) { >> - case 6: > Just to mention it - this case label has been dead code anyway for about 10 > years. Yeah... I noticed that after writing the commit message, although it's not very easy to slip in. Also it's 25 years since the K7 was released (June 1999), because I apparently can't count. ~Andrew ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 0/2] x86/boot: More watchdog fixes 2024-03-19 14:48 [PATCH 0/2] x86/boot: More watchdog fixes Andrew Cooper 2024-03-19 14:48 ` [PATCH 1/2] x86/boot: Fix setup_apic_nmi_watchdog() to fail more cleanly Andrew Cooper 2024-03-19 14:48 ` [PATCH 2/2] x86/boot: Support the watchdog on newer AMD systems Andrew Cooper @ 2024-03-19 16:53 ` Jan Beulich 2 siblings, 0 replies; 7+ messages in thread From: Jan Beulich @ 2024-03-19 16:53 UTC (permalink / raw) To: Andrew Cooper; +Cc: Roger Pau Monné, Xen-devel On 19.03.2024 15:48, Andrew Cooper wrote: > Andrew Cooper (2): > x86/boot: Fix setup_apic_nmi_watchdog() to fail more cleanly > x86/boot: Support the watchdog on newer AMD systems > > xen/arch/x86/nmi.c | 36 ++++++++++++++++-------------------- > 1 file changed, 16 insertions(+), 20 deletions(-) With the R-b-s provided, may I ask that in return you also look at my watchdog related patches ("x86: adjust initial setting of watchdog kind" and "x86: increase NMI timer frequency if necessary")? They've been sitting there for soon 2 months ... Jan ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2024-03-19 16:54 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-03-19 14:48 [PATCH 0/2] x86/boot: More watchdog fixes Andrew Cooper 2024-03-19 14:48 ` [PATCH 1/2] x86/boot: Fix setup_apic_nmi_watchdog() to fail more cleanly Andrew Cooper 2024-03-19 16:49 ` Jan Beulich 2024-03-19 14:48 ` [PATCH 2/2] x86/boot: Support the watchdog on newer AMD systems Andrew Cooper 2024-03-19 16:51 ` Jan Beulich 2024-03-19 16:54 ` Andrew Cooper 2024-03-19 16:53 ` [PATCH 0/2] x86/boot: More watchdog fixes Jan Beulich
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.