* [PATCH 0/2] watchdog/hpwdt: Cleanup Claiming NMI
@ 2023-11-09 2:44 Jerry Hoemann
2023-11-09 2:44 ` [PATCH 1/2] watchdog/hpwdt: Only claim UNKNOWN NMI if from iLO Jerry Hoemann
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Jerry Hoemann @ 2023-11-09 2:44 UTC (permalink / raw)
To: linux, wim; +Cc: linux-watchdog, linux-kernel, Jerry Hoemann
In addition to being a watchdog, hpwdt participates in error
containmnet on ProLiant systems.
On legacy platforms (Gen 8/Gen 9 and earlier) Fatal IO errors would be
signaled as an IO CHECK NMI with expectation that hpwdt would be present
to receive the NMI and crash the systems thus containing the error.
A problem was that hwpdt did not discriminate enough in accepting NMIs.
This could lead to problems if an NMI generated for another subsystems
was not claimed by that subsystem and became UNKNOWN and was claimed
by hpwdt. Application profiling was such an example. While, profiling
issue was fixed, hpwdt should avoid claiming NMI not intended for it.
In iLO 5 time frame, checks were added to make hpwdt more selective
in claiming NMI. This patchset cleans up the checks and enables it
for future versions of iLO.
Jerry Hoemann (2):
watchdog/hpwdt: Only claim UNKNOWN NMI if from iLO
watchdog/hpwdt: Remove checks on ilo5
drivers/watchdog/hpwdt.c | 9 +--------
1 file changed, 1 insertion(+), 8 deletions(-)
--
2.41.0
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH 1/2] watchdog/hpwdt: Only claim UNKNOWN NMI if from iLO
2023-11-09 2:44 [PATCH 0/2] watchdog/hpwdt: Cleanup Claiming NMI Jerry Hoemann
@ 2023-11-09 2:44 ` Jerry Hoemann
2023-11-09 2:44 ` [PATCH 2/2] watchdog/hpwdt: Remove checks on ilo5 Jerry Hoemann
2023-11-27 3:16 ` [PATCH 0/2] watchdog/hpwdt: Cleanup Claiming NMI Jerry Hoemann
2 siblings, 0 replies; 4+ messages in thread
From: Jerry Hoemann @ 2023-11-09 2:44 UTC (permalink / raw)
To: linux, wim; +Cc: linux-watchdog, linux-kernel, Jerry Hoemann
Do not claim NMIs that are not watchdog or ERRORs as it could
cause unnecessary crashes.
The code does this, but only for iLO5.
The intent was to preserve legacy (Gen8/9 and earlier) semantics of
using hpwdt for error containtment as hardware/firmware would signal
fatal IO errors as an NMI with the expectation of hpwdt crashing
the system.
But these IO errors should be received by hpwdt as an NMI_IO_CHECK. So
the test is overly permissive and should not be limited to only ilo5.
This enables this protection for future iLO not matching current PCI IDs.
Fixes: 62290a5c194b ("watchdog: hpwdt: Claim NMIs generated by iLO5")
Signed-off-by: Jerry Hoemann <jerry.hoemann@hpe.com>
---
drivers/watchdog/hpwdt.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/watchdog/hpwdt.c b/drivers/watchdog/hpwdt.c
index f79f932bca14..79ed1626d8ea 100644
--- a/drivers/watchdog/hpwdt.c
+++ b/drivers/watchdog/hpwdt.c
@@ -178,7 +178,7 @@ static int hpwdt_pretimeout(unsigned int ulReason, struct pt_regs *regs)
"3. OA Forward Progress Log\n"
"4. iLO Event Log";
- if (ilo5 && ulReason == NMI_UNKNOWN && !mynmi)
+ if (ulReason == NMI_UNKNOWN && !mynmi)
return NMI_DONE;
if (ilo5 && !pretimeout && !mynmi)
--
2.41.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 2/2] watchdog/hpwdt: Remove checks on ilo5
2023-11-09 2:44 [PATCH 0/2] watchdog/hpwdt: Cleanup Claiming NMI Jerry Hoemann
2023-11-09 2:44 ` [PATCH 1/2] watchdog/hpwdt: Only claim UNKNOWN NMI if from iLO Jerry Hoemann
@ 2023-11-09 2:44 ` Jerry Hoemann
2023-11-27 3:16 ` [PATCH 0/2] watchdog/hpwdt: Cleanup Claiming NMI Jerry Hoemann
2 siblings, 0 replies; 4+ messages in thread
From: Jerry Hoemann @ 2023-11-09 2:44 UTC (permalink / raw)
To: linux, wim; +Cc: linux-watchdog, linux-kernel, Jerry Hoemann
This test doesn't really do much.
ProLiant of vintage to have iLO 5, no longer send watchdog NMI as IO CHECK.
They are presented to hpwdt_pretimeout as UNKNOWN which is convered
by the preceding if statement.
Test could have been useful in the ERROR cases to validate FW set nmistat
register correctly but as the default value of "pretimeout" is true, this
test was almost always skipped during platform validation.
Without this if statment, we can remove all references to variable ilo5.
Signed-off-by: Jerry Hoemann <jerry.hoemann@hpe.com>
---
drivers/watchdog/hpwdt.c | 7 -------
1 file changed, 7 deletions(-)
diff --git a/drivers/watchdog/hpwdt.c b/drivers/watchdog/hpwdt.c
index 79ed1626d8ea..138dc8d8ca3d 100644
--- a/drivers/watchdog/hpwdt.c
+++ b/drivers/watchdog/hpwdt.c
@@ -33,7 +33,6 @@
#define DEFAULT_MARGIN 30
#define PRETIMEOUT_SEC 9
-static bool ilo5;
static unsigned int soft_margin = DEFAULT_MARGIN; /* in seconds */
static bool nowayout = WATCHDOG_NOWAYOUT;
static bool pretimeout = IS_ENABLED(CONFIG_HPWDT_NMI_DECODING);
@@ -181,9 +180,6 @@ static int hpwdt_pretimeout(unsigned int ulReason, struct pt_regs *regs)
if (ulReason == NMI_UNKNOWN && !mynmi)
return NMI_DONE;
- if (ilo5 && !pretimeout && !mynmi)
- return NMI_DONE;
-
if (kdumptimeout < 0)
hpwdt_stop();
else if (kdumptimeout == 0)
@@ -363,9 +359,6 @@ static int hpwdt_init_one(struct pci_dev *dev,
pretimeout ? "on" : "off");
dev_info(&dev->dev, "kdumptimeout: %d.\n", kdumptimeout);
- if (dev->subsystem_vendor == PCI_VENDOR_ID_HP_3PAR)
- ilo5 = true;
-
return 0;
error_wd_register:
--
2.41.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH 0/2] watchdog/hpwdt: Cleanup Claiming NMI
2023-11-09 2:44 [PATCH 0/2] watchdog/hpwdt: Cleanup Claiming NMI Jerry Hoemann
2023-11-09 2:44 ` [PATCH 1/2] watchdog/hpwdt: Only claim UNKNOWN NMI if from iLO Jerry Hoemann
2023-11-09 2:44 ` [PATCH 2/2] watchdog/hpwdt: Remove checks on ilo5 Jerry Hoemann
@ 2023-11-27 3:16 ` Jerry Hoemann
2 siblings, 0 replies; 4+ messages in thread
From: Jerry Hoemann @ 2023-11-27 3:16 UTC (permalink / raw)
To: linux, wim; +Cc: linux-watchdog, linux-kernel
On Wed, Nov 08, 2023 at 07:44:05PM -0700, Jerry Hoemann wrote:
> In addition to being a watchdog, hpwdt participates in error
> containmnet on ProLiant systems.
>
> On legacy platforms (Gen 8/Gen 9 and earlier) Fatal IO errors would be
> signaled as an IO CHECK NMI with expectation that hpwdt would be present
> to receive the NMI and crash the systems thus containing the error.
>
> A problem was that hwpdt did not discriminate enough in accepting NMIs.
> This could lead to problems if an NMI generated for another subsystems
> was not claimed by that subsystem and became UNKNOWN and was claimed
> by hpwdt. Application profiling was such an example. While, profiling
> issue was fixed, hpwdt should avoid claiming NMI not intended for it.
>
> In iLO 5 time frame, checks were added to make hpwdt more selective
> in claiming NMI. This patchset cleans up the checks and enables it
> for future versions of iLO.
>
Hi Guenter,
Was there a problem with this patch set?
Thanks
Jerry
>
> Jerry Hoemann (2):
> watchdog/hpwdt: Only claim UNKNOWN NMI if from iLO
> watchdog/hpwdt: Remove checks on ilo5
>
> drivers/watchdog/hpwdt.c | 9 +--------
> 1 file changed, 1 insertion(+), 8 deletions(-)
>
> --
> 2.41.0
--
-----------------------------------------------------------------------------
Jerry Hoemann Software Engineer Hewlett Packard Enterprise
-----------------------------------------------------------------------------
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-11-27 3:16 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-09 2:44 [PATCH 0/2] watchdog/hpwdt: Cleanup Claiming NMI Jerry Hoemann
2023-11-09 2:44 ` [PATCH 1/2] watchdog/hpwdt: Only claim UNKNOWN NMI if from iLO Jerry Hoemann
2023-11-09 2:44 ` [PATCH 2/2] watchdog/hpwdt: Remove checks on ilo5 Jerry Hoemann
2023-11-27 3:16 ` [PATCH 0/2] watchdog/hpwdt: Cleanup Claiming NMI Jerry Hoemann
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox