* [PATCH v3 0/2] Reduce CPU consumption after panic
@ 2025-04-29 15:06 carlos.bilbao
2025-04-29 15:06 ` [PATCH v3 1/2] panic: Allow for dynamic custom behavior " carlos.bilbao
` (2 more replies)
0 siblings, 3 replies; 16+ messages in thread
From: carlos.bilbao @ 2025-04-29 15:06 UTC (permalink / raw)
To: tglx, seanjc, jan.glauber
Cc: bilbao, pmladek, akpm, jani.nikula, linux-kernel, gregkh,
takakura, john.ogness, Carlos Bilbao
From: Carlos Bilbao <carlos.bilbao@kernel.org>
Provide a priority-based mechanism to set the behavior of the kernel at
the post-panic stage -- the current default is a waste of CPU except for
cases with console that generate insightful output.
In v1 cover letter [1], I illustrated the potential to reduce unnecessary
CPU resources with an experiment with VMs, reducing more than 70% of CPU
usage. The main delta of v2 [2] was that, instead of a weak function that
archs can overwrite, we provided a flexible priority-based mechanism
(following suggestions by Sean Christopherson), panic_set_handling().
Compared to v2 [2], the main changes in this third version are that (1) we
don't set a default function for panic_halt() and (2) we provide a comment
for the x86 implementation describing the check for console.
[1] https://lore.kernel.org/all/20250326151204.67898-1-carlos.bilbao@kernel.org/
[2] https://lore.kernel.org/all/20250428215952.1332985-1-carlos.bilbao@kernel.org/
Carlos:
panic: Allow for dynamic custom behavior after panic
x86/panic: Add x86_panic_handler as default post-panic behavior
---
arch/x86/kernel/setup.c | 17 +++++++++++++++++
include/linux/panic.h | 2 ++
kernel/panic.c | 22 ++++++++++++++++++++++
3 files changed, 41 insertions(+)
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH v3 1/2] panic: Allow for dynamic custom behavior after panic
2025-04-29 15:06 [PATCH v3 0/2] Reduce CPU consumption after panic carlos.bilbao
@ 2025-04-29 15:06 ` carlos.bilbao
2025-04-29 15:06 ` [PATCH v3 2/2] x86/panic: Add x86_panic_handler as default post-panic behavior carlos.bilbao
2025-04-29 20:39 ` [PATCH v3 0/2] Reduce CPU consumption after panic Andrew Morton
2 siblings, 0 replies; 16+ messages in thread
From: carlos.bilbao @ 2025-04-29 15:06 UTC (permalink / raw)
To: tglx, seanjc, jan.glauber
Cc: bilbao, pmladek, akpm, jani.nikula, linux-kernel, gregkh,
takakura, john.ogness, Carlos Bilbao
From: Carlos Bilbao <carlos.bilbao@kernel.org>
Introduce panic_set_handling() to allow overriding the default post-panic
behavior with a priority-based mechanism.
Signed-off-by: Carlos Bilbao (DigitalOcean) <carlos.bilbao@kernel.org>
Reviewed-by: Jan Glauber (DigitalOcean) <jan.glauber@gmail.com>
---
include/linux/panic.h | 2 ++
kernel/panic.c | 22 ++++++++++++++++++++++
2 files changed, 24 insertions(+)
diff --git a/include/linux/panic.h b/include/linux/panic.h
index 2494d51707ef..cf8d4a944407 100644
--- a/include/linux/panic.h
+++ b/include/linux/panic.h
@@ -98,4 +98,6 @@ extern void add_taint(unsigned flag, enum lockdep_ok);
extern int test_taint(unsigned flag);
extern unsigned long get_taint(void);
+void panic_set_handling(void (*fn)(void), int priority);
+
#endif /* _LINUX_PANIC_H */
diff --git a/kernel/panic.c b/kernel/panic.c
index a3889f38153d..559304546f2e 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -276,6 +276,25 @@ static void panic_other_cpus_shutdown(bool crash_kexec)
crash_smp_send_stop();
}
+/*
+ * If set, this function is called after a kernel panic is handled. It can
+ * be assigned using panic_set_handling(), which supports priority-based
+ * logic. For example, specific architectures may provide a default handler
+ * (priority 0) that halts the system to conserve CPU resources.
+ */
+static void (*panic_halt)(void);
+
+static int panic_halt_priority;
+
+void panic_set_handling(void (*fn)(void), int priority)
+{
+ if (panic_halt && priority <= panic_halt_priority)
+ return;
+
+ panic_halt_priority = priority;
+ panic_halt = fn;
+}
+
/**
* panic - halt the system
* @fmt: The text string to print
@@ -467,6 +486,9 @@ void panic(const char *fmt, ...)
console_flush_on_panic(CONSOLE_FLUSH_PENDING);
nbcon_atomic_flush_unsafe();
+ if (panic_halt)
+ panic_halt();
+
local_irq_enable();
for (i = 0; ; i += PANIC_TIMER_STEP) {
touch_softlockup_watchdog();
--
2.47.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v3 2/2] x86/panic: Add x86_panic_handler as default post-panic behavior
2025-04-29 15:06 [PATCH v3 0/2] Reduce CPU consumption after panic carlos.bilbao
2025-04-29 15:06 ` [PATCH v3 1/2] panic: Allow for dynamic custom behavior " carlos.bilbao
@ 2025-04-29 15:06 ` carlos.bilbao
2025-04-29 20:39 ` [PATCH v3 0/2] Reduce CPU consumption after panic Andrew Morton
2 siblings, 0 replies; 16+ messages in thread
From: carlos.bilbao @ 2025-04-29 15:06 UTC (permalink / raw)
To: tglx, seanjc, jan.glauber
Cc: bilbao, pmladek, akpm, jani.nikula, linux-kernel, gregkh,
takakura, john.ogness, Carlos Bilbao
From: Carlos Bilbao <carlos.bilbao@kernel.org>
Add function x86_panic_handler() as the default behavior for x86 for
post-panic stage via panic_set_handling(). Instead of busy-wait loop, it
will halt if there's no console to save CPU cycles.
Signed-off-by: Carlos Bilbao (DigitalOcean) <carlos.bilbao@kernel.org>
Reviewed-by: Jan Glauber (DigitalOcean) <jan.glauber@gmail.com>
---
arch/x86/kernel/setup.c | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 9d2a13b37833..abca4a9b5e0a 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -16,6 +16,7 @@
#include <linux/initrd.h>
#include <linux/iscsi_ibft.h>
#include <linux/memblock.h>
+#include <linux/panic.h>
#include <linux/panic_notifier.h>
#include <linux/pci.h>
#include <linux/root_dev.h>
@@ -837,6 +838,20 @@ static void __init x86_report_nx(void)
}
}
+/*
+ * Halt the CPU to save resources after panic is handled. If
+ * console_trylock() succeeds, no other CPU is currently writing to the
+ * console
+ *
+ */
+static void x86_panic_handler(void)
+{
+ if (console_trylock()) {
+ console_unlock();
+ safe_halt();
+ }
+}
+
/*
* Determine if we were loaded by an EFI loader. If so, then we have also been
* passed the efi memmap, systab, etc., so we should use these data structures
@@ -1252,6 +1267,8 @@ void __init setup_arch(char **cmdline_p)
#endif
unwind_init();
+
+ panic_set_handling(x86_panic_handler, 0);
}
#ifdef CONFIG_X86_32
--
2.47.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH v3 0/2] Reduce CPU consumption after panic
2025-04-29 20:39 ` [PATCH v3 0/2] Reduce CPU consumption after panic Andrew Morton
@ 2025-04-29 20:17 ` Carlos Bilbao
2025-04-29 22:53 ` Andrew Morton
2025-04-29 21:06 ` Peter Zijlstra
1 sibling, 1 reply; 16+ messages in thread
From: Carlos Bilbao @ 2025-04-29 20:17 UTC (permalink / raw)
To: Andrew Morton, carlos.bilbao
Cc: tglx, seanjc, jan.glauber, pmladek, jani.nikula, linux-kernel,
gregkh, takakura, john.ogness, x86
Hey Andrew,
On 4/29/25 15:39, Andrew Morton wrote:
> (cc more x86 people)
>
> On Tue, 29 Apr 2025 10:06:36 -0500 carlos.bilbao@kernel.org wrote:
>
>> From: Carlos Bilbao <carlos.bilbao@kernel.org>
>>
>> Provide a priority-based mechanism to set the behavior of the kernel at
>> the post-panic stage -- the current default is a waste of CPU except for
>> cases with console that generate insightful output.
>>
>> In v1 cover letter [1], I illustrated the potential to reduce unnecessary
>> CPU resources with an experiment with VMs, reducing more than 70% of CPU
>> usage. The main delta of v2 [2] was that, instead of a weak function that
>> archs can overwrite, we provided a flexible priority-based mechanism
>> (following suggestions by Sean Christopherson), panic_set_handling().
>>
>
> An effect of this is that the blinky light will never again occur on
> any x86, I think? I don't know what might the effects of changing such
> longstanding behavior.
Yep, someone pointed this out before. I don't think it's super relevant?
Also, in the second patch, I added a check to see that there's no console
output left to be flushed.
>
> Also, why was the `priority' feature added? It has no effect in this
> patchset.
>
This was done to allow for flexibility, for example, if panic devices
wish to override the default panic behavior. Other benefits of such
flexibility (as opposed to, for example, a weak function that archs can
override) were outlined by Sean here:
https://lore.kernel.org/lkml/20250326151204.67898-1-carlos.bilbao@kernel.org/T/#m93704ff5cb32ade8b8187764aab56403bbd2b331
Thanks,
Carlos
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v3 0/2] Reduce CPU consumption after panic
2025-04-29 21:06 ` Peter Zijlstra
@ 2025-04-29 20:32 ` Carlos Bilbao
2025-04-29 22:10 ` Peter Zijlstra
0 siblings, 1 reply; 16+ messages in thread
From: Carlos Bilbao @ 2025-04-29 20:32 UTC (permalink / raw)
To: Peter Zijlstra, Andrew Morton
Cc: carlos.bilbao, tglx, seanjc, jan.glauber, pmladek, jani.nikula,
linux-kernel, gregkh, takakura, john.ogness, x86
Hey Peter,
On 4/29/25 16:06, Peter Zijlstra wrote:
> On Tue, Apr 29, 2025 at 01:39:41PM -0700, Andrew Morton wrote:
>> (cc more x86 people)
>>
>> On Tue, 29 Apr 2025 10:06:36 -0500 carlos.bilbao@kernel.org wrote:
>>
>>> From: Carlos Bilbao <carlos.bilbao@kernel.org>
>>>
>>> Provide a priority-based mechanism to set the behavior of the kernel at
>>> the post-panic stage -- the current default is a waste of CPU except for
>>> cases with console that generate insightful output.
>>>
>>> In v1 cover letter [1], I illustrated the potential to reduce unnecessary
>>> CPU resources with an experiment with VMs, reducing more than 70% of CPU
>>> usage. The main delta of v2 [2] was that, instead of a weak function that
>>> archs can overwrite, we provided a flexible priority-based mechanism
>>> (following suggestions by Sean Christopherson), panic_set_handling().
>>>
>>
>> An effect of this is that the blinky light will never again occur on
>> any x86, I think? I don't know what might the effects of changing such
>> longstanding behavior.
>>
>> Also, why was the `priority' feature added? It has no effect in this
>> patchset.
>
> It does what now, and why?
>
> Not being copied on anything, the first reaction is, its panic, your
> machine is dead, who cares about power etc..
Thanks for taking the time to look into my patch set!
Yes, the machine is effectively dead, but as things stand today,
it's still drawing resources unnecessarily.
Who cares? An example, as mentioned in the cover letter, is Linux running
in VMs. Imagine a scenario where customers are billed based on CPU usage --
having panicked VMs spinning in useless loops wastes their money. In shared
envs, those wasted cycles could be used by other processes/VMs. But this
is as much about the cloud as it is for laptops/embedded/anywhere -- Linux
should avoid wasting resources wherever possible.
Thanks,
Carlos
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v3 0/2] Reduce CPU consumption after panic
2025-04-29 15:06 [PATCH v3 0/2] Reduce CPU consumption after panic carlos.bilbao
2025-04-29 15:06 ` [PATCH v3 1/2] panic: Allow for dynamic custom behavior " carlos.bilbao
2025-04-29 15:06 ` [PATCH v3 2/2] x86/panic: Add x86_panic_handler as default post-panic behavior carlos.bilbao
@ 2025-04-29 20:39 ` Andrew Morton
2025-04-29 20:17 ` Carlos Bilbao
2025-04-29 21:06 ` Peter Zijlstra
2 siblings, 2 replies; 16+ messages in thread
From: Andrew Morton @ 2025-04-29 20:39 UTC (permalink / raw)
To: carlos.bilbao
Cc: tglx, seanjc, jan.glauber, bilbao, pmladek, jani.nikula,
linux-kernel, gregkh, takakura, john.ogness, x86
(cc more x86 people)
On Tue, 29 Apr 2025 10:06:36 -0500 carlos.bilbao@kernel.org wrote:
> From: Carlos Bilbao <carlos.bilbao@kernel.org>
>
> Provide a priority-based mechanism to set the behavior of the kernel at
> the post-panic stage -- the current default is a waste of CPU except for
> cases with console that generate insightful output.
>
> In v1 cover letter [1], I illustrated the potential to reduce unnecessary
> CPU resources with an experiment with VMs, reducing more than 70% of CPU
> usage. The main delta of v2 [2] was that, instead of a weak function that
> archs can overwrite, we provided a flexible priority-based mechanism
> (following suggestions by Sean Christopherson), panic_set_handling().
>
An effect of this is that the blinky light will never again occur on
any x86, I think? I don't know what might the effects of changing such
longstanding behavior.
Also, why was the `priority' feature added? It has no effect in this
patchset.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v3 0/2] Reduce CPU consumption after panic
2025-04-29 22:10 ` Peter Zijlstra
@ 2025-04-29 20:52 ` Carlos Bilbao
2025-04-30 8:48 ` Peter Zijlstra
0 siblings, 1 reply; 16+ messages in thread
From: Carlos Bilbao @ 2025-04-29 20:52 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Andrew Morton, carlos.bilbao, tglx, seanjc, jan.glauber, pmladek,
jani.nikula, linux-kernel, gregkh, takakura, john.ogness, x86
Hello,
On 4/29/25 17:10, Peter Zijlstra wrote:
> On Tue, Apr 29, 2025 at 03:32:56PM -0500, Carlos Bilbao wrote:
>
>> Yes, the machine is effectively dead, but as things stand today,
>> it's still drawing resources unnecessarily.
>>
>> Who cares? An example, as mentioned in the cover letter, is Linux running
>
> Ah, see, I didn't have no cover letter, only akpm's reply.
>
>> in VMs. Imagine a scenario where customers are billed based on CPU usage --
>> having panicked VMs spinning in useless loops wastes their money. In shared
>> envs, those wasted cycles could be used by other processes/VMs. But this
>> is as much about the cloud as it is for laptops/embedded/anywhere -- Linux
>> should avoid wasting resources wherever possible.
>
> So I don't really buy the laptop and embedded case, people tend to look
> at laptops when open, and get very impatient when they don't respond.
> Embedded things really should have a watchdog.
>
> Also, should you not be using panic_timeout to auto reboot your machine
> in all these cases?
>
> In any case, the VM nonsense, do they not have a virtual watchdog to
> 'reap' crashed VMs or something?
The key word here is "should." Should embedded systems have a watchdog?
Maybe. Should I've auto reboot set? Maybe. Perhaps I don’t want to reboot
until I’ve root-caused the crash. But my patch set isn’t about “shoulds.”
What I’m discussing here is (1) the default Linux behavior, and (2)
providing people with the flexibility to do what THEY think they should do,
not what you think they should do.
Thanks,
Carlos
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v3 0/2] Reduce CPU consumption after panic
2025-04-29 20:39 ` [PATCH v3 0/2] Reduce CPU consumption after panic Andrew Morton
2025-04-29 20:17 ` Carlos Bilbao
@ 2025-04-29 21:06 ` Peter Zijlstra
2025-04-29 20:32 ` Carlos Bilbao
1 sibling, 1 reply; 16+ messages in thread
From: Peter Zijlstra @ 2025-04-29 21:06 UTC (permalink / raw)
To: Andrew Morton
Cc: carlos.bilbao, tglx, seanjc, jan.glauber, bilbao, pmladek,
jani.nikula, linux-kernel, gregkh, takakura, john.ogness, x86
On Tue, Apr 29, 2025 at 01:39:41PM -0700, Andrew Morton wrote:
> (cc more x86 people)
>
> On Tue, 29 Apr 2025 10:06:36 -0500 carlos.bilbao@kernel.org wrote:
>
> > From: Carlos Bilbao <carlos.bilbao@kernel.org>
> >
> > Provide a priority-based mechanism to set the behavior of the kernel at
> > the post-panic stage -- the current default is a waste of CPU except for
> > cases with console that generate insightful output.
> >
> > In v1 cover letter [1], I illustrated the potential to reduce unnecessary
> > CPU resources with an experiment with VMs, reducing more than 70% of CPU
> > usage. The main delta of v2 [2] was that, instead of a weak function that
> > archs can overwrite, we provided a flexible priority-based mechanism
> > (following suggestions by Sean Christopherson), panic_set_handling().
> >
>
> An effect of this is that the blinky light will never again occur on
> any x86, I think? I don't know what might the effects of changing such
> longstanding behavior.
>
> Also, why was the `priority' feature added? It has no effect in this
> patchset.
It does what now, and why?
Not being copied on anything, the first reaction is, its panic, your
machine is dead, who cares about power etc..
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v3 0/2] Reduce CPU consumption after panic
2025-04-29 22:53 ` Andrew Morton
@ 2025-04-29 21:39 ` Carlos Bilbao
0 siblings, 0 replies; 16+ messages in thread
From: Carlos Bilbao @ 2025-04-29 21:39 UTC (permalink / raw)
To: Andrew Morton
Cc: carlos.bilbao, tglx, seanjc, jan.glauber, pmladek, jani.nikula,
linux-kernel, gregkh, takakura, john.ogness, x86
Hey Andrew,
On 4/29/25 17:53, Andrew Morton wrote:
> On Tue, 29 Apr 2025 15:17:33 -0500 Carlos Bilbao <bilbao@vt.edu> wrote:
>
>> Hey Andrew,
>>
>> On 4/29/25 15:39, Andrew Morton wrote:
>>> (cc more x86 people)
>>>
>>> On Tue, 29 Apr 2025 10:06:36 -0500 carlos.bilbao@kernel.org wrote:
>>>
>>>> From: Carlos Bilbao <carlos.bilbao@kernel.org>
>>>>
>>>> Provide a priority-based mechanism to set the behavior of the kernel at
>>>> the post-panic stage -- the current default is a waste of CPU except for
>>>> cases with console that generate insightful output.
>>>>
>>>> In v1 cover letter [1], I illustrated the potential to reduce unnecessary
>>>> CPU resources with an experiment with VMs, reducing more than 70% of CPU
>>>> usage. The main delta of v2 [2] was that, instead of a weak function that
>>>> archs can overwrite, we provided a flexible priority-based mechanism
>>>> (following suggestions by Sean Christopherson), panic_set_handling().
>>>>
>>>
>>> An effect of this is that the blinky light will never again occur on
>>> any x86, I think? I don't know what might the effects of changing such
>>> longstanding behavior.
>>
>> Yep, someone pointed this out before. I don't think it's super relevant?
>
> Why not? It's an alteration in very longstanding behavior - nobody
> knows who will be affected by this and how they will be affected.
It’s difficult for me to imagine how someone might be negatively impacted,
but I understand that it could happen.
>
>> Also, in the second patch, I added a check to see that there's no console
>> output left to be flushed.
>
> It's unclear how this affects such considerations. Please fully
> changelog all these things.
>
>>
>>>
>>> Also, why was the `priority' feature added? It has no effect in this
>>> patchset.
>>>
>>
>> This was done to allow for flexibility, for example, if panic devices
>> wish to override the default panic behavior.
>
> There are no such callers. We can add this feature later, if a need is
> demonstrated.
I think you'd then prefer what I originally proposed:
https://lore.kernel.org/lkml/20250326151204.67898-1-carlos.bilbao@kernel.org/T/
IMHO it's true that this feature might not be necessary ATM, but as Sean
pointed out, it could be useful in the future. I don't have strong
preferences either way. Would you be happier with the current v3 approach
if we add comments to the code explaining the purpose of the priority
feature?
>
>> Other benefits of such
>> flexibility (as opposed to, for example, a weak function that archs can
>> override) were outlined by Sean here:
>>
>> https://lore.kernel.org/lkml/20250326151204.67898-1-carlos.bilbao@kernel.org/T/#m93704ff5cb32ade8b8187764aab56403bbd2b331
>
> Again, please fully describe these matters in changelogging and code
> comments.
Thanks,
Carlos
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v3 0/2] Reduce CPU consumption after panic
2025-04-29 20:32 ` Carlos Bilbao
@ 2025-04-29 22:10 ` Peter Zijlstra
2025-04-29 20:52 ` Carlos Bilbao
0 siblings, 1 reply; 16+ messages in thread
From: Peter Zijlstra @ 2025-04-29 22:10 UTC (permalink / raw)
To: Carlos Bilbao
Cc: Andrew Morton, carlos.bilbao, tglx, seanjc, jan.glauber, pmladek,
jani.nikula, linux-kernel, gregkh, takakura, john.ogness, x86
On Tue, Apr 29, 2025 at 03:32:56PM -0500, Carlos Bilbao wrote:
> Yes, the machine is effectively dead, but as things stand today,
> it's still drawing resources unnecessarily.
>
> Who cares? An example, as mentioned in the cover letter, is Linux running
Ah, see, I didn't have no cover letter, only akpm's reply.
> in VMs. Imagine a scenario where customers are billed based on CPU usage --
> having panicked VMs spinning in useless loops wastes their money. In shared
> envs, those wasted cycles could be used by other processes/VMs. But this
> is as much about the cloud as it is for laptops/embedded/anywhere -- Linux
> should avoid wasting resources wherever possible.
So I don't really buy the laptop and embedded case, people tend to look
at laptops when open, and get very impatient when they don't respond.
Embedded things really should have a watchdog.
Also, should you not be using panic_timeout to auto reboot your machine
in all these cases?
In any case, the VM nonsense, do they not have a virtual watchdog to
'reap' crashed VMs or something?
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v3 0/2] Reduce CPU consumption after panic
2025-04-29 20:17 ` Carlos Bilbao
@ 2025-04-29 22:53 ` Andrew Morton
2025-04-29 21:39 ` Carlos Bilbao
0 siblings, 1 reply; 16+ messages in thread
From: Andrew Morton @ 2025-04-29 22:53 UTC (permalink / raw)
To: Carlos Bilbao
Cc: carlos.bilbao, tglx, seanjc, jan.glauber, pmladek, jani.nikula,
linux-kernel, gregkh, takakura, john.ogness, x86
On Tue, 29 Apr 2025 15:17:33 -0500 Carlos Bilbao <bilbao@vt.edu> wrote:
> Hey Andrew,
>
> On 4/29/25 15:39, Andrew Morton wrote:
> > (cc more x86 people)
> >
> > On Tue, 29 Apr 2025 10:06:36 -0500 carlos.bilbao@kernel.org wrote:
> >
> >> From: Carlos Bilbao <carlos.bilbao@kernel.org>
> >>
> >> Provide a priority-based mechanism to set the behavior of the kernel at
> >> the post-panic stage -- the current default is a waste of CPU except for
> >> cases with console that generate insightful output.
> >>
> >> In v1 cover letter [1], I illustrated the potential to reduce unnecessary
> >> CPU resources with an experiment with VMs, reducing more than 70% of CPU
> >> usage. The main delta of v2 [2] was that, instead of a weak function that
> >> archs can overwrite, we provided a flexible priority-based mechanism
> >> (following suggestions by Sean Christopherson), panic_set_handling().
> >>
> >
> > An effect of this is that the blinky light will never again occur on
> > any x86, I think? I don't know what might the effects of changing such
> > longstanding behavior.
>
> Yep, someone pointed this out before. I don't think it's super relevant?
Why not? It's an alteration in very longstanding behavior - nobody
knows who will be affected by this and how they will be affected.
> Also, in the second patch, I added a check to see that there's no console
> output left to be flushed.
It's unclear how this affects such considerations. Please fully
changelog all these things.
>
> >
> > Also, why was the `priority' feature added? It has no effect in this
> > patchset.
> >
>
> This was done to allow for flexibility, for example, if panic devices
> wish to override the default panic behavior.
There are no such callers. We can add this feature later, if a need is
demonstrated.
> Other benefits of such
> flexibility (as opposed to, for example, a weak function that archs can
> override) were outlined by Sean here:
>
> https://lore.kernel.org/lkml/20250326151204.67898-1-carlos.bilbao@kernel.org/T/#m93704ff5cb32ade8b8187764aab56403bbd2b331
Again, please fully describe these matters in changelogging and code
comments.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v3 0/2] Reduce CPU consumption after panic
2025-04-29 20:52 ` Carlos Bilbao
@ 2025-04-30 8:48 ` Peter Zijlstra
2025-04-30 15:59 ` Sean Christopherson
2025-04-30 18:54 ` Carlos Bilbao
0 siblings, 2 replies; 16+ messages in thread
From: Peter Zijlstra @ 2025-04-30 8:48 UTC (permalink / raw)
To: Carlos Bilbao
Cc: Andrew Morton, carlos.bilbao, tglx, seanjc, jan.glauber, pmladek,
jani.nikula, linux-kernel, gregkh, takakura, john.ogness, x86
On Tue, Apr 29, 2025 at 03:52:05PM -0500, Carlos Bilbao wrote:
> Hello,
>
> On 4/29/25 17:10, Peter Zijlstra wrote:
> > On Tue, Apr 29, 2025 at 03:32:56PM -0500, Carlos Bilbao wrote:
> >
> >> Yes, the machine is effectively dead, but as things stand today,
> >> it's still drawing resources unnecessarily.
> >>
> >> Who cares? An example, as mentioned in the cover letter, is Linux running
> >
> > Ah, see, I didn't have no cover letter, only akpm's reply.
> >
> >> in VMs. Imagine a scenario where customers are billed based on CPU usage --
> >> having panicked VMs spinning in useless loops wastes their money. In shared
> >> envs, those wasted cycles could be used by other processes/VMs. But this
> >> is as much about the cloud as it is for laptops/embedded/anywhere -- Linux
> >> should avoid wasting resources wherever possible.
> >
> > So I don't really buy the laptop and embedded case, people tend to look
> > at laptops when open, and get very impatient when they don't respond.
> > Embedded things really should have a watchdog.
> >
> > Also, should you not be using panic_timeout to auto reboot your machine
> > in all these cases?
> >
> > In any case, the VM nonsense, do they not have a virtual watchdog to
> > 'reap' crashed VMs or something?
>
> The key word here is "should." Should embedded systems have a watchdog?
> Maybe. Should I've auto reboot set? Maybe. Perhaps I don’t want to reboot
> until I’ve root-caused the crash.
Install a kdump kernel, or log your serial line :-)
> But my patch set isn’t about “shoulds.”
> What I’m discussing here is (1) the default Linux behavior,
Well, the default behaviour works for the 'your own physical machine'
thing just fine -- and that has always been the default use-case.
Nobody is going to be sitting there staring at a panic screen for ages.
All the other weirdo cases like embedded and VMs, they're just that,
weirdos and they can keep their pieces :-)
> and (2)
> providing people with the flexibility to do what THEY think they should do,
> not what you think they should do.
Well, there are a ton of options already. Like said, we have watchdogs,
reboots, crash kernels and all sorts. Why do we need more?
All that said... the default more or less does for(;;) { mdelay(100) },
if you have a modern chip that should not end up using much power at
all. That should end up in delay_halt_tpause() or delay_halt_mwaitx()
(depending on you being on Intel or AMD). And spend most its time in
deep idle states.
Is something not working?
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v3 0/2] Reduce CPU consumption after panic
2025-04-30 8:48 ` Peter Zijlstra
@ 2025-04-30 15:59 ` Sean Christopherson
2025-04-30 18:54 ` Carlos Bilbao
1 sibling, 0 replies; 16+ messages in thread
From: Sean Christopherson @ 2025-04-30 15:59 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Carlos Bilbao, Andrew Morton, carlos.bilbao, tglx, jan.glauber,
pmladek, jani.nikula, linux-kernel, gregkh, takakura, john.ogness,
x86
On Wed, Apr 30, 2025, Peter Zijlstra wrote:
> All that said... the default more or less does for(;;) { mdelay(100) },
> if you have a modern chip that should not end up using much power at
> all. That should end up in delay_halt_tpause() or delay_halt_mwaitx()
> (depending on you being on Intel or AMD). And spend most its time in
> deep idle states.
>
> Is something not working?
The motivation is to coerce vCPUs into yielding the physical CPU so that a
different vCPU can be scheduled in when the host is oversubscribed. IMO, that's
firmly a "host" problem to solve, where the solution might involve educating
customers for their own benefit[*].
I am indifferent as to whether or not the kernels halts during panic(), my
suggestions/feedback in earlier versions were purely to not make any behavior
specific to VMs. I.e. I am strongly opposed to implementing behavior that kicks
in only when running as a guest.
[*] from https://lore.kernel.org/all/Z_lDzyXJ8JKqOyzs@google.com:
: On Fri, Apr 11, 2025 at 9:31 AM Sean Christopherson <seanjc@google.com> wrote:
: > > On Wed 2025-03-26 10:12:03, carlos.bilbao@kernel.org wrote:
: > > > After handling a panic, the kernel enters a busy-wait loop, unnecessarily
: > > > consuming CPU and potentially impacting other workloads including other
: > > > guest VMs in the case of virtualized setups.
: >
: > Impacting other guests isn't the guest kernel's problem. If the host has heavily
: > overcommited CPUs and can't meet SLOs because VMs are panicking and not rebooting,
: > that's a host problem.
: >
: > This could become a customer problem if they're getting billed based on CPU usage,
: > but I don't know that simply doing HLT is the best solution. E.g. advising the
: > customer to configure their kernels to kexec into a kdump kernel or to reboot
: > on panic, seems like it would provide a better overall experience for most.
: >
: > QEMU (assuming y'all use QEMU) also supports a pvpanic device, so unless the VM
: > and/or customer is using a funky setup, the host should already know the guest
: > has panicked. At that point, the host can make appropiate scheduling decisions,
: > e.g. userspace can simply stop running the VM after a certain timeout, throttle
: > it, jail it, etc.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v3 0/2] Reduce CPU consumption after panic
2025-04-30 8:48 ` Peter Zijlstra
2025-04-30 15:59 ` Sean Christopherson
@ 2025-04-30 18:54 ` Carlos Bilbao
2025-05-01 8:55 ` Peter Zijlstra
1 sibling, 1 reply; 16+ messages in thread
From: Carlos Bilbao @ 2025-04-30 18:54 UTC (permalink / raw)
To: Peter Zijlstra, Andrew Morton, seanjc
Cc: carlos.bilbao, tglx, jan.glauber, pmladek, jani.nikula,
linux-kernel, gregkh, takakura, john.ogness, x86
Hello,
On 4/30/25 03:48, Peter Zijlstra wrote:
> On Tue, Apr 29, 2025 at 03:52:05PM -0500, Carlos Bilbao wrote:
>> Hello,
>>
>> On 4/29/25 17:10, Peter Zijlstra wrote:
>>> On Tue, Apr 29, 2025 at 03:32:56PM -0500, Carlos Bilbao wrote:
>>>
>>>> Yes, the machine is effectively dead, but as things stand today,
>>>> it's still drawing resources unnecessarily.
>>>>
>>>> Who cares? An example, as mentioned in the cover letter, is Linux running
>>>
>>> Ah, see, I didn't have no cover letter, only akpm's reply.
>>>
>>>> in VMs. Imagine a scenario where customers are billed based on CPU usage --
>>>> having panicked VMs spinning in useless loops wastes their money. In shared
>>>> envs, those wasted cycles could be used by other processes/VMs. But this
>>>> is as much about the cloud as it is for laptops/embedded/anywhere -- Linux
>>>> should avoid wasting resources wherever possible.
>>>
>>> So I don't really buy the laptop and embedded case, people tend to look
>>> at laptops when open, and get very impatient when they don't respond.
>>> Embedded things really should have a watchdog.
>>>
>>> Also, should you not be using panic_timeout to auto reboot your machine
>>> in all these cases?
>>>
>>> In any case, the VM nonsense, do they not have a virtual watchdog to
>>> 'reap' crashed VMs or something?
>>
>> The key word here is "should." Should embedded systems have a watchdog?
>> Maybe. Should I've auto reboot set? Maybe. Perhaps I don’t want to reboot
>> until I’ve root-caused the crash.
>
> Install a kdump kernel, or log your serial line :-)
>
>> But my patch set isn’t about “shoulds.”
>> What I’m discussing here is (1) the default Linux behavior,
>
> Well, the default behaviour works for the 'your own physical machine'
> thing just fine -- and that has always been the default use-case.
>
> Nobody is going to be sitting there staring at a panic screen for ages.
>
> All the other weirdo cases like embedded and VMs, they're just that,
> weirdos and they can keep their pieces :-)
>
>> and (2)
>> providing people with the flexibility to do what THEY think they should do,
>> not what you think they should do.
>
> Well, there are a ton of options already. Like said, we have watchdogs,
> reboots, crash kernels and all sorts. Why do we need more?
>
> All that said... the default more or less does for(;;) { mdelay(100) },
> if you have a modern chip that should not end up using much power at
> all. That should end up in delay_halt_tpause() or delay_halt_mwaitx()
> (depending on you being on Intel or AMD). And spend most its time in
> deep idle states.
>
> Is something not working?
Well, in my experiments, that’s not what happened -- halting the CPU in VMs
reduced CPU usage by around 70%.
How would folks feel about adding something like
/proc/sys/kernel/halt_after_panic, disabled by default? It would help in
the Linux use cases I care about (e.g., virtualized environments), without
affecting others.
Thanks,
Carlos
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v3 0/2] Reduce CPU consumption after panic
2025-04-30 18:54 ` Carlos Bilbao
@ 2025-05-01 8:55 ` Peter Zijlstra
2025-05-07 19:49 ` Carlos Bilbao
0 siblings, 1 reply; 16+ messages in thread
From: Peter Zijlstra @ 2025-05-01 8:55 UTC (permalink / raw)
To: Carlos Bilbao
Cc: Andrew Morton, seanjc, carlos.bilbao, tglx, jan.glauber, pmladek,
jani.nikula, linux-kernel, gregkh, takakura, john.ogness, x86
On Wed, Apr 30, 2025 at 01:54:11PM -0500, Carlos Bilbao wrote:
> > All that said... the default more or less does for(;;) { mdelay(100) },
> > if you have a modern chip that should not end up using much power at
> > all. That should end up in delay_halt_tpause() or delay_halt_mwaitx()
> > (depending on you being on Intel or AMD). And spend most its time in
> > deep idle states.
> >
> > Is something not working?
>
> Well, in my experiments, that’s not what happened -- halting the CPU in VMs
> reduced CPU usage by around 70%.
Because you're doing VMs, and VMs create problems where there weren't
any before. IOW you get to keep the pieces.
Specifically, VMs do VMEXIT on HLT and this is what's working for you.
On real hardware though, HLT gets you C1, while both TPAUSE and MWAITX
can probably get you deeper C states. As such, HLT is probably a
regression on power.
> How would folks feel about adding something like
> /proc/sys/kernel/halt_after_panic, disabled by default? It would help in
> the Linux use cases I care about (e.g., virtualized environments), without
> affecting others.
What's wrong with any of the existing options? Fact remains you need to
configure your VMs properly.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v3 0/2] Reduce CPU consumption after panic
2025-05-01 8:55 ` Peter Zijlstra
@ 2025-05-07 19:49 ` Carlos Bilbao
0 siblings, 0 replies; 16+ messages in thread
From: Carlos Bilbao @ 2025-05-07 19:49 UTC (permalink / raw)
To: Peter Zijlstra, Carlos Bilbao
Cc: Andrew Morton, seanjc, carlos.bilbao, tglx, jan.glauber, pmladek,
jani.nikula, linux-kernel, gregkh, takakura, john.ogness, x86
Hello Peter,
On 5/1/25 03:55, Peter Zijlstra wrote:
> On Wed, Apr 30, 2025 at 01:54:11PM -0500, Carlos Bilbao wrote:
>
>>> All that said... the default more or less does for(;;) { mdelay(100) },
>>> if you have a modern chip that should not end up using much power at
>>> all. That should end up in delay_halt_tpause() or delay_halt_mwaitx()
>>> (depending on you being on Intel or AMD). And spend most its time in
>>> deep idle states.
>>>
>>> Is something not working?
>> Well, in my experiments, that’s not what happened -- halting the CPU in VMs
>> reduced CPU usage by around 70%.
> Because you're doing VMs, and VMs create problems where there weren't
> any before. IOW you get to keep the pieces.
>
> Specifically, VMs do VMEXIT on HLT and this is what's working for you.
>
> On real hardware though, HLT gets you C1, while both TPAUSE and MWAITX
> can probably get you deeper C states. As such, HLT is probably a
> regression on power.
That's a good point -- wouldn't TPAUSE achieve what I was trying to
accomplish with HLT? Assuming there's support and wouldn't just #UD.
>
>> How would folks feel about adding something like
>> /proc/sys/kernel/halt_after_panic, disabled by default? It would help in
>> the Linux use cases I care about (e.g., virtualized environments), without
>> affecting others.
> What's wrong with any of the existing options? Fact remains you need to
> configure your VMs properly.
See, that's the problem -- it's not _my_VMs. It's the VMs of cloud users,
who are ultimately responsible for configuring their kernels however they
want. We can try to educate them, as some maintainers have suggested me,
but many people either don't know what the kernel is or don't care -- they
just trust that Linux will have sensible defaults. I get your point that
VM-specific problems shouldn't burden the broader kernel ecosystem, but I’d
still like to think whether there's something we can do to improve the
situation for VMs post-panic without negatively impacting other use cases.
Thanks,
Carlos
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2025-05-07 20:14 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-29 15:06 [PATCH v3 0/2] Reduce CPU consumption after panic carlos.bilbao
2025-04-29 15:06 ` [PATCH v3 1/2] panic: Allow for dynamic custom behavior " carlos.bilbao
2025-04-29 15:06 ` [PATCH v3 2/2] x86/panic: Add x86_panic_handler as default post-panic behavior carlos.bilbao
2025-04-29 20:39 ` [PATCH v3 0/2] Reduce CPU consumption after panic Andrew Morton
2025-04-29 20:17 ` Carlos Bilbao
2025-04-29 22:53 ` Andrew Morton
2025-04-29 21:39 ` Carlos Bilbao
2025-04-29 21:06 ` Peter Zijlstra
2025-04-29 20:32 ` Carlos Bilbao
2025-04-29 22:10 ` Peter Zijlstra
2025-04-29 20:52 ` Carlos Bilbao
2025-04-30 8:48 ` Peter Zijlstra
2025-04-30 15:59 ` Sean Christopherson
2025-04-30 18:54 ` Carlos Bilbao
2025-05-01 8:55 ` Peter Zijlstra
2025-05-07 19:49 ` Carlos Bilbao
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.