All of lore.kernel.org
 help / color / mirror / Atom feed
From: Carlos Bilbao <carlos.bilbao.osdev@gmail.com>
To: pmladek@suse.com, Andrew Morton <akpm@linux-foundation.org>,
	jani.nikula@intel.com, open list <linux-kernel@vger.kernel.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	takakura@valinux.co.jp, john.ogness@linutronix.de
Cc: jglauber@digitalocean.com
Subject: [RFC] panic: reduce CPU consumption when finished handling panic
Date: Fri, 21 Mar 2025 08:01:52 -0500	[thread overview]
Message-ID: <7b3a0288-20f9-42cf-af81-e10ad2d04b27@gmail.com> (raw)
In-Reply-To: <f2272f04-510e-4c92-be5e-fedcbb445eb0@gmail.com>

Hello again,


I thought it would be helpful to share some numbers to support my claim
and a couple ideas to improve the patch. Below are the perf stats from
the hypervisor after triggering a panic on a guest running kernel v5.15
(I'll provide the details of the experiment afterward.)


Samples: 55K of event 'cycles:P', Event count (approx.): 36090772574
Overhead  Command          Shared Object            Symbol
  42.20%  CPU 5/KVM        [kernel.kallsyms]        [k] vmx_vmexit
  19.07%  CPU 5/KVM        [kernel.kallsyms]        [k] vmx_spec_ctrl_restore_host
   9.73%  CPU 5/KVM        [kernel.kallsyms]        [k] vmx_vcpu_enter_exit
   3.60%  CPU 5/KVM        [kernel.kallsyms]        [k] __flush_smp_call_function_queue
   2.91%  CPU 5/KVM        [kernel.kallsyms]        [k] vmx_vcpu_run
   2.85%  CPU 5/KVM        [kernel.kallsyms]        [k] native_irq_return_iret
   2.67%  CPU 5/KVM        [kernel.kallsyms]        [k] native_flush_tlb_one_user
   2.16%  CPU 5/KVM        [kernel.kallsyms]        [k] llist_reverse_order
   2.10%  CPU 5/KVM        [kernel.kallsyms]        [k] __srcu_read_lock
   2.08%  CPU 5/KVM        [kernel.kallsyms]        [k] flush_tlb_func
   1.52%  CPU 5/KVM        [kernel.kallsyms]        [k] vcpu_enter_guest.constprop.0
   1.50%  CPU 5/KVM        [kernel.kallsyms]        [k] native_apic_msr_eoi
   1.01%  CPU 5/KVM        [kernel.kallsyms]        [k] clear_bhb_loop
   0.66%  CPU 5/KVM        [kernel.kallsyms]        [k] sysvec_call_function_single


And here are the results from the guest VM after applying my patch:


Samples: 28  of event 'cycles:P', Event count (approx.): 28961952
Overhead  Command          Shared Object            Symbol
  11.03%  qemu-system-x86  [kernel.kallsyms]        [k] task_mm_cid_work
  11.03%  qemu-system-x86  qemu-system-x86_64       [.] 0x0000000000579944
   9.80%  qemu-system-x86  qemu-system-x86_64       [.] 0x000000000056512b
   8.45%  IO mon_iothread  libc.so.6                [.] 0x00000000000a3f12
   8.45%  IO mon_iothread  libglib-2.0.so.0.7200.4  [.] g_mutex_lock
   7.51%  IO mon_iothread  [kernel.kallsyms]        [k] avg_vruntime
   6.65%  IO mon_iothread  libc.so.6                [.] write
   5.93%  IO mon_iothread  [kernel.kallsyms]        [k] security_file_permission
   4.97%  qemu-system-x86  libglib-2.0.so.0.7200.4  [.] g_thread_self
   4.64%  IO mon_iothread  [kernel.kallsyms]        [k] aa_label_sk_perm.part.0
   4.13%  IO mon_iothread  libglib-2.0.so.0.7200.4  [.] g_main_context_release
   3.79%  IO mon_iothread  [kernel.kallsyms]        [k] seccomp_run_filters
   3.42%  IO mon_iothread  libglib-2.0.so.0.7200.4  [.] g_main_context_dispatch
   3.42%  IO mon_iothread  qemu-system-x86_64       [.] 0x00000000004edbab
   3.28%  IO mon_iothread  qemu-system-x86_64       [.] 0x00000000005999c8
   3.09%  IO mon_iothread  qemu-system-x86_64       [.] 0x00000000004e636b
   0.22%  qemu-system-x86  [kernel.kallsyms]        [k] __intel_pmu_enable_all.constprop.0


As you can see, CPU consumption is significantly reduced after applying the
proposed change during panic, with KVM-related functions (e.g.,
vmx_vmexit) dropping from more than 70% of CPU usage to virtually nothing.
Also, the num of samples decreased from 55K to 28, and the event count
dropped from 36.09 billion to 28.96 million.


Jan suggested that a better way to implement cpu_halt_end_panic() (perhaps
cpu_halt_after_panic() is a better name) would be to define it as a weak
function in asm-generic, allowing archs to overwrite it. What do you think?

Thank you in advance!

Regards,
Carlos

---

Details on the experiment:

- Linux kernel v5.15 (commit 8bb7eca)

-  VM guest CPU: Intel(R) Xeon(R) Gold 5318Y CPU @ 2.10GHz

- I executed to collect samples:
  /usr/bin/perf record -p 2618527 -a sleep 30

- Image Ubuntu 22.04 (LTS) x64, 8 vCPUs, 16GB / 100GB Disk


Thanks,

Carlos


On 3/17/25 17:01, Carlos Bilbao wrote:
> After the kernel has finished handling a panic, it enters a busy-wait loop.
> But, this unnecessarily consumes CPU power and electricity. Plus, in VMs,
> this negatively impacts the throughput of other VM guests running on the
> same hypervisor.
>
> I propose introducing a function cpu_halt_end_panic() to halt the CPU
> during this state while still allowing interrupts to be processed. See my
> commit below.
>
> Thanks in advance!
>
> Signed-off-by: Carlos Bilbao <carlos.bilbao@kernel.org>
> ---
>  kernel/panic.c | 17 ++++++++++++++++-
>  1 file changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/panic.c b/kernel/panic.c
> index fbc59b3b64d0..c00ccaa698d5 100644
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -276,6 +276,21 @@ static void panic_other_cpus_shutdown(bool crash_kexec)
>          crash_smp_send_stop();
>  }
>  
> +static void cpu_halt_end_panic(void)
> +{
> +#ifdef CONFIG_X86
> +    native_safe_halt();
> +#elif defined(CONFIG_ARM)
> +    cpu_do_idle();
> +#else
> +    /*
> +     * Default to a simple busy-wait if no architecture-specific halt is
> +     * defined above
> +     */
> +    mdelay(PANIC_TIMER_STEP);
> +#endif
> +}
> +
>  /**
>   *    panic - halt the system
>   *    @fmt: The text string to print
> @@ -474,7 +489,7 @@ void panic(const char *fmt, ...)
>              i += panic_blink(state ^= 1);
>              i_next = i + 3600 / PANIC_BLINK_SPD;
>          }
> -        mdelay(PANIC_TIMER_STEP);
> +        cpu_halt_end_panic();
>      }
>  }
>  

  parent reply	other threads:[~2025-03-21 13:01 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-17 22:01 [RFC] panic: reduce CPU consumption when finished handling panic Carlos Bilbao
2025-03-18 12:09 ` Carlos Bilbao
2025-03-21 13:01 ` Carlos Bilbao [this message]
2025-03-21 19:03 ` Thomas Gleixner
2025-03-26 15:10   ` Carlos Bilbao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7b3a0288-20f9-42cf-af81-e10ad2d04b27@gmail.com \
    --to=carlos.bilbao.osdev@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=jani.nikula@intel.com \
    --cc=jglauber@digitalocean.com \
    --cc=john.ogness@linutronix.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pmladek@suse.com \
    --cc=takakura@valinux.co.jp \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.