From: Bitao Hu <yaoma@linux.alibaba.com>
To: tglx@linutronix.de, dianders@chromium.org, pmladek@suse.com,
akpm@linux-foundation.org, kernelfans@gmail.com,
liusong@linux.alibaba.com, deller@gmx.de, npiggin@gmail.com,
jan.kiszka@siemens.com, kbingham@kernel.org
Cc: linux-mips@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
kvm@vger.kernel.org, Bitao Hu <yaoma@linux.alibaba.com>
Subject: [PATCHv8 0/2] *** Detect interrupt storm in softlockup ***
Date: Tue, 20 Feb 2024 00:19:18 +0800 [thread overview]
Message-ID: <20240219161920.15752-1-yaoma@linux.alibaba.com> (raw)
Hi, guys.
I have implemented a low-overhead method for detecting interrupt
storm in softlockup. Please review it, all comments are welcome.
Changes from v7 to v8:
- From Thomas Gleixner, implement statistics within the interrupt
core code and provide sensible interfaces for the watchdog code.
- Patch #1 remains unchanged. Patch #2 has significant changes
based on Thomas's suggestions, which is why I have removed
Liu Song and Douglas's Reviewed-by from patch #2. Please review
it again, and all comments are welcome.
Changes from v6 to v7:
- Remove "READ_ONCE" in "start_counting_irqs"
- Replace the hard-coded 5 with "NUM_SAMPLE_PERIODS" macro in
"set_sample_period".
- Add empty lines to help with reading the code.
- Remove the branch that processes IRQs where "counts_diff = 0".
- Add the Reviewed-by of Liu Song and Douglas.
Changes from v5 to v6:
- Use "./scripts/checkpatch.pl --strict" to get a few extra
style nits and fix them.
- Squash patch #3 into patch #1, and wrapp the help text to
80 columns.
- Sort existing headers alphabetically in watchdog.c
- Drop "softlockup_hardirq_cpus", just read "hardirq_counts"
and see if it's non-NULL.
- Store "nr_irqs" in a local variable.
- Simplify the calculation of "cpu_diff".
Changes from v4 to v5:
- Rearranging variable placement to make code look neater.
Changes from v3 to v4:
- Renaming some variable and function names to make the code logic
more readable.
- Change the code location to avoid predeclaring.
- Just swap rather than a double loop in tabulate_irq_count.
- Since nr_irqs has the potential to grow at runtime, bounds-check
logic has been implemented.
- Add SOFTLOCKUP_DETECTOR_INTR_STORM Kconfig knob.
Changes from v2 to v3:
- From Liu Song, using enum instead of macro for cpu_stats, shortening
the name 'idx_to_stat' to 'stats', adding 'get_16bit_precesion' instead
of using right shift operations, and using 'struct irq_counts'.
- From kernel robot test, using '__this_cpu_read' and '__this_cpu_write'
instead of accessing to an per-cpu array directly, in order to avoid
this warning.
'sparse: incorrect type in initializer (different modifiers)'
Changes from v1 to v2:
- From Douglas, optimize the memory of cpustats. With the maximum number
of CPUs, that's now this.
2 * 8192 * 4 + 1 * 8192 * 5 * 4 + 1 * 8192 = 237,568 bytes.
- From Liu Song, refactor the code format and add necessary comments.
- From Douglas, use interrupt counts instead of interrupt time to
determine the cause of softlockup.
- Remove the cmdline parameter added in PATCHv1.
Bitao Hu (2):
watchdog/softlockup: low-overhead detection of interrupt
watchdog/softlockup: report the most frequent interrupts
arch/mips/dec/setup.c | 2 +-
arch/parisc/kernel/smp.c | 2 +-
arch/powerpc/kvm/book3s_hv_rm_xics.c | 2 +-
include/linux/irqdesc.h | 9 +-
include/linux/kernel_stat.h | 4 +
kernel/irq/internals.h | 2 +-
kernel/irq/irqdesc.c | 34 ++++-
kernel/irq/proc.c | 9 +-
kernel/watchdog.c | 213 ++++++++++++++++++++++++++-
lib/Kconfig.debug | 13 ++
scripts/gdb/linux/interrupts.py | 6 +-
11 files changed, 269 insertions(+), 27 deletions(-)
--
2.37.1 (Apple Git-137.1)
WARNING: multiple messages have this Message-ID (diff)
From: Bitao Hu <yaoma@linux.alibaba.com>
To: tglx@linutronix.de, dianders@chromium.org, pmladek@suse.com,
akpm@linux-foundation.org, kernelfans@gmail.com,
liusong@linux.alibaba.com, deller@gmx.de, npiggin@gmail.com,
jan.kiszka@siemens.com, kbingham@kernel.org
Cc: linux-parisc@vger.kernel.org, kvm@vger.kernel.org,
linux-mips@vger.kernel.org, linux-kernel@vger.kernel.org,
Bitao Hu <yaoma@linux.alibaba.com>,
linuxppc-dev@lists.ozlabs.org
Subject: [PATCHv8 0/2] *** Detect interrupt storm in softlockup ***
Date: Tue, 20 Feb 2024 00:19:18 +0800 [thread overview]
Message-ID: <20240219161920.15752-1-yaoma@linux.alibaba.com> (raw)
Hi, guys.
I have implemented a low-overhead method for detecting interrupt
storm in softlockup. Please review it, all comments are welcome.
Changes from v7 to v8:
- From Thomas Gleixner, implement statistics within the interrupt
core code and provide sensible interfaces for the watchdog code.
- Patch #1 remains unchanged. Patch #2 has significant changes
based on Thomas's suggestions, which is why I have removed
Liu Song and Douglas's Reviewed-by from patch #2. Please review
it again, and all comments are welcome.
Changes from v6 to v7:
- Remove "READ_ONCE" in "start_counting_irqs"
- Replace the hard-coded 5 with "NUM_SAMPLE_PERIODS" macro in
"set_sample_period".
- Add empty lines to help with reading the code.
- Remove the branch that processes IRQs where "counts_diff = 0".
- Add the Reviewed-by of Liu Song and Douglas.
Changes from v5 to v6:
- Use "./scripts/checkpatch.pl --strict" to get a few extra
style nits and fix them.
- Squash patch #3 into patch #1, and wrapp the help text to
80 columns.
- Sort existing headers alphabetically in watchdog.c
- Drop "softlockup_hardirq_cpus", just read "hardirq_counts"
and see if it's non-NULL.
- Store "nr_irqs" in a local variable.
- Simplify the calculation of "cpu_diff".
Changes from v4 to v5:
- Rearranging variable placement to make code look neater.
Changes from v3 to v4:
- Renaming some variable and function names to make the code logic
more readable.
- Change the code location to avoid predeclaring.
- Just swap rather than a double loop in tabulate_irq_count.
- Since nr_irqs has the potential to grow at runtime, bounds-check
logic has been implemented.
- Add SOFTLOCKUP_DETECTOR_INTR_STORM Kconfig knob.
Changes from v2 to v3:
- From Liu Song, using enum instead of macro for cpu_stats, shortening
the name 'idx_to_stat' to 'stats', adding 'get_16bit_precesion' instead
of using right shift operations, and using 'struct irq_counts'.
- From kernel robot test, using '__this_cpu_read' and '__this_cpu_write'
instead of accessing to an per-cpu array directly, in order to avoid
this warning.
'sparse: incorrect type in initializer (different modifiers)'
Changes from v1 to v2:
- From Douglas, optimize the memory of cpustats. With the maximum number
of CPUs, that's now this.
2 * 8192 * 4 + 1 * 8192 * 5 * 4 + 1 * 8192 = 237,568 bytes.
- From Liu Song, refactor the code format and add necessary comments.
- From Douglas, use interrupt counts instead of interrupt time to
determine the cause of softlockup.
- Remove the cmdline parameter added in PATCHv1.
Bitao Hu (2):
watchdog/softlockup: low-overhead detection of interrupt
watchdog/softlockup: report the most frequent interrupts
arch/mips/dec/setup.c | 2 +-
arch/parisc/kernel/smp.c | 2 +-
arch/powerpc/kvm/book3s_hv_rm_xics.c | 2 +-
include/linux/irqdesc.h | 9 +-
include/linux/kernel_stat.h | 4 +
kernel/irq/internals.h | 2 +-
kernel/irq/irqdesc.c | 34 ++++-
kernel/irq/proc.c | 9 +-
kernel/watchdog.c | 213 ++++++++++++++++++++++++++-
lib/Kconfig.debug | 13 ++
scripts/gdb/linux/interrupts.py | 6 +-
11 files changed, 269 insertions(+), 27 deletions(-)
--
2.37.1 (Apple Git-137.1)
next reply other threads:[~2024-02-19 16:19 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-19 16:19 Bitao Hu [this message]
2024-02-19 16:19 ` [PATCHv8 0/2] *** Detect interrupt storm in softlockup *** Bitao Hu
2024-02-19 16:19 ` [PATCHv8 1/2] watchdog/softlockup: low-overhead detection of interrupt Bitao Hu
2024-02-19 16:19 ` Bitao Hu
2024-02-19 16:19 ` [PATCHv8 2/2] watchdog/softlockup: report the most frequent interrupts Bitao Hu
2024-02-19 16:19 ` Bitao Hu
2024-02-20 9:35 ` Thomas Gleixner
2024-02-20 9:35 ` Thomas Gleixner
2024-02-20 9:49 ` Bitao Hu
2024-02-20 9:49 ` Bitao Hu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240219161920.15752-1-yaoma@linux.alibaba.com \
--to=yaoma@linux.alibaba.com \
--cc=akpm@linux-foundation.org \
--cc=deller@gmx.de \
--cc=dianders@chromium.org \
--cc=jan.kiszka@siemens.com \
--cc=kbingham@kernel.org \
--cc=kernelfans@gmail.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mips@vger.kernel.org \
--cc=linux-parisc@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=liusong@linux.alibaba.com \
--cc=npiggin@gmail.com \
--cc=pmladek@suse.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.