From: Bitao Hu <yaoma@linux.alibaba.com>
To: Doug Anderson <dianders@chromium.org>
Cc: akpm@linux-foundation.org, pmladek@suse.com,
kernelfans@gmail.com, liusong@linux.alibaba.com,
linux-kernel@vger.kernel.org, yaoma@linux.alibaba.com
Subject: Re: [PATCHv5 2/3] watchdog/softlockup: report the most frequent interrupts
Date: Wed, 7 Feb 2024 14:18:56 +0800 [thread overview]
Message-ID: <8cf08d4a-fa4b-41cb-8bfa-75387122b194@linux.alibaba.com> (raw)
In-Reply-To: <CAD=FV=V8VcmEDDpaWZi40j5dkP+HyBPFr=D_mKFG-YmXTpa_AA@mail.gmail.com>
Hi,
On 2024/2/7 05:42, Doug Anderson wrote:
> Hi,
>
> On Tue, Feb 6, 2024 at 1:59 AM Bitao Hu <yaoma@linux.alibaba.com> wrote:
>>
>> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
>> index 71d5b6dfa358..26dc1ad86276 100644
>> --- a/kernel/watchdog.c
>> +++ b/kernel/watchdog.c
>> @@ -18,6 +18,9 @@
>> #include <linux/init.h>
>> #include <linux/kernel_stat.h>
>> #include <linux/math64.h>
>> +#include <linux/irq.h>
>> +#include <linux/irqdesc.h>
>> +#include <linux/bitops.h>
>
> These are still not sorted alphabetically. "irq.h" and "irqdesc.h"
> should go between "init.h" and "kernel_stat.h". "bitops.h" is trickier
> because the existing headers are not quite sorted. Probably the best
> would be to fully sort them. They should end up like this:
>
> #include <linux/bitops.h>
> #include <linux/cpu.h>
> #include <linux/init.h>
> #include <linux/irq.h>
> #include <linux/irqdesc.h>
> #include <linux/kernel_stat.h>
> #include <linux/kvm_para.h>
> #include <linux/math64.h>
> #include <linux/mm.h>
> #include <linux/module.h>
> #include <linux/nmi.h>
> #include <linux/stop_machine.h>
> #include <linux/sysctl.h>
> #include <linux/tick.h>
>
> #include <linux/sched/clock.h>
> #include <linux/sched/debug.h>
> #include <linux/sched/isolation.h>
>
> #include <asm/irq_regs.h>
>
Sorry, I misunderstood your point, thinking that they should only be
added between "init.h" and "module.h". I will arrange them in the
alphabetical order as you suggested.
>
>> +static void start_counting_irqs(void)
>> +{
>> + int i;
>> + struct irq_desc *desc;
>> + u32 *counts = __this_cpu_read(hardirq_counts);
>> + int cpu = smp_processor_id();
>> +
>> + if (!test_bit(cpu, softlockup_hardirq_cpus)) {
>
> I don't think you need "softlockup_hardirq_cpus", do you? Just read
> "actual_nr_irqs" and see if it's non-zero? ...or read "hardirq_counts"
> and see if it's non-NULL?
Sure, the existing variables are sufficient for making a determination.
And may be I should swap it to make the decision logic here clearer,
like this (untested)?
bool is_counting_started(void)
{
return !!__this_cpu_read(hardirq_counts);
}
if (!is_counting_started()) {
>
>
>> + /*
>> + * nr_irqs has the potential to grow at runtime. We should read
>> + * it and store locally to avoid array out-of-bounds access.
>> + */
>> + __this_cpu_write(actual_nr_irqs, nr_irqs);
>
> nit: IMO store nr_irqs in a local variable to avoid all of the
> "__this_cpu_read" calls everywhere. Then just write it once from your
> local variable.
OK.
>
>
>> + counts = kmalloc_array(__this_cpu_read(actual_nr_irqs),
>> + sizeof(u32),
>> + GFP_ATOMIC);
>
> should use "kcalloc()" so the array is zeroed. That way if the set of
> non-NULL "desc"s changes between calls you don't end up reading
> uninitialized memory.
OK, I will use "kcalloc()" here.
>
>
>> +static void stop_counting_irqs(void)
>> +{
>> + u32 *counts = __this_cpu_read(hardirq_counts);
>> + int cpu = smp_processor_id();
>> +
>> + if (test_bit(cpu, softlockup_hardirq_cpus)) {
>> + kfree(counts);
>> + counts = NULL;
>> + __this_cpu_write(hardirq_counts, counts);
>
> nit: don't really need to set the local "counts" to NULL. Just:
>
> __this_cpu_write(hardirq_counts, NULL);
>
> ...and actually if you take my advice above and get rid of
> "softlockup_hardirq_cpus" then this function just becomes:
>
> kfree(__this_cpu_read(hardirq_counts));
> __this_cpu_write(hardirq_counts, NULL);
>
> Since kfree() handles when you pass it NULL...
OK.
>
>
>> +static void print_irq_counts(void)
>> +{
>> + int i;
>> + struct irq_desc *desc;
>> + u32 counts_diff;
>> + u32 *counts = __this_cpu_read(hardirq_counts);
>> + int cpu = smp_processor_id();
>> + struct irq_counts irq_counts_sorted[NUM_HARDIRQ_REPORT] = {
>> + {-1, 0}, {-1, 0}, {-1, 0}, {-1, 0},
>> + };
>> +
>> + if (test_bit(cpu, softlockup_hardirq_cpus)) {
>> + for_each_irq_desc(i, desc) {
>> + if (!desc)
>> + continue;
>
> The "if" test above isn't needed. The "for_each_irq_desc()" macro
> already checks for NULL.
Thanks for your reminder.
>
>
>
>> + /*
>> + * We need to bounds-check in case someone on a different CPU
>> + * expanded nr_irqs.
>> + */
>> + if (i < __this_cpu_read(actual_nr_irqs))
>> + counts_diff = desc->kstat_irqs ?
>> + *this_cpu_ptr(desc->kstat_irqs) - counts[i] : 0;
>> + else
>> + counts_diff = desc->kstat_irqs ?
>> + *this_cpu_ptr(desc->kstat_irqs) : 0;
>
> Why do you need to test "kstat_irqs" for 0?
Although "alloc_desc" wil allocate both "desc" and "kstat_irqs" at the
same time, I refer to the usage of "kstat_irqs" in "show_interrupts"
from kernel/irq/proc.c, where it does perform a check.
kernel/irq/proc.c: show_interrupts
for_each_online_cpu(j)
seq_printf(p, "%10u ", desc->kstat_irqs ?
*per_cpu_ptr(desc->kstat_irqs, j) : 0);
I'm not sure why this is necessary, so I copied it as it is. Can we skip
the check in "print_irq_counts"?
> duplicate the math. In other words, I'd expect this (untested):
>
> if (i < __this_cpu_read(actual_nr_irqs))
> count = counts[i];
> else
> count = 0;
> counts_diff = *this_cpu_ptr(desc->kstat_irqs) - count;
Agree.
>
> I guess I'd also put "__this_cpu_read(actual_nr_irqs)" in a local
> variable like you do with counts...
next prev parent reply other threads:[~2024-02-07 6:19 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-06 9:58 [PATCHv5 0/3] *** Detect interrupt storm in softlockup *** Bitao Hu
2024-02-06 9:59 ` [PATCHv5 1/3] watchdog/softlockup: low-overhead detection of interrupt Bitao Hu
2024-02-06 21:41 ` Doug Anderson
2024-02-07 6:18 ` Bitao Hu
2024-02-07 17:13 ` Doug Anderson
2024-02-06 9:59 ` [PATCHv5 2/3] watchdog/softlockup: report the most frequent interrupts Bitao Hu
2024-02-06 21:42 ` Doug Anderson
2024-02-07 6:18 ` Bitao Hu [this message]
2024-02-07 17:14 ` Doug Anderson
2024-02-06 9:59 ` [PATCHv5 3/3] watchdog/softlockup: add SOFTLOCKUP_DETECTOR_INTR_STORM Kconfig knob Bitao Hu
2024-02-06 21:42 ` Doug Anderson
2024-02-07 6:19 ` Bitao Hu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8cf08d4a-fa4b-41cb-8bfa-75387122b194@linux.alibaba.com \
--to=yaoma@linux.alibaba.com \
--cc=akpm@linux-foundation.org \
--cc=dianders@chromium.org \
--cc=kernelfans@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=liusong@linux.alibaba.com \
--cc=pmladek@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox