* [PATCH 0/2] genirq/proc: Speed up show_interrupts()
@ 2024-05-13 12:05 Adrian Huang
2024-05-13 12:05 ` [PATCH 1/2] genirq/proc: Try to jump over the unallocated irq hole whenever possible Adrian Huang
2024-05-13 12:05 ` [PATCH 2/2] genirq/proc: Refine percpu kstat_irqs access logic Adrian Huang
0 siblings, 2 replies; 4+ messages in thread
From: Adrian Huang @ 2024-05-13 12:05 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: linux-kernel, linux-fsdevel, Jiwei Sun, Adrian Huang
Since there are irq number allocation holes, we can jump over those
holes in order to speed up show_interrupts().
In addition, the percpu kstat_irqs access logic can be refined.
System Configuration
====================
* 2-socket server with 488 cores (HT-enabled).
* The last allocated irq is 508.
* nr_irqs = 8360. The following is from dmesg.
NR_IRQS: 524544, nr_irqs: 8360, preallocated irqs: 16
The biggest hole: 7852 iterations (8360 - 509 + 1) are not necessary.
Test Result
===========
* The following result is the average execution time of ten-time
measurements about `time cat /proc/interrupts`.
no patch (ms) patched (ms) saved
------------- ------------ -------
52.4 47.3 9.7%
Adrian Huang (2):
genirq/proc: Try to jump over the unallocated irq hole whenever
possible
genirq/proc: Refine percpu kstat_irqs access logic
fs/proc/interrupts.c | 6 ++++++
kernel/irq/proc.c | 26 ++++++++++++++++++--------
2 files changed, 24 insertions(+), 8 deletions(-)
--
2.25.1
^ permalink raw reply [flat|nested] 4+ messages in thread* [PATCH 1/2] genirq/proc: Try to jump over the unallocated irq hole whenever possible 2024-05-13 12:05 [PATCH 0/2] genirq/proc: Speed up show_interrupts() Adrian Huang @ 2024-05-13 12:05 ` Adrian Huang 2024-05-13 12:05 ` [PATCH 2/2] genirq/proc: Refine percpu kstat_irqs access logic Adrian Huang 1 sibling, 0 replies; 4+ messages in thread From: Adrian Huang @ 2024-05-13 12:05 UTC (permalink / raw) To: Thomas Gleixner; +Cc: linux-kernel, linux-fsdevel, Jiwei Sun, Adrian Huang From: Adrian Huang <ahuang12@lenovo.com> Current approach blindly iterates the irq number until the number is greater than 'nr_irqs', and checks if each irq is allocated. It is inefficient if the big hole (the last allocated irq number to nr_irqs) is available in the system. The solution is to try jumping over the unallocated irq hole when an unallocated irq is detected. Tested-by: Jiwei Sun <sunjw10@lenovo.com> Signed-off-by: Adrian Huang <ahuang12@lenovo.com> --- fs/proc/interrupts.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/fs/proc/interrupts.c b/fs/proc/interrupts.c index cb0edc7cbf09..111ea8a3c553 100644 --- a/fs/proc/interrupts.c +++ b/fs/proc/interrupts.c @@ -19,6 +19,12 @@ static void *int_seq_next(struct seq_file *f, void *v, loff_t *pos) (*pos)++; if (*pos > nr_irqs) return NULL; + + rcu_read_lock(); + if (!irq_to_desc(*pos)) + *pos = irq_get_next_irq(*pos); + rcu_read_unlock(); + return pos; } -- 2.25.1 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 2/2] genirq/proc: Refine percpu kstat_irqs access logic 2024-05-13 12:05 [PATCH 0/2] genirq/proc: Speed up show_interrupts() Adrian Huang 2024-05-13 12:05 ` [PATCH 1/2] genirq/proc: Try to jump over the unallocated irq hole whenever possible Adrian Huang @ 2024-05-13 12:05 ` Adrian Huang 2024-05-14 23:04 ` Thomas Gleixner 1 sibling, 1 reply; 4+ messages in thread From: Adrian Huang @ 2024-05-13 12:05 UTC (permalink / raw) To: Thomas Gleixner; +Cc: linux-kernel, linux-fsdevel, Jiwei Sun, Adrian Huang From: Adrian Huang <ahuang12@lenovo.com> There is no need to accumulate all CPUs' kstat_irqs to determine whether the corresponding irq should be printed. Instead, stop the iteration once one of kstat_irqs is nonzero. In addition, no need to check if kstat_irqs address is available for each iteration when printing each CPU irq statistic. Tested-by: Jiwei Sun <sunjw10@lenovo.com> Signed-off-by: Adrian Huang <ahuang12@lenovo.com> --- kernel/irq/proc.c | 26 ++++++++++++++++++-------- 1 file changed, 18 insertions(+), 8 deletions(-) diff --git a/kernel/irq/proc.c b/kernel/irq/proc.c index 623b8136e9af..bfa341fac687 100644 --- a/kernel/irq/proc.c +++ b/kernel/irq/proc.c @@ -461,7 +461,7 @@ int show_interrupts(struct seq_file *p, void *v) { static int prec; - unsigned long flags, any_count = 0; + unsigned long flags, print_irq = 1; int i = *(loff_t *) v, j; struct irqaction *action; struct irq_desc *desc; @@ -488,18 +488,28 @@ int show_interrupts(struct seq_file *p, void *v) if (!desc || irq_settings_is_hidden(desc)) goto outsparse; - if (desc->kstat_irqs) { - for_each_online_cpu(j) - any_count |= data_race(*per_cpu_ptr(desc->kstat_irqs, j)); + if ((!desc->action || irq_desc_is_chained(desc)) && desc->kstat_irqs) { + print_irq = 0; + for_each_online_cpu(j) { + if (data_race(*per_cpu_ptr(desc->kstat_irqs, j))) { + print_irq = 1; + break; + } + } } - if ((!desc->action || irq_desc_is_chained(desc)) && !any_count) + if (!print_irq) goto outsparse; seq_printf(p, "%*d: ", prec, i); - for_each_online_cpu(j) - seq_printf(p, "%10u ", desc->kstat_irqs ? - *per_cpu_ptr(desc->kstat_irqs, j) : 0); + + if (desc->kstat_irqs) { + for_each_online_cpu(j) + seq_printf(p, "%10u ", *per_cpu_ptr(desc->kstat_irqs, j)); + } else { + for_each_online_cpu(j) + seq_printf(p, "%10u ", 0); + } raw_spin_lock_irqsave(&desc->lock, flags); if (desc->irq_data.chip) { -- 2.25.1 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH 2/2] genirq/proc: Refine percpu kstat_irqs access logic 2024-05-13 12:05 ` [PATCH 2/2] genirq/proc: Refine percpu kstat_irqs access logic Adrian Huang @ 2024-05-14 23:04 ` Thomas Gleixner 0 siblings, 0 replies; 4+ messages in thread From: Thomas Gleixner @ 2024-05-14 23:04 UTC (permalink / raw) To: Adrian Huang; +Cc: linux-kernel, linux-fsdevel, Jiwei Sun, Adrian Huang On Mon, May 13 2024 at 20:05, Adrian Huang wrote: > @@ -461,7 +461,7 @@ int show_interrupts(struct seq_file *p, void *v) > { > static int prec; > > - unsigned long flags, any_count = 0; > + unsigned long flags, print_irq = 1; What's wrong with making print_irq boolean? > int i = *(loff_t *) v, j; > struct irqaction *action; > struct irq_desc *desc; > @@ -488,18 +488,28 @@ int show_interrupts(struct seq_file *p, void *v) > if (!desc || irq_settings_is_hidden(desc)) > goto outsparse; > > - if (desc->kstat_irqs) { > - for_each_online_cpu(j) > - any_count |= data_race(*per_cpu_ptr(desc->kstat_irqs, j)); > + if ((!desc->action || irq_desc_is_chained(desc)) && desc->kstat_irqs) { The condition is wrong. Look how the old code evaluated any_count. > + print_irq = 0; > + for_each_online_cpu(j) { > + if (data_race(*per_cpu_ptr(desc->kstat_irqs, j))) { > + print_irq = 1; > + break; > + } > + } Aside of that this code is just fundamentally wrong in several aspects: 1) Interrupts which have no action are completely uninteresting as there is no real information attached, i.e. it shows that there were interrupts on some CPUs, but there is zero information from which device they originated. Especially with sparse interrupts enabled they are usually gone shortly after the last action was removed. 2) Chained interrupts do not have a count at all as they completely evade the core kernel entry points. So all of this can be avoided and the whole nonsense can be reduced to: if (!desc->action || irq_desc_is_chained(desc) || !desc->kstat_irqs) goto outsparse; which in turn allows to convert this: > - for_each_online_cpu(j) > - seq_printf(p, "%10u ", desc->kstat_irqs ? > - *per_cpu_ptr(desc->kstat_irqs, j) : 0); into an unconditional: for_each_online_cpu(j) seq_printf(p, "%10u ", *per_cpu_ptr(desc->kstat_irqs, j)); Thanks, tglx ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-05-14 23:04 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-05-13 12:05 [PATCH 0/2] genirq/proc: Speed up show_interrupts() Adrian Huang 2024-05-13 12:05 ` [PATCH 1/2] genirq/proc: Try to jump over the unallocated irq hole whenever possible Adrian Huang 2024-05-13 12:05 ` [PATCH 2/2] genirq/proc: Refine percpu kstat_irqs access logic Adrian Huang 2024-05-14 23:04 ` Thomas Gleixner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).