* [GIT pull] core/rseq for v7.2-rc1
@ 2026-06-13 21:24 Thomas Gleixner
2026-06-13 21:24 ` [GIT pull] irq/core " Thomas Gleixner
` (9 more replies)
0 siblings, 10 replies; 22+ messages in thread
From: Thomas Gleixner @ 2026-06-13 21:24 UTC (permalink / raw)
To: Linus Torvalds; +Cc: linux-kernel, x86
Linus,
please pull the latest core/rseq branch from:
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git core-rseq-2026-06-13
up to: 5a0daaff6ed9: selftests/rseq: Add config fragment
A trivial update for RSEQ selftests to provide the config fragments which
contain the config options required to actually run the tests.
Thanks,
tglx
------------------>
Mark Brown (1):
selftests/rseq: Add config fragment
tools/testing/selftests/rseq/config | 3 +++
1 file changed, 3 insertions(+)
create mode 100644 tools/testing/selftests/rseq/config
diff --git a/tools/testing/selftests/rseq/config b/tools/testing/selftests/rseq/config
new file mode 100644
index 000000000000..a64608043ace
--- /dev/null
+++ b/tools/testing/selftests/rseq/config
@@ -0,0 +1,3 @@
+CONFIG_EXPERT=y
+CONFIG_RSEQ=y
+CONFIG_RSEQ_SLICE_EXTENSION=y
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [GIT pull] irq/core for v7.2-rc1
2026-06-13 21:24 [GIT pull] core/rseq for v7.2-rc1 Thomas Gleixner
@ 2026-06-13 21:24 ` Thomas Gleixner
2026-06-15 8:51 ` pr-tracker-bot
2026-06-13 21:24 ` [GIT pull] irq/drivers " Thomas Gleixner
` (8 subsequent siblings)
9 siblings, 1 reply; 22+ messages in thread
From: Thomas Gleixner @ 2026-06-13 21:24 UTC (permalink / raw)
To: Linus Torvalds; +Cc: linux-kernel, x86
Linus,
please pull the latest irq/core branch from:
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git irq-core-2026-06-13
up to: 8f727615134a: x86/irq: Add missing 's' back to thermal event printout
Interrupt core code changes:
- Rework of /proc/interrupt handling:
/proc/interrupts was subject to micro optimizations for a long time,
but most of the low hanging fruit was left on the table. This rework
addresses the major time consuming issues:
- Printing a long series of zeros one by one via a format string
instead of counting subsequent zeros and emitting a string
constant.
- Simplify and cache the conditions whether interrupts should be printed
- Use a proper iteration over the interrupt descriptor xarray
instead of walking and testing one by one.
- Provide helper functions for the architecture code to emit the
architecture specific counters
- Convert the counter structure in x86 to an array, which
simplifies the output and add mechanisms to suppress unused
architecture interrupts, which just occupy space for
nothing. Adopt the new core mechanisms.
This adjusts the gdb scripts related to interrupt counter statitics
to work with the new mechanisms.
- Prevent a string overflow in the /proc/irq/$N/ directory name
creation code.
Thanks,
tglx
------------------>
Dmitry Ilvokhin (1):
x86/irq: Optimize interrupts decimals printing
Pengpeng Hou (1):
genirq/proc: Size interrupt directory names for 10-digit interrupt numbers
Thomas Gleixner (16):
genirq/proc: Avoid formatting zero counts in /proc/interrupts
genirq/proc: Utilize irq_desc::tot_count to avoid evaluation
x86/irq: Make irqstats array based
x86/irq: Suppress unlikely interrupt stats by default
x86/irq: Move IOAPIC misrouted and PIC/APIC error counts into irq_stats
scripts/gdb: Update x86 interrupts to the array based storage
genirq: Expose nr_irqs in core code
genirq/manage: Make NMI cleanup RT safe
genirq: Cache the condition for /proc/interrupts exposure
genirq: Calculate precision only when required
genirq/proc: Increase default interrupt number precision to four
genirq: Add rcuref count to struct irq_desc
genirq: Expose irq_find_desc_at_or_after() in core code
genirq/proc: Runtime size the chip name
genirq/proc: Speed up /proc/interrupts iteration
x86/irq: Add missing 's' back to thermal event printout
arch/alpha/kernel/irq.c | 8 +-
arch/arm/kernel/smp.c | 3 +-
arch/arm64/kernel/smp.c | 5 +-
arch/loongarch/kernel/smp.c | 2 +-
arch/riscv/kernel/smp.c | 3 +-
arch/sh/kernel/irq.c | 2 +-
arch/sparc/kernel/irq_32.c | 12 +-
arch/sparc/kernel/irq_64.c | 4 +-
arch/um/kernel/irq.c | 4 +-
arch/x86/events/amd/core.c | 2 +-
arch/x86/events/amd/ibs.c | 2 +-
arch/x86/events/core.c | 2 +-
arch/x86/events/intel/core.c | 2 +-
arch/x86/events/intel/knc.c | 2 +-
arch/x86/events/intel/p4.c | 2 +-
arch/x86/events/zhaoxin/core.c | 2 +-
arch/x86/hyperv/hv_init.c | 2 +-
arch/x86/include/asm/hardirq.h | 85 ++++++-----
arch/x86/include/asm/hw_irq.h | 4 -
arch/x86/include/asm/mce.h | 3 -
arch/x86/kernel/apic/apic.c | 6 +-
arch/x86/kernel/apic/io_apic.c | 4 +-
arch/x86/kernel/apic/ipi.c | 2 +-
arch/x86/kernel/cpu/acrn.c | 2 +-
arch/x86/kernel/cpu/mce/amd.c | 2 +-
arch/x86/kernel/cpu/mce/core.c | 8 +-
arch/x86/kernel/cpu/mce/threshold.c | 2 +-
arch/x86/kernel/cpu/mshyperv.c | 4 +-
arch/x86/kernel/i8259.c | 2 +-
arch/x86/kernel/irq.c | 281 +++++++++++++++---------------------
arch/x86/kernel/irq_work.c | 2 +-
arch/x86/kernel/kvm.c | 2 +-
arch/x86/kernel/nmi.c | 4 +-
arch/x86/kernel/smp.c | 6 +-
arch/x86/mm/tlb.c | 2 +-
arch/x86/xen/enlighten_hvm.c | 2 +-
arch/x86/xen/enlighten_pv.c | 2 +-
arch/x86/xen/smp.c | 6 +-
arch/x86/xen/smp_pv.c | 2 +-
arch/xtensa/kernel/irq.c | 2 +-
fs/proc/Makefile | 4 +-
fs/proc/stat.c | 4 -
include/linux/interrupt.h | 1 +
include/linux/irq.h | 1 +
include/linux/irqdesc.h | 8 +-
kernel/irq/chip.c | 8 +-
kernel/irq/debugfs.h | 44 ++++++
kernel/irq/internals.h | 64 +++-----
kernel/irq/irqdesc.c | 70 +++++----
kernel/irq/irqdomain.c | 5 +-
kernel/irq/manage.c | 45 +++---
kernel/irq/proc.c | 234 ++++++++++++++++++++++++------
kernel/irq/proc.h | 13 ++
kernel/irq/settings.h | 13 ++
scripts/gdb/linux/interrupts.py | 106 +++++---------
55 files changed, 633 insertions(+), 481 deletions(-)
create mode 100644 kernel/irq/debugfs.h
create mode 100644 kernel/irq/proc.h
diff --git a/arch/alpha/kernel/irq.c b/arch/alpha/kernel/irq.c
index c67047c5d830..4a6a8b1d5a8b 100644
--- a/arch/alpha/kernel/irq.c
+++ b/arch/alpha/kernel/irq.c
@@ -72,16 +72,16 @@ int arch_show_interrupts(struct seq_file *p, int prec)
int j;
#ifdef CONFIG_SMP
- seq_puts(p, "IPI: ");
+ seq_puts(p, " IPI: ");
for_each_online_cpu(j)
seq_printf(p, "%10lu ", cpu_data[j].ipi_count);
seq_putc(p, '\n');
#endif
- seq_puts(p, "PMI: ");
+ seq_puts(p, " PMI: ");
for_each_online_cpu(j)
seq_printf(p, "%10lu ", per_cpu(irq_pmi_count, j));
- seq_puts(p, " Performance Monitoring\n");
- seq_printf(p, "ERR: %10lu\n", irq_err_count);
+ seq_puts(p, " Performance Monitoring\n");
+ seq_printf(p, " ERR: %10lu\n", irq_err_count);
return 0;
}
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 4e8e89a26ca3..b5fb4697bc3f 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -551,8 +551,7 @@ void show_ipi_list(struct seq_file *p, int prec)
if (!ipi_desc[i])
continue;
- seq_printf(p, "%*s%u:%s", prec - 1, "IPI", i,
- prec >= 4 ? " " : "");
+ seq_printf(p, "%*s%u:", prec - 1, "IPI", i);
for_each_online_cpu(cpu)
seq_printf(p, "%10u ", irq_desc_kstat_cpu(ipi_desc[i], cpu));
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 1aa324104afb..1d0e0e6a5b92 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -833,11 +833,10 @@ int arch_show_interrupts(struct seq_file *p, int prec)
unsigned int cpu, i;
for (i = 0; i < MAX_IPI; i++) {
- seq_printf(p, "%*s%u:%s", prec - 1, "IPI", i,
- prec >= 4 ? " " : "");
+ seq_printf(p, "%*s%u: ", prec - 1, "IPI", i);
for_each_online_cpu(cpu)
seq_printf(p, "%10u ", irq_desc_kstat_cpu(get_ipi_desc(cpu, i), cpu));
- seq_printf(p, " %s\n", ipi_types[i]);
+ seq_printf(p, " %s\n", ipi_types[i]);
}
seq_printf(p, "%*s: %10lu\n", prec, "Err", irq_err_count);
diff --git a/arch/loongarch/kernel/smp.c b/arch/loongarch/kernel/smp.c
index 64a048f1b880..50922610758b 100644
--- a/arch/loongarch/kernel/smp.c
+++ b/arch/loongarch/kernel/smp.c
@@ -88,7 +88,7 @@ void show_ipi_list(struct seq_file *p, int prec)
unsigned int cpu, i;
for (i = 0; i < NR_IPI; i++) {
- seq_printf(p, "%*s%u:%s", prec - 1, "IPI", i, prec >= 4 ? " " : "");
+ seq_printf(p, "%*s%u:", prec - 1, "IPI", i);
for_each_online_cpu(cpu)
seq_put_decimal_ull_width(p, " ", per_cpu(irq_stat, cpu).ipi_irqs[i], 10);
seq_printf(p, " LoongArch %d %s\n", i + 1, ipi_types[i]);
diff --git a/arch/riscv/kernel/smp.c b/arch/riscv/kernel/smp.c
index 5ed5095320e6..fa66f9c97d74 100644
--- a/arch/riscv/kernel/smp.c
+++ b/arch/riscv/kernel/smp.c
@@ -226,8 +226,7 @@ void show_ipi_stats(struct seq_file *p, int prec)
unsigned int cpu, i;
for (i = 0; i < IPI_MAX; i++) {
- seq_printf(p, "%*s%u:%s", prec - 1, "IPI", i,
- prec >= 4 ? " " : "");
+ seq_printf(p, "%*s%u:", prec - 1, "IPI", i);
for_each_online_cpu(cpu)
seq_printf(p, "%10u ", irq_desc_kstat_cpu(ipi_desc[i], cpu));
seq_printf(p, " %s\n", ipi_names[i]);
diff --git a/arch/sh/kernel/irq.c b/arch/sh/kernel/irq.c
index 9022d8af9d68..03c39b5da50f 100644
--- a/arch/sh/kernel/irq.c
+++ b/arch/sh/kernel/irq.c
@@ -46,7 +46,7 @@ int arch_show_interrupts(struct seq_file *p, int prec)
seq_printf(p, "%*s:", prec, "NMI");
for_each_online_cpu(j)
seq_put_decimal_ull_width(p, " ", per_cpu(irq_stat.__nmi_count, j), 10);
- seq_printf(p, " Non-maskable interrupts\n");
+ seq_printf(p, " Non-maskable interrupts\n");
seq_printf(p, "%*s: %10u\n", prec, "ERR", atomic_read(&irq_err_count));
diff --git a/arch/sparc/kernel/irq_32.c b/arch/sparc/kernel/irq_32.c
index 5210991429d5..22db727652ba 100644
--- a/arch/sparc/kernel/irq_32.c
+++ b/arch/sparc/kernel/irq_32.c
@@ -199,19 +199,19 @@ int arch_show_interrupts(struct seq_file *p, int prec)
int j;
#ifdef CONFIG_SMP
- seq_printf(p, "RES:");
+ seq_printf(p, "%*s:", prec, "RES");
for_each_online_cpu(j)
seq_put_decimal_ull_width(p, " ", cpu_data(j).irq_resched_count, 10);
- seq_printf(p, " IPI rescheduling interrupts\n");
- seq_printf(p, "CAL:");
+ seq_printf(p, " IPI rescheduling interrupts\n");
+ seq_printf(p, "%*s:", prec, "CAL");
for_each_online_cpu(j)
seq_put_decimal_ull_width(p, " ", cpu_data(j).irq_call_count, 10);
- seq_printf(p, " IPI function call interrupts\n");
+ seq_printf(p, " IPI function call interrupts\n");
#endif
- seq_printf(p, "NMI:");
+ seq_printf(p, "%*s:", prec, "NMI");
for_each_online_cpu(j)
seq_put_decimal_ull_width(p, " ", cpu_data(j).counter, 10);
- seq_printf(p, " Non-maskable interrupts\n");
+ seq_printf(p, " Non-maskable interrupts\n");
return 0;
}
diff --git a/arch/sparc/kernel/irq_64.c b/arch/sparc/kernel/irq_64.c
index c5466a9fd560..3f55c69d5f3b 100644
--- a/arch/sparc/kernel/irq_64.c
+++ b/arch/sparc/kernel/irq_64.c
@@ -303,10 +303,10 @@ int arch_show_interrupts(struct seq_file *p, int prec)
{
int j;
- seq_printf(p, "NMI:");
+ seq_printf(p, "%*s:", prec, "NMI");
for_each_online_cpu(j)
seq_put_decimal_ull_width(p, " ", cpu_data(j).__nmi_count, 10);
- seq_printf(p, " Non-maskable interrupts\n");
+ seq_printf(p, " Non-maskable interrupts\n");
return 0;
}
diff --git a/arch/um/kernel/irq.c b/arch/um/kernel/irq.c
index 5929d498b65f..ddfd6e9bd8c7 100644
--- a/arch/um/kernel/irq.c
+++ b/arch/um/kernel/irq.c
@@ -716,12 +716,12 @@ int arch_show_interrupts(struct seq_file *p, int prec)
seq_printf(p, "%*s: ", prec, "RES");
for_each_online_cpu(cpu)
seq_printf(p, "%10u ", irq_stats(cpu)->irq_resched_count);
- seq_puts(p, " Rescheduling interrupts\n");
+ seq_puts(p, " Rescheduling interrupts\n");
seq_printf(p, "%*s: ", prec, "CAL");
for_each_online_cpu(cpu)
seq_printf(p, "%10u ", irq_stats(cpu)->irq_call_count);
- seq_puts(p, " Function call interrupts\n");
+ seq_puts(p, " Function call interrupts\n");
#endif
return 0;
diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c
index 0c92ed5f464b..305774b67995 100644
--- a/arch/x86/events/amd/core.c
+++ b/arch/x86/events/amd/core.c
@@ -1032,7 +1032,7 @@ static int amd_pmu_v2_handle_irq(struct pt_regs *regs)
* Unmasking the LVTPC is not required as the Mask (M) bit of the LVT
* PMI entry is not set by the local APIC when a PMC overflow occurs
*/
- inc_irq_stat(apic_perf_irqs);
+ inc_perf_irq_stat();
done:
cpuc->enabled = pmu_enabled;
diff --git a/arch/x86/events/amd/ibs.c b/arch/x86/events/amd/ibs.c
index e0bd5051db2a..ed18a6d7e1a8 100644
--- a/arch/x86/events/amd/ibs.c
+++ b/arch/x86/events/amd/ibs.c
@@ -1600,7 +1600,7 @@ perf_ibs_nmi_handler(unsigned int cmd, struct pt_regs *regs)
handled += perf_ibs_handle_irq(&perf_ibs_op, regs);
if (handled)
- inc_irq_stat(apic_perf_irqs);
+ inc_perf_irq_stat();
perf_sample_event_took(sched_clock() - stamp);
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 810ab21ffd99..244ba2018c12 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -1750,7 +1750,7 @@ int x86_pmu_handle_irq(struct pt_regs *regs)
}
if (handled)
- inc_irq_stat(apic_perf_irqs);
+ inc_perf_irq_stat();
return handled;
}
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index d9488ade0f8e..2d0802c590c5 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3504,7 +3504,7 @@ static int handle_pmi_common(struct pt_regs *regs, u64 status)
int bit;
int handled = 0;
- inc_irq_stat(apic_perf_irqs);
+ inc_perf_irq_stat();
/*
* Ignore a range of extra bits in status that do not indicate
diff --git a/arch/x86/events/intel/knc.c b/arch/x86/events/intel/knc.c
index e614baf42926..e887adc108ac 100644
--- a/arch/x86/events/intel/knc.c
+++ b/arch/x86/events/intel/knc.c
@@ -238,7 +238,7 @@ static int knc_pmu_handle_irq(struct pt_regs *regs)
goto done;
}
- inc_irq_stat(apic_perf_irqs);
+ inc_perf_irq_stat();
for_each_set_bit(bit, (unsigned long *)&status, X86_PMC_IDX_MAX) {
struct perf_event *event = cpuc->events[bit];
diff --git a/arch/x86/events/intel/p4.c b/arch/x86/events/intel/p4.c
index 02bfdb77158b..5368dc31787c 100644
--- a/arch/x86/events/intel/p4.c
+++ b/arch/x86/events/intel/p4.c
@@ -1077,7 +1077,7 @@ static int p4_pmu_handle_irq(struct pt_regs *regs)
}
if (handled)
- inc_irq_stat(apic_perf_irqs);
+ inc_perf_irq_stat();
/*
* When dealing with the unmasking of the LVTPC on P4 perf hw, it has
diff --git a/arch/x86/events/zhaoxin/core.c b/arch/x86/events/zhaoxin/core.c
index 4bdfcf091200..4bc177badac2 100644
--- a/arch/x86/events/zhaoxin/core.c
+++ b/arch/x86/events/zhaoxin/core.c
@@ -373,7 +373,7 @@ static int zhaoxin_pmu_handle_irq(struct pt_regs *regs)
else
zhaoxin_pmu_ack_status(status);
- inc_irq_stat(apic_perf_irqs);
+ inc_perf_irq_stat();
/*
* CondChgd bit 63 doesn't mean any overflow status. Ignore
diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index 323adc93f2dc..55a8b6de2865 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -219,7 +219,7 @@ static inline bool hv_reenlightenment_available(void)
DEFINE_IDTENTRY_SYSVEC(sysvec_hyperv_reenlightenment)
{
apic_eoi();
- inc_irq_stat(irq_hv_reenlightenment_count);
+ inc_irq_stat(HYPERV_REENLIGHTENMENT);
schedule_delayed_work(&hv_reenlightenment_work, HZ/10);
}
diff --git a/arch/x86/include/asm/hardirq.h b/arch/x86/include/asm/hardirq.h
index 9314642ae93c..dea60d66d976 100644
--- a/arch/x86/include/asm/hardirq.h
+++ b/arch/x86/include/asm/hardirq.h
@@ -4,51 +4,64 @@
#include <linux/threads.h>
-typedef struct {
-#if IS_ENABLED(CONFIG_CPU_MITIGATIONS) && IS_ENABLED(CONFIG_KVM_INTEL)
- u8 kvm_cpu_l1tf_flush_l1d;
-#endif
- unsigned int __nmi_count; /* arch dependent */
+enum irq_stat_counts {
+ IRQ_COUNT_NMI,
#ifdef CONFIG_X86_LOCAL_APIC
- unsigned int apic_timer_irqs; /* arch dependent */
- unsigned int irq_spurious_count;
- unsigned int icr_read_retry_count;
-#endif
-#if IS_ENABLED(CONFIG_KVM)
- unsigned int kvm_posted_intr_ipis;
- unsigned int kvm_posted_intr_wakeup_ipis;
- unsigned int kvm_posted_intr_nested_ipis;
+ IRQ_COUNT_APIC_TIMER,
+ IRQ_COUNT_SPURIOUS,
+ IRQ_COUNT_APIC_PERF,
+ IRQ_COUNT_IRQ_WORK,
+ IRQ_COUNT_ICR_READ_RETRY,
+ IRQ_COUNT_X86_PLATFORM_IPI,
#endif
-#ifdef CONFIG_GUEST_PERF_EVENTS
- unsigned int perf_guest_mediated_pmis;
-#endif
- unsigned int x86_platform_ipis; /* arch dependent */
- unsigned int apic_perf_irqs;
- unsigned int apic_irq_work_irqs;
#ifdef CONFIG_SMP
- unsigned int irq_resched_count;
- unsigned int irq_call_count;
+ IRQ_COUNT_RESCHEDULE,
+ IRQ_COUNT_CALL_FUNCTION,
#endif
- unsigned int irq_tlb_count;
+ IRQ_COUNT_TLB,
#ifdef CONFIG_X86_THERMAL_VECTOR
- unsigned int irq_thermal_count;
+ IRQ_COUNT_THERMAL_APIC,
#endif
#ifdef CONFIG_X86_MCE_THRESHOLD
- unsigned int irq_threshold_count;
+ IRQ_COUNT_THRESHOLD_APIC,
#endif
#ifdef CONFIG_X86_MCE_AMD
- unsigned int irq_deferred_error_count;
+ IRQ_COUNT_DEFERRED_ERROR,
+#endif
+#ifdef CONFIG_X86_MCE
+ IRQ_COUNT_MCE_EXCEPTION,
+ IRQ_COUNT_MCE_POLL,
#endif
#ifdef CONFIG_X86_HV_CALLBACK_VECTOR
- unsigned int irq_hv_callback_count;
+ IRQ_COUNT_HYPERVISOR_CALLBACK,
#endif
#if IS_ENABLED(CONFIG_HYPERV)
- unsigned int irq_hv_reenlightenment_count;
- unsigned int hyperv_stimer0_count;
+ IRQ_COUNT_HYPERV_REENLIGHTENMENT,
+ IRQ_COUNT_HYPERV_STIMER0,
+#endif
+#if IS_ENABLED(CONFIG_KVM)
+ IRQ_COUNT_POSTED_INTR,
+ IRQ_COUNT_POSTED_INTR_NESTED,
+ IRQ_COUNT_POSTED_INTR_WAKEUP,
+#endif
+#ifdef CONFIG_GUEST_PERF_EVENTS
+ IRQ_COUNT_PERF_GUEST_MEDIATED_PMI,
#endif
#ifdef CONFIG_X86_POSTED_MSI
- unsigned int posted_msi_notification_count;
+ IRQ_COUNT_POSTED_MSI_NOTIFICATION,
+#endif
+ IRQ_COUNT_PIC_APIC_ERROR,
+#ifdef CONFIG_X86_IO_APIC
+ IRQ_COUNT_IOAPIC_MISROUTED,
+#endif
+ IRQ_COUNT_MAX,
+};
+
+typedef struct {
+#if IS_ENABLED(CONFIG_CPU_MITIGATIONS) && IS_ENABLED(CONFIG_KVM_INTEL)
+ u8 kvm_cpu_l1tf_flush_l1d;
#endif
+ unsigned int counts[IRQ_COUNT_MAX];
} ____cacheline_aligned irq_cpustat_t;
DECLARE_PER_CPU_SHARED_ALIGNED(irq_cpustat_t, irq_stat);
@@ -58,15 +71,21 @@ DECLARE_PER_CPU_ALIGNED(struct pi_desc, posted_msi_pi_desc);
#endif
#define __ARCH_IRQ_STAT
-#define inc_irq_stat(member) this_cpu_inc(irq_stat.member)
+#define inc_irq_stat(index) this_cpu_inc(irq_stat.counts[IRQ_COUNT_##index])
+void irq_stat_inc_and_enable(enum irq_stat_counts which);
+
+#ifdef CONFIG_X86_LOCAL_APIC
+#define inc_perf_irq_stat() inc_irq_stat(APIC_PERF)
+#else
+#define inc_perf_irq_stat() do { } while (0)
+#endif
extern void ack_bad_irq(unsigned int irq);
+#ifdef CONFIG_PROC_FS
extern u64 arch_irq_stat_cpu(unsigned int cpu);
#define arch_irq_stat_cpu arch_irq_stat_cpu
-
-extern u64 arch_irq_stat(void);
-#define arch_irq_stat arch_irq_stat
+#endif
DECLARE_PER_CPU_CACHE_HOT(u16, __softirq_pending);
#define local_softirq_pending_ref __softirq_pending
diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h
index cbe19e669080..47727d0b540b 100644
--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -110,10 +110,6 @@ static inline void lock_vector_lock(void) {}
static inline void unlock_vector_lock(void) {}
#endif
-/* Statistics */
-extern atomic_t irq_err_count;
-extern atomic_t irq_mis_count;
-
extern void elcr_set_level_irq(unsigned int irq);
extern char irq_entries_start[];
diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 0175d39a5856..e575b702063d 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -291,9 +291,6 @@ bool mce_is_memory_error(struct mce *m);
bool mce_is_correctable(struct mce *m);
bool mce_usable_address(struct mce *m);
-DECLARE_PER_CPU(unsigned, mce_exception_count);
-DECLARE_PER_CPU(unsigned, mce_poll_count);
-
typedef DECLARE_BITMAP(mce_banks_t, MAX_NR_BANKS);
DECLARE_PER_CPU(mce_banks_t, mce_poll_banks);
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 639904911444..3eeebc2c5a1d 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -1045,7 +1045,7 @@ static void local_apic_timer_interrupt(void)
/*
* the NMI deadlock-detector uses this.
*/
- inc_irq_stat(apic_timer_irqs);
+ inc_irq_stat(APIC_TIMER);
evt->event_handler(evt);
}
@@ -2114,7 +2114,7 @@ static noinline void handle_spurious_interrupt(u8 vector)
trace_spurious_apic_entry(vector);
- inc_irq_stat(irq_spurious_count);
+ irq_stat_inc_and_enable(IRQ_COUNT_SPURIOUS);
/*
* If this is a spurious interrupt then do not acknowledge
@@ -2186,7 +2186,7 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_error_interrupt)
apic_write(APIC_ESR, 0);
v = apic_read(APIC_ESR);
apic_eoi();
- atomic_inc(&irq_err_count);
+ irq_stat_inc_and_enable(IRQ_COUNT_PIC_APIC_ERROR);
apic_pr_debug("APIC error on CPU%d: %02x", smp_processor_id(), v);
diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 352ed5558cbc..7d7175d01228 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -1575,8 +1575,6 @@ static unsigned int startup_ioapic_irq(struct irq_data *data)
return was_pending;
}
-atomic_t irq_mis_count;
-
#ifdef CONFIG_GENERIC_PENDING_IRQ
static bool io_apic_level_ack_pending(struct mp_chip_data *data)
{
@@ -1713,7 +1711,7 @@ static void ioapic_ack_level(struct irq_data *irq_data)
* at the cpu.
*/
if (!(v & (1 << (i & 0x1f)))) {
- atomic_inc(&irq_mis_count);
+ irq_stat_inc_and_enable(IRQ_COUNT_IOAPIC_MISROUTED);
eoi_ioapic_pin(cfg->vector, irq_data->chip_data);
}
diff --git a/arch/x86/kernel/apic/ipi.c b/arch/x86/kernel/apic/ipi.c
index 98a57cb4aa86..c627bee3b14f 100644
--- a/arch/x86/kernel/apic/ipi.c
+++ b/arch/x86/kernel/apic/ipi.c
@@ -120,7 +120,7 @@ u32 apic_mem_wait_icr_idle_timeout(void)
for (cnt = 0; cnt < 1000; cnt++) {
if (!(apic_read(APIC_ICR) & APIC_ICR_BUSY))
return 0;
- inc_irq_stat(icr_read_retry_count);
+ irq_stat_inc_and_enable(IRQ_COUNT_ICR_READ_RETRY);
udelay(100);
}
return APIC_ICR_BUSY;
diff --git a/arch/x86/kernel/cpu/acrn.c b/arch/x86/kernel/cpu/acrn.c
index 2c5b51aad91a..dc119af83524 100644
--- a/arch/x86/kernel/cpu/acrn.c
+++ b/arch/x86/kernel/cpu/acrn.c
@@ -52,7 +52,7 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_acrn_hv_callback)
* HYPERVISOR_CALLBACK_VECTOR.
*/
apic_eoi();
- inc_irq_stat(irq_hv_callback_count);
+ inc_irq_stat(HYPERVISOR_CALLBACK);
if (acrn_intr_handler)
acrn_intr_handler();
diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index 6605a0224659..222fa9cb181b 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -850,7 +850,7 @@ bool amd_mce_usable_address(struct mce *m)
DEFINE_IDTENTRY_SYSVEC(sysvec_deferred_error)
{
trace_deferred_error_apic_entry(DEFERRED_ERROR_VECTOR);
- inc_irq_stat(irq_deferred_error_count);
+ inc_irq_stat(DEFERRED_ERROR);
deferred_error_int_vector();
trace_deferred_error_apic_exit(DEFERRED_ERROR_VECTOR);
apic_eoi();
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 8dd424ac5de8..77cad8d57e16 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -67,8 +67,6 @@ static DEFINE_MUTEX(mce_sysfs_mutex);
#define SPINUNIT 100 /* 100ns */
-DEFINE_PER_CPU(unsigned, mce_exception_count);
-
DEFINE_PER_CPU_READ_MOSTLY(unsigned int, mce_num_banks);
DEFINE_PER_CPU_READ_MOSTLY(struct mce_bank[MAX_NR_BANKS], mce_banks_array);
@@ -716,8 +714,6 @@ static noinstr void mce_read_aux(struct mce_hw_err *err, int i)
}
}
-DEFINE_PER_CPU(unsigned, mce_poll_count);
-
/*
* We have three scenarios for checking for Deferred errors:
*
@@ -820,7 +816,7 @@ void machine_check_poll(enum mcp_flags flags, mce_banks_t *b)
struct mce *m;
int i;
- this_cpu_inc(mce_poll_count);
+ inc_irq_stat(MCE_POLL);
mce_gather_info(&err, NULL);
m = &err.m;
@@ -1595,7 +1591,7 @@ noinstr void do_machine_check(struct pt_regs *regs)
*/
lmce = 1;
- this_cpu_inc(mce_exception_count);
+ inc_irq_stat(MCE_EXCEPTION);
mce_gather_info(&err, regs);
m = &err.m;
diff --git a/arch/x86/kernel/cpu/mce/threshold.c b/arch/x86/kernel/cpu/mce/threshold.c
index 0d13c9ffcba0..6c370d5af5bd 100644
--- a/arch/x86/kernel/cpu/mce/threshold.c
+++ b/arch/x86/kernel/cpu/mce/threshold.c
@@ -37,7 +37,7 @@ void (*mce_threshold_vector)(void) = default_threshold_interrupt;
DEFINE_IDTENTRY_SYSVEC(sysvec_threshold)
{
trace_threshold_apic_entry(THRESHOLD_APIC_VECTOR);
- inc_irq_stat(irq_threshold_count);
+ inc_irq_stat(THRESHOLD_APIC);
mce_threshold_vector();
trace_threshold_apic_exit(THRESHOLD_APIC_VECTOR);
apic_eoi();
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index b5b6a58b67b0..9381102e884a 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -154,7 +154,7 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_hyperv_callback)
{
struct pt_regs *old_regs = set_irq_regs(regs);
- inc_irq_stat(irq_hv_callback_count);
+ inc_irq_stat(HYPERVISOR_CALLBACK);
if (mshv_handler)
mshv_handler();
@@ -193,7 +193,7 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_hyperv_stimer0)
{
struct pt_regs *old_regs = set_irq_regs(regs);
- inc_irq_stat(hyperv_stimer0_count);
+ inc_irq_stat(HYPERV_STIMER0);
if (hv_stimer0_handler)
hv_stimer0_handler();
add_interrupt_randomness(HYPERV_STIMER0_VECTOR);
diff --git a/arch/x86/kernel/i8259.c b/arch/x86/kernel/i8259.c
index f67063df6723..f7a86b94a0dd 100644
--- a/arch/x86/kernel/i8259.c
+++ b/arch/x86/kernel/i8259.c
@@ -214,7 +214,7 @@ static void mask_and_ack_8259A(struct irq_data *data)
"spurious 8259A interrupt: IRQ%d.\n", irq);
spurious_irq_mask |= irqmask;
}
- atomic_inc(&irq_err_count);
+ irq_stat_inc_and_enable(IRQ_COUNT_PIC_APIC_ERROR);
/*
* Theoretically we do not have to handle this IRQ,
* but in Linux this does not cause problems and is
diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
index ec77be217eaf..30122f0b3af9 100644
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -39,8 +39,6 @@ EXPORT_PER_CPU_SYMBOL(__softirq_pending);
DEFINE_PER_CPU_CACHE_HOT(struct irq_stack *, hardirq_stack_ptr);
-atomic_t irq_err_count;
-
/*
* 'what should we do if we get a hw irq event on an illegal vector'.
* each architecture has to answer this themselves.
@@ -62,198 +60,147 @@ void ack_bad_irq(unsigned int irq)
apic_eoi();
}
-#define irq_stats(x) (&per_cpu(irq_stat, x))
-/*
- * /proc/interrupts printing for arch specific interrupts
- */
-int arch_show_interrupts(struct seq_file *p, int prec)
-{
- int j;
+struct irq_stat_info {
+ unsigned int skip_vector;
+ const char *symbol;
+ const char *text;
+};
+
+#define DEFAULT_SUPPRESSED_VECTOR UINT_MAX
+
+#define ISS(idx, sym, txt) [IRQ_COUNT_##idx] = { .symbol = sym, .text = txt }
- seq_printf(p, "%*s: ", prec, "NMI");
- for_each_online_cpu(j)
- seq_printf(p, "%10u ", irq_stats(j)->__nmi_count);
- seq_puts(p, " Non-maskable interrupts\n");
+#define ITS(idx, sym, txt) [IRQ_COUNT_##idx] = \
+ { .skip_vector = idx## _VECTOR, .symbol = sym, .text = txt }
+
+#define IDS(idx, sym, txt) [IRQ_COUNT_##idx] = \
+ { .skip_vector = DEFAULT_SUPPRESSED_VECTOR, .symbol = sym, .text = txt }
+
+static const struct irq_stat_info irq_stat_info[IRQ_COUNT_MAX] = {
+ ISS(NMI, "NMI", " Non-maskable interrupts\n"),
#ifdef CONFIG_X86_LOCAL_APIC
- seq_printf(p, "%*s: ", prec, "LOC");
- for_each_online_cpu(j)
- seq_printf(p, "%10u ", irq_stats(j)->apic_timer_irqs);
- seq_puts(p, " Local timer interrupts\n");
-
- seq_printf(p, "%*s: ", prec, "SPU");
- for_each_online_cpu(j)
- seq_printf(p, "%10u ", irq_stats(j)->irq_spurious_count);
- seq_puts(p, " Spurious interrupts\n");
- seq_printf(p, "%*s: ", prec, "PMI");
- for_each_online_cpu(j)
- seq_printf(p, "%10u ", irq_stats(j)->apic_perf_irqs);
- seq_puts(p, " Performance monitoring interrupts\n");
- seq_printf(p, "%*s: ", prec, "IWI");
- for_each_online_cpu(j)
- seq_printf(p, "%10u ", irq_stats(j)->apic_irq_work_irqs);
- seq_puts(p, " IRQ work interrupts\n");
- seq_printf(p, "%*s: ", prec, "RTR");
- for_each_online_cpu(j)
- seq_printf(p, "%10u ", irq_stats(j)->icr_read_retry_count);
- seq_puts(p, " APIC ICR read retries\n");
- if (x86_platform_ipi_callback) {
- seq_printf(p, "%*s: ", prec, "PLT");
- for_each_online_cpu(j)
- seq_printf(p, "%10u ", irq_stats(j)->x86_platform_ipis);
- seq_puts(p, " Platform interrupts\n");
- }
+ ISS(APIC_TIMER, "LOC", " Local timer interrupts\n"),
+ IDS(SPURIOUS, "SPU", " Spurious interrupts\n"),
+ ISS(APIC_PERF, "PMI", " Performance monitoring interrupts\n"),
+ ISS(IRQ_WORK, "IWI", " IRQ work interrupts\n"),
+ IDS(ICR_READ_RETRY, "RTR", " APIC ICR read retries\n"),
+ ISS(X86_PLATFORM_IPI, "PLT", " Platform interrupts\n"),
#endif
#ifdef CONFIG_SMP
- seq_printf(p, "%*s: ", prec, "RES");
- for_each_online_cpu(j)
- seq_printf(p, "%10u ", irq_stats(j)->irq_resched_count);
- seq_puts(p, " Rescheduling interrupts\n");
- seq_printf(p, "%*s: ", prec, "CAL");
- for_each_online_cpu(j)
- seq_printf(p, "%10u ", irq_stats(j)->irq_call_count);
- seq_puts(p, " Function call interrupts\n");
- seq_printf(p, "%*s: ", prec, "TLB");
- for_each_online_cpu(j)
- seq_printf(p, "%10u ", irq_stats(j)->irq_tlb_count);
- seq_puts(p, " TLB shootdowns\n");
+ ISS(RESCHEDULE, "RES", " Rescheduling interrupts\n"),
+ ISS(CALL_FUNCTION, "CAL", " Function call interrupts\n"),
#endif
+ ISS(TLB, "TLB", " TLB shootdowns\n"),
#ifdef CONFIG_X86_THERMAL_VECTOR
- seq_printf(p, "%*s: ", prec, "TRM");
- for_each_online_cpu(j)
- seq_printf(p, "%10u ", irq_stats(j)->irq_thermal_count);
- seq_puts(p, " Thermal event interrupts\n");
+ ISS(THERMAL_APIC, "TRM", " Thermal event interrupts\n"),
#endif
#ifdef CONFIG_X86_MCE_THRESHOLD
- seq_printf(p, "%*s: ", prec, "THR");
- for_each_online_cpu(j)
- seq_printf(p, "%10u ", irq_stats(j)->irq_threshold_count);
- seq_puts(p, " Threshold APIC interrupts\n");
+ ISS(THRESHOLD_APIC, "THR", " Threshold APIC interrupts\n"),
#endif
#ifdef CONFIG_X86_MCE_AMD
- seq_printf(p, "%*s: ", prec, "DFR");
- for_each_online_cpu(j)
- seq_printf(p, "%10u ", irq_stats(j)->irq_deferred_error_count);
- seq_puts(p, " Deferred Error APIC interrupts\n");
+ ISS(DEFERRED_ERROR, "DFR", " Deferred Error APIC interrupts\n"),
#endif
#ifdef CONFIG_X86_MCE
- seq_printf(p, "%*s: ", prec, "MCE");
- for_each_online_cpu(j)
- seq_printf(p, "%10u ", per_cpu(mce_exception_count, j));
- seq_puts(p, " Machine check exceptions\n");
- seq_printf(p, "%*s: ", prec, "MCP");
- for_each_online_cpu(j)
- seq_printf(p, "%10u ", per_cpu(mce_poll_count, j));
- seq_puts(p, " Machine check polls\n");
+ ISS(MCE_EXCEPTION, "MCE", " Machine check exceptions\n"),
+ ISS(MCE_POLL, "MCP", " Machine check polls\n"),
#endif
#ifdef CONFIG_X86_HV_CALLBACK_VECTOR
- if (test_bit(HYPERVISOR_CALLBACK_VECTOR, system_vectors)) {
- seq_printf(p, "%*s: ", prec, "HYP");
- for_each_online_cpu(j)
- seq_printf(p, "%10u ",
- irq_stats(j)->irq_hv_callback_count);
- seq_puts(p, " Hypervisor callback interrupts\n");
- }
+ ITS(HYPERVISOR_CALLBACK, "HYP", " Hypervisor callback interrupts\n"),
#endif
#if IS_ENABLED(CONFIG_HYPERV)
- if (test_bit(HYPERV_REENLIGHTENMENT_VECTOR, system_vectors)) {
- seq_printf(p, "%*s: ", prec, "HRE");
- for_each_online_cpu(j)
- seq_printf(p, "%10u ",
- irq_stats(j)->irq_hv_reenlightenment_count);
- seq_puts(p, " Hyper-V reenlightenment interrupts\n");
- }
- if (test_bit(HYPERV_STIMER0_VECTOR, system_vectors)) {
- seq_printf(p, "%*s: ", prec, "HVS");
- for_each_online_cpu(j)
- seq_printf(p, "%10u ",
- irq_stats(j)->hyperv_stimer0_count);
- seq_puts(p, " Hyper-V stimer0 interrupts\n");
- }
-#endif
- seq_printf(p, "%*s: %10u\n", prec, "ERR", atomic_read(&irq_err_count));
-#if defined(CONFIG_X86_IO_APIC)
- seq_printf(p, "%*s: %10u\n", prec, "MIS", atomic_read(&irq_mis_count));
+ ITS(HYPERV_REENLIGHTENMENT, "HRE", " Hyper-V reenlightenment interrupts\n"),
+ ITS(HYPERV_STIMER0, "HVS", " Hyper-V stimer0 interrupts\n"),
#endif
#if IS_ENABLED(CONFIG_KVM)
- seq_printf(p, "%*s: ", prec, "PIN");
- for_each_online_cpu(j)
- seq_printf(p, "%10u ", irq_stats(j)->kvm_posted_intr_ipis);
- seq_puts(p, " Posted-interrupt notification event\n");
-
- seq_printf(p, "%*s: ", prec, "NPI");
- for_each_online_cpu(j)
- seq_printf(p, "%10u ",
- irq_stats(j)->kvm_posted_intr_nested_ipis);
- seq_puts(p, " Nested posted-interrupt event\n");
-
- seq_printf(p, "%*s: ", prec, "PIW");
- for_each_online_cpu(j)
- seq_printf(p, "%10u ",
- irq_stats(j)->kvm_posted_intr_wakeup_ipis);
- seq_puts(p, " Posted-interrupt wakeup event\n");
+ ITS(POSTED_INTR, "PIN", " Posted-interrupt notification event\n"),
+ ITS(POSTED_INTR_NESTED, "NPI", " Nested posted-interrupt event\n"),
+ ITS(POSTED_INTR_WAKEUP, "PIW", " Posted-interrupt wakeup event\n"),
#endif
#ifdef CONFIG_GUEST_PERF_EVENTS
- seq_printf(p, "%*s: ", prec, "VPMI");
- for_each_online_cpu(j)
- seq_printf(p, "%10u ",
- irq_stats(j)->perf_guest_mediated_pmis);
- seq_puts(p, " Perf Guest Mediated PMI\n");
+ ISS(PERF_GUEST_MEDIATED_PMI, "VPMI", " Perf Guest Mediated PMI\n"),
#endif
#ifdef CONFIG_X86_POSTED_MSI
- seq_printf(p, "%*s: ", prec, "PMN");
- for_each_online_cpu(j)
- seq_printf(p, "%10u ",
- irq_stats(j)->posted_msi_notification_count);
- seq_puts(p, " Posted MSI notification event\n");
+ ISS(POSTED_MSI_NOTIFICATION, "PMN", " Posted MSI notification event\n"),
#endif
- return 0;
-}
+ IDS(PIC_APIC_ERROR, "ERR", " PIC/APIC error interrupts\n"),
+#ifdef CONFIG_X86_IO_APIC
+ IDS(IOAPIC_MISROUTED, "MIS", " Misrouted IO/APIC interrupts\n"),
+#endif
+};
-/*
- * /proc/stat helpers
- */
-u64 arch_irq_stat_cpu(unsigned int cpu)
+static DECLARE_BITMAP(irq_stat_count_show, IRQ_COUNT_MAX) __read_mostly;
+
+static int __init irq_init_stats(void)
{
- u64 sum = irq_stats(cpu)->__nmi_count;
+ const struct irq_stat_info *info = irq_stat_info;
+
+ for (unsigned int i = 0; i < ARRAY_SIZE(irq_stat_info); i++, info++) {
+ if (!info->skip_vector || (info->skip_vector != DEFAULT_SUPPRESSED_VECTOR &&
+ test_bit(info->skip_vector, system_vectors)))
+ set_bit(i, irq_stat_count_show);
+ }
#ifdef CONFIG_X86_LOCAL_APIC
- sum += irq_stats(cpu)->apic_timer_irqs;
- sum += irq_stats(cpu)->irq_spurious_count;
- sum += irq_stats(cpu)->apic_perf_irqs;
- sum += irq_stats(cpu)->apic_irq_work_irqs;
- sum += irq_stats(cpu)->icr_read_retry_count;
- if (x86_platform_ipi_callback)
- sum += irq_stats(cpu)->x86_platform_ipis;
-#endif
-#ifdef CONFIG_SMP
- sum += irq_stats(cpu)->irq_resched_count;
- sum += irq_stats(cpu)->irq_call_count;
-#endif
-#ifdef CONFIG_X86_THERMAL_VECTOR
- sum += irq_stats(cpu)->irq_thermal_count;
-#endif
-#ifdef CONFIG_X86_MCE_THRESHOLD
- sum += irq_stats(cpu)->irq_threshold_count;
+ if (!x86_platform_ipi_callback)
+ clear_bit(IRQ_COUNT_X86_PLATFORM_IPI, irq_stat_count_show);
#endif
-#ifdef CONFIG_X86_HV_CALLBACK_VECTOR
- sum += irq_stats(cpu)->irq_hv_callback_count;
-#endif
-#if IS_ENABLED(CONFIG_HYPERV)
- sum += irq_stats(cpu)->irq_hv_reenlightenment_count;
- sum += irq_stats(cpu)->hyperv_stimer0_count;
+
+#ifdef CONFIG_X86_POSTED_MSI
+ if (!posted_msi_enabled())
+ clear_bit(IRQ_COUNT_POSTED_MSI_NOTIFICATION, irq_stat_count_show);
#endif
-#ifdef CONFIG_X86_MCE
- sum += per_cpu(mce_exception_count, cpu);
- sum += per_cpu(mce_poll_count, cpu);
+
+#ifdef CONFIG_X86_MCE_AMD
+ if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD &&
+ boot_cpu_data.x86_vendor != X86_VENDOR_HYGON)
+ clear_bit(IRQ_COUNT_DEFERRED_ERROR, irq_stat_count_show);
#endif
- return sum;
+ return 0;
+}
+late_initcall(irq_init_stats);
+
+/*
+ * Used for default disabled counters to increment the stats and to enable the
+ * entry for /proc/interrupts output.
+ */
+void irq_stat_inc_and_enable(enum irq_stat_counts which)
+{
+ this_cpu_inc(irq_stat.counts[which]);
+ set_bit(which, irq_stat_count_show);
+}
+
+#ifdef CONFIG_PROC_FS
+/*
+ * /proc/interrupts printing for arch specific interrupts
+ */
+int arch_show_interrupts(struct seq_file *p, int prec)
+{
+ const struct irq_stat_info *info = irq_stat_info;
+
+ for (unsigned int i = 0; i < ARRAY_SIZE(irq_stat_info); i++, info++) {
+ if (!test_bit(i, irq_stat_count_show))
+ continue;
+
+ seq_printf(p, "%*s:", prec, info->symbol);
+ irq_proc_emit_counts(p, &irq_stat.counts[i]);
+ seq_puts(p, info->text);
+ }
+ return 0;
}
-u64 arch_irq_stat(void)
+/*
+ * /proc/stat helpers
+ */
+u64 arch_irq_stat_cpu(unsigned int cpu)
{
- u64 sum = atomic_read(&irq_err_count);
+ irq_cpustat_t *p = per_cpu_ptr(&irq_stat, cpu);
+ u64 sum = 0;
+
+ for (unsigned int i = 0; i < ARRAY_SIZE(irq_stat_info); i++)
+ sum += p->counts[i];
return sum;
}
+#endif /* CONFIG_PROC_FS */
static __always_inline void handle_irq(struct irq_desc *desc,
struct pt_regs *regs)
@@ -338,7 +285,7 @@ DEFINE_IDTENTRY_IRQ(common_interrupt)
#ifdef CONFIG_X86_LOCAL_APIC
/* Function pointer for generic interrupt vector handling */
-void (*x86_platform_ipi_callback)(void) = NULL;
+void (*x86_platform_ipi_callback)(void) __ro_after_init = NULL;
/*
* Handler for X86_PLATFORM_IPI_VECTOR.
*/
@@ -348,7 +295,7 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_x86_platform_ipi)
apic_eoi();
trace_x86_platform_ipi_entry(X86_PLATFORM_IPI_VECTOR);
- inc_irq_stat(x86_platform_ipis);
+ inc_irq_stat(X86_PLATFORM_IPI);
if (x86_platform_ipi_callback)
x86_platform_ipi_callback();
trace_x86_platform_ipi_exit(X86_PLATFORM_IPI_VECTOR);
@@ -363,7 +310,7 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_x86_platform_ipi)
DEFINE_IDTENTRY_SYSVEC(sysvec_perf_guest_mediated_pmi_handler)
{
apic_eoi();
- inc_irq_stat(perf_guest_mediated_pmis);
+ inc_irq_stat(PERF_GUEST_MEDIATED_PMI);
perf_guest_handle_mediated_pmi();
}
#endif
@@ -389,7 +336,7 @@ EXPORT_SYMBOL_FOR_KVM(kvm_set_posted_intr_wakeup_handler);
DEFINE_IDTENTRY_SYSVEC_SIMPLE(sysvec_kvm_posted_intr_ipi)
{
apic_eoi();
- inc_irq_stat(kvm_posted_intr_ipis);
+ inc_irq_stat(POSTED_INTR);
}
/*
@@ -398,7 +345,7 @@ DEFINE_IDTENTRY_SYSVEC_SIMPLE(sysvec_kvm_posted_intr_ipi)
DEFINE_IDTENTRY_SYSVEC(sysvec_kvm_posted_intr_wakeup_ipi)
{
apic_eoi();
- inc_irq_stat(kvm_posted_intr_wakeup_ipis);
+ inc_irq_stat(POSTED_INTR_WAKEUP);
kvm_posted_intr_wakeup_handler();
}
@@ -408,7 +355,7 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_kvm_posted_intr_wakeup_ipi)
DEFINE_IDTENTRY_SYSVEC_SIMPLE(sysvec_kvm_posted_intr_nested_ipi)
{
apic_eoi();
- inc_irq_stat(kvm_posted_intr_nested_ipis);
+ inc_irq_stat(POSTED_INTR_NESTED);
}
#endif
@@ -482,7 +429,7 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_posted_msi_notification)
/* Mark the handler active for intel_ack_posted_msi_irq() */
__this_cpu_write(posted_msi_handler_active, true);
- inc_irq_stat(posted_msi_notification_count);
+ inc_irq_stat(POSTED_MSI_NOTIFICATION);
irq_enter();
/*
@@ -577,7 +524,7 @@ static void smp_thermal_vector(void)
DEFINE_IDTENTRY_SYSVEC(sysvec_thermal)
{
trace_thermal_apic_entry(THERMAL_APIC_VECTOR);
- inc_irq_stat(irq_thermal_count);
+ inc_irq_stat(THERMAL_APIC);
smp_thermal_vector();
trace_thermal_apic_exit(THERMAL_APIC_VECTOR);
apic_eoi();
diff --git a/arch/x86/kernel/irq_work.c b/arch/x86/kernel/irq_work.c
index b0a24deab4a1..308c62411ff4 100644
--- a/arch/x86/kernel/irq_work.c
+++ b/arch/x86/kernel/irq_work.c
@@ -18,7 +18,7 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_irq_work)
{
apic_eoi();
trace_irq_work_entry(IRQ_WORK_VECTOR);
- inc_irq_stat(apic_irq_work_irqs);
+ inc_irq_stat(IRQ_WORK);
irq_work_run();
trace_irq_work_exit(IRQ_WORK_VECTOR);
}
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 29226d112029..d1f3f320168c 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -304,7 +304,7 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_kvm_asyncpf_interrupt)
apic_eoi();
- inc_irq_stat(irq_hv_callback_count);
+ inc_irq_stat(HYPERVISOR_CALLBACK);
if (__this_cpu_read(async_pf_enabled)) {
token = __this_cpu_read(apf_reason.token);
diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
index 3d239ed12744..a7f56e7638d8 100644
--- a/arch/x86/kernel/nmi.c
+++ b/arch/x86/kernel/nmi.c
@@ -576,7 +576,7 @@ DEFINE_IDTENTRY_RAW(exc_nmi)
irq_state = irqentry_nmi_enter(regs);
- inc_irq_stat(__nmi_count);
+ inc_irq_stat(NMI);
if (IS_ENABLED(CONFIG_NMI_CHECK_CPU) && ignore_nmis) {
WRITE_ONCE(nsp->idt_ignored, nsp->idt_ignored + 1);
@@ -725,7 +725,7 @@ DEFINE_FREDENTRY_NMI(exc_nmi)
irq_state = irqentry_nmi_enter(regs);
- inc_irq_stat(__nmi_count);
+ inc_irq_stat(NMI);
default_do_nmi(regs);
irqentry_nmi_exit(regs, irq_state);
diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c
index cbf95fe2b207..985103cab16c 100644
--- a/arch/x86/kernel/smp.c
+++ b/arch/x86/kernel/smp.c
@@ -250,7 +250,7 @@ DEFINE_IDTENTRY_SYSVEC_SIMPLE(sysvec_reschedule_ipi)
{
apic_eoi();
trace_reschedule_entry(RESCHEDULE_VECTOR);
- inc_irq_stat(irq_resched_count);
+ inc_irq_stat(RESCHEDULE);
scheduler_ipi();
trace_reschedule_exit(RESCHEDULE_VECTOR);
}
@@ -259,7 +259,7 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_call_function)
{
apic_eoi();
trace_call_function_entry(CALL_FUNCTION_VECTOR);
- inc_irq_stat(irq_call_count);
+ inc_irq_stat(CALL_FUNCTION);
generic_smp_call_function_interrupt();
trace_call_function_exit(CALL_FUNCTION_VECTOR);
}
@@ -268,7 +268,7 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_call_function_single)
{
apic_eoi();
trace_call_function_single_entry(CALL_FUNCTION_SINGLE_VECTOR);
- inc_irq_stat(irq_call_count);
+ inc_irq_stat(CALL_FUNCTION);
generic_smp_call_function_single_interrupt();
trace_call_function_single_exit(CALL_FUNCTION_SINGLE_VECTOR);
}
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index af43d177087e..4c045f844492 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -1123,7 +1123,7 @@ static void flush_tlb_func(void *info)
VM_WARN_ON(!irqs_disabled());
if (!local) {
- inc_irq_stat(irq_tlb_count);
+ inc_irq_stat(TLB);
count_vm_tlb_event(NR_TLB_REMOTE_FLUSH_RECEIVED);
}
diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c
index 2f9fa27e5a3c..6c4eac4ff13a 100644
--- a/arch/x86/xen/enlighten_hvm.c
+++ b/arch/x86/xen/enlighten_hvm.c
@@ -125,7 +125,7 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_xen_hvm_callback)
if (xen_percpu_upcall)
apic_eoi();
- inc_irq_stat(irq_hv_callback_count);
+ inc_irq_stat(HYPERVISOR_CALLBACK);
xen_evtchn_do_upcall();
diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
index ed2d7a3756ce..ea19428d5da0 100644
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -728,7 +728,7 @@ static void __xen_pv_evtchn_do_upcall(struct pt_regs *regs)
{
struct pt_regs *old_regs = set_irq_regs(regs);
- inc_irq_stat(irq_hv_callback_count);
+ inc_irq_stat(HYPERVISOR_CALLBACK);
xen_evtchn_do_upcall();
diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
index 05f92c812ac8..05ee0d3b0874 100644
--- a/arch/x86/xen/smp.c
+++ b/arch/x86/xen/smp.c
@@ -23,7 +23,7 @@ static irqreturn_t xen_call_function_single_interrupt(int irq, void *dev_id);
*/
static irqreturn_t xen_reschedule_interrupt(int irq, void *dev_id)
{
- inc_irq_stat(irq_resched_count);
+ inc_irq_stat(RESCHEDULE);
scheduler_ipi();
return IRQ_HANDLED;
@@ -254,7 +254,7 @@ void xen_send_IPI_allbutself(int vector)
static irqreturn_t xen_call_function_interrupt(int irq, void *dev_id)
{
generic_smp_call_function_interrupt();
- inc_irq_stat(irq_call_count);
+ inc_irq_stat(CALL_FUNCTION);
return IRQ_HANDLED;
}
@@ -262,7 +262,7 @@ static irqreturn_t xen_call_function_interrupt(int irq, void *dev_id)
static irqreturn_t xen_call_function_single_interrupt(int irq, void *dev_id)
{
generic_smp_call_function_single_interrupt();
- inc_irq_stat(irq_call_count);
+ inc_irq_stat(CALL_FUNCTION);
return IRQ_HANDLED;
}
diff --git a/arch/x86/xen/smp_pv.c b/arch/x86/xen/smp_pv.c
index db9b8e222b38..c2812f8177bb 100644
--- a/arch/x86/xen/smp_pv.c
+++ b/arch/x86/xen/smp_pv.c
@@ -400,7 +400,7 @@ static void xen_pv_stop_other_cpus(int wait)
static irqreturn_t xen_irq_work_interrupt(int irq, void *dev_id)
{
irq_work_run();
- inc_irq_stat(apic_irq_work_irqs);
+ inc_irq_stat(IRQ_WORK);
return IRQ_HANDLED;
}
diff --git a/arch/xtensa/kernel/irq.c b/arch/xtensa/kernel/irq.c
index b1e410f6b5ab..6f01f530868b 100644
--- a/arch/xtensa/kernel/irq.c
+++ b/arch/xtensa/kernel/irq.c
@@ -59,7 +59,7 @@ int arch_show_interrupts(struct seq_file *p, int prec)
seq_printf(p, "%*s:", prec, "NMI");
for_each_online_cpu(cpu)
seq_printf(p, " %10lu", per_cpu(nmi_count, cpu));
- seq_puts(p, " Non-maskable interrupts\n");
+ seq_puts(p, " Non-maskable interrupts\n");
#endif
return 0;
}
diff --git a/fs/proc/Makefile b/fs/proc/Makefile
index 7b4db9c56e6a..8bc615ff84e5 100644
--- a/fs/proc/Makefile
+++ b/fs/proc/Makefile
@@ -16,7 +16,9 @@ proc-y += cmdline.o
proc-y += consoles.o
proc-y += cpuinfo.o
proc-y += devices.o
-proc-y += interrupts.o
+ifneq ($(CONFIG_GENERIC_IRQ_SHOW),y)
+proc-y += interrupts.o
+endif
proc-y += loadavg.o
proc-y += meminfo.o
proc-y += stat.o
diff --git a/fs/proc/stat.c b/fs/proc/stat.c
index 8b444e862319..20c3df9a9b80 100644
--- a/fs/proc/stat.c
+++ b/fs/proc/stat.c
@@ -18,9 +18,6 @@
#ifndef arch_irq_stat_cpu
#define arch_irq_stat_cpu(cpu) 0
#endif
-#ifndef arch_irq_stat
-#define arch_irq_stat() 0
-#endif
u64 get_idle_time(struct kernel_cpustat *kcs, int cpu)
{
@@ -122,7 +119,6 @@ static int show_stat(struct seq_file *p, void *v)
sum_softirq += softirq_stat;
}
}
- sum += arch_irq_stat();
seq_put_decimal_ull(p, "cpu ", nsec_to_clock_t(user));
seq_put_decimal_ull(p, " ", nsec_to_clock_t(nice));
diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index 6cd26ffb0505..3bf969ad8fe0 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -864,6 +864,7 @@ static inline void init_irq_proc(void)
struct seq_file;
int show_interrupts(struct seq_file *p, void *v);
int arch_show_interrupts(struct seq_file *p, int prec);
+void irq_proc_emit_counts(struct seq_file *p, unsigned int __percpu *cnts);
extern int early_irq_init(void);
extern int arch_probe_nr_irqs(void);
diff --git a/include/linux/irq.h b/include/linux/irq.h
index efa514ee562f..f485369b1b4f 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -103,6 +103,7 @@ enum {
IRQ_DISABLE_UNLAZY = (1 << 19),
IRQ_HIDDEN = (1 << 20),
IRQ_NO_DEBUG = (1 << 21),
+ IRQ_RESERVED = (1 << 22),
};
#define IRQF_MODIFY_MASK \
diff --git a/include/linux/irqdesc.h b/include/linux/irqdesc.h
index dae9a9b93665..8080db17c1b1 100644
--- a/include/linux/irqdesc.h
+++ b/include/linux/irqdesc.h
@@ -52,8 +52,8 @@ struct irq_redirect {
* @depth: disable-depth, for nested irq_disable() calls
* @wake_depth: enable depth, for multiple irq_set_irq_wake() callers
* @tot_count: stats field for non-percpu irqs
- * @irq_count: stats field to detect stalled irqs
* @last_unhandled: aging timer for unhandled count
+ * @irq_count: stats field to detect stalled irqs
* @irqs_unhandled: stats field for spurious unhandled interrupts
* @threads_handled: stats field for deferred spurious detection of threaded handlers
* @threads_handled_last: comparator field for deferred spurious detection of threaded handlers
@@ -70,6 +70,7 @@ struct irq_redirect {
* IRQF_NO_SUSPEND set
* @force_resume_depth: number of irqactions on a irq descriptor with
* IRQF_FORCE_RESUME set
+ * @refcnt: Reference count mainly for /proc/interrupts
* @rcu: rcu head for delayed free
* @kobj: kobject used to represent this struct in sysfs
* @request_mutex: mutex to protect request/free before locking desc->lock
@@ -87,9 +88,9 @@ struct irq_desc {
unsigned int core_internal_state__do_not_mess_with_it;
unsigned int depth; /* nested irq disables */
unsigned int wake_depth; /* nested wake enables */
- unsigned int tot_count;
- unsigned int irq_count; /* For detecting broken IRQs */
+ unsigned long tot_count;
unsigned long last_unhandled; /* Aging timer for unhandled count */
+ unsigned int irq_count; /* For detecting broken IRQs */
unsigned int irqs_unhandled;
atomic_t threads_handled;
int threads_handled_last;
@@ -119,6 +120,7 @@ struct irq_desc {
struct dentry *debugfs_file;
const char *dev_name;
#endif
+ rcuref_t refcnt;
#ifdef CONFIG_SPARSE_IRQ
struct rcu_head rcu;
struct kobject kobj;
diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index 6c9b1dc4e7d4..934777925f5c 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -47,9 +47,11 @@ int irq_set_chip(unsigned int irq, const struct irq_chip *chip)
scoped_irqdesc->irq_data.chip = (struct irq_chip *)(chip ?: &no_irq_chip);
ret = 0;
}
- /* For !CONFIG_SPARSE_IRQ make the irq show up in allocated_irqs. */
- if (!ret)
+ if (!ret) {
+ /* For !CONFIG_SPARSE_IRQ make the irq show up in allocated_irqs. */
irq_mark_irq(irq);
+ irq_proc_update_chip(chip);
+ }
return ret;
}
EXPORT_SYMBOL(irq_set_chip);
@@ -1007,6 +1009,7 @@ __irq_do_set_handler(struct irq_desc *desc, irq_flow_handler_t handle,
WARN_ON(irq_chip_pm_get(irq_desc_get_irq_data(desc)));
irq_activate_and_startup(desc, IRQ_RESEND);
}
+ irq_proc_update_valid(desc);
}
void __irq_set_handler(unsigned int irq, irq_flow_handler_t handle, int is_chained,
@@ -1067,6 +1070,7 @@ void irq_modify_status(unsigned int irq, unsigned long clr, unsigned long set)
trigger = tmp;
irqd_set(&desc->irq_data, trigger);
+ irq_proc_update_valid(desc);
}
}
EXPORT_SYMBOL_GPL(irq_modify_status);
diff --git a/kernel/irq/debugfs.h b/kernel/irq/debugfs.h
new file mode 100644
index 000000000000..8a9360d5fefb
--- /dev/null
+++ b/kernel/irq/debugfs.h
@@ -0,0 +1,44 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _KERNEL_IRQ_DEBUGFS_H
+#define _KERNEL_IRQ_DEBUGFS_H
+
+#ifdef CONFIG_GENERIC_IRQ_DEBUGFS
+#include <linux/debugfs.h>
+
+struct irq_bit_descr {
+ unsigned int mask;
+ char *name;
+};
+
+#define BIT_MASK_DESCR(m) { .mask = m, .name = #m }
+
+void irq_debug_show_bits(struct seq_file *m, int ind, unsigned int state,
+ const struct irq_bit_descr *sd, int size);
+
+void irq_add_debugfs_entry(unsigned int irq, struct irq_desc *desc);
+static inline void irq_remove_debugfs_entry(struct irq_desc *desc)
+{
+ debugfs_remove(desc->debugfs_file);
+ kfree(desc->dev_name);
+}
+void irq_debugfs_copy_devname(int irq, struct device *dev);
+# ifdef CONFIG_IRQ_DOMAIN
+void irq_domain_debugfs_init(struct dentry *root);
+# else
+static inline void irq_domain_debugfs_init(struct dentry *root)
+{
+}
+# endif
+#else /* CONFIG_GENERIC_IRQ_DEBUGFS */
+static inline void irq_add_debugfs_entry(unsigned int irq, struct irq_desc *d)
+{
+}
+static inline void irq_remove_debugfs_entry(struct irq_desc *d)
+{
+}
+static inline void irq_debugfs_copy_devname(int irq, struct device *dev)
+{
+}
+#endif /* CONFIG_GENERIC_IRQ_DEBUGFS */
+
+#endif
diff --git a/kernel/irq/internals.h b/kernel/irq/internals.h
index 9412e57056f5..f9c099d45a64 100644
--- a/kernel/irq/internals.h
+++ b/kernel/irq/internals.h
@@ -9,8 +9,12 @@
#include <linux/irqdesc.h>
#include <linux/kernel_stat.h>
#include <linux/pm_runtime.h>
+#include <linux/rcuref.h>
#include <linux/sched/clock.h>
+#include "debugfs.h"
+#include "proc.h"
+
#ifdef CONFIG_SPARSE_IRQ
# define MAX_SPARSE_IRQS INT_MAX
#else
@@ -21,6 +25,7 @@
extern bool noirqdebug;
extern int irq_poll_cpu;
+extern unsigned int total_nr_irqs;
extern struct irqaction chained_action;
@@ -100,9 +105,23 @@ extern void unmask_irq(struct irq_desc *desc);
extern void unmask_threaded_irq(struct irq_desc *desc);
#ifdef CONFIG_SPARSE_IRQ
-static inline void irq_mark_irq(unsigned int irq) { }
+static __always_inline void irq_mark_irq(unsigned int irq) { }
+void irq_desc_free_rcu(struct irq_desc *desc);
+
+static __always_inline bool irq_desc_get_ref(struct irq_desc *desc)
+{
+ return rcuref_get(&desc->refcnt);
+}
+
+static __always_inline void irq_desc_put_ref(struct irq_desc *desc)
+{
+ if (rcuref_put(&desc->refcnt))
+ irq_desc_free_rcu(desc);
+}
#else
extern void irq_mark_irq(unsigned int irq);
+static __always_inline bool irq_desc_get_ref(struct irq_desc *desc) { return true; }
+static __always_inline void irq_desc_put_ref(struct irq_desc *desc) { }
#endif
irqreturn_t __handle_irq_event_percpu(struct irq_desc *desc);
@@ -122,6 +141,7 @@ extern void register_irq_proc(unsigned int irq, struct irq_desc *desc);
extern void unregister_irq_proc(unsigned int irq, struct irq_desc *desc);
extern void register_handler_proc(unsigned int irq, struct irqaction *action);
extern void unregister_handler_proc(unsigned int irq, struct irqaction *action);
+void irq_proc_update_valid(struct irq_desc *desc);
#else
static inline void register_irq_proc(unsigned int irq, struct irq_desc *desc) { }
static inline void unregister_irq_proc(unsigned int irq, struct irq_desc *desc) { }
@@ -129,8 +149,11 @@ static inline void register_handler_proc(unsigned int irq,
struct irqaction *action) { }
static inline void unregister_handler_proc(unsigned int irq,
struct irqaction *action) { }
+static inline void irq_proc_update_valid(struct irq_desc *desc) { }
#endif
+struct irq_desc *irq_find_desc_at_or_after(unsigned int offset);
+
extern bool irq_can_set_affinity_usr(unsigned int irq);
extern int irq_do_set_affinity(struct irq_data *data,
@@ -372,42 +395,3 @@ static inline struct irq_data *irqd_get_parent_data(struct irq_data *irqd)
return NULL;
#endif
}
-
-#ifdef CONFIG_GENERIC_IRQ_DEBUGFS
-#include <linux/debugfs.h>
-
-struct irq_bit_descr {
- unsigned int mask;
- char *name;
-};
-
-#define BIT_MASK_DESCR(m) { .mask = m, .name = #m }
-
-void irq_debug_show_bits(struct seq_file *m, int ind, unsigned int state,
- const struct irq_bit_descr *sd, int size);
-
-void irq_add_debugfs_entry(unsigned int irq, struct irq_desc *desc);
-static inline void irq_remove_debugfs_entry(struct irq_desc *desc)
-{
- debugfs_remove(desc->debugfs_file);
- kfree(desc->dev_name);
-}
-void irq_debugfs_copy_devname(int irq, struct device *dev);
-# ifdef CONFIG_IRQ_DOMAIN
-void irq_domain_debugfs_init(struct dentry *root);
-# else
-static inline void irq_domain_debugfs_init(struct dentry *root)
-{
-}
-# endif
-#else /* CONFIG_GENERIC_IRQ_DEBUGFS */
-static inline void irq_add_debugfs_entry(unsigned int irq, struct irq_desc *d)
-{
-}
-static inline void irq_remove_debugfs_entry(struct irq_desc *d)
-{
-}
-static inline void irq_debugfs_copy_devname(int irq, struct device *dev)
-{
-}
-#endif /* CONFIG_GENERIC_IRQ_DEBUGFS */
diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
index 7173b8b634f2..80ef4e27dcf4 100644
--- a/kernel/irq/irqdesc.c
+++ b/kernel/irq/irqdesc.c
@@ -137,17 +137,18 @@ static void desc_set_defaults(unsigned int irq, struct irq_desc *desc, int node,
desc->tot_count = 0;
desc->name = NULL;
desc->owner = owner;
+ rcuref_init(&desc->refcnt, 1);
desc_smp_init(desc, node, affinity);
}
-static unsigned int nr_irqs = NR_IRQS;
+unsigned int total_nr_irqs __read_mostly = NR_IRQS;
/**
* irq_get_nr_irqs() - Number of interrupts supported by the system.
*/
unsigned int irq_get_nr_irqs(void)
{
- return nr_irqs;
+ return total_nr_irqs;
}
EXPORT_SYMBOL_GPL(irq_get_nr_irqs);
@@ -157,13 +158,12 @@ EXPORT_SYMBOL_GPL(irq_get_nr_irqs);
*
* Return: @nr.
*/
-unsigned int irq_set_nr_irqs(unsigned int nr)
+unsigned int __init irq_set_nr_irqs(unsigned int nr)
{
- nr_irqs = nr;
-
+ total_nr_irqs = nr;
+ irq_proc_calc_prec();
return nr;
}
-EXPORT_SYMBOL_GPL(irq_set_nr_irqs);
static DEFINE_MUTEX(sparse_irq_lock);
static struct maple_tree sparse_irqs = MTREE_INIT_EXT(sparse_irqs,
@@ -181,15 +181,12 @@ static int irq_find_free_area(unsigned int from, unsigned int cnt)
return mas.index;
}
-static unsigned int irq_find_at_or_after(unsigned int offset)
+struct irq_desc *irq_find_desc_at_or_after(unsigned int offset)
{
unsigned long index = offset;
- struct irq_desc *desc;
-
- guard(rcu)();
- desc = mt_find(&sparse_irqs, &index, nr_irqs);
- return desc ? irq_desc_get_irq(desc) : nr_irqs;
+ lockdep_assert_in_rcu_read_lock();
+ return mt_find(&sparse_irqs, &index, total_nr_irqs);
}
static void irq_insert_desc(unsigned int irq, struct irq_desc *desc)
@@ -466,6 +463,17 @@ static void delayed_free_desc(struct rcu_head *rhp)
kobject_put(&desc->kobj);
}
+void irq_desc_free_rcu(struct irq_desc *desc)
+{
+ /*
+ * We free the descriptor, masks and stat fields via RCU. That
+ * allows demultiplex interrupts to do rcu based management of
+ * the child interrupts.
+ * This also allows us to use rcu in kstat_irqs_usr().
+ */
+ call_rcu(&desc->rcu, delayed_free_desc);
+}
+
static void free_desc(unsigned int irq)
{
struct irq_desc *desc = irq_to_desc(irq);
@@ -484,14 +492,7 @@ static void free_desc(unsigned int irq)
*/
irq_sysfs_del(desc);
delete_irq_desc(irq);
-
- /*
- * We free the descriptor, masks and stat fields via RCU. That
- * allows demultiplex interrupts to do rcu based management of
- * the child interrupts.
- * This also allows us to use rcu in kstat_irqs_usr().
- */
- call_rcu(&desc->rcu, delayed_free_desc);
+ irq_desc_put_ref(desc);
}
static int alloc_descs(unsigned int start, unsigned int cnt, int node,
@@ -543,7 +544,8 @@ static bool irq_expand_nr_irqs(unsigned int nr)
{
if (nr > MAX_SPARSE_IRQS)
return false;
- nr_irqs = nr;
+ total_nr_irqs = nr;
+ irq_proc_calc_prec();
return true;
}
@@ -557,21 +559,22 @@ int __init early_irq_init(void)
/* Let arch update nr_irqs and return the nr of preallocated irqs */
initcnt = arch_probe_nr_irqs();
printk(KERN_INFO "NR_IRQS: %d, nr_irqs: %d, preallocated irqs: %d\n",
- NR_IRQS, nr_irqs, initcnt);
+ NR_IRQS, total_nr_irqs, initcnt);
- if (WARN_ON(nr_irqs > MAX_SPARSE_IRQS))
- nr_irqs = MAX_SPARSE_IRQS;
+ if (WARN_ON(total_nr_irqs > MAX_SPARSE_IRQS))
+ total_nr_irqs = MAX_SPARSE_IRQS;
if (WARN_ON(initcnt > MAX_SPARSE_IRQS))
initcnt = MAX_SPARSE_IRQS;
- if (initcnt > nr_irqs)
- nr_irqs = initcnt;
+ if (initcnt > total_nr_irqs)
+ total_nr_irqs = initcnt;
for (i = 0; i < initcnt; i++) {
desc = alloc_desc(i, node, 0, NULL, NULL);
irq_insert_desc(i, desc);
}
+ irq_proc_calc_prec();
return arch_early_irq_init();
}
@@ -592,7 +595,7 @@ int __init early_irq_init(void)
init_irq_default_affinity();
- printk(KERN_INFO "NR_IRQS: %d\n", NR_IRQS);
+ pr_info("NR_IRQS: %d\n", NR_IRQS);
count = ARRAY_SIZE(irq_desc);
@@ -602,6 +605,7 @@ int __init early_irq_init(void)
goto __free_desc_res;
}
+ irq_proc_calc_prec();
return arch_early_irq_init();
__free_desc_res:
@@ -862,7 +866,7 @@ void irq_free_descs(unsigned int from, unsigned int cnt)
{
int i;
- if (from >= nr_irqs || (from + cnt) > nr_irqs)
+ if (from >= total_nr_irqs || (from + cnt) > total_nr_irqs)
return;
guard(mutex)(&sparse_irq_lock);
@@ -911,7 +915,7 @@ int __ref __irq_alloc_descs(int irq, unsigned int from, unsigned int cnt, int no
if (irq >=0 && start != irq)
return -EEXIST;
- if (start + cnt > nr_irqs) {
+ if (start + cnt > total_nr_irqs) {
if (!irq_expand_nr_irqs(start + cnt))
return -ENOMEM;
}
@@ -923,11 +927,15 @@ EXPORT_SYMBOL_GPL(__irq_alloc_descs);
* irq_get_next_irq - get next allocated irq number
* @offset: where to start the search
*
- * Returns next irq number after offset or nr_irqs if none is found.
+ * Returns next irq number after offset or total_nr_irqs if none is found.
*/
unsigned int irq_get_next_irq(unsigned int offset)
{
- return irq_find_at_or_after(offset);
+ struct irq_desc *desc;
+
+ guard(rcu)();
+ desc = irq_find_desc_at_or_after(offset);
+ return desc ? irq_desc_get_irq(desc) : total_nr_irqs;
}
struct irq_desc *__irq_get_desc_lock(unsigned int irq, unsigned long *flags, bool bus,
diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
index cc93abf009e8..f15c9f1223bb 100644
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -20,6 +20,8 @@
#include <linux/smp.h>
#include <linux/fs.h>
+#include "proc.h"
+
static LIST_HEAD(irq_domain_list);
static DEFINE_MUTEX(irq_domain_mutex);
@@ -1532,6 +1534,7 @@ int irq_domain_set_hwirq_and_chip(struct irq_domain *domain, unsigned int virq,
irq_data->chip = (struct irq_chip *)(chip ? chip : &no_irq_chip);
irq_data->chip_data = chip_data;
+ irq_proc_update_chip(chip);
return 0;
}
EXPORT_SYMBOL_GPL(irq_domain_set_hwirq_and_chip);
@@ -2081,7 +2084,7 @@ static void irq_domain_free_one_irq(struct irq_domain *domain, unsigned int virq
#endif /* CONFIG_IRQ_DOMAIN_HIERARCHY */
#ifdef CONFIG_GENERIC_IRQ_DEBUGFS
-#include "internals.h"
+#include "debugfs.h"
static struct dentry *domain_dir;
diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index 2e8072437826..7eb07e3bdb4c 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -1802,6 +1802,7 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new)
__enable_irq(desc);
}
+ irq_proc_update_valid(desc);
raw_spin_unlock_irqrestore(&desc->lock, flags);
chip_bus_sync_unlock(desc);
mutex_unlock(&desc->request_mutex);
@@ -1906,6 +1907,7 @@ static struct irqaction *__free_irq(struct irq_desc *desc, void *dev_id)
desc->affinity_hint = NULL;
#endif
+ irq_proc_update_valid(desc);
raw_spin_unlock_irqrestore(&desc->lock, flags);
/*
* Drop bus_lock here so the changes which were done in the chip
@@ -2026,24 +2028,32 @@ const void *free_irq(unsigned int irq, void *dev_id)
}
EXPORT_SYMBOL(free_irq);
-/* This function must be called with desc->lock held */
static const void *__cleanup_nmi(unsigned int irq, struct irq_desc *desc)
{
+ struct irqaction *action = NULL;
const char *devname = NULL;
- desc->istate &= ~IRQS_NMI;
+ scoped_guard(raw_spinlock_irqsave, &desc->lock) {
+ irq_nmi_teardown(desc);
- if (!WARN_ON(desc->action == NULL)) {
- irq_pm_remove_action(desc, desc->action);
- devname = desc->action->name;
- unregister_handler_proc(irq, desc->action);
+ desc->istate &= ~IRQS_NMI;
- kfree(desc->action);
+ if (!WARN_ON(desc->action == NULL)) {
+ action = desc->action;
+ irq_pm_remove_action(desc, action);
+ devname = action->name;
+ }
desc->action = NULL;
+
+ irq_settings_clr_disable_unlazy(desc);
+ irq_shutdown_and_deactivate(desc);
}
- irq_settings_clr_disable_unlazy(desc);
- irq_shutdown_and_deactivate(desc);
+ irq_proc_update_valid(desc);
+
+ if (action)
+ unregister_handler_proc(irq, action);
+ kfree(action);
irq_release_resources(desc);
@@ -2067,8 +2077,6 @@ const void *free_nmi(unsigned int irq, void *dev_id)
if (WARN_ON(desc->depth == 0))
disable_nmi_nosync(irq);
- guard(raw_spinlock_irqsave)(&desc->lock);
- irq_nmi_teardown(desc);
return __cleanup_nmi(irq, desc);
}
@@ -2318,13 +2326,14 @@ int request_nmi(unsigned int irq, irq_handler_t handler,
/* Setup NMI state */
desc->istate |= IRQS_NMI;
retval = irq_nmi_setup(desc);
- if (retval) {
- __cleanup_nmi(irq, desc);
- return -EINVAL;
- }
- return 0;
}
+ if (retval) {
+ __cleanup_nmi(irq, desc);
+ return -EINVAL;
+ }
+ return 0;
+
err_irq_setup:
irq_chip_pm_put(&desc->irq_data);
err_out:
@@ -2428,8 +2437,10 @@ static struct irqaction *__free_percpu_irq(unsigned int irq, void __percpu *dev_
*action_ptr = action->next;
/* Demote from NMI if we killed the last action */
- if (!desc->action)
+ if (!desc->action) {
desc->istate &= ~IRQS_NMI;
+ irq_proc_update_valid(desc);
+ }
}
unregister_handler_proc(irq, action);
diff --git a/kernel/irq/proc.c b/kernel/irq/proc.c
index b0999a4f1f68..1b835725f7b1 100644
--- a/kernel/irq/proc.c
+++ b/kernel/irq/proc.c
@@ -10,6 +10,7 @@
#include <linux/proc_fs.h>
#include <linux/seq_file.h>
#include <linux/interrupt.h>
+#include <linux/kernel.h>
#include <linux/kernel_stat.h>
#include <linux/mutex.h>
#include <linux/string.h>
@@ -326,7 +327,7 @@ void register_handler_proc(unsigned int irq, struct irqaction *action)
#undef MAX_NAMELEN
-#define MAX_NAMELEN 10
+#define MAX_NAMELEN 11
void register_irq_proc(unsigned int irq, struct irq_desc *desc)
{
@@ -348,7 +349,7 @@ void register_irq_proc(unsigned int irq, struct irq_desc *desc)
return;
/* create /proc/irq/1234 */
- sprintf(name, "%u", irq);
+ snprintf(name, MAX_NAMELEN, "%u", irq);
desc->dir = proc_mkdir(name, root_irq_dir);
if (!desc->dir)
return;
@@ -401,7 +402,7 @@ void unregister_irq_proc(unsigned int irq, struct irq_desc *desc)
#endif
remove_proc_entry("spurious", desc->dir);
- sprintf(name, "%u", irq);
+ snprintf(name, MAX_NAMELEN, "%u", irq);
remove_proc_entry(name, root_irq_dir);
}
@@ -439,77 +440,159 @@ void init_irq_proc(void)
register_irq_proc(irq, desc);
}
+void irq_proc_update_valid(struct irq_desc *desc)
+{
+ u32 set = _IRQ_PROC_VALID;
+
+ if (irq_settings_is_hidden(desc) || irq_desc_is_chained(desc) || !desc->action)
+ set = 0;
+
+ irq_settings_update_proc_valid(desc, set);
+}
+
#ifdef CONFIG_GENERIC_IRQ_SHOW
+#define ARCH_PROC_IRQDESC ((void *)0x00001111)
+
int __weak arch_show_interrupts(struct seq_file *p, int prec)
{
return 0;
}
+static DEFINE_RAW_SPINLOCK(irq_proc_constraints_lock);
+
+static struct irq_proc_constraints {
+ bool print_header;
+ unsigned int num_prec;
+ unsigned int chip_width;
+} irq_proc_constraints __read_mostly = {
+ .num_prec = 4,
+ .chip_width = 8,
+};
+
#ifndef ACTUAL_NR_IRQS
-# define ACTUAL_NR_IRQS irq_get_nr_irqs()
+# define ACTUAL_NR_IRQS total_nr_irqs
#endif
-int show_interrupts(struct seq_file *p, void *v)
+void irq_proc_calc_prec(void)
{
- const unsigned int nr_irqs = irq_get_nr_irqs();
- static int prec;
+ unsigned int prec, n;
- int i = *(loff_t *) v, j;
- struct irqaction *action;
- struct irq_desc *desc;
+ for (prec = 4, n = 10000; prec < 10 && n <= total_nr_irqs; ++prec)
+ n *= 10;
+
+ guard(raw_spinlock_irqsave)(&irq_proc_constraints_lock);
+ if (prec > irq_proc_constraints.num_prec)
+ WRITE_ONCE(irq_proc_constraints.num_prec, prec);
+}
+
+void irq_proc_update_chip(const struct irq_chip *chip)
+{
+ unsigned int len = chip && chip->name ? strlen(chip->name) : 0;
+
+ if (!len || len <= READ_ONCE(irq_proc_constraints.chip_width))
+ return;
+
+ /* Can be invoked from interrupt disabled contexts */
+ guard(raw_spinlock_irqsave)(&irq_proc_constraints_lock);
+ if (len > irq_proc_constraints.chip_width)
+ WRITE_ONCE(irq_proc_constraints.chip_width, len);
+}
+
+/* Same as seq_put_decimal_ull_width(p, " ", cnt, 10) */
+#define ZSTR1 " 0"
+#define ZSTR1_LEN (sizeof(ZSTR1) - 1)
+#define ZSTR16 ZSTR1 ZSTR1 ZSTR1 ZSTR1 ZSTR1 ZSTR1 ZSTR1 ZSTR1 \
+ ZSTR1 ZSTR1 ZSTR1 ZSTR1 ZSTR1 ZSTR1 ZSTR1 ZSTR1
+#define ZSTR256 ZSTR16 ZSTR16 ZSTR16 ZSTR16 ZSTR16 ZSTR16 ZSTR16 ZSTR16 \
+ ZSTR16 ZSTR16 ZSTR16 ZSTR16 ZSTR16 ZSTR16 ZSTR16 ZSTR16
+
+static inline void irq_proc_emit_zero_counts(struct seq_file *p, unsigned int zeros)
+{
+ if (!zeros)
+ return;
+
+ for (unsigned int n = min(zeros, 256); n; zeros -= n, n = min(zeros, 256))
+ seq_write(p, ZSTR256, n * ZSTR1_LEN);
+}
+
+static inline unsigned int irq_proc_emit_count(struct seq_file *p, unsigned int cnt,
+ unsigned int zeros)
+{
+ if (!cnt)
+ return zeros + 1;
- if (i > ACTUAL_NR_IRQS)
- return 0;
+ irq_proc_emit_zero_counts(p, zeros);
+ seq_put_decimal_ull_width(p, " ", cnt, 10);
+ return 0;
+}
- if (i == ACTUAL_NR_IRQS)
- return arch_show_interrupts(p, prec);
+void irq_proc_emit_counts(struct seq_file *p, unsigned int __percpu *cnts)
+{
+ unsigned int cpu, zeros = 0;
- /* print header and calculate the width of the first column */
- if (i == 0) {
- for (prec = 3, j = 1000; prec < 10 && j <= nr_irqs; ++prec)
- j *= 10;
+ for_each_online_cpu(cpu)
+ zeros = irq_proc_emit_count(p, per_cpu(*cnts, cpu), zeros);
+ irq_proc_emit_zero_counts(p, zeros);
+}
- seq_printf(p, "%*s", prec + 8, "");
- for_each_online_cpu(j)
- seq_printf(p, "CPU%-8d", j);
+static int irq_seq_show(struct seq_file *p, void *v)
+{
+ struct irq_proc_constraints *constr = p->private;
+ struct irq_desc *desc = v;
+ struct irqaction *action;
+
+ /* Print header for the first interrupt? */
+ if (constr->print_header) {
+ unsigned int cpu;
+
+ seq_printf(p, "%*s", constr->num_prec + 8, "");
+ for_each_online_cpu(cpu)
+ seq_printf(p, "CPU%-8d", cpu);
seq_putc(p, '\n');
+ constr->print_header = false;
}
- guard(rcu)();
- desc = irq_to_desc(i);
- if (!desc || irq_settings_is_hidden(desc))
- return 0;
+ if (desc == ARCH_PROC_IRQDESC)
+ return arch_show_interrupts(p, constr->num_prec);
- if (!desc->action || irq_desc_is_chained(desc) || !desc->kstat_irqs)
- return 0;
+ seq_put_decimal_ull_width(p, "", irq_desc_get_irq(desc), constr->num_prec);
+ seq_putc(p, ':');
- seq_printf(p, "%*d:", prec, i);
- for_each_online_cpu(j) {
- unsigned int cnt = desc->kstat_irqs ? per_cpu(desc->kstat_irqs->cnt, j) : 0;
+ /*
+ * Always output per CPU interrupts. Output device interrupts only when
+ * desc::tot_count is not zero.
+ */
+ if (irq_settings_is_per_cpu(desc) || irq_settings_is_per_cpu_devid(desc) ||
+ data_race(desc->tot_count))
+ irq_proc_emit_counts(p, &desc->kstat_irqs->cnt);
+ else
+ irq_proc_emit_zero_counts(p, num_online_cpus());
- seq_put_decimal_ull_width(p, " ", cnt, 10);
- }
- seq_putc(p, ' ');
+ /* Enforce a visual gap */
+ seq_write(p, " ", 2);
guard(raw_spinlock_irq)(&desc->lock);
if (desc->irq_data.chip) {
if (desc->irq_data.chip->irq_print_chip)
desc->irq_data.chip->irq_print_chip(&desc->irq_data, p);
else if (desc->irq_data.chip->name)
- seq_printf(p, "%8s", desc->irq_data.chip->name);
+ seq_printf(p, "%-*s", constr->chip_width, desc->irq_data.chip->name);
else
- seq_printf(p, "%8s", "-");
+ seq_printf(p, "%-*s", constr->chip_width, "-");
} else {
- seq_printf(p, "%8s", "None");
+ seq_printf(p, "%-*s", constr->chip_width, "None");
}
+
+ seq_putc(p, ' ');
if (desc->irq_data.domain)
- seq_printf(p, " %*lu", prec, desc->irq_data.hwirq);
+ seq_put_decimal_ull_width(p, "", desc->irq_data.hwirq, constr->num_prec);
else
- seq_printf(p, " %*s", prec, "");
-#ifdef CONFIG_GENERIC_IRQ_SHOW_LEVEL
- seq_printf(p, " %-8s", irqd_is_level_type(&desc->irq_data) ? "Level" : "Edge");
-#endif
+ seq_printf(p, " %*s", constr->num_prec, "");
+
+ if (IS_ENABLED(CONFIG_GENERIC_IRQ_SHOW_LEVEL))
+ seq_printf(p, " %-8s", irqd_is_level_type(&desc->irq_data) ? "Level" : "Edge");
+
if (desc->name)
seq_printf(p, "-%-8s", desc->name);
@@ -523,4 +606,73 @@ int show_interrupts(struct seq_file *p, void *v)
seq_putc(p, '\n');
return 0;
}
+
+static void *irq_seq_next_desc(loff_t *pos)
+{
+ if (*pos > total_nr_irqs)
+ return NULL;
+
+ guard(rcu)();
+ for (;;) {
+ struct irq_desc *desc = irq_find_desc_at_or_after((unsigned int) *pos);
+
+ if (desc) {
+ *pos = irq_desc_get_irq(desc);
+ /*
+ * If valid for output then try to acquire a reference
+ * count on the descriptor so that it can't be freed
+ * after dropping RCU read lock on return.
+ */
+ if (irq_settings_proc_valid(desc) && irq_desc_get_ref(desc))
+ return desc;
+ (*pos)++;
+ } else {
+ *pos = total_nr_irqs;
+ return ARCH_PROC_IRQDESC;
+ }
+ }
+}
+
+static void *irq_seq_start(struct seq_file *f, loff_t *pos)
+{
+ if (!*pos) {
+ struct irq_proc_constraints *constr = f->private;
+
+ constr->num_prec = READ_ONCE(irq_proc_constraints.num_prec);
+ constr->chip_width = READ_ONCE(irq_proc_constraints.chip_width);
+ constr->print_header = true;
+ }
+ return irq_seq_next_desc(pos);
+}
+
+static void *irq_seq_next(struct seq_file *f, void *v, loff_t *pos)
+{
+ if (v && v != ARCH_PROC_IRQDESC)
+ irq_desc_put_ref(v);
+
+ (*pos)++;
+ return irq_seq_next_desc(pos);
+}
+
+static void irq_seq_stop(struct seq_file *f, void *v)
+{
+ if (v && v != ARCH_PROC_IRQDESC)
+ irq_desc_put_ref(v);
+}
+
+static const struct seq_operations irq_seq_ops = {
+ .start = irq_seq_start,
+ .next = irq_seq_next,
+ .stop = irq_seq_stop,
+ .show = irq_seq_show,
+};
+
+static int __init irq_proc_init(void)
+{
+ proc_create_seq_private("interrupts", 0, NULL, &irq_seq_ops,
+ sizeof(irq_proc_constraints), NULL);
+ return 0;
+}
+fs_initcall(irq_proc_init);
+
#endif
diff --git a/kernel/irq/proc.h b/kernel/irq/proc.h
new file mode 100644
index 000000000000..0631d57fbfb7
--- /dev/null
+++ b/kernel/irq/proc.h
@@ -0,0 +1,13 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _KERNEL_IRQ_PROC_H
+#define _KERNEL_IRQ_PROC_H
+
+#if defined(CONFIG_PROC_FS) && defined(CONFIG_GENERIC_IRQ_SHOW)
+void irq_proc_calc_prec(void);
+void irq_proc_update_chip(const struct irq_chip *chip);
+#else
+static inline void irq_proc_calc_prec(void) { }
+static inline void irq_proc_update_chip(const struct irq_chip *chip) { }
+#endif
+
+#endif
diff --git a/kernel/irq/settings.h b/kernel/irq/settings.h
index 00b3bd127692..0a0c027a5d34 100644
--- a/kernel/irq/settings.h
+++ b/kernel/irq/settings.h
@@ -18,6 +18,7 @@ enum {
_IRQ_DISABLE_UNLAZY = IRQ_DISABLE_UNLAZY,
_IRQ_HIDDEN = IRQ_HIDDEN,
_IRQ_NO_DEBUG = IRQ_NO_DEBUG,
+ _IRQ_PROC_VALID = IRQ_RESERVED,
_IRQF_MODIFY_MASK = IRQF_MODIFY_MASK,
};
@@ -34,6 +35,7 @@ enum {
#define IRQ_DISABLE_UNLAZY GOT_YOU_MORON
#define IRQ_HIDDEN GOT_YOU_MORON
#define IRQ_NO_DEBUG GOT_YOU_MORON
+#define IRQ_RESERVED GOT_YOU_MORON
#undef IRQF_MODIFY_MASK
#define IRQF_MODIFY_MASK GOT_YOU_MORON
@@ -180,3 +182,14 @@ static inline bool irq_settings_no_debug(struct irq_desc *desc)
{
return desc->status_use_accessors & _IRQ_NO_DEBUG;
}
+
+static inline bool irq_settings_proc_valid(struct irq_desc *desc)
+{
+ return desc->status_use_accessors & _IRQ_PROC_VALID;
+}
+
+static inline void irq_settings_update_proc_valid(struct irq_desc *desc, u32 set)
+{
+ desc->status_use_accessors &= ~_IRQ_PROC_VALID;
+ desc->status_use_accessors |= (set & _IRQ_PROC_VALID);
+}
diff --git a/scripts/gdb/linux/interrupts.py b/scripts/gdb/linux/interrupts.py
index f4f715a8f0e3..a68ae91b4531 100644
--- a/scripts/gdb/linux/interrupts.py
+++ b/scripts/gdb/linux/interrupts.py
@@ -20,7 +20,7 @@ def irq_desc_is_chained(desc):
def irqd_is_level(desc):
return desc['irq_data']['common']['state_use_accessors'] & constants.LX_IRQD_LEVEL
-def show_irq_desc(prec, irq):
+def show_irq_desc(prec, chip_width, irq):
text = ""
desc = mapletree.mtree_load(gdb.parse_and_eval("&sparse_irqs"), irq)
@@ -48,7 +48,7 @@ def show_irq_desc(prec, irq):
count = cpus.per_cpu(desc['kstat_irqs'], cpu)['cnt']
else:
count = 0
- text += "%10u" % (count)
+ text += "%10u " % (count)
name = "None"
if desc['irq_data']['chip']:
@@ -58,7 +58,7 @@ def show_irq_desc(prec, irq):
else:
name = "-"
- text += " %8s" % (name)
+ text += " %-*s" % (chip_width, name)
if desc['irq_data']['domain']:
text += " %*lu" % (prec, desc['irq_data']['hwirq'])
@@ -97,64 +97,29 @@ def show_irq_err_count(prec):
text += "%*s: %10u\n" % (prec, "ERR", cnt['counter'])
return text
-def x86_show_irqstat(prec, pfx, field, desc):
- irq_stat = gdb.parse_and_eval("&irq_stat")
+def x86_show_irqstat(prec, pfx, idx, desc):
+ irq_stat = gdb.parse_and_eval("&irq_stat.counts[%d]" %idx)
text = "%*s: " % (prec, pfx)
for cpu in cpus.each_online_cpu():
stat = cpus.per_cpu(irq_stat, cpu)
- text += "%10u " % (stat[field])
- text += " %s\n" % (desc)
- return text
-
-def x86_show_mce(prec, var, pfx, desc):
- pvar = gdb.parse_and_eval(var)
- text = "%*s: " % (prec, pfx)
- for cpu in cpus.each_online_cpu():
- text += "%10u " % (cpus.per_cpu(pvar, cpu).dereference())
- text += " %s\n" % (desc)
+ text += "%10u " % (stat.dereference())
+ text += desc
return text
def x86_show_interupts(prec):
- text = x86_show_irqstat(prec, "NMI", '__nmi_count', 'Non-maskable interrupts')
-
- if constants.LX_CONFIG_X86_LOCAL_APIC:
- text += x86_show_irqstat(prec, "LOC", 'apic_timer_irqs', "Local timer interrupts")
- text += x86_show_irqstat(prec, "SPU", 'irq_spurious_count', "Spurious interrupts")
- text += x86_show_irqstat(prec, "PMI", 'apic_perf_irqs', "Performance monitoring interrupts")
- text += x86_show_irqstat(prec, "IWI", 'apic_irq_work_irqs', "IRQ work interrupts")
- text += x86_show_irqstat(prec, "RTR", 'icr_read_retry_count', "APIC ICR read retries")
- if utils.gdb_eval_or_none("x86_platform_ipi_callback") is not None:
- text += x86_show_irqstat(prec, "PLT", 'x86_platform_ipis', "Platform interrupts")
-
- if constants.LX_CONFIG_SMP:
- text += x86_show_irqstat(prec, "RES", 'irq_resched_count', "Rescheduling interrupts")
- text += x86_show_irqstat(prec, "CAL", 'irq_call_count', "Function call interrupts")
- text += x86_show_irqstat(prec, "TLB", 'irq_tlb_count', "TLB shootdowns")
-
- if constants.LX_CONFIG_X86_THERMAL_VECTOR:
- text += x86_show_irqstat(prec, "TRM", 'irq_thermal_count', "Thermal events interrupts")
-
- if constants.LX_CONFIG_X86_MCE_THRESHOLD:
- text += x86_show_irqstat(prec, "THR", 'irq_threshold_count', "Threshold APIC interrupts")
-
- if constants.LX_CONFIG_X86_MCE_AMD:
- text += x86_show_irqstat(prec, "DFR", 'irq_deferred_error_count', "Deferred Error APIC interrupts")
+ info_type = gdb.lookup_type('struct irq_stat_info')
+ info = gdb.parse_and_eval('irq_stat_info')
+ bitmap = gdb.parse_and_eval('irq_stat_count_show')
+ bitsperlong = 8 * int(bitmap.type.target().sizeof)
- if constants.LX_CONFIG_X86_MCE:
- text += x86_show_mce(prec, "&mce_exception_count", "MCE", "Machine check exceptions")
- text += x86_show_mce(prec, "&mce_poll_count", "MCP", "Machine check polls")
-
- text += show_irq_err_count(prec)
-
- if constants.LX_CONFIG_X86_IO_APIC:
- cnt = utils.gdb_eval_or_none("irq_mis_count")
- if cnt is not None:
- text += "%*s: %10u\n" % (prec, "MIS", cnt['counter'])
-
- if constants.LX_CONFIG_KVM:
- text += x86_show_irqstat(prec, "PIN", 'kvm_posted_intr_ipis', 'Posted-interrupt notification event')
- text += x86_show_irqstat(prec, "NPI", 'kvm_posted_intr_nested_ipis', 'Nested posted-interrupt event')
- text += x86_show_irqstat(prec, "PIW", 'kvm_posted_intr_wakeup_ipis', 'Posted-interrupt wakeup event')
+ text = ""
+ for idx in range(int(info.type.sizeof / info_type.sizeof)):
+ show = bitmap[int(idx / bitsperlong)]
+ if not show & 1 << int(idx % bitsperlong):
+ continue
+ pfx = info[idx]['symbol'].string()
+ desc = info[idx]['text'].string()
+ text += x86_show_irqstat(prec, pfx, idx, desc)
return text
@@ -166,23 +131,19 @@ def arm_common_show_interrupts(prec):
if nr_ipi is None or ipi_desc is None or ipi_types is None:
return text
- if prec >= 4:
- sep = " "
- else:
- sep = ""
-
for ipi in range(nr_ipi):
- text += "%*s%u:%s" % (prec - 1, "IPI", ipi, sep)
+ text += "%*s%u: " % (prec - 1, "IPI", ipi)
desc = ipi_desc[ipi].cast(irq_desc_type.get_type().pointer())
if desc == 0:
continue
for cpu in cpus.each_online_cpu():
- text += "%10u" % (cpus.per_cpu(desc['kstat_irqs'], cpu)['cnt'])
- text += " %s" % (ipi_types[ipi].string())
+ text += "%10u " % (cpus.per_cpu(desc['kstat_irqs'], cpu)['cnt'])
+ text += "%s" % (ipi_types[ipi].string())
text += "\n"
return text
def aarch64_show_interrupts(prec):
+ # Does not work for ARM64 as "ipi_desc" is not available there
text = arm_common_show_interrupts(prec)
text += "%*s: %10lu\n" % (prec, "ERR", gdb.parse_and_eval("irq_err_count"))
return text
@@ -209,12 +170,19 @@ class LxInterruptList(gdb.Command):
super(LxInterruptList, self).__init__("lx-interruptlist", gdb.COMMAND_DATA)
def invoke(self, arg, from_tty):
- nr_irqs = gdb.parse_and_eval("nr_irqs")
- prec = 3
- j = 1000
- while prec < 10 and j <= nr_irqs:
- prec += 1
- j *= 10
+ nr_irqs = gdb.parse_and_eval("total_nr_irqs")
+ constr = utils.gdb_eval_or_none('irq_proc_constraints')
+
+ if constr:
+ prec = int(constr['num_prec'])
+ chip_width = int(constr['chip_width'])
+ else:
+ prec = 4
+ j = 10000
+ while prec < 10 and j <= nr_irqs:
+ prec += 1
+ j *= 10
+ chip_width = 8
gdb.write("%*s" % (prec + 8, ""))
for cpu in cpus.each_online_cpu():
@@ -225,7 +193,7 @@ class LxInterruptList(gdb.Command):
raise gdb.GdbError("Unable to find the sparse IRQ tree, is CONFIG_SPARSE_IRQ enabled?")
for irq in range(nr_irqs):
- gdb.write(show_irq_desc(prec, irq))
+ gdb.write(show_irq_desc(prec, chip_width, irq))
gdb.write(arch_show_interrupts(prec))
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [GIT pull] irq/drivers for v7.2-rc1
2026-06-13 21:24 [GIT pull] core/rseq for v7.2-rc1 Thomas Gleixner
2026-06-13 21:24 ` [GIT pull] irq/core " Thomas Gleixner
@ 2026-06-13 21:24 ` Thomas Gleixner
2026-06-15 8:51 ` pr-tracker-bot
2026-06-13 21:24 ` [GIT pull] irq/msi " Thomas Gleixner
` (7 subsequent siblings)
9 siblings, 1 reply; 22+ messages in thread
From: Thomas Gleixner @ 2026-06-13 21:24 UTC (permalink / raw)
To: Linus Torvalds; +Cc: linux-kernel, x86
Linus,
please pull the latest irq/drivers branch from:
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git irq-drivers-2026-06-13
up to: a1a35c09241f: irqchip/irq-realtek-rtl: Add multicore support
Interrupt chip driver changes:
- Replace the support for the AST2700-A0 early silicon with a proper
driver for the final A2 production silicon
- Rename and rework the StarFive JH8100 interrupt controller for the new
JHB100 SoC as JH8100 was discontinued before production.
- Add support for Amlogic A9 SoCs to the meson-gpio interrupt controller
- Expand the Econet interrupt controller driver to support MIPS 34Kc
Vectored External Interrupt Controller mode.
- Prevent a NULL pointer dereference in the GICv4 code as the vLPI code
blindly assumes that the ITS was populated. Add the missing sanity check.
- Add support for software triggered and for error interrupts to the
Renesas RZ/T2H driver.
- Add interrupt redirection support for the loongarch architecture.
- Add multicore support to the Realtek RTL interrupt driver
- The usual updates, enhancements and fixes all over the place
Thanks,
tglx
------------------>
Caleb James DeLisle (2):
dt-bindings: interrupt-controller: econet: Add CPU interrupt mapping
irqchip/econet-en751221: Support MIPS 34Kc VEIC mode
Changhuang Liang (4):
dt-bindings: interrupt-controller: Repurpose binding for unreleased jh8100 for jhb100
irqchip/starfive: Rename jh8100 to jhb100
irqchip/starfive: Use devm_ interfaces to simplify resource release
irqchip/starfive: Implement irq_set_type() and irq_ack() callbacks
Chen Ni (1):
irqchip/starfive: Fix error check for devm_platform_ioremap_resource()
Cosmin Tanislav (2):
irqchip/renesas-rzt2h: Add software-triggered interrupts support
irqchip/renesas-rzt2h: Add error interrupts support
Hans Zhang (1):
irqchip/gic-v3-its: Use FIELD_MODIFY()
Krzysztof Kozlowski (1):
irqchip/qcom: Unify user-visible "Qualcomm" name
Marek Szyprowski (1):
irqchip/exynos-combiner: Remove useless spinlock
Markus Stockhausen (2):
irqchip/irq-realtek-rtl: Add/simplify register helpers
irqchip/irq-realtek-rtl: Add multicore support
Mason Huo (1):
irqchip/starfive: Increase the interrupt source number up to 64
Mostafa Saleh (1):
irqchip/gic-v4: Don't advertise VLPIs if no ITS is probed
Mukesh Ojha (4):
irqchip/qcom-pdc: Split __pdc_enable_intr() into per-version helpers
irqchip/qcom-pdc: Tighten ioremap clamp to single DRV region size
irqchip/qcom-pdc: Add PDC_VERSION() macro to describe version register fields
irqchip/qcom-pdc: Use FIELD_GET() to extract bank index and bit position
Ryan Chen (4):
dt-bindings: interrupt-controller: Describe AST2700-A2 hardware instead of A0
irqchip/ast2700-intc: Add AST2700-A2 support
irqchip/ast2700-intc: Add KUnit tests for route resolution
irqchip/aspeed-intc: Remove AST2700-A0 support
Thomas Huth (1):
irqchip/gic: Replace __ASSEMBLY__ with __ASSEMBLER__
Tianyang Zhang (4):
Docs/LoongArch: Add advanced extended IRQ model
irqchip/loongarch-avec: Prepare for interrupt redirection support
irqchip/loongarch-avec: Return IRQ_SET_MASK_OK_DONE when keep affinity
irqchip/loongarch-ir: Add IR (interrupt redirection) irqchip support
Xianwei Zhao (3):
irqchip/meson-gpio: Use the correct register in meson_s4_gpio_irq_set_type()
dt-bindings: interrupt-controller: Add support for Amlogic A9 SoCs
irqchip/meson-gpio: Add support for Amlogic A9 SoCs
Documentation/arch/loongarch/irq-chip-model.rst | 35 ++
.../amlogic,meson-gpio-intc.yaml | 21 +-
.../interrupt-controller/aspeed,ast2700-intc.yaml | 90 ----
.../aspeed,ast2700-interrupt.yaml | 188 +++++++
.../interrupt-controller/econet,en751221-intc.yaml | 20 +
...,jh8100-intc.yaml => starfive,jhb100-intc.yaml} | 20 +-
.../zh_CN/arch/loongarch/irq-chip-model.rst | 34 ++
MAINTAINERS | 6 +-
arch/arm/include/asm/arch_gicv3.h | 4 +-
drivers/irqchip/.kunitconfig | 5 +
drivers/irqchip/Kconfig | 35 +-
drivers/irqchip/Makefile | 7 +-
drivers/irqchip/exynos-combiner.c | 4 -
drivers/irqchip/irq-aspeed-intc.c | 139 -----
drivers/irqchip/irq-ast2700-intc0-test.c | 473 +++++++++++++++++
drivers/irqchip/irq-ast2700-intc0.c | 582 +++++++++++++++++++++
drivers/irqchip/irq-ast2700-intc1.c | 280 ++++++++++
drivers/irqchip/irq-ast2700.c | 107 ++++
drivers/irqchip/irq-ast2700.h | 48 ++
drivers/irqchip/irq-econet-en751221.c | 186 ++++++-
drivers/irqchip/irq-gic-v3-its.c | 13 +-
drivers/irqchip/irq-loongarch-avec.c | 20 +-
drivers/irqchip/irq-loongarch-ir.c | 537 +++++++++++++++++++
drivers/irqchip/irq-loongson.h | 15 +
drivers/irqchip/irq-meson-gpio.c | 77 +++
drivers/irqchip/irq-realtek-rtl.c | 106 ++--
drivers/irqchip/irq-renesas-rzt2h.c | 268 +++++++++-
drivers/irqchip/irq-starfive-jh8100-intc.c | 207 --------
drivers/irqchip/irq-starfive-jhb100-intc.c | 254 +++++++++
drivers/irqchip/qcom-pdc.c | 63 ++-
include/linux/irqchip/arm-gic-v3.h | 2 +-
include/linux/irqchip/arm-gic.h | 4 +-
32 files changed, 3277 insertions(+), 573 deletions(-)
delete mode 100644 Documentation/devicetree/bindings/interrupt-controller/aspeed,ast2700-intc.yaml
create mode 100644 Documentation/devicetree/bindings/interrupt-controller/aspeed,ast2700-interrupt.yaml
rename Documentation/devicetree/bindings/interrupt-controller/{starfive,jh8100-intc.yaml => starfive,jhb100-intc.yaml} (68%)
create mode 100644 drivers/irqchip/.kunitconfig
delete mode 100644 drivers/irqchip/irq-aspeed-intc.c
create mode 100644 drivers/irqchip/irq-ast2700-intc0-test.c
create mode 100644 drivers/irqchip/irq-ast2700-intc0.c
create mode 100644 drivers/irqchip/irq-ast2700-intc1.c
create mode 100644 drivers/irqchip/irq-ast2700.c
create mode 100644 drivers/irqchip/irq-ast2700.h
create mode 100644 drivers/irqchip/irq-loongarch-ir.c
delete mode 100644 drivers/irqchip/irq-starfive-jh8100-intc.c
create mode 100644 drivers/irqchip/irq-starfive-jhb100-intc.c
diff --git a/Documentation/arch/loongarch/irq-chip-model.rst b/Documentation/arch/loongarch/irq-chip-model.rst
index 8f5c3345109e..774d40dc6a7e 100644
--- a/Documentation/arch/loongarch/irq-chip-model.rst
+++ b/Documentation/arch/loongarch/irq-chip-model.rst
@@ -181,6 +181,41 @@ go to PCH-PIC/PCH-LPC and gathered by EIOINTC, and then go to CPUINTC directly::
| Devices |
+---------+
+Advanced Extended IRQ model (with redirection)
+==============================================
+
+In this model, IPI (Inter-Processor Interrupt) and CPU Local Timer interrupt go
+to CPUINTC directly, CPU UARTS interrupts go to LIOINTC, PCH-MSI interrupts go
+to REDIRECT for remapping it to AVECINTC, and then go to CPUINTC directly, while
+all other devices interrupts go to PCH-PIC/PCH-LPC and gathered by EIOINTC, and
+then go to CPUINTC directly::
+
+ +-----+ +-----------------------+ +-------+
+ | IPI | --> | CPUINTC | <-- | Timer |
+ +-----+ +-----------------------+ +-------+
+ ^ ^ ^
+ | | |
+ | +----------+ |
+ +---------+ | AVECINTC | +---------+ +-------+
+ | EIOINTC | +----------+ | LIOINTC | <-- | UARTs |
+ +---------+ | REDIRECT | +---------+ +-------+
+ ^ +----------+
+ | ^
+ | |
+ +---------+ +---------+
+ | PCH-PIC | | PCH-MSI |
+ +---------+ +---------+
+ ^ ^ ^
+ | | |
+ +---------+ +---------+ +---------+
+ | Devices | | PCH-LPC | | Devices |
+ +---------+ +---------+ +---------+
+ ^
+ |
+ +---------+
+ | Devices |
+ +---------+
+
ACPI-related definitions
========================
diff --git a/Documentation/devicetree/bindings/interrupt-controller/amlogic,meson-gpio-intc.yaml b/Documentation/devicetree/bindings/interrupt-controller/amlogic,meson-gpio-intc.yaml
index d0fad930de9d..d26671913e89 100644
--- a/Documentation/devicetree/bindings/interrupt-controller/amlogic,meson-gpio-intc.yaml
+++ b/Documentation/devicetree/bindings/interrupt-controller/amlogic,meson-gpio-intc.yaml
@@ -38,6 +38,8 @@ properties:
- amlogic,a4-gpio-intc
- amlogic,a4-gpio-ao-intc
- amlogic,a5-gpio-intc
+ - amlogic,a9-gpio-intc
+ - amlogic,a9-gpio-ao-intc
- amlogic,c3-gpio-intc
- amlogic,s6-gpio-intc
- amlogic,s7-gpio-intc
@@ -56,7 +58,7 @@ properties:
amlogic,channel-interrupts:
description: Array with the upstream hwirq numbers
minItems: 2
- maxItems: 12
+ maxItems: 20
$ref: /schemas/types.yaml#/definitions/uint32-array
required:
@@ -76,9 +78,20 @@ then:
amlogic,channel-interrupts:
maxItems: 2
else:
- properties:
- amlogic,channel-interrupts:
- minItems: 8
+ if:
+ properties:
+ compatible:
+ contains:
+ const: amlogic,a9-gpio-ao-intc
+ then:
+ properties:
+ amlogic,channel-interrupts:
+ minItems: 20
+ else:
+ properties:
+ amlogic,channel-interrupts:
+ minItems: 8
+ maxItems: 12
additionalProperties: false
diff --git a/Documentation/devicetree/bindings/interrupt-controller/aspeed,ast2700-intc.yaml b/Documentation/devicetree/bindings/interrupt-controller/aspeed,ast2700-intc.yaml
deleted file mode 100644
index 258d21fe6e35..000000000000
--- a/Documentation/devicetree/bindings/interrupt-controller/aspeed,ast2700-intc.yaml
+++ /dev/null
@@ -1,90 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause
-%YAML 1.2
----
-$id: http://devicetree.org/schemas/interrupt-controller/aspeed,ast2700-intc.yaml#
-$schema: http://devicetree.org/meta-schemas/core.yaml#
-
-title: Aspeed AST2700 Interrupt Controller
-
-description:
- This interrupt controller hardware is second level interrupt controller that
- is hooked to a parent interrupt controller. It's useful to combine multiple
- interrupt sources into 1 interrupt to parent interrupt controller.
-
-maintainers:
- - Kevin Chen <kevin_chen@aspeedtech.com>
-
-properties:
- compatible:
- enum:
- - aspeed,ast2700-intc-ic
-
- reg:
- maxItems: 1
-
- interrupt-controller: true
-
- '#interrupt-cells':
- const: 1
- description:
- The first cell is the IRQ number, the second cell is the trigger
- type as defined in interrupt.txt in this directory.
-
- interrupts:
- minItems: 1
- maxItems: 10
- description: |
- Depend to which INTC0 or INTC1 used.
- INTC0 and INTC1 are two kinds of interrupt controller with enable and raw
- status registers for use.
- INTC0 is used to assert GIC if interrupt in INTC1 asserted.
- INTC1 is used to assert INTC0 if interrupt of modules asserted.
- +-----+ +-------+ +---------+---module0
- | GIC |---| INTC0 |--+--| INTC1_0 |---module2
- | | | | | | |---...
- +-----+ +-------+ | +---------+---module31
- |
- | +---------+---module0
- +---| INTC1_1 |---module2
- | | |---...
- | +---------+---module31
- ...
- | +---------+---module0
- +---| INTC1_5 |---module2
- | |---...
- +---------+---module31
-
-required:
- - compatible
- - reg
- - interrupt-controller
- - '#interrupt-cells'
- - interrupts
-
-additionalProperties: false
-
-examples:
- - |
- #include <dt-bindings/interrupt-controller/arm-gic.h>
-
- bus {
- #address-cells = <2>;
- #size-cells = <2>;
-
- interrupt-controller@12101b00 {
- compatible = "aspeed,ast2700-intc-ic";
- reg = <0 0x12101b00 0 0x10>;
- #interrupt-cells = <1>;
- interrupt-controller;
- interrupts = <GIC_SPI 192 IRQ_TYPE_LEVEL_HIGH>,
- <GIC_SPI 193 IRQ_TYPE_LEVEL_HIGH>,
- <GIC_SPI 194 IRQ_TYPE_LEVEL_HIGH>,
- <GIC_SPI 195 IRQ_TYPE_LEVEL_HIGH>,
- <GIC_SPI 196 IRQ_TYPE_LEVEL_HIGH>,
- <GIC_SPI 197 IRQ_TYPE_LEVEL_HIGH>,
- <GIC_SPI 198 IRQ_TYPE_LEVEL_HIGH>,
- <GIC_SPI 199 IRQ_TYPE_LEVEL_HIGH>,
- <GIC_SPI 200 IRQ_TYPE_LEVEL_HIGH>,
- <GIC_SPI 201 IRQ_TYPE_LEVEL_HIGH>;
- };
- };
diff --git a/Documentation/devicetree/bindings/interrupt-controller/aspeed,ast2700-interrupt.yaml b/Documentation/devicetree/bindings/interrupt-controller/aspeed,ast2700-interrupt.yaml
new file mode 100644
index 000000000000..a62f0fd2435b
--- /dev/null
+++ b/Documentation/devicetree/bindings/interrupt-controller/aspeed,ast2700-interrupt.yaml
@@ -0,0 +1,188 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/interrupt-controller/aspeed,ast2700-interrupt.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: ASPEED AST2700 Interrupt Controllers (INTC0/INTC1)
+
+description: |
+ The ASPEED AST2700 SoC integrates two interrupt controller designs:
+
+ - INTC0: Primary controller that routes interrupt sources to upstream,
+ processor-specific interrupt controllers
+
+ - INTC1: Secondary controller whose interrupt outputs feed into INTC0
+
+ The SoC contains four processors to which interrupts can be routed:
+
+ - PSP: Primary Service Processor (Cortex-A35)
+ - SSP: Secondary Service Processor (Cortex-M4)
+ - TSP: Tertiary Service Processor (Cortex-M4)
+ - BMCU: Boot MCU (a RISC-V microcontroller)
+
+ The following diagram illustrates the overall architecture of the
+ ASPEED AST2700 interrupt controllers:
+
+ +-----------+ +-----------+
+ | INTC0 | | INTC1(0) |
+ +-----------+ +-----------+
+ | Router | +-----------+ | Router |
+ | out int | +Peripheral + | out int |
+ +-----------+ | 0 0 <-+Controllers+ | INTM | +-----------+
+ |PSP GIC <-|---+ . . | +-----------+ | . . <-+Peripheral +
+ +-----------+ | . . | | . . | +Controllers+
+ +-----------+ | . . | | . . | +-----------+
+ |SSP NVIC <-|---+ . . <----------------+ . . |
+ +-----------+ | . . | | . . |
+ +-----------+ | . . <-------- | . . |
+ |TSP NVIC <-|---+ . . | | ----+ . . |
+ +-----------+ | . . | | | | O P |
+ | . . | | | +-----------+
+ | . . <---- | --------------------
+ | . . | | | +-----------+ |
+ | M N | | ---------+ INTC1(1) | |
+ +-----------+ | +-----------+ |
+ | . |
+ | +-----------+ |
+ -------------+ INTC1(N) | |
+ +-----------+ |
+ +--------------+ |
+ + BMCU APLIC <-+---------------------------------------------
+ +--------------+
+
+ INTC0 supports:
+ - 128 local peripheral interrupt inputs
+ - Fan-in from up to three INTC1 instances via banked interrupt lines (INTM)
+ - Local peripheral interrupt outputs
+ - Merged interrupt outputs
+ - Software interrupt outputs (SWINT)
+ - Configurable interrupt routes targeting the PSP, SSP, and TSP
+
+ INTC1 supports:
+ - 192 local peripheral interrupt inputs
+ - Banked interrupt outputs (INTM, 5 x 6 banks x 32 interrupts per bank)
+ - Configurable interrupt routes targeting the PSP, SSP, TSP, and BMCU
+
+ One INTC1 instance is always present, on the SoC's IO die. A further two
+ instances may be attached to the SoC's one INTC0 instance via LTPI (LVDS
+ Tunneling Protocol & Interface).
+
+ Interrupt numbering model
+ -------------------------
+ The binding uses a controller-local numbering model. Peripheral device
+ nodes use the INTCx local interrupt number (hwirq) in their 'interrupts' or
+ 'interrupts-extended' properties.
+
+ For AST2700, INTC0 exposes the following (inclusive) input ranges:
+
+ - 000..479: Independent interrupts
+ - 480..489: INTM0-INTM9
+ - 490..499: INTM10-INTM19
+ - 500..509: INTM20-INTM29
+ - 510..519: INTM30-INTM39
+ - 520..529: INTM40-INTM49
+
+ INTC0's (inclusive) output ranges are as follows:
+
+ - 000..127: 1:1 local peripheral interrupt output to PSP
+ - 144..151: Software interrupts from the SSP output to PSP
+ - 152..159: Software interrupts from the TSP output to PSP
+ - 192..201: INTM0-INTM9 banked outputs to PSP
+ - 208..217: INTM30-INTM39 banked outputs to PSP
+ - 224..233: INTM40-INTM49 banked outputs to PSP
+ - 256..383: 1:1 local peripheral interrupt output to SSP
+ - 384..393: INTM10-INTM19 banked outputs to SSP
+ - 400..407: Software interrupts from the PSP output to SSP
+ - 408..415: Software interrupts from the TSP output to SSP
+ - 426..553: 1:1 local peripheral interrupt output to TSP
+ - 554..563: INTM20-INTM29 banked outputs to TSP
+ - 570..577: Software interrupts from the PSP output to TSP
+ - 578..585: Software interrupts from the SSP output to TSP
+
+ Inputs and outputs for INTC1 instances are context-dependent. However, for the
+ first instance of INTC1, the (inclusive) output ranges are:
+
+ - 00..05: INTM0-INTM5
+ - 10..15: INTM10-INTM15
+ - 20..25: INTM20-INTM25
+ - 30..35: INTM30-INTM35
+ - 40..45: INTM40-INTM45
+ - 50..50: BootMCU
+
+maintainers:
+ - Ryan Chen <ryan_chen@aspeedtech.com>
+ - Andrew Jeffery <andrew@codeconstruct.com.au>
+
+properties:
+ compatible:
+ enum:
+ - aspeed,ast2700-intc0
+ - aspeed,ast2700-intc1
+
+ reg:
+ maxItems: 1
+
+ interrupt-controller: true
+
+ '#interrupt-cells':
+ const: 1
+ description: Single cell encoding the INTC local interrupt number (hwirq).
+
+ aspeed,interrupt-ranges:
+ description: |
+ Describes how ranges of controller output pins are routed to a parent
+ interrupt controller.
+
+ Each range entry is encoded as:
+
+ <out count phandle parent-specifier...>
+
+ where:
+ - out: First controller interrupt output index in the range.
+ - count: Number of consecutive controller interrupt outputs and parent
+ interrupt inputs in this range.
+ - phandle: Phandle to the parent interrupt controller node.
+ - parent-specifier: Interrupt specifier, as defined by the parent
+ interrupt controller binding.
+ $ref: /schemas/types.yaml#/definitions/uint32-array
+ minItems: 3
+ items:
+ description: Range descriptors with a parent interrupt specifier.
+
+required:
+ - compatible
+ - reg
+ - interrupt-controller
+ - '#interrupt-cells'
+ - aspeed,interrupt-ranges
+
+additionalProperties: false
+
+examples:
+ - |
+ #include <dt-bindings/interrupt-controller/arm-gic.h>
+
+ interrupt-controller@12100000 {
+ compatible = "aspeed,ast2700-intc0";
+ reg = <0x12100000 0x3b00>;
+ interrupt-parent = <&gic>;
+ interrupt-controller;
+ #interrupt-cells = <1>;
+
+ aspeed,interrupt-ranges =
+ <0 128 &gic GIC_SPI 0 IRQ_TYPE_LEVEL_HIGH>,
+ <144 8 &gic GIC_SPI 144 IRQ_TYPE_LEVEL_HIGH>,
+ <152 8 &gic GIC_SPI 152 IRQ_TYPE_LEVEL_HIGH>,
+ <192 10 &gic GIC_SPI 192 IRQ_TYPE_LEVEL_HIGH>,
+ <208 10 &gic GIC_SPI 208 IRQ_TYPE_LEVEL_HIGH>,
+ <224 10 &gic GIC_SPI 224 IRQ_TYPE_LEVEL_HIGH>,
+ <256 128 &ssp_nvic 0 0>,
+ <384 10 &ssp_nvic 160 0>,
+ <400 8 &ssp_nvic 144 0>,
+ <408 8 &ssp_nvic 152 0>,
+ <426 128 &tsp_nvic 0 0>,
+ <554 10 &tsp_nvic 160 0>,
+ <570 8 &tsp_nvic 144 0>,
+ <578 8 &tsp_nvic 152 0>;
+ };
diff --git a/Documentation/devicetree/bindings/interrupt-controller/econet,en751221-intc.yaml b/Documentation/devicetree/bindings/interrupt-controller/econet,en751221-intc.yaml
index 5536319c49c3..44c09785e6bb 100644
--- a/Documentation/devicetree/bindings/interrupt-controller/econet,en751221-intc.yaml
+++ b/Documentation/devicetree/bindings/interrupt-controller/econet,en751221-intc.yaml
@@ -52,6 +52,25 @@ properties:
- description: primary per-CPU IRQ
- description: shadow IRQ number
+ econet,cpu-interrupt-map:
+ $ref: /schemas/types.yaml#/definitions/uint32-matrix
+ description:
+ When running in VEIC mode, the hardware re-routes interrupts from the
+ CPU interrupt controller core to the "external" interrupt controller
+ (this device). It then prioritizes them and sends them back to the CPU
+ along with its own interrupts. The CPU hardware handles interrupts using
+ a special dispatch table (the normal interrupt handler is not invoked).
+ In this interrupt controller, the CPU interrupts are renumbered as they
+ are merged with this controller's own hardware interrupts.
+
+ This is the inverse of an interrupt-map, mapping which interrupts from
+ this controller must be routed back to the CPU interrupt domain for
+ correct handling there.
+ items:
+ items:
+ - description: The interrupt number as received in this controller
+ - description: The interrupt number to be dispatched on the CPU intc
+
required:
- compatible
- reg
@@ -74,5 +93,6 @@ examples:
interrupts = <2>;
econet,shadow-interrupts = <7 2>, <8 3>, <13 12>, <30 29>;
+ econet,cpu-interrupt-map = <7 0>, <8 1>;
};
...
diff --git a/Documentation/devicetree/bindings/interrupt-controller/starfive,jh8100-intc.yaml b/Documentation/devicetree/bindings/interrupt-controller/starfive,jhb100-intc.yaml
similarity index 68%
rename from Documentation/devicetree/bindings/interrupt-controller/starfive,jh8100-intc.yaml
rename to Documentation/devicetree/bindings/interrupt-controller/starfive,jhb100-intc.yaml
index ada5788602d6..d8a0a3862ae2 100644
--- a/Documentation/devicetree/bindings/interrupt-controller/starfive,jh8100-intc.yaml
+++ b/Documentation/devicetree/bindings/interrupt-controller/starfive,jhb100-intc.yaml
@@ -1,13 +1,13 @@
# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
%YAML 1.2
---
-$id: http://devicetree.org/schemas/interrupt-controller/starfive,jh8100-intc.yaml#
+$id: http://devicetree.org/schemas/interrupt-controller/starfive,jhb100-intc.yaml#
$schema: http://devicetree.org/meta-schemas/core.yaml#
title: StarFive External Interrupt Controller
description:
- StarFive SoC JH8100 contain a external interrupt controller. It can be used
+ StarFive SoC JHB100 contain a external interrupt controller. It can be used
to handle high-level input interrupt signals. It also send the output
interrupt signal to RISC-V PLIC.
@@ -16,19 +16,11 @@ maintainers:
properties:
compatible:
- const: starfive,jh8100-intc
+ const: starfive,jhb100-intc
reg:
maxItems: 1
- clocks:
- description: APB clock for the interrupt controller
- maxItems: 1
-
- resets:
- description: APB reset for the interrupt controller
- maxItems: 1
-
interrupts:
maxItems: 1
@@ -40,8 +32,6 @@ properties:
required:
- compatible
- reg
- - clocks
- - resets
- interrupts
- interrupt-controller
- "#interrupt-cells"
@@ -51,10 +41,8 @@ additionalProperties: false
examples:
- |
interrupt-controller@12260000 {
- compatible = "starfive,jh8100-intc";
+ compatible = "starfive,jhb100-intc";
reg = <0x12260000 0x10000>;
- clocks = <&syscrg_ne 76>;
- resets = <&syscrg_ne 13>;
interrupts = <45>;
interrupt-controller;
#interrupt-cells = <1>;
diff --git a/Documentation/translations/zh_CN/arch/loongarch/irq-chip-model.rst b/Documentation/translations/zh_CN/arch/loongarch/irq-chip-model.rst
index d4ff80de47b6..87b58aee92e1 100644
--- a/Documentation/translations/zh_CN/arch/loongarch/irq-chip-model.rst
+++ b/Documentation/translations/zh_CN/arch/loongarch/irq-chip-model.rst
@@ -174,6 +174,40 @@ CPU串口(UARTs)中断发送到LIOINTC,PCH-MSI中断发送到AVECINTC,
| Devices |
+---------+
+高级扩展IRQ模型 (带重定向)
+==========================
+
+在这种模型里面,IPI(Inter-Processor Interrupt)和CPU本地时钟中断直接发送到CPUINTC,
+CPU串口(UARTs)中断发送到LIOINTC,PCH-MSI中断首先发送到REDIRECT模块,完成重定向后发
+送到AVECINTC,而后通过AVECINTC直接送达CPUINTC,而其他所有设备的中断则分别发送到所连
+接的PCH-PIC/PCH-LPC,然后由EIOINTC统一收集,再直接到达CPUINTC::
+
+ +-----+ +-----------------------+ +-------+
+ | IPI | --> | CPUINTC | <-- | Timer |
+ +-----+ +-----------------------+ +-------+
+ ^ ^ ^
+ | | |
+ | +----------+ |
+ +---------+ | AVECINTC | +---------+ +-------+
+ | EIOINTC | +----------+ | LIOINTC | <-- | UARTs |
+ +---------+ | REDIRECT | +---------+ +-------+
+ ^ +----------+
+ | ^
+ | |
+ +---------+ +---------+
+ | PCH-PIC | | PCH-MSI |
+ +---------+ +---------+
+ ^ ^ ^
+ | | |
+ +---------+ +---------+ +---------+
+ | Devices | | PCH-LPC | | Devices |
+ +---------+ +---------+ +---------+
+ ^
+ |
+ +---------+
+ | Devices |
+ +---------+
+
ACPI相关的定义
==============
diff --git a/MAINTAINERS b/MAINTAINERS
index 2fb1c75afd16..73af7d7788ef 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -25533,11 +25533,11 @@ F: Documentation/devicetree/bindings/phy/starfive,jh7110-usb-phy.yaml
F: drivers/phy/starfive/phy-jh7110-pcie.c
F: drivers/phy/starfive/phy-jh7110-usb.c
-STARFIVE JH8100 EXTERNAL INTERRUPT CONTROLLER DRIVER
+STARFIVE JHB100 EXTERNAL INTERRUPT CONTROLLER DRIVER
M: Changhuang Liang <changhuang.liang@starfivetech.com>
S: Supported
-F: Documentation/devicetree/bindings/interrupt-controller/starfive,jh8100-intc.yaml
-F: drivers/irqchip/irq-starfive-jh8100-intc.c
+F: Documentation/devicetree/bindings/interrupt-controller/starfive,jhb100-intc.yaml
+F: drivers/irqchip/irq-starfive-jhb100-intc.c
STATIC BRANCH/CALL
M: Peter Zijlstra <peterz@infradead.org>
diff --git a/arch/arm/include/asm/arch_gicv3.h b/arch/arm/include/asm/arch_gicv3.h
index 311e83038bdb..847590df7551 100644
--- a/arch/arm/include/asm/arch_gicv3.h
+++ b/arch/arm/include/asm/arch_gicv3.h
@@ -7,7 +7,7 @@
#ifndef __ASM_ARCH_GICV3_H
#define __ASM_ARCH_GICV3_H
-#ifndef __ASSEMBLY__
+#ifndef __ASSEMBLER__
#include <linux/io.h>
#include <linux/io-64-nonatomic-lo-hi.h>
@@ -257,5 +257,5 @@ static inline bool gic_has_relaxed_pmr_sync(void)
return false;
}
-#endif /* !__ASSEMBLY__ */
+#endif /* !__ASSEMBLER__ */
#endif /* !__ASM_ARCH_GICV3_H */
diff --git a/drivers/irqchip/.kunitconfig b/drivers/irqchip/.kunitconfig
new file mode 100644
index 000000000000..00a12703f635
--- /dev/null
+++ b/drivers/irqchip/.kunitconfig
@@ -0,0 +1,5 @@
+CONFIG_KUNIT=y
+CONFIG_OF=y
+CONFIG_COMPILE_TEST=y
+CONFIG_ASPEED_AST2700_INTC=y
+CONFIG_ASPEED_AST2700_INTC_TEST=y
diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
index e755a2a05209..fdf27cf529fc 100644
--- a/drivers/irqchip/Kconfig
+++ b/drivers/irqchip/Kconfig
@@ -110,6 +110,29 @@ config AL_FIC
help
Support Amazon's Annapurna Labs Fabric Interrupt Controller.
+config ASPEED_AST2700_INTC
+ bool "ASPEED AST2700 Interrupt Controller support"
+ depends on OF
+ depends on ARCH_ASPEED || COMPILE_TEST
+ select IRQ_DOMAIN_HIERARCHY
+ help
+ Enable support for the ASPEED AST2700 interrupt controller.
+ This driver handles interrupt, routing and merged interrupt
+ sources to upstream parent interrupt controllers.
+
+ If unsure, say N.
+
+config ASPEED_AST2700_INTC_TEST
+ bool "Tests for the ASPEED AST2700 Interrupt Controller"
+ depends on ASPEED_AST2700_INTC && KUNIT=y
+ default KUNIT_ALL_TESTS
+ help
+ Enable KUnit tests for AST2700 INTC route resolution.
+ The tests exercise error handling and route selection paths.
+ This option is intended for test builds.
+
+ If unsure, say N.
+
config ATMEL_AIC_IRQ
bool
select GENERIC_IRQ_CHIP
@@ -476,7 +499,7 @@ config STM32_EXTI
select GENERIC_IRQ_CHIP
config QCOM_IRQ_COMBINER
- bool "QCOM IRQ combiner support"
+ bool "Qualcomm IRQ combiner support"
depends on ARCH_QCOM && ACPI
select IRQ_DOMAIN_HIERARCHY
help
@@ -509,7 +532,7 @@ config GOLDFISH_PIC
for Goldfish based virtual platforms.
config QCOM_PDC
- tristate "QCOM PDC"
+ tristate "Qualcomm PDC"
depends on ARCH_QCOM
select IRQ_DOMAIN_HIERARCHY
help
@@ -517,7 +540,7 @@ config QCOM_PDC
IRQs for Qualcomm Technologies Inc (QTI) mobile chips.
config QCOM_MPM
- tristate "QCOM MPM"
+ tristate "Qualcomm MPM"
depends on ARCH_QCOM
depends on MAILBOX
select IRQ_DOMAIN_HIERARCHY
@@ -654,13 +677,13 @@ config SIFIVE_PLIC
select IRQ_DOMAIN_HIERARCHY
select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP
-config STARFIVE_JH8100_INTC
- bool "StarFive JH8100 External Interrupt Controller"
+config STARFIVE_JHB100_INTC
+ bool "StarFive JHB100 External Interrupt Controller"
depends on ARCH_STARFIVE || COMPILE_TEST
default ARCH_STARFIVE
select IRQ_DOMAIN_HIERARCHY
help
- This enables support for the INTC chip found in StarFive JH8100
+ This enables support for the INTC chip found in StarFive JHB100
SoC.
If you don't know what to do here, say Y.
diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
index 26aa3b6ec99f..72cdcc9caa16 100644
--- a/drivers/irqchip/Makefile
+++ b/drivers/irqchip/Makefile
@@ -89,8 +89,9 @@ obj-$(CONFIG_MVEBU_PIC) += irq-mvebu-pic.o
obj-$(CONFIG_MVEBU_SEI) += irq-mvebu-sei.o
obj-$(CONFIG_LS_EXTIRQ) += irq-ls-extirq.o
obj-$(CONFIG_LS_SCFG_MSI) += irq-ls-scfg-msi.o
+obj-$(CONFIG_ASPEED_AST2700_INTC) += irq-ast2700.o irq-ast2700-intc0.o irq-ast2700-intc1.o
+obj-$(CONFIG_ASPEED_AST2700_INTC_TEST) += irq-ast2700-intc0-test.o
obj-$(CONFIG_ARCH_ASPEED) += irq-aspeed-vic.o irq-aspeed-i2c-ic.o irq-aspeed-scu-ic.o
-obj-$(CONFIG_ARCH_ASPEED) += irq-aspeed-intc.o
obj-$(CONFIG_STM32MP_EXTI) += irq-stm32mp-exti.o
obj-$(CONFIG_STM32_EXTI) += irq-stm32-exti.o
obj-$(CONFIG_QCOM_IRQ_COMBINER) += qcom-irq-combiner.o
@@ -108,7 +109,7 @@ obj-$(CONFIG_RISCV_APLIC_MSI) += irq-riscv-aplic-msi.o
obj-$(CONFIG_RISCV_IMSIC) += irq-riscv-imsic-state.o irq-riscv-imsic-early.o irq-riscv-imsic-platform.o
obj-$(CONFIG_RISCV_RPMI_SYSMSI) += irq-riscv-rpmi-sysmsi.o
obj-$(CONFIG_SIFIVE_PLIC) += irq-sifive-plic.o
-obj-$(CONFIG_STARFIVE_JH8100_INTC) += irq-starfive-jh8100-intc.o
+obj-$(CONFIG_STARFIVE_JHB100_INTC) += irq-starfive-jhb100-intc.o
obj-$(CONFIG_ACLINT_SSWI) += irq-aclint-sswi.o
obj-$(CONFIG_IMX_IRQSTEER) += irq-imx-irqsteer.o
obj-$(CONFIG_IMX_INTMUX) += irq-imx-intmux.o
@@ -119,7 +120,7 @@ obj-$(CONFIG_LS1X_IRQ) += irq-ls1x.o
obj-$(CONFIG_TI_SCI_INTR_IRQCHIP) += irq-ti-sci-intr.o
obj-$(CONFIG_TI_SCI_INTA_IRQCHIP) += irq-ti-sci-inta.o
obj-$(CONFIG_TI_PRUSS_INTC) += irq-pruss-intc.o
-obj-$(CONFIG_IRQ_LOONGARCH_CPU) += irq-loongarch-cpu.o irq-loongarch-avec.o
+obj-$(CONFIG_IRQ_LOONGARCH_CPU) += irq-loongarch-cpu.o irq-loongarch-avec.o irq-loongarch-ir.o
obj-$(CONFIG_LOONGSON_LIOINTC) += irq-loongson-liointc.o
obj-$(CONFIG_LOONGSON_EIOINTC) += irq-loongson-eiointc.o
obj-$(CONFIG_LOONGSON_HTPIC) += irq-loongson-htpic.o
diff --git a/drivers/irqchip/exynos-combiner.c b/drivers/irqchip/exynos-combiner.c
index 03cafcc5c835..d9d408cb4711 100644
--- a/drivers/irqchip/exynos-combiner.c
+++ b/drivers/irqchip/exynos-combiner.c
@@ -24,8 +24,6 @@
#define IRQ_IN_COMBINER 8
-static DEFINE_RAW_SPINLOCK(irq_controller_lock);
-
struct combiner_chip_data {
unsigned int hwirq_offset;
unsigned int irq_mask;
@@ -72,9 +70,7 @@ static void combiner_handle_cascade_irq(struct irq_desc *desc)
chained_irq_enter(chip, desc);
- raw_spin_lock(&irq_controller_lock);
status = readl_relaxed(chip_data->base + COMBINER_INT_STATUS);
- raw_spin_unlock(&irq_controller_lock);
status &= chip_data->irq_mask;
if (status == 0)
diff --git a/drivers/irqchip/irq-aspeed-intc.c b/drivers/irqchip/irq-aspeed-intc.c
deleted file mode 100644
index 4fb0dd8349da..000000000000
--- a/drivers/irqchip/irq-aspeed-intc.c
+++ /dev/null
@@ -1,139 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-only
-/*
- * Aspeed Interrupt Controller.
- *
- * Copyright (C) 2023 ASPEED Technology Inc.
- */
-
-#include <linux/bitops.h>
-#include <linux/irq.h>
-#include <linux/irqchip.h>
-#include <linux/irqchip/chained_irq.h>
-#include <linux/irqdomain.h>
-#include <linux/of_address.h>
-#include <linux/of_irq.h>
-#include <linux/io.h>
-#include <linux/spinlock.h>
-
-#define INTC_INT_ENABLE_REG 0x00
-#define INTC_INT_STATUS_REG 0x04
-#define INTC_IRQS_PER_WORD 32
-
-struct aspeed_intc_ic {
- void __iomem *base;
- raw_spinlock_t gic_lock;
- raw_spinlock_t intc_lock;
- struct irq_domain *irq_domain;
-};
-
-static void aspeed_intc_ic_irq_handler(struct irq_desc *desc)
-{
- struct aspeed_intc_ic *intc_ic = irq_desc_get_handler_data(desc);
- struct irq_chip *chip = irq_desc_get_chip(desc);
-
- chained_irq_enter(chip, desc);
-
- scoped_guard(raw_spinlock, &intc_ic->gic_lock) {
- unsigned long bit, status;
-
- status = readl(intc_ic->base + INTC_INT_STATUS_REG);
- for_each_set_bit(bit, &status, INTC_IRQS_PER_WORD) {
- generic_handle_domain_irq(intc_ic->irq_domain, bit);
- writel(BIT(bit), intc_ic->base + INTC_INT_STATUS_REG);
- }
- }
-
- chained_irq_exit(chip, desc);
-}
-
-static void aspeed_intc_irq_mask(struct irq_data *data)
-{
- struct aspeed_intc_ic *intc_ic = irq_data_get_irq_chip_data(data);
- unsigned int mask = readl(intc_ic->base + INTC_INT_ENABLE_REG) & ~BIT(data->hwirq);
-
- guard(raw_spinlock)(&intc_ic->intc_lock);
- writel(mask, intc_ic->base + INTC_INT_ENABLE_REG);
-}
-
-static void aspeed_intc_irq_unmask(struct irq_data *data)
-{
- struct aspeed_intc_ic *intc_ic = irq_data_get_irq_chip_data(data);
- unsigned int unmask = readl(intc_ic->base + INTC_INT_ENABLE_REG) | BIT(data->hwirq);
-
- guard(raw_spinlock)(&intc_ic->intc_lock);
- writel(unmask, intc_ic->base + INTC_INT_ENABLE_REG);
-}
-
-static struct irq_chip aspeed_intc_chip = {
- .name = "ASPEED INTC",
- .irq_mask = aspeed_intc_irq_mask,
- .irq_unmask = aspeed_intc_irq_unmask,
-};
-
-static int aspeed_intc_ic_map_irq_domain(struct irq_domain *domain, unsigned int irq,
- irq_hw_number_t hwirq)
-{
- irq_set_chip_and_handler(irq, &aspeed_intc_chip, handle_level_irq);
- irq_set_chip_data(irq, domain->host_data);
-
- return 0;
-}
-
-static const struct irq_domain_ops aspeed_intc_ic_irq_domain_ops = {
- .map = aspeed_intc_ic_map_irq_domain,
-};
-
-static int __init aspeed_intc_ic_of_init(struct device_node *node,
- struct device_node *parent)
-{
- struct aspeed_intc_ic *intc_ic;
- int irq, i, ret = 0;
-
- intc_ic = kzalloc_obj(*intc_ic);
- if (!intc_ic)
- return -ENOMEM;
-
- intc_ic->base = of_iomap(node, 0);
- if (!intc_ic->base) {
- pr_err("Failed to iomap intc_ic base\n");
- ret = -ENOMEM;
- goto err_free_ic;
- }
- writel(0xffffffff, intc_ic->base + INTC_INT_STATUS_REG);
- writel(0x0, intc_ic->base + INTC_INT_ENABLE_REG);
-
- intc_ic->irq_domain = irq_domain_create_linear(of_fwnode_handle(node), INTC_IRQS_PER_WORD,
- &aspeed_intc_ic_irq_domain_ops, intc_ic);
- if (!intc_ic->irq_domain) {
- ret = -ENOMEM;
- goto err_iounmap;
- }
-
- raw_spin_lock_init(&intc_ic->gic_lock);
- raw_spin_lock_init(&intc_ic->intc_lock);
-
- /* Check all the irq numbers valid. If not, unmaps all the base and frees the data. */
- for (i = 0; i < of_irq_count(node); i++) {
- irq = irq_of_parse_and_map(node, i);
- if (!irq) {
- pr_err("Failed to get irq number\n");
- ret = -EINVAL;
- goto err_iounmap;
- }
- }
-
- for (i = 0; i < of_irq_count(node); i++) {
- irq = irq_of_parse_and_map(node, i);
- irq_set_chained_handler_and_data(irq, aspeed_intc_ic_irq_handler, intc_ic);
- }
-
- return 0;
-
-err_iounmap:
- iounmap(intc_ic->base);
-err_free_ic:
- kfree(intc_ic);
- return ret;
-}
-
-IRQCHIP_DECLARE(ast2700_intc_ic, "aspeed,ast2700-intc-ic", aspeed_intc_ic_of_init);
diff --git a/drivers/irqchip/irq-ast2700-intc0-test.c b/drivers/irqchip/irq-ast2700-intc0-test.c
new file mode 100644
index 000000000000..d49784509ac7
--- /dev/null
+++ b/drivers/irqchip/irq-ast2700-intc0-test.c
@@ -0,0 +1,473 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2026 Code Construct
+ */
+#include <kunit/test.h>
+
+#include "irq-ast2700.h"
+
+static void aspeed_intc0_resolve_route_bad_args(struct kunit *test)
+{
+ static const struct aspeed_intc_interrupt_range c1ranges[] = { 0 };
+ static const u32 c1outs[] = { 0 };
+ struct aspeed_intc_interrupt_range resolved;
+ const struct irq_domain c0domain = { 0 };
+ int rc;
+
+ rc = aspeed_intc0_resolve_route(NULL, 0, c1outs, 0, c1ranges, NULL);
+ KUNIT_EXPECT_EQ(test, rc, -EINVAL);
+
+ rc = aspeed_intc0_resolve_route(&c0domain, 0, c1outs,
+ ARRAY_SIZE(c1ranges), c1ranges,
+ &resolved);
+ KUNIT_EXPECT_EQ(test, rc, -ENOENT);
+
+ rc = aspeed_intc0_resolve_route(&c0domain, ARRAY_SIZE(c1outs), c1outs,
+ 0, c1ranges, &resolved);
+ KUNIT_EXPECT_EQ(test, rc, -ENOENT);
+}
+
+static int gicv3_fwnode_read_string_array(const struct fwnode_handle *fwnode,
+ const char *propname, const char **val, size_t nval)
+{
+ if (!propname)
+ return -EINVAL;
+
+ if (!val)
+ return 1;
+
+ if (WARN_ON(nval != 1))
+ return -EOVERFLOW;
+
+ *val = "arm,gic-v3";
+ return 1;
+}
+
+static const struct fwnode_operations arm_gicv3_fwnode_ops = {
+ .property_read_string_array = gicv3_fwnode_read_string_array,
+};
+
+static void aspeed_intc_resolve_route_invalid_c0domain(struct kunit *test)
+{
+ struct device_node intc0_node = {
+ .fwnode = { .ops = &arm_gicv3_fwnode_ops },
+ };
+ const struct irq_domain c0domain = { .fwnode = &intc0_node.fwnode };
+ static const struct aspeed_intc_interrupt_range c1ranges[] = { 0 };
+ static const u32 c1outs[] = { 0 };
+ struct aspeed_intc_interrupt_range resolved;
+ int rc;
+
+ rc = aspeed_intc0_resolve_route(&c0domain, ARRAY_SIZE(c1outs), c1outs,
+ ARRAY_SIZE(c1ranges), c1ranges,
+ &resolved);
+ KUNIT_EXPECT_NE(test, rc, 0);
+}
+
+static int
+aspeed_intc0_fwnode_read_string_array(const struct fwnode_handle *fwnode_handle,
+ const char *propname, const char **val,
+ size_t nval)
+{
+ if (!propname)
+ return -EINVAL;
+
+ if (!val)
+ return 1;
+
+ if (WARN_ON(nval != 1))
+ return -EOVERFLOW;
+
+ *val = "aspeed,ast2700-intc0";
+ return nval;
+}
+
+static const struct fwnode_operations intc0_fwnode_ops = {
+ .property_read_string_array = aspeed_intc0_fwnode_read_string_array,
+};
+
+static void
+aspeed_intc0_resolve_route_c1i1o1c0i1o1_connected(struct kunit *test)
+{
+ struct device_node intc0_node = {
+ .fwnode = { .ops = &intc0_fwnode_ops },
+ };
+ struct aspeed_intc_interrupt_range c1ranges[] = {
+ {
+ .start = 0,
+ .count = 1,
+ .upstream = {
+ .fwnode = &intc0_node.fwnode,
+ .param_count = 1,
+ .param = { 128 }
+ }
+ }
+ };
+ static const u32 c1outs[] = { 0 };
+ struct aspeed_intc_interrupt_range resolved;
+ struct aspeed_intc_interrupt_range intc0_ranges[] = {
+ {
+ .start = 128,
+ .count = 1,
+ .upstream = {
+ .fwnode = NULL,
+ .param_count = 0,
+ .param = { 0 },
+ }
+ }
+ };
+ struct aspeed_intc0 intc0 = {
+ .ranges = { .ranges = intc0_ranges, .nranges = ARRAY_SIZE(intc0_ranges), }
+ };
+ const struct irq_domain c0domain = {
+ .host_data = &intc0,
+ .fwnode = &intc0_node.fwnode
+ };
+ int rc;
+
+ rc = aspeed_intc0_resolve_route(&c0domain, ARRAY_SIZE(c1outs), c1outs,
+ ARRAY_SIZE(c1ranges), c1ranges,
+ &resolved);
+ KUNIT_EXPECT_EQ(test, rc, 0);
+ KUNIT_EXPECT_EQ(test, resolved.start, 0);
+ KUNIT_EXPECT_EQ(test, resolved.count, 1);
+ KUNIT_EXPECT_EQ(test, resolved.upstream.param[0], 128);
+}
+
+static void
+aspeed_intc0_resolve_route_c1i1o1c0i1o1_disconnected(struct kunit *test)
+{
+ struct device_node intc0_node = {
+ .fwnode = { .ops = &intc0_fwnode_ops },
+ };
+ struct aspeed_intc_interrupt_range c1ranges[] = {
+ {
+ .start = 0,
+ .count = 1,
+ .upstream = {
+ .fwnode = &intc0_node.fwnode,
+ .param_count = 1,
+ .param = { 128 }
+ }
+ }
+ };
+ static const u32 c1outs[] = { 0 };
+ struct aspeed_intc_interrupt_range resolved;
+ struct aspeed_intc_interrupt_range intc0_ranges[] = {
+ {
+ .start = 129,
+ .count = 1,
+ .upstream = {
+ .fwnode = NULL,
+ .param_count = 0,
+ .param = { 0 },
+ }
+ }
+ };
+ struct aspeed_intc0 intc0 = {
+ .ranges = {
+ .ranges = intc0_ranges,
+ .nranges = ARRAY_SIZE(intc0_ranges),
+ }
+ };
+ const struct irq_domain c0domain = {
+ .host_data = &intc0,
+ .fwnode = &intc0_node.fwnode
+ };
+ int rc;
+
+ rc = aspeed_intc0_resolve_route(&c0domain, ARRAY_SIZE(c1outs), c1outs,
+ ARRAY_SIZE(c1ranges), c1ranges,
+ &resolved);
+ KUNIT_EXPECT_NE(test, rc, 0);
+}
+
+static void aspeed_intc0_resolve_route_c1i1o1mc0i1o1(struct kunit *test)
+{
+ struct device_node intc0_node = {
+ .fwnode = { .ops = &intc0_fwnode_ops },
+ };
+ struct aspeed_intc_interrupt_range c1ranges[] = {
+ {
+ .start = 0,
+ .count = 1,
+ .upstream = {
+ .fwnode = &intc0_node.fwnode,
+ .param_count = 1,
+ .param = { 480 }
+ }
+ }
+ };
+ static const u32 c1outs[] = { 0 };
+ struct aspeed_intc_interrupt_range resolved;
+ struct aspeed_intc_interrupt_range intc0_ranges[] = {
+ {
+ .start = 192,
+ .count = 1,
+ .upstream = {
+ .fwnode = NULL,
+ .param_count = 0,
+ .param = { 0 },
+ }
+ }
+ };
+ struct aspeed_intc0 intc0 = {
+ .ranges = {
+ .ranges = intc0_ranges,
+ .nranges = ARRAY_SIZE(intc0_ranges),
+ }
+ };
+ const struct irq_domain c0domain = {
+ .host_data = &intc0,
+ .fwnode = &intc0_node.fwnode
+ };
+ int rc;
+
+ rc = aspeed_intc0_resolve_route(&c0domain, ARRAY_SIZE(c1outs), c1outs,
+ ARRAY_SIZE(c1ranges), c1ranges,
+ &resolved);
+ KUNIT_EXPECT_EQ(test, rc, 0);
+ KUNIT_EXPECT_EQ(test, resolved.start, 0);
+ KUNIT_EXPECT_EQ(test, resolved.count, 1);
+ KUNIT_EXPECT_EQ(test, resolved.upstream.param[0], 480);
+}
+
+static void aspeed_intc0_resolve_route_c1i2o2mc0i1o1(struct kunit *test)
+{
+ struct device_node intc0_node = {
+ .fwnode = { .ops = &intc0_fwnode_ops },
+ };
+ struct aspeed_intc_interrupt_range c1ranges[] = {
+ {
+ .start = 0,
+ .count = 1,
+ .upstream = {
+ .fwnode = &intc0_node.fwnode,
+ .param_count = 1,
+ .param = { 480 }
+ }
+ },
+ {
+ .start = 1,
+ .count = 1,
+ .upstream = {
+ .fwnode = &intc0_node.fwnode,
+ .param_count = 1,
+ .param = { 510 }
+ }
+ }
+ };
+ static const u32 c1outs[] = { 1 };
+ struct aspeed_intc_interrupt_range resolved;
+ static struct aspeed_intc_interrupt_range intc0_ranges[] = {
+ {
+ .start = 208,
+ .count = 1,
+ .upstream = {
+ .fwnode = NULL,
+ .param_count = 0,
+ .param = { 0 },
+ }
+ }
+ };
+ struct aspeed_intc0 intc0 = {
+ .ranges = {
+ .ranges = intc0_ranges,
+ .nranges = ARRAY_SIZE(intc0_ranges),
+ }
+ };
+ const struct irq_domain c0domain = {
+ .host_data = &intc0,
+ .fwnode = &intc0_node.fwnode
+ };
+ int rc;
+
+ rc = aspeed_intc0_resolve_route(&c0domain, ARRAY_SIZE(c1outs), c1outs,
+ ARRAY_SIZE(c1ranges), c1ranges,
+ &resolved);
+ KUNIT_EXPECT_EQ(test, rc, 0);
+ KUNIT_EXPECT_EQ(test, resolved.start, 1);
+ KUNIT_EXPECT_EQ(test, resolved.count, 1);
+ KUNIT_EXPECT_EQ(test, resolved.upstream.param[0], 510);
+}
+
+static void aspeed_intc0_resolve_route_c1i1o1mc0i2o1(struct kunit *test)
+{
+ struct device_node intc0_node = {
+ .fwnode = { .ops = &intc0_fwnode_ops },
+ };
+ struct aspeed_intc_interrupt_range c1ranges[] = {
+ {
+ .start = 0,
+ .count = 1,
+ .upstream = {
+ .fwnode = &intc0_node.fwnode,
+ .param_count = 1,
+ .param = { 510 }
+ }
+ },
+ };
+ static const u32 c1outs[] = { 0 };
+ struct aspeed_intc_interrupt_range resolved;
+ static struct aspeed_intc_interrupt_range intc0_ranges[] = {
+ {
+ .start = 192,
+ .count = 1,
+ .upstream = {
+ .fwnode = NULL,
+ .param_count = 0,
+ .param = {0},
+ }
+ },
+ {
+ .start = 208,
+ .count = 1,
+ .upstream = {
+ .fwnode = NULL,
+ .param_count = 0,
+ .param = {0},
+ }
+ }
+ };
+ struct aspeed_intc0 intc0 = {
+ .ranges = {
+ .ranges = intc0_ranges,
+ .nranges = ARRAY_SIZE(intc0_ranges),
+ }
+ };
+ const struct irq_domain c0domain = {
+ .host_data = &intc0,
+ .fwnode = &intc0_node.fwnode
+ };
+ int rc;
+
+ rc = aspeed_intc0_resolve_route(&c0domain, ARRAY_SIZE(c1outs), c1outs,
+ ARRAY_SIZE(c1ranges), c1ranges,
+ &resolved);
+ KUNIT_EXPECT_EQ(test, rc, 0);
+ KUNIT_EXPECT_EQ(test, resolved.start, 0);
+ KUNIT_EXPECT_EQ(test, resolved.count, 1);
+ KUNIT_EXPECT_EQ(test, resolved.upstream.param[0], 510);
+}
+
+static void aspeed_intc0_resolve_route_c1i1o2mc0i1o1_invalid(struct kunit *test)
+{
+ struct device_node intc0_node = {
+ .fwnode = { .ops = &intc0_fwnode_ops },
+ };
+ struct aspeed_intc_interrupt_range c1ranges[] = {
+ {
+ .start = 0,
+ .count = 1,
+ .upstream = {
+ .fwnode = &intc0_node.fwnode,
+ .param_count = 1,
+ .param = { 480 }
+ }
+ }
+ };
+ static const u32 c1outs[] = {
+ AST2700_INTC_INVALID_ROUTE, 0
+ };
+ struct aspeed_intc_interrupt_range resolved;
+ struct aspeed_intc_interrupt_range intc0_ranges[] = {
+ {
+ .start = 192,
+ .count = 1,
+ .upstream = {
+ .fwnode = NULL,
+ .param_count = 0,
+ .param = { 0 },
+ }
+ }
+ };
+ struct aspeed_intc0 intc0 = {
+ .ranges = {
+ .ranges = intc0_ranges,
+ .nranges = ARRAY_SIZE(intc0_ranges),
+ }
+ };
+ const struct irq_domain c0domain = {
+ .host_data = &intc0,
+ .fwnode = &intc0_node.fwnode
+ };
+ int rc;
+
+ rc = aspeed_intc0_resolve_route(&c0domain, ARRAY_SIZE(c1outs), c1outs,
+ ARRAY_SIZE(c1ranges), c1ranges,
+ &resolved);
+ KUNIT_EXPECT_EQ(test, rc, 1);
+ KUNIT_EXPECT_EQ(test, resolved.start, 0);
+ KUNIT_EXPECT_EQ(test, resolved.count, 1);
+ KUNIT_EXPECT_EQ(test, resolved.upstream.param[0], 480);
+}
+
+static void
+aspeed_intc0_resolve_route_c1i1o1mc0i1o1_bad_range_upstream(struct kunit *test)
+{
+ struct device_node intc0_node = {
+ .fwnode = { .ops = &intc0_fwnode_ops },
+ };
+ struct aspeed_intc_interrupt_range c1ranges[] = {
+ {
+ .start = 0,
+ .count = 1,
+ .upstream = {
+ .fwnode = &intc0_node.fwnode,
+ .param_count = 0,
+ .param = { 0 }
+ }
+ }
+ };
+ static const u32 c1outs[] = { 0 };
+ struct aspeed_intc_interrupt_range resolved;
+ struct aspeed_intc_interrupt_range intc0_ranges[] = {
+ {
+ .start = 0,
+ .count = 0,
+ .upstream = {
+ .fwnode = NULL,
+ .param_count = 0,
+ .param = { 0 },
+ }
+ }
+ };
+ struct aspeed_intc0 intc0 = {
+ .ranges = {
+ .ranges = intc0_ranges,
+ .nranges = ARRAY_SIZE(intc0_ranges),
+ }
+ };
+ const struct irq_domain c0domain = {
+ .host_data = &intc0,
+ .fwnode = &intc0_node.fwnode
+ };
+ int rc;
+
+ rc = aspeed_intc0_resolve_route(&c0domain, ARRAY_SIZE(c1outs), c1outs,
+ ARRAY_SIZE(c1ranges), c1ranges,
+ &resolved);
+ KUNIT_EXPECT_NE(test, rc, 0);
+}
+
+static struct kunit_case ast2700_intc0_test_cases[] = {
+ KUNIT_CASE(aspeed_intc0_resolve_route_bad_args),
+ KUNIT_CASE(aspeed_intc_resolve_route_invalid_c0domain),
+ KUNIT_CASE(aspeed_intc0_resolve_route_c1i1o1c0i1o1_connected),
+ KUNIT_CASE(aspeed_intc0_resolve_route_c1i1o1c0i1o1_disconnected),
+ KUNIT_CASE(aspeed_intc0_resolve_route_c1i1o1mc0i1o1),
+ KUNIT_CASE(aspeed_intc0_resolve_route_c1i2o2mc0i1o1),
+ KUNIT_CASE(aspeed_intc0_resolve_route_c1i1o1mc0i2o1),
+ KUNIT_CASE(aspeed_intc0_resolve_route_c1i1o2mc0i1o1_invalid),
+ KUNIT_CASE(aspeed_intc0_resolve_route_c1i1o1mc0i1o1_bad_range_upstream),
+ {},
+};
+
+static struct kunit_suite ast2700_intc0_test_suite = {
+ .name = "ast2700-intc0",
+ .test_cases = ast2700_intc0_test_cases,
+};
+
+kunit_test_suite(ast2700_intc0_test_suite);
+
+MODULE_LICENSE("GPL");
diff --git a/drivers/irqchip/irq-ast2700-intc0.c b/drivers/irqchip/irq-ast2700-intc0.c
new file mode 100644
index 000000000000..14b8b88f1179
--- /dev/null
+++ b/drivers/irqchip/irq-ast2700-intc0.c
@@ -0,0 +1,582 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Aspeed AST2700 Interrupt Controller.
+ *
+ * Copyright (C) 2026 ASPEED Technology Inc.
+ */
+
+#include <linux/bitops.h>
+#include <linux/device.h>
+#include <linux/err.h>
+#include <linux/fwnode.h>
+#include <linux/io.h>
+#include <linux/irq.h>
+#include <linux/irqchip.h>
+#include <linux/irqchip/chained_irq.h>
+#include <linux/irqdomain.h>
+#include <linux/kconfig.h>
+#include <linux/of.h>
+#include <linux/of_irq.h>
+#include <linux/overflow.h>
+#include <linux/property.h>
+#include <linux/spinlock.h>
+
+#include "irq-ast2700.h"
+
+#define INT_NUM 480
+#define INTM_NUM 50
+#define SWINT_NUM 16
+
+#define INTM_BASE (INT_NUM)
+#define SWINT_BASE (INT_NUM + INTM_NUM)
+#define INT0_NUM (INT_NUM + INTM_NUM + SWINT_NUM)
+
+#define INTC0_IN_NUM 480
+#define INTC0_ROUTE_NUM 5
+#define INTC0_INTM_NUM 50
+#define INTC0_ROUTE_BITS 3
+
+#define GIC_P2P_SPI_END 128
+#define INTC0_SWINT_OUT_BASE 144
+
+#define INTC0_SWINT_IER 0x10
+#define INTC0_SWINT_ISR 0x14
+#define INTC0_INTBANKX_IER 0x1000
+#define INTC0_INTBANK_SIZE 0x100
+#define INTC0_INTBANK_GROUPS 11
+#define INTC0_INTBANKS_PER_GRP 3
+#define INTC0_INTMX_IER 0x1b00
+#define INTC0_INTMX_ISR 0x1b04
+#define INTC0_INTMX_BANK_SIZE 0x10
+#define INTC0_INTM_BANK_NUM 3
+#define INTC0_IRQS_PER_BANK 32
+#define INTM_IRQS_PER_BANK 10
+#define INTC0_SEL_BASE 0x200
+#define INTC0_SEL_BANK_SIZE 0x4
+#define INTC0_SEL_ROUTE_SIZE 0x100
+
+static void aspeed_swint_irq_mask(struct irq_data *data)
+{
+ struct aspeed_intc0 *intc0 = irq_data_get_irq_chip_data(data);
+ int bit = data->hwirq - SWINT_BASE;
+ u32 ier;
+
+ guard(raw_spinlock)(&intc0->intc_lock);
+ ier = readl(intc0->base + INTC0_SWINT_IER) & ~BIT(bit);
+ writel(ier, intc0->base + INTC0_SWINT_IER);
+ irq_chip_mask_parent(data);
+}
+
+static void aspeed_swint_irq_unmask(struct irq_data *data)
+{
+ struct aspeed_intc0 *intc0 = irq_data_get_irq_chip_data(data);
+ int bit = data->hwirq - SWINT_BASE;
+ u32 ier;
+
+ guard(raw_spinlock)(&intc0->intc_lock);
+ ier = readl(intc0->base + INTC0_SWINT_IER) | BIT(bit);
+ writel(ier, intc0->base + INTC0_SWINT_IER);
+ irq_chip_unmask_parent(data);
+}
+
+static void aspeed_swint_irq_eoi(struct irq_data *data)
+{
+ struct aspeed_intc0 *intc0 = irq_data_get_irq_chip_data(data);
+ int bit = data->hwirq - SWINT_BASE;
+
+ writel(BIT(bit), intc0->base + INTC0_SWINT_ISR);
+ irq_chip_eoi_parent(data);
+}
+
+static struct irq_chip aspeed_swint_chip = {
+ .name = "ast2700-swint",
+ .irq_eoi = aspeed_swint_irq_eoi,
+ .irq_mask = aspeed_swint_irq_mask,
+ .irq_unmask = aspeed_swint_irq_unmask,
+ .irq_set_affinity = irq_chip_set_affinity_parent,
+ .flags = IRQCHIP_SET_TYPE_MASKED,
+};
+
+static void aspeed_intc0_irq_mask(struct irq_data *data)
+{
+ struct aspeed_intc0 *intc0 = irq_data_get_irq_chip_data(data);
+ int bank = (data->hwirq - INTM_BASE) / INTM_IRQS_PER_BANK;
+ int bit = (data->hwirq - INTM_BASE) % INTM_IRQS_PER_BANK;
+ u32 ier;
+
+ guard(raw_spinlock)(&intc0->intc_lock);
+ ier = readl(intc0->base + INTC0_INTMX_IER + bank * INTC0_INTMX_BANK_SIZE) & ~BIT(bit);
+ writel(ier, intc0->base + INTC0_INTMX_IER + bank * INTC0_INTMX_BANK_SIZE);
+ irq_chip_mask_parent(data);
+}
+
+static void aspeed_intc0_irq_unmask(struct irq_data *data)
+{
+ struct aspeed_intc0 *intc0 = irq_data_get_irq_chip_data(data);
+ int bank = (data->hwirq - INTM_BASE) / INTM_IRQS_PER_BANK;
+ int bit = (data->hwirq - INTM_BASE) % INTM_IRQS_PER_BANK;
+ u32 ier;
+
+ guard(raw_spinlock)(&intc0->intc_lock);
+ ier = readl(intc0->base + INTC0_INTMX_IER + bank * INTC0_INTMX_BANK_SIZE) | BIT(bit);
+ writel(ier, intc0->base + INTC0_INTMX_IER + bank * INTC0_INTMX_BANK_SIZE);
+ irq_chip_unmask_parent(data);
+}
+
+static void aspeed_intc0_irq_eoi(struct irq_data *data)
+{
+ struct aspeed_intc0 *intc0 = irq_data_get_irq_chip_data(data);
+ int bank = (data->hwirq - INTM_BASE) / INTM_IRQS_PER_BANK;
+ int bit = (data->hwirq - INTM_BASE) % INTM_IRQS_PER_BANK;
+
+ writel(BIT(bit), intc0->base + INTC0_INTMX_ISR + bank * INTC0_INTMX_BANK_SIZE);
+ irq_chip_eoi_parent(data);
+}
+
+static struct irq_chip aspeed_intm_chip = {
+ .name = "ast2700-intmerge",
+ .irq_eoi = aspeed_intc0_irq_eoi,
+ .irq_mask = aspeed_intc0_irq_mask,
+ .irq_unmask = aspeed_intc0_irq_unmask,
+ .irq_set_affinity = irq_chip_set_affinity_parent,
+ .flags = IRQCHIP_SET_TYPE_MASKED,
+};
+
+static struct irq_chip linear_intr_irq_chip = {
+ .name = "ast2700-int",
+ .irq_eoi = irq_chip_eoi_parent,
+ .irq_mask = irq_chip_mask_parent,
+ .irq_unmask = irq_chip_unmask_parent,
+ .irq_set_affinity = irq_chip_set_affinity_parent,
+ .flags = IRQCHIP_SET_TYPE_MASKED,
+};
+
+static const u32 aspeed_intc0_routes[INTC0_IN_NUM / INTC0_IRQS_PER_BANK][INTC0_ROUTE_NUM] = {
+ { 0, 256, 426, AST2700_INTC_INVALID_ROUTE, AST2700_INTC_INVALID_ROUTE },
+ { 32, 288, 458, AST2700_INTC_INVALID_ROUTE, AST2700_INTC_INVALID_ROUTE },
+ { 64, 320, 490, AST2700_INTC_INVALID_ROUTE, AST2700_INTC_INVALID_ROUTE },
+ { 96, 352, 522, AST2700_INTC_INVALID_ROUTE, AST2700_INTC_INVALID_ROUTE },
+ { 128, 384, 554, 160, 176 },
+ { 129, 385, 555, 161, 177 },
+ { 130, 386, 556, 162, 178 },
+ { 131, 387, 557, 163, 179 },
+ { 132, 388, 558, 164, 180 },
+ { 133, 544, 714, 165, 181 },
+ { 134, 545, 715, 166, 182 },
+ { 135, 546, 706, 167, 183 },
+ { 136, 547, 707, 168, 184 },
+ { 137, 548, 708, 169, 185 },
+ { 138, 549, 709, 170, 186 },
+};
+
+static const u32 aspeed_intc0_intm_routes[INTC0_INTM_NUM / INTM_IRQS_PER_BANK] = {
+ 192, 416, 586, 208, 224
+};
+
+static int resolve_input_from_child_ranges(const struct aspeed_intc0 *intc0,
+ const struct aspeed_intc_interrupt_range *range,
+ u32 outpin, u32 *input)
+{
+ u32 offset, base;
+
+ if (!in_range32(outpin, range->start, range->count))
+ return -ENOENT;
+
+ if (range->upstream.param_count == 0)
+ return -EINVAL;
+
+ base = range->upstream.param[ASPEED_INTC_RANGES_BASE];
+ offset = outpin - range->start;
+ if (check_add_overflow(base, offset, input)) {
+ dev_warn(intc0->dev, "%s: Arithmetic overflow for input derivation: %u + %u\n",
+ __func__, base, offset);
+ return -EINVAL;
+ }
+ return 0;
+}
+
+static int resolve_parent_range_for_output(const struct aspeed_intc0 *intc0,
+ const struct fwnode_handle *parent, u32 output,
+ struct aspeed_intc_interrupt_range *resolved)
+{
+ for (size_t i = 0; i < intc0->ranges.nranges; i++) {
+ struct aspeed_intc_interrupt_range range = intc0->ranges.ranges[i];
+
+ if (!in_range32(output, range.start, range.count))
+ continue;
+
+ if (range.upstream.fwnode != parent)
+ continue;
+
+ if (resolved) {
+ resolved->start = output;
+ resolved->count = 1;
+ resolved->upstream = range.upstream;
+ resolved->upstream.param[ASPEED_INTC_RANGES_COUNT] +=
+ output - range.start;
+ }
+
+ return 0;
+ }
+
+ return -ENOENT;
+}
+
+static int resolve_parent_route_for_input(const struct aspeed_intc0 *intc0,
+ const struct fwnode_handle *parent, u32 input,
+ struct aspeed_intc_interrupt_range *resolved)
+{
+ int rc = -ENOENT;
+ u32 c0o;
+
+ if (input < INT_NUM) {
+ static_assert(INTC0_ROUTE_NUM < INT_MAX, "Broken cast");
+ for (size_t i = 0; rc == -ENOENT && i < INTC0_ROUTE_NUM; i++) {
+ c0o = aspeed_intc0_routes[input / INTC0_IRQS_PER_BANK][i];
+ if (c0o == AST2700_INTC_INVALID_ROUTE)
+ continue;
+
+ if (input < GIC_P2P_SPI_END)
+ c0o += input % INTC0_IRQS_PER_BANK;
+
+ rc = resolve_parent_range_for_output(intc0, parent, c0o, resolved);
+ if (!rc)
+ return (int)i;
+ }
+ } else if (input < (INT_NUM + INTM_NUM)) {
+ c0o = aspeed_intc0_intm_routes[(input - INT_NUM) / INTM_IRQS_PER_BANK];
+ c0o += ((input - INT_NUM) % INTM_IRQS_PER_BANK);
+ return resolve_parent_range_for_output(intc0, parent, c0o, resolved);
+ } else if (input < (INT_NUM + INTM_NUM + SWINT_NUM)) {
+ c0o = input - SWINT_BASE + INTC0_SWINT_OUT_BASE;
+ return resolve_parent_range_for_output(intc0, parent, c0o, resolved);
+ } else {
+ return -ENOENT;
+ }
+
+ return rc;
+}
+
+/**
+ * aspeed_intc0_resolve_route - Determine the necessary interrupt output at intc1
+ * @c0domain: The pointer to intc0's irq_domain
+ * @nc1outs: The number of valid intc1 outputs available for the input
+ * @c1outs: The array of available intc1 output indices for the input
+ * @nc1ranges: The number of interrupt range entries for intc1
+ * @c1ranges: The array of configured intc1 interrupt ranges
+ * @resolved: The fully resolved range entry after applying the resolution
+ * algorithm
+ *
+ * Returns: The intc1 route index associated with the intc1 output identified in
+ * @resolved on success. Otherwise, a negative errno value.
+ *
+ * The AST2700 interrupt architecture allows any peripheral interrupt source
+ * to be routed to one of up to four processors running in the SoC. A processor
+ * binding a driver for a peripheral that requests an interrupt is (without
+ * further design and effort) the destination for the requested interrupt.
+ *
+ * Routing a peripheral interrupt to its destination processor requires
+ * coordination between INTC0 on the CPU die and one or more INTC1 instances.
+ * At least one INTC1 instance exists in the SoC on the IO-die, however up
+ * to two more instances may be integrated via LTPI (LVDS Tunneling Protocol
+ * & Interface).
+ *
+ * Between the multiple destinations, various route constraints, and the
+ * devicetree binding design, some information that's needed at INTC1 instances
+ * to route inbound interrupts correctly to the destination processor is only
+ * available at INTC0.
+ *
+ * aspeed_intc0_resolve_route() is to be invoked by INTC1 driver instances to
+ * perform the route resolution. The implementation in INTC0 allows INTC0 to
+ * encapsulate the information used to perform route selection, and provides it
+ * with an opportunity to apply policy as part of the selection process. Such
+ * policy may, for instance, choose to de-prioritise some interrupts destined
+ * for the PSP (Primary Service Processor) GIC.
+ */
+int aspeed_intc0_resolve_route(const struct irq_domain *c0domain, size_t nc1outs,
+ const u32 *c1outs, size_t nc1ranges,
+ const struct aspeed_intc_interrupt_range *c1ranges,
+ struct aspeed_intc_interrupt_range *resolved)
+{
+ struct fwnode_handle *parent_fwnode;
+ struct aspeed_intc0 *intc0;
+ int ret;
+
+ if (!c0domain || !resolved)
+ return -EINVAL;
+
+ if (nc1outs > INT_MAX)
+ return -EINVAL;
+
+ if (nc1outs == 0 || nc1ranges == 0)
+ return -ENOENT;
+
+ if (!IS_ENABLED(CONFIG_ASPEED_AST2700_INTC_TEST) &&
+ !fwnode_device_is_compatible(c0domain->fwnode, "aspeed,ast2700-intc0"))
+ return -ENODEV;
+
+ intc0 = c0domain->host_data;
+ if (!intc0)
+ return -EINVAL;
+
+ parent_fwnode = of_fwnode_handle(intc0->parent);
+
+ for (size_t i = 0; i < nc1outs; i++) {
+ u32 c1o = c1outs[i];
+
+ if (c1o == AST2700_INTC_INVALID_ROUTE)
+ continue;
+
+ for (size_t j = 0; j < nc1ranges; j++) {
+ struct aspeed_intc_interrupt_range c1r = c1ranges[j];
+ u32 input;
+
+ /*
+ * Range match for intc1 output pin
+ *
+ * Assume a failed match is still a match for the purpose of testing,
+ * saves a bunch of mess in the test fixtures
+ */
+ if (!(c0domain == c1r.domain ||
+ IS_ENABLED(CONFIG_ASPEED_AST2700_INTC_TEST)))
+ continue;
+
+ ret = resolve_input_from_child_ranges(intc0, &c1r, c1o, &input);
+ if (ret)
+ continue;
+
+ /*
+ * INTC1 should never request routes for peripheral interrupt sources
+ * directly attached to INTC0.
+ */
+ if (input < GIC_P2P_SPI_END)
+ continue;
+
+ ret = resolve_parent_route_for_input(intc0, parent_fwnode, input, NULL);
+ if (ret < 0)
+ continue;
+
+ /* Route resolution succeeded */
+ resolved->start = c1o;
+ resolved->count = 1;
+ resolved->upstream = c1r.upstream;
+ resolved->upstream.param[ASPEED_INTC_RANGES_BASE] = input;
+ /* Cast protected by prior test against nc1outs */
+ return (int)i;
+ }
+ }
+
+ return -ENOENT;
+}
+
+static int aspeed_intc0_irq_domain_map(struct irq_domain *domain,
+ unsigned int irq, irq_hw_number_t hwirq)
+{
+ if (hwirq < GIC_P2P_SPI_END)
+ irq_set_chip_and_handler(irq, &linear_intr_irq_chip, handle_level_irq);
+ else if (hwirq < INTM_BASE)
+ return -EINVAL;
+ else if (hwirq < SWINT_BASE)
+ irq_set_chip_and_handler(irq, &aspeed_intm_chip, handle_level_irq);
+ else if (hwirq < INT0_NUM)
+ irq_set_chip_and_handler(irq, &aspeed_swint_chip, handle_level_irq);
+ else
+ return -EINVAL;
+
+ irq_set_chip_data(irq, domain->host_data);
+ return 0;
+}
+
+static int aspeed_intc0_irq_domain_translate(struct irq_domain *domain,
+ struct irq_fwspec *fwspec,
+ unsigned long *hwirq,
+ unsigned int *type)
+{
+ if (fwspec->param_count != 1)
+ return -EINVAL;
+
+ *hwirq = fwspec->param[0];
+ *type = IRQ_TYPE_NONE;
+ return 0;
+}
+
+static int aspeed_intc0_irq_domain_alloc(struct irq_domain *domain,
+ unsigned int virq,
+ unsigned int nr_irqs, void *data)
+{
+ struct aspeed_intc0 *intc0 = domain->host_data;
+ struct aspeed_intc_interrupt_range resolved;
+ struct irq_fwspec *fwspec = data;
+ struct irq_fwspec parent_fwspec;
+ struct irq_chip *chip;
+ unsigned long hwirq;
+ unsigned int type;
+ int ret;
+
+ ret = aspeed_intc0_irq_domain_translate(domain, fwspec, &hwirq, &type);
+ if (ret)
+ return ret;
+
+ if (hwirq >= GIC_P2P_SPI_END && hwirq < INT_NUM)
+ return -EINVAL;
+
+ if (hwirq < INTM_BASE)
+ chip = &linear_intr_irq_chip;
+ else if (hwirq < SWINT_BASE)
+ chip = &aspeed_intm_chip;
+ else
+ chip = &aspeed_swint_chip;
+
+ ret = resolve_parent_route_for_input(intc0, domain->parent->fwnode,
+ (u32)hwirq, &resolved);
+ if (ret)
+ return ret;
+
+ parent_fwspec = resolved.upstream;
+ ret = irq_domain_alloc_irqs_parent(domain, virq, nr_irqs,
+ &parent_fwspec);
+ if (ret)
+ return ret;
+
+ for (int i = 0; i < nr_irqs; ++i, ++hwirq, ++virq) {
+ ret = irq_domain_set_hwirq_and_chip(domain, virq, hwirq, chip,
+ domain->host_data);
+ if (ret)
+ return ret;
+ }
+
+ return 0;
+}
+
+static int aspeed_intc0_irq_domain_activate(struct irq_domain *domain,
+ struct irq_data *data, bool reserve)
+{
+ struct aspeed_intc0 *intc0 = irq_data_get_irq_chip_data(data);
+ unsigned long hwirq = data->hwirq;
+ int route, bank, bit;
+ u32 mask;
+
+ if (hwirq >= INT0_NUM)
+ return -EINVAL;
+
+ if (in_range32(hwirq, INTM_BASE, INTM_NUM + SWINT_NUM))
+ return 0;
+
+ bank = hwirq / INTC0_IRQS_PER_BANK;
+ bit = hwirq % INTC0_IRQS_PER_BANK;
+ mask = BIT(bit);
+
+ route = resolve_parent_route_for_input(intc0, intc0->local->parent->fwnode,
+ hwirq, NULL);
+ if (route < 0)
+ return route;
+
+ guard(raw_spinlock)(&intc0->intc_lock);
+ for (int i = 0; i < INTC0_ROUTE_BITS; i++) {
+ void __iomem *sel = intc0->base + INTC0_SEL_BASE +
+ (bank * INTC0_SEL_BANK_SIZE) +
+ (INTC0_SEL_ROUTE_SIZE * i);
+ u32 reg = readl(sel);
+
+ if (route & BIT(i))
+ reg |= mask;
+ else
+ reg &= ~mask;
+
+ writel(reg, sel);
+ if (readl(sel) != reg)
+ return -EACCES;
+ }
+
+ return 0;
+}
+
+static const struct irq_domain_ops aspeed_intc0_irq_domain_ops = {
+ .translate = aspeed_intc0_irq_domain_translate,
+ .activate = aspeed_intc0_irq_domain_activate,
+ .alloc = aspeed_intc0_irq_domain_alloc,
+ .free = irq_domain_free_irqs_common,
+ .map = aspeed_intc0_irq_domain_map,
+};
+
+static void aspeed_intc0_disable_swint(struct aspeed_intc0 *intc0)
+{
+ writel(0, intc0->base + INTC0_SWINT_IER);
+}
+
+static void aspeed_intc0_disable_intbank(struct aspeed_intc0 *intc0)
+{
+ for (int i = 0; i < INTC0_INTBANK_GROUPS; i++) {
+ for (int j = 0; j < INTC0_INTBANKS_PER_GRP; j++) {
+ u32 base = INTC0_INTBANKX_IER +
+ (INTC0_INTBANK_SIZE * i) +
+ (INTC0_INTMX_BANK_SIZE * j);
+
+ writel(0, intc0->base + base);
+ }
+ }
+}
+
+static void aspeed_intc0_disable_intm(struct aspeed_intc0 *intc0)
+{
+ for (int i = 0; i < INTC0_INTM_BANK_NUM; i++)
+ writel(0, intc0->base + INTC0_INTMX_IER + (INTC0_INTMX_BANK_SIZE * i));
+}
+
+static int aspeed_intc0_probe(struct platform_device *pdev,
+ struct device_node *parent)
+{
+ struct device_node *node = pdev->dev.of_node;
+ struct irq_domain *parent_domain;
+ struct aspeed_intc0 *intc0;
+ int ret;
+
+ if (!parent) {
+ pr_err("missing parent interrupt node\n");
+ return -ENODEV;
+ }
+
+ intc0 = devm_kzalloc(&pdev->dev, sizeof(*intc0), GFP_KERNEL);
+ if (!intc0)
+ return -ENOMEM;
+
+ intc0->dev = &pdev->dev;
+ intc0->parent = parent;
+ intc0->base = devm_platform_ioremap_resource(pdev, 0);
+ if (IS_ERR(intc0->base))
+ return PTR_ERR(intc0->base);
+
+ aspeed_intc0_disable_swint(intc0);
+ aspeed_intc0_disable_intbank(intc0);
+ aspeed_intc0_disable_intm(intc0);
+
+ raw_spin_lock_init(&intc0->intc_lock);
+
+ parent_domain = irq_find_host(parent);
+ if (!parent_domain) {
+ pr_err("unable to obtain parent domain\n");
+ return -ENODEV;
+ }
+
+ if (!of_device_is_compatible(parent, "arm,gic-v3"))
+ return -ENODEV;
+
+ intc0->local = irq_domain_create_hierarchy(parent_domain, 0, INT0_NUM,
+ of_fwnode_handle(node),
+ &aspeed_intc0_irq_domain_ops,
+ intc0);
+ if (!intc0->local)
+ return -ENOMEM;
+
+ ret = aspeed_intc_populate_ranges(&pdev->dev, &intc0->ranges);
+ if (ret < 0) {
+ irq_domain_remove(intc0->local);
+ return ret;
+ }
+
+ return 0;
+}
+
+IRQCHIP_PLATFORM_DRIVER_BEGIN(ast2700_intc0)
+IRQCHIP_MATCH("aspeed,ast2700-intc0", aspeed_intc0_probe)
+IRQCHIP_PLATFORM_DRIVER_END(ast2700_intc0)
diff --git a/drivers/irqchip/irq-ast2700-intc1.c b/drivers/irqchip/irq-ast2700-intc1.c
new file mode 100644
index 000000000000..59e8f0d5ddcd
--- /dev/null
+++ b/drivers/irqchip/irq-ast2700-intc1.c
@@ -0,0 +1,280 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Aspeed AST2700 Interrupt Controller.
+ *
+ * Copyright (C) 2026 ASPEED Technology Inc.
+ */
+
+#include <linux/bitops.h>
+#include <linux/device.h>
+#include <linux/io.h>
+#include <linux/irq.h>
+#include <linux/irqchip.h>
+#include <linux/irqchip/chained_irq.h>
+#include <linux/irqdomain.h>
+#include <linux/of.h>
+#include <linux/of_irq.h>
+#include <linux/spinlock.h>
+
+#include "irq-ast2700.h"
+
+#define INTC1_IER 0x100
+#define INTC1_ISR 0x104
+#define INTC1_BANK_SIZE 0x10
+#define INTC1_SEL_BASE 0x80
+#define INTC1_SEL_BANK_SIZE 0x4
+#define INTC1_SEL_ROUTE_SIZE 0x20
+#define INTC1_IRQS_PER_BANK 32
+#define INTC1_BANK_NUM 6
+#define INTC1_ROUTE_NUM 7
+#define INTC1_IN_NUM 192
+#define INTC1_BOOTMCU_ROUTE 6
+#define INTC1_ROUTE_SELECTOR_BITS 3
+#define INTC1_ROUTE_IRQS_PER_GROUP 32
+#define INTC1_ROUTE_SHIFT 5
+
+struct aspeed_intc1 {
+ struct device *dev;
+ void __iomem *base;
+ raw_spinlock_t intc_lock;
+ struct irq_domain *local;
+ struct irq_domain *upstream;
+ struct aspeed_intc_interrupt_ranges ranges;
+};
+
+static void aspeed_intc1_disable_int(struct aspeed_intc1 *intc1)
+{
+ for (int i = 0; i < INTC1_BANK_NUM; i++)
+ writel(0, intc1->base + INTC1_IER + (INTC1_BANK_SIZE * i));
+}
+
+static void aspeed_intc1_irq_handler(struct irq_desc *desc)
+{
+ struct aspeed_intc1 *intc1 = irq_desc_get_handler_data(desc);
+ struct irq_chip *chip = irq_desc_get_chip(desc);
+ unsigned long bit, status;
+
+ chained_irq_enter(chip, desc);
+
+ for (int bank = 0; bank < INTC1_BANK_NUM; bank++) {
+ status = readl(intc1->base + INTC1_ISR + (INTC1_BANK_SIZE * bank));
+ if (!status)
+ continue;
+
+ for_each_set_bit(bit, &status, INTC1_IRQS_PER_BANK) {
+ generic_handle_domain_irq(intc1->local, (bank * INTC1_IRQS_PER_BANK) + bit);
+ writel(BIT(bit), intc1->base + INTC1_ISR + (INTC1_BANK_SIZE * bank));
+ }
+ }
+
+ chained_irq_exit(chip, desc);
+}
+
+static void aspeed_intc1_irq_mask(struct irq_data *data)
+{
+ struct aspeed_intc1 *intc1 = irq_data_get_irq_chip_data(data);
+ int bank = data->hwirq / INTC1_IRQS_PER_BANK;
+ int bit = data->hwirq % INTC1_IRQS_PER_BANK;
+ u32 ier;
+
+ guard(raw_spinlock)(&intc1->intc_lock);
+ ier = readl(intc1->base + INTC1_IER + (INTC1_BANK_SIZE * bank)) & ~BIT(bit);
+ writel(ier, intc1->base + INTC1_IER + (INTC1_BANK_SIZE * bank));
+}
+
+static void aspeed_intc1_irq_unmask(struct irq_data *data)
+{
+ struct aspeed_intc1 *intc1 = irq_data_get_irq_chip_data(data);
+ int bank = data->hwirq / INTC1_IRQS_PER_BANK;
+ int bit = data->hwirq % INTC1_IRQS_PER_BANK;
+ u32 ier;
+
+ guard(raw_spinlock)(&intc1->intc_lock);
+ ier = readl(intc1->base + INTC1_IER + (INTC1_BANK_SIZE * bank)) | BIT(bit);
+ writel(ier, intc1->base + INTC1_IER + (INTC1_BANK_SIZE * bank));
+}
+
+static struct irq_chip aspeed_intc_chip = {
+ .name = "ASPEED INTC1",
+ .irq_mask = aspeed_intc1_irq_mask,
+ .irq_unmask = aspeed_intc1_irq_unmask,
+};
+
+static int aspeed_intc1_irq_domain_translate(struct irq_domain *domain,
+ struct irq_fwspec *fwspec,
+ unsigned long *hwirq,
+ unsigned int *type)
+{
+ if (fwspec->param_count != 1)
+ return -EINVAL;
+
+ *hwirq = fwspec->param[0];
+ *type = IRQ_TYPE_LEVEL_HIGH;
+ return 0;
+}
+
+static int aspeed_intc1_map_irq_domain(struct irq_domain *domain,
+ unsigned int irq,
+ irq_hw_number_t hwirq)
+{
+ irq_domain_set_info(domain, irq, hwirq, &aspeed_intc_chip,
+ domain->host_data, handle_level_irq, NULL, NULL);
+ return 0;
+}
+
+/*
+ * In-bound interrupts are progressively merged into one out-bound interrupt in
+ * groups of 32. Apply this fact to compress the route table in corresponding
+ * groups of 32.
+ */
+static const u32
+aspeed_intc1_routes[INTC1_IN_NUM / INTC1_ROUTE_IRQS_PER_GROUP][INTC1_ROUTE_NUM] = {
+ { 0, AST2700_INTC_INVALID_ROUTE, 10, 20, 30, 40, 50 },
+ { 1, AST2700_INTC_INVALID_ROUTE, 11, 21, 31, 41, 50 },
+ { 2, AST2700_INTC_INVALID_ROUTE, 12, 22, 32, 42, 50 },
+ { 3, AST2700_INTC_INVALID_ROUTE, 13, 23, 33, 43, 50 },
+ { 4, AST2700_INTC_INVALID_ROUTE, 14, 24, 34, 44, 50 },
+ { 5, AST2700_INTC_INVALID_ROUTE, 15, 25, 35, 45, 50 },
+};
+
+static int aspeed_intc1_irq_domain_activate(struct irq_domain *domain,
+ struct irq_data *data, bool reserve)
+{
+ struct aspeed_intc1 *intc1 = irq_data_get_irq_chip_data(data);
+ struct aspeed_intc_interrupt_range resolved;
+ int rc, bank, bit;
+ u32 mask;
+
+ if (WARN_ON_ONCE((data->hwirq >> INTC1_ROUTE_SHIFT) >= ARRAY_SIZE(aspeed_intc1_routes)))
+ return -EINVAL;
+
+ /*
+ * outpin may be an error if the upstream is the BootMCU APLIC node, or
+ * anything except a valid intc0 driver instance
+ */
+ rc = aspeed_intc0_resolve_route(intc1->upstream, INTC1_ROUTE_NUM,
+ aspeed_intc1_routes[data->hwirq >> INTC1_ROUTE_SHIFT],
+ intc1->ranges.nranges,
+ intc1->ranges.ranges, &resolved);
+ if (rc < 0) {
+ if (!fwnode_device_is_compatible(intc1->upstream->fwnode, "riscv,aplic")) {
+ dev_warn(intc1->dev,
+ "Failed to resolve interrupt route for hwirq %lu in domain %s\n",
+ data->hwirq, domain->name);
+ return rc;
+ }
+ rc = INTC1_BOOTMCU_ROUTE;
+ }
+
+ bank = data->hwirq / INTC1_IRQS_PER_BANK;
+ bit = data->hwirq % INTC1_IRQS_PER_BANK;
+ mask = BIT(bit);
+
+ guard(raw_spinlock)(&intc1->intc_lock);
+ for (int i = 0; i < INTC1_ROUTE_SELECTOR_BITS; i++) {
+ void __iomem *sel = intc1->base + INTC1_SEL_BASE +
+ (bank * INTC1_SEL_BANK_SIZE) +
+ (INTC1_SEL_ROUTE_SIZE * i);
+ u32 reg = readl(sel);
+
+ if (rc & BIT(i))
+ reg |= mask;
+ else
+ reg &= ~mask;
+
+ writel(reg, sel);
+ if (readl(sel) != reg)
+ return -EACCES;
+ }
+
+ return 0;
+}
+
+static const struct irq_domain_ops aspeed_intc1_irq_domain_ops = {
+ .map = aspeed_intc1_map_irq_domain,
+ .translate = aspeed_intc1_irq_domain_translate,
+ .activate = aspeed_intc1_irq_domain_activate,
+};
+
+static void aspeed_intc1_request_interrupts(struct aspeed_intc1 *intc1)
+{
+ for (unsigned int i = 0; i < intc1->ranges.nranges; i++) {
+ struct aspeed_intc_interrupt_range *r =
+ &intc1->ranges.ranges[i];
+
+ if (intc1->upstream != r->domain)
+ continue;
+
+ for (u32 k = 0; k < r->count; k++) {
+ struct of_phandle_args parent_irq;
+ int irq;
+
+ parent_irq.np = to_of_node(r->upstream.fwnode);
+ parent_irq.args_count = 1;
+ parent_irq.args[0] =
+ intc1->ranges.ranges[i].upstream.param[ASPEED_INTC_RANGES_BASE] + k;
+
+ irq = irq_create_of_mapping(&parent_irq);
+ if (!irq)
+ continue;
+
+ irq_set_chained_handler_and_data(irq,
+ aspeed_intc1_irq_handler, intc1);
+ }
+ }
+}
+
+static int aspeed_intc1_probe(struct platform_device *pdev,
+ struct device_node *parent)
+{
+ struct device_node *node = pdev->dev.of_node;
+ struct aspeed_intc1 *intc1;
+ struct irq_domain *host;
+ int ret;
+
+ if (!parent) {
+ dev_err(&pdev->dev, "missing parent interrupt node\n");
+ return -ENODEV;
+ }
+
+ if (!of_device_is_compatible(parent, "aspeed,ast2700-intc0"))
+ return -ENODEV;
+
+ host = irq_find_host(parent);
+ if (!host)
+ return -ENODEV;
+
+ intc1 = devm_kzalloc(&pdev->dev, sizeof(*intc1), GFP_KERNEL);
+ if (!intc1)
+ return -ENOMEM;
+
+ intc1->dev = &pdev->dev;
+ intc1->upstream = host;
+ intc1->base = devm_platform_ioremap_resource(pdev, 0);
+ if (IS_ERR(intc1->base))
+ return PTR_ERR(intc1->base);
+
+ aspeed_intc1_disable_int(intc1);
+
+ raw_spin_lock_init(&intc1->intc_lock);
+
+ intc1->local = irq_domain_create_linear(of_fwnode_handle(node),
+ INTC1_BANK_NUM * INTC1_IRQS_PER_BANK,
+ &aspeed_intc1_irq_domain_ops, intc1);
+ if (!intc1->local)
+ return -ENOMEM;
+
+ ret = aspeed_intc_populate_ranges(&pdev->dev, &intc1->ranges);
+ if (ret < 0) {
+ irq_domain_remove(intc1->local);
+ return ret;
+ }
+
+ aspeed_intc1_request_interrupts(intc1);
+
+ return 0;
+}
+
+IRQCHIP_PLATFORM_DRIVER_BEGIN(ast2700_intc1)
+IRQCHIP_MATCH("aspeed,ast2700-intc1", aspeed_intc1_probe)
+IRQCHIP_PLATFORM_DRIVER_END(ast2700_intc1)
diff --git a/drivers/irqchip/irq-ast2700.c b/drivers/irqchip/irq-ast2700.c
new file mode 100644
index 000000000000..1e4c4a624dbf
--- /dev/null
+++ b/drivers/irqchip/irq-ast2700.c
@@ -0,0 +1,107 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Aspeed AST2700 Interrupt Controller.
+ *
+ * Copyright (C) 2026 ASPEED Technology Inc.
+ */
+#include "irq-ast2700.h"
+
+#define ASPEED_INTC_RANGE_FIXED_CELLS 3U
+#define ASPEED_INTC_RANGE_OFF_START 0U
+#define ASPEED_INTC_RANGE_OFF_COUNT 1U
+#define ASPEED_INTC_RANGE_OFF_PHANDLE 2U
+
+/**
+ * aspeed_intc_populate_ranges
+ * @dev: Device owning the interrupt controller node.
+ * @ranges: Destination for parsed range descriptors.
+ *
+ * Return: 0 on success, negative errno on error.
+ */
+int aspeed_intc_populate_ranges(struct device *dev,
+ struct aspeed_intc_interrupt_ranges *ranges)
+{
+ struct aspeed_intc_interrupt_range *arr;
+ const __be32 *pvs, *pve;
+ struct device_node *dn;
+ int len;
+
+ if (!dev || !ranges)
+ return -EINVAL;
+
+ dn = dev->of_node;
+
+ pvs = of_get_property(dn, "aspeed,interrupt-ranges", &len);
+ if (!pvs)
+ return -EINVAL;
+
+ if (len % sizeof(__be32))
+ return -EINVAL;
+
+ /* Over-estimate the range entry count for now */
+ ranges->ranges = devm_kmalloc_array(dev,
+ len / (ASPEED_INTC_RANGE_FIXED_CELLS * sizeof(__be32)),
+ sizeof(*ranges->ranges),
+ GFP_KERNEL);
+ if (!ranges->ranges)
+ return -ENOMEM;
+
+ pve = pvs + (len / sizeof(__be32));
+ for (unsigned int i = 0; pve - pvs >= ASPEED_INTC_RANGE_FIXED_CELLS; i++) {
+ struct aspeed_intc_interrupt_range *r;
+ struct device_node *target;
+ u32 target_cells;
+
+ target = of_find_node_by_phandle(be32_to_cpu(pvs[ASPEED_INTC_RANGE_OFF_PHANDLE]));
+ if (!target)
+ return -EINVAL;
+
+ if (of_property_read_u32(target, "#interrupt-cells",
+ &target_cells)) {
+ of_node_put(target);
+ return -EINVAL;
+ }
+
+ if (!target_cells || target_cells > IRQ_DOMAIN_IRQ_SPEC_PARAMS) {
+ of_node_put(target);
+ return -EINVAL;
+ }
+
+ if (pve - pvs < ASPEED_INTC_RANGE_FIXED_CELLS + target_cells) {
+ of_node_put(target);
+ return -EINVAL;
+ }
+
+ r = &ranges->ranges[i];
+ r->start = be32_to_cpu(pvs[ASPEED_INTC_RANGE_OFF_START]);
+ r->count = be32_to_cpu(pvs[ASPEED_INTC_RANGE_OFF_COUNT]);
+
+ {
+ struct of_phandle_args args = {
+ .np = target,
+ .args_count = target_cells,
+ };
+
+ for (u32 j = 0; j < target_cells; j++)
+ args.args[j] = be32_to_cpu(pvs[ASPEED_INTC_RANGE_FIXED_CELLS + j]);
+
+ of_phandle_args_to_fwspec(target, args.args,
+ args.args_count,
+ &r->upstream);
+ }
+
+ of_node_put(target);
+ r->domain = irq_find_matching_fwspec(&r->upstream, DOMAIN_BUS_ANY);
+ pvs += ASPEED_INTC_RANGE_FIXED_CELLS + target_cells;
+ ranges->nranges++;
+ }
+
+ /* Re-fit the range array now we know the entry count */
+ arr = devm_krealloc_array(dev, ranges->ranges, ranges->nranges,
+ sizeof(*ranges->ranges), GFP_KERNEL);
+ if (!arr)
+ return -ENOMEM;
+ ranges->ranges = arr;
+
+ return 0;
+}
diff --git a/drivers/irqchip/irq-ast2700.h b/drivers/irqchip/irq-ast2700.h
new file mode 100644
index 000000000000..318296638445
--- /dev/null
+++ b/drivers/irqchip/irq-ast2700.h
@@ -0,0 +1,48 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Aspeed AST2700 Interrupt Controller.
+ *
+ * Copyright (C) 2026 ASPEED Technology Inc.
+ */
+#ifndef DRIVERS_IRQCHIP_AST2700
+#define DRIVERS_IRQCHIP_AST2700
+
+#include <linux/device.h>
+#include <linux/irqdomain.h>
+
+#define AST2700_INTC_INVALID_ROUTE (~0U)
+#define ASPEED_INTC_RANGES_BASE 0U
+#define ASPEED_INTC_RANGES_COUNT 1U
+
+struct aspeed_intc_interrupt_range {
+ u32 start;
+ u32 count;
+ struct irq_fwspec upstream;
+ struct irq_domain *domain;
+};
+
+struct aspeed_intc_interrupt_ranges {
+ struct aspeed_intc_interrupt_range *ranges;
+ unsigned int nranges;
+};
+
+struct aspeed_intc0 {
+ struct device *dev;
+ void __iomem *base;
+ raw_spinlock_t intc_lock;
+ struct irq_domain *local;
+ struct device_node *parent;
+ struct aspeed_intc_interrupt_ranges ranges;
+};
+
+int aspeed_intc_populate_ranges(struct device *dev,
+ struct aspeed_intc_interrupt_ranges *ranges);
+
+int aspeed_intc0_resolve_route(const struct irq_domain *c0domain,
+ size_t nc1outs,
+ const u32 *c1outs,
+ size_t nc1ranges,
+ const struct aspeed_intc_interrupt_range *c1ranges,
+ struct aspeed_intc_interrupt_range *resolved);
+
+#endif
diff --git a/drivers/irqchip/irq-econet-en751221.c b/drivers/irqchip/irq-econet-en751221.c
index d83d5eb12795..2ca5d901866f 100644
--- a/drivers/irqchip/irq-econet-en751221.c
+++ b/drivers/irqchip/irq-econet-en751221.c
@@ -30,6 +30,8 @@
#include <linux/irqchip.h>
#include <linux/irqchip/chained_irq.h>
+#include <asm/setup.h>
+
#define IRQ_COUNT 40
#define NOT_PERCPU 0xff
@@ -41,15 +43,19 @@
#define REG_PENDING1 0x54
/**
- * @membase: Base address of the interrupt controller registers
- * @interrupt_shadows: Array of all interrupts, for each value,
- * - NOT_PERCPU: This interrupt is not per-cpu, so it has no shadow
- * - IS_SHADOW: This interrupt is a shadow of another per-cpu interrupt
- * - else: This is a per-cpu interrupt whose shadow is the value
+ * @membase: Base address of the interrupt controller registers
+ * @domain: The irq_domain for direct dispatch
+ * @ipi_domain: The irq_domain for inter-process dispatch
+ * @interrupt_shadows: Array of all interrupts, for each value,
+ * - NOT_PERCPU: This interrupt is not per-cpu, so it has no shadow
+ * - IS_SHADOW: This interrupt is a shadow of another per-cpu interrupt
+ * - else: This is a per-cpu interrupt whose shadow is the value
*/
static struct {
- void __iomem *membase;
- u8 interrupt_shadows[IRQ_COUNT];
+ void __iomem *membase;
+ struct irq_domain *domain;
+ struct irq_domain *ipi_domain;
+ u8 interrupt_shadows[IRQ_COUNT];
} econet_intc __ro_after_init;
static DEFINE_RAW_SPINLOCK(irq_lock);
@@ -150,6 +156,56 @@ static void econet_intc_from_parent(struct irq_desc *desc)
chained_irq_exit(chip, desc);
}
+/*
+ * When in VEIC mode, the CPU jumps to a handler in the vector table.
+ * The only way to know which interrupt is being triggered is from the vector table offset that
+ * has been jumped to. Reading REG_PENDING(0|1) will tell you which interrupts are currently
+ * pending in the intc, but that will not tell you which one the intc wants you to process
+ * right now. And if you are not processing the exact interrupt that the intc wants you to be
+ * processing, you might be on the wrong VPE. You can't tell which VPE any given REG_PENDING
+ * interrupt is intended for (shadow IRQ numbers are for masking only, they never flag as
+ * pending).
+ *
+ * Consequently, this little ritual of generating n handler functions and registering one per
+ * interrupt is unavoidable.
+ */
+#define X(irq) \
+ static void econet_irq_dispatch ## irq (void) \
+ { \
+ do_domain_IRQ(econet_intc.domain, irq); \
+ }
+
+ X(0) X(1) X(2) X(3) X(4) X(5) X(6) X(7) X(8) X(9)
+X(10) X(11) X(12) X(13) X(14) X(15) X(16) X(17) X(18) X(19)
+X(20) X(21) X(22) X(23) X(24) X(25) X(26) X(27) X(28) X(29)
+X(30) X(31) X(32) X(33) X(34) X(35) X(36) X(37) X(38) X(39)
+
+#undef X
+#define X(irq) econet_irq_dispatch ## irq,
+
+static void (* const econet_irq_dispatchers[])(void) = {
+ X(0) X(1) X(2) X(3) X(4) X(5) X(6) X(7) X(8) X(9)
+ X(10) X(11) X(12) X(13) X(14) X(15) X(16) X(17) X(18) X(19)
+ X(20) X(21) X(22) X(23) X(24) X(25) X(26) X(27) X(28) X(29)
+ X(30) X(31) X(32) X(33) X(34) X(35) X(36) X(37) X(38) X(39)
+};
+
+/* Likewise, we do the same for the 2 IPI IRQs so that we can route them back */
+static void econet_cpu_dispatch0(void)
+{
+ do_domain_IRQ(econet_intc.ipi_domain, 0);
+}
+
+static void econet_cpu_dispatch1(void)
+{
+ do_domain_IRQ(econet_intc.ipi_domain, 1);
+}
+
+static void (* const econet_cpu_dispatchers[])(void) = {
+ econet_cpu_dispatch0,
+ econet_cpu_dispatch1,
+};
+
static const struct irq_chip econet_irq_chip;
static int econet_intc_map(struct irq_domain *d, u32 irq, irq_hw_number_t hwirq)
@@ -174,6 +230,10 @@ static int econet_intc_map(struct irq_domain *d, u32 irq, irq_hw_number_t hwirq)
}
irq_set_chip_data(irq, NULL);
+
+ if (cpu_has_veic)
+ set_vi_handler(hwirq + 1, econet_irq_dispatchers[hwirq]);
+
return 0;
}
@@ -249,6 +309,100 @@ static int __init get_shadow_interrupts(struct device_node *node)
return 0;
}
+/**
+ * econet_cpu_init() - configure routing of CPU interrupts to the correct domain.
+ * @node: The devicetree node of this interrupt controller.
+ *
+ * Interrupts that originate from the CPU are unconditionally unmasked here and are re-routed back
+ * to the IPI irq_domain in the CPU intc. Masking still takes place but the CPU intc is in charge
+ * of it, using the mask bits of the c0_status register.
+ *
+ * Note that because IP2 ... IP7 are repurposed as Interrupt Priority Level, only the two IPI
+ * interrupts are actually supported.
+ */
+static int __init econet_cpu_init(struct device_node *node)
+{
+ const char *field = "econet,cpu-interrupt-map";
+ struct device_node *parent_intc;
+ int map_size;
+ u32 mask;
+
+ map_size = of_property_count_u32_elems(node, field);
+
+ if (map_size <= 0) {
+ return 0;
+ } else if (map_size % 2) {
+ pr_err("%pOF: %s count is odd, ignoring\n", node, field);
+ return 0;
+ }
+
+ u32 *maps __free(kfree) = kmalloc_array(map_size, sizeof(u32), GFP_KERNEL);
+ if (!maps)
+ return -ENOMEM;
+
+ if (of_property_read_u32_array(node, field, maps, map_size)) {
+ pr_err("%pOF: Failed to read %s\n", node, field);
+ return -EINVAL;
+ }
+
+ /* Validation */
+ for (int i = 0; i < map_size; i += 2) {
+ u32 receive = maps[i];
+ u32 dispatch = maps[i + 1];
+ u8 shadow;
+
+ if (receive >= IRQ_COUNT) {
+ pr_err("%pOF: Entry %d:%d in %s (%u) is out of bounds\n",
+ node, i, 0, field, receive);
+ return -EINVAL;
+ }
+
+ shadow = econet_intc.interrupt_shadows[receive];
+ if (shadow != NOT_PERCPU && shadow >= IRQ_COUNT) {
+ pr_err("%pOF: Entry %d:%d in %s (%u) has invalid shadow (%d)\n",
+ node, i, 0, field, receive, shadow);
+ return -EINVAL;
+ }
+
+ if (dispatch >= ARRAY_SIZE(econet_cpu_dispatchers)) {
+ pr_err("%pOF: Entry %d:%d in %s (%u) is out of bounds only IPI interrupts are supported\n",
+ node, i, 1, field, dispatch);
+ return -EINVAL;
+ }
+ }
+
+ parent_intc = of_irq_find_parent(node);
+ if (!parent_intc) {
+ pr_err("%pOF: Failed to find parent %s\n", node, "IRQ device");
+ return -ENODEV;
+ }
+
+ econet_intc.ipi_domain = irq_find_matching_host(parent_intc, DOMAIN_BUS_IPI);
+ if (!econet_intc.ipi_domain) {
+ pr_err("%pOF: Failed to find parent %s\n", node, "IPI domain");
+ return -ENODEV;
+ }
+
+ mask = 0;
+ for (int i = 0; i < map_size; i += 2) {
+ u32 receive = maps[i];
+ u32 dispatch = maps[i + 1];
+ u8 shadow;
+
+ set_vi_handler(receive + 1, econet_cpu_dispatchers[dispatch]);
+
+ mask |= BIT(receive);
+
+ shadow = econet_intc.interrupt_shadows[receive];
+ if (shadow != NOT_PERCPU)
+ mask |= BIT(shadow);
+ }
+
+ econet_wreg(REG_MASK0, mask, mask);
+
+ return 0;
+}
+
static int __init econet_intc_of_init(struct device_node *node, struct device_node *parent)
{
struct irq_domain *domain;
@@ -294,7 +448,23 @@ static int __init econet_intc_of_init(struct device_node *node, struct device_no
goto err_unmap;
}
- irq_set_chained_handler_and_data(irq, econet_intc_from_parent, domain);
+ /*
+ * 34K Manual (MD00534) Section 6.3.1.3 rev 1.13 page 136:
+ * In VEIC mode, IP2 ... IP7 are repurposed as Interrupt Priority Level. The controller
+ * will filter incoming interrupts whose priority is lower than the IPL number. Therefore
+ * we must not set any of these bits. We avoid setting IP2 by not actually chaining this
+ * intc to the CPU intc.
+ */
+ if (cpu_has_veic) {
+ ret = econet_cpu_init(node);
+
+ if (ret)
+ return ret;
+ } else {
+ irq_set_chained_handler_and_data(irq, econet_intc_from_parent, domain);
+ }
+
+ econet_intc.domain = domain;
return 0;
diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 291d7668cc8d..b57d81ad33a0 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -4784,8 +4784,7 @@ static bool __maybe_unused its_enable_quirk_cavium_22375(void *data)
struct its_node *its = data;
/* erratum 22375: only alloc 8MB table size (20 bits) */
- its->typer &= ~GITS_TYPER_DEVBITS;
- its->typer |= FIELD_PREP(GITS_TYPER_DEVBITS, 20 - 1);
+ FIELD_MODIFY(GITS_TYPER_DEVBITS, &its->typer, 20 - 1);
its->flags |= ITS_FLAGS_WORKAROUND_CAVIUM_22375;
return true;
@@ -4805,8 +4804,7 @@ static bool __maybe_unused its_enable_quirk_qdf2400_e0065(void *data)
struct its_node *its = data;
/* On QDF2400, the size of the ITE is 16Bytes */
- its->typer &= ~GITS_TYPER_ITT_ENTRY_SIZE;
- its->typer |= FIELD_PREP(GITS_TYPER_ITT_ENTRY_SIZE, 16 - 1);
+ FIELD_MODIFY(GITS_TYPER_ITT_ENTRY_SIZE, &its->typer, 16 - 1);
return true;
}
@@ -4840,10 +4838,8 @@ static bool __maybe_unused its_enable_quirk_socionext_synquacer(void *data)
its->get_msi_base = its_irq_get_msi_base_pre_its;
ids = ilog2(pre_its_window[1]) - 2;
- if (device_ids(its) > ids) {
- its->typer &= ~GITS_TYPER_DEVBITS;
- its->typer |= FIELD_PREP(GITS_TYPER_DEVBITS, ids - 1);
- }
+ if (device_ids(its) > ids)
+ FIELD_MODIFY(GITS_TYPER_DEVBITS, &its->typer, ids - 1);
/* the pre-ITS breaks isolation, so disable MSI remapping */
its->msi_domain_flags &= ~IRQ_DOMAIN_FLAG_ISOLATED_MSI;
@@ -5837,6 +5833,7 @@ int __init its_init(struct fwnode_handle *handle, struct rdists *rdists,
its_acpi_probe();
if (list_empty(&its_nodes)) {
+ rdists->has_vlpis = false;
pr_warn("ITS: No ITS available, not enabling LPIs\n");
return -ENXIO;
}
diff --git a/drivers/irqchip/irq-loongarch-avec.c b/drivers/irqchip/irq-loongarch-avec.c
index 758262fd5bd6..53d7d23af9bb 100644
--- a/drivers/irqchip/irq-loongarch-avec.c
+++ b/drivers/irqchip/irq-loongarch-avec.c
@@ -24,7 +24,6 @@
#define VECTORS_PER_REG 64
#define IRR_VECTOR_MASK 0xffUL
#define IRR_INVALID_MASK 0x80000000UL
-#define AVEC_MSG_OFFSET 0x100000
#ifdef CONFIG_SMP
struct pending_list {
@@ -47,15 +46,6 @@ struct avecintc_chip {
static struct avecintc_chip loongarch_avec;
-struct avecintc_data {
- struct list_head entry;
- unsigned int cpu;
- unsigned int vec;
- unsigned int prev_cpu;
- unsigned int prev_vec;
- unsigned int moving;
-};
-
static inline void avecintc_enable(void)
{
#ifdef CONFIG_MACH_LOONGSON64
@@ -87,7 +77,7 @@ static inline void pending_list_init(int cpu)
INIT_LIST_HEAD(&plist->head);
}
-static void avecintc_sync(struct avecintc_data *adata)
+void avecintc_sync(struct avecintc_data *adata)
{
struct pending_list *plist;
@@ -111,7 +101,7 @@ static int avecintc_set_affinity(struct irq_data *data, const struct cpumask *de
return -EBUSY;
if (cpu_online(adata->cpu) && cpumask_test_cpu(adata->cpu, dest))
- return 0;
+ return IRQ_SET_MASK_OK_DONE;
cpumask_and(&intersect_mask, dest, cpu_online_mask);
@@ -123,7 +113,8 @@ static int avecintc_set_affinity(struct irq_data *data, const struct cpumask *de
adata->cpu = cpu;
adata->vec = vector;
per_cpu_ptr(irq_map, adata->cpu)[adata->vec] = irq_data_to_desc(data);
- avecintc_sync(adata);
+ if (!cpu_has_redirectint)
+ avecintc_sync(adata);
}
irq_data_update_effective_affinity(data, cpumask_of(cpu));
@@ -415,6 +406,9 @@ static int __init pch_msi_parse_madt(union acpi_subtable_headers *header,
static inline int __init acpi_cascade_irqdomain_init(void)
{
+ if (cpu_has_redirectint)
+ return redirect_acpi_init(loongarch_avec.domain);
+
return acpi_table_parse_madt(ACPI_MADT_TYPE_MSI_PIC, pch_msi_parse_madt, 1);
}
diff --git a/drivers/irqchip/irq-loongarch-ir.c b/drivers/irqchip/irq-loongarch-ir.c
new file mode 100644
index 000000000000..21c649a89a70
--- /dev/null
+++ b/drivers/irqchip/irq-loongarch-ir.c
@@ -0,0 +1,537 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2024-2026 Loongson Technologies, Inc.
+ */
+#define pr_fmt(fmt) "redirect: " fmt
+
+#include <linux/init.h>
+#include <linux/interrupt.h>
+#include <linux/io-64-nonatomic-lo-hi.h>
+#include <linux/irq.h>
+#include <linux/irqchip.h>
+#include <linux/irqchip/irq-msi-lib.h>
+#include <linux/irqdomain.h>
+#include <linux/kernel.h>
+#include <linux/msi.h>
+#include <linux/spinlock.h>
+
+#include <asm/irq.h>
+#include <asm/loongarch.h>
+#include <asm/loongson.h>
+#include <asm/numa.h>
+#include <asm/setup.h>
+
+#include "irq-loongson.h"
+
+#define LOONGARCH_IOCSR_REDIRECT_CFG 0x15e0
+#define LOONGARCH_IOCSR_REDIRECT_TBR 0x15e8 /* IRT BASE REG */
+#define LOONGARCH_IOCSR_REDIRECT_CQB 0x15f0 /* IRT CACHE QUEUE BASE */
+#define LOONGARCH_IOCSR_REDIRECT_CQH 0x15f8 /* IRT CACHE QUEUE HEAD, 32bit */
+#define LOONGARCH_IOCSR_REDIRECT_CQT 0x15fc /* IRT CACHE QUEUE TAIL, 32bit */
+
+#define CQB_ADDR_MASK GENMASK_U64(47, 12)
+#define CQB_SIZE_MASK 0xf
+
+#define GPID_ADDR_MASK GENMASK_U64(47, 6)
+#define GPID_ADDR_SHIFT 6
+
+#define INVALID_INDEX 0
+#define CFG_DISABLE_IDLE 2
+
+#define MAX_IR_ENGINES 16
+
+struct redirect_entry {
+ struct {
+ u64 valid : 1,
+ res1 : 5,
+ gpid : 42,
+ res2 : 8,
+ vector : 8;
+ } lo;
+ u64 hi;
+};
+
+#define IRD_ENTRY_SIZE sizeof(struct redirect_entry)
+#define IRD_ENTRIES SZ_64K
+#define IRD_TABLE_PAGE_ORDER get_order(IRD_ENTRIES * IRD_ENTRY_SIZE)
+
+struct redirect_cmd {
+ union {
+ u64 cmd_info;
+ struct {
+ u64 res1 : 4,
+ type : 1,
+ need_notice : 1,
+ pad1 : 2,
+ index : 16,
+ pad2 : 40;
+ } index;
+ };
+ u64 notice_addr;
+};
+
+#define IRD_CMD_SIZE sizeof(struct redirect_cmd)
+#define INV_QUEUE_SIZE SZ_4K
+#define INV_QUEUE_PAGE_ORDER get_order(INV_QUEUE_SIZE * IRD_CMD_SIZE)
+
+struct redirect_gpid {
+ u64 pir[4]; /* Pending interrupt requested */
+ u8 en : 1, /* Doorbell */
+ res1 : 7;
+ u8 irqnum;
+ u16 res2;
+ u32 dstcpu;
+ u32 rsvd[6];
+};
+
+struct redirect_table {
+ struct redirect_entry *table;
+ unsigned long *bitmap;
+ raw_spinlock_t lock;
+};
+
+struct redirect_queue {
+ struct redirect_cmd *cmd_base;
+ int head;
+ int tail;
+ raw_spinlock_t lock;
+};
+
+struct redirect_desc {
+ struct redirect_table ird_table;
+ struct redirect_queue inv_queue;
+ int node;
+};
+
+struct redirect_item {
+ int index;
+ struct redirect_desc *irde;
+ struct redirect_gpid *gpid;
+};
+
+static struct irq_domain *redirect_domain;
+static struct redirect_desc redirect_descs[MAX_IR_ENGINES];
+
+static phys_addr_t msi_base_addr;
+static phys_addr_t redirect_reg_base = LOONGSON_REG_BASE;
+
+#ifdef CONFIG_32BIT
+
+#define REDIRECT_REG(reg, node) \
+ ((void __iomem *)(IO_BASE | redirect_reg_base | (reg)))
+
+#else
+
+#define REDIRECT_REG(reg, node) \
+ ((void __iomem *)(IO_BASE | redirect_reg_base | (u64)(node) << NODE_ADDRSPACE_SHIFT | (reg)))
+
+#endif
+
+static inline u32 redirect_read_reg32(u32 node, u32 reg)
+{
+ return readl(REDIRECT_REG(reg, node));
+}
+
+static inline void redirect_write_reg32(u32 node, u32 val, u32 reg)
+{
+ writel(val, REDIRECT_REG(reg, node));
+}
+
+static inline void redirect_write_reg64(u32 node, u64 val, u32 reg)
+{
+ writeq(val, REDIRECT_REG(reg, node));
+}
+
+static inline struct redirect_entry *item_get_entry(struct redirect_item *item)
+{
+ return item->irde->ird_table.table + item->index;
+}
+
+static inline bool invalid_queue_is_full(int node, u32 *tail)
+{
+ u32 head = redirect_read_reg32(node, LOONGARCH_IOCSR_REDIRECT_CQH);
+
+ *tail = redirect_read_reg32(node, LOONGARCH_IOCSR_REDIRECT_CQT);
+
+ return head == ((*tail + 1) % INV_QUEUE_SIZE);
+}
+
+static void invalid_enqueue(struct redirect_item *item, struct redirect_cmd *cmd)
+{
+ struct redirect_queue *inv_queue = &item->irde->inv_queue;
+ u32 tail;
+
+ guard(raw_spinlock_irqsave)(&inv_queue->lock);
+
+ while (invalid_queue_is_full(item->irde->node, &tail))
+ cpu_relax();
+
+ memcpy(&inv_queue->cmd_base[tail], cmd, sizeof(*cmd));
+
+ redirect_write_reg32(item->irde->node, (tail + 1) % INV_QUEUE_SIZE, LOONGARCH_IOCSR_REDIRECT_CQT);
+}
+
+static void irde_invalidate_entry(struct redirect_item *item)
+{
+ struct redirect_cmd cmd;
+ u64 raddr = 0;
+
+ cmd.cmd_info = 0;
+ cmd.index.type = INVALID_INDEX;
+ cmd.index.need_notice = 1;
+ cmd.index.index = item->index;
+ cmd.notice_addr = (u64)(__pa(&raddr));
+
+ invalid_enqueue(item, &cmd);
+
+ /*
+ * The CPU needs to wait here for cmd to complete, and it determines this
+ * by checking whether the invalidation queue has already written a valid value
+ * to cmd.notice_addr.
+ */
+ while (!raddr)
+ cpu_relax();
+}
+
+static inline struct avecintc_data *irq_data_get_avec_data(struct irq_data *data)
+{
+ return data->parent_data->chip_data;
+}
+
+static int redirect_table_alloc(int node, u32 nr_irqs)
+{
+ struct redirect_table *ird_table = &redirect_descs[node].ird_table;
+ int index, order = 0;
+
+ if (nr_irqs > 1) {
+ nr_irqs = __roundup_pow_of_two(nr_irqs);
+ order = ilog2(nr_irqs);
+ }
+
+ guard(raw_spinlock_irqsave)(&ird_table->lock);
+
+ index = bitmap_find_free_region(ird_table->bitmap, IRD_ENTRIES, order);
+ if (index < 0) {
+ pr_err("No redirect entry to use\n");
+ return -EINVAL;
+ }
+
+ return index;
+}
+
+static void redirect_table_free(struct redirect_item *item)
+{
+ struct redirect_table *ird_table = &item->irde->ird_table;
+ struct redirect_entry *entry = item_get_entry(item);
+
+ memset(entry, 0, sizeof(*entry));
+
+ scoped_guard(raw_spinlock_irq, &ird_table->lock)
+ clear_bit(item->index, ird_table->bitmap);
+
+ kfree(item->gpid);
+
+ irde_invalidate_entry(item);
+}
+
+static inline void redirect_domain_prepare_entry(struct redirect_item *item,
+ struct avecintc_data *adata)
+{
+ struct redirect_entry *entry = item_get_entry(item);
+
+ item->gpid->en = 1;
+ item->gpid->dstcpu = adata->cpu;
+ item->gpid->irqnum = adata->vec;
+
+ entry->lo.valid = 1;
+ entry->lo.vector = 0xff;
+ entry->lo.gpid = ((unsigned long)item->gpid & GPID_ADDR_MASK) >> GPID_ADDR_SHIFT;
+}
+
+static void redirect_free_resources(struct irq_domain *domain,
+ unsigned int virq, unsigned int nr_irqs)
+{
+ for (int i = 0; i < nr_irqs; i++) {
+ struct irq_data *irq_data = irq_domain_get_irq_data(domain, virq + i);
+
+ if (irq_data && irq_data->chip_data) {
+ struct redirect_item *item = irq_data->chip_data;
+
+ redirect_table_free(item);
+ kfree(item);
+ }
+ }
+}
+
+#ifdef CONFIG_SMP
+static int redirect_set_affinity(struct irq_data *data, const struct cpumask *dest, bool force)
+{
+ struct avecintc_data *adata = irq_data_get_avec_data(data);
+ struct redirect_item *item = data->chip_data;
+ int ret;
+
+ ret = irq_chip_set_affinity_parent(data, dest, force);
+ switch (ret) {
+ case IRQ_SET_MASK_OK:
+ break;
+ case IRQ_SET_MASK_OK_DONE:
+ return ret;
+ default:
+ pr_err("IRDE: set_affinity error %d\n", ret);
+ return ret;
+ }
+
+ redirect_domain_prepare_entry(item, adata);
+ irde_invalidate_entry(item);
+ avecintc_sync(adata);
+
+ return IRQ_SET_MASK_OK;
+}
+#endif
+
+static void redirect_compose_msi_msg(struct irq_data *d, struct msi_msg *msg)
+{
+ struct redirect_item *item = irq_data_get_irq_chip_data(d);
+
+ msg->address_hi = 0x0;
+ msg->address_lo = (msi_base_addr | 1 << 2);
+ msg->data = item->index;
+}
+
+static struct irq_chip loongarch_redirect_chip = {
+ .name = "REDIRECT",
+ .irq_ack = irq_chip_ack_parent,
+ .irq_mask = irq_chip_mask_parent,
+ .irq_unmask = irq_chip_unmask_parent,
+#ifdef CONFIG_SMP
+ .irq_set_affinity = redirect_set_affinity,
+#endif
+ .irq_compose_msi_msg = redirect_compose_msi_msg,
+};
+
+static int redirect_domain_alloc(struct irq_domain *domain, unsigned int virq,
+ unsigned int nr_irqs, void *arg)
+{
+ msi_alloc_info_t *info = arg;
+ int ret, i, node, index;
+
+ node = dev_to_node(info->desc->dev);
+
+ ret = irq_domain_alloc_irqs_parent(domain, virq, nr_irqs, arg);
+ if (ret < 0)
+ return ret;
+
+ index = redirect_table_alloc(node, nr_irqs);
+ if (index < 0) {
+ pr_err("Alloc redirect table entry failed\n");
+ return -EINVAL;
+ }
+
+ for (i = 0; i < nr_irqs; i++) {
+ struct irq_data *irq_data = irq_domain_get_irq_data(domain, virq + i);
+ struct redirect_item *item;
+
+ item = kzalloc(sizeof(*item), GFP_KERNEL);
+ if (!item) {
+ pr_err("Alloc redirect descriptor failed\n");
+ goto out_free_resources;
+ }
+ item->irde = &redirect_descs[node];
+
+ /*
+ * Only bits 47:6 of the GPID are passed to the controller,
+ * 64-byte alignment must be guarantee and make kzalloc can
+ * align to the respective size.
+ */
+ static_assert(sizeof(*item->gpid) == 64);
+ item->gpid = kzalloc_node(sizeof(*item->gpid), GFP_KERNEL, node);
+ if (!item->gpid) {
+ pr_err("Alloc redirect GPID failed\n");
+ goto out_free_resources;
+ }
+ item->index = index + i;
+
+ irq_data->chip_data = item;
+ irq_data->chip = &loongarch_redirect_chip;
+
+ redirect_domain_prepare_entry(item, irq_data_get_avec_data(irq_data));
+ }
+
+ return 0;
+
+out_free_resources:
+ redirect_free_resources(domain, virq, nr_irqs);
+ irq_domain_free_irqs_common(domain, virq, nr_irqs);
+
+ return -ENOMEM;
+}
+
+static void redirect_domain_free(struct irq_domain *domain, unsigned int virq, unsigned int nr_irqs)
+{
+ redirect_free_resources(domain, virq, nr_irqs);
+ return irq_domain_free_irqs_common(domain, virq, nr_irqs);
+}
+
+static const struct irq_domain_ops redirect_domain_ops = {
+ .alloc = redirect_domain_alloc,
+ .free = redirect_domain_free,
+ .select = msi_lib_irq_domain_select,
+};
+
+static int redirect_table_init(struct redirect_desc *irde)
+{
+ struct redirect_table *ird_table = &irde->ird_table;
+ unsigned long *bitmap;
+ struct folio *folio;
+
+ folio = __folio_alloc_node(GFP_KERNEL | __GFP_ZERO, IRD_TABLE_PAGE_ORDER, irde->node);
+ if (!folio) {
+ pr_err("Node [%d] redirect table alloc pages failed!\n", irde->node);
+ return -ENOMEM;
+ }
+ ird_table->table = folio_address(folio);
+
+ bitmap = bitmap_zalloc(IRD_ENTRIES, GFP_KERNEL);
+ if (!bitmap) {
+ pr_err("Node [%d] redirect table bitmap alloc pages failed!\n", irde->node);
+ folio_put(folio);
+ ird_table->table = NULL;
+ return -ENOMEM;
+ }
+ ird_table->bitmap = bitmap;
+
+ raw_spin_lock_init(&ird_table->lock);
+
+ return 0;
+}
+
+static int redirect_queue_init(struct redirect_desc *irde)
+{
+ struct redirect_queue *inv_queue = &irde->inv_queue;
+ struct folio *folio;
+
+ folio = __folio_alloc_node(GFP_KERNEL | __GFP_ZERO, INV_QUEUE_PAGE_ORDER, irde->node);
+ if (!folio) {
+ pr_err("Node [%d] invalid queue alloc pages failed!\n", irde->node);
+ return -ENOMEM;
+ }
+
+ inv_queue->cmd_base = folio_address(folio);
+ inv_queue->head = 0;
+ inv_queue->tail = 0;
+ raw_spin_lock_init(&inv_queue->lock);
+
+ return 0;
+}
+
+static void redirect_irde_cfg(struct redirect_desc *irde)
+{
+ redirect_write_reg64(irde->node, CFG_DISABLE_IDLE, LOONGARCH_IOCSR_REDIRECT_CFG);
+ redirect_write_reg64(irde->node, __pa(irde->ird_table.table), LOONGARCH_IOCSR_REDIRECT_TBR);
+ redirect_write_reg32(irde->node, 0, LOONGARCH_IOCSR_REDIRECT_CQH);
+ redirect_write_reg32(irde->node, 0, LOONGARCH_IOCSR_REDIRECT_CQT);
+ redirect_write_reg64(irde->node, ((unsigned long)irde->inv_queue.cmd_base & CQB_ADDR_MASK) |
+ CQB_SIZE_MASK, LOONGARCH_IOCSR_REDIRECT_CQB);
+}
+
+static void __init redirect_irde_free(struct redirect_desc *irde)
+{
+ struct redirect_table *ird_table = &redirect_descs->ird_table;
+ struct redirect_queue *inv_queue = &redirect_descs->inv_queue;
+
+ if (ird_table->table) {
+ folio_put(virt_to_folio(ird_table->table));
+ ird_table->table = NULL;
+ }
+
+ if (ird_table->bitmap) {
+ bitmap_free(ird_table->bitmap);
+ ird_table->bitmap = NULL;
+ }
+
+ if (inv_queue->cmd_base) {
+ folio_put(virt_to_folio(inv_queue->cmd_base));
+ inv_queue->cmd_base = NULL;
+ }
+}
+
+static int __init redirect_irde_init(int node)
+{
+ struct redirect_desc *irde = &redirect_descs[node];
+ int ret;
+
+ irde->node = node;
+
+ ret = redirect_table_init(irde);
+ if (ret)
+ return ret;
+
+ ret = redirect_queue_init(irde);
+ if (ret) {
+ redirect_irde_free(irde);
+ return ret;
+ }
+
+ redirect_irde_cfg(irde);
+
+ return 0;
+}
+
+static int __init pch_msi_parse_madt(union acpi_subtable_headers *header, const unsigned long end)
+{
+ struct acpi_madt_msi_pic *pchmsi_entry = (struct acpi_madt_msi_pic *)header;
+
+ msi_base_addr = pchmsi_entry->msg_address - AVEC_MSG_OFFSET;
+
+ return pch_msi_acpi_init_avec(redirect_domain);
+}
+
+static int __init acpi_cascade_irqdomain_init(void)
+{
+ return acpi_table_parse_madt(ACPI_MADT_TYPE_MSI_PIC, pch_msi_parse_madt, 1);
+}
+
+int __init redirect_acpi_init(struct irq_domain *parent)
+{
+ struct fwnode_handle *fwnode;
+ int ret = -EINVAL, node;
+
+ fwnode = irq_domain_alloc_named_fwnode("redirect");
+ if (!fwnode) {
+ pr_err("Unable to alloc redirect domain handle\n");
+ goto fail;
+ }
+
+ redirect_domain = irq_domain_create_hierarchy(parent, 0, IRD_ENTRIES, fwnode,
+ &redirect_domain_ops, redirect_descs);
+ if (!redirect_domain) {
+ pr_err("Unable to alloc redirect domain\n");
+ goto out_free_fwnode;
+ }
+
+ for_each_node_mask(node, node_possible_map) {
+ ret = redirect_irde_init(node);
+ if (ret)
+ goto out_clear_irde;
+ }
+
+ ret = acpi_cascade_irqdomain_init();
+ if (ret < 0) {
+ pr_err("Failed to cascade IRQ domain, ret=%d\n", ret);
+ goto out_clear_irde;
+ }
+
+ pr_info("init succeeded\n");
+
+ return 0;
+
+out_clear_irde:
+ for_each_node_mask(node, node_possible_map) {
+ redirect_irde_free(&redirect_descs[node]);
+ }
+ irq_domain_remove(redirect_domain);
+out_free_fwnode:
+ irq_domain_free_fwnode(fwnode);
+fail:
+ return ret;
+}
diff --git a/drivers/irqchip/irq-loongson.h b/drivers/irqchip/irq-loongson.h
index 11fa138d1f44..dd37cd7f453d 100644
--- a/drivers/irqchip/irq-loongson.h
+++ b/drivers/irqchip/irq-loongson.h
@@ -6,6 +6,17 @@
#ifndef _DRIVERS_IRQCHIP_IRQ_LOONGSON_H
#define _DRIVERS_IRQCHIP_IRQ_LOONGSON_H
+#define AVEC_MSG_OFFSET 0x100000
+
+struct avecintc_data {
+ struct list_head entry;
+ unsigned int cpu;
+ unsigned int vec;
+ unsigned int prev_cpu;
+ unsigned int prev_vec;
+ unsigned int moving;
+};
+
int find_pch_pic(u32 gsi);
int liointc_acpi_init(struct irq_domain *parent,
@@ -14,6 +25,8 @@ int eiointc_acpi_init(struct irq_domain *parent,
struct acpi_madt_eio_pic *acpi_eiointc);
int avecintc_acpi_init(struct irq_domain *parent);
+int redirect_acpi_init(struct irq_domain *parent);
+
int htvec_acpi_init(struct irq_domain *parent,
struct acpi_madt_ht_pic *acpi_htvec);
int pch_lpc_acpi_init(struct irq_domain *parent,
@@ -24,4 +37,6 @@ int pch_msi_acpi_init(struct irq_domain *parent,
struct acpi_madt_msi_pic *acpi_pchmsi);
int pch_msi_acpi_init_avec(struct irq_domain *parent);
+void avecintc_sync(struct avecintc_data *adata);
+
#endif /* _DRIVERS_IRQCHIP_IRQ_LOONGSON_H */
diff --git a/drivers/irqchip/irq-meson-gpio.c b/drivers/irqchip/irq-meson-gpio.c
index 74a376ef452e..91a9c337fe6d 100644
--- a/drivers/irqchip/irq-meson-gpio.c
+++ b/drivers/irqchip/irq-meson-gpio.c
@@ -27,6 +27,10 @@
/* use for A1 like chips */
#define REG_PIN_A1_SEL 0x04
+/* use for A9 like chips */
+#define REG_A9_AO_POL 0x00
+#define REG_A9_AO_EDGE 0x30
+
/*
* Note: The S905X3 datasheet reports that BOTH_EDGE is controlled by
* bits 24 to 31. Tests on the actual HW show that these bits are
@@ -53,6 +57,8 @@ static void meson_a1_gpio_irq_sel_pin(struct meson_gpio_irq_controller *ctl,
static void meson_a1_gpio_irq_init(struct meson_gpio_irq_controller *ctl);
static int meson8_gpio_irq_set_type(struct meson_gpio_irq_controller *ctl,
unsigned int type, u32 *channel_hwirq);
+static int meson_a9_ao_gpio_irq_set_type(struct meson_gpio_irq_controller *ctl,
+ unsigned int type, u32 *channel_hwirq);
static int meson_s4_gpio_irq_set_type(struct meson_gpio_irq_controller *ctl,
unsigned int type, u32 *channel_hwirq);
@@ -116,6 +122,18 @@ struct meson_gpio_irq_params {
.pin_sel_mask = 0xff, \
.nr_channels = 2, \
+#define INIT_MESON_A9_AO_COMMON_DATA(irqs) \
+ INIT_MESON_COMMON(irqs, meson_a1_gpio_irq_init, \
+ meson_a1_gpio_irq_sel_pin, \
+ meson_a9_ao_gpio_irq_set_type) \
+ .support_edge_both = true, \
+ .edge_both_offset = 0, \
+ .edge_single_offset = 0, \
+ .edge_pol_reg = 0x2c, \
+ .pol_low_offset = 0, \
+ .pin_sel_mask = 0xff, \
+ .nr_channels = 20, \
+
#define INIT_MESON_S4_COMMON_DATA(irqs) \
INIT_MESON_COMMON(irqs, meson_a1_gpio_irq_init, \
meson_a1_gpio_irq_sel_pin, \
@@ -170,6 +188,14 @@ static const struct meson_gpio_irq_params a5_params = {
INIT_MESON_S4_COMMON_DATA(99)
};
+static const struct meson_gpio_irq_params a9_params = {
+ INIT_MESON_S4_COMMON_DATA(96)
+};
+
+static const struct meson_gpio_irq_params a9_ao_params = {
+ INIT_MESON_A9_AO_COMMON_DATA(39)
+};
+
static const struct meson_gpio_irq_params s4_params = {
INIT_MESON_S4_COMMON_DATA(82)
};
@@ -203,6 +229,8 @@ static const struct of_device_id meson_irq_gpio_matches[] __maybe_unused = {
{ .compatible = "amlogic,a4-gpio-ao-intc", .data = &a4_ao_params },
{ .compatible = "amlogic,a4-gpio-intc", .data = &a4_params },
{ .compatible = "amlogic,a5-gpio-intc", .data = &a5_params },
+ { .compatible = "amlogic,a9-gpio-ao-intc", .data = &a9_ao_params },
+ { .compatible = "amlogic,a9-gpio-intc", .data = &a9_params },
{ .compatible = "amlogic,s6-gpio-intc", .data = &s6_params },
{ .compatible = "amlogic,s7-gpio-intc", .data = &s7_params },
{ .compatible = "amlogic,s7d-gpio-intc", .data = &s7_params },
@@ -375,6 +403,55 @@ static int meson8_gpio_irq_set_type(struct meson_gpio_irq_controller *ctl,
return 0;
}
+/*
+ * gpio irq relative registers for a9_ao
+ * -PADCTRL_GPIO_IRQ_CTRL0
+ * bit[31]: enable/disable all the irq lines
+ * bit[0-19]: polarity trigger
+ *
+ * -PADCTRL_GPIO_IRQ_CTRL[X]
+ * bit[0-5]: 6 bits to choose gpio source for irq line 2*[X] - 2
+ * bit[16-21]:6 bits to choose gpio source for irq line 2*[X] - 1
+ * where X = 1-10
+ *
+ * -PADCTRL_GPIO_IRQ_CTRL[11]
+ * bit[0-19]: both edge trigger
+ *
+ * -PADCTRL_GPIO_IRQ_CTRL[12]
+ * bit[0-19]: single edge trigger
+ */
+static int meson_a9_ao_gpio_irq_set_type(struct meson_gpio_irq_controller *ctl,
+ unsigned int type, u32 *channel_hwirq)
+{
+ const struct meson_gpio_irq_params *params = ctl->params;
+ unsigned int idx;
+ u32 val;
+
+ idx = meson_gpio_irq_get_channel_idx(ctl, channel_hwirq);
+
+ type &= IRQ_TYPE_SENSE_MASK;
+
+ meson_gpio_irq_update_bits(ctl, params->edge_pol_reg, BIT(idx), 0);
+
+ if (type == IRQ_TYPE_EDGE_BOTH) {
+ val = BIT(ctl->params->edge_both_offset + idx);
+ meson_gpio_irq_update_bits(ctl, params->edge_pol_reg, val, val);
+ return 0;
+ }
+
+ val = 0;
+ if (type & (IRQ_TYPE_LEVEL_LOW | IRQ_TYPE_EDGE_FALLING))
+ val = BIT(idx);
+ meson_gpio_irq_update_bits(ctl, REG_A9_AO_POL, BIT(idx), val);
+
+ val = 0;
+ if (type & (IRQ_TYPE_EDGE_RISING | IRQ_TYPE_EDGE_FALLING))
+ val = BIT(idx);
+ meson_gpio_irq_update_bits(ctl, REG_A9_AO_EDGE, BIT(idx), val);
+
+ return 0;
+};
+
/*
* gpio irq relative registers for s4
* -PADCTRL_GPIO_IRQ_CTRL0
diff --git a/drivers/irqchip/irq-realtek-rtl.c b/drivers/irqchip/irq-realtek-rtl.c
index 942c1f8c363d..2ae3be7fa633 100644
--- a/drivers/irqchip/irq-realtek-rtl.c
+++ b/drivers/irqchip/irq-realtek-rtl.c
@@ -23,10 +23,10 @@
#define RTL_ICTL_NUM_INPUTS 32
-#define REG(x) (realtek_ictl_base + x)
+#define REG(cpu, x) (realtek_ictl_base[cpu] + x)
static DEFINE_RAW_SPINLOCK(irq_lock);
-static void __iomem *realtek_ictl_base;
+static void __iomem *realtek_ictl_base[NR_CPUS];
/*
* IRR0-IRR3 store 4 bits per interrupt, but Realtek uses inverted numbering,
@@ -37,10 +37,29 @@ static void __iomem *realtek_ictl_base;
#define IRR_OFFSET(idx) (4 * (3 - (idx * 4) / 32))
#define IRR_SHIFT(idx) ((idx * 4) % 32)
-static void write_irr(void __iomem *irr0, int idx, u32 value)
+static inline void enable_gimr(unsigned int cpu, unsigned int hw_irq)
{
- unsigned int offset = IRR_OFFSET(idx);
- unsigned int shift = IRR_SHIFT(idx);
+ u32 gimr;
+
+ gimr = readl(REG(cpu, RTL_ICTL_GIMR));
+ gimr |= BIT(hw_irq);
+ writel(gimr, REG(cpu, RTL_ICTL_GIMR));
+}
+
+static inline void disable_gimr(unsigned int cpu, unsigned int hw_irq)
+{
+ u32 gimr;
+
+ gimr = readl(REG(cpu, RTL_ICTL_GIMR));
+ gimr &= ~BIT(hw_irq);
+ writel(gimr, REG(cpu, RTL_ICTL_GIMR));
+}
+
+static void write_irr(unsigned int cpu, int hw_irq, u32 value)
+{
+ void __iomem *irr0 = REG(cpu, RTL_ICTL_IRR0);
+ unsigned int offset = IRR_OFFSET(hw_irq);
+ unsigned int shift = IRR_SHIFT(hw_irq);
u32 irr;
irr = readl(irr0 + offset) & ~(0xf << shift);
@@ -50,47 +69,51 @@ static void write_irr(void __iomem *irr0, int idx, u32 value)
static void realtek_ictl_unmask_irq(struct irq_data *i)
{
- unsigned long flags;
- u32 value;
+ unsigned int cpu;
- raw_spin_lock_irqsave(&irq_lock, flags);
+ guard(raw_spinlock)(&irq_lock);
+ for_each_cpu(cpu, irq_data_get_effective_affinity_mask(i))
+ enable_gimr(cpu, i->hwirq);
+}
- value = readl(REG(RTL_ICTL_GIMR));
- value |= BIT(i->hwirq);
- writel(value, REG(RTL_ICTL_GIMR));
+static void realtek_ictl_mask_irq(struct irq_data *i)
+{
+ unsigned int cpu;
- raw_spin_unlock_irqrestore(&irq_lock, flags);
+ guard(raw_spinlock)(&irq_lock);
+ for_each_cpu(cpu, irq_data_get_effective_affinity_mask(i))
+ disable_gimr(cpu, i->hwirq);
}
-static void realtek_ictl_mask_irq(struct irq_data *i)
+static int realtek_ictl_irq_affinity(struct irq_data *i, const struct cpumask *dest, bool force)
{
- unsigned long flags;
- u32 value;
+ if (!irqd_irq_masked(i))
+ realtek_ictl_mask_irq(i);
- raw_spin_lock_irqsave(&irq_lock, flags);
+ irq_data_update_effective_affinity(i, dest);
- value = readl(REG(RTL_ICTL_GIMR));
- value &= ~BIT(i->hwirq);
- writel(value, REG(RTL_ICTL_GIMR));
+ if (!irqd_irq_masked(i))
+ realtek_ictl_unmask_irq(i);
- raw_spin_unlock_irqrestore(&irq_lock, flags);
+ return IRQ_SET_MASK_OK;
}
static struct irq_chip realtek_ictl_irq = {
- .name = "realtek-rtl-intc",
- .irq_mask = realtek_ictl_mask_irq,
- .irq_unmask = realtek_ictl_unmask_irq,
+ .name = "realtek-rtl-intc",
+ .irq_mask = realtek_ictl_mask_irq,
+ .irq_unmask = realtek_ictl_unmask_irq,
+ .irq_set_affinity = realtek_ictl_irq_affinity,
};
static int intc_map(struct irq_domain *d, unsigned int irq, irq_hw_number_t hw)
{
- unsigned long flags;
+ unsigned int cpu;
irq_set_chip_and_handler(irq, &realtek_ictl_irq, handle_level_irq);
- raw_spin_lock_irqsave(&irq_lock, flags);
- write_irr(REG(RTL_ICTL_IRR0), hw, 1);
- raw_spin_unlock_irqrestore(&irq_lock, flags);
+ guard(raw_spinlock_irqsave)(&irq_lock);
+ for_each_present_cpu(cpu)
+ write_irr(cpu, hw, 1);
return 0;
}
@@ -103,12 +126,13 @@ static const struct irq_domain_ops irq_domain_ops = {
static void realtek_irq_dispatch(struct irq_desc *desc)
{
struct irq_chip *chip = irq_desc_get_chip(desc);
+ unsigned int cpu = smp_processor_id();
struct irq_domain *domain;
unsigned long pending;
unsigned int soc_int;
chained_irq_enter(chip, desc);
- pending = readl(REG(RTL_ICTL_GIMR)) & readl(REG(RTL_ICTL_GISR));
+ pending = readl(REG(cpu, RTL_ICTL_GIMR)) & readl(REG(cpu, RTL_ICTL_GISR));
if (unlikely(!pending)) {
spurious_interrupt();
@@ -116,7 +140,7 @@ static void realtek_irq_dispatch(struct irq_desc *desc)
}
domain = irq_desc_get_handler_data(desc);
- for_each_set_bit(soc_int, &pending, 32)
+ for_each_set_bit(soc_int, &pending, RTL_ICTL_NUM_INPUTS)
generic_handle_domain_irq(domain, soc_int);
out:
@@ -127,17 +151,19 @@ static int __init realtek_rtl_of_init(struct device_node *node, struct device_no
{
struct of_phandle_args oirq;
struct irq_domain *domain;
- unsigned int soc_irq;
- int parent_irq;
-
- realtek_ictl_base = of_iomap(node, 0);
- if (!realtek_ictl_base)
- return -ENXIO;
-
- /* Disable all cascaded interrupts and clear routing */
- writel(0, REG(RTL_ICTL_GIMR));
- for (soc_irq = 0; soc_irq < RTL_ICTL_NUM_INPUTS; soc_irq++)
- write_irr(REG(RTL_ICTL_IRR0), soc_irq, 0);
+ int cpu, parent_irq;
+
+ for_each_present_cpu(cpu) {
+ realtek_ictl_base[cpu] = of_iomap(node, cpu);
+ if (!realtek_ictl_base[cpu])
+ return -ENXIO;
+
+ /* Disable all cascaded interrupts and clear routing */
+ for (unsigned int hw_irq = 0; hw_irq < RTL_ICTL_NUM_INPUTS; hw_irq++) {
+ disable_gimr(cpu, hw_irq);
+ write_irr(cpu, hw_irq, 0);
+ }
+ }
if (WARN_ON(!of_irq_count(node))) {
/*
diff --git a/drivers/irqchip/irq-renesas-rzt2h.c b/drivers/irqchip/irq-renesas-rzt2h.c
index ecb69da55508..e06264add3cc 100644
--- a/drivers/irqchip/irq-renesas-rzt2h.c
+++ b/drivers/irqchip/irq-renesas-rzt2h.c
@@ -2,6 +2,7 @@
#include <linux/bitfield.h>
#include <linux/err.h>
+#include <linux/interrupt.h>
#include <linux/io.h>
#include <linux/irqchip.h>
#include <linux/irqchip/irq-renesas-rzt2h.h>
@@ -30,16 +31,44 @@
RZT2H_ICU_IRQ_S_COUNT)
#define RZT2H_ICU_SEI_COUNT 1
+#define RZT2H_ICU_CA55_ERR_START (RZT2H_ICU_SEI_START + \
+ RZT2H_ICU_SEI_COUNT)
+#define RZT2H_ICU_CA55_ERR_COUNT 2
+
+#define RZT2H_ICU_CR52_ERR_START (RZT2H_ICU_CA55_ERR_START + \
+ RZT2H_ICU_CA55_ERR_COUNT)
+#define RZT2H_ICU_CR52_ERR_COUNT 4
+
+#define RZT2H_ICU_PERI_ERR_START (RZT2H_ICU_CR52_ERR_START + \
+ RZT2H_ICU_CR52_ERR_COUNT)
+#define RZT2H_ICU_PERI_ERR_COUNT 2
+
+#define RZT2H_ICU_DSMIF_ERR_START (RZT2H_ICU_PERI_ERR_START + \
+ RZT2H_ICU_PERI_ERR_COUNT)
+#define RZT2H_ICU_DSMIF_ERR_COUNT 2
+
+#define RZT2H_ICU_ENCIF_ERR_START (RZT2H_ICU_DSMIF_ERR_START + \
+ RZT2H_ICU_DSMIF_ERR_COUNT)
+#define RZT2H_ICU_ENCIF_ERR_COUNT 2
+
#define RZT2H_ICU_NUM_IRQ (RZT2H_ICU_INTCPU_NS_COUNT + \
RZT2H_ICU_INTCPU_S_COUNT + \
RZT2H_ICU_IRQ_NS_COUNT + \
RZT2H_ICU_IRQ_S_COUNT + \
- RZT2H_ICU_SEI_COUNT)
+ RZT2H_ICU_SEI_COUNT + \
+ RZT2H_ICU_CA55_ERR_COUNT + \
+ RZT2H_ICU_CR52_ERR_COUNT + \
+ RZT2H_ICU_PERI_ERR_COUNT + \
+ RZT2H_ICU_DSMIF_ERR_COUNT + \
+ RZT2H_ICU_ENCIF_ERR_COUNT)
#define RZT2H_ICU_IRQ_IN_RANGE(n, type) \
((n) >= RZT2H_ICU_##type##_START && \
(n) < RZT2H_ICU_##type##_START + RZT2H_ICU_##type##_COUNT)
+#define RZT2H_ICU_SWINT 0x0
+#define RZT2H_ICU_SWINT_IC_MASK(i) BIT(i)
+
#define RZT2H_ICU_PORTNF_MD 0xc
#define RZT2H_ICU_PORTNF_MDi_MASK(i) (GENMASK(1, 0) << ((i) * 2))
#define RZT2H_ICU_PORTNF_MDi_PREP(i, val) (FIELD_PREP(GENMASK(1, 0), val) << ((i) * 2))
@@ -49,6 +78,29 @@
#define RZT2H_ICU_MD_RISING_EDGE 0b10
#define RZT2H_ICU_MD_BOTH_EDGES 0b11
+#define RZT2H_ICU_CA55ERR_E0MSK 0x50
+#define RZT2H_ICU_CA55ERR_CLR 0x60
+#define RZT2H_ICU_CA55ERR_STAT 0x64
+#define RZT2H_ICU_CA55ERR_MASK GENMASK(12, 0)
+
+#define RZT2H_ICU_PERIERR_E0MSKn(n) (0x98 + 0x4 * (n))
+#define RZT2H_ICU_PERIERR_CLRn(n) (0xc8 + 0x4 * (n))
+#define RZT2H_ICU_PERIERR_STAT 0xd4
+#define RZT2H_ICU_PERIERR_NUM 3
+#define RZT2H_ICU_PERIERR_MASK GENMASK(31, 0)
+
+#define RZT2H_ICU_DSMIFERR_E0MSKn(n) (0xe0 + 0x4 * (n))
+#define RZT2H_ICU_DSMIFERR_CLRn(n) (0x1a0 + 0x4 * (n))
+#define RZT2H_ICU_DSMIFERR_STAT 0x1d0
+#define RZT2H_ICU_DSMIFERR_NUM 12
+#define RZT2H_ICU_DSMIFERR_MASK GENMASK(31, 0)
+
+#define RZT2H_ICU_ENCIFERR_E0MSKn(n) (0x200 + 0x4 * (n))
+#define RZT2H_ICU_ENCIFERR_CLRn(n) (0x250 + 0x4 * (n))
+#define RZT2H_ICU_ENCIFERR_STAT 0x264
+#define RZT2H_ICU_ENCIFERR_NUM 5
+#define RZT2H_ICU_ENCIFERR_MASK GENMASK(31, 0)
+
#define RZT2H_ICU_DMACn_RSSELi(n, i) (0x7d0 + 0x18 * (n) + 0x4 * (i))
#define RZT2H_ICU_DMAC_REQ_SELx_MASK(x) (GENMASK(9, 0) << ((x) * 10))
#define RZT2H_ICU_DMAC_REQ_SELx_PREP(x, val) (FIELD_PREP(GENMASK(9, 0), val) << ((x) * 10))
@@ -99,6 +151,12 @@ static inline int rzt2h_icu_irq_to_offset(struct irq_data *d, void __iomem **bas
} else if (RZT2H_ICU_IRQ_IN_RANGE(hwirq, IRQ_S) || RZT2H_ICU_IRQ_IN_RANGE(hwirq, SEI)) {
*offset = hwirq - RZT2H_ICU_IRQ_S_START;
*base = priv->base_s;
+ } else if (RZT2H_ICU_IRQ_IN_RANGE(hwirq, INTCPU_NS)) {
+ *offset = hwirq - RZT2H_ICU_INTCPU_NS_START;
+ *base = priv->base_ns;
+ } else if (RZT2H_ICU_IRQ_IN_RANGE(hwirq, INTCPU_S)) {
+ *offset = hwirq - RZT2H_ICU_INTCPU_S_START;
+ *base = priv->base_s;
} else {
return -EINVAL;
}
@@ -164,6 +222,28 @@ static int rzt2h_icu_set_type(struct irq_data *d, unsigned int type)
return irq_chip_set_type_parent(d, IRQ_TYPE_EDGE_RISING);
}
+static int rzt2h_icu_intcpu_set_irqchip_state(struct irq_data *d, enum irqchip_irq_state which,
+ bool state)
+{
+ unsigned int offset;
+ void __iomem *base;
+ int ret;
+
+ if (which != IRQCHIP_STATE_PENDING)
+ return irq_chip_set_parent_state(d, which, state);
+
+ if (!state)
+ return 0;
+
+ ret = rzt2h_icu_irq_to_offset(d, &base, &offset);
+ if (ret)
+ return ret;
+
+ writel_relaxed(RZT2H_ICU_SWINT_IC_MASK(offset), base + RZT2H_ICU_SWINT);
+
+ return 0;
+}
+
static const struct irq_chip rzt2h_icu_chip = {
.name = "rzt2h-icu",
.irq_mask = irq_chip_mask_parent,
@@ -180,10 +260,27 @@ static const struct irq_chip rzt2h_icu_chip = {
IRQCHIP_SKIP_SET_WAKE,
};
+static const struct irq_chip rzt2h_icu_intcpu_chip = {
+ .name = "rzt2h-icu",
+ .irq_mask = irq_chip_mask_parent,
+ .irq_unmask = irq_chip_unmask_parent,
+ .irq_eoi = irq_chip_eoi_parent,
+ .irq_set_type = irq_chip_set_type_parent,
+ .irq_set_wake = irq_chip_set_wake_parent,
+ .irq_set_affinity = irq_chip_set_affinity_parent,
+ .irq_retrigger = irq_chip_retrigger_hierarchy,
+ .irq_get_irqchip_state = irq_chip_get_parent_state,
+ .irq_set_irqchip_state = rzt2h_icu_intcpu_set_irqchip_state,
+ .flags = IRQCHIP_MASK_ON_SUSPEND |
+ IRQCHIP_SET_TYPE_MASKED |
+ IRQCHIP_SKIP_SET_WAKE,
+};
+
static int rzt2h_icu_alloc(struct irq_domain *domain, unsigned int virq, unsigned int nr_irqs,
void *arg)
{
struct rzt2h_icu_priv *priv = domain->host_data;
+ const struct irq_chip *chip;
irq_hw_number_t hwirq;
unsigned int type;
int ret;
@@ -192,7 +289,12 @@ static int rzt2h_icu_alloc(struct irq_domain *domain, unsigned int virq, unsigne
if (ret)
return ret;
- ret = irq_domain_set_hwirq_and_chip(domain, virq, hwirq, &rzt2h_icu_chip, NULL);
+ if (RZT2H_ICU_IRQ_IN_RANGE(hwirq, INTCPU_NS) || RZT2H_ICU_IRQ_IN_RANGE(hwirq, INTCPU_S))
+ chip = &rzt2h_icu_intcpu_chip;
+ else
+ chip = &rzt2h_icu_chip;
+
+ ret = irq_domain_set_hwirq_and_chip(domain, virq, hwirq, chip, NULL);
if (ret)
return ret;
@@ -222,6 +324,155 @@ static int rzt2h_icu_parse_interrupts(struct rzt2h_icu_priv *priv, struct device
return 0;
}
+static irqreturn_t rzt2h_icu_intcpu_irq(int irq, void *data)
+{
+ unsigned int intcpu = (uintptr_t)data;
+
+ pr_info("INTCPU%u software interrupt\n", intcpu);
+ return IRQ_HANDLED;
+}
+
+static irqreturn_t rzt2h_icu_err_irq(struct rzt2h_icu_priv *priv, const char *name,
+ unsigned int num, u32 stat_base, u32 clr_base)
+{
+ bool handled = false;
+
+ for (unsigned int n = 0; n < num; n++) {
+ u32 stat = readl(priv->base_ns + stat_base + n * 0x4);
+
+ if (!stat)
+ continue;
+
+ handled = true;
+
+ pr_err("rzt2h-icu: %s error n=%u status=0x%08x\n", name, n, stat);
+
+ writel_relaxed(stat, priv->base_ns + clr_base + n * 0x4);
+ }
+
+ return handled ? IRQ_HANDLED : IRQ_NONE;
+}
+
+static irqreturn_t rzt2h_icu_ca55_err_irq(int irq, void *data)
+{
+ return rzt2h_icu_err_irq(data, "CA55", 1, RZT2H_ICU_CA55ERR_STAT, RZT2H_ICU_CA55ERR_CLR);
+}
+
+static irqreturn_t rzt2h_icu_peri_err_irq(int irq, void *data)
+{
+ return rzt2h_icu_err_irq(data, "peripheral", RZT2H_ICU_PERIERR_NUM, RZT2H_ICU_PERIERR_STAT,
+ RZT2H_ICU_PERIERR_CLRn(0));
+}
+
+static irqreturn_t rzt2h_icu_dsmif_err_irq(int irq, void *data)
+{
+ return rzt2h_icu_err_irq(data, "DSMIF", RZT2H_ICU_DSMIFERR_NUM, RZT2H_ICU_DSMIFERR_STAT,
+ RZT2H_ICU_DSMIFERR_CLRn(0));
+}
+
+static irqreturn_t rzt2h_icu_encif_err_irq(int irq, void *data)
+{
+ return rzt2h_icu_err_irq(data, "ENCIF", RZT2H_ICU_ENCIFERR_NUM, RZT2H_ICU_ENCIFERR_STAT,
+ RZT2H_ICU_ENCIFERR_CLRn(0));
+}
+
+static int rzt2h_icu_request_irqs(struct platform_device *pdev, struct irq_domain *irq_domain,
+ unsigned int start, unsigned int count, irq_handler_t handler,
+ void *data)
+{
+ struct device *dev = &pdev->dev;
+ unsigned int offset, virq;
+ struct irq_fwspec fwspec;
+ int ret;
+
+ for (offset = start; offset < start + count; offset++) {
+ fwspec.fwnode = irq_domain->fwnode;
+ fwspec.param_count = 2;
+ fwspec.param[0] = offset;
+ fwspec.param[1] = IRQ_TYPE_EDGE_RISING;
+
+ virq = irq_create_fwspec_mapping(&fwspec);
+ if (!virq)
+ return dev_err_probe(dev, -EINVAL, "Failed to create IRQ %u mapping\n", offset);
+
+ ret = devm_request_irq(dev, virq, handler, 0, dev_name(dev),
+ data ?: (void *)(uintptr_t)offset);
+ if (ret)
+ return dev_err_probe(dev, ret, "Failed to request IRQ %u\n", offset);
+ }
+
+ return 0;
+}
+
+static int rzt2h_icu_setup_irqs(struct platform_device *pdev, struct irq_domain *irq_domain)
+{
+ struct rzt2h_icu_priv *priv = platform_get_drvdata(pdev);
+ unsigned int n;
+ int ret;
+
+ if (IS_ENABLED(CONFIG_GENERIC_IRQ_INJECTION)) {
+ ret = rzt2h_icu_request_irqs(pdev, irq_domain, RZT2H_ICU_INTCPU_NS_START,
+ RZT2H_ICU_INTCPU_NS_COUNT, rzt2h_icu_intcpu_irq, NULL);
+ if (ret)
+ return ret;
+
+ ret = rzt2h_icu_request_irqs(pdev, irq_domain, RZT2H_ICU_INTCPU_S_START,
+ RZT2H_ICU_INTCPU_S_COUNT, rzt2h_icu_intcpu_irq, NULL);
+ if (ret)
+ return ret;
+ }
+
+ /*
+ * There are two error interrupts and two error masks that can be used
+ * separately for each error type. It would not be very useful to
+ * receive two interrupts for the same error, so use only the first one.
+ */
+
+ ret = rzt2h_icu_request_irqs(pdev, irq_domain, RZT2H_ICU_CA55_ERR_START, 1,
+ rzt2h_icu_ca55_err_irq, priv);
+ if (ret)
+ return ret;
+
+ ret = rzt2h_icu_request_irqs(pdev, irq_domain, RZT2H_ICU_PERI_ERR_START, 1,
+ rzt2h_icu_peri_err_irq, priv);
+ if (ret)
+ return ret;
+
+ ret = rzt2h_icu_request_irqs(pdev, irq_domain, RZT2H_ICU_DSMIF_ERR_START, 1,
+ rzt2h_icu_dsmif_err_irq, priv);
+ if (ret)
+ return ret;
+
+ ret = rzt2h_icu_request_irqs(pdev, irq_domain, RZT2H_ICU_ENCIF_ERR_START, 1,
+ rzt2h_icu_encif_err_irq, priv);
+ if (ret)
+ return ret;
+
+ /* Clear and unmask CA55 error events */
+ writel_relaxed(RZT2H_ICU_CA55ERR_MASK, priv->base_ns + RZT2H_ICU_CA55ERR_CLR);
+ writel_relaxed(0, priv->base_ns + RZT2H_ICU_CA55ERR_E0MSK);
+
+ /* Clear and unmask peripheral error events */
+ for (n = 0; n < RZT2H_ICU_PERIERR_NUM; n++) {
+ writel_relaxed(RZT2H_ICU_PERIERR_MASK, priv->base_ns + RZT2H_ICU_PERIERR_CLRn(n));
+ writel_relaxed(0, priv->base_ns + RZT2H_ICU_PERIERR_E0MSKn(n));
+ }
+
+ /* Clear and unmask DSMIF error events */
+ for (n = 0; n < RZT2H_ICU_DSMIFERR_NUM; n++) {
+ writel_relaxed(RZT2H_ICU_DSMIFERR_MASK, priv->base_ns + RZT2H_ICU_DSMIFERR_CLRn(n));
+ writel_relaxed(0, priv->base_ns + RZT2H_ICU_DSMIFERR_E0MSKn(n));
+ }
+
+ /* Clear and unmask ENCIF error events */
+ for (n = 0; n < RZT2H_ICU_ENCIFERR_NUM; n++) {
+ writel_relaxed(RZT2H_ICU_ENCIFERR_MASK, priv->base_ns + RZT2H_ICU_ENCIFERR_CLRn(n));
+ writel_relaxed(0, priv->base_ns + RZT2H_ICU_ENCIFERR_E0MSKn(n));
+ }
+
+ return 0;
+}
+
static int rzt2h_icu_init(struct platform_device *pdev, struct device_node *parent)
{
struct irq_domain *irq_domain, *parent_domain;
@@ -265,11 +516,20 @@ static int rzt2h_icu_init(struct platform_device *pdev, struct device_node *pare
irq_domain = irq_domain_create_hierarchy(parent_domain, 0, RZT2H_ICU_NUM_IRQ,
dev_fwnode(dev), &rzt2h_icu_domain_ops, priv);
if (!irq_domain) {
- pm_runtime_put_sync(dev);
- return -ENOMEM;
+ ret = -ENOMEM;
+ goto err_pm_put;
}
+ ret = rzt2h_icu_setup_irqs(pdev, irq_domain);
+ if (ret)
+ goto err_irq_domain_free;
return 0;
+
+err_irq_domain_free:
+ irq_domain_remove(irq_domain);
+err_pm_put:
+ pm_runtime_put_sync(dev);
+ return ret;
}
IRQCHIP_PLATFORM_DRIVER_BEGIN(rzt2h_icu)
diff --git a/drivers/irqchip/irq-starfive-jh8100-intc.c b/drivers/irqchip/irq-starfive-jh8100-intc.c
deleted file mode 100644
index bb62ef363d0b..000000000000
--- a/drivers/irqchip/irq-starfive-jh8100-intc.c
+++ /dev/null
@@ -1,207 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-/*
- * StarFive JH8100 External Interrupt Controller driver
- *
- * Copyright (C) 2023 StarFive Technology Co., Ltd.
- *
- * Author: Changhuang Liang <changhuang.liang@starfivetech.com>
- */
-
-#define pr_fmt(fmt) "irq-starfive-jh8100: " fmt
-
-#include <linux/bitops.h>
-#include <linux/clk.h>
-#include <linux/irq.h>
-#include <linux/irqchip.h>
-#include <linux/irqchip/chained_irq.h>
-#include <linux/irqdomain.h>
-#include <linux/of_address.h>
-#include <linux/of_irq.h>
-#include <linux/reset.h>
-#include <linux/spinlock.h>
-
-#define STARFIVE_INTC_SRC0_CLEAR 0x10
-#define STARFIVE_INTC_SRC0_MASK 0x14
-#define STARFIVE_INTC_SRC0_INT 0x1c
-
-#define STARFIVE_INTC_SRC_IRQ_NUM 32
-
-struct starfive_irq_chip {
- void __iomem *base;
- struct irq_domain *domain;
- raw_spinlock_t lock;
-};
-
-static void starfive_intc_bit_set(struct starfive_irq_chip *irqc,
- u32 reg, u32 bit_mask)
-{
- u32 value;
-
- value = ioread32(irqc->base + reg);
- value |= bit_mask;
- iowrite32(value, irqc->base + reg);
-}
-
-static void starfive_intc_bit_clear(struct starfive_irq_chip *irqc,
- u32 reg, u32 bit_mask)
-{
- u32 value;
-
- value = ioread32(irqc->base + reg);
- value &= ~bit_mask;
- iowrite32(value, irqc->base + reg);
-}
-
-static void starfive_intc_unmask(struct irq_data *d)
-{
- struct starfive_irq_chip *irqc = irq_data_get_irq_chip_data(d);
-
- raw_spin_lock(&irqc->lock);
- starfive_intc_bit_clear(irqc, STARFIVE_INTC_SRC0_MASK, BIT(d->hwirq));
- raw_spin_unlock(&irqc->lock);
-}
-
-static void starfive_intc_mask(struct irq_data *d)
-{
- struct starfive_irq_chip *irqc = irq_data_get_irq_chip_data(d);
-
- raw_spin_lock(&irqc->lock);
- starfive_intc_bit_set(irqc, STARFIVE_INTC_SRC0_MASK, BIT(d->hwirq));
- raw_spin_unlock(&irqc->lock);
-}
-
-static struct irq_chip intc_dev = {
- .name = "StarFive JH8100 INTC",
- .irq_unmask = starfive_intc_unmask,
- .irq_mask = starfive_intc_mask,
-};
-
-static int starfive_intc_map(struct irq_domain *d, unsigned int irq,
- irq_hw_number_t hwirq)
-{
- irq_domain_set_info(d, irq, hwirq, &intc_dev, d->host_data,
- handle_level_irq, NULL, NULL);
-
- return 0;
-}
-
-static const struct irq_domain_ops starfive_intc_domain_ops = {
- .xlate = irq_domain_xlate_onecell,
- .map = starfive_intc_map,
-};
-
-static void starfive_intc_irq_handler(struct irq_desc *desc)
-{
- struct starfive_irq_chip *irqc = irq_data_get_irq_handler_data(&desc->irq_data);
- struct irq_chip *chip = irq_desc_get_chip(desc);
- unsigned long value;
- int hwirq;
-
- chained_irq_enter(chip, desc);
-
- value = ioread32(irqc->base + STARFIVE_INTC_SRC0_INT);
- while (value) {
- hwirq = ffs(value) - 1;
-
- generic_handle_domain_irq(irqc->domain, hwirq);
-
- starfive_intc_bit_set(irqc, STARFIVE_INTC_SRC0_CLEAR, BIT(hwirq));
- starfive_intc_bit_clear(irqc, STARFIVE_INTC_SRC0_CLEAR, BIT(hwirq));
-
- __clear_bit(hwirq, &value);
- }
-
- chained_irq_exit(chip, desc);
-}
-
-static int starfive_intc_probe(struct platform_device *pdev, struct device_node *parent)
-{
- struct device_node *intc = pdev->dev.of_node;
- struct starfive_irq_chip *irqc;
- struct reset_control *rst;
- struct clk *clk;
- int parent_irq;
- int ret;
-
- irqc = kzalloc_obj(*irqc);
- if (!irqc)
- return -ENOMEM;
-
- irqc->base = of_iomap(intc, 0);
- if (!irqc->base) {
- pr_err("Unable to map registers\n");
- ret = -ENXIO;
- goto err_free;
- }
-
- rst = of_reset_control_get_exclusive(intc, NULL);
- if (IS_ERR(rst)) {
- pr_err("Unable to get reset control %pe\n", rst);
- ret = PTR_ERR(rst);
- goto err_unmap;
- }
-
- clk = of_clk_get(intc, 0);
- if (IS_ERR(clk)) {
- pr_err("Unable to get clock %pe\n", clk);
- ret = PTR_ERR(clk);
- goto err_reset_put;
- }
-
- ret = reset_control_deassert(rst);
- if (ret)
- goto err_clk_put;
-
- ret = clk_prepare_enable(clk);
- if (ret)
- goto err_reset_assert;
-
- raw_spin_lock_init(&irqc->lock);
-
- irqc->domain = irq_domain_create_linear(of_fwnode_handle(intc), STARFIVE_INTC_SRC_IRQ_NUM,
- &starfive_intc_domain_ops, irqc);
- if (!irqc->domain) {
- pr_err("Unable to create IRQ domain\n");
- ret = -EINVAL;
- goto err_clk_disable;
- }
-
- parent_irq = of_irq_get(intc, 0);
- if (parent_irq < 0) {
- pr_err("Failed to get main IRQ: %d\n", parent_irq);
- ret = parent_irq;
- goto err_remove_domain;
- }
-
- irq_set_chained_handler_and_data(parent_irq, starfive_intc_irq_handler,
- irqc);
-
- pr_info("Interrupt controller register, nr_irqs %d\n",
- STARFIVE_INTC_SRC_IRQ_NUM);
-
- return 0;
-
-err_remove_domain:
- irq_domain_remove(irqc->domain);
-err_clk_disable:
- clk_disable_unprepare(clk);
-err_reset_assert:
- reset_control_assert(rst);
-err_clk_put:
- clk_put(clk);
-err_reset_put:
- reset_control_put(rst);
-err_unmap:
- iounmap(irqc->base);
-err_free:
- kfree(irqc);
- return ret;
-}
-
-IRQCHIP_PLATFORM_DRIVER_BEGIN(starfive_intc)
-IRQCHIP_MATCH("starfive,jh8100-intc", starfive_intc_probe)
-IRQCHIP_PLATFORM_DRIVER_END(starfive_intc)
-
-MODULE_DESCRIPTION("StarFive JH8100 External Interrupt Controller");
-MODULE_LICENSE("GPL");
-MODULE_AUTHOR("Changhuang Liang <changhuang.liang@starfivetech.com>");
diff --git a/drivers/irqchip/irq-starfive-jhb100-intc.c b/drivers/irqchip/irq-starfive-jhb100-intc.c
new file mode 100644
index 000000000000..838885b02f34
--- /dev/null
+++ b/drivers/irqchip/irq-starfive-jhb100-intc.c
@@ -0,0 +1,254 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * StarFive JHB100 External Interrupt Controller driver
+ *
+ * Copyright (C) 2023 StarFive Technology Co., Ltd.
+ *
+ * Author: Changhuang Liang <changhuang.liang@starfivetech.com>
+ */
+
+#include <linux/bitops.h>
+#include <linux/cleanup.h>
+#include <linux/clk.h>
+#include <linux/interrupt.h>
+#include <linux/irq.h>
+#include <linux/irqchip.h>
+#include <linux/irqchip/chained_irq.h>
+#include <linux/irqdomain.h>
+#include <linux/of_irq.h>
+#include <linux/platform_device.h>
+#include <linux/reset.h>
+#include <linux/spinlock.h>
+
+#define STARFIVE_INTC_SRC_TYPE(n) (0x04 + ((n) * 0x20))
+#define STARFIVE_INTC_SRC_CLEAR(n) (0x10 + ((n) * 0x20))
+#define STARFIVE_INTC_SRC_MASK(n) (0x14 + ((n) * 0x20))
+#define STARFIVE_INTC_SRC_INT(n) (0x1c + ((n) * 0x20))
+
+#define STARFIVE_INTC_TRIGGER_MASK 0x3
+#define STARFIVE_INTC_TRIGGER_HIGH 0
+#define STARFIVE_INTC_TRIGGER_LOW 1
+#define STARFIVE_INTC_TRIGGER_POSEDGE 2
+#define STARFIVE_INTC_TRIGGER_NEGEDGE 3
+
+#define STARFIVE_INTC_NUM 2
+#define STARFIVE_INTC_SRC_IRQ_NUM 32
+#define STARFIVE_INTC_TYPE_NUM 16
+
+struct starfive_irq_chip {
+ void __iomem *base;
+ struct irq_domain *domain;
+ raw_spinlock_t lock;
+};
+
+static void starfive_intc_mod(struct starfive_irq_chip *irqc, u32 reg, u32 mask, u32 data)
+{
+ u32 value;
+
+ value = ioread32(irqc->base + reg) & ~mask;
+ data &= mask;
+ data |= value;
+ iowrite32(data, irqc->base + reg);
+}
+
+static void starfive_intc_bit_set(struct starfive_irq_chip *irqc,
+ u32 reg, u32 bit_mask)
+{
+ u32 value;
+
+ value = ioread32(irqc->base + reg);
+ value |= bit_mask;
+ iowrite32(value, irqc->base + reg);
+}
+
+static void starfive_intc_bit_clear(struct starfive_irq_chip *irqc,
+ u32 reg, u32 bit_mask)
+{
+ u32 value;
+
+ value = ioread32(irqc->base + reg);
+ value &= ~bit_mask;
+ iowrite32(value, irqc->base + reg);
+}
+
+static void starfive_intc_unmask(struct irq_data *d)
+{
+ struct starfive_irq_chip *irqc = irq_data_get_irq_chip_data(d);
+ int i, bitpos;
+
+ i = d->hwirq / STARFIVE_INTC_SRC_IRQ_NUM;
+ bitpos = d->hwirq % STARFIVE_INTC_SRC_IRQ_NUM;
+
+ guard(raw_spinlock)(&irqc->lock);
+ starfive_intc_bit_clear(irqc, STARFIVE_INTC_SRC_MASK(i), BIT(bitpos));
+}
+
+static void starfive_intc_mask(struct irq_data *d)
+{
+ struct starfive_irq_chip *irqc = irq_data_get_irq_chip_data(d);
+ int i, bitpos;
+
+ i = d->hwirq / STARFIVE_INTC_SRC_IRQ_NUM;
+ bitpos = d->hwirq % STARFIVE_INTC_SRC_IRQ_NUM;
+
+ guard(raw_spinlock)(&irqc->lock);
+ starfive_intc_bit_set(irqc, STARFIVE_INTC_SRC_MASK(i), BIT(bitpos));
+}
+
+static void starfive_intc_ack(struct irq_data *d)
+{
+ /* for handle_edge_irq, nothing to do */
+}
+
+static int starfive_intc_set_type(struct irq_data *d, unsigned int type)
+{
+ struct starfive_irq_chip *irqc = irq_data_get_irq_chip_data(d);
+ u32 i, bitpos, ty_pos, ty_shift, trigger, typeval;
+ irq_flow_handler_t handler;
+
+ i = d->hwirq / STARFIVE_INTC_SRC_IRQ_NUM;
+ bitpos = d->hwirq % STARFIVE_INTC_SRC_IRQ_NUM;
+ ty_pos = bitpos / STARFIVE_INTC_TYPE_NUM;
+ ty_shift = (bitpos % STARFIVE_INTC_TYPE_NUM) * 2;
+
+ switch (type) {
+ case IRQF_TRIGGER_LOW:
+ trigger = STARFIVE_INTC_TRIGGER_LOW;
+ handler = handle_level_irq;
+ break;
+ case IRQF_TRIGGER_HIGH:
+ trigger = STARFIVE_INTC_TRIGGER_HIGH;
+ handler = handle_level_irq;
+ break;
+ case IRQF_TRIGGER_FALLING:
+ trigger = STARFIVE_INTC_TRIGGER_NEGEDGE;
+ handler = handle_edge_irq;
+ break;
+ case IRQF_TRIGGER_RISING:
+ trigger = STARFIVE_INTC_TRIGGER_POSEDGE;
+ handler = handle_edge_irq;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ irq_set_handler_locked(d, handler);
+ typeval = trigger << ty_shift;
+
+ guard(raw_spinlock)(&irqc->lock);
+
+ starfive_intc_mod(irqc, STARFIVE_INTC_SRC_TYPE(i) + 4 * ty_pos,
+ STARFIVE_INTC_TRIGGER_MASK << ty_shift, typeval);
+
+ /* Once the type is updated, clear interrupt can help to reset the type value */
+ starfive_intc_bit_set(irqc, STARFIVE_INTC_SRC_CLEAR(i), BIT(bitpos));
+ starfive_intc_bit_clear(irqc, STARFIVE_INTC_SRC_CLEAR(i), BIT(bitpos));
+
+ return 0;
+}
+
+static struct irq_chip intc_dev = {
+ .name = "StarFive JHB100 INTC",
+ .irq_unmask = starfive_intc_unmask,
+ .irq_mask = starfive_intc_mask,
+ .irq_ack = starfive_intc_ack,
+ .irq_set_type = starfive_intc_set_type,
+};
+
+static int starfive_intc_map(struct irq_domain *d, unsigned int irq,
+ irq_hw_number_t hwirq)
+{
+ irq_domain_set_info(d, irq, hwirq, &intc_dev, d->host_data,
+ handle_level_irq, NULL, NULL);
+
+ return 0;
+}
+
+static const struct irq_domain_ops starfive_intc_domain_ops = {
+ .xlate = irq_domain_xlate_onecell,
+ .map = starfive_intc_map,
+};
+
+static void starfive_intc_irq_handler(struct irq_desc *desc)
+{
+ struct starfive_irq_chip *irqc = irq_data_get_irq_handler_data(&desc->irq_data);
+ struct irq_chip *chip = irq_desc_get_chip(desc);
+ unsigned long value;
+ int hwirq;
+
+ chained_irq_enter(chip, desc);
+
+ for (int i = 0; i < STARFIVE_INTC_NUM; i++) {
+ value = ioread32(irqc->base + STARFIVE_INTC_SRC_INT(i));
+ while (value) {
+ hwirq = ffs(value) - 1;
+
+ generic_handle_domain_irq(irqc->domain,
+ hwirq + i * STARFIVE_INTC_SRC_IRQ_NUM);
+
+ starfive_intc_bit_set(irqc, STARFIVE_INTC_SRC_CLEAR(i), BIT(hwirq));
+ starfive_intc_bit_clear(irqc, STARFIVE_INTC_SRC_CLEAR(i), BIT(hwirq));
+
+ __clear_bit(hwirq, &value);
+ }
+ }
+
+ chained_irq_exit(chip, desc);
+}
+
+static int starfive_intc_probe(struct platform_device *pdev, struct device_node *parent)
+{
+ struct device_node *intc = pdev->dev.of_node;
+ struct reset_control *rst;
+ struct clk *clk;
+ int parent_irq;
+
+ struct starfive_irq_chip *irqc __free(kfree) = kzalloc_obj(*irqc);
+ if (!irqc)
+ return -ENOMEM;
+
+ irqc->base = devm_platform_ioremap_resource(pdev, 0);
+ if (IS_ERR(irqc->base))
+ return dev_err_probe(&pdev->dev, PTR_ERR(irqc->base), "unable to map registers\n");
+
+ rst = devm_reset_control_get_optional_exclusive_deasserted(&pdev->dev, NULL);
+ if (IS_ERR(rst))
+ return dev_err_probe(&pdev->dev, PTR_ERR(rst),
+ "Unable to get and deassert reset control\n");
+
+ clk = devm_clk_get_optional_enabled(&pdev->dev, NULL);
+ if (IS_ERR(clk))
+ return dev_err_probe(&pdev->dev, PTR_ERR(clk), "Unable to get and enable clock\n");
+
+
+ raw_spin_lock_init(&irqc->lock);
+
+ irqc->domain = irq_domain_create_linear(of_fwnode_handle(intc),
+ STARFIVE_INTC_SRC_IRQ_NUM * STARFIVE_INTC_NUM,
+ &starfive_intc_domain_ops, irqc);
+ if (!irqc->domain)
+ return dev_err_probe(&pdev->dev, -EINVAL, "Unable to create IRQ domain\n");
+
+ parent_irq = of_irq_get(intc, 0);
+ if (parent_irq < 0) {
+ irq_domain_remove(irqc->domain);
+ return dev_err_probe(&pdev->dev, parent_irq, "Failed to get main IRQ\n");
+ }
+
+ irq_set_chained_handler_and_data(parent_irq, starfive_intc_irq_handler,
+ irqc);
+
+ dev_info(&pdev->dev, "Interrupt controller register, nr_irqs %d\n",
+ STARFIVE_INTC_SRC_IRQ_NUM * STARFIVE_INTC_NUM);
+
+ retain_and_null_ptr(irqc);
+ return 0;
+}
+
+IRQCHIP_PLATFORM_DRIVER_BEGIN(starfive_intc)
+IRQCHIP_MATCH("starfive,jhb100-intc", starfive_intc_probe)
+IRQCHIP_PLATFORM_DRIVER_END(starfive_intc)
+
+MODULE_DESCRIPTION("StarFive JHB100 External Interrupt Controller");
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Changhuang Liang <changhuang.liang@starfivetech.com>");
diff --git a/drivers/irqchip/qcom-pdc.c b/drivers/irqchip/qcom-pdc.c
index 32b77fa93f73..2014dbb0bc43 100644
--- a/drivers/irqchip/qcom-pdc.c
+++ b/drivers/irqchip/qcom-pdc.c
@@ -3,6 +3,7 @@
* Copyright (c) 2017-2019, The Linux Foundation. All rights reserved.
*/
+#include <linux/bitfield.h>
#include <linux/err.h>
#include <linux/init.h>
#include <linux/interrupt.h>
@@ -21,22 +22,30 @@
#include <linux/types.h>
#define PDC_MAX_GPIO_IRQS 256
-#define PDC_DRV_OFFSET 0x10000
+#define PDC_DRV_SIZE 0x10000
/* Valid only on HW version < 3.2 */
#define IRQ_ENABLE_BANK 0x10
#define IRQ_ENABLE_BANK_MAX (IRQ_ENABLE_BANK + BITS_TO_BYTES(PDC_MAX_GPIO_IRQS))
+#define IRQ_ENABLE_BANK_INDEX_MASK GENMASK(31, 5)
+#define IRQ_ENABLE_BANK_BIT_MASK GENMASK(4, 0)
#define IRQ_i_CFG 0x110
/* Valid only on HW version >= 3.2 */
#define IRQ_i_CFG_IRQ_ENABLE 3
-#define IRQ_i_CFG_TYPE_MASK GENMASK(2, 0)
+#define IRQ_i_CFG_TYPE_MASK GENMASK(2, 0)
-#define PDC_VERSION_REG 0x1000
+#define PDC_VERSION_REG 0x1000
+#define PDC_VERSION_MAJOR GENMASK(23, 16)
+#define PDC_VERSION_MINOR GENMASK(15, 8)
+#define PDC_VERSION_STEP GENMASK(7, 0)
+#define PDC_VERSION(maj, min, step) (FIELD_PREP(PDC_VERSION_MAJOR, (maj)) | \
+ FIELD_PREP(PDC_VERSION_MINOR, (min)) | \
+ FIELD_PREP(PDC_VERSION_STEP, (step)))
/* Notable PDC versions */
-#define PDC_VERSION_3_2 0x30200
+#define PDC_VERSION_3_2 PDC_VERSION(3, 2, 0)
struct pdc_pin_region {
u32 pin_base;
@@ -97,28 +106,37 @@ static void pdc_x1e_irq_enable_write(u32 bank, u32 enable)
pdc_base_reg_write(base, IRQ_ENABLE_BANK, bank, enable);
}
-static void __pdc_enable_intr(int pin_out, bool on)
+static void pdc_enable_intr_bank(int pin_out, bool on)
{
unsigned long enable;
+ u32 index, mask;
- if (pdc_version < PDC_VERSION_3_2) {
- u32 index, mask;
+ index = FIELD_GET(IRQ_ENABLE_BANK_INDEX_MASK, pin_out);
+ mask = FIELD_GET(IRQ_ENABLE_BANK_BIT_MASK, pin_out);
- index = pin_out / 32;
- mask = pin_out % 32;
+ enable = pdc_reg_read(IRQ_ENABLE_BANK, index);
+ __assign_bit(mask, &enable, on);
- enable = pdc_reg_read(IRQ_ENABLE_BANK, index);
- __assign_bit(mask, &enable, on);
+ if (pdc_x1e_quirk)
+ pdc_x1e_irq_enable_write(index, enable);
+ else
+ pdc_reg_write(IRQ_ENABLE_BANK, index, enable);
+}
- if (pdc_x1e_quirk)
- pdc_x1e_irq_enable_write(index, enable);
- else
- pdc_reg_write(IRQ_ENABLE_BANK, index, enable);
- } else {
- enable = pdc_reg_read(IRQ_i_CFG, pin_out);
- __assign_bit(IRQ_i_CFG_IRQ_ENABLE, &enable, on);
- pdc_reg_write(IRQ_i_CFG, pin_out, enable);
- }
+static void pdc_enable_intr_cfg(int pin_out, bool on)
+{
+ unsigned long enable = pdc_reg_read(IRQ_i_CFG, pin_out);
+
+ __assign_bit(IRQ_i_CFG_IRQ_ENABLE, &enable, on);
+ pdc_reg_write(IRQ_i_CFG, pin_out, enable);
+}
+
+static void __pdc_enable_intr(int pin_out, bool on)
+{
+ if (pdc_version < PDC_VERSION_3_2)
+ pdc_enable_intr_bank(pin_out, on);
+ else
+ pdc_enable_intr_cfg(pin_out, on);
}
static void pdc_enable_intr(struct irq_data *d, bool on)
@@ -348,7 +366,6 @@ static int pdc_setup_pin_mapping(struct device_node *np)
return 0;
}
-#define QCOM_PDC_SIZE 0x30000
static int qcom_pdc_probe(struct platform_device *pdev, struct device_node *parent)
{
@@ -362,7 +379,7 @@ static int qcom_pdc_probe(struct platform_device *pdev, struct device_node *pare
if (of_address_to_resource(node, 0, &res))
return -EINVAL;
- res_size = max_t(resource_size_t, resource_size(&res), QCOM_PDC_SIZE);
+ res_size = max_t(resource_size_t, resource_size(&res), PDC_DRV_SIZE);
if (res_size > resource_size(&res))
pr_warn("%pOF: invalid reg size, please fix DT\n", node);
@@ -375,7 +392,7 @@ static int qcom_pdc_probe(struct platform_device *pdev, struct device_node *pare
* region with the expected offset to preserve support for old DTs.
*/
if (of_device_is_compatible(node, "qcom,x1e80100-pdc")) {
- pdc_prev_base = ioremap(res.start - PDC_DRV_OFFSET, IRQ_ENABLE_BANK_MAX);
+ pdc_prev_base = ioremap(res.start - PDC_DRV_SIZE, IRQ_ENABLE_BANK_MAX);
if (!pdc_prev_base) {
pr_err("%pOF: unable to map previous PDC DRV region\n", node);
return -ENXIO;
diff --git a/include/linux/irqchip/arm-gic-v3.h b/include/linux/irqchip/arm-gic-v3.h
index 0225121f3013..ea5fd2374ebe 100644
--- a/include/linux/irqchip/arm-gic-v3.h
+++ b/include/linux/irqchip/arm-gic-v3.h
@@ -604,7 +604,7 @@
#include <asm/arch_gicv3.h>
-#ifndef __ASSEMBLY__
+#ifndef __ASSEMBLER__
/*
* We need a value to serve as a irq-type for LPIs. Choose one that will
diff --git a/include/linux/irqchip/arm-gic.h b/include/linux/irqchip/arm-gic.h
index d45fa19f9e47..849386dc5ec8 100644
--- a/include/linux/irqchip/arm-gic.h
+++ b/include/linux/irqchip/arm-gic.h
@@ -131,7 +131,7 @@
#define GICV_PMR_PRIORITY_SHIFT 3
#define GICV_PMR_PRIORITY_MASK (0x1f << GICV_PMR_PRIORITY_SHIFT)
-#ifndef __ASSEMBLY__
+#ifndef __ASSEMBLER__
#include <linux/irqdomain.h>
@@ -162,5 +162,5 @@ int gic_get_cpu_id(unsigned int cpu);
void gic_migrate_target(unsigned int new_cpu_id);
unsigned long gic_get_sgir_physaddr(void);
-#endif /* __ASSEMBLY */
+#endif /* __ASSEMBLER__ */
#endif
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [GIT pull] irq/msi for v7.2-rc1
2026-06-13 21:24 [GIT pull] core/rseq for v7.2-rc1 Thomas Gleixner
2026-06-13 21:24 ` [GIT pull] irq/core " Thomas Gleixner
2026-06-13 21:24 ` [GIT pull] irq/drivers " Thomas Gleixner
@ 2026-06-13 21:24 ` Thomas Gleixner
2026-06-15 8:51 ` pr-tracker-bot
2026-06-13 21:25 ` [GIT pull] smp/core " Thomas Gleixner
` (6 subsequent siblings)
9 siblings, 1 reply; 22+ messages in thread
From: Thomas Gleixner @ 2026-06-13 21:24 UTC (permalink / raw)
To: Linus Torvalds; +Cc: linux-kernel, x86
Linus,
please pull the latest irq/msi branch from:
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git irq-msi-2026-06-13
up to: 3661d5f40376: genirq/msi: Fix typos in msi_domain_ops comment
A trivial update to the MSI interrupt subsystem, which fixes a couple of
typos.
Thanks,
tglx
------------------>
Miles Krause (1):
genirq/msi: Fix typos in msi_domain_ops comment
include/linux/msi.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/msi.h b/include/linux/msi.h
index fa41eed62868..a4613de11960 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -444,7 +444,7 @@ struct msi_domain_info;
*
* @domain_alloc_irqs, @domain_free_irqs can be used to override the
* default allocation/free functions (__msi_domain_alloc/free_irqs). This
- * is initially for a wrapper around XENs seperate MSI universe which can't
+ * is initially for a wrapper around XEN's separate MSI universe which can't
* be wrapped into the regular irq domains concepts by mere mortals. This
* allows to universally use msi_domain_alloc/free_irqs without having to
* special case XEN all over the place.
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [GIT pull] smp/core for v7.2-rc1
2026-06-13 21:24 [GIT pull] core/rseq for v7.2-rc1 Thomas Gleixner
` (2 preceding siblings ...)
2026-06-13 21:24 ` [GIT pull] irq/msi " Thomas Gleixner
@ 2026-06-13 21:25 ` Thomas Gleixner
2026-06-15 8:51 ` pr-tracker-bot
2026-06-13 21:25 ` [GIT pull] timers/clocksource " Thomas Gleixner
` (5 subsequent siblings)
9 siblings, 1 reply; 22+ messages in thread
From: Thomas Gleixner @ 2026-06-13 21:25 UTC (permalink / raw)
To: Linus Torvalds; +Cc: linux-kernel, x86
Linus,
please pull the latest smp/core branch from:
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git smp-core-2026-06-13
up to: 9c91efd1d63e: cpu: Add lockdep_is_cpus_held()/lockdep_is_cpus_write_held() stubs for !CONFIG_HOTPLUG_CPU
Two small updates to the SMP/hotplug subsystem:
- Add cpuhplock.h to the maintained files
- Provide the missing stubs for lockdep_is_cpus_held() and
lockdep_is_cpus_write_held() so the usage sites can be simplified.
Thanks,
tglx
------------------>
Reinette Chatre (2):
MAINTAINERS: Add include/linux/cpuhplock.h to CPU HOTPLUG area
cpu: Add lockdep_is_cpus_held()/lockdep_is_cpus_write_held() stubs for !CONFIG_HOTPLUG_CPU
MAINTAINERS | 1 +
include/linux/cpuhplock.h | 7 ++++---
2 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/MAINTAINERS b/MAINTAINERS
index 9ec290e38b44..f25e2d33a7d2 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6676,6 +6676,7 @@ P: Documentation/process/maintainer-tip.rst
T: git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git smp/core
F: include/linux/cpu.h
F: include/linux/cpuhotplug.h
+F: include/linux/cpuhplock.h
F: include/linux/smpboot.h
F: kernel/cpu.c
F: kernel/smpboot.*
diff --git a/include/linux/cpuhplock.h b/include/linux/cpuhplock.h
index 286b3ab92e15..42f6a095ba5b 100644
--- a/include/linux/cpuhplock.h
+++ b/include/linux/cpuhplock.h
@@ -12,9 +12,6 @@
struct device;
-extern int lockdep_is_cpus_held(void);
-extern int lockdep_is_cpus_write_held(void);
-
#ifdef CONFIG_HOTPLUG_CPU
void cpus_write_lock(void);
void cpus_write_unlock(void);
@@ -22,6 +19,8 @@ void cpus_read_lock(void);
void cpus_read_unlock(void);
int cpus_read_trylock(void);
void lockdep_assert_cpus_held(void);
+int lockdep_is_cpus_held(void);
+int lockdep_is_cpus_write_held(void);
void cpu_hotplug_disable_offlining(void);
void cpu_hotplug_disable(void);
void cpu_hotplug_enable(void);
@@ -38,6 +37,8 @@ static inline void cpus_read_lock(void) { }
static inline void cpus_read_unlock(void) { }
static inline int cpus_read_trylock(void) { return true; }
static inline void lockdep_assert_cpus_held(void) { }
+static inline int lockdep_is_cpus_held(void) { return 1; }
+static inline int lockdep_is_cpus_write_held(void) { return 1; }
static inline void cpu_hotplug_disable_offlining(void) { }
static inline void cpu_hotplug_disable(void) { }
static inline void cpu_hotplug_enable(void) { }
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [GIT pull] timers/clocksource for v7.2-rc1
2026-06-13 21:24 [GIT pull] core/rseq for v7.2-rc1 Thomas Gleixner
` (3 preceding siblings ...)
2026-06-13 21:25 ` [GIT pull] smp/core " Thomas Gleixner
@ 2026-06-13 21:25 ` Thomas Gleixner
2026-06-15 8:51 ` pr-tracker-bot
2026-06-13 21:25 ` [GIT pull] timers/core " Thomas Gleixner
` (4 subsequent siblings)
9 siblings, 1 reply; 22+ messages in thread
From: Thomas Gleixner @ 2026-06-13 21:25 UTC (permalink / raw)
To: Linus Torvalds; +Cc: linux-kernel, x86
Linus,
please pull the latest timers/clocksource branch from:
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git timers-clocksource-2026-06-13
up to: c66494c79ede: Merge tag 'timers-v7.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/daniel.lezcano/linux into timers/clocksource
Updates for clocksource/clockevent drivers:
- Add devm helpers for clocksources, which allows to simplify driver
teardown and probe failure handling.
- More module conversion work
- Update the support for the ARM EL2 virtual timer including the required
ACPI changes.
- Add clockevent and clocksource support for the TI Dual Mode Timer
- Fix the support for multiple watchdog instances in the TEGRA186 driver
- Add D1 timer support to the SUN5I driver
- The usual devicetree updates, cleanups and small fixes all over the place
Thanks,
tglx
------------------>
Chen Ni (1):
clocksource/drivers/sun5i: Handle error returns from devm_reset_control_get_optional_exclusive()
Cosmin Tanislav (2):
dt-bindings: timer: renesas,rz-mtu3: Remove TCIU8 interrupt
dt-bindings: timer: renesas,rz-mtu3: document RZ/{T2H,N2H}
Daniel Lezcano (3):
clocksource/drivers/mmio: Make the code compatible with modules
clocksource/drivers/timer-of: Make the code compatible with modules
clocksource: Add devm_clocksource_register_*() helpers
Enric Balletbo i Serra (1):
clocksource: move NXP timer selection to drivers/clocksource
Frank Li (1):
dt-bindings: timer: fsl,imxgpt: add compatible string fsl,imx25-epit
Kartik Rajput (4):
clocksource/drivers/timer-tegra186: Fix support for multiple watchdog instances
clocksource/drivers/timer-tegra186: Correct num_wdts for Tegra186 and Tegra234
clocksource/drivers/timer-tegra186: Register all accessible watchdog timers
clocksource/drivers/timer-tegra186: Reserve and service a kernel watchdog
Krzysztof Kozlowski (1):
clocksource/drivers/timer-rtl-otto: Make rttm_cs variable static
Ley Foon Tan (1):
dt-bindings: timer: Add StarFive JHB100 clint
Marc Zyngier (4):
ACPI: GTDT: Account for GTDTv3 size when walking the platform timer descriptors
ACPI: GTDT: Parse information related to the EL2 virtual timer
clocksource/drivers/arm_arch_timer: Default to EL2 virtual timer when running VHE
dt-bindings: timer: arm,arch_timer: Fix requirements for interrupt description
Markus Schneider-Pargmann (TI) (3):
clocksource/drivers/timer-ti-dm: Fix property name in comment
clocksource/drivers/timer-ti-dm: Add clocksource support
clocksource/drivers/timer-ti-dm: Add clockevent support
Michal Piekos (2):
dt-bindings: timer: allwinner,sun5i-a13-hstimer: add H616 and D1
clocksource/drivers/sun5i: Add D1 hstimer support
Nick Hu (1):
dt-bindings: timer: Remove sifive,fine-ctr-bits property
.../timer/allwinner,sun5i-a13-hstimer.yaml | 9 +-
.../devicetree/bindings/timer/arm,arch_timer.yaml | 21 +-
.../devicetree/bindings/timer/fsl,imxgpt.yaml | 1 +
.../devicetree/bindings/timer/renesas,rz-mtu3.yaml | 26 ++-
.../devicetree/bindings/timer/sifive,clint.yaml | 17 +-
arch/arm/mach-imx/Kconfig | 21 --
drivers/acpi/arm64/gtdt.c | 42 +++-
drivers/clocksource/Kconfig | 31 +++
drivers/clocksource/arm_arch_timer.c | 55 +++---
drivers/clocksource/mmio.c | 11 +-
drivers/clocksource/timer-of.c | 24 +--
drivers/clocksource/timer-of.h | 5 +-
drivers/clocksource/timer-rtl-otto.c | 2 +-
drivers/clocksource/timer-sun5i.c | 87 +++++++--
drivers/clocksource/timer-tegra186.c | 122 ++++++++++--
drivers/clocksource/timer-ti-dm-systimer.c | 2 +-
drivers/clocksource/timer-ti-dm.c | 217 +++++++++++++++++++++
include/linux/clocksource.h | 15 ++
kernel/time/clocksource.c | 20 ++
19 files changed, 591 insertions(+), 137 deletions(-)
diff --git a/Documentation/devicetree/bindings/timer/allwinner,sun5i-a13-hstimer.yaml b/Documentation/devicetree/bindings/timer/allwinner,sun5i-a13-hstimer.yaml
index f1853daec2f9..3e2725c56995 100644
--- a/Documentation/devicetree/bindings/timer/allwinner,sun5i-a13-hstimer.yaml
+++ b/Documentation/devicetree/bindings/timer/allwinner,sun5i-a13-hstimer.yaml
@@ -15,9 +15,13 @@ properties:
oneOf:
- const: allwinner,sun5i-a13-hstimer
- const: allwinner,sun7i-a20-hstimer
+ - const: allwinner,sun20i-d1-hstimer
- items:
- const: allwinner,sun6i-a31-hstimer
- const: allwinner,sun7i-a20-hstimer
+ - items:
+ - const: allwinner,sun50i-h616-hstimer
+ - const: allwinner,sun20i-d1-hstimer
reg:
maxItems: 1
@@ -45,7 +49,10 @@ required:
if:
properties:
compatible:
- const: allwinner,sun5i-a13-hstimer
+ anyOf:
+ - const: allwinner,sun5i-a13-hstimer
+ - contains:
+ const: allwinner,sun20i-d1-hstimer
then:
properties:
diff --git a/Documentation/devicetree/bindings/timer/arm,arch_timer.yaml b/Documentation/devicetree/bindings/timer/arm,arch_timer.yaml
index c5fc3b6c8bd0..c65e48a155ab 100644
--- a/Documentation/devicetree/bindings/timer/arm,arch_timer.yaml
+++ b/Documentation/devicetree/bindings/timer/arm,arch_timer.yaml
@@ -10,13 +10,8 @@ maintainers:
- Marc Zyngier <marc.zyngier@arm.com>
- Mark Rutland <mark.rutland@arm.com>
description: |+
- ARM cores may have a per-core architected timer, which provides per-cpu timers,
- or a memory mapped architected timer, which provides up to 8 frames with a
- physical and optional virtual timer per frame.
-
- The per-core architected timer is attached to a GIC to deliver its
- per-processor interrupts via PPIs. The memory mapped timer is attached to a GIC
- to deliver its interrupts via SPIs.
+ The per-core architected timer is expected to deliver per-CPU interrupts
+ (commonly to a GIC to deliver its per-processor interrupts as PPIs).
properties:
compatible:
@@ -33,13 +28,13 @@ properties:
- const: arm,armv7-timer
interrupts:
- minItems: 1
+ minItems: 2
items:
- - description: secure timer irq
- - description: non-secure timer irq
- - description: virtual timer irq
- - description: hypervisor timer irq
- - description: hypervisor virtual timer irq
+ - description: EL1 secure physical timer irq, if EL3 is implemented
+ - description: EL1 non-secure physical timer irq
+ - description: EL1 virtual timer irq
+ - description: EL2 physical timer irq, if EL2 is implemented
+ - description: EL2 virtual timer irq, if FEAT_VHE is implemented
interrupt-names:
oneOf:
diff --git a/Documentation/devicetree/bindings/timer/fsl,imxgpt.yaml b/Documentation/devicetree/bindings/timer/fsl,imxgpt.yaml
index 9898dc7ea97b..6d41fb120379 100644
--- a/Documentation/devicetree/bindings/timer/fsl,imxgpt.yaml
+++ b/Documentation/devicetree/bindings/timer/fsl,imxgpt.yaml
@@ -14,6 +14,7 @@ properties:
oneOf:
- const: fsl,imx1-gpt
- const: fsl,imx21-gpt
+ - const: fsl,imx25-epit
- items:
- const: fsl,imx27-gpt
- const: fsl,imx21-gpt
diff --git a/Documentation/devicetree/bindings/timer/renesas,rz-mtu3.yaml b/Documentation/devicetree/bindings/timer/renesas,rz-mtu3.yaml
index 3ad10c5b66ba..ecff2912d812 100644
--- a/Documentation/devicetree/bindings/timer/renesas,rz-mtu3.yaml
+++ b/Documentation/devicetree/bindings/timer/renesas,rz-mtu3.yaml
@@ -112,6 +112,8 @@ properties:
- renesas,r9a07g043-mtu3 # RZ/{G2UL,Five}
- renesas,r9a07g044-mtu3 # RZ/G2{L,LC}
- renesas,r9a07g054-mtu3 # RZ/V2L
+ - renesas,r9a09g077-mtu3 # RZ/T2H
+ - renesas,r9a09g087-mtu3 # RZ/N2H
- const: renesas,rz-mtu3
reg:
@@ -162,7 +164,6 @@ properties:
- description: MTU8.TGRC input capture/compare match
- description: MTU8.TGRD input capture/compare match
- description: MTU8.TCNT overflow
- - description: MTU8.TCNT underflow
interrupt-names:
items:
@@ -209,7 +210,6 @@ properties:
- const: tgic8
- const: tgid8
- const: tciv8
- - const: tciu8
clocks:
maxItems: 1
@@ -233,7 +233,22 @@ required:
- interrupt-names
- clocks
- power-domains
- - resets
+
+allOf:
+ - if:
+ properties:
+ compatible:
+ contains:
+ enum:
+ - renesas,r9a07g043-mtu3
+ - renesas,r9a07g044-mtu3
+ - renesas,r9a07g054-mtu3
+ then:
+ required:
+ - resets
+ else:
+ properties:
+ resets: false
additionalProperties: false
@@ -287,8 +302,7 @@ examples:
<GIC_SPI 209 IRQ_TYPE_EDGE_RISING>,
<GIC_SPI 210 IRQ_TYPE_EDGE_RISING>,
<GIC_SPI 211 IRQ_TYPE_EDGE_RISING>,
- <GIC_SPI 212 IRQ_TYPE_EDGE_RISING>,
- <GIC_SPI 213 IRQ_TYPE_EDGE_RISING>;
+ <GIC_SPI 212 IRQ_TYPE_EDGE_RISING>;
interrupt-names = "tgia0", "tgib0", "tgic0", "tgid0", "tciv0", "tgie0",
"tgif0",
"tgia1", "tgib1", "tciv1", "tciu1",
@@ -298,7 +312,7 @@ examples:
"tgiu5", "tgiv5", "tgiw5",
"tgia6", "tgib6", "tgic6", "tgid6", "tciv6",
"tgia7", "tgib7", "tgic7", "tgid7", "tciv7",
- "tgia8", "tgib8", "tgic8", "tgid8", "tciv8", "tciu8";
+ "tgia8", "tgib8", "tgic8", "tgid8", "tciv8";
clocks = <&cpg CPG_MOD R9A07G044_MTU_X_MCK_MTU3>;
power-domains = <&cpg>;
resets = <&cpg R9A07G044_MTU_X_PRESET_MTU3>;
diff --git a/Documentation/devicetree/bindings/timer/sifive,clint.yaml b/Documentation/devicetree/bindings/timer/sifive,clint.yaml
index 3c16b260db04..67cea8edb59f 100644
--- a/Documentation/devicetree/bindings/timer/sifive,clint.yaml
+++ b/Documentation/devicetree/bindings/timer/sifive,clint.yaml
@@ -38,6 +38,7 @@ properties:
- starfive,jh7100-clint # StarFive JH7100
- starfive,jh7110-clint # StarFive JH7110
- starfive,jh8100-clint # StarFive JH8100
+ - starfive,jhb100-clint # StarFive JHB100
- tenstorrent,blackhole-clint # Tenstorrent Blackhole
- const: sifive,clint0 # SiFive CLINT v0 IP block
- items:
@@ -72,22 +73,6 @@ properties:
minItems: 1
maxItems: 4095
- sifive,fine-ctr-bits:
- maximum: 15
- description: The width in bits of the fine counter.
-
-if:
- properties:
- compatible:
- contains:
- const: sifive,clint2
-then:
- required:
- - sifive,fine-ctr-bits
-else:
- properties:
- sifive,fine-ctr-bits: false
-
additionalProperties: false
required:
diff --git a/arch/arm/mach-imx/Kconfig b/arch/arm/mach-imx/Kconfig
index 6ea1bd55acf8..a361840d7a04 100644
--- a/arch/arm/mach-imx/Kconfig
+++ b/arch/arm/mach-imx/Kconfig
@@ -227,27 +227,6 @@ config SOC_VF610
help
This enables support for Freescale Vybrid VF610 processor.
-choice
- prompt "Clocksource for scheduler clock"
- depends on SOC_VF610
- default VF_USE_ARM_GLOBAL_TIMER
-
- config VF_USE_ARM_GLOBAL_TIMER
- bool "Use ARM Global Timer"
- depends on ARCH_MULTI_V7
- select ARM_GLOBAL_TIMER
- select CLKSRC_ARM_GLOBAL_TIMER_SCHED_CLOCK
- help
- Use the ARM Global Timer as clocksource
-
- config VF_USE_PIT_TIMER
- bool "Use PIT timer"
- select NXP_PIT_TIMER
- help
- Use SoC Periodic Interrupt Timer (PIT) as clocksource
-
-endchoice
-
endif
endif
diff --git a/drivers/acpi/arm64/gtdt.c b/drivers/acpi/arm64/gtdt.c
index ffc867bac2d6..00158c8aa6d9 100644
--- a/drivers/acpi/arm64/gtdt.c
+++ b/drivers/acpi/arm64/gtdt.c
@@ -34,14 +34,33 @@ struct acpi_gtdt_descriptor {
void *platform_timer;
};
+struct gtdt_v3 {
+ struct acpi_table_gtdt gtdt_v2;
+ struct acpi_gtdt_el2 el2_vtimer;
+};
+
static struct acpi_gtdt_descriptor acpi_gtdt_desc __initdata;
+static __init struct acpi_gtdt_el2 *gtdt_to_el2_vtimer(struct acpi_table_gtdt *gtdt)
+{
+ if (gtdt->header.revision < 3)
+ return NULL;
+
+ return &container_of(gtdt, struct gtdt_v3, gtdt_v2)->el2_vtimer;
+}
+
static __init bool platform_timer_valid(void *platform_timer)
{
struct acpi_gtdt_header *gh = platform_timer;
+ void *platform_timer_begin;
+
+ if (acpi_gtdt_desc.gtdt->header.revision >= 3)
+ platform_timer_begin = container_of(acpi_gtdt_desc.gtdt, struct gtdt_v3, gtdt_v2) + 1;
+ else
+ platform_timer_begin = acpi_gtdt_desc.gtdt + 1;
- return (platform_timer >= (void *)(acpi_gtdt_desc.gtdt + 1) &&
- platform_timer < acpi_gtdt_desc.gtdt_end &&
+ return (platform_timer >= platform_timer_begin &&
+ platform_timer + sizeof(*gh) <= acpi_gtdt_desc.gtdt_end &&
gh->length != 0 &&
platform_timer + gh->length <= acpi_gtdt_desc.gtdt_end);
}
@@ -101,6 +120,7 @@ static int __init map_gt_gsi(u32 interrupt, u32 flags)
int __init acpi_gtdt_map_ppi(int type)
{
struct acpi_table_gtdt *gtdt = acpi_gtdt_desc.gtdt;
+ struct acpi_gtdt_el2 *el2_vtimer = gtdt_to_el2_vtimer(gtdt);
switch (type) {
case ARCH_TIMER_PHYS_NONSECURE_PPI:
@@ -113,6 +133,12 @@ int __init acpi_gtdt_map_ppi(int type)
case ARCH_TIMER_HYP_PPI:
return map_gt_gsi(gtdt->non_secure_el2_interrupt,
gtdt->non_secure_el2_flags);
+ case ARCH_TIMER_HYP_VIRT_PPI:
+ if (el2_vtimer && el2_vtimer->virtual_el2_timer_gsiv)
+ return map_gt_gsi(el2_vtimer->virtual_el2_timer_gsiv,
+ el2_vtimer->virtual_el2_timer_flags);
+
+ return 0;
default:
pr_err("Failed to map timer interrupt: invalid type.\n");
}
@@ -130,6 +156,7 @@ int __init acpi_gtdt_map_ppi(int type)
bool __init acpi_gtdt_c3stop(int type)
{
struct acpi_table_gtdt *gtdt = acpi_gtdt_desc.gtdt;
+ struct acpi_gtdt_el2 *el2_vtimer = gtdt_to_el2_vtimer(gtdt);
switch (type) {
case ARCH_TIMER_PHYS_NONSECURE_PPI:
@@ -141,6 +168,10 @@ bool __init acpi_gtdt_c3stop(int type)
case ARCH_TIMER_HYP_PPI:
return !(gtdt->non_secure_el2_flags & ACPI_GTDT_ALWAYS_ON);
+ case ARCH_TIMER_HYP_VIRT_PPI:
+ return el2_vtimer && el2_vtimer->virtual_el2_timer_gsiv &&
+ !(el2_vtimer->virtual_el2_timer_flags & ACPI_GTDT_ALWAYS_ON);
+
default:
pr_err("Failed to get c3stop info: invalid type.\n");
}
@@ -166,6 +197,13 @@ int __init acpi_gtdt_init(struct acpi_table_header *table,
u32 cnt = 0;
gtdt = container_of(table, struct acpi_table_gtdt, header);
+
+ if ((gtdt->header.revision >= 3 && gtdt->header.length < sizeof(struct gtdt_v3)) ||
+ (gtdt->header.revision == 2 && gtdt->header.length < sizeof(*gtdt))) {
+ pr_err(FW_BUG "GTDT with invalid size %d\n", gtdt->header.length);
+ return -EINVAL;
+ }
+
acpi_gtdt_desc.gtdt = gtdt;
acpi_gtdt_desc.gtdt_end = (void *)table + table->length;
acpi_gtdt_desc.platform_timer = NULL;
diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig
index d1a33a231a44..d9c76dd443f8 100644
--- a/drivers/clocksource/Kconfig
+++ b/drivers/clocksource/Kconfig
@@ -793,4 +793,35 @@ config RTK_SYSTIMER
this option only when building for a Realtek platform or for compilation
testing.
+choice
+ prompt "NXP clocksource for scheduler clock"
+ depends on SOC_VF610 || ARCH_S32
+ # Default to Global Timer for Vybrid (32-bit)
+ default VF_USE_ARM_GLOBAL_TIMER if SOC_VF610
+ # Default to None for S32 (64-bit)
+ default VF_TIMER_NONE if ARCH_S32
+
+ config VF_USE_ARM_GLOBAL_TIMER
+ bool "Use NXP Vybrid Global Timer"
+ depends on ARCH_MULTI_V7 && SOC_VF610
+ select ARM_GLOBAL_TIMER
+ select CLKSRC_ARM_GLOBAL_TIMER_SCHED_CLOCK
+ help
+ Use the NXP Vybrid Global Timer as clocksource.
+
+ config VF_USE_PIT_TIMER
+ bool "Use NXP PIT timer"
+ select NXP_PIT_TIMER
+ help
+ Use NXP Periodic Interrupt Timer (PIT) as clocksource.
+
+ config VF_TIMER_NONE
+ bool "None (Use standard Arch Timer)"
+ depends on ARCH_S32
+ help
+ Do not use any specific NXP timer driver. Use the standard
+ ARM Architected Timer instead.
+
+endchoice
+
endmenu
diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
index 90aeff44a276..4adf756423de 100644
--- a/drivers/clocksource/arm_arch_timer.c
+++ b/drivers/clocksource/arm_arch_timer.c
@@ -688,6 +688,7 @@ static void __arch_timer_setup(struct clock_event_device *clk)
clk->irq = arch_timer_ppi[arch_timer_uses_ppi];
switch (arch_timer_uses_ppi) {
case ARCH_TIMER_VIRT_PPI:
+ case ARCH_TIMER_HYP_VIRT_PPI:
clk->set_state_shutdown = arch_timer_shutdown_virt;
clk->set_state_oneshot_stopped = arch_timer_shutdown_virt;
sne = erratum_handler(set_next_event_virt);
@@ -879,7 +880,7 @@ static void __init arch_timer_banner(void)
pr_info("cp15 timer running at %lu.%02luMHz (%s).\n",
(unsigned long)arch_timer_rate / 1000000,
(unsigned long)(arch_timer_rate / 10000) % 100,
- (arch_timer_uses_ppi == ARCH_TIMER_VIRT_PPI) ? "virt" : "phys");
+ arch_timer_ppi_names[arch_timer_uses_ppi]);
}
u32 arch_timer_get_rate(void)
@@ -912,7 +913,8 @@ static void __init arch_counter_register(void)
int width;
if ((IS_ENABLED(CONFIG_ARM64) && !is_hyp_mode_available()) ||
- arch_timer_uses_ppi == ARCH_TIMER_VIRT_PPI) {
+ arch_timer_uses_ppi == ARCH_TIMER_VIRT_PPI ||
+ arch_timer_uses_ppi == ARCH_TIMER_HYP_VIRT_PPI) {
if (arch_timer_counter_has_wa()) {
rd = arch_counter_get_cntvct_stable;
scr = raw_counter_get_cntvct_stable;
@@ -1023,6 +1025,7 @@ static int __init arch_timer_register(void)
ppi = arch_timer_ppi[arch_timer_uses_ppi];
switch (arch_timer_uses_ppi) {
case ARCH_TIMER_VIRT_PPI:
+ case ARCH_TIMER_HYP_VIRT_PPI:
err = request_percpu_irq(ppi, arch_timer_handler_virt,
"arch_timer", arch_timer_evt);
break;
@@ -1090,25 +1093,34 @@ static int __init arch_timer_common_init(void)
/**
* arch_timer_select_ppi() - Select suitable PPI for the current system.
*
- * If HYP mode is available, we know that the physical timer
- * has been configured to be accessible from PL1. Use it, so
- * that a guest can use the virtual timer instead.
+ * On AArch32, if HYP mode is available, we know that the physical
+ * timer has been configured to be accessible from PL1. Use it, so
+ * that a guest can use the virtual timer instead (though KVM host
+ * support has long been removed).
*
- * On ARMv8.1 with VH extensions, the kernel runs in HYP. VHE
- * accesses to CNTP_*_EL1 registers are silently redirected to
- * their CNTHP_*_EL2 counterparts, and use a different PPI
- * number.
+ * On ARMv8.1 with FEAT_VHE, the kernel runs in EL2. Accesses to
+ * CNTV_*_EL1 registers are silently redirected to their CNTHV_*_EL2
+ * counterparts, and the timer uses a different PPI number. Similar
+ * thing happen when using the EL2 physical timer. Note that a bunch
+ * of DTs out there omit the virtual EL2 timer, so fallback gracefully
+ * on the physical timer.
+ *
+ * Without VHE, if no interrupt provided for virtual timer, we'll have
+ * to stick to the physical timer. It'd better be accessible...
*
- * If no interrupt provided for virtual timer, we'll have to
- * stick to the physical timer. It'd better be accessible...
* For arm64 we never use the secure interrupt.
*
* Return: a suitable PPI type for the current system.
*/
static enum arch_timer_ppi_nr __init arch_timer_select_ppi(void)
{
- if (is_kernel_in_hyp_mode())
+ if (is_kernel_in_hyp_mode()) {
+ if (arch_timer_ppi[ARCH_TIMER_HYP_VIRT_PPI])
+ return ARCH_TIMER_HYP_VIRT_PPI;
+
+ pr_warn_once(FW_BUG "VHE-capable CPU without EL2 virtual timer interrupt\n");
return ARCH_TIMER_HYP_PPI;
+ }
if (!is_hyp_mode_available() && arch_timer_ppi[ARCH_TIMER_VIRT_PPI])
return ARCH_TIMER_VIRT_PPI;
@@ -1200,14 +1212,9 @@ static int __init arch_timer_acpi_init(struct acpi_table_header *table)
if (ret)
return ret;
- arch_timer_ppi[ARCH_TIMER_PHYS_NONSECURE_PPI] =
- acpi_gtdt_map_ppi(ARCH_TIMER_PHYS_NONSECURE_PPI);
-
- arch_timer_ppi[ARCH_TIMER_VIRT_PPI] =
- acpi_gtdt_map_ppi(ARCH_TIMER_VIRT_PPI);
-
- arch_timer_ppi[ARCH_TIMER_HYP_PPI] =
- acpi_gtdt_map_ppi(ARCH_TIMER_HYP_PPI);
+ /* The GTDT parser can't be bothered with the secure timer */
+ for (int i = ARCH_TIMER_PHYS_NONSECURE_PPI; i < ARCH_TIMER_MAX_TIMER_PPI; i++)
+ arch_timer_ppi[i] = acpi_gtdt_map_ppi(i);
arch_timer_populate_kvm_info();
@@ -1253,10 +1260,14 @@ int kvm_arch_ptp_get_crosststamp(u64 *cycle, struct timespec64 *ts,
if (!IS_ENABLED(CONFIG_HAVE_ARM_SMCCC_DISCOVERY))
return -EOPNOTSUPP;
- if (arch_timer_uses_ppi == ARCH_TIMER_VIRT_PPI)
+ switch (arch_timer_uses_ppi) {
+ case ARCH_TIMER_VIRT_PPI:
+ case ARCH_TIMER_HYP_VIRT_PPI:
ptp_counter = KVM_PTP_VIRT_COUNTER;
- else
+ break;
+ default:
ptp_counter = KVM_PTP_PHYS_COUNTER;
+ }
arm_smccc_1_1_invoke(ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID,
ptp_counter, &hvc_res);
diff --git a/drivers/clocksource/mmio.c b/drivers/clocksource/mmio.c
index cd5fbf49ac29..0fee8edb837a 100644
--- a/drivers/clocksource/mmio.c
+++ b/drivers/clocksource/mmio.c
@@ -21,21 +21,25 @@ u64 clocksource_mmio_readl_up(struct clocksource *c)
{
return (u64)readl_relaxed(to_mmio_clksrc(c)->reg);
}
+EXPORT_SYMBOL_GPL(clocksource_mmio_readl_up);
u64 clocksource_mmio_readl_down(struct clocksource *c)
{
return ~(u64)readl_relaxed(to_mmio_clksrc(c)->reg) & c->mask;
}
+EXPORT_SYMBOL_GPL(clocksource_mmio_readl_down);
u64 clocksource_mmio_readw_up(struct clocksource *c)
{
return (u64)readw_relaxed(to_mmio_clksrc(c)->reg);
}
+EXPORT_SYMBOL_GPL(clocksource_mmio_readw_up);
u64 clocksource_mmio_readw_down(struct clocksource *c)
{
return ~(u64)readw_relaxed(to_mmio_clksrc(c)->reg) & c->mask;
}
+EXPORT_SYMBOL_GPL(clocksource_mmio_readw_down);
/**
* clocksource_mmio_init - Initialize a simple mmio based clocksource
@@ -46,9 +50,9 @@ u64 clocksource_mmio_readw_down(struct clocksource *c)
* @bits: Number of valid bits
* @read: One of clocksource_mmio_read*() above
*/
-int __init clocksource_mmio_init(void __iomem *base, const char *name,
- unsigned long hz, int rating, unsigned bits,
- u64 (*read)(struct clocksource *))
+int clocksource_mmio_init(void __iomem *base, const char *name,
+ unsigned long hz, int rating, unsigned bits,
+ u64 (*read)(struct clocksource *))
{
struct clocksource_mmio *cs;
@@ -68,3 +72,4 @@ int __init clocksource_mmio_init(void __iomem *base, const char *name,
return clocksource_register_hz(&cs->clksrc, hz);
}
+EXPORT_SYMBOL_GPL(clocksource_mmio_init);
diff --git a/drivers/clocksource/timer-of.c b/drivers/clocksource/timer-of.c
index 420202bf76e4..ba63433211b0 100644
--- a/drivers/clocksource/timer-of.c
+++ b/drivers/clocksource/timer-of.c
@@ -19,7 +19,7 @@
*
* Free the irq resource
*/
-static __init void timer_of_irq_exit(struct of_timer_irq *of_irq)
+static void timer_of_irq_exit(struct of_timer_irq *of_irq)
{
struct timer_of *to = container_of(of_irq, struct timer_of, of_irq);
@@ -41,8 +41,8 @@ static __init void timer_of_irq_exit(struct of_timer_irq *of_irq)
*
* Returns 0 on success, < 0 otherwise
*/
-static __init int timer_of_irq_init(struct device_node *np,
- struct of_timer_irq *of_irq)
+static int timer_of_irq_init(struct device_node *np,
+ struct of_timer_irq *of_irq)
{
int ret;
struct timer_of *to = container_of(of_irq, struct timer_of, of_irq);
@@ -82,7 +82,7 @@ static __init int timer_of_irq_init(struct device_node *np,
*
* Disables and releases the refcount on the clk
*/
-static __init void timer_of_clk_exit(struct of_timer_clk *of_clk)
+static void timer_of_clk_exit(struct of_timer_clk *of_clk)
{
of_clk->rate = 0;
clk_disable_unprepare(of_clk->clk);
@@ -98,8 +98,8 @@ static __init void timer_of_clk_exit(struct of_timer_clk *of_clk)
*
* Returns 0 on success, < 0 otherwise
*/
-static __init int timer_of_clk_init(struct device_node *np,
- struct of_timer_clk *of_clk)
+static int timer_of_clk_init(struct device_node *np,
+ struct of_timer_clk *of_clk)
{
int ret;
@@ -137,13 +137,13 @@ static __init int timer_of_clk_init(struct device_node *np,
goto out;
}
-static __init void timer_of_base_exit(struct of_timer_base *of_base)
+static void timer_of_base_exit(struct of_timer_base *of_base)
{
iounmap(of_base->base);
}
-static __init int timer_of_base_init(struct device_node *np,
- struct of_timer_base *of_base)
+static int timer_of_base_init(struct device_node *np,
+ struct of_timer_base *of_base)
{
of_base->base = of_base->name ?
of_io_request_and_map(np, of_base->index, of_base->name) :
@@ -156,7 +156,7 @@ static __init int timer_of_base_init(struct device_node *np,
return 0;
}
-int __init timer_of_init(struct device_node *np, struct timer_of *to)
+int timer_of_init(struct device_node *np, struct timer_of *to)
{
int ret = -EINVAL;
int flags = 0;
@@ -200,6 +200,7 @@ int __init timer_of_init(struct device_node *np, struct timer_of *to)
timer_of_base_exit(&to->of_base);
return ret;
}
+EXPORT_SYMBOL_GPL(timer_of_init);
/**
* timer_of_cleanup - release timer_of resources
@@ -208,7 +209,7 @@ int __init timer_of_init(struct device_node *np, struct timer_of *to)
* Release the resources that has been used in timer_of_init().
* This function should be called in init error cases
*/
-void __init timer_of_cleanup(struct timer_of *to)
+void timer_of_cleanup(struct timer_of *to)
{
if (to->flags & TIMER_OF_IRQ)
timer_of_irq_exit(&to->of_irq);
@@ -219,3 +220,4 @@ void __init timer_of_cleanup(struct timer_of *to)
if (to->flags & TIMER_OF_BASE)
timer_of_base_exit(&to->of_base);
}
+EXPORT_SYMBOL_GPL(timer_of_cleanup);
diff --git a/drivers/clocksource/timer-of.h b/drivers/clocksource/timer-of.h
index 01a2c6b7db06..74a632b85b47 100644
--- a/drivers/clocksource/timer-of.h
+++ b/drivers/clocksource/timer-of.h
@@ -65,9 +65,8 @@ static inline unsigned long timer_of_period(struct timer_of *to)
return to->of_clk.period;
}
-extern int __init timer_of_init(struct device_node *np,
- struct timer_of *to);
+int timer_of_init(struct device_node *np, struct timer_of *to);
-extern void __init timer_of_cleanup(struct timer_of *to);
+void timer_of_cleanup(struct timer_of *to);
#endif
diff --git a/drivers/clocksource/timer-rtl-otto.c b/drivers/clocksource/timer-rtl-otto.c
index 6113d2fdd4de..dd236a7babee 100644
--- a/drivers/clocksource/timer-rtl-otto.c
+++ b/drivers/clocksource/timer-rtl-otto.c
@@ -225,7 +225,7 @@ static int rttm_enable_clocksource(struct clocksource *cs)
return 0;
}
-struct rttm_cs rttm_cs = {
+static struct rttm_cs rttm_cs = {
.to = {
.flags = TIMER_OF_BASE | TIMER_OF_CLOCK,
},
diff --git a/drivers/clocksource/timer-sun5i.c b/drivers/clocksource/timer-sun5i.c
index f827d3f98f60..6ab300d22621 100644
--- a/drivers/clocksource/timer-sun5i.c
+++ b/drivers/clocksource/timer-sun5i.c
@@ -18,21 +18,30 @@
#include <linux/slab.h>
#include <linux/platform_device.h>
-#define TIMER_IRQ_EN_REG 0x00
+#define TIMER_IRQ_EN_REG 0x00
#define TIMER_IRQ_EN(val) BIT(val)
-#define TIMER_IRQ_ST_REG 0x04
-#define TIMER_CTL_REG(val) (0x20 * (val) + 0x10)
+#define TIMER_IRQ_ST_REG 0x04
+#define TIMER_CTL_REG(val, offset) (0x20 * (val) + 0x10 + (offset))
#define TIMER_CTL_ENABLE BIT(0)
#define TIMER_CTL_RELOAD BIT(1)
#define TIMER_CTL_CLK_PRES(val) (((val) & 0x7) << 4)
#define TIMER_CTL_ONESHOT BIT(7)
-#define TIMER_INTVAL_LO_REG(val) (0x20 * (val) + 0x14)
-#define TIMER_INTVAL_HI_REG(val) (0x20 * (val) + 0x18)
-#define TIMER_CNTVAL_LO_REG(val) (0x20 * (val) + 0x1c)
-#define TIMER_CNTVAL_HI_REG(val) (0x20 * (val) + 0x20)
+#define TIMER_INTVAL_LO_REG(val, offset) (0x20 * (val) + 0x14 + (offset))
+#define TIMER_INTVAL_HI_REG(val, offset) (0x20 * (val) + 0x18 + (offset))
+#define TIMER_CNTVAL_LO_REG(val, offset) (0x20 * (val) + 0x1c + (offset))
+#define TIMER_CNTVAL_HI_REG(val, offset) (0x20 * (val) + 0x20 + (offset))
#define TIMER_SYNC_TICKS 3
+/**
+ * struct sunxi_timer_quirks - Differences between SoC variants.
+ *
+ * @from_ctl_base_offset: offset applied from ctl register onwards
+ */
+struct sunxi_timer_quirks {
+ u32 from_ctl_base_offset;
+};
+
struct sun5i_timer {
void __iomem *base;
struct clk *clk;
@@ -40,6 +49,7 @@ struct sun5i_timer {
u32 ticks_per_jiffy;
struct clocksource clksrc;
struct clock_event_device clkevt;
+ const struct sunxi_timer_quirks *quirks;
};
#define nb_to_sun5i_timer(x) \
@@ -57,28 +67,36 @@ struct sun5i_timer {
*/
static void sun5i_clkevt_sync(struct sun5i_timer *ce)
{
- u32 old = readl(ce->base + TIMER_CNTVAL_LO_REG(1));
+ u32 offset = ce->quirks->from_ctl_base_offset;
+ u32 old = readl(ce->base + TIMER_CNTVAL_LO_REG(1, offset));
- while ((old - readl(ce->base + TIMER_CNTVAL_LO_REG(1))) < TIMER_SYNC_TICKS)
+ while ((old - readl(ce->base + TIMER_CNTVAL_LO_REG(1, offset))) <
+ TIMER_SYNC_TICKS)
cpu_relax();
}
static void sun5i_clkevt_time_stop(struct sun5i_timer *ce, u8 timer)
{
- u32 val = readl(ce->base + TIMER_CTL_REG(timer));
- writel(val & ~TIMER_CTL_ENABLE, ce->base + TIMER_CTL_REG(timer));
+ u32 offset = ce->quirks->from_ctl_base_offset;
+ u32 val = readl(ce->base + TIMER_CTL_REG(timer, offset));
+
+ writel(val & ~TIMER_CTL_ENABLE,
+ ce->base + TIMER_CTL_REG(timer, offset));
sun5i_clkevt_sync(ce);
}
static void sun5i_clkevt_time_setup(struct sun5i_timer *ce, u8 timer, u32 delay)
{
- writel(delay, ce->base + TIMER_INTVAL_LO_REG(timer));
+ u32 offset = ce->quirks->from_ctl_base_offset;
+
+ writel(delay, ce->base + TIMER_INTVAL_LO_REG(timer, offset));
}
static void sun5i_clkevt_time_start(struct sun5i_timer *ce, u8 timer, bool periodic)
{
- u32 val = readl(ce->base + TIMER_CTL_REG(timer));
+ u32 offset = ce->quirks->from_ctl_base_offset;
+ u32 val = readl(ce->base + TIMER_CTL_REG(timer, offset));
if (periodic)
val &= ~TIMER_CTL_ONESHOT;
@@ -86,7 +104,7 @@ static void sun5i_clkevt_time_start(struct sun5i_timer *ce, u8 timer, bool perio
val |= TIMER_CTL_ONESHOT;
writel(val | TIMER_CTL_ENABLE | TIMER_CTL_RELOAD,
- ce->base + TIMER_CTL_REG(timer));
+ ce->base + TIMER_CTL_REG(timer, offset));
}
static int sun5i_clkevt_shutdown(struct clock_event_device *clkevt)
@@ -141,8 +159,9 @@ static irqreturn_t sun5i_timer_interrupt(int irq, void *dev_id)
static u64 sun5i_clksrc_read(struct clocksource *clksrc)
{
struct sun5i_timer *cs = clksrc_to_sun5i_timer(clksrc);
+ u32 offset = cs->quirks->from_ctl_base_offset;
- return ~readl(cs->base + TIMER_CNTVAL_LO_REG(1));
+ return ~readl(cs->base + TIMER_CNTVAL_LO_REG(1, offset));
}
static int sun5i_rate_cb(struct notifier_block *nb,
@@ -173,12 +192,13 @@ static int sun5i_setup_clocksource(struct platform_device *pdev,
unsigned long rate)
{
struct sun5i_timer *cs = platform_get_drvdata(pdev);
+ u32 offset = cs->quirks->from_ctl_base_offset;
void __iomem *base = cs->base;
int ret;
- writel(~0, base + TIMER_INTVAL_LO_REG(1));
+ writel(~0, base + TIMER_INTVAL_LO_REG(1, offset));
writel(TIMER_CTL_ENABLE | TIMER_CTL_RELOAD,
- base + TIMER_CTL_REG(1));
+ base + TIMER_CTL_REG(1, offset));
cs->clksrc.name = pdev->dev.of_node->name;
cs->clksrc.rating = 340;
@@ -237,6 +257,7 @@ static int sun5i_setup_clockevent(struct platform_device *pdev,
static int sun5i_timer_probe(struct platform_device *pdev)
{
+ const struct sunxi_timer_quirks *quirks;
struct device *dev = &pdev->dev;
struct sun5i_timer *st;
struct reset_control *rstc;
@@ -273,11 +294,18 @@ static int sun5i_timer_probe(struct platform_device *pdev)
return -EINVAL;
}
+ quirks = of_device_get_match_data(&pdev->dev);
+ if (!quirks) {
+ dev_err(&pdev->dev, "Failed to determine the quirks to use\n");
+ return -ENODEV;
+ }
+
st->base = timer_base;
st->ticks_per_jiffy = DIV_ROUND_UP(rate, HZ);
st->clk = clk;
st->clk_rate_cb.notifier_call = sun5i_rate_cb;
st->clk_rate_cb.next = NULL;
+ st->quirks = quirks;
ret = devm_clk_notifier_register(dev, clk, &st->clk_rate_cb);
if (ret) {
@@ -286,6 +314,9 @@ static int sun5i_timer_probe(struct platform_device *pdev)
}
rstc = devm_reset_control_get_optional_exclusive(dev, NULL);
+ if (IS_ERR(rstc))
+ return dev_err_probe(dev, PTR_ERR(rstc),
+ "failed to get reset\n");
if (rstc)
reset_control_deassert(rstc);
@@ -311,9 +342,27 @@ static void sun5i_timer_remove(struct platform_device *pdev)
clocksource_unregister(&st->clksrc);
}
+static const struct sunxi_timer_quirks sun5i_sun7i_hstimer_quirks = {
+ .from_ctl_base_offset = 0x0,
+};
+
+static const struct sunxi_timer_quirks sun20i_d1_hstimer_quirks = {
+ .from_ctl_base_offset = 0x10,
+};
+
static const struct of_device_id sun5i_timer_of_match[] = {
- { .compatible = "allwinner,sun5i-a13-hstimer" },
- { .compatible = "allwinner,sun7i-a20-hstimer" },
+ {
+ .compatible = "allwinner,sun5i-a13-hstimer",
+ .data = &sun5i_sun7i_hstimer_quirks,
+ },
+ {
+ .compatible = "allwinner,sun7i-a20-hstimer",
+ .data = &sun5i_sun7i_hstimer_quirks,
+ },
+ {
+ .compatible = "allwinner,sun20i-d1-hstimer",
+ .data = &sun20i_d1_hstimer_quirks,
+ },
{},
};
MODULE_DEVICE_TABLE(of, sun5i_timer_of_match);
diff --git a/drivers/clocksource/timer-tegra186.c b/drivers/clocksource/timer-tegra186.c
index 355558893e5f..78600ddeb1c6 100644
--- a/drivers/clocksource/timer-tegra186.c
+++ b/drivers/clocksource/timer-tegra186.c
@@ -57,6 +57,15 @@
#define WDTUR 0x00c
#define WDTUR_UNLOCK_PATTERN 0x0000c45a
+#define TEGRA186_KERNEL_WDT_TIMEOUT 120
+
+/* WDT security configuration registers */
+#define WDTSCR(x) (0xf02c + (x) * 4)
+#define WDTSCR_SEC_WEN BIT(28)
+#define WDTSCR_SEC_REN BIT(27)
+#define WDTSCR_SEC_G1W BIT(9)
+#define WDTSCR_SEC_G1R BIT(1)
+
struct tegra186_timer_soc {
unsigned int num_timers;
unsigned int num_wdts;
@@ -75,6 +84,7 @@ struct tegra186_wdt {
void __iomem *regs;
unsigned int index;
bool locked;
+ bool is_kernel_wdt;
struct tegra186_tmr *tmr;
};
@@ -89,7 +99,7 @@ struct tegra186_timer {
struct device *dev;
void __iomem *regs;
- struct tegra186_wdt *wdt;
+ struct tegra186_wdt **wdts;
struct clocksource usec;
struct clocksource tsc;
struct clocksource osc;
@@ -149,7 +159,8 @@ static void tegra186_wdt_enable(struct tegra186_wdt *wdt)
u32 value;
/* unmask hardware IRQ, this may have been lost across powergate */
- value = TKEIE_WDT_MASK(wdt->index, 1);
+ value = readl(tegra->regs + TKEIE(wdt->tmr->hwirq));
+ value |= TKEIE_WDT_MASK(wdt->index, 1);
writel(value, tegra->regs + TKEIE(wdt->tmr->hwirq));
/* clear interrupt */
@@ -174,6 +185,10 @@ static void tegra186_wdt_enable(struct tegra186_wdt *wdt)
value &= ~WDTCR_PERIOD_MASK;
value |= WDTCR_PERIOD(1);
+ /* enable local interrupt for kernel watchdog */
+ if (wdt->is_kernel_wdt)
+ value |= WDTCR_LOCAL_INT_ENABLE;
+
/* enable system POR reset */
value |= WDTCR_SYSTEM_POR_RESET_ENABLE;
@@ -211,6 +226,16 @@ static int tegra186_wdt_ping(struct watchdog_device *wdd)
return 0;
}
+static irqreturn_t tegra186_wdt_irq(int irq, void *data)
+{
+ struct tegra186_wdt *wdt = data;
+
+ tegra186_wdt_disable(wdt);
+ tegra186_wdt_enable(wdt);
+
+ return IRQ_HANDLED;
+}
+
static int tegra186_wdt_set_timeout(struct watchdog_device *wdd,
unsigned int timeout)
{
@@ -297,6 +322,23 @@ static const struct watchdog_ops tegra186_wdt_ops = {
.get_timeleft = tegra186_wdt_get_timeleft,
};
+static bool tegra186_wdt_is_accessible(struct tegra186_timer *tegra, unsigned int index)
+{
+ u32 value;
+
+ value = readl_relaxed(tegra->regs + WDTSCR(index));
+
+ /* Check OS write access if write blocking is enabled. */
+ if ((value & WDTSCR_SEC_WEN) && !(value & WDTSCR_SEC_G1W))
+ return false;
+
+ /* Check OS read access if read blocking is enabled. */
+ if ((value & WDTSCR_SEC_REN) && !(value & WDTSCR_SEC_G1R))
+ return false;
+
+ return true;
+}
+
static struct tegra186_wdt *tegra186_wdt_create(struct tegra186_timer *tegra,
unsigned int index)
{
@@ -336,10 +378,6 @@ static struct tegra186_wdt *tegra186_wdt_create(struct tegra186_timer *tegra,
if (err < 0)
return ERR_PTR(err);
- err = devm_watchdog_register_device(tegra->dev, &wdt->base);
- if (err < 0)
- return ERR_PTR(err);
-
return wdt;
}
@@ -421,8 +459,11 @@ static int tegra186_timer_usec_init(struct tegra186_timer *tegra)
static int tegra186_timer_probe(struct platform_device *pdev)
{
+ struct tegra186_wdt *kernel_wdt = NULL;
struct device *dev = &pdev->dev;
struct tegra186_timer *tegra;
+ unsigned int i;
+ int irq;
int err;
tegra = devm_kzalloc(dev, sizeof(*tegra), GFP_KERNEL);
@@ -441,12 +482,33 @@ static int tegra186_timer_probe(struct platform_device *pdev)
if (err < 0)
return err;
- /* create a watchdog using a preconfigured timer */
- tegra->wdt = tegra186_wdt_create(tegra, 0);
- if (IS_ERR(tegra->wdt)) {
- err = PTR_ERR(tegra->wdt);
- dev_err(dev, "failed to create WDT: %d\n", err);
- return err;
+ irq = err;
+
+ tegra->wdts = devm_kcalloc(dev, tegra->soc->num_wdts, sizeof(*tegra->wdts), GFP_KERNEL);
+ if (!tegra->wdts)
+ return -ENOMEM;
+
+ for (i = 0; i < tegra->soc->num_wdts; i++) {
+ if (!tegra186_wdt_is_accessible(tegra, i)) {
+ dev_warn(dev, "WDT%u is not accessible\n", i);
+ continue;
+ }
+
+ tegra->wdts[i] = tegra186_wdt_create(tegra, i);
+ if (IS_ERR(tegra->wdts[i]))
+ return dev_err_probe(dev, PTR_ERR(tegra->wdts[i]),
+ "failed to create WDT%u\n", i);
+
+ /* Reserve the first accessible WDT for the Kernel. */
+ if (!kernel_wdt) {
+ kernel_wdt = tegra->wdts[i];
+ kernel_wdt->is_kernel_wdt = true;
+ } else {
+ err = devm_watchdog_register_device(dev, &tegra->wdts[i]->base);
+ if (err < 0)
+ return dev_err_probe(dev, err,
+ "failed to register WDT%u\n", i);
+ }
}
err = tegra186_timer_tsc_init(tegra);
@@ -467,8 +529,22 @@ static int tegra186_timer_probe(struct platform_device *pdev)
goto unregister_osc;
}
+ if (kernel_wdt) {
+ err = devm_request_irq(dev, irq, tegra186_wdt_irq, 0,
+ dev_name(dev), kernel_wdt);
+ if (err < 0) {
+ dev_err(dev, "failed to request kernel WDT IRQ: %d\n", err);
+ goto unregister_usec;
+ }
+
+ tegra186_wdt_set_timeout(&kernel_wdt->base, TEGRA186_KERNEL_WDT_TIMEOUT);
+ tegra186_wdt_enable(kernel_wdt);
+ }
+
return 0;
+unregister_usec:
+ clocksource_unregister(&tegra->usec);
unregister_osc:
clocksource_unregister(&tegra->osc);
unregister_tsc:
@@ -488,9 +564,14 @@ static void tegra186_timer_remove(struct platform_device *pdev)
static int __maybe_unused tegra186_timer_suspend(struct device *dev)
{
struct tegra186_timer *tegra = dev_get_drvdata(dev);
+ unsigned int i;
- if (watchdog_active(&tegra->wdt->base))
- tegra186_wdt_disable(tegra->wdt);
+ for (i = 0; i < tegra->soc->num_wdts; i++) {
+ struct tegra186_wdt *wdt = tegra->wdts[i];
+
+ if (wdt && (wdt->is_kernel_wdt || watchdog_active(&wdt->base)))
+ tegra186_wdt_disable(wdt);
+ }
return 0;
}
@@ -498,9 +579,14 @@ static int __maybe_unused tegra186_timer_suspend(struct device *dev)
static int __maybe_unused tegra186_timer_resume(struct device *dev)
{
struct tegra186_timer *tegra = dev_get_drvdata(dev);
+ unsigned int i;
- if (watchdog_active(&tegra->wdt->base))
- tegra186_wdt_enable(tegra->wdt);
+ for (i = 0; i < tegra->soc->num_wdts; i++) {
+ struct tegra186_wdt *wdt = tegra->wdts[i];
+
+ if (wdt && (wdt->is_kernel_wdt || watchdog_active(&wdt->base)))
+ tegra186_wdt_enable(wdt);
+ }
return 0;
}
@@ -510,12 +596,12 @@ static SIMPLE_DEV_PM_OPS(tegra186_timer_pm_ops, tegra186_timer_suspend,
static const struct tegra186_timer_soc tegra186_timer = {
.num_timers = 10,
- .num_wdts = 3,
+ .num_wdts = 2,
};
static const struct tegra186_timer_soc tegra234_timer = {
.num_timers = 16,
- .num_wdts = 3,
+ .num_wdts = 2,
};
static const struct of_device_id tegra186_timer_of_match[] = {
diff --git a/drivers/clocksource/timer-ti-dm-systimer.c b/drivers/clocksource/timer-ti-dm-systimer.c
index eb0dfe4b9b7c..3804c1234522 100644
--- a/drivers/clocksource/timer-ti-dm-systimer.c
+++ b/drivers/clocksource/timer-ti-dm-systimer.c
@@ -226,7 +226,7 @@ static bool __init dmtimer_is_preferred(struct device_node *np)
* Some omap3 boards with unreliable oscillator must not use the counter_32k
* or dmtimer1 with 32 KiHz source. Additionally, the boards with unreliable
* oscillator should really set counter_32k as disabled, and delete dmtimer1
- * ti,always-on property, but let's not count on it. For these quirky cases,
+ * ti,timer-alwon property, but let's not count on it. For these quirky cases,
* we prefer using the always-on secure dmtimer12 with the internal 32 KiHz
* clock as the clocksource, and any available dmtimer as clockevent.
*
diff --git a/drivers/clocksource/timer-ti-dm.c b/drivers/clocksource/timer-ti-dm.c
index 793e7cdcb1b1..bd06afb7d522 100644
--- a/drivers/clocksource/timer-ti-dm.c
+++ b/drivers/clocksource/timer-ti-dm.c
@@ -20,8 +20,11 @@
#include <linux/clk.h>
#include <linux/clk-provider.h>
+#include <linux/clocksource.h>
+#include <linux/clockchips.h>
#include <linux/cpu_pm.h>
#include <linux/module.h>
+#include <linux/interrupt.h>
#include <linux/io.h>
#include <linux/device.h>
#include <linux/err.h>
@@ -29,6 +32,7 @@
#include <linux/of.h>
#include <linux/platform_device.h>
#include <linux/platform_data/dmtimer-omap.h>
+#include <linux/sched_clock.h>
#include <clocksource/timer-ti-dm.h>
#include <linux/delay.h>
@@ -148,6 +152,21 @@ static u32 omap_reserved_systimers;
static LIST_HEAD(omap_timer_list);
static DEFINE_SPINLOCK(dm_timer_lock);
+struct dmtimer_clocksource {
+ struct clocksource dev;
+ struct dmtimer *timer;
+ unsigned int loadval;
+};
+
+struct omap_dm_timer_clockevent {
+ struct clock_event_device dev;
+ struct dmtimer *timer;
+ u32 period;
+};
+
+static bool omap_dm_timer_clockevent_setup;
+static void __iomem *omap_dm_timer_sched_clock_counter;
+
enum {
REQUEST_ANY = 0,
REQUEST_BY_ID,
@@ -1185,6 +1204,192 @@ static const struct dev_pm_ops omap_dm_timer_pm_ops = {
static const struct of_device_id omap_timer_match[];
+static struct dmtimer_clocksource *omap_dm_timer_to_clocksource(struct clocksource *cs)
+{
+ return container_of(cs, struct dmtimer_clocksource, dev);
+}
+
+static u64 omap_dm_timer_read_cycles(struct clocksource *cs)
+{
+ struct dmtimer_clocksource *clksrc = omap_dm_timer_to_clocksource(cs);
+ struct dmtimer *timer = clksrc->timer;
+
+ return (u64)__omap_dm_timer_read_counter(timer);
+}
+
+static u64 notrace omap_dm_timer_read_sched_clock(void)
+{
+ /* Posted mode is not active here, so we can read directly */
+ return readl_relaxed(omap_dm_timer_sched_clock_counter);
+}
+
+static void omap_dm_timer_clocksource_suspend(struct clocksource *cs)
+{
+ struct dmtimer_clocksource *clksrc = omap_dm_timer_to_clocksource(cs);
+ struct dmtimer *timer = clksrc->timer;
+
+ clksrc->loadval = __omap_dm_timer_read_counter(timer);
+ __omap_dm_timer_stop(timer);
+}
+
+static void omap_dm_timer_clocksource_resume(struct clocksource *cs)
+{
+ struct dmtimer_clocksource *clksrc = omap_dm_timer_to_clocksource(cs);
+ struct dmtimer *timer = clksrc->timer;
+
+ dmtimer_write(timer, OMAP_TIMER_COUNTER_REG, clksrc->loadval);
+ dmtimer_write(timer, OMAP_TIMER_CTRL_REG, OMAP_TIMER_CTRL_ST | OMAP_TIMER_CTRL_AR);
+}
+
+static void omap_dm_timer_clocksource_unregister(void *data)
+{
+ struct clocksource *cs = data;
+
+ clocksource_unregister(cs);
+}
+
+static int omap_dm_timer_setup_clocksource(struct dmtimer *timer)
+{
+ struct device *dev = &timer->pdev->dev;
+ struct dmtimer_clocksource *clksrc;
+ int err;
+
+ __omap_dm_timer_init_regs(timer);
+
+ timer->reserved = 1;
+
+ clksrc = devm_kzalloc(dev, sizeof(*clksrc), GFP_KERNEL);
+ if (!clksrc)
+ return -ENOMEM;
+
+ clksrc->timer = timer;
+
+ clksrc->dev.name = "omap_dm_timer";
+ clksrc->dev.rating = 300;
+ clksrc->dev.read = omap_dm_timer_read_cycles;
+ clksrc->dev.mask = CLOCKSOURCE_MASK(32);
+ clksrc->dev.flags = CLOCK_SOURCE_IS_CONTINUOUS;
+ clksrc->dev.suspend = omap_dm_timer_clocksource_suspend;
+ clksrc->dev.resume = omap_dm_timer_clocksource_resume;
+
+ dmtimer_write(timer, OMAP_TIMER_COUNTER_REG, 0);
+ dmtimer_write(timer, OMAP_TIMER_LOAD_REG, 0);
+ dmtimer_write(timer, OMAP_TIMER_CTRL_REG, OMAP_TIMER_CTRL_ST | OMAP_TIMER_CTRL_AR);
+
+ omap_dm_timer_sched_clock_counter = timer->func_base + _OMAP_TIMER_COUNTER_OFFSET;
+ sched_clock_register(omap_dm_timer_read_sched_clock, 32, timer->fclk_rate);
+
+ err = clocksource_register_hz(&clksrc->dev, timer->fclk_rate);
+ if (err)
+ return dev_err_probe(dev, err, "Could not register as clocksource\n");
+
+ err = devm_add_action_or_reset(dev, omap_dm_timer_clocksource_unregister, &clksrc->dev);
+ if (err)
+ return dev_err_probe(dev, err, "Could not register clocksource_unregister action\n");
+
+ return 0;
+}
+
+static struct omap_dm_timer_clockevent *to_dm_timer_clockevent(struct clock_event_device *evt)
+{
+ return container_of(evt, struct omap_dm_timer_clockevent, dev);
+}
+
+static int omap_dm_timer_evt_set_next_event(unsigned long cycles,
+ struct clock_event_device *evt)
+{
+ struct omap_dm_timer_clockevent *clkevt = to_dm_timer_clockevent(evt);
+ struct dmtimer *timer = clkevt->timer;
+
+ dmtimer_write(timer, OMAP_TIMER_COUNTER_REG, 0xffffffff - cycles);
+ dmtimer_write(timer, OMAP_TIMER_CTRL_REG, OMAP_TIMER_CTRL_ST);
+
+ return 0;
+}
+
+static int omap_dm_timer_evt_shutdown(struct clock_event_device *evt)
+{
+ struct omap_dm_timer_clockevent *clkevt = to_dm_timer_clockevent(evt);
+ struct dmtimer *timer = clkevt->timer;
+
+ __omap_dm_timer_stop(timer);
+
+ return 0;
+}
+
+static int omap_dm_timer_evt_set_periodic(struct clock_event_device *evt)
+{
+ struct omap_dm_timer_clockevent *clkevt = to_dm_timer_clockevent(evt);
+ struct dmtimer *timer = clkevt->timer;
+
+ omap_dm_timer_evt_shutdown(evt);
+
+ omap_dm_timer_set_load(&timer->cookie, clkevt->period);
+ dmtimer_write(timer, OMAP_TIMER_COUNTER_REG, clkevt->period);
+ dmtimer_write(timer, OMAP_TIMER_CTRL_REG,
+ OMAP_TIMER_CTRL_AR | OMAP_TIMER_CTRL_ST);
+
+ return 0;
+}
+
+static irqreturn_t omap_dm_timer_evt_interrupt(int irq, void *dev_id)
+{
+ struct omap_dm_timer_clockevent *clkevt = dev_id;
+ struct dmtimer *timer = clkevt->timer;
+
+ __omap_dm_timer_write_status(timer, OMAP_TIMER_INT_OVERFLOW);
+
+ clkevt->dev.event_handler(&clkevt->dev);
+
+ return IRQ_HANDLED;
+}
+
+static int omap_dm_timer_setup_clockevent(struct dmtimer *timer)
+{
+ struct device *dev = &timer->pdev->dev;
+ struct omap_dm_timer_clockevent *clkevt;
+ int ret;
+
+ clkevt = devm_kzalloc(dev, sizeof(*clkevt), GFP_KERNEL);
+ if (!clkevt)
+ return -ENOMEM;
+
+ timer->reserved = 1;
+ clkevt->timer = timer;
+
+ clkevt->dev.name = "omap_dm_timer";
+ clkevt->dev.features = CLOCK_EVT_FEAT_PERIODIC | CLOCK_EVT_FEAT_ONESHOT;
+ clkevt->dev.rating = 300;
+ clkevt->dev.set_next_event = omap_dm_timer_evt_set_next_event;
+ clkevt->dev.set_state_shutdown = omap_dm_timer_evt_shutdown;
+ clkevt->dev.set_state_periodic = omap_dm_timer_evt_set_periodic;
+ clkevt->dev.set_state_oneshot = omap_dm_timer_evt_shutdown;
+ clkevt->dev.set_state_oneshot_stopped = omap_dm_timer_evt_shutdown;
+ clkevt->dev.tick_resume = omap_dm_timer_evt_shutdown;
+ clkevt->dev.cpumask = cpu_possible_mask;
+ clkevt->period = 0xffffffff - DIV_ROUND_CLOSEST(timer->fclk_rate, HZ);
+
+ __omap_dm_timer_init_regs(timer);
+ __omap_dm_timer_stop(timer);
+ __omap_dm_timer_enable_posted(timer);
+
+ ret = devm_request_irq(dev, timer->irq, omap_dm_timer_evt_interrupt,
+ IRQF_TIMER, "omap_dm_timer_clockevent", clkevt);
+ if (ret) {
+ dev_err(dev, "Failed to request interrupt: %d\n", ret);
+ return ret;
+ }
+
+ __omap_dm_timer_int_enable(timer, OMAP_TIMER_INT_OVERFLOW);
+
+ clockevents_config_and_register(&clkevt->dev, timer->fclk_rate,
+ 3,
+ 0xffffffff);
+
+ omap_dm_timer_clockevent_setup = true;
+ return 0;
+}
+
/**
* omap_dm_timer_probe - probe function called for every registered device
* @pdev: pointer to current timer platform device
@@ -1272,6 +1477,18 @@ static int omap_dm_timer_probe(struct platform_device *pdev)
timer->pdev = pdev;
+ if (timer->capability & OMAP_TIMER_ALWON && !IS_ERR_OR_NULL(timer->fclk)) {
+ if (!omap_dm_timer_sched_clock_counter) {
+ ret = omap_dm_timer_setup_clocksource(timer);
+ if (ret)
+ return ret;
+ } else if (!omap_dm_timer_clockevent_setup) {
+ ret = omap_dm_timer_setup_clockevent(timer);
+ if (ret)
+ return ret;
+ }
+ }
+
pm_runtime_enable(dev);
if (!timer->reserved) {
diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index 7c38190b10bf..c5b34c16602e 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -236,6 +236,9 @@ clocks_calc_mult_shift(u32 *mult, u32 *shift, u32 from, u32 to, u32 minsec);
*/
extern int
__clocksource_register_scale(struct clocksource *cs, u32 scale, u32 freq);
+extern int
+__devm_clocksource_register_scale(struct device *dev, struct clocksource *cs,
+ u32 scale, u32 freq);
extern void
__clocksource_update_freq_scale(struct clocksource *cs, u32 scale, u32 freq);
@@ -258,6 +261,18 @@ static inline int clocksource_register_khz(struct clocksource *cs, u32 khz)
return __clocksource_register_scale(cs, 1000, khz);
}
+static inline int devm_clocksource_register_hz(struct device *dev,
+ struct clocksource *cs, u32 hz)
+{
+ return __devm_clocksource_register_scale(dev, cs, 1, hz);
+}
+
+static inline int devm_clocksource_register_khz(struct device *dev,
+ struct clocksource *cs, u32 khz)
+{
+ return __devm_clocksource_register_scale(dev, cs, 1000, khz);
+}
+
static inline void __clocksource_update_freq_hz(struct clocksource *cs, u32 hz)
{
__clocksource_update_freq_scale(cs, 1, hz);
diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index baee13a1f87f..313f6c88148e 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -1338,6 +1338,26 @@ int __clocksource_register_scale(struct clocksource *cs, u32 scale, u32 freq)
}
EXPORT_SYMBOL_GPL(__clocksource_register_scale);
+static void __devm_clocksource_unregister(void *data)
+{
+ struct clocksource *cs = data;
+
+ clocksource_unregister(cs);
+}
+
+int __devm_clocksource_register_scale(struct device *dev, struct clocksource *cs,
+ u32 scale, u32 freq)
+{
+ int ret;
+
+ ret = __clocksource_register_scale(cs, scale, freq);
+ if (ret)
+ return ret;
+
+ return devm_add_action_or_reset(dev, __devm_clocksource_unregister, cs);
+}
+EXPORT_SYMBOL_GPL(__devm_clocksource_register_scale);
+
/*
* Unbind clocksource @cs. Called with clocksource_mutex held
*/
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [GIT pull] timers/core for v7.2-rc1
2026-06-13 21:24 [GIT pull] core/rseq for v7.2-rc1 Thomas Gleixner
` (4 preceding siblings ...)
2026-06-13 21:25 ` [GIT pull] timers/clocksource " Thomas Gleixner
@ 2026-06-13 21:25 ` Thomas Gleixner
2026-06-15 8:51 ` pr-tracker-bot
2026-06-15 13:35 ` Oleg Nesterov
2026-06-13 21:25 ` [GIT pull] timers/nohz " Thomas Gleixner
` (3 subsequent siblings)
9 siblings, 2 replies; 22+ messages in thread
From: Thomas Gleixner @ 2026-06-13 21:25 UTC (permalink / raw)
To: Linus Torvalds; +Cc: linux-kernel, x86
Linus,
please pull the latest timers/core branch from:
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git timers-core-2026-06-13
up to: 87bd2ad568e1: posix-cpu-timers: Fix pid refcount leak in do_cpu_nanosleep() error path
Updates for the time/timer core subsystem:
- Harden the user space controllable hrtimer interfaces further to
protect against unpriviledged DoS attempts by arming timers in the past.
- Add per-capacity hierarchies to the timer migration code to prevent
timer migration accross different capacity domains. This code has been
disabled last minute as there is a pathological problem with SoCs which
advertise a larger number of capacity domains. The problem is under
investigation and the code won't be active before v7.3, but that turned
out to be less intrusive than a full revert as it preserves the
preparatory steps and allows people to work on the final resolution
- Export time namespace functionality as a recent user can be built as a
module.
- Initialize the jiffies clocksource before using it. The recent
hardening against time moving backward requires that the related
members of struct clocksource have been initialized, otherwise it
clamps the readout to 0, which makes time stand sill and causes boot
delays.
- Fix a more than twenty year old PID reference count leak in an error
path of the POSIX CPU timer code.
- The usual small fixes, improvements and cleanups all over the place.
Note: There is a trivial conflict against the timers/clocksource
branch. That branch introduces devm helpers which touch the same area
of code as the timers/core branch. The resolution is to keep the
timers/clocksource changes.
Thanks,
tglx
------------------>
Frederic Weisbecker (8):
timers/migration: Abstract out hierarchy to prepare for CPU capacity awareness
timers/migration: Track CPUs in a hierarchy
timers/migration: Split per-capacity hierarchies
timers/migration: Handle capacity in connect tracepoints
scripts/timers: Add timer_migration_tree.py
timers/migration: Fix hotplug migrator selection target on asymetric capacity machines
timers/migration: Deactivate per-capacity hierarchies under nohz_full
timers/migration: Temporarily disable per capacity hierarchies
Gitle Mikkelsen (1):
timers: Fix flseep() typo in kernel-doc comment
John Stultz (1):
selftests/posix_timers: Use CLOCK_THREAD_CPUTIME_ID for ITIMER_PROF measurements
Maoyi Xie (2):
time/namespace: Export init_time_ns and do_timens_ktime_to_host()
ntsync: Honour caller's time namespace for absolute MONOTONIC timeouts
Rosen Penev (1):
timers/migration: Turn tmigr_hierarchy level_list into a flexible array
Thomas Gleixner (13):
hrtimer: Provide hrtimer_start_range_ns_user()
hrtimer: Use hrtimer_start_expires_user() for hrtimer sleepers
posix-timers: Expand timer_[re]arm() callbacks with a boolean return value
posix-timers: Handle the timer_[re]arm() return value
posix-timers: Switch to hrtimer_start_expires_user()
alarmtimer: Provide alarm_start_timer()
alarmtimer: Convert posix timer functions to alarm_start_timer()
fs/timerfd: Use the new alarm/hrtimer functions
power: supply: charger-manager: Switch to alarm_start_timer()
netfilter: xt_IDLETIMER: Switch to alarm_start_timer()
alarmtimer: Remove unused interfaces
hrtimer: Fix the bogus return type of __hrtimer_start_range_ns()
time/jiffies: Register jiffies clocksource before usage
Thomas Weißschuh (2):
clocksource: Clean up clocksource_update_freq() functions
hrtimer: Return ktime_t from hrtimer_get_next_event()/hrtimer_next_event_without()
WenTao Liang (1):
posix-cpu-timers: Fix pid refcount leak in do_cpu_nanosleep() error path
Zhan Xusheng (2):
alarmtimer: Remove stale return description from alarm_handle_timer()
timers/migration: Update stale @online doc to @available
drivers/misc/ntsync.c | 3 +
drivers/power/supply/charger-manager.c | 16 +-
fs/timerfd.c | 117 +++++++------
include/linux/alarmtimer.h | 9 +-
include/linux/clocksource.h | 12 --
include/linux/delay.h | 2 +-
include/linux/hrtimer.h | 24 ++-
include/trace/events/timer.h | 13 ++
include/trace/events/timer_migration.h | 24 +--
kernel/time/alarmtimer.c | 72 ++++----
kernel/time/clocksource.c | 9 +-
kernel/time/hrtimer.c | 152 +++++++++++++---
kernel/time/jiffies.c | 11 +-
kernel/time/namespace.c | 2 +
kernel/time/posix-cpu-timers.c | 19 +-
kernel/time/posix-timers.c | 35 ++--
kernel/time/posix-timers.h | 4 +-
kernel/time/tick-sched.c | 3 +-
kernel/time/timer.c | 2 +-
kernel/time/timer_migration.c | 241 +++++++++++++++++++-------
kernel/time/timer_migration.h | 36 +++-
net/netfilter/xt_IDLETIMER.c | 24 ++-
scripts/timer_migration_tree.py | 122 +++++++++++++
tools/testing/selftests/timers/posix_timers.c | 55 +++---
24 files changed, 717 insertions(+), 290 deletions(-)
create mode 100755 scripts/timer_migration_tree.py
diff --git a/drivers/misc/ntsync.c b/drivers/misc/ntsync.c
index 30af282262ef..02c9d1192812 100644
--- a/drivers/misc/ntsync.c
+++ b/drivers/misc/ntsync.c
@@ -19,6 +19,7 @@
#include <linux/sched/signal.h>
#include <linux/slab.h>
#include <linux/spinlock.h>
+#include <linux/time_namespace.h>
#include <uapi/linux/ntsync.h>
#define NTSYNC_NAME "ntsync"
@@ -836,6 +837,8 @@ static int ntsync_schedule(const struct ntsync_q *q, const struct ntsync_wait_ar
if (args->flags & NTSYNC_WAIT_REALTIME)
clock = CLOCK_REALTIME;
+ else
+ timeout = timens_ktime_to_host(clock, timeout);
do {
if (signal_pending(current)) {
diff --git a/drivers/power/supply/charger-manager.c b/drivers/power/supply/charger-manager.c
index c49e0e4d02f7..1b0239c59114 100644
--- a/drivers/power/supply/charger-manager.c
+++ b/drivers/power/supply/charger-manager.c
@@ -881,26 +881,22 @@ static bool cm_setup_timer(void)
mutex_unlock(&cm_list_mtx);
if (timer_req && cm_timer) {
- ktime_t now, add;
-
/*
* Set alarm with the polling interval (wakeup_ms)
* The alarm time should be NOW + CM_RTC_SMALL or later.
*/
- if (wakeup_ms == UINT_MAX ||
- wakeup_ms < CM_RTC_SMALL * MSEC_PER_SEC)
+ if (wakeup_ms == UINT_MAX || wakeup_ms < CM_RTC_SMALL * MSEC_PER_SEC)
wakeup_ms = 2 * CM_RTC_SMALL * MSEC_PER_SEC;
pr_info("Charger Manager wakeup timer: %u ms\n", wakeup_ms);
- now = ktime_get_boottime();
- add = ktime_set(wakeup_ms / MSEC_PER_SEC,
- (wakeup_ms % MSEC_PER_SEC) * NSEC_PER_MSEC);
- alarm_start(cm_timer, ktime_add(now, add));
-
cm_suspend_duration_ms = wakeup_ms;
- return true;
+ /*
+ * The timer should always be queued as the timeout is at least
+ * two seconds out. Handle it correctly nevertheless.
+ */
+ return alarm_start_timer(cm_timer, ktime_add_ms(0, wakeup_ms), true);
}
return false;
}
diff --git a/fs/timerfd.c b/fs/timerfd.c
index 73104f36bcae..fe845af0b74e 100644
--- a/fs/timerfd.c
+++ b/fs/timerfd.c
@@ -55,6 +55,15 @@ static inline bool isalarm(struct timerfd_ctx *ctx)
ctx->clockid == CLOCK_BOOTTIME_ALARM;
}
+static void __timerfd_triggered(struct timerfd_ctx *ctx)
+{
+ lockdep_assert_held(&ctx->wqh.lock);
+
+ ctx->expired = 1;
+ ctx->ticks++;
+ wake_up_locked_poll(&ctx->wqh, EPOLLIN);
+}
+
/*
* This gets called when the timer event triggers. We set the "expired"
* flag, but we do not re-arm the timer (in case it's necessary,
@@ -62,13 +71,8 @@ static inline bool isalarm(struct timerfd_ctx *ctx)
*/
static void timerfd_triggered(struct timerfd_ctx *ctx)
{
- unsigned long flags;
-
- spin_lock_irqsave(&ctx->wqh.lock, flags);
- ctx->expired = 1;
- ctx->ticks++;
- wake_up_locked_poll(&ctx->wqh, EPOLLIN);
- spin_unlock_irqrestore(&ctx->wqh.lock, flags);
+ guard(spinlock_irqsave)(&ctx->wqh.lock);
+ __timerfd_triggered(ctx);
}
static enum hrtimer_restart timerfd_tmrproc(struct hrtimer *htmr)
@@ -184,15 +188,54 @@ static ktime_t timerfd_get_remaining(struct timerfd_ctx *ctx)
return remaining < 0 ? 0: remaining;
}
+static void timerfd_alarm_start(struct timerfd_ctx *ctx, ktime_t exp, bool relative)
+{
+ /* Start the timer. If it's expired already, handle the callback. */
+ if (!alarm_start_timer(&ctx->t.alarm, exp, relative))
+ __timerfd_triggered(ctx);
+}
+
+static u64 timerfd_alarm_restart(struct timerfd_ctx *ctx)
+{
+ /* -1 to account for ctx->ticks++ in __timerfd_triggered() */
+ u64 ticks = alarm_forward_now(&ctx->t.alarm, ctx->tintv) - 1;
+
+ timerfd_alarm_start(ctx, alarm_get_expires(&ctx->t.alarm), false);
+ return ticks;
+}
+
+static void timerfd_hrtimer_start(struct timerfd_ctx *ctx, ktime_t exp,
+ const enum hrtimer_mode mode)
+{
+ /* Start the timer. If it's expired already, handle the callback. */
+ if (!hrtimer_start_range_ns_user(&ctx->t.tmr, exp, 0, mode))
+ __timerfd_triggered(ctx);
+}
+
+static u64 timerfd_hrtimer_restart(struct timerfd_ctx *ctx)
+{
+ /* -1 to account for ctx->ticks++ in __timerfd_triggered() */
+ u64 ticks = hrtimer_forward_now(&ctx->t.tmr, ctx->tintv) - 1;
+
+ timerfd_hrtimer_start(ctx, hrtimer_get_expires(&ctx->t.tmr), HRTIMER_MODE_ABS);
+ return ticks;
+}
+
+static u64 timerfd_restart(struct timerfd_ctx *ctx)
+{
+ if (isalarm(ctx))
+ return timerfd_alarm_restart(ctx);
+ return timerfd_hrtimer_restart(ctx);
+}
+
static int timerfd_setup(struct timerfd_ctx *ctx, int flags,
const struct itimerspec64 *ktmr)
{
+ int clockid = ctx->clockid;
enum hrtimer_mode htmode;
ktime_t texp;
- int clockid = ctx->clockid;
- htmode = (flags & TFD_TIMER_ABSTIME) ?
- HRTIMER_MODE_ABS: HRTIMER_MODE_REL;
+ htmode = (flags & TFD_TIMER_ABSTIME) ? HRTIMER_MODE_ABS: HRTIMER_MODE_REL;
texp = timespec64_to_ktime(ktmr->it_value);
ctx->expired = 0;
@@ -206,20 +249,15 @@ static int timerfd_setup(struct timerfd_ctx *ctx, int flags,
timerfd_alarmproc);
} else {
hrtimer_setup(&ctx->t.tmr, timerfd_tmrproc, clockid, htmode);
- hrtimer_set_expires(&ctx->t.tmr, texp);
}
if (texp != 0) {
if (flags & TFD_TIMER_ABSTIME)
texp = timens_ktime_to_host(clockid, texp);
- if (isalarm(ctx)) {
- if (flags & TFD_TIMER_ABSTIME)
- alarm_start(&ctx->t.alarm, texp);
- else
- alarm_start_relative(&ctx->t.alarm, texp);
- } else {
- hrtimer_start(&ctx->t.tmr, texp, htmode);
- }
+ if (isalarm(ctx))
+ timerfd_alarm_start(ctx, texp, !(flags & TFD_TIMER_ABSTIME));
+ else
+ timerfd_hrtimer_start(ctx, texp, htmode);
if (timerfd_canceled(ctx))
return -ECANCELED;
@@ -287,27 +325,19 @@ static ssize_t timerfd_read_iter(struct kiocb *iocb, struct iov_iter *to)
}
if (ctx->ticks) {
- ticks = ctx->ticks;
+ unsigned int expired = ctx->expired;
- if (ctx->expired && ctx->tintv) {
- /*
- * If tintv != 0, this is a periodic timer that
- * needs to be re-armed. We avoid doing it in the timer
- * callback to avoid DoS attacks specifying a very
- * short timer period.
- */
- if (isalarm(ctx)) {
- ticks += alarm_forward_now(
- &ctx->t.alarm, ctx->tintv) - 1;
- alarm_restart(&ctx->t.alarm);
- } else {
- ticks += hrtimer_forward_now(&ctx->t.tmr,
- ctx->tintv) - 1;
- hrtimer_restart(&ctx->t.tmr);
- }
- }
+ ticks = ctx->ticks;
ctx->expired = 0;
ctx->ticks = 0;
+
+ /*
+ * If tintv != 0, this is a periodic timer that needs to be
+ * re-armed. We avoid doing it in the timer callback to avoid
+ * DoS attacks specifying a very short timer period.
+ */
+ if (expired && ctx->tintv)
+ ticks += timerfd_restart(ctx);
}
spin_unlock_irq(&ctx->wqh.lock);
if (ticks) {
@@ -526,18 +556,7 @@ static int do_timerfd_gettime(int ufd, struct itimerspec64 *t)
spin_lock_irq(&ctx->wqh.lock);
if (ctx->expired && ctx->tintv) {
ctx->expired = 0;
-
- if (isalarm(ctx)) {
- ctx->ticks +=
- alarm_forward_now(
- &ctx->t.alarm, ctx->tintv) - 1;
- alarm_restart(&ctx->t.alarm);
- } else {
- ctx->ticks +=
- hrtimer_forward_now(&ctx->t.tmr, ctx->tintv)
- - 1;
- hrtimer_restart(&ctx->t.tmr);
- }
+ ctx->ticks += timerfd_restart(ctx);
}
t->it_value = ktime_to_timespec64(timerfd_get_remaining(ctx));
t->it_interval = ktime_to_timespec64(ctx->tintv);
diff --git a/include/linux/alarmtimer.h b/include/linux/alarmtimer.h
index 3ffa5341dce2..2014288ca2f4 100644
--- a/include/linux/alarmtimer.h
+++ b/include/linux/alarmtimer.h
@@ -42,11 +42,14 @@ struct alarm {
void *data;
};
+static __always_inline ktime_t alarm_get_expires(struct alarm *alarm)
+{
+ return alarm->node.expires;
+}
+
void alarm_init(struct alarm *alarm, enum alarmtimer_type type,
void (*function)(struct alarm *, ktime_t));
-void alarm_start(struct alarm *alarm, ktime_t start);
-void alarm_start_relative(struct alarm *alarm, ktime_t start);
-void alarm_restart(struct alarm *alarm);
+bool alarm_start_timer(struct alarm *alarm, ktime_t expires, bool relative);
int alarm_try_to_cancel(struct alarm *alarm);
int alarm_cancel(struct alarm *alarm);
diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index 7c38190b10bf..c61aa458d4ec 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -236,8 +236,6 @@ clocks_calc_mult_shift(u32 *mult, u32 *shift, u32 from, u32 to, u32 minsec);
*/
extern int
__clocksource_register_scale(struct clocksource *cs, u32 scale, u32 freq);
-extern void
-__clocksource_update_freq_scale(struct clocksource *cs, u32 scale, u32 freq);
/*
* Don't call this unless you are a default clocksource
@@ -258,16 +256,6 @@ static inline int clocksource_register_khz(struct clocksource *cs, u32 khz)
return __clocksource_register_scale(cs, 1000, khz);
}
-static inline void __clocksource_update_freq_hz(struct clocksource *cs, u32 hz)
-{
- __clocksource_update_freq_scale(cs, 1, hz);
-}
-
-static inline void __clocksource_update_freq_khz(struct clocksource *cs, u32 khz)
-{
- __clocksource_update_freq_scale(cs, 1000, khz);
-}
-
#ifdef CONFIG_ARCH_CLOCKSOURCE_INIT
extern void clocksource_arch_init(struct clocksource *cs);
#else
diff --git a/include/linux/delay.h b/include/linux/delay.h
index 46412c00033a..68b2a69dd24d 100644
--- a/include/linux/delay.h
+++ b/include/linux/delay.h
@@ -110,7 +110,7 @@ static const unsigned int max_slack_shift = 2;
* fsleep - flexible sleep which autoselects the best mechanism
* @usecs: requested sleep duration in microseconds
*
- * flseep() selects the best mechanism that will provide maximum 25% slack
+ * fsleep() selects the best mechanism that will provide maximum 25% slack
* to the requested sleep duration. Therefore it uses:
*
* * udelay() loop for sleep durations <= 10 microseconds to avoid hrtimer
diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
index 9ced498fefaa..6862dea0acc5 100644
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -206,6 +206,9 @@ static inline void destroy_hrtimer_on_stack(struct hrtimer *timer) { }
extern void hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
u64 range_ns, const enum hrtimer_mode mode);
+extern bool hrtimer_start_range_ns_user(struct hrtimer *timer, ktime_t tim,
+ u64 range_ns, const enum hrtimer_mode mode);
+
/**
* hrtimer_start - (re)start an hrtimer
* @timer: the timer to be added
@@ -223,17 +226,28 @@ static inline void hrtimer_start(struct hrtimer *timer, ktime_t tim,
extern int hrtimer_cancel(struct hrtimer *timer);
extern int hrtimer_try_to_cancel(struct hrtimer *timer);
-static inline void hrtimer_start_expires(struct hrtimer *timer,
- enum hrtimer_mode mode)
+static inline void hrtimer_start_expires(struct hrtimer *timer, enum hrtimer_mode mode)
{
- u64 delta;
ktime_t soft, hard;
+ u64 delta;
+
soft = hrtimer_get_softexpires(timer);
hard = hrtimer_get_expires(timer);
delta = ktime_to_ns(ktime_sub(hard, soft));
hrtimer_start_range_ns(timer, soft, delta, mode);
}
+static inline bool hrtimer_start_expires_user(struct hrtimer *timer, enum hrtimer_mode mode)
+{
+ ktime_t soft, hard;
+ u64 delta;
+
+ soft = hrtimer_get_softexpires(timer);
+ hard = hrtimer_get_expires(timer);
+ delta = ktime_to_ns(ktime_sub(hard, soft));
+ return hrtimer_start_range_ns_user(timer, soft, delta, mode);
+}
+
void hrtimer_sleeper_start_expires(struct hrtimer_sleeper *sl,
enum hrtimer_mode mode);
@@ -254,8 +268,8 @@ static inline ktime_t hrtimer_get_remaining(const struct hrtimer *timer)
return __hrtimer_get_remaining(timer, false);
}
-extern u64 hrtimer_get_next_event(void);
-extern u64 hrtimer_next_event_without(const struct hrtimer *exclude);
+extern ktime_t hrtimer_get_next_event(void);
+extern ktime_t hrtimer_next_event_without(const struct hrtimer *exclude);
extern bool hrtimer_active(const struct hrtimer *timer);
diff --git a/include/trace/events/timer.h b/include/trace/events/timer.h
index 07cbb9836b91..ca82fd62dc30 100644
--- a/include/trace/events/timer.h
+++ b/include/trace/events/timer.h
@@ -298,6 +298,19 @@ DECLARE_EVENT_CLASS(hrtimer_class,
TP_printk("hrtimer=%p", __entry->hrtimer)
);
+/**
+ * hrtimer_start_expired - Invoked when a expired timer was started
+ * @hrtimer: pointer to struct hrtimer
+ *
+ * Preceeded by a hrtimer_start tracepoint.
+ */
+DEFINE_EVENT(hrtimer_class, hrtimer_start_expired,
+
+ TP_PROTO(struct hrtimer *hrtimer),
+
+ TP_ARGS(hrtimer)
+);
+
/**
* hrtimer_expire_exit - called immediately after the hrtimer callback returns
* @hrtimer: pointer to struct hrtimer
diff --git a/include/trace/events/timer_migration.h b/include/trace/events/timer_migration.h
index 61171b13c687..0b135e9301b1 100644
--- a/include/trace/events/timer_migration.h
+++ b/include/trace/events/timer_migration.h
@@ -33,15 +33,16 @@ TRACE_EVENT(tmigr_group_set,
TRACE_EVENT(tmigr_connect_child_parent,
- TP_PROTO(struct tmigr_group *child),
+ TP_PROTO(struct tmigr_hierarchy *hier, struct tmigr_group *child),
- TP_ARGS(child),
+ TP_ARGS(hier, child),
TP_STRUCT__entry(
__field( void *, child )
__field( void *, parent )
__field( unsigned int, lvl )
__field( unsigned int, numa_node )
+ __field( unsigned int, capacity )
__field( unsigned int, num_children )
__field( u32, groupmask )
),
@@ -51,26 +52,28 @@ TRACE_EVENT(tmigr_connect_child_parent,
__entry->parent = child->parent;
__entry->lvl = child->parent->level;
__entry->numa_node = child->parent->numa_node;
+ __entry->capacity = hier->capacity;
__entry->num_children = child->parent->num_children;
__entry->groupmask = child->groupmask;
),
- TP_printk("group=%p groupmask=%0x parent=%p lvl=%d numa=%d num_children=%d",
- __entry->child, __entry->groupmask, __entry->parent,
- __entry->lvl, __entry->numa_node, __entry->num_children)
+ TP_printk("group=%p groupmask=%0x parent=%p lvl=%d numa=%d capacity=%d num_children=%d",
+ __entry->child, __entry->groupmask, __entry->parent, __entry->lvl,
+ __entry->numa_node, __entry->capacity, __entry->num_children)
);
TRACE_EVENT(tmigr_connect_cpu_parent,
- TP_PROTO(struct tmigr_cpu *tmc),
+ TP_PROTO(struct tmigr_hierarchy *hier, struct tmigr_cpu *tmc),
- TP_ARGS(tmc),
+ TP_ARGS(hier, tmc),
TP_STRUCT__entry(
__field( void *, parent )
__field( unsigned int, cpu )
__field( unsigned int, lvl )
__field( unsigned int, numa_node )
+ __field( unsigned int, capacity )
__field( unsigned int, num_children )
__field( u32, groupmask )
),
@@ -80,13 +83,14 @@ TRACE_EVENT(tmigr_connect_cpu_parent,
__entry->cpu = tmc->cpuevt.cpu;
__entry->lvl = tmc->tmgroup->level;
__entry->numa_node = tmc->tmgroup->numa_node;
+ __entry->capacity = hier->capacity;
__entry->num_children = tmc->tmgroup->num_children;
__entry->groupmask = tmc->groupmask;
),
- TP_printk("cpu=%d groupmask=%0x parent=%p lvl=%d numa=%d num_children=%d",
- __entry->cpu, __entry->groupmask, __entry->parent,
- __entry->lvl, __entry->numa_node, __entry->num_children)
+ TP_printk("cpu=%d groupmask=%0x parent=%p lvl=%d numa=%d capacity=%d num_children=%d",
+ __entry->cpu, __entry->groupmask, __entry->parent, __entry->lvl,
+ __entry->numa_node, __entry->capacity, __entry->num_children)
);
DECLARE_EVENT_CLASS(tmigr_group_and_cpu,
diff --git a/kernel/time/alarmtimer.c b/kernel/time/alarmtimer.c
index 6e173d70d825..ea5be5870e51 100644
--- a/kernel/time/alarmtimer.c
+++ b/kernel/time/alarmtimer.c
@@ -337,48 +337,32 @@ void alarm_init(struct alarm *alarm, enum alarmtimer_type type,
EXPORT_SYMBOL_GPL(alarm_init);
/**
- * alarm_start - Sets an absolute alarm to fire
- * @alarm: ptr to alarm to set
- * @start: time to run the alarm
+ * alarm_start_timer - Sets an alarm to fire
+ * @alarm: Pointer to alarm to set
+ * @expires: Expiry time
+ * @relative: True if @expires is relative
+ *
+ * Returns: True if the alarm was queued. False if it already expired
*/
-void alarm_start(struct alarm *alarm, ktime_t start)
+bool alarm_start_timer(struct alarm *alarm, ktime_t expires, bool relative)
{
struct alarm_base *base = &alarm_bases[alarm->type];
- scoped_guard(spinlock_irqsave, &base->lock) {
- alarm->node.expires = start;
- alarmtimer_enqueue(base, alarm);
- hrtimer_start(&alarm->timer, alarm->node.expires, HRTIMER_MODE_ABS);
- }
+ if (relative)
+ expires = ktime_add_safe(expires, base->get_ktime());
trace_alarmtimer_start(alarm, base->get_ktime());
-}
-EXPORT_SYMBOL_GPL(alarm_start);
-
-/**
- * alarm_start_relative - Sets a relative alarm to fire
- * @alarm: ptr to alarm to set
- * @start: time relative to now to run the alarm
- */
-void alarm_start_relative(struct alarm *alarm, ktime_t start)
-{
- struct alarm_base *base = &alarm_bases[alarm->type];
-
- start = ktime_add_safe(start, base->get_ktime());
- alarm_start(alarm, start);
-}
-EXPORT_SYMBOL_GPL(alarm_start_relative);
-
-void alarm_restart(struct alarm *alarm)
-{
- struct alarm_base *base = &alarm_bases[alarm->type];
guard(spinlock_irqsave)(&base->lock);
- hrtimer_set_expires(&alarm->timer, alarm->node.expires);
- hrtimer_restart(&alarm->timer);
+ alarm->node.expires = expires;
alarmtimer_enqueue(base, alarm);
+ if (!hrtimer_start_range_ns_user(&alarm->timer, expires, 0, HRTIMER_MODE_ABS)) {
+ alarmtimer_dequeue(base, alarm);
+ return false;
+ }
+ return true;
}
-EXPORT_SYMBOL_GPL(alarm_restart);
+EXPORT_SYMBOL_GPL(alarm_start_timer);
/**
* alarm_try_to_cancel - Tries to cancel an alarm timer
@@ -512,8 +496,6 @@ static enum alarmtimer_type clock2alarm(clockid_t clockid)
* @now: time at the timer expiration
*
* Posix timer callback for expired alarm timers.
- *
- * Return: whether the timer is to be restarted
*/
static void alarm_handle_timer(struct alarm *alarm, ktime_t now)
{
@@ -527,12 +509,12 @@ static void alarm_handle_timer(struct alarm *alarm, ktime_t now)
* alarm_timer_rearm - Posix timer callback for rearming timer
* @timr: Pointer to the posixtimer data struct
*/
-static void alarm_timer_rearm(struct k_itimer *timr)
+static bool alarm_timer_rearm(struct k_itimer *timr)
{
struct alarm *alarm = &timr->it.alarm.alarmtimer;
timr->it_overrun += alarm_forward_now(alarm, timr->it_interval);
- alarm_start(alarm, alarm->node.expires);
+ return alarm_start_timer(alarm, alarm->node.expires, false);
}
/**
@@ -588,7 +570,7 @@ static void alarm_timer_wait_running(struct k_itimer *timr)
* @absolute: Expiry value is absolute time
* @sigev_none: Posix timer does not deliver signals
*/
-static void alarm_timer_arm(struct k_itimer *timr, ktime_t expires,
+static bool alarm_timer_arm(struct k_itimer *timr, ktime_t expires,
bool absolute, bool sigev_none)
{
struct alarm *alarm = &timr->it.alarm.alarmtimer;
@@ -596,10 +578,16 @@ static void alarm_timer_arm(struct k_itimer *timr, ktime_t expires,
if (!absolute)
expires = ktime_add_safe(expires, base->get_ktime());
- if (sigev_none)
+
+ /*
+ * sigev_none needs to update the expires value and pretend
+ * that the timer is queued
+ */
+ if (sigev_none) {
alarm->node.expires = expires;
- else
- alarm_start(&timr->it.alarm.alarmtimer, expires);
+ return true;
+ }
+ return alarm_start_timer(&timr->it.alarm.alarmtimer, expires, false);
}
/**
@@ -706,7 +694,9 @@ static int alarmtimer_do_nsleep(struct alarm *alarm, ktime_t absexp,
alarm->data = (void *)current;
do {
set_current_state(TASK_INTERRUPTIBLE);
- alarm_start(alarm, absexp);
+ if (!alarm_start_timer(alarm, absexp, false))
+ alarm->data = NULL;
+
if (likely(alarm->data))
schedule();
diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index baee13a1f87f..f8d15ed3ec98 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -1222,14 +1222,8 @@ static void clocksource_enqueue(struct clocksource *cs)
* @cs: clocksource to be registered
* @scale: Scale factor multiplied against freq to get clocksource hz
* @freq: clocksource frequency (cycles per second) divided by scale
- *
- * This should only be called from the clocksource->enable() method.
- *
- * This *SHOULD NOT* be called directly! Please use the
- * __clocksource_update_freq_hz() or __clocksource_update_freq_khz() helper
- * functions.
*/
-void __clocksource_update_freq_scale(struct clocksource *cs, u32 scale, u32 freq)
+static void __clocksource_update_freq_scale(struct clocksource *cs, u32 scale, u32 freq)
{
u64 sec;
@@ -1287,7 +1281,6 @@ void __clocksource_update_freq_scale(struct clocksource *cs, u32 scale, u32 freq
pr_info("%s: mask: 0x%llx max_cycles: 0x%llx, max_idle_ns: %lld ns\n",
cs->name, cs->mask, cs->max_cycles, cs->max_idle_ns);
}
-EXPORT_SYMBOL_GPL(__clocksource_update_freq_scale);
/**
* __clocksource_register_scale - Used to install new clocksources
diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index 5bd6efe598f0..638ce623c342 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -1352,8 +1352,14 @@ static inline bool hrtimer_keep_base(struct hrtimer *timer, bool is_local, bool
return hrtimer_prefer_local(is_local, is_first, is_pinned);
}
-static bool __hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim, u64 delta_ns,
- const enum hrtimer_mode mode, struct hrtimer_clock_base *base)
+enum {
+ HRTIMER_REPROGRAM_NONE,
+ HRTIMER_REPROGRAM,
+ HRTIMER_REPROGRAM_FORCE,
+};
+
+static int __hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim, u64 delta_ns,
+ const enum hrtimer_mode mode, struct hrtimer_clock_base *base)
{
struct hrtimer_cpu_base *this_cpu_base = this_cpu_ptr(&hrtimer_bases);
bool is_pinned, first, was_first, keep_base = false;
@@ -1410,7 +1416,7 @@ static bool __hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim, u64 del
/* If a deferred rearm is pending skip reprogramming the device */
if (cpu_base->deferred_rearm) {
cpu_base->deferred_needs_update = true;
- return false;
+ return HRTIMER_REPROGRAM_NONE;
}
if (!was_first || cpu_base != this_cpu_base) {
@@ -1423,7 +1429,7 @@ static bool __hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim, u64 del
* callbacks.
*/
if (likely(hrtimer_base_is_online(this_cpu_base)))
- return first;
+ return first ? HRTIMER_REPROGRAM : HRTIMER_REPROGRAM_NONE;
/*
* Timer was enqueued remote because the current base is
@@ -1432,7 +1438,7 @@ static bool __hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim, u64 del
*/
if (first)
smp_call_function_single_async(cpu_base->cpu, &cpu_base->csd);
- return false;
+ return HRTIMER_REPROGRAM_NONE;
}
/*
@@ -1446,7 +1452,7 @@ static bool __hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim, u64 del
*/
if (timer->is_lazy) {
if (cpu_base->expires_next <= hrtimer_get_expires(timer))
- return false;
+ return HRTIMER_REPROGRAM_NONE;
}
/*
@@ -1455,8 +1461,24 @@ static bool __hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim, u64 del
* reprogram the hardware by evaluating the new first expiring
* timer.
*/
- hrtimer_force_reprogram(cpu_base, /* skip_equal */ true);
- return false;
+ return HRTIMER_REPROGRAM_FORCE;
+}
+
+static int hrtimer_start_range_ns_common(struct hrtimer *timer, ktime_t tim,
+ u64 delta_ns, const enum hrtimer_mode mode,
+ struct hrtimer_clock_base *base)
+{
+ /*
+ * Check whether the HRTIMER_MODE_SOFT bit and hrtimer.is_soft
+ * match on CONFIG_PREEMPT_RT = n. With PREEMPT_RT check the hard
+ * expiry mode because unmarked timers are moved to softirq expiry.
+ */
+ if (!IS_ENABLED(CONFIG_PREEMPT_RT))
+ WARN_ON_ONCE(!(mode & HRTIMER_MODE_SOFT) ^ !timer->is_soft);
+ else
+ WARN_ON_ONCE(!(mode & HRTIMER_MODE_HARD) ^ !timer->is_hard);
+
+ return __hrtimer_start_range_ns(timer, tim, delta_ns, mode, base);
}
/**
@@ -1476,24 +1498,104 @@ void hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim, u64 delta_ns,
debug_hrtimer_assert_init(timer);
+ base = lock_hrtimer_base(timer, &flags);
+
+ switch (hrtimer_start_range_ns_common(timer, tim, delta_ns, mode, base)) {
+ case HRTIMER_REPROGRAM:
+ hrtimer_reprogram(timer, true);
+ break;
+ case HRTIMER_REPROGRAM_FORCE:
+ hrtimer_force_reprogram(timer->base->cpu_base, 1);
+ break;
+ case HRTIMER_REPROGRAM_NONE:
+ break;
+ }
+
+ unlock_hrtimer_base(timer, &flags);
+}
+EXPORT_SYMBOL_GPL(hrtimer_start_range_ns);
+
+static inline bool hrtimer_check_user_timer(struct hrtimer *timer)
+{
+ struct hrtimer_cpu_base *cpu_base = timer->base->cpu_base;
+ ktime_t expires;
+
/*
- * Check whether the HRTIMER_MODE_SOFT bit and hrtimer.is_soft
- * match on CONFIG_PREEMPT_RT = n. With PREEMPT_RT check the hard
- * expiry mode because unmarked timers are moved to softirq expiry.
+ * This uses soft expires because that's the user provided
+ * expiry time, while expires can be further in the past
+ * due to a slack value added to the user expiry time.
*/
- if (!IS_ENABLED(CONFIG_PREEMPT_RT))
- WARN_ON_ONCE(!(mode & HRTIMER_MODE_SOFT) ^ !timer->is_soft);
- else
- WARN_ON_ONCE(!(mode & HRTIMER_MODE_HARD) ^ !timer->is_hard);
+ expires = hrtimer_get_softexpires(timer);
+
+ /* Convert to monotonic */
+ expires = ktime_sub(expires, timer->base->offset);
+
+ /*
+ * Check whether this timer will end up as the first expiring timer in
+ * the CPU base. If not, no further checks required as it's then
+ * guaranteed to expire in the future.
+ */
+ if (expires >= cpu_base->expires_next)
+ return true;
+
+ /* Validate that the expiry time is in the future. */
+ if (expires > ktime_get())
+ return true;
+
+ debug_hrtimer_deactivate(timer);
+ __remove_hrtimer(timer, timer->base, HRTIMER_STATE_INACTIVE, false);
+ trace_hrtimer_start_expired(timer);
+ return false;
+}
+
+/**
+ * hrtimer_start_range_ns_user - (re)start an user controlled hrtimer
+ * @timer: the timer to be added
+ * @tim: expiry time
+ * @delta_ns: "slack" range for the timer
+ * @mode: timer mode: absolute (HRTIMER_MODE_ABS) or
+ * relative (HRTIMER_MODE_REL), and pinned (HRTIMER_MODE_PINNED);
+ * softirq based mode is considered for debug purpose only!
+ *
+ * Returns: True when the timer was queued, false if it was already expired
+ *
+ * This function cannot invoke the timer callback for expired timers as it might
+ * be called under a lock which the timer callback needs to acquire. So the
+ * caller has to handle that case.
+ */
+bool hrtimer_start_range_ns_user(struct hrtimer *timer, ktime_t tim,
+ u64 delta_ns, const enum hrtimer_mode mode)
+{
+ struct hrtimer_clock_base *base;
+ unsigned long flags;
+ bool ret = true;
+
+ debug_hrtimer_assert_init(timer);
base = lock_hrtimer_base(timer, &flags);
- if (__hrtimer_start_range_ns(timer, tim, delta_ns, mode, base))
- hrtimer_reprogram(timer, true);
+ switch (hrtimer_start_range_ns_common(timer, tim, delta_ns, mode, base)) {
+ case HRTIMER_REPROGRAM:
+ ret = hrtimer_check_user_timer(timer);
+ if (ret)
+ hrtimer_reprogram(timer, true);
+ break;
+ case HRTIMER_REPROGRAM_FORCE:
+ ret = hrtimer_check_user_timer(timer);
+ /*
+ * The base must always be reevaluated, independent of the
+ * result above because the timer was the first pending timer.
+ */
+ hrtimer_force_reprogram(timer->base->cpu_base, 1);
+ break;
+ case HRTIMER_REPROGRAM_NONE:
+ break;
+ }
unlock_hrtimer_base(timer, &flags);
+ return ret;
}
-EXPORT_SYMBOL_GPL(hrtimer_start_range_ns);
+EXPORT_SYMBOL_GPL(hrtimer_start_range_ns_user);
/**
* hrtimer_try_to_cancel - try to deactivate a timer
@@ -1681,10 +1783,10 @@ EXPORT_SYMBOL_GPL(__hrtimer_get_remaining);
*
* Returns the next expiry time or KTIME_MAX if no timer is pending.
*/
-u64 hrtimer_get_next_event(void)
+ktime_t hrtimer_get_next_event(void)
{
struct hrtimer_cpu_base *cpu_base = this_cpu_ptr(&hrtimer_bases);
- u64 expires = KTIME_MAX;
+ ktime_t expires = KTIME_MAX;
guard(raw_spinlock_irqsave)(&cpu_base->lock);
if (!hrtimer_hres_active(cpu_base))
@@ -1700,10 +1802,10 @@ u64 hrtimer_get_next_event(void)
* Returns the next expiry time over all timers except for the @exclude one or
* KTIME_MAX if none of them is pending.
*/
-u64 hrtimer_next_event_without(const struct hrtimer *exclude)
+ktime_t hrtimer_next_event_without(const struct hrtimer *exclude)
{
struct hrtimer_cpu_base *cpu_base = this_cpu_ptr(&hrtimer_bases);
- u64 expires = KTIME_MAX;
+ ktime_t expires = KTIME_MAX;
unsigned int active;
guard(raw_spinlock_irqsave)(&cpu_base->lock);
@@ -2213,7 +2315,11 @@ void hrtimer_sleeper_start_expires(struct hrtimer_sleeper *sl, enum hrtimer_mode
if (IS_ENABLED(CONFIG_PREEMPT_RT) && sl->timer.is_hard)
mode |= HRTIMER_MODE_HARD;
- hrtimer_start_expires(&sl->timer, mode);
+ /* If already expired, clear the task pointer and set current state to running */
+ if (!hrtimer_start_expires_user(&sl->timer, mode)) {
+ sl->task = NULL;
+ __set_current_state(TASK_RUNNING);
+ }
}
EXPORT_SYMBOL_GPL(hrtimer_sleeper_start_expires);
diff --git a/kernel/time/jiffies.c b/kernel/time/jiffies.c
index 1c954f330dfe..d51428867a33 100644
--- a/kernel/time/jiffies.c
+++ b/kernel/time/jiffies.c
@@ -60,15 +60,14 @@ EXPORT_SYMBOL(get_jiffies_64);
EXPORT_SYMBOL(jiffies);
-static int __init init_jiffies_clocksource(void)
-{
- return __clocksource_register(&clocksource_jiffies);
-}
-
-core_initcall(init_jiffies_clocksource);
+static bool cs_jiffies_registered __initdata;
struct clocksource * __init __weak clocksource_default_clock(void)
{
+ if (!cs_jiffies_registered) {
+ __clocksource_register(&clocksource_jiffies);
+ cs_jiffies_registered = true;
+ }
return &clocksource_jiffies;
}
diff --git a/kernel/time/namespace.c b/kernel/time/namespace.c
index 4bca3f78c8ea..5fa0af66cf3f 100644
--- a/kernel/time/namespace.c
+++ b/kernel/time/namespace.c
@@ -57,6 +57,7 @@ ktime_t do_timens_ktime_to_host(clockid_t clockid, ktime_t tim,
return tim;
}
+EXPORT_SYMBOL_GPL(do_timens_ktime_to_host);
static struct ucounts *inc_time_namespaces(struct user_namespace *ns)
{
@@ -351,6 +352,7 @@ struct time_namespace init_time_ns = {
.user_ns = &init_user_ns,
.frozen_offsets = true,
};
+EXPORT_SYMBOL_GPL(init_time_ns);
void __init time_ns_init(void)
{
diff --git a/kernel/time/posix-cpu-timers.c b/kernel/time/posix-cpu-timers.c
index 0de2bb7cbec0..74775b94d11b 100644
--- a/kernel/time/posix-cpu-timers.c
+++ b/kernel/time/posix-cpu-timers.c
@@ -19,7 +19,7 @@
#include "posix-timers.h"
-static void posix_cpu_timer_rearm(struct k_itimer *timer);
+static bool posix_cpu_timer_rearm(struct k_itimer *timer);
void posix_cputimers_group_init(struct posix_cputimers *pct, u64 cpu_limit)
{
@@ -1011,24 +1011,27 @@ static void check_process_timers(struct task_struct *tsk,
/*
* This is called from the signal code (via posixtimer_rearm)
* when the last timer signal was delivered and we have to reload the timer.
+ *
+ * Return true unconditionally so the core code assumes the timer to be
+ * armed. Otherwise it would requeue the signal.
*/
-static void posix_cpu_timer_rearm(struct k_itimer *timer)
+static bool posix_cpu_timer_rearm(struct k_itimer *timer)
{
clockid_t clkid = CPUCLOCK_WHICH(timer->it_clock);
- struct task_struct *p;
struct sighand_struct *sighand;
+ struct task_struct *p;
unsigned long flags;
u64 now;
- rcu_read_lock();
+ guard(rcu)();
p = cpu_timer_task_rcu(timer);
if (!p)
- goto out;
+ return true;
/* Protect timer list r/w in arm_timer() */
sighand = lock_task_sighand(p, &flags);
if (unlikely(sighand == NULL))
- goto out;
+ return true;
/*
* Fetch the current sample and update the timer's expiry time.
@@ -1045,8 +1048,7 @@ static void posix_cpu_timer_rearm(struct k_itimer *timer)
*/
arm_timer(timer, p);
unlock_task_sighand(p, &flags);
-out:
- rcu_read_unlock();
+ return true;
}
/**
@@ -1504,6 +1506,7 @@ static int do_cpu_nanosleep(const clockid_t which_clock, int flags,
spin_lock_irq(&timer.it_lock);
error = posix_cpu_timer_set(&timer, flags, &it, NULL);
if (error) {
+ posix_cpu_timer_del(&timer);
spin_unlock_irq(&timer.it_lock);
return error;
}
diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c
index 9331e1614124..436ba794cc0b 100644
--- a/kernel/time/posix-timers.c
+++ b/kernel/time/posix-timers.c
@@ -288,16 +288,18 @@ static inline int timer_overrun_to_int(struct k_itimer *timr)
return (int)timr->it_overrun_last;
}
-static void common_hrtimer_rearm(struct k_itimer *timr)
+static bool common_hrtimer_rearm(struct k_itimer *timr)
{
struct hrtimer *timer = &timr->it.real.timer;
timr->it_overrun += hrtimer_forward_now(timer, timr->it_interval);
- hrtimer_restart(timer);
+ return hrtimer_start_expires_user(timer, HRTIMER_MODE_ABS);
}
static bool __posixtimer_deliver_signal(struct kernel_siginfo *info, struct k_itimer *timr)
{
+ bool queued;
+
guard(spinlock)(&timr->it_lock);
/*
@@ -311,12 +313,18 @@ static bool __posixtimer_deliver_signal(struct kernel_siginfo *info, struct k_it
if (!timr->it_interval || WARN_ON_ONCE(timr->it_status != POSIX_TIMER_REQUEUE_PENDING))
return true;
- timr->kclock->timer_rearm(timr);
- timr->it_status = POSIX_TIMER_ARMED;
+ /* timer_rearm() updates timr::it_overrun */
+ queued = timr->kclock->timer_rearm(timr);
+
timr->it_overrun_last = timr->it_overrun;
timr->it_overrun = -1LL;
++timr->it_signal_seq;
info->si_overrun = timer_overrun_to_int(timr);
+
+ if (queued)
+ timr->it_status = POSIX_TIMER_ARMED;
+ else
+ posix_timer_queue_signal(timr);
return true;
}
@@ -795,7 +803,7 @@ SYSCALL_DEFINE1(timer_getoverrun, timer_t, timer_id)
return timer_overrun_to_int(scoped_timer);
}
-static void common_hrtimer_arm(struct k_itimer *timr, ktime_t expires,
+static bool common_hrtimer_arm(struct k_itimer *timr, ktime_t expires,
bool absolute, bool sigev_none)
{
struct hrtimer *timer = &timr->it.real.timer;
@@ -820,8 +828,11 @@ static void common_hrtimer_arm(struct k_itimer *timr, ktime_t expires,
expires = ktime_add_safe(expires, hrtimer_cb_get_time(timer));
hrtimer_set_expires(timer, expires);
- if (!sigev_none)
- hrtimer_start_expires(timer, HRTIMER_MODE_ABS);
+ /* For sigev_none pretend that the timer is queued */
+ if (sigev_none)
+ return true;
+
+ return hrtimer_start_expires_user(timer, HRTIMER_MODE_ABS);
}
static int common_hrtimer_try_to_cancel(struct k_itimer *timr)
@@ -903,9 +914,13 @@ int common_timer_set(struct k_itimer *timr, int flags,
expires = timens_ktime_to_host(timr->it_clock, expires);
sigev_none = timr->it_sigev_notify == SIGEV_NONE;
- kc->timer_arm(timr, expires, flags & TIMER_ABSTIME, sigev_none);
- if (!sigev_none)
- timr->it_status = POSIX_TIMER_ARMED;
+ if (kc->timer_arm(timr, expires, flags & TIMER_ABSTIME, sigev_none)) {
+ if (!sigev_none)
+ timr->it_status = POSIX_TIMER_ARMED;
+ } else {
+ /* Timer was already expired, queue the signal */
+ posix_timer_queue_signal(timr);
+ }
return 0;
}
diff --git a/kernel/time/posix-timers.h b/kernel/time/posix-timers.h
index 7f259e845d24..4ea9611dd716 100644
--- a/kernel/time/posix-timers.h
+++ b/kernel/time/posix-timers.h
@@ -27,11 +27,11 @@ struct k_clock {
int (*timer_del)(struct k_itimer *timr);
void (*timer_get)(struct k_itimer *timr,
struct itimerspec64 *cur_setting);
- void (*timer_rearm)(struct k_itimer *timr);
+ bool (*timer_rearm)(struct k_itimer *timr);
s64 (*timer_forward)(struct k_itimer *timr, ktime_t now);
ktime_t (*timer_remaining)(struct k_itimer *timr, ktime_t now);
int (*timer_try_to_cancel)(struct k_itimer *timr);
- void (*timer_arm)(struct k_itimer *timr, ktime_t expires,
+ bool (*timer_arm)(struct k_itimer *timr, ktime_t expires,
bool absolute, bool sigev_none);
void (*timer_wait_running)(struct k_itimer *timr);
};
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index cbbb87a0c6e7..3026a301dff7 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -1407,8 +1407,7 @@ ktime_t tick_nohz_get_sleep_length(ktime_t *delta_next)
* If the next highres timer to expire is earlier than 'next_event', the
* idle governor needs to know that.
*/
- next_event = min_t(u64, next_event,
- hrtimer_next_event_without(&ts->sched_timer));
+ next_event = min(next_event, hrtimer_next_event_without(&ts->sched_timer));
return ktime_sub(next_event, now);
}
diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 04d928c21aba..655a8c6cd84d 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -1932,7 +1932,7 @@ static void timer_recalc_next_expiry(struct timer_base *base)
*/
static u64 cmp_next_hrtimer_event(u64 basem, u64 expires)
{
- u64 nextevt = hrtimer_get_next_event();
+ u64 nextevt = ktime_to_ns(hrtimer_get_next_event());
/*
* If high resolution timers are enabled
diff --git a/kernel/time/timer_migration.c b/kernel/time/timer_migration.c
index 1d0d3a4058d5..e9d96d96e251 100644
--- a/kernel/time/timer_migration.c
+++ b/kernel/time/timer_migration.c
@@ -102,7 +102,7 @@
* active CPU/group information atomic_try_cmpxchg() is used instead and only
* the per CPU tmigr_cpu->lock is held.
*
- * During the setup of groups tmigr_level_list is required. It is protected by
+ * During the setup of groups, hier->level_list is required. It is protected by
* @tmigr_mutex.
*
* When @timer_base->lock as well as tmigr related locks are required, the lock
@@ -416,13 +416,12 @@
*/
static DEFINE_MUTEX(tmigr_mutex);
-static struct list_head *tmigr_level_list __read_mostly;
+
+static LIST_HEAD(tmigr_hierarchy_list);
static unsigned int tmigr_hierarchy_levels __read_mostly;
static unsigned int tmigr_crossnode_level __read_mostly;
-static struct tmigr_group *tmigr_root;
-
static DEFINE_PER_CPU(struct tmigr_cpu, tmigr_cpu);
/*
@@ -1465,6 +1464,34 @@ static long tmigr_trigger_active(void *unused)
return 0;
}
+static unsigned int tmigr_get_capacity(int cpu)
+{
+ /*
+ * nohz_full CPUs need to make sure there is always an available (online)
+ * and never idle migrator to handle all their global timers. That duty
+ * is served by the timekeeper which then never stops its tick. But the
+ * timekeeper must then belong to the same hierarchy as all the nohz_full
+ * CPUs. Simply turn off capacity awareness when nohz_full is running.
+ */
+ if (tick_nohz_full_enabled() || !IS_ENABLED(CONFIG_BROKEN))
+ return SCHED_CAPACITY_SCALE;
+ else
+ return arch_scale_cpu_capacity(cpu);
+}
+
+static struct tmigr_hierarchy *__tmigr_get_hierarchy(int cpu)
+{
+ unsigned int capacity = tmigr_get_capacity(cpu);
+ struct tmigr_hierarchy *iter;
+
+ list_for_each_entry(iter, &tmigr_hierarchy_list, node) {
+ if (iter->capacity == capacity)
+ return iter;
+ }
+
+ return NULL;
+}
+
static int tmigr_clear_cpu_available(unsigned int cpu)
{
struct tmigr_cpu *tmc = this_cpu_ptr(&tmigr_cpu);
@@ -1489,8 +1516,21 @@ static int tmigr_clear_cpu_available(unsigned int cpu)
}
if (firstexp != KTIME_MAX) {
- migrator = cpumask_any(tmigr_available_cpumask);
- work_on_cpu(migrator, tmigr_trigger_active, NULL);
+ struct tmigr_hierarchy *hier = __tmigr_get_hierarchy(cpu);
+
+ if (WARN_ON_ONCE(!hier))
+ return -EINVAL;
+
+ migrator = cpumask_any_and(tmigr_available_cpumask, hier->cpumask);
+ if (migrator < nr_cpu_ids) {
+ work_on_cpu(migrator, tmigr_trigger_active, NULL);
+ } else {
+ /*
+ * If deactivation returned an expiration, it belongs to an available
+ * nohz CPU in the hierarchy.
+ */
+ WARN_ONCE(1, "Expected available CPU in the hierarchy\n");
+ }
}
return 0;
@@ -1653,14 +1693,14 @@ static void tmigr_init_group(struct tmigr_group *group, unsigned int lvl,
group->groupevt.ignore = true;
}
-static struct tmigr_group *tmigr_get_group(int node, unsigned int lvl)
+static struct tmigr_group *tmigr_get_group(struct tmigr_hierarchy *hier, int node, unsigned int lvl)
{
struct tmigr_group *tmp, *group = NULL;
lockdep_assert_held(&tmigr_mutex);
/* Try to attach to an existing group first */
- list_for_each_entry(tmp, &tmigr_level_list[lvl], list) {
+ list_for_each_entry(tmp, &hier->level_list[lvl], list) {
/*
* If @lvl is below the cross NUMA node level, check whether
* this group belongs to the same NUMA node.
@@ -1694,14 +1734,14 @@ static struct tmigr_group *tmigr_get_group(int node, unsigned int lvl)
tmigr_init_group(group, lvl, node);
/* Setup successful. Add it to the hierarchy */
- list_add(&group->list, &tmigr_level_list[lvl]);
+ list_add(&group->list, &hier->level_list[lvl]);
trace_tmigr_group_set(group);
return group;
}
-static bool tmigr_init_root(struct tmigr_group *group, bool activate)
+static bool tmigr_init_root(struct tmigr_hierarchy *hier, struct tmigr_group *group, bool activate)
{
- if (!group->parent && group != tmigr_root) {
+ if (!group->parent && group != hier->root) {
/*
* This is the new top-level, prepare its groupmask in advance
* to avoid accidents where yet another new top-level is
@@ -1717,11 +1757,10 @@ static bool tmigr_init_root(struct tmigr_group *group, bool activate)
}
-static void tmigr_connect_child_parent(struct tmigr_group *child,
- struct tmigr_group *parent,
- bool activate)
+static void tmigr_connect_child_parent(struct tmigr_hierarchy *hier, struct tmigr_group *child,
+ struct tmigr_group *parent, bool activate)
{
- if (tmigr_init_root(parent, activate)) {
+ if (tmigr_init_root(hier, parent, activate)) {
/*
* The previous top level had prepared its groupmask already,
* simply account it in advance as the first child. If some groups
@@ -1754,13 +1793,13 @@ static void tmigr_connect_child_parent(struct tmigr_group *child,
*/
smp_store_release(&child->parent, parent);
- trace_tmigr_connect_child_parent(child);
+ trace_tmigr_connect_child_parent(hier, child);
}
-static int tmigr_setup_groups(unsigned int cpu, unsigned int node,
- struct tmigr_group *start, bool activate)
+static int tmigr_setup_groups(struct tmigr_hierarchy *hier, unsigned int cpu,
+ unsigned int node, struct tmigr_group *start, bool activate)
{
- struct tmigr_group *group, *child, **stack;
+ struct tmigr_group *root = hier->root, *group, *child, **stack;
int i, top = 0, err = 0, start_lvl = 0;
bool root_mismatch = false;
@@ -1773,11 +1812,11 @@ static int tmigr_setup_groups(unsigned int cpu, unsigned int node,
start_lvl = start->level + 1;
}
- if (tmigr_root)
- root_mismatch = tmigr_root->numa_node != node;
+ if (root)
+ root_mismatch = root->numa_node != node;
for (i = start_lvl; i < tmigr_hierarchy_levels; i++) {
- group = tmigr_get_group(node, i);
+ group = tmigr_get_group(hier, node, i);
if (IS_ERR(group)) {
err = PTR_ERR(group);
i--;
@@ -1799,7 +1838,7 @@ static int tmigr_setup_groups(unsigned int cpu, unsigned int node,
if (group->parent)
break;
if ((!root_mismatch || i >= tmigr_crossnode_level) &&
- list_is_singular(&tmigr_level_list[i]))
+ list_is_singular(&hier->level_list[i]))
break;
}
@@ -1827,15 +1866,15 @@ static int tmigr_setup_groups(unsigned int cpu, unsigned int node,
tmc->tmgroup = group;
tmc->groupmask = BIT(group->num_children++);
- tmigr_init_root(group, activate);
+ tmigr_init_root(hier, group, activate);
- trace_tmigr_connect_cpu_parent(tmc);
+ trace_tmigr_connect_cpu_parent(hier, tmc);
/* There are no children that need to be connected */
continue;
} else {
child = stack[i - 1];
- tmigr_connect_child_parent(child, group, activate);
+ tmigr_connect_child_parent(hier, child, group, activate);
}
}
@@ -1891,18 +1930,23 @@ static int tmigr_setup_groups(unsigned int cpu, unsigned int node,
data.childmask = start->groupmask;
__walk_groups_from(tmigr_active_up, &data, start, start->parent);
}
+ } else if (start) {
+ union tmigr_state state;
+
+ /* Remote activation assumes the whole target's hierarchy is inactive */
+ state.state = atomic_read(&start->migr_state);
+ WARN_ON_ONCE(state.active);
}
/* Root update */
- if (list_is_singular(&tmigr_level_list[top])) {
- group = list_first_entry(&tmigr_level_list[top],
- typeof(*group), list);
+ if (list_is_singular(&hier->level_list[top])) {
+ group = list_first_entry(&hier->level_list[top], typeof(*group), list);
WARN_ON_ONCE(group->parent);
- if (tmigr_root) {
+ if (root) {
/* Old root should be the same or below */
- WARN_ON_ONCE(tmigr_root->level > top);
+ WARN_ON_ONCE(root->level > top);
}
- tmigr_root = group;
+ hier->root = group;
}
out:
kfree(stack);
@@ -1910,34 +1954,123 @@ static int tmigr_setup_groups(unsigned int cpu, unsigned int node,
return err;
}
+static struct tmigr_hierarchy *tmigr_get_hierarchy(int cpu)
+{
+ struct tmigr_hierarchy *hier;
+
+ hier = __tmigr_get_hierarchy(cpu);
+
+ if (hier)
+ return hier;
+
+ hier = kzalloc_flex(*hier, level_list, tmigr_hierarchy_levels);
+ if (!hier)
+ return ERR_PTR(-ENOMEM);
+
+ hier->cpumask = kzalloc(cpumask_size(), GFP_KERNEL);
+ if (!hier->cpumask) {
+ kfree(hier);
+ return ERR_PTR(-ENOMEM);
+ }
+
+ for (int i = 0; i < tmigr_hierarchy_levels; i++)
+ INIT_LIST_HEAD(&hier->level_list[i]);
+
+ hier->capacity = tmigr_get_capacity(cpu);
+ list_add_tail(&hier->node, &tmigr_hierarchy_list);
+
+ return hier;
+}
+
+static int tmigr_connect_old_root(struct tmigr_hierarchy *hier, int cpu,
+ struct tmigr_group *old_root, bool activate)
+{
+ /*
+ * The target CPU must never do the prepare work, except
+ * on early boot when the boot CPU is the target. Otherwise
+ * it may spuriously activate the old top level group inside
+ * the new one (nevertheless whether old top level group is
+ * active or not) and/or release an uninitialized childmask.
+ */
+ WARN_ON_ONCE(cpu == smp_processor_id());
+ if (activate) {
+ /*
+ * The current CPU is expected to be online in the hierarchy,
+ * otherwise the old root may not be active as expected.
+ */
+ WARN_ON_ONCE(!__this_cpu_read(tmigr_cpu.available));
+ }
+
+ return tmigr_setup_groups(hier, -1, old_root->numa_node, old_root, activate);
+}
+
+static long connect_old_root_work(void *arg)
+{
+ struct tmigr_group *old_root = arg;
+ struct tmigr_hierarchy *hier;
+ int cpu = smp_processor_id();
+
+ hier = __tmigr_get_hierarchy(cpu);
+ if (WARN_ON_ONCE(!hier))
+ return -EINVAL;
+
+ return tmigr_connect_old_root(hier, cpu, old_root, true);
+}
+
static int tmigr_add_cpu(unsigned int cpu)
{
- struct tmigr_group *old_root = tmigr_root;
+ struct tmigr_hierarchy *hier;
+ struct tmigr_group *old_root;
int node = cpu_to_node(cpu);
int ret;
guard(mutex)(&tmigr_mutex);
- ret = tmigr_setup_groups(cpu, node, NULL, false);
+ hier = tmigr_get_hierarchy(cpu);
+ if (IS_ERR(hier))
+ return PTR_ERR(hier);
+
+ old_root = hier->root;
+
+ ret = tmigr_setup_groups(hier, cpu, node, NULL, false);
+
+ if (ret < 0)
+ return ret;
/* Root has changed? Connect the old one to the new */
- if (ret >= 0 && old_root && old_root != tmigr_root) {
- /*
- * The target CPU must never do the prepare work, except
- * on early boot when the boot CPU is the target. Otherwise
- * it may spuriously activate the old top level group inside
- * the new one (nevertheless whether old top level group is
- * active or not) and/or release an uninitialized childmask.
- */
- WARN_ON_ONCE(cpu == raw_smp_processor_id());
- /*
- * The (likely) current CPU is expected to be online in the hierarchy,
- * otherwise the old root may not be active as expected.
- */
- WARN_ON_ONCE(!per_cpu_ptr(&tmigr_cpu, raw_smp_processor_id())->available);
- ret = tmigr_setup_groups(-1, old_root->numa_node, old_root, true);
+ if (old_root && old_root != hier->root) {
+ guard(migrate)();
+
+ if (cpumask_test_cpu(smp_processor_id(), hier->cpumask)) {
+ /*
+ * If the target belong to the same hierarchy, the old root is expected
+ * to be active. Link and propagate to the new root.
+ */
+ ret = tmigr_connect_old_root(hier, cpu, old_root, true);
+ } else {
+ int target = cpumask_first_and(hier->cpumask, tmigr_available_cpumask);
+
+ if (target < nr_cpu_ids) {
+ /*
+ * If the target doesn't belong to the same hierarchy as the current
+ * CPU, activate from a relevant one to make sure the old root is
+ * active.
+ */
+ ret = work_on_cpu(target, connect_old_root_work, old_root);
+ } else {
+ /*
+ * No other available CPUs in the remote hierarchy. Link the
+ * old root remotely but don't propagate activation since the
+ * old root is not expected to be active.
+ */
+ ret = tmigr_connect_old_root(hier, cpu, old_root, false);
+ }
+ }
}
+ if (ret >= 0)
+ cpumask_set_cpu(cpu, hier->cpumask);
+
return ret;
}
@@ -1970,7 +2103,7 @@ static int tmigr_cpu_prepare(unsigned int cpu)
static int __init tmigr_init(void)
{
- unsigned int cpulvl, nodelvl, cpus_per_node, i;
+ unsigned int cpulvl, nodelvl, cpus_per_node;
unsigned int nnodes = num_possible_nodes();
unsigned int ncpus = num_possible_cpus();
int ret = -ENOMEM;
@@ -2017,14 +2150,6 @@ static int __init tmigr_init(void)
*/
tmigr_crossnode_level = cpulvl;
- tmigr_level_list = kzalloc_objs(struct list_head,
- tmigr_hierarchy_levels);
- if (!tmigr_level_list)
- goto err;
-
- for (i = 0; i < tmigr_hierarchy_levels; i++)
- INIT_LIST_HEAD(&tmigr_level_list[i]);
-
pr_info("Timer migration: %d hierarchy levels; %d children per group;"
" %d crossnode level\n",
tmigr_hierarchy_levels, TMIGR_CHILDREN_PER_GROUP,
diff --git a/kernel/time/timer_migration.h b/kernel/time/timer_migration.h
index 70879cde6fdd..31735dd52327 100644
--- a/kernel/time/timer_migration.h
+++ b/kernel/time/timer_migration.h
@@ -5,6 +5,24 @@
/* Per group capacity. Must be a power of 2! */
#define TMIGR_CHILDREN_PER_GROUP 8
+/**
+ * struct tmigr_hierarchy - a hierarchy associated to a given CPU capacity.
+ * Homogeneous systems have only one hierarchy.
+ * Heterogenous have one hierarchy per CPU capacity.
+ * @cpumask: CPUs belonging to this hierarchy
+ * @root: The current root of the hierarchy
+ * @capacity: CPU capacity associated to this hierarchy
+ * @node: Node in the global hierarchy list
+ * @level_list: Per level lists of tmigr groups
+ */
+struct tmigr_hierarchy {
+ struct cpumask *cpumask;
+ struct tmigr_group *root;
+ unsigned long capacity;
+ struct list_head node;
+ struct list_head level_list[];
+};
+
/**
* struct tmigr_event - a timer event associated to a CPU
* @nextevt: The node to enqueue an event in the parent group queue
@@ -75,15 +93,17 @@ struct tmigr_group {
/**
* struct tmigr_cpu - timer migration per CPU group
* @lock: Lock protecting the tmigr_cpu group information
- * @online: Indicates whether the CPU is online; In deactivate path
- * it is required to know whether the migrator in the top
- * level group is to be set offline, while a timer is
- * pending. Then another online CPU needs to be notified to
- * take over the migrator role. Furthermore the information
- * is required in CPU hotplug path as the CPU is able to go
- * idle before the timer migration hierarchy hotplug AP is
- * reached. During this phase, the CPU has to handle the
+ * @available: Indicates whether the CPU is available for handling
+ * global timers. In the deactivate path it is required to
+ * know whether the migrator in the top level group is to
+ * be set offline, while a timer is pending. Then another
+ * available CPU needs to be notified to take over the
+ * migrator role. Furthermore the information is required
+ * in the CPU hotplug path as the CPU is able to go idle
+ * before the timer migration hierarchy hotplug callback is
+ * reached. During this phase, the CPU has to handle the
* global timers on its own and must not act as a migrator.
+
* @idle: Indicates whether the CPU is idle in the timer migration
* hierarchy
* @remote: Is set when timers of the CPU are expired remotely
diff --git a/net/netfilter/xt_IDLETIMER.c b/net/netfilter/xt_IDLETIMER.c
index 517106165ad2..bfcf2d44e93d 100644
--- a/net/netfilter/xt_IDLETIMER.c
+++ b/net/netfilter/xt_IDLETIMER.c
@@ -115,6 +115,21 @@ static void idletimer_tg_alarmproc(struct alarm *alarm, ktime_t now)
schedule_work(&timer->work);
}
+static void idletimer_start_alarm_ktime(struct idletimer_tg *timer, ktime_t timeout)
+{
+ /*
+ * The timer should always be queued as @tout it should be least one
+ * second, but handle it correctly in any case. Virt will manage!
+ */
+ if (!alarm_start_timer(&timer->alarm, timeout, true))
+ schedule_work(&timer->work);
+}
+
+static void idletimer_start_alarm_sec(struct idletimer_tg *timer, unsigned int seconds)
+{
+ idletimer_start_alarm_ktime(timer, ktime_set(seconds, 0));
+}
+
static int idletimer_check_sysfs_name(const char *name, unsigned int size)
{
int ret;
@@ -220,12 +235,10 @@ static int idletimer_tg_create_v1(struct idletimer_tg_info_v1 *info)
INIT_WORK(&info->timer->work, idletimer_tg_work);
if (info->timer->timer_type & XT_IDLETIMER_ALARM) {
- ktime_t tout;
alarm_init(&info->timer->alarm, ALARM_BOOTTIME,
idletimer_tg_alarmproc);
info->timer->alarm.data = info->timer;
- tout = ktime_set(info->timeout, 0);
- alarm_start_relative(&info->timer->alarm, tout);
+ idletimer_start_alarm_sec(info->timer, info->timeout);
} else {
timer_setup(&info->timer->timer, idletimer_tg_expired, 0);
mod_timer(&info->timer->timer,
@@ -271,8 +284,7 @@ static unsigned int idletimer_tg_target_v1(struct sk_buff *skb,
info->label, info->timeout);
if (info->timer->timer_type & XT_IDLETIMER_ALARM) {
- ktime_t tout = ktime_set(info->timeout, 0);
- alarm_start_relative(&info->timer->alarm, tout);
+ idletimer_start_alarm_sec(info->timer, info->timeout);
} else {
mod_timer(&info->timer->timer,
secs_to_jiffies(info->timeout) + jiffies);
@@ -384,7 +396,7 @@ static int idletimer_tg_checkentry_v1(const struct xt_tgchk_param *par)
if (ktimespec.tv_sec > 0) {
pr_debug("time_expiry_remaining %lld\n",
ktimespec.tv_sec);
- alarm_start_relative(&info->timer->alarm, tout);
+ idletimer_start_alarm_ktime(info->timer, tout);
}
} else {
mod_timer(&info->timer->timer,
diff --git a/scripts/timer_migration_tree.py b/scripts/timer_migration_tree.py
new file mode 100755
index 000000000000..faac9de854bd
--- /dev/null
+++ b/scripts/timer_migration_tree.py
@@ -0,0 +1,122 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: GPL-2.0
+
+"""
+Draw the timer migration tree.
+
+1) Boot with trace_event==tmigr_connect_cpu_parent,tmigr_connect_child_parent
+2) ./timer_migration_tree.py < /sys/kernel/tracing/trace
+"""
+
+import re, sys
+from ete3 import Tree
+
+class Node:
+ def __init__(self, group):
+ self.group = group
+ self.children = []
+ self.parent = None
+ self.num_children = 0
+ self.groupmask = 0
+ self.lvl = -1
+
+ def set_groupmask(self, groupmask):
+ self.groupmask = groupmask
+
+ def set_parent(self, parent):
+ self.parent = parent
+
+ def add_child(self, child):
+ self.children.append(child)
+
+ def set_lvl(self, lvl):
+ self.lvl = lvl
+
+ def set_numa(self, numa):
+ self.numa = numa
+
+ def set_num_children(self, num_children):
+ self.num_children = num_children
+
+ def __repr__(self):
+ if self.parent:
+ parent_grp = self.parent.group
+ else:
+ parent_grp = "-"
+ return "Group: %s mask: %s parent: %s lvl: %d numa: %d num_children: %d" % (self.group, self.groupmask, parent_grp, self.lvl, self.numa, self.num_children)
+
+hierarchies = { }
+
+def get_hierarchy(capacity):
+ if capacity not in hierarchies:
+ hierarchies[capacity] = {}
+ return hierarchies[capacity]
+
+def get_node(capacity, group):
+ hier = get_hierarchy(capacity)
+ if group in hier:
+ return hier[group]
+ else:
+ n = Node(group)
+ hier[group] = n
+ return n
+
+def tmigr_connect_cpu_parent(ts, line):
+ s = re.search("tmigr_connect_cpu_parent: cpu=([0-9]+) groupmask=([0-9a-zA-Z]+) parent=([0-9a-zA-Z]+) lvl=([0-9]+) numa=([-]?[0-9]+) capacity=([-]?[0-9]+) num_children=([0-9]+)", line)
+ if s is None:
+ return False
+ (cpu, groupmask, parent, lvl, numa, capacity, num_children) = (int(s.group(1)), s.group(2), s.group(3), int(s.group(4)), int(s.group(5)), int(s.group(6)), int(s.group(7)))
+ n = get_node(capacity, cpu)
+ p = get_node(capacity, parent)
+ n.set_parent(p)
+ n.set_groupmask(groupmask)
+ n.set_lvl(-1)
+ p.set_lvl(lvl)
+ p.set_numa(numa)
+ n.set_numa(numa)
+ p.set_num_children(num_children)
+ p.add_child(n)
+
+def tmigr_connect_child_parent(ts, line):
+ s = re.search("tmigr_connect_child_parent: group=([0-9a-zA-Z]+) groupmask=([0-9a-zA-Z]+) parent=([0-9a-zA-Z]+) lvl=([0-9]+) numa=([-]?[0-9]+) capacity=([-]?[0-9]+) num_children=([0-9]+)", line)
+ if s is None:
+ return False
+ (group, groupmask, parent, lvl, numa, capacity, num_children) = (s.group(1), s.group(2), s.group(3), int(s.group(4)), int(s.group(5)), int(s.group(6)), int(s.group(7)))
+ n = get_node(capacity, group)
+ p = get_node(capacity, parent)
+ n.set_parent(p)
+ n.set_groupmask(groupmask)
+ p.set_lvl(lvl)
+ p.set_numa(numa)
+ p.set_num_children(num_children)
+ p.add_child(n)
+
+def populate(enode, node):
+ enode = enode.add_child(name = node.group)
+ enode.add_feature("groupmask", "m:%s" % node.groupmask)
+ enode.add_feature("lvl", "lvl:%d" % node.lvl)
+ enode.add_feature("numa", "node %d" % node.numa)
+ enode.add_feature("num_children", "c=%d" % node.num_children)
+ for child in node.children:
+ populate(enode, child)
+
+if __name__ == "__main__":
+ for line in sys.stdin:
+ s = re.search("([0-9]+[.][0-9]{6}): (.+?)$", line, re.S)
+ if s is not None:
+ if tmigr_connect_cpu_parent(float(s.group(1)), s.group(2)):
+ continue
+ if tmigr_connect_child_parent(float(s.group(1)), s.group(2)):
+ continue
+
+ for cap in hierarchies:
+ h = hierarchies[cap]
+ print("Tree for capacity %d" % cap)
+ for k in h:
+ n = h[k]
+ while n.parent != None:
+ n = n.parent
+ root = Tree()
+ populate(root, n)
+ print(root.get_ascii(show_internal=True, attributes=["name", "numa", "lvl"]))
+ break
diff --git a/tools/testing/selftests/timers/posix_timers.c b/tools/testing/selftests/timers/posix_timers.c
index 38512623622a..2f3bac9fc6e8 100644
--- a/tools/testing/selftests/timers/posix_timers.c
+++ b/tools/testing/selftests/timers/posix_timers.c
@@ -78,19 +78,25 @@ static void sig_handler(int nr)
done = 1;
}
+static inline int64_t calcdiff_ns(struct timespec t1, struct timespec t2)
+{
+ int64_t diff;
+
+ diff = NSEC_PER_SEC * (int64_t)((int) t1.tv_sec - (int) t2.tv_sec);
+ diff += ((int) t1.tv_nsec - (int) t2.tv_nsec);
+ return diff;
+}
+
/*
* Check the expected timer expiration matches the GTOD elapsed delta since
* we armed the timer. Keep a 0.5 sec error margin due to various jitter.
*/
-static int check_diff(struct timeval start, struct timeval end)
+static int check_diff(struct timespec start, struct timespec end)
{
- long long diff;
-
- diff = end.tv_usec - start.tv_usec;
- diff += (end.tv_sec - start.tv_sec) * USEC_PER_SEC;
+ long long diff = calcdiff_ns(end, start);
- if (llabs(diff - DELAY * USEC_PER_SEC) > USEC_PER_SEC / 2) {
- printf("Diff too high: %lld..", diff);
+ if (llabs(diff - DELAY * NSEC_PER_SEC) > NSEC_PER_SEC / 2) {
+ printf("Diff too high: %lld ns..", diff);
return -1;
}
@@ -99,22 +105,25 @@ static int check_diff(struct timeval start, struct timeval end)
static void check_itimer(int which, const char *name)
{
- struct timeval start, end;
+ struct timespec start, end;
struct itimerval val = {
.it_value.tv_sec = DELAY,
};
+ int clock_id = CLOCK_REALTIME;
done = 0;
if (which == ITIMER_VIRTUAL)
signal(SIGVTALRM, sig_handler);
- else if (which == ITIMER_PROF)
+ else if (which == ITIMER_PROF) {
+ clock_id = CLOCK_THREAD_CPUTIME_ID;
signal(SIGPROF, sig_handler);
+ }
else if (which == ITIMER_REAL)
signal(SIGALRM, sig_handler);
- if (gettimeofday(&start, NULL) < 0)
- fatal_error(name, "gettimeofday()");
+ if (clock_gettime(clock_id, &start))
+ fatal_error(name, "clock_gettime()");
if (setitimer(which, &val, NULL) < 0)
fatal_error(name, "setitimer()");
@@ -126,18 +135,19 @@ static void check_itimer(int which, const char *name)
else if (which == ITIMER_REAL)
idle_loop();
- if (gettimeofday(&end, NULL) < 0)
- fatal_error(name, "gettimeofday()");
+ if (clock_gettime(clock_id, &end))
+ fatal_error(name, "clock_gettime()");
ksft_test_result(check_diff(start, end) == 0, "%s\n", name);
}
static void check_timer_create(int which, const char *name)
{
- struct timeval start, end;
+ struct timespec start, end;
struct itimerspec val = {
.it_value.tv_sec = DELAY,
};
+ int clock_id = CLOCK_REALTIME;
timer_t id;
done = 0;
@@ -148,16 +158,16 @@ static void check_timer_create(int which, const char *name)
if (signal(SIGALRM, sig_handler) == SIG_ERR)
fatal_error(name, "signal()");
- if (gettimeofday(&start, NULL) < 0)
- fatal_error(name, "gettimeofday()");
+ if (clock_gettime(clock_id, &start))
+ fatal_error(name, "clock_gettime()");
if (timer_settime(id, 0, &val, NULL) < 0)
fatal_error(name, "timer_settime()");
user_loop();
- if (gettimeofday(&end, NULL) < 0)
- fatal_error(name, "gettimeofday()");
+ if (clock_gettime(clock_id, &end))
+ fatal_error(name, "clock_gettime()");
ksft_test_result(check_diff(start, end) == 0,
"timer_create() per %s\n", name);
@@ -445,15 +455,6 @@ static void check_delete(void)
ksft_test_result(!tsig.signals, "check_delete\n");
}
-static inline int64_t calcdiff_ns(struct timespec t1, struct timespec t2)
-{
- int64_t diff;
-
- diff = NSEC_PER_SEC * (int64_t)((int) t1.tv_sec - (int) t2.tv_sec);
- diff += ((int) t1.tv_nsec - (int) t2.tv_nsec);
- return diff;
-}
-
static void check_sigev_none(int which, const char *name)
{
struct timespec start, now;
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [GIT pull] timers/nohz for v7.2-rc1
2026-06-13 21:24 [GIT pull] core/rseq for v7.2-rc1 Thomas Gleixner
` (5 preceding siblings ...)
2026-06-13 21:25 ` [GIT pull] timers/core " Thomas Gleixner
@ 2026-06-13 21:25 ` Thomas Gleixner
2026-06-15 8:00 ` Ingo Molnar
2026-06-15 8:51 ` pr-tracker-bot
2026-06-13 21:25 ` [GIT pull] timers/ptp " Thomas Gleixner
` (2 subsequent siblings)
9 siblings, 2 replies; 22+ messages in thread
From: Thomas Gleixner @ 2026-06-13 21:25 UTC (permalink / raw)
To: Linus Torvalds; +Cc: linux-kernel, x86
Linus,
please pull the latest timers/nohz branch from:
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git timers-nohz-2026-06-13
up to: 6199f9999a9b: sched/cputime: Handle dyntick-idle steal time correctly
Updates for the NOHZ subsystem:
- Fix a long standing TOCTOU in get_cpu_sleep_time_us()
- Make the CPU offline NOHZ handling more robust by disabling NOHZ on the
outgoing CPU early instead of creating unneeded state which needs to be
undone.
- Unify idle CPU time accounting instead of having two different
accounting mechanisms. These two different mechanisms are not really
independent, but the different properties can in the worst case cause
that gloabl idle time can be observed going backwards.
- Consolidate the idle/iowait time retrieval interfaces instead of
converting back and forth between them.
- Make idle interrupt time accounting more robust. The original code
assumes that interrupt time accouting is enabled and therefore stops
elapsing idle time while an interrupt is handled in NOHZ dyntick
state. That assumption is not correct as interrupt time accounting can
be disabled at compile and runtime.
- Fix an accounting error between dyntick idle time and dyntick idle
steal time. The stolen time is not accounted and therefore idle time
becomes inaccurate. The stolen time is now accounted after the fact as
there is no way to predict the steal time upfront.
Thanks,
tglx
------------------>
Frederic Weisbecker (15):
tick/sched: Fix TOCTOU in nohz idle time fetch
sched/idle: Handle offlining first in idle loop
sched/cputime: Remove superfluous and error prone kcpustat_field() parameter
sched/cputime: Correctly support generic vtime idle time
powerpc/time: Prepare to stop elapsing in dynticks-idle
s390/time: Prepare to stop elapsing in dynticks-idle
tick/sched: Unify idle cputime accounting
tick/sched: Remove nohz disabled special case in cputime fetch
tick/sched: Move dyntick-idle cputime accounting to cputime code
tick/sched: Remove unused fields
tick/sched: Account tickless idle cputime only when tick is stopped
tick/sched: Consolidate idle time fetching APIs
sched/cputime: Provide get_cpu_[idle|iowait]_time_us() off-case
sched/cputime: Handle idle irqtime gracefully
sched/cputime: Handle dyntick-idle steal time correctly
arch/powerpc/kernel/time.c | 41 +++++
arch/s390/include/asm/idle.h | 2 +
arch/s390/kernel/idle.c | 5 +-
arch/s390/kernel/vtime.c | 75 ++++++++-
drivers/cpufreq/cpufreq.c | 29 +---
drivers/cpufreq/cpufreq_governor.c | 6 +-
drivers/macintosh/rack-meter.c | 2 +-
fs/proc/stat.c | 40 +----
fs/proc/uptime.c | 8 +-
include/linux/kernel_stat.h | 76 +++++++--
include/linux/tick.h | 4 -
include/linux/vtime.h | 22 ++-
kernel/rcu/tree.c | 9 +-
kernel/rcu/tree_stall.h | 7 +-
kernel/sched/core.c | 6 +-
kernel/sched/cputime.c | 308 +++++++++++++++++++++++++++++++------
kernel/sched/idle.c | 13 +-
kernel/time/tick-sched.c | 212 ++++++-------------------
kernel/time/tick-sched.h | 12 --
kernel/time/timer_list.c | 6 +-
scripts/gdb/linux/timerlist.py | 4 -
21 files changed, 529 insertions(+), 358 deletions(-)
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index b4472288e0d4..3460d1a5a97c 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -376,6 +376,47 @@ void vtime_task_switch(struct task_struct *prev)
acct->starttime = acct0->starttime;
}
}
+
+#ifdef CONFIG_NO_HZ_COMMON
+/**
+ * vtime_reset - Fast forward vtime entry clocks
+ *
+ * Called from dynticks idle IRQ entry to fast-forward the clocks to current time
+ * so that the IRQ time is still accounted by vtime while nohz cputime is paused.
+ */
+void vtime_reset(void)
+{
+ struct cpu_accounting_data *acct = get_accounting(current);
+
+ acct->starttime = mftb();
+#ifdef CONFIG_ARCH_HAS_SCALED_CPUTIME
+ acct->startspurr = read_spurr(acct->starttime);
+#endif
+}
+
+/**
+ * vtime_dyntick_start - Inform vtime about entry to idle-dynticks
+ *
+ * Called when idle enters in dyntick mode. The idle cputime that elapsed so far
+ * is accumulated and the tick subsystem takes over the idle cputime accounting.
+ */
+void vtime_dyntick_start(void)
+{
+ vtime_account_idle(current);
+}
+
+/**
+ * vtime_dyntick_stop - Inform vtime about exit from idle-dynticks
+ *
+ * Called when idle exits from dyntick mode. The vtime entry clocks are
+ * fast-forward to current time so that idle accounting restarts elapsing from
+ * now.
+ */
+void vtime_dyntick_stop(void)
+{
+ vtime_reset();
+}
+#endif /* CONFIG_NO_HZ_COMMON */
#endif /* CONFIG_VIRT_CPU_ACCOUNTING_NATIVE */
void __no_kcsan __delay(unsigned long loops)
diff --git a/arch/s390/include/asm/idle.h b/arch/s390/include/asm/idle.h
index 32536ee34aa0..e4ad09a22400 100644
--- a/arch/s390/include/asm/idle.h
+++ b/arch/s390/include/asm/idle.h
@@ -8,10 +8,12 @@
#ifndef _S390_IDLE_H
#define _S390_IDLE_H
+#include <linux/percpu-defs.h>
#include <linux/types.h>
#include <linux/device.h>
struct s390_idle_data {
+ bool idle_dyntick;
unsigned long idle_count;
unsigned long idle_time;
unsigned long clock_idle_enter;
diff --git a/arch/s390/kernel/idle.c b/arch/s390/kernel/idle.c
index 1f1b06b6b4ef..4685d7c5bc51 100644
--- a/arch/s390/kernel/idle.c
+++ b/arch/s390/kernel/idle.c
@@ -31,7 +31,10 @@ void account_idle_time_irq(void)
/* Account time spent with enabled wait psw loaded as idle time. */
__atomic64_add(idle_time, &idle->idle_time);
__atomic64_add_const(1, &idle->idle_count);
- account_idle_time(cputime_to_nsecs(idle_time));
+
+ /* Dyntick idle time accounted by nohz/scheduler */
+ if (!idle->idle_dyntick)
+ account_idle_time(cputime_to_nsecs(idle_time));
}
void noinstr arch_cpu_idle(void)
diff --git a/arch/s390/kernel/vtime.c b/arch/s390/kernel/vtime.c
index bf48744d0912..d1102a6f80bd 100644
--- a/arch/s390/kernel/vtime.c
+++ b/arch/s390/kernel/vtime.c
@@ -17,6 +17,7 @@
#include <asm/vtimer.h>
#include <asm/vtime.h>
#include <asm/cpu_mf.h>
+#include <asm/idle.h>
#include <asm/smp.h>
#include "entry.h"
@@ -110,6 +111,16 @@ static void account_system_index_scaled(struct task_struct *p, u64 cputime,
account_system_index_time(p, cputime_to_nsecs(cputime), index);
}
+static inline void vtime_reset_last_update(struct lowcore *lc)
+{
+ asm volatile(
+ " stpt %0\n" /* Store current cpu timer value */
+ " stckf %1" /* Store current tod clock value */
+ : "=Q" (lc->last_update_timer),
+ "=Q" (lc->last_update_clock)
+ : : "cc");
+}
+
/*
* Update process times based on virtual cpu times stored by entry.S
* to the lowcore fields user_timer, system_timer & steal_clock.
@@ -121,17 +132,16 @@ static int do_account_vtime(struct task_struct *tsk)
timer = lc->last_update_timer;
clock = lc->last_update_clock;
- asm volatile(
- " stpt %0\n" /* Store current cpu timer value */
- " stckf %1" /* Store current tod clock value */
- : "=Q" (lc->last_update_timer),
- "=Q" (lc->last_update_clock)
- : : "cc");
+
+ vtime_reset_last_update(lc);
+
clock = lc->last_update_clock - clock;
timer -= lc->last_update_timer;
if (hardirq_count())
lc->hardirq_timer += timer;
+ else if (in_serving_softirq())
+ lc->softirq_timer += timer;
else
lc->system_timer += timer;
@@ -231,13 +241,62 @@ EXPORT_SYMBOL_GPL(vtime_account_kernel);
void vtime_account_softirq(struct task_struct *tsk)
{
- get_lowcore()->softirq_timer += vtime_delta();
+ if (!__this_cpu_read(s390_idle.idle_dyntick))
+ get_lowcore()->softirq_timer += vtime_delta();
+ else
+ vtime_flush(tsk);
}
void vtime_account_hardirq(struct task_struct *tsk)
{
- get_lowcore()->hardirq_timer += vtime_delta();
+ if (!__this_cpu_read(s390_idle.idle_dyntick)) {
+ get_lowcore()->hardirq_timer += vtime_delta();
+ } else {
+ /*
+ * In dynticks mode, the idle cputime is accounted by the nohz
+ * subsystem. Therefore the s390 timer/clocks are reset on IRQ
+ * entry and steal time must be accounted now.
+ */
+ vtime_flush(tsk);
+ }
+}
+
+#ifdef CONFIG_NO_HZ_COMMON
+/**
+ * vtime_reset - Fast forward vtime entry clocks
+ *
+ * Called from dynticks idle IRQ entry to fast-forward the clocks to current time
+ * so that the IRQ time is still accounted by vtime while nohz cputime is paused.
+ */
+void vtime_reset(void)
+{
+ vtime_reset_last_update(get_lowcore());
+}
+
+/**
+ * vtime_dyntick_start - Inform vtime about entry to idle-dynticks
+ *
+ * Called when idle enters in dyntick mode. The idle cputime that elapsed so far
+ * is flushed and the tick subsystem takes over the idle cputime accounting.
+ */
+void vtime_dyntick_start(void)
+{
+ __this_cpu_write(s390_idle.idle_dyntick, true);
+ vtime_flush(current);
+}
+
+/**
+ * vtime_dyntick_stop - Inform vtime about exit from idle-dynticks
+ *
+ * Called when idle exits from dyntick mode. The vtime entry clocks are
+ * fast-forward to current time and idle accounting resumes.
+ */
+void vtime_dyntick_stop(void)
+{
+ vtime_reset_last_update(get_lowcore());
+ __this_cpu_write(s390_idle.idle_dyntick, false);
}
+#endif /* CONFIG_NO_HZ_COMMON */
/*
* Sorted add to a list. List is linear searched until first bigger
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 44eb1b7e7fc1..dda0d34d3c02 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -130,38 +130,11 @@ struct kobject *get_governor_parent_kobj(struct cpufreq_policy *policy)
}
EXPORT_SYMBOL_GPL(get_governor_parent_kobj);
-static inline u64 get_cpu_idle_time_jiffy(unsigned int cpu, u64 *wall)
-{
- struct kernel_cpustat kcpustat;
- u64 cur_wall_time;
- u64 idle_time;
- u64 busy_time;
-
- cur_wall_time = jiffies64_to_nsecs(get_jiffies_64());
-
- kcpustat_cpu_fetch(&kcpustat, cpu);
-
- busy_time = kcpustat.cpustat[CPUTIME_USER];
- busy_time += kcpustat.cpustat[CPUTIME_SYSTEM];
- busy_time += kcpustat.cpustat[CPUTIME_IRQ];
- busy_time += kcpustat.cpustat[CPUTIME_SOFTIRQ];
- busy_time += kcpustat.cpustat[CPUTIME_STEAL];
- busy_time += kcpustat.cpustat[CPUTIME_NICE];
-
- idle_time = cur_wall_time - busy_time;
- if (wall)
- *wall = div_u64(cur_wall_time, NSEC_PER_USEC);
-
- return div_u64(idle_time, NSEC_PER_USEC);
-}
-
u64 get_cpu_idle_time(unsigned int cpu, u64 *wall, int io_busy)
{
u64 idle_time = get_cpu_idle_time_us(cpu, io_busy ? wall : NULL);
- if (idle_time == -1ULL)
- return get_cpu_idle_time_jiffy(cpu, wall);
- else if (!io_busy)
+ if (!io_busy)
idle_time += get_cpu_iowait_time_us(cpu, wall);
return idle_time;
diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index 86f35e451914..3c4a1f9af3ae 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -105,7 +105,7 @@ void gov_update_cpu_data(struct dbs_data *dbs_data)
j_cdbs->prev_cpu_idle = get_cpu_idle_time(j, &j_cdbs->prev_update_time,
dbs_data->io_is_busy);
if (dbs_data->ignore_nice_load)
- j_cdbs->prev_cpu_nice = kcpustat_field(&kcpustat_cpu(j), CPUTIME_NICE, j);
+ j_cdbs->prev_cpu_nice = kcpustat_field(CPUTIME_NICE, j);
}
}
}
@@ -165,7 +165,7 @@ unsigned int dbs_update(struct cpufreq_policy *policy)
j_cdbs->prev_cpu_idle = cur_idle_time;
if (ignore_nice) {
- u64 cur_nice = kcpustat_field(&kcpustat_cpu(j), CPUTIME_NICE, j);
+ u64 cur_nice = kcpustat_field(CPUTIME_NICE, j);
idle_time += div_u64(cur_nice - j_cdbs->prev_cpu_nice, NSEC_PER_USEC);
j_cdbs->prev_cpu_nice = cur_nice;
@@ -539,7 +539,7 @@ int cpufreq_dbs_governor_start(struct cpufreq_policy *policy)
j_cdbs->prev_load = 0;
if (ignore_nice)
- j_cdbs->prev_cpu_nice = kcpustat_field(&kcpustat_cpu(j), CPUTIME_NICE, j);
+ j_cdbs->prev_cpu_nice = kcpustat_field(CPUTIME_NICE, j);
}
gov->start(policy);
diff --git a/drivers/macintosh/rack-meter.c b/drivers/macintosh/rack-meter.c
index 8a1e2c08b096..26cb93191ede 100644
--- a/drivers/macintosh/rack-meter.c
+++ b/drivers/macintosh/rack-meter.c
@@ -87,7 +87,7 @@ static inline u64 get_cpu_idle_time(unsigned int cpu)
kcpustat->cpustat[CPUTIME_IOWAIT];
if (rackmeter_ignore_nice)
- retval += kcpustat_field(kcpustat, CPUTIME_NICE, cpu);
+ retval += kcpustat_field(CPUTIME_NICE, cpu);
return retval;
}
diff --git a/fs/proc/stat.c b/fs/proc/stat.c
index 8b444e862319..c00468a83f64 100644
--- a/fs/proc/stat.c
+++ b/fs/proc/stat.c
@@ -22,38 +22,6 @@
#define arch_irq_stat() 0
#endif
-u64 get_idle_time(struct kernel_cpustat *kcs, int cpu)
-{
- u64 idle, idle_usecs = -1ULL;
-
- if (cpu_online(cpu))
- idle_usecs = get_cpu_idle_time_us(cpu, NULL);
-
- if (idle_usecs == -1ULL)
- /* !NO_HZ or cpu offline so we can rely on cpustat.idle */
- idle = kcs->cpustat[CPUTIME_IDLE];
- else
- idle = idle_usecs * NSEC_PER_USEC;
-
- return idle;
-}
-
-static u64 get_iowait_time(struct kernel_cpustat *kcs, int cpu)
-{
- u64 iowait, iowait_usecs = -1ULL;
-
- if (cpu_online(cpu))
- iowait_usecs = get_cpu_iowait_time_us(cpu, NULL);
-
- if (iowait_usecs == -1ULL)
- /* !NO_HZ or cpu offline so we can rely on cpustat.iowait */
- iowait = kcs->cpustat[CPUTIME_IOWAIT];
- else
- iowait = iowait_usecs * NSEC_PER_USEC;
-
- return iowait;
-}
-
static void show_irq_gap(struct seq_file *p, unsigned int gap)
{
static const char zeros[] = " 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0";
@@ -105,8 +73,8 @@ static int show_stat(struct seq_file *p, void *v)
user += cpustat[CPUTIME_USER];
nice += cpustat[CPUTIME_NICE];
system += cpustat[CPUTIME_SYSTEM];
- idle += get_idle_time(&kcpustat, i);
- iowait += get_iowait_time(&kcpustat, i);
+ idle += cpustat[CPUTIME_IDLE];
+ iowait += cpustat[CPUTIME_IOWAIT];
irq += cpustat[CPUTIME_IRQ];
softirq += cpustat[CPUTIME_SOFTIRQ];
steal += cpustat[CPUTIME_STEAL];
@@ -146,8 +114,8 @@ static int show_stat(struct seq_file *p, void *v)
user = cpustat[CPUTIME_USER];
nice = cpustat[CPUTIME_NICE];
system = cpustat[CPUTIME_SYSTEM];
- idle = get_idle_time(&kcpustat, i);
- iowait = get_iowait_time(&kcpustat, i);
+ idle = cpustat[CPUTIME_IDLE];
+ iowait = cpustat[CPUTIME_IOWAIT];
irq = cpustat[CPUTIME_IRQ];
softirq = cpustat[CPUTIME_SOFTIRQ];
steal = cpustat[CPUTIME_STEAL];
diff --git a/fs/proc/uptime.c b/fs/proc/uptime.c
index b5343d209381..433aa947cd57 100644
--- a/fs/proc/uptime.c
+++ b/fs/proc/uptime.c
@@ -18,12 +18,8 @@ static int uptime_proc_show(struct seq_file *m, void *v)
int i;
idle_nsec = 0;
- for_each_possible_cpu(i) {
- struct kernel_cpustat kcs;
-
- kcpustat_cpu_fetch(&kcs, i);
- idle_nsec += get_idle_time(&kcs, i);
- }
+ for_each_possible_cpu(i)
+ idle_nsec += kcpustat_field(CPUTIME_IDLE, i);
ktime_get_boottime_ts64(&uptime);
timens_add_boottime(&uptime);
diff --git a/include/linux/kernel_stat.h b/include/linux/kernel_stat.h
index b97ce2df376f..fce1392e2140 100644
--- a/include/linux/kernel_stat.h
+++ b/include/linux/kernel_stat.h
@@ -34,7 +34,14 @@ enum cpu_usage_stat {
};
struct kernel_cpustat {
- u64 cpustat[NR_STATS];
+#ifdef CONFIG_NO_HZ_COMMON
+ bool idle_dyntick;
+ bool idle_elapse;
+ seqcount_t idle_sleeptime_seq;
+ u64 idle_entrytime;
+ u64 idle_stealtime[2];
+#endif
+ u64 cpustat[NR_STATS];
};
struct kernel_stat {
@@ -99,23 +106,68 @@ static inline unsigned long kstat_cpu_irqs_sum(unsigned int cpu)
return kstat_cpu(cpu).irqs_sum;
}
+#ifdef CONFIG_NO_HZ_COMMON
+extern void kcpustat_dyntick_start(u64 now);
+extern void kcpustat_dyntick_stop(u64 now);
+extern void kcpustat_irq_enter(u64 now);
+extern void kcpustat_irq_exit(u64 now);
+extern u64 kcpustat_field_idle(int cpu);
+extern u64 kcpustat_field_iowait(int cpu);
+
+static inline bool kcpustat_idle_dyntick(void)
+{
+ return __this_cpu_read(kernel_cpustat.idle_dyntick);
+}
+#else
+static inline u64 kcpustat_field_idle(int cpu)
+{
+ return kcpustat_cpu(cpu).cpustat[CPUTIME_IDLE];
+}
+static inline u64 kcpustat_field_iowait(int cpu)
+{
+ return kcpustat_cpu(cpu).cpustat[CPUTIME_IOWAIT];
+}
+
+static inline bool kcpustat_idle_dyntick(void)
+{
+ return false;
+}
+#endif /* CONFIG_NO_HZ_COMMON */
+
+extern u64 get_cpu_idle_time_us(int cpu, u64 *last_update_time);
+extern u64 get_cpu_iowait_time_us(int cpu, u64 *last_update_time);
+
+/* Fetch cputime values when vtime is disabled on a CPU */
+static inline u64 kcpustat_field_default(enum cpu_usage_stat usage, int cpu)
+{
+ if (usage == CPUTIME_IDLE)
+ return kcpustat_field_idle(cpu);
+ if (usage == CPUTIME_IOWAIT)
+ return kcpustat_field_iowait(cpu);
+ return kcpustat_cpu(cpu).cpustat[usage];
+}
+
+static inline void kcpustat_cpu_fetch_default(struct kernel_cpustat *dst, int cpu)
+{
+ *dst = kcpustat_cpu(cpu);
+ dst->cpustat[CPUTIME_IDLE] = kcpustat_field_idle(cpu);
+ dst->cpustat[CPUTIME_IOWAIT] = kcpustat_field_iowait(cpu);
+}
+
#ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
-extern u64 kcpustat_field(struct kernel_cpustat *kcpustat,
- enum cpu_usage_stat usage, int cpu);
+extern u64 kcpustat_field(enum cpu_usage_stat usage, int cpu);
extern void kcpustat_cpu_fetch(struct kernel_cpustat *dst, int cpu);
#else
-static inline u64 kcpustat_field(struct kernel_cpustat *kcpustat,
- enum cpu_usage_stat usage, int cpu)
+static inline u64 kcpustat_field(enum cpu_usage_stat usage, int cpu)
{
- return kcpustat->cpustat[usage];
+ return kcpustat_field_default(usage, cpu);
}
static inline void kcpustat_cpu_fetch(struct kernel_cpustat *dst, int cpu)
{
- *dst = kcpustat_cpu(cpu);
+ kcpustat_cpu_fetch_default(dst, cpu);
}
-
-#endif
+#endif /* !CONFIG_VIRT_CPU_ACCOUNTING_GEN */
extern void account_user_time(struct task_struct *, u64);
extern void account_guest_time(struct task_struct *, u64);
@@ -124,19 +176,17 @@ extern void account_system_index_time(struct task_struct *, u64,
enum cpu_usage_stat);
extern void account_steal_time(u64);
extern void account_idle_time(u64);
-extern u64 get_idle_time(struct kernel_cpustat *kcs, int cpu);
#ifdef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
static inline void account_process_tick(struct task_struct *tsk, int user)
{
- vtime_flush(tsk);
+ if (!kcpustat_idle_dyntick())
+ vtime_flush(tsk);
}
#else
extern void account_process_tick(struct task_struct *, int user);
#endif
-extern void account_idle_ticks(unsigned long ticks);
-
#ifdef CONFIG_SCHED_CORE
extern void __account_forceidle_time(struct task_struct *tsk, u64 delta);
#endif
diff --git a/include/linux/tick.h b/include/linux/tick.h
index 738007d6f577..1cf4651f09ad 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -139,8 +139,6 @@ extern bool tick_nohz_idle_got_tick(void);
extern ktime_t tick_nohz_get_next_hrtimer(void);
extern ktime_t tick_nohz_get_sleep_length(ktime_t *delta_next);
extern unsigned long tick_nohz_get_idle_calls_cpu(int cpu);
-extern u64 get_cpu_idle_time_us(int cpu, u64 *last_update_time);
-extern u64 get_cpu_iowait_time_us(int cpu, u64 *last_update_time);
#else /* !CONFIG_NO_HZ_COMMON */
#define tick_nohz_enabled (0)
static inline bool tick_nohz_is_active(void) { return false; }
@@ -162,8 +160,6 @@ static inline ktime_t tick_nohz_get_sleep_length(ktime_t *delta_next)
*delta_next = TICK_NSEC;
return *delta_next;
}
-static inline u64 get_cpu_idle_time_us(int cpu, u64 *unused) { return -1; }
-static inline u64 get_cpu_iowait_time_us(int cpu, u64 *unused) { return -1; }
#endif /* !CONFIG_NO_HZ_COMMON */
/*
diff --git a/include/linux/vtime.h b/include/linux/vtime.h
index 29dd5b91dd7d..9dc25b04a119 100644
--- a/include/linux/vtime.h
+++ b/include/linux/vtime.h
@@ -10,7 +10,6 @@
*/
#ifdef CONFIG_VIRT_CPU_ACCOUNTING
extern void vtime_account_kernel(struct task_struct *tsk);
-extern void vtime_account_idle(struct task_struct *tsk);
#endif /* !CONFIG_VIRT_CPU_ACCOUNTING */
#ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
@@ -27,16 +26,33 @@ static inline void vtime_guest_exit(struct task_struct *tsk) { }
static inline void vtime_init_idle(struct task_struct *tsk, int cpu) { }
#endif
+static inline bool vtime_generic_enabled_cpu(int cpu)
+{
+ return context_tracking_enabled_cpu(cpu);
+}
+
+static inline bool vtime_generic_enabled_this_cpu(void)
+{
+ return context_tracking_enabled_this_cpu();
+}
+
#ifdef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
+extern void vtime_account_idle(struct task_struct *tsk);
extern void vtime_account_irq(struct task_struct *tsk, unsigned int offset);
extern void vtime_account_softirq(struct task_struct *tsk);
extern void vtime_account_hardirq(struct task_struct *tsk);
extern void vtime_flush(struct task_struct *tsk);
+extern void vtime_reset(void);
+extern void vtime_dyntick_start(void);
+extern void vtime_dyntick_stop(void);
#else /* !CONFIG_VIRT_CPU_ACCOUNTING_NATIVE */
static inline void vtime_account_irq(struct task_struct *tsk, unsigned int offset) { }
static inline void vtime_account_softirq(struct task_struct *tsk) { }
static inline void vtime_account_hardirq(struct task_struct *tsk) { }
static inline void vtime_flush(struct task_struct *tsk) { }
+static inline void vtime_reset(void) { }
+static inline void vtime_dyntick_start(void) { }
+static inline void vtime_dyntick_stop(void) { }
#endif
/*
@@ -74,12 +90,12 @@ static inline bool vtime_accounting_enabled(void)
static inline bool vtime_accounting_enabled_cpu(int cpu)
{
- return context_tracking_enabled_cpu(cpu);
+ return vtime_generic_enabled_cpu(cpu);
}
static inline bool vtime_accounting_enabled_this_cpu(void)
{
- return context_tracking_enabled_this_cpu();
+ return vtime_generic_enabled_this_cpu();
}
extern void vtime_task_switch_generic(struct task_struct *prev);
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 55df6d37145e..3cbf79bee976 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -969,14 +969,11 @@ static int rcu_watching_snap_recheck(struct rcu_data *rdp)
if (rcu_cpu_stall_cputime && rdp->snap_record.gp_seq != rdp->gp_seq) {
int cpu = rdp->cpu;
struct rcu_snap_record *rsrp;
- struct kernel_cpustat *kcsp;
-
- kcsp = &kcpustat_cpu(cpu);
rsrp = &rdp->snap_record;
- rsrp->cputime_irq = kcpustat_field(kcsp, CPUTIME_IRQ, cpu);
- rsrp->cputime_softirq = kcpustat_field(kcsp, CPUTIME_SOFTIRQ, cpu);
- rsrp->cputime_system = kcpustat_field(kcsp, CPUTIME_SYSTEM, cpu);
+ rsrp->cputime_irq = kcpustat_field(CPUTIME_IRQ, cpu);
+ rsrp->cputime_softirq = kcpustat_field(CPUTIME_SOFTIRQ, cpu);
+ rsrp->cputime_system = kcpustat_field(CPUTIME_SYSTEM, cpu);
rsrp->nr_hardirqs = kstat_cpu_irqs_sum(cpu) + arch_irq_stat_cpu(cpu);
rsrp->nr_softirqs = kstat_cpu_softirqs_sum(cpu);
rsrp->nr_csw = nr_context_switches_cpu(cpu);
diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
index b67532cb8770..cf7ae51cba40 100644
--- a/kernel/rcu/tree_stall.h
+++ b/kernel/rcu/tree_stall.h
@@ -479,7 +479,6 @@ static void print_cpu_stat_info(int cpu)
{
struct rcu_snap_record rsr, *rsrp;
struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
- struct kernel_cpustat *kcsp = &kcpustat_cpu(cpu);
if (!rcu_cpu_stall_cputime)
return;
@@ -488,9 +487,9 @@ static void print_cpu_stat_info(int cpu)
if (rsrp->gp_seq != rdp->gp_seq)
return;
- rsr.cputime_irq = kcpustat_field(kcsp, CPUTIME_IRQ, cpu);
- rsr.cputime_softirq = kcpustat_field(kcsp, CPUTIME_SOFTIRQ, cpu);
- rsr.cputime_system = kcpustat_field(kcsp, CPUTIME_SYSTEM, cpu);
+ rsr.cputime_irq = kcpustat_field(CPUTIME_IRQ, cpu);
+ rsr.cputime_softirq = kcpustat_field(CPUTIME_SOFTIRQ, cpu);
+ rsr.cputime_system = kcpustat_field(CPUTIME_SYSTEM, cpu);
pr_err("\t hardirqs softirqs csw/system\n");
pr_err("\t number: %8lld %10d %12lld\n",
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index b8871449d3c6..d797d6696c58 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5518,7 +5518,11 @@ void sched_exec(void)
}
DEFINE_PER_CPU(struct kernel_stat, kstat);
-DEFINE_PER_CPU(struct kernel_cpustat, kernel_cpustat);
+DEFINE_PER_CPU(struct kernel_cpustat, kernel_cpustat) = {
+#ifdef CONFIG_NO_HZ_COMMON
+ .idle_sleeptime_seq = SEQCNT_ZERO(kernel_cpustat.idle_sleeptime_seq)
+#endif
+};
EXPORT_PER_CPU_SYMBOL(kstat);
EXPORT_PER_CPU_SYMBOL(kernel_cpustat);
diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index fbf31db0d2f3..244b57417240 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -2,6 +2,7 @@
/*
* Simple CPU accounting cgroup controller
*/
+#include <linux/sched/clock.h>
#include <linux/sched/cputime.h>
#include <linux/tsacct_kern.h>
#include "sched.h"
@@ -46,7 +47,8 @@ static void irqtime_account_delta(struct irqtime *irqtime, u64 delta,
u64_stats_update_begin(&irqtime->sync);
cpustat[idx] += delta;
irqtime->total += delta;
- irqtime->tick_delta += delta;
+ if (!kcpustat_idle_dyntick())
+ irqtime->tick_delta += delta;
u64_stats_update_end(&irqtime->sync);
}
@@ -414,16 +416,219 @@ static void irqtime_account_process_tick(struct task_struct *p, int user_tick,
}
}
-static void irqtime_account_idle_ticks(int ticks)
-{
- irqtime_account_process_tick(current, 0, ticks);
-}
#else /* !CONFIG_IRQ_TIME_ACCOUNTING: */
-static inline void irqtime_account_idle_ticks(int ticks) { }
static inline void irqtime_account_process_tick(struct task_struct *p, int user_tick,
int nr_ticks) { }
#endif /* !CONFIG_IRQ_TIME_ACCOUNTING */
+#ifdef CONFIG_NO_HZ_COMMON
+static void kcpustat_idle_stop(struct kernel_cpustat *kc, u64 now)
+{
+ u64 *cpustat = kc->cpustat;
+ u64 delta, steal, steal_delta;
+ int iowait;
+
+ if (!kc->idle_elapse)
+ return;
+
+ iowait = nr_iowait_cpu(smp_processor_id()) > 0;
+ delta = now - kc->idle_entrytime;
+ steal = steal_account_process_time(delta);
+
+ /*
+ * Record the idle time after substracting the steal time from
+ * previous update sequence. Don't substract the steal time from
+ * the current update sequence to avoid readers moving backward.
+ */
+ write_seqcount_begin(&kc->idle_sleeptime_seq);
+ steal_delta = min_t(u64, kc->idle_stealtime[iowait], delta);
+ delta -= steal_delta;
+ kc->idle_stealtime[iowait] -= steal_delta;
+
+ if (iowait)
+ cpustat[CPUTIME_IOWAIT] += delta;
+ else
+ cpustat[CPUTIME_IDLE] += delta;
+
+ kc->idle_stealtime[iowait] += steal;
+ kc->idle_entrytime = now;
+ kc->idle_elapse = false;
+ write_seqcount_end(&kc->idle_sleeptime_seq);
+}
+
+static void kcpustat_idle_start(struct kernel_cpustat *kc, u64 now)
+{
+ /* Irqtime accounting might have been enabled in the middle of the IRQ */
+ if (kc->idle_elapse)
+ return;
+
+ write_seqcount_begin(&kc->idle_sleeptime_seq);
+ kc->idle_entrytime = now;
+ kc->idle_elapse = true;
+ write_seqcount_end(&kc->idle_sleeptime_seq);
+}
+
+void kcpustat_dyntick_stop(u64 now)
+{
+ struct kernel_cpustat *kc = kcpustat_this_cpu;
+
+ if (!vtime_generic_enabled_this_cpu()) {
+ WARN_ON_ONCE(!kc->idle_dyntick);
+ kcpustat_idle_stop(kc, now);
+ kc->idle_dyntick = false;
+ vtime_dyntick_stop();
+ }
+}
+
+void kcpustat_dyntick_start(u64 now)
+{
+ struct kernel_cpustat *kc = kcpustat_this_cpu;
+
+ if (!vtime_generic_enabled_this_cpu()) {
+ vtime_dyntick_start();
+ kc->idle_dyntick = true;
+ kcpustat_idle_start(kc, now);
+ }
+}
+
+void kcpustat_irq_enter(u64 now)
+{
+ struct kernel_cpustat *kc = kcpustat_this_cpu;
+
+ if (!vtime_generic_enabled_this_cpu() &&
+ (irqtime_enabled() || vtime_accounting_enabled_this_cpu()))
+ kcpustat_idle_stop(kc, now);
+}
+
+void kcpustat_irq_exit(u64 now)
+{
+ struct kernel_cpustat *kc = kcpustat_this_cpu;
+
+ /*
+ * Generic vtime already does its own idle accounting.
+ * But irqtime accounting or arch vtime which also accounts IRQs
+ * need to pause nohz accounting. Resume nohz accounting as long
+ * as the irqtime config is enabled to handle case where irqtime
+ * accounting got runtime disabled in the middle of an IRQ.
+ */
+ if (!vtime_generic_enabled_this_cpu() &&
+ (IS_ENABLED(CONFIG_IRQ_TIME_ACCOUNTING) || vtime_accounting_enabled_this_cpu()))
+ kcpustat_idle_start(kc, now);
+}
+
+static u64 kcpustat_field_dyntick(int cpu, enum cpu_usage_stat idx,
+ bool compute_delta, u64 now)
+{
+ struct kernel_cpustat *kc = &kcpustat_cpu(cpu);
+ int iowait = idx == CPUTIME_IOWAIT;
+ u64 *cpustat = kc->cpustat;
+ unsigned int seq;
+ u64 idle;
+
+ do {
+ seq = read_seqcount_begin(&kc->idle_sleeptime_seq);
+
+ idle = cpustat[idx];
+
+ if (kc->idle_elapse && compute_delta && now > kc->idle_entrytime) {
+ u64 delta = now - kc->idle_entrytime;
+
+ delta -= min_t(u64, kc->idle_stealtime[iowait], delta);
+ idle += delta;
+ }
+ } while (read_seqcount_retry(&kc->idle_sleeptime_seq, seq));
+
+ return idle;
+}
+
+u64 kcpustat_field_idle(int cpu)
+{
+ return kcpustat_field_dyntick(cpu, CPUTIME_IDLE,
+ !nr_iowait_cpu(cpu), ktime_get());
+}
+EXPORT_SYMBOL_GPL(kcpustat_field_idle);
+
+u64 kcpustat_field_iowait(int cpu)
+{
+ return kcpustat_field_dyntick(cpu, CPUTIME_IOWAIT,
+ nr_iowait_cpu(cpu), ktime_get());
+}
+EXPORT_SYMBOL_GPL(kcpustat_field_iowait);
+#else
+static u64 kcpustat_field_dyntick(int cpu, enum cpu_usage_stat idx,
+ bool compute_delta, ktime_t now)
+{
+ return kcpustat_cpu(cpu).cpustat[idx];
+}
+#endif /* CONFIG_NO_HZ_COMMON */
+
+static u64 get_cpu_sleep_time_us(int cpu, enum cpu_usage_stat idx,
+ bool compute_delta, u64 *last_update_time)
+{
+ ktime_t now = ktime_get();
+ u64 res;
+
+ if (vtime_generic_enabled_cpu(cpu))
+ res = kcpustat_field(idx, cpu);
+ else
+ res = kcpustat_field_dyntick(cpu, idx, compute_delta, now);
+
+ do_div(res, NSEC_PER_USEC);
+
+ if (last_update_time)
+ *last_update_time = ktime_to_us(now);
+
+ return res;
+}
+
+/**
+ * get_cpu_idle_time_us - get the total idle time of a CPU
+ * @cpu: CPU number to query
+ * @last_update_time: variable to store update time in. Do not update
+ * counters if NULL.
+ *
+ * Return the cumulative idle time (since boot) for a given
+ * CPU, in microseconds. Note that this is partially broken due to
+ * the counter of iowait tasks that can be remotely updated without
+ * any synchronization. Therefore it is possible to observe backward
+ * values within two consecutive reads.
+ *
+ * This time is measured via accounting rather than sampling,
+ * and is as accurate as ktime_get() is.
+ *
+ * Return: total idle time of the @cpu
+ */
+u64 get_cpu_idle_time_us(int cpu, u64 *last_update_time)
+{
+ return get_cpu_sleep_time_us(cpu, CPUTIME_IDLE,
+ !nr_iowait_cpu(cpu), last_update_time);
+}
+EXPORT_SYMBOL_GPL(get_cpu_idle_time_us);
+
+/**
+ * get_cpu_iowait_time_us - get the total iowait time of a CPU
+ * @cpu: CPU number to query
+ * @last_update_time: variable to store update time in. Do not update
+ * counters if NULL.
+ *
+ * Return the cumulative iowait time (since boot) for a given
+ * CPU, in microseconds. Note this is partially broken due to
+ * the counter of iowait tasks that can be remotely updated without
+ * any synchronization. Therefore it is possible to observe backward
+ * values within two consecutive reads.
+ *
+ * This time is measured via accounting rather than sampling,
+ * and is as accurate as ktime_get() is.
+ *
+ * Return: total iowait time of @cpu
+ */
+u64 get_cpu_iowait_time_us(int cpu, u64 *last_update_time)
+{
+ return get_cpu_sleep_time_us(cpu, CPUTIME_IOWAIT,
+ nr_iowait_cpu(cpu), last_update_time);
+}
+EXPORT_SYMBOL_GPL(get_cpu_iowait_time_us);
+
/*
* Use precise platform statistics if available:
*/
@@ -437,11 +642,15 @@ void vtime_account_irq(struct task_struct *tsk, unsigned int offset)
vtime_account_hardirq(tsk);
} else if (pc & SOFTIRQ_OFFSET) {
vtime_account_softirq(tsk);
- } else if (!IS_ENABLED(CONFIG_HAVE_VIRT_CPU_ACCOUNTING_IDLE) &&
- is_idle_task(tsk)) {
- vtime_account_idle(tsk);
+ } else if (!kcpustat_idle_dyntick()) {
+ if (!IS_ENABLED(CONFIG_HAVE_VIRT_CPU_ACCOUNTING_IDLE) &&
+ is_idle_task(tsk)) {
+ vtime_account_idle(tsk);
+ } else {
+ vtime_account_kernel(tsk);
+ }
} else {
- vtime_account_kernel(tsk);
+ vtime_reset();
}
}
@@ -483,6 +692,9 @@ void account_process_tick(struct task_struct *p, int user_tick)
if (vtime_accounting_enabled_this_cpu())
return;
+ if (kcpustat_idle_dyntick())
+ return;
+
if (irqtime_enabled()) {
irqtime_account_process_tick(p, user_tick, 1);
return;
@@ -504,29 +716,6 @@ void account_process_tick(struct task_struct *p, int user_tick)
account_idle_time(cputime);
}
-/*
- * Account multiple ticks of idle time.
- * @ticks: number of stolen ticks
- */
-void account_idle_ticks(unsigned long ticks)
-{
- u64 cputime, steal;
-
- if (irqtime_enabled()) {
- irqtime_account_idle_ticks(ticks);
- return;
- }
-
- cputime = ticks * TICK_NSEC;
- steal = steal_account_process_time(ULONG_MAX);
-
- if (steal >= cputime)
- return;
-
- cputime -= steal;
- account_idle_time(cputime);
-}
-
/*
* Adjust tick based cputime random precision against scheduler runtime
* accounting.
@@ -773,9 +962,9 @@ void vtime_guest_exit(struct task_struct *tsk)
}
EXPORT_SYMBOL_GPL(vtime_guest_exit);
-void vtime_account_idle(struct task_struct *tsk)
+static void __vtime_account_idle(struct vtime *vtime)
{
- account_idle_time(get_vtime_delta(&tsk->vtime));
+ account_idle_time(get_vtime_delta(vtime));
}
void vtime_task_switch_generic(struct task_struct *prev)
@@ -784,7 +973,7 @@ void vtime_task_switch_generic(struct task_struct *prev)
write_seqcount_begin(&vtime->seqcount);
if (vtime->state == VTIME_IDLE)
- vtime_account_idle(prev);
+ __vtime_account_idle(vtime);
else
__vtime_account_kernel(prev, vtime);
vtime->state = VTIME_INACTIVE;
@@ -926,6 +1115,7 @@ static int kcpustat_field_vtime(u64 *cpustat,
int cpu, u64 *val)
{
struct vtime *vtime = &tsk->vtime;
+ struct rq *rq = cpu_rq(cpu);
unsigned int seq;
do {
@@ -967,6 +1157,14 @@ static int kcpustat_field_vtime(u64 *cpustat,
if (state == VTIME_GUEST && task_nice(tsk) > 0)
*val += vtime->gtime + vtime_delta(vtime);
break;
+ case CPUTIME_IDLE:
+ if (state == VTIME_IDLE && !atomic_read(&rq->nr_iowait))
+ *val += vtime_delta(vtime);
+ break;
+ case CPUTIME_IOWAIT:
+ if (state == VTIME_IDLE && atomic_read(&rq->nr_iowait) > 0)
+ *val += vtime_delta(vtime);
+ break;
default:
break;
}
@@ -975,16 +1173,15 @@ static int kcpustat_field_vtime(u64 *cpustat,
return 0;
}
-u64 kcpustat_field(struct kernel_cpustat *kcpustat,
- enum cpu_usage_stat usage, int cpu)
+u64 kcpustat_field(enum cpu_usage_stat usage, int cpu)
{
- u64 *cpustat = kcpustat->cpustat;
+ u64 *cpustat = kcpustat_cpu(cpu).cpustat;
u64 val = cpustat[usage];
struct rq *rq;
int err;
- if (!vtime_accounting_enabled_cpu(cpu))
- return val;
+ if (!vtime_generic_enabled_cpu(cpu))
+ return kcpustat_field_default(usage, cpu);
rq = cpu_rq(cpu);
@@ -1030,8 +1227,8 @@ static int kcpustat_cpu_fetch_vtime(struct kernel_cpustat *dst,
*dst = *src;
cpustat = dst->cpustat;
- /* Task is sleeping, dead or idle, nothing to add */
- if (state < VTIME_SYS)
+ /* Task is sleeping or dead, nothing to add */
+ if (state < VTIME_IDLE)
continue;
delta = vtime_delta(vtime);
@@ -1040,15 +1237,17 @@ static int kcpustat_cpu_fetch_vtime(struct kernel_cpustat *dst,
* Task runs either in user (including guest) or kernel space,
* add pending nohz time to the right place.
*/
- if (state == VTIME_SYS) {
+ switch (state) {
+ case VTIME_SYS:
cpustat[CPUTIME_SYSTEM] += vtime->stime + delta;
- } else if (state == VTIME_USER) {
+ break;
+ case VTIME_USER:
if (task_nice(tsk) > 0)
cpustat[CPUTIME_NICE] += vtime->utime + delta;
else
cpustat[CPUTIME_USER] += vtime->utime + delta;
- } else {
- WARN_ON_ONCE(state != VTIME_GUEST);
+ break;
+ case VTIME_GUEST:
if (task_nice(tsk) > 0) {
cpustat[CPUTIME_GUEST_NICE] += vtime->gtime + delta;
cpustat[CPUTIME_NICE] += vtime->gtime + delta;
@@ -1056,6 +1255,15 @@ static int kcpustat_cpu_fetch_vtime(struct kernel_cpustat *dst,
cpustat[CPUTIME_GUEST] += vtime->gtime + delta;
cpustat[CPUTIME_USER] += vtime->gtime + delta;
}
+ break;
+ case VTIME_IDLE:
+ if (atomic_read(&cpu_rq(cpu)->nr_iowait) > 0)
+ cpustat[CPUTIME_IOWAIT] += delta;
+ else
+ cpustat[CPUTIME_IDLE] += delta;
+ break;
+ default:
+ WARN_ON_ONCE(1);
}
} while (read_seqcount_retry(&vtime->seqcount, seq));
@@ -1068,8 +1276,8 @@ void kcpustat_cpu_fetch(struct kernel_cpustat *dst, int cpu)
struct rq *rq;
int err;
- if (!vtime_accounting_enabled_cpu(cpu)) {
- *dst = *src;
+ if (!vtime_generic_enabled_cpu(cpu)) {
+ kcpustat_cpu_fetch_default(dst, cpu);
return;
}
@@ -1082,7 +1290,7 @@ void kcpustat_cpu_fetch(struct kernel_cpustat *dst, int cpu)
curr = rcu_dereference(rq->curr);
if (WARN_ON_ONCE(!curr)) {
rcu_read_unlock();
- *dst = *src;
+ kcpustat_cpu_fetch_default(dst, cpu);
return;
}
diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
index a83be0c834dd..aa7e3dc59856 100644
--- a/kernel/sched/idle.c
+++ b/kernel/sched/idle.c
@@ -280,6 +280,14 @@ static void do_idle(void)
int cpu = smp_processor_id();
bool got_tick = false;
+ if (cpu_is_offline(cpu)) {
+ local_irq_disable();
+ /* All per-CPU kernel threads should be done by now. */
+ WARN_ON_ONCE(need_resched());
+ cpuhp_report_idle_dead();
+ arch_cpu_idle_dead();
+ }
+
/*
* Check if we need to update blocked load
*/
@@ -331,11 +339,6 @@ static void do_idle(void)
*/
local_irq_disable();
- if (cpu_is_offline(cpu)) {
- cpuhp_report_idle_dead();
- arch_cpu_idle_dead();
- }
-
arch_cpu_idle_enter();
rcu_nocb_flush_deferred_wakeup();
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index cbbb87a0c6e7..c1ee0b256445 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -285,8 +285,6 @@ static void tick_sched_handle(struct tick_sched *ts, struct pt_regs *regs)
if (IS_ENABLED(CONFIG_NO_HZ_COMMON) &&
tick_sched_flag_test(ts, TS_FLAG_STOPPED)) {
touch_softlockup_watchdog_sched();
- if (is_idle_task(current))
- ts->idle_jiffies++;
/*
* In case the current tick fired too early past its expected
* expiration, make sure we don't bypass the next clock reprogramming
@@ -751,119 +749,6 @@ static void tick_nohz_update_jiffies(ktime_t now)
touch_softlockup_watchdog_sched();
}
-static void tick_nohz_stop_idle(struct tick_sched *ts, ktime_t now)
-{
- ktime_t delta;
-
- if (WARN_ON_ONCE(!tick_sched_flag_test(ts, TS_FLAG_IDLE_ACTIVE)))
- return;
-
- delta = ktime_sub(now, ts->idle_entrytime);
-
- write_seqcount_begin(&ts->idle_sleeptime_seq);
- if (nr_iowait_cpu(smp_processor_id()) > 0)
- ts->iowait_sleeptime = ktime_add(ts->iowait_sleeptime, delta);
- else
- ts->idle_sleeptime = ktime_add(ts->idle_sleeptime, delta);
-
- ts->idle_entrytime = now;
- tick_sched_flag_clear(ts, TS_FLAG_IDLE_ACTIVE);
- write_seqcount_end(&ts->idle_sleeptime_seq);
-
- sched_clock_idle_wakeup_event();
-}
-
-static void tick_nohz_start_idle(struct tick_sched *ts)
-{
- write_seqcount_begin(&ts->idle_sleeptime_seq);
- ts->idle_entrytime = ktime_get();
- tick_sched_flag_set(ts, TS_FLAG_IDLE_ACTIVE);
- write_seqcount_end(&ts->idle_sleeptime_seq);
-
- sched_clock_idle_sleep_event();
-}
-
-static u64 get_cpu_sleep_time_us(struct tick_sched *ts, ktime_t *sleeptime,
- bool compute_delta, u64 *last_update_time)
-{
- ktime_t now, idle;
- unsigned int seq;
-
- if (!tick_nohz_active)
- return -1;
-
- now = ktime_get();
- if (last_update_time)
- *last_update_time = ktime_to_us(now);
-
- do {
- seq = read_seqcount_begin(&ts->idle_sleeptime_seq);
-
- if (tick_sched_flag_test(ts, TS_FLAG_IDLE_ACTIVE) && compute_delta) {
- ktime_t delta = ktime_sub(now, ts->idle_entrytime);
-
- idle = ktime_add(*sleeptime, delta);
- } else {
- idle = *sleeptime;
- }
- } while (read_seqcount_retry(&ts->idle_sleeptime_seq, seq));
-
- return ktime_to_us(idle);
-
-}
-
-/**
- * get_cpu_idle_time_us - get the total idle time of a CPU
- * @cpu: CPU number to query
- * @last_update_time: variable to store update time in. Do not update
- * counters if NULL.
- *
- * Return the cumulative idle time (since boot) for a given
- * CPU, in microseconds. Note that this is partially broken due to
- * the counter of iowait tasks that can be remotely updated without
- * any synchronization. Therefore it is possible to observe backward
- * values within two consecutive reads.
- *
- * This time is measured via accounting rather than sampling,
- * and is as accurate as ktime_get() is.
- *
- * Return: -1 if NOHZ is not enabled, else total idle time of the @cpu
- */
-u64 get_cpu_idle_time_us(int cpu, u64 *last_update_time)
-{
- struct tick_sched *ts = &per_cpu(tick_cpu_sched, cpu);
-
- return get_cpu_sleep_time_us(ts, &ts->idle_sleeptime,
- !nr_iowait_cpu(cpu), last_update_time);
-}
-EXPORT_SYMBOL_GPL(get_cpu_idle_time_us);
-
-/**
- * get_cpu_iowait_time_us - get the total iowait time of a CPU
- * @cpu: CPU number to query
- * @last_update_time: variable to store update time in. Do not update
- * counters if NULL.
- *
- * Return the cumulative iowait time (since boot) for a given
- * CPU, in microseconds. Note this is partially broken due to
- * the counter of iowait tasks that can be remotely updated without
- * any synchronization. Therefore it is possible to observe backward
- * values within two consecutive reads.
- *
- * This time is measured via accounting rather than sampling,
- * and is as accurate as ktime_get() is.
- *
- * Return: -1 if NOHZ is not enabled, else total iowait time of @cpu
- */
-u64 get_cpu_iowait_time_us(int cpu, u64 *last_update_time)
-{
- struct tick_sched *ts = &per_cpu(tick_cpu_sched, cpu);
-
- return get_cpu_sleep_time_us(ts, &ts->iowait_sleeptime,
- nr_iowait_cpu(cpu), last_update_time);
-}
-EXPORT_SYMBOL_GPL(get_cpu_iowait_time_us);
-
/* Simplified variant of hrtimer_forward_now() */
static ktime_t tick_forward_now(ktime_t expires, ktime_t now)
{
@@ -1273,7 +1158,7 @@ void tick_nohz_idle_stop_tick(void)
ts->idle_expires = expires;
if (!was_stopped && tick_sched_flag_test(ts, TS_FLAG_STOPPED)) {
- ts->idle_jiffies = ts->last_jiffies;
+ kcpustat_dyntick_start(ts->idle_entrytime);
nohz_balance_enter_idle(cpu);
}
} else {
@@ -1286,6 +1171,20 @@ void tick_nohz_idle_retain_tick(void)
tick_nohz_retain_tick(this_cpu_ptr(&tick_cpu_sched));
}
+static void tick_nohz_clock_sleep(struct tick_sched *ts)
+{
+ tick_sched_flag_set(ts, TS_FLAG_IDLE_ACTIVE);
+ sched_clock_idle_sleep_event();
+}
+
+static void tick_nohz_clock_wakeup(struct tick_sched *ts)
+{
+ if (tick_sched_flag_test(ts, TS_FLAG_IDLE_ACTIVE)) {
+ tick_sched_flag_clear(ts, TS_FLAG_IDLE_ACTIVE);
+ sched_clock_idle_wakeup_event();
+ }
+}
+
/**
* tick_nohz_idle_enter - prepare for entering idle on the current CPU
*
@@ -1300,11 +1199,10 @@ void tick_nohz_idle_enter(void)
local_irq_disable();
ts = this_cpu_ptr(&tick_cpu_sched);
-
WARN_ON_ONCE(ts->timer_expires_base);
-
tick_sched_flag_set(ts, TS_FLAG_INIDLE);
- tick_nohz_start_idle(ts);
+ ts->idle_entrytime = ktime_get();
+ tick_nohz_clock_sleep(ts);
local_irq_enable();
}
@@ -1332,10 +1230,14 @@ void tick_nohz_irq_exit(void)
{
struct tick_sched *ts = this_cpu_ptr(&tick_cpu_sched);
- if (tick_sched_flag_test(ts, TS_FLAG_INIDLE))
- tick_nohz_start_idle(ts);
- else
+ if (tick_sched_flag_test(ts, TS_FLAG_INIDLE)) {
+ tick_nohz_clock_sleep(ts);
+ ts->idle_entrytime = ktime_get();
+ if (tick_sched_flag_test(ts, TS_FLAG_STOPPED))
+ kcpustat_irq_exit(ts->idle_entrytime);
+ } else {
tick_nohz_full_update_tick(ts);
+ }
}
/**
@@ -1429,36 +1331,20 @@ unsigned long tick_nohz_get_idle_calls_cpu(int cpu)
return ts->idle_calls;
}
-static void tick_nohz_account_idle_time(struct tick_sched *ts,
- ktime_t now)
-{
- unsigned long ticks;
-
- ts->idle_exittime = now;
-
- if (vtime_accounting_enabled_this_cpu())
- return;
- /*
- * We stopped the tick in idle. update_process_times() would miss the
- * time we slept, as it does only a 1 tick accounting.
- * Enforce that this is accounted to idle !
- */
- ticks = jiffies - ts->idle_jiffies;
- /*
- * We might be one off. Do not randomly account a huge number of ticks!
- */
- if (ticks && ticks < LONG_MAX)
- account_idle_ticks(ticks);
-}
-
void tick_nohz_idle_restart_tick(void)
{
struct tick_sched *ts = this_cpu_ptr(&tick_cpu_sched);
if (tick_sched_flag_test(ts, TS_FLAG_STOPPED)) {
- ktime_t now = ktime_get();
- tick_nohz_restart_sched_tick(ts, now);
- tick_nohz_account_idle_time(ts, now);
+ /*
+ * Update entrytime here in case the tick restart is due to temporary
+ * polling on forced broadcast. The tick may be stopped again later within
+ * the same idle trip. The idle_entrytime was updated recently but make sure
+ * no tiny amount of idle time is accounted twice.
+ */
+ ts->idle_entrytime = ktime_get();
+ kcpustat_dyntick_stop(ts->idle_entrytime);
+ tick_nohz_restart_sched_tick(ts, ts->idle_entrytime);
}
}
@@ -1468,8 +1354,6 @@ static void tick_nohz_idle_update_tick(struct tick_sched *ts, ktime_t now)
__tick_nohz_full_update_tick(ts, now);
else
tick_nohz_restart_sched_tick(ts, now);
-
- tick_nohz_account_idle_time(ts, now);
}
/**
@@ -1491,7 +1375,6 @@ static void tick_nohz_idle_update_tick(struct tick_sched *ts, ktime_t now)
void tick_nohz_idle_exit(void)
{
struct tick_sched *ts = this_cpu_ptr(&tick_cpu_sched);
- bool idle_active, tick_stopped;
ktime_t now;
local_irq_disable();
@@ -1500,17 +1383,13 @@ void tick_nohz_idle_exit(void)
WARN_ON_ONCE(ts->timer_expires_base);
tick_sched_flag_clear(ts, TS_FLAG_INIDLE);
- idle_active = tick_sched_flag_test(ts, TS_FLAG_IDLE_ACTIVE);
- tick_stopped = tick_sched_flag_test(ts, TS_FLAG_STOPPED);
+ tick_nohz_clock_wakeup(ts);
- if (idle_active || tick_stopped)
+ if (tick_sched_flag_test(ts, TS_FLAG_STOPPED)) {
now = ktime_get();
-
- if (idle_active)
- tick_nohz_stop_idle(ts, now);
-
- if (tick_stopped)
+ kcpustat_dyntick_stop(now);
tick_nohz_idle_update_tick(ts, now);
+ }
local_irq_enable();
}
@@ -1565,11 +1444,14 @@ static inline void tick_nohz_irq_enter(void)
struct tick_sched *ts = this_cpu_ptr(&tick_cpu_sched);
ktime_t now;
- if (!tick_sched_flag_test(ts, TS_FLAG_STOPPED | TS_FLAG_IDLE_ACTIVE))
+ tick_nohz_clock_wakeup(ts);
+
+ if (!tick_sched_flag_test(ts, TS_FLAG_STOPPED))
return;
+
now = ktime_get();
- if (tick_sched_flag_test(ts, TS_FLAG_IDLE_ACTIVE))
- tick_nohz_stop_idle(ts, now);
+ kcpustat_irq_enter(now);
+
/*
* If all CPUs are idle we may need to update a stale jiffies value.
* Note nohz_full is a special case: a timekeeper is guaranteed to stay
@@ -1577,8 +1459,7 @@ static inline void tick_nohz_irq_enter(void)
* rare case (typically stop machine). So we must make sure we have a
* last resort.
*/
- if (tick_sched_flag_test(ts, TS_FLAG_STOPPED))
- tick_nohz_update_jiffies(now);
+ tick_nohz_update_jiffies(now);
}
#else
@@ -1648,20 +1529,15 @@ void tick_setup_sched_timer(bool hrtimer)
void tick_sched_timer_dying(int cpu)
{
struct tick_sched *ts = &per_cpu(tick_cpu_sched, cpu);
- ktime_t idle_sleeptime, iowait_sleeptime;
unsigned long idle_calls, idle_sleeps;
/* This must happen before hrtimers are migrated! */
if (tick_sched_flag_test(ts, TS_FLAG_HIGHRES))
hrtimer_cancel(&ts->sched_timer);
- idle_sleeptime = ts->idle_sleeptime;
- iowait_sleeptime = ts->iowait_sleeptime;
idle_calls = ts->idle_calls;
idle_sleeps = ts->idle_sleeps;
memset(ts, 0, sizeof(*ts));
- ts->idle_sleeptime = idle_sleeptime;
- ts->iowait_sleeptime = iowait_sleeptime;
ts->idle_calls = idle_calls;
ts->idle_sleeps = idle_sleeps;
}
diff --git a/kernel/time/tick-sched.h b/kernel/time/tick-sched.h
index b4a7822f495d..79b9252047b1 100644
--- a/kernel/time/tick-sched.h
+++ b/kernel/time/tick-sched.h
@@ -44,9 +44,7 @@ struct tick_device {
* to resume the tick timer operation in the timeline
* when the CPU returns from nohz sleep.
* @next_tick: Next tick to be fired when in dynticks mode.
- * @idle_jiffies: jiffies at the entry to idle for idle time accounting
* @idle_waketime: Time when the idle was interrupted
- * @idle_sleeptime_seq: sequence counter for data consistency
* @idle_entrytime: Time when the idle call was entered
* @last_jiffies: Base jiffies snapshot when next event was last computed
* @timer_expires_base: Base time clock monotonic for @timer_expires
@@ -55,9 +53,6 @@ struct tick_device {
* @idle_expires: Next tick in idle, for debugging purpose only
* @idle_calls: Total number of idle calls
* @idle_sleeps: Number of idle calls, where the sched tick was stopped
- * @idle_exittime: Time when the idle state was left
- * @idle_sleeptime: Sum of the time slept in idle with sched tick stopped
- * @iowait_sleeptime: Sum of the time slept in idle with sched tick stopped, with IO outstanding
* @tick_dep_mask: Tick dependency mask - is set, if someone needs the tick
* @check_clocks: Notification mechanism about clocksource changes
*/
@@ -73,12 +68,10 @@ struct tick_sched {
struct hrtimer sched_timer;
ktime_t last_tick;
ktime_t next_tick;
- unsigned long idle_jiffies;
ktime_t idle_waketime;
unsigned int got_idle_tick;
/* Idle entry */
- seqcount_t idle_sleeptime_seq;
ktime_t idle_entrytime;
/* Tick stop */
@@ -90,11 +83,6 @@ struct tick_sched {
unsigned long idle_calls;
unsigned long idle_sleeps;
- /* Idle exit */
- ktime_t idle_exittime;
- ktime_t idle_sleeptime;
- ktime_t iowait_sleeptime;
-
/* Full dynticks handling */
atomic_t tick_dep_mask;
diff --git a/kernel/time/timer_list.c b/kernel/time/timer_list.c
index 427d7ddea3af..514802def1e0 100644
--- a/kernel/time/timer_list.c
+++ b/kernel/time/timer_list.c
@@ -152,14 +152,10 @@ static void print_cpu(struct seq_file *m, int cpu, u64 now)
P_flag(highres, TS_FLAG_HIGHRES);
P_ns(last_tick);
P_flag(tick_stopped, TS_FLAG_STOPPED);
- P(idle_jiffies);
P(idle_calls);
P(idle_sleeps);
P_ns(idle_entrytime);
P_ns(idle_waketime);
- P_ns(idle_exittime);
- P_ns(idle_sleeptime);
- P_ns(iowait_sleeptime);
P(last_jiffies);
P(next_timer);
P_ns(idle_expires);
@@ -256,7 +252,7 @@ static void timer_list_show_tickdevices_header(struct seq_file *m)
static inline void timer_list_header(struct seq_file *m, u64 now)
{
- SEQ_printf(m, "Timer List Version: v0.10\n");
+ SEQ_printf(m, "Timer List Version: v0.11\n");
SEQ_printf(m, "HRTIMER_MAX_CLOCK_BASES: %d\n", HRTIMER_MAX_CLOCK_BASES);
SEQ_printf(m, "now at %Ld nsecs\n", (unsigned long long)now);
SEQ_printf(m, "\n");
diff --git a/scripts/gdb/linux/timerlist.py b/scripts/gdb/linux/timerlist.py
index 9fb3436a217c..744b032e4d38 100644
--- a/scripts/gdb/linux/timerlist.py
+++ b/scripts/gdb/linux/timerlist.py
@@ -90,14 +90,10 @@ def print_cpu(hrtimer_bases, cpu, max_clock_bases):
text += f" .{'nohz':15s}: {int(bool(ts['flags'] & TS_FLAG_NOHZ))}\n"
text += f" .{'last_tick':15s}: {ts['last_tick']}\n"
text += f" .{'tick_stopped':15s}: {int(bool(ts['flags'] & TS_FLAG_STOPPED))}\n"
- text += f" .{'idle_jiffies':15s}: {ts['idle_jiffies']}\n"
text += f" .{'idle_calls':15s}: {ts['idle_calls']}\n"
text += f" .{'idle_sleeps':15s}: {ts['idle_sleeps']}\n"
text += f" .{'idle_entrytime':15s}: {ts['idle_entrytime']} nsecs\n"
text += f" .{'idle_waketime':15s}: {ts['idle_waketime']} nsecs\n"
- text += f" .{'idle_exittime':15s}: {ts['idle_exittime']} nsecs\n"
- text += f" .{'idle_sleeptime':15s}: {ts['idle_sleeptime']} nsecs\n"
- text += f" .{'iowait_sleeptime':15s}: {ts['iowait_sleeptime']} nsecs\n"
text += f" .{'last_jiffies':15s}: {ts['last_jiffies']}\n"
text += f" .{'next_timer':15s}: {ts['next_timer']}\n"
text += f" .{'idle_expires':15s}: {ts['idle_expires']} nsecs\n"
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [GIT pull] timers/ptp for v7.2-rc1
2026-06-13 21:24 [GIT pull] core/rseq for v7.2-rc1 Thomas Gleixner
` (6 preceding siblings ...)
2026-06-13 21:25 ` [GIT pull] timers/nohz " Thomas Gleixner
@ 2026-06-13 21:25 ` Thomas Gleixner
2026-06-15 8:51 ` pr-tracker-bot
2026-06-13 21:25 ` [GIT pull] timers/vdso " Thomas Gleixner
2026-06-15 8:51 ` [GIT pull] core/rseq " pr-tracker-bot
9 siblings, 1 reply; 22+ messages in thread
From: Thomas Gleixner @ 2026-06-13 21:25 UTC (permalink / raw)
To: Linus Torvalds; +Cc: linux-kernel, x86
Linus,
please pull the latest timers/ptp branch from:
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git timers-ptp-2026-06-13
up to: bc484a509673: ptp: vmclock: Use hw_cycles from snapshot for precise TSC pairing
Updates for NTP/timekeeping and PTP:
- Expand timekeeping snapshot mechanisms
The various snapshot functions are mostly used for PTP to collect
"atomic" snapshots of various involved clocks.
They lack support for the recently introduced AUX clocks and do not
provide the underlying counter value (e.g. TSC) to user space. Exposing
the counter value snapshot allows for better control and steering.
Convert the hard wired ktime_get_snapshot() to take a clock ID, which
allows the caller to select the clock ID to be captured along with
CLOCK_MONONOTONIC_RAW. Additionally capture the underlying hardware
counter value and the clock source ID of the counter.
Expand the hardware based snapshot capture where devices provide a
mechanism to snapshot the hardware PTP clock and the system counter
(usually via PCI/PTM) to support AUX clocks and also provide the
captured counter value back to the caller and not only the clock
timestamps derived from it.
- Add a new optional read_snapshot() callback to clocksources
That is required to capture atomic snapshots from clocksources which
are derived from TSC with a scaling mechanism (e.g. Hyper-V, KVMclock).
The value pair is handed back in the snapshot structure to the callers,
so they can do the necessary correlations in a more precise way.
This touches usage sites of the affected functions and data structure all
over the tree, but stays fully backwards compatible for the existing user
space exposed interfaces. New PTP IOCTLs will provide access to the
extended functionality in later kernel versions.
Thanks,
tglx
------------------>
David Woodhouse (4):
timekeeping: Add clocksource read_snapshot() method and hw_cycles to snapshot
clocksource/hyperv: Implement read_snapshot() for TSC page clocksource
x86/kvmclock: Implement read_snapshot() for kvmclock clocksource
ptp: vmclock: Use hw_cycles from snapshot for precise TSC pairing
Thomas Gleixner (24):
timekeeping: Provide ktime_get_snapshot_id()
timekeeping: Use system_time_snapshot::systime/monoraw instead of ::real/raw
pps: generators: Use ktime_get_real_ts64() instead of ktime_get_snapshot()
pps: Convert to ktime_get_snapshot_id()
KVM: arm64: Use ktime_get_snapshot_id() to retrieve CLOCK_BOOTTIME
KVM: arm64: Use ktime_get_snapshot_id() to snapshot CLOCK_REALTIME
ptp: ptp_vmclock: Convert to ktime_get_snapshot_id()
timekeeping: Remove system_time_snapshot::real/boot/raw
timekeeping: Add CLOCK_AUX support for ktime_get_snapshot_id()
timekeeping: Add system_counterval_t to struct system_device_crosststamp
timekeeping: Add CLOCK ID to system_device_crosststamp
wifi: iwlwifi: Adopt PTP cross timestamps to core changes
ice/ptp: Use provided clock ID for history snapshot
igc: Use provided clock ID for history snapshot
net/mlx5: Use provided clock ID for history snapshot
virtio_rtc: Use provided clock ID for history snapshot
timekeeping: Remove ktime_get_snapshot()
timekeeping: Prepare for cross timestamps on arbitrary clock IDs
ptp: Use system_device_crosststamp::sys_systime
wifi: iwlwifi: Use system_device_crosststamp::sys_systime
ALSA: hda/common: Use system_device_crosststamp::sys_systime
timekeeping: Remove system_device_crosststamp::sys_realtime
timekeeping: Add support for AUX clock cross timestamping
ptp: Switch to ktime_get_snapshot_id() for pre/post timestamps
arch/arm64/kvm/hyp_trace.c | 8 +-
arch/arm64/kvm/hypercalls.c | 6 +-
arch/x86/kernel/kvmclock.c | 36 +++-
drivers/clocksource/hyperv_timer.c | 37 +++-
drivers/net/dsa/sja1105/sja1105_main.c | 8 +-
drivers/net/ethernet/intel/ice/ice_ptp.c | 5 +-
drivers/net/ethernet/intel/igc/igc.h | 1 +
drivers/net/ethernet/intel/igc/igc_ptp.c | 4 +-
.../net/ethernet/mellanox/mlx5/core/lib/clock.c | 4 +-
drivers/net/wireless/intel/iwlwifi/mld/ptp.c | 5 +-
drivers/net/wireless/intel/iwlwifi/mvm/ptp.c | 7 +-
drivers/pps/generators/pps_gen-dummy.c | 6 +-
drivers/pps/generators/pps_gen_tio.c | 6 +-
drivers/ptp/ptp_chardev.c | 18 +-
drivers/ptp/ptp_ocp.c | 11 +-
drivers/ptp/ptp_vmclock.c | 29 ++-
drivers/virtio/virtio_rtc_ptp.c | 2 +-
include/linux/clocksource.h | 24 +++
include/linux/pps_kernel.h | 10 +-
include/linux/ptp_clock_kernel.h | 15 +-
include/linux/timekeeping.h | 61 +++---
kernel/time/timekeeping.c | 235 ++++++++++++++-------
sound/hda/common/controller.c | 4 +-
23 files changed, 348 insertions(+), 194 deletions(-)
diff --git a/arch/arm64/kvm/hyp_trace.c b/arch/arm64/kvm/hyp_trace.c
index 8b7f2bf2fba8..822ce32d39ae 100644
--- a/arch/arm64/kvm/hyp_trace.c
+++ b/arch/arm64/kvm/hyp_trace.c
@@ -51,8 +51,8 @@ static void __hyp_clock_work(struct work_struct *work)
hyp_clock = container_of(dwork, struct hyp_trace_clock, work);
- ktime_get_snapshot(&snap);
- boot = ktime_to_ns(snap.boot);
+ ktime_get_snapshot_id(CLOCK_BOOTTIME, &snap);
+ boot = ktime_to_ns(snap.systime);
delta_boot = boot - hyp_clock->boot;
delta_cycles = snap.cycles - hyp_clock->cycles;
@@ -118,9 +118,9 @@ static void hyp_trace_clock_enable(struct hyp_trace_clock *hyp_clock, bool enabl
hyp_clock->running = false;
}
- ktime_get_snapshot(&snap);
+ ktime_get_snapshot_id(CLOCK_BOOTTIME, &snap);
- hyp_clock->boot = ktime_to_ns(snap.boot);
+ hyp_clock->boot = ktime_to_ns(snap.systime);
hyp_clock->cycles = snap.cycles;
hyp_clock->mult = 0;
diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
index 58c5fe7d7572..b11b8821c9fb 100644
--- a/arch/arm64/kvm/hypercalls.c
+++ b/arch/arm64/kvm/hypercalls.c
@@ -28,7 +28,7 @@ static void kvm_ptp_get_time(struct kvm_vcpu *vcpu, u64 *val)
* system time and counter value must captured at the same
* time to keep consistency and precision.
*/
- ktime_get_snapshot(&systime_snapshot);
+ ktime_get_snapshot_id(CLOCK_REALTIME, &systime_snapshot);
/*
* This is only valid if the current clocksource is the
@@ -61,8 +61,8 @@ static void kvm_ptp_get_time(struct kvm_vcpu *vcpu, u64 *val)
* in the future (about 292 years from 1970, and at that stage
* nobody will give a damn about it).
*/
- val[0] = upper_32_bits(systime_snapshot.real);
- val[1] = lower_32_bits(systime_snapshot.real);
+ val[0] = upper_32_bits(systime_snapshot.systime);
+ val[1] = lower_32_bits(systime_snapshot.systime);
val[2] = upper_32_bits(cycles);
val[3] = lower_32_bits(cycles);
}
diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index b5991d53fc0e..cb3d0ca1fa22 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -87,6 +87,27 @@ static u64 kvm_clock_get_cycles(struct clocksource *cs)
return kvm_clock_read();
}
+static u64 kvm_clock_get_cycles_snapshot(struct clocksource *cs,
+ struct clocksource_hw_snapshot *chs)
+{
+ struct pvclock_vcpu_time_info *src;
+ unsigned version;
+ u64 ret, tsc;
+
+ preempt_disable_notrace();
+ src = this_cpu_pvti();
+ do {
+ version = pvclock_read_begin(src);
+ tsc = rdtsc_ordered();
+ ret = __pvclock_read_cycles(src, tsc);
+ } while (pvclock_read_retry(src, version));
+ preempt_enable_notrace();
+
+ chs->hw_cycles = tsc;
+ chs->hw_csid = CSID_X86_TSC;
+ return ret;
+}
+
static noinstr u64 kvm_sched_clock_read(void)
{
return pvclock_clocksource_read_nowd(this_cpu_pvti()) - kvm_sched_clock_offset;
@@ -156,13 +177,14 @@ static int kvm_cs_enable(struct clocksource *cs)
}
static struct clocksource kvm_clock = {
- .name = "kvm-clock",
- .read = kvm_clock_get_cycles,
- .rating = 400,
- .mask = CLOCKSOURCE_MASK(64),
- .flags = CLOCK_SOURCE_IS_CONTINUOUS,
- .id = CSID_X86_KVM_CLK,
- .enable = kvm_cs_enable,
+ .name = "kvm-clock",
+ .read = kvm_clock_get_cycles,
+ .read_snapshot = kvm_clock_get_cycles_snapshot,
+ .rating = 400,
+ .mask = CLOCKSOURCE_MASK(64),
+ .flags = CLOCK_SOURCE_IS_CONTINUOUS,
+ .id = CSID_X86_KVM_CLK,
+ .enable = kvm_cs_enable,
};
static void kvm_register_clock(char *txt)
diff --git a/drivers/clocksource/hyperv_timer.c b/drivers/clocksource/hyperv_timer.c
index e9f5034a1bc8..df567795d175 100644
--- a/drivers/clocksource/hyperv_timer.c
+++ b/drivers/clocksource/hyperv_timer.c
@@ -444,6 +444,22 @@ static u64 notrace read_hv_clock_tsc_cs(struct clocksource *arg)
return read_hv_clock_tsc();
}
+static u64 notrace read_hv_clock_tsc_cs_snapshot(struct clocksource *arg,
+ struct clocksource_hw_snapshot *chs)
+{
+ u64 time;
+
+ if (hv_read_tsc_page_tsc(tsc_page, &chs->hw_cycles, &time)) {
+ chs->hw_csid = CSID_X86_TSC;
+ } else {
+ chs->hw_cycles = 0;
+ chs->hw_csid = CSID_GENERIC;
+ time = read_hv_clock_msr();
+ }
+
+ return time;
+}
+
static u64 noinstr read_hv_sched_clock_tsc(void)
{
return (read_hv_clock_tsc() - hv_sched_clock_offset) *
@@ -492,18 +508,19 @@ static int hv_cs_enable(struct clocksource *cs)
#endif
static struct clocksource hyperv_cs_tsc = {
- .name = "hyperv_clocksource_tsc_page",
- .rating = 500,
- .read = read_hv_clock_tsc_cs,
- .mask = CLOCKSOURCE_MASK(64),
- .flags = CLOCK_SOURCE_IS_CONTINUOUS,
- .suspend= suspend_hv_clock_tsc,
- .resume = resume_hv_clock_tsc,
+ .name = "hyperv_clocksource_tsc_page",
+ .rating = 500,
+ .read = read_hv_clock_tsc_cs,
+ .read_snapshot = read_hv_clock_tsc_cs_snapshot,
+ .mask = CLOCKSOURCE_MASK(64),
+ .flags = CLOCK_SOURCE_IS_CONTINUOUS,
+ .suspend = suspend_hv_clock_tsc,
+ .resume = resume_hv_clock_tsc,
#ifdef HAVE_VDSO_CLOCKMODE_HVCLOCK
- .enable = hv_cs_enable,
- .vdso_clock_mode = VDSO_CLOCKMODE_HVCLOCK,
+ .enable = hv_cs_enable,
+ .vdso_clock_mode = VDSO_CLOCKMODE_HVCLOCK,
#else
- .vdso_clock_mode = VDSO_CLOCKMODE_NONE,
+ .vdso_clock_mode = VDSO_CLOCKMODE_NONE,
#endif
};
diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
index c72c2bfdcffb..2697073dbf90 100644
--- a/drivers/net/dsa/sja1105/sja1105_main.c
+++ b/drivers/net/dsa/sja1105/sja1105_main.c
@@ -2310,10 +2310,10 @@ int sja1105_static_config_reload(struct sja1105_private *priv,
goto out;
}
- t1 = timespec64_to_ns(&ptp_sts_before.pre_ts);
- t2 = timespec64_to_ns(&ptp_sts_before.post_ts);
- t3 = timespec64_to_ns(&ptp_sts_after.pre_ts);
- t4 = timespec64_to_ns(&ptp_sts_after.post_ts);
+ t1 = ktime_to_ns(ptp_sts_before.pre_sts.systime);
+ t2 = ktime_to_ns(ptp_sts_before.post_sts.systime);
+ t3 = ktime_to_ns(ptp_sts_after.pre_sts.systime);
+ t4 = ktime_to_ns(ptp_sts_after.post_sts.systime);
/* Mid point, corresponds to pre-reset PTPCLKVAL */
t12 = t1 + (t2 - t1) / 2;
/* Mid point, corresponds to post-reset PTPCLKVAL, aka 0 */
diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.c b/drivers/net/ethernet/intel/ice/ice_ptp.c
index 36df742c326c..f9e4ec6f7ebb 100644
--- a/drivers/net/ethernet/intel/ice/ice_ptp.c
+++ b/drivers/net/ethernet/intel/ice/ice_ptp.c
@@ -2065,11 +2065,13 @@ static const struct ice_crosststamp_cfg ice_crosststamp_cfg_e830 = {
/**
* struct ice_crosststamp_ctx - Device cross timestamp context
* @snapshot: snapshot of system clocks for historic interpolation
+ * @snapshot_clock_id: System clock ID for @snapshot
* @pf: pointer to the PF private structure
* @cfg: pointer to hardware configuration for cross timestamp
*/
struct ice_crosststamp_ctx {
struct system_time_snapshot snapshot;
+ clockid_t snapshot_clock_id;
struct ice_pf *pf;
const struct ice_crosststamp_cfg *cfg;
};
@@ -2115,7 +2117,7 @@ static int ice_capture_crosststamp(ktime_t *device,
}
/* Snapshot system time for historic interpolation */
- ktime_get_snapshot(&ctx->snapshot);
+ ktime_get_snapshot_id(ctx->snapshot_clock_id, &ctx->snapshot);
/* Program cmd to master timer */
ice_ptp_src_cmd(hw, ICE_PTP_READ_TIME);
@@ -2176,6 +2178,7 @@ static int ice_ptp_getcrosststamp(struct ptp_clock_info *info,
{
struct ice_pf *pf = ptp_info_to_pf(info);
struct ice_crosststamp_ctx ctx = {
+ .snapshot_clock_id = cts->clock_id,
.pf = pf,
};
diff --git a/drivers/net/ethernet/intel/igc/igc.h b/drivers/net/ethernet/intel/igc/igc.h
index 17236813965d..46d625b15f44 100644
--- a/drivers/net/ethernet/intel/igc/igc.h
+++ b/drivers/net/ethernet/intel/igc/igc.h
@@ -326,6 +326,7 @@ struct igc_adapter {
struct timespec64 prev_ptp_time; /* Pre-reset PTP clock */
ktime_t ptp_reset_start; /* Reset time in clock mono */
struct system_time_snapshot snapshot;
+ clockid_t snapshot_clock_id;
struct mutex ptm_lock; /* Only allow one PTM transaction at a time */
char fw_version[32];
diff --git a/drivers/net/ethernet/intel/igc/igc_ptp.c b/drivers/net/ethernet/intel/igc/igc_ptp.c
index 3d6b2264164a..b40aba9ab685 100644
--- a/drivers/net/ethernet/intel/igc/igc_ptp.c
+++ b/drivers/net/ethernet/intel/igc/igc_ptp.c
@@ -1049,7 +1049,7 @@ static int igc_phc_get_syncdevicetime(ktime_t *device,
*/
do {
/* Get a snapshot of system clocks to use as historic value. */
- ktime_get_snapshot(&adapter->snapshot);
+ ktime_get_snapshot_id(adapter->snapshot_clock_id, &adapter->snapshot);
igc_ptm_trigger(hw);
@@ -1103,6 +1103,8 @@ static int igc_ptp_getcrosststamp(struct ptp_clock_info *ptp,
/* This blocks until any in progress PTM transactions complete */
mutex_lock(&adapter->ptm_lock);
+ adapter->snapshot_clock_id = cts->clock_id;
+
ret = get_device_system_crosststamp(igc_phc_get_syncdevicetime,
adapter, &adapter->snapshot, cts);
mutex_unlock(&adapter->ptm_lock);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c
index d785f1b4f2e1..5df786133e4b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c
@@ -340,7 +340,7 @@ static int mlx5_ptp_getcrosststamp(struct ptp_clock_info *ptp,
goto unlock;
}
- ktime_get_snapshot(&history_begin);
+ ktime_get_snapshot_id(cts->clock_id, &history_begin);
err = get_device_system_crosststamp(mlx5_mtctr_syncdevicetime, mdev,
&history_begin, cts);
@@ -366,7 +366,7 @@ static int mlx5_ptp_getcrosscycles(struct ptp_clock_info *ptp,
goto unlock;
}
- ktime_get_snapshot(&history_begin);
+ ktime_get_snapshot_id(cts->clock_id, &history_begin);
err = get_device_system_crosststamp(mlx5_mtctr_syncdevicecyclestime,
mdev, &history_begin, cts);
diff --git a/drivers/net/wireless/intel/iwlwifi/mld/ptp.c b/drivers/net/wireless/intel/iwlwifi/mld/ptp.c
index c65f4b56a327..f829156d42b3 100644
--- a/drivers/net/wireless/intel/iwlwifi/mld/ptp.c
+++ b/drivers/net/wireless/intel/iwlwifi/mld/ptp.c
@@ -250,7 +250,8 @@ iwl_mld_phc_get_crosstimestamp(struct ptp_clock_info *ptp,
/* System (wall) time */
ktime_t sys_time;
- memset(xtstamp, 0, sizeof(struct system_device_crosststamp));
+ if (xtstamp->clock_id != CLOCK_REALTIME)
+ return -ENOTSUPP;
ret = iwl_mld_get_crosstimestamp_fw(mld, &gp2, &sys_time);
if (ret) {
@@ -270,7 +271,7 @@ iwl_mld_phc_get_crosstimestamp(struct ptp_clock_info *ptp,
/* System monotonic raw time is not used */
xtstamp->device = ns_to_ktime(gp2_ns);
- xtstamp->sys_realtime = sys_time;
+ xtstamp->sys_systime = sys_time;
return ret;
}
diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/ptp.c b/drivers/net/wireless/intel/iwlwifi/mvm/ptp.c
index f7b620136c85..bcd6f7cead2a 100644
--- a/drivers/net/wireless/intel/iwlwifi/mvm/ptp.c
+++ b/drivers/net/wireless/intel/iwlwifi/mvm/ptp.c
@@ -160,13 +160,14 @@ iwl_mvm_phc_get_crosstimestamp(struct ptp_clock_info *ptp,
/* System (wall) time */
ktime_t sys_time;
- memset(xtstamp, 0, sizeof(struct system_device_crosststamp));
-
if (!mvm->ptp_data.ptp_clock) {
IWL_ERR(mvm, "No PHC clock registered\n");
return -ENODEV;
}
+ if (xtstamp->clock_id != CLOCK_REALTIME)
+ return -ENOTSUPP;
+
mutex_lock(&mvm->mutex);
if (fw_has_capa(&mvm->fw->ucode_capa, IWL_UCODE_TLV_CAPA_SYNCED_TIME)) {
ret = iwl_mvm_get_crosstimestamp_fw(mvm, &gp2, &sys_time);
@@ -184,7 +185,7 @@ iwl_mvm_phc_get_crosstimestamp(struct ptp_clock_info *ptp,
/* System monotonic raw time is not used */
xtstamp->device = (ktime_t)gp2_ns;
- xtstamp->sys_realtime = sys_time;
+ xtstamp->sys_systime = sys_time;
out:
mutex_unlock(&mvm->mutex);
diff --git a/drivers/pps/generators/pps_gen-dummy.c b/drivers/pps/generators/pps_gen-dummy.c
index 547fa7fe29f4..a4395543c4bb 100644
--- a/drivers/pps/generators/pps_gen-dummy.c
+++ b/drivers/pps/generators/pps_gen-dummy.c
@@ -39,11 +39,7 @@ static void pps_gen_ktimer_event(struct timer_list *unused)
static int pps_gen_dummy_get_time(struct pps_gen_device *pps_gen,
struct timespec64 *time)
{
- struct system_time_snapshot snap;
-
- ktime_get_snapshot(&snap);
- *time = ktime_to_timespec64(snap.real);
-
+ ktime_get_real_ts64(time);
return 0;
}
diff --git a/drivers/pps/generators/pps_gen_tio.c b/drivers/pps/generators/pps_gen_tio.c
index de00a85bfafa..9483d126ada0 100644
--- a/drivers/pps/generators/pps_gen_tio.c
+++ b/drivers/pps/generators/pps_gen_tio.c
@@ -189,11 +189,7 @@ static int pps_tio_gen_enable(struct pps_gen_device *pps_gen, bool enable)
static int pps_tio_get_time(struct pps_gen_device *pps_gen,
struct timespec64 *time)
{
- struct system_time_snapshot snap;
-
- ktime_get_snapshot(&snap);
- *time = ktime_to_timespec64(snap.real);
-
+ ktime_get_real_ts64(time);
return 0;
}
diff --git a/drivers/ptp/ptp_chardev.c b/drivers/ptp/ptp_chardev.c
index c61cf9edac48..dc23cd708cfe 100644
--- a/drivers/ptp/ptp_chardev.c
+++ b/drivers/ptp/ptp_chardev.c
@@ -317,8 +317,8 @@ typedef int (*ptp_crosststamp_fn)(struct ptp_clock_info *,
static long ptp_sys_offset_precise(struct ptp_clock *ptp, void __user *arg,
ptp_crosststamp_fn crosststamp_fn)
{
+ struct system_device_crosststamp xtstamp = { .clock_id = CLOCK_REALTIME };
struct ptp_sys_offset_precise precise_offset;
- struct system_device_crosststamp xtstamp;
struct timespec64 ts;
int err;
@@ -333,7 +333,7 @@ static long ptp_sys_offset_precise(struct ptp_clock *ptp, void __user *arg,
ts = ktime_to_timespec64(xtstamp.device);
precise_offset.device.sec = ts.tv_sec;
precise_offset.device.nsec = ts.tv_nsec;
- ts = ktime_to_timespec64(xtstamp.sys_realtime);
+ ts = ktime_to_timespec64(xtstamp.sys_systime);
precise_offset.sys_realtime.sec = ts.tv_sec;
precise_offset.sys_realtime.nsec = ts.tv_nsec;
ts = ktime_to_timespec64(xtstamp.sys_monoraw);
@@ -386,15 +386,19 @@ static long ptp_sys_offset_extended(struct ptp_clock *ptp, void __user *arg,
return err;
/* Filter out disabled or unavailable clocks */
- if (sts.pre_ts.tv_sec < 0 || sts.post_ts.tv_sec < 0)
+ if (!sts.pre_sts.valid || !sts.post_sts.valid)
return -EINVAL;
- extoff->ts[i][0].sec = sts.pre_ts.tv_sec;
- extoff->ts[i][0].nsec = sts.pre_ts.tv_nsec;
extoff->ts[i][1].sec = ts.tv_sec;
extoff->ts[i][1].nsec = ts.tv_nsec;
- extoff->ts[i][2].sec = sts.post_ts.tv_sec;
- extoff->ts[i][2].nsec = sts.post_ts.tv_nsec;
+
+ ts = ktime_to_timespec64(sts.pre_sts.systime);
+ extoff->ts[i][0].sec = ts.tv_sec;
+ extoff->ts[i][0].nsec = ts.tv_nsec;
+
+ ts = ktime_to_timespec64(sts.post_sts.systime);
+ extoff->ts[i][2].sec = ts.tv_sec;
+ extoff->ts[i][2].nsec = ts.tv_nsec;
}
return copy_to_user(arg, extoff, sizeof(*extoff)) ? -EFAULT : 0;
diff --git a/drivers/ptp/ptp_ocp.c b/drivers/ptp/ptp_ocp.c
index beacc2ffb166..28b0302c6250 100644
--- a/drivers/ptp/ptp_ocp.c
+++ b/drivers/ptp/ptp_ocp.c
@@ -1491,11 +1491,8 @@ __ptp_ocp_gettime_locked(struct ptp_ocp *bp, struct timespec64 *ts,
}
ptp_read_system_postts(sts);
- if (sts && bp->ts_window_adjust) {
- s64 ns = timespec64_to_ns(&sts->post_ts);
-
- sts->post_ts = ns_to_timespec64(ns - bp->ts_window_adjust);
- }
+ if (sts && bp->ts_window_adjust)
+ sts->post_sts.systime -= bp->ts_window_adjust;
time_ns = ioread32(&bp->reg->time_ns);
time_sec = ioread32(&bp->reg->time_sec);
@@ -4595,8 +4592,8 @@ ptp_ocp_summary_show(struct seq_file *s, void *data)
struct timespec64 sys_ts;
s64 pre_ns, post_ns, ns;
- pre_ns = timespec64_to_ns(&sts.pre_ts);
- post_ns = timespec64_to_ns(&sts.post_ts);
+ pre_ns = ktime_to_ns(sts.pre_sts.systime);
+ post_ns = ktime_to_ns(sts.post_sts.systime);
ns = (pre_ns + post_ns) / 2;
ns += (s64)bp->utc_tai_offset * NSEC_PER_SEC;
sys_ts = ns_to_timespec64(ns);
diff --git a/drivers/ptp/ptp_vmclock.c b/drivers/ptp/ptp_vmclock.c
index 8b630eb916b5..eebdcd5ebc08 100644
--- a/drivers/ptp/ptp_vmclock.c
+++ b/drivers/ptp/ptp_vmclock.c
@@ -101,7 +101,6 @@ static int vmclock_get_crosststamp(struct vmclock_state *st,
struct timespec64 *tspec)
{
ktime_t deadline = ktime_add(ktime_get(), VMCLOCK_MAX_WAIT);
- struct system_time_snapshot systime_snapshot;
uint64_t cycle, delta, seq, frac_sec;
#ifdef CONFIG_X86
@@ -132,17 +131,19 @@ static int vmclock_get_crosststamp(struct vmclock_state *st,
* will be derived from the *same* counter value.
*
* If the system isn't using the same counter, then the value
- * from ktime_get_snapshot() will still be used as pre_ts, and
- * ptp_read_system_postts() is called to populate postts after
- * calling get_cycles().
- *
- * The conversion to timespec64 happens further down, outside
- * the seq_count loop.
+ * from ptp_read_system_prets() will still be used as pre_ts,
+ * and ptp_read_system_postts() is called to populate postts
+ * after calling get_cycles().
*/
if (sts) {
- ktime_get_snapshot(&systime_snapshot);
- if (systime_snapshot.cs_id == st->cs_id) {
- cycle = systime_snapshot.cycles;
+ ptp_read_system_prets(sts);
+ if (sts->pre_sts.cs_id == st->cs_id) {
+ cycle = sts->pre_sts.cycles;
+ sts->post_sts = sts->pre_sts;
+ } else if (sts->pre_sts.hw_csid == st->cs_id &&
+ sts->pre_sts.hw_cycles) {
+ cycle = sts->pre_sts.hw_cycles;
+ sts->post_sts = sts->pre_sts;
} else {
cycle = get_cycles();
ptp_read_system_postts(sts);
@@ -180,12 +181,6 @@ static int vmclock_get_crosststamp(struct vmclock_state *st,
system_counter->cs_id = st->cs_id;
}
- if (sts) {
- sts->pre_ts = ktime_to_timespec64(systime_snapshot.real);
- if (systime_snapshot.cs_id == st->cs_id)
- sts->post_ts = sts->pre_ts;
- }
-
return 0;
}
@@ -272,7 +267,7 @@ static int ptp_vmclock_getcrosststamp(struct ptp_clock_info *ptp,
if (ret == -ENODEV) {
struct system_time_snapshot systime_snapshot;
- ktime_get_snapshot(&systime_snapshot);
+ ktime_get_snapshot_id(CLOCK_REALTIME, &systime_snapshot);
if (systime_snapshot.cs_id == CSID_X86_TSC ||
systime_snapshot.cs_id == CSID_X86_KVM_CLK) {
diff --git a/drivers/virtio/virtio_rtc_ptp.c b/drivers/virtio/virtio_rtc_ptp.c
index f84599950cd4..ff8d834493dc 100644
--- a/drivers/virtio/virtio_rtc_ptp.c
+++ b/drivers/virtio/virtio_rtc_ptp.c
@@ -139,7 +139,7 @@ static int viortc_ptp_getcrosststamp(struct ptp_clock_info *ptp,
if (ret)
return ret;
- ktime_get_snapshot(&history_begin);
+ ktime_get_snapshot_id(xtstamp->clock_id, &history_begin);
if (history_begin.cs_id != cs_id)
return -EOPNOTSUPP;
diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index 7c38190b10bf..6d9ddf1587a2 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -31,6 +31,21 @@ struct module;
#include <vdso/clocksource.h>
+/**
+ * struct clocksource_hw_snapshot - Snapshot for the underlying hardware counter of derived
+ * clocksources like kvmclock or Hyper-V scaled TSC
+ * @hw_cycles: The hardware counter value
+ * @hw_csid: Clocksource ID of the hardware counter
+ *
+ * Such clocksources must implement the read_snapshot() callback and fill in the
+ * hardware counter value, the clocksource ID of the hardware counter and derive
+ * the actual clocksource cycles from @hw_cycles to provide an atomic snapshot
+ */
+struct clocksource_hw_snapshot {
+ u64 hw_cycles;
+ enum clocksource_ids hw_csid;
+};
+
/**
* struct clocksource - hardware abstraction for a free running counter
* Provides mostly state-free accessors to the underlying hardware.
@@ -72,6 +87,14 @@ struct module;
* @flags: Flags describing special properties
* @base: Hardware abstraction for clock on which a clocksource
* is based
+ * @read_snapshot: Extended @read() function for clocksources such as
+ * kvmclock or the Hyper-V scaled TSC where the actual
+ * clocksource value for timekeeping is calculated from an
+ * underlying hardware counter. Returns the timekeeping
+ * relevant cycle value and stores the raw value of the
+ * underlying counter from which it was calculated
+ * including the clocksource ID of that counter in the
+ * clocksource hardware snapshot.
* @enable: Optional function to enable the clocksource
* @disable: Optional function to disable the clocksource
* @suspend: Optional suspend function for the clocksource
@@ -113,6 +136,7 @@ struct clocksource {
unsigned long flags;
struct clocksource_base *base;
+ u64 (*read_snapshot)(struct clocksource *cs, struct clocksource_hw_snapshot *chs);
int (*enable)(struct clocksource *cs);
void (*disable)(struct clocksource *cs);
void (*suspend)(struct clocksource *cs);
diff --git a/include/linux/pps_kernel.h b/include/linux/pps_kernel.h
index aab0aebb529e..9f088c9023b1 100644
--- a/include/linux/pps_kernel.h
+++ b/include/linux/pps_kernel.h
@@ -99,12 +99,14 @@ static inline void timespec_to_pps_ktime(struct pps_ktime *kt,
static inline void pps_get_ts(struct pps_event_time *ts)
{
+#ifdef CONFIG_NTP_PPS
struct system_time_snapshot snap;
- ktime_get_snapshot(&snap);
- ts->ts_real = ktime_to_timespec64(snap.real);
-#ifdef CONFIG_NTP_PPS
- ts->ts_raw = ktime_to_timespec64(snap.raw);
+ ktime_get_snapshot_id(CLOCK_REALTIME, &snap);
+ ts->ts_real = ktime_to_timespec64(snap.systime);
+ ts->ts_raw = ktime_to_timespec64(snap.monoraw);
+#else
+ ktime_get_real_ts64(&ts->ts_real);
#endif
}
diff --git a/include/linux/ptp_clock_kernel.h b/include/linux/ptp_clock_kernel.h
index 884364596dd3..36a27a910595 100644
--- a/include/linux/ptp_clock_kernel.h
+++ b/include/linux/ptp_clock_kernel.h
@@ -12,6 +12,7 @@
#include <linux/pps_kernel.h>
#include <linux/ptp_clock.h>
#include <linux/timecounter.h>
+#include <linux/timekeeping.h>
#include <linux/skbuff.h>
#define PTP_CLOCK_NAME_LEN 32
@@ -45,13 +46,13 @@ struct system_device_crosststamp;
/**
* struct ptp_system_timestamp - system time corresponding to a PHC timestamp
- * @pre_ts: system timestamp before capturing PHC
- * @post_ts: system timestamp after capturing PHC
- * @clockid: clock-base used for capturing the system timestamps
+ * @pre_sts: system time snapshot before capturing PHC
+ * @post_sts: system time snapshot after capturing PHC
+ * @clockid: clock-base used for capturing the system timestamps
*/
struct ptp_system_timestamp {
- struct timespec64 pre_ts;
- struct timespec64 post_ts;
+ struct system_time_snapshot pre_sts;
+ struct system_time_snapshot post_sts;
clockid_t clockid;
};
@@ -510,13 +511,13 @@ static inline ktime_t ptp_convert_timestamp(const ktime_t *hwtstamp,
static inline void ptp_read_system_prets(struct ptp_system_timestamp *sts)
{
if (sts)
- ktime_get_clock_ts64(sts->clockid, &sts->pre_ts);
+ ktime_get_snapshot_id(sts->clockid, &sts->pre_sts);
}
static inline void ptp_read_system_postts(struct ptp_system_timestamp *sts)
{
if (sts)
- ktime_get_clock_ts64(sts->clockid, &sts->post_ts);
+ ktime_get_snapshot_id(sts->clockid, &sts->post_sts);
}
#endif
diff --git a/include/linux/timekeeping.h b/include/linux/timekeeping.h
index aee2c1a46e47..984a866d293b 100644
--- a/include/linux/timekeeping.h
+++ b/include/linux/timekeeping.h
@@ -276,37 +276,30 @@ static inline bool ktime_get_aux_ts64(clockid_t id, struct timespec64 *kt) { ret
#endif
/**
- * struct system_time_snapshot - simultaneous raw/real time capture with
- * counter value
- * @cycles: Clocksource counter value to produce the system times
- * @real: Realtime system time
- * @boot: Boot time
- * @raw: Monotonic raw system time
- * @cs_id: Clocksource ID
+ * struct system_time_snapshot - Simultaneous time capture of CLOCK_MONOTONIC_RAW,
+ * a selected CLOCK_* and the clocksource counter value
+ * @cycles: Clocksource counter value to produce the system times
+ * @hw_cycles: For derived clocksources, the hardware counter value from
+ * which @cycles was derived
+ * @systime: The system time of the selected CLOCK ID
+ * @monoraw: Monotonic raw system time
+ * @cs_id: Clocksource ID
+ * @hw_csid: Clocksource ID of the underlying hardware counter for derived
+ * clocksources which implement the read_snapshot() callback.
* @clock_was_set_seq: The sequence number of clock-was-set events
* @cs_was_changed_seq: The sequence number of clocksource change events
+ * @valid: True if the snapshot is valid
*/
struct system_time_snapshot {
u64 cycles;
- ktime_t real;
- ktime_t boot;
- ktime_t raw;
+ u64 hw_cycles;
+ ktime_t systime;
+ ktime_t monoraw;
enum clocksource_ids cs_id;
+ enum clocksource_ids hw_csid;
unsigned int clock_was_set_seq;
u8 cs_was_changed_seq;
-};
-
-/**
- * struct system_device_crosststamp - system/device cross-timestamp
- * (synchronized capture)
- * @device: Device time
- * @sys_realtime: Realtime simultaneous with device time
- * @sys_monoraw: Monotonic raw simultaneous with device time
- */
-struct system_device_crosststamp {
- ktime_t device;
- ktime_t sys_realtime;
- ktime_t sys_monoraw;
+ u8 valid;
};
/**
@@ -325,6 +318,23 @@ struct system_counterval_t {
bool use_nsecs;
};
+/**
+ * struct system_device_crosststamp - system/device cross-timestamp
+ * (synchronized capture)
+ * @clock_id: System time Clock ID to capture
+ * @device: Device time
+ * @sys_counter: Clocksource counter value simultaneous with device time
+ * @sys_systime: System time for @clock_id
+ * @sys_monoraw: Monotonic raw simultaneous with device time
+ */
+struct system_device_crosststamp {
+ clockid_t clock_id;
+ ktime_t device;
+ struct system_counterval_t sys_counter;
+ ktime_t sys_systime;
+ ktime_t sys_monoraw;
+};
+
extern bool ktime_real_to_base_clock(ktime_t treal,
enum clocksource_ids base_id, u64 *cycles);
extern bool timekeeping_clocksource_has_base(enum clocksource_ids id);
@@ -341,9 +351,10 @@ extern int get_device_system_crosststamp(
struct system_device_crosststamp *xtstamp);
/*
- * Simultaneously snapshot realtime and monotonic raw clocks
+ * Simultaneously snapshot a given clock with MONOTONIC_RAW and the underlying
+ * clocksource counter value.
*/
-extern void ktime_get_snapshot(struct system_time_snapshot *systime_snapshot);
+extern void ktime_get_snapshot_id(clockid_t clock_id, struct system_time_snapshot *systime_snapshot);
/*
* Persistent clock related interfaces
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index c493a4010305..0d5b67f609bb 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -67,6 +67,7 @@ static inline bool tk_is_aux(const struct timekeeper *tk)
{
return tk->id >= TIMEKEEPER_AUX_FIRST && tk->id <= TIMEKEEPER_AUX_LAST;
}
+static inline struct tk_data *aux_get_tk_data(clockid_t id);
#else
static inline bool tk_get_aux_ts64(unsigned int tkid, struct timespec64 *ts)
{
@@ -77,6 +78,10 @@ static inline bool tk_is_aux(const struct timekeeper *tk)
{
return false;
}
+static inline struct tk_data *aux_get_tk_data(clockid_t id)
+{
+ return NULL;
+}
#endif
static inline void tk_update_aux_offs(struct timekeeper *tk, ktime_t offs)
@@ -315,6 +320,7 @@ static __always_inline u64 tk_clock_read(const struct tk_read_base *tkr)
return clock->read(clock);
}
+
static inline void clocksource_disable_inline_read(void) { }
static inline void clocksource_enable_inline_read(void) { }
#endif
@@ -1182,44 +1188,107 @@ noinstr time64_t __ktime_get_real_seconds(void)
return tk->xtime_sec;
}
-/**
- * ktime_get_snapshot - snapshots the realtime/monotonic raw clocks with counter
- * @systime_snapshot: pointer to struct receiving the system time snapshot
- */
-void ktime_get_snapshot(struct system_time_snapshot *systime_snapshot)
+static inline u64 tk_clock_read_snapshot(const struct tk_read_base *tkr,
+ struct clocksource_hw_snapshot *chs)
{
- struct timekeeper *tk = &tk_core.timekeeper;
+ struct clocksource *clock = READ_ONCE(tkr->clock);
+
+ if (unlikely(clock->read_snapshot))
+ return clock->read_snapshot(clock, chs);
+
+ return clock->read(clock);
+}
+
+
+/**
+ * ktime_get_snapshot_id - Simultaneously snapshot a given clock ID with
+ * CLOCK_MONOTONIC_RAW and the underlying
+ * clocksource counter value.
+ * @clock_id: The clock ID to snapshot
+ * @systime_snapshot: Pointer to struct receiving the system time snapshot
+ */
+void ktime_get_snapshot_id(clockid_t clock_id, struct system_time_snapshot *systime_snapshot)
+{
+ ktime_t base_raw, base_sys, offs_sys, *offs, offs_zero = 0;
+ u64 nsec_raw, nsec_sys, now;
+ struct timekeeper *tk;
+ struct tk_data *tkd;
unsigned int seq;
- ktime_t base_raw;
- ktime_t base_real;
- ktime_t base_boot;
- u64 nsec_raw;
- u64 nsec_real;
- u64 now;
- WARN_ON_ONCE(timekeeping_suspended);
+ /* Invalidate the snapshot for all failure cases */
+ systime_snapshot->valid = false;
+
+ if (WARN_ON_ONCE(timekeeping_suspended))
+ return;
+
+ switch (clock_id) {
+ case CLOCK_REALTIME:
+ tkd = &tk_core;
+ offs = &tk_core.timekeeper.offs_real;
+ break;
+ /* Map RAW to MONOTONIC so the loop below is trivial */
+ case CLOCK_MONOTONIC_RAW:
+ case CLOCK_MONOTONIC:
+ tkd = &tk_core;
+ offs = &offs_zero;
+ break;
+ case CLOCK_BOOTTIME:
+ tkd = &tk_core;
+ offs = &tk_core.timekeeper.offs_boot;
+ break;
+ case CLOCK_AUX ... CLOCK_AUX_LAST:
+ tkd = aux_get_tk_data(clock_id);
+ if (!tkd)
+ return;
+ offs = &tkd->timekeeper.offs_aux;
+ break;
+ default:
+ WARN_ON_ONCE(1);
+ return;
+ }
+
+ tk = &tkd->timekeeper;
do {
- seq = read_seqcount_begin(&tk_core.seq);
- now = tk_clock_read(&tk->tkr_mono);
+ struct clocksource_hw_snapshot chs = { };
+
+ seq = read_seqcount_begin(&tkd->seq);
+
+ /* Aux clocks can be invalid */
+ if (!tk->clock_valid)
+ return;
+
+ now = tk_clock_read_snapshot(&tk->tkr_mono, &chs);
systime_snapshot->cs_id = tk->tkr_mono.clock->id;
+
+ systime_snapshot->hw_cycles = chs.hw_cycles;
+ systime_snapshot->hw_csid = chs.hw_csid;
+
systime_snapshot->cs_was_changed_seq = tk->cs_was_changed_seq;
systime_snapshot->clock_was_set_seq = tk->clock_was_set_seq;
- base_real = ktime_add(tk->tkr_mono.base,
- tk_core.timekeeper.offs_real);
- base_boot = ktime_add(tk->tkr_mono.base,
- tk_core.timekeeper.offs_boot);
+
+ base_sys = tk->tkr_mono.base;
+ offs_sys = *offs;
base_raw = tk->tkr_raw.base;
- nsec_real = timekeeping_cycles_to_ns(&tk->tkr_mono, now);
- nsec_raw = timekeeping_cycles_to_ns(&tk->tkr_raw, now);
- } while (read_seqcount_retry(&tk_core.seq, seq));
+
+ nsec_sys = timekeeping_cycles_to_ns(&tk->tkr_mono, now);
+ nsec_raw = timekeeping_cycles_to_ns(&tk->tkr_raw, now);
+ } while (read_seqcount_retry(&tkd->seq, seq));
systime_snapshot->cycles = now;
- systime_snapshot->real = ktime_add_ns(base_real, nsec_real);
- systime_snapshot->boot = ktime_add_ns(base_boot, nsec_real);
- systime_snapshot->raw = ktime_add_ns(base_raw, nsec_raw);
+ systime_snapshot->systime = ktime_add_ns(base_sys, offs_sys + nsec_sys);
+ systime_snapshot->monoraw = ktime_add_ns(base_raw, nsec_raw);
+
+ /*
+ * Special case for PTP. Just transfer the raw time into sys,
+ * so the call sites can consistently use snap::systime.
+ */
+ if (clock_id == CLOCK_MONOTONIC_RAW)
+ systime_snapshot->systime = systime_snapshot->monoraw;
+ /* Tell the consumer that this snapshot is valid */
+ systime_snapshot->valid = true;
}
-EXPORT_SYMBOL_GPL(ktime_get_snapshot);
+EXPORT_SYMBOL_GPL(ktime_get_snapshot_id);
/* Scale base by mult/div checking for overflow */
static int scale64_check_overflow(u64 mult, u64 div, u64 *base)
@@ -1262,7 +1331,7 @@ static int adjust_historical_crosststamp(struct system_time_snapshot *history,
struct system_device_crosststamp *ts)
{
struct timekeeper *tk = &tk_core.timekeeper;
- u64 corr_raw, corr_real;
+ u64 corr_raw, corr_sys;
bool interp_forward;
int ret;
@@ -1279,8 +1348,7 @@ static int adjust_historical_crosststamp(struct system_time_snapshot *history,
* Scale the monotonic raw time delta by:
* partial_history_cycles / total_history_cycles
*/
- corr_raw = (u64)ktime_to_ns(
- ktime_sub(ts->sys_monoraw, history->raw));
+ corr_raw = (u64)ktime_to_ns(ktime_sub(ts->sys_monoraw, history->monoraw));
ret = scale64_check_overflow(partial_history_cycles,
total_history_cycles, &corr_raw);
if (ret)
@@ -1288,30 +1356,29 @@ static int adjust_historical_crosststamp(struct system_time_snapshot *history,
/*
* If there is a discontinuity in the history, scale monotonic raw
- * correction by:
- * mult(real)/mult(raw) yielding the realtime correction
- * Otherwise, calculate the realtime correction similar to monotonic
- * raw calculation
+ * correction by:
+ * mult(sys)/mult(raw) yielding the system time correction
+ *
+ * Otherwise, calculate the system time correction similar to monotonic
+ * raw calculation
*/
if (discontinuity) {
- corr_real = mul_u64_u32_div
- (corr_raw, tk->tkr_mono.mult, tk->tkr_raw.mult);
+ corr_sys = mul_u64_u32_div(corr_raw, tk->tkr_mono.mult, tk->tkr_raw.mult);
} else {
- corr_real = (u64)ktime_to_ns(
- ktime_sub(ts->sys_realtime, history->real));
- ret = scale64_check_overflow(partial_history_cycles,
- total_history_cycles, &corr_real);
+ corr_sys = (u64)ktime_to_ns(ktime_sub(ts->sys_systime, history->systime));
+ ret = scale64_check_overflow(partial_history_cycles, total_history_cycles,
+ &corr_sys);
if (ret)
return ret;
}
- /* Fixup monotonic raw and real time time values */
+ /* Fixup monotonic raw and system time time values */
if (interp_forward) {
- ts->sys_monoraw = ktime_add_ns(history->raw, corr_raw);
- ts->sys_realtime = ktime_add_ns(history->real, corr_real);
+ ts->sys_monoraw = ktime_add_ns(history->monoraw, corr_raw);
+ ts->sys_systime = ktime_add_ns(history->systime, corr_sys);
} else {
ts->sys_monoraw = ktime_sub_ns(ts->sys_monoraw, corr_raw);
- ts->sys_realtime = ktime_sub_ns(ts->sys_realtime, corr_real);
+ ts->sys_systime = ktime_sub_ns(ts->sys_systime, corr_sys);
}
return 0;
@@ -1368,6 +1435,8 @@ static bool convert_base_to_cs(struct system_counterval_t *scv)
return false;
scv->cycles += base->offset;
+ /* Set the clocksource ID as scv::cycles is now clocksource based */
+ scv->cs_id = cs->id;
return true;
}
@@ -1435,11 +1504,11 @@ EXPORT_SYMBOL_GPL(ktime_real_to_base_clock);
/**
* get_device_system_crosststamp - Synchronously capture system/device timestamp
- * @get_time_fn: Callback to get simultaneous device time and
- * system counter from the device driver
+ * @get_time_fn: Callback to get simultaneous device time and system counter
+ * from the device driver
* @ctx: Context passed to get_time_fn()
- * @history_begin: Historical reference point used to interpolate system
- * time when counter provided by the driver is before the current interval
+ * @history_begin: Historical reference point used to interpolate system time when
+ * the counter value provided by the driver is before the current interval
* @xtstamp: Receives simultaneously captured system and device time
*
* Reads a timestamp from a device and correlates it to system time
@@ -1452,36 +1521,54 @@ int get_device_system_crosststamp(int (*get_time_fn)
struct system_time_snapshot *history_begin,
struct system_device_crosststamp *xtstamp)
{
- struct system_counterval_t system_counterval = {};
- struct timekeeper *tk = &tk_core.timekeeper;
- u64 cycles, now, interval_start;
- unsigned int clock_was_set_seq = 0;
- ktime_t base_real, base_raw;
- u64 nsec_real, nsec_raw;
+ u64 syscnt_cycles, cycles, now, interval_start;
+ unsigned int seq, clock_was_set_seq = 0;
+ ktime_t base_sys, base_raw, *offs;
+ u64 nsec_sys, nsec_raw;
u8 cs_was_changed_seq;
- unsigned int seq;
bool do_interp;
+ struct timekeeper *tk;
+ struct tk_data *tkd;
int ret;
+ switch (xtstamp->clock_id) {
+ case CLOCK_REALTIME:
+ tkd = &tk_core;
+ offs = &tk_core.timekeeper.offs_real;
+ break;
+ case CLOCK_AUX ... CLOCK_AUX_LAST:
+ tkd = aux_get_tk_data(xtstamp->clock_id);
+ if (!tkd)
+ return -ENODEV;
+ offs = &tkd->timekeeper.offs_aux;
+ break;
+ default:
+ WARN_ON_ONCE(1);
+ return -ENODEV;
+ }
+
+ tk = &tkd->timekeeper;
+
do {
- seq = read_seqcount_begin(&tk_core.seq);
+ seq = read_seqcount_begin(&tkd->seq);
/*
* Try to synchronously capture device time and a system
* counter value calling back into the device driver
*/
- ret = get_time_fn(&xtstamp->device, &system_counterval, ctx);
+ ret = get_time_fn(&xtstamp->device, &xtstamp->sys_counter, ctx);
if (ret)
return ret;
/*
* Verify that the clocksource ID associated with the captured
* system counter value is the same as for the currently
- * installed timekeeper clocksource
+ * installed timekeeper clocksource and convert to it.
*/
- if (system_counterval.cs_id == CSID_GENERIC ||
- !convert_base_to_cs(&system_counterval))
+ if (xtstamp->sys_counter.cs_id == CSID_GENERIC ||
+ !convert_base_to_cs(&xtstamp->sys_counter))
return -ENODEV;
- cycles = system_counterval.cycles;
+
+ cycles = syscnt_cycles = xtstamp->sys_counter.cycles;
/*
* Check whether the system counter value provided by the
@@ -1498,15 +1585,14 @@ int get_device_system_crosststamp(int (*get_time_fn)
do_interp = false;
}
- base_real = ktime_add(tk->tkr_mono.base,
- tk_core.timekeeper.offs_real);
+ base_sys = ktime_add(tk->tkr_mono.base, *offs);
base_raw = tk->tkr_raw.base;
- nsec_real = timekeeping_cycles_to_ns(&tk->tkr_mono, cycles);
+ nsec_sys = timekeeping_cycles_to_ns(&tk->tkr_mono, cycles);
nsec_raw = timekeeping_cycles_to_ns(&tk->tkr_raw, cycles);
- } while (read_seqcount_retry(&tk_core.seq, seq));
+ } while (read_seqcount_retry(&tkd->seq, seq));
- xtstamp->sys_realtime = ktime_add_ns(base_real, nsec_real);
+ xtstamp->sys_systime = ktime_add_ns(base_sys, nsec_sys);
xtstamp->sys_monoraw = ktime_add_ns(base_raw, nsec_raw);
/*
@@ -1523,24 +1609,19 @@ int get_device_system_crosststamp(int (*get_time_fn)
* clocksource change
*/
if (!history_begin ||
- !timestamp_in_interval(history_begin->cycles,
- cycles, system_counterval.cycles) ||
+ !timestamp_in_interval(history_begin->cycles, cycles, syscnt_cycles) ||
history_begin->cs_was_changed_seq != cs_was_changed_seq)
return -EINVAL;
- partial_history_cycles = cycles - system_counterval.cycles;
+
+ partial_history_cycles = cycles - syscnt_cycles;
total_history_cycles = cycles - history_begin->cycles;
- discontinuity =
- history_begin->clock_was_set_seq != clock_was_set_seq;
+ discontinuity = history_begin->clock_was_set_seq != clock_was_set_seq;
- ret = adjust_historical_crosststamp(history_begin,
- partial_history_cycles,
- total_history_cycles,
- discontinuity, xtstamp);
- if (ret)
- return ret;
+ ret = adjust_historical_crosststamp(history_begin, partial_history_cycles,
+ total_history_cycles, discontinuity, xtstamp);
}
- return 0;
+ return ret;
}
EXPORT_SYMBOL_GPL(get_device_system_crosststamp);
diff --git a/sound/hda/common/controller.c b/sound/hda/common/controller.c
index 5934e5cdfdfd..77a67fb9eaf9 100644
--- a/sound/hda/common/controller.c
+++ b/sound/hda/common/controller.c
@@ -489,9 +489,9 @@ static int azx_get_time_info(struct snd_pcm_substream *substream,
struct snd_pcm_audio_tstamp_config *audio_tstamp_config,
struct snd_pcm_audio_tstamp_report *audio_tstamp_report)
{
+ struct system_device_crosststamp xtstamp = { .clock_id = CLOCK_REALTIME };
struct azx_dev *azx_dev = get_azx_dev(substream);
struct snd_pcm_runtime *runtime = substream->runtime;
- struct system_device_crosststamp xtstamp;
int ret;
u64 nsec;
@@ -525,7 +525,7 @@ static int azx_get_time_info(struct snd_pcm_substream *substream,
break;
default:
- *system_ts = ktime_to_timespec64(xtstamp.sys_realtime);
+ *system_ts = ktime_to_timespec64(xtstamp.sys_systime);
break;
}
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [GIT pull] timers/vdso for v7.2-rc1
2026-06-13 21:24 [GIT pull] core/rseq for v7.2-rc1 Thomas Gleixner
` (7 preceding siblings ...)
2026-06-13 21:25 ` [GIT pull] timers/ptp " Thomas Gleixner
@ 2026-06-13 21:25 ` Thomas Gleixner
2026-06-15 8:51 ` pr-tracker-bot
2026-06-15 8:51 ` [GIT pull] core/rseq " pr-tracker-bot
9 siblings, 1 reply; 22+ messages in thread
From: Thomas Gleixner @ 2026-06-13 21:25 UTC (permalink / raw)
To: Linus Torvalds; +Cc: linux-kernel, x86
Linus,
please pull the latest timers/vdso branch from:
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git timers-vdso-2026-06-13
up to: 8d563bd79047: MIPS: VDSO: Fold MIPS_CLOCK_VSYSCALL into MIPS_GENERIC_GETTIMEOFDAY
A series of updates for the VDSO:
- Remove the redundant CONFIG_GENERIC_TIME_VSYSCALL after converting the
remaining users over.
- Rework and sanitize the MIPS VDSO handling, so it does not handle the
time related VDSO if there is no VDSO capable clocksource available.
Also stop mapping VDSO data pages unconditionally even if there is no
usage possible.
Thanks,
tglx
------------------>
Thomas Weißschuh (13):
riscv: vdso: Drop CONFIG_GENERIC_TIME_VSYSCALL guard around syscall fallbacks
vdso/vsyscall: Gate update_vsyscall() behind CONFIG_GENERIC_GETTIMEOFDAY
vdso/treewide: Drop GENERIC_TIME_VSYSCALL
vdso/gettimeofday: Rename __arch_get_vdso_u_timens_data()
MAINTAINERS: Add include/linux/vdso_datastore.h to vDSO block
vdso/datastore: Always provide symbol declarations
MIPS: Introduce Kconfig MIPS_GENERIC_GETTIMEOFDAY
MIPS: VDSO: Only map the data pages when the vDSO is used
MIPS: csrc-r4k: Only use VDSO_CLOCKMODE_R4K when it is a available
clocksource/drivers/mips-gic-timer: Only use VDSO_CLOCKMODE_GIC when it is a available
MIPS: VDSO: Fold MIPS_DISABLE_VDSO into MIPS_GENERIC_GETTIMEOFDAY
MIPS: VDSO: Gate microMIPS restriction on GCC version
MIPS: VDSO: Fold MIPS_CLOCK_VSYSCALL into MIPS_GENERIC_GETTIMEOFDAY
MAINTAINERS | 1 +
arch/arm/mm/Kconfig | 1 -
arch/arm64/Kconfig | 1 -
arch/loongarch/Kconfig | 1 -
arch/mips/Kconfig | 18 ++++++++++--------
arch/mips/kernel/csrc-r4k.c | 2 ++
arch/mips/kernel/vdso.c | 12 +++++++-----
arch/mips/vdso/Kconfig | 6 ------
arch/mips/vdso/Makefile | 7 ++-----
arch/mips/vdso/vdso.lds.S | 4 +---
arch/mips/vdso/vgettimeofday.c | 20 --------------------
arch/powerpc/Kconfig | 1 -
arch/riscv/Kconfig | 1 -
arch/riscv/include/asm/vdso/gettimeofday.h | 8 --------
arch/s390/Kconfig | 1 -
arch/sparc/Kconfig | 1 -
arch/x86/Kconfig | 1 -
drivers/clocksource/mips-gic-timer.c | 2 ++
include/linux/timekeeper_internal.h | 2 +-
include/linux/vdso_datastore.h | 2 +-
kernel/time/Kconfig | 4 ----
lib/vdso/gettimeofday.c | 14 +++++++-------
22 files changed, 34 insertions(+), 76 deletions(-)
delete mode 100644 arch/mips/vdso/Kconfig
diff --git a/MAINTAINERS b/MAINTAINERS
index 9ec290e38b44..65b2336f4fae 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -10870,6 +10870,7 @@ L: linux-kernel@vger.kernel.org
S: Maintained
T: git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git timers/vdso
F: include/asm-generic/vdso/vsyscall.h
+F: include/linux/vdso_datastore.h
F: include/vdso/
F: kernel/time/namespace_vdso.c
F: kernel/time/vsyscall.c
diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig
index 7b27ee9482b3..871bd58d2ccc 100644
--- a/arch/arm/mm/Kconfig
+++ b/arch/arm/mm/Kconfig
@@ -925,7 +925,6 @@ config VDSO
depends on AEABI && MMU && CPU_V7
default y if ARM_ARCH_TIMER
select HAVE_GENERIC_VDSO
- select GENERIC_TIME_VSYSCALL
select GENERIC_GETTIMEOFDAY
help
Place in the process address space an ELF shared object
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index fe60738e5943..7e331b4f480a 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -140,7 +140,6 @@ config ARM64
select GENERIC_PCI_IOMAP
select GENERIC_SCHED_CLOCK
select GENERIC_SMP_IDLE_THREAD
- select GENERIC_TIME_VSYSCALL
select GENERIC_GETTIMEOFDAY
select HARDIRQS_SW_RESEND
select HAS_IOPORT
diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
index 606597da46b8..3f69c5d7e48e 100644
--- a/arch/loongarch/Kconfig
+++ b/arch/loongarch/Kconfig
@@ -110,7 +110,6 @@ config LOONGARCH
select GENERIC_PCI_IOMAP
select GENERIC_SCHED_CLOCK
select GENERIC_SMP_IDLE_THREAD
- select GENERIC_TIME_VSYSCALL if GENERIC_GETTIMEOFDAY
select GPIOLIB
select HAS_IOPORT
select HAVE_ALIGNED_STRUCT_PAGE if 64BIT
diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index 4364f3dba688..323ca084e79a 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -38,7 +38,6 @@ config MIPS
select GENERIC_BUILTIN_DTB if BUILTIN_DTB
select GENERIC_CMOS_UPDATE
select GENERIC_CPU_AUTOPROBE
- select GENERIC_GETTIMEOFDAY
select GENERIC_IRQ_PROBE
select GENERIC_IRQ_SHOW
select GENERIC_ISA_DMA if EISA
@@ -51,7 +50,6 @@ config MIPS
select GENERIC_SCHED_CLOCK if !CAVIUM_OCTEON_SOC
select GENERIC_SMP_IDLE_THREAD
select GENERIC_IDLE_POLL_SETUP
- select GENERIC_TIME_VSYSCALL
select GUP_GET_PXX_LOW_HIGH if CPU_MIPS32 && PHYS_ADDR_T_64BIT
select HAS_IOPORT if !NO_IOPORT_MAP || ISA
select HAVE_ARCH_COMPILER_H
@@ -76,7 +74,6 @@ config MIPS
select HAVE_FUNCTION_GRAPH_TRACER
select HAVE_FUNCTION_TRACER
select HAVE_GCC_PLUGINS
- select HAVE_GENERIC_VDSO
select HAVE_IOREMAP_PROT
select HAVE_IRQ_EXIT_ON_IRQ_STACK
select HAVE_IRQ_TIME_ACCOUNTING
@@ -1136,9 +1133,6 @@ config CSRC_R4K
config CSRC_SB1250
bool
-config MIPS_CLOCK_VSYSCALL
- def_bool CSRC_R4K || CLKSRC_MIPS_GIC
-
config GPIO_TXX9
select GPIOLIB
bool
@@ -3170,6 +3164,16 @@ endmenu
config MIPS_EXTERNAL_TIMER
bool
+config MIPS_GENERIC_GETTIMEOFDAY
+ def_bool y
+ select GENERIC_GETTIMEOFDAY
+ select HAVE_GENERIC_VDSO
+ depends on CSRC_R4K || CLKSRC_MIPS_GIC
+ # GCC (at least up to version 9.2) appears to emit function calls that make use
+ # of the GOT when targeting microMIPS, which we can't use in the VDSO due to
+ # the lack of relocations. As such, we disable the VDSO for microMIPS builds.
+ depends on !(CPU_MICROMIPS && CC_IS_GCC && GCC_VERSION < 90300)
+
menu "CPU Power Management"
if CPU_SUPPORTS_CPUFREQ && MIPS_EXTERNAL_TIMER
@@ -3181,5 +3185,3 @@ source "drivers/cpuidle/Kconfig"
endmenu
source "arch/mips/kvm/Kconfig"
-
-source "arch/mips/vdso/Kconfig"
diff --git a/arch/mips/kernel/csrc-r4k.c b/arch/mips/kernel/csrc-r4k.c
index 59eca397f297..241a934543a8 100644
--- a/arch/mips/kernel/csrc-r4k.c
+++ b/arch/mips/kernel/csrc-r4k.c
@@ -126,12 +126,14 @@ int __init init_r4k_clocksource(void)
clocksource_mips.rating = 200;
clocksource_mips.rating += clamp(mips_hpt_frequency / 10000000, 0, 99);
+#ifdef CONFIG_GENERIC_GETTIMEOFDAY
/*
* R2 onwards makes the count accessible to user mode so it can be used
* by the VDSO (HWREna is configured by configure_hwrena()).
*/
if (cpu_has_mips_r2_r6 && rdhwr_count_usable())
clocksource_mips.vdso_clock_mode = VDSO_CLOCKMODE_R4K;
+#endif
clocksource_register_hz(&clocksource_mips, mips_hpt_frequency);
diff --git a/arch/mips/kernel/vdso.c b/arch/mips/kernel/vdso.c
index 2fa4df3e46e4..bd1fc17d3975 100644
--- a/arch/mips/kernel/vdso.c
+++ b/arch/mips/kernel/vdso.c
@@ -129,7 +129,7 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
* This ensures that when the kernel updates the VDSO data userland
* will observe it without requiring cache invalidations.
*/
- if (cpu_has_dc_aliases) {
+ if (cpu_has_dc_aliases && IS_ENABLED(CONFIG_HAVE_GENERIC_VDSO)) {
base = __ALIGN_MASK(base, shm_align_mask);
base += ((unsigned long)vdso_k_time_data - gic_size) & shm_align_mask;
}
@@ -137,10 +137,12 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
data_addr = base + gic_size;
vdso_addr = data_addr + VDSO_NR_PAGES * PAGE_SIZE;
- vma = vdso_install_vvar_mapping(mm, data_addr);
- if (IS_ERR(vma)) {
- ret = PTR_ERR(vma);
- goto out;
+ if (IS_ENABLED(CONFIG_HAVE_GENERIC_VDSO)) {
+ vma = vdso_install_vvar_mapping(mm, data_addr);
+ if (IS_ERR(vma)) {
+ ret = PTR_ERR(vma);
+ goto out;
+ }
}
/* Map GIC user page. */
diff --git a/arch/mips/vdso/Kconfig b/arch/mips/vdso/Kconfig
deleted file mode 100644
index 70140248da72..000000000000
--- a/arch/mips/vdso/Kconfig
+++ /dev/null
@@ -1,6 +0,0 @@
-# GCC (at least up to version 9.2) appears to emit function calls that make use
-# of the GOT when targeting microMIPS, which we can't use in the VDSO due to
-# the lack of relocations. As such, we disable the VDSO for microMIPS builds.
-
-config MIPS_DISABLE_VDSO
- def_bool CPU_MICROMIPS
diff --git a/arch/mips/vdso/Makefile b/arch/mips/vdso/Makefile
index 69d4593f64fe..00d3ba2c482a 100644
--- a/arch/mips/vdso/Makefile
+++ b/arch/mips/vdso/Makefile
@@ -4,7 +4,7 @@
# Include the generic Makefile to check the built vdso.
include $(srctree)/lib/vdso/Makefile.include
-obj-vdso-y := elf.o vgettimeofday.o sigreturn.o
+obj-vdso-y := elf.o sigreturn.o
# Common compiler flags between ABIs.
ccflags-vdso := \
@@ -36,6 +36,7 @@ aflags-vdso := $(ccflags-vdso) \
-D__ASSEMBLY__ -Wa,-gdwarf-2
ifneq ($(c-gettimeofday-y),)
+obj-vdso-y += vgettimeofday.o
CFLAGS_vgettimeofday.o = -include $(c-gettimeofday-y)
# config-n32-o32-env.c prepares the environment to build a 32bit vDSO
@@ -47,10 +48,6 @@ endif
CFLAGS_REMOVE_vgettimeofday.o = $(CC_FLAGS_FTRACE)
-ifdef CONFIG_MIPS_DISABLE_VDSO
- obj-vdso-y := $(filter-out vgettimeofday.o, $(obj-vdso-y))
-endif
-
# VDSO linker flags.
ldflags-y := -Bsymbolic --no-undefined -soname=linux-vdso.so.1 \
$(filter -E%,$(KBUILD_CFLAGS)) -shared \
diff --git a/arch/mips/vdso/vdso.lds.S b/arch/mips/vdso/vdso.lds.S
index 5d08be3a6b85..05badf3ae0ff 100644
--- a/arch/mips/vdso/vdso.lds.S
+++ b/arch/mips/vdso/vdso.lds.S
@@ -94,12 +94,10 @@ PHDRS
VERSION
{
LINUX_2.6 {
-#ifndef CONFIG_MIPS_DISABLE_VDSO
+#ifdef CONFIG_GENERIC_GETTIMEOFDAY
global:
__vdso_clock_gettime;
-#ifdef CONFIG_MIPS_CLOCK_VSYSCALL
__vdso_gettimeofday;
-#endif
__vdso_clock_getres;
#if _MIPS_SIM != _MIPS_SIM_ABI64
__vdso_clock_gettime64;
diff --git a/arch/mips/vdso/vgettimeofday.c b/arch/mips/vdso/vgettimeofday.c
index 1d236215e8f6..00f9fcfc327e 100644
--- a/arch/mips/vdso/vgettimeofday.c
+++ b/arch/mips/vdso/vgettimeofday.c
@@ -18,22 +18,12 @@ int __vdso_clock_gettime(clockid_t clock,
return __cvdso_clock_gettime32(clock, ts);
}
-#ifdef CONFIG_MIPS_CLOCK_VSYSCALL
-
-/*
- * This is behind the ifdef so that we don't provide the symbol when there's no
- * possibility of there being a usable clocksource, because there's nothing we
- * can do without it. When libc fails the symbol lookup it should fall back on
- * the standard syscall path.
- */
int __vdso_gettimeofday(struct __kernel_old_timeval *tv,
struct timezone *tz)
{
return __cvdso_gettimeofday(tv, tz);
}
-#endif /* CONFIG_MIPS_CLOCK_VSYSCALL */
-
int __vdso_clock_getres(clockid_t clock_id,
struct old_timespec32 *res)
{
@@ -59,22 +49,12 @@ int __vdso_clock_gettime(clockid_t clock,
return __cvdso_clock_gettime(clock, ts);
}
-#ifdef CONFIG_MIPS_CLOCK_VSYSCALL
-
-/*
- * This is behind the ifdef so that we don't provide the symbol when there's no
- * possibility of there being a usable clocksource, because there's nothing we
- * can do without it. When libc fails the symbol lookup it should fall back on
- * the standard syscall path.
- */
int __vdso_gettimeofday(struct __kernel_old_timeval *tv,
struct timezone *tz)
{
return __cvdso_gettimeofday(tv, tz);
}
-#endif /* CONFIG_MIPS_CLOCK_VSYSCALL */
-
int __vdso_clock_getres(clockid_t clock_id,
struct __kernel_timespec *res)
{
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index e93df95b79e7..c99fd8335ddc 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -213,7 +213,6 @@ config PPC
select GENERIC_IRQ_SHOW_LEVEL
select GENERIC_PCI_IOMAP if PCI
select GENERIC_SMP_IDLE_THREAD
- select GENERIC_TIME_VSYSCALL
select HAS_IOPORT if PCI
select HAVE_ARCH_AUDITSYSCALL
select HAVE_ARCH_HUGE_VMALLOC if HAVE_ARCH_HUGE_VMAP
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index c5754942cf85..195ebc21049a 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -123,7 +123,6 @@ config RISCV
select GENERIC_PCI_IOMAP
select GENERIC_SCHED_CLOCK
select GENERIC_SMP_IDLE_THREAD
- select GENERIC_TIME_VSYSCALL if GENERIC_GETTIMEOFDAY
select HARDIRQS_SW_RESEND
select HAS_IOPORT if MMU
select HAVE_ALIGNED_STRUCT_PAGE
diff --git a/arch/riscv/include/asm/vdso/gettimeofday.h b/arch/riscv/include/asm/vdso/gettimeofday.h
index 9ec08fa04d35..61cb3cbab143 100644
--- a/arch/riscv/include/asm/vdso/gettimeofday.h
+++ b/arch/riscv/include/asm/vdso/gettimeofday.h
@@ -9,12 +9,6 @@
#include <asm/csr.h>
#include <uapi/linux/time.h>
-/*
- * 32-bit land is lacking generic time vsyscalls as well as the legacy 32-bit
- * time syscalls like gettimeofday. Skip these definitions since on 32-bit.
- */
-#ifdef CONFIG_GENERIC_TIME_VSYSCALL
-
#define VDSO_HAS_CLOCK_GETRES 1
static __always_inline
@@ -66,8 +60,6 @@ int clock_getres_fallback(clockid_t _clkid, struct __kernel_timespec *_ts)
return ret;
}
-#endif /* CONFIG_GENERIC_TIME_VSYSCALL */
-
static __always_inline u64 __arch_get_hw_counter(s32 clock_mode,
const struct vdso_time_data *vd)
{
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index ecbcbb781e40..2a5e78465fb8 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -177,7 +177,6 @@ config S390
select GENERIC_ENTRY
select GENERIC_GETTIMEOFDAY
select GENERIC_SMP_IDLE_THREAD
- select GENERIC_TIME_VSYSCALL
select GENERIC_IOREMAP if PCI
select HAVE_ALIGNED_STRUCT_PAGE
select HAVE_ARCH_AUDITSYSCALL
diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index a6b787efc2c4..f83d5065c3cf 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -103,7 +103,6 @@ config SPARC64
select HAVE_REGS_AND_STACK_ACCESS_API
select ARCH_USE_QUEUED_RWLOCKS
select ARCH_USE_QUEUED_SPINLOCKS
- select GENERIC_TIME_VSYSCALL
select ARCH_HAS_PTE_SPECIAL
select PCI_DOMAINS if PCI
select ARCH_HAS_GIGANTIC_PAGE
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index f3f7cb01d69d..43d8105068f4 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -180,7 +180,6 @@ config X86
select GENERIC_IRQ_SHOW
select GENERIC_PENDING_IRQ if SMP
select GENERIC_SMP_IDLE_THREAD
- select GENERIC_TIME_VSYSCALL
select GENERIC_GETTIMEOFDAY
select GENERIC_VDSO_OVERFLOW_PROTECT
select GUP_GET_PXX_LOW_HIGH if X86_PAE
diff --git a/drivers/clocksource/mips-gic-timer.c b/drivers/clocksource/mips-gic-timer.c
index 1501c7db9a8e..a1669266c94d 100644
--- a/drivers/clocksource/mips-gic-timer.c
+++ b/drivers/clocksource/mips-gic-timer.c
@@ -198,7 +198,9 @@ static struct clocksource gic_clocksource = {
.name = "GIC",
.read = gic_hpt_read,
.flags = CLOCK_SOURCE_IS_CONTINUOUS,
+#ifdef CONFIG_GENERIC_GETTIMEOFDAY
.vdso_clock_mode = VDSO_CLOCKMODE_GIC,
+#endif
};
static void gic_clocksource_unstable(char *reason)
diff --git a/include/linux/timekeeper_internal.h b/include/linux/timekeeper_internal.h
index e36d11e33e0c..4486dfd5d0de 100644
--- a/include/linux/timekeeper_internal.h
+++ b/include/linux/timekeeper_internal.h
@@ -190,7 +190,7 @@ struct timekeeper {
s32 tai_offset;
};
-#ifdef CONFIG_GENERIC_TIME_VSYSCALL
+#ifdef CONFIG_GENERIC_GETTIMEOFDAY
extern void update_vsyscall(struct timekeeper *tk);
extern void update_vsyscall_tz(void);
diff --git a/include/linux/vdso_datastore.h b/include/linux/vdso_datastore.h
index 0b530428db71..3dfba9502d78 100644
--- a/include/linux/vdso_datastore.h
+++ b/include/linux/vdso_datastore.h
@@ -2,12 +2,12 @@
#ifndef _LINUX_VDSO_DATASTORE_H
#define _LINUX_VDSO_DATASTORE_H
-#ifdef CONFIG_HAVE_GENERIC_VDSO
#include <linux/mm_types.h>
extern const struct vm_special_mapping vdso_vvar_mapping;
struct vm_area_struct *vdso_install_vvar_mapping(struct mm_struct *mm, unsigned long addr);
+#ifdef CONFIG_HAVE_GENERIC_VDSO
void __init vdso_setup_data_pages(void);
#else /* !CONFIG_HAVE_GENERIC_VDSO */
static inline void vdso_setup_data_pages(void) { }
diff --git a/kernel/time/Kconfig b/kernel/time/Kconfig
index 02aac7c5aa76..d098ac39bde4 100644
--- a/kernel/time/Kconfig
+++ b/kernel/time/Kconfig
@@ -16,10 +16,6 @@ config ARCH_CLOCKSOURCE_INIT
config ARCH_WANTS_CLOCKSOURCE_READ_INLINE
bool
-# Timekeeping vsyscall support
-config GENERIC_TIME_VSYSCALL
- bool
-
# The generic clock events infrastructure
config GENERIC_CLOCKEVENTS
def_bool !LEGACY_TIMER_TICK
diff --git a/lib/vdso/gettimeofday.c b/lib/vdso/gettimeofday.c
index da224011fafd..e0f289d3d110 100644
--- a/lib/vdso/gettimeofday.c
+++ b/lib/vdso/gettimeofday.c
@@ -126,7 +126,7 @@ bool vdso_get_timestamp(const struct vdso_time_data *vd, const struct vdso_clock
}
static __always_inline
-const struct vdso_time_data *__arch_get_vdso_u_timens_data(const struct vdso_time_data *vd)
+const struct vdso_time_data *vdso_timens_data(const struct vdso_time_data *vd)
{
return (void *)vd + PAGE_SIZE;
}
@@ -135,7 +135,7 @@ static __always_inline
bool do_hres_timens(const struct vdso_time_data *vdns, const struct vdso_clock *vcns,
clockid_t clk, struct __kernel_timespec *ts)
{
- const struct vdso_time_data *vd = __arch_get_vdso_u_timens_data(vdns);
+ const struct vdso_time_data *vd = vdso_timens_data(vdns);
const struct timens_offset *offs = &vcns->offset[clk];
const struct vdso_clock *vc = vd->clock_data;
u32 seq;
@@ -191,7 +191,7 @@ static __always_inline
bool do_coarse_timens(const struct vdso_time_data *vdns, const struct vdso_clock *vcns,
clockid_t clk, struct __kernel_timespec *ts)
{
- const struct vdso_time_data *vd = __arch_get_vdso_u_timens_data(vdns);
+ const struct vdso_time_data *vd = vdso_timens_data(vdns);
const struct timens_offset *offs = &vcns->offset[clk];
const struct vdso_clock *vc = vd->clock_data;
const struct vdso_timestamp *vdso_ts;
@@ -250,7 +250,7 @@ bool do_aux(const struct vdso_time_data *vd, clockid_t clock, struct __kernel_ti
do {
while (vdso_read_begin_timens(vc, &seq)) {
/* Re-read from the real time data page, reload seq by looping */
- vd = __arch_get_vdso_u_timens_data(vd);
+ vd = vdso_timens_data(vd);
vc = &vd->aux_clock_data[idx];
}
@@ -360,7 +360,7 @@ __cvdso_gettimeofday_data(const struct vdso_time_data *vd,
if (unlikely(tz != NULL)) {
if (vdso_is_timens_clock(vc))
- vd = __arch_get_vdso_u_timens_data(vd);
+ vd = vdso_timens_data(vd);
tz->tz_minuteswest = vd[CS_HRES_COARSE].tz_minuteswest;
tz->tz_dsttime = vd[CS_HRES_COARSE].tz_dsttime;
@@ -383,7 +383,7 @@ __cvdso_time_data(const struct vdso_time_data *vd, __kernel_old_time_t *time)
__kernel_old_time_t t;
if (vdso_is_timens_clock(vc)) {
- vd = __arch_get_vdso_u_timens_data(vd);
+ vd = vdso_timens_data(vd);
vc = vd->clock_data;
}
@@ -414,7 +414,7 @@ bool __cvdso_clock_getres_common(const struct vdso_time_data *vd, clockid_t cloc
return false;
if (vdso_is_timens_clock(vc))
- vd = __arch_get_vdso_u_timens_data(vd);
+ vd = vdso_timens_data(vd);
/*
* Convert the clockid to a bitmask and use it to check which
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [GIT pull] timers/nohz for v7.2-rc1
2026-06-13 21:25 ` [GIT pull] timers/nohz " Thomas Gleixner
@ 2026-06-15 8:00 ` Ingo Molnar
2026-06-15 8:51 ` pr-tracker-bot
1 sibling, 0 replies; 22+ messages in thread
From: Ingo Molnar @ 2026-06-15 8:00 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: Linus Torvalds, linux-kernel, x86
* Thomas Gleixner <tglx@kernel.org> wrote:
> Linus,
>
> please pull the latest timers/nohz branch from:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git timers-nohz-2026-06-13
>
> up to: 6199f9999a9b: sched/cputime: Handle dyntick-idle steal time correctly
>
> Updates for the NOHZ subsystem:
Merge note: there's a new conflict with the cpufreq tree which
is already upstream:
Conflicts:
drivers/cpufreq/cpufreq_governor.c
Due to these commits:
080b5c6d9503 ("sched/cputime: Remove superfluous and error prone kcpustat_field() parameter")
24fc5870808d ("cpufreq: governor: Fix stale prev_cpu_nice spike when enabling ignore_nice_load")
It's just overlapping changes, the resolution is to remove the
first argument of all 3 uses of kcpustat_field() from the cpufreq
version of the file, like 080b5c6d9503 does.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [GIT pull] core/rseq for v7.2-rc1
2026-06-13 21:24 [GIT pull] core/rseq for v7.2-rc1 Thomas Gleixner
` (8 preceding siblings ...)
2026-06-13 21:25 ` [GIT pull] timers/vdso " Thomas Gleixner
@ 2026-06-15 8:51 ` pr-tracker-bot
9 siblings, 0 replies; 22+ messages in thread
From: pr-tracker-bot @ 2026-06-15 8:51 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: Linus Torvalds, linux-kernel, x86
The pull request you sent on Sat, 13 Jun 2026 23:24:45 +0200:
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git core-rseq-2026-06-13
has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/a04c8472b0bc99963283e379f4ca2c775be4949b
Thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [GIT pull] irq/msi for v7.2-rc1
2026-06-13 21:24 ` [GIT pull] irq/msi " Thomas Gleixner
@ 2026-06-15 8:51 ` pr-tracker-bot
0 siblings, 0 replies; 22+ messages in thread
From: pr-tracker-bot @ 2026-06-15 8:51 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: Linus Torvalds, linux-kernel, x86
The pull request you sent on Sat, 13 Jun 2026 23:24:59 +0200:
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git irq-msi-2026-06-13
has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/8f45c6ce4959edee1ed25131fc14ce8bd261ca35
Thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [GIT pull] smp/core for v7.2-rc1
2026-06-13 21:25 ` [GIT pull] smp/core " Thomas Gleixner
@ 2026-06-15 8:51 ` pr-tracker-bot
0 siblings, 0 replies; 22+ messages in thread
From: pr-tracker-bot @ 2026-06-15 8:51 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: Linus Torvalds, linux-kernel, x86
The pull request you sent on Sat, 13 Jun 2026 23:25:04 +0200:
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git smp-core-2026-06-13
has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/9e94480d81b9eb9bd175499636bf622e5d62176d
Thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [GIT pull] irq/core for v7.2-rc1
2026-06-13 21:24 ` [GIT pull] irq/core " Thomas Gleixner
@ 2026-06-15 8:51 ` pr-tracker-bot
0 siblings, 0 replies; 22+ messages in thread
From: pr-tracker-bot @ 2026-06-15 8:51 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: Linus Torvalds, linux-kernel, x86
The pull request you sent on Sat, 13 Jun 2026 23:24:50 +0200:
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git irq-core-2026-06-13
has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/13e1a6d6a17eb4bca350e5bf59a89a3056c834ca
Thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [GIT pull] timers/clocksource for v7.2-rc1
2026-06-13 21:25 ` [GIT pull] timers/clocksource " Thomas Gleixner
@ 2026-06-15 8:51 ` pr-tracker-bot
0 siblings, 0 replies; 22+ messages in thread
From: pr-tracker-bot @ 2026-06-15 8:51 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: Linus Torvalds, linux-kernel, x86
The pull request you sent on Sat, 13 Jun 2026 23:25:08 +0200:
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git timers-clocksource-2026-06-13
has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/f20e2fdaaeb74330a6c5d65af22a8c47409a7a91
Thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [GIT pull] timers/core for v7.2-rc1
2026-06-13 21:25 ` [GIT pull] timers/core " Thomas Gleixner
@ 2026-06-15 8:51 ` pr-tracker-bot
2026-06-15 13:35 ` Oleg Nesterov
1 sibling, 0 replies; 22+ messages in thread
From: pr-tracker-bot @ 2026-06-15 8:51 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: Linus Torvalds, linux-kernel, x86
The pull request you sent on Sat, 13 Jun 2026 23:25:13 +0200:
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git timers-core-2026-06-13
has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/a60ce761d99ff2d9eefe33374c5f20726465a140
Thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [GIT pull] timers/vdso for v7.2-rc1
2026-06-13 21:25 ` [GIT pull] timers/vdso " Thomas Gleixner
@ 2026-06-15 8:51 ` pr-tracker-bot
0 siblings, 0 replies; 22+ messages in thread
From: pr-tracker-bot @ 2026-06-15 8:51 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: Linus Torvalds, linux-kernel, x86
The pull request you sent on Sat, 13 Jun 2026 23:25:26 +0200:
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git timers-vdso-2026-06-13
has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/186d3c4e92242351afc24d9784f31cb4cd08a4b7
Thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [GIT pull] timers/nohz for v7.2-rc1
2026-06-13 21:25 ` [GIT pull] timers/nohz " Thomas Gleixner
2026-06-15 8:00 ` Ingo Molnar
@ 2026-06-15 8:51 ` pr-tracker-bot
1 sibling, 0 replies; 22+ messages in thread
From: pr-tracker-bot @ 2026-06-15 8:51 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: Linus Torvalds, linux-kernel, x86
The pull request you sent on Sat, 13 Jun 2026 23:25:17 +0200:
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git timers-nohz-2026-06-13
has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/a53fcff8fc7530f59a8171824ed586200df724a0
Thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [GIT pull] irq/drivers for v7.2-rc1
2026-06-13 21:24 ` [GIT pull] irq/drivers " Thomas Gleixner
@ 2026-06-15 8:51 ` pr-tracker-bot
0 siblings, 0 replies; 22+ messages in thread
From: pr-tracker-bot @ 2026-06-15 8:51 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: Linus Torvalds, linux-kernel, x86
The pull request you sent on Sat, 13 Jun 2026 23:24:54 +0200:
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git irq-drivers-2026-06-13
has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/857ae5a4459c600d70b9ad64c46a730c428770e2
Thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [GIT pull] timers/ptp for v7.2-rc1
2026-06-13 21:25 ` [GIT pull] timers/ptp " Thomas Gleixner
@ 2026-06-15 8:51 ` pr-tracker-bot
0 siblings, 0 replies; 22+ messages in thread
From: pr-tracker-bot @ 2026-06-15 8:51 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: Linus Torvalds, linux-kernel, x86
The pull request you sent on Sat, 13 Jun 2026 23:25:22 +0200:
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git timers-ptp-2026-06-13
has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/2d6d57f889f3a5e7d19009c560ea2002cdde9fb8
Thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [GIT pull] timers/core for v7.2-rc1
2026-06-13 21:25 ` [GIT pull] timers/core " Thomas Gleixner
2026-06-15 8:51 ` pr-tracker-bot
@ 2026-06-15 13:35 ` Oleg Nesterov
1 sibling, 0 replies; 22+ messages in thread
From: Oleg Nesterov @ 2026-06-15 13:35 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: Linus Torvalds, linux-kernel, x86
On 06/13, Thomas Gleixner wrote:
>
> Linus,
>
> please pull the latest timers/core branch from:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git timers-core-2026-06-13
>
> up to: 87bd2ad568e1: posix-cpu-timers: Fix pid refcount leak in do_cpu_nanosleep() error path
...
> --- a/kernel/time/jiffies.c
> +++ b/kernel/time/jiffies.c
> @@ -60,15 +60,14 @@ EXPORT_SYMBOL(get_jiffies_64);
>
> EXPORT_SYMBOL(jiffies);
>
> -static int __init init_jiffies_clocksource(void)
> -{
> - return __clocksource_register(&clocksource_jiffies);
> -}
> -
> -core_initcall(init_jiffies_clocksource);
> +static bool cs_jiffies_registered __initdata;
>
> struct clocksource * __init __weak clocksource_default_clock(void)
> {
> + if (!cs_jiffies_registered) {
> + __clocksource_register(&clocksource_jiffies);
> + cs_jiffies_registered = true;
> + }
> return &clocksource_jiffies;
> }
It seems that this change is problematic...
timekeeping_init() does
guard(raw_spinlock_irqsave)(&tk_core.lock);
clock = clocksource_default_clock();
and __clocksource_register() -> __clocksource_register_scale() takes
clocksource_mutex.
So I got
=============================
[ BUG: Invalid wait context ]
7.1.0-00977-g7a78e6f6bb02 #237 Not tainted
-----------------------------
swapper/0/0 is trying to lock:
ffffffff820415e8 (clocksource_mutex){....}-{4:4}, at: __clocksource_register_scale+0x186/0x230
other info that might help us debug this:
context-{5:5}
1 lock held by swapper/0/0:
#0: ffffffff8327dfb8 (&tkd->lock){....}-{2:2}, at: timekeeping_init+0x159/0x1f0
at boot time after git pull.
Oleg.
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2026-06-15 13:35 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-13 21:24 [GIT pull] core/rseq for v7.2-rc1 Thomas Gleixner
2026-06-13 21:24 ` [GIT pull] irq/core " Thomas Gleixner
2026-06-15 8:51 ` pr-tracker-bot
2026-06-13 21:24 ` [GIT pull] irq/drivers " Thomas Gleixner
2026-06-15 8:51 ` pr-tracker-bot
2026-06-13 21:24 ` [GIT pull] irq/msi " Thomas Gleixner
2026-06-15 8:51 ` pr-tracker-bot
2026-06-13 21:25 ` [GIT pull] smp/core " Thomas Gleixner
2026-06-15 8:51 ` pr-tracker-bot
2026-06-13 21:25 ` [GIT pull] timers/clocksource " Thomas Gleixner
2026-06-15 8:51 ` pr-tracker-bot
2026-06-13 21:25 ` [GIT pull] timers/core " Thomas Gleixner
2026-06-15 8:51 ` pr-tracker-bot
2026-06-15 13:35 ` Oleg Nesterov
2026-06-13 21:25 ` [GIT pull] timers/nohz " Thomas Gleixner
2026-06-15 8:00 ` Ingo Molnar
2026-06-15 8:51 ` pr-tracker-bot
2026-06-13 21:25 ` [GIT pull] timers/ptp " Thomas Gleixner
2026-06-15 8:51 ` pr-tracker-bot
2026-06-13 21:25 ` [GIT pull] timers/vdso " Thomas Gleixner
2026-06-15 8:51 ` pr-tracker-bot
2026-06-15 8:51 ` [GIT pull] core/rseq " pr-tracker-bot
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.