* [PATCH 0/2] genirq: Retain disable-depth across irq_{shutdown,startup}()
@ 2025-05-13 22:42 Brian Norris
2025-05-13 22:42 ` [PATCH 1/2] genirq: Add kunit tests for depth counts Brian Norris
2025-05-13 22:42 ` [PATCH 2/2] genirq: Retain disable depth across irq shutdown/startup Brian Norris
0 siblings, 2 replies; 8+ messages in thread
From: Brian Norris @ 2025-05-13 22:42 UTC (permalink / raw)
To: Thomas Gleixner
Cc: Douglas Anderson, Tsai Sung-Fu, linux-kernel, Brian Norris
I'm seeing problems in a driver that:
(a) requests an affinity-managed IRQ (struct
irq_affinity_desc::is_managed == 1);
(b) disables that IRQ (disable_irq()); and
(c) undergoes CPU hotplug for the affined CPU.
When we do the above, the genirq core leaves the IRQ in a different
state than it started -- the kernel IRQ is re-enabled after CPU hot
unplug/plug.
This problem seems to stem from the behavior of irq_shutdown() and
irq_shutdown(): that they assume they always run with an enabled IRQ,
and can simply set depth to 1 and 0 respectively.
I encode my test cases in a few kunit tests in patch 1, and I provide a
blunt attempt at solving the test failures in patch 2. I'm not very
confident in my solution, so please take it with a heavy dose of salt.
Side note: I understand my colleague has reported other issues related
to the same code:
Subject: [PATCH] genirq/PM: Fix IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND if depth > 1
https://lore.kernel.org/lkml/20250512173250.1.If5c00cf9f08732f4af5f104ae59b8785c7f69536@changeid/
We're addressing different problems, but they do happen to hit on some
of the same awkwardness in irq_startup(). These two patches obviously
would need to be reconciled in some way.
Brian Norris (2):
genirq: Add kunit tests for depth counts
genirq: Retain disable depth across irq shutdown/startup
kernel/irq/Kconfig | 10 +++
kernel/irq/Makefile | 1 +
kernel/irq/chip.c | 7 +-
kernel/irq/irq_test.c | 162 ++++++++++++++++++++++++++++++++++++++++++
4 files changed, 178 insertions(+), 2 deletions(-)
create mode 100644 kernel/irq/irq_test.c
--
2.49.0.1045.g170613ef41-goog
^ permalink raw reply [flat|nested] 8+ messages in thread* [PATCH 1/2] genirq: Add kunit tests for depth counts 2025-05-13 22:42 [PATCH 0/2] genirq: Retain disable-depth across irq_{shutdown,startup}() Brian Norris @ 2025-05-13 22:42 ` Brian Norris 2025-05-14 15:16 ` kernel test robot ` (2 more replies) 2025-05-13 22:42 ` [PATCH 2/2] genirq: Retain disable depth across irq shutdown/startup Brian Norris 1 sibling, 3 replies; 8+ messages in thread From: Brian Norris @ 2025-05-13 22:42 UTC (permalink / raw) To: Thomas Gleixner Cc: Douglas Anderson, Tsai Sung-Fu, linux-kernel, Brian Norris These tests demonstrate bugs in the irq shutdown/startup code. See the appended test report. In summary, the latter two cases cover: ## shutdown depth: disable_irq() irq_shutdown_and_deactivate() irq_activate_and_startup() <-- BUG: depth is 0 after this enable_irq() ## cpu hotplug: affine IRQ to CPU 1 disable_irq() remove CPU 1 add CPU 1 <-- BUG: depth is 0 after this enable_irq() NB: since one of the tests intersects with CPU hotplug, these tests requires SMP support. They can be easily run with: $ tools/testing/kunit/kunit.py run 'irq_test_cases*' --arch x86_64 --qemu_args '-smp 2' [13:24:21] =============== irq_test_cases (3 subtests) ================ [13:24:21] [PASSED] irq_disable_depth_test [13:24:21] # irq_shutdown_depth_test: EXPECTATION FAILED at kernel/irq/irq_test.c:93 [13:24:21] Expected desc->depth == 1, but [13:24:21] desc->depth == 0 (0x0) [13:24:21] ------------[ cut here ]------------ [13:24:21] Unbalanced enable for IRQ 25 [13:24:21] WARNING: CPU: 1 PID: 34 at kernel/irq/manage.c:792 __enable_irq+0x36/0x60 ... [13:24:21] Call Trace: [13:24:21] <TASK> [13:24:21] enable_irq+0x4a/0x90 [13:24:21] irq_shutdown_depth_test+0x17b/0x3b0 [13:24:21] kunit_try_run_case+0x90/0x120 ... [13:24:21] </TASK> [13:24:21] ---[ end trace 0000000000000000 ]--- [13:24:21] # irq_shutdown_depth_test.speed: slow [13:24:21] [FAILED] irq_shutdown_depth_test [13:24:21] #1 [13:24:21] # irq_cpuhotplug_test: EXPECTATION FAILED at kernel/irq/irq_test.c:140 [13:24:21] Expected desc->depth == 1, but [13:24:21] desc->depth == 0 (0x0) [13:24:21] ------------[ cut here ]------------ [13:24:21] Unbalanced enable for IRQ 26 [13:24:21] WARNING: CPU: 0 PID: 36 at kernel/irq/manage.c:792 __enable_irq+0x36/0x60 ... [13:24:21] Call Trace: [13:24:21] <TASK> [13:24:21] enable_irq+0x4a/0x90 [13:24:21] irq_cpuhotplug_test+0x28f/0x660 [13:24:21] kunit_try_run_case+0x90/0x120 ... [13:24:21] </TASK> [13:24:21] ---[ end trace 0000000000000000 ]--- [13:24:21] # irq_cpuhotplug_test.speed: slow [13:24:21] [FAILED] irq_cpuhotplug_test [13:24:21] # module: irq_test [13:24:21] # irq_test_cases: pass:1 fail:2 skip:0 total:3 [13:24:21] # Totals: pass:1 fail:2 skip:0 total:3 [13:24:21] ================= [FAILED] irq_test_cases ================== [13:24:21] ============================================================ [13:24:21] Testing complete. Ran 3 tests: passed: 1, failed: 2 Also note that currently, these tests don't fully clean up after themselves, as I didn't yet figure out all the right ways to set up a fake virq and domain for the purpose of unit testing. They contain TODOs to that effect. Signed-off-by: Brian Norris <briannorris@chromium.org> --- kernel/irq/Kconfig | 10 +++ kernel/irq/Makefile | 1 + kernel/irq/irq_test.c | 162 ++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 173 insertions(+) create mode 100644 kernel/irq/irq_test.c diff --git a/kernel/irq/Kconfig b/kernel/irq/Kconfig index 3f02a0e45254..9295dabea4a0 100644 --- a/kernel/irq/Kconfig +++ b/kernel/irq/Kconfig @@ -144,6 +144,16 @@ config GENERIC_IRQ_DEBUGFS config GENERIC_IRQ_KEXEC_CLEAR_VM_FORWARD bool +config IRQ_KUNIT_TEST + tristate "KUnit test for IRQ management APIs" if !KUNIT_ALL_TESTS + depends on KUNIT + default KUNIT_ALL_TESTS + select SMP + help + Enable this option to test IRQ management APIs. + + If unsure, say N. + endmenu config GENERIC_IRQ_MULTI_HANDLER diff --git a/kernel/irq/Makefile b/kernel/irq/Makefile index c0f44c06d69d..6ab3a4055667 100644 --- a/kernel/irq/Makefile +++ b/kernel/irq/Makefile @@ -19,3 +19,4 @@ obj-$(CONFIG_GENERIC_IRQ_IPI_MUX) += ipi-mux.o obj-$(CONFIG_SMP) += affinity.o obj-$(CONFIG_GENERIC_IRQ_DEBUGFS) += debugfs.o obj-$(CONFIG_GENERIC_IRQ_MATRIX_ALLOCATOR) += matrix.o +obj-$(CONFIG_IRQ_KUNIT_TEST) += irq_test.o diff --git a/kernel/irq/irq_test.c b/kernel/irq/irq_test.c new file mode 100644 index 000000000000..24f4d8e6c433 --- /dev/null +++ b/kernel/irq/irq_test.c @@ -0,0 +1,162 @@ +// SPDX-License-Identifier: LGPL-2.1+ + +#include <linux/cpu.h> +#include <linux/cpumask.h> +#include <linux/interrupt.h> +#include <linux/irq.h> +#include <linux/irqdesc.h> +#include <linux/irqdomain.h> +#include <linux/nodemask.h> +#include <kunit/test.h> + +#include "internals.h" + +static irqreturn_t noop_handler(int, void *) +{ + return IRQ_HANDLED; +} + +static void noop(struct irq_data *data) { } +static unsigned int noop_ret(struct irq_data *data) { return 0; } + +static int noop_affinity(struct irq_data *data, const struct cpumask *dest, bool force) +{ + irq_data_update_effective_affinity(data, dest); + + return 0; +} + +static struct irq_chip fake_irq_chip = { + .name = "fake", + .irq_startup = noop_ret, + .irq_shutdown = noop, + .irq_enable = noop, + .irq_disable = noop, + .irq_ack = noop, + .irq_mask = noop, + .irq_unmask = noop, + .irq_set_affinity = noop_affinity, + .flags = IRQCHIP_SKIP_SET_WAKE, +}; + +static void irq_disable_depth_test(struct kunit *test) +{ + struct irq_desc *desc; + int virq, ret; + + virq = irq_domain_alloc_descs(-1 /*virq*/, 1 /*nr_irqs*/, 0/*hwirq*/, first_online_node/*node*/, NULL); + KUNIT_ASSERT_GE(test, virq, 0); + + irq_set_chip_and_handler(virq, &dummy_irq_chip, handle_simple_irq); + + desc = irq_to_desc(virq); + KUNIT_ASSERT_PTR_NE(test, desc, NULL); + + ret = request_irq(virq, noop_handler, 0, "test_irq", NULL); + KUNIT_EXPECT_EQ(test, ret, 0); + + KUNIT_EXPECT_EQ(test, desc->depth, 0); + + disable_irq(virq); + KUNIT_EXPECT_EQ(test, desc->depth, 1); + + enable_irq(virq); + KUNIT_EXPECT_EQ(test, desc->depth, 0); + + /* TODO: free virq? */ +} + +static void irq_shutdown_depth_test(struct kunit *test) +{ + struct irq_desc *desc; + int virq, ret; + + virq = irq_domain_alloc_descs(-1 /*virq*/, 1 /*nr_irqs*/, 0/*hwirq*/, first_online_node/*node*/, NULL); + KUNIT_ASSERT_GE(test, virq, 0); + + irq_set_chip_and_handler(virq, &dummy_irq_chip, handle_simple_irq); + + desc = irq_to_desc(virq); + KUNIT_ASSERT_PTR_NE(test, desc, NULL); + + ret = request_irq(virq, noop_handler, 0, "test_irq", NULL); + KUNIT_EXPECT_EQ(test, ret, 0); + + KUNIT_EXPECT_EQ(test, desc->depth, 0); + + disable_irq(virq); + KUNIT_EXPECT_EQ(test, desc->depth, 1); + + irq_shutdown_and_deactivate(desc); + KUNIT_EXPECT_EQ(test, irq_activate_and_startup(desc, IRQ_NORESEND), 0); + + KUNIT_EXPECT_EQ(test, desc->depth, 1); + + enable_irq(virq); + KUNIT_EXPECT_EQ(test, desc->depth, 0); + + /* TODO: free virq? */ +} + +static void irq_cpuhotplug_test(struct kunit *test) +{ + struct irq_desc *desc; + struct irq_data *data; + int virq, ret; + struct irq_affinity_desc affinity = { + .is_managed = 1, + }; + + cpumask_copy(&affinity.mask, cpumask_of(1)); + KUNIT_ASSERT_PTR_NE(test, get_cpu_device(1), NULL); + KUNIT_ASSERT_TRUE(test, cpu_is_hotpluggable(1)); + + virq = irq_domain_alloc_descs(-1 /*virq*/, 1 /*nr_irqs*/, 0/*hwirq*/, first_online_node/*node*/, &affinity); + KUNIT_ASSERT_GE(test, virq, 0); + + irq_set_chip_and_handler(virq, &fake_irq_chip, handle_simple_irq); + + desc = irq_to_desc(virq); + KUNIT_ASSERT_PTR_NE(test, desc, NULL); + + data = irq_desc_get_irq_data(desc); + KUNIT_ASSERT_PTR_NE(test, data, NULL); + + ret = request_irq(virq, noop_handler, 0, "test_irq", NULL); + KUNIT_EXPECT_EQ(test, ret, 0); + + KUNIT_EXPECT_TRUE(test, irqd_is_activated(data)); + KUNIT_EXPECT_TRUE(test, irqd_is_started(data)); + KUNIT_EXPECT_TRUE(test, irqd_affinity_is_managed(data)); + + KUNIT_EXPECT_EQ(test, desc->depth, 0); + + disable_irq(virq); + KUNIT_EXPECT_EQ(test, desc->depth, 1); + + KUNIT_EXPECT_EQ(test, remove_cpu(1), 0); + KUNIT_EXPECT_EQ(test, add_cpu(1), 0); + + KUNIT_EXPECT_EQ(test, desc->depth, 1); + + enable_irq(virq); + KUNIT_EXPECT_EQ(test, desc->depth, 0); + + /* TODO: free virq? */ +} + +static struct kunit_case irq_test_cases[] = { + KUNIT_CASE_SLOW(irq_disable_depth_test), + KUNIT_CASE_SLOW(irq_shutdown_depth_test), + KUNIT_CASE_SLOW(irq_cpuhotplug_test), + {} +}; + +static struct kunit_suite irq_test_suite = { + .name = "irq_test_cases", + .test_cases = irq_test_cases, +}; + +kunit_test_suite(irq_test_suite); +MODULE_DESCRIPTION("IRQ unit test suite"); +MODULE_LICENSE("GPL"); -- 2.49.0.1045.g170613ef41-goog ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] genirq: Add kunit tests for depth counts 2025-05-13 22:42 ` [PATCH 1/2] genirq: Add kunit tests for depth counts Brian Norris @ 2025-05-14 15:16 ` kernel test robot 2025-05-14 15:27 ` kernel test robot 2025-05-14 18:05 ` kernel test robot 2 siblings, 0 replies; 8+ messages in thread From: kernel test robot @ 2025-05-14 15:16 UTC (permalink / raw) To: Brian Norris, Thomas Gleixner Cc: llvm, oe-kbuild-all, Douglas Anderson, Tsai Sung-Fu, linux-kernel, Brian Norris Hi Brian, kernel test robot noticed the following build warnings: [auto build test WARNING on tip/irq/core] [also build test WARNING on linus/master v6.15-rc6 next-20250514] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Brian-Norris/genirq-Add-kunit-tests-for-depth-counts/20250514-065050 base: tip/irq/core patch link: https://lore.kernel.org/r/20250513224402.864767-2-briannorris%40chromium.org patch subject: [PATCH 1/2] genirq: Add kunit tests for depth counts config: arm64-randconfig-001-20250514 (https://download.01.org/0day-ci/archive/20250514/202505142357.R7Xa1KFW-lkp@intel.com/config) compiler: clang version 17.0.6 (https://github.com/llvm/llvm-project 6009708b4367171ccdbf4b5905cb6a803753fe18) reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250514/202505142357.R7Xa1KFW-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202505142357.R7Xa1KFW-lkp@intel.com/ All warnings (new ones prefixed by >>): >> kernel/irq/irq_test.c:14:36: warning: omitting the parameter name in a function definition is a C2x extension [-Wc2x-extensions] 14 | static irqreturn_t noop_handler(int, void *) | ^ kernel/irq/irq_test.c:14:44: warning: omitting the parameter name in a function definition is a C2x extension [-Wc2x-extensions] 14 | static irqreturn_t noop_handler(int, void *) | ^ 2 warnings generated. vim +14 kernel/irq/irq_test.c 13 > 14 static irqreturn_t noop_handler(int, void *) 15 { 16 return IRQ_HANDLED; 17 } 18 -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] genirq: Add kunit tests for depth counts 2025-05-13 22:42 ` [PATCH 1/2] genirq: Add kunit tests for depth counts Brian Norris 2025-05-14 15:16 ` kernel test robot @ 2025-05-14 15:27 ` kernel test robot 2025-05-14 18:05 ` kernel test robot 2 siblings, 0 replies; 8+ messages in thread From: kernel test robot @ 2025-05-14 15:27 UTC (permalink / raw) To: Brian Norris, Thomas Gleixner Cc: oe-kbuild-all, Douglas Anderson, Tsai Sung-Fu, linux-kernel, Brian Norris Hi Brian, kernel test robot noticed the following build errors: [auto build test ERROR on tip/irq/core] [also build test ERROR on linus/master v6.15-rc6 next-20250514] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Brian-Norris/genirq-Add-kunit-tests-for-depth-counts/20250514-065050 base: tip/irq/core patch link: https://lore.kernel.org/r/20250513224402.864767-2-briannorris%40chromium.org patch subject: [PATCH 1/2] genirq: Add kunit tests for depth counts config: arm-randconfig-003-20250514 (https://download.01.org/0day-ci/archive/20250514/202505142327.mfHUg14q-lkp@intel.com/config) compiler: arm-linux-gnueabi-gcc (GCC) 7.5.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250514/202505142327.mfHUg14q-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202505142327.mfHUg14q-lkp@intel.com/ All error/warnings (new ones prefixed by >>): arch/arm/lib/clearbit.S: Assembler messages: >> arch/arm/lib/clearbit.S:12: Error: architectural extension `mp' is not allowed for the current base architecture >> arch/arm/lib/clearbit.S:12: Error: selected processor does not support `pldw.w [r1]' in Thumb mode -- arch/arm/lib/changebit.S: Assembler messages: >> arch/arm/lib/changebit.S:12: Error: architectural extension `mp' is not allowed for the current base architecture >> arch/arm/lib/changebit.S:12: Error: selected processor does not support `pldw.w [r1]' in Thumb mode -- arch/arm/lib/testclearbit.S: Assembler messages: >> arch/arm/lib/testclearbit.S:12: Error: architectural extension `mp' is not allowed for the current base architecture >> arch/arm/lib/testclearbit.S:12: Error: selected processor does not support `pldw.w [r1]' in Thumb mode arch/arm/lib/testclearbit.S:15: Error: architectural extension `mp' is not allowed for the current base architecture arch/arm/lib/testclearbit.S:15: Error: selected processor does not support `pldw.w [r1]' in Thumb mode -- arch/arm/lib/testsetbit.S: Assembler messages: >> arch/arm/lib/testsetbit.S:12: Error: architectural extension `mp' is not allowed for the current base architecture >> arch/arm/lib/testsetbit.S:12: Error: selected processor does not support `pldw.w [r1]' in Thumb mode arch/arm/lib/testsetbit.S:15: Error: architectural extension `mp' is not allowed for the current base architecture arch/arm/lib/testsetbit.S:15: Error: selected processor does not support `pldw.w [r1]' in Thumb mode -- arch/arm/lib/testchangebit.S: Assembler messages: >> arch/arm/lib/testchangebit.S:12: Error: architectural extension `mp' is not allowed for the current base architecture >> arch/arm/lib/testchangebit.S:12: Error: selected processor does not support `pldw.w [r1]' in Thumb mode arch/arm/lib/testchangebit.S:15: Error: architectural extension `mp' is not allowed for the current base architecture arch/arm/lib/testchangebit.S:15: Error: selected processor does not support `pldw.w [r1]' in Thumb mode -- arch/arm/lib/setbit.S: Assembler messages: >> arch/arm/lib/setbit.S:12: Error: architectural extension `mp' is not allowed for the current base architecture >> arch/arm/lib/setbit.S:12: Error: selected processor does not support `pldw.w [r1]' in Thumb mode -- >> arch/arm/kernel/smp.c:98:22: warning: 'get_arch_pgd' defined but not used [-Wunused-function] static unsigned long get_arch_pgd(pgd_t *pgd) ^~~~~~~~~~~~ /tmp/ccR1vg73.s: Assembler messages: >> /tmp/ccR1vg73.s:470: Error: architectural extension `mp' is not allowed for the current base architecture >> /tmp/ccR1vg73.s:471: Error: selected processor does not support `pldw [r1]' in Thumb mode -- arch/arm/kernel/head-nommu.S: Assembler messages: >> arch/arm/kernel/head-nommu.S:488: Error: ARM register expected -- `str r0,[,#0x98]' arch/arm/kernel/head-nommu.S:495: Error: ARM register expected -- `str r0,[,#0x9c]' >> arch/arm/kernel/head-nommu.S:495: Error: ARM register expected -- `str r5,[,#0xa0]' arch/arm/kernel/head-nommu.S:497: Error: ARM register expected -- `str r0,[,#0x9c]' arch/arm/kernel/head-nommu.S:497: Error: ARM register expected -- `str r5,[,#0xa0]' -- /tmp/ccPRjQyq.s: Assembler messages: /tmp/ccPRjQyq.s:388: Error: architectural extension `mp' is not allowed for the current base architecture >> /tmp/ccPRjQyq.s:389: Error: selected processor does not support `pldw [r4]' in Thumb mode /tmp/ccPRjQyq.s:549: Error: architectural extension `mp' is not allowed for the current base architecture /tmp/ccPRjQyq.s:550: Error: selected processor does not support `pldw [r4]' in Thumb mode -- /tmp/cc0opIaH.s: Assembler messages: /tmp/cc0opIaH.s:1308: Error: architectural extension `mp' is not allowed for the current base architecture >> /tmp/cc0opIaH.s:1309: Error: selected processor does not support `pldw [r3]' in Thumb mode /tmp/cc0opIaH.s:2474: Error: architectural extension `mp' is not allowed for the current base architecture /tmp/cc0opIaH.s:2475: Error: selected processor does not support `pldw [r3]' in Thumb mode /tmp/cc0opIaH.s:2671: Error: architectural extension `mp' is not allowed for the current base architecture /tmp/cc0opIaH.s:2672: Error: selected processor does not support `pldw [r3]' in Thumb mode -- /tmp/ccEJFH9Z.s: Assembler messages: /tmp/ccEJFH9Z.s:206: Error: architectural extension `mp' is not allowed for the current base architecture >> /tmp/ccEJFH9Z.s:207: Error: selected processor does not support `pldw [r5]' in Thumb mode /tmp/ccEJFH9Z.s:233: Error: architectural extension `mp' is not allowed for the current base architecture /tmp/ccEJFH9Z.s:234: Error: selected processor does not support `pldw [r5]' in Thumb mode /tmp/ccEJFH9Z.s:540: Error: architectural extension `mp' is not allowed for the current base architecture >> /tmp/ccEJFH9Z.s:541: Error: selected processor does not support `pldw [r8]' in Thumb mode /tmp/ccEJFH9Z.s:568: Error: architectural extension `mp' is not allowed for the current base architecture /tmp/ccEJFH9Z.s:569: Error: selected processor does not support `pldw [r8]' in Thumb mode .. Kconfig warnings: (for reference only) WARNING: unmet direct dependencies detected for SMP Depends on [n]: (CPU_V6K [=n] || CPU_V7 [=n]) && HAVE_SMP [=n] && (MMU [=n] || ARM_MPU [=y]) Selected by [y]: - IRQ_KUNIT_TEST [=y] && KUNIT [=y] vim +14 kernel/irq/irq_test.c 13 > 14 static irqreturn_t noop_handler(int, void *) 15 { 16 return IRQ_HANDLED; 17 } 18 -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] genirq: Add kunit tests for depth counts 2025-05-13 22:42 ` [PATCH 1/2] genirq: Add kunit tests for depth counts Brian Norris 2025-05-14 15:16 ` kernel test robot 2025-05-14 15:27 ` kernel test robot @ 2025-05-14 18:05 ` kernel test robot 2 siblings, 0 replies; 8+ messages in thread From: kernel test robot @ 2025-05-14 18:05 UTC (permalink / raw) To: Brian Norris, Thomas Gleixner Cc: Paul Gazzillo, Necip Fazil Yildiran, oe-kbuild-all, Douglas Anderson, Tsai Sung-Fu, linux-kernel, Brian Norris Hi Brian, kernel test robot noticed the following build warnings: [auto build test WARNING on tip/irq/core] [also build test WARNING on linus/master v6.15-rc6 next-20250514] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Brian-Norris/genirq-Add-kunit-tests-for-depth-counts/20250514-065050 base: tip/irq/core patch link: https://lore.kernel.org/r/20250513224402.864767-2-briannorris%40chromium.org patch subject: [PATCH 1/2] genirq: Add kunit tests for depth counts config: alpha-kismet-CONFIG_SMP-CONFIG_IRQ_KUNIT_TEST-0-0 (https://download.01.org/0day-ci/archive/20250515/202505150113.w3vVNU6s-lkp@intel.com/config) reproduce: (https://download.01.org/0day-ci/archive/20250515/202505150113.w3vVNU6s-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202505150113.w3vVNU6s-lkp@intel.com/ kismet warnings: (new ones prefixed by >>) >> kismet: WARNING: unmet direct dependencies detected for SMP when selected by IRQ_KUNIT_TEST WARNING: unmet direct dependencies detected for SMP Depends on [n]: ALPHA_SABLE [=n] || ALPHA_RAWHIDE [=n] || ALPHA_DP264 [=n] || ALPHA_WILDFIRE [=n] || ALPHA_TITAN [=n] || ALPHA_GENERIC [=n] || ALPHA_SHARK [=n] || ALPHA_MARVEL [=n] Selected by [y]: - IRQ_KUNIT_TEST [=y] && KUNIT [=y] -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 2/2] genirq: Retain disable depth across irq shutdown/startup 2025-05-13 22:42 [PATCH 0/2] genirq: Retain disable-depth across irq_{shutdown,startup}() Brian Norris 2025-05-13 22:42 ` [PATCH 1/2] genirq: Add kunit tests for depth counts Brian Norris @ 2025-05-13 22:42 ` Brian Norris 2025-05-14 7:35 ` Thomas Gleixner 1 sibling, 1 reply; 8+ messages in thread From: Brian Norris @ 2025-05-13 22:42 UTC (permalink / raw) To: Thomas Gleixner Cc: Douglas Anderson, Tsai Sung-Fu, linux-kernel, Brian Norris If an IRQ is shut down and restarted while it was already disabled, its depth is clobbered and reset to 0. This can produce unexpected results, as: 1) the consuming driver probably expected it to stay disabled and 2) the kernel starts complaining about "Unbalanced enable for IRQ N" the next time the consumer calls enable_irq() This problem can occur especially for affinity-managed IRQs that are already disabled before CPU hotplug. I captured these failures in kunit tests irq_shutdown_depth_test and irq_cpuhotplug_test. Perform a naive increment/decrement instead of clobbering the count to 0/1. Tested via kunit: tools/testing/kunit/kunit.py run 'irq_test_cases*' --arch x86_64 --qemu_args '-smp 2' Signed-off-by: Brian Norris <briannorris@chromium.org> --- I'm not very confident this is a fully correct fix, as I'm not sure I've grokked all the startup/shutdown logic in the IRQ core. This probably serves better as an example method to pass the tests in patch 1. kernel/irq/chip.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c index 36cf1b09cc84..cc6d2220ceae 100644 --- a/kernel/irq/chip.c +++ b/kernel/irq/chip.c @@ -272,7 +272,9 @@ int irq_startup(struct irq_desc *desc, bool resend, bool force) const struct cpumask *aff = irq_data_get_affinity_mask(d); int ret = 0; - desc->depth = 0; + desc->depth--; + if (desc->depth) + return 0; if (irqd_is_started(d)) { irq_enable(desc); @@ -290,6 +292,7 @@ int irq_startup(struct irq_desc *desc, bool resend, bool force) ret = __irq_startup(desc); break; case IRQ_STARTUP_ABORT: + desc->depth++; irqd_set_managed_shutdown(d); return 0; } @@ -322,7 +325,7 @@ void irq_shutdown(struct irq_desc *desc) { if (irqd_is_started(&desc->irq_data)) { clear_irq_resend(desc); - desc->depth = 1; + desc->depth++; if (desc->irq_data.chip->irq_shutdown) { desc->irq_data.chip->irq_shutdown(&desc->irq_data); irq_state_set_disabled(desc); -- 2.49.0.1045.g170613ef41-goog ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] genirq: Retain disable depth across irq shutdown/startup 2025-05-13 22:42 ` [PATCH 2/2] genirq: Retain disable depth across irq shutdown/startup Brian Norris @ 2025-05-14 7:35 ` Thomas Gleixner 2025-05-14 20:16 ` Brian Norris 0 siblings, 1 reply; 8+ messages in thread From: Thomas Gleixner @ 2025-05-14 7:35 UTC (permalink / raw) To: Brian Norris; +Cc: Douglas Anderson, Tsai Sung-Fu, linux-kernel, Brian Norris On Tue, May 13 2025 at 15:42, Brian Norris wrote: > If an IRQ is shut down and restarted while it was already disabled, its > depth is clobbered and reset to 0. This can produce unexpected results, > as: > 1) the consuming driver probably expected it to stay disabled and > 2) the kernel starts complaining about "Unbalanced enable for IRQ N" the > next time the consumer calls enable_irq() > > This problem can occur especially for affinity-managed IRQs that are > already disabled before CPU hotplug. Groan. > I'm not very confident this is a fully correct fix, as I'm not sure I've > grokked all the startup/shutdown logic in the IRQ core. This probably > serves better as an example method to pass the tests in patch 1. It's close enough except for a subtle detail. > @@ -272,7 +272,9 @@ int irq_startup(struct irq_desc *desc, bool resend, bool force) > const struct cpumask *aff = irq_data_get_affinity_mask(d); > int ret = 0; > > - desc->depth = 0; > + desc->depth--; > + if (desc->depth) > + return 0; This breaks a request_irq() disable_irq() free_irq() request_irq() sequence. So the only case where the disable depth needs to be preserved is for managed interrupts in the hotunplug -> shutdown -> hotplug -> startup scenario. Making that explicit avoids chasing all other places and sprinkle desc->depth = 1 into them. Something like the uncompiled below should do the trick. Thanks, tglx --- diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c index 36cf1b09cc84..b88e9d36d933 100644 --- a/kernel/irq/chip.c +++ b/kernel/irq/chip.c @@ -223,6 +223,19 @@ __irq_startup_managed(struct irq_desc *desc, const struct cpumask *aff, return IRQ_STARTUP_ABORT; return IRQ_STARTUP_MANAGED; } + +void irq_startup_managed(struct irq_desc *desc) +{ + /* + * Only start it up when the disable depth is 1, so that a disable, + * hotunplug, hotplug sequence does not end up enabling it during + * hotplug unconditionally. + */ + desc->depth--; + if (!desc->depth) + irq_startup(desc, IRQ_RESEND, IRQ_START_COND); +} + #else static __always_inline int __irq_startup_managed(struct irq_desc *desc, const struct cpumask *aff, @@ -290,6 +303,7 @@ int irq_startup(struct irq_desc *desc, bool resend, bool force) ret = __irq_startup(desc); break; case IRQ_STARTUP_ABORT: + desc->depth = 1; irqd_set_managed_shutdown(d); return 0; } @@ -322,7 +336,13 @@ void irq_shutdown(struct irq_desc *desc) { if (irqd_is_started(&desc->irq_data)) { clear_irq_resend(desc); - desc->depth = 1; + /* + * Increment disable depth, so that a managed shutdown on + * CPU hotunplug preserves the actual disabled state when the + * CPU comes back online. See irq_startup_managed(). + */ + desc->depth++; + if (desc->irq_data.chip->irq_shutdown) { desc->irq_data.chip->irq_shutdown(&desc->irq_data); irq_state_set_disabled(desc); diff --git a/kernel/irq/cpuhotplug.c b/kernel/irq/cpuhotplug.c index 15a7654eff68..3ed5b1592735 100644 --- a/kernel/irq/cpuhotplug.c +++ b/kernel/irq/cpuhotplug.c @@ -219,7 +219,7 @@ static void irq_restore_affinity_of_irq(struct irq_desc *desc, unsigned int cpu) return; if (irqd_is_managed_and_shutdown(data)) - irq_startup(desc, IRQ_RESEND, IRQ_START_COND); + irq_startup_managed(desc); /* * If the interrupt can only be directed to a single target diff --git a/kernel/irq/internals.h b/kernel/irq/internals.h index b0290849c395..8d2b3ac80ef3 100644 --- a/kernel/irq/internals.h +++ b/kernel/irq/internals.h @@ -87,6 +87,7 @@ extern void __enable_irq(struct irq_desc *desc); extern int irq_activate(struct irq_desc *desc); extern int irq_activate_and_startup(struct irq_desc *desc, bool resend); extern int irq_startup(struct irq_desc *desc, bool resend, bool force); +void irq_startup_managed(struct irq_desc *desc); extern void irq_shutdown(struct irq_desc *desc); extern void irq_shutdown_and_deactivate(struct irq_desc *desc); ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] genirq: Retain disable depth across irq shutdown/startup 2025-05-14 7:35 ` Thomas Gleixner @ 2025-05-14 20:16 ` Brian Norris 0 siblings, 0 replies; 8+ messages in thread From: Brian Norris @ 2025-05-14 20:16 UTC (permalink / raw) To: Thomas Gleixner; +Cc: Douglas Anderson, Tsai Sung-Fu, linux-kernel Hi Thomas, On Wed, May 14, 2025 at 09:35:49AM +0200, Thomas Gleixner wrote: > On Tue, May 13 2025 at 15:42, Brian Norris wrote: > > @@ -272,7 +272,9 @@ int irq_startup(struct irq_desc *desc, bool resend, bool force) > > const struct cpumask *aff = irq_data_get_affinity_mask(d); > > int ret = 0; > > > > - desc->depth = 0; > > + desc->depth--; > > + if (desc->depth) > > + return 0; > > This breaks a > > request_irq() > disable_irq() > free_irq() > request_irq() > > sequence. Ah, thanks for the callout. I factored that into another unit test in v2. > So the only case where the disable depth needs to be preserved is for > managed interrupts in the hotunplug -> shutdown -> hotplug -> startup > scenario. Making that explicit avoids chasing all other places and > sprinkle desc->depth = 1 into them. Something like the uncompiled below > should do the trick. Seems reasonable, and it works for me. I've incorporated that in v2, although I'm not sure how the attribution should work there. Thanks, Brian ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-05-14 20:16 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-13 22:42 [PATCH 0/2] genirq: Retain disable-depth across irq_{shutdown,startup}() Brian Norris
2025-05-13 22:42 ` [PATCH 1/2] genirq: Add kunit tests for depth counts Brian Norris
2025-05-14 15:16 ` kernel test robot
2025-05-14 15:27 ` kernel test robot
2025-05-14 18:05 ` kernel test robot
2025-05-13 22:42 ` [PATCH 2/2] genirq: Retain disable depth across irq shutdown/startup Brian Norris
2025-05-14 7:35 ` Thomas Gleixner
2025-05-14 20:16 ` Brian Norris
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.