* [PATCH v2 0/2] genirq: Retain depth for managed IRQs across CPU hotplug @ 2025-05-14 20:13 Brian Norris 2025-05-14 20:13 ` [PATCH v2 1/2] " Brian Norris 2025-05-14 20:13 ` [PATCH v2 2/2] genirq: Add kunit tests for depth counts Brian Norris 0 siblings, 2 replies; 21+ messages in thread From: Brian Norris @ 2025-05-14 20:13 UTC (permalink / raw) To: Thomas Gleixner Cc: Tsai Sung-Fu, Douglas Anderson, linux-kernel, Brian Norris v1: https://lore.kernel.org/lkml/20250513224402.864767-1-briannorris@chromium.org/ [PATCH 0/2] genirq: Retain disable-depth across irq_{shutdown,startup}() I'm seeing problems in a driver that: (a) requests an affinity-managed IRQ (struct irq_affinity_desc::is_managed == 1); (b) disables that IRQ (disable_irq()); and (c) undergoes CPU hotplug for the affined CPU. When we do the above, the genirq core leaves the IRQ in a different state than it started -- the kernel IRQ is re-enabled after CPU hot unplug/plug. This problem seems to stem from the behavior of irq_shutdown() and irq_shutdown(): that they assume they always run with an enabled IRQ, and can simply set depth to 1 and 0 respectively. I incorporate a fix suggested by Thomas Gleixner in patch 1, and provide some new kunit test cases for this area in patch 2. Side note: I understand my colleague has reported other issues related to the same code: Subject: [PATCH] genirq/PM: Fix IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND if depth > 1 https://lore.kernel.org/lkml/20250512173250.1.If5c00cf9f08732f4af5f104ae59b8785c7f69536@changeid/ We're addressing different problems, but they do happen to hit on some of the same awkwardness in irq_startup(). These two patches would probably need to be reconciled in some way. Changes in v2: * Adapt Thomas Gleixner's alternative solution, to focus only on CPU hotplug cases * add request_irq()/disable_irq()/free_irq()/request_irq() test sequence * clean up more resources in tests * move tests to patch 2 (i.e., after bugs are fixed and tests pass) * adapt to irq_startup_managed() (new API) Brian Norris (2): genirq: Retain depth for managed IRQs across CPU hotplug genirq: Add kunit tests for depth counts kernel/irq/Kconfig | 11 ++ kernel/irq/Makefile | 1 + kernel/irq/chip.c | 22 +++- kernel/irq/cpuhotplug.c | 2 +- kernel/irq/internals.h | 1 + kernel/irq/irq_test.c | 229 ++++++++++++++++++++++++++++++++++++++++ 6 files changed, 264 insertions(+), 2 deletions(-) create mode 100644 kernel/irq/irq_test.c -- 2.49.0.1045.g170613ef41-goog ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v2 1/2] genirq: Retain depth for managed IRQs across CPU hotplug 2025-05-14 20:13 [PATCH v2 0/2] genirq: Retain depth for managed IRQs across CPU hotplug Brian Norris @ 2025-05-14 20:13 ` Brian Norris 2025-05-15 14:51 ` [tip: irq/core] genirq: Retain disable depth for managed interrupts " tip-bot2 for Brian Norris 2025-06-06 12:21 ` [PATCH v2 1/2] genirq: Retain depth for managed IRQs " Aleksandrs Vinarskis 2025-05-14 20:13 ` [PATCH v2 2/2] genirq: Add kunit tests for depth counts Brian Norris 1 sibling, 2 replies; 21+ messages in thread From: Brian Norris @ 2025-05-14 20:13 UTC (permalink / raw) To: Thomas Gleixner Cc: Tsai Sung-Fu, Douglas Anderson, linux-kernel, Brian Norris Affinity-managed IRQs may be shut down and restarted during CPU hotunplug/plug, and the IRQ may be left in an unexpected state. Specifically: 1. IRQ affines to CPU N 2. disable_irq() -> depth is 1 3. CPU N goes offline 4. irq_shutdown() -> depth is set to 1 (again) 5. CPU N goes online 6. irq_startup() -> depth is set to 0 (BUG! client expected IRQ is still disabled) 7. enable_irq() -> depth underflow / unbalanced enable_irq() WARN It seems depth only needs preserved for managed IRQs + CPU hotplug, so per Thomas's recommendation, we make that explicit. I add kunit tests that cover some of this in a following patch. Signed-off-by: Brian Norris <briannorris@chromium.org> Co-developed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- Thomas provided a better suggestion than my v1, without fully-formed patch metadata. I've incorporated that as "Co-developed-by". Feel free to suggest something different. Changes in v2: * Adapt Thomas Gleixner's alternative solution, to focus only on CPU hotplug cases kernel/irq/chip.c | 22 +++++++++++++++++++++- kernel/irq/cpuhotplug.c | 2 +- kernel/irq/internals.h | 1 + 3 files changed, 23 insertions(+), 2 deletions(-) diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c index 36cf1b09cc84..ab2bf0de3422 100644 --- a/kernel/irq/chip.c +++ b/kernel/irq/chip.c @@ -223,6 +223,19 @@ __irq_startup_managed(struct irq_desc *desc, const struct cpumask *aff, return IRQ_STARTUP_ABORT; return IRQ_STARTUP_MANAGED; } + +void irq_startup_managed(struct irq_desc *desc) +{ + /* + * Only start it up when the disable depth is 1, so that a disable, + * hotunplug, hotplug sequence does not end up enabling it during + * hotplug unconditionally. + */ + desc->depth--; + if (!desc->depth) + irq_startup(desc, IRQ_RESEND, IRQ_START_COND); +} + #else static __always_inline int __irq_startup_managed(struct irq_desc *desc, const struct cpumask *aff, @@ -290,6 +303,7 @@ int irq_startup(struct irq_desc *desc, bool resend, bool force) ret = __irq_startup(desc); break; case IRQ_STARTUP_ABORT: + desc->depth = 1; irqd_set_managed_shutdown(d); return 0; } @@ -322,7 +336,13 @@ void irq_shutdown(struct irq_desc *desc) { if (irqd_is_started(&desc->irq_data)) { clear_irq_resend(desc); - desc->depth = 1; + /* + * Increment disable depth, so that a managed shutdown on + * CPU hotunplug preserves the actual disabled state when the + * CPU comes back online. See irq_startup_managed(). + */ + desc->depth++; + if (desc->irq_data.chip->irq_shutdown) { desc->irq_data.chip->irq_shutdown(&desc->irq_data); irq_state_set_disabled(desc); diff --git a/kernel/irq/cpuhotplug.c b/kernel/irq/cpuhotplug.c index 15a7654eff68..3ed5b1592735 100644 --- a/kernel/irq/cpuhotplug.c +++ b/kernel/irq/cpuhotplug.c @@ -219,7 +219,7 @@ static void irq_restore_affinity_of_irq(struct irq_desc *desc, unsigned int cpu) return; if (irqd_is_managed_and_shutdown(data)) - irq_startup(desc, IRQ_RESEND, IRQ_START_COND); + irq_startup_managed(desc); /* * If the interrupt can only be directed to a single target diff --git a/kernel/irq/internals.h b/kernel/irq/internals.h index b0290849c395..7111747ecb86 100644 --- a/kernel/irq/internals.h +++ b/kernel/irq/internals.h @@ -87,6 +87,7 @@ extern void __enable_irq(struct irq_desc *desc); extern int irq_activate(struct irq_desc *desc); extern int irq_activate_and_startup(struct irq_desc *desc, bool resend); extern int irq_startup(struct irq_desc *desc, bool resend, bool force); +extern void irq_startup_managed(struct irq_desc *desc); extern void irq_shutdown(struct irq_desc *desc); extern void irq_shutdown_and_deactivate(struct irq_desc *desc); -- 2.49.0.1045.g170613ef41-goog ^ permalink raw reply related [flat|nested] 21+ messages in thread
* [tip: irq/core] genirq: Retain disable depth for managed interrupts across CPU hotplug 2025-05-14 20:13 ` [PATCH v2 1/2] " Brian Norris @ 2025-05-15 14:51 ` tip-bot2 for Brian Norris 2025-06-06 12:21 ` [PATCH v2 1/2] genirq: Retain depth for managed IRQs " Aleksandrs Vinarskis 1 sibling, 0 replies; 21+ messages in thread From: tip-bot2 for Brian Norris @ 2025-05-15 14:51 UTC (permalink / raw) To: linux-tip-commits; +Cc: Thomas Gleixner, Brian Norris, x86, linux-kernel, maz The following commit has been merged into the irq/core branch of tip: Commit-ID: 788019eb559fd0b365f501467ceafce540e377cc Gitweb: https://git.kernel.org/tip/788019eb559fd0b365f501467ceafce540e377cc Author: Brian Norris <briannorris@chromium.org> AuthorDate: Wed, 14 May 2025 13:13:16 -07:00 Committer: Thomas Gleixner <tglx@linutronix.de> CommitterDate: Thu, 15 May 2025 16:44:25 +02:00 genirq: Retain disable depth for managed interrupts across CPU hotplug Affinity-managed interrupts can be shut down and restarted during CPU hotunplug/plug. Thereby the interrupt may be left in an unexpected state. Specifically: 1. Interrupt is affine to CPU N 2. disable_irq() -> depth is 1 3. CPU N goes offline 4. irq_shutdown() -> depth is set to 1 (again) 5. CPU N goes online 6. irq_startup() -> depth is set to 0 (BUG! driver expects that the interrupt still disabled) 7. enable_irq() -> depth underflow / unbalanced enable_irq() warning This is only a problem for managed interrupts and CPU hotplug, all other cases like request()/free()/request() truly needs to reset a possibly stale disable depth value. Provide a startup function, which takes the disable depth into account, and invoked it for the managed interrupts in the CPU hotplug path. This requires to change irq_shutdown() to do a depth increment instead of setting it to 1, which allows to retain the disable depth, but is harmless for the other code paths using irq_startup(), which will still reset the disable depth unconditionally to keep the original correct behaviour. A kunit tests will be added separately to cover some of these aspects. [ tglx: Massaged changelog ] Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250514201353.3481400-2-briannorris@chromium.org --- kernel/irq/chip.c | 22 +++++++++++++++++++++- kernel/irq/cpuhotplug.c | 2 +- kernel/irq/internals.h | 1 + 3 files changed, 23 insertions(+), 2 deletions(-) diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c index 1d45c84..b0e0a73 100644 --- a/kernel/irq/chip.c +++ b/kernel/irq/chip.c @@ -202,6 +202,19 @@ __irq_startup_managed(struct irq_desc *desc, const struct cpumask *aff, return IRQ_STARTUP_ABORT; return IRQ_STARTUP_MANAGED; } + +void irq_startup_managed(struct irq_desc *desc) +{ + /* + * Only start it up when the disable depth is 1, so that a disable, + * hotunplug, hotplug sequence does not end up enabling it during + * hotplug unconditionally. + */ + desc->depth--; + if (!desc->depth) + irq_startup(desc, IRQ_RESEND, IRQ_START_COND); +} + #else static __always_inline int __irq_startup_managed(struct irq_desc *desc, const struct cpumask *aff, @@ -269,6 +282,7 @@ int irq_startup(struct irq_desc *desc, bool resend, bool force) ret = __irq_startup(desc); break; case IRQ_STARTUP_ABORT: + desc->depth = 1; irqd_set_managed_shutdown(d); return 0; } @@ -301,7 +315,13 @@ void irq_shutdown(struct irq_desc *desc) { if (irqd_is_started(&desc->irq_data)) { clear_irq_resend(desc); - desc->depth = 1; + /* + * Increment disable depth, so that a managed shutdown on + * CPU hotunplug preserves the actual disabled state when the + * CPU comes back online. See irq_startup_managed(). + */ + desc->depth++; + if (desc->irq_data.chip->irq_shutdown) { desc->irq_data.chip->irq_shutdown(&desc->irq_data); irq_state_set_disabled(desc); diff --git a/kernel/irq/cpuhotplug.c b/kernel/irq/cpuhotplug.c index e77ca6d..f07529a 100644 --- a/kernel/irq/cpuhotplug.c +++ b/kernel/irq/cpuhotplug.c @@ -218,7 +218,7 @@ static void irq_restore_affinity_of_irq(struct irq_desc *desc, unsigned int cpu) return; if (irqd_is_managed_and_shutdown(data)) - irq_startup(desc, IRQ_RESEND, IRQ_START_COND); + irq_startup_managed(desc); /* * If the interrupt can only be directed to a single target diff --git a/kernel/irq/internals.h b/kernel/irq/internals.h index 476a20f..aebfe22 100644 --- a/kernel/irq/internals.h +++ b/kernel/irq/internals.h @@ -87,6 +87,7 @@ extern void __enable_irq(struct irq_desc *desc); extern int irq_activate(struct irq_desc *desc); extern int irq_activate_and_startup(struct irq_desc *desc, bool resend); extern int irq_startup(struct irq_desc *desc, bool resend, bool force); +extern void irq_startup_managed(struct irq_desc *desc); extern void irq_shutdown(struct irq_desc *desc); extern void irq_shutdown_and_deactivate(struct irq_desc *desc); ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH v2 1/2] genirq: Retain depth for managed IRQs across CPU hotplug 2025-05-14 20:13 ` [PATCH v2 1/2] " Brian Norris 2025-05-15 14:51 ` [tip: irq/core] genirq: Retain disable depth for managed interrupts " tip-bot2 for Brian Norris @ 2025-06-06 12:21 ` Aleksandrs Vinarskis 2025-06-09 17:13 ` Brian Norris 1 sibling, 1 reply; 21+ messages in thread From: Aleksandrs Vinarskis @ 2025-06-06 12:21 UTC (permalink / raw) To: Brian Norris, Thomas Gleixner Cc: Tsai Sung-Fu, Douglas Anderson, linux-kernel, Johan Hovold On 5/14/25 22:13, Brian Norris wrote: > Affinity-managed IRQs may be shut down and restarted during CPU > hotunplug/plug, and the IRQ may be left in an unexpected state. > Specifically: > > 1. IRQ affines to CPU N > 2. disable_irq() -> depth is 1 > 3. CPU N goes offline > 4. irq_shutdown() -> depth is set to 1 (again) > 5. CPU N goes online > 6. irq_startup() -> depth is set to 0 (BUG! client expected IRQ is > still disabled) > 7. enable_irq() -> depth underflow / unbalanced enable_irq() WARN > > It seems depth only needs preserved for managed IRQs + CPU hotplug, so > per Thomas's recommendation, we make that explicit. > > I add kunit tests that cover some of this in a following patch. > > Signed-off-by: Brian Norris <briannorris@chromium.org> > Co-developed-by: Thomas Gleixner <tglx@linutronix.de> > Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Hi All, It appears that this commit introduces a critical bug observed on at least some Qualcomm Snapdragon X1E/X1P laptops, rendering the suspend function unusable. With this change in place, after successful suspend the device either: 1. Cannot wake up at all. Screen stays black, even though PM has existed suspend (observed by external LEDs controlled by PM) 2. Wakes up eventually after minutes (instead of seconds) with SSD related errors in dmesg. System still exhibits errors eg. UI icons are not properly loaded, WiFi does not (always) connect. Example of SSD errors: ``` [ 45.997238] PM: suspend exit [ 76.276320] nvme nvme0: I/O tag 320 (5140) QID 3 timeout, completion polled [ 104.945562] nvme nvme0: I/O tag 38 (9026) QID 2 timeout, completion polled [ 104.946170] nvme nvme0: I/O tag 147 (8093) QID 4 timeout, completion polled [ 106.354320] nvme nvme0: I/O tag 321 (3141) QID 3 timeout, completion polled [ 136.689693] nvme nvme0: I/O tag 322 (3142) QID 3 timeout, completion polled [ 141.428102] wlP4p1s0: authenticate with 50:64:2b:5f:e3:ba (local address=8c:3b:4a:a6:fa:f3) [ 141.428123] wlP4p1s0: send auth to 50:64:2b:5f:e3:ba (try 1/3) [ 141.433397] wlP4p1s0: authenticated [ 141.434303] wlP4p1s0: associate with 50:64:2b:5f:e3:ba (try 1/3) [ 141.438224] wlP4p1s0: RX AssocResp from 50:64:2b:5f:e3:ba (capab=0x1011 status=0 aid=2) [ 141.451828] wlP4p1s0: associated [ 149.984776] ath11k_pci 0004:01:00.0: msdu_done bit in attention is not set [ 160.635229] ath11k_pci 0004:01:00.0: msdu_done bit in attention is not set [ 165.873389] nvme nvme0: I/O tag 12 (500c) QID 7 timeout, completion polled [ 165.873478] nvme nvme0: I/O tag 526 (120e) QID 8 timeout, completion polled [ 166.026369] nvme nvme0: I/O tag 776 (a308) QID 5 timeout, completion polled [ 166.026406] nvme nvme0: I/O tag 704 (22c0) QID 6 timeout, completion polled [ 166.769312] nvme nvme0: I/O tag 128 (d080) QID 4 timeout, completion polled [ 166.858582] systemd-journald[452]: Time jumped backwards, rotating. [ 196.072359] nvme nvme0: I/O tag 45 (a02d) QID 2 timeout, completion polled [ 196.072429] nvme nvme0: I/O tag 778 (630a) QID 5 timeout, completion polled [ 196.072440] nvme nvme0: I/O tag 705 (82c1) QID 6 timeout, completion polled [ 196.904376] nvme nvme0: I/O tag 346 (215a) QID 3 timeout, completion polled [ 212.970816] nvme nvme0: I/O tag 129 (e081) QID 4 timeout, completion polled ``` This series was merged to linux-next on 20250516 introducing this bug. Reverting commit 788019eb559fd0b3 ("genirq: Retain disable depth for managed interrupts across CPU hotplug") on either `next-20250516` or anything newer eg. `next-20250605` fixes the issue. Tested on Dell XPS 9345 (Snaprdagon X1E-80-100), Asus Zenbook A14 (Snapdragon X1-26-100). Is it possible to have this addressed/patched up/reverted before 6.16-rc1 goes live and introduces the regression? It also appears this series was selected for backporting to 6.6, 6.12, 6.14, 6.15: perhaps this should be postponed/aborted until better solution is found? Thanks in advance, Alex > --- > Thomas provided a better suggestion than my v1, without fully-formed > patch metadata. I've incorporated that as "Co-developed-by". Feel free to > suggest something different. > > Changes in v2: > * Adapt Thomas Gleixner's alternative solution, to focus only on CPU > hotplug cases > > kernel/irq/chip.c | 22 +++++++++++++++++++++- > kernel/irq/cpuhotplug.c | 2 +- > kernel/irq/internals.h | 1 + > 3 files changed, 23 insertions(+), 2 deletions(-) > > diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c > index 36cf1b09cc84..ab2bf0de3422 100644 > --- a/kernel/irq/chip.c > +++ b/kernel/irq/chip.c > @@ -223,6 +223,19 @@ __irq_startup_managed(struct irq_desc *desc, const struct cpumask *aff, > return IRQ_STARTUP_ABORT; > return IRQ_STARTUP_MANAGED; > } > + > +void irq_startup_managed(struct irq_desc *desc) > +{ > + /* > + * Only start it up when the disable depth is 1, so that a disable, > + * hotunplug, hotplug sequence does not end up enabling it during > + * hotplug unconditionally. > + */ > + desc->depth--; > + if (!desc->depth) > + irq_startup(desc, IRQ_RESEND, IRQ_START_COND); > +} > + > #else > static __always_inline int > __irq_startup_managed(struct irq_desc *desc, const struct cpumask *aff, > @@ -290,6 +303,7 @@ int irq_startup(struct irq_desc *desc, bool resend, bool force) > ret = __irq_startup(desc); > break; > case IRQ_STARTUP_ABORT: > + desc->depth = 1; > irqd_set_managed_shutdown(d); > return 0; > } > @@ -322,7 +336,13 @@ void irq_shutdown(struct irq_desc *desc) > { > if (irqd_is_started(&desc->irq_data)) { > clear_irq_resend(desc); > - desc->depth = 1; > + /* > + * Increment disable depth, so that a managed shutdown on > + * CPU hotunplug preserves the actual disabled state when the > + * CPU comes back online. See irq_startup_managed(). > + */ > + desc->depth++; > + > if (desc->irq_data.chip->irq_shutdown) { > desc->irq_data.chip->irq_shutdown(&desc->irq_data); > irq_state_set_disabled(desc); > diff --git a/kernel/irq/cpuhotplug.c b/kernel/irq/cpuhotplug.c > index 15a7654eff68..3ed5b1592735 100644 > --- a/kernel/irq/cpuhotplug.c > +++ b/kernel/irq/cpuhotplug.c > @@ -219,7 +219,7 @@ static void irq_restore_affinity_of_irq(struct irq_desc *desc, unsigned int cpu) > return; > > if (irqd_is_managed_and_shutdown(data)) > - irq_startup(desc, IRQ_RESEND, IRQ_START_COND); > + irq_startup_managed(desc); > > /* > * If the interrupt can only be directed to a single target > diff --git a/kernel/irq/internals.h b/kernel/irq/internals.h > index b0290849c395..7111747ecb86 100644 > --- a/kernel/irq/internals.h > +++ b/kernel/irq/internals.h > @@ -87,6 +87,7 @@ extern void __enable_irq(struct irq_desc *desc); > extern int irq_activate(struct irq_desc *desc); > extern int irq_activate_and_startup(struct irq_desc *desc, bool resend); > extern int irq_startup(struct irq_desc *desc, bool resend, bool force); > +extern void irq_startup_managed(struct irq_desc *desc); > > extern void irq_shutdown(struct irq_desc *desc); > extern void irq_shutdown_and_deactivate(struct irq_desc *desc); ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 1/2] genirq: Retain depth for managed IRQs across CPU hotplug 2025-06-06 12:21 ` [PATCH v2 1/2] genirq: Retain depth for managed IRQs " Aleksandrs Vinarskis @ 2025-06-09 17:13 ` Brian Norris 2025-06-09 18:19 ` Aleksandrs Vinarskis 0 siblings, 1 reply; 21+ messages in thread From: Brian Norris @ 2025-06-09 17:13 UTC (permalink / raw) To: Aleksandrs Vinarskis Cc: Thomas Gleixner, Tsai Sung-Fu, Douglas Anderson, linux-kernel, Johan Hovold Hi Alex, On Fri, Jun 06, 2025 at 02:21:54PM +0200, Aleksandrs Vinarskis wrote: > On 5/14/25 22:13, Brian Norris wrote: > > Affinity-managed IRQs may be shut down and restarted during CPU > > hotunplug/plug, and the IRQ may be left in an unexpected state. > > Specifically: [...] > It appears that this commit introduces a critical bug observed on at least > some Qualcomm Snapdragon X1E/X1P laptops, rendering the suspend function > unusable. > > With this change in place, after successful suspend the device either: > 1. Cannot wake up at all. Screen stays black, even though PM has existed > suspend (observed by external LEDs controlled by PM) > > 2. Wakes up eventually after minutes (instead of seconds) with SSD related > errors in dmesg. System still exhibits errors eg. UI icons are not properly > loaded, WiFi does not (always) connect. I'm sorry to hear this has caused regressions. I don't yet know why your particular problems have occurred, but I did notice last week that there were some issues with the patch in question. I wrote a patch which I'll append, and have started (but not completely finished) testing it. Perhaps you could try it out and let me know how it goes? > Is it possible to have this addressed/patched up/reverted before 6.16-rc1 > goes live and introduces the regression? > It also appears this series was selected for backporting to 6.6, 6.12, 6.14, > 6.15: perhaps this should be postponed/aborted until better solution is > found? Regarding stable backports: yes, please. It looks like Johan requested holding this back on stable here: https://lore.kernel.org/all/aELf3QmuEJOlR7Dv@hovoldconsulting.com/ Hopefully we can figure out a mainline solution promptly enough, but revert is also OK if it comes down to it. Below is a patch I'm working with so far. I can submit it as a separate patch if that helps you. Brian --- Subject: [PATCH] genirq: Rebalance managed interrupts across multi-CPU hotplug Commit 788019eb559f ("genirq: Retain disable depth for managed interrupts across CPU hotplug") intended to only decrement the disable depth once per managed shutdown, but instead it decrements for each CPU hotplug in the affinity mask, until its depth reaches a point where it finally gets re-started. For example, consider: 1. Interrupt is affine to CPU {M,N} 2. disable_irq() -> depth is 1 3. CPU M goes offline -> interrupt migrates to CPU N / depth is still 1 4. CPU N goes offline -> irq_shutdown() / depth is 2 5. CPU N goes online -> irq_restore_affinity_of_irq() -> irqd_is_managed_and_shutdown()==true -> irq_startup_managed() -> depth is 1 6. CPU M goes online -> irq_restore_affinity_of_irq() -> irqd_is_managed_and_shutdown()==true -> irq_startup_managed() -> depth is 0 *** BUG: driver expects the interrupt is still disabled *** -> irq_startup() -> irqd_clr_managed_shutdown() 7. enable_irq() -> depth underflow / unbalanced enable_irq() warning We should clear the managed-shutdown flag at step 6, so that further hotplugs don't cause further imbalance. Fixes: commit 788019eb559f ("genirq: Retain disable depth for managed interrupts across CPU hotplug") Signed-off-by: Brian Norris <briannorris@chromium.org> --- kernel/irq/chip.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c index b0e0a7332993..1af5fe14f3e0 100644 --- a/kernel/irq/chip.c +++ b/kernel/irq/chip.c @@ -175,8 +175,6 @@ __irq_startup_managed(struct irq_desc *desc, const struct cpumask *aff, if (!irqd_affinity_is_managed(d)) return IRQ_STARTUP_NORMAL; - irqd_clr_managed_shutdown(d); - if (!cpumask_intersects(aff, cpu_online_mask)) { /* * Catch code which fiddles with enable_irq() on a managed @@ -205,12 +203,15 @@ __irq_startup_managed(struct irq_desc *desc, const struct cpumask *aff, void irq_startup_managed(struct irq_desc *desc) { + struct irq_data *d = irq_desc_get_irq_data(desc); + /* * Only start it up when the disable depth is 1, so that a disable, * hotunplug, hotplug sequence does not end up enabling it during * hotplug unconditionally. */ desc->depth--; + irqd_clr_managed_shutdown(d); if (!desc->depth) irq_startup(desc, IRQ_RESEND, IRQ_START_COND); } -- 2.50.0.rc0.642.g800a2b2222-goog ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH v2 1/2] genirq: Retain depth for managed IRQs across CPU hotplug 2025-06-09 17:13 ` Brian Norris @ 2025-06-09 18:19 ` Aleksandrs Vinarskis 2025-06-10 20:07 ` Brian Norris 0 siblings, 1 reply; 21+ messages in thread From: Aleksandrs Vinarskis @ 2025-06-09 18:19 UTC (permalink / raw) To: Brian Norris Cc: Thomas Gleixner, Tsai Sung-Fu, Douglas Anderson, linux-kernel, Johan Hovold On Mon, 9 Jun 2025 at 19:13, Brian Norris <briannorris@chromium.org> wrote: > > Hi Alex, > > On Fri, Jun 06, 2025 at 02:21:54PM +0200, Aleksandrs Vinarskis wrote: > > On 5/14/25 22:13, Brian Norris wrote: > > > Affinity-managed IRQs may be shut down and restarted during CPU > > > hotunplug/plug, and the IRQ may be left in an unexpected state. > > > Specifically: > > [...] > > > It appears that this commit introduces a critical bug observed on at least > > some Qualcomm Snapdragon X1E/X1P laptops, rendering the suspend function > > unusable. > > > > With this change in place, after successful suspend the device either: > > 1. Cannot wake up at all. Screen stays black, even though PM has existed > > suspend (observed by external LEDs controlled by PM) > > > > 2. Wakes up eventually after minutes (instead of seconds) with SSD related > > errors in dmesg. System still exhibits errors eg. UI icons are not properly > > loaded, WiFi does not (always) connect. > > I'm sorry to hear this has caused regressions. I don't yet know why your > particular problems have occurred, but I did notice last week that there > were some issues with the patch in question. I wrote a patch which I'll > append, and have started (but not completely finished) testing it. > Perhaps you could try it out and let me know how it goes? Hi Brian, I have tested your attached patch in addition to the original one, and unfortunately it did not resolve the problem on either of the two laptops: neither managed to wake up, just like before. Will be happy to promptly test other proposed solutions. Thanks, Alex > > > Is it possible to have this addressed/patched up/reverted before 6.16-rc1 > > goes live and introduces the regression? > > It also appears this series was selected for backporting to 6.6, 6.12, 6.14, > > 6.15: perhaps this should be postponed/aborted until better solution is > > found? > > Regarding stable backports: yes, please. It looks like Johan requested > holding this back on stable here: > > https://lore.kernel.org/all/aELf3QmuEJOlR7Dv@hovoldconsulting.com/ > > Hopefully we can figure out a mainline solution promptly enough, but > revert is also OK if it comes down to it. > > Below is a patch I'm working with so far. I can submit it as a separate > patch if that helps you. > > Brian > > --- > > Subject: [PATCH] genirq: Rebalance managed interrupts across multi-CPU hotplug > > Commit 788019eb559f ("genirq: Retain disable depth for managed > interrupts across CPU hotplug") intended to only decrement the disable > depth once per managed shutdown, but instead it decrements for each CPU > hotplug in the affinity mask, until its depth reaches a point where it > finally gets re-started. > > For example, consider: > > 1. Interrupt is affine to CPU {M,N} > 2. disable_irq() -> depth is 1 > 3. CPU M goes offline -> interrupt migrates to CPU N / depth is still 1 > 4. CPU N goes offline -> irq_shutdown() / depth is 2 > 5. CPU N goes online > -> irq_restore_affinity_of_irq() > -> irqd_is_managed_and_shutdown()==true > -> irq_startup_managed() -> depth is 1 > 6. CPU M goes online > -> irq_restore_affinity_of_irq() > -> irqd_is_managed_and_shutdown()==true > -> irq_startup_managed() -> depth is 0 > *** BUG: driver expects the interrupt is still disabled *** > -> irq_startup() -> irqd_clr_managed_shutdown() > 7. enable_irq() -> depth underflow / unbalanced enable_irq() warning > > We should clear the managed-shutdown flag at step 6, so that further > hotplugs don't cause further imbalance. > > Fixes: commit 788019eb559f ("genirq: Retain disable depth for managed interrupts across CPU hotplug") > Signed-off-by: Brian Norris <briannorris@chromium.org> > --- > kernel/irq/chip.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c > index b0e0a7332993..1af5fe14f3e0 100644 > --- a/kernel/irq/chip.c > +++ b/kernel/irq/chip.c > @@ -175,8 +175,6 @@ __irq_startup_managed(struct irq_desc *desc, const struct cpumask *aff, > if (!irqd_affinity_is_managed(d)) > return IRQ_STARTUP_NORMAL; > > - irqd_clr_managed_shutdown(d); > - > if (!cpumask_intersects(aff, cpu_online_mask)) { > /* > * Catch code which fiddles with enable_irq() on a managed > @@ -205,12 +203,15 @@ __irq_startup_managed(struct irq_desc *desc, const struct cpumask *aff, > > void irq_startup_managed(struct irq_desc *desc) > { > + struct irq_data *d = irq_desc_get_irq_data(desc); > + > /* > * Only start it up when the disable depth is 1, so that a disable, > * hotunplug, hotplug sequence does not end up enabling it during > * hotplug unconditionally. > */ > desc->depth--; > + irqd_clr_managed_shutdown(d); > if (!desc->depth) > irq_startup(desc, IRQ_RESEND, IRQ_START_COND); > } > -- > 2.50.0.rc0.642.g800a2b2222-goog > ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 1/2] genirq: Retain depth for managed IRQs across CPU hotplug 2025-06-09 18:19 ` Aleksandrs Vinarskis @ 2025-06-10 20:07 ` Brian Norris 2025-06-11 6:50 ` Thomas Gleixner 2025-06-11 6:56 ` Aleksandrs Vinarskis 0 siblings, 2 replies; 21+ messages in thread From: Brian Norris @ 2025-06-10 20:07 UTC (permalink / raw) To: Aleksandrs Vinarskis Cc: Thomas Gleixner, Tsai Sung-Fu, Douglas Anderson, linux-kernel, Johan Hovold Hi Alex, On Mon, Jun 09, 2025 at 08:19:58PM +0200, Aleksandrs Vinarskis wrote: > On Mon, 9 Jun 2025 at 19:13, Brian Norris <briannorris@chromium.org> wrote: > > On Fri, Jun 06, 2025 at 02:21:54PM +0200, Aleksandrs Vinarskis wrote: > > > It appears that this commit introduces a critical bug observed on at least > > > some Qualcomm Snapdragon X1E/X1P laptops, rendering the suspend function > > > unusable. For my reference, are these laptops represented by the arch/arm64/boot/dts/qcom/x1e80100.dtsi family of device trees? I'm just trying to reason through what sorts of driver(s) are in use here, in case there's something I'm overlooking, as I don't have the laptop in question to test. > > > With this change in place, after successful suspend the device either: > > > 1. Cannot wake up at all. Screen stays black, even though PM has existed > > > suspend (observed by external LEDs controlled by PM) > > > > > > 2. Wakes up eventually after minutes (instead of seconds) with SSD related > > > errors in dmesg. System still exhibits errors eg. UI icons are not properly > > > loaded, WiFi does not (always) connect. FYI, my assumption here based on the log snippets and the patch in question is that "only" the NVMe driver's IRQs are getting b0rked by my change. I could imagine that would produce the above symptoms in most laptop configurations, because failing disk I/O will likely block most wakeup-related activity, and cause all sorts of UI and system daemon (e.g., WiFi supplicant) misbehavior. > > I'm sorry to hear this has caused regressions. I don't yet know why your > > particular problems have occurred, but I did notice last week that there > > were some issues with the patch in question. I wrote a patch which I'll > > append, and have started (but not completely finished) testing it. > > Perhaps you could try it out and let me know how it goes? > > Hi Brian, > > I have tested your attached patch in addition to the original one, and > unfortunately it did not resolve the problem on either of the two > laptops: neither managed to wake up, just like before. > Will be happy to promptly test other proposed solutions. Thanks for the testing. I've found a few problems with my proposed patch, and I've come up with the appended alternative that solves them. Could you give it a try? Also, if it's not too much trouble (and especially if my patch still doesn't help you), could you also provide a more complete kernel log and kernel .config file? (Attachment is fine with me. Or a direct email, if somehow the lists don't like it.) It's possible that would give me more hints as to what's going wrong for you. Thanks, Brian diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c index b0e0a7332993..3e873c5ce623 100644 --- a/kernel/irq/chip.c +++ b/kernel/irq/chip.c @@ -205,12 +205,15 @@ __irq_startup_managed(struct irq_desc *desc, const struct cpumask *aff, void irq_startup_managed(struct irq_desc *desc) { + struct irq_data *d = irq_desc_get_irq_data(desc); + /* * Only start it up when the disable depth is 1, so that a disable, * hotunplug, hotplug sequence does not end up enabling it during * hotplug unconditionally. */ desc->depth--; + irqd_clr_managed_shutdown(d); if (!desc->depth) irq_startup(desc, IRQ_RESEND, IRQ_START_COND); } diff --git a/kernel/irq/cpuhotplug.c b/kernel/irq/cpuhotplug.c index f07529ae4895..755346ea9819 100644 --- a/kernel/irq/cpuhotplug.c +++ b/kernel/irq/cpuhotplug.c @@ -210,13 +210,6 @@ static void irq_restore_affinity_of_irq(struct irq_desc *desc, unsigned int cpu) !irq_data_get_irq_chip(data) || !cpumask_test_cpu(cpu, affinity)) return; - /* - * Don't restore suspended interrupts here when a system comes back - * from S3. They are reenabled via resume_device_irqs(). - */ - if (desc->istate & IRQS_SUSPENDED) - return; - if (irqd_is_managed_and_shutdown(data)) irq_startup_managed(desc); ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH v2 1/2] genirq: Retain depth for managed IRQs across CPU hotplug 2025-06-10 20:07 ` Brian Norris @ 2025-06-11 6:50 ` Thomas Gleixner 2025-06-11 8:50 ` Thomas Gleixner 2025-06-11 6:56 ` Aleksandrs Vinarskis 1 sibling, 1 reply; 21+ messages in thread From: Thomas Gleixner @ 2025-06-11 6:50 UTC (permalink / raw) To: Brian Norris, Aleksandrs Vinarskis Cc: Tsai Sung-Fu, Douglas Anderson, linux-kernel, Johan Hovold On Tue, Jun 10 2025 at 13:07, Brian Norris wrote: > On Mon, Jun 09, 2025 at 08:19:58PM +0200, Aleksandrs Vinarskis wrote: > > void irq_startup_managed(struct irq_desc *desc) > { > + struct irq_data *d = irq_desc_get_irq_data(desc); > + > /* > * Only start it up when the disable depth is 1, so that a disable, > * hotunplug, hotplug sequence does not end up enabling it during > * hotplug unconditionally. > */ > desc->depth--; > + irqd_clr_managed_shutdown(d); If depth > 0, then it's still shutdown and the subsequent enable operation which brings it down to 0 will take care of it. So what are you trying to solve here? > if (!desc->depth) > irq_startup(desc, IRQ_RESEND, IRQ_START_COND); > } > diff --git a/kernel/irq/cpuhotplug.c b/kernel/irq/cpuhotplug.c > index f07529ae4895..755346ea9819 100644 > --- a/kernel/irq/cpuhotplug.c > +++ b/kernel/irq/cpuhotplug.c > @@ -210,13 +210,6 @@ static void irq_restore_affinity_of_irq(struct irq_desc *desc, unsigned int cpu) > !irq_data_get_irq_chip(data) || !cpumask_test_cpu(cpu, affinity)) > return; > > - /* > - * Don't restore suspended interrupts here when a system comes back > - * from S3. They are reenabled via resume_device_irqs(). > - */ > - if (desc->istate & IRQS_SUSPENDED) > - return; > - Huch? Care to read: a60dd06af674 ("genirq/cpuhotplug: Skip suspended interrupts when restoring affinity") Thanks, tglx ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 1/2] genirq: Retain depth for managed IRQs across CPU hotplug 2025-06-11 6:50 ` Thomas Gleixner @ 2025-06-11 8:50 ` Thomas Gleixner 2025-06-11 18:51 ` Brian Norris 0 siblings, 1 reply; 21+ messages in thread From: Thomas Gleixner @ 2025-06-11 8:50 UTC (permalink / raw) To: Brian Norris, Aleksandrs Vinarskis Cc: Tsai Sung-Fu, Douglas Anderson, linux-kernel, Johan Hovold On Wed, Jun 11 2025 at 08:50, Thomas Gleixner wrote: > On Tue, Jun 10 2025 at 13:07, Brian Norris wrote: >> On Mon, Jun 09, 2025 at 08:19:58PM +0200, Aleksandrs Vinarskis wrote: >> >> void irq_startup_managed(struct irq_desc *desc) >> { >> + struct irq_data *d = irq_desc_get_irq_data(desc); >> + >> /* >> * Only start it up when the disable depth is 1, so that a disable, >> * hotunplug, hotplug sequence does not end up enabling it during >> * hotplug unconditionally. >> */ >> desc->depth--; >> + irqd_clr_managed_shutdown(d); > > If depth > 0, then it's still shutdown and the subsequent enable > operation which brings it down to 0 will take care of it. So what are > you trying to solve here? I found the previous version which has an explanation for this. That makes sense. You really want this to be: struct irq_data *d = irq_desc_get_irq_data(desc); /* Proper comment */ irqd_clr_managed_shutdown(d); /* * Only start it up when the disable depth is 1, so that a disable, * hotunplug, hotplug sequence does not end up enabling it during * hotplug unconditionally. */ desc->depth--; ... >> if (!desc->depth) >> irq_startup(desc, IRQ_RESEND, IRQ_START_COND); >> } >> diff --git a/kernel/irq/cpuhotplug.c b/kernel/irq/cpuhotplug.c >> index f07529ae4895..755346ea9819 100644 >> --- a/kernel/irq/cpuhotplug.c >> +++ b/kernel/irq/cpuhotplug.c >> @@ -210,13 +210,6 @@ static void irq_restore_affinity_of_irq(struct irq_desc *desc, unsigned int cpu) >> !irq_data_get_irq_chip(data) || !cpumask_test_cpu(cpu, affinity)) >> return; >> >> - /* >> - * Don't restore suspended interrupts here when a system comes back >> - * from S3. They are reenabled via resume_device_irqs(). >> - */ >> - if (desc->istate & IRQS_SUSPENDED) >> - return; >> - > > Huch? Care to read: > > a60dd06af674 ("genirq/cpuhotplug: Skip suspended interrupts when restoring affinity") Never mind. After staring long enough at it, this should work because suspend increments depth and shutdown does too. Thanks, tglx ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 1/2] genirq: Retain depth for managed IRQs across CPU hotplug 2025-06-11 8:50 ` Thomas Gleixner @ 2025-06-11 18:51 ` Brian Norris 0 siblings, 0 replies; 21+ messages in thread From: Brian Norris @ 2025-06-11 18:51 UTC (permalink / raw) To: Thomas Gleixner Cc: Aleksandrs Vinarskis, Tsai Sung-Fu, Douglas Anderson, linux-kernel, Johan Hovold Hi Thomas, On Wed, Jun 11, 2025 at 10:50:32AM +0200, Thomas Gleixner wrote: > On Wed, Jun 11 2025 at 08:50, Thomas Gleixner wrote: > > On Tue, Jun 10 2025 at 13:07, Brian Norris wrote: > >> On Mon, Jun 09, 2025 at 08:19:58PM +0200, Aleksandrs Vinarskis wrote: > >> > >> void irq_startup_managed(struct irq_desc *desc) > >> { > >> + struct irq_data *d = irq_desc_get_irq_data(desc); > >> + > >> /* > >> * Only start it up when the disable depth is 1, so that a disable, > >> * hotunplug, hotplug sequence does not end up enabling it during > >> * hotplug unconditionally. > >> */ > >> desc->depth--; > >> + irqd_clr_managed_shutdown(d); > > > > If depth > 0, then it's still shutdown and the subsequent enable > > operation which brings it down to 0 will take care of it. So what are > > you trying to solve here? > > I found the previous version which has an explanation for this. That Right, I didn't update and carry the explanations here, as I figured I'd format this all as proper patches after getting Alex's testing feedbabck. (I think I'll split to two patches, since there are two distinct bugs I'm fixing in here.) > makes sense. You really want this to be: > > struct irq_data *d = irq_desc_get_irq_data(desc); > > /* Proper comment */ > irqd_clr_managed_shutdown(d); Ack, will add a comment. > /* > * Only start it up when the disable depth is 1, so that a disable, > * hotunplug, hotplug sequence does not end up enabling it during > * hotplug unconditionally. > */ > desc->depth--; > > ... > > >> if (!desc->depth) > >> irq_startup(desc, IRQ_RESEND, IRQ_START_COND); > >> } > >> diff --git a/kernel/irq/cpuhotplug.c b/kernel/irq/cpuhotplug.c > >> index f07529ae4895..755346ea9819 100644 > >> --- a/kernel/irq/cpuhotplug.c > >> +++ b/kernel/irq/cpuhotplug.c > >> @@ -210,13 +210,6 @@ static void irq_restore_affinity_of_irq(struct irq_desc *desc, unsigned int cpu) > >> !irq_data_get_irq_chip(data) || !cpumask_test_cpu(cpu, affinity)) > >> return; > >> > >> - /* > >> - * Don't restore suspended interrupts here when a system comes back > >> - * from S3. They are reenabled via resume_device_irqs(). > >> - */ > >> - if (desc->istate & IRQS_SUSPENDED) > >> - return; > >> - > > > > Huch? Care to read: > > > > a60dd06af674 ("genirq/cpuhotplug: Skip suspended interrupts when restoring affinity") > > Never mind. After staring long enough at it, this should work because > suspend increments depth and shutdown does too. Yeah, that one definitely deserves an explanation :) I wrote out a commit message already, but didn't include it here yet, as I was only looking for testing. Sorry to have sent you staring so long at it. One snippet: This effectively reverts commit a60dd06af674 ("genirq/cpuhotplug: Skip suspended interrupts when restoring affinity"), because it is replaced by commit 788019eb559f ("genirq: Retain disable depth for managed interrupts across CPU hotplug"). IOW, the depth-tracking we added accomplishes the same goal as commit a60dd06af674, but including both causes further bugs/imbalances. Brian ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 1/2] genirq: Retain depth for managed IRQs across CPU hotplug 2025-06-10 20:07 ` Brian Norris 2025-06-11 6:50 ` Thomas Gleixner @ 2025-06-11 6:56 ` Aleksandrs Vinarskis 2025-06-11 19:08 ` Brian Norris 1 sibling, 1 reply; 21+ messages in thread From: Aleksandrs Vinarskis @ 2025-06-11 6:56 UTC (permalink / raw) To: Brian Norris Cc: Thomas Gleixner, Tsai Sung-Fu, Douglas Anderson, linux-kernel, Johan Hovold On Tue, 10 Jun 2025 at 22:07, Brian Norris <briannorris@chromium.org> wrote: > > Hi Alex, > > On Mon, Jun 09, 2025 at 08:19:58PM +0200, Aleksandrs Vinarskis wrote: > > On Mon, 9 Jun 2025 at 19:13, Brian Norris <briannorris@chromium.org> wrote: > > > On Fri, Jun 06, 2025 at 02:21:54PM +0200, Aleksandrs Vinarskis wrote: > > > > It appears that this commit introduces a critical bug observed on at least > > > > some Qualcomm Snapdragon X1E/X1P laptops, rendering the suspend function > > > > unusable. > > For my reference, are these laptops represented by the > arch/arm64/boot/dts/qcom/x1e80100.dtsi family of device trees? I'm just > trying to reason through what sorts of driver(s) are in use here, in > case there's something I'm overlooking, as I don't have the laptop in > question to test. Hi, Yes. Dell XPS 9345 is arch/arm64/boot/dts/qcom/x1e80100.dtsi based, and Asus Zenbook A14 is arch/arm64/boot/dts/qcom/x1p42100.dtsi based, which is a derivative but has a slightly different PCIe setup. So far both laptops would behave in the same ways. > > > > > With this change in place, after successful suspend the device either: > > > > 1. Cannot wake up at all. Screen stays black, even though PM has existed > > > > suspend (observed by external LEDs controlled by PM) > > > > > > > > 2. Wakes up eventually after minutes (instead of seconds) with SSD related > > > > errors in dmesg. System still exhibits errors eg. UI icons are not properly > > > > loaded, WiFi does not (always) connect. > > FYI, my assumption here based on the log snippets and the patch in > question is that "only" the NVMe driver's IRQs are getting b0rked by my > change. I could imagine that would produce the above symptoms in most > laptop configurations, because failing disk I/O will likely block most > wakeup-related activity, and cause all sorts of UI and system daemon > (e.g., WiFi supplicant) misbehavior. > > > > I'm sorry to hear this has caused regressions. I don't yet know why your > > > particular problems have occurred, but I did notice last week that there > > > were some issues with the patch in question. I wrote a patch which I'll > > > append, and have started (but not completely finished) testing it. > > > Perhaps you could try it out and let me know how it goes? > > > > Hi Brian, > > > > I have tested your attached patch in addition to the original one, and > > unfortunately it did not resolve the problem on either of the two > > laptops: neither managed to wake up, just like before. > > Will be happy to promptly test other proposed solutions. > > Thanks for the testing. I've found a few problems with my proposed > patch, and I've come up with the appended alternative that solves them. > Could you give it a try? Just tested, and it appears to solve it, though I see some errors on wakeup that I don't remember seeing before. I will test-drive this setup for a day to provide better feedback and confirm if it is related to the fixup or not. > > Also, if it's not too much trouble (and especially if my patch still > doesn't help you), could you also provide a more complete kernel log and > kernel .config file? (Attachment is fine with me. Or a direct email, if > somehow the lists don't like it.) It's possible that would give me more > hints as to what's going wrong for you. I will share the logs with and without the fixup by private email attachment in a bit. Thanks for looking into this, Alex > > Thanks, > Brian > > diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c > index b0e0a7332993..3e873c5ce623 100644 > --- a/kernel/irq/chip.c > +++ b/kernel/irq/chip.c > @@ -205,12 +205,15 @@ __irq_startup_managed(struct irq_desc *desc, const struct cpumask *aff, > > void irq_startup_managed(struct irq_desc *desc) > { > + struct irq_data *d = irq_desc_get_irq_data(desc); > + > /* > * Only start it up when the disable depth is 1, so that a disable, > * hotunplug, hotplug sequence does not end up enabling it during > * hotplug unconditionally. > */ > desc->depth--; > + irqd_clr_managed_shutdown(d); > if (!desc->depth) > irq_startup(desc, IRQ_RESEND, IRQ_START_COND); > } > diff --git a/kernel/irq/cpuhotplug.c b/kernel/irq/cpuhotplug.c > index f07529ae4895..755346ea9819 100644 > --- a/kernel/irq/cpuhotplug.c > +++ b/kernel/irq/cpuhotplug.c > @@ -210,13 +210,6 @@ static void irq_restore_affinity_of_irq(struct irq_desc *desc, unsigned int cpu) > !irq_data_get_irq_chip(data) || !cpumask_test_cpu(cpu, affinity)) > return; > > - /* > - * Don't restore suspended interrupts here when a system comes back > - * from S3. They are reenabled via resume_device_irqs(). > - */ > - if (desc->istate & IRQS_SUSPENDED) > - return; > - > if (irqd_is_managed_and_shutdown(data)) > irq_startup_managed(desc); > ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 1/2] genirq: Retain depth for managed IRQs across CPU hotplug 2025-06-11 6:56 ` Aleksandrs Vinarskis @ 2025-06-11 19:08 ` Brian Norris 2025-06-12 18:40 ` Brian Norris 0 siblings, 1 reply; 21+ messages in thread From: Brian Norris @ 2025-06-11 19:08 UTC (permalink / raw) To: Aleksandrs Vinarskis Cc: Thomas Gleixner, Tsai Sung-Fu, Douglas Anderson, linux-kernel, Johan Hovold Hi Alex, On Wed, Jun 11, 2025 at 08:56:40AM +0200, Aleksandrs Vinarskis wrote: > Yes. Dell XPS 9345 is arch/arm64/boot/dts/qcom/x1e80100.dtsi based, > and Asus Zenbook A14 is arch/arm64/boot/dts/qcom/x1p42100.dtsi based, > which is a derivative but has a slightly different PCIe setup. So far > both laptops would behave in the same ways. Thanks. So that's what I suspected, a DWC/pcie-qcom PCIe driver, and seemingly standard NVMe on top. pcie-qcom doesn't seem to do anything weird regarding MSIs or affinity, so I wonder why I can't reproduce the same symptoms on other NVMe setups so far. I can see some of the NVMe queue MSIs left disabled (queue 0 / CPU 0 doesn't hit the same bugs, since we don't hotplug CPU 0), but operations seem to function OK even when missing a few queues. Maybe that's an implementation-specific behavior that exhibits differently depending on the exact disk in question. Anyway, I think we've probably honed in on the bugs, so this might just be a curiosity. > > Thanks for the testing. I've found a few problems with my proposed > > patch, and I've come up with the appended alternative that solves them. > > Could you give it a try? > > Just tested, and it appears to solve it, though I see some errors on > wakeup that I don't remember seeing before. I will test-drive this > setup for a day to provide better feedback and confirm if it is > related to the fixup or not. That's promising, I think. Do feel free to forward info if you think there's still a problem though. I'll await your feedback before spinning patches. Brian ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 1/2] genirq: Retain depth for managed IRQs across CPU hotplug 2025-06-11 19:08 ` Brian Norris @ 2025-06-12 18:40 ` Brian Norris 2025-06-18 10:17 ` Johan Hovold 0 siblings, 1 reply; 21+ messages in thread From: Brian Norris @ 2025-06-12 18:40 UTC (permalink / raw) To: Aleksandrs Vinarskis Cc: Thomas Gleixner, Tsai Sung-Fu, Douglas Anderson, linux-kernel, Johan Hovold On Wed, Jun 11, 2025 at 12:08:16PM -0700, Brian Norris wrote: > On Wed, Jun 11, 2025 at 08:56:40AM +0200, Aleksandrs Vinarskis wrote: > > Yes. Dell XPS 9345 is arch/arm64/boot/dts/qcom/x1e80100.dtsi based, > > and Asus Zenbook A14 is arch/arm64/boot/dts/qcom/x1p42100.dtsi based, > > which is a derivative but has a slightly different PCIe setup. So far > > both laptops would behave in the same ways. > > Thanks. So that's what I suspected, a DWC/pcie-qcom PCIe driver, and > seemingly standard NVMe on top. pcie-qcom doesn't seem to do anything > weird regarding MSIs or affinity, [...] For the record, I was reminded that DWC/pcie-qcom does not, in fact, support irq_chip::irq_set_affinity(), which could perhaps be a unique factor in his systems' behavior. > > > Thanks for the testing. I've found a few problems with my proposed > > > patch, and I've come up with the appended alternative that solves them. > > > Could you give it a try? > > > > Just tested, and it appears to solve it, though I see some errors on > > wakeup that I don't remember seeing before. I will test-drive this > > setup for a day to provide better feedback and confirm if it is > > related to the fixup or not. > > That's promising, I think. Do feel free to forward info if you think > there's still a problem though. I'll await your feedback before spinning > patches. Alex sent some private feedback, and from what I could tell, there was nothing concerning. The "new" errors are simply about a wakeup attempt interrupting the CPU offlining process, which I believe is normal behavior depending on the wakeup actvity on his laptop (e.g., input devices). I've submitted my fixes here: Subject: [PATCH 6.16 0/2] genirq: Fixes for CPU hotplug / disable-depth regressions https://lore.kernel.org/lkml/20250612183303.3433234-1-briannorris@chromium.org/ Brian ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 1/2] genirq: Retain depth for managed IRQs across CPU hotplug 2025-06-12 18:40 ` Brian Norris @ 2025-06-18 10:17 ` Johan Hovold 2025-06-18 17:10 ` Brian Norris 0 siblings, 1 reply; 21+ messages in thread From: Johan Hovold @ 2025-06-18 10:17 UTC (permalink / raw) To: Brian Norris Cc: Aleksandrs Vinarskis, Thomas Gleixner, Tsai Sung-Fu, Douglas Anderson, linux-kernel On Thu, Jun 12, 2025 at 11:40:38AM -0700, Brian Norris wrote: > On Wed, Jun 11, 2025 at 12:08:16PM -0700, Brian Norris wrote: > > On Wed, Jun 11, 2025 at 08:56:40AM +0200, Aleksandrs Vinarskis wrote: > > > Yes. Dell XPS 9345 is arch/arm64/boot/dts/qcom/x1e80100.dtsi based, > > > and Asus Zenbook A14 is arch/arm64/boot/dts/qcom/x1p42100.dtsi based, > > > which is a derivative but has a slightly different PCIe setup. So far > > > both laptops would behave in the same ways. > > > > Thanks. So that's what I suspected, a DWC/pcie-qcom PCIe driver, and > > seemingly standard NVMe on top. pcie-qcom doesn't seem to do anything > > weird regarding MSIs or affinity, [...] > > For the record, I was reminded that DWC/pcie-qcom does not, in fact, > support irq_chip::irq_set_affinity(), which could perhaps be a unique > factor in his systems' behavior. No, we use the GIC ITS for the NVMe interrupts on X1E so that should not be involved here. Johan ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 1/2] genirq: Retain depth for managed IRQs across CPU hotplug 2025-06-18 10:17 ` Johan Hovold @ 2025-06-18 17:10 ` Brian Norris 2025-06-19 8:32 ` Johan Hovold 0 siblings, 1 reply; 21+ messages in thread From: Brian Norris @ 2025-06-18 17:10 UTC (permalink / raw) To: Johan Hovold Cc: Aleksandrs Vinarskis, Thomas Gleixner, Tsai Sung-Fu, Douglas Anderson, linux-kernel On Wed, Jun 18, 2025 at 12:17:17PM +0200, Johan Hovold wrote: > On Thu, Jun 12, 2025 at 11:40:38AM -0700, Brian Norris wrote: > > For the record, I was reminded that DWC/pcie-qcom does not, in fact, > > support irq_chip::irq_set_affinity(), which could perhaps be a unique > > factor in his systems' behavior. > > No, we use the GIC ITS for the NVMe interrupts on X1E so that should > not be involved here. Huh, interesting callout. I had looked at trying that previously on another DWC-based platform, for the same reasons noted in that switchover. Anyway, all I can tell you is that Alex's logs clearly showed: set affinity failed(-22) for what looked like the NVMe interrupts during CPU unplug / migration. Brian ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 1/2] genirq: Retain depth for managed IRQs across CPU hotplug 2025-06-18 17:10 ` Brian Norris @ 2025-06-19 8:32 ` Johan Hovold 0 siblings, 0 replies; 21+ messages in thread From: Johan Hovold @ 2025-06-19 8:32 UTC (permalink / raw) To: Brian Norris Cc: Aleksandrs Vinarskis, Thomas Gleixner, Tsai Sung-Fu, Douglas Anderson, linux-kernel On Wed, Jun 18, 2025 at 10:10:14AM -0700, Brian Norris wrote: > On Wed, Jun 18, 2025 at 12:17:17PM +0200, Johan Hovold wrote: > > On Thu, Jun 12, 2025 at 11:40:38AM -0700, Brian Norris wrote: > > > For the record, I was reminded that DWC/pcie-qcom does not, in fact, > > > support irq_chip::irq_set_affinity(), which could perhaps be a unique > > > factor in his systems' behavior. > > > > No, we use the GIC ITS for the NVMe interrupts on X1E so that should > > not be involved here. > > Huh, interesting callout. I had looked at trying that previously on > another DWC-based platform, for the same reasons noted in that > switchover. > > Anyway, all I can tell you is that Alex's logs clearly showed: > > set affinity failed(-22) > > for what looked like the NVMe interrupts during CPU unplug / migration. No, those warnings are for wakeup enabled GPIO interrupts (e.g. the lid switch), which are currently partly disabled on this platform pending some rework of the PDC driver: 602cb14e310a ("pinctrl: qcom: x1e80100: Bypass PDC wakeup parent for now") (And with the rework those warnings go away, while the IRQ suspend regression is still there.) Johan ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v2 2/2] genirq: Add kunit tests for depth counts 2025-05-14 20:13 [PATCH v2 0/2] genirq: Retain depth for managed IRQs across CPU hotplug Brian Norris 2025-05-14 20:13 ` [PATCH v2 1/2] " Brian Norris @ 2025-05-14 20:13 ` Brian Norris 2025-05-15 14:01 ` kernel test robot 2025-05-15 14:51 ` [tip: irq/core] genirq: Add kunit tests for disable " tip-bot2 for Brian Norris 1 sibling, 2 replies; 21+ messages in thread From: Brian Norris @ 2025-05-14 20:13 UTC (permalink / raw) To: Thomas Gleixner Cc: Tsai Sung-Fu, Douglas Anderson, linux-kernel, Brian Norris There have been a few bugs and/or misunderstandings about the reference counting, and startup/shutdown behaviors in the IRQ core and related CPU hotplug code. These 4 test cases try to capture a few interesting cases. * irq_disable_depth_test: basic request/disable/enable sequence * irq_free_disabled_test: request/disable/free/re-request sequence - this catches errors on previous revisions of my work * irq_cpuhotplug_test: exercises managed-affinity IRQ + CPU hotplug. This captures a problematic test case that I've fixed. This test requires CONFIG_SMP and a hotpluggable CPU#1. * irq_shutdown_depth_test: exercises similar behavior from irq_cpuhotplug_test, but directly using irq_*() APIs instead of going through CPU hotplug. This still requires CONFIG_SMP, because managed-affinity is stubbed out (and not all APIs are even present) without it. Note the use of 'imply SMP': ARCH=um doesn't support SMP, and kunit is often exercised there. Thus, 'imply' will force SMP on where possible (such as ARCH=x86_64), but leave it off where it's not. Behavior on various SMP and ARCH configurations: $ tools/testing/kunit/kunit.py run 'irq_test_cases*' --arch x86_64 --qemu_args '-smp 2' [...] [11:12:24] Testing complete. Ran 4 tests: passed: 4 $ tools/testing/kunit/kunit.py run 'irq_test_cases*' --arch x86_64 [...] [11:13:27] [SKIPPED] irq_cpuhotplug_test [11:13:27] ================= [PASSED] irq_test_cases ================== [11:13:27] ============================================================ [11:13:27] Testing complete. Ran 4 tests: passed: 3, skipped: 1 # default: ARCH=um $ tools/testing/kunit/kunit.py run 'irq_test_cases*' [11:14:26] [SKIPPED] irq_shutdown_depth_test [11:14:26] [SKIPPED] irq_cpuhotplug_test [11:14:26] ================= [PASSED] irq_test_cases ================== [11:14:26] ============================================================ [11:14:26] Testing complete. Ran 4 tests: passed: 2, skipped: 2 Without the prior fix ("genirq: Retain depth for managed IRQs across CPU hotplug"), we fail as follows: [11:18:55] =============== irq_test_cases (4 subtests) ================ [11:18:55] [PASSED] irq_disable_depth_test [11:18:55] [PASSED] irq_free_disabled_test [11:18:55] # irq_shutdown_depth_test: EXPECTATION FAILED at kernel/irq/irq_test.c:147 [11:18:55] Expected desc->depth == 1, but [11:18:55] desc->depth == 0 (0x0) [11:18:55] ------------[ cut here ]------------ [11:18:55] Unbalanced enable for IRQ 26 [11:18:55] WARNING: CPU: 1 PID: 36 at kernel/irq/manage.c:792 __enable_irq+0x36/0x60 ... [11:18:55] [FAILED] irq_shutdown_depth_test [11:18:55] #1 [11:18:55] # irq_cpuhotplug_test: EXPECTATION FAILED at kernel/irq/irq_test.c:202 [11:18:55] Expected irqd_is_activated(data) to be false, but is true [11:18:55] # irq_cpuhotplug_test: EXPECTATION FAILED at kernel/irq/irq_test.c:203 [11:18:55] Expected irqd_is_started(data) to be false, but is true [11:18:55] # irq_cpuhotplug_test: EXPECTATION FAILED at kernel/irq/irq_test.c:204 [11:18:55] Expected desc->depth == 1, but [11:18:55] desc->depth == 0 (0x0) [11:18:55] ------------[ cut here ]------------ [11:18:55] Unbalanced enable for IRQ 27 [11:18:55] WARNING: CPU: 0 PID: 38 at kernel/irq/manage.c:792 __enable_irq+0x36/0x60 ... [11:18:55] [FAILED] irq_cpuhotplug_test [11:18:55] # module: irq_test [11:18:55] # irq_test_cases: pass:2 fail:2 skip:0 total:4 [11:18:55] # Totals: pass:2 fail:2 skip:0 total:4 [11:18:55] ================= [FAILED] irq_test_cases ================== [11:18:55] ============================================================ [11:18:55] Testing complete. Ran 4 tests: passed: 2, failed: 2 Signed-off-by: Brian Norris <briannorris@chromium.org> --- Changes in v2: * add request_irq()/disable_irq()/free_irq()/request_irq() test sequence * clean up more resources in tests * move tests to patch 2 (i.e., after bugs are fixed and tests pass) * adapt to irq_startup_managed() (new API) kernel/irq/Kconfig | 11 ++ kernel/irq/Makefile | 1 + kernel/irq/irq_test.c | 229 ++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 241 insertions(+) create mode 100644 kernel/irq/irq_test.c diff --git a/kernel/irq/Kconfig b/kernel/irq/Kconfig index 3f02a0e45254..7429abe5011f 100644 --- a/kernel/irq/Kconfig +++ b/kernel/irq/Kconfig @@ -144,6 +144,17 @@ config GENERIC_IRQ_DEBUGFS config GENERIC_IRQ_KEXEC_CLEAR_VM_FORWARD bool +config IRQ_KUNIT_TEST + tristate "KUnit tests for IRQ management APIs" if !KUNIT_ALL_TESTS + depends on KUNIT + default KUNIT_ALL_TESTS + imply SMP + help + This option enables KUnit tests for the IRQ subsystem API. These are + only for development and testing, not for regular kernel use cases. + + If unsure, say N. + endmenu config GENERIC_IRQ_MULTI_HANDLER diff --git a/kernel/irq/Makefile b/kernel/irq/Makefile index c0f44c06d69d..6ab3a4055667 100644 --- a/kernel/irq/Makefile +++ b/kernel/irq/Makefile @@ -19,3 +19,4 @@ obj-$(CONFIG_GENERIC_IRQ_IPI_MUX) += ipi-mux.o obj-$(CONFIG_SMP) += affinity.o obj-$(CONFIG_GENERIC_IRQ_DEBUGFS) += debugfs.o obj-$(CONFIG_GENERIC_IRQ_MATRIX_ALLOCATOR) += matrix.o +obj-$(CONFIG_IRQ_KUNIT_TEST) += irq_test.o diff --git a/kernel/irq/irq_test.c b/kernel/irq/irq_test.c new file mode 100644 index 000000000000..5161b56a12f9 --- /dev/null +++ b/kernel/irq/irq_test.c @@ -0,0 +1,229 @@ +// SPDX-License-Identifier: LGPL-2.1+ + +#include <linux/cpu.h> +#include <linux/cpumask.h> +#include <linux/interrupt.h> +#include <linux/irq.h> +#include <linux/irqdesc.h> +#include <linux/irqdomain.h> +#include <linux/nodemask.h> +#include <kunit/test.h> + +#include "internals.h" + +static irqreturn_t noop_handler(int irq, void *data) +{ + return IRQ_HANDLED; +} + +static void noop(struct irq_data *data) { } +static unsigned int noop_ret(struct irq_data *data) { return 0; } + +static int noop_affinity(struct irq_data *data, const struct cpumask *dest, + bool force) +{ + irq_data_update_effective_affinity(data, dest); + + return 0; +} + +static struct irq_chip fake_irq_chip = { + .name = "fake", + .irq_startup = noop_ret, + .irq_shutdown = noop, + .irq_enable = noop, + .irq_disable = noop, + .irq_ack = noop, + .irq_mask = noop, + .irq_unmask = noop, + .irq_set_affinity = noop_affinity, + .flags = IRQCHIP_SKIP_SET_WAKE, +}; + +static void irq_disable_depth_test(struct kunit *test) +{ + struct irq_desc *desc; + int virq, ret; + + virq = irq_domain_alloc_descs(-1, 1, 0, NUMA_NO_NODE, NULL); + KUNIT_ASSERT_GE(test, virq, 0); + + irq_set_chip_and_handler(virq, &dummy_irq_chip, handle_simple_irq); + + desc = irq_to_desc(virq); + KUNIT_ASSERT_PTR_NE(test, desc, NULL); + + ret = request_irq(virq, noop_handler, 0, "test_irq", NULL); + KUNIT_EXPECT_EQ(test, ret, 0); + + KUNIT_EXPECT_EQ(test, desc->depth, 0); + + disable_irq(virq); + KUNIT_EXPECT_EQ(test, desc->depth, 1); + + enable_irq(virq); + KUNIT_EXPECT_EQ(test, desc->depth, 0); + + free_irq(virq, NULL); +} + +static void irq_free_disabled_test(struct kunit *test) +{ + struct irq_desc *desc; + int virq, ret; + + virq = irq_domain_alloc_descs(-1, 1, 0, NUMA_NO_NODE, NULL); + KUNIT_ASSERT_GE(test, virq, 0); + + irq_set_chip_and_handler(virq, &dummy_irq_chip, handle_simple_irq); + + desc = irq_to_desc(virq); + KUNIT_ASSERT_PTR_NE(test, desc, NULL); + + ret = request_irq(virq, noop_handler, 0, "test_irq", NULL); + KUNIT_EXPECT_EQ(test, ret, 0); + + KUNIT_EXPECT_EQ(test, desc->depth, 0); + + disable_irq(virq); + KUNIT_EXPECT_EQ(test, desc->depth, 1); + + free_irq(virq, NULL); + KUNIT_EXPECT_GE(test, desc->depth, 1); + + ret = request_irq(virq, noop_handler, 0, "test_irq", NULL); + KUNIT_EXPECT_EQ(test, ret, 0); + KUNIT_EXPECT_EQ(test, desc->depth, 0); + + free_irq(virq, NULL); +} + +static void irq_shutdown_depth_test(struct kunit *test) +{ + struct irq_desc *desc; + struct irq_data *data; + int virq, ret; + struct irq_affinity_desc affinity = { + .is_managed = 1, + .mask = CPU_MASK_ALL, + }; + + if (!IS_ENABLED(CONFIG_SMP)) + kunit_skip(test, "requires CONFIG_SMP for managed shutdown"); + + virq = irq_domain_alloc_descs(-1, 1, 0, NUMA_NO_NODE, &affinity); + KUNIT_ASSERT_GE(test, virq, 0); + + irq_set_chip_and_handler(virq, &dummy_irq_chip, handle_simple_irq); + + desc = irq_to_desc(virq); + KUNIT_ASSERT_PTR_NE(test, desc, NULL); + + data = irq_desc_get_irq_data(desc); + KUNIT_ASSERT_PTR_NE(test, data, NULL); + + ret = request_irq(virq, noop_handler, 0, "test_irq", NULL); + KUNIT_EXPECT_EQ(test, ret, 0); + + KUNIT_EXPECT_TRUE(test, irqd_is_activated(data)); + KUNIT_EXPECT_TRUE(test, irqd_is_started(data)); + KUNIT_EXPECT_TRUE(test, irqd_affinity_is_managed(data)); + + KUNIT_EXPECT_EQ(test, desc->depth, 0); + + disable_irq(virq); + KUNIT_EXPECT_EQ(test, desc->depth, 1); + + irq_shutdown_and_deactivate(desc); + + KUNIT_EXPECT_FALSE(test, irqd_is_activated(data)); + KUNIT_EXPECT_FALSE(test, irqd_is_started(data)); + + KUNIT_EXPECT_EQ(test, irq_activate(desc), 0); +#ifdef CONFIG_SMP + irq_startup_managed(desc); +#endif + + KUNIT_EXPECT_EQ(test, desc->depth, 1); + + enable_irq(virq); + KUNIT_EXPECT_EQ(test, desc->depth, 0); + + free_irq(virq, NULL); +} + +static void irq_cpuhotplug_test(struct kunit *test) +{ + struct irq_desc *desc; + struct irq_data *data; + int virq, ret; + struct irq_affinity_desc affinity = { + .is_managed = 1, + }; + + if (!IS_ENABLED(CONFIG_SMP)) + kunit_skip(test, "requires CONFIG_SMP for CPU hotplug"); + if (!get_cpu_device(1)) + kunit_skip(test, "requires more than 1 CPU for CPU hotplug"); + if (!cpu_is_hotpluggable(1)) + kunit_skip(test, "CPU 1 must be hotpluggable"); + + cpumask_copy(&affinity.mask, cpumask_of(1)); + + virq = irq_domain_alloc_descs(-1, 1, 0, NUMA_NO_NODE, &affinity); + KUNIT_ASSERT_GE(test, virq, 0); + + irq_set_chip_and_handler(virq, &fake_irq_chip, handle_simple_irq); + + desc = irq_to_desc(virq); + KUNIT_ASSERT_PTR_NE(test, desc, NULL); + + data = irq_desc_get_irq_data(desc); + KUNIT_ASSERT_PTR_NE(test, data, NULL); + + ret = request_irq(virq, noop_handler, 0, "test_irq", NULL); + KUNIT_EXPECT_EQ(test, ret, 0); + + KUNIT_EXPECT_TRUE(test, irqd_is_activated(data)); + KUNIT_EXPECT_TRUE(test, irqd_is_started(data)); + KUNIT_EXPECT_TRUE(test, irqd_affinity_is_managed(data)); + + KUNIT_EXPECT_EQ(test, desc->depth, 0); + + disable_irq(virq); + KUNIT_EXPECT_EQ(test, desc->depth, 1); + + KUNIT_EXPECT_EQ(test, remove_cpu(1), 0); + KUNIT_EXPECT_FALSE(test, irqd_is_activated(data)); + KUNIT_EXPECT_FALSE(test, irqd_is_started(data)); + KUNIT_EXPECT_GE(test, desc->depth, 1); + KUNIT_EXPECT_EQ(test, add_cpu(1), 0); + + KUNIT_EXPECT_FALSE(test, irqd_is_activated(data)); + KUNIT_EXPECT_FALSE(test, irqd_is_started(data)); + KUNIT_EXPECT_EQ(test, desc->depth, 1); + + enable_irq(virq); + KUNIT_EXPECT_TRUE(test, irqd_is_activated(data)); + KUNIT_EXPECT_TRUE(test, irqd_is_started(data)); + KUNIT_EXPECT_EQ(test, desc->depth, 0); + + free_irq(virq, NULL); +} + +static struct kunit_case irq_test_cases[] = { + KUNIT_CASE(irq_disable_depth_test), + KUNIT_CASE(irq_free_disabled_test), + KUNIT_CASE(irq_shutdown_depth_test), + KUNIT_CASE(irq_cpuhotplug_test), + {} +}; + +static struct kunit_suite irq_test_suite = { + .name = "irq_test_cases", + .test_cases = irq_test_cases, +}; + +kunit_test_suite(irq_test_suite); +MODULE_DESCRIPTION("IRQ unit test suite"); +MODULE_LICENSE("GPL"); -- 2.49.0.1045.g170613ef41-goog ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH v2 2/2] genirq: Add kunit tests for depth counts 2025-05-14 20:13 ` [PATCH v2 2/2] genirq: Add kunit tests for depth counts Brian Norris @ 2025-05-15 14:01 ` kernel test robot 2025-05-15 17:21 ` Brian Norris 2025-05-15 14:51 ` [tip: irq/core] genirq: Add kunit tests for disable " tip-bot2 for Brian Norris 1 sibling, 1 reply; 21+ messages in thread From: kernel test robot @ 2025-05-15 14:01 UTC (permalink / raw) To: Brian Norris, Thomas Gleixner Cc: llvm, oe-kbuild-all, Tsai Sung-Fu, Douglas Anderson, linux-kernel, Brian Norris Hi Brian, kernel test robot noticed the following build errors: [auto build test ERROR on tip/irq/core] [also build test ERROR on linus/master v6.15-rc6 next-20250515] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Brian-Norris/genirq-Retain-depth-for-managed-IRQs-across-CPU-hotplug/20250515-041533 base: tip/irq/core patch link: https://lore.kernel.org/r/20250514201353.3481400-3-briannorris%40chromium.org patch subject: [PATCH v2 2/2] genirq: Add kunit tests for depth counts config: i386-buildonly-randconfig-004-20250515 (https://download.01.org/0day-ci/archive/20250515/202505152136.y04AHovS-lkp@intel.com/config) compiler: clang version 20.1.2 (https://github.com/llvm/llvm-project 58df0ef89dd64126512e4ee27b4ac3fd8ddf6247) reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250515/202505152136.y04AHovS-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202505152136.y04AHovS-lkp@intel.com/ All errors (new ones prefixed by >>, old ones prefixed by <<): WARNING: modpost: missing MODULE_DESCRIPTION() in lib/ucs2_string.o ERROR: modpost: "irq_domain_alloc_descs" [kernel/irq/irq_test.ko] undefined! ERROR: modpost: "irq_to_desc" [kernel/irq/irq_test.ko] undefined! ERROR: modpost: "irq_shutdown_and_deactivate" [kernel/irq/irq_test.ko] undefined! >> ERROR: modpost: "irq_activate" [kernel/irq/irq_test.ko] undefined! >> ERROR: modpost: "irq_startup_managed" [kernel/irq/irq_test.ko] undefined! -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 2/2] genirq: Add kunit tests for depth counts 2025-05-15 14:01 ` kernel test robot @ 2025-05-15 17:21 ` Brian Norris 2025-05-15 22:24 ` Thomas Gleixner 0 siblings, 1 reply; 21+ messages in thread From: Brian Norris @ 2025-05-15 17:21 UTC (permalink / raw) To: Thomas Gleixner Cc: Thomas Gleixner, llvm, oe-kbuild-all, Tsai Sung-Fu, Douglas Anderson, linux-kernel, kernel test robot Hi Thomas, On Thu, May 15, 2025 at 10:01:18PM +0800, kernel test robot wrote: > patch link: https://lore.kernel.org/r/20250514201353.3481400-3-briannorris%40chromium.org > patch subject: [PATCH v2 2/2] genirq: Add kunit tests for depth counts First of all, thanks for the help, and for applying patch 1. I see that: 1) this bot noticed a trivial problem with patch 2; and 2) I received notification that patch 2 was applied to tip/irq/core, but 3) I can't find it there any more. I'm not sure if #3 is because you dropped it (e.g., due to #1's report) or some other reason, so I'm not sure what to do next. Possibilities: (a) send the trivial fix separately, as a fixup (against what tree?) (b) resend an improved patch 2 on its own, against tip/irq/core (c) just drop it, because you have deeper reasons to not want these tests. I'm fine with anything you'd like, although I do think there's value in providing unit tests for corner cases like this. See below for the trivial fix, for the record. I can send it separately if you'd like. > config: i386-buildonly-randconfig-004-20250515 (https://download.01.org/0day-ci/archive/20250515/202505152136.y04AHovS-lkp@intel.com/config) > compiler: clang version 20.1.2 (https://github.com/llvm/llvm-project 58df0ef89dd64126512e4ee27b4ac3fd8ddf6247) > reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250515/202505152136.y04AHovS-lkp@intel.com/reproduce) > > If you fix the issue in a separate patch/commit (i.e. not just a new version of > the same patch/commit), kindly add following tags > | Reported-by: kernel test robot <lkp@intel.com> > | Closes: https://lore.kernel.org/oe-kbuild-all/202505152136.y04AHovS-lkp@intel.com/ > > All errors (new ones prefixed by >>, old ones prefixed by <<): > > WARNING: modpost: missing MODULE_DESCRIPTION() in lib/ucs2_string.o > ERROR: modpost: "irq_domain_alloc_descs" [kernel/irq/irq_test.ko] undefined! > ERROR: modpost: "irq_to_desc" [kernel/irq/irq_test.ko] undefined! > ERROR: modpost: "irq_shutdown_and_deactivate" [kernel/irq/irq_test.ko] undefined! > >> ERROR: modpost: "irq_activate" [kernel/irq/irq_test.ko] undefined! > >> ERROR: modpost: "irq_startup_managed" [kernel/irq/irq_test.ko] undefined! The test Kconfig symbol should be bool, not tristate. Some of the functions required for the test are non-modular. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202505152136.y04AHovS-lkp@intel.com/ Signed-off-by: Brian Norris <briannorris@chromium.org> --- a/kernel/irq/Kconfig +++ b/kernel/irq/Kconfig @@ -145,7 +145,7 @@ config GENERIC_IRQ_KEXEC_CLEAR_VM_FORWARD bool config IRQ_KUNIT_TEST - tristate "KUnit tests for IRQ management APIs" if !KUNIT_ALL_TESTS + bool "KUnit tests for IRQ management APIs" if !KUNIT_ALL_TESTS depends on KUNIT default KUNIT_ALL_TESTS imply SMP ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 2/2] genirq: Add kunit tests for depth counts 2025-05-15 17:21 ` Brian Norris @ 2025-05-15 22:24 ` Thomas Gleixner 0 siblings, 0 replies; 21+ messages in thread From: Thomas Gleixner @ 2025-05-15 22:24 UTC (permalink / raw) To: Brian Norris Cc: llvm, oe-kbuild-all, Tsai Sung-Fu, Douglas Anderson, linux-kernel, kernel test robot On Thu, May 15 2025 at 10:21, Brian Norris wrote: > On Thu, May 15, 2025 at 10:01:18PM +0800, kernel test robot wrote: >> patch link: https://lore.kernel.org/r/20250514201353.3481400-3-briannorris%40chromium.org >> patch subject: [PATCH v2 2/2] genirq: Add kunit tests for depth counts > > First of all, thanks for the help, and for applying patch 1. I see that: > 1) this bot noticed a trivial problem with patch 2; and > 2) I received notification that patch 2 was applied to tip/irq/core, but > 3) I can't find it there any more. > > I'm not sure if #3 is because you dropped it (e.g., due to #1's report) > or some other reason, so I'm not sure what to do next. Possibilities: #3 because I dropped it. > (a) send the trivial fix separately, as a fixup (against what tree?) > (b) resend an improved patch 2 on its own, against tip/irq/core > (c) just drop it, because you have deeper reasons to not want these > tests. #b please Thanks, tglx ^ permalink raw reply [flat|nested] 21+ messages in thread
* [tip: irq/core] genirq: Add kunit tests for disable depth counts 2025-05-14 20:13 ` [PATCH v2 2/2] genirq: Add kunit tests for depth counts Brian Norris 2025-05-15 14:01 ` kernel test robot @ 2025-05-15 14:51 ` tip-bot2 for Brian Norris 1 sibling, 0 replies; 21+ messages in thread From: tip-bot2 for Brian Norris @ 2025-05-15 14:51 UTC (permalink / raw) To: linux-tip-commits; +Cc: Brian Norris, Thomas Gleixner, x86, linux-kernel, maz The following commit has been merged into the irq/core branch of tip: Commit-ID: 334c36eeb5df010ea1d954a96b6eba42a15a1d4a Gitweb: https://git.kernel.org/tip/334c36eeb5df010ea1d954a96b6eba42a15a1d4a Author: Brian Norris <briannorris@chromium.org> AuthorDate: Wed, 14 May 2025 13:13:17 -07:00 Committer: Thomas Gleixner <tglx@linutronix.de> CommitterDate: Thu, 15 May 2025 16:44:25 +02:00 genirq: Add kunit tests for disable depth counts There have been a few bugs and/or misunderstandings about the reference counting, and startup/shutdown behaviors in the interrupt core and the related CPU hotplug code. These 4 test cases try to capture a few interesting cases. * irq_disable_depth_test: basic request/disable/enable sequence * irq_free_disabled_test: request/disable/free/re-request sequence - this catches errors on previous revisions of the fix for the hotplug case. * irq_cpuhotplug_test: exercises managed-affinity IRQ + CPU hotplug. This captures a problematic test case for a bug in that code path, which failed to retain the disable depth across the managed interrupt hotunplug/hotplug path. This test requires CONFIG_SMP and a hotpluggable CPU#1. * irq_shutdown_depth_test: exercises similar behavior as irq_cpuhotplug_test, but directly uses the irq_*() APIs instead of going through CPU hotplug. This still requires CONFIG_SMP, because managed-affinity is stubbed out (and not all APIs are even present) without it. Note the use of 'imply SMP': ARCH=um doesn't support SMP, and kunit is often exercised there. Thus, 'imply' will force SMP on where possible (such as ARCH=x86_64), but leave it off where it's not. Behavior on various SMP and ARCH configurations: $ tools/testing/kunit/kunit.py run 'irq_test_cases*' --arch x86_64 --qemu_args '-smp 2' [...] [11:12:24] Testing complete. Ran 4 tests: passed: 4 $ tools/testing/kunit/kunit.py run 'irq_test_cases*' --arch x86_64 [...] [11:13:27] [SKIPPED] irq_cpuhotplug_test [11:13:27] ================= [PASSED] irq_test_cases ================== [11:13:27] ============================================================ [11:13:27] Testing complete. Ran 4 tests: passed: 3, skipped: 1 # default: ARCH=um $ tools/testing/kunit/kunit.py run 'irq_test_cases*' [11:14:26] [SKIPPED] irq_shutdown_depth_test [11:14:26] [SKIPPED] irq_cpuhotplug_test [11:14:26] ================= [PASSED] irq_test_cases ================== [11:14:26] ============================================================ [11:14:26] Testing complete. Ran 4 tests: passed: 2, skipped: 2 Without the prior fix ("genirq: Retain disable depth for managed interrupts across CPU hotplug"), this fails as follows: [11:18:55] =============== irq_test_cases (4 subtests) ================ [11:18:55] [PASSED] irq_disable_depth_test [11:18:55] [PASSED] irq_free_disabled_test [11:18:55] # irq_shutdown_depth_test: EXPECTATION FAILED at kernel/irq/irq_test.c:147 [11:18:55] Expected desc->depth == 1, but [11:18:55] desc->depth == 0 (0x0) [11:18:55] ------------[ cut here ]------------ [11:18:55] Unbalanced enable for IRQ 26 [11:18:55] WARNING: CPU: 1 PID: 36 at kernel/irq/manage.c:792 __enable_irq+0x36/0x60 ... [11:18:55] [FAILED] irq_shutdown_depth_test [11:18:55] #1 [11:18:55] # irq_cpuhotplug_test: EXPECTATION FAILED at kernel/irq/irq_test.c:202 [11:18:55] Expected irqd_is_activated(data) to be false, but is true [11:18:55] # irq_cpuhotplug_test: EXPECTATION FAILED at kernel/irq/irq_test.c:203 [11:18:55] Expected irqd_is_started(data) to be false, but is true [11:18:55] # irq_cpuhotplug_test: EXPECTATION FAILED at kernel/irq/irq_test.c:204 [11:18:55] Expected desc->depth == 1, but [11:18:55] desc->depth == 0 (0x0) [11:18:55] ------------[ cut here ]------------ [11:18:55] Unbalanced enable for IRQ 27 [11:18:55] WARNING: CPU: 0 PID: 38 at kernel/irq/manage.c:792 __enable_irq+0x36/0x60 ... [11:18:55] [FAILED] irq_cpuhotplug_test [11:18:55] # module: irq_test [11:18:55] # irq_test_cases: pass:2 fail:2 skip:0 total:4 [11:18:55] # Totals: pass:2 fail:2 skip:0 total:4 [11:18:55] ================= [FAILED] irq_test_cases ================== [11:18:55] ============================================================ [11:18:55] Testing complete. Ran 4 tests: passed: 2, failed: 2 Signed-off-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250514201353.3481400-3-briannorris@chromium.org --- kernel/irq/Kconfig | 11 ++- kernel/irq/Makefile | 1 +- kernel/irq/irq_test.c | 229 +++++++++++++++++++++++++++++++++++++++++- 3 files changed, 241 insertions(+) create mode 100644 kernel/irq/irq_test.c diff --git a/kernel/irq/Kconfig b/kernel/irq/Kconfig index 3f02a0e..7429abe 100644 --- a/kernel/irq/Kconfig +++ b/kernel/irq/Kconfig @@ -144,6 +144,17 @@ config GENERIC_IRQ_DEBUGFS config GENERIC_IRQ_KEXEC_CLEAR_VM_FORWARD bool +config IRQ_KUNIT_TEST + tristate "KUnit tests for IRQ management APIs" if !KUNIT_ALL_TESTS + depends on KUNIT + default KUNIT_ALL_TESTS + imply SMP + help + This option enables KUnit tests for the IRQ subsystem API. These are + only for development and testing, not for regular kernel use cases. + + If unsure, say N. + endmenu config GENERIC_IRQ_MULTI_HANDLER diff --git a/kernel/irq/Makefile b/kernel/irq/Makefile index c0f44c0..6ab3a40 100644 --- a/kernel/irq/Makefile +++ b/kernel/irq/Makefile @@ -19,3 +19,4 @@ obj-$(CONFIG_GENERIC_IRQ_IPI_MUX) += ipi-mux.o obj-$(CONFIG_SMP) += affinity.o obj-$(CONFIG_GENERIC_IRQ_DEBUGFS) += debugfs.o obj-$(CONFIG_GENERIC_IRQ_MATRIX_ALLOCATOR) += matrix.o +obj-$(CONFIG_IRQ_KUNIT_TEST) += irq_test.o diff --git a/kernel/irq/irq_test.c b/kernel/irq/irq_test.c new file mode 100644 index 0000000..5161b56 --- /dev/null +++ b/kernel/irq/irq_test.c @@ -0,0 +1,229 @@ +// SPDX-License-Identifier: LGPL-2.1+ + +#include <linux/cpu.h> +#include <linux/cpumask.h> +#include <linux/interrupt.h> +#include <linux/irq.h> +#include <linux/irqdesc.h> +#include <linux/irqdomain.h> +#include <linux/nodemask.h> +#include <kunit/test.h> + +#include "internals.h" + +static irqreturn_t noop_handler(int irq, void *data) +{ + return IRQ_HANDLED; +} + +static void noop(struct irq_data *data) { } +static unsigned int noop_ret(struct irq_data *data) { return 0; } + +static int noop_affinity(struct irq_data *data, const struct cpumask *dest, + bool force) +{ + irq_data_update_effective_affinity(data, dest); + + return 0; +} + +static struct irq_chip fake_irq_chip = { + .name = "fake", + .irq_startup = noop_ret, + .irq_shutdown = noop, + .irq_enable = noop, + .irq_disable = noop, + .irq_ack = noop, + .irq_mask = noop, + .irq_unmask = noop, + .irq_set_affinity = noop_affinity, + .flags = IRQCHIP_SKIP_SET_WAKE, +}; + +static void irq_disable_depth_test(struct kunit *test) +{ + struct irq_desc *desc; + int virq, ret; + + virq = irq_domain_alloc_descs(-1, 1, 0, NUMA_NO_NODE, NULL); + KUNIT_ASSERT_GE(test, virq, 0); + + irq_set_chip_and_handler(virq, &dummy_irq_chip, handle_simple_irq); + + desc = irq_to_desc(virq); + KUNIT_ASSERT_PTR_NE(test, desc, NULL); + + ret = request_irq(virq, noop_handler, 0, "test_irq", NULL); + KUNIT_EXPECT_EQ(test, ret, 0); + + KUNIT_EXPECT_EQ(test, desc->depth, 0); + + disable_irq(virq); + KUNIT_EXPECT_EQ(test, desc->depth, 1); + + enable_irq(virq); + KUNIT_EXPECT_EQ(test, desc->depth, 0); + + free_irq(virq, NULL); +} + +static void irq_free_disabled_test(struct kunit *test) +{ + struct irq_desc *desc; + int virq, ret; + + virq = irq_domain_alloc_descs(-1, 1, 0, NUMA_NO_NODE, NULL); + KUNIT_ASSERT_GE(test, virq, 0); + + irq_set_chip_and_handler(virq, &dummy_irq_chip, handle_simple_irq); + + desc = irq_to_desc(virq); + KUNIT_ASSERT_PTR_NE(test, desc, NULL); + + ret = request_irq(virq, noop_handler, 0, "test_irq", NULL); + KUNIT_EXPECT_EQ(test, ret, 0); + + KUNIT_EXPECT_EQ(test, desc->depth, 0); + + disable_irq(virq); + KUNIT_EXPECT_EQ(test, desc->depth, 1); + + free_irq(virq, NULL); + KUNIT_EXPECT_GE(test, desc->depth, 1); + + ret = request_irq(virq, noop_handler, 0, "test_irq", NULL); + KUNIT_EXPECT_EQ(test, ret, 0); + KUNIT_EXPECT_EQ(test, desc->depth, 0); + + free_irq(virq, NULL); +} + +static void irq_shutdown_depth_test(struct kunit *test) +{ + struct irq_desc *desc; + struct irq_data *data; + int virq, ret; + struct irq_affinity_desc affinity = { + .is_managed = 1, + .mask = CPU_MASK_ALL, + }; + + if (!IS_ENABLED(CONFIG_SMP)) + kunit_skip(test, "requires CONFIG_SMP for managed shutdown"); + + virq = irq_domain_alloc_descs(-1, 1, 0, NUMA_NO_NODE, &affinity); + KUNIT_ASSERT_GE(test, virq, 0); + + irq_set_chip_and_handler(virq, &dummy_irq_chip, handle_simple_irq); + + desc = irq_to_desc(virq); + KUNIT_ASSERT_PTR_NE(test, desc, NULL); + + data = irq_desc_get_irq_data(desc); + KUNIT_ASSERT_PTR_NE(test, data, NULL); + + ret = request_irq(virq, noop_handler, 0, "test_irq", NULL); + KUNIT_EXPECT_EQ(test, ret, 0); + + KUNIT_EXPECT_TRUE(test, irqd_is_activated(data)); + KUNIT_EXPECT_TRUE(test, irqd_is_started(data)); + KUNIT_EXPECT_TRUE(test, irqd_affinity_is_managed(data)); + + KUNIT_EXPECT_EQ(test, desc->depth, 0); + + disable_irq(virq); + KUNIT_EXPECT_EQ(test, desc->depth, 1); + + irq_shutdown_and_deactivate(desc); + + KUNIT_EXPECT_FALSE(test, irqd_is_activated(data)); + KUNIT_EXPECT_FALSE(test, irqd_is_started(data)); + + KUNIT_EXPECT_EQ(test, irq_activate(desc), 0); +#ifdef CONFIG_SMP + irq_startup_managed(desc); +#endif + + KUNIT_EXPECT_EQ(test, desc->depth, 1); + + enable_irq(virq); + KUNIT_EXPECT_EQ(test, desc->depth, 0); + + free_irq(virq, NULL); +} + +static void irq_cpuhotplug_test(struct kunit *test) +{ + struct irq_desc *desc; + struct irq_data *data; + int virq, ret; + struct irq_affinity_desc affinity = { + .is_managed = 1, + }; + + if (!IS_ENABLED(CONFIG_SMP)) + kunit_skip(test, "requires CONFIG_SMP for CPU hotplug"); + if (!get_cpu_device(1)) + kunit_skip(test, "requires more than 1 CPU for CPU hotplug"); + if (!cpu_is_hotpluggable(1)) + kunit_skip(test, "CPU 1 must be hotpluggable"); + + cpumask_copy(&affinity.mask, cpumask_of(1)); + + virq = irq_domain_alloc_descs(-1, 1, 0, NUMA_NO_NODE, &affinity); + KUNIT_ASSERT_GE(test, virq, 0); + + irq_set_chip_and_handler(virq, &fake_irq_chip, handle_simple_irq); + + desc = irq_to_desc(virq); + KUNIT_ASSERT_PTR_NE(test, desc, NULL); + + data = irq_desc_get_irq_data(desc); + KUNIT_ASSERT_PTR_NE(test, data, NULL); + + ret = request_irq(virq, noop_handler, 0, "test_irq", NULL); + KUNIT_EXPECT_EQ(test, ret, 0); + + KUNIT_EXPECT_TRUE(test, irqd_is_activated(data)); + KUNIT_EXPECT_TRUE(test, irqd_is_started(data)); + KUNIT_EXPECT_TRUE(test, irqd_affinity_is_managed(data)); + + KUNIT_EXPECT_EQ(test, desc->depth, 0); + + disable_irq(virq); + KUNIT_EXPECT_EQ(test, desc->depth, 1); + + KUNIT_EXPECT_EQ(test, remove_cpu(1), 0); + KUNIT_EXPECT_FALSE(test, irqd_is_activated(data)); + KUNIT_EXPECT_FALSE(test, irqd_is_started(data)); + KUNIT_EXPECT_GE(test, desc->depth, 1); + KUNIT_EXPECT_EQ(test, add_cpu(1), 0); + + KUNIT_EXPECT_FALSE(test, irqd_is_activated(data)); + KUNIT_EXPECT_FALSE(test, irqd_is_started(data)); + KUNIT_EXPECT_EQ(test, desc->depth, 1); + + enable_irq(virq); + KUNIT_EXPECT_TRUE(test, irqd_is_activated(data)); + KUNIT_EXPECT_TRUE(test, irqd_is_started(data)); + KUNIT_EXPECT_EQ(test, desc->depth, 0); + + free_irq(virq, NULL); +} + +static struct kunit_case irq_test_cases[] = { + KUNIT_CASE(irq_disable_depth_test), + KUNIT_CASE(irq_free_disabled_test), + KUNIT_CASE(irq_shutdown_depth_test), + KUNIT_CASE(irq_cpuhotplug_test), + {} +}; + +static struct kunit_suite irq_test_suite = { + .name = "irq_test_cases", + .test_cases = irq_test_cases, +}; + +kunit_test_suite(irq_test_suite); +MODULE_DESCRIPTION("IRQ unit test suite"); +MODULE_LICENSE("GPL"); ^ permalink raw reply related [flat|nested] 21+ messages in thread
end of thread, other threads:[~2025-06-19 8:32 UTC | newest] Thread overview: 21+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-05-14 20:13 [PATCH v2 0/2] genirq: Retain depth for managed IRQs across CPU hotplug Brian Norris 2025-05-14 20:13 ` [PATCH v2 1/2] " Brian Norris 2025-05-15 14:51 ` [tip: irq/core] genirq: Retain disable depth for managed interrupts " tip-bot2 for Brian Norris 2025-06-06 12:21 ` [PATCH v2 1/2] genirq: Retain depth for managed IRQs " Aleksandrs Vinarskis 2025-06-09 17:13 ` Brian Norris 2025-06-09 18:19 ` Aleksandrs Vinarskis 2025-06-10 20:07 ` Brian Norris 2025-06-11 6:50 ` Thomas Gleixner 2025-06-11 8:50 ` Thomas Gleixner 2025-06-11 18:51 ` Brian Norris 2025-06-11 6:56 ` Aleksandrs Vinarskis 2025-06-11 19:08 ` Brian Norris 2025-06-12 18:40 ` Brian Norris 2025-06-18 10:17 ` Johan Hovold 2025-06-18 17:10 ` Brian Norris 2025-06-19 8:32 ` Johan Hovold 2025-05-14 20:13 ` [PATCH v2 2/2] genirq: Add kunit tests for depth counts Brian Norris 2025-05-15 14:01 ` kernel test robot 2025-05-15 17:21 ` Brian Norris 2025-05-15 22:24 ` Thomas Gleixner 2025-05-15 14:51 ` [tip: irq/core] genirq: Add kunit tests for disable " tip-bot2 for Brian Norris
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).