From: Thomas Gleixner <tglx@linutronix.de>
To: kernel test robot <yujie.liu@intel.com>,
Shanker Donthineni <sdonthineni@nvidia.com>
Cc: oe-lkp@lists.linux.dev, lkp@intel.com,
linux-kernel@vger.kernel.org, Marc Zyngier <maz@kernel.org>,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
Michael Walle <michael@walle.cc>,
Shanker Donthineni <sdonthineni@nvidia.com>,
Vikram Sethi <vsethi@nvidia.com>
Subject: Re: [PATCH v3 3/3] genirq: Use the maple tree for IRQ descriptors management
Date: Wed, 26 Apr 2023 14:08:54 +0200 [thread overview]
Message-ID: <87a5yuzvzd.ffs@tglx> (raw)
In-Reply-To: <202304251035.19367560-yujie.liu@intel.com>
On Tue, Apr 25 2023 at 11:16, kernel test robot wrote:
> kernel test robot noticed "WARNING:at_arch/x86/kernel/apic/ipi.c:#default_send_IPI_mask_logical" on:
>
> commit: 13eb5c4e7d2fb860d3dc5f63d910e3acf78dfd28 ("[PATCH v3 3/3] genirq: Use the maple tree for IRQ descriptors management")
> url: https://github.com/intel-lab-lkp/linux/commits/Shanker-Donthineni/genirq-Use-hlist-for-managing-resend-handlers/20230410-235853
> base: https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git 6f3ee0e22b4c62f44b8fa3c8de6e369a4d112a75
> patch link: https://lore.kernel.org/all/20230410155721.3720991-4-sdonthineni@nvidia.com/
> patch subject: [PATCH v3 3/3] genirq: Use the maple tree for IRQ
> descriptors management
This happens during CPU hot-unplug.
[ 206.930774][ T228] block/008 => sdb2 (do IO while hotplugging CPUs)
[ 206.935757][ T2086] run blktests block/008 at 2023-04-22 16:27:25
[ 207.199359][ T2086] smpboot: CPU 2 is now offline
[ 207.468574][ T30] WARNING: CPU: 3 PID: 30 at arch/x86/kernel/apic/ipi.c:299 default_send_IPI_mask_logical+0x40/0x44
[ 207.568426][ T30] CPU: 3 PID: 30 Comm: migration/3 Tainted: G S E 6.2.0-rc4-00051-g13eb5c4e7d2f #1
[ 207.588372][ T30] Stopper: multi_cpu_stop+0x0/0xf0 <- stop_machine_cpuslocked+0xf5/0x138
[ 207.596649][ T30] EIP: default_send_IPI_mask_logical+0x40/0x44
This warns because fixup_irqs() sends an IPI to an offline CPU. In this
case to CPU3 which just cleared its online bit and is about to vanish:
[ 207.622147][ T30] EAX: 00000008 EBX: 00000002 ECX: fffffffc EDX: 00000022
EAX contains the target and ECX the inverted online mask. That's
probably the ata2 interrupt as that later detects a timeout:
[ 238.826212][ T174] ata2.00: exception Emask 0x0 SAct 0x3c00000 SErr 0x0 action 0x6 frozen
[ 238.834522][ T174] ata2.00: failed command: READ FPDMA QUEUED
[ 238.840378][ T174] ata2.00: cmd 60/08:b0:90:3e:90/00:00:25:00:00/40 tag 22 ncq dma 4096 in
[ 238.840378][ T174] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Which means that migrating the interrupt away from the outgoing CPU3
failed for yet to understand reasons.
The patch in question is changing the interrupt descriptor storage and
with that also the iterator function. But I can't spot anything wrong
right now.
But what I can spot is this:
[ 0.000000][ T0] Linux version 6.2.0-rc4-00051-g13eb5c4e7d2f
IOW, that test is based on some random upstream version, which lacks
about 30 commits to maple_tree, where 12 of them have 'fix' in the
commit subject.
Can you please retest this on v6.3 and report back when the problem
persists?
Thanks,
tglx
next prev parent reply other threads:[~2023-04-26 12:09 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-10 15:57 [PATCH v3 0/3] Increase the number of IRQ descriptors for SPARSEIRQ Shanker Donthineni
2023-04-10 15:57 ` [PATCH v3 1/3] genirq: Use hlist for managing resend handlers Shanker Donthineni
2023-04-10 15:57 ` [PATCH v3 2/3] genirq: Encapsulate sparse bitmap handling Shanker Donthineni
2023-04-10 15:57 ` [PATCH v3 3/3] genirq: Use the maple tree for IRQ descriptors management Shanker Donthineni
2023-04-25 3:16 ` kernel test robot
2023-04-26 12:08 ` Thomas Gleixner [this message]
2023-04-28 1:33 ` Yujie Liu
2023-04-28 10:31 ` Thomas Gleixner
2023-05-07 8:05 ` Yujie Liu
2023-05-08 9:36 ` Thomas Gleixner
2023-05-10 7:24 ` Yujie Liu
2023-05-10 14:41 ` Thomas Gleixner
2023-05-10 14:49 ` Thomas Gleixner
2023-05-10 15:19 ` Shanker Donthineni
2023-05-10 17:15 ` Shanker Donthineni
2023-05-10 19:12 ` Thomas Gleixner
2023-05-10 16:02 ` Shanker Donthineni
2023-04-15 15:49 ` [PATCH v3 0/3] Increase the number of IRQ descriptors for SPARSEIRQ Shanker Donthineni
2023-04-15 21:50 ` Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87a5yuzvzd.ffs@tglx \
--to=tglx@linutronix.de \
--cc=bigeasy@linutronix.de \
--cc=linux-kernel@vger.kernel.org \
--cc=lkp@intel.com \
--cc=maz@kernel.org \
--cc=michael@walle.cc \
--cc=oe-lkp@lists.linux.dev \
--cc=sdonthineni@nvidia.com \
--cc=vsethi@nvidia.com \
--cc=yujie.liu@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox