From: Jason Gunthorpe <jgg@ziepe.ca>
To: Praveen Kumar Kannoju <praveen.kannoju@oracle.com>
Cc: saeedm@nvidia.com, leon@kernel.org, tariqt@nvidia.com,
mbloch@nvidia.com, andrew+netdev@lunn.ch, davem@davemloft.net,
edumazet@google.com, kuba@kernel.org, pabeni@redhat.com,
netdev@vger.kernel.org, linux-rdma@vger.kernel.org,
linux-kernel@vger.kernel.org, rama.nichanamatlu@oracle.com,
manjunath.b.patil@oracle.com, anand.a.khoje@oracle.com
Subject: Re: [PATCH] net/mlx5: poll mlx5 eq during irq migration
Date: Wed, 4 Mar 2026 16:11:51 -0400 [thread overview]
Message-ID: <20260304201151.GI964116@ziepe.ca> (raw)
In-Reply-To: <20260304161704.910564-1-praveen.kannoju@oracle.com>
On Wed, Mar 04, 2026 at 04:17:04PM +0000, Praveen Kumar Kannoju wrote:
> Interrupt lost scenario has been observed in multiple issues during IRQ
> migration due to cpu scaling activity. This further led to the presence of
> unhandled EQE's causing corresponding Mellanox transmission queues to
> become full and get timedout. This patch overcomes this situation by
> polling the EQ associated with the IRQ which undergoes migration, to
> recover any unhandled EQE's and keep the transmission uninterrupted from
> the corresponding queue.
What? This does not seem like something we should do like this.
IRQ migration is not supposed to loose interrupts, this seems like a
IRQ layer bug to me. If it is buggy and loosing interrupts it should
probably inject a spurious interrupt around these events so all
devices can be enjoy the bug fix.
Basically you need to explain with alot more detail why the IRQ was
lost, not just some hand wavey "migration something something"..
BTW there are known bugs in things like qemu that can loose interrupts
around changes to the MSI (and worse than that too), but I thought
they were all fixed now?
Jason
next prev parent reply other threads:[~2026-03-04 20:11 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-04 16:17 [PATCH] net/mlx5: poll mlx5 eq during irq migration Praveen Kumar Kannoju
2026-03-04 20:11 ` Jason Gunthorpe [this message]
[not found] ` <CH3PR10MB7704DD1E6B9A671796FC6B528C7DA@CH3PR10MB7704.namprd10.prod.outlook.com>
2026-03-06 0:32 ` Jason Gunthorpe
2026-03-06 14:19 ` Praveen Kannoju
2026-03-06 23:10 ` Jason Gunthorpe
2026-03-07 5:43 ` Praveen Kannoju
2026-03-12 0:35 ` Jason Gunthorpe
2026-03-20 16:31 ` Praveen Kannoju
2026-03-05 4:17 ` kernel test robot
2026-03-05 8:45 ` kernel test robot
2026-03-05 9:29 ` kernel test robot
2026-03-05 11:16 ` kernel test robot
2026-03-05 13:15 ` Praveen Kannoju
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260304201151.GI964116@ziepe.ca \
--to=jgg@ziepe.ca \
--cc=anand.a.khoje@oracle.com \
--cc=andrew+netdev@lunn.ch \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=kuba@kernel.org \
--cc=leon@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=manjunath.b.patil@oracle.com \
--cc=mbloch@nvidia.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=praveen.kannoju@oracle.com \
--cc=rama.nichanamatlu@oracle.com \
--cc=saeedm@nvidia.com \
--cc=tariqt@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox