From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
To: Wander Lairson Costa <wander@redhat.com>
Cc: Tony Nguyen <anthony.l.nguyen@intel.com>,
Przemek Kitszel <przemyslaw.kitszel@intel.com>,
Andrew Lunn <andrew+netdev@lunn.ch>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Clark Williams <clrkwllms@kernel.org>,
Steven Rostedt <rostedt@goodmis.org>,
Jeff Garzik <jgarzik@redhat.com>,
Auke Kok <auke-jan.h.kok@intel.com>,
"moderated list:INTEL ETHERNET DRIVERS"
<intel-wired-lan@lists.osuosl.org>,
"open list:NETWORKING DRIVERS" <netdev@vger.kernel.org>,
open list <linux-kernel@vger.kernel.org>,
"open list:Real-time Linux (PREEMPT_RT):Keyword:PREEMPT_RT"
<linux-rt-devel@lists.linux.dev>
Subject: Re: [Intel-wired-lan] [PATCH iwl-net 0/4] igb: fix igb_msix_other() handling for PREEMPT_RT
Date: Wed, 8 Jan 2025 11:25:32 +0100 [thread overview]
Message-ID: <20250108102532.VWnKWvoo@linutronix.de> (raw)
In-Reply-To: <taea3z7nof4szjir2azxsjtbouymqxyy4draa3hz35zbacqeeq@t3uidpha64k7>
On 2025-01-07 15:52:47 [-0300], Wander Lairson Costa wrote:
> On Tue, Jan 07, 2025 at 02:51:06PM +0100, Sebastian Andrzej Siewior wrote:
> > On 2024-12-04 08:42:23 [-0300], Wander Lairson Costa wrote:
> > > This is the second attempt at fixing the behavior of igb_msix_other()
> > > for PREEMPT_RT. The previous attempt [1] was reverted [2] following
> > > concerns raised by Sebastian [3].
> > >
> > > The initial approach proposed converting vfs_lock to a raw_spinlock,
> > > a minor change intended to make it safe. However, it became evident
> > > that igb_rcv_msg_from_vf() invokes kcalloc with GFP_ATOMIC,
> > > which is unsafe in interrupt context on PREEMPT_RT systems.
> > >
> > > To address this, the solution involves splitting igb_msg_task()
> > > into two parts:
> > >
> > > * One part invoked from the IRQ context.
> > > * Another part called from the threaded interrupt handler.
> > >
> > > To accommodate this, vfs_lock has been restructured into a double
> > > lock: a spinlock_t and a raw_spinlock_t. In the revised design:
> > >
> > > * igb_disable_sriov() locks both spinlocks.
> > > * Each part of igb_msg_task() locks the appropriate spinlock for
> > > its execution context.
> >
> > - Is this limited to PREEMPT_RT or does it also occur on PREEMPT systems
> > with threadirqs? And if this is PREEMPT_RT only, why?
>
> PREEMPT systems configured to use threadirqs should be affected as well,
> although I never tested with this configuration. Honestly, until now I wasn't
> aware of the possibility of a non PREEMPT_RT kernel with threaded IRQs by default.
If the issue is indeed the use of threaded interrupts then the fix
should not be limited to be PREEMPT_RT only.
> > - What causes the failure? I see you reworked into two parts to behave
> > similar to what happens without threaded interrupts. There is still no
> > explanation for it. Is there a timing limit or was there another
> > register operation which removed the mailbox message?
> >
>
> I explained the root cause of the issue in the last commit. Maybe I should
> have added the explanation to the cover letter as well. Anyway, here is a
> partial verbatim copy of it:
>
> "During testing of SR-IOV, Red Hat QE encountered an issue where the
> ip link up command intermittently fails for the igbvf interfaces when
> using the PREEMPT_RT variant. Investigation revealed that
> e1000_write_posted_mbx returns an error due to the lack of an ACK
> from e1000_poll_for_ack.
That ACK would have come if it would poll longer?
> The underlying issue arises from the fact that IRQs are threaded by
> default under PREEMPT_RT. While the exact hardware details are not
> available, it appears that the IRQ handled by igb_msix_other must
> be processed before e1000_poll_for_ack times out. However,
> e1000_write_posted_mbx is called with preemption disabled, leading
> to a scenario where the IRQ is serviced only after the failure of
> e1000_write_posted_mbx."
Where is this disabled preemption coming from? This should be one of the
ops.write_posted() calls, right? I've been looking around and don't see
anything obvious.
Couldn't you wait for an event instead of polling?
> The call chain from igb_msg_task():
>
> igb_msg_task
> igb_rcv_msg_from_vf
> igb_set_vf_multicasts
> igb_set_rx_mode
> igb_write_mc_addr_list
> kmalloc
>
> Cannot happen from interrupt context under PREEMPT_RT. So this part of
> the interrupt handler is deferred to a threaded IRQ handler.
>
> > > Cheers,
> > > Wander
Sebastian
next prev parent reply other threads:[~2025-01-08 10:25 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-04 11:42 [Intel-wired-lan] [PATCH iwl-net 0/4] igb: fix igb_msix_other() handling for PREEMPT_RT Wander Lairson Costa
2024-12-04 11:42 ` [Intel-wired-lan] [PATCH iwl-net 1/4] igb: narrow scope of vfs_lock in SR-IOV cleanup Wander Lairson Costa
2025-01-07 10:06 ` Romanowski, Rafal
2024-12-04 11:42 ` [Intel-wired-lan] [PATCH iwl-net 2/4] igb: introduce raw vfs_lock to igb_adapter Wander Lairson Costa
2025-01-07 10:08 ` Romanowski, Rafal
2024-12-04 11:42 ` [Intel-wired-lan] [PATCH iwl-net 3/4] igb: split igb_msg_task() Wander Lairson Costa
2025-01-07 10:09 ` Romanowski, Rafal
2024-12-04 11:42 ` [Intel-wired-lan] [PATCH iwl-net 4/4] igb: fix igb_msix_other() handling for PREEMPT_RT Wander Lairson Costa
2025-01-07 10:10 ` Romanowski, Rafal
2024-12-26 13:24 ` [Intel-wired-lan] [PATCH iwl-net 0/4] " Wander Lairson Costa
2025-01-07 13:51 ` Sebastian Andrzej Siewior
2025-01-07 18:52 ` Wander Lairson Costa
2025-01-08 10:25 ` Sebastian Andrzej Siewior [this message]
2025-01-09 16:46 ` Wander Lairson Costa
2025-01-09 17:45 ` Sebastian Andrzej Siewior
2025-01-17 13:19 ` Wander Lairson Costa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250108102532.VWnKWvoo@linutronix.de \
--to=bigeasy@linutronix.de \
--cc=andrew+netdev@lunn.ch \
--cc=anthony.l.nguyen@intel.com \
--cc=auke-jan.h.kok@intel.com \
--cc=clrkwllms@kernel.org \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=intel-wired-lan@lists.osuosl.org \
--cc=jgarzik@redhat.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rt-devel@lists.linux.dev \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=przemyslaw.kitszel@intel.com \
--cc=rostedt@goodmis.org \
--cc=wander@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox