From: Florian Fainelli <florian.fainelli@broadcom.com>
To: stable@vger.kernel.org
Cc: "Ilya Maximets" <i.maximets@ovn.org>,
"Friedrich Weber" <f.weber@proxmox.com>,
"Aaron Conole" <aconole@redhat.com>,
"Jakub Kicinski" <kuba@kernel.org>,
"Sasha Levin" <sashal@kernel.org>,
"Carlos Soto" <carlos.soto@broadcom.com>,
"Florian Fainelli" <florian.fainelli@broadcom.com>,
"David S. Miller" <davem@davemloft.net>,
"Pravin B Shelar" <pshelar@ovn.org>,
"Alexei Starovoitov" <ast@kernel.org>,
"Daniel Borkmann" <daniel@iogearbox.net>,
"Andrii Nakryiko" <andrii@kernel.org>,
"Martin KaFai Lau" <kafai@fb.com>,
"Song Liu" <songliubraving@fb.com>, "Yonghong Song" <yhs@fb.com>,
"John Fastabend" <john.fastabend@gmail.com>,
"KP Singh" <kpsingh@kernel.org>,
"Eric Dumazet" <edumazet@google.com>,
"Willem de Bruijn" <willemb@google.com>,
"Paolo Abeni" <pabeni@redhat.com>,
"Breno Leitao" <leitao@debian.org>,
"Benoît Monin" <benoit.monin@gmx.fr>,
"Yan Zhai" <yan@cloudflare.com>,
"Felix Huettner" <felix.huettner@mail.schwarz>,
"Joe Stringer" <joestringer@nicira.com>,
"Andy Zhou" <azhou@nicira.com>,
"Justin Pettit" <jpettit@nicira.com>,
"Thomas Graf" <tgraf@suug.ch>,
"Luca Czesla" <luca.czesla@mail.schwarz>,
"Simon Horman" <simon.horman@corigine.com>,
netdev@vger.kernel.org (open list:NETWORKING [GENERAL]),
linux-kernel@vger.kernel.org (open list),
dev@openvswitch.org (open list:OPENVSWITCH),
bpf@vger.kernel.org (open list:BPF (Safe dynamic programs and
tools))
Subject: [PATCH stable 5.4 v2 2/2] openvswitch: fix lockup on tx to unregistering netdev with carrier
Date: Tue, 25 Mar 2025 12:22:20 -0700 [thread overview]
Message-ID: <20250325192220.1849902-3-florian.fainelli@broadcom.com> (raw)
In-Reply-To: <20250325192220.1849902-1-florian.fainelli@broadcom.com>
From: Ilya Maximets <i.maximets@ovn.org>
[ Upstream commit 82f433e8dd0629e16681edf6039d094b5518d8ed ]
Commit in a fixes tag attempted to fix the issue in the following
sequence of calls:
do_output
-> ovs_vport_send
-> dev_queue_xmit
-> __dev_queue_xmit
-> netdev_core_pick_tx
-> skb_tx_hash
When device is unregistering, the 'dev->real_num_tx_queues' goes to
zero and the 'while (unlikely(hash >= qcount))' loop inside the
'skb_tx_hash' becomes infinite, locking up the core forever.
But unfortunately, checking just the carrier status is not enough to
fix the issue, because some devices may still be in unregistering
state while reporting carrier status OK.
One example of such device is a net/dummy. It sets carrier ON
on start, but it doesn't implement .ndo_stop to set the carrier off.
And it makes sense, because dummy doesn't really have a carrier.
Therefore, while this device is unregistering, it's still easy to hit
the infinite loop in the skb_tx_hash() from the OVS datapath. There
might be other drivers that do the same, but dummy by itself is
important for the OVS ecosystem, because it is frequently used as a
packet sink for tcpdump while debugging OVS deployments. And when the
issue is hit, the only way to recover is to reboot.
Fix that by also checking if the device is running. The running
state is handled by the net core during unregistering, so it covers
unregistering case better, and we don't really need to send packets
to devices that are not running anyway.
While only checking the running state might be enough, the carrier
check is preserved. The running and the carrier states seem disjoined
throughout the code and different drivers. And other core functions
like __dev_direct_xmit() check both before attempting to transmit
a packet. So, it seems safer to check both flags in OVS as well.
Fixes: 066b86787fa3 ("net: openvswitch: fix race on port output")
Reported-by: Friedrich Weber <f.weber@proxmox.com>
Closes: https://mail.openvswitch.org/pipermail/ovs-discuss/2025-January/053423.html
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Tested-by: Friedrich Weber <f.weber@proxmox.com>
Reviewed-by: Aaron Conole <aconole@redhat.com>
Link: https://patch.msgid.link/20250109122225.4034688-1-i.maximets@ovn.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Carlos Soto <carlos.soto@broadcom.com>
Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
---
net/openvswitch/actions.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
index aec20faadfcc..815a55fa7356 100644
--- a/net/openvswitch/actions.c
+++ b/net/openvswitch/actions.c
@@ -920,7 +920,9 @@ static void do_output(struct datapath *dp, struct sk_buff *skb, int out_port,
{
struct vport *vport = ovs_vport_rcu(dp, out_port);
- if (likely(vport && netif_carrier_ok(vport->dev))) {
+ if (likely(vport &&
+ netif_running(vport->dev) &&
+ netif_carrier_ok(vport->dev))) {
u16 mru = OVS_CB(skb)->mru;
u32 cutlen = OVS_CB(skb)->cutlen;
--
2.34.1
prev parent reply other threads:[~2025-03-25 19:28 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-25 19:22 [PATCH stable 5.4 v2 0/2] openvswitch port output fixes Florian Fainelli
2025-03-25 19:22 ` [PATCH stable 5.4 v2 1/2] net: openvswitch: fix race on port output Florian Fainelli
2025-03-25 19:22 ` Florian Fainelli [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250325192220.1849902-3-florian.fainelli@broadcom.com \
--to=florian.fainelli@broadcom.com \
--cc=aconole@redhat.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=azhou@nicira.com \
--cc=benoit.monin@gmx.fr \
--cc=bpf@vger.kernel.org \
--cc=carlos.soto@broadcom.com \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=dev@openvswitch.org \
--cc=edumazet@google.com \
--cc=f.weber@proxmox.com \
--cc=felix.huettner@mail.schwarz \
--cc=i.maximets@ovn.org \
--cc=joestringer@nicira.com \
--cc=john.fastabend@gmail.com \
--cc=jpettit@nicira.com \
--cc=kafai@fb.com \
--cc=kpsingh@kernel.org \
--cc=kuba@kernel.org \
--cc=leitao@debian.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luca.czesla@mail.schwarz \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=pshelar@ovn.org \
--cc=sashal@kernel.org \
--cc=simon.horman@corigine.com \
--cc=songliubraving@fb.com \
--cc=stable@vger.kernel.org \
--cc=tgraf@suug.ch \
--cc=willemb@google.com \
--cc=yan@cloudflare.com \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox