Re: [ovs-dev] [PATCH net] openvswitch: fix lockup on tx to unregistering netdev with carrier

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Aaron Conole <aconole@redhat.com>
To: Ilya Maximets <i.maximets@ovn.org>
Cc: netdev@vger.kernel.org,  dev@openvswitch.org,
	 Friedrich Weber <f.weber@proxmox.com>,
	 Luca Czesla <luca.czesla@mail.schwarz>,
	linux-kernel@vger.kernel.org,
	 Felix Huettner <felix.huettner@mail.schwarz>,
	 Eric Dumazet <edumazet@google.com>,
	 Simon Horman <horms@kernel.org>,
	 Jakub Kicinski <kuba@kernel.org>,
	 Paolo Abeni <pabeni@redhat.com>,
	 "David S. Miller" <davem@davemloft.net>
Subject: Re: [ovs-dev] [PATCH net] openvswitch: fix lockup on tx to unregistering netdev with carrier
Date: Thu, 09 Jan 2025 12:05:51 -0500	[thread overview]
Message-ID: <f7to70fq5n4.fsf@redhat.com> (raw)
In-Reply-To: <20250109122225.4034688-1-i.maximets@ovn.org> (Ilya Maximets's message of "Thu, 9 Jan 2025 13:21:24 +0100")

Ilya Maximets <i.maximets@ovn.org> writes:

> Commit in a fixes tag attempted to fix the issue in the following
> sequence of calls:
>
>     do_output
>     -> ovs_vport_send
>        -> dev_queue_xmit
>           -> __dev_queue_xmit
>              -> netdev_core_pick_tx
>                 -> skb_tx_hash
>
> When device is unregistering, the 'dev->real_num_tx_queues' goes to
> zero and the 'while (unlikely(hash >= qcount))' loop inside the
> 'skb_tx_hash' becomes infinite, locking up the core forever.
>
> But unfortunately, checking just the carrier status is not enough to
> fix the issue, because some devices may still be in unregistering
> state while reporting carrier status OK.
>
> One example of such device is a net/dummy.  It sets carrier ON
> on start, but it doesn't implement .ndo_stop to set the carrier off.
> And it makes sense, because dummy doesn't really have a carrier.
> Therefore, while this device is unregistering, it's still easy to hit
> the infinite loop in the skb_tx_hash() from the OVS datapath.  There
> might be other drivers that do the same, but dummy by itself is
> important for the OVS ecosystem, because it is frequently used as a
> packet sink for tcpdump while debugging OVS deployments.  And when the
> issue is hit, the only way to recover is to reboot.
>
> Fix that by also checking if the device is running.  The running
> state is handled by the net core during unregistering, so it covers
> unregistering case better, and we don't really need to send packets
> to devices that are not running anyway.
>
> While only checking the running state might be enough, the carrier
> check is preserved.  The running and the carrier states seem disjoined
> throughout the code and different drivers.  And other core functions
> like __dev_direct_xmit() check both before attempting to transmit
> a packet.  So, it seems safer to check both flags in OVS as well.
>
> Fixes: 066b86787fa3 ("net: openvswitch: fix race on port output")
> Reported-by: Friedrich Weber <f.weber@proxmox.com>
> Closes: https://mail.openvswitch.org/pipermail/ovs-discuss/2025-January/053423.html
> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
> ---
>  net/openvswitch/actions.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
> index 16e260014684..704c858cf209 100644
> --- a/net/openvswitch/actions.c
> +++ b/net/openvswitch/actions.c
> @@ -934,7 +934,9 @@ static void do_output(struct datapath *dp, struct sk_buff *skb, int out_port,
>  {
>  	struct vport *vport = ovs_vport_rcu(dp, out_port);
>  
> -	if (likely(vport && netif_carrier_ok(vport->dev))) {
> +	if (likely(vport &&
> +		   netif_running(vport->dev) &&
> +		   netif_carrier_ok(vport->dev))) {
>  		u16 mru = OVS_CB(skb)->mru;
>  		u32 cutlen = OVS_CB(skb)->cutlen;

Reviewed-by: Aaron Conole <aconole@redhat.com>

I tried on with my VMs to reproduce the issue as described in the email
report, but I probably didn't give enough resources (or gave too many
resources) to get the race condition to bubble up.  I was using kernel
6.13-rc5 (0bc21e701a6f) also.

In any case, I think the change makes sense.

next prev parent reply	other threads:[~2025-01-09 17:06 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-09 12:21 [PATCH net] openvswitch: fix lockup on tx to unregistering netdev with carrier Ilya Maximets
2025-01-09 16:13 ` Friedrich Weber
2025-01-09 17:05 ` Aaron Conole [this message]
2025-01-10 11:50   ` [ovs-dev] " Friedrich Weber
2025-01-11  2:40 ` patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f7to70fq5n4.fsf@redhat.com \
    --to=aconole@redhat.com \
    --cc=davem@davemloft.net \
    --cc=dev@openvswitch.org \
    --cc=edumazet@google.com \
    --cc=f.weber@proxmox.com \
    --cc=felix.huettner@mail.schwarz \
    --cc=horms@kernel.org \
    --cc=i.maximets@ovn.org \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luca.czesla@mail.schwarz \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).