From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 03F791391; Fri, 15 May 2026 01:35:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778808911; cv=none; b=kn6sYwqahZ6HRAiAtl7qoCUC7tKgAK4+akyr+/nkwqSZN0sh7ctYJxAflenu47i6AJV9o03oDc2Ly56adqZ+xKSSpiBf6K+Z0QOu9nsqNC07a2j1Gl2ysqBTlaq5JwmOniJa9jRMn7bW+BKgg5/6CsiYJK6jFOLUchdQRM0SuVA= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778808911; c=relaxed/simple; bh=MetcOpFWs7h0xBLOEAcEN86jIbdJx3yD33tX/Dul9QA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=IoWFQMB8qdVIvGg2ohdU1zSARf7/7jA79esB8Jqe9JOHmD5DfzAbfG18sr0I1V3h2QBLSrCLxyqrv+gR00pcTdXZpaoZ5mryZNdBlOmJdQBGzHSwAWHMxysB8T6KBA/Bqz4ofpvhgbD8hcsJuUn7HQj3WtG9/KszrudF8FRQqJg= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=aoePZStM; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="aoePZStM" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 99270C2BCB3; Fri, 15 May 2026 01:35:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1778808910; bh=MetcOpFWs7h0xBLOEAcEN86jIbdJx3yD33tX/Dul9QA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=aoePZStM19di7cOMDHRSgzmCY+aVOrEE0eWQnRIqxyw0oBc1XEIXblsXeNgeW2ail osxDIJXAuI+FJt2IqwlWVwy1ICdE5LOIDhr49KJin+5aQ/MtV3fNLK/HLkawW//JSy BddFgjaY1KaxKoGYeTazmo5XmwD4PCnKQkbLjvsEME6n/uMEhRdLvBENac8OakE2ti KXab4IpGJ1hJJ530jvQYgUIIWqHjIRuTOu4jTFKnKnPZDqvNdtbX/7NzvqWHQBD12V 2cuYeb6pstSc9LJaq6ayG/piIwbToIsSk9lijYkeDOQHnkPLnqgD1qTeneJ6LuTlPS cPJOs0y/j1jQw== From: Jakub Kicinski To: tariqt@nvidia.com Cc: Jakub Kicinski , edumazet@google.com, pabeni@redhat.com, andrew+netdev@lunn.ch, davem@davemloft.net, saeedm@nvidia.com, leon@kernel.org, mbloch@nvidia.com, shayd@nvidia.com, ohartoov@nvidia.com, edwards@nvidia.com, msanalla@nvidia.com, horms@kernel.org, parav@nvidia.com, moshe@nvidia.com, kees@kernel.org, phaddad@nvidia.com, gbayer@linux.ibm.com, netdev@vger.kernel.org, linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org, gal@nvidia.com Subject: Re: [PATCH net] net/mlx5: Skip disabled vports when setting max TX speed Date: Thu, 14 May 2026 18:35:08 -0700 Message-ID: <20260515013508.1308541-1-kuba@kernel.org> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260513063640.334132-1-tariqt@nvidia.com> References: <20260513063640.334132-1-tariqt@nvidia.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit This is an AI-generated review of your patch. The human sending this email has considered the AI review valid, or at least plausible. --- net/mlx5: Skip disabled vports when setting max TX speed This commit modifies the LAG activation process to skip setting the maximum TX speed for disabled vports, preventing firmware commands from being sent to uninitialized vports. It stores the aggregated speed in the vport structure so it can be applied later if the vport is enabled. > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c > --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c > +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c [ ... ] > @@ -948,6 +966,9 @@ int mlx5_esw_vport_enable(struct mlx5_eswitch *esw, struct mlx5_vport *vport, > > esw->enabled_vports++; > esw_debug(esw->dev, "Enabled VPORT(%d)\n", vport_num); > + > + if (vport->agg_max_tx_speed) Could dynamically allocated vports bypass the active LAG QoS speed limits? If a new vport is allocated dynamically after the LAG is already active, it appears agg_max_tx_speed would be zero-initialized. When this new vport is subsequently enabled here, the condition evaluates to false, which would skip the firmware speed update entirely. > + mlx5_esw_vport_set_max_tx_speed(esw, vport); Does this introduce a race condition with mlx5_lag_modify_device_vports_speed? If the LAG thread is concurrently updating the speed: LAG thread (lockless): vport->agg_max_tx_speed = speed; if (!vport->enabled) continue; ESW thread (with esw->state_lock): vport->enabled = true; ... if (vport->agg_max_tx_speed) mlx5_esw_vport_set_max_tx_speed(esw, vport); Since there are no shared locks or explicit memory barriers, could CPU reordering allow the LAG thread to see a false enabled state while the ESW thread sees a zero agg_max_tx_speed, resulting in neither thread issuing the firmware command? > done: > mutex_unlock(&esw->state_lock); [ ... ] > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c > --- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c > +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c [ ... ] > @@ -1274,6 +1274,11 @@ static void mlx5_lag_modify_device_vports_speed(struct mlx5_core_dev *mdev, > if (vport->vport == MLX5_VPORT_UPLINK) > continue; > > + vport->agg_max_tx_speed = speed; > + > + if (!vport->enabled) Is there a regression here regarding the lifetime of the vport pointer? This loop locklessly iterates over the esw->vports xarray. Because the xarray iteration drops the RCU read lock before executing the loop body, and mlx5_esw_vport_free() calls kfree() synchronously without an RCU grace period, could a concurrent vport deletion cause these new field accesses to operate on freed memory? > + continue; > + > ret = mlx5_modify_vport_max_tx_speed(mdev, op_mod, > vport->vport, true, speed); Does this code contain a time-of-check to time-of-use regression that is affected by the lockless architecture here? Looking at mlx5_modify_vport_max_tx_speed(), it queries the firmware admin_state and writes it back without holding esw->state_lock. If the LAG thread reads an UP state, but mlx5_esw_vport_disable() concurrently disables the vport (setting the hardware state to DOWN), could the LAG thread accidentally overwrite the admin_state back to UP, effectively reverting the disable action? > if (ret) -- pw-bot: cr