From: sashiko-bot@kernel.org
To: "Long Li" <longli@microsoft.com>
Cc: linux-hyperv@vger.kernel.org
Subject: Re: [PATCH net-next v9 1/6] net: mana: Create separate EQs for each vPort
Date: Thu, 14 May 2026 22:10:32 +0000 [thread overview]
Message-ID: <20260514221033.3B5C9C2BCB3@smtp.kernel.org> (raw)
In-Reply-To: <20260513220956.402058-2-longli@microsoft.com>
Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [Critical] Race condition in `mana_set_channels()` allows an RDMA application to steal vport ownership during an Ethernet reconfiguration, leading to out-of-bounds memory accesses and interface breakage.
- [Critical] Per-PD reference counting for per-port EQs causes a scope mismatch, leading to NULL pointer dereferences and resource leaks when a Protection Domain is shared across multiple ports.
--
commit 31a11c151b6c97fb0cd32cd8787a2fa1cd92856b
Author: Long Li <longli@microsoft.com>
Subject: net: mana: Create separate EQs for each vPort
To prepare for assigning vPorts to dedicated MSI-X vectors, remove EQ
sharing among the vPorts and create dedicated EQs for each vPort.
> diff --git a/drivers/infiniband/hw/mana/main.c b/drivers/infiniband/hw/mana/main.c
> index ac5e75dd3494d..6159bd03a0217 100644
> --- a/drivers/infiniband/hw/mana/main.c
> +++ b/drivers/infiniband/hw/mana/main.c
[ ... ]
> @@ -55,15 +57,23 @@ int mana_ib_cfg_vport(struct mana_ib_dev *dev, u32 port, struct mana_ib_pd *pd,
> return err;
> }
>
> - mutex_unlock(&pd->vport_mutex);
>
> - pd->tx_shortform_allowed = mpc->tx_shortform_allowed;
> - pd->tx_vp_offset = mpc->tx_vp_offset;
> + err = mana_create_eq(mpc);
> + if (err) {
> + mana_uncfg_vport(mpc);
> + pd->vport_use_count--;
> + } else {
> + pd->tx_shortform_allowed = mpc->tx_shortform_allowed;
> + pd->tx_vp_offset = mpc->tx_vp_offset;
> + }
>
> - ibdev_dbg(&dev->ib_dev, "vport handle %llx pdid %x doorbell_id %x\n",
> - mpc->port_handle, pd->pdn, doorbell_id);
> + mutex_unlock(&pd->vport_mutex);
Could this lead to a NULL pointer dereference or resource leak if a Protection
Domain is shared across multiple ports?
The vport_use_count is tracked per-PD (pd->vport_use_count), but Event Queues
are now managed per-port (mpc->eqs).
If a user creates a QP on Port 1, pd->vport_use_count becomes 1 and mpc1->eqs
are allocated. If the user then creates a QP on Port 2 using the same PD, the
code in mana_ib_cfg_vport() will see pd->vport_use_count > 1 and bypass
configuration for Port 2.
This leaves mpc2->eqs as NULL if the Ethernet interface is down, leading to a
NULL dereference when EQs are accessed. When QPs are later destroyed, it
will also destroy the context of whichever port the last QP belonged to,
permanently leaking the hardware configuration and EQs of the port that was
actually initialized.
[ ... ]
> diff --git a/drivers/net/ethernet/microsoft/mana/mana_ethtool.c b/drivers/net/ethernet/microsoft/mana/mana_ethtool.c
> index 04350973e19e3..e121834d17f3a 100644
> --- a/drivers/net/ethernet/microsoft/mana/mana_ethtool.c
> +++ b/drivers/net/ethernet/microsoft/mana/mana_ethtool.c
> @@ -454,18 +454,40 @@ static int mana_set_coalesce(struct net_device *ndev,
> return err;
> }
>
> +/* mana_set_channels - change the number of queues on a port
> + *
> + * Returns -EBUSY if the port is down and RDMA holds the vport with
> + * EQs sized to the current num_queues.
> + */
> static int mana_set_channels(struct net_device *ndev,
> struct ethtool_channels *channels)
> {
> struct mana_port_context *apc = netdev_priv(ndev);
> unsigned int new_count = channels->combined_count;
> unsigned int old_count = apc->num_queues;
> + bool locked = false;
> int err;
>
> + /* When the port is down, hold vport_mutex for the entire
> + * operation to serialize against RDMA's mana_cfg_vport().
> + * This is safe because mana_detach()/mana_attach() skip
> + * vport teardown/setup when port_st_save is false.
> + * When the port is up, Ethernet owns the vport exclusively
> + * so no locking against RDMA is needed.
> + */
> + if (!apc->port_is_up) {
> + mutex_lock(&apc->vport_mutex);
> + if (apc->vport_use_count) {
> + mutex_unlock(&apc->vport_mutex);
> + return -EBUSY;
> + }
> + locked = true;
> + }
>
> err = mana_pre_alloc_rxbufs(apc, ndev->mtu, new_count);
> if (err) {
> netdev_err(ndev, "Insufficient memory for new allocations");
> - return err;
> + goto unlock;
> }
>
> err = mana_detach(ndev, false);
Can this lead to a race condition with RDMA operations when the port is up?
If apc->port_is_up is true, the function does not acquire apc->vport_mutex.
It proceeds to call mana_detach(), which calls mana_uncfg_vport() and lowers
apc->vport_use_count to 0.
Because no lock is held across the detach/attach window, a concurrent RDMA
operation like mana_ib_create_qp_raw() could grab apc->vport_mutex, observe
a use count of 0, and steal ownership of the vport.
If the RDMA thread allocates apc->eqs based on old_count, and then
mana_set_channels() updates apc->num_queues = new_count, subsequent RDMA
queue indexing operations may read out of bounds. The Ethernet interface will
also fail to reattach because RDMA now owns the vport, leaving the interface
broken.
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260513220956.402058-1-longli@microsoft.com?part=1
next prev parent reply other threads:[~2026-05-14 22:10 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-13 22:09 [PATCH net-next v9 0/6] net: mana: Per-vPort EQ and MSI-X interrupt management Long Li
2026-05-13 22:09 ` [PATCH net-next v9 1/6] net: mana: Create separate EQs for each vPort Long Li
2026-05-14 22:10 ` sashiko-bot [this message]
2026-05-13 22:09 ` [PATCH net-next v9 2/6] net: mana: Query device capabilities and configure MSI-X sharing for EQs Long Li
2026-05-14 22:10 ` sashiko-bot
2026-05-13 22:09 ` [PATCH net-next v9 3/6] net: mana: Introduce GIC context with refcounting for interrupt management Long Li
2026-05-13 22:09 ` [PATCH net-next v9 4/6] net: mana: Use GIC functions to allocate global EQs Long Li
2026-05-13 22:09 ` [PATCH net-next v9 5/6] net: mana: Allocate interrupt context for each EQ when creating vPort Long Li
2026-05-13 22:09 ` [PATCH net-next v9 6/6] RDMA/mana_ib: Allocate interrupt contexts on EQs Long Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260514221033.3B5C9C2BCB3@smtp.kernel.org \
--to=sashiko-bot@kernel.org \
--cc=linux-hyperv@vger.kernel.org \
--cc=longli@microsoft.com \
--cc=sashiko-reviews@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.