public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jakub Kicinski <kuba@kernel.org>
To: devnull+satishkh.cisco.com@kernel.org
Cc: Jakub Kicinski <kuba@kernel.org>,
	andrew+netdev@lunn.ch, davem@davemloft.net, edumazet@google.com,
	pabeni@redhat.com, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	20260401-enic-sriov-v2-prep-v4-0-d5834b2ef1b9@cisco.com,
	satishkh@cisco.com
Subject: Re: [PATCH net-next v4 10/10] enic: add V2 VF probe with admin channel and PF registration
Date: Mon, 13 Apr 2026 17:22:34 -0700	[thread overview]
Message-ID: <20260414002234.3735463-1-kuba@kernel.org> (raw)
In-Reply-To: <20260411-enic-sriov-v2-admin-channel-v2-v4-10-f052326c2a57@cisco.com>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
---
enic: add V2 VF probe with admin channel and PF registration

When a V2 SR-IOV VF probes, open the admin channel, initialize the
MBOX protocol, perform the capability check with the PF, and register
with the PF. This establishes the PF-VF communication path that the PF
uses to send link state notifications.

> diff --git a/drivers/net/ethernet/cisco/enic/enic_main.c b/drivers/net/ethernet/cisco/enic/enic_main.c
> index 057716ccc2837..bf4417e67b16d 100644
> --- a/drivers/net/ethernet/cisco/enic/enic_main.c
> +++ b/drivers/net/ethernet/cisco/enic/enic_main.c
[ ... ]
> @@ -2992,6 +2997,32 @@ static int enic_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>  		goto err_out_dev_close;
>  	}
>  
> +	/* V2 VF: open admin channel and register with PF.
> +	 * Must happen before register_netdev so the VF is fully
> +	 * initialized before the interface is visible to userspace.
> +	 */
> +	if (enic_is_sriov_vf_v2(enic)) {
> +		err = enic_admin_channel_open(enic);
> +		if (err) {
> +			dev_err(dev,
> +				"Failed to open admin channel: %d\n", err);
> +			goto err_out_dev_deinit;
> +		}
> +		enic_mbox_init(enic);

Could this ordering cause a race condition with an uninitialized completion object?

enic_admin_channel_open() dynamically allocates the MSI-X interrupt, registers
the IRQ handler, and enables the admin hardware queues, making the interrupt
live immediately.

Only afterward is enic_mbox_init() called, which initializes the
enic->mbox_comp completion struct. If a spurious interrupt or a left-over
admin message causes the admin NAPI poll handler to fire between these two
calls, it will attempt to call complete(&enic->mbox_comp) before the
completion struct's internal lock has been initialized.

Also regarding the admin channel resources, are we properly handling the admin
channel across a soft reset?

When the driver recovers from a Tx hang or queue error via
enic_tx_hang_reset() or enic_reset(), it issues a CMD_SOFT_RESET to the
firmware, which resets and disables all hardware queues on the VNIC. The
recovery path then calls enic_init_vnic_resources(), which only re-initializes
and enables the datapath queues (wq, rq, cq).

It completely ignores the admin channel queues (admin_wq, admin_rq, admin_cq).
As a result, the admin queues remain disabled in hardware and desynchronized
from the software descriptors.

> +		err = enic_mbox_vf_capability_check(enic);
[ ... ]
> @@ -3156,6 +3199,17 @@ static void enic_remove(struct pci_dev *pdev)
>  		cancel_work_sync(&enic->reset);
>  		cancel_work_sync(&enic->change_mtu_work);
>  		unregister_netdev(netdev);
> +		if (enic_is_sriov_vf_v2(enic)) {
> +			if (enic->vf_registered) {
> +				int unreg_err = enic_mbox_vf_unregister(enic);
> +
> +				if (unreg_err)
> +					netdev_warn(netdev,
> +						    "Failed to unregister from PF: %d\n",
> +						    unreg_err);
> +			}
> +			enic_admin_channel_close(enic);
> +		}

Does this unregistration ordering introduce a use-after-free on the freed
netdev?

Because the admin interrupt is not masked by ndo_stop(), it remains fully
active after the network device is shut down. If the PF asynchronously sends a
link state notification after unregister_netdev(netdev) but before
enic_admin_channel_close(enic):

The admin workqueue will process the notification and call
netif_carrier_on(enic->netdev). This adds the device to the lweventlist and
calls dev_hold(dev), bumping dev->pcpu_refcnt.

However, unregister_netdev() has already completed its wait for pcpu_refcnt to
drop to zero. The driver then proceeds to call free_netdev(netdev), freeing
both the pcpu_refcnt allocation and the netdev memory. When the linkwatch
worker subsequently processes the event, it accesses this freed memory.

Should the admin channel be closed before unregister_netdev()?

> diff --git a/drivers/net/ethernet/cisco/enic/enic_res.c b/drivers/net/ethernet/cisco/enic/enic_res.c
> index 436326ace0499..74cd2ee3af5cb 100644
> --- a/drivers/net/ethernet/cisco/enic/enic_res.c
> +++ b/drivers/net/ethernet/cisco/enic/enic_res.c
> @@ -211,7 +211,8 @@ void enic_get_res_counts(struct enic *enic)
>  		vnic_dev_get_res_count(enic->vdev, RES_TYPE_ADMIN_RQ) >= 1 &&
>  		vnic_dev_get_res_count(enic->vdev, RES_TYPE_ADMIN_CQ) >=
>  			ARRAY_SIZE(enic->admin_cq) &&
> -		vnic_dev_get_res_count(enic->vdev, RES_TYPE_SRIOV_INTR) >= 1;
> +		(enic_is_sriov_vf_v2(enic) ||
> +		 vnic_dev_get_res_count(enic->vdev, RES_TYPE_SRIOV_INTR) >= 1);

Does this configuration lead to interrupt resource exhaustion and probe failures?

enic_adjust_resources() allocates all available interrupts (minus
ENIC_MSIX_RESERVED_INTR) to data queues, which sets enic->intr_count equal to
enic->intr_avail when queue counts are maximized.

However, enic_admin_setup_intr() requires an additional interrupt at index
intr_index = enic->intr_count. This triggers the
intr_index >= enic->intr_avail bounds check and returns -ENODEV, causing the
VF probe to abort.

Additionally, on the PF side, vnic_intr_alloc() hardcodes the resource type to
RES_TYPE_INTR_CTRL, effectively stealing a standard datapath interrupt instead
of utilizing the dedicated RES_TYPE_SRIOV_INTR hardware resource provisioned
by the firmware.

      reply	other threads:[~2026-04-14  0:22 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-12  5:06 [PATCH net-next v4 00/10] enic: SR-IOV V2 admin channel and MBOX protocol Satish Kharat via B4 Relay
2026-04-12  5:06 ` [PATCH net-next v4 01/10] enic: verify firmware supports V2 SR-IOV at probe time Satish Kharat via B4 Relay
2026-04-12  5:06 ` [PATCH net-next v4 02/10] enic: add admin channel open and close for SR-IOV Satish Kharat via B4 Relay
2026-04-14  0:21   ` Jakub Kicinski
2026-04-12  5:06 ` [PATCH net-next v4 03/10] enic: add admin RQ buffer management Satish Kharat via B4 Relay
2026-04-14  0:21   ` Jakub Kicinski
2026-04-12  5:06 ` [PATCH net-next v4 04/10] enic: add admin CQ service with MSI-X interrupt and NAPI polling Satish Kharat via B4 Relay
2026-04-14  0:21   ` Jakub Kicinski
2026-04-12  5:06 ` [PATCH net-next v4 05/10] enic: define MBOX message types and header structures Satish Kharat via B4 Relay
2026-04-14  0:21   ` Jakub Kicinski
2026-04-12  5:06 ` [PATCH net-next v4 06/10] enic: add MBOX core send and receive for admin channel Satish Kharat via B4 Relay
2026-04-14  0:21   ` Jakub Kicinski
2026-04-12  5:06 ` [PATCH net-next v4 07/10] enic: add MBOX PF handlers for VF register and capability Satish Kharat via B4 Relay
2026-04-14  0:21   ` Jakub Kicinski
2026-04-12  5:06 ` [PATCH net-next v4 08/10] enic: add MBOX VF handlers for capability, register and link state Satish Kharat via B4 Relay
2026-04-14  0:22   ` Jakub Kicinski
2026-04-12  5:06 ` [PATCH net-next v4 09/10] enic: wire V2 SR-IOV enable with admin channel and MBOX Satish Kharat via B4 Relay
2026-04-14  0:22   ` Jakub Kicinski
2026-04-12  5:06 ` [PATCH net-next v4 10/10] enic: add V2 VF probe with admin channel and PF registration Satish Kharat via B4 Relay
2026-04-14  0:22   ` Jakub Kicinski [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260414002234.3735463-1-kuba@kernel.org \
    --to=kuba@kernel.org \
    --cc=20260401-enic-sriov-v2-prep-v4-0-d5834b2ef1b9@cisco.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=devnull+satishkh.cisco.com@kernel.org \
    --cc=edumazet@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=satishkh@cisco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox