From: Jakub Kicinski <kuba@kernel.org>
To: anthony.l.nguyen@intel.com
Cc: Jakub Kicinski <kuba@kernel.org>,
davem@davemloft.net, pabeni@redhat.com, edumazet@google.com,
andrew+netdev@lunn.ch, netdev@vger.kernel.org,
vinschen@redhat.com, alexander.nowlin@intel.com
Subject: Re: [PATCH net 8/8] ixgbe: only access vfinfo and mv_list under RCU lock
Date: Fri, 22 May 2026 17:16:42 -0700 [thread overview]
Message-ID: <20260523001642.1758535-1-kuba@kernel.org> (raw)
In-Reply-To: <20260520183501.3360810-9-anthony.l.nguyen@intel.com>
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
---
Subject: ixgbe: only access vfinfo and mv_list under RCU lock
This patch drops the spinlock protecting the adapter info in favor of RCU to
avoid lock contention and crashes when setting sriov_numvfs to zero. It
modifies the accesses to vfinfo and mv_list across the driver to use RCU
primitives.
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> index 2646ee6f295f0..d82c7dfc6580a 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
[ ... ]
> @@ -8309,9 +8340,15 @@ static void ixgbe_check_for_bad_vf(struct ixgbe_adapter *adapter)
>
> /* check status reg for all VFs owned by this PF */
> for (vf = 0; vf < adapter->num_vfs; ++vf) {
> - struct pci_dev *vfdev = adapter->vfinfo[vf].vfdev;
> + struct vf_data_storage *vfinfo;
> + struct pci_dev *vfdev = NULL;
> u16 status_reg;
>
> + rcu_read_lock();
> + vfinfo = rcu_dereference(adapter->vfinfo);
> + if (vfinfo)
> + vfdev = vfinfo[vf].vfdev;
> + rcu_read_unlock();
> if (!vfdev)
> continue;
> pci_read_config_word(vfdev, PCI_STATUS, &status_reg);
[Severity: High]
Is it safe to dereference vfdev outside the RCU read-side critical
section? By dropping the RCU read lock before calling pci_read_config_word(),
the grace period can expire and a concurrent execution of
ixgbe_disable_sriov() could free the pci_dev object, resulting in a
use-after-free.
[ ... ]
> @@ -10071,20 +10114,26 @@ static int handle_redirect_action(struct ixgbe_adapter *adapter, int ifindex,
> {
> struct ixgbe_ring_feature *vmdq = &adapter->ring_feature[RING_F_VMDQ];
> unsigned int num_vfs = adapter->num_vfs, vf;
> + struct vf_data_storage *vfinfo;
> struct netdev_nested_priv priv;
> struct upper_walk_data data;
> struct net_device *upper;
>
> /* redirect to a SRIOV VF */
> - for (vf = 0; vf < num_vfs; ++vf) {
> - upper = pci_get_drvdata(adapter->vfinfo[vf].vfdev);
> - if (upper->ifindex == ifindex) {
> - *queue = vf * __ALIGN_MASK(1, ~vmdq->mask);
> - *action = vf + 1;
> - *action <<= ETHTOOL_RX_FLOW_SPEC_RING_VF_OFF;
> - return 0;
> + rcu_read_lock();
> + vfinfo = rcu_dereference(adapter->vfinfo);
> + if (vfinfo)
> + for (vf = 0; vf < num_vfs; ++vf) {
[Severity: High]
Does reading adapter->num_vfs outside the RCU lock create a time-of-check to
time-of-use (TOCTOU) race? If SR-IOV is disabled and rapidly re-enabled, a
reader can cache an older, larger num_vfs value before acquiring the RCU
lock, but then fetch the newer, smaller vfinfo array. Iterating up to the old
num_vfs would then cause an out-of-bounds memory access.
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
> index 431d77da15a5f..80f22a8e7af43 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
[ ... ]
> @@ -123,6 +125,7 @@ static int __ixgbe_enable_sriov(struct ixgbe_adapter *adapter,
> static void ixgbe_get_vfs(struct ixgbe_adapter *adapter)
> {
> struct pci_dev *pdev = adapter->pdev;
> + struct vf_data_storage *vfinfo;
> u16 vendor = pdev->vendor;
> struct pci_dev *vfdev;
> int vf = 0;
> @@ -134,18 +137,23 @@ static void ixgbe_get_vfs(struct ixgbe_adapter *adapter)
> return;
> pci_read_config_word(pdev, pos + PCI_SRIOV_VF_DID, &vf_id);
>
> - vfdev = pci_get_device(vendor, vf_id, NULL);
> - for (; vfdev; vfdev = pci_get_device(vendor, vf_id, vfdev)) {
> - if (!vfdev->is_virtfn)
> - continue;
> - if (vfdev->physfn != pdev)
> - continue;
> - if (vf >= adapter->num_vfs)
> - continue;
> - pci_dev_get(vfdev);
> - adapter->vfinfo[vf].vfdev = vfdev;
> - ++vf;
> + rcu_read_lock();
> + vfinfo = rcu_dereference(adapter->vfinfo);
> + if (vfinfo) {
> + vfdev = pci_get_device(vendor, vf_id, NULL);
> + for (; vfdev; vfdev = pci_get_device(vendor, vf_id, vfdev)) {
[Severity: High]
Does this code call a sleeping function inside an RCU read-side critical
section? pci_get_device() internally acquires the pci_bus_sem read-write
semaphore, which can sleep. Sleeping inside an RCU read section will
trigger a 'scheduling while atomic' BUG.
[ ... ]
> @@ -206,31 +214,28 @@ int ixgbe_disable_sriov(struct ixgbe_adapter *adapter)
> {
> unsigned int num_vfs = adapter->num_vfs, vf;
> struct ixgbe_hw *hw = &adapter->hw;
> - unsigned long flags;
> + struct vf_data_storage *vfinfo;
> + struct vf_macvlans *mv_list;
> int rss;
>
> - spin_lock_irqsave(&adapter->vfs_lock, flags);
> - /* set num VFs to 0 to prevent access to vfinfo */
> + /* set num VFs to 0 so readers bail out early */
> adapter->num_vfs = 0;
> - spin_unlock_irqrestore(&adapter->vfs_lock, flags);
> +
> + vfinfo = rcu_replace_pointer(adapter->vfinfo, NULL, 1);
> + mv_list = rcu_replace_pointer(adapter->mv_list, NULL, 1);
>
> /* put the reference to all of the vf devices */
> for (vf = 0; vf < num_vfs; ++vf) {
> - struct pci_dev *vfdev = adapter->vfinfo[vf].vfdev;
> + struct pci_dev *vfdev = vfinfo[vf].vfdev;
>
> if (!vfdev)
> continue;
> - adapter->vfinfo[vf].vfdev = NULL;
> + vfinfo[vf].vfdev = NULL;
> pci_dev_put(vfdev);
> }
[Severity: High]
Can RCU readers observe these synchronous mutations before the grace
period ends? The vfinfo array is modified in-place and pci_dev_put() is
called on the devices synchronously. If the final reference is dropped,
the pci_dev could be freed while an RCU reader (like
handle_redirect_action()) is concurrently accessing it, leading to a
use-after-free.
[ ... ]
> @@ -643,10 +668,16 @@ static void ixgbe_clear_vf_vlans(struct ixgbe_adapter *adapter, u32 vf)
> static int ixgbe_set_vf_macvlan(struct ixgbe_adapter *adapter,
> int vf, int index, unsigned char *mac_addr)
> {
> - struct vf_macvlans *entry;
> + struct vf_macvlans *mv_list, *entry;
> bool found = false;
> int retval = 0;
>
> + lockdep_assert_in_rcu_read_lock();
> + /* vf_mvs entries point into the mv_list array */
> + mv_list = rcu_dereference(adapter->mv_list);
> + if (!mv_list)
> + return 0;
> +
> if (index <= 1) {
> list_for_each_entry(entry, &adapter->vf_mvs.l, l) {
[Severity: High]
Are standard list operations safe here under RCU? The adapter->vf_mvs.l
list is traversed using list_for_each_entry() under rcu_read_lock().
Concurrently, ixgbe_alloc_vf_macvlans() can re-initialize the list head
and add entries using list_add(). Without the appropriate RCU list
primitives, concurrent list modifications can cause data races and
corrupted pointers.
[ ... ]
> @@ -952,14 +1013,20 @@ static int ixgbe_set_vf_mac_addr(struct ixgbe_adapter *adapter,
> u32 *msgbuf, u32 vf)
> {
> u8 *new_mac = ((u8 *)(&msgbuf[1]));
> + struct vf_data_storage *vfinfo;
>
> if (!is_valid_ether_addr(new_mac)) {
> e_warn(drv, "VF %d attempted to set invalid mac\n", vf);
> return -1;
> }
>
> - if (adapter->vfinfo[vf].pf_set_mac && !adapter->vfinfo[vf].trusted &&
> - !ether_addr_equal(adapter->vfinfo[vf].vf_mac_addresses, new_mac)) {
> + lockdep_assert_in_rcu_read_lock();
> + vfinfo = rcu_dereference(adapter->vfinfo);
> + if (!vfinfo)
> + return 0;
[Severity: Medium]
Will this send false ACKs to the VF when vfinfo is NULL? The caller,
ixgbe_rcv_msg_from_vf(), interprets a return value of 0 as success and
sends an ACK (IXGBE_VT_MSGTYPE_ACK) to the VF. Returning 0 here means
the operation is silently ignored but acknowledged as successful. Could
this return an error like -EINVAL instead?
prev parent reply other threads:[~2026-05-23 0:16 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-20 18:34 [PATCH net 0/8][pull request] Intel Wired LAN Driver Updates 2026-05-20 (ice, iavf, i40e, ixgbe) Tony Nguyen
2026-05-20 18:34 ` [PATCH net 1/8] ice: fix UAF/NULL deref when VSI rebuild and XDP attach race Tony Nguyen
2026-05-21 15:37 ` Jakub Kicinski
2026-05-23 0:16 ` Jakub Kicinski
2026-05-20 18:34 ` [PATCH net 2/8] ice: fix stats array overflow when VF requests more queues Tony Nguyen
2026-05-23 0:16 ` Jakub Kicinski
2026-05-20 18:34 ` [PATCH net 3/8] iavf: return EBUSY if reset in progress or not ready during MAC change Tony Nguyen
2026-05-20 18:34 ` [PATCH net 4/8] i40e: skip unnecessary VF reset when setting trust Tony Nguyen
2026-05-23 0:16 ` Jakub Kicinski
2026-05-23 0:16 ` Jakub Kicinski
2026-05-20 18:34 ` [PATCH net 5/8] iavf: send MAC change request synchronously Tony Nguyen
2026-05-23 0:16 ` Jakub Kicinski
2026-05-23 0:16 ` Jakub Kicinski
2026-05-20 18:34 ` [PATCH net 6/8] ice: skip unnecessary VF reset when setting trust Tony Nguyen
2026-05-23 0:16 ` Jakub Kicinski
2026-05-23 0:16 ` Jakub Kicinski
2026-05-20 18:34 ` [PATCH net 7/8] i40e: set supported_extts_flags for rising edge Tony Nguyen
2026-05-23 0:16 ` Jakub Kicinski
2026-05-20 18:34 ` [PATCH net 8/8] ixgbe: only access vfinfo and mv_list under RCU lock Tony Nguyen
2026-05-23 0:16 ` Jakub Kicinski
2026-05-23 0:16 ` Jakub Kicinski [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260523001642.1758535-1-kuba@kernel.org \
--to=kuba@kernel.org \
--cc=alexander.nowlin@intel.com \
--cc=andrew+netdev@lunn.ch \
--cc=anthony.l.nguyen@intel.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=vinschen@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox