From: Mingming Cao <mmc@linux.ibm.com>
To: netdev@vger.kernel.org
Cc: horms@kernel.org, bjking1@linux.ibm.com, haren@linux.ibm.com,
ricklind@linux.ibm.com, mmc@linux.ibm.com, kuba@kernel.org,
edumazet@google.com, pabeni@redhat.com,
linuxppc-dev@lists.ozlabs.org, maddy@linux.ibm.com,
mpe@ellerman.id.au, Dave Marquardt <davemarq@linux.ibm.com>
Subject: [PATCH v1 16/18] ibmveth: Implement incremental MQ RX queue resize
Date: Tue, 30 Jun 2026 07:53:23 -0700 [thread overview]
Message-ID: <f2aa676604b63b7e2c4797da911e9bc8dc2e83ab.1782758799.git.mmc@linux.ibm.com> (raw)
In-Reply-To: <cover.1782758799.git.mmc@linux.ibm.com>
Add ibmveth_resize_rx_queues_incremental() to grow or shrink
adapter->num_rx_queues while the netdev stays up.
Scale-up, per new queue index:
alloc RX resources and per-queue pools
register subordinate queue with PHYP
request_irq(), then ibmveth_enable_irq(), then napi_enable
update num_rx_queues, replenish new queues
netif_set_real_num_rx_queues()
Scale-down disables NAPI on excess queues, drains pending buffers,
disables PHYP IRQ delivery and waits for in-flight handlers with
synchronize_irq() before lowering num_rx_queues, then tears down
IRQ/PHYP/memory.
Reject out-of-range new_count. On scale-down netif failure, re-enable
NAPI on queues not yet torn down. Refresh VIO CMO entitlement after a
successful resize when FW_FEATURE_CMO is enabled.
Scale-up rollback mirrors scale-down: drain posted buffers and wait for
in-flight handlers before deregistering with PHYP.
In replenish_task(), skip queues with queue_index >= num_rx_queues and
require pool->free_map before replenishing so in-flight handlers avoid
queues being torn down without clearing probe-time pool->active on free.
Queue 0 is never removed here. Scale-up failure unwinds only queues
added in this call. ethtool -L wiring is next.
Signed-off-by: Mingming Cao <mmc@linux.ibm.com>
Reviewed-by: Dave Marquardt <davemarq@linux.ibm.com>
---
drivers/net/ethernet/ibm/ibmveth.c | 183 ++++++++++++++++++++++++++++-
1 file changed, 178 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/ibm/ibmveth.c b/drivers/net/ethernet/ibm/ibmveth.c
index cd0acd1715da..ac4d89a66a8d 100644
--- a/drivers/net/ethernet/ibm/ibmveth.c
+++ b/drivers/net/ethernet/ibm/ibmveth.c
@@ -945,18 +945,22 @@ static void ibmveth_replenish_task(struct ibmveth_adapter *adapter,
unsigned long flags;
int i;
- if (queue_index >= adapter->num_rx_queues)
- return;
-
adapter->replenish_task_cycles++;
+ if (queue_index >= adapter->num_rx_queues) {
+ netdev_dbg(adapter->netdev,
+ "Skipping replenish for freed queue %d (num_queues=%d)\n",
+ queue_index, adapter->num_rx_queues);
+ return;
+ }
+
spin_lock_irqsave(&rxq->replenish_lock, flags);
for (i = (IBMVETH_NUM_BUFF_POOLS - 1); i >= 0; i--) {
struct ibmveth_buff_pool *pool =
&adapter->rx_buff_pool[queue_index][i];
- if (pool->active &&
+ if (pool->active && pool->free_map &&
(atomic_read(&pool->available) < pool->threshold))
ibmveth_replenish_buffer_pool(adapter, pool,
queue_index);
@@ -1682,7 +1686,7 @@ ibmveth_register_single_rx_queue(struct ibmveth_adapter *adapter,
* the IRQ mapping for subordinate queues. Queue 0 is freed only through
* ibmveth_free_all_queues() (H_FREE_LOGICAL_LAN).
*/
-static void __maybe_unused
+static void
ibmveth_deregister_single_rx_queue(struct ibmveth_adapter *adapter,
int queue_idx)
{
@@ -1714,6 +1718,175 @@ ibmveth_deregister_single_rx_queue(struct ibmveth_adapter *adapter,
netdev_dbg(adapter->netdev, "Deregistered queue %d\n", queue_idx);
}
+/**
+ * ibmveth_resize_rx_queues_incremental - Resize RX queue count incrementally
+ * @adapter: ibmveth adapter structure
+ * @new_count: Target number of RX queues
+ * @rxq_entries: Number of entries per RX queue
+ *
+ * Adds or removes RX queues without tearing down the entire adapter.
+ * Active queues continue receiving during scale-up; scale-down drains
+ * excess queues before deregistering them with the hypervisor.
+ *
+ * Return: 0 on success, negative error code on failure
+ */
+static int
+ibmveth_resize_rx_queues_incremental(struct ibmveth_adapter *adapter,
+ int new_count, int rxq_entries)
+{
+ struct net_device *netdev = adapter->netdev;
+ u64 mac_address = ether_addr_to_u64(netdev->dev_addr);
+ int old_count = adapter->num_rx_queues;
+ int failed_queue;
+ int rc, i;
+
+ if (old_count == new_count) {
+ netdev_dbg(netdev, "RX queue count unchanged (%d), nothing to do\n",
+ old_count);
+ return 0;
+ }
+
+ if (new_count < 1 || new_count > IBMVETH_MAX_RX_QUEUES) {
+ netdev_err(netdev, "Invalid RX queue count %d (must be 1-%d)\n",
+ new_count, IBMVETH_MAX_RX_QUEUES);
+ return -EINVAL;
+ }
+
+ netdev_info(netdev, "Incrementally resizing RX queues: %d to %d\n",
+ old_count, new_count);
+
+ if (new_count > old_count) {
+ netdev_dbg(netdev, "Scale-up: adding queues %d-%d\n",
+ old_count, new_count - 1);
+
+ for (i = old_count; i < new_count; i++) {
+ rc = ibmveth_alloc_single_rx_queue(adapter, i, rxq_entries);
+ if (rc) {
+ netdev_err(netdev, "Failed to allocate queue %d: %d\n",
+ i, rc);
+ goto cleanup_new_queues;
+ }
+
+ rc = ibmveth_register_single_rx_queue(adapter, i,
+ mac_address);
+ if (rc) {
+ netdev_err(netdev, "Failed to register queue %d: %d\n",
+ i, rc);
+ ibmveth_free_single_rx_queue(adapter, i);
+ goto cleanup_new_queues;
+ }
+
+ rc = ibmveth_setup_single_rx_interrupt(adapter, i);
+ if (rc) {
+ netdev_err(netdev,
+ "Failed to setup IRQ for queue %d: %d\n",
+ i, rc);
+ ibmveth_deregister_single_rx_queue(adapter, i);
+ ibmveth_free_single_rx_queue(adapter, i);
+ goto cleanup_new_queues;
+ }
+
+ rc = ibmveth_enable_irq(adapter, i);
+ if (rc) {
+ netdev_err(netdev,
+ "Failed to enable IRQ for queue %d: %d\n",
+ i, rc);
+ ibmveth_cleanup_single_rx_interrupt(adapter, i);
+ ibmveth_deregister_single_rx_queue(adapter, i);
+ ibmveth_free_single_rx_queue(adapter, i);
+ goto cleanup_new_queues;
+ }
+
+ napi_enable(&adapter->napi[i]);
+ }
+
+ adapter->num_rx_queues = new_count;
+
+ for (i = old_count; i < new_count; i++)
+ ibmveth_replenish_task(adapter, i);
+
+ rc = netif_set_real_num_rx_queues(netdev, new_count);
+ if (rc) {
+ netdev_err(netdev, "Failed to set real RX queues to %d: %d\n",
+ new_count, rc);
+ goto cleanup_new_queues;
+ }
+ } else {
+ netdev_dbg(netdev, "Scale-down: removing queues %d-%d\n",
+ new_count, old_count - 1);
+
+ for (i = new_count; i < old_count; i++)
+ napi_disable(&adapter->napi[i]);
+
+ for (i = new_count; i < old_count; i++)
+ ibmveth_drain_rx_queue(adapter, i);
+
+ synchronize_net();
+
+ rc = netif_set_real_num_rx_queues(netdev, new_count);
+ if (rc) {
+ netdev_err(netdev, "Failed to set real RX queues to %d: %d\n",
+ new_count, rc);
+ for (i = new_count; i < old_count; i++)
+ napi_enable(&adapter->napi[i]);
+ return rc;
+ }
+
+ /* Disable hypervisor interrupts and wait for handlers to complete
+ * before updating num_rx_queues.
+ */
+ for (i = new_count; i < old_count; i++) {
+ ibmveth_disable_irq(adapter, i);
+ synchronize_irq(adapter->queue_irq[i]);
+ }
+
+ adapter->num_rx_queues = new_count;
+
+ for (i = new_count; i < old_count; i++) {
+ ibmveth_cleanup_single_rx_interrupt(adapter, i);
+ ibmveth_deregister_single_rx_queue(adapter, i);
+ ibmveth_free_single_rx_queue(adapter, i);
+ }
+ }
+
+ netdev_info(netdev, "Successfully resized to %d RX queues (incremental)\n",
+ adapter->num_rx_queues);
+
+ if (firmware_has_feature(FW_FEATURE_CMO))
+ vio_cmo_set_dev_desired(adapter->vdev,
+ ibmveth_get_desired_dma(adapter->vdev));
+
+ return 0;
+
+cleanup_new_queues:
+ failed_queue = i;
+ netdev_err(netdev,
+ "Scale-up failed at queue %d, cleaning up queues %d-%d\n",
+ failed_queue, old_count, failed_queue - 1);
+ for (i = old_count; i < failed_queue; i++)
+ napi_disable(&adapter->napi[i]);
+
+ for (i = old_count; i < failed_queue; i++)
+ ibmveth_drain_rx_queue(adapter, i);
+
+ synchronize_net();
+
+ for (i = old_count; i < failed_queue; i++) {
+ ibmveth_disable_irq(adapter, i);
+ synchronize_irq(adapter->queue_irq[i]);
+ }
+
+ for (i = old_count; i < failed_queue; i++) {
+ ibmveth_cleanup_single_rx_interrupt(adapter, i);
+ ibmveth_deregister_single_rx_queue(adapter, i);
+ ibmveth_free_single_rx_queue(adapter, i);
+ }
+ adapter->num_rx_queues = old_count;
+ netdev_warn(netdev, "Keeping %d queues after scale-up failure\n",
+ old_count);
+ return rc;
+}
+
/**
* ibmveth_free_all_queues - Free all RX queues at once
* @adapter: ibmveth adapter structure
--
2.39.3 (Apple Git-146)
next prev parent reply other threads:[~2026-06-30 14:54 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-30 14:53 [PATCH v1 00/18] ibmveth: Add multi-queue RX Support Mingming Cao
2026-06-30 14:53 ` [PATCH v1 01/18] ibmveth: Add MQ RX hypercall wrappers and call definitions Mingming Cao
2026-06-30 14:53 ` [PATCH v1 02/18] ibmveth: Prepare adapter data structures for MQ RX Mingming Cao
2026-06-30 14:53 ` [PATCH v1 03/18] ibmveth: Add MQ-ready RX statistics structures Mingming Cao
2026-06-30 14:53 ` [PATCH v1 04/18] ibmveth: Refactor RX resource allocation for MQ RX bring-up Mingming Cao
2026-06-30 14:53 ` [PATCH v1 05/18] ibmveth: Refactor buffer pool management for per-queue MQ RX Mingming Cao
2026-06-30 14:53 ` [PATCH v1 06/18] ibmveth: Refactor RX interrupt control for MQ RX queues Mingming Cao
2026-06-30 14:53 ` [PATCH v1 07/18] ibmveth: Refactor TX resource allocation in open/close paths Mingming Cao
2026-06-30 14:53 ` [PATCH v1 08/18] ibmveth: Add RX queue register/deregister helpers for MQ Mingming Cao
2026-06-30 14:53 ` [PATCH v1 09/18] ibmveth: Refactor open/close into MQ-ready resource pipeline Mingming Cao
2026-06-30 14:53 ` [PATCH v1 10/18] ibmveth: Add queue-aware RX buffer submit helper for MQ Mingming Cao
2026-06-30 14:53 ` [PATCH v1 11/18] ibmveth: Enable multi-queue RX receive path Mingming Cao
2026-06-30 14:53 ` [PATCH v1 12/18] ibmveth: Add per-queue RX statistics collection and reporting Mingming Cao
2026-06-30 14:53 ` [PATCH v1 13/18] ibmveth: Add per-queue TX statistics reporting Mingming Cao
2026-06-30 14:53 ` [PATCH v1 14/18] ibmveth: Expose per-queue buffer pool details via sysfs Mingming Cao
2026-06-30 14:53 ` [PATCH v1 15/18] ibmveth: Add helpers for incremental MQ RX queue resize Mingming Cao
2026-06-30 14:53 ` Mingming Cao [this message]
2026-06-30 14:53 ` [PATCH v1 17/18] ibmveth: Wire ethtool set_channels to " Mingming Cao
2026-06-30 14:53 ` [PATCH v1 18/18] ibmveth: Fix MQ RX poll and shutdown hangs after " Mingming Cao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f2aa676604b63b7e2c4797da911e9bc8dc2e83ab.1782758799.git.mmc@linux.ibm.com \
--to=mmc@linux.ibm.com \
--cc=bjking1@linux.ibm.com \
--cc=davemarq@linux.ibm.com \
--cc=edumazet@google.com \
--cc=haren@linux.ibm.com \
--cc=horms@kernel.org \
--cc=kuba@kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=maddy@linux.ibm.com \
--cc=mpe@ellerman.id.au \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=ricklind@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox