From: Mingming Cao <mmc@linux.ibm.com>
To: netdev@vger.kernel.org
Cc: horms@kernel.org, bjking1@linux.ibm.com, haren@linux.ibm.com,
ricklind@linux.ibm.com, mmc@linux.ibm.com, kuba@kernel.org,
edumazet@google.com, pabeni@redhat.com,
linuxppc-dev@lists.ozlabs.org, maddy@linux.ibm.com,
mpe@ellerman.id.au, Dave Marquardt <davemarq@linux.ibm.com>
Subject: [PATCH net-next v2 12/15] ibmveth: Add helpers for incremental MQ RX queue resize
Date: Wed, 1 Jul 2026 15:23:24 -0700 [thread overview]
Message-ID: <20260701222327.61325-13-mmc@linux.ibm.com> (raw)
In-Reply-To: <20260701222327.61325-1-mmc@linux.ibm.com>
Patches 15-17 add runtime RX queue resize via ethtool -L: single-queue
helpers here, ibmveth_resize_rx_queues_incremental() next, then ethtool
set_channels wiring.
Design: rx queue count must be changeable without a full close/open.
Close tears down the whole logical LAN (H_FREE_LOGICAL_LAN), dropping
every queue and disrupting traffic on queues that should stay up.
Incremental resize is viable because MQ PHYP registers subordinate
queues independently (H_REG_LOGICAL_LAN_QUEUE and per-queue free) while
queue 0 keeps the adapter handle; earlier per-queue bring-up helpers
already split pools, IRQs, and PHYP registration by queue index. Resize
then grows or shrinks by touching only the indices that change, leaving
surviving queues registered with buffers and IRQs intact.
This patch adds the single-queue Linux-side lifecycle helpers the resize
path calls for each new or removed index:
ibmveth_drain_rx_queue()
ibmveth_alloc_single_rx_queue()
ibmveth_free_single_rx_queue()
ibmveth_setup_single_rx_interrupt()
ibmveth_cleanup_single_rx_interrupt()
Scale-up copies pool geometry from queue 0 and uses
ibmveth_alloc_queue_buffer_pools() so only active pools are allocated
for the new queue index.
No user-visible behavior yet: helpers are added but not called until
the next patch implements ibmveth_resize_rx_queues_incremental().
Signed-off-by: Mingming Cao <mmc@linux.ibm.com>
Reviewed-by: Dave Marquardt <davemarq@linux.ibm.com>
---
drivers/net/ethernet/ibm/ibmveth.c | 223 +++++++++++++++++++++++++++++
1 file changed, 223 insertions(+)
diff --git a/drivers/net/ethernet/ibm/ibmveth.c b/drivers/net/ethernet/ibm/ibmveth.c
index ecc472ee8f71..cd0acd1715da 100644
--- a/drivers/net/ethernet/ibm/ibmveth.c
+++ b/drivers/net/ethernet/ibm/ibmveth.c
@@ -589,6 +589,54 @@ ibmveth_cleanup_rx_interrupts(struct ibmveth_adapter *adapter)
adapter->queue_irq[0] = 0;
}
+/**
+ * ibmveth_setup_single_rx_interrupt - Setup interrupt for a single RX queue
+ * @adapter: ibmveth adapter structure
+ * @queue_idx: Queue index to setup
+ *
+ * Registers the IRQ handler for one queue. Used during incremental
+ * scale-up when adding new RX queues; the caller enables NAPI via
+ * napi_enable() after ibmveth_enable_irq().
+ *
+ * Return: 0 on success, negative error code on failure
+ */
+static int
+ibmveth_setup_single_rx_interrupt(struct ibmveth_adapter *adapter,
+ int queue_idx)
+{
+ struct net_device *netdev = adapter->netdev;
+ int rc;
+
+ rc = request_irq(adapter->queue_irq[queue_idx], ibmveth_interrupt,
+ 0, netdev->name, &adapter->napi[queue_idx]);
+ if (rc) {
+ netdev_err(netdev, "request_irq() failed for queue %d: %d\n",
+ queue_idx, rc);
+ return rc;
+ }
+
+ netdev_dbg(netdev, "Setup IRQ %d for queue %d\n",
+ adapter->queue_irq[queue_idx], queue_idx);
+ return 0;
+}
+
+/**
+ * ibmveth_cleanup_single_rx_interrupt - Cleanup interrupt for a single RX queue
+ * @adapter: ibmveth adapter structure
+ * @queue_idx: Queue index to cleanup
+ *
+ * Frees the IRQ handler for one queue. Used during incremental scale-down.
+ */
+static void
+ibmveth_cleanup_single_rx_interrupt(struct ibmveth_adapter *adapter,
+ int queue_idx)
+{
+ if (adapter->queue_irq[queue_idx]) {
+ free_irq(adapter->queue_irq[queue_idx], &adapter->napi[queue_idx]);
+ netdev_dbg(adapter->netdev, "Freed IRQ for queue %d\n", queue_idx);
+ }
+}
+
/* setup the initial settings for a buffer pool */
static void ibmveth_init_buffer_pool(struct ibmveth_buff_pool *pool,
u32 pool_index, u32 pool_size,
@@ -1080,6 +1128,138 @@ static void ibmveth_free_buffer_pools(struct ibmveth_adapter *adapter)
adapter->num_rx_queues);
}
+/**
+ * ibmveth_alloc_single_rx_queue - Allocate resources for a single RX queue
+ * @adapter: ibmveth adapter structure
+ * @queue_idx: Queue index to allocate
+ * @rxq_entries: Number of RX queue entries
+ *
+ * Allocates buffer list, RX queue, and per-queue buffer pools for one queue.
+ * Used during incremental scale-up without affecting existing queues.
+ *
+ * Return: 0 on success, negative error code on failure
+ */
+static int
+ibmveth_alloc_single_rx_queue(struct ibmveth_adapter *adapter, int queue_idx,
+ int rxq_entries)
+{
+ struct device *dev = &adapter->vdev->dev;
+ struct net_device *netdev = adapter->netdev;
+ int i, rc = -ENOMEM;
+
+ adapter->buffer_list_addr[queue_idx] = (void *)get_zeroed_page(GFP_KERNEL);
+ if (!adapter->buffer_list_addr[queue_idx]) {
+ netdev_err(netdev, "unable to allocate buffer list for queue %d\n",
+ queue_idx);
+ return -ENOMEM;
+ }
+
+ adapter->rx_queue[queue_idx].queue_len =
+ sizeof(struct ibmveth_rx_q_entry) * rxq_entries;
+ adapter->rx_queue[queue_idx].queue_addr =
+ dma_alloc_coherent(dev, adapter->rx_queue[queue_idx].queue_len,
+ &adapter->rx_queue[queue_idx].queue_dma,
+ GFP_KERNEL);
+ if (!adapter->rx_queue[queue_idx].queue_addr) {
+ netdev_err(netdev, "unable to allocate RX queue for queue %d\n",
+ queue_idx);
+ goto out_free_buflist;
+ }
+
+ adapter->buffer_list_dma[queue_idx] =
+ dma_map_single(dev, adapter->buffer_list_addr[queue_idx],
+ 4096, DMA_BIDIRECTIONAL);
+ if (dma_mapping_error(dev, adapter->buffer_list_dma[queue_idx])) {
+ netdev_err(netdev, "unable to map buffer list for queue %d\n",
+ queue_idx);
+ goto out_free_rxq;
+ }
+
+ for (i = 0; i < IBMVETH_NUM_BUFF_POOLS; i++) {
+ adapter->rx_buff_pool[queue_idx][i].size =
+ adapter->rx_buff_pool[0][i].size;
+ adapter->rx_buff_pool[queue_idx][i].buff_size =
+ adapter->rx_buff_pool[0][i].buff_size;
+ adapter->rx_buff_pool[queue_idx][i].threshold =
+ adapter->rx_buff_pool[0][i].threshold;
+ adapter->rx_buff_pool[queue_idx][i].active =
+ adapter->rx_buff_pool[0][i].active;
+ }
+
+ rc = ibmveth_alloc_queue_buffer_pools(adapter, queue_idx);
+ if (rc) {
+ netdev_err(netdev,
+ "Failed to allocate buffer pools for queue %d\n",
+ queue_idx);
+ goto out_unmap_buflist;
+ }
+
+ adapter->rx_queue[queue_idx].index = 0;
+ adapter->rx_queue[queue_idx].num_slots = rxq_entries;
+ adapter->rx_queue[queue_idx].toggle = 1;
+ spin_lock_init(&adapter->rx_queue[queue_idx].replenish_lock);
+
+ netdev_dbg(netdev,
+ "Allocated queue %d: buffer_list @ %p (DMA: 0x%llx), rx_queue @ %p (DMA: 0x%llx), %d entries\n",
+ queue_idx, adapter->buffer_list_addr[queue_idx],
+ (unsigned long long)adapter->buffer_list_dma[queue_idx],
+ adapter->rx_queue[queue_idx].queue_addr,
+ (unsigned long long)adapter->rx_queue[queue_idx].queue_dma,
+ rxq_entries);
+
+ return 0;
+
+out_unmap_buflist:
+ dma_unmap_single(dev, adapter->buffer_list_dma[queue_idx],
+ 4096, DMA_BIDIRECTIONAL);
+ adapter->buffer_list_dma[queue_idx] = 0;
+out_free_rxq:
+ dma_free_coherent(dev, adapter->rx_queue[queue_idx].queue_len,
+ adapter->rx_queue[queue_idx].queue_addr,
+ adapter->rx_queue[queue_idx].queue_dma);
+ adapter->rx_queue[queue_idx].queue_addr = NULL;
+out_free_buflist:
+ free_page((unsigned long)adapter->buffer_list_addr[queue_idx]);
+ adapter->buffer_list_addr[queue_idx] = NULL;
+ return rc;
+}
+
+/**
+ * ibmveth_free_single_rx_queue - Free resources for a single RX queue
+ * @adapter: ibmveth adapter structure
+ * @queue_idx: Queue index to free
+ *
+ * Frees buffer list, RX queue, and per-queue buffer pools for one queue.
+ * Used during incremental scale-down without affecting remaining queues.
+ */
+static void
+ibmveth_free_single_rx_queue(struct ibmveth_adapter *adapter, int queue_idx)
+{
+ struct device *dev = &adapter->vdev->dev;
+
+ ibmveth_free_queue_buffer_pools(adapter, queue_idx);
+
+ if (adapter->buffer_list_dma[queue_idx]) {
+ dma_unmap_single(dev, adapter->buffer_list_dma[queue_idx],
+ 4096, DMA_BIDIRECTIONAL);
+ adapter->buffer_list_dma[queue_idx] = 0;
+ }
+
+ if (adapter->rx_queue[queue_idx].queue_addr) {
+ dma_free_coherent(dev, adapter->rx_queue[queue_idx].queue_len,
+ adapter->rx_queue[queue_idx].queue_addr,
+ adapter->rx_queue[queue_idx].queue_dma);
+ adapter->rx_queue[queue_idx].queue_addr = NULL;
+ }
+
+ if (adapter->buffer_list_addr[queue_idx]) {
+ free_page((unsigned long)adapter->buffer_list_addr[queue_idx]);
+ adapter->buffer_list_addr[queue_idx] = NULL;
+ }
+
+ netdev_dbg(adapter->netdev, "Freed queue %d resources\n", queue_idx);
+}
+
/**
* ibmveth_remove_buffer_from_pool - remove a buffer from a pool
* @adapter: adapter instance
@@ -1192,6 +1372,49 @@ static int ibmveth_rxq_harvest_buffer(struct ibmveth_adapter *adapter,
return 0;
}
+/**
+ * ibmveth_drain_rx_queue - Drain pending buffers from an RX queue
+ * @adapter: ibmveth adapter structure
+ * @queue_index: Queue index to drain
+ *
+ * Recycles all pending buffers back to the per-queue buffer pools.
+ * Must be called with NAPI disabled for this queue.
+ *
+ * Return: Number of buffers drained
+ */
+static int
+ibmveth_drain_rx_queue(struct ibmveth_adapter *adapter, int queue_index)
+{
+ struct net_device *netdev = adapter->netdev;
+ int drained = 0;
+ int limit = adapter->rx_queue[queue_index].num_slots;
+ int rc;
+
+ netdev_dbg(netdev, "Draining RX queue %d (limit: %d slots)\n",
+ queue_index, limit);
+
+ while (drained < limit &&
+ ibmveth_rxq_pending_buffer(adapter, queue_index)) {
+ rc = ibmveth_rxq_harvest_buffer(adapter, queue_index, true);
+ if (rc) {
+ netdev_err(netdev,
+ "Failed to harvest buffer from queue %d during drain: %d\n",
+ queue_index, rc);
+ break;
+ }
+ drained++;
+ }
+
+ if (drained > 0)
+ netdev_dbg(netdev, "Drained %d buffer(s) from RX queue %d\n",
+ drained, queue_index);
+ else
+ netdev_dbg(netdev, "No buffers to drain from RX queue %d\n",
+ queue_index);
+
+ return drained;
+}
+
static void ibmveth_free_tx_ltb(struct ibmveth_adapter *adapter, int idx)
{
dma_unmap_single(&adapter->vdev->dev, adapter->tx_ltb_dma[idx],
--
2.39.3 (Apple Git-146)
next prev parent reply other threads:[~2026-07-01 22:26 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-07-01 22:23 [PATCH net-next v2 00/15] ibmveth: Add multi-queue RX support Mingming Cao
2026-07-01 22:23 ` [PATCH net-next v2 01/15] ibmveth: Refactor RX resource allocation for MQ RX bring-up Mingming Cao
2026-07-01 22:23 ` [PATCH net-next v2 02/15] ibmveth: Refactor buffer pool management for per-queue MQ RX Mingming Cao
2026-07-01 22:23 ` [PATCH net-next v2 03/15] ibmveth: Refactor RX interrupt control for MQ RX queues Mingming Cao
2026-07-01 22:23 ` [PATCH net-next v2 04/15] ibmveth: Refactor TX resource allocation in open/close paths Mingming Cao
2026-07-01 22:23 ` [PATCH net-next v2 05/15] ibmveth: Add RX queue register/deregister helpers for MQ Mingming Cao
2026-07-01 22:23 ` [PATCH net-next v2 06/15] ibmveth: Refactor open/close into MQ-ready resource pipeline Mingming Cao
2026-07-01 22:23 ` [PATCH net-next v2 07/15] ibmveth: Add queue-aware RX buffer submit helper for MQ Mingming Cao
2026-07-01 22:23 ` [PATCH net-next v2 08/15] ibmveth: Enable multi-queue RX receive path Mingming Cao
2026-07-01 22:23 ` [PATCH net-next v2 09/15] ibmveth: Add per-queue RX statistics collection and reporting Mingming Cao
2026-07-01 22:23 ` [PATCH net-next v2 10/15] ibmveth: Add per-queue TX statistics reporting Mingming Cao
2026-07-01 22:23 ` [PATCH net-next v2 11/15] ibmveth: Expose per-queue buffer pool details via sysfs Mingming Cao
2026-07-01 22:23 ` Mingming Cao [this message]
2026-07-01 22:23 ` [PATCH net-next v2 13/15] ibmveth: Implement incremental MQ RX queue resize Mingming Cao
2026-07-01 22:23 ` [PATCH net-next v2 14/15] ibmveth: Wire ethtool set_channels to " Mingming Cao
2026-07-01 22:23 ` [PATCH net-next v2 15/15] ibmveth: Fix MQ RX poll and shutdown hangs after " Mingming Cao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260701222327.61325-13-mmc@linux.ibm.com \
--to=mmc@linux.ibm.com \
--cc=bjking1@linux.ibm.com \
--cc=davemarq@linux.ibm.com \
--cc=edumazet@google.com \
--cc=haren@linux.ibm.com \
--cc=horms@kernel.org \
--cc=kuba@kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=maddy@linux.ibm.com \
--cc=mpe@ellerman.id.au \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=ricklind@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox