* [PATCH net-next v2] ibmveth: Add multi buffers rx replenishment hcall support
@ 2025-07-19 9:13 Mingming Cao
2025-07-20 10:45 ` Simon Horman
2025-07-22 13:20 ` patchwork-bot+netdevbpf
0 siblings, 2 replies; 3+ messages in thread
From: Mingming Cao @ 2025-07-19 9:13 UTC (permalink / raw)
To: netdev
Cc: horms, bjking1, haren, ricklind, davemarq, mmc, maddy, mpe,
npiggin, christophe.leroy, andrew+netdev, davem, kuba, edumazet,
pabeni, linuxppc-dev
This patch enables batched RX buffer replenishment in ibmveth by
using the new firmware-supported h_add_logical_lan_buffers() hcall
to submit up to 8 RX buffers in a single call, instead of repeatedly
calling the single-buffer h_add_logical_lan_buffer() hcall.
During the probe, with the patch, the driver queries ILLAN attributes
to detect IBMVETH_ILLAN_RX_MULTI_BUFF_SUPPORT bit. If the attribute is
present, rx_buffers_per_hcall is set to 8, enabling batched replenishment.
Otherwise, it defaults to 1, preserving the original upstream behavior
with no change in code flow for unsupported systems.
The core rx replenish logic remains the same. But when batching
is enabled, the driver aggregates up to 8 fully prepared descriptors
into a single h_add_logical_lan_buffers() hypercall. If any allocation
or DMA mapping fails while preparing a batch, only the successfully
prepared buffers are submitted, and the remaining are deferred for
the next replenish cycle.
If at runtime the firmware stops accepting the batched hcall—e,g,
after a Live Partition Migration (LPM) to a host that does not
support h_add_logical_lan_buffers(), the hypercall returns H_FUNCTION.
In that case, the driver transparently disables batching, resets
rx_buffers_per_hcall to 1, and falls back to the single-buffer hcall
in next future replenishments to take care of these and future buffers.
Test were done on systems with firmware that both supports and
does not support the new h_add_logical_lan_buffers hcall.
On supported firmware, this reduces hypercall overhead significantly
over multiple buffers. SAR measurements showed about a 15% improvement
in packet processing rate under moderate RX load, with heavier traffic
seeing gains more than 30%
Signed-off-by: Mingming Cao <mmc@linux.ibm.com>
Reviewed-by: Brian King <bjking1@linux.ibm.com>
Reviewed-by: Haren Myneni <haren@linux.ibm.com>
Reviewed-by: Dave Marquardt <davemarq@linux.ibm.com>
---
arch/powerpc/include/asm/hvcall.h | 1 +
drivers/net/ethernet/ibm/ibmveth.c | 220 ++++++++++++++++++++---------
drivers/net/ethernet/ibm/ibmveth.h | 21 +++
3 files changed, 174 insertions(+), 68 deletions(-)
diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h
index 6df6dbbe1e7c..ea6c8dc400d2 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -270,6 +270,7 @@
#define H_QUERY_INT_STATE 0x1E4
#define H_POLL_PENDING 0x1D8
#define H_ILLAN_ATTRIBUTES 0x244
+#define H_ADD_LOGICAL_LAN_BUFFERS 0x248
#define H_MODIFY_HEA_QP 0x250
#define H_QUERY_HEA_QP 0x254
#define H_QUERY_HEA 0x258
diff --git a/drivers/net/ethernet/ibm/ibmveth.c b/drivers/net/ethernet/ibm/ibmveth.c
index 24046fe16634..6f0821f1e798 100644
--- a/drivers/net/ethernet/ibm/ibmveth.c
+++ b/drivers/net/ethernet/ibm/ibmveth.c
@@ -211,98 +211,169 @@ static inline void ibmveth_flush_buffer(void *addr, unsigned long length)
static void ibmveth_replenish_buffer_pool(struct ibmveth_adapter *adapter,
struct ibmveth_buff_pool *pool)
{
- u32 i;
- u32 count = pool->size - atomic_read(&pool->available);
- u32 buffers_added = 0;
- struct sk_buff *skb;
- unsigned int free_index, index;
- u64 correlator;
+ union ibmveth_buf_desc descs[IBMVETH_MAX_RX_PER_HCALL] = {0};
+ u32 remaining = pool->size - atomic_read(&pool->available);
+ u64 correlators[IBMVETH_MAX_RX_PER_HCALL] = {0};
unsigned long lpar_rc;
+ u32 buffers_added = 0;
+ u32 i, filled, batch;
+ struct vio_dev *vdev;
dma_addr_t dma_addr;
+ struct device *dev;
+ u32 index;
+
+ vdev = adapter->vdev;
+ dev = &vdev->dev;
mb();
- for (i = 0; i < count; ++i) {
- union ibmveth_buf_desc desc;
+ batch = adapter->rx_buffers_per_hcall;
- free_index = pool->consumer_index;
- index = pool->free_map[free_index];
- skb = NULL;
+ while (remaining > 0) {
+ unsigned int free_index = pool->consumer_index;
- if (WARN_ON(index == IBM_VETH_INVALID_MAP)) {
- schedule_work(&adapter->work);
- goto bad_index_failure;
- }
+ /* Fill a batch of descriptors */
+ for (filled = 0; filled < min(remaining, batch); filled++) {
+ index = pool->free_map[free_index];
+ if (WARN_ON(index == IBM_VETH_INVALID_MAP)) {
+ adapter->replenish_add_buff_failure++;
+ netdev_info(adapter->netdev,
+ "Invalid map index %u, reset\n",
+ index);
+ schedule_work(&adapter->work);
+ break;
+ }
+
+ if (!pool->skbuff[index]) {
+ struct sk_buff *skb = NULL;
- /* are we allocating a new buffer or recycling an old one */
- if (pool->skbuff[index])
- goto reuse;
+ skb = netdev_alloc_skb(adapter->netdev,
+ pool->buff_size);
+ if (!skb) {
+ adapter->replenish_no_mem++;
+ adapter->replenish_add_buff_failure++;
+ break;
+ }
+
+ dma_addr = dma_map_single(dev, skb->data,
+ pool->buff_size,
+ DMA_FROM_DEVICE);
+ if (dma_mapping_error(dev, dma_addr)) {
+ dev_kfree_skb_any(skb);
+ adapter->replenish_add_buff_failure++;
+ break;
+ }
- skb = netdev_alloc_skb(adapter->netdev, pool->buff_size);
+ pool->dma_addr[index] = dma_addr;
+ pool->skbuff[index] = skb;
+ } else {
+ /* re-use case */
+ dma_addr = pool->dma_addr[index];
+ }
- if (!skb) {
- netdev_dbg(adapter->netdev,
- "replenish: unable to allocate skb\n");
- adapter->replenish_no_mem++;
- break;
- }
+ if (rx_flush) {
+ unsigned int len;
- dma_addr = dma_map_single(&adapter->vdev->dev, skb->data,
- pool->buff_size, DMA_FROM_DEVICE);
+ len = adapter->netdev->mtu + IBMVETH_BUFF_OH;
+ len = min(pool->buff_size, len);
+ ibmveth_flush_buffer(pool->skbuff[index]->data,
+ len);
+ }
- if (dma_mapping_error(&adapter->vdev->dev, dma_addr))
- goto failure;
+ descs[filled].fields.flags_len = IBMVETH_BUF_VALID |
+ pool->buff_size;
+ descs[filled].fields.address = dma_addr;
- pool->dma_addr[index] = dma_addr;
- pool->skbuff[index] = skb;
+ correlators[filled] = ((u64)pool->index << 32) | index;
+ *(u64 *)pool->skbuff[index]->data = correlators[filled];
- if (rx_flush) {
- unsigned int len = min(pool->buff_size,
- adapter->netdev->mtu +
- IBMVETH_BUFF_OH);
- ibmveth_flush_buffer(skb->data, len);
+ free_index++;
+ if (free_index >= pool->size)
+ free_index = 0;
}
-reuse:
- dma_addr = pool->dma_addr[index];
- desc.fields.flags_len = IBMVETH_BUF_VALID | pool->buff_size;
- desc.fields.address = dma_addr;
-
- correlator = ((u64)pool->index << 32) | index;
- *(u64 *)pool->skbuff[index]->data = correlator;
- lpar_rc = h_add_logical_lan_buffer(adapter->vdev->unit_address,
- desc.desc);
+ if (!filled)
+ break;
+ /* single buffer case*/
+ if (filled == 1)
+ lpar_rc = h_add_logical_lan_buffer(vdev->unit_address,
+ descs[0].desc);
+ else
+ /* Multi-buffer hcall */
+ lpar_rc = h_add_logical_lan_buffers(vdev->unit_address,
+ descs[0].desc,
+ descs[1].desc,
+ descs[2].desc,
+ descs[3].desc,
+ descs[4].desc,
+ descs[5].desc,
+ descs[6].desc,
+ descs[7].desc);
if (lpar_rc != H_SUCCESS) {
- netdev_warn(adapter->netdev,
- "%sadd_logical_lan failed %lu\n",
- skb ? "" : "When recycling: ", lpar_rc);
- goto failure;
+ dev_warn_ratelimited(dev,
+ "RX h_add_logical_lan failed: filled=%u, rc=%lu, batch=%u\n",
+ filled, lpar_rc, batch);
+ goto hcall_failure;
}
- pool->free_map[free_index] = IBM_VETH_INVALID_MAP;
- pool->consumer_index++;
- if (pool->consumer_index >= pool->size)
- pool->consumer_index = 0;
+ /* Only update pool state after hcall succeeds */
+ for (i = 0; i < filled; i++) {
+ free_index = pool->consumer_index;
+ pool->free_map[free_index] = IBM_VETH_INVALID_MAP;
- buffers_added++;
- adapter->replenish_add_buff_success++;
- }
+ pool->consumer_index++;
+ if (pool->consumer_index >= pool->size)
+ pool->consumer_index = 0;
+ }
- mb();
- atomic_add(buffers_added, &(pool->available));
- return;
+ buffers_added += filled;
+ adapter->replenish_add_buff_success += filled;
+ remaining -= filled;
-failure:
+ memset(&descs, 0, sizeof(descs));
+ memset(&correlators, 0, sizeof(correlators));
+ continue;
- if (dma_addr && !dma_mapping_error(&adapter->vdev->dev, dma_addr))
- dma_unmap_single(&adapter->vdev->dev,
- pool->dma_addr[index], pool->buff_size,
- DMA_FROM_DEVICE);
- dev_kfree_skb_any(pool->skbuff[index]);
- pool->skbuff[index] = NULL;
-bad_index_failure:
- adapter->replenish_add_buff_failure++;
+hcall_failure:
+ for (i = 0; i < filled; i++) {
+ index = correlators[i] & 0xffffffffUL;
+ dma_addr = pool->dma_addr[index];
+
+ if (pool->skbuff[index]) {
+ if (dma_addr &&
+ !dma_mapping_error(dev, dma_addr))
+ dma_unmap_single(dev, dma_addr,
+ pool->buff_size,
+ DMA_FROM_DEVICE);
+
+ dev_kfree_skb_any(pool->skbuff[index]);
+ pool->skbuff[index] = NULL;
+ }
+ }
+ adapter->replenish_add_buff_failure += filled;
+
+ /*
+ * If multi rx buffers hcall is no longer supported by FW
+ * e.g. in the case of Live Parttion Migration
+ */
+ if (batch > 1 && lpar_rc == H_FUNCTION) {
+ /*
+ * Instead of retry submit single buffer individually
+ * here just set the max rx buffer per hcall to 1
+ * buffers will be respleshed next time
+ * when ibmveth_replenish_buffer_pool() is called again
+ * with single-buffer case
+ */
+ netdev_info(adapter->netdev,
+ "RX Multi buffers not supported by FW, rc=%lu\n",
+ lpar_rc);
+ adapter->rx_buffers_per_hcall = 1;
+ netdev_info(adapter->netdev,
+ "Next rx replesh will fall back to single-buffer hcall\n");
+ }
+ break;
+ }
mb();
atomic_add(buffers_added, &(pool->available));
@@ -1783,6 +1854,19 @@ static int ibmveth_probe(struct vio_dev *dev, const struct vio_device_id *id)
netdev->features |= NETIF_F_FRAGLIST;
}
+ if (ret == H_SUCCESS &&
+ (ret_attr & IBMVETH_ILLAN_RX_MULTI_BUFF_SUPPORT)) {
+ adapter->rx_buffers_per_hcall = IBMVETH_MAX_RX_PER_HCALL;
+ netdev_dbg(netdev,
+ "RX Multi-buffer hcall supported by FW, batch set to %u\n",
+ adapter->rx_buffers_per_hcall);
+ } else {
+ adapter->rx_buffers_per_hcall = 1;
+ netdev_dbg(netdev,
+ "RX Single-buffer hcall mode, batch set to %u\n",
+ adapter->rx_buffers_per_hcall);
+ }
+
netdev->min_mtu = IBMVETH_MIN_MTU;
netdev->max_mtu = ETH_MAX_MTU - IBMVETH_BUFF_OH;
diff --git a/drivers/net/ethernet/ibm/ibmveth.h b/drivers/net/ethernet/ibm/ibmveth.h
index b0a2460ec9f9..068f99df133e 100644
--- a/drivers/net/ethernet/ibm/ibmveth.h
+++ b/drivers/net/ethernet/ibm/ibmveth.h
@@ -28,6 +28,7 @@
#define IbmVethMcastRemoveFilter 0x2UL
#define IbmVethMcastClearFilterTable 0x3UL
+#define IBMVETH_ILLAN_RX_MULTI_BUFF_SUPPORT 0x0000000000040000UL
#define IBMVETH_ILLAN_LRG_SR_ENABLED 0x0000000000010000UL
#define IBMVETH_ILLAN_LRG_SND_SUPPORT 0x0000000000008000UL
#define IBMVETH_ILLAN_PADDED_PKT_CSUM 0x0000000000002000UL
@@ -46,6 +47,24 @@
#define h_add_logical_lan_buffer(ua, buf) \
plpar_hcall_norets(H_ADD_LOGICAL_LAN_BUFFER, ua, buf)
+static inline long h_add_logical_lan_buffers(unsigned long unit_address,
+ unsigned long desc1,
+ unsigned long desc2,
+ unsigned long desc3,
+ unsigned long desc4,
+ unsigned long desc5,
+ unsigned long desc6,
+ unsigned long desc7,
+ unsigned long desc8)
+{
+ unsigned long retbuf[PLPAR_HCALL9_BUFSIZE];
+
+ return plpar_hcall9(H_ADD_LOGICAL_LAN_BUFFERS,
+ retbuf, unit_address,
+ desc1, desc2, desc3, desc4,
+ desc5, desc6, desc7, desc8);
+}
+
/* FW allows us to send 6 descriptors but we only use one so mark
* the other 5 as unused (0)
*/
@@ -101,6 +120,7 @@ static inline long h_illan_attributes(unsigned long unit_address,
#define IBMVETH_MAX_TX_BUF_SIZE (1024 * 64)
#define IBMVETH_MAX_QUEUES 16U
#define IBMVETH_DEFAULT_QUEUES 8U
+#define IBMVETH_MAX_RX_PER_HCALL 8U
static int pool_size[] = { 512, 1024 * 2, 1024 * 16, 1024 * 32, 1024 * 64 };
static int pool_count[] = { 256, 512, 256, 256, 256 };
@@ -151,6 +171,7 @@ struct ibmveth_adapter {
int rx_csum;
int large_send;
bool is_active_trunk;
+ unsigned int rx_buffers_per_hcall;
u64 fw_ipv6_csum_support;
u64 fw_ipv4_csum_support;
--
2.39.3 (Apple Git-146)
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH net-next v2] ibmveth: Add multi buffers rx replenishment hcall support
2025-07-19 9:13 [PATCH net-next v2] ibmveth: Add multi buffers rx replenishment hcall support Mingming Cao
@ 2025-07-20 10:45 ` Simon Horman
2025-07-22 13:20 ` patchwork-bot+netdevbpf
1 sibling, 0 replies; 3+ messages in thread
From: Simon Horman @ 2025-07-20 10:45 UTC (permalink / raw)
To: Mingming Cao
Cc: netdev, bjking1, haren, ricklind, davemarq, maddy, mpe, npiggin,
christophe.leroy, andrew+netdev, davem, kuba, edumazet, pabeni,
linuxppc-dev
On Sat, Jul 19, 2025 at 05:13:56AM -0400, Mingming Cao wrote:
> This patch enables batched RX buffer replenishment in ibmveth by
> using the new firmware-supported h_add_logical_lan_buffers() hcall
> to submit up to 8 RX buffers in a single call, instead of repeatedly
> calling the single-buffer h_add_logical_lan_buffer() hcall.
>
> During the probe, with the patch, the driver queries ILLAN attributes
> to detect IBMVETH_ILLAN_RX_MULTI_BUFF_SUPPORT bit. If the attribute is
> present, rx_buffers_per_hcall is set to 8, enabling batched replenishment.
> Otherwise, it defaults to 1, preserving the original upstream behavior
> with no change in code flow for unsupported systems.
>
> The core rx replenish logic remains the same. But when batching
> is enabled, the driver aggregates up to 8 fully prepared descriptors
> into a single h_add_logical_lan_buffers() hypercall. If any allocation
> or DMA mapping fails while preparing a batch, only the successfully
> prepared buffers are submitted, and the remaining are deferred for
> the next replenish cycle.
>
> If at runtime the firmware stops accepting the batched hcall—e,g,
> after a Live Partition Migration (LPM) to a host that does not
> support h_add_logical_lan_buffers(), the hypercall returns H_FUNCTION.
> In that case, the driver transparently disables batching, resets
> rx_buffers_per_hcall to 1, and falls back to the single-buffer hcall
> in next future replenishments to take care of these and future buffers.
>
> Test were done on systems with firmware that both supports and
> does not support the new h_add_logical_lan_buffers hcall.
>
> On supported firmware, this reduces hypercall overhead significantly
> over multiple buffers. SAR measurements showed about a 15% improvement
> in packet processing rate under moderate RX load, with heavier traffic
> seeing gains more than 30%
>
> Signed-off-by: Mingming Cao <mmc@linux.ibm.com>
> Reviewed-by: Brian King <bjking1@linux.ibm.com>
> Reviewed-by: Haren Myneni <haren@linux.ibm.com>
> Reviewed-by: Dave Marquardt <davemarq@linux.ibm.com>
Thanks for the update.
Reviewed-by: Simon Horman <horms@kernel.org>
...
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH net-next v2] ibmveth: Add multi buffers rx replenishment hcall support
2025-07-19 9:13 [PATCH net-next v2] ibmveth: Add multi buffers rx replenishment hcall support Mingming Cao
2025-07-20 10:45 ` Simon Horman
@ 2025-07-22 13:20 ` patchwork-bot+netdevbpf
1 sibling, 0 replies; 3+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-07-22 13:20 UTC (permalink / raw)
To: Mingming Cao
Cc: netdev, horms, bjking1, haren, ricklind, davemarq, maddy, mpe,
npiggin, christophe.leroy, andrew+netdev, davem, kuba, edumazet,
pabeni, linuxppc-dev
Hello:
This patch was applied to netdev/net-next.git (main)
by Paolo Abeni <pabeni@redhat.com>:
On Sat, 19 Jul 2025 05:13:56 -0400 you wrote:
> This patch enables batched RX buffer replenishment in ibmveth by
> using the new firmware-supported h_add_logical_lan_buffers() hcall
> to submit up to 8 RX buffers in a single call, instead of repeatedly
> calling the single-buffer h_add_logical_lan_buffer() hcall.
>
> During the probe, with the patch, the driver queries ILLAN attributes
> to detect IBMVETH_ILLAN_RX_MULTI_BUFF_SUPPORT bit. If the attribute is
> present, rx_buffers_per_hcall is set to 8, enabling batched replenishment.
> Otherwise, it defaults to 1, preserving the original upstream behavior
> with no change in code flow for unsupported systems.
>
> [...]
Here is the summary with links:
- [net-next,v2] ibmveth: Add multi buffers rx replenishment hcall support
https://git.kernel.org/netdev/net-next/c/2094200b5f77
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-07-22 13:19 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-19 9:13 [PATCH net-next v2] ibmveth: Add multi buffers rx replenishment hcall support Mingming Cao
2025-07-20 10:45 ` Simon Horman
2025-07-22 13:20 ` patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).