* [PATCH net-next 0/2] ibmveth RR performance
@ 2024-08-01 21:12 Nick Child
2024-08-01 21:12 ` [PATCH net-next 1/2] ibmveth: Optimize poll rescheduling process Nick Child
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Nick Child @ 2024-08-01 21:12 UTC (permalink / raw)
To: netdev; +Cc: bjking1, haren, ricklind, Nick Child
Hello!
This patchset aims to increase the ibmveth drivers small packet
request response rate.
These 2 patches address:
1. NAPI rescheduling technique
2. Driver-side processing of small packets
Testing over several netperf tcp_rr connections, we saw a
30% increase in transactions per second. No regressions
were observed in other workloads.
Please let me know if there are any questions/concerns.
Thanks,
Nick Child
Nick Child (2):
ibmveth: Optimize poll rescheduling process
ibmveth: Recycle buffers during replenish phase
drivers/net/ethernet/ibm/ibmveth.c | 172 +++++++++++++----------------
1 file changed, 75 insertions(+), 97 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH net-next 1/2] ibmveth: Optimize poll rescheduling process
2024-08-01 21:12 [PATCH net-next 0/2] ibmveth RR performance Nick Child
@ 2024-08-01 21:12 ` Nick Child
2024-08-01 23:07 ` Nelson, Shannon
2024-08-01 21:12 ` [PATCH net-next 2/2] ibmveth: Recycle buffers during replenish phase Nick Child
2024-08-02 23:50 ` [PATCH net-next 0/2] ibmveth RR performance patchwork-bot+netdevbpf
2 siblings, 1 reply; 8+ messages in thread
From: Nick Child @ 2024-08-01 21:12 UTC (permalink / raw)
To: netdev; +Cc: bjking1, haren, ricklind, Nick Child
When the ibmveth driver processes less than the budget, it must call
napi_complete_done() to release the instance. This function will
return false if the driver should avoid rearming interrupts.
Previously, the driver was ignoring the return code of
napi_complete_done(). As a result, there were unnecessary calls to
enable the veth irq.
Therefore, use the return code napi_complete_done() to determine if
irq rearm is necessary.
Additionally, in the event that new data is received immediately after
rearming interrupts, rather than just rescheduling napi, also jump
back to the poll processing loop since we are already in the poll
function (and know that we did not expense all of budget).
This slight tweak results in a 15% increase in TCP_RR transaction rate
(320k to 370k txns). We can see the ftrace data supports this:
PREV: ibmveth_poll = 8818014.0 us / 182802.0 hits = AVG 48.24
NEW: ibmveth_poll = 8082398.0 us / 191413.0 hits = AVG 42.22
Signed-off-by: Nick Child <nnac123@linux.ibm.com>
---
drivers/net/ethernet/ibm/ibmveth.c | 28 +++++++++++++++-------------
1 file changed, 15 insertions(+), 13 deletions(-)
diff --git a/drivers/net/ethernet/ibm/ibmveth.c b/drivers/net/ethernet/ibm/ibmveth.c
index 4c9d9badd698..e6eb594f0751 100644
--- a/drivers/net/ethernet/ibm/ibmveth.c
+++ b/drivers/net/ethernet/ibm/ibmveth.c
@@ -1337,6 +1337,7 @@ static int ibmveth_poll(struct napi_struct *napi, int budget)
unsigned long lpar_rc;
u16 mss = 0;
+restart_poll:
while (frames_processed < budget) {
if (!ibmveth_rxq_pending_buffer(adapter))
break;
@@ -1420,24 +1421,25 @@ static int ibmveth_poll(struct napi_struct *napi, int budget)
ibmveth_replenish_task(adapter);
- if (frames_processed < budget) {
- napi_complete_done(napi, frames_processed);
+ if (frames_processed == budget)
+ goto out;
- /* We think we are done - reenable interrupts,
- * then check once more to make sure we are done.
- */
- lpar_rc = h_vio_signal(adapter->vdev->unit_address,
- VIO_IRQ_ENABLE);
+ if (!napi_complete_done(napi, frames_processed))
+ goto out;
- BUG_ON(lpar_rc != H_SUCCESS);
+ /* We think we are done - reenable interrupts,
+ * then check once more to make sure we are done.
+ */
+ lpar_rc = h_vio_signal(adapter->vdev->unit_address, VIO_IRQ_ENABLE);
+ BUG_ON(lpar_rc != H_SUCCESS);
- if (ibmveth_rxq_pending_buffer(adapter) &&
- napi_schedule(napi)) {
- lpar_rc = h_vio_signal(adapter->vdev->unit_address,
- VIO_IRQ_DISABLE);
- }
+ if (ibmveth_rxq_pending_buffer(adapter) && napi_schedule(napi)) {
+ lpar_rc = h_vio_signal(adapter->vdev->unit_address,
+ VIO_IRQ_DISABLE);
+ goto restart_poll;
}
+out:
return frames_processed;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH net-next 2/2] ibmveth: Recycle buffers during replenish phase
2024-08-01 21:12 [PATCH net-next 0/2] ibmveth RR performance Nick Child
2024-08-01 21:12 ` [PATCH net-next 1/2] ibmveth: Optimize poll rescheduling process Nick Child
@ 2024-08-01 21:12 ` Nick Child
2024-08-01 23:07 ` Nelson, Shannon
2024-08-02 23:50 ` [PATCH net-next 0/2] ibmveth RR performance patchwork-bot+netdevbpf
2 siblings, 1 reply; 8+ messages in thread
From: Nick Child @ 2024-08-01 21:12 UTC (permalink / raw)
To: netdev; +Cc: bjking1, haren, ricklind, Nick Child
When the length of a packet is under the rx_copybreak threshold, the
buffer is copied into a new skb and sent up the stack. This allows the
dma mapped memory to be recycled back to FW.
Previously, the reuse of the DMA space was handled immediately.
This means that further packet processing has to wait until
h_add_logical_lan finishes for this packet.
Therefore, when reusing a packet, offload the hcall to the replenish
function. As a result, much of the shared logic between the recycle and
replenish functions can be removed.
This change increases TCP_RR packet rate by another 15% (370k to 430k
txns). We can see the ftrace data supports this:
PREV: ibmveth_poll = 8078553.0 us / 190999.0 hits = AVG 42.3 us
NEW: ibmveth_poll = 7632787.0 us / 224060.0 hits = AVG 34.07 us
Signed-off-by: Nick Child <nnac123@linux.ibm.com>
---
drivers/net/ethernet/ibm/ibmveth.c | 144 ++++++++++++-----------------
1 file changed, 60 insertions(+), 84 deletions(-)
diff --git a/drivers/net/ethernet/ibm/ibmveth.c b/drivers/net/ethernet/ibm/ibmveth.c
index e6eb594f0751..b619a3ec245b 100644
--- a/drivers/net/ethernet/ibm/ibmveth.c
+++ b/drivers/net/ethernet/ibm/ibmveth.c
@@ -39,7 +39,8 @@
#include "ibmveth.h"
static irqreturn_t ibmveth_interrupt(int irq, void *dev_instance);
-static void ibmveth_rxq_harvest_buffer(struct ibmveth_adapter *adapter);
+static void ibmveth_rxq_harvest_buffer(struct ibmveth_adapter *adapter,
+ bool reuse);
static unsigned long ibmveth_get_desired_dma(struct vio_dev *vdev);
static struct kobj_type ktype_veth_pool;
@@ -226,6 +227,16 @@ static void ibmveth_replenish_buffer_pool(struct ibmveth_adapter *adapter,
for (i = 0; i < count; ++i) {
union ibmveth_buf_desc desc;
+ free_index = pool->consumer_index;
+ index = pool->free_map[free_index];
+ skb = NULL;
+
+ BUG_ON(index == IBM_VETH_INVALID_MAP);
+
+ /* are we allocating a new buffer or recycling an old one */
+ if (pool->skbuff[index])
+ goto reuse;
+
skb = netdev_alloc_skb(adapter->netdev, pool->buff_size);
if (!skb) {
@@ -235,46 +246,46 @@ static void ibmveth_replenish_buffer_pool(struct ibmveth_adapter *adapter,
break;
}
- free_index = pool->consumer_index;
- pool->consumer_index++;
- if (pool->consumer_index >= pool->size)
- pool->consumer_index = 0;
- index = pool->free_map[free_index];
-
- BUG_ON(index == IBM_VETH_INVALID_MAP);
- BUG_ON(pool->skbuff[index] != NULL);
-
dma_addr = dma_map_single(&adapter->vdev->dev, skb->data,
pool->buff_size, DMA_FROM_DEVICE);
if (dma_mapping_error(&adapter->vdev->dev, dma_addr))
goto failure;
- pool->free_map[free_index] = IBM_VETH_INVALID_MAP;
pool->dma_addr[index] = dma_addr;
pool->skbuff[index] = skb;
- correlator = ((u64)pool->index << 32) | index;
- *(u64 *)skb->data = correlator;
-
- desc.fields.flags_len = IBMVETH_BUF_VALID | pool->buff_size;
- desc.fields.address = dma_addr;
-
if (rx_flush) {
unsigned int len = min(pool->buff_size,
- adapter->netdev->mtu +
- IBMVETH_BUFF_OH);
+ adapter->netdev->mtu +
+ IBMVETH_BUFF_OH);
ibmveth_flush_buffer(skb->data, len);
}
+reuse:
+ dma_addr = pool->dma_addr[index];
+ desc.fields.flags_len = IBMVETH_BUF_VALID | pool->buff_size;
+ desc.fields.address = dma_addr;
+
+ correlator = ((u64)pool->index << 32) | index;
+ *(u64 *)pool->skbuff[index]->data = correlator;
+
lpar_rc = h_add_logical_lan_buffer(adapter->vdev->unit_address,
desc.desc);
if (lpar_rc != H_SUCCESS) {
+ netdev_warn(adapter->netdev,
+ "%sadd_logical_lan failed %lu\n",
+ skb ? "" : "When recycling: ", lpar_rc);
goto failure;
- } else {
- buffers_added++;
- adapter->replenish_add_buff_success++;
}
+
+ pool->free_map[free_index] = IBM_VETH_INVALID_MAP;
+ pool->consumer_index++;
+ if (pool->consumer_index >= pool->size)
+ pool->consumer_index = 0;
+
+ buffers_added++;
+ adapter->replenish_add_buff_success++;
}
mb();
@@ -282,17 +293,13 @@ static void ibmveth_replenish_buffer_pool(struct ibmveth_adapter *adapter,
return;
failure:
- pool->free_map[free_index] = index;
- pool->skbuff[index] = NULL;
- if (pool->consumer_index == 0)
- pool->consumer_index = pool->size - 1;
- else
- pool->consumer_index--;
- if (!dma_mapping_error(&adapter->vdev->dev, dma_addr))
+
+ if (dma_addr && !dma_mapping_error(&adapter->vdev->dev, dma_addr))
dma_unmap_single(&adapter->vdev->dev,
pool->dma_addr[index], pool->buff_size,
DMA_FROM_DEVICE);
- dev_kfree_skb_any(skb);
+ dev_kfree_skb_any(pool->skbuff[index]);
+ pool->skbuff[index] = NULL;
adapter->replenish_add_buff_failure++;
mb();
@@ -365,7 +372,7 @@ static void ibmveth_free_buffer_pool(struct ibmveth_adapter *adapter,
/* remove a buffer from a pool */
static void ibmveth_remove_buffer_from_pool(struct ibmveth_adapter *adapter,
- u64 correlator)
+ u64 correlator, bool reuse)
{
unsigned int pool = correlator >> 32;
unsigned int index = correlator & 0xffffffffUL;
@@ -376,15 +383,23 @@ static void ibmveth_remove_buffer_from_pool(struct ibmveth_adapter *adapter,
BUG_ON(index >= adapter->rx_buff_pool[pool].size);
skb = adapter->rx_buff_pool[pool].skbuff[index];
-
BUG_ON(skb == NULL);
- adapter->rx_buff_pool[pool].skbuff[index] = NULL;
+ /* if we are going to reuse the buffer then keep the pointers around
+ * but mark index as available. replenish will see the skb pointer and
+ * assume it is to be recycled.
+ */
+ if (!reuse) {
+ /* remove the skb pointer to mark free. actual freeing is done
+ * by upper level networking after gro_recieve
+ */
+ adapter->rx_buff_pool[pool].skbuff[index] = NULL;
- dma_unmap_single(&adapter->vdev->dev,
- adapter->rx_buff_pool[pool].dma_addr[index],
- adapter->rx_buff_pool[pool].buff_size,
- DMA_FROM_DEVICE);
+ dma_unmap_single(&adapter->vdev->dev,
+ adapter->rx_buff_pool[pool].dma_addr[index],
+ adapter->rx_buff_pool[pool].buff_size,
+ DMA_FROM_DEVICE);
+ }
free_index = adapter->rx_buff_pool[pool].producer_index;
adapter->rx_buff_pool[pool].producer_index++;
@@ -411,51 +426,13 @@ static inline struct sk_buff *ibmveth_rxq_get_buffer(struct ibmveth_adapter *ada
return adapter->rx_buff_pool[pool].skbuff[index];
}
-/* recycle the current buffer on the rx queue */
-static int ibmveth_rxq_recycle_buffer(struct ibmveth_adapter *adapter)
+static void ibmveth_rxq_harvest_buffer(struct ibmveth_adapter *adapter,
+ bool reuse)
{
- u32 q_index = adapter->rx_queue.index;
- u64 correlator = adapter->rx_queue.queue_addr[q_index].correlator;
- unsigned int pool = correlator >> 32;
- unsigned int index = correlator & 0xffffffffUL;
- union ibmveth_buf_desc desc;
- unsigned long lpar_rc;
- int ret = 1;
-
- BUG_ON(pool >= IBMVETH_NUM_BUFF_POOLS);
- BUG_ON(index >= adapter->rx_buff_pool[pool].size);
+ u64 cor;
- if (!adapter->rx_buff_pool[pool].active) {
- ibmveth_rxq_harvest_buffer(adapter);
- ibmveth_free_buffer_pool(adapter, &adapter->rx_buff_pool[pool]);
- goto out;
- }
-
- desc.fields.flags_len = IBMVETH_BUF_VALID |
- adapter->rx_buff_pool[pool].buff_size;
- desc.fields.address = adapter->rx_buff_pool[pool].dma_addr[index];
-
- lpar_rc = h_add_logical_lan_buffer(adapter->vdev->unit_address, desc.desc);
-
- if (lpar_rc != H_SUCCESS) {
- netdev_dbg(adapter->netdev, "h_add_logical_lan_buffer failed "
- "during recycle rc=%ld", lpar_rc);
- ibmveth_remove_buffer_from_pool(adapter, adapter->rx_queue.queue_addr[adapter->rx_queue.index].correlator);
- ret = 0;
- }
-
- if (++adapter->rx_queue.index == adapter->rx_queue.num_slots) {
- adapter->rx_queue.index = 0;
- adapter->rx_queue.toggle = !adapter->rx_queue.toggle;
- }
-
-out:
- return ret;
-}
-
-static void ibmveth_rxq_harvest_buffer(struct ibmveth_adapter *adapter)
-{
- ibmveth_remove_buffer_from_pool(adapter, adapter->rx_queue.queue_addr[adapter->rx_queue.index].correlator);
+ cor = adapter->rx_queue.queue_addr[adapter->rx_queue.index].correlator;
+ ibmveth_remove_buffer_from_pool(adapter, cor, reuse);
if (++adapter->rx_queue.index == adapter->rx_queue.num_slots) {
adapter->rx_queue.index = 0;
@@ -1347,7 +1324,7 @@ static int ibmveth_poll(struct napi_struct *napi, int budget)
wmb(); /* suggested by larson1 */
adapter->rx_invalid_buffer++;
netdev_dbg(netdev, "recycling invalid buffer\n");
- ibmveth_rxq_recycle_buffer(adapter);
+ ibmveth_rxq_harvest_buffer(adapter, true);
} else {
struct sk_buff *skb, *new_skb;
int length = ibmveth_rxq_frame_length(adapter);
@@ -1380,11 +1357,10 @@ static int ibmveth_poll(struct napi_struct *napi, int budget)
if (rx_flush)
ibmveth_flush_buffer(skb->data,
length + offset);
- if (!ibmveth_rxq_recycle_buffer(adapter))
- kfree_skb(skb);
+ ibmveth_rxq_harvest_buffer(adapter, true);
skb = new_skb;
} else {
- ibmveth_rxq_harvest_buffer(adapter);
+ ibmveth_rxq_harvest_buffer(adapter, false);
skb_reserve(skb, offset);
}
--
2.43.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH net-next 1/2] ibmveth: Optimize poll rescheduling process
2024-08-01 21:12 ` [PATCH net-next 1/2] ibmveth: Optimize poll rescheduling process Nick Child
@ 2024-08-01 23:07 ` Nelson, Shannon
0 siblings, 0 replies; 8+ messages in thread
From: Nelson, Shannon @ 2024-08-01 23:07 UTC (permalink / raw)
To: Nick Child, netdev; +Cc: bjking1, haren, ricklind
On 8/1/2024 2:12 PM, Nick Child wrote:
>
> When the ibmveth driver processes less than the budget, it must call
> napi_complete_done() to release the instance. This function will
> return false if the driver should avoid rearming interrupts.
> Previously, the driver was ignoring the return code of
> napi_complete_done(). As a result, there were unnecessary calls to
> enable the veth irq.
> Therefore, use the return code napi_complete_done() to determine if
> irq rearm is necessary.
>
> Additionally, in the event that new data is received immediately after
> rearming interrupts, rather than just rescheduling napi, also jump
> back to the poll processing loop since we are already in the poll
> function (and know that we did not expense all of budget).
>
> This slight tweak results in a 15% increase in TCP_RR transaction rate
> (320k to 370k txns). We can see the ftrace data supports this:
> PREV: ibmveth_poll = 8818014.0 us / 182802.0 hits = AVG 48.24
> NEW: ibmveth_poll = 8082398.0 us / 191413.0 hits = AVG 42.22
>
> Signed-off-by: Nick Child <nnac123@linux.ibm.com>
> ---
> drivers/net/ethernet/ibm/ibmveth.c | 28 +++++++++++++++-------------
> 1 file changed, 15 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/net/ethernet/ibm/ibmveth.c b/drivers/net/ethernet/ibm/ibmveth.c
> index 4c9d9badd698..e6eb594f0751 100644
> --- a/drivers/net/ethernet/ibm/ibmveth.c
> +++ b/drivers/net/ethernet/ibm/ibmveth.c
> @@ -1337,6 +1337,7 @@ static int ibmveth_poll(struct napi_struct *napi, int budget)
> unsigned long lpar_rc;
> u16 mss = 0;
>
> +restart_poll:
> while (frames_processed < budget) {
> if (!ibmveth_rxq_pending_buffer(adapter))
> break;
> @@ -1420,24 +1421,25 @@ static int ibmveth_poll(struct napi_struct *napi, int budget)
>
> ibmveth_replenish_task(adapter);
>
> - if (frames_processed < budget) {
> - napi_complete_done(napi, frames_processed);
> + if (frames_processed == budget)
> + goto out;
>
> - /* We think we are done - reenable interrupts,
> - * then check once more to make sure we are done.
> - */
> - lpar_rc = h_vio_signal(adapter->vdev->unit_address,
> - VIO_IRQ_ENABLE);
> + if (!napi_complete_done(napi, frames_processed))
> + goto out;
>
> - BUG_ON(lpar_rc != H_SUCCESS);
> + /* We think we are done - reenable interrupts,
> + * then check once more to make sure we are done.
> + */
> + lpar_rc = h_vio_signal(adapter->vdev->unit_address, VIO_IRQ_ENABLE);
> + BUG_ON(lpar_rc != H_SUCCESS);
I know you're just moving this around, but maybe this is a good time to
get rid of the deprecated BUG_ON?
Otherwise this looks reasonable.
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
>
> - if (ibmveth_rxq_pending_buffer(adapter) &&
> - napi_schedule(napi)) {
> - lpar_rc = h_vio_signal(adapter->vdev->unit_address,
> - VIO_IRQ_DISABLE);
> - }
> + if (ibmveth_rxq_pending_buffer(adapter) && napi_schedule(napi)) {
> + lpar_rc = h_vio_signal(adapter->vdev->unit_address,
> + VIO_IRQ_DISABLE);
> + goto restart_poll;
> }
>
> +out:
> return frames_processed;
> }
>
> --
> 2.43.0
>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net-next 2/2] ibmveth: Recycle buffers during replenish phase
2024-08-01 21:12 ` [PATCH net-next 2/2] ibmveth: Recycle buffers during replenish phase Nick Child
@ 2024-08-01 23:07 ` Nelson, Shannon
2024-08-02 17:36 ` Nick Child
0 siblings, 1 reply; 8+ messages in thread
From: Nelson, Shannon @ 2024-08-01 23:07 UTC (permalink / raw)
To: Nick Child, netdev; +Cc: bjking1, haren, ricklind
On 8/1/2024 2:12 PM, Nick Child wrote:
>
> When the length of a packet is under the rx_copybreak threshold, the
> buffer is copied into a new skb and sent up the stack. This allows the
> dma mapped memory to be recycled back to FW.
>
> Previously, the reuse of the DMA space was handled immediately.
> This means that further packet processing has to wait until
> h_add_logical_lan finishes for this packet.
>
> Therefore, when reusing a packet, offload the hcall to the replenish
> function. As a result, much of the shared logic between the recycle and
> replenish functions can be removed.
>
> This change increases TCP_RR packet rate by another 15% (370k to 430k
> txns). We can see the ftrace data supports this:
> PREV: ibmveth_poll = 8078553.0 us / 190999.0 hits = AVG 42.3 us
> NEW: ibmveth_poll = 7632787.0 us / 224060.0 hits = AVG 34.07 us
>
> Signed-off-by: Nick Child <nnac123@linux.ibm.com>
> ---
> drivers/net/ethernet/ibm/ibmveth.c | 144 ++++++++++++-----------------
> 1 file changed, 60 insertions(+), 84 deletions(-)
>
> diff --git a/drivers/net/ethernet/ibm/ibmveth.c b/drivers/net/ethernet/ibm/ibmveth.c
> index e6eb594f0751..b619a3ec245b 100644
> --- a/drivers/net/ethernet/ibm/ibmveth.c
> +++ b/drivers/net/ethernet/ibm/ibmveth.c
> @@ -39,7 +39,8 @@
> #include "ibmveth.h"
>
> static irqreturn_t ibmveth_interrupt(int irq, void *dev_instance);
> -static void ibmveth_rxq_harvest_buffer(struct ibmveth_adapter *adapter);
> +static void ibmveth_rxq_harvest_buffer(struct ibmveth_adapter *adapter,
> + bool reuse);
> static unsigned long ibmveth_get_desired_dma(struct vio_dev *vdev);
>
> static struct kobj_type ktype_veth_pool;
> @@ -226,6 +227,16 @@ static void ibmveth_replenish_buffer_pool(struct ibmveth_adapter *adapter,
> for (i = 0; i < count; ++i) {
> union ibmveth_buf_desc desc;
>
> + free_index = pool->consumer_index;
> + index = pool->free_map[free_index];
> + skb = NULL;
> +
> + BUG_ON(index == IBM_VETH_INVALID_MAP);
Maybe can replace with a WARN_ON with a break out of the loop?
Otherwise this looks reasonable.
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
> +
> + /* are we allocating a new buffer or recycling an old one */
> + if (pool->skbuff[index])
> + goto reuse;
> +
> skb = netdev_alloc_skb(adapter->netdev, pool->buff_size);
>
> if (!skb) {
> @@ -235,46 +246,46 @@ static void ibmveth_replenish_buffer_pool(struct ibmveth_adapter *adapter,
> break;
> }
>
> - free_index = pool->consumer_index;
> - pool->consumer_index++;
> - if (pool->consumer_index >= pool->size)
> - pool->consumer_index = 0;
> - index = pool->free_map[free_index];
> -
> - BUG_ON(index == IBM_VETH_INVALID_MAP);
> - BUG_ON(pool->skbuff[index] != NULL);
> -
> dma_addr = dma_map_single(&adapter->vdev->dev, skb->data,
> pool->buff_size, DMA_FROM_DEVICE);
>
> if (dma_mapping_error(&adapter->vdev->dev, dma_addr))
> goto failure;
>
> - pool->free_map[free_index] = IBM_VETH_INVALID_MAP;
> pool->dma_addr[index] = dma_addr;
> pool->skbuff[index] = skb;
>
> - correlator = ((u64)pool->index << 32) | index;
> - *(u64 *)skb->data = correlator;
> -
> - desc.fields.flags_len = IBMVETH_BUF_VALID | pool->buff_size;
> - desc.fields.address = dma_addr;
> -
> if (rx_flush) {
> unsigned int len = min(pool->buff_size,
> - adapter->netdev->mtu +
> - IBMVETH_BUFF_OH);
> + adapter->netdev->mtu +
> + IBMVETH_BUFF_OH);
> ibmveth_flush_buffer(skb->data, len);
> }
> +reuse:
> + dma_addr = pool->dma_addr[index];
> + desc.fields.flags_len = IBMVETH_BUF_VALID | pool->buff_size;
> + desc.fields.address = dma_addr;
> +
> + correlator = ((u64)pool->index << 32) | index;
> + *(u64 *)pool->skbuff[index]->data = correlator;
> +
> lpar_rc = h_add_logical_lan_buffer(adapter->vdev->unit_address,
> desc.desc);
>
> if (lpar_rc != H_SUCCESS) {
> + netdev_warn(adapter->netdev,
> + "%sadd_logical_lan failed %lu\n",
> + skb ? "" : "When recycling: ", lpar_rc);
> goto failure;
> - } else {
> - buffers_added++;
> - adapter->replenish_add_buff_success++;
> }
> +
> + pool->free_map[free_index] = IBM_VETH_INVALID_MAP;
> + pool->consumer_index++;
> + if (pool->consumer_index >= pool->size)
> + pool->consumer_index = 0;
> +
> + buffers_added++;
> + adapter->replenish_add_buff_success++;
> }
>
> mb();
> @@ -282,17 +293,13 @@ static void ibmveth_replenish_buffer_pool(struct ibmveth_adapter *adapter,
> return;
>
> failure:
> - pool->free_map[free_index] = index;
> - pool->skbuff[index] = NULL;
> - if (pool->consumer_index == 0)
> - pool->consumer_index = pool->size - 1;
> - else
> - pool->consumer_index--;
> - if (!dma_mapping_error(&adapter->vdev->dev, dma_addr))
> +
> + if (dma_addr && !dma_mapping_error(&adapter->vdev->dev, dma_addr))
> dma_unmap_single(&adapter->vdev->dev,
> pool->dma_addr[index], pool->buff_size,
> DMA_FROM_DEVICE);
> - dev_kfree_skb_any(skb);
> + dev_kfree_skb_any(pool->skbuff[index]);
> + pool->skbuff[index] = NULL;
> adapter->replenish_add_buff_failure++;
>
> mb();
> @@ -365,7 +372,7 @@ static void ibmveth_free_buffer_pool(struct ibmveth_adapter *adapter,
>
> /* remove a buffer from a pool */
> static void ibmveth_remove_buffer_from_pool(struct ibmveth_adapter *adapter,
> - u64 correlator)
> + u64 correlator, bool reuse)
> {
> unsigned int pool = correlator >> 32;
> unsigned int index = correlator & 0xffffffffUL;
> @@ -376,15 +383,23 @@ static void ibmveth_remove_buffer_from_pool(struct ibmveth_adapter *adapter,
> BUG_ON(index >= adapter->rx_buff_pool[pool].size);
>
> skb = adapter->rx_buff_pool[pool].skbuff[index];
> -
> BUG_ON(skb == NULL);
>
> - adapter->rx_buff_pool[pool].skbuff[index] = NULL;
> + /* if we are going to reuse the buffer then keep the pointers around
> + * but mark index as available. replenish will see the skb pointer and
> + * assume it is to be recycled.
> + */
> + if (!reuse) {
> + /* remove the skb pointer to mark free. actual freeing is done
> + * by upper level networking after gro_recieve
> + */
> + adapter->rx_buff_pool[pool].skbuff[index] = NULL;
>
> - dma_unmap_single(&adapter->vdev->dev,
> - adapter->rx_buff_pool[pool].dma_addr[index],
> - adapter->rx_buff_pool[pool].buff_size,
> - DMA_FROM_DEVICE);
> + dma_unmap_single(&adapter->vdev->dev,
> + adapter->rx_buff_pool[pool].dma_addr[index],
> + adapter->rx_buff_pool[pool].buff_size,
> + DMA_FROM_DEVICE);
> + }
>
> free_index = adapter->rx_buff_pool[pool].producer_index;
> adapter->rx_buff_pool[pool].producer_index++;
> @@ -411,51 +426,13 @@ static inline struct sk_buff *ibmveth_rxq_get_buffer(struct ibmveth_adapter *ada
> return adapter->rx_buff_pool[pool].skbuff[index];
> }
>
> -/* recycle the current buffer on the rx queue */
> -static int ibmveth_rxq_recycle_buffer(struct ibmveth_adapter *adapter)
> +static void ibmveth_rxq_harvest_buffer(struct ibmveth_adapter *adapter,
> + bool reuse)
> {
> - u32 q_index = adapter->rx_queue.index;
> - u64 correlator = adapter->rx_queue.queue_addr[q_index].correlator;
> - unsigned int pool = correlator >> 32;
> - unsigned int index = correlator & 0xffffffffUL;
> - union ibmveth_buf_desc desc;
> - unsigned long lpar_rc;
> - int ret = 1;
> -
> - BUG_ON(pool >= IBMVETH_NUM_BUFF_POOLS);
> - BUG_ON(index >= adapter->rx_buff_pool[pool].size);
> + u64 cor;
>
> - if (!adapter->rx_buff_pool[pool].active) {
> - ibmveth_rxq_harvest_buffer(adapter);
> - ibmveth_free_buffer_pool(adapter, &adapter->rx_buff_pool[pool]);
> - goto out;
> - }
> -
> - desc.fields.flags_len = IBMVETH_BUF_VALID |
> - adapter->rx_buff_pool[pool].buff_size;
> - desc.fields.address = adapter->rx_buff_pool[pool].dma_addr[index];
> -
> - lpar_rc = h_add_logical_lan_buffer(adapter->vdev->unit_address, desc.desc);
> -
> - if (lpar_rc != H_SUCCESS) {
> - netdev_dbg(adapter->netdev, "h_add_logical_lan_buffer failed "
> - "during recycle rc=%ld", lpar_rc);
> - ibmveth_remove_buffer_from_pool(adapter, adapter->rx_queue.queue_addr[adapter->rx_queue.index].correlator);
> - ret = 0;
> - }
> -
> - if (++adapter->rx_queue.index == adapter->rx_queue.num_slots) {
> - adapter->rx_queue.index = 0;
> - adapter->rx_queue.toggle = !adapter->rx_queue.toggle;
> - }
> -
> -out:
> - return ret;
> -}
> -
> -static void ibmveth_rxq_harvest_buffer(struct ibmveth_adapter *adapter)
> -{
> - ibmveth_remove_buffer_from_pool(adapter, adapter->rx_queue.queue_addr[adapter->rx_queue.index].correlator);
> + cor = adapter->rx_queue.queue_addr[adapter->rx_queue.index].correlator;
> + ibmveth_remove_buffer_from_pool(adapter, cor, reuse);
>
> if (++adapter->rx_queue.index == adapter->rx_queue.num_slots) {
> adapter->rx_queue.index = 0;
> @@ -1347,7 +1324,7 @@ static int ibmveth_poll(struct napi_struct *napi, int budget)
> wmb(); /* suggested by larson1 */
> adapter->rx_invalid_buffer++;
> netdev_dbg(netdev, "recycling invalid buffer\n");
> - ibmveth_rxq_recycle_buffer(adapter);
> + ibmveth_rxq_harvest_buffer(adapter, true);
> } else {
> struct sk_buff *skb, *new_skb;
> int length = ibmveth_rxq_frame_length(adapter);
> @@ -1380,11 +1357,10 @@ static int ibmveth_poll(struct napi_struct *napi, int budget)
> if (rx_flush)
> ibmveth_flush_buffer(skb->data,
> length + offset);
> - if (!ibmveth_rxq_recycle_buffer(adapter))
> - kfree_skb(skb);
> + ibmveth_rxq_harvest_buffer(adapter, true);
> skb = new_skb;
> } else {
> - ibmveth_rxq_harvest_buffer(adapter);
> + ibmveth_rxq_harvest_buffer(adapter, false);
> skb_reserve(skb, offset);
> }
>
> --
> 2.43.0
>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net-next 2/2] ibmveth: Recycle buffers during replenish phase
2024-08-01 23:07 ` Nelson, Shannon
@ 2024-08-02 17:36 ` Nick Child
2024-08-02 17:38 ` Nelson, Shannon
0 siblings, 1 reply; 8+ messages in thread
From: Nick Child @ 2024-08-02 17:36 UTC (permalink / raw)
To: Nelson, Shannon, netdev; +Cc: bjking1, haren, ricklind
On 8/1/24 18:07, Nelson, Shannon wrote:
> On 8/1/2024 2:12 PM, Nick Child wrote:
>>
>> When the length of a packet is under the rx_copybreak threshold, the
>> buffer is copied into a new skb and sent up the stack. This allows the
>> dma mapped memory to be recycled back to FW.
>>
>> Previously, the reuse of the DMA space was handled immediately.
>> This means that further packet processing has to wait until
>> h_add_logical_lan finishes for this packet.
>>
>> Therefore, when reusing a packet, offload the hcall to the replenish
>> function. As a result, much of the shared logic between the recycle and
>> replenish functions can be removed.
>>
>> This change increases TCP_RR packet rate by another 15% (370k to 430k
>> txns). We can see the ftrace data supports this:
>> PREV: ibmveth_poll = 8078553.0 us / 190999.0 hits = AVG 42.3 us
>> NEW: ibmveth_poll = 7632787.0 us / 224060.0 hits = AVG 34.07 us
>>
>> Signed-off-by: Nick Child <nnac123@linux.ibm.com>
>> ---
>> drivers/net/ethernet/ibm/ibmveth.c | 144 ++++++++++++-----------------
>> 1 file changed, 60 insertions(+), 84 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/ibm/ibmveth.c
>> b/drivers/net/ethernet/ibm/ibmveth.c
>> index e6eb594f0751..b619a3ec245b 100644
>> --- a/drivers/net/ethernet/ibm/ibmveth.c
>> +++ b/drivers/net/ethernet/ibm/ibmveth.c
>> @@ -39,7 +39,8 @@
>> #include "ibmveth.h"
>>
>> static irqreturn_t ibmveth_interrupt(int irq, void *dev_instance);
>> -static void ibmveth_rxq_harvest_buffer(struct ibmveth_adapter *adapter);
>> +static void ibmveth_rxq_harvest_buffer(struct ibmveth_adapter *adapter,
>> + bool reuse);
>> static unsigned long ibmveth_get_desired_dma(struct vio_dev *vdev);
>>
>> static struct kobj_type ktype_veth_pool;
>> @@ -226,6 +227,16 @@ static void ibmveth_replenish_buffer_pool(struct
>> ibmveth_adapter *adapter,
>> for (i = 0; i < count; ++i) {
>> union ibmveth_buf_desc desc;
>>
>> + free_index = pool->consumer_index;
>> + index = pool->free_map[free_index];
>> + skb = NULL;
>> +
>> + BUG_ON(index == IBM_VETH_INVALID_MAP);
>
> Maybe can replace with a WARN_ON with a break out of the loop?
>
> Otherwise this looks reasonable.
>
> Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
>
Hi Shannon,
Thanks for reviewing. Addressing your comment on both
patches here as they are related. I agree we should
replace the BUG_ON's but there are 6 other BUG_ON's in
the driver that I would like to address as well. I am
thinking I will send a different patch which removes
all BUG_ON's in the driver (outside of this patchset).
Since this patchset only rearranges existing BUG_ONs, I
will hold off on sending a v2 unless other feedback comes
in. Thanks again.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net-next 2/2] ibmveth: Recycle buffers during replenish phase
2024-08-02 17:36 ` Nick Child
@ 2024-08-02 17:38 ` Nelson, Shannon
0 siblings, 0 replies; 8+ messages in thread
From: Nelson, Shannon @ 2024-08-02 17:38 UTC (permalink / raw)
To: Nick Child, netdev; +Cc: bjking1, haren, ricklind
On 8/2/2024 10:36 AM, Nick Child wrote:
>
> On 8/1/24 18:07, Nelson, Shannon wrote:
>> On 8/1/2024 2:12 PM, Nick Child wrote:
>>>
>>> When the length of a packet is under the rx_copybreak threshold, the
>>> buffer is copied into a new skb and sent up the stack. This allows the
>>> dma mapped memory to be recycled back to FW.
>>>
>>> Previously, the reuse of the DMA space was handled immediately.
>>> This means that further packet processing has to wait until
>>> h_add_logical_lan finishes for this packet.
>>>
>>> Therefore, when reusing a packet, offload the hcall to the replenish
>>> function. As a result, much of the shared logic between the recycle and
>>> replenish functions can be removed.
>>>
>>> This change increases TCP_RR packet rate by another 15% (370k to 430k
>>> txns). We can see the ftrace data supports this:
>>> PREV: ibmveth_poll = 8078553.0 us / 190999.0 hits = AVG 42.3 us
>>> NEW: ibmveth_poll = 7632787.0 us / 224060.0 hits = AVG 34.07 us
>>>
>>> Signed-off-by: Nick Child <nnac123@linux.ibm.com>
>>> ---
>>> drivers/net/ethernet/ibm/ibmveth.c | 144 ++++++++++++-----------------
>>> 1 file changed, 60 insertions(+), 84 deletions(-)
>>>
>>> diff --git a/drivers/net/ethernet/ibm/ibmveth.c
>>> b/drivers/net/ethernet/ibm/ibmveth.c
>>> index e6eb594f0751..b619a3ec245b 100644
>>> --- a/drivers/net/ethernet/ibm/ibmveth.c
>>> +++ b/drivers/net/ethernet/ibm/ibmveth.c
>>> @@ -39,7 +39,8 @@
>>> #include "ibmveth.h"
>>>
>>> static irqreturn_t ibmveth_interrupt(int irq, void *dev_instance);
>>> -static void ibmveth_rxq_harvest_buffer(struct ibmveth_adapter
>>> *adapter);
>>> +static void ibmveth_rxq_harvest_buffer(struct ibmveth_adapter *adapter,
>>> + bool reuse);
>>> static unsigned long ibmveth_get_desired_dma(struct vio_dev *vdev);
>>>
>>> static struct kobj_type ktype_veth_pool;
>>> @@ -226,6 +227,16 @@ static void ibmveth_replenish_buffer_pool(struct
>>> ibmveth_adapter *adapter,
>>> for (i = 0; i < count; ++i) {
>>> union ibmveth_buf_desc desc;
>>>
>>> + free_index = pool->consumer_index;
>>> + index = pool->free_map[free_index];
>>> + skb = NULL;
>>> +
>>> + BUG_ON(index == IBM_VETH_INVALID_MAP);
>>
>> Maybe can replace with a WARN_ON with a break out of the loop?
>>
>> Otherwise this looks reasonable.
>>
>> Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
>>
>
> Hi Shannon,
> Thanks for reviewing. Addressing your comment on both
> patches here as they are related. I agree we should
> replace the BUG_ON's but there are 6 other BUG_ON's in
> the driver that I would like to address as well. I am
> thinking I will send a different patch which removes
> all BUG_ON's in the driver (outside of this patchset).
>
> Since this patchset only rearranges existing BUG_ONs, I
> will hold off on sending a v2 unless other feedback comes
> in. Thanks again.
>
No problem - this is what I expected.
sln
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net-next 0/2] ibmveth RR performance
2024-08-01 21:12 [PATCH net-next 0/2] ibmveth RR performance Nick Child
2024-08-01 21:12 ` [PATCH net-next 1/2] ibmveth: Optimize poll rescheduling process Nick Child
2024-08-01 21:12 ` [PATCH net-next 2/2] ibmveth: Recycle buffers during replenish phase Nick Child
@ 2024-08-02 23:50 ` patchwork-bot+netdevbpf
2 siblings, 0 replies; 8+ messages in thread
From: patchwork-bot+netdevbpf @ 2024-08-02 23:50 UTC (permalink / raw)
To: Nick Child; +Cc: netdev, bjking1, haren, ricklind
Hello:
This series was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Thu, 1 Aug 2024 16:12:13 -0500 you wrote:
> Hello!
>
> This patchset aims to increase the ibmveth drivers small packet
> request response rate.
>
> These 2 patches address:
> 1. NAPI rescheduling technique
> 2. Driver-side processing of small packets
>
> [...]
Here is the summary with links:
- [net-next,1/2] ibmveth: Optimize poll rescheduling process
https://git.kernel.org/netdev/net-next/c/f128c7cf0530
- [net-next,2/2] ibmveth: Recycle buffers during replenish phase
https://git.kernel.org/netdev/net-next/c/b5381a5540cb
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2024-08-02 23:50 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-01 21:12 [PATCH net-next 0/2] ibmveth RR performance Nick Child
2024-08-01 21:12 ` [PATCH net-next 1/2] ibmveth: Optimize poll rescheduling process Nick Child
2024-08-01 23:07 ` Nelson, Shannon
2024-08-01 21:12 ` [PATCH net-next 2/2] ibmveth: Recycle buffers during replenish phase Nick Child
2024-08-01 23:07 ` Nelson, Shannon
2024-08-02 17:36 ` Nick Child
2024-08-02 17:38 ` Nelson, Shannon
2024-08-02 23:50 ` [PATCH net-next 0/2] ibmveth RR performance patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).