From: Simon Horman <horms@kernel.org>
To: Mingming Cao <mmc@linux.ibm.com>
Cc: netdev@vger.kernel.org, bjking1@linux.ibm.com,
haren@linux.ibm.com, ricklind@linux.ibm.com, kuba@kernel.org,
edumazet@google.com, pabeni@redhat.com,
linuxppc-dev@lists.ozlabs.org, maddy@linux.ibm.com,
mpe@ellerman.id.au
Subject: Re: [PATCH v3 net-next] ibmvnic: Increase max subcrq indirect entries with fallback
Date: Tue, 5 Aug 2025 20:14:36 +0100 [thread overview]
Message-ID: <20250805191436.GY8494@horms.kernel.org> (raw)
In-Reply-To: <20250804231704.12309-1-mmc@linux.ibm.com>
On Mon, Aug 04, 2025 at 04:17:04PM -0700, Mingming Cao wrote:
> POWER8 support a maximum of 16 subcrq indirect descriptor entries per
> H_SEND_SUB_CRQ_INDIRECT call, while POWER9 and newer hypervisors
> support up to 128 entries. Increasing the max number of indirect
> descriptor entries improves batching efficiency and reduces
> hcall overhead, which enhances throughput under large workload on POWER9+.
>
> Currently, ibmvnic driver always uses a fixed number of max indirect
> descriptor entries (16). send_subcrq_indirect() treats all hypervisor
> errors the same:
> - Cleanup and Drop the entire batch of descriptors.
> - Return an error to the caller.
> - Rely on TCP/IP retransmissions to recover.
> - If the hypervisor returns H_PARAMETER (e.g., because 128
> entries are not supported on POWER8), the driver will continue
> to drop batches, resulting in unnecessary packet loss.
>
> In this patch:
> Raise the default maximum indirect entries to 128 to improve ibmvnic
> batching on morden platform. But also gracefully fall back to
> 16 entries for Power 8 systems.
>
> Since there is no VIO interface to query the hypervisor’s supported
> limit, vnic handles send_subcrq_indirect() H_PARAMETER errors:
> - On first H_PARAMETER failure, log the failure context
> - Reduce max_indirect_entries to 16 and allow the single batch to drop.
> - Subsequent calls automatically use the correct lower limit,
> avoiding repeated drops.
>
> The goal is to optimizes performance on modern systems while handles
> falling back for older POWER8 hypervisors.
>
> Performance shows 40% improvements with MTU (1500) on largework load.
>
> --------------------------------------
> Changes since v2:
> link to v2: https://www.spinics.net/lists/netdev/msg1104669.html
>
> -- was Patch 4 from a patch series v2. v2 introduced a module parameter
> for backward compatibility. Based on review feedback, This patch handles
> older systems fall back case without adding a module parameter.
>
> Signed-off-by: Mingming Cao <mmc@linux.ibm.com>
> Reviewed-by: Brian King <bjking1@linux.ibm.com>
> Reviewed-by: Haren Myneni <haren@linux.ibm.com>
> ---
> drivers/net/ethernet/ibm/ibmvnic.c | 56 ++++++++++++++++++++++++++----
> drivers/net/ethernet/ibm/ibmvnic.h | 6 ++--
> 2 files changed, 53 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
...
> @@ -862,6 +862,19 @@ static void replenish_rx_pool(struct ibmvnic_adapter *adapter,
> failure:
> if (lpar_rc != H_PARAMETER && lpar_rc != H_CLOSED)
> dev_err_ratelimited(dev, "rx: replenish packet buffer failed\n");
> +
> + /* Detect platform limit H_PARAMETER */
> + if (lpar_rc == H_PARAMETER &&
> + adapter->cur_max_ind_descs > IBMVNIC_MAX_IND_DESC_MIN) {
> + netdev_info(adapter->netdev,
> + "H_PARAMETER, set ind desc to safe limit %u\n",
> + IBMVNIC_MAX_IND_DESC_MIN);
> + adapter->cur_max_ind_descs = IBMVNIC_MAX_IND_DESC_MIN;
> + }
Hi Mingming, all,
The logic above seems to appear twice in this patch.
I think it would be good to consolidate it somehow.
E.g. in a helper function.
> +
> + /* for all error case, temporarily drop only this batch
> + * Rely on TCP/IP retransmissions to retry and recover
> + */
Thanks for adding this comment.
Although perhaps 'for' -> 'For'.
Likewise below.
> for (i = ind_bufp->index - 1; i >= 0; --i) {
> struct ibmvnic_rx_buff *rx_buff;
>
> @@ -2381,16 +2394,33 @@ static int ibmvnic_tx_scrq_flush(struct ibmvnic_adapter *adapter,
> rc = send_subcrq_direct(adapter, handle,
> (u64 *)ind_bufp->indir_arr);
>
> - if (rc)
> + if (rc) {
> + dev_err_ratelimited(&adapter->vdev->dev,
> + "tx_flush failed, rc=%u (%llu entries dma=%pad handle=%llx)\n",
> + rc, entries, &dma_addr, handle);
> + /* Detect platform limit H_PARAMETER */
> + if (rc == H_PARAMETER &&
> + adapter->cur_max_ind_descs > IBMVNIC_MAX_IND_DESC_MIN) {
> + netdev_info(adapter->netdev,
> + "H_PARAMETER, set ind descs to safe limit %u\n",
> + IBMVNIC_MAX_IND_DESC_MIN);
> + adapter->cur_max_ind_descs = IBMVNIC_MAX_IND_DESC_MIN;
> + }
> +
> + /* for all error case, temporarily drop only this batch
> + * Rely on TCP/IP retransmissions to retry and recover
> + */
> ibmvnic_tx_scrq_clean_buffer(adapter, tx_scrq);
> - else
> + } else {
> ind_bufp->index = 0;
> + }
> return rc;
> }
...
> @@ -6369,6 +6399,17 @@ static int ibmvnic_reset_init(struct ibmvnic_adapter *adapter, bool reset)
> rc = reset_sub_crq_queues(adapter);
> }
> } else {
> + if (adapter->reset_reason == VNIC_RESET_MOBILITY) {
> + /* post migrtione reset the max
> + * indirect descriptors per hcall to be default max
> + * (e.g p8 ->p10)
> + * if the destination is on the platform supports
> + * do not support max (e.g. p10->p8) the threshold
> + * will be reduced to safe min limit for p8 later
> + */
nits: Post migration, reset.
The line breaking seems uneven.
And if p8 and p10 are POWER8 and POWER10 then I think it would
be worth spelling that out.
> + adapter->cur_max_ind_descs = IBMVNIC_MAX_IND_DESC_MAX;
> + }
> +
> rc = init_sub_crqs(adapter);
> }
>
...
> diff --git a/drivers/net/ethernet/ibm/ibmvnic.h b/drivers/net/ethernet/ibm/ibmvnic.h
> index 246ddce753f9..829a16116812 100644
> --- a/drivers/net/ethernet/ibm/ibmvnic.h
> +++ b/drivers/net/ethernet/ibm/ibmvnic.h
> @@ -29,8 +29,9 @@
> #define IBMVNIC_BUFFS_PER_POOL 100
> #define IBMVNIC_MAX_QUEUES 16
> #define IBMVNIC_MAX_QUEUE_SZ 4096
> -#define IBMVNIC_MAX_IND_DESCS 16
> -#define IBMVNIC_IND_ARR_SZ (IBMVNIC_MAX_IND_DESCS * 32)
> +#define IBMVNIC_MAX_IND_DESC_MAX 128
> +#define IBMVNIC_MAX_IND_DESC_MIN 16
...MAX...{MAX,MIN} seems like an unfortunate name.
But I don't feel particularly strongly about this one.
> +#define IBMVNIC_IND_MAX_ARR_SZ (IBMVNIC_MAX_IND_DESC_MAX * 32)
>
> #define IBMVNIC_TSO_BUF_SZ 65536
> #define IBMVNIC_TSO_BUFS 64
...
prev parent reply other threads:[~2025-08-05 19:14 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-04 23:17 [PATCH v3 net-next] ibmvnic: Increase max subcrq indirect entries with fallback Mingming Cao
2025-08-05 19:14 ` Simon Horman [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250805191436.GY8494@horms.kernel.org \
--to=horms@kernel.org \
--cc=bjking1@linux.ibm.com \
--cc=edumazet@google.com \
--cc=haren@linux.ibm.com \
--cc=kuba@kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=maddy@linux.ibm.com \
--cc=mmc@linux.ibm.com \
--cc=mpe@ellerman.id.au \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=ricklind@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.