From: Simon Horman <horms@kernel.org>
To: Mingming Cao <mmc@linux.ibm.com>
Cc: netdev@vger.kernel.org, bjking1@linux.ibm.com,
haren@linux.ibm.com, ricklind@linux.ibm.com, kuba@kernel.org,
edumazet@google.com, pabeni@redhat.com,
linuxppc-dev@lists.ozlabs.org, maddy@linux.ibm.com,
mpe@ellerman.id.au
Subject: Re: [PATCH v3 net-next] ibmvnic: Increase max subcrq indirect entries with fallback
Date: Tue, 5 Aug 2025 20:14:36 +0100 [thread overview]
Message-ID: <20250805191436.GY8494@horms.kernel.org> (raw)
In-Reply-To: <20250804231704.12309-1-mmc@linux.ibm.com>
On Mon, Aug 04, 2025 at 04:17:04PM -0700, Mingming Cao wrote:
> POWER8 support a maximum of 16 subcrq indirect descriptor entries per
> H_SEND_SUB_CRQ_INDIRECT call, while POWER9 and newer hypervisors
> support up to 128 entries. Increasing the max number of indirect
> descriptor entries improves batching efficiency and reduces
> hcall overhead, which enhances throughput under large workload on POWER9+.
>
> Currently, ibmvnic driver always uses a fixed number of max indirect
> descriptor entries (16). send_subcrq_indirect() treats all hypervisor
> errors the same:
> - Cleanup and Drop the entire batch of descriptors.
> - Return an error to the caller.
> - Rely on TCP/IP retransmissions to recover.
> - If the hypervisor returns H_PARAMETER (e.g., because 128
> entries are not supported on POWER8), the driver will continue
> to drop batches, resulting in unnecessary packet loss.
>
> In this patch:
> Raise the default maximum indirect entries to 128 to improve ibmvnic
> batching on morden platform. But also gracefully fall back to
> 16 entries for Power 8 systems.
>
> Since there is no VIO interface to query the hypervisor’s supported
> limit, vnic handles send_subcrq_indirect() H_PARAMETER errors:
> - On first H_PARAMETER failure, log the failure context
> - Reduce max_indirect_entries to 16 and allow the single batch to drop.
> - Subsequent calls automatically use the correct lower limit,
> avoiding repeated drops.
>
> The goal is to optimizes performance on modern systems while handles
> falling back for older POWER8 hypervisors.
>
> Performance shows 40% improvements with MTU (1500) on largework load.
>
> --------------------------------------
> Changes since v2:
> link to v2: https://www.spinics.net/lists/netdev/msg1104669.html
>
> -- was Patch 4 from a patch series v2. v2 introduced a module parameter
> for backward compatibility. Based on review feedback, This patch handles
> older systems fall back case without adding a module parameter.
>
> Signed-off-by: Mingming Cao <mmc@linux.ibm.com>
> Reviewed-by: Brian King <bjking1@linux.ibm.com>
> Reviewed-by: Haren Myneni <haren@linux.ibm.com>
> ---
> drivers/net/ethernet/ibm/ibmvnic.c | 56 ++++++++++++++++++++++++++----
> drivers/net/ethernet/ibm/ibmvnic.h | 6 ++--
> 2 files changed, 53 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
...
> @@ -862,6 +862,19 @@ static void replenish_rx_pool(struct ibmvnic_adapter *adapter,
> failure:
> if (lpar_rc != H_PARAMETER && lpar_rc != H_CLOSED)
> dev_err_ratelimited(dev, "rx: replenish packet buffer failed\n");
> +
> + /* Detect platform limit H_PARAMETER */
> + if (lpar_rc == H_PARAMETER &&
> + adapter->cur_max_ind_descs > IBMVNIC_MAX_IND_DESC_MIN) {
> + netdev_info(adapter->netdev,
> + "H_PARAMETER, set ind desc to safe limit %u\n",
> + IBMVNIC_MAX_IND_DESC_MIN);
> + adapter->cur_max_ind_descs = IBMVNIC_MAX_IND_DESC_MIN;
> + }
Hi Mingming, all,
The logic above seems to appear twice in this patch.
I think it would be good to consolidate it somehow.
E.g. in a helper function.
> +
> + /* for all error case, temporarily drop only this batch
> + * Rely on TCP/IP retransmissions to retry and recover
> + */
Thanks for adding this comment.
Although perhaps 'for' -> 'For'.
Likewise below.
> for (i = ind_bufp->index - 1; i >= 0; --i) {
> struct ibmvnic_rx_buff *rx_buff;
>
> @@ -2381,16 +2394,33 @@ static int ibmvnic_tx_scrq_flush(struct ibmvnic_adapter *adapter,
> rc = send_subcrq_direct(adapter, handle,
> (u64 *)ind_bufp->indir_arr);
>
> - if (rc)
> + if (rc) {
> + dev_err_ratelimited(&adapter->vdev->dev,
> + "tx_flush failed, rc=%u (%llu entries dma=%pad handle=%llx)\n",
> + rc, entries, &dma_addr, handle);
> + /* Detect platform limit H_PARAMETER */
> + if (rc == H_PARAMETER &&
> + adapter->cur_max_ind_descs > IBMVNIC_MAX_IND_DESC_MIN) {
> + netdev_info(adapter->netdev,
> + "H_PARAMETER, set ind descs to safe limit %u\n",
> + IBMVNIC_MAX_IND_DESC_MIN);
> + adapter->cur_max_ind_descs = IBMVNIC_MAX_IND_DESC_MIN;
> + }
> +
> + /* for all error case, temporarily drop only this batch
> + * Rely on TCP/IP retransmissions to retry and recover
> + */
> ibmvnic_tx_scrq_clean_buffer(adapter, tx_scrq);
> - else
> + } else {
> ind_bufp->index = 0;
> + }
> return rc;
> }
...
> @@ -6369,6 +6399,17 @@ static int ibmvnic_reset_init(struct ibmvnic_adapter *adapter, bool reset)
> rc = reset_sub_crq_queues(adapter);
> }
> } else {
> + if (adapter->reset_reason == VNIC_RESET_MOBILITY) {
> + /* post migrtione reset the max
> + * indirect descriptors per hcall to be default max
> + * (e.g p8 ->p10)
> + * if the destination is on the platform supports
> + * do not support max (e.g. p10->p8) the threshold
> + * will be reduced to safe min limit for p8 later
> + */
nits: Post migration, reset.
The line breaking seems uneven.
And if p8 and p10 are POWER8 and POWER10 then I think it would
be worth spelling that out.
> + adapter->cur_max_ind_descs = IBMVNIC_MAX_IND_DESC_MAX;
> + }
> +
> rc = init_sub_crqs(adapter);
> }
>
...
> diff --git a/drivers/net/ethernet/ibm/ibmvnic.h b/drivers/net/ethernet/ibm/ibmvnic.h
> index 246ddce753f9..829a16116812 100644
> --- a/drivers/net/ethernet/ibm/ibmvnic.h
> +++ b/drivers/net/ethernet/ibm/ibmvnic.h
> @@ -29,8 +29,9 @@
> #define IBMVNIC_BUFFS_PER_POOL 100
> #define IBMVNIC_MAX_QUEUES 16
> #define IBMVNIC_MAX_QUEUE_SZ 4096
> -#define IBMVNIC_MAX_IND_DESCS 16
> -#define IBMVNIC_IND_ARR_SZ (IBMVNIC_MAX_IND_DESCS * 32)
> +#define IBMVNIC_MAX_IND_DESC_MAX 128
> +#define IBMVNIC_MAX_IND_DESC_MIN 16
...MAX...{MAX,MIN} seems like an unfortunate name.
But I don't feel particularly strongly about this one.
> +#define IBMVNIC_IND_MAX_ARR_SZ (IBMVNIC_MAX_IND_DESC_MAX * 32)
>
> #define IBMVNIC_TSO_BUF_SZ 65536
> #define IBMVNIC_TSO_BUFS 64
...
prev parent reply other threads:[~2025-08-05 19:14 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-04 23:17 [PATCH v3 net-next] ibmvnic: Increase max subcrq indirect entries with fallback Mingming Cao
2025-08-05 19:14 ` Simon Horman [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250805191436.GY8494@horms.kernel.org \
--to=horms@kernel.org \
--cc=bjking1@linux.ibm.com \
--cc=edumazet@google.com \
--cc=haren@linux.ibm.com \
--cc=kuba@kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=maddy@linux.ibm.com \
--cc=mmc@linux.ibm.com \
--cc=mpe@ellerman.id.au \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=ricklind@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).