public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Aditya Garg <gargaditya@linux.microsoft.com>
To: Eric Dumazet <edumazet@google.com>
Cc: kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org,
	decui@microsoft.com, andrew+netdev@lunn.ch, davem@davemloft.net,
	kuba@kernel.org, pabeni@redhat.com, longli@microsoft.com,
	kotaranov@microsoft.com, horms@kernel.org,
	shradhagupta@linux.microsoft.com, ernis@linux.microsoft.com,
	dipayanroy@linux.microsoft.com, shirazsaleem@microsoft.com,
	linux-hyperv@vger.kernel.org, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org,
	gargaditya@microsoft.com, ssengar@linux.microsoft.com
Subject: Re: [PATCH net-next] net: mana: Linearize SKB if TX SGEs exceeds hardware limit
Date: Fri, 17 Oct 2025 23:11:11 +0530	[thread overview]
Message-ID: <1d3ac973-7bc7-4abe-9fe2-6b17dbba223b@linux.microsoft.com> (raw)
In-Reply-To: <7bc327ba-0050-4d9e-86b6-1b7427a96f53@linux.microsoft.com>

On 08-10-2025 20:58, Aditya Garg wrote:
> On 08-10-2025 20:51, Eric Dumazet wrote:
>> On Wed, Oct 8, 2025 at 8:16 AM Aditya Garg
>> <gargaditya@linux.microsoft.com> wrote:
>>>
>>> On 03-10-2025 21:45, Eric Dumazet wrote:
>>>> On Fri, Oct 3, 2025 at 8:47 AM Aditya Garg
>>>> <gargaditya@linux.microsoft.com> wrote:
>>>>>
>>>>> The MANA hardware supports a maximum of 30 scatter-gather entries 
>>>>> (SGEs)
>>>>> per TX WQE. In rare configurations where MAX_SKB_FRAGS + 2 exceeds 
>>>>> this
>>>>> limit, the driver drops the skb. Add a check in mana_start_xmit() to
>>>>> detect such cases and linearize the SKB before transmission.
>>>>>
>>>>> Return NETDEV_TX_BUSY only for -ENOSPC from 
>>>>> mana_gd_post_work_request(),
>>>>> send other errors to free_sgl_ptr to free resources and record the tx
>>>>> drop.
>>>>>
>>>>> Signed-off-by: Aditya Garg <gargaditya@linux.microsoft.com>
>>>>> Reviewed-by: Dipayaan Roy <dipayanroy@linux.microsoft.com>
>>>>> ---
>>>>>    drivers/net/ethernet/microsoft/mana/mana_en.c | 26 +++++++++++++ 
>>>>> ++----
>>>>>    include/net/mana/gdma.h                       |  8 +++++-
>>>>>    include/net/mana/mana.h                       |  1 +
>>>>>    3 files changed, 29 insertions(+), 6 deletions(-)
>>>>>
>>>>> diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/ 
>>>>> drivers/net/ethernet/microsoft/mana/mana_en.c
>>>>> index f4fc86f20213..22605753ca84 100644
>>>>> --- a/drivers/net/ethernet/microsoft/mana/mana_en.c
>>>>> +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
>>>>> @@ -20,6 +20,7 @@
>>>>>
>>>>>    #include <net/mana/mana.h>
>>>>>    #include <net/mana/mana_auxiliary.h>
>>>>> +#include <linux/skbuff.h>
>>>>>
>>>>>    static DEFINE_IDA(mana_adev_ida);
>>>>>
>>>>> @@ -289,6 +290,19 @@ netdev_tx_t mana_start_xmit(struct sk_buff 
>>>>> *skb, struct net_device *ndev)
>>>>>           cq = &apc->tx_qp[txq_idx].tx_cq;
>>>>>           tx_stats = &txq->stats;
>>>>>
>>>>> +       BUILD_BUG_ON(MAX_TX_WQE_SGL_ENTRIES != 
>>>>> MANA_MAX_TX_WQE_SGL_ENTRIES);
>>>>> +       #if (MAX_SKB_FRAGS + 2 > MANA_MAX_TX_WQE_SGL_ENTRIES)
>>>>> +               if (skb_shinfo(skb)->nr_frags + 2 > 
>>>>> MANA_MAX_TX_WQE_SGL_ENTRIES) {
>>>>> +                       netdev_info_once(ndev,
>>>>> +                                        "nr_frags %d exceeds max 
>>>>> supported sge limit. Attempting skb_linearize\n",
>>>>> +                                        skb_shinfo(skb)->nr_frags);
>>>>> +                       if (skb_linearize(skb)) {
>>>>
>>>> This will fail in many cases.
>>>>
>>>> This sort of check is better done in ndo_features_check()
>>>>
>>>> Most probably this would occur for GSO packets, so can ask a software
>>>> segmentation
>>>> to avoid this big and risky kmalloc() by all means.
>>>>
>>>> Look at idpf_features_check()  which has something similar.
>>>
>>> Hi Eric,
>>> Thank you for your review. I understand your concerns regarding the use
>>> of skb_linearize() in the xmit path, as it can fail under memory
>>> pressure and introduces additional overhead in the transmit path. Based
>>> on your input, I will work on a v2 that will move the SGE limit check to
>>> the ndo_features_check() path and for GSO skbs exceding the hw limit
>>> will disable the NETIF_F_GSO_MASK to enforce software segmentation in
>>> kernel before the call to xmit.
>>> Also for non GSO skb exceeding the SGE hw limit should we go for using
>>> skb_linearize only then or would you suggest some other approach here?
>>
>> I think that for non GSO, the linearization attempt is fine.
>>
>> Note that this is extremely unlikely for non malicious users,
>> and MTU being usually small (9K or less),
>> the allocation will be much smaller than a GSO packet.
> 
> Okay. Will send a v2
Hi Eric,
I tested the code by disabling GSO in ndo_features_check when the number 
of SGEs exceeds the hardware limit, using iperf for a single TCP 
connection with zerocopy enabled. I noticed a significant difference in 
throughput compared to when we linearize the skbs.
For reference, the throughput is 35.6 Gbits/sec when using 
skb_linearize, but drops to 6.75 Gbits/sec when disabling GSO per skb.

Hence, We propose to  linearizing skbs until the first failure occurs. 
After that, we switch to a fail-safe mode by disabling GSO for SKBs with 
  sge > hw limit using the ndo_feature_check implementation, while 
continuing to apply  skb_linearize() for non-GSO packets that exceed the 
hardware limit. This ensures we remain on the optimal performance path 
initially, and only transition to the fail-safe path after encountering 
a failure.
Regards,
Aditya

  reply	other threads:[~2025-10-17 17:41 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-03 15:47 [PATCH net-next] net: mana: Linearize SKB if TX SGEs exceeds hardware limit Aditya Garg
2025-10-03 16:15 ` Eric Dumazet
2025-10-08 15:16   ` Aditya Garg
2025-10-08 15:21     ` Eric Dumazet
2025-10-08 15:28       ` Aditya Garg
2025-10-17 17:41         ` Aditya Garg [this message]
2025-10-17 18:06           ` Eric Dumazet
2025-10-22 16:44             ` Aditya Garg
2025-10-04  9:38 ` Simon Horman
2025-10-08 15:18   ` Aditya Garg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1d3ac973-7bc7-4abe-9fe2-6b17dbba223b@linux.microsoft.com \
    --to=gargaditya@linux.microsoft.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=decui@microsoft.com \
    --cc=dipayanroy@linux.microsoft.com \
    --cc=edumazet@google.com \
    --cc=ernis@linux.microsoft.com \
    --cc=gargaditya@microsoft.com \
    --cc=haiyangz@microsoft.com \
    --cc=horms@kernel.org \
    --cc=kotaranov@microsoft.com \
    --cc=kuba@kernel.org \
    --cc=kys@microsoft.com \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=longli@microsoft.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=shirazsaleem@microsoft.com \
    --cc=shradhagupta@linux.microsoft.com \
    --cc=ssengar@linux.microsoft.com \
    --cc=wei.liu@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox