Linux kernel -stable discussions
 help / color / mirror / Atom feed
* [PATCH net] gve: Fix an edge case for TSO skb validity check
@ 2024-07-18 19:02 Praveen Kaligineedi
  2024-07-18 19:06 ` kernel test robot
  2024-07-18 23:07 ` Willem de Bruijn
  0 siblings, 2 replies; 11+ messages in thread
From: Praveen Kaligineedi @ 2024-07-18 19:02 UTC (permalink / raw)
  To: netdev
  Cc: davem, edumazet, kuba, pabeni, willemb, shailend, hramamurthy,
	csully, jfraker, stable, Bailey Forrest, Praveen Kaligineedi,
	Jeroen de Borst

From: Bailey Forrest <bcf@google.com>

The NIC requires each TSO segment to not span more than 10
descriptors. gve_can_send_tso() performs this check. However,
the current code misses an edge case when a TSO skb has a large
frag that needs to be split into multiple descriptors, causing
the 10 descriptor limit per TSO-segment to be exceeded. This
change fixes the edge case.

Fixes: a57e5de476be ("gve: DQO: Add TX path")
Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com>
Signed-off-by: Bailey Forrest <bcf@google.com>
Reviewed-by: Jeroen de Borst <jeroendb@google.com>
---
 drivers/net/ethernet/google/gve/gve_tx_dqo.c | 22 +++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/google/gve/gve_tx_dqo.c b/drivers/net/ethernet/google/gve/gve_tx_dqo.c
index 0b3cca3fc792..dc39dc481f21 100644
--- a/drivers/net/ethernet/google/gve/gve_tx_dqo.c
+++ b/drivers/net/ethernet/google/gve/gve_tx_dqo.c
@@ -866,22 +866,42 @@ static bool gve_can_send_tso(const struct sk_buff *skb)
 	const int header_len = skb_tcp_all_headers(skb);
 	const int gso_size = shinfo->gso_size;
 	int cur_seg_num_bufs;
+	int last_frag_size;
 	int cur_seg_size;
 	int i;
 
 	cur_seg_size = skb_headlen(skb) - header_len;
+	last_frag_size = skb_headlen(skb);
 	cur_seg_num_bufs = cur_seg_size > 0;
 
 	for (i = 0; i < shinfo->nr_frags; i++) {
 		if (cur_seg_size >= gso_size) {
 			cur_seg_size %= gso_size;
 			cur_seg_num_bufs = cur_seg_size > 0;
+
+			/* If the last buffer is split in the middle of a TSO
+			 * segment, then it will count as two descriptors.
+			 */
+			if (last_frag_size > GVE_TX_MAX_BUF_SIZE_DQO) {
+				int last_frag_remain = last_frag_size %
+					GVE_TX_MAX_BUF_SIZE_DQO;
+
+				/* If the last frag was evenly divisible by
+				 * GVE_TX_MAX_BUF_SIZE_DQO, then it will not be
+				 * split in the current segment.
+				 */
+				if (last_frag_remain &&
+				    cur_seg_size > last_frag_remain) {
+					cur_seg_num_bufs++;
+				}
+			}
 		}
 
 		if (unlikely(++cur_seg_num_bufs > max_bufs_per_seg))
 			return false;
 
-		cur_seg_size += skb_frag_size(&shinfo->frags[i]);
+		last_frag_size = skb_frag_size(&shinfo->frags[i]);
+		cur_seg_size += last_frag_size;
 	}
 
 	return true;
-- 
2.45.2.993.g49e7a77208-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH net] gve: Fix an edge case for TSO skb validity check
  2024-07-18 19:02 [PATCH net] gve: Fix an edge case for TSO skb validity check Praveen Kaligineedi
@ 2024-07-18 19:06 ` kernel test robot
  2024-07-18 23:07 ` Willem de Bruijn
  1 sibling, 0 replies; 11+ messages in thread
From: kernel test robot @ 2024-07-18 19:06 UTC (permalink / raw)
  To: Praveen Kaligineedi; +Cc: stable, oe-kbuild-all

Hi,

Thanks for your patch.

FYI: kernel test robot notices the stable kernel rule is not satisfied.

The check is based on https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html#option-1

Rule: add the tag "Cc: stable@vger.kernel.org" in the sign-off area to have the patch automatically included in the stable tree.
Subject: [PATCH net] gve: Fix an edge case for TSO skb validity check
Link: https://lore.kernel.org/stable/20240718190221.2219835-1-pkaligineedi%40google.com

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net] gve: Fix an edge case for TSO skb validity check
  2024-07-18 19:02 [PATCH net] gve: Fix an edge case for TSO skb validity check Praveen Kaligineedi
  2024-07-18 19:06 ` kernel test robot
@ 2024-07-18 23:07 ` Willem de Bruijn
  2024-07-19  0:27   ` Bailey Forrest
  2024-07-19  1:51   ` Praveen Kaligineedi
  1 sibling, 2 replies; 11+ messages in thread
From: Willem de Bruijn @ 2024-07-18 23:07 UTC (permalink / raw)
  To: Praveen Kaligineedi, netdev
  Cc: davem, edumazet, kuba, pabeni, willemb, shailend, hramamurthy,
	csully, jfraker, stable, Bailey Forrest, Praveen Kaligineedi,
	Jeroen de Borst

Praveen Kaligineedi wrote:
> From: Bailey Forrest <bcf@google.com>
> 
> The NIC requires each TSO segment to not span more than 10
> descriptors. gve_can_send_tso() performs this check. However,
> the current code misses an edge case when a TSO skb has a large
> frag that needs to be split into multiple descriptors

because each descriptor may not exceed 16KB (GVE_TX_MAX_BUF_SIZE_DQO)

>, causing
> the 10 descriptor limit per TSO-segment to be exceeded. This
> change fixes the edge case.
> 
> Fixes: a57e5de476be ("gve: DQO: Add TX path")
> Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com>
> Signed-off-by: Bailey Forrest <bcf@google.com>
> Reviewed-by: Jeroen de Borst <jeroendb@google.com>
> ---
>  drivers/net/ethernet/google/gve/gve_tx_dqo.c | 22 +++++++++++++++++++-
>  1 file changed, 21 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/google/gve/gve_tx_dqo.c b/drivers/net/ethernet/google/gve/gve_tx_dqo.c
> index 0b3cca3fc792..dc39dc481f21 100644
> --- a/drivers/net/ethernet/google/gve/gve_tx_dqo.c
> +++ b/drivers/net/ethernet/google/gve/gve_tx_dqo.c
> @@ -866,22 +866,42 @@ static bool gve_can_send_tso(const struct sk_buff *skb)
>  	const int header_len = skb_tcp_all_headers(skb);
>  	const int gso_size = shinfo->gso_size;
>  	int cur_seg_num_bufs;
> +	int last_frag_size;

nit: last_frag can be interpreted as frags[nr_frags - 1], perhaps prev_frag.

>  	int cur_seg_size;
>  	int i;
>  
>  	cur_seg_size = skb_headlen(skb) - header_len;
> +	last_frag_size = skb_headlen(skb);
>  	cur_seg_num_bufs = cur_seg_size > 0;
>  
>  	for (i = 0; i < shinfo->nr_frags; i++) {
>  		if (cur_seg_size >= gso_size) {
>  			cur_seg_size %= gso_size;
>  			cur_seg_num_bufs = cur_seg_size > 0;
> +
> +			/* If the last buffer is split in the middle of a TSO

s/buffer/frag?

> +			 * segment, then it will count as two descriptors.
> +			 */
> +			if (last_frag_size > GVE_TX_MAX_BUF_SIZE_DQO) {
> +				int last_frag_remain = last_frag_size %
> +					GVE_TX_MAX_BUF_SIZE_DQO;
> +
> +				/* If the last frag was evenly divisible by
> +				 * GVE_TX_MAX_BUF_SIZE_DQO, then it will not be
> +				 * split in the current segment.

Is this true even if the segment did not start at the start of the frag?

Overall, it's not trivial to follow. Probably because the goal is to
count max descriptors per segment, but that is not what is being
looped over.

Intuitive (perhaps buggy, a quick sketch), this is what is intended,
right?

static bool gve_can_send_tso(const struct sk_buff *skb)
{
        int frag_size = skb_headlen(skb) - header_len;
        int gso_size_left;
        int frag_idx = 0;
        int num_desc;
        int desc_len;
        int off = 0;

        while (off < skb->len) {
                gso_size_left = shinfo->gso_size;
                num_desc = 0;

                while (gso_size_left) {
                        desc_len = min(gso_size_left, frag_size);
                        gso_size_left -= desc_len;
                        frag_size -= desc_len;
                        num_desc++;

                        if (num_desc > max_descs_per_seg)
                                return false;

                        if (!frag_size)
                                frag_size = skb_frag_size(&shinfo->frags[frag_idx++]);
                }
        }

        return true;
}

This however loops skb->len / gso_size. While the above modulo
operation skips many segments that span a frag. Not sure if the more
intuitive approach could be as performant.

Else, I'll stare some more at the suggested patch to convince myself
that it is correct and complete..

> +                              */


> +				 */
> +				if (last_frag_remain &&
> +				    cur_seg_size > last_frag_remain) {
> +					cur_seg_num_bufs++;
> +				}
> +			}
>  		}
>  
>  		if (unlikely(++cur_seg_num_bufs > max_bufs_per_seg))
>  			return false;
>  
> -		cur_seg_size += skb_frag_size(&shinfo->frags[i]);
> +		last_frag_size = skb_frag_size(&shinfo->frags[i]);
> +		cur_seg_size += last_frag_size;
>  	}
>  
>  	return true;
> -- 
> 2.45.2.993.g49e7a77208-goog
> 



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net] gve: Fix an edge case for TSO skb validity check
  2024-07-18 23:07 ` Willem de Bruijn
@ 2024-07-19  0:27   ` Bailey Forrest
  2024-07-19  3:24     ` Willem de Bruijn
  2024-07-19  1:51   ` Praveen Kaligineedi
  1 sibling, 1 reply; 11+ messages in thread
From: Bailey Forrest @ 2024-07-19  0:27 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Praveen Kaligineedi, netdev, davem, edumazet, kuba, pabeni,
	willemb, shailend, hramamurthy, csully, jfraker, stable,
	Jeroen de Borst

On Thu, Jul 18, 2024 at 4:07 PM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> This however loops skb->len / gso_size. While the above modulo
> operation skips many segments that span a frag. Not sure if the more
> intuitive approach could be as performant.

Yes, the original intention of the code was to loop over nr_frags,
instead of (skb->len / gso_size).

But perhaps that's premature optimization if it makes the code
significantly harder to follow.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net] gve: Fix an edge case for TSO skb validity check
  2024-07-18 23:07 ` Willem de Bruijn
  2024-07-19  0:27   ` Bailey Forrest
@ 2024-07-19  1:51   ` Praveen Kaligineedi
  2024-07-19  3:46     ` Willem de Bruijn
  1 sibling, 1 reply; 11+ messages in thread
From: Praveen Kaligineedi @ 2024-07-19  1:51 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: netdev, davem, edumazet, kuba, pabeni, willemb, shailend,
	hramamurthy, csully, jfraker, stable, Bailey Forrest,
	Jeroen de Borst

On Thu, Jul 18, 2024 at 4:07 PM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:

> > +                      * segment, then it will count as two descriptors.
> > +                      */
> > +                     if (last_frag_size > GVE_TX_MAX_BUF_SIZE_DQO) {
> > +                             int last_frag_remain = last_frag_size %
> > +                                     GVE_TX_MAX_BUF_SIZE_DQO;
> > +
> > +                             /* If the last frag was evenly divisible by
> > +                              * GVE_TX_MAX_BUF_SIZE_DQO, then it will not be
> > +                              * split in the current segment.
>
> Is this true even if the segment did not start at the start of the frag?
The comment probably is a bit confusing here. The current segment
we are tracking could have a portion in the previous frag. The code
assumed that the portion on the previous frag (if present) mapped to only
one descriptor. However, that portion could have been split across two
descriptors due to the restriction that each descriptor cannot exceed 16KB.
That's the case this fix is trying to address.
I will work on simplifying the logic based on your suggestion below so
that the fix is easier to follow
>
> Overall, it's not trivial to follow. Probably because the goal is to
> count max descriptors per segment, but that is not what is being
> looped over.
>
The comment
> Intuitive (perhaps buggy, a quick sketch), this is what is intended,
> right?
>
> static bool gve_can_send_tso(const struct sk_buff *skb)
> {
>         int frag_size = skb_headlen(skb) - header_len;
>         int gso_size_left;
>         int frag_idx = 0;
>         int num_desc;
>         int desc_len;
>         int off = 0;
>
>         while (off < skb->len) {
>                 gso_size_left = shinfo->gso_size;
>                 num_desc = 0;
>
>                 while (gso_size_left) {
>                         desc_len = min(gso_size_left, frag_size);
>                         gso_size_left -= desc_len;
>                         frag_size -= desc_len;
>                         num_desc++;
>
>                         if (num_desc > max_descs_per_seg)
>                                 return false;
>
>                         if (!frag_size)
>                                 frag_size = skb_frag_size(&shinfo->frags[frag_idx++]);
>                 }
>         }
>
>         return true;
> }
>
> This however loops skb->len / gso_size. While the above modulo
> operation skips many segments that span a frag. Not sure if the more
> intuitive approach could be as performant.
>
> Else, I'll stare some more at the suggested patch to convince myself
> that it is correct and complete..

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net] gve: Fix an edge case for TSO skb validity check
  2024-07-19  0:27   ` Bailey Forrest
@ 2024-07-19  3:24     ` Willem de Bruijn
  0 siblings, 0 replies; 11+ messages in thread
From: Willem de Bruijn @ 2024-07-19  3:24 UTC (permalink / raw)
  To: Bailey Forrest
  Cc: Praveen Kaligineedi, netdev, davem, edumazet, kuba, pabeni,
	willemb, shailend, hramamurthy, csully, jfraker, stable,
	Jeroen de Borst

On Thu, Jul 18, 2024 at 8:28 PM Bailey Forrest <bcf@google.com> wrote:
>
> On Thu, Jul 18, 2024 at 4:07 PM Willem de Bruijn
> <willemdebruijn.kernel@gmail.com> wrote:
> >
> > This however loops skb->len / gso_size. While the above modulo
> > operation skips many segments that span a frag. Not sure if the more
> > intuitive approach could be as performant.
>
> Yes, the original intention of the code was to loop over nr_frags,
> instead of (skb->len / gso_size).
>
> But perhaps that's premature optimization if it makes the code
> significantly harder to follow.

Thanks. I don't mean to ask for a wholesale rewrite if not needed.

But perhaps the logic can be explained in the commit in a way
that it is more immediately obvious.

Praveen suggested that. I'll respond to his reply in more detail.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net] gve: Fix an edge case for TSO skb validity check
  2024-07-19  1:51   ` Praveen Kaligineedi
@ 2024-07-19  3:46     ` Willem de Bruijn
  2024-07-19 14:31       ` Praveen Kaligineedi
  0 siblings, 1 reply; 11+ messages in thread
From: Willem de Bruijn @ 2024-07-19  3:46 UTC (permalink / raw)
  To: Praveen Kaligineedi
  Cc: netdev, davem, edumazet, kuba, pabeni, willemb, shailend,
	hramamurthy, csully, jfraker, stable, Bailey Forrest,
	Jeroen de Borst

On Thu, Jul 18, 2024 at 9:52 PM Praveen Kaligineedi
<pkaligineedi@google.com> wrote:
>
> On Thu, Jul 18, 2024 at 4:07 PM Willem de Bruijn
> <willemdebruijn.kernel@gmail.com> wrote:
>
> > > +                      * segment, then it will count as two descriptors.
> > > +                      */
> > > +                     if (last_frag_size > GVE_TX_MAX_BUF_SIZE_DQO) {
> > > +                             int last_frag_remain = last_frag_size %
> > > +                                     GVE_TX_MAX_BUF_SIZE_DQO;
> > > +
> > > +                             /* If the last frag was evenly divisible by
> > > +                              * GVE_TX_MAX_BUF_SIZE_DQO, then it will not be
> > > +                              * split in the current segment.
> >
> > Is this true even if the segment did not start at the start of the frag?
> The comment probably is a bit confusing here. The current segment
> we are tracking could have a portion in the previous frag. The code
> assumed that the portion on the previous frag (if present) mapped to only
> one descriptor. However, that portion could have been split across two
> descriptors due to the restriction that each descriptor cannot exceed 16KB.

>>> /* If the last frag was evenly divisible by
>>> +                                * GVE_TX_MAX_BUF_SIZE_DQO, then it will not be
>>>  +                              * split in the current segment.

This is true because the smallest multiple of 16KB is 32KB, and the
largest gso_size at least for Ethernet will be 9K. But I don't think
that that is what is used here as the basis for this statement?

> That's the case this fix is trying to address.
> I will work on simplifying the logic based on your suggestion below so
> that the fix is easier to follow

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net] gve: Fix an edge case for TSO skb validity check
  2024-07-19  3:46     ` Willem de Bruijn
@ 2024-07-19 14:31       ` Praveen Kaligineedi
  2024-07-19 16:55         ` Bailey Forrest
  0 siblings, 1 reply; 11+ messages in thread
From: Praveen Kaligineedi @ 2024-07-19 14:31 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: netdev, davem, edumazet, kuba, pabeni, willemb, shailend,
	hramamurthy, csully, jfraker, stable, Bailey Forrest,
	Jeroen de Borst

On Thu, Jul 18, 2024 at 8:47 PM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> On Thu, Jul 18, 2024 at 9:52 PM Praveen Kaligineedi
> <pkaligineedi@google.com> wrote:
> >
> > On Thu, Jul 18, 2024 at 4:07 PM Willem de Bruijn
> > <willemdebruijn.kernel@gmail.com> wrote:
> >
> > > > +                      * segment, then it will count as two descriptors.
> > > > +                      */
> > > > +                     if (last_frag_size > GVE_TX_MAX_BUF_SIZE_DQO) {
> > > > +                             int last_frag_remain = last_frag_size %
> > > > +                                     GVE_TX_MAX_BUF_SIZE_DQO;
> > > > +
> > > > +                             /* If the last frag was evenly divisible by
> > > > +                              * GVE_TX_MAX_BUF_SIZE_DQO, then it will not be
> > > > +                              * split in the current segment.
> > >
> > > Is this true even if the segment did not start at the start of the frag?
> > The comment probably is a bit confusing here. The current segment
> > we are tracking could have a portion in the previous frag. The code
> > assumed that the portion on the previous frag (if present) mapped to only
> > one descriptor. However, that portion could have been split across two
> > descriptors due to the restriction that each descriptor cannot exceed 16KB.
>
> >>> /* If the last frag was evenly divisible by
> >>> +                                * GVE_TX_MAX_BUF_SIZE_DQO, then it will not be
> >>>  +                              * split in the current segment.
>
> This is true because the smallest multiple of 16KB is 32KB, and the
> largest gso_size at least for Ethernet will be 9K. But I don't think
> that that is what is used here as the basis for this statement?
>
The largest Ethernet gso_size (9K) is less than GVE_TX_MAX_BUF_SIZE_DQO
is an implicit assumption made in this patch and in that comment. Bailey,
please correct me if I am wrong..




> > That's the case this fix is trying to address.
> > I will work on simplifying the logic based on your suggestion below so
> > that the fix is easier to follow

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net] gve: Fix an edge case for TSO skb validity check
  2024-07-19 14:31       ` Praveen Kaligineedi
@ 2024-07-19 16:55         ` Bailey Forrest
  2024-07-21 19:10           ` Willem de Bruijn
  0 siblings, 1 reply; 11+ messages in thread
From: Bailey Forrest @ 2024-07-19 16:55 UTC (permalink / raw)
  To: Praveen Kaligineedi
  Cc: Willem de Bruijn, netdev, davem, edumazet, kuba, pabeni, willemb,
	shailend, hramamurthy, csully, jfraker, stable, Jeroen de Borst

On Fri, Jul 19, 2024 at 7:31 AM Praveen Kaligineedi
<pkaligineedi@google.com> wrote:
>
> On Thu, Jul 18, 2024 at 8:47 PM Willem de Bruijn
> <willemdebruijn.kernel@gmail.com> wrote:
> >
> > On Thu, Jul 18, 2024 at 9:52 PM Praveen Kaligineedi
> > <pkaligineedi@google.com> wrote:
> > >
> > > On Thu, Jul 18, 2024 at 4:07 PM Willem de Bruijn
> > > <willemdebruijn.kernel@gmail.com> wrote:
> > >
> > > > > +                      * segment, then it will count as two descriptors.
> > > > > +                      */
> > > > > +                     if (last_frag_size > GVE_TX_MAX_BUF_SIZE_DQO) {
> > > > > +                             int last_frag_remain = last_frag_size %
> > > > > +                                     GVE_TX_MAX_BUF_SIZE_DQO;
> > > > > +
> > > > > +                             /* If the last frag was evenly divisible by
> > > > > +                              * GVE_TX_MAX_BUF_SIZE_DQO, then it will not be
> > > > > +                              * split in the current segment.
> > > >
> > > > Is this true even if the segment did not start at the start of the frag?
> > > The comment probably is a bit confusing here. The current segment
> > > we are tracking could have a portion in the previous frag. The code
> > > assumed that the portion on the previous frag (if present) mapped to only
> > > one descriptor. However, that portion could have been split across two
> > > descriptors due to the restriction that each descriptor cannot exceed 16KB.
> >
> > >>> /* If the last frag was evenly divisible by
> > >>> +                                * GVE_TX_MAX_BUF_SIZE_DQO, then it will not be
> > >>>  +                              * split in the current segment.
> >
> > This is true because the smallest multiple of 16KB is 32KB, and the
> > largest gso_size at least for Ethernet will be 9K. But I don't think
> > that that is what is used here as the basis for this statement?
> >
> The largest Ethernet gso_size (9K) is less than GVE_TX_MAX_BUF_SIZE_DQO
> is an implicit assumption made in this patch and in that comment. Bailey,
> please correct me if I am wrong..

If last_frag_size is evenly divisible by GVE_TX_MAX_BUF_SIZE_DQO, it
doesn't hit the edge case we're looking for.

- If it's evenly divisible, then we know it will use exactly
(last_frag_size / GVE_TX_MAX_BUF_SIZE_DQO) descriptors
- GVE_TX_MAX_BUF_SIZE_DQO > 9k, so we know each descriptor won't
create a segment which exceeds the limit

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net] gve: Fix an edge case for TSO skb validity check
  2024-07-19 16:55         ` Bailey Forrest
@ 2024-07-21 19:10           ` Willem de Bruijn
  2024-07-23 10:20             ` Paolo Abeni
  0 siblings, 1 reply; 11+ messages in thread
From: Willem de Bruijn @ 2024-07-21 19:10 UTC (permalink / raw)
  To: Bailey Forrest
  Cc: Praveen Kaligineedi, netdev, davem, edumazet, kuba, pabeni,
	willemb, shailend, hramamurthy, csully, jfraker, stable,
	Jeroen de Borst

On Fri, Jul 19, 2024 at 9:56 AM Bailey Forrest <bcf@google.com> wrote:
>
> On Fri, Jul 19, 2024 at 7:31 AM Praveen Kaligineedi
> <pkaligineedi@google.com> wrote:
> >
> > On Thu, Jul 18, 2024 at 8:47 PM Willem de Bruijn
> > <willemdebruijn.kernel@gmail.com> wrote:
> > >
> > > On Thu, Jul 18, 2024 at 9:52 PM Praveen Kaligineedi
> > > <pkaligineedi@google.com> wrote:
> > > >
> > > > On Thu, Jul 18, 2024 at 4:07 PM Willem de Bruijn
> > > > <willemdebruijn.kernel@gmail.com> wrote:
> > > >
> > > > > > +                      * segment, then it will count as two descriptors.
> > > > > > +                      */
> > > > > > +                     if (last_frag_size > GVE_TX_MAX_BUF_SIZE_DQO) {
> > > > > > +                             int last_frag_remain = last_frag_size %
> > > > > > +                                     GVE_TX_MAX_BUF_SIZE_DQO;
> > > > > > +
> > > > > > +                             /* If the last frag was evenly divisible by
> > > > > > +                              * GVE_TX_MAX_BUF_SIZE_DQO, then it will not be
> > > > > > +                              * split in the current segment.
> > > > >
> > > > > Is this true even if the segment did not start at the start of the frag?
> > > > The comment probably is a bit confusing here. The current segment
> > > > we are tracking could have a portion in the previous frag. The code
> > > > assumed that the portion on the previous frag (if present) mapped to only
> > > > one descriptor. However, that portion could have been split across two
> > > > descriptors due to the restriction that each descriptor cannot exceed 16KB.
> > >
> > > >>> /* If the last frag was evenly divisible by
> > > >>> +                                * GVE_TX_MAX_BUF_SIZE_DQO, then it will not be
> > > >>>  +                              * split in the current segment.
> > >
> > > This is true because the smallest multiple of 16KB is 32KB, and the
> > > largest gso_size at least for Ethernet will be 9K. But I don't think
> > > that that is what is used here as the basis for this statement?
> > >
> > The largest Ethernet gso_size (9K) is less than GVE_TX_MAX_BUF_SIZE_DQO
> > is an implicit assumption made in this patch and in that comment. Bailey,
> > please correct me if I am wrong..
>
> If last_frag_size is evenly divisible by GVE_TX_MAX_BUF_SIZE_DQO, it
> doesn't hit the edge case we're looking for.
>
> - If it's evenly divisible, then we know it will use exactly
> (last_frag_size / GVE_TX_MAX_BUF_SIZE_DQO) descriptors

This assumes that gso_segment start is aligned with skb_frag
start. That is not necessarily true, right?

If headlen minus protocol headers is 1B, then the first segment
will have two descriptors { 1B, 9KB - 1 }. And the next segment
can have skb_frag_size - ( 9KB - 1).

I think the statement is correct, but because every multiple
of 16KB is so much larger than the max gso_size of ~9KB,
that a single segment will never include more than two
skb_frags.

Quite possibly the code overestimates the number of
descriptors per segment now, but that is safe and only a
performance regression.

> - GVE_TX_MAX_BUF_SIZE_DQO > 9k, so we know each descriptor won't
> create a segment which exceeds the limit

For a net patch, it is generally better to make a small fix rather than rewrite.

That said, my sketch without looping over every segment:

        while (off < skb->len) {
                gso_size_left = shinfo->gso_size;
                num_desc = 0;

                while (gso_size_left) {
                        desc_len = min(gso_size_left, frag_size_left);
                        gso_size_left -= desc_len;
                        frag_size_left -= desc_len;
                        num_desc++;

                        if (num_desc > max_descs_per_seg)
                                return false;

                        if (!frag_size_left)
                                frag_size_left =
skb_frag_size(&shinfo->frags[frag_idx++]);
+                      else
+                              frag_size_left %= gso_size;        /*
skip segments that fit in one desc */
                }
        }

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net] gve: Fix an edge case for TSO skb validity check
  2024-07-21 19:10           ` Willem de Bruijn
@ 2024-07-23 10:20             ` Paolo Abeni
  0 siblings, 0 replies; 11+ messages in thread
From: Paolo Abeni @ 2024-07-23 10:20 UTC (permalink / raw)
  To: Willem de Bruijn, Bailey Forrest
  Cc: Praveen Kaligineedi, netdev, davem, edumazet, kuba, willemb,
	shailend, hramamurthy, csully, jfraker, stable, Jeroen de Borst

On 7/21/24 21:10, Willem de Bruijn wrote:
> On Fri, Jul 19, 2024 at 9:56 AM Bailey Forrest <bcf@google.com> wrote:
>> If last_frag_size is evenly divisible by GVE_TX_MAX_BUF_SIZE_DQO, it
>> doesn't hit the edge case we're looking for.
>>
>> - If it's evenly divisible, then we know it will use exactly
>> (last_frag_size / GVE_TX_MAX_BUF_SIZE_DQO) descriptors
> 
> This assumes that gso_segment start is aligned with skb_frag
> start. That is not necessarily true, right?
> 
> If headlen minus protocol headers is 1B, then the first segment
> will have two descriptors { 1B, 9KB - 1 }. And the next segment
> can have skb_frag_size - ( 9KB - 1).
> 
> I think the statement is correct, but because every multiple
> of 16KB is so much larger than the max gso_size of ~9KB,
> that a single segment will never include more than two
> skb_frags.

I'm a bit lost, but what abut big TCP? that should allow (considerably) 
more than 2 frags 16K each per segment.

In any case it looks like this needs at least some comments clarification.

Thanks,

Paolo


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2024-07-23 10:20 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-18 19:02 [PATCH net] gve: Fix an edge case for TSO skb validity check Praveen Kaligineedi
2024-07-18 19:06 ` kernel test robot
2024-07-18 23:07 ` Willem de Bruijn
2024-07-19  0:27   ` Bailey Forrest
2024-07-19  3:24     ` Willem de Bruijn
2024-07-19  1:51   ` Praveen Kaligineedi
2024-07-19  3:46     ` Willem de Bruijn
2024-07-19 14:31       ` Praveen Kaligineedi
2024-07-19 16:55         ` Bailey Forrest
2024-07-21 19:10           ` Willem de Bruijn
2024-07-23 10:20             ` Paolo Abeni

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox