netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Ilpo Järvinen" <ij@kernel.org>
To: Eric Dumazet <edumazet@google.com>
Cc: chia-yu.chang@nokia-bell-labs.com, netdev@vger.kernel.org,
	 dsahern@gmail.com, davem@davemloft.net, dsahern@kernel.org,
	 pabeni@redhat.com, joel.granados@kernel.org, kuba@kernel.org,
	 andrew+netdev@lunn.ch, horms@kernel.org, pablo@netfilter.org,
	 kadlec@netfilter.org, netfilter-devel@vger.kernel.org,
	 coreteam@netfilter.org, ncardwell@google.com,
	 koen.de_schepper@nokia-bell-labs.com, g.white@cablelabs.com,
	 ingemar.s.johansson@ericsson.com, mirja.kuehlewind@ericsson.com,
	 cheshire@apple.com, rs.ietf@gmx.at, Jason_Livingood@comcast.com,
	 vidhi_goel@apple.com
Subject: Re: [PATCH v5 net-next 09/13] gro: prevent ACE field corruption & better AccECN handling
Date: Thu, 7 Nov 2024 21:28:08 +0200 (EET)	[thread overview]
Message-ID: <37429ace-59c0-21d2-bcc8-54033794e789@kernel.org> (raw)
In-Reply-To: <CANn89i+9USaOthY3yaJPT-cbfAcP0re2bbGbWU7SqOSYEW2CMw@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4454 bytes --]

On Thu, 7 Nov 2024, Eric Dumazet wrote:

> On Tue, Nov 5, 2024 at 11:07 AM <chia-yu.chang@nokia-bell-labs.com> wrote:
> >
> > From: Ilpo Järvinen <ij@kernel.org>
> >
> > There are important differences in how the CWR field behaves
> > in RFC3168 and AccECN. With AccECN, CWR flag is part of the
> > ACE counter and its changes are important so adjust the flags
> > changed mask accordingly.
> >
> > Also, if CWR is there, set the Accurate ECN GSO flag to avoid
> > corrupting CWR flag somewhere.
> >
> > Signed-off-by: Ilpo Järvinen <ij@kernel.org>
> > Signed-off-by: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com>
> > ---
> >  net/ipv4/tcp_offload.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/net/ipv4/tcp_offload.c b/net/ipv4/tcp_offload.c
> > index 0b05f30e9e5f..f59762d88c38 100644
> > --- a/net/ipv4/tcp_offload.c
> > +++ b/net/ipv4/tcp_offload.c
> > @@ -329,7 +329,7 @@ struct sk_buff *tcp_gro_receive(struct list_head *head, struct sk_buff *skb,
> >         th2 = tcp_hdr(p);
> >         flush = (__force int)(flags & TCP_FLAG_CWR);
> >         flush |= (__force int)((flags ^ tcp_flag_word(th2)) &
> > -                 ~(TCP_FLAG_CWR | TCP_FLAG_FIN | TCP_FLAG_PSH));
> > +                 ~(TCP_FLAG_FIN | TCP_FLAG_PSH));
> >         flush |= (__force int)(th->ack_seq ^ th2->ack_seq);
> >         for (i = sizeof(*th); i < thlen; i += 4)
> >                 flush |= *(u32 *)((u8 *)th + i) ^
> > @@ -405,7 +405,7 @@ void tcp_gro_complete(struct sk_buff *skb)
> >         shinfo->gso_segs = NAPI_GRO_CB(skb)->count;
> >
> >         if (th->cwr)
> > -               shinfo->gso_type |= SKB_GSO_TCP_ECN;
> > +               shinfo->gso_type |= SKB_GSO_TCP_ACCECN;
> >  }
> >  EXPORT_SYMBOL(tcp_gro_complete);
> >
> 
> I do not really understand this patch. How a GRO engine can know which
> ECN variant the peers are using ?

Hi Eric,

Thanks for taking a look.

GRO doesn't know. Setting SKB_GSO_TCP_ECN in case of not knowing can 
result in header change that corrupts ACE field. Thus, GRO has to assume 
the RFC3168 CWR offloading trick cannot be used anymore (unless it tracks 
the connection and knows the bits are used for RFC3168 which is something 
nobody is going to do).

The main point of SKB_GSO_TCP_ACCECN is to prevent SKB_GSO_TCP_ECN or 
NETIF_F_TSO_ECN offloading to be used for the skb as it would corrupt ACE 
field value.

SKB_GSO_TCP_ACCECN doesn't allow CWR bits change within a super-skb but 
the same CWR flag should be repeated for all segments. In a sense, it's 
simpler than RFC3168 offloading.

> SKB_GSO_TCP_ECN is also used from other points, what is your plan ?
> 
> git grep -n SKB_GSO_TCP_ECN
> drivers/net/ethernet/hisilicon/hns3/hns3_enet.c:3888:
> skb_shinfo(skb)->gso_type |= SKB_GSO_TCP_ECN;
> drivers/net/ethernet/mellanox/mlx5/core/en_rx.c:1291:
> skb_shinfo(skb)->gso_type |= SKB_GSO_TCP_ECN;
> drivers/net/ethernet/mellanox/mlx5/core/en_rx.c:1312:
> skb_shinfo(skb)->gso_type |= SKB_GSO_TCP_ECN;

These looked like they should be just changed to set SKB_GSO_TCP_ACCECN 
instead.

I don't anymore recall why I didn't change those when I made this patch 
long time ago, perhaps it was just an oversight or things have changed 
somehow since then.

> include/linux/netdevice.h:5061: BUILD_BUG_ON(SKB_GSO_TCP_ECN !=
> (NETIF_F_TSO_ECN >> NETIF_F_GSO_SHIFT));
> include/linux/skbuff.h:664:     SKB_GSO_TCP_ECN = 1 << 2,

Not relevant.

> include/linux/virtio_net.h:88:                  gso_type |= SKB_GSO_TCP_ECN;
> include/linux/virtio_net.h:161:         switch (gso_type & ~SKB_GSO_TCP_ECN) {
> include/linux/virtio_net.h:226:         if (sinfo->gso_type & SKB_GSO_TCP_ECN)

These need to be looked further what's going on as UAPI is also involved 
here.

> net/ipv4/tcp_offload.c:404:             shinfo->gso_type |= SKB_GSO_TCP_ECN;

This was changed above. :-)

> net/ipv4/tcp_output.c:389: skb_shinfo(skb)->gso_type |= SKB_GSO_TCP_ECN;

I'm pretty sure this relates to sending side which will be taken care by 
a later patch in the full series (not among these preparatory patches).


FYI, these TSO/GSO/GRO changes are what I was most unsure myself if I 
got everything right when I was working with this series a few years back 
so please keep that in mind while reviewing so my lack of knowledge 
doesn't end up breaking something. :-)

-- 
 i.

  reply	other threads:[~2024-11-07 19:28 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-05 10:06 [PATCH v5 net-next 00/13] AccECN protocol preparation patch series chia-yu.chang
2024-11-05 10:06 ` [PATCH v5 net-next 01/13] tcp: reorganize tcp_in_ack_event() and tcp_count_delivered() chia-yu.chang
2024-11-07  9:31   ` Eric Dumazet
2024-11-07 19:32     ` Ilpo Järvinen
2024-11-05 10:06 ` [PATCH v5 net-next 02/13] tcp: create FLAG_TS_PROGRESS chia-yu.chang
2024-11-07  9:34   ` Eric Dumazet
2024-11-05 10:06 ` [PATCH v5 net-next 03/13] tcp: use BIT() macro in include/net/tcp.h chia-yu.chang
2024-11-07 10:20   ` Eric Dumazet
2024-11-05 10:06 ` [PATCH v5 net-next 04/13] tcp: extend TCP flags to allow AE bit/ACE field chia-yu.chang
2024-11-07 10:23   ` Eric Dumazet
2024-11-05 10:06 ` [PATCH v5 net-next 05/13] tcp: reorganize SYN ECN code chia-yu.chang
2024-11-07 10:25   ` Eric Dumazet
2024-11-05 10:06 ` [PATCH v5 net-next 06/13] tcp: rework {__,}tcp_ecn_check_ce() -> tcp_data_ecn_check() chia-yu.chang
2024-11-07 10:30   ` Eric Dumazet
2024-11-05 10:06 ` [PATCH v5 net-next 07/13] tcp: helpers for ECN mode handling chia-yu.chang
2024-11-07 12:26   ` Eric Dumazet
2024-11-05 10:06 ` [PATCH v5 net-next 08/13] gso: AccECN support chia-yu.chang
2024-11-07 12:41   ` Eric Dumazet
2024-11-09 10:09   ` Ilpo Järvinen
2024-11-09 19:42     ` Neal Cardwell
2024-11-09 21:38       ` Ilpo Järvinen
2024-11-05 10:06 ` [PATCH v5 net-next 09/13] gro: prevent ACE field corruption & better AccECN handling chia-yu.chang
2024-11-07 12:40   ` Eric Dumazet
2024-11-07 19:28     ` Ilpo Järvinen [this message]
2024-11-12  2:09       ` Chia-Yu Chang (Nokia)
2024-11-12 21:19         ` Ilpo Järvinen
2024-11-13  1:42           ` Jason Wang
2024-11-14  1:04             ` Chia-Yu Chang (Nokia)
2024-11-05 10:06 ` [PATCH v5 net-next 10/13] tcp: AccECN support to tcp_add_backlog chia-yu.chang
2024-11-07 12:42   ` Eric Dumazet
2024-11-05 10:06 ` [PATCH v5 net-next 11/13] tcp: allow ECN bits in TOS/traffic class chia-yu.chang
2024-11-07 12:55   ` Eric Dumazet
2024-11-05 10:06 ` [PATCH v5 net-next 12/13] tcp: Pass flags to __tcp_send_ack chia-yu.chang
2024-11-07 12:58   ` Eric Dumazet
2024-11-05 10:06 ` [PATCH v5 net-next 13/13] tcp: fast path functions later chia-yu.chang
2024-11-07 13:00   ` Eric Dumazet
2024-11-07 20:38     ` Chia-Yu Chang (Nokia)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=37429ace-59c0-21d2-bcc8-54033794e789@kernel.org \
    --to=ij@kernel.org \
    --cc=Jason_Livingood@comcast.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=cheshire@apple.com \
    --cc=chia-yu.chang@nokia-bell-labs.com \
    --cc=coreteam@netfilter.org \
    --cc=davem@davemloft.net \
    --cc=dsahern@gmail.com \
    --cc=dsahern@kernel.org \
    --cc=edumazet@google.com \
    --cc=g.white@cablelabs.com \
    --cc=horms@kernel.org \
    --cc=ingemar.s.johansson@ericsson.com \
    --cc=joel.granados@kernel.org \
    --cc=kadlec@netfilter.org \
    --cc=koen.de_schepper@nokia-bell-labs.com \
    --cc=kuba@kernel.org \
    --cc=mirja.kuehlewind@ericsson.com \
    --cc=ncardwell@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=pablo@netfilter.org \
    --cc=rs.ietf@gmx.at \
    --cc=vidhi_goel@apple.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).