Netdev List
 help / color / mirror / Atom feed
From: Justin Lai <justinlai0215@realtek.com>
To: David Laight <david.laight.linux@gmail.com>
Cc: Andrew Lunn <andrew@lunn.ch>, "kuba@kernel.org" <kuba@kernel.org>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"edumazet@google.com" <edumazet@google.com>,
	"pabeni@redhat.com" <pabeni@redhat.com>,
	"andrew+netdev@lunn.ch" <andrew+netdev@lunn.ch>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"horms@kernel.org" <horms@kernel.org>,
	Ping-Ke Shih <pkshih@realtek.com>,
	Larry Chiu <larry.chiu@realtek.com>
Subject: RE: [PATCH] rtase: Workaround for IP fragmented UDP packet hardware bug
Date: Tue, 9 Jun 2026 08:39:27 +0000	[thread overview]
Message-ID: <94fd8b1a43e6401bb67e0a1310e212e0@realtek.com> (raw)
In-Reply-To: <20260608230053.5c1a7bc5@pumpkin>

David Laight <david.laight.linux@gmail.com> wrote:
> 
> On Mon, 8 Jun 2026 12:28:28 +0000
> Justin Lai <justinlai0215@realtek.com> wrote:
> 
> > David Laight <david.laight.linux@gmail.com> wrote:
> > >
> > > On Thu, 4 Jun 2026 13:43:27 +0000
> > > Justin Lai <justinlai0215@realtek.com> wrote:
> > >
> > > > David Laight <david.laight.linux@gmail.com> wrote:
> > > > >
> > > > > On Thu, 4 Jun 2026 08:33:51 +0000 Justin Lai
> > > > > <justinlai0215@realtek.com> wrote:
> > > > >
> > > > > > Andrew Lunn <andrew@lunn.ch> wrote:
> > > > > > >
> > > > > > > On Mon, Jun 01, 2026 at 02:23:41PM +0800, Justin Lai wrote:
> > > > > > > > The hardware parser incorrectly interprets 319/320 in a
> > > > > > > > short IP fragmented UDP packet payload as standard PTP
> > > > > > > > destination ports and treats the fragment as a PTP packet for
> further parsing.
> > > > >
> > > > > Is that a packet that has been segmented by IP, or one where the
> > > > > skb is fragmented enough that the data in the header is too short?
> > > > > I thought that IPv4 required an mtu of 128 bytes (ish) and IPv6
> > > > > somewhat larger - so I don't see how that is a problem.
> > > > >
> > > > > If the skb is fragmented then you need to move data into the
> > > > > header not
> > > pad
> > > > > the frame.
> > > > >
> > > > > If the hardware really is broken then I suspect you need to
> > > > > disable the
> > > feature
> > > > > and suffer the consequences.
> > > > >
> > > > > > > >
> > > > > > > > If the transport data is smaller than RTASE_MIN_PAD_LEN,
> > > > > > > > the remaining data is insufficient for further parsing and
> > > > > > > > causes hardware
> > > TX
> > > > > hang.
> > > > > > >
> > > > > > > Where does RTASE_SHORT_PKT_THRESH come into this?
> > > > > > >
> > > > > > > RTASE_MIN_PAD_LEN is 47, so matches all packets which need
> > > > > > > padding up to
> > > > > > > 60 bytes, plus FCS. There are not many such packets, so why
> > > > > > > both this all the complexity and just pad all small packets?
> > > > > > > Do you have any performance numbers which show the complexity
> is worth it?
> > > > > > >
> > > > > > > > Pad these packets so the transport data reaches
> > > RTASE_MIN_PAD_LEN
> > > > > > > > before transmitting to avoid triggering the hardware issue.
> > > > > > > >
> > > > > > > > Signed-off-by: Justin Lai <justinlai0215@realtek.com>
> > > > > > >
> > > > > > > Is this a Fix? Please add a Fixes: tag. And base it on net.
> > > > > > >
> > > > > > >
> > > https://www.kernel.org/doc/html/latest/process/maintainer-netdev.htm
> > > > > > > l
> > > > > > >
> > > > > > >     Andrew
> > > > > > >
> > > > > > > ---
> > > > > > > pw-bot: cr
> > > > > >
> > > > > > Hi Andrew,
> > > > > >
> > > > > > RTASE_MIN_PAD_LEN is not the Ethernet minimum-frame padding
> > > > > threshold.
> > > > > > It is the minimum transport-data length required by the
> > > > > > hardware parser after the packet is incorrectly detected as a PTP
> packet.
> > > > > >
> > > > > > Therefore, this workaround needs to pad the packets which can
> > > > > > trigger the hardware issue, rather than just padding packets
> > > > > > to the Ethernet minimum frame size.
> > > > >
> > > > > Is that a longer length?
> > > > > Excessive frame padding (beyond 60+FCS) can be treated as a
> > > > > protocol
> > > error.
> > > > >
> > > > > -- David
> > > > >
> > > > > >
> > > > > > I agree that RTASE_SHORT_PKT_THRESH is not necessary here. I
> > > > > > will remove it in the next revision.
> > > > > >
> > > > > > Yes, this is a fix. I will add a Fixes tag and repost it
> > > > > > against the net tree.
> > > > > >
> > > > > > Thanks,
> > > > > > Justin
> > > > > >
> > > >
> > > > Hi David,
> > > >
> > > > This is an IP fragmented packet, not a fragmented skb.
> > >
> > > Ok, your tests are broken for fragmented skb.
> > > skb_tail_pointer() is the end of the initial fragment, not the end
> > > of the actual data.
> > > So you could be adding padding to a full length packet making it
> > > overlong.
> > > Somewhere you need to be looking at skb->len.
> > > Probably with a fast-path check to ignore long packets.
> > >
> >
> > You're right. skb_tail_pointer() only covers the linear area and can
> > underestimate the transport-data length for non-linear skb.
> >
> > I will change the transport-data length calculation to use:
> > trans_data_len = skb->len - skb_transport_offset(skb);
> >
> > > >
> > > > The issue occurs on a non-initial IP fragment whose payload
> > > > contains values matching the PTP event/general destination ports
> > > > (319/320). The hardware parser incorrectly identifies the fragment
> > > > as a PTP packet and attempts further parsing.
> > >
> > > Wait a minute, what stops someone using either of those port numbers
> > > for something else?
> > > There are no hard restrictions on the use of UDP port numbers.
> > > So what does the hardware do with UDP packets to port 319/320 that
> > > are being used for something else entirely?
> > >
> >
> > If the UDP destination port is 319 or 320, the hardware will identify
> > the packet as a PTP packet and perform the corresponding PTP
> > processing.
> 
> But aren't I allowed to run my 'frobnicate' protocol on those ports?
>
You are correct that UDP ports are not strictly bound to a specific
protocol at the networking layer.

However, 319/320 are the standardized and widely used ports for PTP
(IEEE 1588). Our hardware relies on this common usage for PTP
identification in the parser.
> >
> > > > The workaround does not modify the IP or UDP length fields. The
> > > > original protocol headers still describe the actual packet size.
> > > >
> > > > Therefore, the protocol-defined payload size remains unchanged.
> > > >
> > > > We have tested this workaround and have not observed issues caused
> > > > by the additional padding.
> > >
> > > Have you checked all the systems that might receive such packets?
> > >
> >
> > Could you clarify what kind of protocol error you are referring to?
> >
> > From my understanding, this workaround adds padding at the end of the
> > frame without modifying the IP or UDP length fields. The
> > protocol-defined packet length therefore remains unchanged.
> >
> > This appears similar to Ethernet frame padding, where additional bytes
> > may exist beyond the protocol-defined payload length. Could you
> > elaborate on how this case differs from the padding up to the Ethernet
> > minimum frame size (60 bytes + FCS)?
> 
> The padding of short packets to 64 bytes (inc FCS) is a special case.
> I don't believe any other packets are expected to be padded.
> There have been issues when packets get extended when hardware switches
> add VLAN headers to padded packets.
> I also think you'll find bug reports caused by one of the VM 'virtual network
> interfaces' adding extra padding to frames before they reach the physical
> network (possibly just making them even length).
> I can't remember the full details, it caused the company I used to work for
> some issues because we had some hardware that rejected frames with
> unexpected padding, a google search showed it was a known problem with that
> VM.
> A quick search showed https://www.virtualbox.org/ticket/18202
> I can't remember if that was the source of the padded packets.
> 
> David
> 
Thanks for the detailed feedback.

As you mentioned, some hardware may enforce strict Ethernet frame
length validation or be sensitive to tail padding beyond the
original frame size.

In our case, IP and UDP headers are not modified, and the
protocol-defined length remains unchanged. Therefore, on hardware
without strict L2 length validation, this padding is not expected
to cause issues.

However, on our hardware this workaround is required, as the
known issue can result in TX hang if not applied.

Thanks,
Justin
> >
> > Thanks,
> > Justin
> >
> > > -- David
> > >
> > > >
> > > > Thanks,
> > > > Justin
> >
> >


  reply	other threads:[~2026-06-09  8:39 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-01  6:23 [PATCH] rtase: Workaround for IP fragmented UDP packet hardware bug Justin Lai
2026-06-01 13:20 ` Andrew Lunn
2026-06-04  8:33   ` Justin Lai
2026-06-04 11:46     ` David Laight
2026-06-04 13:43       ` Justin Lai
2026-06-04 14:53         ` David Laight
2026-06-08 12:28           ` Justin Lai
2026-06-08 22:00             ` David Laight
2026-06-09  8:39               ` Justin Lai [this message]
2026-06-01 13:22 ` Alexander Lobakin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=94fd8b1a43e6401bb67e0a1310e212e0@realtek.com \
    --to=justinlai0215@realtek.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=andrew@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=david.laight.linux@gmail.com \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=kuba@kernel.org \
    --cc=larry.chiu@realtek.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=pkshih@realtek.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox