From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Cc: netdev@vger.kernel.org, "(JC),
Jayachandran" <j-rameshbabu@ti.com>,
"David S. Miller" <davem@davemloft.net>,
Andrew Lunn <andrew+netdev@lunn.ch>,
Chintan Vankar <c-vankar@ti.com>,
Danish Anwar <danishanwar@ti.com>, Daolin Qiu <d-qiu@ti.com>,
Eric Dumazet <edumazet@google.com>,
Felix Maurer <fmaurer@redhat.com>,
Jakub Kicinski <kuba@kernel.org>,
Paolo Abeni <pabeni@redhat.com>,
Richard Cochran <richardcochran@gmail.com>,
Simon Horman <horms@kernel.org>
Subject: Re: [PATCH RFC net-next v2 2/2] af_packet: Add port specific handling for HSR
Date: Tue, 10 Mar 2026 17:35:33 -0400 [thread overview]
Message-ID: <willemdebruijn.kernel.3693c9aa8271@gmail.com> (raw)
In-Reply-To: <20260310105544.EVXIekwG@linutronix.de>
Sebastian Andrzej Siewior wrote:
> On 2026-03-09 21:38:33 [-0400], Willem de Bruijn wrote:
> > The same point about adding per protocol state to sk_buff applies
> > to a slightly lesser extent to PF_PACKET.
> >
> > Adding this much HSR + PTP specific code there is a non-starter.
>
> It looked like a little and is hidden behind a static branch so it is
> just a nop as long as there is no one using the socket bind. And without
> CONFIG_HSR there is not even that.
>
> > I should have said this in v1. This likely makes my skb_extensions
> > suggestion a non-starter sorry.
>
> I need something to share between two layers I think so that
> skb_extensions wasn't that bad.
>
> > We need to find a different way to
> >
> > Rx: get the port info from the slave device to userspace.
> > Tx: send out the intended slave device.
> >
> > Let's separate the two challenges (and patches).
> >
> > On Rx, could your process just attach the PF_PACKET socket to the
> > slave devices and filter on HSR PTP packets? Then separately drop
> > these packets in hsr_handle_frame (as already done?) or TC ingress, so
> > that they only arrive in userspace?
>
> I could listen directly on eth0/ eth1 as a PF_PACKET. That would give me
> all I need including a timestamp, yes. I wouldn't just be able to use it
> for TX but lets go on.
Great
> > On Tx, can you share a bit more why there are two cases, one where the
> > master has to add the header, but also one where it does not (so
> > userspace has presumably inserted it).
>
> PTP + HSR. Lets assume the following setup:
>
> ╭────────╮ ╭──────╮ ╭──────╮ ╭────────╮
> ╔═══│ Node X │═══│Port A├┅┅┅┅┅┅┅┅┅┅┅┅┅┅┤Port A│══│ Node Y │════╗
> ║ ╰────────╯ ╰──────╯ ╰──────╯ ╰────────╯ ║
> ║ ║
> ║ ║
> ╭──────╮ ╭──────╮ ╭────────╮ ╭──────╮ ╭──────╮
> │Port B├┅┅┅┅┅┅┅┅┅┅┅┅┤Port B│══│ Node Z │══│Port A├┅┅┅┅┅┅┅┅┅┅┅┅┤Port B│
> ╰──────╯ ╰──────╯ ╰────────╯ ╰──────╯ ╰──────╯
>
> Node X has direct connection to Y and Z, each node has two ports. You
> could add more nodes but it always remains a ring.
> Lets say node X sends a packet (say TCP/IP) with the destination MAC of
> node Z assuming a "normal port 443" request. This packet gets a HSR
> header prepended and is sent on X-A and X-B. This happens transparently
> as hsr0 is the device with an IP address assigned and port A and B are
> just two device which are up with no IP address assigned. These are the
> physical devices forwarding the traffic.
>
> Y-A receives it, is not the target, forwards it over Y-B.
> Z-B receives it, it is the target, sends to its master port which
> removes the HSR header and the packet arrives in the IP stack. After the
> master port, it forwards it also on Z-A.
> Z-A receives it (the copy from Y-B) identifies it as a duplicate based
> on the HSR-sequence number (does not inject into the master port) and
> forwards it on Z-B.
> At the end Node X receives two copies of the packet it sent and removes
> them from the ring (node X was the sender identified by the SRC MAC and
> does not forward it).
>
> This is how HSR works in general. Now lets add PTP to this as specified.
> The target MAC is always a multicast MAC and the ether type is PTP
> 0x88f7.
>
> Use case 1: A PDELAY_REQ packet. This packet travels only between two
> neighbours. That means X-A sends it to Y-A and Y must not forward it
> over Y-B but needs to answer (send a PDELAY_RESP). These packets are
> sent as PTP frames and the HSR stack needs to prepend a HSR header with
> a valid sequence number. X-B gets its own request. Userland needs to
> track time/ state information on per port basis.
>
> Use case 2: A SYNC packet. This packet is sent from X-A to Y-A. Again a
> HSR header needs to be prepended by the stack on X. Y-A receives that
> packet. It injects it into the master port where user land can consume
> it. This is the same as the previous case.
> Here comes the different part: This packet needs to be forwarded by Y
> over Y-B. As in the previous case the HSR stack does not forward it on
> its own but this part is done by userland. So userland sends a packet,
> only on Y-B and this packet already contains the HSR header from X and
> it needs to be preserved.
> The forwarded SYNC packet got its timing information updated based on
> the delay within the stack (so it is not identical as received).
Thanks for the detailed explanation of the challenge!
> That is why the HSR stack must not forwarded the packets on the other
> port as it would normally do (breaks PTP time information), why user
> land needs to know on which port the packet was received and why it
> needs to send a packet only on one port with or without the HSR header.
>
> > The second case is simpler: can just write directly the whole packet
> > to the intended slave device.
>
> Yes. This has been suggested and was indeed used in my v1 of linuxptp
Great
> but the problem was sending with system's HSR header.
>
> > For the first case, could skb->mark be used as port selector when
> > writing from a packet socket to the master device? That already works
> > with sock_cmsg_send.
>
> We would have to specify that SO_MARK 1 and 2 denotes the port on which
> a packet is sent. This kind of burns the usage for everything else on
> HSR so it feels misused.
It is more or less what mark is for. An alternative similar field
supported by sock_cmsg_send is skb->priority.
An alternative may be to share the information in-band. Already
insert the HSR header also wen writing to the master device. If the
master device can detect this packet-with-pre-existing header.
This is not the first case where ndo_start_xmit may already expect a
header prefixed that it normally inserts. I forgot the exact case (can
look it up), maybe a weird edge case in GRE?
It does not even have to be a valid HSR header: just an agreement
between the process writing the raw packet and hsr_dev_xmit.
There probably are still more ways we can approach this challenge.
But these are three that do not require kernel changes outside the
HSR protocol code.
> And then we would need an additional bit to
> specify whether the HSR header is there or not. Unless I open additional
> socket on the ethernet device just for sending and dropping everything
> incoming.
Right, packets that already have a header prefixed are written
directly to the intended slave.
> And we would have to filter/ distinguish the RX port based on it.
> Userland has a cBPF filter to filter everything out and receive only PTP
> frames. If the PTP packet is forwarded to both sockets (A and B) then
> userland would have to throw one copy away and go to sleep again. This
> sort of breaks currently linuxptp logic. It would probably require
> either eBPF to filter also so_mark or deal with "no packet despite the
> wakeup" but so far I tried minimal impact on both sides (kernel and
> user).
I don't fully follow this part. It discusses Rx again?
next prev parent reply other threads:[~2026-03-10 21:35 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-09 15:52 [PATCH RFC net-next v2 0/2] hsr: Add additional info to send/ receive skbs Sebastian Andrzej Siewior
2026-03-09 15:52 ` [PATCH RFC net-next v2 1/2] hsr: Allow to send a specific port and with HSR header Sebastian Andrzej Siewior
2026-03-12 4:09 ` kernel test robot
2026-03-12 4:48 ` kernel test robot
2026-03-09 15:52 ` [PATCH RFC net-next v2 2/2] af_packet: Add port specific handling for HSR Sebastian Andrzej Siewior
2026-03-10 1:38 ` Willem de Bruijn
2026-03-10 10:55 ` Sebastian Andrzej Siewior
2026-03-10 21:35 ` Willem de Bruijn [this message]
2026-03-12 15:42 ` Sebastian Andrzej Siewior
2026-03-12 21:43 ` Willem de Bruijn
2026-03-13 9:22 ` Sebastian Andrzej Siewior
2026-03-13 16:04 ` Sebastian Andrzej Siewior
2026-03-16 20:12 ` Willem de Bruijn
2026-03-17 17:29 ` Sebastian Andrzej Siewior
2026-03-19 13:29 ` Willem de Bruijn
2026-03-19 14:26 ` Sebastian Andrzej Siewior
2026-03-19 16:27 ` Willem de Bruijn
2026-03-24 16:38 ` Sebastian Andrzej Siewior
2026-04-02 16:32 ` Sebastian Andrzej Siewior
2026-04-06 14:47 ` Willem de Bruijn
2026-04-16 16:18 ` Sebastian Andrzej Siewior
2026-04-21 7:41 ` Willem de Bruijn
2026-04-22 13:27 ` Sebastian Andrzej Siewior
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=willemdebruijn.kernel.3693c9aa8271@gmail.com \
--to=willemdebruijn.kernel@gmail.com \
--cc=andrew+netdev@lunn.ch \
--cc=bigeasy@linutronix.de \
--cc=c-vankar@ti.com \
--cc=d-qiu@ti.com \
--cc=danishanwar@ti.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=fmaurer@redhat.com \
--cc=horms@kernel.org \
--cc=j-rameshbabu@ti.com \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=richardcochran@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.