From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
To: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Cc: netdev@vger.kernel.org, "(JC),
Jayachandran" <j-rameshbabu@ti.com>,
"David S. Miller" <davem@davemloft.net>,
Andrew Lunn <andrew+netdev@lunn.ch>,
Chintan Vankar <c-vankar@ti.com>,
Danish Anwar <danishanwar@ti.com>, Daolin Qiu <d-qiu@ti.com>,
Eric Dumazet <edumazet@google.com>,
Felix Maurer <fmaurer@redhat.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Richard Cochran <richardcochran@gmail.com>,
Simon Horman <horms@kernel.org>
Subject: Re: [PATCH RFC net-next v2 2/2] af_packet: Add port specific handling for HSR
Date: Tue, 10 Mar 2026 11:55:44 +0100 [thread overview]
Message-ID: <20260310105544.EVXIekwG@linutronix.de> (raw)
In-Reply-To: <willemdebruijn.kernel.3b7b5294c6434@gmail.com>
On 2026-03-09 21:38:33 [-0400], Willem de Bruijn wrote:
> The same point about adding per protocol state to sk_buff applies
> to a slightly lesser extent to PF_PACKET.
>
> Adding this much HSR + PTP specific code there is a non-starter.
It looked like a little and is hidden behind a static branch so it is
just a nop as long as there is no one using the socket bind. And without
CONFIG_HSR there is not even that.
> I should have said this in v1. This likely makes my skb_extensions
> suggestion a non-starter sorry.
I need something to share between two layers I think so that
skb_extensions wasn't that bad.
> We need to find a different way to
>
> Rx: get the port info from the slave device to userspace.
> Tx: send out the intended slave device.
>
> Let's separate the two challenges (and patches).
>
> On Rx, could your process just attach the PF_PACKET socket to the
> slave devices and filter on HSR PTP packets? Then separately drop
> these packets in hsr_handle_frame (as already done?) or TC ingress, so
> that they only arrive in userspace?
I could listen directly on eth0/ eth1 as a PF_PACKET. That would give me
all I need including a timestamp, yes. I wouldn't just be able to use it
for TX but lets go on.
> On Tx, can you share a bit more why there are two cases, one where the
> master has to add the header, but also one where it does not (so
> userspace has presumably inserted it).
PTP + HSR. Lets assume the following setup:
╭────────╮ ╭──────╮ ╭──────╮ ╭────────╮
╔═══│ Node X │═══│Port A├┅┅┅┅┅┅┅┅┅┅┅┅┅┅┤Port A│══│ Node Y │════╗
║ ╰────────╯ ╰──────╯ ╰──────╯ ╰────────╯ ║
║ ║
║ ║
╭──────╮ ╭──────╮ ╭────────╮ ╭──────╮ ╭──────╮
│Port B├┅┅┅┅┅┅┅┅┅┅┅┅┤Port B│══│ Node Z │══│Port A├┅┅┅┅┅┅┅┅┅┅┅┅┤Port B│
╰──────╯ ╰──────╯ ╰────────╯ ╰──────╯ ╰──────╯
Node X has direct connection to Y and Z, each node has two ports. You
could add more nodes but it always remains a ring.
Lets say node X sends a packet (say TCP/IP) with the destination MAC of
node Z assuming a "normal port 443" request. This packet gets a HSR
header prepended and is sent on X-A and X-B. This happens transparently
as hsr0 is the device with an IP address assigned and port A and B are
just two device which are up with no IP address assigned. These are the
physical devices forwarding the traffic.
Y-A receives it, is not the target, forwards it over Y-B.
Z-B receives it, it is the target, sends to its master port which
removes the HSR header and the packet arrives in the IP stack. After the
master port, it forwards it also on Z-A.
Z-A receives it (the copy from Y-B) identifies it as a duplicate based
on the HSR-sequence number (does not inject into the master port) and
forwards it on Z-B.
At the end Node X receives two copies of the packet it sent and removes
them from the ring (node X was the sender identified by the SRC MAC and
does not forward it).
This is how HSR works in general. Now lets add PTP to this as specified.
The target MAC is always a multicast MAC and the ether type is PTP
0x88f7.
Use case 1: A PDELAY_REQ packet. This packet travels only between two
neighbours. That means X-A sends it to Y-A and Y must not forward it
over Y-B but needs to answer (send a PDELAY_RESP). These packets are
sent as PTP frames and the HSR stack needs to prepend a HSR header with
a valid sequence number. X-B gets its own request. Userland needs to
track time/ state information on per port basis.
Use case 2: A SYNC packet. This packet is sent from X-A to Y-A. Again a
HSR header needs to be prepended by the stack on X. Y-A receives that
packet. It injects it into the master port where user land can consume
it. This is the same as the previous case.
Here comes the different part: This packet needs to be forwarded by Y
over Y-B. As in the previous case the HSR stack does not forward it on
its own but this part is done by userland. So userland sends a packet,
only on Y-B and this packet already contains the HSR header from X and
it needs to be preserved.
The forwarded SYNC packet got its timing information updated based on
the delay within the stack (so it is not identical as received).
That is why the HSR stack must not forwarded the packets on the other
port as it would normally do (breaks PTP time information), why user
land needs to know on which port the packet was received and why it
needs to send a packet only on one port with or without the HSR header.
> The second case is simpler: can just write directly the whole packet
> to the intended slave device.
Yes. This has been suggested and was indeed used in my v1 of linuxptp
but the problem was sending with system's HSR header.
> For the first case, could skb->mark be used as port selector when
> writing from a packet socket to the master device? That already works
> with sock_cmsg_send.
We would have to specify that SO_MARK 1 and 2 denotes the port on which
a packet is sent. This kind of burns the usage for everything else on
HSR so it feels misused. And then we would need an additional bit to
specify whether the HSR header is there or not. Unless I open additional
socket on the ethernet device just for sending and dropping everything
incoming.
And we would have to filter/ distinguish the RX port based on it.
Userland has a cBPF filter to filter everything out and receive only PTP
frames. If the PTP packet is forwarded to both sockets (A and B) then
userland would have to throw one copy away and go to sleep again. This
sort of breaks currently linuxptp logic. It would probably require
either eBPF to filter also so_mark or deal with "no packet despite the
wakeup" but so far I tried minimal impact on both sides (kernel and
user).
Sebastian
next prev parent reply other threads:[~2026-03-10 10:55 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-09 15:52 [PATCH RFC net-next v2 0/2] hsr: Add additional info to send/ receive skbs Sebastian Andrzej Siewior
2026-03-09 15:52 ` [PATCH RFC net-next v2 1/2] hsr: Allow to send a specific port and with HSR header Sebastian Andrzej Siewior
2026-03-09 15:52 ` [PATCH RFC net-next v2 2/2] af_packet: Add port specific handling for HSR Sebastian Andrzej Siewior
2026-03-10 1:38 ` Willem de Bruijn
2026-03-10 10:55 ` Sebastian Andrzej Siewior [this message]
2026-03-10 21:35 ` Willem de Bruijn
2026-03-12 15:42 ` Sebastian Andrzej Siewior
2026-03-12 21:43 ` Willem de Bruijn
2026-03-13 9:22 ` Sebastian Andrzej Siewior
2026-03-13 16:04 ` Sebastian Andrzej Siewior
2026-03-16 20:12 ` Willem de Bruijn
2026-03-17 17:29 ` Sebastian Andrzej Siewior
2026-03-19 13:29 ` Willem de Bruijn
2026-03-19 14:26 ` Sebastian Andrzej Siewior
2026-03-19 16:27 ` Willem de Bruijn
2026-03-24 16:38 ` Sebastian Andrzej Siewior
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260310105544.EVXIekwG@linutronix.de \
--to=bigeasy@linutronix.de \
--cc=andrew+netdev@lunn.ch \
--cc=c-vankar@ti.com \
--cc=d-qiu@ti.com \
--cc=danishanwar@ti.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=fmaurer@redhat.com \
--cc=horms@kernel.org \
--cc=j-rameshbabu@ti.com \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=richardcochran@gmail.com \
--cc=willemdebruijn.kernel@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox