* [PATCH RFC net-next v2 0/2] hsr: Add additional info to send/ receive skbs
@ 2026-03-09 15:52 Sebastian Andrzej Siewior
2026-03-09 15:52 ` [PATCH RFC net-next v2 1/2] hsr: Allow to send a specific port and with HSR header Sebastian Andrzej Siewior
2026-03-09 15:52 ` [PATCH RFC net-next v2 2/2] af_packet: Add port specific handling for HSR Sebastian Andrzej Siewior
0 siblings, 2 replies; 16+ messages in thread
From: Sebastian Andrzej Siewior @ 2026-03-09 15:52 UTC (permalink / raw)
To: netdev
Cc: Sebastian Andrzej Siewior, (JC), Jayachandran, David S. Miller,
Andrew Lunn, Chintan Vankar, Danish Anwar, Daolin Qiu,
Eric Dumazet, Felix Maurer, Jakub Kicinski, Paolo Abeni,
Richard Cochran, Simon Horman, Willem de Bruijn
I am trying to extend linuxptp to support PTP over a HSR network.
This is the kernel side of the changes. In short PTP over HSR sends its
packets to a multicast address and every node needs to forward the PTP
packet (SYNC and FOLLOW-UP for instance) within the HSR ring.
In order to achieve this, the HSR stack must not duplicate and forward
the PTP packets as it would other packets. The delay caused by the
duplication and forwarding adds overhead which in turn makes the timing
information within the PTP packet inaccurate.
My current approach is to open the hsr0 device from userland and send
and receive the PTP packets individually on both slave interfaces. I
added additional hints to the af_packet interface, which is used by
linuxptp, to be able send a packet on a specific interface (A or B) and
also to have the information recorded for received packets.
Additionally, the PTP-timestamps which arrive on the slave interface are
forwarded to the hsr interface.
This solves the following:
- Forwarding a SYNC and a FOLLOW_UP packet.
Userland receives a SYNC, FOLLOW_UP and timestamp packet from port A.
The received port is known and recorded. The SYNC packet and the
updated FOLLOW_UP packet can then be forwarded on port B. The update
is based on the time spent during the forward so the forwarding delay
is accounted for.
In this scenario the HSR header is used from the original sender of
the SYNC and FOLLOW_UP header and the stack must not prepand the
system's HSR header nor replace the provided header.
- Sending PDELAY_REQ and PDELAY_RESP*
These packets are only exchanged between the two nodes directly
connected and they must not be forwarded within the HSR ring.
These packets are sent as PTP packets and the HSR stack must prepand
system's HSR header including a sequence number.
This logic is also used for SYNC packet if the node acts as a GM.
This has been tested in a pure software environment and in an HW-assited
environment where the HW is able to duplicate and deduplicate packets
but does not do it for PTP packets.
It has not been tested within an enviroment where the HW is able to
forward the PTP packet and correctly update the timming information.
---
v1…v2: https://patch.msgid.link/20260204-hsr_ptp-v1-0-b421c69a77da@linutronix.de
- Added PRP support
- skb extention is used instead of extending struct skb_shared_info
- in af_packet
- packet_sendmsg_spkt() is no longer extended
- jump labels are used to avoid the overhead if there no socket that
is using this HSR extension.
---
Sebastian Andrzej Siewior (2):
hsr: Allow to send a specific port and with HSR header
af_packet: Add port specific handling for HSR
include/linux/if_hsr.h | 81 ++++++++++++++++++++
include/linux/skbuff.h | 3 +
include/uapi/linux/if_packet.h | 9 +++
net/core/skbuff.c | 4 +
net/hsr/Kconfig | 1 +
net/hsr/hsr_device.c | 7 ++
net/hsr/hsr_forward.c | 118 ++++++++++++++++++++++++++---
net/hsr/hsr_framereg.h | 1 +
net/hsr/hsr_slave.c | 35 +++++++--
net/packet/af_packet.c | 167 +++++++++++++++++++++++++++++++++++++++++
net/packet/internal.h | 1 +
11 files changed, 412 insertions(+), 15 deletions(-)
---
base-commit: 3daa4f5dc6cc1ac1ab2f95b5b4c16bc5fb87f48f
change-id: 20260204-hsr_ptp-1f6380f1d35f
Best regards,
--
Sebastian Andrzej Siewior <bigeasy@linutronix.de>
^ permalink raw reply [flat|nested] 16+ messages in thread* [PATCH RFC net-next v2 1/2] hsr: Allow to send a specific port and with HSR header 2026-03-09 15:52 [PATCH RFC net-next v2 0/2] hsr: Add additional info to send/ receive skbs Sebastian Andrzej Siewior @ 2026-03-09 15:52 ` Sebastian Andrzej Siewior 2026-03-09 15:52 ` [PATCH RFC net-next v2 2/2] af_packet: Add port specific handling for HSR Sebastian Andrzej Siewior 1 sibling, 0 replies; 16+ messages in thread From: Sebastian Andrzej Siewior @ 2026-03-09 15:52 UTC (permalink / raw) To: netdev Cc: Sebastian Andrzej Siewior, (JC), Jayachandran, David S. Miller, Andrew Lunn, Chintan Vankar, Danish Anwar, Daolin Qiu, Eric Dumazet, Felix Maurer, Jakub Kicinski, Paolo Abeni, Richard Cochran, Simon Horman, Willem de Bruijn HSR forwards all packets it received on slave port 1 to slave port 2 and one of the two copies to the user (master) interface. In terms of PTP this is not good because the latency introduced by forwarding makes the timestamp in the PTP packet inaccurate. The PTP packets should not forwarded like regular packets. This is addressed by adding a skb extension to received PTP packet (over a slave interface). Those packets won't be forwarded on the other port slave port. Instead both copies with the HSR header will be injected into the master port so userspace can consume it. For the TX side (received over master port) the same facility is used to send the packet over the specified slave port and either use the provided HSR header or generate one by the stack. The added skb extension is struct hsr_ptp_ext. Receive (slave ports): - The 'port' member is set to HSR_PT_SLAVE_A/ HSR_PT_SLAVE_B to denote the port which received the packet. The 'header' member is always false. This information is only added to PTP packets. - The received packet is injected into the master port with its HSR header. Both copies are injected, duplicates are not removed. Send (master port): - The 'port' member is set to HSR_PT_SLAVE_A/ HSR_PT_SLAVE_B to denote the port on which the packet has to be sent. - The 'header' can be set to true to denote that the packet already contains a HSR header and the stack must not add the system's header to it. Otherwise (if false) the stack will add a HSR header. If a header is already provided then some checks/ actions are skipped such MAC-assignment or node status update. - Cloning a TX skb requires to assign socket information and tx_flags/ tskey to associate the timestamp with the skb. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> --- include/linux/if_hsr.h | 81 +++++++++++++++++++++++++++++++++ include/linux/skbuff.h | 3 ++ net/core/skbuff.c | 4 ++ net/hsr/Kconfig | 1 + net/hsr/hsr_device.c | 7 +++ net/hsr/hsr_forward.c | 118 ++++++++++++++++++++++++++++++++++++++++++++----- net/hsr/hsr_framereg.h | 1 + net/hsr/hsr_slave.c | 35 ++++++++++++--- 8 files changed, 235 insertions(+), 15 deletions(-) diff --git a/include/linux/if_hsr.h b/include/linux/if_hsr.h index f4cf2dd36d193..33fc11f895d3f 100644 --- a/include/linux/if_hsr.h +++ b/include/linux/if_hsr.h @@ -22,6 +22,11 @@ enum hsr_port_type { HSR_PT_PORTS, /* This must be the last item in the enum */ }; +struct hsr_ptp_ext { + u8 port; + u8 header; +}; + /* HSR Tag. * As defined in IEC-62439-3:2010, the HSR tag is really { ethertype = 0x88FB, * path, LSDU_size, sequence Nr }. But we let eth_header() create { h_dest, @@ -45,6 +50,60 @@ struct net_device *hsr_get_port_ndev(struct net_device *ndev, enum hsr_port_type pt); int hsr_get_port_type(struct net_device *hsr_dev, struct net_device *dev, enum hsr_port_type *type); + +static inline bool hsr_skb_has_header(struct sk_buff *skb) +{ + struct hsr_ptp_ext *ptp_ext; + + ptp_ext = skb_ext_find(skb, SKB_EXT_HSR); + if (!ptp_ext) + return false; + return ptp_ext->header; +} + +static inline unsigned int hsr_skb_has_port(struct sk_buff *skb) +{ + struct hsr_ptp_ext *ptp_ext; + + if (!skb) + return 0; + + ptp_ext = skb_ext_find(skb, SKB_EXT_HSR); + if (!ptp_ext) + return 0; + return ptp_ext->port; +} + +static inline bool hsr_skb_get_header_port(struct sk_buff *skb, bool *header, + enum hsr_port_type *port_type) +{ + struct hsr_ptp_ext *ptp_ext; + + *port_type = HSR_PT_NONE; + *header = false; + + ptp_ext = skb_ext_find(skb, SKB_EXT_HSR); + if (!ptp_ext) + return false; + + *port_type = ptp_ext->port; + *header = ptp_ext->header; + return true; +} + +static inline bool hsr_skb_add_header_port(struct sk_buff *skb, bool header, + enum hsr_port_type port) +{ + struct hsr_ptp_ext *ptp_ext; + + ptp_ext = skb_ext_add(skb, SKB_EXT_HSR); + if (!ptp_ext) + return false; + ptp_ext->port = port; + ptp_ext->header = header; + return true; +} + #else static inline bool is_hsr_master(struct net_device *dev) { @@ -68,6 +127,28 @@ static inline int hsr_get_port_type(struct net_device *hsr_dev, { return -EINVAL; } + +static inline bool hsr_skb_has_header(struct sk_buff *skb) +{ + return false; +} + +static inline unsigned int hsr_skb_has_port(struct sk_buff *skb) +{ + return 0; +} + +static inline bool hsr_skb_get_header_port(struct sk_buff *skb, bool *header, + enum hsr_port_type *port_type) +{ + return false; +} + +static inline bool hsr_skb_add_header_port(struct sk_buff *skb, bool header, + enum hsr_port_type port) +{ + return true; +} #endif /* CONFIG_HSR */ #endif /*_LINUX_IF_HSR_H_*/ diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index daa4e4944ce3f..75ddf7f41b8ee 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -5004,6 +5004,9 @@ enum skb_ext_id { #endif #if IS_ENABLED(CONFIG_CAN) SKB_EXT_CAN, +#endif +#if IS_ENABLED(CONFIG_HSR) + SKB_EXT_HSR, #endif SKB_EXT_NUM, /* must be last */ }; diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 0e217041958a8..d1de7c9d02639 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -60,6 +60,7 @@ #include <linux/errqueue.h> #include <linux/prefetch.h> #include <linux/bitfield.h> +#include <linux/if_hsr.h> #include <linux/if_vlan.h> #include <linux/mpls.h> #include <linux/kcov.h> @@ -5143,6 +5144,9 @@ static const u8 skb_ext_type_len[] = { #if IS_ENABLED(CONFIG_CAN) [SKB_EXT_CAN] = SKB_EXT_CHUNKSIZEOF(struct can_skb_ext), #endif +#if IS_ENABLED(CONFIG_HSR) + [SKB_EXT_HSR] = SKB_EXT_CHUNKSIZEOF(struct hsr_ptp_ext), +#endif }; static __always_inline unsigned int skb_ext_total_length(void) diff --git a/net/hsr/Kconfig b/net/hsr/Kconfig index fcacdf4f0ffc3..973f8b03aeb11 100644 --- a/net/hsr/Kconfig +++ b/net/hsr/Kconfig @@ -5,6 +5,7 @@ config HSR tristate "High-availability Seamless Redundancy (HSR & PRP)" + select SKB_EXTENSIONS help This enables IEC 62439 defined High-availability Seamless Redundancy (HSR) and Parallel Redundancy Protocol (PRP). diff --git a/net/hsr/hsr_device.c b/net/hsr/hsr_device.c index d1bfc49b5f017..f755f3c85d89d 100644 --- a/net/hsr/hsr_device.c +++ b/net/hsr/hsr_device.c @@ -230,6 +230,13 @@ static netdev_tx_t hsr_dev_xmit(struct sk_buff *skb, struct net_device *dev) master = hsr_port_get_hsr(hsr, HSR_PT_MASTER); if (master) { skb->dev = master->dev; + + /* Only HSR (not PRP) skb may include a header. The network + * header is only set to the ethernet header. + */ + if (hsr_skb_has_header(skb)) + skb_set_network_header(skb, ETH_HLEN + HSR_HLEN); + skb_reset_mac_header(skb); skb_reset_mac_len(skb); spin_lock_bh(&hsr->seqnr_lock); diff --git a/net/hsr/hsr_forward.c b/net/hsr/hsr_forward.c index aefc9b6936ba0..6fb83a36a44d8 100644 --- a/net/hsr/hsr_forward.c +++ b/net/hsr/hsr_forward.c @@ -12,11 +12,25 @@ #include <linux/skbuff.h> #include <linux/etherdevice.h> #include <linux/if_vlan.h> +#include <net/sock.h> #include "hsr_main.h" #include "hsr_framereg.h" struct hsr_node; +static void hsr_parse_req_master(struct hsr_frame_info *frame, + unsigned int *port, + bool *header) +{ + *port = HSR_PT_NONE; + *header = false; + + if (!frame->skb_std) + return; + + hsr_skb_get_header_port(frame->skb_std, header, port); +} + /* The uses I can see for these HSR supervision frames are: * 1) Use the frames that are sent after node initialization ("HSR_TLV.Type = * 22") to reset any sequence_nr counters belonging to that node. Useful if @@ -343,7 +357,10 @@ struct sk_buff *hsr_create_tagged_frame(struct hsr_frame_info *frame, hsr_set_path_id(frame, hsr_ethhdr, port); return skb_clone(frame->skb_hsr, GFP_ATOMIC); } else if (port->dev->features & NETIF_F_HW_HSR_TAG_INS) { - return skb_clone(frame->skb_std, GFP_ATOMIC); + skb = skb_clone(frame->skb_std, GFP_ATOMIC); + if (hsr_skb_has_port(skb)) + skb_set_owner_w(skb, frame->skb_std->sk); + return skb; } /* Create the new skb with enough headroom to fit the HSR tag */ @@ -365,6 +382,15 @@ struct sk_buff *hsr_create_tagged_frame(struct hsr_frame_info *frame, memmove(dst, src, movelen); skb_reset_mac_header(skb); + if (hsr_skb_has_port(skb)) { + /* Packets are bound to a port and the sender may expect time + * information. + */ + skb_shinfo(skb)->tx_flags = skb_shinfo(frame->skb_std)->tx_flags; + skb_shinfo(skb)->tskey = skb_shinfo(frame->skb_std)->tskey; + skb_set_owner_w(skb, frame->skb_std->sk); + } + /* skb_put_padto free skb on error and hsr_fill_tag returns NULL in * that case */ @@ -387,12 +413,23 @@ struct sk_buff *prp_create_tagged_frame(struct hsr_frame_info *frame, } return skb_clone(frame->skb_prp, GFP_ATOMIC); } else if (port->dev->features & NETIF_F_HW_HSR_TAG_INS) { - return skb_clone(frame->skb_std, GFP_ATOMIC); + skb = skb_clone(frame->skb_std, GFP_ATOMIC); + if (hsr_skb_has_port(skb)) + skb_set_owner_w(skb, frame->skb_std->sk); + return skb; } skb = skb_copy_expand(frame->skb_std, skb_headroom(frame->skb_std), skb_tailroom(frame->skb_std) + HSR_HLEN, GFP_ATOMIC); + if (hsr_skb_has_port(skb)) { + /* Packets are bound to a port and the sender may expect time + * information. + */ + skb_shinfo(skb)->tx_flags = skb_shinfo(frame->skb_std)->tx_flags; + skb_shinfo(skb)->tskey = skb_shinfo(frame->skb_std)->tskey; + skb_set_owner_w(skb, frame->skb_std->sk); + } return prp_fill_rct(skb, frame, port); } @@ -420,7 +457,7 @@ static void hsr_deliver_master(struct sk_buff *skb, struct net_device *dev, static int hsr_xmit(struct sk_buff *skb, struct hsr_port *port, struct hsr_frame_info *frame) { - if (frame->port_rcv->type == HSR_PT_MASTER) { + if (frame->port_rcv->type == HSR_PT_MASTER && !frame->has_foreign_header) { hsr_addr_subst_dest(frame->node_src, skb, port); /* Address substitution (IEC62439-3 pp 26, 50): replace mac @@ -518,12 +555,18 @@ bool hsr_drop_frame(struct hsr_frame_info *frame, struct hsr_port *port) */ static void hsr_forward_do(struct hsr_frame_info *frame) { + unsigned int req_tx_port; + bool req_tx_keep_header; struct hsr_port *port; - struct sk_buff *skb; bool sent = false; + hsr_parse_req_master(frame, &req_tx_port, &req_tx_keep_header); + hsr_for_each_port(frame->port_rcv->hsr, port) { struct hsr_priv *hsr = port->hsr; + unsigned int skb_rx_port = 0; + struct sk_buff *skb = NULL; + /* Don't send frame back the way it came */ if (port == frame->port_rcv) continue; @@ -542,6 +585,38 @@ static void hsr_forward_do(struct hsr_frame_info *frame) if ((port->dev->features & NETIF_F_HW_HSR_DUP) && sent) continue; + /* RX PTP packets have the received port recorded */ + if (frame->skb_hsr) + skb = frame->skb_hsr; + else if (frame->skb_prp) + skb = frame->skb_prp; + + skb_rx_port = hsr_skb_has_port(skb); + if (skb_rx_port) { + /* No PTP forwarding */ + if (port->type != HSR_PT_MASTER) + continue; + + skb = skb_clone(skb, GFP_ATOMIC); + /* Inject the PTP packet into the master interface + * with HSR headers. + */ + goto inject_into_stack; + } + + /* PTP TX packets have an outgoing port specified */ + if (req_tx_port != HSR_PT_NONE && req_tx_port != port->type) + continue; + /* PTP TX packets may already have a HSR header which needs to + * be preserved + */ + if (req_tx_keep_header) { + skb = skb_clone(frame->skb_std, GFP_ATOMIC); + if (skb) + skb_set_owner_w(skb, frame->skb_std->sk); + goto inject_into_stack; + } + /* Don't send frame over port where it has been sent before. * Also for SAN, this shouldn't be done. */ @@ -569,6 +644,7 @@ static void hsr_forward_do(struct hsr_frame_info *frame) else skb = hsr->proto_ops->get_untagged_frame(frame, port); +inject_into_stack: if (!skb) { frame->port_rcv->dev->stats.rx_dropped++; continue; @@ -633,6 +709,13 @@ int hsr_fill_frame_info(__be16 proto, struct sk_buff *skb, struct hsr_port *port = frame->port_rcv; struct hsr_priv *hsr = port->hsr; + if (frame->has_foreign_header) { + frame->skb_std = skb; + + WARN_ON_ONCE(port->type != HSR_PT_MASTER); + WARN_ON_ONCE(skb->mac_len < sizeof(struct hsr_ethhdr)); + return 0; + } /* HSRv0 supervisory frames double as a tag so treat them as tagged. */ if ((!hsr->prot_version && proto == htons(ETH_P_PRP)) || proto == htons(ETH_P_HSR)) { @@ -658,7 +741,16 @@ int prp_fill_frame_info(__be16 proto, struct sk_buff *skb, struct hsr_frame_info *frame) { /* Supervision frame */ - struct prp_rct *rct = skb_get_PRP_rct(skb); + struct prp_rct *rct; + + if (frame->has_foreign_header) { + struct hsr_port *port = frame->port_rcv; + + frame->skb_std = skb; + WARN_ON_ONCE(port->type != HSR_PT_MASTER); + return 0; + } + rct = skb_get_PRP_rct(skb); if (rct && prp_check_lsdu_size(skb, rct, frame->is_supervision)) { @@ -697,10 +789,15 @@ static int fill_frame_info(struct hsr_frame_info *frame, if (port->type == HSR_PT_INTERLINK) n_db = &hsr->proxy_node_db; - frame->node_src = hsr_get_node(port, n_db, skb, - frame->is_supervision, port->type); - if (!frame->node_src) - return -1; /* Unknown node and !is_supervision, or no mem */ + if (hsr_skb_has_header(skb)) + frame->has_foreign_header = true; + + if (!frame->has_foreign_header) { + frame->node_src = hsr_get_node(port, n_db, skb, + frame->is_supervision, port->type); + if (!frame->node_src) + return -1; /* Unknown node and !is_supervision, or no mem */ + } ethhdr = (struct ethhdr *)skb_mac_header(skb); frame->is_vlan = false; @@ -739,7 +836,8 @@ void hsr_forward_skb(struct sk_buff *skb, struct hsr_port *port) if (fill_frame_info(&frame, skb, port) < 0) goto out_drop; - hsr_register_frame_in(frame.node_src, port, frame.sequence_nr); + if (!frame.has_foreign_header) + hsr_register_frame_in(frame.node_src, port, frame.sequence_nr); hsr_forward_do(&frame); rcu_read_unlock(); /* Gets called for ingress frames as well as egress from master port. diff --git a/net/hsr/hsr_framereg.h b/net/hsr/hsr_framereg.h index c65ecb9257348..fc0341b158f67 100644 --- a/net/hsr/hsr_framereg.h +++ b/net/hsr/hsr_framereg.h @@ -27,6 +27,7 @@ struct hsr_frame_info { bool is_local_dest; bool is_local_exclusive; bool is_from_san; + bool has_foreign_header; }; void hsr_del_self_node(struct hsr_priv *hsr); diff --git a/net/hsr/hsr_slave.c b/net/hsr/hsr_slave.c index 44f83c8c56a79..cd2092527c780 100644 --- a/net/hsr/hsr_slave.c +++ b/net/hsr/hsr_slave.c @@ -44,8 +44,7 @@ static rx_handler_result_t hsr_handle_frame(struct sk_buff **pskb) if (hsr_addr_is_self(port->hsr, eth_hdr(skb)->h_source)) { /* Directly kill frames sent by ourselves */ - kfree_skb(skb); - goto finish_consume; + goto finish_free_consume; } /* For HSR, only tagged frames are expected (unless the device offloads @@ -65,8 +64,7 @@ static rx_handler_result_t hsr_handle_frame(struct sk_buff **pskb) if ((!hsr->prot_version && protocol == htons(ETH_P_PRP)) || protocol == htons(ETH_P_HSR)) { if (!pskb_may_pull(skb, ETH_HLEN + HSR_HLEN)) { - kfree_skb(skb); - goto finish_consume; + goto finish_free_consume; } skb_set_network_header(skb, ETH_HLEN + HSR_HLEN); @@ -81,10 +79,37 @@ static rx_handler_result_t hsr_handle_frame(struct sk_buff **pskb) hsr_forward_skb(skb, port); spin_unlock_bh(&hsr->seqnr_lock); } else { + struct hsr_ethhdr *hsr_ethhdr; + + /* PTP packets are not supposed to be forwarded via HSR/ PRP + * as-is. The latency introduced by forwarding renders + * the time information useless. + * Instead attach the port information on which it was + * received, forward both copies to userland and let it deal + * with it. + */ + if ((!hsr->prot_version && protocol == htons(ETH_P_PRP)) || + protocol == htons(ETH_P_HSR)) { + /* HSR */ + hsr_ethhdr = (struct hsr_ethhdr *)skb_mac_header(skb); + if (hsr_ethhdr->hsr_tag.encap_proto == htons(ETH_P_1588)) { + if (!hsr_skb_add_header_port(skb, false, port->type)) + goto finish_free_consume; + } + } else { + if (protocol == htons(ETH_P_1588)) { + if (!hsr_skb_add_header_port(skb, false, port->type)) + goto finish_free_consume; + } + } + hsr_forward_skb(skb, port); } -finish_consume: + return RX_HANDLER_CONSUMED; + +finish_free_consume: + kfree_skb(skb); return RX_HANDLER_CONSUMED; finish_pass: -- 2.53.0 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH RFC net-next v2 2/2] af_packet: Add port specific handling for HSR 2026-03-09 15:52 [PATCH RFC net-next v2 0/2] hsr: Add additional info to send/ receive skbs Sebastian Andrzej Siewior 2026-03-09 15:52 ` [PATCH RFC net-next v2 1/2] hsr: Allow to send a specific port and with HSR header Sebastian Andrzej Siewior @ 2026-03-09 15:52 ` Sebastian Andrzej Siewior 2026-03-10 1:38 ` Willem de Bruijn 1 sibling, 1 reply; 16+ messages in thread From: Sebastian Andrzej Siewior @ 2026-03-09 15:52 UTC (permalink / raw) To: netdev Cc: Sebastian Andrzej Siewior, (JC), Jayachandran, David S. Miller, Andrew Lunn, Chintan Vankar, Danish Anwar, Daolin Qiu, Eric Dumazet, Felix Maurer, Jakub Kicinski, Paolo Abeni, Richard Cochran, Simon Horman, Willem de Bruijn linuxptp/ ptp4l uses a AF_PACKET with a RAW socket to send and receive PTP packets. Extend the interface with the ability to bind the socket to one of the two HSR ports and a flag for sendmsg() to indicate that the packet already contains a HSR header. Once PACKET_HSR_BIND_PORT is set, the socket will be bound to requested slave port. All incoming packets without a set port will be discarded. This limits receiving packet to PTP only packets since only those get the tagging by the HSR stack. For the control message used by sendmsg(), type PACKET_HSR_INFO is added with PACKET_HSR_INFO_HAS_HDR as the only option. If this is used, the skb will be known as containing the HSR header. This option can only be used if the socket is bound to a specific HSR port. To lower the impact for non-PTP-HSR/PRP user, a static branch is used to to avoid a compare & branch if the socket was not bound to a hsr-slave device. If CONFIG_HSR is not set, the compiler will be smart enough to remove the code entirely (keeping only PACKET_HSR_BIND_PORT for getsockopt()). Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> --- include/uapi/linux/if_packet.h | 9 +++ net/packet/af_packet.c | 167 +++++++++++++++++++++++++++++++++++++++++ net/packet/internal.h | 1 + 3 files changed, 177 insertions(+) diff --git a/include/uapi/linux/if_packet.h b/include/uapi/linux/if_packet.h index 6cd1d7a41dfb7..3443eeac8470e 100644 --- a/include/uapi/linux/if_packet.h +++ b/include/uapi/linux/if_packet.h @@ -60,6 +60,7 @@ struct sockaddr_ll { #define PACKET_FANOUT_DATA 22 #define PACKET_IGNORE_OUTGOING 23 #define PACKET_VNET_HDR_SZ 24 +#define PACKET_HSR_BIND_PORT 25 #define PACKET_FANOUT_HASH 0 #define PACKET_FANOUT_LB 1 @@ -74,6 +75,14 @@ struct sockaddr_ll { #define PACKET_FANOUT_FLAG_IGNORE_OUTGOING 0x4000 #define PACKET_FANOUT_FLAG_DEFRAG 0x8000 +/* For HSR, bind port */ +#define PACKET_HSR_BIND_PORT_AB 0 +#define PACKET_HSR_BIND_PORT_A 1 +#define PACKET_HSR_BIND_PORT_B 2 +/* HSR, CMSG */ +#define PACKET_HSR_INFO 1 +#define PACKET_HSR_INFO_HAS_HDR 1 + struct tpacket_stats { unsigned int tp_packets; unsigned int tp_drops; diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index 72d0935139f0f..98322ad57a234 100644 --- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -82,6 +82,7 @@ #include <linux/module.h> #include <linux/init.h> #include <linux/mutex.h> +#include <linux/if_hsr.h> #include <linux/if_vlan.h> #include <linux/virtio_net.h> #include <linux/errqueue.h> @@ -284,6 +285,94 @@ static int packet_xmit(const struct packet_sock *po, struct sk_buff *skb) return dev_direct_xmit(skb, packet_pick_tx_queue(skb)); } +#if IS_ENABLED(CONFIG_HSR) + +static DEFINE_STATIC_KEY_FALSE(hsr_ptp_support_enabled); + +static inline bool packet_add_ptp_settings(struct sk_buff *skb, + bool header, unsigned int port) +{ + if (!static_branch_unlikely(&hsr_ptp_support_enabled)) + return true; + + if (!port) + return true; + + return hsr_skb_add_header_port(skb, header, port); +} + +static inline bool packet_filter_hsr_port(struct packet_sock *po, struct sk_buff *skb) +{ + unsigned int req_port; + bool req_hdr; + + if (!static_branch_unlikely(&hsr_ptp_support_enabled)) + return false; + + if (!po->hsr_bound_port) + return false; + + if (!hsr_skb_get_header_port(skb, &req_hdr, &req_port)) + return true; + + if (req_port == po->hsr_bound_port) + return false; + return true; +} + +static int packet_cmsg_send(struct msghdr *msg, struct packet_sock *po, + bool *hsr_has_hdr) +{ + struct cmsghdr *cmsg; + int ret = -EINVAL; + u32 val; + + if (!static_branch_unlikely(&hsr_ptp_support_enabled)) + return 0; + + for_each_cmsghdr(cmsg, msg) { + if (!CMSG_OK(msg, cmsg)) + goto out; + if (cmsg->cmsg_level != SOL_PACKET) + continue; + if (cmsg->cmsg_type != PACKET_HSR_INFO) + continue; + if (cmsg->cmsg_len != CMSG_LEN(sizeof(u32))) + goto out; + + val = *(u32 *)CMSG_DATA(cmsg); + if (val != PACKET_HSR_INFO_HAS_HDR) + goto out; + if (!po->hsr_bound_port) + goto out; + + *hsr_has_hdr = true; + } + ret = 0; +out: + return ret; +} + +#else + +static bool packet_add_ptp_settings(struct sk_buff *skb, + bool header, unsigned int port) +{ + return true; +} + +static bool packet_filter_hsr_port(struct packet_sock *po, struct sk_buff *skb) +{ + return false; +} + +static int packet_cmsg_send(struct msghdr *msg, struct packet_sock *po, + bool *hsr_has_hdr) +{ + return 0; +} +#endif + static struct net_device *packet_cached_dev_get(struct packet_sock *po) { struct net_device *dev; @@ -2131,6 +2220,9 @@ static int packet_rcv(struct sk_buff *skb, struct net_device *dev, if (!net_eq(dev_net(dev), sock_net(sk))) goto drop; + if (packet_filter_hsr_port(po, skb)) + goto drop; + skb->dev = dev; if (dev_has_header(dev)) { @@ -2260,6 +2352,9 @@ static int tpacket_rcv(struct sk_buff *skb, struct net_device *dev, if (!net_eq(dev_net(dev), sock_net(sk))) goto drop; + if (packet_filter_hsr_port(po, skb)) + goto drop; + if (dev_has_header(dev)) { if (sk->sk_type != SOCK_DGRAM) skb_push(skb, skb->data - skb_mac_header(skb)); @@ -2731,6 +2826,7 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg) int len_sum = 0; int status = TP_STATUS_AVAILABLE; int hlen, tlen, copylen = 0; + bool hsr_has_hdr = false; long timeo; mutex_lock(&po->pg_vec_lock); @@ -2775,6 +2871,10 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg) err = sock_cmsg_send(&po->sk, msg, &sockc); if (unlikely(err)) goto out_put; + + err = packet_cmsg_send(msg, po, &hsr_has_hdr); + if (unlikely(err)) + goto out_put; } if (po->sk.sk_socket->type == SOCK_RAW) @@ -2863,6 +2963,10 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg) goto out_status; } } + if (!packet_add_ptp_settings(skb, hsr_has_hdr, po->hsr_bound_port)) { + err = -ENOMEM; + goto out_status; + } if (vnet_hdr_sz) { if (virtio_net_hdr_to_skb(skb, vnet_hdr, vio_le())) { @@ -2950,6 +3054,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len) int offset = 0; struct packet_sock *po = pkt_sk(sk); int vnet_hdr_sz = READ_ONCE(po->vnet_hdr_sz); + bool hsr_has_hdr = false; int hlen, tlen, linear; int extra_len = 0; @@ -2988,6 +3093,10 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len) err = sock_cmsg_send(sk, msg, &sockc); if (unlikely(err)) goto out_unlock; + + err = packet_cmsg_send(msg, po, &hsr_has_hdr); + if (unlikely(err)) + goto out_unlock; } if (sock->type == SOCK_RAW) @@ -3047,6 +3156,10 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len) } skb_setup_tx_timestamp(skb, &sockc); + if (!packet_add_ptp_settings(skb, hsr_has_hdr, po->hsr_bound_port)) { + err = -ENOMEM; + goto out_free; + } if (!vnet_hdr.gso_type && (len > dev->mtu + reserve + extra_len) && !packet_extra_vlan_len_allowed(dev, skb)) { @@ -3166,6 +3279,11 @@ static int packet_release(struct socket *sock) fanout_release_data(f); kvfree(f); } + +#if IS_ENABLED(CONFIG_HSR) + if (po->hsr_bound_port) + static_branch_dec(&hsr_ptp_support_enabled); +#endif /* * Now the socket is dead. No more input will appear. */ @@ -4044,6 +4162,40 @@ packet_setsockopt(struct socket *sock, int level, int optname, sockptr_t optval, packet_sock_flag_set(po, PACKET_SOCK_QDISC_BYPASS, val); return 0; } + case PACKET_HSR_BIND_PORT: + { +#if IS_ENABLED(CONFIG_HSR) + int old_val; + int val; + + if (optlen != sizeof(val)) + return -EINVAL; + if (copy_from_sockptr(&val, optval, sizeof(val))) + return -EFAULT; + + old_val = !!po->hsr_bound_port; + switch (val) { + case PACKET_HSR_BIND_PORT_AB: + po->hsr_bound_port = 0; + break; + case PACKET_HSR_BIND_PORT_A: + po->hsr_bound_port = HSR_PT_SLAVE_A; + break; + case PACKET_HSR_BIND_PORT_B: + po->hsr_bound_port = HSR_PT_SLAVE_B; + break; + default: + return -EINVAL; + } + if (old_val != !!po->hsr_bound_port) { + if (po->hsr_bound_port) + static_branch_inc(&hsr_ptp_support_enabled); + else + static_branch_dec(&hsr_ptp_support_enabled); + } + return 0; +#endif + } default: return -ENOPROTOOPT; } @@ -4164,6 +4316,21 @@ static int packet_getsockopt(struct socket *sock, int level, int optname, case PACKET_QDISC_BYPASS: val = packet_sock_flag(po, PACKET_SOCK_QDISC_BYPASS); break; + case PACKET_HSR_BIND_PORT: + switch (po->hsr_bound_port) { + case 0: + val = PACKET_HSR_BIND_PORT_AB; + break; + case HSR_PT_SLAVE_A: + val = PACKET_HSR_BIND_PORT_A; + break; + case HSR_PT_SLAVE_B: + val = PACKET_HSR_BIND_PORT_B; + break; + default: + return -EINVAL; + } + break; default: return -ENOPROTOOPT; } diff --git a/net/packet/internal.h b/net/packet/internal.h index b76e645cd78d1..24d63275d432f 100644 --- a/net/packet/internal.h +++ b/net/packet/internal.h @@ -114,6 +114,7 @@ struct packet_sock { unsigned long flags; int ifindex; /* bound device */ u8 vnet_hdr_sz; + u8 hsr_bound_port; __be16 num; struct packet_rollover *rollover; struct packet_mclist *mclist; -- 2.53.0 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH RFC net-next v2 2/2] af_packet: Add port specific handling for HSR 2026-03-09 15:52 ` [PATCH RFC net-next v2 2/2] af_packet: Add port specific handling for HSR Sebastian Andrzej Siewior @ 2026-03-10 1:38 ` Willem de Bruijn 2026-03-10 10:55 ` Sebastian Andrzej Siewior 0 siblings, 1 reply; 16+ messages in thread From: Willem de Bruijn @ 2026-03-10 1:38 UTC (permalink / raw) To: Sebastian Andrzej Siewior, netdev Cc: Sebastian Andrzej Siewior, (JC), Jayachandran, David S. Miller, Andrew Lunn, Chintan Vankar, Danish Anwar, Daolin Qiu, Eric Dumazet, Felix Maurer, Jakub Kicinski, Paolo Abeni, Richard Cochran, Simon Horman, Willem de Bruijn Sebastian Andrzej Siewior wrote: > linuxptp/ ptp4l uses a AF_PACKET with a RAW socket to send and receive > PTP packets. Extend the interface with the ability to bind the socket to > one of the two HSR ports and a flag for sendmsg() to indicate that the > packet already contains a HSR header. The same point about adding per protocol state to sk_buff applies to a slightly lesser extent to PF_PACKET. Adding this much HSR + PTP specific code there is a non-starter. I should have said this in v1. This likely makes my skb_extensions suggestion a non-starter sorry. We need to find a different way to Rx: get the port info from the slave device to userspace. Tx: send out the intended slave device. Let's separate the two challenges (and patches). On Rx, could your process just attach the PF_PACKET socket to the slave devices and filter on HSR PTP packets? Then separately drop these packets in hsr_handle_frame (as already done?) or TC ingress, so that they only arrive in userspace? On Tx, can you share a bit more why there are two cases, one where the master has to add the header, but also one where it does not (so userspace has presumably inserted it). The second case is simpler: can just write directly the whole packet to the intended slave device. For the first case, could skb->mark be used as port selector when writing from a packet socket to the master device? That already works with sock_cmsg_send. > Once PACKET_HSR_BIND_PORT is set, the socket will be bound to requested > slave port. All incoming packets without a set port will be discarded. > This limits receiving packet to PTP only packets since only those get > the tagging by the HSR stack. > > For the control message used by sendmsg(), type PACKET_HSR_INFO is added > with PACKET_HSR_INFO_HAS_HDR as the only option. If this is used, the > skb will be known as containing the HSR header. This option can only be > used if the socket is bound to a specific HSR port. > > To lower the impact for non-PTP-HSR/PRP user, a static branch is used to > to avoid a compare & branch if the socket was not bound to a hsr-slave > device. If CONFIG_HSR is not set, the compiler will be smart enough to > remove the code entirely (keeping only PACKET_HSR_BIND_PORT for > getsockopt()). > > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> > --- > include/uapi/linux/if_packet.h | 9 +++ > net/packet/af_packet.c | 167 +++++++++++++++++++++++++++++++++++++++++ > net/packet/internal.h | 1 + > 3 files changed, 177 insertions(+) > > diff --git a/include/uapi/linux/if_packet.h b/include/uapi/linux/if_packet.h > index 6cd1d7a41dfb7..3443eeac8470e 100644 > --- a/include/uapi/linux/if_packet.h > +++ b/include/uapi/linux/if_packet.h > @@ -60,6 +60,7 @@ struct sockaddr_ll { > #define PACKET_FANOUT_DATA 22 > #define PACKET_IGNORE_OUTGOING 23 > #define PACKET_VNET_HDR_SZ 24 > +#define PACKET_HSR_BIND_PORT 25 > > #define PACKET_FANOUT_HASH 0 > #define PACKET_FANOUT_LB 1 > @@ -74,6 +75,14 @@ struct sockaddr_ll { > #define PACKET_FANOUT_FLAG_IGNORE_OUTGOING 0x4000 > #define PACKET_FANOUT_FLAG_DEFRAG 0x8000 > > +/* For HSR, bind port */ > +#define PACKET_HSR_BIND_PORT_AB 0 > +#define PACKET_HSR_BIND_PORT_A 1 > +#define PACKET_HSR_BIND_PORT_B 2 > +/* HSR, CMSG */ > +#define PACKET_HSR_INFO 1 > +#define PACKET_HSR_INFO_HAS_HDR 1 > + > struct tpacket_stats { > unsigned int tp_packets; > unsigned int tp_drops; > diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c > index 72d0935139f0f..98322ad57a234 100644 > --- a/net/packet/af_packet.c > +++ b/net/packet/af_packet.c > @@ -82,6 +82,7 @@ > #include <linux/module.h> > #include <linux/init.h> > #include <linux/mutex.h> > +#include <linux/if_hsr.h> > #include <linux/if_vlan.h> > #include <linux/virtio_net.h> > #include <linux/errqueue.h> > @@ -284,6 +285,94 @@ static int packet_xmit(const struct packet_sock *po, struct sk_buff *skb) > return dev_direct_xmit(skb, packet_pick_tx_queue(skb)); > } > > +#if IS_ENABLED(CONFIG_HSR) > + > +static DEFINE_STATIC_KEY_FALSE(hsr_ptp_support_enabled); > + > +static inline bool packet_add_ptp_settings(struct sk_buff *skb, > + bool header, unsigned int port) > +{ > + if (!static_branch_unlikely(&hsr_ptp_support_enabled)) > + return true; > + > + if (!port) > + return true; > + > + return hsr_skb_add_header_port(skb, header, port); > +} > + > +static inline bool packet_filter_hsr_port(struct packet_sock *po, struct sk_buff *skb) > +{ > + unsigned int req_port; > + bool req_hdr; > + > + if (!static_branch_unlikely(&hsr_ptp_support_enabled)) > + return false; > + > + if (!po->hsr_bound_port) > + return false; > + > + if (!hsr_skb_get_header_port(skb, &req_hdr, &req_port)) > + return true; > + > + if (req_port == po->hsr_bound_port) > + return false; > + return true; > +} > + > +static int packet_cmsg_send(struct msghdr *msg, struct packet_sock *po, > + bool *hsr_has_hdr) > +{ > + struct cmsghdr *cmsg; > + int ret = -EINVAL; > + u32 val; > + > + if (!static_branch_unlikely(&hsr_ptp_support_enabled)) > + return 0; > + > + for_each_cmsghdr(cmsg, msg) { > + if (!CMSG_OK(msg, cmsg)) > + goto out; > + if (cmsg->cmsg_level != SOL_PACKET) > + continue; > + if (cmsg->cmsg_type != PACKET_HSR_INFO) > + continue; > + if (cmsg->cmsg_len != CMSG_LEN(sizeof(u32))) > + goto out; > + > + val = *(u32 *)CMSG_DATA(cmsg); > + if (val != PACKET_HSR_INFO_HAS_HDR) > + goto out; > + if (!po->hsr_bound_port) > + goto out; > + > + *hsr_has_hdr = true; > + } > + ret = 0; > +out: > + return ret; > +} > + > +#else > + > +static bool packet_add_ptp_settings(struct sk_buff *skb, > + bool header, unsigned int port) > +{ > + return true; > +} > + > +static bool packet_filter_hsr_port(struct packet_sock *po, struct sk_buff *skb) > +{ > + return false; > +} > + > +static int packet_cmsg_send(struct msghdr *msg, struct packet_sock *po, > + bool *hsr_has_hdr) > +{ > + return 0; > +} > +#endif > + > static struct net_device *packet_cached_dev_get(struct packet_sock *po) > { > struct net_device *dev; > @@ -2131,6 +2220,9 @@ static int packet_rcv(struct sk_buff *skb, struct net_device *dev, > if (!net_eq(dev_net(dev), sock_net(sk))) > goto drop; > > + if (packet_filter_hsr_port(po, skb)) > + goto drop; > + > skb->dev = dev; > > if (dev_has_header(dev)) { > @@ -2260,6 +2352,9 @@ static int tpacket_rcv(struct sk_buff *skb, struct net_device *dev, > if (!net_eq(dev_net(dev), sock_net(sk))) > goto drop; > > + if (packet_filter_hsr_port(po, skb)) > + goto drop; > + > if (dev_has_header(dev)) { > if (sk->sk_type != SOCK_DGRAM) > skb_push(skb, skb->data - skb_mac_header(skb)); > @@ -2731,6 +2826,7 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg) > int len_sum = 0; > int status = TP_STATUS_AVAILABLE; > int hlen, tlen, copylen = 0; > + bool hsr_has_hdr = false; > long timeo; > > mutex_lock(&po->pg_vec_lock); > @@ -2775,6 +2871,10 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg) > err = sock_cmsg_send(&po->sk, msg, &sockc); > if (unlikely(err)) > goto out_put; > + > + err = packet_cmsg_send(msg, po, &hsr_has_hdr); > + if (unlikely(err)) > + goto out_put; > } > > if (po->sk.sk_socket->type == SOCK_RAW) > @@ -2863,6 +2963,10 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg) > goto out_status; > } > } > + if (!packet_add_ptp_settings(skb, hsr_has_hdr, po->hsr_bound_port)) { > + err = -ENOMEM; > + goto out_status; > + } > > if (vnet_hdr_sz) { > if (virtio_net_hdr_to_skb(skb, vnet_hdr, vio_le())) { > @@ -2950,6 +3054,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len) > int offset = 0; > struct packet_sock *po = pkt_sk(sk); > int vnet_hdr_sz = READ_ONCE(po->vnet_hdr_sz); > + bool hsr_has_hdr = false; > int hlen, tlen, linear; > int extra_len = 0; > > @@ -2988,6 +3093,10 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len) > err = sock_cmsg_send(sk, msg, &sockc); > if (unlikely(err)) > goto out_unlock; > + > + err = packet_cmsg_send(msg, po, &hsr_has_hdr); > + if (unlikely(err)) > + goto out_unlock; > } > > if (sock->type == SOCK_RAW) > @@ -3047,6 +3156,10 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len) > } > > skb_setup_tx_timestamp(skb, &sockc); > + if (!packet_add_ptp_settings(skb, hsr_has_hdr, po->hsr_bound_port)) { > + err = -ENOMEM; > + goto out_free; > + } > > if (!vnet_hdr.gso_type && (len > dev->mtu + reserve + extra_len) && > !packet_extra_vlan_len_allowed(dev, skb)) { > @@ -3166,6 +3279,11 @@ static int packet_release(struct socket *sock) > fanout_release_data(f); > kvfree(f); > } > + > +#if IS_ENABLED(CONFIG_HSR) > + if (po->hsr_bound_port) > + static_branch_dec(&hsr_ptp_support_enabled); > +#endif > /* > * Now the socket is dead. No more input will appear. > */ > @@ -4044,6 +4162,40 @@ packet_setsockopt(struct socket *sock, int level, int optname, sockptr_t optval, > packet_sock_flag_set(po, PACKET_SOCK_QDISC_BYPASS, val); > return 0; > } > + case PACKET_HSR_BIND_PORT: > + { > +#if IS_ENABLED(CONFIG_HSR) > + int old_val; > + int val; > + > + if (optlen != sizeof(val)) > + return -EINVAL; > + if (copy_from_sockptr(&val, optval, sizeof(val))) > + return -EFAULT; > + > + old_val = !!po->hsr_bound_port; > + switch (val) { > + case PACKET_HSR_BIND_PORT_AB: > + po->hsr_bound_port = 0; > + break; > + case PACKET_HSR_BIND_PORT_A: > + po->hsr_bound_port = HSR_PT_SLAVE_A; > + break; > + case PACKET_HSR_BIND_PORT_B: > + po->hsr_bound_port = HSR_PT_SLAVE_B; > + break; > + default: > + return -EINVAL; > + } > + if (old_val != !!po->hsr_bound_port) { > + if (po->hsr_bound_port) > + static_branch_inc(&hsr_ptp_support_enabled); > + else > + static_branch_dec(&hsr_ptp_support_enabled); > + } > + return 0; > +#endif > + } > default: > return -ENOPROTOOPT; > } > @@ -4164,6 +4316,21 @@ static int packet_getsockopt(struct socket *sock, int level, int optname, > case PACKET_QDISC_BYPASS: > val = packet_sock_flag(po, PACKET_SOCK_QDISC_BYPASS); > break; > + case PACKET_HSR_BIND_PORT: > + switch (po->hsr_bound_port) { > + case 0: > + val = PACKET_HSR_BIND_PORT_AB; > + break; > + case HSR_PT_SLAVE_A: > + val = PACKET_HSR_BIND_PORT_A; > + break; > + case HSR_PT_SLAVE_B: > + val = PACKET_HSR_BIND_PORT_B; > + break; > + default: > + return -EINVAL; > + } > + break; > default: > return -ENOPROTOOPT; > } > diff --git a/net/packet/internal.h b/net/packet/internal.h > index b76e645cd78d1..24d63275d432f 100644 > --- a/net/packet/internal.h > +++ b/net/packet/internal.h > @@ -114,6 +114,7 @@ struct packet_sock { > unsigned long flags; > int ifindex; /* bound device */ > u8 vnet_hdr_sz; > + u8 hsr_bound_port; > __be16 num; > struct packet_rollover *rollover; > struct packet_mclist *mclist; > > -- > 2.53.0 > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH RFC net-next v2 2/2] af_packet: Add port specific handling for HSR 2026-03-10 1:38 ` Willem de Bruijn @ 2026-03-10 10:55 ` Sebastian Andrzej Siewior 2026-03-10 21:35 ` Willem de Bruijn 0 siblings, 1 reply; 16+ messages in thread From: Sebastian Andrzej Siewior @ 2026-03-10 10:55 UTC (permalink / raw) To: Willem de Bruijn Cc: netdev, (JC), Jayachandran, David S. Miller, Andrew Lunn, Chintan Vankar, Danish Anwar, Daolin Qiu, Eric Dumazet, Felix Maurer, Jakub Kicinski, Paolo Abeni, Richard Cochran, Simon Horman On 2026-03-09 21:38:33 [-0400], Willem de Bruijn wrote: > The same point about adding per protocol state to sk_buff applies > to a slightly lesser extent to PF_PACKET. > > Adding this much HSR + PTP specific code there is a non-starter. It looked like a little and is hidden behind a static branch so it is just a nop as long as there is no one using the socket bind. And without CONFIG_HSR there is not even that. > I should have said this in v1. This likely makes my skb_extensions > suggestion a non-starter sorry. I need something to share between two layers I think so that skb_extensions wasn't that bad. > We need to find a different way to > > Rx: get the port info from the slave device to userspace. > Tx: send out the intended slave device. > > Let's separate the two challenges (and patches). > > On Rx, could your process just attach the PF_PACKET socket to the > slave devices and filter on HSR PTP packets? Then separately drop > these packets in hsr_handle_frame (as already done?) or TC ingress, so > that they only arrive in userspace? I could listen directly on eth0/ eth1 as a PF_PACKET. That would give me all I need including a timestamp, yes. I wouldn't just be able to use it for TX but lets go on. > On Tx, can you share a bit more why there are two cases, one where the > master has to add the header, but also one where it does not (so > userspace has presumably inserted it). PTP + HSR. Lets assume the following setup: ╭────────╮ ╭──────╮ ╭──────╮ ╭────────╮ ╔═══│ Node X │═══│Port A├┅┅┅┅┅┅┅┅┅┅┅┅┅┅┤Port A│══│ Node Y │════╗ ║ ╰────────╯ ╰──────╯ ╰──────╯ ╰────────╯ ║ ║ ║ ║ ║ ╭──────╮ ╭──────╮ ╭────────╮ ╭──────╮ ╭──────╮ │Port B├┅┅┅┅┅┅┅┅┅┅┅┅┤Port B│══│ Node Z │══│Port A├┅┅┅┅┅┅┅┅┅┅┅┅┤Port B│ ╰──────╯ ╰──────╯ ╰────────╯ ╰──────╯ ╰──────╯ Node X has direct connection to Y and Z, each node has two ports. You could add more nodes but it always remains a ring. Lets say node X sends a packet (say TCP/IP) with the destination MAC of node Z assuming a "normal port 443" request. This packet gets a HSR header prepended and is sent on X-A and X-B. This happens transparently as hsr0 is the device with an IP address assigned and port A and B are just two device which are up with no IP address assigned. These are the physical devices forwarding the traffic. Y-A receives it, is not the target, forwards it over Y-B. Z-B receives it, it is the target, sends to its master port which removes the HSR header and the packet arrives in the IP stack. After the master port, it forwards it also on Z-A. Z-A receives it (the copy from Y-B) identifies it as a duplicate based on the HSR-sequence number (does not inject into the master port) and forwards it on Z-B. At the end Node X receives two copies of the packet it sent and removes them from the ring (node X was the sender identified by the SRC MAC and does not forward it). This is how HSR works in general. Now lets add PTP to this as specified. The target MAC is always a multicast MAC and the ether type is PTP 0x88f7. Use case 1: A PDELAY_REQ packet. This packet travels only between two neighbours. That means X-A sends it to Y-A and Y must not forward it over Y-B but needs to answer (send a PDELAY_RESP). These packets are sent as PTP frames and the HSR stack needs to prepend a HSR header with a valid sequence number. X-B gets its own request. Userland needs to track time/ state information on per port basis. Use case 2: A SYNC packet. This packet is sent from X-A to Y-A. Again a HSR header needs to be prepended by the stack on X. Y-A receives that packet. It injects it into the master port where user land can consume it. This is the same as the previous case. Here comes the different part: This packet needs to be forwarded by Y over Y-B. As in the previous case the HSR stack does not forward it on its own but this part is done by userland. So userland sends a packet, only on Y-B and this packet already contains the HSR header from X and it needs to be preserved. The forwarded SYNC packet got its timing information updated based on the delay within the stack (so it is not identical as received). That is why the HSR stack must not forwarded the packets on the other port as it would normally do (breaks PTP time information), why user land needs to know on which port the packet was received and why it needs to send a packet only on one port with or without the HSR header. > The second case is simpler: can just write directly the whole packet > to the intended slave device. Yes. This has been suggested and was indeed used in my v1 of linuxptp but the problem was sending with system's HSR header. > For the first case, could skb->mark be used as port selector when > writing from a packet socket to the master device? That already works > with sock_cmsg_send. We would have to specify that SO_MARK 1 and 2 denotes the port on which a packet is sent. This kind of burns the usage for everything else on HSR so it feels misused. And then we would need an additional bit to specify whether the HSR header is there or not. Unless I open additional socket on the ethernet device just for sending and dropping everything incoming. And we would have to filter/ distinguish the RX port based on it. Userland has a cBPF filter to filter everything out and receive only PTP frames. If the PTP packet is forwarded to both sockets (A and B) then userland would have to throw one copy away and go to sleep again. This sort of breaks currently linuxptp logic. It would probably require either eBPF to filter also so_mark or deal with "no packet despite the wakeup" but so far I tried minimal impact on both sides (kernel and user). Sebastian ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH RFC net-next v2 2/2] af_packet: Add port specific handling for HSR 2026-03-10 10:55 ` Sebastian Andrzej Siewior @ 2026-03-10 21:35 ` Willem de Bruijn 2026-03-12 15:42 ` Sebastian Andrzej Siewior 0 siblings, 1 reply; 16+ messages in thread From: Willem de Bruijn @ 2026-03-10 21:35 UTC (permalink / raw) To: Sebastian Andrzej Siewior, Willem de Bruijn Cc: netdev, (JC), Jayachandran, David S. Miller, Andrew Lunn, Chintan Vankar, Danish Anwar, Daolin Qiu, Eric Dumazet, Felix Maurer, Jakub Kicinski, Paolo Abeni, Richard Cochran, Simon Horman Sebastian Andrzej Siewior wrote: > On 2026-03-09 21:38:33 [-0400], Willem de Bruijn wrote: > > The same point about adding per protocol state to sk_buff applies > > to a slightly lesser extent to PF_PACKET. > > > > Adding this much HSR + PTP specific code there is a non-starter. > > It looked like a little and is hidden behind a static branch so it is > just a nop as long as there is no one using the socket bind. And without > CONFIG_HSR there is not even that. > > > I should have said this in v1. This likely makes my skb_extensions > > suggestion a non-starter sorry. > > I need something to share between two layers I think so that > skb_extensions wasn't that bad. > > > We need to find a different way to > > > > Rx: get the port info from the slave device to userspace. > > Tx: send out the intended slave device. > > > > Let's separate the two challenges (and patches). > > > > On Rx, could your process just attach the PF_PACKET socket to the > > slave devices and filter on HSR PTP packets? Then separately drop > > these packets in hsr_handle_frame (as already done?) or TC ingress, so > > that they only arrive in userspace? > > I could listen directly on eth0/ eth1 as a PF_PACKET. That would give me > all I need including a timestamp, yes. I wouldn't just be able to use it > for TX but lets go on. Great > > On Tx, can you share a bit more why there are two cases, one where the > > master has to add the header, but also one where it does not (so > > userspace has presumably inserted it). > > PTP + HSR. Lets assume the following setup: > > ╭────────╮ ╭──────╮ ╭──────╮ ╭────────╮ > ╔═══│ Node X │═══│Port A├┅┅┅┅┅┅┅┅┅┅┅┅┅┅┤Port A│══│ Node Y │════╗ > ║ ╰────────╯ ╰──────╯ ╰──────╯ ╰────────╯ ║ > ║ ║ > ║ ║ > ╭──────╮ ╭──────╮ ╭────────╮ ╭──────╮ ╭──────╮ > │Port B├┅┅┅┅┅┅┅┅┅┅┅┅┤Port B│══│ Node Z │══│Port A├┅┅┅┅┅┅┅┅┅┅┅┅┤Port B│ > ╰──────╯ ╰──────╯ ╰────────╯ ╰──────╯ ╰──────╯ > > Node X has direct connection to Y and Z, each node has two ports. You > could add more nodes but it always remains a ring. > Lets say node X sends a packet (say TCP/IP) with the destination MAC of > node Z assuming a "normal port 443" request. This packet gets a HSR > header prepended and is sent on X-A and X-B. This happens transparently > as hsr0 is the device with an IP address assigned and port A and B are > just two device which are up with no IP address assigned. These are the > physical devices forwarding the traffic. > > Y-A receives it, is not the target, forwards it over Y-B. > Z-B receives it, it is the target, sends to its master port which > removes the HSR header and the packet arrives in the IP stack. After the > master port, it forwards it also on Z-A. > Z-A receives it (the copy from Y-B) identifies it as a duplicate based > on the HSR-sequence number (does not inject into the master port) and > forwards it on Z-B. > At the end Node X receives two copies of the packet it sent and removes > them from the ring (node X was the sender identified by the SRC MAC and > does not forward it). > > This is how HSR works in general. Now lets add PTP to this as specified. > The target MAC is always a multicast MAC and the ether type is PTP > 0x88f7. > > Use case 1: A PDELAY_REQ packet. This packet travels only between two > neighbours. That means X-A sends it to Y-A and Y must not forward it > over Y-B but needs to answer (send a PDELAY_RESP). These packets are > sent as PTP frames and the HSR stack needs to prepend a HSR header with > a valid sequence number. X-B gets its own request. Userland needs to > track time/ state information on per port basis. > > Use case 2: A SYNC packet. This packet is sent from X-A to Y-A. Again a > HSR header needs to be prepended by the stack on X. Y-A receives that > packet. It injects it into the master port where user land can consume > it. This is the same as the previous case. > Here comes the different part: This packet needs to be forwarded by Y > over Y-B. As in the previous case the HSR stack does not forward it on > its own but this part is done by userland. So userland sends a packet, > only on Y-B and this packet already contains the HSR header from X and > it needs to be preserved. > The forwarded SYNC packet got its timing information updated based on > the delay within the stack (so it is not identical as received). Thanks for the detailed explanation of the challenge! > That is why the HSR stack must not forwarded the packets on the other > port as it would normally do (breaks PTP time information), why user > land needs to know on which port the packet was received and why it > needs to send a packet only on one port with or without the HSR header. > > > The second case is simpler: can just write directly the whole packet > > to the intended slave device. > > Yes. This has been suggested and was indeed used in my v1 of linuxptp Great > but the problem was sending with system's HSR header. > > > For the first case, could skb->mark be used as port selector when > > writing from a packet socket to the master device? That already works > > with sock_cmsg_send. > > We would have to specify that SO_MARK 1 and 2 denotes the port on which > a packet is sent. This kind of burns the usage for everything else on > HSR so it feels misused. It is more or less what mark is for. An alternative similar field supported by sock_cmsg_send is skb->priority. An alternative may be to share the information in-band. Already insert the HSR header also wen writing to the master device. If the master device can detect this packet-with-pre-existing header. This is not the first case where ndo_start_xmit may already expect a header prefixed that it normally inserts. I forgot the exact case (can look it up), maybe a weird edge case in GRE? It does not even have to be a valid HSR header: just an agreement between the process writing the raw packet and hsr_dev_xmit. There probably are still more ways we can approach this challenge. But these are three that do not require kernel changes outside the HSR protocol code. > And then we would need an additional bit to > specify whether the HSR header is there or not. Unless I open additional > socket on the ethernet device just for sending and dropping everything > incoming. Right, packets that already have a header prefixed are written directly to the intended slave. > And we would have to filter/ distinguish the RX port based on it. > Userland has a cBPF filter to filter everything out and receive only PTP > frames. If the PTP packet is forwarded to both sockets (A and B) then > userland would have to throw one copy away and go to sleep again. This > sort of breaks currently linuxptp logic. It would probably require > either eBPF to filter also so_mark or deal with "no packet despite the > wakeup" but so far I tried minimal impact on both sides (kernel and > user). I don't fully follow this part. It discusses Rx again? ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH RFC net-next v2 2/2] af_packet: Add port specific handling for HSR 2026-03-10 21:35 ` Willem de Bruijn @ 2026-03-12 15:42 ` Sebastian Andrzej Siewior 2026-03-12 21:43 ` Willem de Bruijn 0 siblings, 1 reply; 16+ messages in thread From: Sebastian Andrzej Siewior @ 2026-03-12 15:42 UTC (permalink / raw) To: Willem de Bruijn Cc: netdev, (JC), Jayachandran, David S. Miller, Andrew Lunn, Chintan Vankar, Danish Anwar, Daolin Qiu, Eric Dumazet, Felix Maurer, Jakub Kicinski, Paolo Abeni, Richard Cochran, Simon Horman On 2026-03-10 17:35:33 [-0400], Willem de Bruijn wrote: > > but the problem was sending with system's HSR header. > > > > > For the first case, could skb->mark be used as port selector when > > > writing from a packet socket to the master device? That already works > > > with sock_cmsg_send. > > > > We would have to specify that SO_MARK 1 and 2 denotes the port on which > > a packet is sent. This kind of burns the usage for everything else on > > HSR so it feels misused. > > It is more or less what mark is for. An alternative similar field > supported by sock_cmsg_send is skb->priority. The ->priority thing looks wrong. It is used by ssh and other application so they would suddenly behave differently. But okay lets take SO_MARK. Might be smallest abuse here. And we would have to hardcode the values in respect to HSR and I have no idea where to document this. The difference with SO_MARK in general (at least my understanding) is that you can choose the values based on your setup. Here we would say 7-0 are already reserved. > An alternative may be to share the information in-band. Already > insert the HSR header also wen writing to the master device. If the > master device can detect this packet-with-pre-existing header. Given all this, I guess it is easier to stick to SO_MARK as the header might collide with other data. > This is not the first case where ndo_start_xmit may already expect a > header prefixed that it normally inserts. I forgot the exact case (can > look it up), maybe a weird edge case in GRE? > > It does not even have to be a valid HSR header: just an agreement > between the process writing the raw packet and hsr_dev_xmit. > > There probably are still more ways we can approach this challenge. > But these are three that do not require kernel changes outside the > HSR protocol code. Okay. > > And then we would need an additional bit to > > specify whether the HSR header is there or not. Unless I open additional > > socket on the ethernet device just for sending and dropping everything > > incoming. > > Right, packets that already have a header prefixed are written > directly to the intended slave. > > > And we would have to filter/ distinguish the RX port based on it. > > Userland has a cBPF filter to filter everything out and receive only PTP > > frames. If the PTP packet is forwarded to both sockets (A and B) then > > userland would have to throw one copy away and go to sleep again. This > > sort of breaks currently linuxptp logic. It would probably require > > either eBPF to filter also so_mark or deal with "no packet despite the > > wakeup" but so far I tried minimal impact on both sides (kernel and > > user). > > I don't fully follow this part. It discusses Rx again? Yes, this is RX. If I open packet socket, bind to hsr0, how do I filter for packets from one of the slaves? After some thinking and browsing through the packet code: The hsr stack creates hsrX. If it would create additionally hsrX_A and hsrX_B in order to be able to send and receive only on the relevant slave device then I wouldn't need the filters in packet code. That could work… Sebastian ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH RFC net-next v2 2/2] af_packet: Add port specific handling for HSR 2026-03-12 15:42 ` Sebastian Andrzej Siewior @ 2026-03-12 21:43 ` Willem de Bruijn 2026-03-13 9:22 ` Sebastian Andrzej Siewior 0 siblings, 1 reply; 16+ messages in thread From: Willem de Bruijn @ 2026-03-12 21:43 UTC (permalink / raw) To: Sebastian Andrzej Siewior, Willem de Bruijn Cc: netdev, (JC), Jayachandran, David S. Miller, Andrew Lunn, Chintan Vankar, Danish Anwar, Daolin Qiu, Eric Dumazet, Felix Maurer, Jakub Kicinski, Paolo Abeni, Richard Cochran, Simon Horman Sebastian Andrzej Siewior wrote: > On 2026-03-10 17:35:33 [-0400], Willem de Bruijn wrote: > > > but the problem was sending with system's HSR header. > > > > > > > For the first case, could skb->mark be used as port selector when > > > > writing from a packet socket to the master device? That already works > > > > with sock_cmsg_send. > > > > > > We would have to specify that SO_MARK 1 and 2 denotes the port on which > > > a packet is sent. This kind of burns the usage for everything else on > > > HSR so it feels misused. > > > > It is more or less what mark is for. An alternative similar field > > supported by sock_cmsg_send is skb->priority. > > The ->priority thing looks wrong. It is used by ssh and other > application so they would suddenly behave differently. > But okay lets take SO_MARK. Might be smallest abuse here. > And we would have to hardcode the values in respect to HSR and I have no > idea where to document this. > The difference with SO_MARK in general (at least my understanding) is > that you can choose the values based on your setup. Here we would say > 7-0 are already reserved. > > > An alternative may be to share the information in-band. Already > > insert the HSR header also wen writing to the master device. If the > > master device can detect this packet-with-pre-existing header. > > Given all this, I guess it is easier to stick to SO_MARK as the header > might collide with other data. Inserting a header by the packet socket sender and popping it in hsr_dev_xmit is a bit like virtio_net. It is a private channel. The tricky part is how hsr_dev_xmit can differentiate such packets inserted by a packet socket, from regular packets arriving on the device. Anyway, admittedly it is a bit of a hack too. > > This is not the first case where ndo_start_xmit may already expect a > > header prefixed that it normally inserts. I forgot the exact case (can > > look it up), maybe a weird edge case in GRE? > > > > It does not even have to be a valid HSR header: just an agreement > > between the process writing the raw packet and hsr_dev_xmit. > > > > There probably are still more ways we can approach this challenge. > > But these are three that do not require kernel changes outside the > > HSR protocol code. > > Okay. > > > > And then we would need an additional bit to > > > specify whether the HSR header is there or not. Unless I open additional > > > socket on the ethernet device just for sending and dropping everything > > > incoming. > > > > Right, packets that already have a header prefixed are written > > directly to the intended slave. > > > > > And we would have to filter/ distinguish the RX port based on it. > > > Userland has a cBPF filter to filter everything out and receive only PTP > > > frames. If the PTP packet is forwarded to both sockets (A and B) then > > > userland would have to throw one copy away and go to sleep again. This > > > sort of breaks currently linuxptp logic. It would probably require > > > either eBPF to filter also so_mark or deal with "no packet despite the > > > wakeup" but so far I tried minimal impact on both sides (kernel and > > > user). > > > > I don't fully follow this part. It discusses Rx again? > > Yes, this is RX. If I open packet socket, bind to hsr0, how do I filter > for packets from one of the slaves? I was thinking of attaching directly to the slave devices. > After some thinking and browsing through the packet code: > The hsr stack creates hsrX. If it would create additionally hsrX_A and > hsrX_B in order to be able to send and receive only on the relevant > slave device then I wouldn't need the filters in packet code. That could > work… That's another option. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH RFC net-next v2 2/2] af_packet: Add port specific handling for HSR 2026-03-12 21:43 ` Willem de Bruijn @ 2026-03-13 9:22 ` Sebastian Andrzej Siewior 2026-03-13 16:04 ` Sebastian Andrzej Siewior 0 siblings, 1 reply; 16+ messages in thread From: Sebastian Andrzej Siewior @ 2026-03-13 9:22 UTC (permalink / raw) To: Willem de Bruijn Cc: netdev, (JC), Jayachandran, David S. Miller, Andrew Lunn, Chintan Vankar, Danish Anwar, Daolin Qiu, Eric Dumazet, Felix Maurer, Jakub Kicinski, Paolo Abeni, Richard Cochran, Simon Horman On 2026-03-12 17:43:36 [-0400], Willem de Bruijn wrote: > > Yes, this is RX. If I open packet socket, bind to hsr0, how do I filter > > for packets from one of the slaves? > > I was thinking of attaching directly to the slave devices. That could work. Let me see. > > After some thinking and browsing through the packet code: > > The hsr stack creates hsrX. If it would create additionally hsrX_A and > > hsrX_B in order to be able to send and receive only on the relevant > > slave device then I wouldn't need the filters in packet code. That could > > work… > > That's another option. Okay. Let me add this as plan B then. Sebastian ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH RFC net-next v2 2/2] af_packet: Add port specific handling for HSR 2026-03-13 9:22 ` Sebastian Andrzej Siewior @ 2026-03-13 16:04 ` Sebastian Andrzej Siewior 2026-03-16 20:12 ` Willem de Bruijn 0 siblings, 1 reply; 16+ messages in thread From: Sebastian Andrzej Siewior @ 2026-03-13 16:04 UTC (permalink / raw) To: Willem de Bruijn Cc: netdev, (JC), Jayachandran, David S. Miller, Andrew Lunn, Chintan Vankar, Danish Anwar, Daolin Qiu, Eric Dumazet, Felix Maurer, Jakub Kicinski, Paolo Abeni, Richard Cochran, Simon Horman On 2026-03-13 10:22:28 [+0100], To Willem de Bruijn wrote: > On 2026-03-12 17:43:36 [-0400], Willem de Bruijn wrote: > > > Yes, this is RX. If I open packet socket, bind to hsr0, how do I filter > > > for packets from one of the slaves? > > > > I was thinking of attaching directly to the slave devices. > > That could work. Let me see. Halfway done. What we went down to is: Send with header -> eth0/ eth1 + SO_MARK for header + port 1/ 2 Send without header -> hsr0 + SO_MARK 1/2 port 1/ 2 RX -> listen on eth0/ eth1 Now this compiles and I look into userland and am currently stuck with the problem that it does not expect "two devices". It is not impossible but… > > > After some thinking and browsing through the packet code: > > > The hsr stack creates hsrX. If it would create additionally hsrX_A and > > > hsrX_B in order to be able to send and receive only on the relevant > > > slave device then I wouldn't need the filters in packet code. That could > > > work… > > > > That's another option. > > Okay. Let me add this as plan B then. At this point I am considering this which would avoid having to use hsr0+eth1 for one direction depending on whether I have to send with or without the header. But this might lead to other problems such as ip address and so on. The suggested changes to af_packet, that remains a hard no? Sebastian ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH RFC net-next v2 2/2] af_packet: Add port specific handling for HSR 2026-03-13 16:04 ` Sebastian Andrzej Siewior @ 2026-03-16 20:12 ` Willem de Bruijn 2026-03-17 17:29 ` Sebastian Andrzej Siewior 0 siblings, 1 reply; 16+ messages in thread From: Willem de Bruijn @ 2026-03-16 20:12 UTC (permalink / raw) To: Sebastian Andrzej Siewior, Willem de Bruijn Cc: netdev, (JC), Jayachandran, David S. Miller, Andrew Lunn, Chintan Vankar, Danish Anwar, Daolin Qiu, Eric Dumazet, Felix Maurer, Jakub Kicinski, Paolo Abeni, Richard Cochran, Simon Horman Sebastian Andrzej Siewior wrote: > On 2026-03-13 10:22:28 [+0100], To Willem de Bruijn wrote: > > On 2026-03-12 17:43:36 [-0400], Willem de Bruijn wrote: > > > > Yes, this is RX. If I open packet socket, bind to hsr0, how do I filter > > > > for packets from one of the slaves? > > > > > > I was thinking of attaching directly to the slave devices. > > > > That could work. Let me see. > > Halfway done. What we went down to is: > > Send with header -> eth0/ eth1 + SO_MARK for header + port 1/ 2 > Send without header -> hsr0 + SO_MARK 1/2 port 1/ 2 > RX -> listen on eth0/ eth1 > > Now this compiles and I look into userland and am currently stuck with > the problem that it does not expect "two devices". It is not impossible > but… > > > > > After some thinking and browsing through the packet code: > > > > The hsr stack creates hsrX. If it would create additionally hsrX_A and > > > > hsrX_B in order to be able to send and receive only on the relevant > > > > slave device then I wouldn't need the filters in packet code. That could > > > > work… > > > > > > That's another option. > > > > Okay. Let me add this as plan B then. > > At this point I am considering this which would avoid having to use > hsr0+eth1 for one direction depending on whether I have to send with or > without the header. But this might lead to other problems such as ip > address and so on. > > The suggested changes to af_packet, that remains a hard no? Yes. Packet sockets are largely superseded by XDP and AF_XDP when it comes to new functionality. And were never open for such protocol specific logic. Protocol independent extensions, such as reading skb->mark as part of extended auxdata, could be up for debate. But again, we already have XDP and AF_XDP which allow passing arbitrary metadata to/from userspace. That is preferable over adding new structs to the ABI. If you only want to attach to a single (hsr0) device, I still think passing the data inline might work. Either by overriding existing fields such as a MAC addr (a hack for sure) or something in the HSR header (it as the direction? or by inserting a custom header (or trailer) akin to virtio_net_hdr for tun/tap. But custom to your workload. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH RFC net-next v2 2/2] af_packet: Add port specific handling for HSR 2026-03-16 20:12 ` Willem de Bruijn @ 2026-03-17 17:29 ` Sebastian Andrzej Siewior 2026-03-19 13:29 ` Willem de Bruijn 0 siblings, 1 reply; 16+ messages in thread From: Sebastian Andrzej Siewior @ 2026-03-17 17:29 UTC (permalink / raw) To: Willem de Bruijn Cc: netdev, (JC), Jayachandran, David S. Miller, Andrew Lunn, Chintan Vankar, Danish Anwar, Daolin Qiu, Eric Dumazet, Felix Maurer, Jakub Kicinski, Paolo Abeni, Richard Cochran, Simon Horman On 2026-03-16 16:12:41 [-0400], Willem de Bruijn wrote: > > The suggested changes to af_packet, that remains a hard no? > > Yes. Packet sockets are largely superseded by XDP and AF_XDP when it > comes to new functionality. And were never open for such protocol > specific logic. ptp4l uses AF_PACKET and while I am not against AF_XDP I am not sure if it helps. I don't have any protocol stack so extending SOL_PACKET was the only option left. > Protocol independent extensions, such as reading skb->mark as part of > extended auxdata, could be up for debate. But again, we already have > XDP and AF_XDP which allow passing arbitrary metadata to/from > userspace. That is preferable over adding new structs to the ABI. Is this struct xsk_tx_metadata and something similar for RX? If so, I would require HSR fields for both. My understanding is that you bind XDP to device+queue. I would require four sockets (generic+event for both slaves) so this does not work, right? Also I would need to extend HSR stack so support XDP in a similar fashion as network driver do? > If you only want to attach to a single (hsr0) device, I still think > passing the data inline might work. Either by overriding existing > fields such as a MAC addr (a hack for sure) or something in the HSR > header (it as the direction? or by inserting a custom header (or > trailer) akin to virtio_net_hdr for tun/tap. But custom to your > workload. But here you have PACKET_VNET_HDR_SZ to configure virtio_net_hdr. This is kind of what I ask for with PACKET_HSR_INFO + PACKET_HSR_BIND_PORT (and I added static branches so nobody has to suffer with CONFIG_HSR enabled but no HSR enabled socket).. The diff below would be what means to bypass AF_PACKET entirely and use eth0/ eth1 to send via SLAVE_A/ B and hsr0 with SO_MARK to send with system's HSR header but only on one of the two ports. With HSR-offloading enabled we need to attach the SO_MARK hints also on eth0/ eth1 devices, too. Otherwise the offloading part will send the packet on both ports and attach a header (as designed). Now I have now 8 sockets in userland instead of 4. I added icssg to show the offloading case: diff --git a/drivers/net/ethernet/ti/icssg/icssg_common.c b/drivers/net/ethernet/ti/icssg/icssg_common.c index 79df641b4fb97..a2e50cae01686 100644 --- a/drivers/net/ethernet/ti/icssg/icssg_common.c +++ b/drivers/net/ethernet/ti/icssg/icssg_common.c @@ -909,13 +909,17 @@ enum netdev_tx icssg_ndo_start_xmit(struct sk_buff *skb, struct net_device *ndev */ dst_tag_id = emac->port_id | (q_idx << 8); - if (prueth->is_hsr_offload_mode && - (ndev->features & NETIF_F_HW_HSR_DUP)) - dst_tag_id = PRUETH_UNDIRECTED_PKT_DST_TAG; + if (prueth->is_hsr_offload_mode) { + bool has_header; + bool has_port; - if (prueth->is_hsr_offload_mode && - (ndev->features & NETIF_F_HW_HSR_TAG_INS)) - epib[1] |= PRUETH_UNDIRECTED_PKT_TAG_INS; + hsr_skb_get_header_port_mark(skb, &has_header, &has_port); + if (ndev->features & NETIF_F_HW_HSR_DUP && !has_port) + dst_tag_id = PRUETH_UNDIRECTED_PKT_DST_TAG; + + if (ndev->features & NETIF_F_HW_HSR_TAG_INS && !has_header) + epib[1] |= PRUETH_UNDIRECTED_PKT_TAG_INS; + } cppi5_desc_set_tags_ids(&first_desc->hdr, 0, dst_tag_id); k3_udma_glue_tx_dma_to_cppi5_addr(tx_chn->tx_chn, &buf_dma); diff --git a/include/linux/if_hsr.h b/include/linux/if_hsr.h index d7941fd880329..814b30f65f102 100644 --- a/include/linux/if_hsr.h +++ b/include/linux/if_hsr.h @@ -22,6 +22,11 @@ enum hsr_port_type { HSR_PT_PORTS, /* This must be the last item in the enum */ }; +struct hsr_ptp_cb { + u8 ptp; + u8 port; +}; + /* HSR Tag. * As defined in IEC-62439-3:2010, the HSR tag is really { ethertype = 0x88FB, * path, LSDU_size, sequence Nr }. But we let eth_header() create { h_dest, @@ -43,6 +48,57 @@ extern bool is_hsr_master(struct net_device *dev); extern int hsr_get_version(struct net_device *dev, enum hsr_version *ver); struct net_device *hsr_get_port_ndev(struct net_device *ndev, enum hsr_port_type pt); + +static inline unsigned int hsr_skb_get_TX_port(struct sk_buff *skb) +{ + struct hsr_ptp_cb *cb; + + if (!skb) + return 0; + + cb = (struct hsr_ptp_cb*)skb->cb; + if (!cb->ptp) + return 0; + return cb->port; +} + +static inline void hsr_skb_set_TX_port(struct sk_buff *skb, + enum hsr_port_type port) +{ + struct hsr_ptp_cb *cb; + + cb = (struct hsr_ptp_cb*)skb->cb; + cb->ptp = 1; + cb->port = port; +} + +static inline void hsr_skb_non_ptp(struct sk_buff *skb) +{ + struct hsr_ptp_cb *cb; + + cb = (struct hsr_ptp_cb*)skb->cb; + cb->ptp = 0; +} + +static inline void hsr_skb_get_header_port_mark(struct sk_buff *skb, bool *has_header, + bool *has_port) +{ + u32 mark = skb->mark; + u32 val; + + val = mark & 0b11; + if (val == 1 || val == 2) + *has_port = true; + else + *has_port = false; + + val = mark & 0b100; + if (val) + *has_header = true; + else + *has_header = false; +} + #else static inline bool is_hsr_master(struct net_device *dev) { @@ -59,6 +115,13 @@ static inline struct net_device *hsr_get_port_ndev(struct net_device *ndev, { return ERR_PTR(-EINVAL); } + +static inline void hsr_skb_get_header_port_mark(struct sk_buff *skb, bool *has_header, + bool *has_port) +{ + *has_port = false; + *has_header = false; +} #endif /* CONFIG_HSR */ #endif /*_LINUX_IF_HSR_H_*/ diff --git a/net/hsr/hsr_device.c b/net/hsr/hsr_device.c index 2c43776b7c4fb..e01d3a33ac941 100644 --- a/net/hsr/hsr_device.c +++ b/net/hsr/hsr_device.c @@ -229,6 +229,14 @@ static netdev_tx_t hsr_dev_xmit(struct sk_buff *skb, struct net_device *dev) master = hsr_port_get_hsr(hsr, HSR_PT_MASTER); if (master) { skb->dev = master->dev; + + if (skb->mark == 1) + hsr_skb_set_TX_port(skb, HSR_PT_SLAVE_A); + else if (skb->mark == 2) + hsr_skb_set_TX_port(skb, HSR_PT_SLAVE_B); + else + hsr_skb_non_ptp(skb); + skb_reset_mac_header(skb); skb_reset_mac_len(skb); spin_lock_bh(&hsr->seqnr_lock); @@ -355,6 +363,7 @@ static void send_hsr_supervision_frame(struct hsr_port *port, return; } + hsr_skb_non_ptp(skb); hsr_forward_skb(skb, port); spin_unlock_bh(&hsr->seqnr_lock); return; @@ -396,6 +405,7 @@ static void send_prp_supervision_frame(struct hsr_port *master, return; } + hsr_skb_non_ptp(skb); hsr_forward_skb(skb, master); spin_unlock_bh(&hsr->seqnr_lock); } diff --git a/net/hsr/hsr_forward.c b/net/hsr/hsr_forward.c index c67c0d35921de..d0d6a8342bb33 100644 --- a/net/hsr/hsr_forward.c +++ b/net/hsr/hsr_forward.c @@ -12,6 +12,7 @@ #include <linux/skbuff.h> #include <linux/etherdevice.h> #include <linux/if_vlan.h> +#include <net/sock.h> #include "hsr_main.h" #include "hsr_framereg.h" @@ -333,7 +334,10 @@ struct sk_buff *hsr_create_tagged_frame(struct hsr_frame_info *frame, hsr_set_path_id(hsr_ethhdr, port); return skb_clone(frame->skb_hsr, GFP_ATOMIC); } else if (port->dev->features & NETIF_F_HW_HSR_TAG_INS) { - return skb_clone(frame->skb_std, GFP_ATOMIC); + skb = skb_clone(frame->skb_std, GFP_ATOMIC); + if (hsr_skb_get_TX_port(skb)) + skb_set_owner_w(skb, frame->skb_std->sk); + return skb; } /* Create the new skb with enough headroom to fit the HSR tag */ @@ -355,6 +359,15 @@ struct sk_buff *hsr_create_tagged_frame(struct hsr_frame_info *frame, memmove(dst, src, movelen); skb_reset_mac_header(skb); + if (hsr_skb_get_TX_port(skb)) { + /* Packets are bound to a port and the sender may expect time + * information. + */ + skb_shinfo(skb)->tx_flags = skb_shinfo(frame->skb_std)->tx_flags; + skb_shinfo(skb)->tskey = skb_shinfo(frame->skb_std)->tskey; + skb_set_owner_w(skb, frame->skb_std->sk); + } + /* skb_put_padto free skb on error and hsr_fill_tag returns NULL in * that case */ @@ -377,12 +390,23 @@ struct sk_buff *prp_create_tagged_frame(struct hsr_frame_info *frame, } return skb_clone(frame->skb_prp, GFP_ATOMIC); } else if (port->dev->features & NETIF_F_HW_HSR_TAG_INS) { - return skb_clone(frame->skb_std, GFP_ATOMIC); + skb = skb_clone(frame->skb_std, GFP_ATOMIC); + if (hsr_skb_get_TX_port(skb)) + skb_set_owner_w(skb, frame->skb_std->sk); + return skb; } skb = skb_copy_expand(frame->skb_std, skb_headroom(frame->skb_std), skb_tailroom(frame->skb_std) + HSR_HLEN, GFP_ATOMIC); + if (hsr_skb_get_TX_port(skb)) { + /* Packets are bound to a port and the sender may expect time + * information. + */ + skb_shinfo(skb)->tx_flags = skb_shinfo(frame->skb_std)->tx_flags; + skb_shinfo(skb)->tskey = skb_shinfo(frame->skb_std)->tskey; + skb_set_owner_w(skb, frame->skb_std->sk); + } return prp_fill_rct(skb, frame, port); } @@ -508,10 +532,13 @@ bool hsr_drop_frame(struct hsr_frame_info *frame, struct hsr_port *port) */ static void hsr_forward_do(struct hsr_frame_info *frame) { + unsigned int req_tx_port; struct hsr_port *port; struct sk_buff *skb; bool sent = false; + req_tx_port = hsr_skb_get_TX_port(frame->skb_std); + hsr_for_each_port(frame->port_rcv->hsr, port) { struct hsr_priv *hsr = port->hsr; /* Don't send frame back the way it came */ @@ -532,6 +559,16 @@ static void hsr_forward_do(struct hsr_frame_info *frame) if ((port->dev->features & NETIF_F_HW_HSR_DUP) && sent) continue; + /* RX PTP packets have the received port recorded */ + if (frame->skb_hsr) + skb = frame->skb_hsr; + else if (frame->skb_prp) + skb = frame->skb_prp; + + /* PTP TX packets have an outgoing port specified */ + if (req_tx_port != HSR_PT_NONE && req_tx_port != port->type) + continue; + /* Don't send frame over port where it has been sent before. * Also for SAN, this shouldn't be done. */ diff --git a/net/hsr/hsr_slave.c b/net/hsr/hsr_slave.c index 464f683e016db..f4879d1ee8ab6 100644 --- a/net/hsr/hsr_slave.c +++ b/net/hsr/hsr_slave.c @@ -44,8 +44,7 @@ static rx_handler_result_t hsr_handle_frame(struct sk_buff **pskb) if (hsr_addr_is_self(port->hsr, eth_hdr(skb)->h_source)) { /* Directly kill frames sent by ourselves */ - kfree_skb(skb); - goto finish_consume; + goto finish_free_consume; } /* For HSR, only tagged frames are expected (unless the device offloads @@ -70,15 +69,36 @@ static rx_handler_result_t hsr_handle_frame(struct sk_buff **pskb) /* Only the frames received over the interlink port will assign a * sequence number and require synchronisation vs other sender. */ + hsr_skb_non_ptp(skb); if (port->type == HSR_PT_INTERLINK) { spin_lock_bh(&hsr->seqnr_lock); hsr_forward_skb(skb, port); spin_unlock_bh(&hsr->seqnr_lock); } else { + struct hsr_ethhdr *hsr_ethhdr; + + /* PTP packets are not supposed to be forwarded via HSR/ PRP + * as-is. The latency introduced by forwarding renders + * the time information useless. + */ + if ((!hsr->prot_version && protocol == htons(ETH_P_PRP)) || + protocol == htons(ETH_P_HSR)) { + /* HSR */ + hsr_ethhdr = (struct hsr_ethhdr *)skb_mac_header(skb); + if (hsr_ethhdr->hsr_tag.encap_proto == htons(ETH_P_1588)) + goto finish_free_consume; + } else { + if (protocol == htons(ETH_P_1588)) + goto finish_free_consume; + } + hsr_forward_skb(skb, port); } -finish_consume: + return RX_HANDLER_CONSUMED; + +finish_free_consume: + kfree_skb(skb); return RX_HANDLER_CONSUMED; finish_pass: Sebastian ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH RFC net-next v2 2/2] af_packet: Add port specific handling for HSR 2026-03-17 17:29 ` Sebastian Andrzej Siewior @ 2026-03-19 13:29 ` Willem de Bruijn 2026-03-19 14:26 ` Sebastian Andrzej Siewior 0 siblings, 1 reply; 16+ messages in thread From: Willem de Bruijn @ 2026-03-19 13:29 UTC (permalink / raw) To: Sebastian Andrzej Siewior, Willem de Bruijn Cc: netdev, (JC), Jayachandran, David S. Miller, Andrew Lunn, Chintan Vankar, Danish Anwar, Daolin Qiu, Eric Dumazet, Felix Maurer, Jakub Kicinski, Paolo Abeni, Richard Cochran, Simon Horman Sebastian Andrzej Siewior wrote: > On 2026-03-16 16:12:41 [-0400], Willem de Bruijn wrote: > > > The suggested changes to af_packet, that remains a hard no? > > > > Yes. Packet sockets are largely superseded by XDP and AF_XDP when it > > comes to new functionality. And were never open for such protocol > > specific logic. > > ptp4l uses AF_PACKET and while I am not against AF_XDP I am not sure if > it helps. I don't have any protocol stack so extending SOL_PACKET was > the only option left. > > > Protocol independent extensions, such as reading skb->mark as part of > > extended auxdata, could be up for debate. But again, we already have > > XDP and AF_XDP which allow passing arbitrary metadata to/from > > userspace. That is preferable over adding new structs to the ABI. > > Is this struct xsk_tx_metadata and something similar for RX? If so, I > would require HSR fields for both. > My understanding is that you bind XDP to device+queue. I would require > four sockets (generic+event for both slaves) so this does not work, > right? Also I would need to extend HSR stack so support XDP in a similar > fashion as network driver do? > > > If you only want to attach to a single (hsr0) device, I still think > > passing the data inline might work. Either by overriding existing > > fields such as a MAC addr (a hack for sure) or something in the HSR > > header (it as the direction? or by inserting a custom header (or > > trailer) akin to virtio_net_hdr for tun/tap. But custom to your > > workload. > > But here you have PACKET_VNET_HDR_SZ to configure virtio_net_hdr. I mean an implicit header in the packet. virtio_net_hdr is something that the packet socket needs to inspect. The metadata between ptp4l and hsr_dev_xmit is not, so the packet socket can be oblivious to it. > This > is kind of what I ask for with PACKET_HSR_INFO + PACKET_HSR_BIND_PORT > (and I added static branches so nobody has to suffer with CONFIG_HSR > enabled but no HSR enabled socket).. > > The diff below would be what means to bypass AF_PACKET entirely and use > eth0/ eth1 to send via SLAVE_A/ B and hsr0 with SO_MARK to send with > system's HSR header but only on one of the two ports. > With HSR-offloading enabled we need to attach the SO_MARK hints also on > eth0/ eth1 devices, too. Otherwise the offloading part will send the > packet on both ports and attach a header (as designed). eth0/eth1 are not HSR devices, so how would they interpret skb->mark and how would they loop packets to the other port? > > Now I have now 8 sockets in userland instead of 4. > I added icssg to show the offloading case: > > diff --git a/drivers/net/ethernet/ti/icssg/icssg_common.c b/drivers/net/ethernet/ti/icssg/icssg_common.c > index 79df641b4fb97..a2e50cae01686 100644 > --- a/drivers/net/ethernet/ti/icssg/icssg_common.c > +++ b/drivers/net/ethernet/ti/icssg/icssg_common.c > @@ -909,13 +909,17 @@ enum netdev_tx icssg_ndo_start_xmit(struct sk_buff *skb, struct net_device *ndev > */ > dst_tag_id = emac->port_id | (q_idx << 8); > > - if (prueth->is_hsr_offload_mode && > - (ndev->features & NETIF_F_HW_HSR_DUP)) > - dst_tag_id = PRUETH_UNDIRECTED_PKT_DST_TAG; > + if (prueth->is_hsr_offload_mode) { > + bool has_header; > + bool has_port; > > - if (prueth->is_hsr_offload_mode && > - (ndev->features & NETIF_F_HW_HSR_TAG_INS)) > - epib[1] |= PRUETH_UNDIRECTED_PKT_TAG_INS; > + hsr_skb_get_header_port_mark(skb, &has_header, &has_port); > + if (ndev->features & NETIF_F_HW_HSR_DUP && !has_port) > + dst_tag_id = PRUETH_UNDIRECTED_PKT_DST_TAG; > + > + if (ndev->features & NETIF_F_HW_HSR_TAG_INS && !has_header) > + epib[1] |= PRUETH_UNDIRECTED_PKT_TAG_INS; > + } > > cppi5_desc_set_tags_ids(&first_desc->hdr, 0, dst_tag_id); > k3_udma_glue_tx_dma_to_cppi5_addr(tx_chn->tx_chn, &buf_dma); > diff --git a/include/linux/if_hsr.h b/include/linux/if_hsr.h > index d7941fd880329..814b30f65f102 100644 > --- a/include/linux/if_hsr.h > +++ b/include/linux/if_hsr.h > @@ -22,6 +22,11 @@ enum hsr_port_type { > HSR_PT_PORTS, /* This must be the last item in the enum */ > }; > > +struct hsr_ptp_cb { > + u8 ptp; > + u8 port; > +}; > + > /* HSR Tag. > * As defined in IEC-62439-3:2010, the HSR tag is really { ethertype = 0x88FB, > * path, LSDU_size, sequence Nr }. But we let eth_header() create { h_dest, > @@ -43,6 +48,57 @@ extern bool is_hsr_master(struct net_device *dev); > extern int hsr_get_version(struct net_device *dev, enum hsr_version *ver); > struct net_device *hsr_get_port_ndev(struct net_device *ndev, > enum hsr_port_type pt); > + > +static inline unsigned int hsr_skb_get_TX_port(struct sk_buff *skb) > +{ > + struct hsr_ptp_cb *cb; > + > + if (!skb) > + return 0; > + > + cb = (struct hsr_ptp_cb*)skb->cb; > + if (!cb->ptp) > + return 0; > + return cb->port; > +} > + > +static inline void hsr_skb_set_TX_port(struct sk_buff *skb, > + enum hsr_port_type port) > +{ > + struct hsr_ptp_cb *cb; > + > + cb = (struct hsr_ptp_cb*)skb->cb; > + cb->ptp = 1; > + cb->port = port; > +} > + > +static inline void hsr_skb_non_ptp(struct sk_buff *skb) > +{ > + struct hsr_ptp_cb *cb; > + > + cb = (struct hsr_ptp_cb*)skb->cb; > + cb->ptp = 0; > +} > + > +static inline void hsr_skb_get_header_port_mark(struct sk_buff *skb, bool *has_header, > + bool *has_port) > +{ > + u32 mark = skb->mark; > + u32 val; > + > + val = mark & 0b11; > + if (val == 1 || val == 2) > + *has_port = true; > + else > + *has_port = false; > + > + val = mark & 0b100; > + if (val) > + *has_header = true; > + else > + *has_header = false; > +} > + > #else > static inline bool is_hsr_master(struct net_device *dev) > { > @@ -59,6 +115,13 @@ static inline struct net_device *hsr_get_port_ndev(struct net_device *ndev, > { > return ERR_PTR(-EINVAL); > } > + > +static inline void hsr_skb_get_header_port_mark(struct sk_buff *skb, bool *has_header, > + bool *has_port) > +{ > + *has_port = false; > + *has_header = false; > +} > #endif /* CONFIG_HSR */ > > #endif /*_LINUX_IF_HSR_H_*/ > diff --git a/net/hsr/hsr_device.c b/net/hsr/hsr_device.c > index 2c43776b7c4fb..e01d3a33ac941 100644 > --- a/net/hsr/hsr_device.c > +++ b/net/hsr/hsr_device.c > @@ -229,6 +229,14 @@ static netdev_tx_t hsr_dev_xmit(struct sk_buff *skb, struct net_device *dev) > master = hsr_port_get_hsr(hsr, HSR_PT_MASTER); > if (master) { > skb->dev = master->dev; > + > + if (skb->mark == 1) > + hsr_skb_set_TX_port(skb, HSR_PT_SLAVE_A); > + else if (skb->mark == 2) > + hsr_skb_set_TX_port(skb, HSR_PT_SLAVE_B); > + else > + hsr_skb_non_ptp(skb); > + > skb_reset_mac_header(skb); > skb_reset_mac_len(skb); > spin_lock_bh(&hsr->seqnr_lock); > @@ -355,6 +363,7 @@ static void send_hsr_supervision_frame(struct hsr_port *port, > return; > } > > + hsr_skb_non_ptp(skb); > hsr_forward_skb(skb, port); > spin_unlock_bh(&hsr->seqnr_lock); > return; > @@ -396,6 +405,7 @@ static void send_prp_supervision_frame(struct hsr_port *master, > return; > } > > + hsr_skb_non_ptp(skb); > hsr_forward_skb(skb, master); > spin_unlock_bh(&hsr->seqnr_lock); > } > diff --git a/net/hsr/hsr_forward.c b/net/hsr/hsr_forward.c > index c67c0d35921de..d0d6a8342bb33 100644 > --- a/net/hsr/hsr_forward.c > +++ b/net/hsr/hsr_forward.c > @@ -12,6 +12,7 @@ > #include <linux/skbuff.h> > #include <linux/etherdevice.h> > #include <linux/if_vlan.h> > +#include <net/sock.h> > #include "hsr_main.h" > #include "hsr_framereg.h" > > @@ -333,7 +334,10 @@ struct sk_buff *hsr_create_tagged_frame(struct hsr_frame_info *frame, > hsr_set_path_id(hsr_ethhdr, port); > return skb_clone(frame->skb_hsr, GFP_ATOMIC); > } else if (port->dev->features & NETIF_F_HW_HSR_TAG_INS) { > - return skb_clone(frame->skb_std, GFP_ATOMIC); > + skb = skb_clone(frame->skb_std, GFP_ATOMIC); > + if (hsr_skb_get_TX_port(skb)) > + skb_set_owner_w(skb, frame->skb_std->sk); > + return skb; > } > > /* Create the new skb with enough headroom to fit the HSR tag */ > @@ -355,6 +359,15 @@ struct sk_buff *hsr_create_tagged_frame(struct hsr_frame_info *frame, > memmove(dst, src, movelen); > skb_reset_mac_header(skb); > > + if (hsr_skb_get_TX_port(skb)) { > + /* Packets are bound to a port and the sender may expect time > + * information. > + */ > + skb_shinfo(skb)->tx_flags = skb_shinfo(frame->skb_std)->tx_flags; > + skb_shinfo(skb)->tskey = skb_shinfo(frame->skb_std)->tskey; > + skb_set_owner_w(skb, frame->skb_std->sk); > + } > + > /* skb_put_padto free skb on error and hsr_fill_tag returns NULL in > * that case > */ > @@ -377,12 +390,23 @@ struct sk_buff *prp_create_tagged_frame(struct hsr_frame_info *frame, > } > return skb_clone(frame->skb_prp, GFP_ATOMIC); > } else if (port->dev->features & NETIF_F_HW_HSR_TAG_INS) { > - return skb_clone(frame->skb_std, GFP_ATOMIC); > + skb = skb_clone(frame->skb_std, GFP_ATOMIC); > + if (hsr_skb_get_TX_port(skb)) > + skb_set_owner_w(skb, frame->skb_std->sk); > + return skb; > } > > skb = skb_copy_expand(frame->skb_std, skb_headroom(frame->skb_std), > skb_tailroom(frame->skb_std) + HSR_HLEN, > GFP_ATOMIC); > + if (hsr_skb_get_TX_port(skb)) { > + /* Packets are bound to a port and the sender may expect time > + * information. > + */ > + skb_shinfo(skb)->tx_flags = skb_shinfo(frame->skb_std)->tx_flags; > + skb_shinfo(skb)->tskey = skb_shinfo(frame->skb_std)->tskey; > + skb_set_owner_w(skb, frame->skb_std->sk); > + } > return prp_fill_rct(skb, frame, port); > } > > @@ -508,10 +532,13 @@ bool hsr_drop_frame(struct hsr_frame_info *frame, struct hsr_port *port) > */ > static void hsr_forward_do(struct hsr_frame_info *frame) > { > + unsigned int req_tx_port; > struct hsr_port *port; > struct sk_buff *skb; > bool sent = false; > > + req_tx_port = hsr_skb_get_TX_port(frame->skb_std); > + > hsr_for_each_port(frame->port_rcv->hsr, port) { > struct hsr_priv *hsr = port->hsr; > /* Don't send frame back the way it came */ > @@ -532,6 +559,16 @@ static void hsr_forward_do(struct hsr_frame_info *frame) > if ((port->dev->features & NETIF_F_HW_HSR_DUP) && sent) > continue; > > + /* RX PTP packets have the received port recorded */ > + if (frame->skb_hsr) > + skb = frame->skb_hsr; > + else if (frame->skb_prp) > + skb = frame->skb_prp; > + > + /* PTP TX packets have an outgoing port specified */ > + if (req_tx_port != HSR_PT_NONE && req_tx_port != port->type) > + continue; > + > /* Don't send frame over port where it has been sent before. > * Also for SAN, this shouldn't be done. > */ > diff --git a/net/hsr/hsr_slave.c b/net/hsr/hsr_slave.c > index 464f683e016db..f4879d1ee8ab6 100644 > --- a/net/hsr/hsr_slave.c > +++ b/net/hsr/hsr_slave.c > @@ -44,8 +44,7 @@ static rx_handler_result_t hsr_handle_frame(struct sk_buff **pskb) > > if (hsr_addr_is_self(port->hsr, eth_hdr(skb)->h_source)) { > /* Directly kill frames sent by ourselves */ > - kfree_skb(skb); > - goto finish_consume; > + goto finish_free_consume; > } > > /* For HSR, only tagged frames are expected (unless the device offloads > @@ -70,15 +69,36 @@ static rx_handler_result_t hsr_handle_frame(struct sk_buff **pskb) > /* Only the frames received over the interlink port will assign a > * sequence number and require synchronisation vs other sender. > */ > + hsr_skb_non_ptp(skb); > if (port->type == HSR_PT_INTERLINK) { > spin_lock_bh(&hsr->seqnr_lock); > hsr_forward_skb(skb, port); > spin_unlock_bh(&hsr->seqnr_lock); > } else { > + struct hsr_ethhdr *hsr_ethhdr; > + > + /* PTP packets are not supposed to be forwarded via HSR/ PRP > + * as-is. The latency introduced by forwarding renders > + * the time information useless. > + */ > + if ((!hsr->prot_version && protocol == htons(ETH_P_PRP)) || > + protocol == htons(ETH_P_HSR)) { > + /* HSR */ > + hsr_ethhdr = (struct hsr_ethhdr *)skb_mac_header(skb); > + if (hsr_ethhdr->hsr_tag.encap_proto == htons(ETH_P_1588)) > + goto finish_free_consume; > + } else { > + if (protocol == htons(ETH_P_1588)) > + goto finish_free_consume; > + } > + > hsr_forward_skb(skb, port); > } > > -finish_consume: > + return RX_HANDLER_CONSUMED; > + > +finish_free_consume: > + kfree_skb(skb); > return RX_HANDLER_CONSUMED; > > finish_pass: > > Sebastian ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH RFC net-next v2 2/2] af_packet: Add port specific handling for HSR 2026-03-19 13:29 ` Willem de Bruijn @ 2026-03-19 14:26 ` Sebastian Andrzej Siewior 2026-03-19 16:27 ` Willem de Bruijn 0 siblings, 1 reply; 16+ messages in thread From: Sebastian Andrzej Siewior @ 2026-03-19 14:26 UTC (permalink / raw) To: Willem de Bruijn Cc: netdev, (JC), Jayachandran, David S. Miller, Andrew Lunn, Chintan Vankar, Danish Anwar, Daolin Qiu, Eric Dumazet, Felix Maurer, Jakub Kicinski, Paolo Abeni, Richard Cochran, Simon Horman On 2026-03-19 09:29:44 [-0400], Willem de Bruijn wrote: > > But here you have PACKET_VNET_HDR_SZ to configure virtio_net_hdr. > > I mean an implicit header in the packet. virtio_net_hdr is something > that the packet socket needs to inspect. The metadata between ptp4l > and hsr_dev_xmit is not, so the packet socket can be oblivious to it. Ah okay. If I insert a header, I would need to scan for something in every packet and it must not collide with anything that would include the same sequence. This means the PTP packets from ptp4l and everything else that sends data. That is something I am not very comfortable with. Even the destination MAC address had to be changed a few times during testing so scanning for it would not be good. > > This > > is kind of what I ask for with PACKET_HSR_INFO + PACKET_HSR_BIND_PORT > > (and I added static branches so nobody has to suffer with CONFIG_HSR > > enabled but no HSR enabled socket).. > > > > The diff below would be what means to bypass AF_PACKET entirely and use > > eth0/ eth1 to send via SLAVE_A/ B and hsr0 with SO_MARK to send with > > system's HSR header but only on one of the two ports. > > With HSR-offloading enabled we need to attach the SO_MARK hints also on > > eth0/ eth1 devices, too. Otherwise the offloading part will send the > > packet on both ports and attach a header (as designed). > > eth0/eth1 are not HSR devices, so how would they interpret skb->mark > and how would they loop packets to the other port? If you look at the diff below, there is the hsr_skb_get_header_port_mark() helper. So eth0/ eth1 are not HSR devices but they are used as slaves by hsr0. The intention is use the eth0/ eth1 devices to pass packets which include the HSR-header. So if I send on eth0 the intention is to send the packet 1:1 on this device. The skb->mark is not required there unless with hardware offload enabled. In this case if "PORT" marker is used to let the driver know to not duplicate the packet on the other port and if "HEADER" marker is set then it won't prepend a HSR header. The port marking is the same as for hsr0 and I used the next bit for the header to keep things a bit simple. The skb->mark with values 1…7 is then not available to anything else in this scenario. > > diff --git a/drivers/net/ethernet/ti/icssg/icssg_common.c b/drivers/net/ethernet/ti/icssg/icssg_common.c > > index 79df641b4fb97..a2e50cae01686 100644 > > --- a/drivers/net/ethernet/ti/icssg/icssg_common.c > > +++ b/drivers/net/ethernet/ti/icssg/icssg_common.c > > @@ -909,13 +909,17 @@ enum netdev_tx icssg_ndo_start_xmit(struct sk_buff *skb, struct net_device *ndev > > */ > > dst_tag_id = emac->port_id | (q_idx << 8); > > > > - if (prueth->is_hsr_offload_mode && > > - (ndev->features & NETIF_F_HW_HSR_DUP)) > > - dst_tag_id = PRUETH_UNDIRECTED_PKT_DST_TAG; > > + if (prueth->is_hsr_offload_mode) { > > + bool has_header; > > + bool has_port; > > > > - if (prueth->is_hsr_offload_mode && > > - (ndev->features & NETIF_F_HW_HSR_TAG_INS)) > > - epib[1] |= PRUETH_UNDIRECTED_PKT_TAG_INS; > > + hsr_skb_get_header_port_mark(skb, &has_header, &has_port); > > + if (ndev->features & NETIF_F_HW_HSR_DUP && !has_port) > > + dst_tag_id = PRUETH_UNDIRECTED_PKT_DST_TAG; > > + > > + if (ndev->features & NETIF_F_HW_HSR_TAG_INS && !has_header) > > + epib[1] |= PRUETH_UNDIRECTED_PKT_TAG_INS; > > + } > > > > cppi5_desc_set_tags_ids(&first_desc->hdr, 0, dst_tag_id); > > k3_udma_glue_tx_dma_to_cppi5_addr(tx_chn->tx_chn, &buf_dma); Sebastian ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH RFC net-next v2 2/2] af_packet: Add port specific handling for HSR 2026-03-19 14:26 ` Sebastian Andrzej Siewior @ 2026-03-19 16:27 ` Willem de Bruijn 2026-03-24 16:38 ` Sebastian Andrzej Siewior 0 siblings, 1 reply; 16+ messages in thread From: Willem de Bruijn @ 2026-03-19 16:27 UTC (permalink / raw) To: Sebastian Andrzej Siewior, Willem de Bruijn Cc: netdev, (JC), Jayachandran, David S. Miller, Andrew Lunn, Chintan Vankar, Danish Anwar, Daolin Qiu, Eric Dumazet, Felix Maurer, Jakub Kicinski, Paolo Abeni, Richard Cochran, Simon Horman Sebastian Andrzej Siewior wrote: > On 2026-03-19 09:29:44 [-0400], Willem de Bruijn wrote: > > > But here you have PACKET_VNET_HDR_SZ to configure virtio_net_hdr. > > > > I mean an implicit header in the packet. virtio_net_hdr is something > > that the packet socket needs to inspect. The metadata between ptp4l > > and hsr_dev_xmit is not, so the packet socket can be oblivious to it. > > Ah okay. If I insert a header, I would need to scan for something in > every packet and it must not collide with anything that would include > the same sequence. True. > This means the PTP packets from ptp4l and everything > else that sends data. That is something I am not very comfortable with. > Even the destination MAC address had to be changed a few times during > testing so scanning for it would not be good. > > > > This > > > is kind of what I ask for with PACKET_HSR_INFO + PACKET_HSR_BIND_PORT > > > (and I added static branches so nobody has to suffer with CONFIG_HSR > > > enabled but no HSR enabled socket).. > > > > > > The diff below would be what means to bypass AF_PACKET entirely and use > > > eth0/ eth1 to send via SLAVE_A/ B and hsr0 with SO_MARK to send with > > > system's HSR header but only on one of the two ports. > > > With HSR-offloading enabled we need to attach the SO_MARK hints also on > > > eth0/ eth1 devices, too. Otherwise the offloading part will send the > > > packet on both ports and attach a header (as designed). > > > > eth0/eth1 are not HSR devices, so how would they interpret skb->mark > > and how would they loop packets to the other port? > > If you look at the diff below, there is the > hsr_skb_get_header_port_mark() helper. > > So eth0/ eth1 are not HSR devices but they are used as slaves by hsr0. > The intention is use the eth0/ eth1 devices to pass packets which > include the HSR-header. So if I send on eth0 the intention is to send > the packet 1:1 on this device. The skb->mark is not required there > unless with hardware offload enabled. Right, so this is simple without hardware offload. Does this HW offload exist, or is this aspirational. At least for infrequent PTP it does not sound important. > In this case if "PORT" marker is > used to let the driver know to not duplicate the packet on the other > port and if "HEADER" marker is set then it won't prepend a HSR header. > The port marking is the same as for hsr0 and I used the next bit for the > header to keep things a bit simple. > The skb->mark with values 1…7 is then not available to anything else in > this scenario. > > > > diff --git a/drivers/net/ethernet/ti/icssg/icssg_common.c b/drivers/net/ethernet/ti/icssg/icssg_common.c > > > index 79df641b4fb97..a2e50cae01686 100644 > > > --- a/drivers/net/ethernet/ti/icssg/icssg_common.c > > > +++ b/drivers/net/ethernet/ti/icssg/icssg_common.c > > > @@ -909,13 +909,17 @@ enum netdev_tx icssg_ndo_start_xmit(struct sk_buff *skb, struct net_device *ndev > > > */ > > > dst_tag_id = emac->port_id | (q_idx << 8); > > > > > > - if (prueth->is_hsr_offload_mode && > > > - (ndev->features & NETIF_F_HW_HSR_DUP)) > > > - dst_tag_id = PRUETH_UNDIRECTED_PKT_DST_TAG; > > > + if (prueth->is_hsr_offload_mode) { > > > + bool has_header; > > > + bool has_port; > > > > > > - if (prueth->is_hsr_offload_mode && > > > - (ndev->features & NETIF_F_HW_HSR_TAG_INS)) > > > - epib[1] |= PRUETH_UNDIRECTED_PKT_TAG_INS; > > > + hsr_skb_get_header_port_mark(skb, &has_header, &has_port); > > > + if (ndev->features & NETIF_F_HW_HSR_DUP && !has_port) > > > + dst_tag_id = PRUETH_UNDIRECTED_PKT_DST_TAG; > > > + > > > + if (ndev->features & NETIF_F_HW_HSR_TAG_INS && !has_header) > > > + epib[1] |= PRUETH_UNDIRECTED_PKT_TAG_INS; > > > + } > > > > > > cppi5_desc_set_tags_ids(&first_desc->hdr, 0, dst_tag_id); > > > k3_udma_glue_tx_dma_to_cppi5_addr(tx_chn->tx_chn, &buf_dma); > > Sebastian ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH RFC net-next v2 2/2] af_packet: Add port specific handling for HSR 2026-03-19 16:27 ` Willem de Bruijn @ 2026-03-24 16:38 ` Sebastian Andrzej Siewior 0 siblings, 0 replies; 16+ messages in thread From: Sebastian Andrzej Siewior @ 2026-03-24 16:38 UTC (permalink / raw) To: Willem de Bruijn Cc: netdev, (JC), Jayachandran, David S. Miller, Andrew Lunn, Chintan Vankar, Danish Anwar, Daolin Qiu, Eric Dumazet, Felix Maurer, Jakub Kicinski, Paolo Abeni, Richard Cochran, Simon Horman On 2026-03-19 12:27:57 [-0400], Willem de Bruijn wrote: > > Right, so this is simple without hardware offload. Does this HW > offload exist, or is this aspirational. At least for infrequent PTP > it does not sound important. The snippet below was for the icssg driver as-is and it works with updated firmware to ignore PTP packets while HW offloading is enabled. The hardware offload in general forwards the packets without linux so if a packet arrives on port A, the HW forwards it on port B and the linux networking stack receives one copy and does not need to forward it anymore. This "cut-through" forwarding is beneficial because the packet is forwarded instantly without getting delayed by the software stack. For PTP packets I don't want any of this because the forwarding breaks the timestamp information. So I need to tell the driver not do any of the offloading while sending and the updated firmware does the same for incoming packets. There is no HW-offloading of PTP related packets. With the skb_ext I had, the mechanism was mostly the same but it relied only on the skb_ext data. Now it has to look at skb->mark. Sebastian ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2026-03-24 16:38 UTC | newest] Thread overview: 16+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-03-09 15:52 [PATCH RFC net-next v2 0/2] hsr: Add additional info to send/ receive skbs Sebastian Andrzej Siewior 2026-03-09 15:52 ` [PATCH RFC net-next v2 1/2] hsr: Allow to send a specific port and with HSR header Sebastian Andrzej Siewior 2026-03-09 15:52 ` [PATCH RFC net-next v2 2/2] af_packet: Add port specific handling for HSR Sebastian Andrzej Siewior 2026-03-10 1:38 ` Willem de Bruijn 2026-03-10 10:55 ` Sebastian Andrzej Siewior 2026-03-10 21:35 ` Willem de Bruijn 2026-03-12 15:42 ` Sebastian Andrzej Siewior 2026-03-12 21:43 ` Willem de Bruijn 2026-03-13 9:22 ` Sebastian Andrzej Siewior 2026-03-13 16:04 ` Sebastian Andrzej Siewior 2026-03-16 20:12 ` Willem de Bruijn 2026-03-17 17:29 ` Sebastian Andrzej Siewior 2026-03-19 13:29 ` Willem de Bruijn 2026-03-19 14:26 ` Sebastian Andrzej Siewior 2026-03-19 16:27 ` Willem de Bruijn 2026-03-24 16:38 ` Sebastian Andrzej Siewior
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox