From: Paolo Abeni <pabeni@redhat.com>
To: Felix Maurer <fmaurer@redhat.com>, netdev@vger.kernel.org
Cc: davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
horms@kernel.org, jkarrenpalo@gmail.com, tglx@kernel.org,
mingo@kernel.org, bigeasy@linutronix.de, matttbe@kernel.org,
allison.henderson@oracle.com, petrm@nvidia.com,
antonio@openvpn.net, Steffen Lindner <steffen.lindner@de.abb.com>
Subject: Re: [PATCH net-next v3 4/8] hsr: Implement more robust duplicate discard for PRP
Date: Thu, 5 Feb 2026 13:25:01 +0100 [thread overview]
Message-ID: <ccad3f52-e46a-46b5-9113-153a31b3ddd3@redhat.com> (raw)
In-Reply-To: <d767a756f426b8906ea12e411a97cf2ba04182c5.1770041682.git.fmaurer@redhat.com>
On 2/2/26 3:19 PM, Felix Maurer wrote:
> The PRP duplicate discard algorithm does not work reliably with certain
> link faults. Especially with packet loss on one link, the duplicate discard
> algorithm drops valid packets which leads to packet loss on the PRP
> interface where the link fault should in theory be perfectly recoverable by
> PRP. This happens because the algorithm opens the drop window on the lossy
> link, covering received and lost sequence numbers. If the other, non-lossy
> link receives the duplicate for a lost frame, it is within the drop window
> of the lossy link and therefore dropped.
>
> Since IEC 62439-3:2012, a node has one sequence number counter for frames
> it sends, instead of one sequence number counter for each destination.
> Therefore, a node can not expect to receive contiguous sequence numbers
> from a sender. A missing sequence number can be totally normal (if the
> sender intermittently communicates with another node) or mean a frame was
> lost.
>
> The algorithm, as previously implemented in commit 05fd00e5e7b1 ("net: hsr:
> Fix PRP duplicate detection"), was part of IEC 62439-3:2010 (HSRv0/PRPv0)
> but was removed with IEC 62439-3:2012 (HSRv1/PRPv1). Since that, no
> algorithm is specified but up to implementers. It should be "designed such
> that it never rejects a legitimate frame, while occasional acceptance of a
> duplicate can be tolerated" (IEC 62439-3:2021).
>
> For the duplicate discard algorithm, this means that 1) we need to track
> the sequence numbers individually to account for non-contiguous sequence
> numbers, and 2) we should always err on the side of accepting a duplicate
> than dropping a valid frame.
>
> The idea of the new algorithm is to store the seen sequence numbers in a
> bitmap. To keep the size of the bitmap in control, we store it as a "sparse
> bitmap" where the bitmap is split into blocks and not all blocks exist at
> the same time. The sparse bitmap is implemented using an xarray that keeps
> the references to the individual blocks and a backing ring buffer that
> stores the actual blocks. New blocks are initialized in the buffer and
> added to the xarray as needed when new frames arrive. Existing blocks are
> removed in two conditions:
> 1. The block found for an arriving sequence number is old and therefore not
> relevant to the duplicate discard algorithm anymore, i.e., it has been
> added more than the entry forget time ago. In this case, the block is
> removed from the xarray and marked as forgotten (by setting its
> timestamp to 0).
> 2. Space is needed in the ring buffer for a new block. In this case, the
> block is removed from the xarray, if it hasn't already been forgotten
> (by 1.). Afterwards, the new block is initialized in its place.
>
> This has the nice property that we can reliably track sequence numbers on
> low traffic situations (where they expire based on their timestamp) and
> more quickly forget sequence numbers in high traffic situations before they
> potentially wrap over and repeat before they are expired.
>
> When nodes are merged, the blocks are merged as well. The timestamp of a
> merged block is set to the minimum of the two timestamps to never keep
> around a seen sequence number for too long. The bitmaps are or'd to mark
> all seen sequence numbers as seen.
>
> All of this still happens under seq_out_lock, to prevent concurrent
> access to the blocks.
>
> The KUnit test for the algorithm is updated as well. The updates are done
> in a way to match the original intends pretty closely. Currently, there is
> much knowledge about the actual algorithm baked into the tests (especially
> the expectations) which may need some redesign in the future.
>
> Reported-by: Steffen Lindner <steffen.lindner@de.abb.com>
> Fixes: 05fd00e5e7b1 ("net: hsr: Fix PRP duplicate detection")
I'm sorry for nit picking, but it looks like the current quidance is to
avoid fixes tag for this kind of resiliece improving refactors:
https://lore.kernel.org/netdev/20260121171051.039110c3@kernel.org/
> @@ -526,18 +613,21 @@ int hsr_register_frame_out(struct hsr_port *port, struct hsr_frame_info *frame)
> */
> int prp_register_frame_out(struct hsr_port *port, struct hsr_frame_info *frame)
> {
> - enum hsr_port_type other_port;
> - enum hsr_port_type rcv_port;
> + u16 sequence_nr, seq_bit, block_idx;
> + struct hsr_seq_block *block;
> struct hsr_node *node;
> - u16 sequence_diff;
> - u16 sequence_exp;
> - u16 sequence_nr;
>
> - /* out-going frames are always in order
> - * and can be checked the same way as for HSR
> - */
> - if (frame->port_rcv->type == HSR_PT_MASTER)
> - return hsr_register_frame_out(port, frame);
> + node = frame->node_src;
> + sequence_nr = frame->sequence_nr;
> +
> + // out-going frames are always in order
Please use /* */ for comments.
/P
next prev parent reply other threads:[~2026-02-05 12:25 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-02 14:19 [PATCH net-next v3 0/8] hsr: Implement more robust duplicate discard algorithm Felix Maurer
2026-02-02 14:19 ` [PATCH net-next v3 1/8] selftests: hsr: Add ping test for PRP Felix Maurer
2026-02-02 14:19 ` [PATCH net-next v3 2/8] selftests: hsr: Check duplicates on HSR with VLAN Felix Maurer
2026-02-02 14:19 ` [PATCH net-next v3 3/8] selftests: hsr: Add tests for faulty links Felix Maurer
2026-02-02 14:19 ` [PATCH net-next v3 4/8] hsr: Implement more robust duplicate discard for PRP Felix Maurer
2026-02-04 11:44 ` Sebastian Andrzej Siewior
2026-02-05 12:25 ` Paolo Abeni [this message]
2026-02-05 14:02 ` Felix Maurer
2026-02-02 14:19 ` [PATCH net-next v3 5/8] selftests: hsr: Add tests for more link faults with PRP Felix Maurer
2026-02-02 14:19 ` [PATCH net-next v3 6/8] hsr: Implement more robust duplicate discard for HSR Felix Maurer
2026-02-04 11:44 ` Sebastian Andrzej Siewior
2026-02-02 14:19 ` [PATCH net-next v3 7/8] selftests: hsr: Add more link fault tests " Felix Maurer
2026-02-02 14:19 ` [PATCH net-next v3 8/8] MAINTAINERS: Assign hsr selftests to HSR Felix Maurer
2026-02-03 14:09 ` [PATCH net-next v3 0/8] hsr: Implement more robust duplicate discard algorithm Sebastian Andrzej Siewior
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ccad3f52-e46a-46b5-9113-153a31b3ddd3@redhat.com \
--to=pabeni@redhat.com \
--cc=allison.henderson@oracle.com \
--cc=antonio@openvpn.net \
--cc=bigeasy@linutronix.de \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=fmaurer@redhat.com \
--cc=horms@kernel.org \
--cc=jkarrenpalo@gmail.com \
--cc=kuba@kernel.org \
--cc=matttbe@kernel.org \
--cc=mingo@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=petrm@nvidia.com \
--cc=steffen.lindner@de.abb.com \
--cc=tglx@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox