From: Felix Maurer <fmaurer@redhat.com>
To: netdev@vger.kernel.org
Cc: davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
pabeni@redhat.com, horms@kernel.org, jkarrenpalo@gmail.com,
tglx@linutronix.de, mingo@kernel.org,
allison.henderson@oracle.com, matttbe@kernel.org,
petrm@nvidia.com, bigeasy@linutronix.de
Subject: [RFC net 0/6] hsr: Implement more robust duplicate discard algorithm
Date: Mon, 22 Dec 2025 21:57:30 +0100 [thread overview]
Message-ID: <cover.1766433800.git.fmaurer@redhat.com> (raw)
The PRP duplicate discard algorithm does not work reliably with certain
link faults. Especially with packet loss on one link, the duplicate
discard algorithm drops valid packets. For a more thorough description
see patch 5.
My suggestion is to replace the current, drop window-based algorithm
with a new one that tracks the received sequence numbers individually
(description again in patch 5). I am sending this as an RFC to gather
feedback mainly on two points:
1. Is the design generally acceptable? Of course, this change leads to
higher memory usage and more work to do for each packet. But I argue
that this is an acceptable trade-off to make for a more robust PRP
behavior with faulty links. After all, PRP is to be used in
environments where redundancy is needed and people are ready to
maintain two duplicate networks to achieve it.
2. As the tests added in patch 6 show, HSR is subject to similar
problems. I do not see a reason not to use a very similar algorithm
for HSR as well (with a bitmap for each port). Any objections to
doing that (in a later patch series)? This will make the trade-off
with memory usage more pronounced, as the hsr_seq_block will grow by
three more bitmaps, at least for each HSR node (of which we do not
expect too many, as an HSR ring can not be infinitely large).
Most of the patches in this series are for the selftests. This is mainly
to demonstrate the problems with the current duplicate discard
algorithms, not so much about gathering feedback. Especially patch 1 and
2 are rather preparatory cleanups that do not have much to do with the
actual problems the new algorithm tries to solve.
A few points I know not yet addressed are:
- HSR duplicate discard (see above).
- The KUnit test is not updated for the new algorithm. I will work on
that before actual patch submission.
- Merging the sequence number blocks when two entries in the node table
are merged because they belong to the same node.
Thank you for your feedback already!
Signed-off-by: Felix Maurer <fmaurer@redhat.com>
---
Felix Maurer (6):
selftests: hsr: Add ping test for PRP
selftests: hsr: Check duplicates on HSR with VLAN
selftests: hsr: Add tests for faulty links
selftests: hsr: Add tests for more link faults with PRP
hsr: Implement more robust duplicate discard for PRP
selftests: hsr: Add more link fault tests for HSR
net/hsr/hsr_framereg.c | 181 ++++++---
net/hsr/hsr_framereg.h | 24 +-
tools/testing/selftests/net/hsr/Makefile | 2 +
tools/testing/selftests/net/hsr/hsr_ping.sh | 198 +++------
.../testing/selftests/net/hsr/link_faults.sh | 376 ++++++++++++++++++
tools/testing/selftests/net/hsr/prp_ping.sh | 141 +++++++
tools/testing/selftests/net/hsr/settings | 2 +-
7 files changed, 714 insertions(+), 210 deletions(-)
create mode 100755 tools/testing/selftests/net/hsr/link_faults.sh
create mode 100755 tools/testing/selftests/net/hsr/prp_ping.sh
--
2.52.0
next reply other threads:[~2025-12-22 20:58 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-22 20:57 Felix Maurer [this message]
2025-12-22 20:57 ` [RFC net 1/6] selftests: hsr: Add ping test for PRP Felix Maurer
2025-12-22 20:57 ` [RFC net 2/6] selftests: hsr: Check duplicates on HSR with VLAN Felix Maurer
2025-12-22 20:57 ` [RFC net 3/6] selftests: hsr: Add tests for faulty links Felix Maurer
2025-12-22 20:57 ` [RFC net 4/6] selftests: hsr: Add tests for more link faults with PRP Felix Maurer
2025-12-22 20:57 ` [RFC net 5/6] hsr: Implement more robust duplicate discard for PRP Felix Maurer
2025-12-22 20:57 ` [RFC net 6/6] selftests: hsr: Add more link fault tests for HSR Felix Maurer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1766433800.git.fmaurer@redhat.com \
--to=fmaurer@redhat.com \
--cc=allison.henderson@oracle.com \
--cc=bigeasy@linutronix.de \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=jkarrenpalo@gmail.com \
--cc=kuba@kernel.org \
--cc=matttbe@kernel.org \
--cc=mingo@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=petrm@nvidia.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).