From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B241126F2BD for ; Thu, 29 Jan 2026 15:30:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769700656; cv=none; b=sMCevNj2O9Yf4elGh6n72I6yDAuUxoSQ6ut6voQ+VPqf52xUsBWxiW5ziz00+MxhsXG71n71q3QYQozLSfsZi6VwPvLMoLgHbHx4oI1yODwCVPA1tTUGoHbe2FLsQCPpl29dZZ205mSNUMj/jLAj4PUQsj1gSnBHwi5Hd2icVu4= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769700656; c=relaxed/simple; bh=apc65uLcuXfsP+/cbJQmxinXOEjJaefSjUQOV7/0VQ4=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=WsRujeCkR9DRaJ2PbbBDEgr5v73ZATPv90OL2fyBkR393SsckukJ004aEC19TCUjT2fhwgnI6FeBp9NGKm5mHhVzmb+T/fOAiMUfhXZU59Wc+JSNaEV0nSTxTwQqviiZYhIMRZ/pn5UrqepG6xC9sWL4toRIp8w3mmkTDVzjs9c= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=HlYgVoHa; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="HlYgVoHa" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1769700653; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4ttgpfPTVqpXctCpgd8C7YpFbEsl5uP0oBMz3JbpSPM=; b=HlYgVoHaQBJOMKlU6iTSJ4n8k/kU9AtT+Hiiak07pUcKrR9ZluOs9HBLLn/cbtHJlqkg+q rkEl4H8jcEDP9gvg7sVrUv4/8i1MghRdXIUmgWzQGsKDeRkC9XbWiUviaKk32RIHzXSCyU IYB2S2BEWTWVK6cYXFwNhhLo4FVdCaM= Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-658-RCnn6hUONY6YnVyg-HuIyQ-1; Thu, 29 Jan 2026 10:30:46 -0500 X-MC-Unique: RCnn6hUONY6YnVyg-HuIyQ-1 X-Mimecast-MFC-AGG-ID: RCnn6hUONY6YnVyg-HuIyQ_1769700640 Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 19F48195606D; Thu, 29 Jan 2026 15:30:40 +0000 (UTC) Received: from thinkpad (unknown [10.44.34.121]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 12A1E3000218; Thu, 29 Jan 2026 15:30:35 +0000 (UTC) Date: Thu, 29 Jan 2026 16:30:32 +0100 From: Felix Maurer To: Sebastian Andrzej Siewior Cc: netdev@vger.kernel.org, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, horms@kernel.org, jkarrenpalo@gmail.com, tglx@linutronix.de, mingo@kernel.org, allison.henderson@oracle.com, petrm@nvidia.com, antonio@openvpn.net, Steffen Lindner Subject: Re: [PATCH net-next v2 4/9] hsr: Implement more robust duplicate discard for PRP Message-ID: References: <20260129132940.XvIWrwoG@linutronix.de> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20260129132940.XvIWrwoG@linutronix.de> X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 On Thu, Jan 29, 2026 at 02:29:40PM +0100, Sebastian Andrzej Siewior wrote: > On 2026-01-22 15:56:59 [+0100], Felix Maurer wrote: > > The PRP duplicate discard algorithm does not work reliably with certain > … > > The idea of the new algorithm is to store the seen sequence numbers in a > > bitmap. To keep the size of the bitmap in control, we store it as a "sparse > > bitmap" where the bitmap is split into blocks and not all blocks exist at > > the same time. The sparse bitmap is implemented using an xarray that keeps > > the references to the individual blocks and a backing ring buffer that > … > > My idea was to keep track of say the last 64 sequence numbers but I had > no idea how to efficiently implement it with all the "forget" parts and > so on. What you did here is quite nice. Thank you! > > @@ -280,6 +301,59 @@ struct hsr_node *hsr_get_node(struct hsr_port *port, struct list_head *node_db, > … > > +/* Get the currently active sequence number block. If there is no block yet, or > > + * the existing one is expired, a new block is created. The idea is to maintain > > + * a "sparse bitmap" where a bitmap for the whole sequence number space is > > + * split into blocks and not all blocks exist all the time. The blocks can > > + * expire after time (in low traffic situations) or when they are replaced in > > + * the backing fixed size buffer (in high traffic situations). > HSR_MAX_SEQ_BLOCKS, > > + */ > > +static struct hsr_seq_block *hsr_get_seq_block(struct hsr_node *node, > > + u16 block_idx) > > +{ > > + struct hsr_seq_block *block, *res; > > + > > + block = xa_load(&node->seq_blocks, block_idx); > > + > > + if (block && hsr_seq_block_is_old(block)) { > > + hsr_forget_seq_block(node, block); > > + block = NULL; > > + } > > + > > + if (!block) { > > + block = &node->block_buf[node->next_block]; > > + hsr_forget_seq_block(node, block); > > + > > + memset(block, 0, sizeof(*block)); > > + block->time = jiffies; > > So we assign ->time here while the block is created and it expires after > HSR_ENTRY_FORGET_TIME (400ms). *Maybe* it should be updated in > hsr_check_duplicate() if we set a bit. The difference would be that the > last sequence in the block would get it whole life time while now the > lifetime starts with the first entry in the block. The downside > of course would be that the first entry gets more than the initialy > expected lifetime. > Not sure if this matters in reality, just wanted to mention. I considered that as well. But given that the standard states that "since the 16-bit SeqNr wraps around after 65 536 frames [...], entries older than a specified time EntryForgetTime are purged" (IEC 62439-3:2021, 4.1.10.3) and later more explicitly that EntryForgetTime is the "maximum time an entry may reside in the duplicate table" (4.4), I decided that we rather go the safe way here and use the lifetime of the first entry. One can definitely construct a scenario where using the lifetime of the last entry is problematic: let's say we have a block with a single entry. If the sequence number wraps around and we reach the same block again after, let's say, 350ms but hit a different bit in the block, the new lifetime would apply to both entries, which would almost double the lifetime of the initial entry. The large span of 128 entries per timestamp could be mitigated by changing the size (and maybe number) of blocks. The 64 blocks with 128 entries each is a kind of arbitrary pick, but make a good tradeoff in my opinion. I'm making that tradeoff between memory usage, timestamp inaccuracy, and link delay we can deal with (with 64 * 128 = 2^13 entries out of the 2^16 sequence number space, i.e, 1/8th of the space, that's 5-6ms on a 1Gbit/s network where sequence number could repeat every 44ms according to the standard). > > + block->block_idx = block_idx; > > + > > + res = xa_store(&node->seq_blocks, block_idx, block, GFP_ATOMIC); > > + if (xa_is_err(res)) > > + return NULL; > > + > > + node->next_block = > > + (node->next_block + 1) & (HSR_MAX_SEQ_BLOCKS - 1); > > + } > > + > > + return block; > > +} > > + > > /* Use the Supervision frame's info about an eventual macaddress_B for merging > > * nodes that has previously had their macaddress_B registered as a separate > > * node. > > @@ -545,47 +631,26 @@ int prp_register_frame_out(struct hsr_port *port, struct hsr_frame_info *frame) > … > > + > > + seq_bit = hsr_seq_block_bit(sequence_nr); > > + if (test_and_set_bit(seq_bit, block->seq_nrs)) > > the access happens under ::seq_out_lock so you don't need to be atomic > here. You could safe a cycle and use __test_and_set_bit() instead. Good point, I tried to get rid of the lock at some point, but then decided that it would be material for another patchset. I'll change it to __test_and_set_bit(). Thanks, Felix > > + goto out_seen; > > > > node->time_out[port->type] = jiffies; > > node->seq_out[port->type] = sequence_nr; > > +out_new: > > spin_unlock_bh(&node->seq_out_lock); > > return 0; > > + > > +out_seen: > > + spin_unlock_bh(&node->seq_out_lock); > > + return 1; > > } > > > > #if IS_MODULE(CONFIG_PRP_DUP_DISCARD_KUNIT_TEST) > > Sebastian