public inbox for netfilter@vger.kernel.org
 help / color / mirror / Atom feed
* netfilter: nf_conncount: cpu soft lockup using limiting with Open vSwitch.
@ 2025-12-08 12:31 Rukomoinikova Aleksandra
  0 siblings, 0 replies; 10+ messages in thread
From: Rukomoinikova Aleksandra @ 2025-12-08 12:31 UTC (permalink / raw)
  To: netfilter@vger.kernel.org
  Cc: Odintsov Vladislav, Kovalev Evgeniy, Vazhnetsov Anton,
	ovs-dev@openvswitch.org

Hi!
I was testing conntrack limiting using Open vSwitch and noticed the 
following issue: under certain limits, a CPU lock occurred.

[  491.682936] watchdog: BUG: soft lockup - CPU#1 stuck for 26s! 
[ovs-dpctl:19437]

This occurs during a high packet frequency when trying to get the set 
limits through ovs-dpctl ct-get-limits.

In the trace, I can see that the lock occurred on attempts to acquire a 
spinlock.

[  491.683056]  <IRQ>
[  491.683059]  _raw_spin_lock_bh+0x29/0x30
[  491.683064]  count_tree+0x19b/0x1f0 [nf_conncount]
[  491.683069]  ovs_ct_commit+0x196/0x490 [openvswitch]

Prior to this, in the trace, there was processing of a task from 
userspace (ovs-dpctl)

[  491.683236]  </IRQ>
[  491.683237]  <TASK>
[  491.683238]  asm_common_interrupt+0x22/0x40
[  491.683240] RIP: 0010:nf_conncount_gc_list+0x18a/0x200 [nf_conncount]

Inside the nf_conncount_gc_list function, a lock is taken on 
nf_conncount.c:spin_trylock_bh(&list->list_lock):335. After this, the 
not-so-fast __nf_conncount_gc_list function is executed. If, at this 
moment, a packet interrupt arrives on the same сpu core (and 
spin_trylock_bh doesn't disable interrupts on that core), then scenario 
I encountered occurs: the first lock remains held, while the packet 
interrupt also attempts to acquire it at 
nf_conncount.c:spin_lock_bh(&rbconn->list.list_lock):502 while 
committing to conntrack. This attempt fails, leading to a soft lockup.

Hence my question: shouldn't we avoid calling nf_conncount_gc_list when 
querying limits without an skb (as OVS does in 
openvswitch/conntrack.c:1773)? The limit retrieval operation should be 
read-only regarding the contract state, not involve potential modification.

Like this:
--- a/net/netfilter/nf_conncount.c
+++ b/net/netfilter/nf_conncount.c
@@ -495,7 +495,6 @@ count_tree(struct net *net,
              int ret;

              if (!skb) {
-                nf_conncount_gc_list(net, &rbconn->list);
                  return rbconn->list.count;
              }

Thanks!

-- 
regards,
Alexandra.


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2025-12-15 15:38 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <7a6872ce-8015-4397-bbe9-22108c65b7ec@k2.cloud>
2025-12-08 12:27 ` netfilter: nf_conncount: cpu soft lockup using limiting with Open vSwitch Odintsov Vladislav
2025-12-09  7:57   ` Fernando Fernandez Mancera
2025-12-09  8:42     ` Pablo Neira Ayuso
2025-12-09  8:49       ` Rukomoinikova Aleksandra
2025-12-10  8:03       ` Fernando Fernandez Mancera
2025-12-12 21:27         ` Rukomoinikova Aleksandra
2025-12-15 11:00           ` Fernando Fernandez Mancera
2025-12-15 11:07             ` Rukomoinikova Aleksandra
2025-12-15 15:38               ` Fernando Fernandez Mancera
2025-12-08 12:31 Rukomoinikova Aleksandra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox