Netdev List
 help / color / mirror / Atom feed
From: "Cássio Gabriel Monteiro Pires" <cassiogabrielcontato@gmail.com>
To: Tung Quang Nguyen <tung.quang.nguyen@est.tech>
Cc: "netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"tipc-discussion@lists.sourceforge.net"
	<tipc-discussion@lists.sourceforge.net>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"stable@vger.kernel.org" <stable@vger.kernel.org>,
	"syzbot+aa7d098bd6fa788fae8e@syzkaller.appspotmail.com"
	<syzbot+aa7d098bd6fa788fae8e@syzkaller.appspotmail.com>,
	Jon Maloy <jmaloy@redhat.com>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Simon Horman <horms@kernel.org>
Subject: Re: [PATCH net] tipc: avoid sending zero-length stream messages
Date: Wed, 6 May 2026 22:52:18 -0300	[thread overview]
Message-ID: <dfd8fb00-29de-4151-86a7-307a7c721f7d@gmail.com> (raw)
In-Reply-To: <GV1P189MB1988FEBC4D7BA3F00E210774C63F2@GV1P189MB1988.EURP189.PROD.OUTLOOK.COM>


[-- Attachment #1.1.1: Type: text/plain, Size: 5504 bytes --]

Hi!

On 5/6/26 03:41, Tung Quang Nguyen wrote:
>> Subject: [PATCH net] tipc: avoid sending zero-length stream messages
>>
>> TIPC stream send currently enters the transmit loop even when the user
>> payload length is zero. This can build and transmit a header-only connection
>> message.
>>
>> For local TIPC sockets, such messages are delivered synchronously through the
>> loopback receive path. When this happens while socket backlog processing is
>> being flushed, reply transmission can re-enter TIPC receive processing
>> repeatedly and trigger an RCU stall.
>>
> Can you demonstrate this scenario using code ? It is better to point out what current code is faulty.

The minimized user-visible trigger is essentially:

      int fd[2];
      struct msghdr msg = {};

      socketpair(AF_TIPC, SOCK_STREAM, 0, fd);

      /* In parallel, this makes release_sock() flush backlog. */
      setsockopt(fd[0], SOL_SOCKET, SO_ATTACH_BPF, &bad_fd,
                 sizeof(bad_fd));

      /* Repeated zero-length MSG_PROBE send on the connected peer. */
      for (i = 0; i < 64; i++)
              sendmsg(fd[1], &msg, MSG_PROBE | MSG_MORE);

The faulty current-code path is that TIPC stream send does not handle
MSG_PROBE before entering __tipc_sendstream(). MSG_PROBE is supposed to
probe without transmitting data, but the call reaches __tipc_sendstream()
with dlen == 0.

__tipc_sendstream() uses a do/while loop, so even when dlen is 0 the body
runs once:

      send = min_t(size_t, dlen - sent, TIPC_MAX_USER_MSG_SIZE);

At that point send is 0, but the code can still call tipc_msg_append() or
tipc_msg_build(), creating a TIPC connection message with only the header.
It then calls:

      tipc_node_xmit(net, txq, dnode, tsk->portid);

For a local TIPC socketpair, tipc_node_xmit() takes the in_own_node() path
and synchronously calls tipc_sk_rcv(). When this happens while
release_sock() is processing backlog, the receive path can generate
response traffic through tipc_node_distr_xmit(), which re-enters the same
local receive path.

I should have made that explicit in the changelog and pointed at the
missing MSG_PROBE handling as the faulty part.
>>
>> diff --git a/net/tipc/socket.c b/net/tipc/socket.c index
>> 9329919fb07f..3c7838713d74 100644
>> --- a/net/tipc/socket.c
>> +++ b/net/tipc/socket.c
>> @@ -1585,6 +1585,8 @@ static int __tipc_sendstream(struct socket *sock,
>> struct msghdr *m, size_t dlen)
>> 					 tipc_sk_connected(sk)));
>> 		if (unlikely(rc))
>> 			break;
>> +		if (unlikely(!dlen && sk->sk_type == SOCK_STREAM))
>> +			break;
> This change is wrong. It immediately breaks normal connection set up because the ACK  (zero in length) has no chance to be sent back from the server to the client.
> Please try to test your patch before submission. 

I did test the patch with the syzkaller C repro under QEMU for 10 minutes, and
it did not trigger the reported RCU stall:

      /tmp/repro & pid=$!; sleep 600; kill $pid
      dmesg | grep -Ei 'rcu.*stall|rcu_preempt|soft lockup|panic|BUG|WARNING' (attached)

The dmesg check did not show any repro-triggered RCU stall, soft lockup,
panic, BUG, or WARNING. But that test only covered the syzkaller trigger;
it did not cover normal active/passive TIPC stream connection setup, which
your review points out is broken by this version.

I re-checked the TIPC connection setup path as well.

tipc_accept() intentionally sends the server-side ACK as a zero-length
stream message:

      iov_iter_kvec(&m.msg_iter, ITER_SOURCE, NULL, 0, 0);
      __tipc_sendstream(new_sock, &m, 0);

So blocking all zero-length sends inside __tipc_sendstream() prevents
that ACK from being transmitted and can break normal SOCK_STREAM
connection setup.

After re-checking the syzkaller repro, the real trigger seems to be narrower
than zero-length stream send. The repro uses a user sendmsg() with
MSG_PROBE | MSG_MORE and no payload on an already connected TIPC stream
socket. MSG_PROBE is supposed to probe without sending, but TIPC stream
send currently lets that path reach __tipc_sendstream(), where the
do/while body can still run once with dlen == 0 and build/transmit a
header-only message.

I think we should avoid suppressing the internal __tipc_sendstream() ACK path
and instead handle the user-originated zero-length MSG_PROBE case before it
reaches the internal stream send helper.

The v2 fix would look like this:

-- 8< --

diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index 9329919fb07f..4783df337971 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -1542,6 +1542,10 @@ static int tipc_sendstream(struct socket *sock, struct msghdr *m, size_t dsz)
        struct sock *sk = sock->sk;
        int ret;
 
+       /* MSG_PROBE asks only to probe the path, not to transmit data. */
+       if (unlikely((m->msg_flags & MSG_PROBE) && !dsz))
+               return 0;
+
        lock_sock(sk);
        ret = __tipc_sendstream(sock, m, dsz);
        release_sock(sk);
-- >8 --

I tested the reworked patch with the syzkaller C reproducer under QEMU.
The reproducer was run for 10 minutes:

      /tmp/repro & pid=$!; sleep 600; kill $pid
      dmesg | grep -Ei 'rcu.*stall|rcu_preempt|soft lockup|panic|BUG|WARNING' (attached)

The grep only matched boot-time command-line/debug messages; no
repro-triggered RCU stall, soft lockup, panic, BUG, or WARNING appeared.

What you think?

[-- Attachment #1.1.2: patch_v1_test_log.txt --]
[-- Type: text/plain, Size: 2365 bytes --]

# dmesg | grep -Ei 'rcu.*stall|rcu_preempt|soft lockup|panic|BUG|WARNING'
[    0.000000][    T0]   net.ifnames=0 panic_on_warn=1
[    0.000000][    T0] Kernel command line: earlyprintk=serial net.ifnames=0 sysctl.kernel.hung_task_all_cpu_backtrace=1 ima_policy=tcb nf-conntrack-ftp.ports=20000 nf-conntrack-tftp.ports=20000 nf-conntrack-sip.ports=20000 nf-conntrack-irc.ports=20000 nf-conntrack-sane.ports=20000 binder.debug_mask=0 rcupdate.rcu_expedited=1 rcupdate.rcu_cpu_stall_cputime=1 no_hash_pointers page_owner=on sysctl.vm.nr_hugepages=4 sysctl.vm.nr_overcommit_hugepages=4 secretmem.enable=1 sysctl.max_rcu_stall_to_panic=1 msr.allow_writes=off coredump_filter=0xffff root=/dev/sda console=ttyS0 vsyscall=native numa=fake=2 kvm-intel.nested=1 spec_store_bypass_disable=prctl nopcid vivid.n_devs=64 vivid.multiplanar=1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2 netrom.nr_ndevs=32 rose.rose_ndevs=32 smp.csd_lock_timeout=100000 watchdog_thresh=55 workqueue.watchdog_thresh=140 sysctl.net.core.netdev_unregister_timeout_secs=140 dummy_hcd.num=32 max_loop=32 nbds_max=32 \
[    0.000000][    T0] Kernel command line: comedi.comedi_num_legacy_minors=4 panic_on_warn=1 console=ttyS0 root=/dev/vda1 rootfstype=ext4 rw earlyprintk=serial
[    0.000000][    T0]   net.ifnames=0 panic_on_warn=1
[    0.000000][    T0] ** If you see this message and you are not debugging    **
[    0.000000][    T0] rcu:     RCU callback double-/use-after-free debug is enabled.
[    0.000000][    T0] rcu:     RCU debug extended QS entry/exit.
[   10.704615][    T1] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug
[   21.826838][    T1] orangefs_debugfs_init: called with debug mask: :none: :0:
[   22.032237][    T1] SGI XFS with ACLs, security attributes, realtime, quota, no debug enabled
[   77.296497][    T1] usbcore: registered new interface driver usb_debug
[   77.309604][    T1] usbserial: USB Serial support registered for debug
[  114.238149][    T1] pvrusb2: Debug mask is 31 (0x1f)
[  181.100641][    T1] debug_vm_pgtable: [debug_vm_pgtable         ]: Validating architecture page table helpers
[  201.556741][    T1] Failed to set sysctl parameter 'max_rcu_stall_to_panic=1': parameter not found
[1]+  Terminated                 /tmp/repro

[-- Attachment #1.1.3: patch_v2_test_log.txt --]
[-- Type: text/plain, Size: 2419 bytes --]

# dmesg | grep -Ei 'rcu.*stall|rcu_preempt|soft lockup|panic|BUG|WARNING'
[    0.000000][    T0] Command line: console=ttyS0 root=/dev/vda1 rootfstype=ext4 rw earlyprintk=serial net.ifnames=0 panic_on_warn=1
[    1.462430][    T0] Kernel command line: earlyprintk=serial net.ifnames=0 sysctl.kernel.hung_task_all_cpu_backtrace=1 ima_policy=tcb nf-conntrack-ftp.ports=20000 nf-conntrack-tftp.ports=20000 nf-conntrack-sip.ports=20000 nf-conntrack-irc.ports=20000 nf-conntrack-sane.ports=20000 binder.debug_mask=0 rcupdate.rcu_expedited=1 rcupdate.rcu_cpu_stall_cputime=1 no_hash_pointers page_owner=on sysctl.vm.nr_hugepages=4 sysctl.vm.nr_overcommit_hugepages=4 secretmem.enable=1 sysctl.max_rcu_stall_to_panic=1 msr.allow_writes=off coredump_filter=0xffff root=/dev/sda console=ttyS0 vsyscall=native numa=fake=2 kvm-intel.nested=1 spec_store_bypass_disable=prctl nopcid vivid.n_devs=64 vivid.multiplanar=1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2 netrom.nr_ndevs=32 rose.rose_ndevs=32 smp.csd_lock_timeout=100000 watchdog_thresh=55 workqueue.watchdog_thresh=140 sysctl.net.core.netdev_unregister_timeout_secs=140 dummy_hcd.num=32 max_loop=32 nbds_max=32 \
[    1.470761][    T0] Kernel command line: comedi.comedi_num_legacy_minors=4 panic_on_warn=1 console=ttyS0 root=/dev/vda1 rootfstype=ext4 rw earlyprintk=serial net.ifnames=0 panic_on_warn=1
[    3.155914][    T0] ** If you see this message and you are not debugging    **
[    3.813298][    T0] rcu:     RCU callback double-/use-after-free debug is enabled.
[    3.814645][    T0] rcu:     RCU debug extended QS entry/exit.
[   17.096163][    T1] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug
[   28.566521][    T1] orangefs_debugfs_init: called with debug mask: :none: :0:
[   28.796190][    T1] SGI XFS with ACLs, security attributes, realtime, quota, no debug enabled
[   84.486523][    T1] usbcore: registered new interface driver usb_debug
[   84.504286][    T1] usbserial: USB Serial support registered for debug
[  114.419251][    T1] pvrusb2: Debug mask is 31 (0x1f)
[  179.396180][    T1] debug_vm_pgtable: [debug_vm_pgtable         ]: Validating architecture page table helpers
[  185.907359][    T1] Failed to set sysctl parameter 'max_rcu_stall_to_panic=1': parameter not found
[1]+  Terminated                 /tmp/repro

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 236 bytes --]

  reply	other threads:[~2026-05-07  1:52 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-06  5:13 [PATCH net] tipc: avoid sending zero-length stream messages Cássio Gabriel
2026-05-06  6:41 ` Tung Quang Nguyen
2026-05-07  1:52   ` Cássio Gabriel Monteiro Pires [this message]
2026-05-08 10:38     ` Tung Quang Nguyen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dfd8fb00-29de-4151-86a7-307a7c721f7d@gmail.com \
    --to=cassiogabrielcontato@gmail.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=jmaloy@redhat.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=stable@vger.kernel.org \
    --cc=syzbot+aa7d098bd6fa788fae8e@syzkaller.appspotmail.com \
    --cc=tipc-discussion@lists.sourceforge.net \
    --cc=tung.quang.nguyen@est.tech \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox