From: mawupeng <mawupeng1@huawei.com>
To: <syzbot+listbf7b8eeeb8dda31d6de1@syzkaller.appspotmail.com>,
<linux-kernel@vger.kernel.org>, <syzkaller-bugs@googlegroups.com>,
<virtualization@lists.linux.dev>, <mst@redhat.com>,
<jasowang@redhat.com>, <xuanzhuo@linux.alibaba.com>,
<eperezma@redhat.com>, <stefanha@redhat.com>,
<sgarzare@redhat.com>, <davem@davemloft.net>,
<edumazet@google.com>, <kuba@kernel.org>, <pabeni@redhat.com>,
<horms@kernel.org>
Cc: <mawupeng1@huawei.com>, <virtualization@lists.linux.dev>,
<kvm@vger.kernel.org>, <netdev@vger.kernel.org>
Subject: Re: [syzbot] Monthly virt report (Jun 2026)
Date: Thu, 2 Jul 2026 10:55:21 +0800 [thread overview]
Message-ID: <d1e9d389-affa-4d12-aaf7-3fcf3218966c@huawei.com> (raw)
In-Reply-To: <6a3c3ed6.80e5668d.5d0ef.0001.GAE@google.com>
On 周四 2026-6-25 04:32, syzbot wrote:
> Hello virt maintainers/developers,
>
> This is a 31-day syzbot report for the virt subsystem.
> All related reports/information can be found at:
> https://syzkaller.appspot.com/upstream/s/virt
>
> During the period, 0 new issues were detected and 0 were fixed.
> In total, 5 issues are still open and 61 have already been fixed.
> There are also 2 low-priority issues.
>
> Some of the still happening issues:
>
> Ref Crashes Repro Title
> <1> 24 No WARNING: refcount bug in call_timer_fn (4)
> https://syzkaller.appspot.com/bug?extid=07dcf509f4c013e25dc5
> <2> 3 Yes memory leak in __vsock_create (2)
> https://syzkaller.appspot.com/bug?extid=1b2c9c4a0f8708082678
Hi,
This is regarding the still-open "memory leak in __vsock_create (2)"
bug (#2 in the monthly virt report, extid 1b2c9c4a0f8708082678):
https://syzkaller.appspot.com/bug?extid=1b2c9c4a0f8708082678
I spent some time analyzing the root cause and the previous fix
attempt; below is a summary and a direction that tested out.
== Root cause ==
The leaked object is the child socket created by
virtio_transport_recv_listen() via __vsock_create() — exactly the
allocation site kmemleak points at. The reason it never gets freed is
in the accept() error path, not in the allocation itself.
When vsock_accept() dequeues a child but the listener carries an error
(listener->sk_err, e.g. set by a failed connect() issued on the socket
before listen()), it sets vconnected->rejected = true, skips
sock_graft(), drops the dequeue reference and *relies on
vsock_pending_work()* to clean the child up.
The catch: vsock_pending_work() is never scheduled on the transports
involved here. It is only ever scheduled by vmci_transport
(vmci_transport.c:1130); virtio_transport and vsock_loopback never
schedule it. So the rejected child sits with an unreleased initial
reference (the one from sk_alloc()) plus the connected-table
reference, vsock_sk_destruct() is never reached, and the cascade —
child socket, struct cred, virtio transport, SELinux blob — all leak.
The earlier commit 3a5cc90a4d17 ("vsock/virtio: remove socket from
connected/bound list on shutdown") adds an unconditional
vsock_remove_sock() in virtio_transport_recv_connected() when a
SHUTDOWN arrives, which drops the connected-table reference for a
child that later receives a SHUTDOWN; but it does not release the
sk_alloc() reference. So the leak is not really a regression
introduced there — rejected children have never been cleaned up on
transports that don't schedule pending_work. 3a5cc90a4d17 mainly
changes whether kmemleak can see the leak: on v6.6 it can (the
cascade shows up), on mainline the smaller struct sock layout leaves a
residual pointer inside the child that kmemleak counts as a reachable
reference, so mainline kmemleak stays silent even though
create/destruct accounting confirms the child never reaches
vsock_sk_destruct().
== Why the previous attempt didn't land ==
Divya's patch [1] tried to fix it by re-locking the parent listener
inside virtio_transport_recv_listen() and re-checking the shutdown
state under that lock before vsock_enqueue_accept(). That re-locks an
already-held lock — virtio_transport_recv_pkt() holds lock_sock(sk)
across the call into recv_listen() — and syzbot ci immediately flagged
"possible recursive locking" [2]. So it was backed out and the bug
stayed open.
== A direction that tests out ==
Instead of re-locking in the receive path, handle the cleanup directly
in vsock_accept(): on reject, instead of setting vconnected->rejected
and relying on pending_work, explicitly release the child's references
there:
if (err) {
vsock_remove_connected(vconnected); /* connected-table ref */
connected->sk_state = TCP_CLOSE;
sock_put(connected); /* enqueue_accept ref */
} else {
sock_graft(connected, newsock);
}
...
sock_put(connected); /* the existing, common put — sk_alloc ref */
This drops exactly the three references the child holds at dequeue
time (sk_alloc + __vsock_insert_connected + vsock_enqueue_accept),
lets refcount reach zero and vsock_sk_destruct() run. The `rejected`
flag and its pending_work handling can then be removed. The receive
path is not touched, so there is no re-locking and no deadlock.
I verified this on ARM64 QEMU. On linux v6.6.y (where kmemleak can
see the leak) with the syzbot reproducer:
- before: 6 creates / 4 destructs (2 leaked); kmemleak reports the
cascade;
- after: 6 creates / 6 destructs (0 leaked); kmemleak clean;
- 50-iter normal server and 50-iter same-port-reconnect tests both
pass 50/50 with zero leaks, no double-put warnings.
On mainline, kmemleak stays silent (see above) but create/destruct
accounting confirms the same leak before the fix; the fix is
code-identical across v6.6.y and mainline (same recv_listen/accept
paths).
I'm not subscribed to follow the list at full volume; happy to send a
formal patch (with the af_vsock.h / pending_work changes folded in)
if the direction looks right to the maintainers.
== Trigger, for completeness ==
The reproducer's atypical-but-legal sequence is what sets
listener->sk_err: a socket is connect()ed (leaving sk_err set, since
vsock_connect() only clears it at the start of a new connect) and then
turned into a listener:
fd = socket(AF_VSOCK, SOCK_STREAM, 0);
bind(fd, ...);
connect(fd, &(VMADDR_CID_LOCAL, ...)); /* leaves sk_err set */
listen(fd, 5);
/* a peer connects to fd; the child created is later rejected */
accept4(fd, ...);
Standard servers (listen before any connect on the same fd) don't hit
it, which is why this went ~2.5 years between the offending commit and
the syzbot report.
[1] https://lore.kernel.org/all/20260605191922.12720-1-divyakm@unc.edu/
[2] https://ci.syzbot.org/series/76f40e62-5a21-46d4-a636-10f0ec9c5040
Thanks.
> <3> 3913 Yes INFO: rcu detected stall in do_idle
> https://syzkaller.appspot.com/bug?extid=385468161961cee80c31
>
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
>
> To disable reminders for individual bugs, reply with the following command:
> #syz set <Ref> no-reminders
>
> To change bug's subsystems, reply with:
> #syz set <Ref> subsystems: new-subsystem
>
> You may send multiple commands in a single email message.
parent reply other threads:[~2026-07-02 2:55 UTC|newest]
Thread overview: expand[flat|nested] mbox.gz Atom feed
[parent not found: <6a3c3ed6.80e5668d.5d0ef.0001.GAE@google.com>]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d1e9d389-affa-4d12-aaf7-3fcf3218966c@huawei.com \
--to=mawupeng1@huawei.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=eperezma@redhat.com \
--cc=horms@kernel.org \
--cc=jasowang@redhat.com \
--cc=kuba@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mst@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sgarzare@redhat.com \
--cc=stefanha@redhat.com \
--cc=syzbot+listbf7b8eeeb8dda31d6de1@syzkaller.appspotmail.com \
--cc=syzkaller-bugs@googlegroups.com \
--cc=virtualization@lists.linux.dev \
--cc=xuanzhuo@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox