public inbox for virtualization@lists.linux-foundation.org
 help / color / mirror / Atom feed
From: Matthieu Baerts <matttbe@kernel.org>
To: Thomas Gleixner <tglx@kernel.org>,
	Jiri Slaby <jirislaby@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>
Cc: "Stefan Hajnoczi" <stefanha@redhat.com>,
	"Stefano Garzarella" <sgarzare@redhat.com>,
	kvm@vger.kernel.org, virtualization@lists.linux.dev,
	Netdev <netdev@vger.kernel.org>,
	rcu@vger.kernel.org, "MPTCP Linux" <mptcp@lists.linux.dev>,
	"Linux Kernel" <linux-kernel@vger.kernel.org>,
	"Shinichiro Kawasaki" <shinichiro.kawasaki@wdc.com>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	"Dave Hansen" <dave.hansen@linux.intel.com>,
	"luto@kernel.org" <luto@kernel.org>,
	"Michal Koutný" <MKoutny@suse.com>,
	"Waiman Long" <longman@redhat.com>
Subject: Re: Stalls when starting a VSOCK listening socket: soft lockups, RCU stalls, timeout
Date: Fri, 6 Mar 2026 12:06:51 +0100	[thread overview]
Message-ID: <9798cb27-0f52-42fa-b0da-a7834039da1f@kernel.org> (raw)
In-Reply-To: <87qzpx2sck.ffs@tglx>

Hi Thomas,

Thank you for looking into this!

On 06/03/2026 10:57, Thomas Gleixner wrote:
> On Fri, Mar 06 2026 at 06:48, Jiri Slaby wrote:
>> On 05. 03. 26, 20:25, Thomas Gleixner wrote:
>>> Is there simple way to reproduce?
>>
>> Unfortunately not at all. To date, I even cannot reproduce locally, it 
>> reproduces exclusively in opensuse build service (and github CI as per 
>> Matthieu's report). I have a project in there with packages which fail 
>> more often than others:
>>    https://build.opensuse.org/project/monitor/home:jirislaby:softlockup
>> But it's all green ATM.
>>
>> Builds of Go 1.24 and tests of rust 1.90 fail the most. The former even 
>> takes only ~ 8 minutes, so it's not that intensive build at all. So the 
>> reasons are unknown to me. At least, Go apparently uses threads for 
>> building (unlike gcc/clang with forks/processes). Dunno about rust.
> 
> I tried with tons of test cases which stress test mmcid with threads and
> failed.

On my side, I didn't manage to reproduce it locally either.


> Can you provide me your .config, source version, VM setup (Number of
> CPUs, memory etc.)?

My CI ran into this issue 2 days ago, with and without a debug kernel
config. The kernel being tested was on top of 'net-next', which was on
top of this commit from Linus' tree: fbdfa8da05b6 ("selftests:
tc-testing: fix list_categories() crash on list type").

- Config without debug:


https://github.com/user-attachments/files/25791728/config-run-22657946888-normal-join.gz

- Config with debug:


https://github.com/user-attachments/files/25791960/config-run-22657946888-debug-nojoin.gz

- Just in case, stacktraces available there:

  https://github.com/multipath-tcp/mptcp_net-next/actions/runs/22657946888


My tests are being executed in VMs I don't control using a kernel v6.14
on Azure with 4 vCPUs, 16GB of RAM, and KVM nested support. From more
details about what's in it:


https://github.com/actions/runner-images/blob/ubuntu24/20260302.42/images/ubuntu/Ubuntu2404-Readme.md


From there, a docker container is started, from which QEMU 10.1.0
(Debian 1:10.1.0+ds-5ubuntu2.2) is launched with 4 vCPU and 5GB of RAM
using this command:


/usr/bin/qemu-system-x86_64 \
  -name mptcpdev \
  -m 5120M \
  -smp 4 \
  -chardev socket,id=charvirtfs5,path=/tmp/virtmevrwrzu5k \
  -device vhost-user-fs-device,chardev=charvirtfs5,tag=ROOTFS \
  -object memory-backend-memfd,id=mem,size=5120M,share=on \
  -numa node,memdev=mem \
  -machine accel=kvm:tcg \
  -M microvm,accel=kvm,pcie=on,rtc=on \
  -cpu host,topoext=on \
  -parallel none \
  -net none \
  -echr 1 \
  -chardev file,path=/proc/self/fd/2,id=dmesg \
  -device virtio-serial-device \
  -device virtconsole,chardev=dmesg \
  -chardev stdio,id=console,signal=off,mux=on \
  -serial chardev:console \
  -mon chardev=console \
  -vga none \
  -display none \
  -device vhost-vsock-device,guest-cid=3 \
  -kernel
/home/runner/work/mptcp_net-next/mptcp_net-next/.virtme/build/arch/x86/boot/bzImage
\
  -append 'virtme_hostname=mptcpdev nr_open=1048576
virtme_link_mods=/home/runner/work/mptcp_net-next/mptcp_net-next/.virtme/build/.virtme_mods/lib/modules/0.0.0
virtme_rw_overlay0=/tmp console=hvc0 earlyprintk=serial,ttyS0,115200
virtme_console=ttyS0 psmouse.proto=exps
virtme.vsockexec=`/tmp/virtme-console/3.sh`
virtme_chdir=home/runner/work/mptcp_net-next/mptcp_net-next
virtme_root_user=1 rootfstype=virtiofs root=ROOTFS raid=noautodetect rw
debug nokaslr mitigations=off softlockup_panic=1 nmi_watchdog=1
hung_task_panic=1 panic=-1 oops=panic
init=/usr/local/lib/python3.13/dist-packages/virtme/guest/bin/virtme-ng-init'
\
  -gdb tcp::1234 \
  -qmp tcp::3636,server,nowait \
  -no-reboot


It is possible to locally launch the same command using the same QEMU
version (but not the same host kernel) with the help of Docker:

  $ cd <kernel source code>
  # docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --rm \
    -it --privileged mptcp/mptcp-upstream-virtme-docker:latest \
    manual normal

This will build a new kernel in O=.virtme/build, launch it and give you
access to a prompt.


After that, you can do also use the "auto" mode with the last built
image to boot the VM, only print "OK", stop and retry if there were no
errors:

  $ cd <kernel source code>
  $ echo 'echo OK' > .virtme-exec-run
  # i=1; \
    while docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --rm \
    -it --privileged mptcp/mptcp-upstream-virtme-docker:latest \
    vm auto normal; do \
      echo "== Attempt: $i: OK =="; \
      i=$((i+1)); \
    done; \
    echo "== Failure after $i attempts =="


> I tried to find it on that github page Matthiue mentioned but I'm
> probably too stupid to navigate this clicky interface.

I'm sorry about that, I understand, the interface is not very clear. Do
not hesitate to tell me if you need anything else from me.

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.


  parent reply	other threads:[~2026-03-06 11:06 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-06 11:54 Stalls when starting a VSOCK listening socket: soft lockups, RCU stalls, timeout Matthieu Baerts
2026-02-06 16:38 ` Stefano Garzarella
2026-02-06 17:13   ` Matthieu Baerts
2026-02-26 10:37 ` Jiri Slaby
2026-03-02  5:28   ` Jiri Slaby
2026-03-02 11:46     ` Peter Zijlstra
2026-03-02 14:30       ` Waiman Long
2026-03-05  7:00       ` Jiri Slaby
2026-03-05 11:53         ` Jiri Slaby
2026-03-05 12:20           ` Jiri Slaby
2026-03-05 16:16             ` Thomas Gleixner
2026-03-05 17:33               ` Jiri Slaby
2026-03-05 19:25                 ` Thomas Gleixner
2026-03-06  5:48                   ` Jiri Slaby
2026-03-06  9:57                     ` Thomas Gleixner
2026-03-06 10:16                       ` Jiri Slaby
2026-03-06 16:28                         ` Thomas Gleixner
2026-03-06 11:06                       ` Matthieu Baerts [this message]
2026-03-06 16:57                         ` Matthieu Baerts
2026-03-06 18:31                           ` Jiri Slaby
2026-03-06 18:44                             ` Matthieu Baerts
2026-03-06 21:40                           ` Matthieu Baerts
2026-03-06 15:24                       ` Peter Zijlstra
2026-03-07  9:01                         ` Thomas Gleixner
2026-03-07 22:29                           ` Thomas Gleixner
2026-03-08  9:15                             ` Thomas Gleixner
2026-03-08 16:55                               ` Jiri Slaby
2026-03-08 16:58                               ` Thomas Gleixner
2026-03-08 17:23                                 ` Matthieu Baerts
2026-03-09  8:43                                   ` Thomas Gleixner
2026-03-09 12:23                                     ` Matthieu Baerts
2026-03-10  8:09                                       ` Thomas Gleixner
2026-03-10  8:20                                         ` Thomas Gleixner
2026-03-10  8:56                                         ` Jiri Slaby
2026-03-10  9:00                                           ` Jiri Slaby
2026-03-10 10:03                                             ` Thomas Gleixner
2026-03-10 10:06                                               ` Thomas Gleixner
2026-03-10 11:24                                                 ` Matthieu Baerts
2026-03-10 11:54                                                   ` Peter Zijlstra
2026-03-10 12:28                                                     ` Thomas Gleixner
2026-03-10 13:40                                                       ` Matthieu Baerts
2026-03-10 13:47                                                         ` Thomas Gleixner
2026-03-10 15:51                                                           ` Matthieu Baerts
2026-03-03 13:23   ` Matthieu Baerts
2026-03-05  6:46     ` Jiri Slaby

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9798cb27-0f52-42fa-b0da-a7834039da1f@kernel.org \
    --to=matttbe@kernel.org \
    --cc=MKoutny@suse.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=jirislaby@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=luto@kernel.org \
    --cc=mptcp@lists.linux.dev \
    --cc=netdev@vger.kernel.org \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rcu@vger.kernel.org \
    --cc=sgarzare@redhat.com \
    --cc=shinichiro.kawasaki@wdc.com \
    --cc=stefanha@redhat.com \
    --cc=tglx@kernel.org \
    --cc=virtualization@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox