From: Sonam Sanju <sonam.sanju@intel.com>
To: kunwu.chan@linux.dev
Cc: dmaluka@chromium.org, kvm@vger.kernel.org,
linux-kernel@vger.kernel.org, paulmck@kernel.org,
pbonzini@redhat.com, rcu@vger.kernel.org, seanjc@google.com,
sonam.sanju@intel.com, stable@vger.kernel.org,
vineeth@bitbyteword.org
Subject: Re: [PATCH v2] KVM: irqfd: fix deadlock by moving synchronize_srcu out of resampler_lock
Date: Tue, 21 Apr 2026 10:42:19 +0530 [thread overview]
Message-ID: <20260421051219.3409921-1-sonam.sanju@intel.com> (raw)
In-Reply-To: <87add1dc9bb95dc50bc20ce5c8fbfe2999185dd3@linux.dev>
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 3865 bytes --]
> Could you provide a time-aligned dump that includes:
> - pwq state (active/pending/in-flight)
> - pending and in-flight work items with their queue/start times
> - worker task states
Below are time-aligned extracts from both instances. Full logs are
included further down in this email.
=== Instance 1: kernel 6.18.8, pool 14 (cpus=3) ===
--- t=62s: First workqueue lockup dump (pool stuck 49s, since ~t=13s) ---
kvm-irqfd-cleanup: pwq 14: active=4 refcnt=5
in-flight: 157:irqfd_shutdown ,4044:irqfd_shutdown ,
102:irqfd_shutdown ,39:irqfd_shutdown
rcu_gp: pwq 14: active=2 refcnt=3
pending: 2*process_srcu
events: pwq 14: active=43 refcnt=44
pending: binder_deferred_func, kernfs_notify_workfn,
delayed_vfree_work, 5*destroy_super_work,
3*bpf_prog_free_deferred, 10*destroy_super_work, ...
mm_percpu_wq: pwq 14: active=2 refcnt=4
pending: vmstat_update, lru_add_drain_per_cpu
pm: pwq 14: active=1 refcnt=2
pending: pm_runtime_work
pool 14: cpus=3 flags=0x0 hung=49s workers=11
idle: 4046 4038 4045 4039 4043 156 77 (7 idle)
Active busy worker backtrace (pid 102):
__schedule → schedule → schedule_preempt_disabled →
__mutex_lock → irqfd_resampler_shutdown+0x23 →
irqfd_shutdown → process_scheduled_works → worker_thread
--- t=312s: Last workqueue lockup dump (pool stuck 298s) ---
kvm-irqfd-cleanup: pwq 14: active=4 (same 4 in-flight)
rcu_gp: pwq 14: pending: 2*process_srcu (still pending, 250s later)
events: pwq 14: active=43 (same, no progress)
pool 14: hung=298s workers=11 idle: 4046 4038 4045 4039 4043 156 77
--- t=314s: Hung task dump ---
Worker 4044 (MUTEX HOLDER):
task:kworker/3:8 state:D pid:4044
Workqueue: kvm-irqfd-cleanup irqfd_shutdown
__synchronize_srcu+0x100/0x130
irqfd_resampler_shutdown+0xf0/0x150 ← synchronize_srcu call
Worker 157 (MUTEX WAITER):
task:kworker/3:4 state:D pid:157
__mutex_lock+0x409/0xd90
irqfd_resampler_shutdown+0x23/0x150 ← mutex_lock call
(Workers 39 and 102 show identical mutex_lock stacks)
=== Instance 2: kernel 6.18.2, pool 22 (cpus=5) ===
--- t=93s: First workqueue lockup dump (pool stuck 79s, since ~t=14s) ---
kvm-irqfd-cleanup: pwq 22: active=4 refcnt=5
in-flight: 151:irqfd_shutdown ,4246:irqfd_shutdown ,
4241:irqfd_shutdown ,4243:irqfd_shutdown
rcu_gp: pwq 22: active=1 refcnt=2
pending: process_srcu
events: pwq 22: active=56 refcnt=57
pending: kernfs_notify_workfn, delayed_vfree_work,
binder_deferred_func, 47*destroy_super_work, ...
pool 22: cpus=5 flags=0x0 hung=79s workers=12
idle: 4242 51 4248 4247 4245 435 4244 4239 (8 idle)
--- t=341s: Last workqueue lockup dump (pool stuck 327s) ---
kvm-irqfd-cleanup: pwq 22: active=4 (same)
rcu_gp: pwq 22: pending: process_srcu (still pending, 248s later)
events: pwq 22: active=56 (56 pending items, zero progress)
pool 22: hung=327s workers=12 idle: same 8 workers
--- t=343s: Hung task dump ---
Worker 4241 (MUTEX HOLDER):
task:kworker/5:4 state:D pid:4241
Workqueue: kvm-irqfd-cleanup irqfd_shutdown
__synchronize_srcu+0x100/0x130
irqfd_resampler_shutdown+0xf0/0x150
Worker 4243 (MUTEX WAITER):
task:kworker/5:6 state:D pid:4243
__mutex_lock+0x37d/0xbb0
irqfd_resampler_shutdown+0x23/0x150
(Workers 151 and 4246 show identical mutex_lock stacks)
> Please post sanitized ramoops/dmesg logs on-list so others can
> validate.
Full logs: https://gist.github.com/sonam-sanju/773855aa2cbe156ca19f3a87bbebc15e
Thanks,
Sonam
prev parent reply other threads:[~2026-04-21 5:17 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20260323053353.805336-1-sonam.sanju@intel.com>
2026-03-23 6:42 ` [PATCH v2] KVM: irqfd: fix deadlock by moving synchronize_srcu out of resampler_lock Sonam Sanju
2026-03-31 18:17 ` Sean Christopherson
2026-03-31 20:51 ` Paul E. McKenney
2026-04-01 9:47 ` Sonam Sanju
2026-04-06 23:09 ` Paul E. McKenney
2026-04-01 9:34 ` Kunwu Chan
2026-04-01 14:24 ` Sonam Sanju
2026-04-06 14:20 ` Kunwu Chan
2026-04-17 1:18 ` Vineeth Pillai
2026-04-19 3:03 ` Vineeth Remanan Pillai
2026-04-21 16:54 ` [PATCH v2] KVM: eventfd: Use WQ_UNBOUND workqueue for irqfd cleanup - New logs confirm preemption race Sonam Sanju
2026-04-21 18:22 ` Tejun Heo
2026-04-21 5:12 ` Sonam Sanju [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260421051219.3409921-1-sonam.sanju@intel.com \
--to=sonam.sanju@intel.com \
--cc=dmaluka@chromium.org \
--cc=kunwu.chan@linux.dev \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=paulmck@kernel.org \
--cc=pbonzini@redhat.com \
--cc=rcu@vger.kernel.org \
--cc=seanjc@google.com \
--cc=stable@vger.kernel.org \
--cc=vineeth@bitbyteword.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox