public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
From: Zhangjiaji <zhangjiaji1@huawei.com>
To: "stable@vger.kernel.org" <stable@vger.kernel.org>
Cc: "huyu (D)" <huyu70@h-partners.com>,
	"Wangqinxiao (Tom)" <wangqinxiao@huawei.com>,
	"regressions@lists.linux.dev" <regressions@lists.linux.dev>,
	Liumengqiu <liumengqiu1@huawei.com>
Subject: lock contention: x86/kvm: Potential deadlock between shrinker_rwsem and kvm_lock under high VM load
Date: Mon, 2 Feb 2026 01:19:22 +0000	[thread overview]
Message-ID: <a5ebab14f0444f8da03a6fa4d1978793@huawei.com> (raw)
In-Reply-To: <eecb1d2d1f7a44ef8c757138cb1b3755@huawei.com>

Hi all,

I'm hitting a lock contention / long stall issue on an x86 KVM host under heavy VM load, and I'd like to ask for advice on the proper fix direction.

Problem summary
When the host is under heavy VM pressure and a cache drop is triggered, the reclaim path can hold shrinker_rwsem for a long time due to lock contention on kvm_lock inside the KVM/MMU shrinker, which then blocks systemd in a way that also holds cgroup_mutex, causing cascading issues (e.g., journald log gaps).

Observed lock chain / flow
From what I see:

1. drop_caches leads to slab reclaim and enters shrink_slab()
2. shrink_slab() takes shrinker_rwsem
3. It then enters do_shrink_slab()
4. During slab shrinking, the KVM/MMU shrinker callback is invoked (e.g mmu_shrink_scan()) to reclaim KVM-related caches
5. mmu_shrink_scan() attempts to take kvm_lock
6. Under heavy VM load, kvm_lock is highly contended, so the shrinker callback stalls and shrinker_rwsem remains held for an extended time

In parallel:

7. systemd holds cgroup_mutex (e.g. during cgroup operations) and then tries to acquire shrinker_rwsem
8. Because shrinker_rwsem is still held by the drop_caches reclaim path, systemd blocks while still holding cgroup_mutex
9. Other components (e.g. systemd-journald) needing cgroup_mutex become blocked, leading to issues such as logging stalls/gaps

Impact
- Long stalls in systemd-controlled cgroup operations
- systemd-journald (and possibly others) blocked on cgroup_mutex, causing log dropouts / discontinuities
- Overall system responsiveness degradation during the cache-drop operation

Questions
1. Is it expected/acceptable for a shrinker callback (KVM/MMU shrinker) to contend on a highly contended lock like kvm_lock while shrinker_rwsem is held?
2. Are there known recommendations to avoid holding shrinker_rwsem across potentially blocking/contended shrinker callbacks?
3. Would the preferred fix be on the KVM shrinker side (e.g. using mutex_trylock()/spin_trylock() semantics and returning SHRINK_STOP/-EAGAIN style behavior when contended), or on the shrink_slab/shrinker infrastructure side?
4. Alternatively, is there any known guidance for systemd/cgroup codepaths to avoid waiting on shrinker_rwsem while holding cgroup_mutex (to avoid lock chaining)?

Please let me know what the most useful information would be, and what direction you would recommend for a fix.

Thanks,
Huyu

       reply	other threads:[~2026-02-02  1:19 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <505c34d2cef84117b7e995c211efc393@huawei.com>
     [not found] ` <eecb1d2d1f7a44ef8c757138cb1b3755@huawei.com>
2026-02-02  1:19   ` Zhangjiaji [this message]
2026-02-06 11:07     ` lock contention: x86/kvm: Potential deadlock between shrinker_rwsem and kvm_lock under high VM load Thorsten Leemhuis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a5ebab14f0444f8da03a6fa4d1978793@huawei.com \
    --to=zhangjiaji1@huawei.com \
    --cc=huyu70@h-partners.com \
    --cc=liumengqiu1@huawei.com \
    --cc=regressions@lists.linux.dev \
    --cc=stable@vger.kernel.org \
    --cc=wangqinxiao@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox