* [PATCH RFC 00/12] Document synchronization used in managing guest faults
@ 2026-05-27 15:33 Ackerley Tng via B4 Relay
2026-05-27 15:33 ` [PATCH RFC 01/12] Documentation: KVM: Elaborate comment on kvm_usage_lock Ackerley Tng via B4 Relay
` (11 more replies)
0 siblings, 12 replies; 19+ messages in thread
From: Ackerley Tng via B4 Relay @ 2026-05-27 15:33 UTC (permalink / raw)
To: Paolo Bonzini, Jonathan Corbet, Shuah Khan, Tianrui Zhao,
Bibo Mao, Huacai Chen, WANG Xuerui, Sean Christopherson,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Fuad Tabba, vannapurve, x86, H. Peter Anvin
Cc: kvm, linux-doc, linux-kernel, loongarch, Ackerley Tng
In [1], Sean suggested consolidating comments for some functions.
While trying to consolidate comments, I read up more about synchronization
used in managing guest faults and put together some updates for
Documentation/virt/kvm/locking.rst, including some fixes to the current
content.
I'm generalizing the kinds of functions Sean was referring to as
"documentation for functions that depend on derived information from GFNs",
and kvm_gmem_get_memory_attributes() from the conversion series [1] will
also point to the documentation that is updated in this patch series.
[1] https://lore.kernel.org/all/ag8JIlHjohAOC3-g@google.com/
Signed-off-by: Ackerley Tng <ackerleytng@google.com>
---
Ackerley Tng (12):
Documentation: KVM: Elaborate comment on kvm_usage_lock
Documentation: KVM: Consolidate notes about cpu_read_lock() and kvm_lock
Documentation: KVM: Consolidate notes about kvm->slots_lock and irq_lock
Documentation: KVM: Turn - into bullet point
Documentation: KVM: Explain what rule the exception section is meant for
Documentation: KVM: Have actual headings for exceptions
Documentation: KVM: Drop mention of kvm->lock in SRCU documentation
Documentation: KVM: Add example for kvm->srcu in relation to mutex/lock
Documentation: KVM: Document synchronization for managing guest faults
KVM: guest_memfd: Clarify comment about gmem.file vs kvm->srcu
KVM: mmu: Point users of host_pfn_mapping_level() to docs
Documentation: KVM: Focus acquisition order section on preventing deadlocks
Documentation/virt/kvm/locking.rst | 173 ++++++++++++++++++++++++++++++++-----
arch/loongarch/kvm/mmu.c | 24 +----
arch/x86/kvm/mmu/mmu.c | 24 +----
virt/kvm/guest_memfd.c | 9 +-
4 files changed, 165 insertions(+), 65 deletions(-)
---
base-commit: b7fbe9a1bf9ee6c967ef77d366ca58c35fcf1887
change-id: 20260527-kvm-locking-docs-3c6dee0fabce
Best regards,
--
Ackerley Tng <ackerleytng@google.com>
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH RFC 01/12] Documentation: KVM: Elaborate comment on kvm_usage_lock
2026-05-27 15:33 [PATCH RFC 00/12] Document synchronization used in managing guest faults Ackerley Tng via B4 Relay
@ 2026-05-27 15:33 ` Ackerley Tng via B4 Relay
2026-06-25 18:12 ` Sean Christopherson
2026-05-27 15:33 ` [PATCH RFC 02/12] Documentation: KVM: Consolidate notes about cpu_read_lock() and kvm_lock Ackerley Tng via B4 Relay
` (10 subsequent siblings)
11 siblings, 1 reply; 19+ messages in thread
From: Ackerley Tng via B4 Relay @ 2026-05-27 15:33 UTC (permalink / raw)
To: Paolo Bonzini, Jonathan Corbet, Shuah Khan, Tianrui Zhao,
Bibo Mao, Huacai Chen, WANG Xuerui, Sean Christopherson,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Fuad Tabba, vannapurve, x86, H. Peter Anvin
Cc: kvm, linux-doc, linux-kernel, loongarch, Ackerley Tng
From: Ackerley Tng <ackerleytng@google.com>
The original comment talks about cpus_read_lock() and kvm_usage_count, but
doesn't explain why they are related.
Elaborate comment on kvm_usage_lock to provide more context.
Signed-off-by: Ackerley Tng <ackerleytng@google.com>
---
Documentation/virt/kvm/locking.rst | 19 +++++++++++++++++--
1 file changed, 17 insertions(+), 2 deletions(-)
diff --git a/Documentation/virt/kvm/locking.rst b/Documentation/virt/kvm/locking.rst
index 662231e958a07..5564c8b38b9cc 100644
--- a/Documentation/virt/kvm/locking.rst
+++ b/Documentation/virt/kvm/locking.rst
@@ -248,8 +248,23 @@ time it will be set using the Dirty tracking mechanism described above.
:Arch: any
:Protects: - kvm_usage_count
- hardware virtualization enable/disable
-:Comment: Exists to allow taking cpus_read_lock() while kvm_usage_count is
- protected, which simplifies the virtualization enabling logic.
+:Comment: ``kvm_usage_count`` serves to deduplicate hardware
+ virtualization enabling and disabling requests from different VMs
+ being created.
+
+ Hardware virtualization enabling/disabling requires taking
+ ``cpus_read_lock()``.
+
+ ``kvm_lock`` used to also protect ``kvm_usage_count``, but other
+ parts of the Linux kernel holding ``cpus_read_lock()`` need to
+ call into KVM to ensure that VM state remains consistent with the
+ host's state. For example, when the CPU frequency changes, KVM is
+ notified. ``kvmclock_cpufreq_notifier()`` takes ``kvm_lock`` to
+ iterate ``vm_list``.
+
+ To decouple these, use different locks, ``kvm_lock`` for
+ ``vm_list`` and ``kvm_usage_lock`` for enabling/disabling hardware
+ virtualization.
``kvm->mn_invalidate_lock``
^^^^^^^^^^^^^^^^^^^^^^^^^^^
--
2.54.0.823.g6e5bcc1fc9-goog
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH RFC 02/12] Documentation: KVM: Consolidate notes about cpu_read_lock() and kvm_lock
2026-05-27 15:33 [PATCH RFC 00/12] Document synchronization used in managing guest faults Ackerley Tng via B4 Relay
2026-05-27 15:33 ` [PATCH RFC 01/12] Documentation: KVM: Elaborate comment on kvm_usage_lock Ackerley Tng via B4 Relay
@ 2026-05-27 15:33 ` Ackerley Tng via B4 Relay
2026-06-25 18:12 ` Sean Christopherson
2026-05-27 15:33 ` [PATCH RFC 03/12] Documentation: KVM: Consolidate notes about kvm->slots_lock and irq_lock Ackerley Tng via B4 Relay
` (9 subsequent siblings)
11 siblings, 1 reply; 19+ messages in thread
From: Ackerley Tng via B4 Relay @ 2026-05-27 15:33 UTC (permalink / raw)
To: Paolo Bonzini, Jonathan Corbet, Shuah Khan, Tianrui Zhao,
Bibo Mao, Huacai Chen, WANG Xuerui, Sean Christopherson,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Fuad Tabba, vannapurve, x86, H. Peter Anvin
Cc: kvm, linux-doc, linux-kernel, loongarch, Ackerley Tng
From: Ackerley Tng <ackerleytng@google.com>
Move the detail about cpu_read_lock() and kvm_lock to where the acquisition
order is mentioned.
Signed-off-by: Ackerley Tng <ackerleytng@google.com>
---
Documentation/virt/kvm/locking.rst | 12 +++++-------
1 file changed, 5 insertions(+), 7 deletions(-)
diff --git a/Documentation/virt/kvm/locking.rst b/Documentation/virt/kvm/locking.rst
index 5564c8b38b9cc..1e8cbbe3ba706 100644
--- a/Documentation/virt/kvm/locking.rst
+++ b/Documentation/virt/kvm/locking.rst
@@ -10,6 +10,11 @@ KVM Lock Overview
The acquisition orders for mutexes are as follows:
- cpus_read_lock() is taken outside kvm_lock
+ - Taking cpus_read_lock() outside of kvm_lock is problematic,
+ despite it being the official ordering, as it is quite easy to
+ unknowingly trigger cpus_read_lock() while holding kvm_lock.
+ Use caution when walking vm_list, e.g. avoid complex operations
+ when possible.
- kvm_usage_lock is taken outside cpus_read_lock()
@@ -28,13 +33,6 @@ The acquisition orders for mutexes are as follows:
are taken on the waiting side when modifying memslots, so MMU notifiers
must not take either kvm->slots_lock or kvm->slots_arch_lock.
-cpus_read_lock() vs kvm_lock:
-
-- Taking cpus_read_lock() outside of kvm_lock is problematic, despite that
- being the official ordering, as it is quite easy to unknowingly trigger
- cpus_read_lock() while holding kvm_lock. Use caution when walking vm_list,
- e.g. avoid complex operations when possible.
-
For SRCU:
- ``synchronize_srcu(&kvm->srcu)`` is called inside critical sections
--
2.54.0.823.g6e5bcc1fc9-goog
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH RFC 03/12] Documentation: KVM: Consolidate notes about kvm->slots_lock and irq_lock
2026-05-27 15:33 [PATCH RFC 00/12] Document synchronization used in managing guest faults Ackerley Tng via B4 Relay
2026-05-27 15:33 ` [PATCH RFC 01/12] Documentation: KVM: Elaborate comment on kvm_usage_lock Ackerley Tng via B4 Relay
2026-05-27 15:33 ` [PATCH RFC 02/12] Documentation: KVM: Consolidate notes about cpu_read_lock() and kvm_lock Ackerley Tng via B4 Relay
@ 2026-05-27 15:33 ` Ackerley Tng via B4 Relay
2026-06-25 18:12 ` Sean Christopherson
2026-05-27 15:33 ` [PATCH RFC 04/12] Documentation: KVM: Turn - into bullet point Ackerley Tng via B4 Relay
` (8 subsequent siblings)
11 siblings, 1 reply; 19+ messages in thread
From: Ackerley Tng via B4 Relay @ 2026-05-27 15:33 UTC (permalink / raw)
To: Paolo Bonzini, Jonathan Corbet, Shuah Khan, Tianrui Zhao,
Bibo Mao, Huacai Chen, WANG Xuerui, Sean Christopherson,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Fuad Tabba, vannapurve, x86, H. Peter Anvin
Cc: kvm, linux-doc, linux-kernel, loongarch, Ackerley Tng
From: Ackerley Tng <ackerleytng@google.com>
Move the detail about ordering between kvm->slots_lock and kvm->irq_lock to
where the two locks are first mentioned.
Signed-off-by: Ackerley Tng <ackerleytng@google.com>
---
Documentation/virt/kvm/locking.rst | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/Documentation/virt/kvm/locking.rst b/Documentation/virt/kvm/locking.rst
index 1e8cbbe3ba706..67dd2066f6d98 100644
--- a/Documentation/virt/kvm/locking.rst
+++ b/Documentation/virt/kvm/locking.rst
@@ -21,12 +21,11 @@ The acquisition orders for mutexes are as follows:
- kvm->lock is taken outside vcpu->mutex
- kvm->lock is taken outside kvm->slots_lock and kvm->irq_lock
+ - kvm->slots_lock is taken outside kvm->irq_lock, though acquiring
+ them together is quite rare.
- vcpu->mutex is taken outside kvm->slots_lock and kvm->slots_arch_lock
-- kvm->slots_lock is taken outside kvm->irq_lock, though acquiring
- them together is quite rare.
-
- kvm->mn_active_invalidate_count ensures that pairs of
invalidate_range_start() and invalidate_range_end() callbacks
use the same memslots array. kvm->slots_lock and kvm->slots_arch_lock
--
2.54.0.823.g6e5bcc1fc9-goog
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH RFC 04/12] Documentation: KVM: Turn - into bullet point
2026-05-27 15:33 [PATCH RFC 00/12] Document synchronization used in managing guest faults Ackerley Tng via B4 Relay
` (2 preceding siblings ...)
2026-05-27 15:33 ` [PATCH RFC 03/12] Documentation: KVM: Consolidate notes about kvm->slots_lock and irq_lock Ackerley Tng via B4 Relay
@ 2026-05-27 15:33 ` Ackerley Tng via B4 Relay
2026-05-27 15:33 ` [PATCH RFC 05/12] Documentation: KVM: Explain what rule the exception section is meant for Ackerley Tng via B4 Relay
` (7 subsequent siblings)
11 siblings, 0 replies; 19+ messages in thread
From: Ackerley Tng via B4 Relay @ 2026-05-27 15:33 UTC (permalink / raw)
To: Paolo Bonzini, Jonathan Corbet, Shuah Khan, Tianrui Zhao,
Bibo Mao, Huacai Chen, WANG Xuerui, Sean Christopherson,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Fuad Tabba, vannapurve, x86, H. Peter Anvin
Cc: kvm, linux-doc, linux-kernel, loongarch, Ackerley Tng
From: Ackerley Tng <ackerleytng@google.com>
For the :Protects: section of kvm->mmu_lock, a missing space causes the -
to render as a literal - instead of a bullet point. Add space to make it
render as a bullet point.
Signed-off-by: Ackerley Tng <ackerleytng@google.com>
---
Documentation/virt/kvm/locking.rst | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/Documentation/virt/kvm/locking.rst b/Documentation/virt/kvm/locking.rst
index 67dd2066f6d98..e349c2cb94943 100644
--- a/Documentation/virt/kvm/locking.rst
+++ b/Documentation/virt/kvm/locking.rst
@@ -283,7 +283,7 @@ time it will be set using the Dirty tracking mechanism described above.
^^^^^^^^^^^^^^^^^
:Type: spinlock_t or rwlock_t
:Arch: any
-:Protects: -shadow page/shadow tlb entry
+:Protects: - shadow page/shadow tlb entry
:Comment: it is a spinlock since it is used in mmu notifier.
``kvm->srcu``
--
2.54.0.823.g6e5bcc1fc9-goog
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH RFC 05/12] Documentation: KVM: Explain what rule the exception section is meant for
2026-05-27 15:33 [PATCH RFC 00/12] Document synchronization used in managing guest faults Ackerley Tng via B4 Relay
` (3 preceding siblings ...)
2026-05-27 15:33 ` [PATCH RFC 04/12] Documentation: KVM: Turn - into bullet point Ackerley Tng via B4 Relay
@ 2026-05-27 15:33 ` Ackerley Tng via B4 Relay
2026-05-27 15:33 ` [PATCH RFC 06/12] Documentation: KVM: Have actual headings for exceptions Ackerley Tng via B4 Relay
` (6 subsequent siblings)
11 siblings, 0 replies; 19+ messages in thread
From: Ackerley Tng via B4 Relay @ 2026-05-27 15:33 UTC (permalink / raw)
To: Paolo Bonzini, Jonathan Corbet, Shuah Khan, Tianrui Zhao,
Bibo Mao, Huacai Chen, WANG Xuerui, Sean Christopherson,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Fuad Tabba, vannapurve, x86, H. Peter Anvin
Cc: kvm, linux-doc, linux-kernel, loongarch, Ackerley Tng
From: Ackerley Tng <ackerleytng@google.com>
The Exception section describes some exceptions but not the rule the
exception is for. Add a paragraph to clarify that detail.
Signed-off-by: Ackerley Tng <ackerleytng@google.com>
---
Documentation/virt/kvm/locking.rst | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/Documentation/virt/kvm/locking.rst b/Documentation/virt/kvm/locking.rst
index e349c2cb94943..5161636cec481 100644
--- a/Documentation/virt/kvm/locking.rst
+++ b/Documentation/virt/kvm/locking.rst
@@ -61,6 +61,10 @@ sections.
2. Exception
------------
+The general rule in KVM is that any modification to shadow page tables
+(and their entries (SPTEs)) must be protected by ``kvm->mmu_lock``,
+with the exceptions described below.
+
Fast page fault:
Fast page fault is the fast path which fixes the guest page fault out of
--
2.54.0.823.g6e5bcc1fc9-goog
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH RFC 06/12] Documentation: KVM: Have actual headings for exceptions
2026-05-27 15:33 [PATCH RFC 00/12] Document synchronization used in managing guest faults Ackerley Tng via B4 Relay
` (4 preceding siblings ...)
2026-05-27 15:33 ` [PATCH RFC 05/12] Documentation: KVM: Explain what rule the exception section is meant for Ackerley Tng via B4 Relay
@ 2026-05-27 15:33 ` Ackerley Tng via B4 Relay
2026-05-27 15:33 ` [PATCH RFC 07/12] Documentation: KVM: Drop mention of kvm->lock in SRCU documentation Ackerley Tng via B4 Relay
` (5 subsequent siblings)
11 siblings, 0 replies; 19+ messages in thread
From: Ackerley Tng via B4 Relay @ 2026-05-27 15:33 UTC (permalink / raw)
To: Paolo Bonzini, Jonathan Corbet, Shuah Khan, Tianrui Zhao,
Bibo Mao, Huacai Chen, WANG Xuerui, Sean Christopherson,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Fuad Tabba, vannapurve, x86, H. Peter Anvin
Cc: kvm, linux-doc, linux-kernel, loongarch, Ackerley Tng
From: Ackerley Tng <ackerleytng@google.com>
Exceptions documented are described but without headings, making it hard to
identify where each exception description ended.
Use actual headings at a lower level than that of the heading used for
Exception to improve readability.
Signed-off-by: Ackerley Tng <ackerleytng@google.com>
---
Documentation/virt/kvm/locking.rst | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/Documentation/virt/kvm/locking.rst b/Documentation/virt/kvm/locking.rst
index 5161636cec481..fc4537a7659a9 100644
--- a/Documentation/virt/kvm/locking.rst
+++ b/Documentation/virt/kvm/locking.rst
@@ -65,7 +65,8 @@ The general rule in KVM is that any modification to shadow page tables
(and their entries (SPTEs)) must be protected by ``kvm->mmu_lock``,
with the exceptions described below.
-Fast page fault:
+2.1. Fast page fault
+^^^^^^^^^^^^^^^^^^^^
Fast page fault is the fast path which fixes the guest page fault out of
the mmu-lock on x86. Currently, the page fault can be fast in one of the
@@ -217,7 +218,8 @@ Since the spte is "volatile" if it can be updated out of mmu-lock, we always
atomically update the spte and the race caused by fast page fault can be avoided.
See the comments in spte_needs_atomic_update() and mmu_spte_update().
-Lockless Access Tracking:
+2.2 Lockless Access Tracking
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This is used for Intel CPUs that are using EPT but do not support the EPT A/D
bits. In this case, PTEs are tagged as A/D disabled (using ignored bits), and
--
2.54.0.823.g6e5bcc1fc9-goog
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH RFC 07/12] Documentation: KVM: Drop mention of kvm->lock in SRCU documentation
2026-05-27 15:33 [PATCH RFC 00/12] Document synchronization used in managing guest faults Ackerley Tng via B4 Relay
` (5 preceding siblings ...)
2026-05-27 15:33 ` [PATCH RFC 06/12] Documentation: KVM: Have actual headings for exceptions Ackerley Tng via B4 Relay
@ 2026-05-27 15:33 ` Ackerley Tng via B4 Relay
2026-05-27 15:33 ` [PATCH RFC 08/12] Documentation: KVM: Add example for kvm->srcu in relation to mutex/lock Ackerley Tng via B4 Relay
` (4 subsequent siblings)
11 siblings, 0 replies; 19+ messages in thread
From: Ackerley Tng via B4 Relay @ 2026-05-27 15:33 UTC (permalink / raw)
To: Paolo Bonzini, Jonathan Corbet, Shuah Khan, Tianrui Zhao,
Bibo Mao, Huacai Chen, WANG Xuerui, Sean Christopherson,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Fuad Tabba, vannapurve, x86, H. Peter Anvin
Cc: kvm, linux-doc, linux-kernel, loongarch, Ackerley Tng
From: Ackerley Tng <ackerleytng@google.com>
The original comment says that synchronize_srcu(&kvm->srcu) is called
inside critical sections for kvm->lock, vcpu->mutex and
kvm->slots_lock. Drop mention of kvm->lock since this is no longer true.
Signed-off-by: Ackerley Tng <ackerleytng@google.com>
---
Documentation/virt/kvm/locking.rst | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/Documentation/virt/kvm/locking.rst b/Documentation/virt/kvm/locking.rst
index fc4537a7659a9..437dbfa0030b9 100644
--- a/Documentation/virt/kvm/locking.rst
+++ b/Documentation/virt/kvm/locking.rst
@@ -35,8 +35,8 @@ The acquisition orders for mutexes are as follows:
For SRCU:
- ``synchronize_srcu(&kvm->srcu)`` is called inside critical sections
- for kvm->lock, vcpu->mutex and kvm->slots_lock. These locks _cannot_
- be taken inside a kvm->srcu read-side critical section; that is, the
+ for vcpu->mutex and kvm->slots_lock. These locks _cannot_ be taken
+ inside a kvm->srcu read-side critical section; that is, the
following is broken::
srcu_read_lock(&kvm->srcu);
--
2.54.0.823.g6e5bcc1fc9-goog
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH RFC 08/12] Documentation: KVM: Add example for kvm->srcu in relation to mutex/lock
2026-05-27 15:33 [PATCH RFC 00/12] Document synchronization used in managing guest faults Ackerley Tng via B4 Relay
` (6 preceding siblings ...)
2026-05-27 15:33 ` [PATCH RFC 07/12] Documentation: KVM: Drop mention of kvm->lock in SRCU documentation Ackerley Tng via B4 Relay
@ 2026-05-27 15:33 ` Ackerley Tng via B4 Relay
2026-06-25 18:17 ` Sean Christopherson
2026-05-27 15:33 ` [PATCH RFC 09/12] Documentation: KVM: Document synchronization for managing guest faults Ackerley Tng via B4 Relay
` (3 subsequent siblings)
11 siblings, 1 reply; 19+ messages in thread
From: Ackerley Tng via B4 Relay @ 2026-05-27 15:33 UTC (permalink / raw)
To: Paolo Bonzini, Jonathan Corbet, Shuah Khan, Tianrui Zhao,
Bibo Mao, Huacai Chen, WANG Xuerui, Sean Christopherson,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Fuad Tabba, vannapurve, x86, H. Peter Anvin
Cc: kvm, linux-doc, linux-kernel, loongarch, Ackerley Tng
From: Ackerley Tng <ackerleytng@google.com>
Add example of where vcpu->mutex and kvm->slots_lock are held while calling
synchronize_srcu(&kvm->srcu) to concretely show where the synchronization
primitives overlap.
Signed-off-by: Ackerley Tng <ackerleytng@google.com>
---
Documentation/virt/kvm/locking.rst | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/Documentation/virt/kvm/locking.rst b/Documentation/virt/kvm/locking.rst
index 437dbfa0030b9..f12664443e913 100644
--- a/Documentation/virt/kvm/locking.rst
+++ b/Documentation/virt/kvm/locking.rst
@@ -35,9 +35,12 @@ The acquisition orders for mutexes are as follows:
For SRCU:
- ``synchronize_srcu(&kvm->srcu)`` is called inside critical sections
- for vcpu->mutex and kvm->slots_lock. These locks _cannot_ be taken
- inside a kvm->srcu read-side critical section; that is, the
- following is broken::
+ for vcpu->mutex and kvm->slots_lock. (For example, when there is a
+ ``KVM_REQ_APICV_UPDATE`` request, ``vcpu->mutex`` is held in
+ ``kvm_vcpu_ioctl()``, and then when the memslots get updated,
+ ``kvm->slots_lock`` is taken.) These locks _cannot_ be taken inside
+ a kvm->srcu read-side critical section; that is, the following is
+ broken::
srcu_read_lock(&kvm->srcu);
mutex_lock(&kvm->slots_lock);
--
2.54.0.823.g6e5bcc1fc9-goog
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH RFC 09/12] Documentation: KVM: Document synchronization for managing guest faults
2026-05-27 15:33 [PATCH RFC 00/12] Document synchronization used in managing guest faults Ackerley Tng via B4 Relay
` (7 preceding siblings ...)
2026-05-27 15:33 ` [PATCH RFC 08/12] Documentation: KVM: Add example for kvm->srcu in relation to mutex/lock Ackerley Tng via B4 Relay
@ 2026-05-27 15:33 ` Ackerley Tng via B4 Relay
2026-05-27 15:33 ` [PATCH RFC 10/12] KVM: guest_memfd: Clarify comment about gmem.file vs kvm->srcu Ackerley Tng via B4 Relay
` (2 subsequent siblings)
11 siblings, 0 replies; 19+ messages in thread
From: Ackerley Tng via B4 Relay @ 2026-05-27 15:33 UTC (permalink / raw)
To: Paolo Bonzini, Jonathan Corbet, Shuah Khan, Tianrui Zhao,
Bibo Mao, Huacai Chen, WANG Xuerui, Sean Christopherson,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Fuad Tabba, vannapurve, x86, H. Peter Anvin
Cc: kvm, linux-doc, linux-kernel, loongarch, Ackerley Tng
From: Ackerley Tng <ackerleytng@google.com>
Document how synchronization is used while managing guest faults centrally
so code comments can point users at a central place.
Signed-off-by: Ackerley Tng <ackerleytng@google.com>
---
Documentation/virt/kvm/locking.rst | 108 +++++++++++++++++++++++++++++++++++++
1 file changed, 108 insertions(+)
diff --git a/Documentation/virt/kvm/locking.rst b/Documentation/virt/kvm/locking.rst
index f12664443e913..0663ccfe0633d 100644
--- a/Documentation/virt/kvm/locking.rst
+++ b/Documentation/virt/kvm/locking.rst
@@ -339,3 +339,111 @@ time it will be set using the Dirty tracking mechanism described above.
cpu_hotplug_lock is held, e.g. from cpufreq_boost_trigger_state(), and many
operations need to take cpu_hotplug_lock when loading a vendor module, e.g.
updating static calls.
+
+4. Synchronization while managing guest faults
+----------------------------------------------
+
+This section explains the intersection of these synchronization mechanisms:
+
+- ``kvm->srcu`` (for memslots)
+- ``kvm->mmu_invalidate_*`` (pending invalidations)
+- ``kvm->mn_*`` (synchronization for ``kvm->mmu_invalidate_*``)
+
+4.1 Overview
+^^^^^^^^^^^^
+
+KVM resolves guest page faults by translating the Guest Frame Number (GFN) into
+a Page Frame Number (PFN) via memslots and then populating its shadow page
+tables with the resulting mapping.
+
+While handling the guest page fault, KVM must ensure a consistent view of the
+active memslots container, so KVM takes ``srcu_read_lock(&kvm->srcu);``.
+
+Guest page fault handling can race with some request from host userspace to
+invalidate shadow page tables. These requests originate from a few places, such
+as
+
+1. MMU Notifiers: KVM registers callbacks with the kernel’s memory management
+ subsystem to know when there are changes to mappings in the host userspace
+ page tables.
+2. Memslot Updates: The host userspace VMM, such as QEMU may use the
+ ``KVM_SET_USER_MEMORY_REGION`` ioctl to add, delete, or move a memslot. KVM
+ must zap the affected shadow page tables to ensure the guest doesn't access
+ stale mappings.
+3. Memory Attribute Changes: The ``KVM_SET_MEMORY_ATTRIBUTES`` ioctl allows
+ userspace to change attributes for a range of guest memory (e.g., setting a
+ range as "private" for Confidential Computing). This also requires
+ invalidating existing shadow mappings.
+
+When such a race occurs, KVM optimistically allows the faulting logic to
+proceed, but just before committing the fault, KVM will check for a pending
+invalidation, and retry the fault process if there is a pending invalidation
+affecting the GFN where the fault occurred.
+
+4.2 Tracking pending invalidations with ``kvm->mmu_invalidate*`` fields
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+A "pending invalidation" is determined using a combination of
+
+- ``kvm->mmu_invalidate_in_progress``
+- ``kvm->mmu_invalidate_range_start`` and ``kvm->mmu_invalidate_range_end``
+- ``kvm->mmu_invalidate_seq``
+
+``is_page_fault_stale()`` shows how the above fields are used to determine if
+the page fault is stale and requires a retry.
+
+To protect the above combination of fields, a lock is used, which is the
+``kvm->mmu_lock``.
+
+4.2.1 Derived information vs pending invalidations
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Generally, the result of any information derived from GFN aka page
+attribute/page metadata lookups may race with invalidations. Here are some
+examples of lookups:
+
+- ``host_pfn_mapping_level()`` uses memslot information to find the mapping
+ level of pages in host userspace page tables. If there's an invalidation, the
+ pages that were mapped would no longer be mapped and hence the mapping level
+ result would be stale.
+
+There are several ways to ensure valid results:
+
+- Check ``mmu_invalidate_retry_gfn()`` after grabbing the result, before
+ consuming it. In this case, ``mmu_lock`` doesn't need to be held during the
+ lookup, but it does need to be held while checking the MMU notifier. KVM's
+ guest page fault handling uses this option.
+- Hold ``mmu_lock`` AND ensure there is no in-progress MMU notifier invalidation
+ event for the hva. This can be done by explicit checking the MMU notifier or
+ by ensuring that KVM already has a valid mapping that covers the
+ hva. ``kvm_mmu_recover_huge_pages()`` uses this option.
+
+4.3 Further optimization: ignoring invalidations if there is no matching memslot
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Invalidation is only really required when the invalidated memory range overlaps
+with some memslot. Without a matching memslot, the invalidation request could
+actually just be ignored. Hence, KVM only updates the ``kvm->mmu_invalidate_*``
+fields and takes ``kvm->mmu_lock`` if it finds a matching memslot.
+
+This creates another problem: if memslots are updated while there is an ongoing
+invalidation, then the updates to the fields and the lock would be imbalanced.
+
+4.4 Synchronization for invalidation lock/fields: ``kvm->mn_*``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+To make sure the updates to the invalidation lock/fields are balanced, KVM has a
+further layer of synchronization. ``kvm_swap_active_memslots()`` enforces that
+changes to memslots are only committed once all pending invalidations are
+complete.
+
+In other words, ``kvm->mn_*`` ensures the following does not happen:
+
+1. Some memslot existed, causing a pending invalidation request to be recorded
+ in the ``kvm->mmu_invalidate_*`` fields
+2. Memslot got removed, so the invalidation request was never removed from the
+ ``kvm->mmu_invalidate_*`` fields.
+
+In addition, ``kvm_swap_active_memslots()`` also enforces that changes to
+memslots are complete before doing ``synchronize_srcu(&kvm->srcu)`` to make sure
+running readers of the old memslots container are done before freeing it.
--
2.54.0.823.g6e5bcc1fc9-goog
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH RFC 10/12] KVM: guest_memfd: Clarify comment about gmem.file vs kvm->srcu
2026-05-27 15:33 [PATCH RFC 00/12] Document synchronization used in managing guest faults Ackerley Tng via B4 Relay
` (8 preceding siblings ...)
2026-05-27 15:33 ` [PATCH RFC 09/12] Documentation: KVM: Document synchronization for managing guest faults Ackerley Tng via B4 Relay
@ 2026-05-27 15:33 ` Ackerley Tng via B4 Relay
2026-06-25 18:19 ` Sean Christopherson
2026-05-27 15:33 ` [PATCH RFC 11/12] KVM: mmu: Point users of host_pfn_mapping_level() to docs Ackerley Tng via B4 Relay
2026-05-27 15:33 ` [PATCH RFC 12/12] Documentation: KVM: Focus acquisition order section on preventing deadlocks Ackerley Tng via B4 Relay
11 siblings, 1 reply; 19+ messages in thread
From: Ackerley Tng via B4 Relay @ 2026-05-27 15:33 UTC (permalink / raw)
To: Paolo Bonzini, Jonathan Corbet, Shuah Khan, Tianrui Zhao,
Bibo Mao, Huacai Chen, WANG Xuerui, Sean Christopherson,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Fuad Tabba, vannapurve, x86, H. Peter Anvin
Cc: kvm, linux-doc, linux-kernel, loongarch, Ackerley Tng
From: Ackerley Tng <ackerleytng@google.com>
Clarify the existing comment about synchronize_srcu() and
kvm_gmem_get_pfn() to provide further context. Explain which
synchronize_srcu() prevents races with how kvm_gmem_get_pfn() is used.
Also point reader to documentation for better understanding.
Signed-off-by: Ackerley Tng <ackerleytng@google.com>
---
virt/kvm/guest_memfd.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
index 69c9d6d546b28..f2218db0af980 100644
--- a/virt/kvm/guest_memfd.c
+++ b/virt/kvm/guest_memfd.c
@@ -711,8 +711,13 @@ static void __kvm_gmem_unbind(struct kvm_memory_slot *slot, struct gmem_file *f)
xa_store_range(&f->bindings, start, end - 1, NULL, GFP_KERNEL);
/*
- * synchronize_srcu(&kvm->srcu) ensured that kvm_gmem_get_pfn()
- * cannot see this memslot.
+ * This is called when memslots are updated, after the old
+ * memslot container is no longer in
+ * use. synchronize_srcu(&kvm->srcu) was called there, so
+ * kvm_gmem_get_pfn() from KVM's guest fault handling cannot
+ * see this memslot. See Documentation/virt/kvm/locking.rst
+ * for more information about kvm->srcu and the memslots
+ * container.
*/
WRITE_ONCE(slot->gmem.file, NULL);
}
--
2.54.0.823.g6e5bcc1fc9-goog
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH RFC 11/12] KVM: mmu: Point users of host_pfn_mapping_level() to docs
2026-05-27 15:33 [PATCH RFC 00/12] Document synchronization used in managing guest faults Ackerley Tng via B4 Relay
` (9 preceding siblings ...)
2026-05-27 15:33 ` [PATCH RFC 10/12] KVM: guest_memfd: Clarify comment about gmem.file vs kvm->srcu Ackerley Tng via B4 Relay
@ 2026-05-27 15:33 ` Ackerley Tng via B4 Relay
2026-05-27 15:33 ` [PATCH RFC 12/12] Documentation: KVM: Focus acquisition order section on preventing deadlocks Ackerley Tng via B4 Relay
11 siblings, 0 replies; 19+ messages in thread
From: Ackerley Tng via B4 Relay @ 2026-05-27 15:33 UTC (permalink / raw)
To: Paolo Bonzini, Jonathan Corbet, Shuah Khan, Tianrui Zhao,
Bibo Mao, Huacai Chen, WANG Xuerui, Sean Christopherson,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Fuad Tabba, vannapurve, x86, H. Peter Anvin
Cc: kvm, linux-doc, linux-kernel, loongarch, Ackerley Tng
From: Ackerley Tng <ackerleytng@google.com>
After consolidating documentation for host_pfn_mapping_level() in
Documentation/virt/kvm/locking.rst, point users of function to docs.
Signed-off-by: Ackerley Tng <ackerleytng@google.com>
---
arch/loongarch/kvm/mmu.c | 24 ++++--------------------
arch/x86/kvm/mmu/mmu.c | 24 ++++--------------------
2 files changed, 8 insertions(+), 40 deletions(-)
diff --git a/arch/loongarch/kvm/mmu.c b/arch/loongarch/kvm/mmu.c
index a7fa458e33605..0313901171a2e 100644
--- a/arch/loongarch/kvm/mmu.c
+++ b/arch/loongarch/kvm/mmu.c
@@ -641,27 +641,11 @@ static bool fault_supports_huge_mapping(struct kvm_memory_slot *memslot,
/*
* Lookup the mapping level for @gfn in the current mm.
*
- * WARNING! Use of host_pfn_mapping_level() requires the caller and the end
- * consumer to be tied into KVM's handlers for MMU notifier events!
+ * WARNING! This derives information from the current state of memslots and
+ * page mappings and may race with invalidations.
*
- * There are several ways to safely use this helper:
- *
- * - Check mmu_invalidate_retry_gfn() after grabbing the mapping level, before
- * consuming it. In this case, mmu_lock doesn't need to be held during the
- * lookup, but it does need to be held while checking the MMU notifier.
- *
- * - Hold mmu_lock AND ensure there is no in-progress MMU notifier invalidation
- * event for the hva. This can be done by explicit checking the MMU notifier
- * or by ensuring that KVM already has a valid mapping that covers the hva.
- *
- * - Do not use the result to install new mappings, e.g. use the host mapping
- * level only to decide whether or not to zap an entry. In this case, it's
- * not required to hold mmu_lock (though it's highly likely the caller will
- * want to hold mmu_lock anyways, e.g. to modify SPTEs).
- *
- * Note! The lookup can still race with modifications to host page tables, but
- * the above "rules" ensure KVM will not _consume_ the result of the walk if a
- * race with the primary MMU occurs.
+ * See Documentation/virt/kvm/locking.rst to understand how to consuming the
+ * result of this lookup safely.
*/
static int host_pfn_mapping_level(struct kvm *kvm, gfn_t gfn,
const struct kvm_memory_slot *slot)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index f8aa7eda661ee..20cdcdd20e78d 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3214,27 +3214,11 @@ static void direct_pte_prefetch(struct kvm_vcpu *vcpu, u64 *sptep)
/*
* Lookup the mapping level for @gfn in the current mm.
*
- * WARNING! Use of host_pfn_mapping_level() requires the caller and the end
- * consumer to be tied into KVM's handlers for MMU notifier events!
+ * WARNING! This derives information from the current state of memslots and
+ * page mappings and may race with invalidations.
*
- * There are several ways to safely use this helper:
- *
- * - Check mmu_invalidate_retry_gfn() after grabbing the mapping level, before
- * consuming it. In this case, mmu_lock doesn't need to be held during the
- * lookup, but it does need to be held while checking the MMU notifier.
- *
- * - Hold mmu_lock AND ensure there is no in-progress MMU notifier invalidation
- * event for the hva. This can be done by explicit checking the MMU notifier
- * or by ensuring that KVM already has a valid mapping that covers the hva.
- *
- * - Do not use the result to install new mappings, e.g. use the host mapping
- * level only to decide whether or not to zap an entry. In this case, it's
- * not required to hold mmu_lock (though it's highly likely the caller will
- * want to hold mmu_lock anyways, e.g. to modify SPTEs).
- *
- * Note! The lookup can still race with modifications to host page tables, but
- * the above "rules" ensure KVM will not _consume_ the result of the walk if a
- * race with the primary MMU occurs.
+ * See Documentation/virt/kvm/locking.rst to understand how to consuming the
+ * result of this lookup safely.
*/
static int host_pfn_mapping_level(struct kvm *kvm, gfn_t gfn,
const struct kvm_memory_slot *slot)
--
2.54.0.823.g6e5bcc1fc9-goog
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH RFC 12/12] Documentation: KVM: Focus acquisition order section on preventing deadlocks
2026-05-27 15:33 [PATCH RFC 00/12] Document synchronization used in managing guest faults Ackerley Tng via B4 Relay
` (10 preceding siblings ...)
2026-05-27 15:33 ` [PATCH RFC 11/12] KVM: mmu: Point users of host_pfn_mapping_level() to docs Ackerley Tng via B4 Relay
@ 2026-05-27 15:33 ` Ackerley Tng via B4 Relay
2026-06-25 18:25 ` Sean Christopherson
11 siblings, 1 reply; 19+ messages in thread
From: Ackerley Tng via B4 Relay @ 2026-05-27 15:33 UTC (permalink / raw)
To: Paolo Bonzini, Jonathan Corbet, Shuah Khan, Tianrui Zhao,
Bibo Mao, Huacai Chen, WANG Xuerui, Sean Christopherson,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Fuad Tabba, vannapurve, x86, H. Peter Anvin
Cc: kvm, linux-doc, linux-kernel, loongarch, Ackerley Tng
From: Ackerley Tng <ackerleytng@google.com>
Now that the first sentence is already described in more detail in the new
section on synchronization while managing guest faults, drop the first
sentence.
Signed-off-by: Ackerley Tng <ackerleytng@google.com>
---
Documentation/virt/kvm/locking.rst | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/Documentation/virt/kvm/locking.rst b/Documentation/virt/kvm/locking.rst
index 0663ccfe0633d..f26ea3acd0b70 100644
--- a/Documentation/virt/kvm/locking.rst
+++ b/Documentation/virt/kvm/locking.rst
@@ -26,11 +26,9 @@ The acquisition orders for mutexes are as follows:
- vcpu->mutex is taken outside kvm->slots_lock and kvm->slots_arch_lock
-- kvm->mn_active_invalidate_count ensures that pairs of
- invalidate_range_start() and invalidate_range_end() callbacks
- use the same memslots array. kvm->slots_lock and kvm->slots_arch_lock
- are taken on the waiting side when modifying memslots, so MMU notifiers
- must not take either kvm->slots_lock or kvm->slots_arch_lock.
+- kvm->slots_lock and kvm->slots_arch_lock are taken on the waiting side when
+ modifying memslots, so MMU notifiers must not take either kvm->slots_lock or
+ kvm->slots_arch_lock.
For SRCU:
--
2.54.0.823.g6e5bcc1fc9-goog
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH RFC 01/12] Documentation: KVM: Elaborate comment on kvm_usage_lock
2026-05-27 15:33 ` [PATCH RFC 01/12] Documentation: KVM: Elaborate comment on kvm_usage_lock Ackerley Tng via B4 Relay
@ 2026-06-25 18:12 ` Sean Christopherson
0 siblings, 0 replies; 19+ messages in thread
From: Sean Christopherson @ 2026-06-25 18:12 UTC (permalink / raw)
To: Ackerley Tng
Cc: Paolo Bonzini, Jonathan Corbet, Shuah Khan, Tianrui Zhao,
Bibo Mao, Huacai Chen, WANG Xuerui, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, Fuad Tabba, vannapurve, x86,
H. Peter Anvin, kvm, linux-doc, linux-kernel, loongarch
On Wed, May 27, 2026, Ackerley Tng wrote:
> The original comment talks about cpus_read_lock() and kvm_usage_count, but
> doesn't explain why they are related.
>
> Elaborate comment on kvm_usage_lock to provide more context.
>
> Signed-off-by: Ackerley Tng <ackerleytng@google.com>
> ---
> Documentation/virt/kvm/locking.rst | 19 +++++++++++++++++--
> 1 file changed, 17 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/virt/kvm/locking.rst b/Documentation/virt/kvm/locking.rst
> index 662231e958a07..5564c8b38b9cc 100644
> --- a/Documentation/virt/kvm/locking.rst
> +++ b/Documentation/virt/kvm/locking.rst
> @@ -248,8 +248,23 @@ time it will be set using the Dirty tracking mechanism described above.
> :Arch: any
> :Protects: - kvm_usage_count
> - hardware virtualization enable/disable
> -:Comment: Exists to allow taking cpus_read_lock() while kvm_usage_count is
> - protected, which simplifies the virtualization enabling logic.
> +:Comment: ``kvm_usage_count`` serves to deduplicate hardware
> + virtualization enabling and disabling requests from different VMs
> + being created.
kvm_usage_count does that and more, i.e. this is 'wrong" by being incomplete.
> +
> + Hardware virtualization enabling/disabling requires taking
> + ``cpus_read_lock()``.
> +
> + ``kvm_lock`` used to also protect ``kvm_usage_count``, but other
> + parts of the Linux kernel holding ``cpus_read_lock()`` need to
> + call into KVM to ensure that VM state remains consistent with the
> + host's state. For example, when the CPU frequency changes, KVM is
> + notified. ``kvmclock_cpufreq_notifier()`` takes ``kvm_lock`` to
> + iterate ``vm_list``.
> +
> + To decouple these, use different locks, ``kvm_lock`` for
> + ``vm_list`` and ``kvm_usage_lock`` for enabling/disabling hardware
> + virtualization.
I appreciate the effort, but honestly I think this does more harm than good. I
already know what this code does, and the above confused me more than anything.
>
> ``kvm->mn_invalidate_lock``
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> --
> 2.54.0.823.g6e5bcc1fc9-goog
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH RFC 02/12] Documentation: KVM: Consolidate notes about cpu_read_lock() and kvm_lock
2026-05-27 15:33 ` [PATCH RFC 02/12] Documentation: KVM: Consolidate notes about cpu_read_lock() and kvm_lock Ackerley Tng via B4 Relay
@ 2026-06-25 18:12 ` Sean Christopherson
0 siblings, 0 replies; 19+ messages in thread
From: Sean Christopherson @ 2026-06-25 18:12 UTC (permalink / raw)
To: Ackerley Tng
Cc: Paolo Bonzini, Jonathan Corbet, Shuah Khan, Tianrui Zhao,
Bibo Mao, Huacai Chen, WANG Xuerui, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, Fuad Tabba, vannapurve, x86,
H. Peter Anvin, kvm, linux-doc, linux-kernel, loongarch
On Wed, May 27, 2026, Ackerley Tng wrote:
> Move the detail about cpu_read_lock() and kvm_lock to where the acquisition
> order is mentioned.
Why?
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH RFC 03/12] Documentation: KVM: Consolidate notes about kvm->slots_lock and irq_lock
2026-05-27 15:33 ` [PATCH RFC 03/12] Documentation: KVM: Consolidate notes about kvm->slots_lock and irq_lock Ackerley Tng via B4 Relay
@ 2026-06-25 18:12 ` Sean Christopherson
0 siblings, 0 replies; 19+ messages in thread
From: Sean Christopherson @ 2026-06-25 18:12 UTC (permalink / raw)
To: Ackerley Tng
Cc: Paolo Bonzini, Jonathan Corbet, Shuah Khan, Tianrui Zhao,
Bibo Mao, Huacai Chen, WANG Xuerui, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, Fuad Tabba, vannapurve, x86,
H. Peter Anvin, kvm, linux-doc, linux-kernel, loongarch
On Wed, May 27, 2026, Ackerley Tng wrote:
> Move the detail about ordering between kvm->slots_lock and kvm->irq_lock to
> where the two locks are first mentioned.
Why?
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH RFC 08/12] Documentation: KVM: Add example for kvm->srcu in relation to mutex/lock
2026-05-27 15:33 ` [PATCH RFC 08/12] Documentation: KVM: Add example for kvm->srcu in relation to mutex/lock Ackerley Tng via B4 Relay
@ 2026-06-25 18:17 ` Sean Christopherson
0 siblings, 0 replies; 19+ messages in thread
From: Sean Christopherson @ 2026-06-25 18:17 UTC (permalink / raw)
To: Ackerley Tng
Cc: Paolo Bonzini, Jonathan Corbet, Shuah Khan, Tianrui Zhao,
Bibo Mao, Huacai Chen, WANG Xuerui, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, Fuad Tabba, vannapurve, x86,
H. Peter Anvin, kvm, linux-doc, linux-kernel, loongarch
On Wed, May 27, 2026, Ackerley Tng wrote:
> Add example of where vcpu->mutex and kvm->slots_lock are held while calling
> synchronize_srcu(&kvm->srcu) to concretely show where the synchronization
> primitives overlap.
Sorry, but NAK. This is too x86-centric, and IMO the risk of the documentation
becoming stale and confusing outweighs any benefits from providing an incomplete
example. Because like the kvm_usage_count stuff, I know the code in question,
and the example confused me and makes it harder to understand the rule(s).
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH RFC 10/12] KVM: guest_memfd: Clarify comment about gmem.file vs kvm->srcu
2026-05-27 15:33 ` [PATCH RFC 10/12] KVM: guest_memfd: Clarify comment about gmem.file vs kvm->srcu Ackerley Tng via B4 Relay
@ 2026-06-25 18:19 ` Sean Christopherson
0 siblings, 0 replies; 19+ messages in thread
From: Sean Christopherson @ 2026-06-25 18:19 UTC (permalink / raw)
To: Ackerley Tng
Cc: Paolo Bonzini, Jonathan Corbet, Shuah Khan, Tianrui Zhao,
Bibo Mao, Huacai Chen, WANG Xuerui, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, Fuad Tabba, vannapurve, x86,
H. Peter Anvin, kvm, linux-doc, linux-kernel, loongarch
On Wed, May 27, 2026, Ackerley Tng wrote:
> Clarify the existing comment about synchronize_srcu() and
> kvm_gmem_get_pfn() to provide further context. Explain which
> synchronize_srcu() prevents races with how kvm_gmem_get_pfn() is used.
>
> Also point reader to documentation for better understanding.
>
> Signed-off-by: Ackerley Tng <ackerleytng@google.com>
> ---
> virt/kvm/guest_memfd.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> index 69c9d6d546b28..f2218db0af980 100644
> --- a/virt/kvm/guest_memfd.c
> +++ b/virt/kvm/guest_memfd.c
> @@ -711,8 +711,13 @@ static void __kvm_gmem_unbind(struct kvm_memory_slot *slot, struct gmem_file *f)
> xa_store_range(&f->bindings, start, end - 1, NULL, GFP_KERNEL);
>
> /*
> - * synchronize_srcu(&kvm->srcu) ensured that kvm_gmem_get_pfn()
> - * cannot see this memslot.
> + * This is called when memslots are updated, after the old
> + * memslot container is no longer in
> + * use. synchronize_srcu(&kvm->srcu) was called there, so
> + * kvm_gmem_get_pfn() from KVM's guest fault handling cannot
> + * see this memslot. See Documentation/virt/kvm/locking.rst
> + * for more information about kvm->srcu and the memslots
> + * container.
If we want to add to this comment, I would much rather do so as part of an update
to kvm_gmem_release()'s comment as well.
https://lore.kernel.org/all/20251113232229.1698886-1-seanjc@google.com
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH RFC 12/12] Documentation: KVM: Focus acquisition order section on preventing deadlocks
2026-05-27 15:33 ` [PATCH RFC 12/12] Documentation: KVM: Focus acquisition order section on preventing deadlocks Ackerley Tng via B4 Relay
@ 2026-06-25 18:25 ` Sean Christopherson
0 siblings, 0 replies; 19+ messages in thread
From: Sean Christopherson @ 2026-06-25 18:25 UTC (permalink / raw)
To: Ackerley Tng
Cc: Paolo Bonzini, Jonathan Corbet, Shuah Khan, Tianrui Zhao,
Bibo Mao, Huacai Chen, WANG Xuerui, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, Fuad Tabba, vannapurve, x86,
H. Peter Anvin, kvm, linux-doc, linux-kernel, loongarch
On Wed, May 27, 2026, Ackerley Tng wrote:
> Now that the first sentence is already described in more detail in the new
> section on synchronization while managing guest faults, drop the first
> sentence.
Nope, nothing in that sections says anything about the role of
mn_active_invalidate_count.
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2026-06-25 18:25 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-27 15:33 [PATCH RFC 00/12] Document synchronization used in managing guest faults Ackerley Tng via B4 Relay
2026-05-27 15:33 ` [PATCH RFC 01/12] Documentation: KVM: Elaborate comment on kvm_usage_lock Ackerley Tng via B4 Relay
2026-06-25 18:12 ` Sean Christopherson
2026-05-27 15:33 ` [PATCH RFC 02/12] Documentation: KVM: Consolidate notes about cpu_read_lock() and kvm_lock Ackerley Tng via B4 Relay
2026-06-25 18:12 ` Sean Christopherson
2026-05-27 15:33 ` [PATCH RFC 03/12] Documentation: KVM: Consolidate notes about kvm->slots_lock and irq_lock Ackerley Tng via B4 Relay
2026-06-25 18:12 ` Sean Christopherson
2026-05-27 15:33 ` [PATCH RFC 04/12] Documentation: KVM: Turn - into bullet point Ackerley Tng via B4 Relay
2026-05-27 15:33 ` [PATCH RFC 05/12] Documentation: KVM: Explain what rule the exception section is meant for Ackerley Tng via B4 Relay
2026-05-27 15:33 ` [PATCH RFC 06/12] Documentation: KVM: Have actual headings for exceptions Ackerley Tng via B4 Relay
2026-05-27 15:33 ` [PATCH RFC 07/12] Documentation: KVM: Drop mention of kvm->lock in SRCU documentation Ackerley Tng via B4 Relay
2026-05-27 15:33 ` [PATCH RFC 08/12] Documentation: KVM: Add example for kvm->srcu in relation to mutex/lock Ackerley Tng via B4 Relay
2026-06-25 18:17 ` Sean Christopherson
2026-05-27 15:33 ` [PATCH RFC 09/12] Documentation: KVM: Document synchronization for managing guest faults Ackerley Tng via B4 Relay
2026-05-27 15:33 ` [PATCH RFC 10/12] KVM: guest_memfd: Clarify comment about gmem.file vs kvm->srcu Ackerley Tng via B4 Relay
2026-06-25 18:19 ` Sean Christopherson
2026-05-27 15:33 ` [PATCH RFC 11/12] KVM: mmu: Point users of host_pfn_mapping_level() to docs Ackerley Tng via B4 Relay
2026-05-27 15:33 ` [PATCH RFC 12/12] Documentation: KVM: Focus acquisition order section on preventing deadlocks Ackerley Tng via B4 Relay
2026-06-25 18:25 ` Sean Christopherson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox