* [PATCH 1/3] KVM: Batch setting of per-page memory attributes to avoid soft lockup
2025-06-05 15:24 [PATCH 0/3] SEV-SNP fix for cpu soft lockup on 1TB+ guests Liam Merwick
@ 2025-06-05 15:25 ` Liam Merwick
2025-06-05 15:57 ` Sean Christopherson
2025-06-05 15:25 ` [PATCH 2/3] KVM: Add trace_kvm_vm_set_mem_attributes() Liam Merwick
2025-06-05 15:25 ` [PATCH 3/3] KVM: fix typo in kvm_vm_set_mem_attributes() comment Liam Merwick
2 siblings, 1 reply; 8+ messages in thread
From: Liam Merwick @ 2025-06-05 15:25 UTC (permalink / raw)
To: kvm
Cc: liam.merwick, pbonzini, seanjc, thomas.lendacky, michael.roth,
tabba, ackerleytng
When booting an SEV-SNP guest with a sufficiently large amount of memory (1TB+),
the host can experience CPU soft lockups when running an operation in
kvm_vm_set_mem_attributes() to set memory attributes on the whole
range of guest memory.
watchdog: BUG: soft lockup - CPU#8 stuck for 26s! [qemu-kvm:6372]
CPU: 8 UID: 0 PID: 6372 Comm: qemu-kvm Kdump: loaded Not tainted 6.15.0-rc7.20250520.el9uek.rc1.x86_64 #1 PREEMPT(voluntary)
Hardware name: Oracle Corporation ORACLE SERVER E4-2c/Asm,MB Tray,2U,E4-2c, BIOS 78016600 11/13/2024
RIP: 0010:xas_create+0x78/0x1f0
Code: 00 00 00 41 80 fc 01 0f 84 82 00 00 00 ba 06 00 00 00 bd 06 00 00 00 49 8b 45 08 4d 8d 65 08 41 39 d6 73 20 83 ed 06 48 85 c0 <74> 67 48 89 c2 83 e2 03 48 83 fa 02 75 0c 48 3d 00 10 00 00 0f 87
RSP: 0018:ffffad890a34b940 EFLAGS: 00000286
RAX: ffff96f30b261daa RBX: ffffad890a34b9c8 RCX: 0000000000000000
RDX: 000000000000001e RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000018 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffffad890a356868
R13: ffffad890a356860 R14: 0000000000000000 R15: ffffad890a356868
FS: 00007f5578a2a400(0000) GS:ffff97ed317e1000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f015c70fb18 CR3: 00000001109fd006 CR4: 0000000000f70ef0
PKRU: 55555554
Call Trace:
<TASK>
xas_store+0x58/0x630
? srso_alias_return_thunk+0x5/0xfbef5
? asm_sysvec_apic_timer_interrupt+0x1a/0x20
__xa_store+0xa5/0x130
xa_store+0x2c/0x50
kvm_vm_set_mem_attributes+0x343/0x710 [kvm]
kvm_vm_ioctl+0x796/0xab0 [kvm]
? srso_alias_return_thunk+0x5/0xfbef5
? srso_alias_return_thunk+0x5/0xfbef5
? rseq_ip_fixup+0x8c/0x1e0
__x64_sys_ioctl+0xa3/0xd0
do_syscall_64+0x8c/0x7a0
? srso_alias_return_thunk+0x5/0xfbef5
? __alloc_frozen_pages_noprof+0x18d/0x340
? srso_alias_return_thunk+0x5/0xfbef5
? try_charge_memcg+0x76/0x640
? srso_alias_return_thunk+0x5/0xfbef5
? __count_memcg_events+0xbb/0x150
? srso_alias_return_thunk+0x5/0xfbef5
? __mod_memcg_lruvec_state+0xb6/0x1b0
? srso_alias_return_thunk+0x5/0xfbef5
? __lruvec_stat_mod_folio+0x83/0xd0
? srso_alias_return_thunk+0x5/0xfbef5
? srso_alias_return_thunk+0x5/0xfbef5
? srso_alias_return_thunk+0x5/0xfbef5
? set_ptes.isra.0+0x36/0x90
? srso_alias_return_thunk+0x5/0xfbef5
? do_anonymous_page+0x103/0x4d0
? srso_alias_return_thunk+0x5/0xfbef5
? __handle_mm_fault+0x397/0x6f0
? srso_alias_return_thunk+0x5/0xfbef5
? __count_memcg_events+0xbb/0x150
? srso_alias_return_thunk+0x5/0xfbef5
? count_memcg_events.constprop.0+0x26/0x50
? srso_alias_return_thunk+0x5/0xfbef5
? handle_mm_fault+0x245/0x350
? srso_alias_return_thunk+0x5/0xfbef5
? do_user_addr_fault+0x221/0x686
? srso_alias_return_thunk+0x5/0xfbef5
? arch_exit_to_user_mode_prepare.isra.0+0x1e/0xd0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7f5578d031bb
Code: ff ff ff 85 c0 79 9b 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 2d 4c 0f 00 f7 d8 64 89 01 48
RSP: 002b:00007ffe0a742b88 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 000000004020aed2 RCX: 00007f5578d031bb
RDX: 00007ffe0a742c80 RSI: 000000004020aed2 RDI: 000000000000000b
RBP: 0000010000000000 R08: 0000010000000000 R09: 0000017680000000
R10: 0000000000000080 R11: 0000000000000246 R12: 00005575e5f95120
R13: 00007ffe0a742c80 R14: 0000000000000008 R15: 00005575e5f961e0
Limit the range of memory per operation when setting the attributes to
avoid holding kvm->slots_lock for too long and causing a cpu soft lockup.
Fixes: 5a475554db1e ("KVM: Introduce per-page memory attributes")
Cc: stable@vger.kernel.org # 6.12.x
Signed-off-by: Liam Merwick <liam.merwick@oracle.com>
---
virt/kvm/kvm_main.c | 37 ++++++++++++++++++++++++++++++++-----
1 file changed, 32 insertions(+), 5 deletions(-)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 69782df3617f..6e6d404a7d7a 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2533,7 +2533,9 @@ static int kvm_vm_set_mem_attributes(struct kvm *kvm, gfn_t start, gfn_t end,
static int kvm_vm_ioctl_set_mem_attributes(struct kvm *kvm,
struct kvm_memory_attributes *attrs)
{
- gfn_t start, end;
+ gfn_t start, end, section_start, section_end;
+ u64 size, size_remaining;
+ int ret = 0;
/* flags is currently not used. */
if (attrs->flags)
@@ -2545,9 +2547,6 @@ static int kvm_vm_ioctl_set_mem_attributes(struct kvm *kvm,
if (!PAGE_ALIGNED(attrs->address) || !PAGE_ALIGNED(attrs->size))
return -EINVAL;
- start = attrs->address >> PAGE_SHIFT;
- end = (attrs->address + attrs->size) >> PAGE_SHIFT;
-
/*
* xarray tracks data using "unsigned long", and as a result so does
* KVM. For simplicity, supports generic attributes only on 64-bit
@@ -2555,7 +2554,35 @@ static int kvm_vm_ioctl_set_mem_attributes(struct kvm *kvm,
*/
BUILD_BUG_ON(sizeof(attrs->attributes) != sizeof(unsigned long));
- return kvm_vm_set_mem_attributes(kvm, start, end, attrs->attributes);
+ size_remaining = attrs->size;
+ section_start = start = attrs->address >> PAGE_SHIFT;
+ section_end = end = (attrs->address + attrs->size) >> PAGE_SHIFT;
+ while (size_remaining > 0) {
+ /*
+ * If the range of memory is greater than 512GB, clamp it for
+ * this iteration to 512GB. This avoids a potential CPU soft
+ * lockup when run on a larger range for an SEV-SNP guest.
+ * (measured at 940GB so there is some headroom, just in case).
+ */
+ if (size_remaining > SZ_512G) {
+ size = SZ_512G;
+ size_remaining -= size;
+ section_end = section_start + (size >> PAGE_SHIFT);
+ } else {
+ size = size_remaining;
+ size_remaining = 0;
+ section_end = end;
+ WARN_ON_ONCE(section_end != (section_start + (size >> PAGE_SHIFT)));
+ }
+
+ ret = kvm_vm_set_mem_attributes(kvm, section_start, section_end, attrs->attributes);
+ if (ret != 0)
+ break;
+
+ section_start = section_end;
+ }
+
+ return ret;
}
#endif /* CONFIG_KVM_GENERIC_MEMORY_ATTRIBUTES */
--
2.47.1
^ permalink raw reply related [flat|nested] 8+ messages in thread* Re: [PATCH 1/3] KVM: Batch setting of per-page memory attributes to avoid soft lockup
2025-06-05 15:25 ` [PATCH 1/3] KVM: Batch setting of per-page memory attributes to avoid soft lockup Liam Merwick
@ 2025-06-05 15:57 ` Sean Christopherson
2025-06-05 19:03 ` Liam Merwick
0 siblings, 1 reply; 8+ messages in thread
From: Sean Christopherson @ 2025-06-05 15:57 UTC (permalink / raw)
To: Liam Merwick
Cc: kvm, pbonzini, thomas.lendacky, michael.roth, tabba, ackerleytng
On Thu, Jun 05, 2025, Liam Merwick wrote:
> When booting an SEV-SNP guest with a sufficiently large amount of memory (1TB+),
> the host can experience CPU soft lockups when running an operation in
> kvm_vm_set_mem_attributes() to set memory attributes on the whole
> range of guest memory.
>
> watchdog: BUG: soft lockup - CPU#8 stuck for 26s! [qemu-kvm:6372]
> CPU: 8 UID: 0 PID: 6372 Comm: qemu-kvm Kdump: loaded Not tainted 6.15.0-rc7.20250520.el9uek.rc1.x86_64 #1 PREEMPT(voluntary)
> Hardware name: Oracle Corporation ORACLE SERVER E4-2c/Asm,MB Tray,2U,E4-2c, BIOS 78016600 11/13/2024
> RIP: 0010:xas_create+0x78/0x1f0
> Code: 00 00 00 41 80 fc 01 0f 84 82 00 00 00 ba 06 00 00 00 bd 06 00 00 00 49 8b 45 08 4d 8d 65 08 41 39 d6 73 20 83 ed 06 48 85 c0 <74> 67 48 89 c2 83 e2 03 48 83 fa 02 75 0c 48 3d 00 10 00 00 0f 87
> RSP: 0018:ffffad890a34b940 EFLAGS: 00000286
> RAX: ffff96f30b261daa RBX: ffffad890a34b9c8 RCX: 0000000000000000
> RDX: 000000000000001e RSI: 0000000000000000 RDI: 0000000000000000
> RBP: 0000000000000018 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: ffffad890a356868
> R13: ffffad890a356860 R14: 0000000000000000 R15: ffffad890a356868
> FS: 00007f5578a2a400(0000) GS:ffff97ed317e1000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f015c70fb18 CR3: 00000001109fd006 CR4: 0000000000f70ef0
> PKRU: 55555554
> Call Trace:
> <TASK>
> xas_store+0x58/0x630
Trim the '?' lines when including a backtrace in a changelog, they're pure noise.
> __xa_store+0xa5/0x130
> xa_store+0x2c/0x50
> kvm_vm_set_mem_attributes+0x343/0x710 [kvm]
> kvm_vm_ioctl+0x796/0xab0 [kvm]
> __x64_sys_ioctl+0xa3/0xd0
> do_syscall_64+0x8c/0x7a0
> entry_SYSCALL_64_after_hwframe+0x76/0x7e
> RIP: 0033:0x7f5578d031bb
> Code: ff ff ff 85 c0 79 9b 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 2d 4c 0f 00 f7 d8 64 89 01 48
> RSP: 002b:00007ffe0a742b88 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> RAX: ffffffffffffffda RBX: 000000004020aed2 RCX: 00007f5578d031bb
> RDX: 00007ffe0a742c80 RSI: 000000004020aed2 RDI: 000000000000000b
> RBP: 0000010000000000 R08: 0000010000000000 R09: 0000017680000000
> R10: 0000000000000080 R11: 0000000000000246 R12: 00005575e5f95120
> R13: 00007ffe0a742c80 R14: 0000000000000008 R15: 00005575e5f961e0
>
> Limit the range of memory per operation when setting the attributes to
> avoid holding kvm->slots_lock for too long and causing a cpu soft lockup.
Holding slots_lock is totally fine. Presumably the issue is that the CPU never
reschedules.
E.g. I would expect this to make the problem go away, though it's probably not a
complete fix (I'm guessing kvm_range_has_memory_attributes() can be made to yell
too).
I'd strongly prefer to avoid arbitrary batching, because that raises a bunch of
questions that are difficult to answer, e.g. what guarantees 512GiB is a "good"
batch size on _all_ systems.
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index b24db92e98f3..28230bad43f4 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2513,6 +2513,8 @@ static int kvm_vm_set_mem_attributes(struct kvm *kvm, gfn_t start, gfn_t end,
r = xa_reserve(&kvm->mem_attr_array, i, GFP_KERNEL_ACCOUNT);
if (r)
goto out_unlock;
+
+ cond_resched();
}
kvm_handle_gfn_range(kvm, &pre_set_range);
@@ -2521,6 +2523,7 @@ static int kvm_vm_set_mem_attributes(struct kvm *kvm, gfn_t start, gfn_t end,
r = xa_err(xa_store(&kvm->mem_attr_array, i, entry,
GFP_KERNEL_ACCOUNT));
KVM_BUG_ON(r, kvm);
+ cond_resched();
}
kvm_handle_gfn_range(kvm, &post_set_range);
^ permalink raw reply related [flat|nested] 8+ messages in thread* Re: [PATCH 1/3] KVM: Batch setting of per-page memory attributes to avoid soft lockup
2025-06-05 15:57 ` Sean Christopherson
@ 2025-06-05 19:03 ` Liam Merwick
2025-06-05 19:08 ` Sean Christopherson
0 siblings, 1 reply; 8+ messages in thread
From: Liam Merwick @ 2025-06-05 19:03 UTC (permalink / raw)
To: Sean Christopherson
Cc: kvm, pbonzini, thomas.lendacky, michael.roth, tabba, ackerleytng,
liam.merwick
On 05/06/2025 16:57, Sean Christopherson wrote:
> On Thu, Jun 05, 2025, Liam Merwick wrote:
>> When booting an SEV-SNP guest with a sufficiently large amount of memory (1TB+),
>> the host can experience CPU soft lockups when running an operation in
>> kvm_vm_set_mem_attributes() to set memory attributes on the whole
>> range of guest memory.
>>
>> watchdog: BUG: soft lockup - CPU#8 stuck for 26s! [qemu-kvm:6372]
>> CPU: 8 UID: 0 PID: 6372 Comm: qemu-kvm Kdump: loaded Not tainted 6.15.0-rc7.20250520.el9uek.rc1.x86_64 #1 PREEMPT(voluntary)
>> Hardware name: Oracle Corporation ORACLE SERVER E4-2c/Asm,MB Tray,2U,E4-2c, BIOS 78016600 11/13/2024
>> RIP: 0010:xas_create+0x78/0x1f0
>> Code: 00 00 00 41 80 fc 01 0f 84 82 00 00 00 ba 06 00 00 00 bd 06 00 00 00 49 8b 45 08 4d 8d 65 08 41 39 d6 73 20 83 ed 06 48 85 c0 <74> 67 48 89 c2 83 e2 03 48 83 fa 02 75 0c 48 3d 00 10 00 00 0f 87
>> RSP: 0018:ffffad890a34b940 EFLAGS: 00000286
>> RAX: ffff96f30b261daa RBX: ffffad890a34b9c8 RCX: 0000000000000000
>> RDX: 000000000000001e RSI: 0000000000000000 RDI: 0000000000000000
>> RBP: 0000000000000018 R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000000000000 R11: 0000000000000000 R12: ffffad890a356868
>> R13: ffffad890a356860 R14: 0000000000000000 R15: ffffad890a356868
>> FS: 00007f5578a2a400(0000) GS:ffff97ed317e1000(0000) knlGS:0000000000000000
>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 00007f015c70fb18 CR3: 00000001109fd006 CR4: 0000000000f70ef0
>> PKRU: 55555554
>> Call Trace:
>> <TASK>
>> xas_store+0x58/0x630
>
> Trim the '?' lines when including a backtrace in a changelog, they're pure noise.
>
Ack
>> __xa_store+0xa5/0x130
>> xa_store+0x2c/0x50
>> kvm_vm_set_mem_attributes+0x343/0x710 [kvm]
>> kvm_vm_ioctl+0x796/0xab0 [kvm]
>> __x64_sys_ioctl+0xa3/0xd0
>> do_syscall_64+0x8c/0x7a0
>> entry_SYSCALL_64_after_hwframe+0x76/0x7e
>> RIP: 0033:0x7f5578d031bb
>> Code: ff ff ff 85 c0 79 9b 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 2d 4c 0f 00 f7 d8 64 89 01 48
>> RSP: 002b:00007ffe0a742b88 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
>> RAX: ffffffffffffffda RBX: 000000004020aed2 RCX: 00007f5578d031bb
>> RDX: 00007ffe0a742c80 RSI: 000000004020aed2 RDI: 000000000000000b
>> RBP: 0000010000000000 R08: 0000010000000000 R09: 0000017680000000
>> R10: 0000000000000080 R11: 0000000000000246 R12: 00005575e5f95120
>> R13: 00007ffe0a742c80 R14: 0000000000000008 R15: 00005575e5f961e0
>>
>> Limit the range of memory per operation when setting the attributes to
>> avoid holding kvm->slots_lock for too long and causing a cpu soft lockup.
>
> Holding slots_lock is totally fine. Presumably the issue is that the CPU never
> reschedules.
>
> E.g. I would expect this to make the problem go away, though it's probably not a
> complete fix (I'm guessing kvm_range_has_memory_attributes() can be made to yell
> too).
>
That indeed works. I couldn't trigger anything in
kvm_range_has_memory_attributes() but am limited to about 2TiB.
I'll do some more tracing before I send a v2 to see if there any more
places that might be close to hitting the limit.
Thanks,
Liam
> I'd strongly prefer to avoid arbitrary batching, because that raises a bunch of
> questions that are difficult to answer, e.g. what guarantees 512GiB is a "good"
> batch size on _all_ systems.
>
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index b24db92e98f3..28230bad43f4 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -2513,6 +2513,8 @@ static int kvm_vm_set_mem_attributes(struct kvm *kvm, gfn_t start, gfn_t end,
> r = xa_reserve(&kvm->mem_attr_array, i, GFP_KERNEL_ACCOUNT);
> if (r)
> goto out_unlock;
> +
> + cond_resched();
> }
>
> kvm_handle_gfn_range(kvm, &pre_set_range);
> @@ -2521,6 +2523,7 @@ static int kvm_vm_set_mem_attributes(struct kvm *kvm, gfn_t start, gfn_t end,
> r = xa_err(xa_store(&kvm->mem_attr_array, i, entry,
> GFP_KERNEL_ACCOUNT));
> KVM_BUG_ON(r, kvm);
> + cond_resched();
> }
>
> kvm_handle_gfn_range(kvm, &post_set_range);
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/3] KVM: Batch setting of per-page memory attributes to avoid soft lockup
2025-06-05 19:03 ` Liam Merwick
@ 2025-06-05 19:08 ` Sean Christopherson
2025-06-07 21:14 ` Liam Merwick
0 siblings, 1 reply; 8+ messages in thread
From: Sean Christopherson @ 2025-06-05 19:08 UTC (permalink / raw)
To: Liam Merwick
Cc: kvm, pbonzini, thomas.lendacky, michael.roth, tabba, ackerleytng
On Thu, Jun 05, 2025, Liam Merwick wrote:
> On 05/06/2025 16:57, Sean Christopherson wrote:
> > On Thu, Jun 05, 2025, Liam Merwick wrote:
> > > Limit the range of memory per operation when setting the attributes to
> > > avoid holding kvm->slots_lock for too long and causing a cpu soft lockup.
> >
> > Holding slots_lock is totally fine. Presumably the issue is that the CPU never
> > reschedules.
> >
> > E.g. I would expect this to make the problem go away, though it's probably not a
> > complete fix (I'm guessing kvm_range_has_memory_attributes() can be made to yell
> > too).
>
> That indeed works. I couldn't trigger anything in
> kvm_range_has_memory_attributes() but am limited to about 2TiB. I'll do some
> more tracing before I send a v2 to see if there any more places that might be
> close to hitting the limit.
To get kvm_range_has_memory_attributes() to fail, I _think_ you would need to do
a large query when the attributes match a non-zero value, so that it needs to
perform its slower search.
Ah, actually, I wouldn't be at all surprised if the issue is limited to insertion,
or even just to the xa_reserve() path that allocates memory.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/3] KVM: Batch setting of per-page memory attributes to avoid soft lockup
2025-06-05 19:08 ` Sean Christopherson
@ 2025-06-07 21:14 ` Liam Merwick
0 siblings, 0 replies; 8+ messages in thread
From: Liam Merwick @ 2025-06-07 21:14 UTC (permalink / raw)
To: Sean Christopherson
Cc: kvm, pbonzini, thomas.lendacky, michael.roth, tabba, ackerleytng
On 05/06/2025 20:08, Sean Christopherson wrote:
> On Thu, Jun 05, 2025, Liam Merwick wrote:
>> On 05/06/2025 16:57, Sean Christopherson wrote:
>>> On Thu, Jun 05, 2025, Liam Merwick wrote:
>>>> Limit the range of memory per operation when setting the attributes to
>>>> avoid holding kvm->slots_lock for too long and causing a cpu soft lockup.
>>>
>>> Holding slots_lock is totally fine. Presumably the issue is that the CPU never
>>> reschedules.
>>>
>>> E.g. I would expect this to make the problem go away, though it's probably not a
>>> complete fix (I'm guessing kvm_range_has_memory_attributes() can be made to yell
>>> too).
>>
>> That indeed works. I couldn't trigger anything in
>> kvm_range_has_memory_attributes() but am limited to about 2TiB. I'll do some
>> more tracing before I send a v2 to see if there any more places that might be
>> close to hitting the limit.
>
> To get kvm_range_has_memory_attributes() to fail, I _think_ you would need to do
> a large query when the attributes match a non-zero value, so that it needs to
> perform its slower search.
>
> Ah, actually, I wouldn't be at all surprised if the issue is limited to insertion,
> or even just to the xa_reserve() path that allocates memory.
Yes indeed, the kvm_range_has_memory_attributes() operation has a much
lower overhead. kvm_vm_set_mem_attributes() has that outlier of 99 sec
for 1.9 TiB
kvm_range_has_memory_attributes
value ------------- Distribution ------------- count
256 | 0
512 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 966532
1024 | 9781
2048 | 1355
4096 | 2449
8192 | 843
16384 | 240
32768 | 3
65536 | 239
131072 | 2
262144 | 1
524288 | 3
1048576 | 1
2097152 | 0
kvm_vm_set_mem_attributes
value ------------- Distribution ------------- count
512 | 0
1024 | 2
2048 |@@@@@@@@@@@@@ 1496
4096 |@@@@@@@ 813
8192 |@@@@@@@@@@@@@@ 1621
16384 |@@@@ 432
32768 | 12
65536 |@@ 239
131072 | 6
262144 | 4
524288 | 3
1048576 | 1
2097152 | 0
4194304 | 0
8388608 | 8
16777216 | 0
33554432 | 0
67108864 | 1
134217728 | 0
268435456 | 0
536870912 | 0
1073741824 | 0
2147483648 | 0
4294967296 | 0
8589934592 | 0
17179869184 | 0
34359738368 | 0
68719476736 | 1
137438953472 | 0
(As a test, I also inserted an additional call to
kvm_range_has_memory_attributes()
over the whole range of memory with a different attribute value
and didn't hit any pathological behaviour).
I'll send a v2 with the suggested fix.
Regards,
Liam
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 2/3] KVM: Add trace_kvm_vm_set_mem_attributes()
2025-06-05 15:24 [PATCH 0/3] SEV-SNP fix for cpu soft lockup on 1TB+ guests Liam Merwick
2025-06-05 15:25 ` [PATCH 1/3] KVM: Batch setting of per-page memory attributes to avoid soft lockup Liam Merwick
@ 2025-06-05 15:25 ` Liam Merwick
2025-06-05 15:25 ` [PATCH 3/3] KVM: fix typo in kvm_vm_set_mem_attributes() comment Liam Merwick
2 siblings, 0 replies; 8+ messages in thread
From: Liam Merwick @ 2025-06-05 15:25 UTC (permalink / raw)
To: kvm
Cc: liam.merwick, pbonzini, seanjc, thomas.lendacky, michael.roth,
tabba, ackerleytng
Add a tracing function to display the attribules being set for
a range of guest memory.
Sample output:
<...>-12693 [059] ..... 1342.536361: kvm_vm_set_mem_attributes: 0x00000000000000 -- 0x00000000080000 [0x8]
qemu-kvm-12693 [187] ..... 1342.747651: kvm_vm_set_mem_attributes: . 0x00000010000000 -- 0x00000018000000 [0x8]
qemu-kvm-12693 [040] .N... 1366.473790: kvm_vm_set_mem_attributes: . 0x00000018000000 -- 0x00000020000000 [0x8]
qemu-kvm-12693 [009] .N... 1390.350362: kvm_vm_set_mem_attributes: . 0x00000020000000 -- 0x00000028000000 [0x8]
qemu-kvm-12693 [008] .N... 1414.154231: kvm_vm_set_mem_attributes: 0x00000028000000 -- 0x0000002da80000 [0x8]
qemu-kvm-12693 [136] ..... 1430.988101: kvm_vm_set_mem_attributes: 0x000000000ffc00 -- 0x00000000100000 [0x8]
qemu-kvm-12693 [024] ..... 1431.029798: kvm_vm_set_mem_attributes: 0x00000000000000 -- 0x000000000000c0 [0x8]
The '.' before the addresses above signifies that the initial request
was split into multiple operations. Originally it was requested to
set the attributes on 0x00000010000000 to 0x0000002da80000
Signed-off-by: Liam Merwick <liam.merwick@oracle.com>
---
include/trace/events/kvm.h | 33 +++++++++++++++++++++++++++++++++
virt/kvm/kvm_main.c | 4 ++++
2 files changed, 37 insertions(+)
diff --git a/include/trace/events/kvm.h b/include/trace/events/kvm.h
index fc7d0f8ff078..701bf1f88850 100644
--- a/include/trace/events/kvm.h
+++ b/include/trace/events/kvm.h
@@ -473,6 +473,39 @@ TRACE_EVENT(kvm_dirty_ring_exit,
TP_printk("vcpu %d", __entry->vcpu_id)
);
+#ifdef CONFIG_KVM_GENERIC_MEMORY_ATTRIBUTES
+/*
+ * @start: Starting address of guest memory range
+ * @end: End address of guest memory range
+ * @attr: The value of the attribute being set.
+ * @indent: If true, indent output displayed (printing '.' is used to
+ * indicate that the transaction was split into multiple
+ * operations and more are to follow).
+ */
+TRACE_EVENT(kvm_vm_set_mem_attributes,
+ TP_PROTO(gfn_t start, gfn_t end, unsigned long attr, bool indent),
+ TP_ARGS(start, end, attr, indent),
+
+ TP_STRUCT__entry(
+ __field(gfn_t, start)
+ __field(gfn_t, end)
+ __field(unsigned long, attr)
+ __field(bool, indent)
+ ),
+
+ TP_fast_assign(
+ __entry->start = start;
+ __entry->end = end;
+ __entry->attr = attr;
+ __entry->indent = indent;
+ ),
+
+ TP_printk("%s %#016llx -- %#016llx [0x%lx]",
+ __entry->indent ? " ." : "",
+ __entry->start, __entry->end, __entry->attr)
+);
+#endif /* CONFIG_KVM_GENERIC_MEMORY_ATTRIBUTES */
+
TRACE_EVENT(kvm_unmap_hva_range,
TP_PROTO(unsigned long start, unsigned long end),
TP_ARGS(start, end),
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 6e6d404a7d7a..464357ea638c 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2568,11 +2568,15 @@ static int kvm_vm_ioctl_set_mem_attributes(struct kvm *kvm,
size = SZ_512G;
size_remaining -= size;
section_end = section_start + (size >> PAGE_SHIFT);
+ trace_kvm_vm_set_mem_attributes(section_start, section_end,
+ attrs->attributes, true);
} else {
size = size_remaining;
size_remaining = 0;
section_end = end;
WARN_ON_ONCE(section_end != (section_start + (size >> PAGE_SHIFT)));
+ trace_kvm_vm_set_mem_attributes(section_start, section_end,
+ attrs->attributes, false);
}
ret = kvm_vm_set_mem_attributes(kvm, section_start, section_end, attrs->attributes);
--
2.47.1
^ permalink raw reply related [flat|nested] 8+ messages in thread* [PATCH 3/3] KVM: fix typo in kvm_vm_set_mem_attributes() comment
2025-06-05 15:24 [PATCH 0/3] SEV-SNP fix for cpu soft lockup on 1TB+ guests Liam Merwick
2025-06-05 15:25 ` [PATCH 1/3] KVM: Batch setting of per-page memory attributes to avoid soft lockup Liam Merwick
2025-06-05 15:25 ` [PATCH 2/3] KVM: Add trace_kvm_vm_set_mem_attributes() Liam Merwick
@ 2025-06-05 15:25 ` Liam Merwick
2 siblings, 0 replies; 8+ messages in thread
From: Liam Merwick @ 2025-06-05 15:25 UTC (permalink / raw)
To: kvm
Cc: liam.merwick, pbonzini, seanjc, thomas.lendacky, michael.roth,
tabba, ackerleytng
It should be 'has' in the sentence and not 'as'.
Signed-off-by: Liam Merwick <liam.merwick@oracle.com>
---
virt/kvm/kvm_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 464357ea638c..be8cf9d5864d 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2501,7 +2501,7 @@ static int kvm_vm_set_mem_attributes(struct kvm *kvm, gfn_t start, gfn_t end,
mutex_lock(&kvm->slots_lock);
- /* Nothing to do if the entire range as the desired attributes. */
+ /* Nothing to do if the entire range has the desired attributes. */
if (kvm_range_has_memory_attributes(kvm, start, end, ~0, attributes))
goto out_unlock;
--
2.47.1
^ permalink raw reply related [flat|nested] 8+ messages in thread