From: Sean Christopherson <seanjc@google.com>
To: Nikita Kalyazin <kalyazin@amazon.com>
Cc: Keir Fraser <keirf@google.com>,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
Eric Auger <eric.auger@redhat.com>,
Oliver Upton <oliver.upton@linux.dev>,
Marc Zyngier <maz@kernel.org>, Will Deacon <will@kernel.org>,
Paolo Bonzini <pbonzini@redhat.com>,
Li RongQing <lirongqing@baidu.com>
Subject: Re: [PATCH v4 4/4] KVM: Avoid synchronize_srcu() in kvm_io_bus_register_dev()
Date: Tue, 17 Feb 2026 11:07:09 -0800 [thread overview]
Message-ID: <aZS8XXOW7vhMkNWQ@google.com> (raw)
In-Reply-To: <dcbd7a58-c961-4510-ae48-ef7fd4f4d75c@amazon.com>
On Mon, Feb 16, 2026, Nikita Kalyazin wrote:
> On 13/02/2026 23:20, Sean Christopherson wrote:
> > On Fri, Feb 13, 2026, Nikita Kalyazin wrote:
> > > I am not aware of way to make it fast for both use cases and would be more
> > > than happy to hear about possible solutions.
> >
> > What if we key off of vCPUS being created? The motivation for Keir's change was
> > to avoid stalling during VM boot, i.e. *after* initial VM creation.
>
> It doesn't work as is on x86 because the delay we're seeing occurs after the
> created_cpus gets incremented
I don't follow, the suggestion was to key off created_vcpus in
kvm_io_bus_register_dev(), not in kvm_swap_active_memslots(). I can totally
imagine the patch not working, but the ordering in kvm_vm_ioctl_create_vcpu()
should be largely irrelevant.
Probably a moot point though.
> so it doesn't allow to differentiate the two
> cases (below is kvm_vm_ioctl_create_vcpu):
>
> kvm->created_vcpus++; // <===== incremented here
> mutex_unlock(&kvm->lock);
>
> vcpu = kmem_cache_zalloc(kvm_vcpu_cache, GFP_KERNEL_ACCOUNT);
> if (!vcpu) {
> r = -ENOMEM;
> goto vcpu_decrement;
> }
>
> BUILD_BUG_ON(sizeof(struct kvm_run) > PAGE_SIZE);
> page = alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
> if (!page) {
> r = -ENOMEM;
> goto vcpu_free;
> }
> vcpu->run = page_address(page);
>
> kvm_vcpu_init(vcpu, kvm, id);
>
> r = kvm_arch_vcpu_create(vcpu); // <===== the delay is here
>
>
> firecracker 583 [001] 151.297145: probe:synchronize_srcu_expedited:
> (ffffffff813e5cf0)
> ffffffff813e5cf1 synchronize_srcu_expedited+0x1 ([kernel.kallsyms])
> ffffffff81234986 kvm_swap_active_memslots+0x136 ([kernel.kallsyms])
> ffffffff81236cdd kvm_set_memslot+0x1cd ([kernel.kallsyms])
> ffffffff81237518 kvm_set_memory_region.part.0+0x478 ([kernel.kallsyms])
> ffffffff81264dbc __x86_set_memory_region+0xec ([kernel.kallsyms])
> ffffffff8127e2dc kvm_alloc_apic_access_page+0x5c ([kernel.kallsyms])
> ffffffff812b9ed3 vmx_vcpu_create+0x193 ([kernel.kallsyms])
> ffffffff8126788a kvm_arch_vcpu_create+0x1da ([kernel.kallsyms])
> ffffffff8123c54c kvm_vm_ioctl+0x5fc ([kernel.kallsyms])
> ffffffff8167b331 __x64_sys_ioctl+0x91 ([kernel.kallsyms])
> ffffffff8251a89c do_syscall_64+0x4c ([kernel.kallsyms])
> ffffffff8100012b entry_SYSCALL_64_after_hwframe+0x76 ([kernel.kallsyms])
> 6512de ioctl+0x32 (/mnt/host/firecracker)
> d99a7 std::rt::lang_start+0x37 (/mnt/host/firecracker)
>
> Also, given that it stumbles after the KVM_CREATE_VCPU on ARM (in
> KVM_SET_USER_MEMORY_REGION), it doesn't look like a universal solution.
Hmm. Under the hood, __synchronize_srcu() itself uses __call_srcu, so I _think_
the only practical difference (aside from waiting, obviously) between call_srcu()
and synchronize_srcu_expedited() with respect to "transferring" grace period
latency is that using call_srcu() could start a normal, non-expedited grace period.
IIUC, SRCU has best-effort logic to shift in-flight non-expedited grace periods
to expedited mode, but if the normal grace period has already started the timer
for the delayed invocation of process_srcu(), then SRCU will still wait for one
jiffie, i.e. won't immediately queue the work.
I have no idea if this is sane and/or acceptable, but before looping in Paul and
others, can you try this to see if it helps?
diff --git a/include/linux/srcu.h b/include/linux/srcu.h
index 344ad51c8f6c..30437dc8d818 100644
--- a/include/linux/srcu.h
+++ b/include/linux/srcu.h
@@ -89,6 +89,8 @@ void __srcu_read_unlock(struct srcu_struct *ssp, int idx) __releases(ssp);
void call_srcu(struct srcu_struct *ssp, struct rcu_head *head,
void (*func)(struct rcu_head *head));
+void call_srcu_expedited(struct srcu_struct *ssp, struct rcu_head *rhp,
+ rcu_callback_t func);
void cleanup_srcu_struct(struct srcu_struct *ssp);
void synchronize_srcu(struct srcu_struct *ssp);
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index ea3f128de06f..03333b079092 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -1493,6 +1493,13 @@ void call_srcu(struct srcu_struct *ssp, struct rcu_head *rhp,
}
EXPORT_SYMBOL_GPL(call_srcu);
+void call_srcu_expedited(struct srcu_struct *ssp, struct rcu_head *rhp,
+ rcu_callback_t func)
+{
+ __call_srcu(ssp, rhp, func, rcu_gp_is_normal());
+}
+EXPORT_SYMBOL_GPL(call_srcu_expedited);
+
/*
* Helper function for synchronize_srcu() and synchronize_srcu_expedited().
*/
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 737b74b15bb5..26215f98c98f 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -6036,7 +6036,7 @@ int kvm_io_bus_register_dev(struct kvm *kvm, enum kvm_bus bus_idx, gpa_t addr,
memcpy(new_bus->range + i + 1, bus->range + i,
(bus->dev_count - i) * sizeof(struct kvm_io_range));
rcu_assign_pointer(kvm->buses[bus_idx], new_bus);
- call_srcu(&kvm->srcu, &bus->rcu, __free_bus);
+ call_srcu_expedited(&kvm->srcu, &bus->rcu, __free_bus);
return 0;
}
next prev parent reply other threads:[~2026-02-17 19:07 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-09 10:00 [PATCH v4 0/4] KVM: Speed up MMIO registrations Keir Fraser
2025-09-09 10:00 ` [PATCH v4 1/4] KVM: arm64: vgic-init: Remove vgic_ready() macro Keir Fraser
2025-09-09 10:00 ` [PATCH v4 2/4] KVM: arm64: vgic: Explicitly implement vgic_dist::ready ordering Keir Fraser
2025-09-09 10:00 ` [PATCH v4 3/4] KVM: Implement barriers before accessing kvm->buses[] on SRCU read paths Keir Fraser
2025-09-09 10:00 ` [PATCH v4 4/4] KVM: Avoid synchronize_srcu() in kvm_io_bus_register_dev() Keir Fraser
2026-02-13 15:42 ` Nikita Kalyazin
2026-02-13 23:20 ` Sean Christopherson
2026-02-16 17:53 ` Nikita Kalyazin
2026-02-17 19:07 ` Sean Christopherson [this message]
2026-02-18 12:55 ` Nikita Kalyazin
2026-02-18 16:02 ` Keir Fraser
2026-02-18 16:15 ` Nikita Kalyazin
2026-02-19 7:50 ` Keir Fraser
2026-02-19 11:02 ` Nikita Kalyazin
2025-09-15 9:59 ` [PATCH v4 0/4] KVM: Speed up MMIO registrations Marc Zyngier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aZS8XXOW7vhMkNWQ@google.com \
--to=seanjc@google.com \
--cc=eric.auger@redhat.com \
--cc=kalyazin@amazon.com \
--cc=keirf@google.com \
--cc=kvm@vger.kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lirongqing@baidu.com \
--cc=maz@kernel.org \
--cc=oliver.upton@linux.dev \
--cc=pbonzini@redhat.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.