From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-wr1-f47.google.com (mail-wr1-f47.google.com [209.85.221.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 78D21330672 for ; Wed, 18 Feb 2026 16:02:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.47 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771430536; cv=none; b=Hew0A3A37g2OWAamhBAJJ7S2QZBeKwoSVdXeuerUf5smRxIpJ28/IBWlZelVE3r4LiJjTs2PaM4hG5HsuFNhHweF6WZRrh4iACYsatXwsh6MVZqc8h6lkQVfcWXv0yPGtYynrk7jcotvRXUE8hPhliF5eJN3rJ54eXNhTwWv4A4= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771430536; c=relaxed/simple; bh=Sya9btRRHCzFM+Kc07ob/2EMmNBJBIrHqnz7VyaAlYA=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=h5ofuj0GibUc/SrVsBLr6bXAmvwyMOpF3iiDPj4V/BfZLkn+F1kf0q+ZuuKOKNk4w2CGcFvmSM6yFm1ByaCEyakwxUvuZZcJzGqwGcOSQIMSXmpK4XP6K5SuHPezRj4OesAtnYmUUfnz9MN44oS99mn6zFFDP3bVq8tf/yrhWOA= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=ilGh/R4q; arc=none smtp.client-ip=209.85.221.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="ilGh/R4q" Received: by mail-wr1-f47.google.com with SMTP id ffacd0b85a97d-4359a302794so4060603f8f.1 for ; Wed, 18 Feb 2026 08:02:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1771430532; x=1772035332; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=6WI6XVVgb/sjOKhoe/ZrjdLzkj8wj+eqRqsuqmM7G5c=; b=ilGh/R4q1e54/vmRdKPSG2jCNkw3RZ2sUD5ufG8vDhmvgbNU+/4Ktc9NRa4qlScuLG nR4bMsk9XyBvmtVyWkk2KmaHCF78P833RZDM/rAPTkqkDMBAJXAJrJBQxOJNf6Y/IN4z K89eP1SyJTvzxTEcRAhoA8DCajfK9Sf6t+dhAxVx0AqDrjHfy5Ea3/vEFKZ+ATwll2Xe BfRR/g9HG8qGT32koYy9rDwrzMvc3mBbGjHEGtEWRur3H6C8fZYuPO7QveIdJJF4Pesg pROrcEohlljNq0hifK5lFy8KlI8UqCPDxo2CufomIGTO6ozabW1vj1V6v4Q+7eSS2mpk iO/g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1771430532; x=1772035332; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-gg:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=6WI6XVVgb/sjOKhoe/ZrjdLzkj8wj+eqRqsuqmM7G5c=; b=q3rjhwLyzd8cXS4QkRVWh5IVJ/a4JICv2/aA805/WIEwnCyr+cYkvugFNYrdwUKCm4 jqzLZRRuUegn2v9IoZYWd79lNTx6zhI4iTJ+efG1KUJL+VGynC31D5xpxLBDR1t6c9+j 8pkbK8avbzr/oQZySRIi3zw0RSQvJnzfEllWPZmnMq/I2b7w3ASfd6OIWttVEZI+5Abz tBE4q6tGFaIo67jZVyxrHEMbXnnroGJye+xI/noLminyVmwCwVWWRkuykBpHPneTO8lH bsUAK72yeV02P0oBjSNlIkxtHrSPNkuZ43i5IWlK4dIp5vf1gM8JYUe5TGjRrVbNJiio FUiA== X-Forwarded-Encrypted: i=1; AJvYcCV6yDbN8EojqwMKgS0sCmAFkY3ks24p32ZAlD5HoL5o+4nchUVHOyMXZW4j43kDqX/SOiJ646lAW5mUCoQ=@vger.kernel.org X-Gm-Message-State: AOJu0YzcliQFyI8EUbCSeBxDmqiil6vVyeQtkKn9wWcrw19igZt0Rnei W/ARYhrQLgT5xKFuMqkhapaNGoJzSzCpUaDXGtwja2RWbYm6FF52EunZ3rpSBvSuDQ== X-Gm-Gg: AZuq6aJjwrCboPfHiKkF8YfNxFMtxizqLBTOz6taBStav3Ln8L1HFapp8R7M9nQW7Bq PQ8CXWGCJofvFR5f5RGKwkN2v9/fS98PsbJr7LeGLmjdzIuMCFEqn5rw01QhDey6SDqpzjJV6HI 07PzLAEPisSNJ5SqJ175deIivnUg88Nt+bARFW+aqcywx9tuI3rFxIIskGrGLAE/Y9FWRfVqpCW QH8M3QB1ecqVLwarjSADo34hvcijuxbCZRxbCLDVU/atp3zOhauFfyXNCX2ssM0jdAJ0VX8coW9 +HWeHilVZuWb6b4nRpdP12iW/cYPFUWh8KPt99EjlJuQ8r5F1tZ8DqN3vzh0cuEdtAL3eG1Zi5x 8aMwwW5Pc/UutH7FW17m93t9fxJY5ypUqyj+1Ys31oIx0oAQCTNW5cQTfAm2/wYptQKRb9sJgNV IpDRsx3uMRfJe2Uy5ximfAtAYKQlkUB1Yi/abV+ydtbqHdx/ME3I73S0OYzCB5ftQHZtnEukI= X-Received: by 2002:a05:6000:2510:b0:42b:3246:1681 with SMTP id ffacd0b85a97d-4379db64295mr25862256f8f.18.1771430529420; Wed, 18 Feb 2026 08:02:09 -0800 (PST) Received: from google.com (164.102.240.35.bc.googleusercontent.com. [35.240.102.164]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-43796ac800esm45562165f8f.27.2026.02.18.08.02.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Feb 2026 08:02:08 -0800 (PST) Date: Wed, 18 Feb 2026 16:02:05 +0000 From: Keir Fraser To: Nikita Kalyazin Cc: Sean Christopherson , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Eric Auger , Oliver Upton , Marc Zyngier , Will Deacon , Paolo Bonzini , Li RongQing Subject: Re: [PATCH v4 4/4] KVM: Avoid synchronize_srcu() in kvm_io_bus_register_dev() Message-ID: References: <20250909100007.3136249-1-keirf@google.com> <20250909100007.3136249-5-keirf@google.com> <162cedc3-cd6c-494c-b39e-daadfbd6d8db@amazon.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <162cedc3-cd6c-494c-b39e-daadfbd6d8db@amazon.com> On Wed, Feb 18, 2026 at 12:55:11PM +0000, Nikita Kalyazin wrote: > > > On 17/02/2026 19:07, Sean Christopherson wrote: > > On Mon, Feb 16, 2026, Nikita Kalyazin wrote: > > > On 13/02/2026 23:20, Sean Christopherson wrote: > > > > On Fri, Feb 13, 2026, Nikita Kalyazin wrote: > > > > > I am not aware of way to make it fast for both use cases and would be more > > > > > than happy to hear about possible solutions. > > > > > > > > What if we key off of vCPUS being created? The motivation for Keir's change was > > > > to avoid stalling during VM boot, i.e. *after* initial VM creation. > > > > > > It doesn't work as is on x86 because the delay we're seeing occurs after the > > > created_cpus gets incremented > > > > I don't follow, the suggestion was to key off created_vcpus in > > kvm_io_bus_register_dev(), not in kvm_swap_active_memslots(). I can totally > > imagine the patch not working, but the ordering in kvm_vm_ioctl_create_vcpu() > > should be largely irrelevant. > > Yes, you're right, it's irrelevant. I had made the change in > kvm_io_bus_register_dev() like proposed, but have no idea how I couldn't see > the effect. I retested it now and it's obvious that it works on x86. Sorry > for the confusion. > > > > > Probably a moot point though. > > Yes, this will not solve the problem on ARM. Sorry for being late to this thread. I'm a bit confused now. Did Sean's original patch (reintroducing the old logic, based on whether any vcpus have been created) work for both/either/neither arch? I would have expected it to work for both ARM and X86, despite the offending synchronize_srcu() not being in the vcpu-creation ioctl on ARM, and I think that is finally what your testing seems to show? If so then that seems the pragmatic if somewhat ugly way forward. Cheers, Keir > > > > > so it doesn't allow to differentiate the two > > > cases (below is kvm_vm_ioctl_create_vcpu): > > > > > > kvm->created_vcpus++; // <===== incremented here > > > mutex_unlock(&kvm->lock); > > > > > > vcpu = kmem_cache_zalloc(kvm_vcpu_cache, GFP_KERNEL_ACCOUNT); > > > if (!vcpu) { > > > r = -ENOMEM; > > > goto vcpu_decrement; > > > } > > > > > > BUILD_BUG_ON(sizeof(struct kvm_run) > PAGE_SIZE); > > > page = alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO); > > > if (!page) { > > > r = -ENOMEM; > > > goto vcpu_free; > > > } > > > vcpu->run = page_address(page); > > > > > > kvm_vcpu_init(vcpu, kvm, id); > > > > > > r = kvm_arch_vcpu_create(vcpu); // <===== the delay is here > > > > > > > > > firecracker 583 [001] 151.297145: probe:synchronize_srcu_expedited: > > > (ffffffff813e5cf0) > > > ffffffff813e5cf1 synchronize_srcu_expedited+0x1 ([kernel.kallsyms]) > > > ffffffff81234986 kvm_swap_active_memslots+0x136 ([kernel.kallsyms]) > > > ffffffff81236cdd kvm_set_memslot+0x1cd ([kernel.kallsyms]) > > > ffffffff81237518 kvm_set_memory_region.part.0+0x478 ([kernel.kallsyms]) > > > ffffffff81264dbc __x86_set_memory_region+0xec ([kernel.kallsyms]) > > > ffffffff8127e2dc kvm_alloc_apic_access_page+0x5c ([kernel.kallsyms]) > > > ffffffff812b9ed3 vmx_vcpu_create+0x193 ([kernel.kallsyms]) > > > ffffffff8126788a kvm_arch_vcpu_create+0x1da ([kernel.kallsyms]) > > > ffffffff8123c54c kvm_vm_ioctl+0x5fc ([kernel.kallsyms]) > > > ffffffff8167b331 __x64_sys_ioctl+0x91 ([kernel.kallsyms]) > > > ffffffff8251a89c do_syscall_64+0x4c ([kernel.kallsyms]) > > > ffffffff8100012b entry_SYSCALL_64_after_hwframe+0x76 ([kernel.kallsyms]) > > > 6512de ioctl+0x32 (/mnt/host/firecracker) > > > d99a7 std::rt::lang_start+0x37 (/mnt/host/firecracker) > > > > > > Also, given that it stumbles after the KVM_CREATE_VCPU on ARM (in > > > KVM_SET_USER_MEMORY_REGION), it doesn't look like a universal solution. > > > > Hmm. Under the hood, __synchronize_srcu() itself uses __call_srcu, so I _think_ > > the only practical difference (aside from waiting, obviously) between call_srcu() > > and synchronize_srcu_expedited() with respect to "transferring" grace period > > latency is that using call_srcu() could start a normal, non-expedited grace period. > > > > IIUC, SRCU has best-effort logic to shift in-flight non-expedited grace periods > > to expedited mode, but if the normal grace period has already started the timer > > for the delayed invocation of process_srcu(), then SRCU will still wait for one > > jiffie, i.e. won't immediately queue the work. > > > > I have no idea if this is sane and/or acceptable, but before looping in Paul and > > others, can you try this to see if it helps? > > That's exactly what I tried myself before and it didn't help, probably for > the reason you mentioned above (a normal GP being already started). > > > > > diff --git a/include/linux/srcu.h b/include/linux/srcu.h > > index 344ad51c8f6c..30437dc8d818 100644 > > --- a/include/linux/srcu.h > > +++ b/include/linux/srcu.h > > @@ -89,6 +89,8 @@ void __srcu_read_unlock(struct srcu_struct *ssp, int idx) __releases(ssp); > > > > void call_srcu(struct srcu_struct *ssp, struct rcu_head *head, > > void (*func)(struct rcu_head *head)); > > +void call_srcu_expedited(struct srcu_struct *ssp, struct rcu_head *rhp, > > + rcu_callback_t func); > > void cleanup_srcu_struct(struct srcu_struct *ssp); > > void synchronize_srcu(struct srcu_struct *ssp); > > > > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c > > index ea3f128de06f..03333b079092 100644 > > --- a/kernel/rcu/srcutree.c > > +++ b/kernel/rcu/srcutree.c > > @@ -1493,6 +1493,13 @@ void call_srcu(struct srcu_struct *ssp, struct rcu_head *rhp, > > } > > EXPORT_SYMBOL_GPL(call_srcu); > > > > +void call_srcu_expedited(struct srcu_struct *ssp, struct rcu_head *rhp, > > + rcu_callback_t func) > > +{ > > + __call_srcu(ssp, rhp, func, rcu_gp_is_normal()); > > +} > > +EXPORT_SYMBOL_GPL(call_srcu_expedited); > > + > > /* > > * Helper function for synchronize_srcu() and synchronize_srcu_expedited(). > > */ > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > > index 737b74b15bb5..26215f98c98f 100644 > > --- a/virt/kvm/kvm_main.c > > +++ b/virt/kvm/kvm_main.c > > @@ -6036,7 +6036,7 @@ int kvm_io_bus_register_dev(struct kvm *kvm, enum kvm_bus bus_idx, gpa_t addr, > > memcpy(new_bus->range + i + 1, bus->range + i, > > (bus->dev_count - i) * sizeof(struct kvm_io_range)); > > rcu_assign_pointer(kvm->buses[bus_idx], new_bus); > > - call_srcu(&kvm->srcu, &bus->rcu, __free_bus); > > + call_srcu_expedited(&kvm->srcu, &bus->rcu, __free_bus); > > > > return 0; > > } >