From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DF479C433EF for ; Tue, 28 Jun 2022 09:32:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344280AbiF1JcD (ORCPT ); Tue, 28 Jun 2022 05:32:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55236 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229747AbiF1JcA (ORCPT ); Tue, 28 Jun 2022 05:32:00 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 953DC1EAC1 for ; Tue, 28 Jun 2022 02:31:59 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 58485B81D35 for ; Tue, 28 Jun 2022 09:31:58 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 15567C3411D; Tue, 28 Jun 2022 09:31:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1656408717; bh=vd7MwdbmIjU8f/cbr5SMG+Q8UGlRYGSlJ51RV5CWtCU=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=tmnbHMMplWuux8JNaKMiJifbjulg1W/UuZVxNHWVUXgWyV/YT7P/YF840BGwcESJI jN2F9mEK7yN+4GgbMWZmfQYndCuTWx2Tffb9BEyiFUvdWq7Q5kK4YyxfUBW9CIV7SA ui85gb4dhokWMszhivfLr9Cpd37ZNHx7pn9oumQcvuREgQXAPcVsgn2UTNgDGlvK4H NJZ9cLp1Zx1QfdUKaFwY2KRfhSrlznR31CwLNN9aWELABiWwV5Is4P75IgHauvz0m3 EW3ndCdyLjMvpTU8p7r3CYCLSACgW9UlIlW19+L4jMLGMecesYYxRMucJlyPQXG7bB PUJB8Afg8Pj9w== Received: from sofa.misterjones.org ([185.219.108.64] helo=why.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1o67Za-003j7c-OP; Tue, 28 Jun 2022 10:31:54 +0100 Date: Tue, 28 Jun 2022 10:31:54 +0100 Message-ID: <874k052kxh.wl-maz@kernel.org> From: Marc Zyngier To: Neeraj Upadhyay Cc: , , , , , , , , , , , , , , , Subject: Re: [PATCH] srcu: Reduce blocking agressiveness of expedited grace periods further In-Reply-To: References: <20220627123706.20187-1-quic_neeraju@quicinc.com> <875ykl2mb2.wl-maz@kernel.org> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: quic_neeraju@quicinc.com, paulmck@kernel.org, frederic@kernel.org, josh@joshtriplett.org, rostedt@goodmis.org, mathieu.desnoyers@efficios.com, jiangshanlai@gmail.com, joel@joelfernandes.org, linux-kernel@vger.kernel.org, zhangfei.gao@foxmail.com, boqun.feng@gmail.com, urezki@gmail.com, shameerali.kolothum.thodi@huawei.com, pbonzini@redhat.com, mtosatti@redhat.com, eric.auger@redhat.com, chenxiang66@hisilicon.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 28 Jun 2022 10:17:24 +0100, Neeraj Upadhyay wrote: >=20 >=20 >=20 > On 6/28/2022 2:32 PM, Marc Zyngier wrote: > > On Mon, 27 Jun 2022 13:37:06 +0100, > > Neeraj Upadhyay wrote: > >>=20 > >> Commit 640a7d37c3f4 ("srcu: Block less aggressively for expedited > >> grace periods") highlights a problem where aggressively blocking > >> SRCU expedited grace periods, as was introduced in commit > >> 282d8998e997 ("srcu: Prevent expedited GPs and blocking readers > >> from consuming CPU"), introduces ~2 minutes delay to the overall > >> ~3.5 minutes boot time, when starting VMs with "-bios QEMU_EFI.fd" > >> cmdline on qemu, which results in very high rate of memslots > >> add/remove, which causes > ~6000 synchronize_srcu() calls for > >> kvm->srcu SRCU instance. > >>=20 > >> Below table captures the experiments done by Zhangfei Gao, Shameer, > >> to measure the boottime impact with various values of non-sleeping > >> per phase counts, with HZ_250 and preemption enabled: > >>=20 > >> +=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80+=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80+ > >> | SRCU_MAX_NODELAY_PHASE | Boot time (s) | > >> +=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80+=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80+ > >> | 100 | 30.053 | > >> | 150 | 25.151 | > >> | 200 | 20.704 | > >> | 250 | 15.748 | > >> | 500 | 11.401 | > >> | 1000 | 11.443 | > >> | 10000 | 11.258 | > >> | 1000000 | 11.154 | > >> +=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80+=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80+ > >>=20 > >> Analysis on the experiment results showed improved boot time > >> with non blocking delays close to one jiffy duration. This > >> was also seen when number of per-phase iterations were scaled > >> to one jiffy. > >>=20 > >> So, this change scales per-grace-period phase number of non-sleeping > >> polls, soiuch that, non-sleeping polls are done for one jiffy. In addi= tion > >> to this, srcu_get_delay() call in srcu_gp_end(), which is used to calc= ulate > >> the delay used for scheduling callbacks, is replaced with the check for > >> expedited grace period. This is done, to schedule cbs for completed ex= pedited > >> grace periods immediately, which results in improved boot time seen in > >> experiments. > >>=20 > >> In addition to the changes to default per phase delays, this change > >> adds 3 new kernel parameters - srcutree.srcu_max_nodelay, > >> srcutree.srcu_max_nodelay_phase, srcutree.srcu_retry_check_delay. > >> This allows users to configure the srcu grace period scanning delays, > >> depending on their system configuration requirements. > >>=20 > >> Signed-off-by: Neeraj Upadhyay > >=20 > > I've given this a go on one of my test platforms (the one I noticed > > the issue on the first place), and found that the initial part of the > > EFI boot under KVM (pointlessly wiping the emulated flash) went down > > to 1m7s from 3m50s (HZ=3D250). > >=20 > > Clearly a massive improvement, but still a far cry from the original > > ~40s (yes, this box is utter crap -- which is why I use it). >=20 > Do you see any improvement by using "srcutree.srcu_max_nodelay=3D1000" > bootarg, on top of this patch? Yup, this brings it back to 43s on a quick test run, which is close enough to what I had before. How does a random user come up with such a value though? Thanks, M. --=20 Without deviation from the norm, progress is not possible.