From: Uladzislau Rezki <urezki@gmail.com>
To: "Joel Fernandes (Google)" <joel@joelfernandes.org>
Cc: linux-kernel@vger.kernel.org, Qiuxu Zhuo <qiuxu.zhuo@intel.com>,
Frederic Weisbecker <frederic@kernel.org>,
Lai Jiangshan <jiangshanlai@gmail.com>,
linux-doc@vger.kernel.org,
"Paul E. McKenney" <paulmck@kernel.org>,
rcu@vger.kernel.org, urezki@gmail.com
Subject: Re: [PATCH v3] rcu: Add a minimum time for marking boot as completed
Date: Sun, 5 Mar 2023 12:39:01 +0100 [thread overview]
Message-ID: <ZAR/VdMNBwdrWA/5@pc636> (raw)
In-Reply-To: <20230303213851.2090365-1-joel@joelfernandes.org>
On Fri, Mar 03, 2023 at 09:38:51PM +0000, Joel Fernandes (Google) wrote:
> On many systems, a great deal of boot (in userspace) happens after the
> kernel thinks the boot has completed. It is difficult to determine if
> the system has really booted from the kernel side. Some features like
> lazy-RCU can risk slowing down boot time if, say, a callback has been
> added that the boot synchronously depends on. Further expedited callbacks
> can get unexpedited way earlier than it should be, thus slowing down
> boot (as shown in the data below).
>
> For these reasons, this commit adds a config option
> 'CONFIG_RCU_BOOT_END_DELAY' and a boot parameter rcupdate.boot_end_delay.
> Userspace can also make RCU's view of the system as booted, by writing the
> time in milliseconds to: /sys/module/rcupdate/parameters/rcu_boot_end_delay
> Or even just writing a value of 0 to this sysfs node.
> However, under no circumstance will the boot be allowed to end earlier
> than just before init is launched.
>
> The default value of CONFIG_RCU_BOOT_END_DELAY is chosen as 15s. This
> suites ChromeOS and also a PREEMPT_RT system below very well, which need
> no config or parameter changes, and just a simple application of this patch. A
> system designer can also choose a specific value here to keep RCU from marking
> boot completion. As noted earlier, RCU's perspective of the system as booted
> will not be marker until at least rcu_boot_end_delay milliseconds have passed
> or an update is made via writing a small value (or 0) in milliseconds to:
> /sys/module/rcupdate/parameters/rcu_boot_end_delay.
>
> One side-effect of this patch is, there is a risk that a real-time workload
> launched just after the kernel boots will suffer interruptions due to expedited
> RCU, which previous ended just before init was launched. However, to mitigate
> such an issue (however unlikely), the user should either tune
> CONFIG_RCU_BOOT_END_DELAY to a smaller value than 15 seconds or write a value
> of 0 to /sys/module/rcupdate/parameters/rcu_boot_end_delay, once userspace
> boots, and before launching the real-time workload.
>
> Qiuxu also noted impressive boot-time improvements with earlier version
> of patch. An excerpt from the data he shared:
>
> 1) Testing environment:
> OS : CentOS Stream 8 (non-RT OS)
> Kernel : v6.2
> Machine : Intel Cascade Lake server (2 sockets, each with 44 logical threads)
> Qemu args : -cpu host -enable-kvm, -smp 88,threads=2,sockets=2, …
>
> 2) OS boot time definition:
> The time from the start of the kernel boot to the shell command line
> prompt is shown from the console. [ Different people may have
> different OS boot time definitions. ]
>
> 3) Measurement method (very rough method):
> A timer in the kernel periodically prints the boot time every 100ms.
> As soon as the shell command line prompt is shown from the console,
> we record the boot time printed by the timer, then the printed boot
> time is the OS boot time.
>
> 4) Measured OS boot time (in seconds)
> a) Measured 10 times w/o this patch:
> 8.7s, 8.4s, 8.6s, 8.2s, 9.0s, 8.7s, 8.8s, 9.3s, 8.8s, 8.3s
> The average OS boot time was: ~8.7s
>
> b) Measure 10 times w/ this patch:
> 8.5s, 8.2s, 7.6s, 8.2s, 8.7s, 8.2s, 7.8s, 8.2s, 9.3s, 8.4s
> The average OS boot time was: ~8.3s.
>
> Tested-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> ---
> v1->v2:
> Update some comments and description.
> v2->v3:
> Add sysfs param, and update with Test data.
>
> .../admin-guide/kernel-parameters.txt | 12 ++++
> cc_list | 8 +++
> kernel/rcu/Kconfig | 19 ++++++
> kernel/rcu/update.c | 68 ++++++++++++++++++-
> 4 files changed, 106 insertions(+), 1 deletion(-)
> create mode 100644 cc_list
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 2429b5e3184b..611de90d9c13 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -5085,6 +5085,18 @@
> rcutorture.verbose= [KNL]
> Enable additional printk() statements.
>
> + rcupdate.rcu_boot_end_delay= [KNL]
> + Minimum time in milliseconds that must elapse
> + before the boot sequence can be marked complete
> + from RCU's perspective, after which RCU's behavior
> + becomes more relaxed. The default value is also
> + configurable via CONFIG_RCU_BOOT_END_DELAY.
> + Userspace can also mark the boot as completed
> + sooner by writing the time in milliseconds, say once
> + userspace considers the system as booted, to:
> + /sys/module/rcupdate/parameters/rcu_boot_end_delay
> + Or even just writing a value of 0 to this sysfs node.
> +
> rcupdate.rcu_cpu_stall_ftrace_dump= [KNL]
> Dump ftrace buffer after reporting RCU CPU
> stall warning.
> diff --git a/kernel/rcu/Kconfig b/kernel/rcu/Kconfig
> index 9071182b1284..4b5ffa36cbaf 100644
> --- a/kernel/rcu/Kconfig
> +++ b/kernel/rcu/Kconfig
> @@ -217,6 +217,25 @@ config RCU_BOOST_DELAY
>
> Accept the default if unsure.
>
> +config RCU_BOOT_END_DELAY
> + int "Minimum time before RCU may consider in-kernel boot as completed"
> + range 0 120000
> + default 15000
> + help
> + Default value of the minimum time in milliseconds that must elapse
> + before the boot sequence can be marked complete from RCU's perspective,
> + after which RCU's behavior becomes more relaxed.
> + Userspace can also mark the boot as completed sooner than this default
> + by writing the time in milliseconds, say once userspace considers
> + the system as booted, to: /sys/module/rcupdate/parameters/rcu_boot_end_delay.
> + Or even just writing a value of 0 to this sysfs node.
> +
> + The actual delay for RCU's view of the system to be marked as booted can be
> + higher than this value if the kernel takes a long time to initialize but it
> + will never be smaller than this value.
> +
> + Accept the default if unsure.
> +
> config RCU_EXP_KTHREAD
> bool "Perform RCU expedited work in a real-time kthread"
> depends on RCU_BOOST && RCU_EXPERT
> diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
> index 19bf6fa3ee6a..93138c92136e 100644
> --- a/kernel/rcu/update.c
> +++ b/kernel/rcu/update.c
> @@ -224,18 +224,84 @@ void rcu_unexpedite_gp(void)
> }
> EXPORT_SYMBOL_GPL(rcu_unexpedite_gp);
>
> +/*
> + * Minimum time in milliseconds until RCU can consider in-kernel boot as
> + * completed. This can also be tuned at runtime to end the boot earlier, by
> + * userspace init code writing the time in milliseconds (even 0) to:
> + * /sys/module/rcupdate/parameters/rcu_boot_end_delay
> + */
> +static int rcu_boot_end_delay = CONFIG_RCU_BOOT_END_DELAY;
> +
> static bool rcu_boot_ended __read_mostly;
> +static bool rcu_boot_end_called __read_mostly;
> +static DEFINE_MUTEX(rcu_boot_end_lock);
> +
> +static int param_set_rcu_boot_end(const char *val, const struct kernel_param *kp)
> +{
> + uint end_ms;
> + int ret = kstrtouint(val, 0, &end_ms);
> +
> + if (ret)
> + return ret;
> + WRITE_ONCE(*(uint *)kp->arg, end_ms);
> +
> + /*
> + * rcu_end_inkernel_boot() should be called at least once during init
> + * before we can allow param changes to end the boot.
> + */
> + mutex_lock(&rcu_boot_end_lock);
> + rcu_boot_end_delay = end_ms;
> + if (!rcu_boot_ended && rcu_boot_end_called) {
> + mutex_unlock(&rcu_boot_end_lock);
> + rcu_end_inkernel_boot();
> + }
> + mutex_unlock(&rcu_boot_end_lock);
> + return ret;
> +}
> +
> +static const struct kernel_param_ops rcu_boot_end_ops = {
> + .set = param_set_rcu_boot_end,
> + .get = param_get_uint,
> +};
> +module_param_cb(rcu_boot_end_delay, &rcu_boot_end_ops, &rcu_boot_end_delay, 0644);
>
> /*
> - * Inform RCU of the end of the in-kernel boot sequence.
> + * Inform RCU of the end of the in-kernel boot sequence. The boot sequence will
> + * not be marked ended until at least rcu_boot_end_delay milliseconds have passed.
> */
> +void rcu_end_inkernel_boot(void);
> +static void rcu_boot_end_work_fn(struct work_struct *work)
> +{
> + rcu_end_inkernel_boot();
> +}
> +static DECLARE_DELAYED_WORK(rcu_boot_end_work, rcu_boot_end_work_fn);
> +
> void rcu_end_inkernel_boot(void)
> {
> + mutex_lock(&rcu_boot_end_lock);
> + rcu_boot_end_called = true;
> +
> + if (rcu_boot_ended)
> + return;
> +
> + if (rcu_boot_end_delay) {
> + u64 boot_ms = div_u64(ktime_get_boot_fast_ns(), 1000000UL);
> +
> + if (boot_ms < rcu_boot_end_delay) {
> + schedule_delayed_work(&rcu_boot_end_work,
> + rcu_boot_end_delay - boot_ms);
<snip>
urezki@pc638:~/data/raid0/coding/linux-rcu.git$ git diff
diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
index 93138c92136e..93f426f0f4ec 100644
--- a/kernel/rcu/update.c
+++ b/kernel/rcu/update.c
@@ -289,7 +289,7 @@ void rcu_end_inkernel_boot(void)
if (boot_ms < rcu_boot_end_delay) {
schedule_delayed_work(&rcu_boot_end_work,
- rcu_boot_end_delay - boot_ms);
+ msecs_to_jiffies(rcu_boot_end_delay - boot_ms));
mutex_unlock(&rcu_boot_end_lock);
return;
}
urezki@pc638:~/data/raid0/coding/linux-rcu.git$
<snip>
I think you need to apply above patch. I am not sure maybe Paul
has already mentioned about it. But just in case.
--
Uladzislau Rezki
next prev parent reply other threads:[~2023-03-05 11:39 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-03 21:38 [PATCH v3] rcu: Add a minimum time for marking boot as completed Joel Fernandes (Google)
2023-03-04 1:02 ` Paul E. McKenney
2023-03-04 4:51 ` Joel Fernandes
2023-03-11 20:44 ` Paul E. McKenney
2023-03-11 22:23 ` Joel Fernandes
2023-03-11 22:57 ` Paul E. McKenney
2023-03-06 8:24 ` Zhuo, Qiuxu
2023-03-06 14:49 ` Paul E. McKenney
2023-03-07 7:49 ` Zhuo, Qiuxu
2023-03-07 15:22 ` Paul E. McKenney
2023-03-09 15:17 ` Zhuo, Qiuxu
2023-03-09 21:53 ` Paul E. McKenney
2023-03-10 0:11 ` Akira Yokosawa
2023-03-10 1:47 ` Paul E. McKenney
2023-03-10 2:35 ` Zhuo, Qiuxu
2023-03-05 11:39 ` Uladzislau Rezki [this message]
2023-03-05 15:03 ` Joel Fernandes
2023-03-06 8:37 ` Zhuo, Qiuxu
2023-03-05 20:34 ` Paul E. McKenney
2023-03-07 13:01 ` Frederic Weisbecker
2023-03-07 13:40 ` Uladzislau Rezki
2023-03-07 13:48 ` Joel Fernandes
2023-03-07 17:33 ` Paul E. McKenney
2023-03-07 18:54 ` Joel Fernandes
2023-03-07 19:27 ` Paul E. McKenney
2023-03-08 9:41 ` Uladzislau Rezki
2023-03-08 14:45 ` Paul E. McKenney
2023-03-09 12:57 ` Uladzislau Rezki
2023-03-09 22:10 ` Joel Fernandes
2023-03-10 8:55 ` Uladzislau Rezki
2023-03-11 6:24 ` Paul E. McKenney
2023-03-11 17:19 ` Joel Fernandes
2023-03-13 9:51 ` Uladzislau Rezki
2023-03-13 12:27 ` Uladzislau Rezki
2023-03-13 13:48 ` Zhuo, Qiuxu
2023-03-13 15:28 ` Uladzislau Rezki
2023-03-13 13:58 ` Joel Fernandes
2023-03-13 15:32 ` Uladzislau Rezki
2023-03-13 15:49 ` Joel Fernandes
2023-03-13 18:12 ` Uladzislau Rezki
2023-03-13 18:56 ` Paul E. McKenney
2023-03-14 11:16 ` Uladzislau Rezki
2023-03-14 13:49 ` Paul E. McKenney
2023-03-14 22:44 ` Joel Fernandes
2023-03-15 17:12 ` Uladzislau Rezki
2023-03-15 12:21 ` Joel Fernandes
2023-03-15 16:12 ` Paul E. McKenney
2023-03-08 10:14 ` Uladzislau Rezki
2023-03-15 12:18 ` Joel Fernandes
2023-03-07 13:41 ` Joel Fernandes
2023-03-07 17:19 ` Frederic Weisbecker
2023-03-07 18:19 ` Joel Fernandes
2023-03-08 13:52 ` Joel Fernandes
2023-03-08 15:01 ` Frederic Weisbecker
2023-03-08 15:09 ` Joel Fernandes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZAR/VdMNBwdrWA/5@pc636 \
--to=urezki@gmail.com \
--cc=frederic@kernel.org \
--cc=jiangshanlai@gmail.com \
--cc=joel@joelfernandes.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=paulmck@kernel.org \
--cc=qiuxu.zhuo@intel.com \
--cc=rcu@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.