public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 4.19 v2 1/3] sched/rt: Fix sysctl_sched_rr_timeslice intial value
@ 2024-02-22 17:05 Petr Vorel
  2024-02-22 17:05 ` [PATCH 4.19 v2 2/3] sched/rt: sysctl_sched_rr_timeslice show default timeslice after reset Petr Vorel
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Petr Vorel @ 2024-02-22 17:05 UTC (permalink / raw)
  To: stable
  Cc: Cyril Hrubis, Greg Kroah-Hartman, Sasha Levin, Ingo Molnar,
	Peter Zijlstra, Petr Vorel, Mel Gorman

From: Cyril Hrubis <chrubis@suse.cz>

[ Upstream commit c7fcb99877f9f542c918509b2801065adcaf46fa ]

There is a 10% rounding error in the intial value of the
sysctl_sched_rr_timeslice with CONFIG_HZ_300=y.

This was found with LTP test sched_rr_get_interval01:

sched_rr_get_interval01.c:57: TPASS: sched_rr_get_interval() passed
sched_rr_get_interval01.c:64: TPASS: Time quantum 0s 99999990ns
sched_rr_get_interval01.c:72: TFAIL: /proc/sys/kernel/sched_rr_timeslice_ms != 100 got 90
sched_rr_get_interval01.c:57: TPASS: sched_rr_get_interval() passed
sched_rr_get_interval01.c:64: TPASS: Time quantum 0s 99999990ns
sched_rr_get_interval01.c:72: TFAIL: /proc/sys/kernel/sched_rr_timeslice_ms != 100 got 90

What this test does is to compare the return value from the
sched_rr_get_interval() and the sched_rr_timeslice_ms sysctl file and
fails if they do not match.

The problem it found is the intial sysctl file value which was computed as:

static int sysctl_sched_rr_timeslice = (MSEC_PER_SEC / HZ) * RR_TIMESLICE;

which works fine as long as MSEC_PER_SEC is multiple of HZ, however it
introduces 10% rounding error for CONFIG_HZ_300:

(MSEC_PER_SEC / HZ) * (100 * HZ / 1000)

(1000 / 300) * (100 * 300 / 1000)

3 * 30 = 90

This can be easily fixed by reversing the order of the multiplication
and division. After this fix we get:

(MSEC_PER_SEC * (100 * HZ / 1000)) / HZ

(1000 * (100 * 300 / 1000)) / 300

(1000 * 30) / 300 = 100

Fixes: 975e155ed873 ("sched/rt: Show the 'sched_rr_timeslice' SCHED_RR timeslice tuning knob in milliseconds")
Signed-off-by: Cyril Hrubis <chrubis@suse.cz>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Petr Vorel <pvorel@suse.cz>
Acked-by: Mel Gorman <mgorman@suse.de>
Tested-by: Petr Vorel <pvorel@suse.cz>
Link: https://lore.kernel.org/r/20230802151906.25258-2-chrubis@suse.cz
[ pvorel: rebased for 4.19 ]
Signed-off-by: Petr Vorel <pvorel@suse.cz>
---
 kernel/sched/rt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 394c66442cff..ce4594215728 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -8,7 +8,7 @@
 #include "pelt.h"
 
 int sched_rr_timeslice = RR_TIMESLICE;
-int sysctl_sched_rr_timeslice = (MSEC_PER_SEC / HZ) * RR_TIMESLICE;
+int sysctl_sched_rr_timeslice = (MSEC_PER_SEC * RR_TIMESLICE) / HZ;
 
 static int do_sched_rt_period_timer(struct rt_bandwidth *rt_b, int overrun);
 
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 4.19 v2 2/3] sched/rt: sysctl_sched_rr_timeslice show default timeslice after reset
  2024-02-22 17:05 [PATCH 4.19 v2 1/3] sched/rt: Fix sysctl_sched_rr_timeslice intial value Petr Vorel
@ 2024-02-22 17:05 ` Petr Vorel
  2024-02-22 17:05 ` [PATCH 4.19 v2 3/3] sched/rt: Disallow writing invalid values to sched_rt_period_us Petr Vorel
  2024-02-23 16:25 ` [PATCH 4.19 v2 1/3] sched/rt: Fix sysctl_sched_rr_timeslice intial value Greg Kroah-Hartman
  2 siblings, 0 replies; 4+ messages in thread
From: Petr Vorel @ 2024-02-22 17:05 UTC (permalink / raw)
  To: stable
  Cc: Cyril Hrubis, Greg Kroah-Hartman, Sasha Levin, Ingo Molnar,
	Peter Zijlstra, Petr Vorel, Mel Gorman

From: Cyril Hrubis <chrubis@suse.cz>

[ Upstream commit c1fc6484e1fb7cc2481d169bfef129a1b0676abe ]

The sched_rr_timeslice can be reset to default by writing value that is
<= 0. However after reading from this file we always got the last value
written, which is not useful at all.

$ echo -1 > /proc/sys/kernel/sched_rr_timeslice_ms
$ cat /proc/sys/kernel/sched_rr_timeslice_ms
-1

Fix this by setting the variable that holds the sysctl file value to the
jiffies_to_msecs(RR_TIMESLICE) in case that <= 0 value was written.

Signed-off-by: Cyril Hrubis <chrubis@suse.cz>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Petr Vorel <pvorel@suse.cz>
Acked-by: Mel Gorman <mgorman@suse.de>
Tested-by: Petr Vorel <pvorel@suse.cz>
Link: https://lore.kernel.org/r/20230802151906.25258-3-chrubis@suse.cz
Signed-off-by: Petr Vorel <pvorel@suse.cz>
---
 kernel/sched/rt.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index ce4594215728..2ea4da8c5f3a 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -2735,6 +2735,9 @@ int sched_rr_handler(struct ctl_table *table, int write,
 		sched_rr_timeslice =
 			sysctl_sched_rr_timeslice <= 0 ? RR_TIMESLICE :
 			msecs_to_jiffies(sysctl_sched_rr_timeslice);
+
+		if (sysctl_sched_rr_timeslice <= 0)
+			sysctl_sched_rr_timeslice = jiffies_to_msecs(RR_TIMESLICE);
 	}
 	mutex_unlock(&mutex);
 
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 4.19 v2 3/3] sched/rt: Disallow writing invalid values to sched_rt_period_us
  2024-02-22 17:05 [PATCH 4.19 v2 1/3] sched/rt: Fix sysctl_sched_rr_timeslice intial value Petr Vorel
  2024-02-22 17:05 ` [PATCH 4.19 v2 2/3] sched/rt: sysctl_sched_rr_timeslice show default timeslice after reset Petr Vorel
@ 2024-02-22 17:05 ` Petr Vorel
  2024-02-23 16:25 ` [PATCH 4.19 v2 1/3] sched/rt: Fix sysctl_sched_rr_timeslice intial value Greg Kroah-Hartman
  2 siblings, 0 replies; 4+ messages in thread
From: Petr Vorel @ 2024-02-22 17:05 UTC (permalink / raw)
  To: stable
  Cc: Cyril Hrubis, Greg Kroah-Hartman, Sasha Levin, Ingo Molnar,
	Peter Zijlstra, Petr Vorel

From: Cyril Hrubis <chrubis@suse.cz>

[ Upstream commit 079be8fc630943d9fc70a97807feb73d169ee3fc ]

The validation of the value written to sched_rt_period_us was broken
because:

  - the sysclt_sched_rt_period is declared as unsigned int
  - parsed by proc_do_intvec()
  - the range is asserted after the value parsed by proc_do_intvec()

Because of this negative values written to the file were written into a
unsigned integer that were later on interpreted as large positive
integers which did passed the check:

  if (sysclt_sched_rt_period <= 0)
	return EINVAL;

This commit fixes the parsing by setting explicit range for both
perid_us and runtime_us into the sched_rt_sysctls table and processes
the values with proc_dointvec_minmax() instead.

Alternatively if we wanted to use full range of unsigned int for the
period value we would have to split the proc_handler and use
proc_douintvec() for it however even the
Documentation/scheduller/sched-rt-group.rst describes the range as 1 to
INT_MAX.

As far as I can tell the only problem this causes is that the sysctl
file allows writing negative values which when read back may confuse
userspace.

There is also a LTP test being submitted for these sysctl files at:

  http://patchwork.ozlabs.org/project/ltp/patch/20230901144433.2526-1-chrubis@suse.cz/

Signed-off-by: Cyril Hrubis <chrubis@suse.cz>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20231002115553.3007-2-chrubis@suse.cz
[ pvorel: rebased for 4.19 ]
Reviewed-by: Petr Vorel <pvorel@suse.cz>
Signed-off-by: Petr Vorel <pvorel@suse.cz>
---
 kernel/sched/rt.c | 5 +----
 kernel/sysctl.c   | 5 +++++
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 2ea4da8c5f3a..deb9366e4f30 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -2658,9 +2658,6 @@ static int sched_rt_global_constraints(void)
 
 static int sched_rt_global_validate(void)
 {
-	if (sysctl_sched_rt_period <= 0)
-		return -EINVAL;
-
 	if ((sysctl_sched_rt_runtime != RUNTIME_INF) &&
 		(sysctl_sched_rt_runtime > sysctl_sched_rt_period))
 		return -EINVAL;
@@ -2690,7 +2687,7 @@ int sched_rt_handler(struct ctl_table *table, int write,
 	old_period = sysctl_sched_rt_period;
 	old_runtime = sysctl_sched_rt_runtime;
 
-	ret = proc_dointvec(table, write, buffer, lenp, ppos);
+	ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
 
 	if (!ret && write) {
 		ret = sched_rt_global_validate();
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 4bb194f096ec..6ce9f10b9c7d 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -127,6 +127,7 @@ static int zero;
 static int __maybe_unused one = 1;
 static int __maybe_unused two = 2;
 static int __maybe_unused four = 4;
+static int int_max = INT_MAX;
 static unsigned long zero_ul;
 static unsigned long one_ul = 1;
 static unsigned long long_max = LONG_MAX;
@@ -464,6 +465,8 @@ static struct ctl_table kern_table[] = {
 		.maxlen		= sizeof(unsigned int),
 		.mode		= 0644,
 		.proc_handler	= sched_rt_handler,
+		.extra1		= &one,
+		.extra2		= &int_max,
 	},
 	{
 		.procname	= "sched_rt_runtime_us",
@@ -471,6 +474,8 @@ static struct ctl_table kern_table[] = {
 		.maxlen		= sizeof(int),
 		.mode		= 0644,
 		.proc_handler	= sched_rt_handler,
+		.extra1		= &neg_one,
+		.extra2		= &int_max,
 	},
 	{
 		.procname	= "sched_rr_timeslice_ms",
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH 4.19 v2 1/3] sched/rt: Fix sysctl_sched_rr_timeslice intial value
  2024-02-22 17:05 [PATCH 4.19 v2 1/3] sched/rt: Fix sysctl_sched_rr_timeslice intial value Petr Vorel
  2024-02-22 17:05 ` [PATCH 4.19 v2 2/3] sched/rt: sysctl_sched_rr_timeslice show default timeslice after reset Petr Vorel
  2024-02-22 17:05 ` [PATCH 4.19 v2 3/3] sched/rt: Disallow writing invalid values to sched_rt_period_us Petr Vorel
@ 2024-02-23 16:25 ` Greg Kroah-Hartman
  2 siblings, 0 replies; 4+ messages in thread
From: Greg Kroah-Hartman @ 2024-02-23 16:25 UTC (permalink / raw)
  To: Petr Vorel
  Cc: stable, Cyril Hrubis, Sasha Levin, Ingo Molnar, Peter Zijlstra,
	Mel Gorman

On Thu, Feb 22, 2024 at 06:05:38PM +0100, Petr Vorel wrote:
> From: Cyril Hrubis <chrubis@suse.cz>
> 
> [ Upstream commit c7fcb99877f9f542c918509b2801065adcaf46fa ]
> 
> There is a 10% rounding error in the intial value of the
> sysctl_sched_rr_timeslice with CONFIG_HZ_300=y.
> 
> This was found with LTP test sched_rr_get_interval01:
> 
> sched_rr_get_interval01.c:57: TPASS: sched_rr_get_interval() passed
> sched_rr_get_interval01.c:64: TPASS: Time quantum 0s 99999990ns
> sched_rr_get_interval01.c:72: TFAIL: /proc/sys/kernel/sched_rr_timeslice_ms != 100 got 90
> sched_rr_get_interval01.c:57: TPASS: sched_rr_get_interval() passed
> sched_rr_get_interval01.c:64: TPASS: Time quantum 0s 99999990ns
> sched_rr_get_interval01.c:72: TFAIL: /proc/sys/kernel/sched_rr_timeslice_ms != 100 got 90
> 
> What this test does is to compare the return value from the
> sched_rr_get_interval() and the sched_rr_timeslice_ms sysctl file and
> fails if they do not match.
> 
> The problem it found is the intial sysctl file value which was computed as:
> 
> static int sysctl_sched_rr_timeslice = (MSEC_PER_SEC / HZ) * RR_TIMESLICE;
> 
> which works fine as long as MSEC_PER_SEC is multiple of HZ, however it
> introduces 10% rounding error for CONFIG_HZ_300:
> 
> (MSEC_PER_SEC / HZ) * (100 * HZ / 1000)
> 
> (1000 / 300) * (100 * 300 / 1000)
> 
> 3 * 30 = 90
> 
> This can be easily fixed by reversing the order of the multiplication
> and division. After this fix we get:
> 
> (MSEC_PER_SEC * (100 * HZ / 1000)) / HZ
> 
> (1000 * (100 * 300 / 1000)) / 300
> 
> (1000 * 30) / 300 = 100
> 
> Fixes: 975e155ed873 ("sched/rt: Show the 'sched_rr_timeslice' SCHED_RR timeslice tuning knob in milliseconds")
> Signed-off-by: Cyril Hrubis <chrubis@suse.cz>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Reviewed-by: Petr Vorel <pvorel@suse.cz>
> Acked-by: Mel Gorman <mgorman@suse.de>
> Tested-by: Petr Vorel <pvorel@suse.cz>
> Link: https://lore.kernel.org/r/20230802151906.25258-2-chrubis@suse.cz
> [ pvorel: rebased for 4.19 ]
> Signed-off-by: Petr Vorel <pvorel@suse.cz>
> ---
>  kernel/sched/rt.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> index 394c66442cff..ce4594215728 100644
> --- a/kernel/sched/rt.c
> +++ b/kernel/sched/rt.c
> @@ -8,7 +8,7 @@
>  #include "pelt.h"
>  
>  int sched_rr_timeslice = RR_TIMESLICE;
> -int sysctl_sched_rr_timeslice = (MSEC_PER_SEC / HZ) * RR_TIMESLICE;
> +int sysctl_sched_rr_timeslice = (MSEC_PER_SEC * RR_TIMESLICE) / HZ;
>  
>  static int do_sched_rt_period_timer(struct rt_bandwidth *rt_b, int overrun);
>  
> -- 
> 2.35.3
> 
> 

All now queued up, thanks!

greg k-h

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-02-23 16:25 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-02-22 17:05 [PATCH 4.19 v2 1/3] sched/rt: Fix sysctl_sched_rr_timeslice intial value Petr Vorel
2024-02-22 17:05 ` [PATCH 4.19 v2 2/3] sched/rt: sysctl_sched_rr_timeslice show default timeslice after reset Petr Vorel
2024-02-22 17:05 ` [PATCH 4.19 v2 3/3] sched/rt: Disallow writing invalid values to sched_rt_period_us Petr Vorel
2024-02-23 16:25 ` [PATCH 4.19 v2 1/3] sched/rt: Fix sysctl_sched_rr_timeslice intial value Greg Kroah-Hartman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox