The Linux Kernel Mailing List
 help / color / mirror / Atom feed
From: "Michal Koutný" <mkoutny@suse.com>
To: Joe Simmons-Talbott <joest@redhat.com>
Cc: Tejun Heo <tj@kernel.org>, Johannes Weiner <hannes@cmpxchg.org>,
	 Shuah Khan <shuah@kernel.org>,
	cgroups@vger.kernel.org, linux-kselftest@vger.kernel.org,
	 linux-kernel@vger.kernel.org,
	Sebastian Chlad <sebastian.chlad@suse.com>
Subject: Re: [PATCH v2] selftests/cgroup: Adjust cpu.max quota based on HZ
Date: Tue, 23 Jun 2026 15:52:30 +0200	[thread overview]
Message-ID: <ajqBjmJ-VT3UDPMr@localhost.localdomain> (raw)
In-Reply-To: <20260622194305.601392-1-joest@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 3924 bytes --]

On Mon, Jun 22, 2026 at 03:43:04PM -0400, Joe Simmons-Talbott <joest@redhat.com> wrote:
> +static long
> +_get_config_hz(void)
> +{
> +	long hz = -1;
> +	FILE *f;
> +	char cmd[256] = "zcat /proc/config.gz 2>/dev/null | grep '^CONFIG_HZ='";
> +
> +	f = popen(cmd, "r");
> +
> +	if (!f)
> +		goto out;
> +
> +	fscanf(f, "CONFIG_HZ=%ld", &hz);
> +
> +out:
> +	pclose(f);
> +	return hz;
> +}

I like that you voiced this dependency on CONFIG_HZ and also that
_SC_CLK_TCK is useless in this regards.
(I see that BPF selftests have similar infra for this.)


> +
>  /*
>   * This test creates a cgroup with some maximum value within a period, and
>   * verifies that a process in the cgroup is not overscheduled.
> @@ -646,7 +669,8 @@ test_cpucg_nested_weight_underprovisioned(const char *root)
>  static int test_cpucg_max(const char *root)
>  {
>  	int ret = KSFT_FAIL;
> -	long quota_usec = 1000;
> +	long hz = _get_config_hz();
> +	long quota_usec;
>  	long default_period_usec = 100000; /* cpu.max's default period */
>  	long duration_seconds = 1;

I would not bend the tested value but it's expectation (so that
approximately same quantity is tested acroos configs).

I reckon the problem might be tasks that overrun the quota due to long
tick, fortunately, we can assume this is compensated over multiple
periods, so _on average_ quota should be honored (more) precisely.
But the test duration may be not well aligned with all the compensation
periods, to that must be accounted for in the expectation.

When I write it all down, I get this:

--- a/tools/testing/selftests/cgroup/test_cpu.c
+++ b/tools/testing/selftests/cgroup/test_cpu.c
@@ -651,7 +651,9 @@ static int test_cpucg_max(const char *root)
        long duration_seconds = 1;

        long duration_usec = duration_seconds * USEC_PER_SEC;
-       long usage_usec, n_periods, remainder_usec, expected_usage_usec;
+       long usage_usec, expected_usage_usec;
+       long n_periods, spread_periods, unaligned;
+       long tick_usec, low_usage, high_usage;
        char *cpucg;
        char quota_buf[32];

@@ -687,9 +689,16 @@ static int test_cpucg_max(const char *root)
         * the cpu hog is set to run as per wall-clock time
         */
        n_periods = duration_usec / default_period_usec;
-       remainder_usec = duration_usec - n_periods * default_period_usec;
-       expected_usage_usec
-               = n_periods * quota_usec + MIN(remainder_usec, quota_usec);
+       tick_usec = USEC_PER_SEC / hz;
+       /* Up to tick_usec (over)run is compensated over multiple periods */
+       spread_periods = MAX(1, tick_usec / quota_usec);
+       low_usage = n_periods / spread_periods;
+       high_usage = (n_periods + spread_periods - 1) / spread_periods;
+       unaligned = n_periods % spread_periods;
+
+       expected_usage_usec = quota_usec * (
+               unaligned * high_usage +
+               (spread_periods - unaligned) * low_usage);

        if (!values_close_report(usage_usec, expected_usage_usec, 10))
                goto cleanup;


(I neglected (and dropped) remainder_usec because it is zero with
default values)

However, not all preemptions are tick-based, so there'd be noise 
and one has to tune the values_clone_report(,,err) anyway.

Then to reduce noise, the simpler solution is to let the test run
longer

duration_usec = duration_seconds * USEC_PER_SEC * 1000 / hz;

(where 1000 is the CONFIG_HZ=1000 where the test runs sufficiently [1] well.)

Joe, how do to the two variants above (unalignment account and prolonged
duration) affect test_cpu behavior on your setup?

(I'm personally wondering what is bigger quantity: systemic error due to
HZ quantization or random (SMP) error.)

Thanks,
Michal

[1] Even there one runs into noise depending on nr_cpus, thus even that
    fixed err=10 is not ideal.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 265 bytes --]

      parent reply	other threads:[~2026-06-23 13:52 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-22 19:43 [PATCH v2] selftests/cgroup: Adjust cpu.max quota based on HZ Joe Simmons-Talbott
2026-06-23  5:32 ` Tao Cui
2026-06-23 11:51   ` Joe Simmons-Talbott
2026-06-23 13:52 ` Michal Koutný [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ajqBjmJ-VT3UDPMr@localhost.localdomain \
    --to=mkoutny@suse.com \
    --cc=cgroups@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=joest@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=sebastian.chlad@suse.com \
    --cc=shuah@kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox