From: luca abeni <luca.abeni@santannapisa.it>
To: Marcel Ziswiler <marcel.ziswiler@codethink.co.uk>
Cc: Juri Lelli <juri.lelli@redhat.com>,
linux-kernel@vger.kernel.org, Ingo Molnar <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Vineeth Pillai <vineeth@bitbyteword.org>
Subject: Re: SCHED_DEADLINE tasks missing their deadline with SCHED_FLAG_RECLAIM jobs in the mix (using GRUB)
Date: Fri, 30 May 2025 11:21:08 +0200 [thread overview]
Message-ID: <20250530112108.63a24cde@luca64> (raw)
In-Reply-To: <c91a117401225290fbf0390f2ce78c3e0fb3b2d5.camel@codethink.co.uk>
Hi Marcel,
On Sun, 25 May 2025 21:29:05 +0200
Marcel Ziswiler <marcel.ziswiler@codethink.co.uk> wrote:
[...]
> > How do you configure systemd? I am having troubles in reproducing
> > your AllowedCPUs configuration... This is an example of what I am
> > trying: sudo systemctl set-property --runtime custom-workload.slice
> > AllowedCPUs=1 sudo systemctl set-property --runtime init.scope
> > AllowedCPUs=0,2,3 sudo systemctl set-property --runtime
> > system.slice AllowedCPUs=0,2,3 sudo systemctl set-property
> > --runtime user.slice AllowedCPUs=0,2,3 and then I try to run a
> > SCHED_DEADLINE application with sudo systemd-run --scope -p
> > Slice=custom-workload.slice <application>
>
> We just use a bunch of systemd configuration files as follows:
>
> [root@localhost ~]# cat /lib/systemd/system/monitor.slice
> # Copyright (C) 2024 Codethink Limited
> # SPDX-License-Identifier: GPL-2.0-only
[...]
So, I copied your *.slice files in /lib/systemd/system (and I added
them to the "Wants=" entry of /lib/systemd/system/slices.target,
otherwise the slices are not created), but I am still unable to run
SCHED_DEADLINE applications in these slices.
This is due to the fact that the kernel does not create a new root
domain for these cpusets (probably because the cpusets' CPUs are not
exclusive and the cpuset is not "isolated": for example,
/sys/fs/cgroup/safety1.slice/cpuset.cpus.partition is set to "member",
not to "isolated"). So, the "cpumask_subset(span, p->cpus_ptr)" in
sched_setsched() is still false and the syscall returns -EPERM.
Since I do not know how to obtain an isolated cpuset with cgroup v2 and
systemd, I tried using the old cgroup v1, as described in the
SCHED_DEADLINE documentation.
This worked fine, and enabling SCHED_FLAG_RECLAIM actually reduced the
number of missed deadlines (I tried with a set of periodic tasks having
the same parameters as the ones you described). So, it looks like
reclaiming is working correctly (at least, as far as I can see) when
using cgroup v1 to configure the CPU partitions... Maybe there is some
bug triggered by cgroup v2, or maybe I am misunderstanding your setup.
I think the experiment suggested by Juri can help in understanding
where the issue can be.
Thanks,
Luca
> [Unit]
> Description=Prioritized slice for the safety monitor.
> Before=slices.target
>
> [Slice]
> CPUWeight=1000
> AllowedCPUs=0
> MemoryAccounting=true
> MemoryMin=10%
> ManagedOOMPreference=omit
>
> [Install]
> WantedBy=slices.target
>
> [root@localhost ~]# cat /lib/systemd/system/safety1.slice
> # Copyright (C) 2024 Codethink Limited
> # SPDX-License-Identifier: GPL-2.0-only
> [Unit]
> Description=Slice for Safety case processes.
> Before=slices.target
>
> [Slice]
> CPUWeight=1000
> AllowedCPUs=1
> MemoryAccounting=true
> MemoryMin=10%
> ManagedOOMPreference=omit
>
> [Install]
> WantedBy=slices.target
>
> [root@localhost ~]# cat /lib/systemd/system/safety2.slice
> # Copyright (C) 2024 Codethink Limited
> # SPDX-License-Identifier: GPL-2.0-only
> [Unit]
> Description=Slice for Safety case processes.
> Before=slices.target
>
> [Slice]
> CPUWeight=1000
> AllowedCPUs=2
> MemoryAccounting=true
> MemoryMin=10%
> ManagedOOMPreference=omit
>
> [Install]
> WantedBy=slices.target
>
> [root@localhost ~]# cat /lib/systemd/system/safety3.slice
> # Copyright (C) 2024 Codethink Limited
> # SPDX-License-Identifier: GPL-2.0-only
> [Unit]
> Description=Slice for Safety case processes.
> Before=slices.target
>
> [Slice]
> CPUWeight=1000
> AllowedCPUs=3
> MemoryAccounting=true
> MemoryMin=10%
> ManagedOOMPreference=omit
>
> [Install]
> WantedBy=slices.target
>
> [root@localhost ~]# cat /lib/systemd/system/system.slice
> # Copyright (C) 2024 Codethink Limited
> # SPDX-License-Identifier: GPL-2.0-only
>
> #
> # This slice will control all processes started by systemd by
> # default.
> #
>
> [Unit]
> Description=System Slice
> Documentation=man:systemd.special(7)
> Before=slices.target
>
> [Slice]
> CPUQuota=150%
> AllowedCPUs=0
> MemoryAccounting=true
> MemoryMax=80%
> ManagedOOMSwap=kill
> ManagedOOMMemoryPressure=kill
>
> [root@localhost ~]# cat /lib/systemd/system/user.slice
> # Copyright (C) 2024 Codethink Limited
> # SPDX-License-Identifier: GPL-2.0-only
>
> #
> # This slice will control all processes started by systemd-logind
> #
>
> [Unit]
> Description=User and Session Slice
> Documentation=man:systemd.special(7)
> Before=slices.target
>
> [Slice]
> CPUQuota=25%
> AllowedCPUs=0
> MemoryAccounting=true
> MemoryMax=80%
> ManagedOOMSwap=kill
> ManagedOOMMemoryPressure=kill
>
> > However, this does not work because systemd is not creating an
> > isolated cpuset... So, the root domain still contains CPUs 0-3, and
> > the "custom-workload.slice" cpuset only has CPU 1. Hence, the check
> > /*
> > * Don't allow tasks with an affinity mask
> > smaller than
> > * the entire root_domain to become
> > SCHED_DEADLINE. We
> > * will also fail if there's no bandwidth
> > available. */
> > if (!cpumask_subset(span, p->cpus_ptr) ||
> > rq->rd->dl_bw.bw == 0) {
> > retval = -EPERM;
> > goto unlock;
> > }
> > in sched_setsched() fails.
> >
> >
> > How are you configuring the cpusets?
>
> See above.
>
> > Also, which kernel version are you using?
> > (sorry if you already posted this information in previous emails
> > and I am missing something obvious)
>
> Not even sure, whether I explicitly mentioned that other than that we
> are always running latest stable.
>
> Two months ago when we last run some extensive tests on this it was
> actually v6.13.6.
>
> > Thanks,
>
> Thank you!
>
> > Luca
>
> Cheers
>
> Marcel
next prev parent reply other threads:[~2025-05-30 9:21 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-28 18:04 SCHED_DEADLINE tasks missing their deadline with SCHED_FLAG_RECLAIM jobs in the mix (using GRUB) Marcel Ziswiler
2025-05-02 13:55 ` Juri Lelli
2025-05-02 14:10 ` luca abeni
2025-05-03 13:14 ` Marcel Ziswiler
2025-05-05 15:53 ` luca abeni
2025-05-03 11:14 ` Marcel Ziswiler
2025-05-07 20:25 ` luca abeni
2025-05-19 13:32 ` Marcel Ziswiler
2025-05-20 16:09 ` luca abeni
2025-05-21 9:59 ` Marcel Ziswiler
2025-05-23 19:46 ` luca abeni
2025-05-25 19:29 ` Marcel Ziswiler
2025-05-29 9:39 ` Juri Lelli
2025-06-02 14:59 ` Marcel Ziswiler
2025-06-17 12:21 ` Juri Lelli
2025-06-18 11:24 ` Marcel Ziswiler
2025-06-20 9:29 ` Juri Lelli
2025-06-20 9:37 ` luca abeni
2025-06-20 9:58 ` Juri Lelli
2025-06-20 14:16 ` luca abeni
2025-06-20 15:28 ` Juri Lelli
2025-06-20 16:52 ` luca abeni
2025-06-24 7:49 ` Juri Lelli
2025-06-24 12:59 ` Juri Lelli
2025-06-24 15:00 ` luca abeni
2025-06-25 9:30 ` Juri Lelli
2025-06-25 10:11 ` Juri Lelli
2025-06-25 12:50 ` luca abeni
2025-06-26 10:59 ` Marcel Ziswiler
2025-06-26 11:45 ` Juri Lelli
2025-06-25 15:55 ` Marcel Ziswiler
2025-06-24 13:36 ` luca abeni
2025-05-30 9:21 ` luca abeni [this message]
2025-06-03 11:18 ` Marcel Ziswiler
2025-06-06 13:16 ` luca abeni
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250530112108.63a24cde@luca64 \
--to=luca.abeni@santannapisa.it \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=marcel.ziswiler@codethink.co.uk \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=vineeth@bitbyteword.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.