From: Juri Lelli <juri.lelli-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Qais Yousef <qyousef-wp2msK0BRk8tq7phqP6ubQ@public.gmane.org>
Cc: Hao Luo <haoluo-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
Dietmar Eggemann <dietmar.eggemann-5wv7dgnIgG8@public.gmane.org>,
Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>,
Ingo Molnar <mingo-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
Waiman Long <longman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
Steven Rostedt <rostedt-nx8X9YLhiw1AfugRpC6u6w@public.gmane.org>,
tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org,
linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
luca.abeni-5rdYK369eBLQB0XuIGIEkQ@public.gmane.org,
claudio-YOzL5CV4y4YG1A2ADO40+w@public.gmane.org,
tommaso.cucinotta-5rdYK369eBLQB0XuIGIEkQ@public.gmane.org,
bristot-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
mathieu.poirier-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org,
cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Vincent Guittot
<vincent.guittot-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>,
Wei Wang <wvw-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
Rick Yiu <rickyiu-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
Quentin Perret <qperret-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
Heiko Carstens <hca-tEXmvtCZX7AybS5Ee8rs3A@public.gmane.org>,
Vasily Gorbik <gor-tEXmvtCZX7AybS5Ee8rs3A@public.gmane.org>,
Alexander Gordeev
<agordeev-tEXmvtCZX7AybS5Ee8rs3A@public.gmane.org>,
Sudeep Holla <sudeep.holla-5wv7dgnIgG8@public.gmane.org>,
Zefan Li <lizefan.x-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org>,
linux-s3
Subject: Re: [PATCH v3] sched: cpuset: Don't rebuild root domains on suspend-resume
Date: Mon, 13 Mar 2023 16:37:59 +0000 [thread overview]
Message-ID: <ZA9RZ3VvGXKp+1L6@localhost.localdomain> (raw)
In-Reply-To: <20230311185150.stvtcbdkoofgn3wd@airbuntu>
On 11/03/23 18:51, Qais Yousef wrote:
> On 03/09/23 14:23, Hao Luo wrote:
> > On Wed, Mar 8, 2023 at 10:55 PM Juri Lelli <juri.lelli-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> > >
> > > On 08/03/23 10:01, Hao Luo wrote:
> > > > On Wed, Mar 8, 2023 at 2:20 AM Juri Lelli <juri.lelli-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> > > > >
> > > > > On 01/03/23 17:03, Qais Yousef wrote:
> > > > > > On 03/01/23 15:26, Juri Lelli wrote:
> > > > <...>
> > > > > > > BTW, do you have a repro script of some sort handy I might play with?
> > > > > >
> > > > > > Sorry no. You'll just need to suspend to ram. I had a simple patch to measure
> > > > > > the time around the call and trace_printk'ed the result.
> > > > > >
> > > > > > I was working on a android phone which just suspends to ram if you turn the
> > > > > > screen off and disconnect the usb.
> > > > >
> > > > > Looks like I could come up with the following
> > > > >
> > > > > https://github.com/jlelli/linux.git deadline/rework-cpusets
> > > > > https://github.com/jlelli/linux/tree/deadline/rework-cpusets
> > > > >
> > > > > which I don't think it's at a point that I feel comfortable to propose
> > > > > as an RFC (not even sure if it actually makes sense), but it survived my
> > > > > very light testing.
> > > > >
> > > > > Could you please take a look and, if it makes some sense in theory, give
> > > > > it a try on your end?
> > > > >
> > > >
> > > > Hi Juri,
> > > >
> > > > Thanks for coming up with the RFC. I can test your changes in the
> > > > server environment. I observed the same issue on my side and I can
> > > > reproduce.
> > > >
> > > > I sync'ed up with Qais offline earlier yesterday, and was preparing a
> > > > couple of patches that optimize the cpuset.cpus writes. Tracking dl
> > > > tasks in cpusets is one of them. But I am happy to take your patches
> > > > and do the testing. Note that I won't be able to test the dl part of
> > > > the patch, only the latency impact on rebuild_root_domains(), as we
> > > > don't have dl tasks in our system.
> > > >
> > > > The other patch is fixing cpuset_rwsem. I see you switched it back to
> > > > mutex. I did observe performance issues with cpuset_rwsem. Basically,
> > > > using percpu_rwsem generates very very long latency tails for writers,
> > > > but mutex doesn't. After some debugging, I found it was because
> > > > percpu_rwsem requires every writer to call a synchronize_rcu() for
> > > > acquiring the lock. So in my patches, I disabled the fastpath of
> > > > readers for cpuset_rwsem. This has been done before[1]. But mutex also
> > > > worked.
> > > >
> > > > Anyway, I'm happy to test your patches and ack once they are sent out.
> > >
> > > Do you strictly need a proper RFC or could you please test the above for
> > > now? If you could please do the latter, and if tests look ok, I could
> > > then put together proper changelogs etc. and propose an RFC (it would
> > > save me some time not to do that if the above doesn't work, apologies
> > > for not going the proper route from the start). Guess this question
> > > applies to Qais as well. Hummm, or maybe you are actually saying that
> > > you are indeed going to test them already, just wanted to make sure
> > > then. :)
> >
> > Juri, I ported your patches to a 5.10 kernel, because my workload can
> > only run on 5.10. But unfortunately the kernel crashed at
> > cpuset_can_attach(). I'll put a few comments in your github branch.
>
> Yeah I am working on 5.10 too (this will need to be backported to 5.10 and 5.15
> ultimately) and had the same crash because task is NULL.
>
> Fixed it this way which I think what you intended to do Juri? It moves the
> check for dl_task(task) inside cgroup_taskset_for_each() loop.
>
> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
> index 83a8943467fb..06d6bb68d86b 100644
> --- a/kernel/cgroup/cpuset.c
> +++ b/kernel/cgroup/cpuset.c
> @@ -2495,11 +2495,11 @@ static int cpuset_can_attach(struct cgroup_taskset *tset)
> ret = security_task_setscheduler(task);
> if (ret)
> goto out_unlock;
> - }
>
> - if (dl_task(task)) {
> - cs->deadline_tasks++;
> - cpuset_attach_old_cs->deadline_tasks--;
> + if (dl_task(task)) {
> + cs->deadline_tasks++;
> + cpuset_attach_old_cs->deadline_tasks--;
> + }
> }
>
> /*
Duh, indeed.
> Like Hao I don't have any deadline tasks in the system. With the fix above
> I don't notice the delay on suspend resume using your patches.
OK, cool.
> If you want any debug; please feel free to add them into your branch so I can
> run with that and give you the log.
Will need to find time to run some tests with DEADLINE tasks, yeah.
Maybe Dietmar, since you reported as well the issue above with your
testing, you could help with testing DEADLINE?
Thanks,
Juri
next prev parent reply other threads:[~2023-03-13 16:37 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-06 22:14 [PATCH v3] sched: cpuset: Don't rebuild root domains on suspend-resume Qais Yousef
[not found] ` <20230206221428.2125324-1-qyousef-wp2msK0BRk8tq7phqP6ubQ@public.gmane.org>
2023-02-23 15:38 ` Qais Yousef
2023-02-24 15:14 ` Dietmar Eggemann
[not found] ` <5f087dd8-3e39-ce83-fe24-afa5179c05d9-5wv7dgnIgG8@public.gmane.org>
2023-02-27 20:57 ` Qais Yousef
2023-02-28 14:09 ` Dietmar Eggemann
[not found] ` <5a1e58bf-7eb2-bd7a-7e19-7864428a2b83-5wv7dgnIgG8@public.gmane.org>
2023-02-28 17:46 ` Qais Yousef
2023-03-01 7:31 ` Juri Lelli
[not found] ` <Y/7/SLzvK8LfB29z-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2023-03-01 12:28 ` Qais Yousef
2023-03-01 14:26 ` Juri Lelli
[not found] ` <Y/9gmDRlGOChIwpf-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2023-03-01 17:03 ` Qais Yousef
2023-03-08 10:19 ` Juri Lelli
[not found] ` <ZAhhGi55BkYkc3ss-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2023-03-08 18:01 ` Hao Luo
[not found] ` <CA+khW7hmE0tECG2qfKW1HN9yLVOLUn5Zzx4Rz-wHYDtSUPYotw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2023-03-09 6:55 ` Juri Lelli
2023-03-09 22:23 ` Hao Luo
[not found] ` <CA+khW7iAeTALH5b6upHXs1RotFBohVeZCcAp5H+r-=L05kqnww-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2023-03-11 18:51 ` Qais Yousef
2023-03-13 16:37 ` Juri Lelli [this message]
[not found] ` <ZA9RZ3VvGXKp+1L6-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2023-03-13 17:10 ` Dietmar Eggemann
[not found] ` <7070da53-a5a7-6965-5604-abee3cae9d46-5wv7dgnIgG8@public.gmane.org>
2023-03-14 11:41 ` Dietmar Eggemann
2023-03-08 19:21 ` Waiman Long
2023-03-13 12:37 ` Dietmar Eggemann
2023-03-07 19:56 ` Hao Luo
[not found] ` <CA+khW7i_Sc0M4FXzojmQ5PSfkPwk6AdcbN9j0gDXZ9FsOMQAwA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2023-03-07 20:08 ` Waiman Long
[not found] ` <f3a99500-e51c-032f-a0c6-01763f0a5be6-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2023-03-07 21:06 ` Hao Luo
[not found] ` <CA+khW7iWAn6bbXdkJX1Lt4dWUsN6o4KqVQ8OFTs0B+VTtVjBkw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2023-03-07 21:13 ` Waiman Long
[not found] ` <1f2cf8ea-a9d7-5245-0f69-eb8be9f64afc-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2023-03-07 22:17 ` Hao Luo
2023-03-08 2:29 ` Waiman Long
[not found] ` <315efe55-abaf-c199-673b-95ef76a1e442-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2023-03-08 18:11 ` Hao Luo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZA9RZ3VvGXKp+1L6@localhost.localdomain \
--to=juri.lelli-h+wxahxf7alqt0dzr+alfa@public.gmane.org \
--cc=agordeev-tEXmvtCZX7AybS5Ee8rs3A@public.gmane.org \
--cc=bristot-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=claudio-YOzL5CV4y4YG1A2ADO40+w@public.gmane.org \
--cc=dietmar.eggemann-5wv7dgnIgG8@public.gmane.org \
--cc=gor-tEXmvtCZX7AybS5Ee8rs3A@public.gmane.org \
--cc=haoluo-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=hca-tEXmvtCZX7AybS5Ee8rs3A@public.gmane.org \
--cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=lizefan.x-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org \
--cc=longman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=luca.abeni-5rdYK369eBLQB0XuIGIEkQ@public.gmane.org \
--cc=mathieu.poirier-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org \
--cc=mingo-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
--cc=peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
--cc=qperret-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=qyousef-wp2msK0BRk8tq7phqP6ubQ@public.gmane.org \
--cc=rickyiu-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=rostedt-nx8X9YLhiw1AfugRpC6u6w@public.gmane.org \
--cc=sudeep.holla-5wv7dgnIgG8@public.gmane.org \
--cc=tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
--cc=tommaso.cucinotta-5rdYK369eBLQB0XuIGIEkQ@public.gmane.org \
--cc=vincent.guittot-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org \
--cc=wvw-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox