From: Oleksandr Natalenko <oleksandr@natalenko.name>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Greg KH <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Peter Zijlstra <peterz@infradead.org>
Subject: Re: Backport d583d360a6 into 5.12 stable
Date: Wed, 23 Jun 2021 16:00:14 +0200 [thread overview]
Message-ID: <7592880.XtGZTbpMlS@spock> (raw)
In-Reply-To: <YNIrp7A0LV0aLc5q@cmpxchg.org>
Hello.
On úterý 22. června 2021 20:27:51 CEST Johannes Weiner wrote:
> On Tue, Jun 22, 2021 at 07:24:56PM +0200, Oleksandr Natalenko wrote:
> > On úterý 22. června 2021 18:47:59 CEST Greg KH wrote:
> > > On Tue, Jun 22, 2021 at 06:30:46PM +0200, Oleksandr Natalenko wrote:
> > > > I'd like to nominate d583d360a6 ("psi: Fix psi state corruption when
> > > > schedule() races with cgroup move") for 5.12 stable tree.
> > > >
> > > > Recently, I've hit this:
> > > >
> > > > ```
> > > > kernel: psi: inconsistent task state! task=2667:clementine cpu=21
> > > > psi_flags=0 clear=1 set=0
> > > > ```
> > > >
> > > > and after that PSI IO went crazy high. That seems to match the
> > > > symptoms
> > > > described in the commit message.
> > >
> > > But this says it fixes 4117cebf1a9f ("psi: Optimize task switch inside
> > > shared cgroups") which did not show up until 5.13-rc1, so how are you
> > > hitting this issue?
> >
> > I'm not positive 4117cebf1a9f was a root cause of the race. To me it looks
> > like 4117cebf1a9f just made an older issue more likely to be hit.
> >
> > Peter, Johannes, am I correct saying that it is still possible to hit a
> > corruption described in d583d360a6 on 5.12?
>
> I'm not aware of a previous issue, but it's possible you discovered
> one that was incidentally fixed by this change.
>
> That said, there haven't been many changes in this area prior to 5.12,
> and I stared at the old code quite a bit to see if there are other
> possible scenarios, so this gives me pause.
Ack.
> > > Did you try this patch on 5.12.y and see that it solved your problem?
> >
> > Yes, I've built the kernel with this patch, and so far it runs fine. It
> > can
> > take a while until the condition is hit though since it seems to be very
> > unlikely on 5.12.
>
> Is your task moving / being moved between cgroups while it's doing
> work?
Likely, yes. IIUC, KDE spawns apps in separate cgroups so that in that very
case Clementine should get its own one (?):
```
$ systemd-cgls
…
│ │ │ ├─app-clementine-df516e4181f446ab869e723ea2ed6094.scope
│ │ │ │ ├─2926 /bin/clementine -session
10de706f63000162437544200000015700012_1624379013_575845
│ │ │ │ ├─3059 /usr/bin/clementine-tagreader /tmp/clementine_735427711
│ │ │ │ ├─3060 /usr/bin/clementine-tagreader /tmp/clementine_557274898
│ │ │ │ ├─3062 /usr/bin/clementine-tagreader /tmp/clementine_1730944950
│ │ │ │ ├─3063 /usr/bin/clementine-tagreader /tmp/clementine_1509249421
│ │ │ │ ├─3065 /usr/bin/clementine-tagreader /tmp/clementine_1345386497
│ │ │ │ ├─3068 /usr/bin/clementine-tagreader /tmp/clementine_865255891
│ │ │ │ ├─3070 /usr/bin/clementine-tagreader /tmp/clementine_1782561441
│ │ │ │ ├─3072 /usr/bin/clementine-tagreader /tmp/clementine_421851305
│ │ │ │ ├─3073 /usr/bin/clementine-tagreader /tmp/clementine_175368243
│ │ │ │ ├─3075 /usr/bin/clementine-tagreader /tmp/clementine_1962830479
│ │ │ │ ├─3076 /usr/bin/clementine-tagreader /tmp/clementine_547573203
│ │ │ │ ├─3078 /usr/bin/clementine-tagreader /tmp/clementine_1819270047
│ │ │ │ ├─3079 /usr/bin/clementine-tagreader /tmp/clementine_1632862299
│ │ │ │ ├─3085 /usr/bin/clementine-tagreader /tmp/clementine_1279975869
│ │ │ │ ├─3095 /usr/bin/clementine-tagreader /tmp/clementine_1612119641
│ │ │ │ ├─3102 /usr/bin/clementine-tagreader /tmp/clementine_1789578483
│ │ │ │ ├─3103 /usr/bin/clementine-tagreader /tmp/clementine_1541442265
│ │ │ │ ├─3105 /usr/bin/clementine-tagreader /tmp/clementine_1418456770
│ │ │ │ ├─3106 /usr/bin/clementine-tagreader /tmp/clementine_1998684543
│ │ │ │ ├─3107 /usr/bin/clementine-tagreader /tmp/clementine_1349315391
│ │ │ │ ├─3108 /usr/bin/clementine-tagreader /tmp/clementine_231895572
│ │ │ │ ├─3110 /usr/bin/clementine-tagreader /tmp/clementine_492688785
│ │ │ │ ├─3111 /usr/bin/clementine-tagreader /tmp/clementine_1492630900
│ │ │ │ └─3112 /usr/bin/clementine-tagreader /tmp/clementine_2017490599
…
```
> How long does it usually take to trigger it?
I don't know :(. I don't usually peer into dmesg, and noticed this by a pure
chance. Grepping the journal shows nothing else but only this occurrence, and
also the journal is rotating, so some info might be already lost.
> Would it be possible to share a simpler reproducer, or is this part of
> a more complex application?
This was triggered bu KDE's autostart of Clementine player, and I don't have
any specific reproducer. If I find one, I'll share it of course.
Thanks.
--
Oleksandr Natalenko (post-factum)
prev parent reply other threads:[~2021-06-23 14:00 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-22 16:30 Backport d583d360a6 into 5.12 stable Oleksandr Natalenko
2021-06-22 16:47 ` Greg KH
2021-06-22 17:24 ` Oleksandr Natalenko
2021-06-22 18:27 ` Johannes Weiner
2021-06-23 14:00 ` Oleksandr Natalenko [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7592880.XtGZTbpMlS@spock \
--to=oleksandr@natalenko.name \
--cc=gregkh@linuxfoundation.org \
--cc=hannes@cmpxchg.org \
--cc=peterz@infradead.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox