public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Alexey Dobriyan <adobriyan@gmail.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Yu Liao <liaoyu15@huawei.com>,
	fweisbec@gmail.com, mingo@kernel.org, liwei391@huawei.com,
	mirsad.todorovac@alu.unizg.hr, linux-kernel@vger.kernel.org,
	Peter Zijlstra <peterz@infradead.org>
Subject: Re: [PATCH RFC] tick/nohz: fix data races in get_cpu_idle_time_us()
Date: Tue, 31 Jan 2023 21:35:39 +0300	[thread overview]
Message-ID: <Y9lfe54aoCWlmyqy@p183> (raw)
In-Reply-To: <87357q228f.ffs@tglx>

On Tue, Jan 31, 2023 at 03:44:00PM +0100, Thomas Gleixner wrote:
> On Sat, Jan 28 2023 at 10:00, Yu Liao wrote:
> > selftest/proc/proc-uptime-001 complains:
> >   Euler:/mnt # while true; do ./proc-uptime-001; done
> >   proc-uptime-001: proc-uptime-001.c:41: main: Assertion `i1 >= i0' failed.
> >   proc-uptime-001: proc-uptime-001.c:41: main: Assertion `i1 >= i0' failed.
> >
> > /proc/uptime should be monotonically increasing. This occurs because
> > the data races between get_cpu_idle_time_us and
> > tick_nohz_stop_idle/tick_nohz_start_idle, for example:
> >
> > CPU0                        CPU1
> > get_cpu_idle_time_us
> >
> >                             tick_nohz_idle_exit
> >                               now = ktime_get();
> >                               tick_nohz_stop_idle
> >                                 update_ts_time_stats
> >                                   delta = ktime_sub(now, ts->idle_entrytime);
> >                                   ts->idle_sleeptime = ktime_add(ts->idle_sleeptime, delta)
> >                                   ts->idle_entrytime = now
> >
> > now = ktime_get();
> > if (ts->idle_active && !nr_iowait_cpu(cpu)) {
> >     ktime_t delta = ktime_sub(now, ts->idle_entrytime);
> >     idle = ktime_add(ts->idle_sleeptime, delta);
> >     //idle is slightly greater than the actual value
> > } else {
> >     idle = ts->idle_sleeptime;
> > }
> >                             ts->idle_active = 0
> >
> > After this, idle = idle_sleeptime(actual idle value) + now(CPU0) - now(CPU1).
> > If get_cpu_idle_time_us() is called immediately after ts->idle_active = 0,
> > only ts->idle_sleeptime is returned, which is smaller than the previously
> > read one, resulting in a non-monotonically increasing idle time. In
> > addition, there are other data race scenarios not listed here.
> 
> Seriously this procfs accuracy is the least of the problems and if this
> would be the only issue then we could trivially fix it by declaring that
> the procfs output might go backwards.

Declarations on l-k are meaningless.

> If there would be a real reason to ensure monotonicity there then we could
> easily do that in the readout code.

People expect it to be monotonic. I wrote this test fully expecting
that /proc/uptime is monotonic. It didn't ever occured to me that
idletime can go backwards (nor uptime, but uptime is not buggy).

> But the real issue is that both get_cpu_idle_time_us() and
> get_cpu_iowait_time_us() can invoke update_ts_time_stats() which is way
> worse than the above procfs idle time going backwards.
> 
> If update_ts_time_stats() is invoked concurrently for the same CPU then
> ts->idle_sleeptime and ts->iowait_sleeptime are turning into random
> numbers.
> 
> This has been broken 12 years ago in commit 595aac488b54 ("sched:
> Introduce a function to update the idle statistics").

  reply	other threads:[~2023-01-31 18:36 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-28  2:00 [PATCH RFC] tick/nohz: fix data races in get_cpu_idle_time_us() Yu Liao
2023-01-31 14:44 ` Thomas Gleixner
2023-01-31 18:35   ` Alexey Dobriyan [this message]
2023-01-31 19:59     ` Peter Zijlstra
2023-01-31 19:57   ` Peter Zijlstra
2023-01-31 21:11     ` Frederic Weisbecker
2023-02-01  9:03       ` Peter Zijlstra
2023-02-08 15:19   ` [PATCH] timers/nohz: Restructure and reshuffle struct tick_sched Frederic Weisbecker
     [not found] ` <20230201045302.316-1-hdanton@sina.com>
2023-02-01 12:02   ` [PATCH RFC] tick/nohz: fix data races in get_cpu_idle_time_us() Frederic Weisbecker
     [not found]   ` <20230201140117.539-1-hdanton@sina.com>
2023-02-01 14:28     ` Frederic Weisbecker
2023-02-06  7:03 ` kernel test robot
2023-02-07  5:25 ` Mirsad Goran Todorovac
2023-02-07  8:03   ` Mirsad Goran Todorovac

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y9lfe54aoCWlmyqy@p183 \
    --to=adobriyan@gmail.com \
    --cc=fweisbec@gmail.com \
    --cc=liaoyu15@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=liwei391@huawei.com \
    --cc=mingo@kernel.org \
    --cc=mirsad.todorovac@alu.unizg.hr \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox