From: Marcelo Tosatti <mtosatti@redhat.com>
To: Christoph Lameter <cl@gentwo.de>
Cc: linux-kernel@vger.kernel.org,
Thomas Gleixner <tglx@linutronix.de>,
Frederic Weisbecker <frederic@kernel.org>,
Juri Lelli <juri.lelli@redhat.com>, Nitesh Lal <nilal@redhat.com>
Subject: Re: [patch 0/5] optionally sync per-CPU vmstats counter on return to userspace
Date: Mon, 5 Jul 2021 11:45:42 -0300 [thread overview]
Message-ID: <20210705144542.GA27275@fuller.cnet> (raw)
In-Reply-To: <alpine.DEB.2.22.394.2107051622580.278143@gentwo.de>
On Mon, Jul 05, 2021 at 04:26:48PM +0200, Christoph Lameter wrote:
> On Fri, 2 Jul 2021, Marcelo Tosatti wrote:
>
> > > > The logic to disable vmstat worker thread, when entering
> > > > nohz full, does not cover all scenarios. For example, it is possible
> > > > for the following to happen:
> > > >
> > > > 1) enter nohz_full, which calls refresh_cpu_vm_stats, syncing the stats.
> > > > 2) app runs mlock, which increases counters for mlock'ed pages.
> > > > 3) start -RT loop
> > > >
> > > > Since refresh_cpu_vm_stats from nohz_full logic can happen _before_
> > > > the mlock, vmstat shepherd can restart vmstat worker thread on
> > > > the CPU in question.
> > >
> > > Can we enter nohz_full after the app runs mlock?
> >
> > Hum, i don't think its a good idea to use that route, because
> > entering or exiting nohz_full depends on a number of variable
> > outside of one's control (and additional variables might be
> > added in the future).
>
> Then I do not see any need for this patch. Because after a certain time
> of inactivity (after the mlock) the system will enter nohz_full again.
> If userspace has no direct control over nohz_full and can only wait then
> it just has to do so.
Sorry, fail to see what you mean.
The problem (well its not a bug per se, but basically the current
disablement of vmstat_worker thread is not aggressive enough).
From the initial message:
1) enter nohz_full, which calls refresh_cpu_vm_stats, syncing the stats.
2) app runs mlock, which increases counters for mlock'ed pages.
3) start -RT loop
Note that any activity that triggers stat counter changes (other than
mlock, it just happens that it was mlock in the test application i was
using, just replace with any other system call that triggers writes
to per-CPU vmstat counters), will cause this.
You said:
"Because after a certain time of inactivity (after the mlock) the
system will enter nohz_full again."
Yes, but we can't tolerate any activity from vmstat worker thread
on this particular CPU.
Do you want the app to wait for an event saying: "vmstat_worker is now
disabled, as long as you don't dirty vmstat counters, vmstat_shepherd
won't wake it up".
Rather than that, what this patch does is to sync the vmstat counters on
return to userspace, so that:
"We synced per-CPU vmstat counters to global counters, and disable
local-CPU vmstat worker (on return to userspace). As long as you
don't dirty vmstat counters, vmstat_shepherd won't wake it up".
Makes sense?
> > So preparing the system to function
> > while entering nohz_full at any location seems the sane thing to do.
> >
> > And that would be at return to userspace (since, if mlocked, after
> > that point there will be no more changes to propagate to vmstat
> > counters).
> >
> > Or am i missing something else you can think of ?
>
> I assumed that the "enter nohz full" was an action by the user
> space app because I saw some earlier patches to introduce such
> functionality in the past.
No, it meant "enter nohz full" (in the current Linux codebase, for
existing applications).
next prev parent reply other threads:[~2021-07-05 14:46 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-01 21:03 [patch 0/5] optionally sync per-CPU vmstats counter on return to userspace Marcelo Tosatti
2021-07-01 21:03 ` [patch 1/5] sched: isolation: introduce vmstat_sync isolcpu flags Marcelo Tosatti
2021-07-01 21:03 ` [patch 2/5] common entry: add hook for isolation to __syscall_exit_to_user_mode_work Marcelo Tosatti
2021-07-01 21:03 ` [patch 3/5] mm: vmstat: optionally flush per-CPU vmstat counters on return to userspace Marcelo Tosatti
2021-07-01 23:11 ` kernel test robot
2021-07-02 6:50 ` kernel test robot
2021-07-01 21:03 ` [patch 4/5] mm: vmstat: move need_update Marcelo Tosatti
2021-07-01 21:03 ` [patch 5/5] mm: vmstat_refresh: avoid queueing work item if cpu stats are clean Marcelo Tosatti
2021-07-02 4:10 ` kernel test robot
2021-07-02 4:43 ` kernel test robot
2021-07-02 8:00 ` [patch 0/5] optionally sync per-CPU vmstats counter on return to userspace Christoph Lameter
2021-07-02 11:52 ` Marcelo Tosatti
2021-07-02 11:59 ` Marcelo Tosatti
2021-07-05 14:26 ` Christoph Lameter
2021-07-05 14:45 ` Marcelo Tosatti [this message]
2021-07-02 12:30 ` Frederic Weisbecker
2021-07-02 15:28 ` Marcelo Tosatti
2021-07-06 13:09 ` Frederic Weisbecker
2021-07-06 14:05 ` Marcelo Tosatti
2021-07-06 14:09 ` Marcelo Tosatti
2021-07-06 14:17 ` Marcelo Tosatti
2021-07-06 16:15 ` Peter Zijlstra
2021-07-06 16:53 ` Marcelo Tosatti
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210705144542.GA27275@fuller.cnet \
--to=mtosatti@redhat.com \
--cc=cl@gentwo.de \
--cc=frederic@kernel.org \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=nilal@redhat.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox