From: Marcelo Tosatti <mtosatti@redhat.com>
To: Christoph Lameter <cl@gentwo.de>
Cc: linux-kernel@vger.kernel.org,
Thomas Gleixner <tglx@linutronix.de>,
Frederic Weisbecker <frederic@kernel.org>,
Juri Lelli <juri.lelli@redhat.com>, Nitesh Lal <nilal@redhat.com>
Subject: Re: [patch 0/5] optionally sync per-CPU vmstats counter on return to userspace
Date: Mon, 5 Jul 2021 11:45:42 -0300 [thread overview]
Message-ID: <20210705144542.GA27275@fuller.cnet> (raw)
In-Reply-To: <alpine.DEB.2.22.394.2107051622580.278143@gentwo.de>
On Mon, Jul 05, 2021 at 04:26:48PM +0200, Christoph Lameter wrote:
> On Fri, 2 Jul 2021, Marcelo Tosatti wrote:
>
> > > > The logic to disable vmstat worker thread, when entering
> > > > nohz full, does not cover all scenarios. For example, it is possible
> > > > for the following to happen:
> > > >
> > > > 1) enter nohz_full, which calls refresh_cpu_vm_stats, syncing the stats.
> > > > 2) app runs mlock, which increases counters for mlock'ed pages.
> > > > 3) start -RT loop
> > > >
> > > > Since refresh_cpu_vm_stats from nohz_full logic can happen _before_
> > > > the mlock, vmstat shepherd can restart vmstat worker thread on
> > > > the CPU in question.
> > >
> > > Can we enter nohz_full after the app runs mlock?
> >
> > Hum, i don't think its a good idea to use that route, because
> > entering or exiting nohz_full depends on a number of variable
> > outside of one's control (and additional variables might be
> > added in the future).
>
> Then I do not see any need for this patch. Because after a certain time
> of inactivity (after the mlock) the system will enter nohz_full again.
> If userspace has no direct control over nohz_full and can only wait then
> it just has to do so.
Sorry, fail to see what you mean.
The problem (well its not a bug per se, but basically the current
disablement of vmstat_worker thread is not aggressive enough).
From the initial message:
1) enter nohz_full, which calls refresh_cpu_vm_stats, syncing the stats.
2) app runs mlock, which increases counters for mlock'ed pages.
3) start -RT loop
Note that any activity that triggers stat counter changes (other than
mlock, it just happens that it was mlock in the test application i was
using, just replace with any other system call that triggers writes
to per-CPU vmstat counters), will cause this.
You said:
"Because after a certain time of inactivity (after the mlock) the
system will enter nohz_full again."
Yes, but we can't tolerate any activity from vmstat worker thread
on this particular CPU.
Do you want the app to wait for an event saying: "vmstat_worker is now
disabled, as long as you don't dirty vmstat counters, vmstat_shepherd
won't wake it up".
Rather than that, what this patch does is to sync the vmstat counters on
return to userspace, so that:
"We synced per-CPU vmstat counters to global counters, and disable
local-CPU vmstat worker (on return to userspace). As long as you
don't dirty vmstat counters, vmstat_shepherd won't wake it up".
Makes sense?
> > So preparing the system to function
> > while entering nohz_full at any location seems the sane thing to do.
> >
> > And that would be at return to userspace (since, if mlocked, after
> > that point there will be no more changes to propagate to vmstat
> > counters).
> >
> > Or am i missing something else you can think of ?
>
> I assumed that the "enter nohz full" was an action by the user
> space app because I saw some earlier patches to introduce such
> functionality in the past.
No, it meant "enter nohz full" (in the current Linux codebase, for
existing applications).
next prev parent reply other threads:[~2021-07-05 14:46 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-01 21:03 [patch 0/5] optionally sync per-CPU vmstats counter on return to userspace Marcelo Tosatti
2021-07-01 21:03 ` [patch 1/5] sched: isolation: introduce vmstat_sync isolcpu flags Marcelo Tosatti
2021-07-01 21:03 ` [patch 2/5] common entry: add hook for isolation to __syscall_exit_to_user_mode_work Marcelo Tosatti
2021-07-01 21:03 ` [patch 3/5] mm: vmstat: optionally flush per-CPU vmstat counters on return to userspace Marcelo Tosatti
2021-07-01 23:11 ` kernel test robot
2021-07-01 23:11 ` kernel test robot
2021-07-02 6:50 ` kernel test robot
2021-07-02 6:50 ` kernel test robot
2021-07-01 21:03 ` [patch 4/5] mm: vmstat: move need_update Marcelo Tosatti
2021-07-01 21:03 ` [patch 5/5] mm: vmstat_refresh: avoid queueing work item if cpu stats are clean Marcelo Tosatti
2021-07-02 4:10 ` kernel test robot
2021-07-02 4:10 ` kernel test robot
2021-07-02 4:43 ` kernel test robot
2021-07-02 4:43 ` kernel test robot
2021-07-02 8:00 ` [patch 0/5] optionally sync per-CPU vmstats counter on return to userspace Christoph Lameter
2021-07-02 11:52 ` Marcelo Tosatti
2021-07-02 11:59 ` Marcelo Tosatti
2021-07-05 14:26 ` Christoph Lameter
2021-07-05 14:45 ` Marcelo Tosatti [this message]
2021-07-02 12:30 ` Frederic Weisbecker
2021-07-02 15:28 ` Marcelo Tosatti
2021-07-06 13:09 ` Frederic Weisbecker
2021-07-06 14:05 ` Marcelo Tosatti
2021-07-06 14:09 ` Marcelo Tosatti
2021-07-06 14:17 ` Marcelo Tosatti
2021-07-06 16:15 ` Peter Zijlstra
2021-07-06 16:53 ` Marcelo Tosatti
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210705144542.GA27275@fuller.cnet \
--to=mtosatti@redhat.com \
--cc=cl@gentwo.de \
--cc=frederic@kernel.org \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=nilal@redhat.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.