From: Marcelo Tosatti <mtosatti@redhat.com>
To: Aaron Tomlin <atomlin@atomlin.com>
Cc: Frederic Weisbecker <frederic@kernel.org>,
Christoph Lameter <cl@linux.com>,
tglx@linutronix.de, mingo@kernel.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
Phil Auld <pauld@redhat.com>
Subject: Re: [RFC PATCH] tick/sched: Ensure quiet_vmstat() is called when the idle tick was stopped too
Date: Thu, 24 Feb 2022 09:37:44 -0300 [thread overview]
Message-ID: <Yhd8GONfqYpIdl51@fuller.cnet> (raw)
In-Reply-To: <Yhd5olg9CjXSAf2s@fuller.cnet>
On Thu, Feb 24, 2022 at 09:27:14AM -0300, Marcelo Tosatti wrote:
> On Sat, Feb 19, 2022 at 03:46:16PM +0000, Aaron Tomlin wrote:
> > On Fri 2022-02-18 12:54 +0000, Aaron Tomlin wrote:
> > > On Thu 2022-02-17 17:32 +0100, Frederic Weisbecker wrote:
> > > > > If I understand correctly, in the context of nohz_full, since such work is
> > > > > deferred, it will only be handled in a scenario when the periodic/or
> > > > > scheduling-clock tick is enabled i.e. the timer was reprogrammed on exit
> > > > > from idle.
> > > >
> > > > Oh I see, it's a deferrable delayed work...
> > > > Then I can see two other issues:
> > > >
> > > > 1) Can an interrupt in idle modify the vmstat and thus trigger the need to
> > > > flush it?
>
> Yes. Page allocation and page freeing for example.
>
> 6 3730 ../mm/page_alloc.c <<rmqueue>>
> __mod_zone_freepage_state(zone, -(1 << order),
> 4 1096 ../mm/page_alloc.c <<__free_one_page>
> __mod_zone_freepage_state(zone, -(1 << order),
>
> > > > I believe it's the case and then the problem goes beyond nohz_full
> > > > because if the idle interrupt fired while the tick is stopped and didn't set
> > > > TIF_RESCHED, we go back to sleep without calling quiet_vmstat().
> > >
> > > Yes: e.g. a nohz_full CPU, in idle code, could indeed receive a reschedule
> > > IPI; re-enable local IRQs and generic idle code sees the TIF_NEED_RESCHED
> > > flag against the idle task. Additionally, the selected task could
> > > indirectly released a few pages [to satisfy a low-memory condition] and
> > > modify CPU-specific vmstat data i.e. vm_stat_diff[NR_FREE_PAGES].
> > >
> > >
> > > > 2) What if we are running task A in kernel mode while the tick is stopped
> > > > (nohz_full). Task A modifies the vmstat and goes to userspace for a long
> > > > while.
> > > > Your patch fixes case 1) but not case 2). The problem is that TIMER_DEFERRABLE
> > > > should really be about dynticks-idle only and not dynticks-full. I've always
> > > > been afraid about enforcing that rule though because that would break old
> > > > noise-free setups. But perhaps I should...
> > >
> > > If I understand correctly, I agree. For the latter case, nothing can be
> > > done unfortunately since the scheduling-clock tick is stopped.
> >
> > Hi Frederic,
> >
> > As far vmstat_updateas I understand, in the context of nohz_full, options are indeed
> > limited; albeit, if we can ensure CPU-specific vmstat data is folded on
> > return to idle [when required] then this should be good enough.
>
> I suppose the desired behaviour, with the deferred timer for vmstat_sync, is:
>
> "Allow the per-CPU vmstats to be out of sync, but for a maximum of
> sysctl_stat_interval".
>
> But Aaron, vmstat_shepherd should be ensuring that per-CPU vmstat_update
> work are queued, if the per-CPU vmstat are out of sync.
>
> And:
>
> static void
> trigger_dyntick_cpu(struct timer_base *base, struct timer_list *timer)
> {
> if (!is_timers_nohz_active())
> return;
>
> /*
> * TODO: This wants some optimizing similar to the code below, but we
> * will do that when we switch from push to pull for deferrable timers.
> */
> if (timer->flags & TIMER_DEFERRABLE) {
> if (tick_nohz_full_cpu(base->cpu))
> wake_up_nohz_cpu(base->cpu);
> return;
> }
>
> * @TIMER_DEFERRABLE: A deferrable timer will work normally when the
> * system is busy, but will not cause a CPU to come out of idle just
> * to service it; instead, the timer will be serviced when the CPU
> * eventually wakes up with a subsequent non-deferrable timer.
>
> You'd want that vmstat_update to execute regardless of whether there are
> armed non-deferrable timers.
>
> Should fix both 1 and 2 AFAICS.
Maybe just switching to a non-deferrable timer does not increase the
frequency of vmstat_update calls so much ? It should happen once per
second anyway.
Then the "vmstats out of sync but for a maximum of sysctl_stat_interval"
would be respected, rather than existance of non-deferrable timers.
next prev parent reply other threads:[~2022-02-24 12:38 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-03 21:43 [RFC PATCH] tick/sched: Ensure quiet_vmstat() is called when the idle tick was stopped too Aaron Tomlin
2022-02-03 22:22 ` Phil Auld
2022-02-16 14:34 ` Aaron Tomlin
2022-02-16 21:20 ` Phil Auld
2022-02-17 12:57 ` Frederic Weisbecker
2022-02-17 14:45 ` Aaron Tomlin
2022-02-17 12:47 ` Frederic Weisbecker
2022-02-17 14:26 ` Aaron Tomlin
2022-02-17 16:32 ` Frederic Weisbecker
2022-02-18 12:54 ` Aaron Tomlin
2022-02-19 15:46 ` Aaron Tomlin
2022-02-24 12:27 ` Marcelo Tosatti
2022-02-24 12:30 ` Marcelo Tosatti
2022-02-24 13:01 ` Aaron Tomlin
2022-02-24 12:37 ` Marcelo Tosatti [this message]
2022-02-24 13:00 ` Aaron Tomlin
2022-02-24 13:14 ` Marcelo Tosatti
2022-02-24 13:28 ` Aaron Tomlin
2022-02-24 13:40 ` Marcelo Tosatti
2022-02-24 13:44 ` Aaron Tomlin
2022-03-31 14:33 ` Aaron Tomlin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Yhd8GONfqYpIdl51@fuller.cnet \
--to=mtosatti@redhat.com \
--cc=atomlin@atomlin.com \
--cc=cl@linux.com \
--cc=frederic@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mingo@kernel.org \
--cc=pauld@redhat.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.