All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marcelo Tosatti <mtosatti@redhat.com>
To: Aaron Tomlin <atomlin@atomlin.com>
Cc: Frederic Weisbecker <frederic@kernel.org>,
	Christoph Lameter <cl@linux.com>,
	tglx@linutronix.de, mingo@kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Phil Auld <pauld@redhat.com>
Subject: Re: [RFC PATCH] tick/sched: Ensure quiet_vmstat() is called when the idle tick was stopped too
Date: Thu, 24 Feb 2022 09:27:14 -0300	[thread overview]
Message-ID: <Yhd5olg9CjXSAf2s@fuller.cnet> (raw)
In-Reply-To: <20220219154616.pwsvh445x3vn7ltf@ava.usersys.com>

On Sat, Feb 19, 2022 at 03:46:16PM +0000, Aaron Tomlin wrote:
> On Fri 2022-02-18 12:54 +0000, Aaron Tomlin wrote:
> > On Thu 2022-02-17 17:32 +0100, Frederic Weisbecker wrote:
> > > > If I understand correctly, in the context of nohz_full, since such work is
> > > > deferred, it will only be handled in a scenario when the periodic/or
> > > > scheduling-clock tick is enabled i.e. the timer was reprogrammed on exit
> > > > from idle.
> > > 
> > > Oh I see, it's a deferrable delayed work...
> > > Then I can see two other issues:
> > > 
> > > 1) Can an interrupt in idle modify the vmstat and thus trigger the need to
> > >    flush it? 

Yes. Page allocation and page freeing for example.

   6   3730  ../mm/page_alloc.c <<rmqueue>>
             __mod_zone_freepage_state(zone, -(1 << order),
   4   1096  ../mm/page_alloc.c <<__free_one_page>>
             __mod_zone_freepage_state(zone, -(1 << order),

> > >    I believe it's the case and then the problem goes beyond nohz_full
> > >    because if the idle interrupt fired while the tick is stopped and didn't set
> > >    TIF_RESCHED, we go back to sleep without calling quiet_vmstat().
> > 
> > Yes: e.g. a nohz_full CPU, in idle code, could indeed receive a reschedule
> > IPI; re-enable local IRQs and generic idle code sees the TIF_NEED_RESCHED
> > flag against the idle task. Additionally, the selected task could
> > indirectly released a few pages [to satisfy a low-memory condition] and
> > modify CPU-specific vmstat data i.e. vm_stat_diff[NR_FREE_PAGES].
> > 
> > 
> > > 2) What if we are running task A in kernel mode while the tick is stopped
> > >    (nohz_full). Task A modifies the vmstat and goes to userspace for a long
> > >    while.
> > > Your patch fixes case 1) but not case 2). The problem is that TIMER_DEFERRABLE
> > > should really be about dynticks-idle only and not dynticks-full. I've always
> > > been afraid about enforcing that rule though because that would break old
> > > noise-free setups. But perhaps I should...
> > 
> > If I understand correctly, I agree. For the latter case, nothing can be
> > done unfortunately since the scheduling-clock tick is stopped.
> 
> Hi Frederic,
> 
> As far vmstat_updateas I understand, in the context of nohz_full, options are indeed
> limited; albeit, if we can ensure CPU-specific vmstat data is folded on
> return to idle [when required] then this should be good enough.

I suppose the desired behaviour, with the deferred timer for vmstat_sync, is:

"Allow the per-CPU vmstats to be out of sync, but for a maximum of 
sysctl_stat_interval".

But Aaron, vmstat_shepherd should be ensuring that per-CPU vmstat_update
work are queued, if the per-CPU vmstat are out of sync.

And:

static void
trigger_dyntick_cpu(struct timer_base *base, struct timer_list *timer)
{
        if (!is_timers_nohz_active())
                return;

        /*
         * TODO: This wants some optimizing similar to the code below, but we
         * will do that when we switch from push to pull for deferrable timers.
         */
        if (timer->flags & TIMER_DEFERRABLE) {
                if (tick_nohz_full_cpu(base->cpu))
                        wake_up_nohz_cpu(base->cpu);
                return;
        }

 * @TIMER_DEFERRABLE: A deferrable timer will work normally when the
 * system is busy, but will not cause a CPU to come out of idle just
 * to service it; instead, the timer will be serviced when the CPU
 * eventually wakes up with a subsequent non-deferrable timer.

You'd want that vmstat_update to execute regardless of whether there are 
armed non-deferrable timers.

Should fix both 1 and 2 AFAICS.



  reply	other threads:[~2022-02-24 12:27 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-03 21:43 [RFC PATCH] tick/sched: Ensure quiet_vmstat() is called when the idle tick was stopped too Aaron Tomlin
2022-02-03 22:22 ` Phil Auld
2022-02-16 14:34   ` Aaron Tomlin
2022-02-16 21:20     ` Phil Auld
2022-02-17 12:57     ` Frederic Weisbecker
2022-02-17 14:45       ` Aaron Tomlin
2022-02-17 12:47 ` Frederic Weisbecker
2022-02-17 14:26   ` Aaron Tomlin
2022-02-17 16:32     ` Frederic Weisbecker
2022-02-18 12:54       ` Aaron Tomlin
2022-02-19 15:46         ` Aaron Tomlin
2022-02-24 12:27           ` Marcelo Tosatti [this message]
2022-02-24 12:30             ` Marcelo Tosatti
2022-02-24 13:01               ` Aaron Tomlin
2022-02-24 12:37             ` Marcelo Tosatti
2022-02-24 13:00             ` Aaron Tomlin
2022-02-24 13:14               ` Marcelo Tosatti
2022-02-24 13:28                 ` Aaron Tomlin
2022-02-24 13:40                   ` Marcelo Tosatti
2022-02-24 13:44                     ` Aaron Tomlin
2022-03-31 14:33       ` Aaron Tomlin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yhd5olg9CjXSAf2s@fuller.cnet \
    --to=mtosatti@redhat.com \
    --cc=atomlin@atomlin.com \
    --cc=cl@linux.com \
    --cc=frederic@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@kernel.org \
    --cc=pauld@redhat.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.