All of lore.kernel.org
 help / color / mirror / Atom feed
From: Frederic Weisbecker <frederic@kernel.org>
To: Aaron Tomlin <atomlin@atomlin.com>
Cc: Aaron Tomlin <atomlin@redhat.com>,
	Marcelo Tosatti <mtosatti@redhat.com>,
	Christoph Lameter <cl@linux.com>,
	tglx@linutronix.de, mingo@kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Phil Auld <pauld@redhat.com>
Subject: Re: [RFC PATCH] tick/sched: Ensure quiet_vmstat() is called when the idle tick was stopped too
Date: Thu, 17 Feb 2022 17:32:05 +0100	[thread overview]
Message-ID: <20220217163205.GA748087@lothringen> (raw)
In-Reply-To: <20220217142615.xqtiydixvnumyvei@ava.usersys.com>

On Thu, Feb 17, 2022 at 02:26:15PM +0000, Aaron Tomlin wrote:
> On Thu 2022-02-17 13:47 +0100, Frederic Weisbecker wrote:
> > So, to make sure I understand, the issue is that with nohz_full, we may
> > well enter into the idle loop with the tick already stopped. We may also
> > exit from idle without restarting the tick (again only with nohz_full). And
> > so this can cause the vmstat to not be flushed upon idle entry. Right?
> 
> Hi Frederic,
> 
> Yes - this is exactly it.
> 
> > > A customer provided some evidence which indicates that the idle tick was
> > > stopped; albeit, CPU-specific vmstat counters still remained populated.
> > > Thus one can only assume quiet_vmstat() was not invoked on return to the
> > > idle loop.
> > > 
> > > Unfortunately, I suspect this divergence might erroneously prevent a
> > > reclaim attempt by kswapd. If the number of zone specific free pages are
> > > below their per-cpu drift value then zone_page_state_snapshot() is used to
> > > compute a more accurate view of the aforementioned statistic.
> > > Thus any task blocked on the NUMA node specific pfmemalloc_wait queue will
> > > be unable to make significant progress via direct reclaim unless it is
> > > killed after being woken up by kswapd (see throttle_direct_reclaim()).
> > > That being said, eventually reclaim should give up if the conditions are
> > > correct, no?
> 
> > Now if quiet_vmstat() isn't called, the vmstat_work should fix this later,
> > right? Or does that happen too late perhaps?
> 
> If I understand correctly, in the context of nohz_full, since such work is
> deferred, it will only be handled in a scenario when the periodic/or
> scheduling-clock tick is enabled i.e. the timer was reprogrammed on exit
> from idle.

Oh I see, it's a deferrable delayed work...
Then I can see two other issues:

1) Can an interrupt in idle modify the vmstat and thus trigger the need to
   flush it? I believe it's the case and then the problem goes beyond nohz_full
   because if the idle interrupt fired while the tick is stopped and didn't set
   TIF_RESCHED, we go back to sleep without calling quiet_vmstat().

2) What if we are running task A in kernel mode while the tick is stopped
   (nohz_full). Task A modifies the vmstat and goes to userspace for a long
   while.

Your patch fixes case 1) but not case 2). The problem is that TIMER_DEFERRABLE
should really be about dynticks-idle only and not dynticks-full. I've always
been afraid about enforcing that rule though because that would break old
noise-free setups. But perhaps I should...


  reply	other threads:[~2022-02-17 16:34 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-03 21:43 [RFC PATCH] tick/sched: Ensure quiet_vmstat() is called when the idle tick was stopped too Aaron Tomlin
2022-02-03 22:22 ` Phil Auld
2022-02-16 14:34   ` Aaron Tomlin
2022-02-16 21:20     ` Phil Auld
2022-02-17 12:57     ` Frederic Weisbecker
2022-02-17 14:45       ` Aaron Tomlin
2022-02-17 12:47 ` Frederic Weisbecker
2022-02-17 14:26   ` Aaron Tomlin
2022-02-17 16:32     ` Frederic Weisbecker [this message]
2022-02-18 12:54       ` Aaron Tomlin
2022-02-19 15:46         ` Aaron Tomlin
2022-02-24 12:27           ` Marcelo Tosatti
2022-02-24 12:30             ` Marcelo Tosatti
2022-02-24 13:01               ` Aaron Tomlin
2022-02-24 12:37             ` Marcelo Tosatti
2022-02-24 13:00             ` Aaron Tomlin
2022-02-24 13:14               ` Marcelo Tosatti
2022-02-24 13:28                 ` Aaron Tomlin
2022-02-24 13:40                   ` Marcelo Tosatti
2022-02-24 13:44                     ` Aaron Tomlin
2022-03-31 14:33       ` Aaron Tomlin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220217163205.GA748087@lothringen \
    --to=frederic@kernel.org \
    --cc=atomlin@atomlin.com \
    --cc=atomlin@redhat.com \
    --cc=cl@linux.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=pauld@redhat.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.