Re: [For Stable] mm: memcontrol: fix excessive complexity in memory.stat reporting

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Greg KH <gregkh@linuxfoundation.org>
To: Vaibhav Rustagi <vaibhavrustagi@google.com>
Cc: stable@vger.kernel.org, hannes@cmpxchg.org, tj@kernel.org,
	mhocko@suse.com, vdavydov.dev@gmail.com, guro@fb.com,
	riel@surriel.com, sfr@canb.auug.org.au,
	akpm@linux-foundation.org, torvalds@linux-foundation.org,
	Aditya Kali <adityakali@google.com>
Subject: Re: [For Stable] mm: memcontrol: fix excessive complexity in memory.stat reporting
Date: Wed, 1 May 2019 09:08:27 +0200	[thread overview]
Message-ID: <20190501070827.GB30616@kroah.com> (raw)
In-Reply-To: <CAMVonLiXfX8r=1-fwQCk275wrkBmxjXuyWJSAmW=7hjvy7YPyg@mail.gmail.com>

On Tue, Apr 30, 2019 at 01:41:16PM -0700, Vaibhav Rustagi wrote:
> On Wed, Apr 24, 2019 at 11:53 AM Greg KH <gregkh@linuxfoundation.org> wrote:
> >
> >
> > A: Because it messes up the order in which people normally read text.
> > Q: Why is top-posting such a bad thing?
> > A: Top-posting.
> > Q: What is the most annoying thing in e-mail?
> >
> > A: No.
> > Q: Should I include quotations after my reply?
> >
> > http://daringfireball.net/2007/07/on_top
> >
> > On Wed, Apr 24, 2019 at 10:35:51AM -0700, Vaibhav Rustagi wrote:
> > > Apologies for sending a non-plain text e-mail previously.
> > >
> > > This issue is encountered in the actual production environment by our
> > > customers where they are constantly creating containers
> > > and tearing them down (using kubernetes for the workload).  Kubernetes
> > > constantly reads the memory.stat file for accounting memory
> > > information and over time (around a week) the memcg's got accumulated
> > > and the response time for reading memory.stat increases and
> > > customer applications get affected.
> >
> > Please define "affected".  Their apps still run properly, so all should
> > be fine, it would be kubernetes that sees the slowdowns, not the
> > application.  How exactly does this show up to an end-user?
> >
> 
> Over time as the zombie cgroups get accumulated, kubelet (process
> doing frequent memory.stat) becomes more cpu resource intensive and
> all other user containers running on the same machine will starve for
> cpu. It affects the user containers in at-least 2 ways that we know
> of: (1) User experience liveness probe failures where there
> applications are not completed in expected amount of time.

"expected amount of time" is interesting to claim in a shared
environment :)

> (2) new user jobs cannot be schedule,

Really?  This slows down starting new processes?  Or is this just
slowing down your system overall?

> There certainly is a possibilty of reducing the adverse affect at
> Kubernetes level as well, and we are investigating that as well. But,
> the kernel patches requested helps in not exacerbating the problem.

I understand this is a kernel issue, but if you see this happen, just
updating to a modern kernel should be fine.

> > > The repro steps mentioned previously was just used for testing the
> > > patches locally.
> > >
> > > Yes, we are moving to 4.19 but are also supporting 4.14 till Jan 2020
> > > (so production environment will still contain 4.14 kernel)
> >
> > If you are already moving to 4.19, this seems like a good as reason as
> > any (hint, I can give you more) to move off of 4.14 at this point in
> > time.  There's no real need to keep 4.14 around, given that you don't
> > have any out-of-tree code in your kernels, so all should be simple to
> > just update the next reboot, right?
> >
> 
> Based on the past experiences, major kernel upgrade sometime
> introduces new regressions as well. So while we are working to roll
> out kernel 4.19, it may not be a practical solution for all the users.

If you are not doing the same exact testing senario for a new 4.14.y
kernel release as you are doing for a move to 4.19.y, then your "roll
out" process is broken.

Given that 4.19.y is now 6 months old, I would have expected any "new
regressions" to have already been reported.  Please just use a new
kernel, and if you have regressions, we will work to address them.

thanks,

greg k-h

next prev parent reply	other threads:[~2019-05-01  7:08 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-02  5:35 [For Stable] mm: memcontrol: fix excessive complexity in memory.stat reporting Vaibhav Rustagi
2019-04-24 16:50 ` Greg KH
2019-04-24 17:35   ` Vaibhav Rustagi
2019-04-24 18:34     ` Greg KH
2019-04-30 20:41       ` Vaibhav Rustagi
2019-05-01  7:08         ` Greg KH [this message]
  -- strict thread matches above, loose matches on Subject: below --
2019-04-01 20:34 Vaibhav Rustagi
2019-04-02  5:24 ` Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190501070827.GB30616@kroah.com \
    --to=gregkh@linuxfoundation.org \
    --cc=adityakali@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=mhocko@suse.com \
    --cc=riel@surriel.com \
    --cc=sfr@canb.auug.org.au \
    --cc=stable@vger.kernel.org \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=vaibhavrustagi@google.com \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.