public inbox for linux-nfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Carlos Carvalho <carlos@fisica.ufpr.br>
To: "J. Bruce Fields" <bfields@fieldses.org>
Cc: linux-nfs@vger.kernel.org
Subject: Re: massive memory leak in 3.1[3-5] with nfs4+kerberos
Date: Tue, 28 Oct 2014 12:14:28 -0200	[thread overview]
Message-ID: <20141028141428.GA17735@fisica.ufpr.br> (raw)
In-Reply-To: <20141014204245.GB15960@fieldses.org>

J. Bruce Fields (bfields@fieldses.org) wrote on Tue, Oct 14, 2014 at 05:42:45PM BRT:
> On Mon, Oct 13, 2014 at 08:50:27PM -0300, Carlos Carvalho wrote:
> > J. Bruce Fields (bfields@fieldses.org) wrote on Mon, Oct 13, 2014 at 10:58:40AM BRT:
> > Note the big xprt_alloc. slabinfo is found in the kernel tree at tools/vm.
> > Another way to see it:
> > 
> > urquell# sort -n /sys/kernel/slab/kmalloc-2048/alloc_calls | tail -n 2
> >    1519 nfsd4_create_session+0x24a/0x810 age=189221/25894524/71426273 pid=5372-5436 cpus=0-11,13-16,19-20 nodes=0-1
> > 3380755 xprt_alloc+0x1e/0x190 age=5/27767270/71441075 pid=6-32599 cpus=0-31 nodes=0-1
> 
> Agreed that the xprt_alloc is suspicious, though I don't really
> understand these statistics.
> 
> Since you have 4.1 clients, maybe this would be explained by a leak in
> the backchannel code.

We've set clients to use 4.0 and it only made the problem worse; the growth in
unreclaimable memory was faster.

> It could certainly still be worth testing 3.17 if possible.

We tested it and it SEEMS the problem doesn't appear in 3.17.1; the SUnreclaim
value oscillates up and down as usual, instead of increasing monotonically.
However it didn't last long enough for us to get conclusive numbers because
after about 5-6h the machine fills the screen with "NMI watchdog CPU #... is
locked for more than 22s". It spits these messages for many cores at once, and
becomes unresponsive; we have to reboot it from the console with alt+sysreq.

Do these 2 new pieces of info give a clue?

  reply	other threads:[~2014-10-28 14:14 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-11  3:36 massive memory leak in 3.1[3-5] with nfs4+kerberos Carlos Carvalho
2014-10-13 13:58 ` J. Bruce Fields
2014-10-13 23:50   ` Carlos Carvalho
2014-10-14 20:42     ` J. Bruce Fields
2014-10-28 14:14       ` Carlos Carvalho [this message]
2014-10-28 14:24         ` J. Bruce Fields
2014-10-28 19:12           ` Carlos Carvalho
2014-10-28 19:29             ` J. Bruce Fields
2014-10-28 19:37               ` Carlos Carvalho

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141028141428.GA17735@fisica.ufpr.br \
    --to=carlos@fisica.ufpr.br \
    --cc=bfields@fieldses.org \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox