From: "Anders K. Pedersen | Cohaesio" <akp@cohaesio.com>
To: "alexander.duyck@gmail.com" <alexander.duyck@gmail.com>
Cc: "pstaszewski@itcare.pl" <pstaszewski@itcare.pl>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
"pavlos.parissis@gmail.com" <pavlos.parissis@gmail.com>,
"intel-wired-lan@lists.osuosl.org"
<intel-wired-lan@lists.osuosl.org>,
"alexander.h.duyck@intel.com" <alexander.h.duyck@intel.com>
Subject: Re: Linux 4.12+ memory leak on router with i40e NICs
Date: Sun, 22 Oct 2017 13:56:40 +0000 [thread overview]
Message-ID: <1508680600.4970.26.camel@cohaesio.com> (raw)
In-Reply-To: <CAKgT0Ud1ftJ187qz_XBVnx+X_1BQTgWWe4sA307_3VmpMSSVaw@mail.gmail.com>
On tor, 2017-10-19 at 08:40 -0700, Alexander Duyck wrote:
> On Thu, Oct 19, 2017 at 5:19 AM, Anders K. Pedersen | Cohaesio
> <akp@cohaesio.com> wrote:
> > Hi Alex,
> >
> > On ons, 2017-10-18 at 16:37 -0700, Alexander Duyck wrote:
> > > When we last talked I had asked if you could do a git bisect to
> > > find
> > > the memory leak and you said you would look into it. The most
> > > useful
> > > way to solve this would be to do a git bisect between your
> > > current
> > > kernel and the 4.11 kernel to find the point at which this
> > > started.
> > > If
> > > we can do that then fixing this becomes much simpler as we just
> > > have
> > > to fix the patch that introduced the issue.
> >
> > We're also seeing a smaller memory leak (about 1 GB per day) than
> > the
> > original one even with the "Fix memory leak related filter
> > programming
> > status" fix applied. So far I've determined that the leak is
> > present on
> > 4.13.7 and was introduced between 4.11 and 4.12, so I'll do another
> > round of bisection to identify the patch that introduced this.
> >
> > Since the router must run for a couple of hours before I can be
> > sure
> > whether a kernel is good or bad, and I can't reboot it during
> > working
> > hours, it'll probably be about a week before I have a result.
> >
> > --
> > Venlig hilsen / Best Regards
> >
> > Anders K. Pedersen
> > Senior Technical Manager
>
> Anders,
>
> I'll do some digging on my side to see if I can find any other memory
> leaks that might be floating around in the driver that could have
> been
> introduced during that time-frame.
>
> One thing you might try that would help with your testing would be to
> just disable the ATR functionality in i40e. You can do that with the
> ethtool command "ethtool --set-priv-flags <iface> flow-director-atr
> off". That should allow you to bisect this without needing to deal
> with the "programming status" patches since you won't be programming
> ATR filters which is what caused that leak.
>
> Thanks for looking into this.
>
> - Alex
Hi Alex,
I began bisecting, where I applied the known fix patches to the steps,
where they were applicable (i.e. without changing the flow-director-atr
flag), but some of the steps had a high amount of packet drops, which
caused problems for our network, so I couldn't leave them running for
several hours, which is necessary to determine if the leak is present
or not. The part of the bisection I got through had the same outcome as
the last bisection, which led to "i40e: Fix support for flow
director programming status".
After that I experimented a bit with the flow-director-atr flag, and it
turns out that if I disable this flag on all the NICs, then the memory
leak is gone, so I suspected that the smaller memory leak was also
caused by "i40e: Fix support for flow director programming status".
I tried to revert this patch from 4.13 (with manual fixup for the trace
point that had been added later), but that brought back the packet
drops, so I couldn't let it run.
This morning I saw your "i40e: Add programming descriptors to
cleaned_count" patch, so I tried 4.13.9 with that patch and the
previous "i40e: Fix memory leak related filter programming status"
without turning off the flow-director-atr flag. So far this combination
is running stable without any memory leaks.
Thanks for fixing this.
Regards,
Anders
next prev parent reply other threads:[~2017-10-22 13:56 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-10-04 12:56 Linux 4.12+ memory leak on router with i40e NICs Anders K. Pedersen | Cohaesio
2017-10-04 15:32 ` Alexander Duyck
2017-10-05 5:19 ` Anders K. Pedersen | Cohaesio
[not found] ` <227d17ae-b040-07d0-3c57-e9acd1a3b5b4@itcare.pl>
[not found] ` <c49a750f-c47c-9de0-ebf0-148db5e3d3c5@itcare.pl>
2017-10-15 0:58 ` Alexander Duyck
2017-10-15 15:03 ` Paweł Staszewski
2017-10-16 11:20 ` Pavlos Parissis
2017-10-16 14:11 ` Alexander Duyck
2017-10-16 16:26 ` Paweł Staszewski
2017-10-16 23:34 ` Paweł Staszewski
2017-10-16 23:56 ` Alexander Duyck
2017-10-17 0:44 ` Paweł Staszewski
2017-10-17 9:48 ` Paweł Staszewski
2017-10-17 10:20 ` Paweł Staszewski
2017-10-17 10:51 ` Paweł Staszewski
2017-10-17 10:59 ` Paweł Staszewski
2017-10-17 11:05 ` Paweł Staszewski
2017-10-17 11:52 ` Paweł Staszewski
2017-10-17 14:08 ` Paweł Staszewski
2017-10-18 15:44 ` Paweł Staszewski
2017-10-18 22:20 ` Paweł Staszewski
2017-10-18 22:50 ` Paweł Staszewski
2017-10-18 22:58 ` Paweł Staszewski
2017-10-18 23:22 ` Paweł Staszewski
2017-10-18 23:37 ` Alexander Duyck
2017-10-18 23:51 ` Paweł Staszewski
2017-10-18 23:56 ` Paweł Staszewski
2017-10-18 23:59 ` Paweł Staszewski
2017-10-19 17:10 ` Alexander Duyck
2017-10-19 12:19 ` Anders K. Pedersen | Cohaesio
2017-10-19 15:40 ` Alexander Duyck
2017-10-22 13:56 ` Anders K. Pedersen | Cohaesio [this message]
2017-10-17 5:51 ` Vitezslav Samel
2017-10-18 23:29 ` Alexander Duyck
2017-10-18 23:40 ` Paweł Staszewski
2017-10-19 11:41 ` Pavlos Parissis
2017-10-19 15:53 ` Alexander Duyck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1508680600.4970.26.camel@cohaesio.com \
--to=akp@cohaesio.com \
--cc=alexander.duyck@gmail.com \
--cc=alexander.h.duyck@intel.com \
--cc=intel-wired-lan@lists.osuosl.org \
--cc=netdev@vger.kernel.org \
--cc=pavlos.parissis@gmail.com \
--cc=pstaszewski@itcare.pl \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).