All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tony Lu <tonylu@linux.alibaba.com>
To: Cong Wang <xiyou.wangcong@gmail.com>
Cc: David Miller <davem@davemloft.net>,
	shemminger@osdl.org,
	Linux Kernel Network Developers <netdev@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] net: remove static inline from dev_put/dev_hold
Date: Tue, 12 Nov 2019 16:47:14 +0800	[thread overview]
Message-ID: <20191112084714.GC67139@TonyMac-Alibaba> (raw)
In-Reply-To: <CAM_iQpUaPsFHrDmd7fLjWZLbbo8j1uD6opuT+zKqPTVuQPKniA@mail.gmail.com>

On Mon, Nov 11, 2019 at 01:26:13PM -0800, Cong Wang wrote:
> On Mon, Nov 11, 2019 at 6:12 AM Tony Lu <tonylu@linux.alibaba.com> wrote:
> >
> > This patch removes static inline from dev_put/dev_hold in order to help
> > trace the pcpu_refcnt leak of net_device.
> >
> > We have sufferred this kind of issue for several times during
> > manipulating NIC between different net namespaces. It prints this
> > log in dmesg:
> >
> >   unregister_netdevice: waiting for eth0 to become free. Usage count = 1
> 
> I debugged a nasty dst refcnt leak in TCP a long time ago, so I can
> feel your pain.
> 
> 
> >
> > However, it is hard to find out who called and leaked refcnt in time. It
> > only left the crime scene but few evidence. Once leaked, it is not
> > safe to fix it up on the running host. We can't trace dev_put/dev_hold
> > directly, for the functions are inlined and used wildly amoung modules.
> > And this issue is common, there are tens of patches fix net_device
> > refcnt leak for various causes.
> >
> > To trace the refcnt manipulating, this patch removes static inline from
> > dev_put/dev_hold. We can use handy tools, such as eBPF with kprobe, to
> > find out who holds but forgets to put refcnt. This will not be called
> > frequently, so the overhead is limited.
> 
> I think tracepoint serves the purpose of tracking function call history,
> you can add tracepoint for each of dev_put()/dev_hold(), which could
> also inherit the trace filter and trigger too.

Thanks for your advice. I already made a patch set to add a pair of
tracepoints to trace dev_hold()/dev_put() as an available solution. I
used to want to give a flexible approach for people who want to choose.
I will send it out later.

> 
> The netdev refcnt itself is not changed very frequently, but it is
> refcnt'ed by other things like dst too which is changed frequently.
> This is why usually when you see the netdev refcnt leak warning,
> the problem is probably somewhere else, like dst refcnt leak.

We also suffered dst refcnt leak issue before. It is really hard to
investigate. I will think about this place well.

> 
> Hope this helps.
> 
> Thanks.


Thanks.
Tony Lu

      reply	other threads:[~2019-11-12  8:47 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-11 14:05 [PATCH] net: remove static inline from dev_put/dev_hold Tony Lu
2019-11-11 16:56 ` Stephen Hemminger
2019-11-12  7:18   ` Tony Lu
2019-11-11 17:21 ` Eric Dumazet
2019-11-12  9:48   ` Tony Lu
2019-11-11 21:26 ` Cong Wang
2019-11-12  8:47   ` Tony Lu [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191112084714.GC67139@TonyMac-Alibaba \
    --to=tonylu@linux.alibaba.com \
    --cc=davem@davemloft.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=shemminger@osdl.org \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.