netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Wei Wang <weiwan@google.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: stranche@codeaurora.org, David Ahern <dsahern@gmail.com>,
	Martin KaFai Lau <kafai@fb.com>,
	Mahesh Bandewar <maheshb@google.com>,
	Jakub Kicinski <kuba@kernel.org>,
	Linux Kernel Network Developers <netdev@vger.kernel.org>,
	Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Subject: Re: Refcount mismatch when unregistering netdevice from kernel
Date: Tue, 8 Dec 2020 10:09:39 -0800	[thread overview]
Message-ID: <CAEA6p_D+diS7jnpoGk6cncWL8qiAGod2EAp=Vcnc-zWNPg04Jg@mail.gmail.com> (raw)
In-Reply-To: <56e72b72-685f-925d-db2d-d245c1557987@gmail.com>

Hi Sean,

Thanks for reporting the issue. Like Eric said, if you could provide
the kernel version you are using, that would be great. And if you have
a reproducer, that would be even better.
I have a few comments inline below.

On Tue, Dec 8, 2020 at 7:08 AM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
>
>
> On 12/8/20 4:55 AM, stranche@codeaurora.org wrote:
> > Hi everyone,
> >
> > We've recently been investigating a refcount problem when unregistering a netdevice from the kernel. It seems that the IPv6 module is still holding various references to the inet6_dev associated with the main netdevice struct that are not being released, preventing the unregistration from completing.
> >
> > After tracing the various locations that take/release refcounts to these two structs, we see that there are mismatches in the refcount for the inet6_dev in the DST path when there are routes flushed with the rt6_uncached_list_flush_dev() function during rt6_disable_ip() when the device is unregistering. It seems that usually these references are freed via ip6_dst_ifdown() in the dst_ops->ifdown callback from dst_dev_put(), but this callback is not executed in the rt6_uncached_list_flush_dev() function. Instead, a reference to the inet6_dev is simply dropped to account for the inet6_dev_get() in ip6_rt_copy_init().
> >

rt6_uncached_list_flush_dev() actually tries to replace the inet6_dev
with loopback_dev, and release the reference to the previous inet6_dev
by calling in6_dev_put(), which is actually doing the same thing as
ip6_dst_ifdown(). I don't understand why you say " a reference to the
inet6_dev is simply dropped".

> > Unfortunately, this does not seem to be sufficient, as these uncached routes have an additional refcount on the inet6_device attached to them from the DST allocation. In the normal case, this reference from the DST allocation will happen in the dst_ops->destroy() callback in the dst_destroy() function when the DST is being freed. However, since rt6_uncached_list_flush_dev() changes the inet6_device stored in the DST to the loopback device, the dst_ops->destroy() callback doesn't decrement the refcount on the correct inet6_dev struct.
> >

The additional refcount to the DST is also released by doing the following:
                        if (rt_dev == dev) {
                                rt->dst.dev = blackhole_netdev;
                                dev_hold(rt->dst.dev);
                                dev_put(rt_dev);
                        }
Am I missing something?


> > We're wondering if it would be appropriate to put() the refcount additionally for the uncached routes when flushing out the list for the unregistering device. Perhaps something like the following?
> >
>
> dev refcount imbalances are quite common, particularly on old kernel versions, lacking few fixes.
>
> First thing is to let us know on which kernel version you see this, and how you reproduce it.
>

  reply	other threads:[~2020-12-08 18:10 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-08  3:55 Refcount mismatch when unregistering netdevice from kernel stranche
2020-12-08 15:08 ` Eric Dumazet
2020-12-08 18:09   ` Wei Wang [this message]
2020-12-08 19:12     ` stranche
2020-12-08 21:51       ` Wei Wang
2020-12-09  0:03         ` David Ahern
2020-12-11  1:12           ` stranche
2020-12-11 16:10             ` David Ahern
2021-01-05  3:05               ` stranche
2021-01-05  4:58                 ` David Ahern
2021-01-05 19:09                   ` Wei Wang
2021-02-11 19:21                     ` Alexei Starovoitov
2021-02-12  1:28                       ` Jakub Kicinski
2021-02-12  1:44                         ` Alexei Starovoitov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAEA6p_D+diS7jnpoGk6cncWL8qiAGod2EAp=Vcnc-zWNPg04Jg@mail.gmail.com' \
    --to=weiwan@google.com \
    --cc=dsahern@gmail.com \
    --cc=eric.dumazet@gmail.com \
    --cc=kafai@fb.com \
    --cc=kuba@kernel.org \
    --cc=maheshb@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=stranche@codeaurora.org \
    --cc=subashab@codeaurora.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).