From: Wengang <wen.gang.wang@oracle.com>
To: Jay Vosburgh <jay.vosburgh@canonical.com>,
Cong Wang <cwang@twopensource.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>, netdev <netdev@vger.kernel.org>
Subject: Re: [PATCH] bonding: clear header_ops when last slave detached (v2)
Date: Mon, 24 Nov 2014 11:05:39 +0800 [thread overview]
Message-ID: <5472A083.9020801@oracle.com> (raw)
In-Reply-To: <29850.1416596051@famine>
于 2014年11月22日 02:54, Jay Vosburgh 写道:
> Cong Wang <cwang@twopensource.com> wrote:
>
>> On Thu, Nov 20, 2014 at 2:53 PM, Jay Vosburgh
>> <jay.vosburgh@canonical.com> wrote:
>>> Cong Wang <cwang@twopensource.com> wrote:
>>>
>>>> Also, no one seems to care about my previous question:
>>>> why only bonding has the problem?
>>> Bonding has the problem because it stashes a pointer to a data
>>> structure (the header_ops) from another module, and when that module is
>>> unloaded the dangling pointer may be dereferenced if it's not either
>>> cleared or made to never go away.
>> I knew, please re-read my question, I was asking why ONLY bonding
>> has the problem, i.e. why not neigh or whatever else calling
>> header_ops->foo()? :)
>>
>> As I said, I may miss some try_get_module() somewhere of course.
>> Needs more digging.
> My explanation is why only bonding has the problem; it's keeping
> a pointer (in bond_dev->header_ops) that is copied from the slave
> device's ->header_ops, and clearing that stashed pointer is (a) not
> correctly synchronized with the removal of the slave device, and (b)
> trying to simply clear the pointer has a check then use race in
> dev_hard_header.
>
> 8021q, for example, uses a "passthru" header_ops to call the
> underlying device's header_ops, but 8021q is only for ethernet, and the
> eth_header_ops are static in vmlinux, so it won't see these problems.
>
> Actually, now that I think about it, when the last ipoib slave
> is released, the bonding master device is theoretically supposed to be
> removed to avoid the sort of problem we're discussing here.
>
> That apparently isn't happening, unless Wengang is running
> pktgen and simultaneously removing the ipoib module (racing the transmit
> against the removal), or maybe something else is going on (perhaps
> pktgen holds a reference to the bonding master, preventing its removal).
>
> Also, curiously, looking at pkgten, pktgen_setup_dev appears to
> only accept devices of type ARPHRD_ETHER, but bonding with an ipoib
> slave would be ARPHRD_INFINIBAND. I'm therefore not sure how Wengang
> configured pktgen over an ipoib bond.
>
> Wengang, what kernel are you using, and is your kernel modified
> to change pktgen_setup_dev?
>
> -J
It's a 2.6.39 kernel.
code is like this:
static int pktgen_setup_dev(struct pktgen_dev *pkt_dev, const char *ifname)
{
struct net_device *odev;
int err;
/* Clean old setups */
if (pkt_dev->odev) {
dev_put(pkt_dev->odev);
pkt_dev->odev = NULL;
}
odev = pktgen_dev_get_by_name(pkt_dev, ifname);
if (!odev) {
pr_err("no such netdevice: \"%s\"\n", ifname);
return -ENODEV;
}
if (odev->type != ARPHRD_ETHER) {
pr_err("not an ethernet device: \"%s\"\n", ifname);
err = -EINVAL;
} else if (!netif_running(odev)) {
pr_err("device is down: \"%s\"\n", ifname);
err = -ENETDOWN;
} else {
pkt_dev->odev = odev;
return 0;
}
dev_put(odev);
return err;
}
No change done to it.
This problem is a side product when I was working with another area. I
am so far not very clear about the setup(no env to check now either).
thanks,
wengang
> ---
> -Jay Vosburgh, jay.vosburgh@canonical.com
next prev parent reply other threads:[~2014-11-24 3:04 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-11-19 5:18 [PATCH] bonding: clear header_ops when last slave detached (v2) Wengang Wang
2014-11-19 5:22 ` Wengang
2014-11-19 5:39 ` Eric Dumazet
2014-11-19 7:00 ` Wengang
2014-11-19 22:26 ` Cong Wang
2014-11-19 22:56 ` Cong Wang
2014-11-20 6:41 ` Eric Dumazet
2014-11-20 17:34 ` Cong Wang
2014-11-20 20:41 ` Eric Dumazet
2014-11-20 21:57 ` Cong Wang
2014-11-20 22:03 ` Eric Dumazet
2014-11-20 22:13 ` Cong Wang
2014-11-20 22:53 ` Jay Vosburgh
2014-11-21 18:17 ` Cong Wang
2014-11-21 18:54 ` Jay Vosburgh
2014-11-24 3:05 ` Wengang [this message]
2014-11-21 18:55 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5472A083.9020801@oracle.com \
--to=wen.gang.wang@oracle.com \
--cc=cwang@twopensource.com \
--cc=eric.dumazet@gmail.com \
--cc=jay.vosburgh@canonical.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).