From: Willy Tarreau <willy@w.ods.org>
To: netdev@oss.sgi.com
Cc: jgarzik@pobox.com, davem@redhat.com, Alexandre.Cassen@wanadoo.fr
Subject: Deadlock problem with dev->refcnt somewhere in netlink/multicast...
Date: Wed, 28 Jan 2004 12:34:41 +0100 [thread overview]
Message-ID: <20040128113441.GA12118@alpha.home.local> (raw)
Hello,
as previously discussed in private with Jeff and David, I encounter a deadlock
problem somewhere in netlink in 2.4 kernels since at least 2.4.21 to
2.4.25-pre7. At first I thought it was related to the TG3 driver that I was
using, but after many steps in the wrong direction, I could finally find an
easy way to reproduce it on other NICs (e1000 & 3c59x). I have added lots of
traces in the kernel to track the dev->refcnt changes, but I now need help to
understand what I gathered. For this, you need an application which binds to
a multicast address. Since I've been noticing the problem on keepalived at
first, I'm sticking to it for this example, but I just could reproduce the
same problem with ntpd a few minutes ago.
The problem is the following :
1/ configure an interface up with an address
2/ start keepalived. Keepalived registers itself to receive netlink
broadcasts (link and address groups), and sets a multicast address
for VRRP on the interfaces.
3/ now flush all addresses on this interface
4/ then put the link down
5/ then stop keepalived
6/ now rmmod => it hangs in unregister_netdevice() with dev->refcnt=2
now simply change the order of operations between 3 and 4 (addr vs link) :
1/ configure an interface up with an address
2/ start keepalived.
3/ then put the link down
4/ now flush all addresses on this interface
5/ then stop keepalived
6/ now rmmod => no problem at all
Stopping keepalived after the ip link down or ip addr flush is OK too.
I have tried suggestions by Alexandre Cassen to disable either link
or address group registration in keepalived, but it did not change
anything at all. I even set the group to zero, but the problem persists,
which led me to try ntp to confirm that this was a multicast problem in
fact. Anyway, "ip monitor" does not cause this trouble. So I'm now certain
that just listening to netlink broadcasts does not causes this problem.
BTW, If I manually delete the addresses by hand instead of flushing them,
it does not work either. Oh, and I noticed that when I flush all IP addresses
on an interface, the interface disappears from "ip maddr", so I suspect that
someone in the address deletion code does something nasty with the mcast
addresses, but I cannot find what.
So I put lots of printk's in the kernel to track dev->refcnt at several
places, and I now have the following traces, with all printk(refcnt), not
displayed here, and along with diffs between them. The end of the name
tells the order of removal : A=addr, L=link, K=keepalived. So "trace.kal"
describes the following operations, where keepalived was stopped, then
the address was flushed, then the link was set down :
root: ##### TRACE ##### modprobe e1000
root: ##### TRACE ##### ip addr
root: ##### TRACE ##### ip addr add 1.2.3.0/24 dev eth2
root: ##### TRACE ##### ip addr
root: ##### TRACE ##### ip link set eth2 up
root: ##### TRACE ##### keepalived --vrrp -f /var/state/vrrp.conf
root: ##### TRACE ##### ip addr
root: ##### TRACE ##### ip addr flush dev eth2
root: ##### TRACE ##### ip link set eth2 down
root: ##### TRACE ##### killall keepalived
root: ##### TRACE ##### ip addr
root: ##### TRACE ##### rmmod e1000
Since there are important differences between logs, and it's been several
days I spent on the problem, I think that there is something obvious in
front of me that I cannot see. I have uploaded the traces and the side-by-side
diffs on this site (not posted because they're about 10kB each) :
http://w.ods.org/debug/pb-mcast/
I really hope that someone with better knowledge will be able either to
point to the problem, or to narrow the problem so that I have some clues
where to add traces or what to try, because I'm really out of ideas now.
Thanks in advance,
Willy
reply other threads:[~2004-01-28 11:34 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20040128113441.GA12118@alpha.home.local \
--to=willy@w.ods.org \
--cc=Alexandre.Cassen@wanadoo.fr \
--cc=davem@redhat.com \
--cc=jgarzik@pobox.com \
--cc=netdev@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).