netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* multicast, interfaces, kernel 3.0+...
@ 2012-09-21 18:46 Michael Tokarev
  2012-09-22  4:21 ` Michael Tokarev
  0 siblings, 1 reply; 6+ messages in thread
From: Michael Tokarev @ 2012-09-21 18:46 UTC (permalink / raw)
  To: netdev

Hello.

We found some, well, interesting behavour of kernels
3.0 and later, while 2.6.32 (previous long-stable
series) worked fine.  I'm not sure when it "broke",
since this is a production machine and we've difficult
time diagnosing it, and the app causing it is, well,
large.

The short story.  A big java app uses multicast group
to register one component and find it later.

The machine in question has 3 active network interfaces:
usual lo, eth0, and virtual (tap, pointopoint) tinc.
Tinc interface is marked as "multicast off".

When the app starts on 2.6.32 kernel, netstat -g shows
that multicast group on 2 interfaces: lo and eth0, but
not on tinc, which is sort of expected:

$ netstat -g
IPv6/IPv4 Group Memberships
Interface       RefCnt Group
--------------- ------ ---------------------
lo              4      228.5.6.7
lo              1      all-systems.mcast.net
eth0            4      228.5.6.7
eth0            1      all-systems.mcast.net
tinc            1      all-systems.mcast.net


But when the same app (actually the same userspace) is
booted on the same machine but on 3.0+ kernel, the same
multicast group is registered also on 2 interfaces, but
this time these are lo (as before) and tinc, but not eth0:

$ netstat -g
IPv6/IPv4 Group Memberships
Interface       RefCnt Group
--------------- ------ ---------------------
lo              4      228.5.6.7
lo              1      all-systems.mcast.net
eth0            1      all-systems.mcast.net
tinc            4      228.5.6.7
tinc            1      all-systems.mcast.net

Now, on 3.0+ kernel, parts of this app can't find each
other.  The "client" tries to send a datagram packet
to this address, 228.5.6.7, but receives no reply.

On 2.6.32 kernel, when eth0 is used instead of tinc,
it all works as expected.

Now, my knowlege of this multicast stuff is very limited
(reading about it now), so I don't really know what it
all means.  At least the fact that it somehow registers
tinc (which is multicast-off!) is already somewhat strange.
I tried removing this multicast setting from this iface,
but that didn't help.  I also tried enabling multicast on
lo (which was disabled!) and disabling it on others, but
that didn't help either.

According to strace, the app does not try to change iface
group membership, it does bind of a udp socket to 0.0.0.0:port,
and uses SOL_IP, IP_ADD_MEMBERSHIP to add this socket to a
multicast group.

Note: there's just ONE machine involved, and two applications
running on it.

Why with 3.0+, the non-multicast "tinc" interface is shown
as a member of 228.5.6.7 group, but not eth0 which actually
*is* multicast?

For the record, this "big java app" is Oracle reports server.
I've no idea why they use multicast to find two components
of one thing running on the same machine, and does not provide
any usable unicast solution...

Thanks!

/mjt

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-09-22  4:50 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-09-21 18:46 multicast, interfaces, kernel 3.0+ Michael Tokarev
2012-09-22  4:21 ` Michael Tokarev
2012-09-22  4:31   ` David Miller
2012-09-22  4:43     ` Michael Tokarev
2012-09-22  4:47       ` David Miller
2012-09-22  4:50         ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).