netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: netdev <netdev@vger.kernel.org>, Thomas Graf <tgraf@suug.ch>
Subject: Re: Bug with IPv6-UDP address binding
Date: Thu, 09 Aug 2012 11:40:05 +0200	[thread overview]
Message-ID: <1344505205.3069.55.camel@localhost> (raw)
In-Reply-To: <1344459591.28967.271.camel@edumazet-glaptop>

On Wed, 2012-08-08 at 22:59 +0200, Eric Dumazet wrote:
> On Wed, 2012-08-08 at 22:37 +0200, Jesper Dangaard Brouer wrote:
> > Hi NetDev
> > 
> > I think I have found a problem/bug with IPv6-UDP address binding.
> > 
> > I found this problem while playing with IPVS and IPv6-UDP, but its also
> > present in more basic/normal situations.
> > 
> > If you have two IPv6 addresses, within the same IPv6 subnet, then one
> > of the IPv6 addrs takes precedence over the other (for UDP only).
> > 
> > Meaning that, if connecting to the "secondary" IPv6 via UDP, will
> > result in userspace see/bind the connection as being created to the
> > "primary" IP, even-though tcpdump shows that the IPv6-UDP packets are
> > dest the "secondary".
> > 
> > The result is; that only the first IPv6-UDP packet is delivered to
> > userspace, and the next packets are denied by the kernel as the UDP
> > socket is "established" with the "primary" IPv6 addr.
> > 
> > I would appreciate some hints to where in the IPv6 code I should look
> > for this bug.  If any one else wants to fix it, I'm also fine with
> > that ;-)
> > 
> > 
> > Its quite easy to reproduce, using netcat (nc).
> > 
> > Add two addresses to the "server" e.g.:
> >  ip addr add fee0:cafe::102/64 dev eth0
> >  ip addr add fee0:cafe::bad/64 dev eth0
> > 
> > Run a netcat listener on "server":
> >  nc -6 -u -l 2000
> > (Notice restart the listener between runs, due to limitation in nc)
> > 
> > On the client add an IPv6 addr e.g.:
> >  ip addr add fee0:cafe::101/64 dev eth0
> > 
> > Run a netcat UDP-IPv6 producer on "client":
> >   nc -6 -u fee0:cafe::bad 2000
> > 
> > Notice that first packet, will get through, but second packets will
> > not (nc: Write error: Connection refused).  Running a tcpdump shows
> > that the kernel is sending back ICMP6, destination unreachable,
> > unreachable port.
> > 
> > Its also possible to see the problem, simply running "netstat -uan" on
> > "server", which will show that the "established" UDP connection, is
> > bound to the wrong "Local Address".
> > 
> > (Tested on both latest net-next kernel at commit 79cda75a1, and also
> > on RHEL6 approx 2.6.32)
> > 
> 
> Hi Jesper
> 
> Thats because the "nc -6 -u -l 2000" on server does :
> 
> bind(3, {sa_family=AF_INET6, sin6_port=htons(2000), inet_pton(AF_INET6,
> "::", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = 0
> 
> recvfrom(3, "\n", 1024, MSG_PEEK, {sa_family=AF_INET6,
> sin6_port=htons(53696), inet_pton(AF_INET6, "fee0:cafe::101",
> &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 1
> 
> connect(3, {sa_family=AF_INET6, sin6_port=htons(53696),
> inet_pton(AF_INET6, "fee0:cafe::101", &sin6_addr), sin6_flowinfo=0,
> sin6_scope_id=0}, 28) = 0
> 
> And the kernel automatically chooses a SOURCE address (fee0:cafe::102)
> that is not what you expected (fee0:cafe::bad)

Okay I see.  And this is also the case for IPv4.

Guess I should have read Stephens[1] first, as this problem with
multihomed hosts is described  (on page 219).  He also states, that this
is a problem/feature related to Berkely-derived implementations.  E.g.
Solaris handle this, the way I expected. That is, the source IP address
for the server's reply is the dest IP of the client's request.


> So its a bug in the application.

Yes, I guess its an application bug, because Berkely-derived
implementations don't handle multihomeing well for UDP.

Why are we keeping this, counter-intuitive behavior? 

What about changing the implementation to act like Solaris, which IMHO
makes much more sense?

(BTW, iperf also have this "bug")


> UDP connect() is tricky : In this case, nc should learn on what IP
> address the client sent the frame. (using recvmsg() and appropriate
> ancillary message)

Reading through howto use recvmsg() and parsing of the ancillary
messages.  See [1] "Advanced UDP sockets" page 531-538.  Its quite an
extensive task to extract destination IP address.  No wonder, netcat
missed this part.

> Then nc should bind a new socket on this address, then do the connect()

Yes, after the difficult extraction of the dest IP of the UDP packet.


Now I better understand, why the DNS server named/bind is so annoying,
that is requires a restart after adding IPs.  I guess they didn't
implement this recvmsg(), and instead chooses to bind to all avail IPs
on init/start.

Hints for readers:
For IPv4 is easy to see which is the "secondary" IP via the command "ip
addr" (look for the word "secondary")
For IPv6 I cannot tell which one is the secondary/primary from the "ip
addr" output.  But you can instead do a route lookup via the command
e.g: "ip route get fee0:cafe::102" and look for the "src" field.


[1] UNIX network programming Vol.1 (Networking APIs) by W. Richard
Stevens
-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Sr. Network Kernel Developer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

  reply	other threads:[~2012-08-09  9:40 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-08 20:37 Bug with IPv6-UDP address binding Jesper Dangaard Brouer
2012-08-08 20:59 ` Eric Dumazet
2012-08-09  9:40   ` Jesper Dangaard Brouer [this message]
2012-08-09 11:37     ` Eric Dumazet
2012-08-09 11:43     ` Eric Dumazet
2012-08-10 19:15       ` Jesper Dangaard Brouer
2012-08-21 21:51       ` Jesper Dangaard Brouer
2012-08-09 11:48     ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1344505205.3069.55.camel@localhost \
    --to=brouer@redhat.com \
    --cc=eric.dumazet@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=tgraf@suug.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).