* socket api problem: can't bind an ipv6 socket to ::ffff:0.0.0.0
@ 2009-03-16 23:48 Felix von Leitner
2009-03-17 0:00 ` Stephen Hemminger
` (2 more replies)
0 siblings, 3 replies; 27+ messages in thread
From: Felix von Leitner @ 2009-03-16 23:48 UTC (permalink / raw)
To: netdev
Here's an strace:
socket(PF_INET6, SOCK_STREAM, IPPROTO_IP) = 3
fcntl(3, F_GETFL) = 0x2 (flags O_RDWR)
fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0
setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
bind(3, {sa_family=AF_INET6, sin6_port=htons(6969), inet_pton(AF_INET6, "::ffff:0.0.0.0", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = -1 EADDRNOTAVAIL (Cannot assign requested address)
This is supposed to work, and it works on other operating systems, even
on Mac OS X.
I think it used to work on Linux, too.
I'm using 2.6.29-rc7 right now, but others have reported this not
working on distro kernels, too.
Felix
^ permalink raw reply [flat|nested] 27+ messages in thread* Re: socket api problem: can't bind an ipv6 socket to ::ffff:0.0.0.0 2009-03-16 23:48 socket api problem: can't bind an ipv6 socket to ::ffff:0.0.0.0 Felix von Leitner @ 2009-03-17 0:00 ` Stephen Hemminger 2009-03-17 0:18 ` Felix von Leitner 2009-03-17 2:26 ` Brian Haley 2009-03-17 9:03 ` Bjørn Mork 2 siblings, 1 reply; 27+ messages in thread From: Stephen Hemminger @ 2009-03-17 0:00 UTC (permalink / raw) To: Felix von Leitner; +Cc: netdev On Tue, 17 Mar 2009 00:48:10 +0100 Felix von Leitner <felix-kernel@fefe.de> wrote: > Here's an strace: > > socket(PF_INET6, SOCK_STREAM, IPPROTO_IP) = 3 > fcntl(3, F_GETFL) = 0x2 (flags O_RDWR) > fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 > setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 > bind(3, {sa_family=AF_INET6, sin6_port=htons(6969), inet_pton(AF_INET6, "::ffff:0.0.0.0", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = -1 EADDRNOTAVAIL (Cannot assign requested address) > > This is supposed to work, and it works on other operating systems, even > on Mac OS X. > > I think it used to work on Linux, too. > > I'm using 2.6.29-rc7 right now, but others have reported this not > working on distro kernels, too. > > Felix > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html Most likely you already have same port open on IPV4 and unless you set IPV6 only, the bind bind will fail. The standard way of doing servers is to bind only for IPV6 and handle IPV4 clients via the 6-4 address mapping. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: socket api problem: can't bind an ipv6 socket to ::ffff:0.0.0.0 2009-03-17 0:00 ` Stephen Hemminger @ 2009-03-17 0:18 ` Felix von Leitner 0 siblings, 0 replies; 27+ messages in thread From: Felix von Leitner @ 2009-03-17 0:18 UTC (permalink / raw) To: Stephen Hemminger; +Cc: netdev > > Here's an strace: > > > > socket(PF_INET6, SOCK_STREAM, IPPROTO_IP) = 3 > > fcntl(3, F_GETFL) = 0x2 (flags O_RDWR) > > fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 > > setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 > > bind(3, {sa_family=AF_INET6, sin6_port=htons(6969), inet_pton(AF_INET6, "::ffff:0.0.0.0", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = -1 EADDRNOTAVAIL (Cannot assign requested address) > > > > This is supposed to work, and it works on other operating systems, even > > on Mac OS X. > > > > I think it used to work on Linux, too. > > > > I'm using 2.6.29-rc7 right now, but others have reported this not > > working on distro kernels, too. > > > > Felix > > -- > > To unsubscribe from this list: send the line "unsubscribe netdev" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > Most likely you already have same port open on IPV4 and unless > you set IPV6 only, the bind bind will fail. The standard way > of doing servers is to bind only for IPV6 and handle IPV4 > clients via the 6-4 address mapping. No I don't have anything else on that port. BTW, just for the record, binding to ::ffff:10.0.0.3 (my eth0 address at the moment) still works, so the mechanism is not completely broken. Felix ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: socket api problem: can't bind an ipv6 socket to ::ffff:0.0.0.0 2009-03-16 23:48 socket api problem: can't bind an ipv6 socket to ::ffff:0.0.0.0 Felix von Leitner 2009-03-17 0:00 ` Stephen Hemminger @ 2009-03-17 2:26 ` Brian Haley 2009-03-17 2:47 ` Eric Dumazet 2009-03-17 12:58 ` Felix von Leitner 2009-03-17 9:03 ` Bjørn Mork 2 siblings, 2 replies; 27+ messages in thread From: Brian Haley @ 2009-03-17 2:26 UTC (permalink / raw) To: Felix von Leitner; +Cc: netdev Felix von Leitner wrote: > Here's an strace: > > socket(PF_INET6, SOCK_STREAM, IPPROTO_IP) = 3 > fcntl(3, F_GETFL) = 0x2 (flags O_RDWR) > fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 > setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 > bind(3, {sa_family=AF_INET6, sin6_port=htons(6969), inet_pton(AF_INET6, "::ffff:0.0.0.0", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = -1 EADDRNOTAVAIL (Cannot assign requested address) > > This is supposed to work, and it works on other operating systems, even > on Mac OS X. > > I think it used to work on Linux, too. > > I'm using 2.6.29-rc7 right now, but others have reported this not > working on distro kernels, too. I don't think this ever worked on Linux, from the very beginning of inet6_bind(): /* Check if the address belongs to the host. */ if (addr_type == IPV6_ADDR_MAPPED) { v4addr = addr->sin6_addr.s6_addr32[3]; if (inet_addr_type(net, v4addr) != RTN_LOCAL) { err = -EADDRNOTAVAIL; goto out; } } else { So if it's a mapped address, the lower 32-bits must contain a local address. RFC 3493 doesn't specifically mention what to do with ::ffff:0.0.0.0, so this looks like a gray area to me. So are you trying to get IPv4-only behavior out of this socket? Seems like the wrong way to go about it. -Brian ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: socket api problem: can't bind an ipv6 socket to ::ffff:0.0.0.0 2009-03-17 2:26 ` Brian Haley @ 2009-03-17 2:47 ` Eric Dumazet 2009-03-17 8:51 ` Bjørn Mork 2009-03-17 16:00 ` Brian Haley 2009-03-17 12:58 ` Felix von Leitner 1 sibling, 2 replies; 27+ messages in thread From: Eric Dumazet @ 2009-03-17 2:47 UTC (permalink / raw) To: Brian Haley; +Cc: Felix von Leitner, netdev Brian Haley a écrit : > Felix von Leitner wrote: >> Here's an strace: >> >> socket(PF_INET6, SOCK_STREAM, IPPROTO_IP) = 3 >> fcntl(3, F_GETFL) = 0x2 (flags O_RDWR) >> fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 >> setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 >> bind(3, {sa_family=AF_INET6, sin6_port=htons(6969), inet_pton(AF_INET6, "::ffff:0.0.0.0", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = -1 EADDRNOTAVAIL (Cannot assign requested address) >> >> This is supposed to work, and it works on other operating systems, even >> on Mac OS X. >> >> I think it used to work on Linux, too. >> >> I'm using 2.6.29-rc7 right now, but others have reported this not >> working on distro kernels, too. > > I don't think this ever worked on Linux, from the very beginning of inet6_bind(): > > /* Check if the address belongs to the host. */ > if (addr_type == IPV6_ADDR_MAPPED) { > v4addr = addr->sin6_addr.s6_addr32[3]; > if (inet_addr_type(net, v4addr) != RTN_LOCAL) { > err = -EADDRNOTAVAIL; > goto out; > } > } else { > > So if it's a mapped address, the lower 32-bits must contain a local address. > RFC 3493 doesn't specifically mention what to do with ::ffff:0.0.0.0, so this > looks like a gray area to me. > > So are you trying to get IPv4-only behavior out of this socket? Seems like the > wrong way to go about it. To me, section 3.7 of RFC 3493 is not gray. It is only refering to interoperate with IPV4 applications. Ie *sending* UDP messages to IPV4 nodes, or *connect* to TCP IPV4 nodes. So "::ffff:0.0.0.0" has no meaning to contact an IPV4 node, since 0.0.0.0 is not a valid IPV4 address. RFC 2373 is also clear Part of RFC 3493 : Applications may use AF_INET6 sockets to open TCP connections to IPv4 nodes, or send UDP packets to IPv4 nodes, by simply encoding the destination's IPv4 address as an IPv4-mapped IPv6 address, and passing that address, within a sockaddr_in6 structure, in the connect() or sendto() call. When applications use AF_INET6 sockets to accept TCP connections from IPv4 nodes, or receive UDP packets from IPv4 nodes, the system returns the peer's address to the application in the accept(), recvfrom(), or getpeername() call using a sockaddr_in6 structure encoded this way. RFC 2373 states : The IPv6 transition mechanisms [TRAN] include a technique for hosts and routers to dynamically tunnel IPv6 packets over IPv4 routing infrastructure. IPv6 nodes that utilize this technique are assigned special IPv6 unicast addresses that carry an IPv4 address in the low- order 32-bits. This type of address is termed an "IPv4-compatible IPv6 address" and has the format: | 80 bits | 16 | 32 bits | +--------------------------------------+--------------------------+ |0000..............................0000|0000| IPv4 address | +--------------------------------------+----+---------------------+ A second type of IPv6 address which holds an embedded IPv4 address is also defined. This address is used to represent the addresses of IPv4-only nodes (those that *do not* support IPv6) as IPv6 addresses. This type of address is termed an "IPv4-mapped IPv6 address" and has the format: | 80 bits | 16 | 32 bits | +--------------------------------------+--------------------------+ |0000..............................0000|FFFF| IPv4 address | +--------------------------------------+----+---------------------+ So using the "::ffff:0.0.0.0" as a local address for an IPv6 socket is a paradox, since "IPv4-mapped IPV6 address" are for IPV4-only nodes. If you want to accept only IPV4 connections, why use AF_INET6 in the first place ? Check how is implemented sctp_v6_cmp_addr() to see how expensive it is to handle extensive ipv6 address comparisons... /* Compare addresses exactly. * v4-mapped-v6 is also in consideration. */ static int sctp_v6_cmp_addr(const union sctp_addr *addr1, const union sctp_addr *addr2) { if (addr1->sa.sa_family != addr2->sa.sa_family) { if (addr1->sa.sa_family == AF_INET && addr2->sa.sa_family == AF_INET6 && ipv6_addr_v4mapped(&addr2->v6.sin6_addr)) { if (addr2->v6.sin6_port == addr1->v4.sin_port && addr2->v6.sin6_addr.s6_addr32[3] == addr1->v4.sin_addr.s_addr) return 1; } if (addr2->sa.sa_family == AF_INET && addr1->sa.sa_family == AF_INET6 && ipv6_addr_v4mapped(&addr1->v6.sin6_addr)) { if (addr1->v6.sin6_port == addr2->v4.sin_port && addr1->v6.sin6_addr.s6_addr32[3] == addr2->v4.sin_addr.s_addr) return 1; } return 0; } if (!ipv6_addr_equal(&addr1->v6.sin6_addr, &addr2->v6.sin6_addr)) return 0; /* If this is a linklocal address, compare the scope_id. */ if (ipv6_addr_type(&addr1->v6.sin6_addr) & IPV6_ADDR_LINKLOCAL) { if (addr1->v6.sin6_scope_id && addr2->v6.sin6_scope_id && (addr1->v6.sin6_scope_id != addr2->v6.sin6_scope_id)) { return 0; } } return 1; } ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: socket api problem: can't bind an ipv6 socket to ::ffff:0.0.0.0 2009-03-17 2:47 ` Eric Dumazet @ 2009-03-17 8:51 ` Bjørn Mork 2009-03-17 16:00 ` Brian Haley 1 sibling, 0 replies; 27+ messages in thread From: Bjørn Mork @ 2009-03-17 8:51 UTC (permalink / raw) To: netdev Eric Dumazet <dada1@cosmosbay.com> writes: > RFC 2373 states : I fully agree with your interpretation... ..., but just FYI, RFC 2373 was obsoleted by RFC 3513 which in turn was obsoleted by RFC 4291. Among the important changes was the deprecation of the first address class you quote (the ::a.b.c.d addresses). This doesn't affect the question though. Bjørn ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: socket api problem: can't bind an ipv6 socket to ::ffff:0.0.0.0 2009-03-17 2:47 ` Eric Dumazet 2009-03-17 8:51 ` Bjørn Mork @ 2009-03-17 16:00 ` Brian Haley 1 sibling, 0 replies; 27+ messages in thread From: Brian Haley @ 2009-03-17 16:00 UTC (permalink / raw) To: Eric Dumazet; +Cc: Felix von Leitner, netdev Eric Dumazet wrote: > To me, section 3.7 of RFC 3493 is not gray. It is only refering to interoperate > with IPV4 applications. > Ie *sending* UDP messages to IPV4 nodes, or *connect* to TCP IPV4 nodes. > > So "::ffff:0.0.0.0" has no meaning to contact an IPV4 node, since 0.0.0.0 is not > a valid IPV4 address. I agree with you Eric :) I was simply referring to the fact that RFC 3493 doesn't distinguish between valid and invalid use of mapped addresses: IPv4-mapped addresses are written as follows: ::FFFF:<IPv4-address> <IPv4-address> could be interpreted as 0.0.0.0 if you take that little section out of context. -Brian ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: socket api problem: can't bind an ipv6 socket to ::ffff:0.0.0.0 2009-03-17 2:26 ` Brian Haley 2009-03-17 2:47 ` Eric Dumazet @ 2009-03-17 12:58 ` Felix von Leitner 2009-03-17 13:47 ` Vlad Yasevich 2009-03-17 15:59 ` Brian Haley 1 sibling, 2 replies; 27+ messages in thread From: Felix von Leitner @ 2009-03-17 12:58 UTC (permalink / raw) To: Brian Haley; +Cc: netdev > I don't think this ever worked on Linux, from the very beginning of inet6_bind(): > /* Check if the address belongs to the host. */ > if (addr_type == IPV6_ADDR_MAPPED) { > v4addr = addr->sin6_addr.s6_addr32[3]; > if (inet_addr_type(net, v4addr) != RTN_LOCAL) { > err = -EADDRNOTAVAIL; > goto out; > } > } else { What is the harm in allowing this? That way an application ported to IPv6 can still bind IPv4-only. Why would it be legal to bind to a specific IPv4 address but not to all IPv4 addresses? The specific case is a bittorrent tracker. The code was ported to IPv6, but since there is so much overhead in storing IPv6 addresses you are supposed to run two processes, one on the IPv6 address and one on the IPv4 address (the IPv4 one then does not have overhead). The sane way to do this is to bind the IPv6 socket to ::ffff:0.0.0.0 then. Otherwise you would need some kind of giant abstraction layer in the application. And we specifically added the ipv4 mapped addresses so applications would not need to have a giant abstraction layer. Did I mention *BSD and OSX allow this? > So are you trying to get IPv4-only behavior out of this socket? Seems > like the wrong way to go about it. Why would you say that? Felix ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: socket api problem: can't bind an ipv6 socket to ::ffff:0.0.0.0 2009-03-17 12:58 ` Felix von Leitner @ 2009-03-17 13:47 ` Vlad Yasevich 2009-03-17 14:14 ` Felix von Leitner 2009-03-17 15:59 ` Brian Haley 1 sibling, 1 reply; 27+ messages in thread From: Vlad Yasevich @ 2009-03-17 13:47 UTC (permalink / raw) To: Felix von Leitner; +Cc: Brian Haley, netdev Felix von Leitner wrote: >> I don't think this ever worked on Linux, from the very beginning of inet6_bind(): > >> /* Check if the address belongs to the host. */ >> if (addr_type == IPV6_ADDR_MAPPED) { >> v4addr = addr->sin6_addr.s6_addr32[3]; >> if (inet_addr_type(net, v4addr) != RTN_LOCAL) { >> err = -EADDRNOTAVAIL; >> goto out; >> } >> } else { > > What is the harm in allowing this? That way an application ported to > IPv6 can still bind IPv4-only. Why would it be legal to bind to a > specific IPv4 address but not to all IPv4 addresses? > > The specific case is a bittorrent tracker. The code was ported to IPv6, > but since there is so much overhead in storing IPv6 addresses you are > supposed to run two processes, one on the IPv6 address and one on the > IPv4 address (the IPv4 one then does not have overhead). The sane way > to do this is to bind the IPv6 socket to ::ffff:0.0.0.0 then. Otherwise > you would need some kind of giant abstraction layer in the application. > And we specifically added the ipv4 mapped addresses so applications > would not need to have a giant abstraction layer. Sorry, I just don't buy this. You imply that you don't want the overhead of storing IPv6 addresses, but you still get this with ::ffff:0.0.0.0. In fact, now your overhead is even worse since ever IPv4 address will be stored stored and interpreted as IPv6 128 bit address. If you really care about overhead, run 2 services. Your IPv6 service will only track real IPv6 addresses and will reduce you total overhead. If you don't care about overhead, just bind a single socket to :: and you will get behavior identical for the ::fff:0.0.0.0 case, but with the added benefit of tracking real ipv6 addresses as well. Having written support for ::ffff:0.0.0.0, I've always thought it was a bastardized case that didn't provide any benefits. It was like saying: "I've got IPv6 on my system, but I don't really support it, even though I pretend that I do." > > Did I mention *BSD and OSX allow this? > >> So are you trying to get IPv4-only behavior out of this socket? Seems >> like the wrong way to go about it. > > Why would you say that? Because that case doesn't provide any benefits. It only has the drawback that you have to deal with ipv4-mapped IPv6 addresses witch is the overhead of the whole thing. If you are prepared to deal with it, you might as well deal with real ipv6 addresses at the same time and mitigate your overhead somewhat. -vlad > > Felix > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: socket api problem: can't bind an ipv6 socket to ::ffff:0.0.0.0 2009-03-17 13:47 ` Vlad Yasevich @ 2009-03-17 14:14 ` Felix von Leitner 2009-03-17 14:57 ` Vlad Yasevich 2009-03-17 15:21 ` Eric Dumazet 0 siblings, 2 replies; 27+ messages in thread From: Felix von Leitner @ 2009-03-17 14:14 UTC (permalink / raw) To: Vlad Yasevich; +Cc: Brian Haley, netdev > Sorry, I just don't buy this. You imply that you don't want the overhead > of storing IPv6 addresses, but you still get this with ::ffff:0.0.0.0. > In fact, now your overhead is even worse since ever IPv4 address will be > stored stored and interpreted as IPv6 128 bit address. > If you really care about overhead, run 2 services. Your IPv6 service > will only track real IPv6 addresses and will reduce you total overhead. I am worried about the overhead of storing the IPv6 addresses. I am not storing them in the IPv4 case. But the socket code has been rewritten to use IPv6 addresses only, precisely because IPv4-mapped addresses exist. > If you don't care about overhead, just bind a single socket to :: and > you will get behavior identical for the ::fff:0.0.0.0 case, but with > the added benefit of tracking real ipv6 addresses as well. You probably mean well but please stick to the problem at hand and don't speculate about my app. > Having written support for ::ffff:0.0.0.0, I've always thought it was > a bastardized case that didn't provide any benefits. It was like saying: > "I've got IPv6 on my system, but I don't really support it, even though > I pretend that I do." The app has a command line option to specify which address to bind to. The app understands IPv4 addresses and converts them to ipv4 mapped addresses so it can only deal with sockaddr_in6 when talking to the kernel and does not need to store info on what kind of socket family it is dealing with. If someone specifies 0.0.0.0, it does not work. It's that easy. Now it may be a fascinating side discussion on whether you think IPv4 mapped 0.0.0.0 is useful or not, but rest assured: it is useful to at least one high profile app that is so far running on Linux. > > Why would you say that? > Because that case doesn't provide any benefits. You may not see it but it does. > It only has the drawback that you have to deal with ipv4-mapped IPv6 > addresses witch is the overhead of the whole thing. That is not a drawback. On the contrary. It greatly simplifies how the app deals with the socket API. > If you are prepared to deal with it, you might as well deal with real ipv6 addresses > at the same time and mitigate your overhead somewhat. You are currently proving all the snide remarks by the BSD people about the Linux IP stack true, and the "professionalism" snide remarks of the Solaris people. Great work, man. Felix ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: socket api problem: can't bind an ipv6 socket to ::ffff:0.0.0.0 2009-03-17 14:14 ` Felix von Leitner @ 2009-03-17 14:57 ` Vlad Yasevich 2009-03-17 17:51 ` Felix von Leitner 2009-03-17 15:21 ` Eric Dumazet 1 sibling, 1 reply; 27+ messages in thread From: Vlad Yasevich @ 2009-03-17 14:57 UTC (permalink / raw) To: Felix von Leitner; +Cc: Brian Haley, netdev Felix von Leitner wrote: >> Sorry, I just don't buy this. You imply that you don't want the overhead >> of storing IPv6 addresses, but you still get this with ::ffff:0.0.0.0. >> In fact, now your overhead is even worse since ever IPv4 address will be >> stored stored and interpreted as IPv6 128 bit address. > >> If you really care about overhead, run 2 services. Your IPv6 service >> will only track real IPv6 addresses and will reduce you total overhead. > > I am worried about the overhead of storing the IPv6 addresses. > I am not storing them in the IPv4 case. > > But the socket code has been rewritten to use IPv6 addresses only, > precisely because IPv4-mapped addresses exist. So, what you want to do is provide IPv4 only service on a fully configured dual-stacked machine by running an IPv6 enabled application? Why do you not want to provide IPv6 side of the same service? You mentioned overhead (and I am guessing that's the answer the above question), but is the number of IPv6 clients so high that your service would not be able to handle it. As I've already mentioned, your overhead of tracking IPv6 clients is actually lower that tracking all the IPv4 clients using mapped addresses. One way of preventing the tracking IPv6 clients is by disallowing IPv6 traffic or even not configuring any IPv6 addresses. That could get what you want right now, without waiting for a kernel patch. > >> If you don't care about overhead, just bind a single socket to :: and >> you will get behavior identical for the ::fff:0.0.0.0 case, but with >> the added benefit of tracking real ipv6 addresses as well. > > You probably mean well but please stick to the problem at hand and don't > speculate about my app. > >> Having written support for ::ffff:0.0.0.0, I've always thought it was >> a bastardized case that didn't provide any benefits. It was like saying: >> "I've got IPv6 on my system, but I don't really support it, even though >> I pretend that I do." > > The app has a command line option to specify which address to bind to. > The app understands IPv4 addresses and converts them to ipv4 mapped > addresses so it can only deal with sockaddr_in6 when talking to the > kernel and does not need to store info on what kind of socket family it > is dealing with. > > If someone specifies 0.0.0.0, it does not work. It's that easy. > > Now it may be a fascinating side discussion on whether you think IPv4 > mapped 0.0.0.0 is useful or not, but rest assured: it is useful to at > least one high profile app that is so far running on Linux. > In this case, you are making a trade-off of application complexity against kernel complexity. You are making your application much simpler, while demanding more complexity from the kernel. It is your right as an application developer, and it our right as kernel developers to push back and provide alternatives. >>> Why would you say that? >> Because that case doesn't provide any benefits. > > You may not see it but it does. > >> It only has the drawback that you have to deal with ipv4-mapped IPv6 >> addresses witch is the overhead of the whole thing. > > That is not a drawback. On the contrary. It greatly simplifies how the > app deals with the socket API. > >> If you are prepared to deal with it, you might as well deal with real ipv6 addresses >> at the same time and mitigate your overhead somewhat. > > You are currently proving all the snide remarks by the BSD people about > the Linux IP stack true, and the "professionalism" snide remarks of the > Solaris people. Great work, man. > This is really a great way to convince someone to do the work... :/ -vlad ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: socket api problem: can't bind an ipv6 socket to ::ffff:0.0.0.0 2009-03-17 14:57 ` Vlad Yasevich @ 2009-03-17 17:51 ` Felix von Leitner 0 siblings, 0 replies; 27+ messages in thread From: Felix von Leitner @ 2009-03-17 17:51 UTC (permalink / raw) To: Vlad Yasevich; +Cc: Brian Haley, netdev > > I am worried about the overhead of storing the IPv6 addresses. > > I am not storing them in the IPv4 case. > > But the socket code has been rewritten to use IPv6 addresses only, > > precisely because IPv4-mapped addresses exist. > So, what you want to do is provide IPv4 only service on a fully > configured dual-stacked machine by running an IPv6 enabled application? Yes. Actually, I want to provide IPv6 and IPv4 service, but it turns out the users in some cases want to run the service in IPv4-only mode. > Why do you not want to provide IPv6 side of the same service? As I said, in this particular case, you run two processes. One for IPv6 and one for IPv4. The reason is that a) it's P2P, so you don't want to provide IPv6 addresses of peers to IPv4 users anyway, because if they supported IPv6, they'd be connecting via IPv6. b) IPv4 users outnumber IPv6 users by a wide margin. For the IPv4 case it does not make sense to waste 12 bytes per IP address to even store the "::ffff:" part. > You mentioned overhead (and I am guessing that's the answer the above question), > but is the number of IPv6 clients so high that your service would > not be able to handle it. The overhead is the memory overhead needed to store the IP addresses of the peers. For some popular files we are talking about a five digit number of peers, and we don't want to store the full IPv6 address for those. We do want to use IPv6 sockets so we don't have to add code to differentiate and make it work, because the kernel already has that code in the form of the ipv4-mapped address handling code. And it works, except for that one if clause that prevents me from binding to ::ffff:0.0.0.0 As I said, this is not _me_ who wants to bind there. It's the user who uses "-i 0.0.0.0" to get a process that runs only in IPv4 mode. It took me a while to see the point in that, too. But again, it's not my place to argue with the customers on how they want to use the software. It's my place to provide software that does what they need. And if you ask me, the same holds true for you. > As I've already mentioned, your overhead of tracking IPv6 clients is actually > lower that tracking all the IPv4 clients using mapped addresses. You did not understand the problem then. I hope you understand it now. > One way of preventing the tracking IPv6 clients is by disallowing IPv6 traffic > or even not configuring any IPv6 addresses. That could get what you want > right now, without waiting for a kernel patch. We do have IPv6, and we have it enabled, and we run a copy of the software on the IPv6 address, too. Now we could bind to the specific address of the PC, but that happens to inferfere with the load balancing and failover installation we have. In the case of one failing node, we configure that IP address on one of the other hosts and expect that host to handle that traffic. > In this case, you are making a trade-off of application complexity against > kernel complexity. You are making your application much simpler, while demanding > more complexity from the kernel. In fact it's the other way around. I waited for the kernel to support v4 mapped addresses. Then I wrote the socket layer on top of it. You already committed on providing the complexity. Now I just want you to follow through on the promise. :-) > >> If you are prepared to deal with it, you might as well deal with real ipv6 addresses > >> at the same time and mitigate your overhead somewhat. > > You are currently proving all the snide remarks by the BSD people about > > the Linux IP stack true, and the "professionalism" snide remarks of the > > Solaris people. Great work, man. > This is really a great way to convince someone to do the work... :/ Hey, I'm just saying. My middleware runs on Linux, BSD, OSX and Solaris. I'm just writing the middleware. Previously, users of my middleware switched from BSD to Linux because v4 mapped v6 addresses were turned off by default in FreeBSD. My users made a stink about it and convinced FreeBSD to change the default. But many of them switched to Linux. What do you think happens if my middleware now does not work right on Linux? People will switch to Solaris. Or FreeBSD. I am willing to put up a fight before abandoning ship. You apparently think this is a disservice to you because I'm taking your time with this, but it's in fact the opposite. I'm giving Linux an opportunity here to set things right. Linux has stood tall as a beacon of "it may take us longer but we like to do things right". We did not just do a big kernel lock, we wanted to do it right. We did not just take an old Unix filesystem, we wanted to do it right. We did not just reimplement mbufs, we wanted to do memory management right. And now I hope we do not just let some language lawyer weasel through some RFC and provide an interpretation of it that would legally allow the current broken behavior. I hope we fix it instead. This may not seem like much to you, but we are talking about the biggest noncommercial Internet messaging infrastructure here. If they run Linux, that is an asset for Linux. Because it shows that we can scale. We can provide a proper implementation of the IPv6 APIs. Please don't be part of the problem. Be part of the solution. Felix ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: socket api problem: can't bind an ipv6 socket to ::ffff:0.0.0.0 2009-03-17 14:14 ` Felix von Leitner 2009-03-17 14:57 ` Vlad Yasevich @ 2009-03-17 15:21 ` Eric Dumazet 2009-03-17 18:01 ` Felix von Leitner 1 sibling, 1 reply; 27+ messages in thread From: Eric Dumazet @ 2009-03-17 15:21 UTC (permalink / raw) To: Felix von Leitner; +Cc: Vlad Yasevich, Brian Haley, netdev Felix von Leitner a écrit : >> Sorry, I just don't buy this. You imply that you don't want the overhead >> of storing IPv6 addresses, but you still get this with ::ffff:0.0.0.0. >> In fact, now your overhead is even worse since ever IPv4 address will be >> stored stored and interpreted as IPv6 128 bit address. > >> If you really care about overhead, run 2 services. Your IPv6 service >> will only track real IPv6 addresses and will reduce you total overhead. > > I am worried about the overhead of storing the IPv6 addresses. > I am not storing them in the IPv4 case. > > But the socket code has been rewritten to use IPv6 addresses only, > precisely because IPv4-mapped addresses exist. > >> If you don't care about overhead, just bind a single socket to :: and >> you will get behavior identical for the ::fff:0.0.0.0 case, but with >> the added benefit of tracking real ipv6 addresses as well. > > You probably mean well but please stick to the problem at hand and don't > speculate about my app. > >> Having written support for ::ffff:0.0.0.0, I've always thought it was >> a bastardized case that didn't provide any benefits. It was like saying: >> "I've got IPv6 on my system, but I don't really support it, even though >> I pretend that I do." > > The app has a command line option to specify which address to bind to. > The app understands IPv4 addresses and converts them to ipv4 mapped > addresses so it can only deal with sockaddr_in6 when talking to the > kernel and does not need to store info on what kind of socket family it > is dealing with. > > If someone specifies 0.0.0.0, it does not work. It's that easy. > > Now it may be a fascinating side discussion on whether you think IPv4 > mapped 0.0.0.0 is useful or not, but rest assured: it is useful to at > least one high profile app that is so far running on Linux. > >>> Why would you say that? >> Because that case doesn't provide any benefits. > > You may not see it but it does. > >> It only has the drawback that you have to deal with ipv4-mapped IPv6 >> addresses witch is the overhead of the whole thing. > > That is not a drawback. On the contrary. It greatly simplifies how the > app deals with the socket API. > >> If you are prepared to deal with it, you might as well deal with real ipv6 addresses >> at the same time and mitigate your overhead somewhat. > > You are currently proving all the snide remarks by the BSD people about > the Linux IP stack true, and the "professionalism" snide remarks of the > Solaris people. Great work, man. > Trying to understand why you seem furious, lets try to be pragmatic. Most users of your great program wont have a fix for this until next year. I am afraid you have no choice but change your program, or loose users. Still I dont get your point. Having TCP V6 sockets is much more expensive at kernel level (same for UDP), and bittorrent is known to stress network a bit, so having application use an IPV4 socket where it can is a win for your program getting more users, and computers spend less power. grep TCP /proc/slabinfo tw_sock_TCPv6 0 0 192 21 1 : tunables 0 0 0 : slabdata 0 0 0 TCPv6 140 140 1600 20 8 : tunables 0 0 0 : slabdata 7 7 0 tw_sock_TCP 256 256 128 32 1 : tunables 0 0 0 : slabdata 8 8 0 TCP 197 198 1472 22 8 : tunables 0 0 0 : slabdata 9 9 0 Gasp, OSX having this "::ffff:0.0.0.0" right is probably the reason why more computers run OSX than linux. Sometime dont implement RFC too literally :) ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: socket api problem: can't bind an ipv6 socket to ::ffff:0.0.0.0 2009-03-17 15:21 ` Eric Dumazet @ 2009-03-17 18:01 ` Felix von Leitner 0 siblings, 0 replies; 27+ messages in thread From: Felix von Leitner @ 2009-03-17 18:01 UTC (permalink / raw) To: Eric Dumazet; +Cc: Vlad Yasevich, Brian Haley, netdev > Trying to understand why you seem furious, lets try to be pragmatic. I'm not furious. I just get angry when people I submit a bug report to tell me they don't want to fix the bug. Some people think that if I submit a bug to them, they are doing me a service if they fix the bug. In fact it's the opposite of that. If I submit a bug, I am doing them a service, because I am telling them in what way their software fails to meet the requirements of the users. > Most users of your great program wont have a fix for this until next year. You underestimate my users. The few ones that run into this kind of problem are not above patching their kernels to make it work. But I am not willing to provide a kernel patch and do the customer support for that. > I am afraid you have no choice but change your program, or loose users. No I will not. My program works. Just not on Linux. If my users see that "the Linux people" don't consider running high profile high throughput messing systems important enough to remove one if clause of dubious merit, then they go switch to Solaris or FreeBSD instead. And then Solaris and FreeBSD get the PR benefit. > Still I dont get your point. Having TCP V6 sockets is much more expensive > at kernel level (same for UDP), and bittorrent is known to stress network a bit, so > having application use an IPV4 socket where it can is a win for your > program getting more users, and computers spend less power. There are two things to say to that: 1. IPv6 is the future. If I implement IPv4 code because the IPv6 code is slower, there will never be an incentive for the kernel people to tune the IPv6 code, and it will continue to suck. 2. IPv4 users won't ever switch to IPv6 if they hear it's so slow that people like me had to provide a legacy code path for performance reasons. That is exactly the wrong message to send. 3. In my benchmarks the performance difference was negligible. It was in the area of 1-2%, i.e. within the margin of error. > Gasp, OSX having this "::ffff:0.0.0.0" right is probably the reason why more computers > run OSX than linux. Sometime dont implement RFC too literally :) Your target audience is not the RFCs, it's the people. And the people just told you that you implemented this part of the code wrong. Please listen to your users and don't berate them. Even if we assume that the RFCs can be read so that the current implementation is technically not illegal, note that the other operating systems interpreted it differently. So you miss the main goal of the RFCs, providing a fertile ground for interoperability. Just forget all I said. Just look at the facts. The RFCs are unclear. All the other major IPv6 stacks do it the other way. Maybe they are right? Felix ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: socket api problem: can't bind an ipv6 socket to ::ffff:0.0.0.0 2009-03-17 12:58 ` Felix von Leitner 2009-03-17 13:47 ` Vlad Yasevich @ 2009-03-17 15:59 ` Brian Haley [not found] ` <20090317180840.GC13270@codeblau.de> 1 sibling, 1 reply; 27+ messages in thread From: Brian Haley @ 2009-03-17 15:59 UTC (permalink / raw) To: Felix von Leitner; +Cc: netdev Felix von Leitner wrote: >> I don't think this ever worked on Linux, from the very beginning of inet6_bind(): > >> /* Check if the address belongs to the host. */ >> if (addr_type == IPV6_ADDR_MAPPED) { >> v4addr = addr->sin6_addr.s6_addr32[3]; >> if (inet_addr_type(net, v4addr) != RTN_LOCAL) { >> err = -EADDRNOTAVAIL; >> goto out; >> } >> } else { > > What is the harm in allowing this? That way an application ported to > IPv6 can still bind IPv4-only. Why would it be legal to bind to a > specific IPv4 address but not to all IPv4 addresses? Please show me a porting guide that even mentions supporting IPv4-only mode through an IPv6 socket by using this method. There is none that I know of. > The specific case is a bittorrent tracker. The code was ported to IPv6, > but since there is so much overhead in storing IPv6 addresses you are > supposed to run two processes, one on the IPv6 address and one on the > IPv4 address (the IPv4 one then does not have overhead). The sane way > to do this is to bind the IPv6 socket to ::ffff:0.0.0.0 then. Otherwise > you would need some kind of giant abstraction layer in the application. > And we specifically added the ipv4 mapped addresses so applications > would not need to have a giant abstraction layer. > > Did I mention *BSD and OSX allow this? That was their decision, and it doesn't mean it's the right thing to do. It doesn't mean Linux shouldn't change either, but name-calling isn't going to get you anywhere on this list. Compare your bittorrent server to Apache, which is probably the most widely-used server application in the world. It doesn't do what you're trying to do. See http://httpd.apache.org/docs/2.2/bind.html and/or browse the source code. >> So are you trying to get IPv4-only behavior out of this socket? Seems >> like the wrong way to go about it. > > Why would you say that? Because if you want IPv4-only you open an AF_INET socket. There is no equivalent to IPv6-only, for example when you open an AF_INET6 socket and set IPV6_ONLY on it. -Brian ^ permalink raw reply [flat|nested] 27+ messages in thread
[parent not found: <20090317180840.GC13270@codeblau.de>]
* Re: socket api problem: can't bind an ipv6 socket to ::ffff:0.0.0.0 [not found] ` <20090317180840.GC13270@codeblau.de> @ 2009-03-17 19:21 ` Brian Haley 2009-03-17 19:31 ` David Miller 0 siblings, 1 reply; 27+ messages in thread From: Brian Haley @ 2009-03-17 19:21 UTC (permalink / raw) To: Felix von Leitner; +Cc: netdev@vger.kernel.org Top-posting so others can see your off-list rant in full. I see no reason to help you any further, even though I did have a patch that would change this behavior for you. Good luck with your "biggest noncommercial Internet messaging infrastructure" in the world. -Brian Felix von Leitner wrote: >> Please show me a porting guide that even mentions supporting IPv4-only mode >> through an IPv6 socket by using this method. There is none that I know of. > > Are you kidding me? > A _porting guide_?!? > > If you are trying to troll me, you just succeeded. > > Now please make room so the adults can talk about the issue at hand > while you are putting up straw men. > >>> Did I mention *BSD and OSX allow this? >> That was their decision, and it doesn't mean it's the right thing to do. > > Riiiight. > > There is an old joke. The Joneses are driving on the freeway, when the > radio sounds a warning. "Warning! There is a car driving the wrong way > on the freeway!" Says grandpa (who is driving the car) "what do you > mean, one guy? Hundreds!!" > > Sometimes, if there are two ways to read something, and your users tell > you which way they want it, and the competition does it the way the > users want, and you don't, sometimes, in that case, YOU ARE WRONG. > > It's that easy. > > Hey, you have an hp.com email address. Why don't you check out how > HP-UX handles this. > >> Compare your bittorrent server to Apache, which is probably the most widely-used >> server application in the world. It doesn't do what you're trying to do. See >> http://httpd.apache.org/docs/2.2/bind.html and/or browse the source code. > > What is this supposed to be? Name dropping? > > I'm not impressed. > > And Apache never won any speed or scalability records. Just because > many people use Apache does not mean it's a good piece of software. You > know, many more people use Windows than Linux. That does not make > Windows the standard to follow. Hey, many people use sendmail! And > BIND! > > Felix > ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: socket api problem: can't bind an ipv6 socket to ::ffff:0.0.0.0 2009-03-17 19:21 ` Brian Haley @ 2009-03-17 19:31 ` David Miller 2009-03-17 21:05 ` Vlad Yasevich ` (5 more replies) 0 siblings, 6 replies; 27+ messages in thread From: David Miller @ 2009-03-17 19:31 UTC (permalink / raw) To: brian.haley; +Cc: felix-kernel, netdev From: Brian Haley <brian.haley@hp.com> Date: Tue, 17 Mar 2009 15:21:52 -0400 > Top-posting so others can see your off-list rant in full. I see no > reason to help you any further, even though I did have a patch that > would change this behavior for you. Good luck with your "biggest > noncommercial Internet messaging infrastructure" in the world. What a jerk. Brian, don't help him any more, you were being very reasonable in your email to him. His response was way out of line. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: socket api problem: can't bind an ipv6 socket to ::ffff:0.0.0.0 2009-03-17 19:31 ` David Miller @ 2009-03-17 21:05 ` Vlad Yasevich 2009-03-17 21:05 ` [RFC PATCH 1/4] ipv6: Disallow binding to v4-mapped address on v6-only socket Vlad Yasevich ` (4 subsequent siblings) 5 siblings, 0 replies; 27+ messages in thread From: Vlad Yasevich @ 2009-03-17 21:05 UTC (permalink / raw) To: davem; +Cc: netdev Hi David Regardless of how we may feel about this thread, it did make me run the BSD bindtest utility and look at the results. What I found was rather surprising. There were multiple tests that one would exptect to succeed, but they were failing. Things that I consider broken: 1) We can bind to a v4-mapped IPv6 address on a v6-only socket. 2) We conflict IPv4 wildcrads with explicit IPv6 addresses and vice-versa 3) We inconsitently treat V4 address and v4-mapped addresses. As an example, try binging to 0.1.2.3. (This also kind of goes to binding ::ffff:0.0.0.0). The following 4 RFC patches attempt to fix this. I've run bindtest tool and am currently analizing the results. They look a heck of a lot better. Thanks -vlad ^ permalink raw reply [flat|nested] 27+ messages in thread
* [RFC PATCH 1/4] ipv6: Disallow binding to v4-mapped address on v6-only socket. 2009-03-17 19:31 ` David Miller 2009-03-17 21:05 ` Vlad Yasevich @ 2009-03-17 21:05 ` Vlad Yasevich 2009-03-17 21:06 ` [RFC PATCH 2/4] ipv6: Allow ipv4 wildcard binds after ipv6 address binds Vlad Yasevich ` (3 subsequent siblings) 5 siblings, 0 replies; 27+ messages in thread From: Vlad Yasevich @ 2009-03-17 21:05 UTC (permalink / raw) To: davem; +Cc: netdev, Vlad Yasevich A socket marked v6-only, can not receive or send traffic to v4-mapped addresses. Thus allowing binding to v4-mapped address on such a socket makes no sense. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> --- net/ipv6/af_inet6.c | 7 +++++++ 1 files changed, 7 insertions(+), 0 deletions(-) diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c index 3e2ddfa..07b9f3c 100644 --- a/net/ipv6/af_inet6.c +++ b/net/ipv6/af_inet6.c @@ -276,6 +276,13 @@ int inet6_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len) /* Check if the address belongs to the host. */ if (addr_type == IPV6_ADDR_MAPPED) { + /* Binding to v4-mapped address on a v6-only socket + * makes no sense + */ + if (np->ipv6only) { + err = -EINVAL; + goto out; + } v4addr = addr->sin6_addr.s6_addr32[3]; if (inet_addr_type(net, v4addr) != RTN_LOCAL) { err = -EADDRNOTAVAIL; -- 1.5.4.3 ^ permalink raw reply related [flat|nested] 27+ messages in thread
* [RFC PATCH 2/4] ipv6: Allow ipv4 wildcard binds after ipv6 address binds 2009-03-17 19:31 ` David Miller 2009-03-17 21:05 ` Vlad Yasevich 2009-03-17 21:05 ` [RFC PATCH 1/4] ipv6: Disallow binding to v4-mapped address on v6-only socket Vlad Yasevich @ 2009-03-17 21:06 ` Vlad Yasevich 2009-03-17 21:06 ` [RFC PATCH 3/4] ipv6: Make v4-mapped bindings consitant with IPv4 Vlad Yasevich ` (2 subsequent siblings) 5 siblings, 0 replies; 27+ messages in thread From: Vlad Yasevich @ 2009-03-17 21:06 UTC (permalink / raw) To: davem; +Cc: netdev, Vlad Yasevich The IPv4 wildcard (0.0.0.0) address does not intersect in any way with explicit IPv6 addresses. These two should be permitted, but the IPv4 conflict code checks the ipv6only bit as part of the test. Since binding to an explicit IPv6 address restricts the socket to only that IPv6 address, the side-effect is that the socket behaves as v6-only. By explicitely setting ipv6only in this case, allows the 2 binds to succeed. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> --- net/ipv6/af_inet6.c | 5 ++++- 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c index 07b9f3c..0adce8e 100644 --- a/net/ipv6/af_inet6.c +++ b/net/ipv6/af_inet6.c @@ -346,8 +346,11 @@ int inet6_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len) goto out; } - if (addr_type != IPV6_ADDR_ANY) + if (addr_type != IPV6_ADDR_ANY) { sk->sk_userlocks |= SOCK_BINDADDR_LOCK; + if (addr_type != IPV6_ADDR_MAPPED) + np->ipv6only = 1; + } if (snum) sk->sk_userlocks |= SOCK_BINDPORT_LOCK; inet->sport = htons(inet->num); -- 1.5.4.3 ^ permalink raw reply related [flat|nested] 27+ messages in thread
* [RFC PATCH 3/4] ipv6: Make v4-mapped bindings consitant with IPv4 2009-03-17 19:31 ` David Miller ` (2 preceding siblings ...) 2009-03-17 21:06 ` [RFC PATCH 2/4] ipv6: Allow ipv4 wildcard binds after ipv6 address binds Vlad Yasevich @ 2009-03-17 21:06 ` Vlad Yasevich 2009-03-17 21:06 ` [RFC PATCH 4/4] ipv6: Fix conflict resolutions during ipv6 binding Vlad Yasevich 2009-03-18 9:13 ` socket api problem: can't bind an ipv6 socket to ::ffff:0.0.0.0 Jarek Poplawski 5 siblings, 0 replies; 27+ messages in thread From: Vlad Yasevich @ 2009-03-17 21:06 UTC (permalink / raw) To: davem; +Cc: netdev, Vlad Yasevich Binding to a v4-mapped address on an AF_INET6 socket should produce the same result as binding to an IPv4 address on AF_INET socket. The two are interchangable as v4-mapped address is really a portability aid. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> --- net/ipv6/af_inet6.c | 14 +++++++++++--- 1 files changed, 11 insertions(+), 3 deletions(-) diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c index 0adce8e..274cc89 100644 --- a/net/ipv6/af_inet6.c +++ b/net/ipv6/af_inet6.c @@ -276,6 +276,8 @@ int inet6_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len) /* Check if the address belongs to the host. */ if (addr_type == IPV6_ADDR_MAPPED) { + int chk_addr_ret; + /* Binding to v4-mapped address on a v6-only socket * makes no sense */ @@ -283,11 +285,17 @@ int inet6_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len) err = -EINVAL; goto out; } + + /* Reproduce AF_INET checks to make the bindings consitant */ v4addr = addr->sin6_addr.s6_addr32[3]; - if (inet_addr_type(net, v4addr) != RTN_LOCAL) { - err = -EADDRNOTAVAIL; + chk_addr_ret = inet_addr_type(net, v4addr); + if (!sysctl_ip_nonlocal_bind && + !(inet->freebind || inet->transparent) && + v4addr != htonl(INADDR_ANY) && + chk_addr_ret != RTN_LOCAL && + chk_addr_ret != RTN_MULTICAST && + chk_addr_ret != RTN_BROADCAST) goto out; - } } else { if (addr_type != IPV6_ADDR_ANY) { struct net_device *dev = NULL; -- 1.5.4.3 ^ permalink raw reply related [flat|nested] 27+ messages in thread
* [RFC PATCH 4/4] ipv6: Fix conflict resolutions during ipv6 binding 2009-03-17 19:31 ` David Miller ` (3 preceding siblings ...) 2009-03-17 21:06 ` [RFC PATCH 3/4] ipv6: Make v4-mapped bindings consitant with IPv4 Vlad Yasevich @ 2009-03-17 21:06 ` Vlad Yasevich 2009-03-18 9:13 ` socket api problem: can't bind an ipv6 socket to ::ffff:0.0.0.0 Jarek Poplawski 5 siblings, 0 replies; 27+ messages in thread From: Vlad Yasevich @ 2009-03-17 21:06 UTC (permalink / raw) To: davem; +Cc: netdev, Vlad Yasevich The ipv6 version of bind_conflict code calls ipv6_rcv_saddr_equal() which at times wrongly identified intersections between addresses. It particularly broke down under a few instances and caused erroneouse bind conflicts. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> --- include/net/addrconf.h | 4 ++-- include/net/udp.h | 2 ++ net/ipv4/udp.c | 3 ++- net/ipv6/addrconf.c | 34 ---------------------------------- net/ipv6/udp.c | 30 ++++++++++++++++++++++++++++++ 5 files changed, 36 insertions(+), 37 deletions(-) diff --git a/include/net/addrconf.h b/include/net/addrconf.h index c216de5..7b55ab2 100644 --- a/include/net/addrconf.h +++ b/include/net/addrconf.h @@ -88,8 +88,8 @@ extern int ipv6_dev_get_saddr(struct net *net, extern int ipv6_get_lladdr(struct net_device *dev, struct in6_addr *addr, unsigned char banned_flags); -extern int ipv6_rcv_saddr_equal(const struct sock *sk, - const struct sock *sk2); +extern int ipv6_rcv_saddr_equal(const struct sock *sk, + const struct sock *sk2); extern void addrconf_join_solict(struct net_device *dev, struct in6_addr *addr); extern void addrconf_leave_solict(struct inet6_dev *idev, diff --git a/include/net/udp.h b/include/net/udp.h index 90e6ce5..93dbe29 100644 --- a/include/net/udp.h +++ b/include/net/udp.h @@ -124,6 +124,8 @@ static inline void udp_lib_close(struct sock *sk, long timeout) sk_common_release(sk); } +extern int ipv4_rcv_saddr_equal(const struct sock *sk1, + const struct sock *sk2); extern int udp_lib_get_port(struct sock *sk, unsigned short snum, int (*)(const struct sock*,const struct sock*)); diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 4bd178a..ce64e4d 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -222,7 +222,7 @@ fail: return error; } -static int ipv4_rcv_saddr_equal(const struct sock *sk1, const struct sock *sk2) +int ipv4_rcv_saddr_equal(const struct sock *sk1, const struct sock *sk2) { struct inet_sock *inet1 = inet_sk(sk1), *inet2 = inet_sk(sk2); @@ -1819,6 +1819,7 @@ EXPORT_SYMBOL(udp_lib_getsockopt); EXPORT_SYMBOL(udp_lib_setsockopt); EXPORT_SYMBOL(udp_poll); EXPORT_SYMBOL(udp_lib_get_port); +EXPORT_SYMBOL(ipv4_rcv_saddr_equal); #ifdef CONFIG_PROC_FS EXPORT_SYMBOL(udp_proc_register); diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index e83852a..00a37c1 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -1367,40 +1367,6 @@ struct inet6_ifaddr *ipv6_get_ifaddr(struct net *net, const struct in6_addr *add return ifp; } -int ipv6_rcv_saddr_equal(const struct sock *sk, const struct sock *sk2) -{ - const struct in6_addr *sk_rcv_saddr6 = &inet6_sk(sk)->rcv_saddr; - const struct in6_addr *sk2_rcv_saddr6 = inet6_rcv_saddr(sk2); - __be32 sk_rcv_saddr = inet_sk(sk)->rcv_saddr; - __be32 sk2_rcv_saddr = inet_rcv_saddr(sk2); - int sk_ipv6only = ipv6_only_sock(sk); - int sk2_ipv6only = inet_v6_ipv6only(sk2); - int addr_type = ipv6_addr_type(sk_rcv_saddr6); - int addr_type2 = sk2_rcv_saddr6 ? ipv6_addr_type(sk2_rcv_saddr6) : IPV6_ADDR_MAPPED; - - if (!sk2_rcv_saddr && !sk_ipv6only) - return 1; - - if (addr_type2 == IPV6_ADDR_ANY && - !(sk2_ipv6only && addr_type == IPV6_ADDR_MAPPED)) - return 1; - - if (addr_type == IPV6_ADDR_ANY && - !(sk_ipv6only && addr_type2 == IPV6_ADDR_MAPPED)) - return 1; - - if (sk2_rcv_saddr6 && - ipv6_addr_equal(sk_rcv_saddr6, sk2_rcv_saddr6)) - return 1; - - if (addr_type == IPV6_ADDR_MAPPED && - !sk2_ipv6only && - (!sk2_rcv_saddr || !sk_rcv_saddr || sk_rcv_saddr == sk2_rcv_saddr)) - return 1; - - return 0; -} - /* Gets referenced address, destroys ifaddr */ static void addrconf_dad_stop(struct inet6_ifaddr *ifp) diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index 84b1a29..7e45761 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -49,6 +49,36 @@ #include <linux/seq_file.h> #include "udp_impl.h" +int ipv6_rcv_saddr_equal(const struct sock *sk, const struct sock *sk2) +{ + const struct in6_addr *sk_rcv_saddr6 = &inet6_sk(sk)->rcv_saddr; + const struct in6_addr *sk2_rcv_saddr6 = inet6_rcv_saddr(sk2); + __be32 sk_rcv_saddr = inet_sk(sk)->rcv_saddr; + __be32 sk2_rcv_saddr = inet_rcv_saddr(sk2); + int sk_ipv6only = ipv6_only_sock(sk); + int sk2_ipv6only = inet_v6_ipv6only(sk2); + int addr_type = ipv6_addr_type(sk_rcv_saddr6); + int addr_type2 = sk2_rcv_saddr6 ? ipv6_addr_type(sk2_rcv_saddr6) : IPV6_ADDR_MAPPED; + + /* if both are mapped, treat as IPv4 */ + if (addr_type == IPV6_ADDR_MAPPED && addr_type2 == IPV6_ADDR_MAPPED) + return ipv4_rcv_saddr_equal(sk, sk2); + + if (addr_type2 == IPV6_ADDR_ANY && + !(sk2_ipv6only && addr_type == IPV6_ADDR_MAPPED)) + return 1; + + if (addr_type == IPV6_ADDR_ANY && + !(sk_ipv6only && addr_type2 == IPV6_ADDR_MAPPED)) + return 1; + + if (sk2_rcv_saddr6 && + ipv6_addr_equal(sk_rcv_saddr6, sk2_rcv_saddr6)) + return 1; + + return 0; +} + int udp_v6_get_port(struct sock *sk, unsigned short snum) { return udp_lib_get_port(sk, snum, ipv6_rcv_saddr_equal); -- 1.5.4.3 ^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: socket api problem: can't bind an ipv6 socket to ::ffff:0.0.0.0 2009-03-17 19:31 ` David Miller ` (4 preceding siblings ...) 2009-03-17 21:06 ` [RFC PATCH 4/4] ipv6: Fix conflict resolutions during ipv6 binding Vlad Yasevich @ 2009-03-18 9:13 ` Jarek Poplawski 2009-03-18 21:36 ` David Miller 5 siblings, 1 reply; 27+ messages in thread From: Jarek Poplawski @ 2009-03-18 9:13 UTC (permalink / raw) To: David Miller; +Cc: brian.haley, felix-kernel, netdev On 17-03-2009 20:31, David Miller wrote: > From: Brian Haley <brian.haley@hp.com> > Date: Tue, 17 Mar 2009 15:21:52 -0400 > >> Top-posting so others can see your off-list rant in full. I see no >> reason to help you any further, even though I did have a patch that >> would change this behavior for you. Good luck with your "biggest >> noncommercial Internet messaging infrastructure" in the world. > > What a jerk. Brian, don't help him any more, you were being > very reasonable in your email to him. His response was way > out of line. Do you mean he got that joke wrong? Otherwise I think he is right. We shouln't advise him how to do the things right, but, since what he wants looks like legal and acceptable elsewhere, try to do this the least invasive way. Jarek P. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: socket api problem: can't bind an ipv6 socket to ::ffff:0.0.0.0 2009-03-18 9:13 ` socket api problem: can't bind an ipv6 socket to ::ffff:0.0.0.0 Jarek Poplawski @ 2009-03-18 21:36 ` David Miller 2009-03-18 21:53 ` Jarek Poplawski 0 siblings, 1 reply; 27+ messages in thread From: David Miller @ 2009-03-18 21:36 UTC (permalink / raw) To: jarkao2; +Cc: brian.haley, felix-kernel, netdev From: Jarek Poplawski <jarkao2@gmail.com> Date: Wed, 18 Mar 2009 09:13:07 +0000 > On 17-03-2009 20:31, David Miller wrote: > > From: Brian Haley <brian.haley@hp.com> > > Date: Tue, 17 Mar 2009 15:21:52 -0400 > > > >> Top-posting so others can see your off-list rant in full. I see no > >> reason to help you any further, even though I did have a patch that > >> would change this behavior for you. Good luck with your "biggest > >> noncommercial Internet messaging infrastructure" in the world. > > > > What a jerk. Brian, don't help him any more, you were being > > very reasonable in your email to him. His response was way > > out of line. > > Do you mean he got that joke wrong? Otherwise I think he is right. We > shouln't advise him how to do the things right, but, since what he > wants looks like legal and acceptable elsewhere, try to do this the > least invasive way. First of all, no matter if we allow that kind of bind() he wants or not, he cannot use it in his application unless he wants his application to be useless of most people's machines for at least a year. That's why the "make Linux be compatible with X other systems" is always a joke argument. Application wise, one still has to be compatible with all existing Linux systems which is a much larger issue. And yes we should advise people what is an appropriate way to accomplish some task. If we aren't the experts on such a topic, then who the hell is? ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: socket api problem: can't bind an ipv6 socket to ::ffff:0.0.0.0 2009-03-18 21:36 ` David Miller @ 2009-03-18 21:53 ` Jarek Poplawski 2009-03-19 0:32 ` David Miller 0 siblings, 1 reply; 27+ messages in thread From: Jarek Poplawski @ 2009-03-18 21:53 UTC (permalink / raw) To: David Miller; +Cc: brian.haley, felix-kernel, netdev On Wed, Mar 18, 2009 at 02:36:35PM -0700, David Miller wrote: ... > And yes we should advise people what is an appropriate way to > accomplish some task. If we aren't the experts on such a topic, > then who the hell is? Only if sb. is looking for advice; otherwise it's not very nice, especially if repeated many times. Jarek P. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: socket api problem: can't bind an ipv6 socket to ::ffff:0.0.0.0 2009-03-18 21:53 ` Jarek Poplawski @ 2009-03-19 0:32 ` David Miller 0 siblings, 0 replies; 27+ messages in thread From: David Miller @ 2009-03-19 0:32 UTC (permalink / raw) To: jarkao2; +Cc: brian.haley, felix-kernel, netdev From: Jarek Poplawski <jarkao2@gmail.com> Date: Wed, 18 Mar 2009 22:53:00 +0100 > On Wed, Mar 18, 2009 at 02:36:35PM -0700, David Miller wrote: > ... > > And yes we should advise people what is an appropriate way to > > accomplish some task. If we aren't the experts on such a topic, > > then who the hell is? > > Only if sb. is looking for advice; otherwise it's not very nice, > especially if repeated many times. If the purpose of the query was to suggest that Linux should behave a certain way, it should be no surprise to anyone that if we should disagree with that suggestion we would suggest what we consider more desriable alternatives for the application developer. I don't even think this is worth the time we are spending to discuss it, it seems so straightforward. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: socket api problem: can't bind an ipv6 socket to ::ffff:0.0.0.0 2009-03-16 23:48 socket api problem: can't bind an ipv6 socket to ::ffff:0.0.0.0 Felix von Leitner 2009-03-17 0:00 ` Stephen Hemminger 2009-03-17 2:26 ` Brian Haley @ 2009-03-17 9:03 ` Bjørn Mork 2 siblings, 0 replies; 27+ messages in thread From: Bjørn Mork @ 2009-03-17 9:03 UTC (permalink / raw) To: netdev Felix von Leitner <felix-kernel@fefe.de> writes: > bind(3, {sa_family=AF_INET6, sin6_port=htons(6969), inet_pton(AF_INET6, "::ffff:0.0.0.0", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = -1 EADDRNOTAVAIL (Cannot assign requested address) > > This is supposed to work, and it works on other operating systems, even > on Mac OS X. > > I think it used to work on Linux, too. You can find testresults for a number of (older) OSes here: http://www.kame.net/newsletter/20010504/ You'll probably have to refer to the bindtest man page to interprete the results: http://www.jinmei.org/bindtest-man.txt Bjørn ^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2009-03-19 0:32 UTC | newest]
Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-16 23:48 socket api problem: can't bind an ipv6 socket to ::ffff:0.0.0.0 Felix von Leitner
2009-03-17 0:00 ` Stephen Hemminger
2009-03-17 0:18 ` Felix von Leitner
2009-03-17 2:26 ` Brian Haley
2009-03-17 2:47 ` Eric Dumazet
2009-03-17 8:51 ` Bjørn Mork
2009-03-17 16:00 ` Brian Haley
2009-03-17 12:58 ` Felix von Leitner
2009-03-17 13:47 ` Vlad Yasevich
2009-03-17 14:14 ` Felix von Leitner
2009-03-17 14:57 ` Vlad Yasevich
2009-03-17 17:51 ` Felix von Leitner
2009-03-17 15:21 ` Eric Dumazet
2009-03-17 18:01 ` Felix von Leitner
2009-03-17 15:59 ` Brian Haley
[not found] ` <20090317180840.GC13270@codeblau.de>
2009-03-17 19:21 ` Brian Haley
2009-03-17 19:31 ` David Miller
2009-03-17 21:05 ` Vlad Yasevich
2009-03-17 21:05 ` [RFC PATCH 1/4] ipv6: Disallow binding to v4-mapped address on v6-only socket Vlad Yasevich
2009-03-17 21:06 ` [RFC PATCH 2/4] ipv6: Allow ipv4 wildcard binds after ipv6 address binds Vlad Yasevich
2009-03-17 21:06 ` [RFC PATCH 3/4] ipv6: Make v4-mapped bindings consitant with IPv4 Vlad Yasevich
2009-03-17 21:06 ` [RFC PATCH 4/4] ipv6: Fix conflict resolutions during ipv6 binding Vlad Yasevich
2009-03-18 9:13 ` socket api problem: can't bind an ipv6 socket to ::ffff:0.0.0.0 Jarek Poplawski
2009-03-18 21:36 ` David Miller
2009-03-18 21:53 ` Jarek Poplawski
2009-03-19 0:32 ` David Miller
2009-03-17 9:03 ` Bjørn Mork
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).