* [PATCH v2 net 0/6] tcp: Fix bind() regression for v4-mapped-v6 address
@ 2023-09-11 18:36 Kuniyuki Iwashima
2023-09-11 18:36 ` [PATCH v2 net 1/6] tcp: Factorise sk_family-independent comparison in inet_bind2_bucket_match(_addr_any) Kuniyuki Iwashima
` (7 more replies)
0 siblings, 8 replies; 15+ messages in thread
From: Kuniyuki Iwashima @ 2023-09-11 18:36 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
David Ahern
Cc: Joanne Koong, Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
Since bhash2 was introduced, bind() is broken in two cases related
to v4-mapped-v6 address.
This series fixes the regression and adds test to cover the cases.
Changes:
v2:
* Added patch 1 to factorise duplicated comparison (Eric Dumazet)
v1: https://lore.kernel.org/netdev/20230911165106.39384-1-kuniyu@amazon.com/
Kuniyuki Iwashima (6):
tcp: Factorise sk_family-independent comparison in
inet_bind2_bucket_match(_addr_any).
tcp: Fix bind() regression for v4-mapped-v6 wildcard address.
tcp: Fix bind() regression for v4-mapped-v6 non-wildcard address.
selftest: tcp: Fix address length in bind_wildcard.c.
selftest: tcp: Move expected_errno into each test case in
bind_wildcard.c.
selftest: tcp: Add v4-mapped-v6 cases in bind_wildcard.c.
include/net/ipv6.h | 5 ++
net/ipv4/inet_hashtables.c | 36 ++++++-----
tools/testing/selftests/net/bind_wildcard.c | 68 +++++++++++++++++----
3 files changed, 82 insertions(+), 27 deletions(-)
--
2.30.2
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v2 net 1/6] tcp: Factorise sk_family-independent comparison in inet_bind2_bucket_match(_addr_any).
2023-09-11 18:36 [PATCH v2 net 0/6] tcp: Fix bind() regression for v4-mapped-v6 address Kuniyuki Iwashima
@ 2023-09-11 18:36 ` Kuniyuki Iwashima
2023-09-12 4:48 ` Eric Dumazet
2023-09-11 18:36 ` [PATCH v2 net 2/6] tcp: Fix bind() regression for v4-mapped-v6 wildcard address Kuniyuki Iwashima
` (6 subsequent siblings)
7 siblings, 1 reply; 15+ messages in thread
From: Kuniyuki Iwashima @ 2023-09-11 18:36 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
David Ahern
Cc: Joanne Koong, Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
This is a prep patch to make the following patches cleaner that touch
inet_bind2_bucket_match() and inet_bind2_bucket_match_addr_any().
Both functions have duplicated comparison for netns, port, and l3mdev.
Let's factorise them.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
net/ipv4/inet_hashtables.c | 28 +++++++++++++---------------
1 file changed, 13 insertions(+), 15 deletions(-)
diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
index 7876b7d703cb..5c54f2804174 100644
--- a/net/ipv4/inet_hashtables.c
+++ b/net/ipv4/inet_hashtables.c
@@ -815,41 +815,39 @@ static bool inet_bind2_bucket_match(const struct inet_bind2_bucket *tb,
const struct net *net, unsigned short port,
int l3mdev, const struct sock *sk)
{
+ if (!net_eq(ib2_net(tb), net) || tb->port != port ||
+ tb->l3mdev != l3mdev)
+ return false;
+
#if IS_ENABLED(CONFIG_IPV6)
if (sk->sk_family != tb->family)
return false;
if (sk->sk_family == AF_INET6)
- return net_eq(ib2_net(tb), net) && tb->port == port &&
- tb->l3mdev == l3mdev &&
- ipv6_addr_equal(&tb->v6_rcv_saddr, &sk->sk_v6_rcv_saddr);
- else
+ return ipv6_addr_equal(&tb->v6_rcv_saddr, &sk->sk_v6_rcv_saddr);
#endif
- return net_eq(ib2_net(tb), net) && tb->port == port &&
- tb->l3mdev == l3mdev && tb->rcv_saddr == sk->sk_rcv_saddr;
+ return tb->rcv_saddr == sk->sk_rcv_saddr;
}
bool inet_bind2_bucket_match_addr_any(const struct inet_bind2_bucket *tb, const struct net *net,
unsigned short port, int l3mdev, const struct sock *sk)
{
+ if (!net_eq(ib2_net(tb), net) || tb->port != port ||
+ tb->l3mdev != l3mdev)
+ return false;
+
#if IS_ENABLED(CONFIG_IPV6)
if (sk->sk_family != tb->family) {
if (sk->sk_family == AF_INET)
- return net_eq(ib2_net(tb), net) && tb->port == port &&
- tb->l3mdev == l3mdev &&
- ipv6_addr_any(&tb->v6_rcv_saddr);
+ return ipv6_addr_any(&tb->v6_rcv_saddr);
return false;
}
if (sk->sk_family == AF_INET6)
- return net_eq(ib2_net(tb), net) && tb->port == port &&
- tb->l3mdev == l3mdev &&
- ipv6_addr_any(&tb->v6_rcv_saddr);
- else
+ return ipv6_addr_any(&tb->v6_rcv_saddr);
#endif
- return net_eq(ib2_net(tb), net) && tb->port == port &&
- tb->l3mdev == l3mdev && tb->rcv_saddr == 0;
+ return tb->rcv_saddr == 0;
}
/* The socket's bhash2 hashbucket spinlock must be held when this is called */
--
2.30.2
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v2 net 2/6] tcp: Fix bind() regression for v4-mapped-v6 wildcard address.
2023-09-11 18:36 [PATCH v2 net 0/6] tcp: Fix bind() regression for v4-mapped-v6 address Kuniyuki Iwashima
2023-09-11 18:36 ` [PATCH v2 net 1/6] tcp: Factorise sk_family-independent comparison in inet_bind2_bucket_match(_addr_any) Kuniyuki Iwashima
@ 2023-09-11 18:36 ` Kuniyuki Iwashima
2023-09-12 4:58 ` Eric Dumazet
2023-09-11 18:36 ` [PATCH v2 net 3/6] tcp: Fix bind() regression for v4-mapped-v6 non-wildcard address Kuniyuki Iwashima
` (5 subsequent siblings)
7 siblings, 1 reply; 15+ messages in thread
From: Kuniyuki Iwashima @ 2023-09-11 18:36 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
David Ahern
Cc: Joanne Koong, Kuniyuki Iwashima, Kuniyuki Iwashima, netdev,
Andrei Vagin
Andrei Vagin reported bind() regression with strace logs.
If we bind() a TCPv6 socket to ::FFFF:0.0.0.0 and then bind() a TCPv4
socket to 127.0.0.1, the 2nd bind() should fail but now succeeds.
from socket import *
s1 = socket(AF_INET6, SOCK_STREAM)
s1.bind(('::ffff:0.0.0.0', 0))
s2 = socket(AF_INET, SOCK_STREAM)
s2.bind(('127.0.0.1', s1.getsockname()[1]))
During the 2nd bind(), if tb->family is AF_INET6 and sk->sk_family is
AF_INET in inet_bind2_bucket_match_addr_any(), we still need to check
if tb has the v4-mapped-v6 wildcard address.
The example above does not work after commit 5456262d2baa ("net: Fix
incorrect address comparison when searching for a bind2 bucket"), but
the blamed change is not the commit.
Before the commit, the leading zeros of ::FFFF:0.0.0.0 were treated
as 0.0.0.0, and the sequence above worked by chance. Technically, this
case has been broken since bhash2 was introduced.
Note that if we bind() two sockets to 127.0.0.1 and then ::FFFF:0.0.0.0,
the 2nd bind() fails properly because we fall back to using bhash to
detect conflicts for the v4-mapped-v6 address.
Fixes: 28044fc1d495 ("net: Add a bhash2 table hashed by port and address")
Reported-by: Andrei Vagin <avagin@google.com>
Closes: https://lore.kernel.org/netdev/ZPuYBOFC8zsK6r9T@google.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
include/net/ipv6.h | 5 +++++
net/ipv4/inet_hashtables.c | 3 ++-
2 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 0675be0f3fa0..56d8217ea6cf 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -784,6 +784,11 @@ static inline bool ipv6_addr_v4mapped(const struct in6_addr *a)
cpu_to_be32(0x0000ffff))) == 0UL;
}
+static inline bool ipv6_addr_v4mapped_any(const struct in6_addr *a)
+{
+ return ipv6_addr_v4mapped(a) && ipv4_is_zeronet(a->s6_addr32[3]);
+}
+
static inline bool ipv6_addr_v4mapped_loopback(const struct in6_addr *a)
{
return ipv6_addr_v4mapped(a) && ipv4_is_loopback(a->s6_addr32[3]);
diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
index 5c54f2804174..a58b04052ca6 100644
--- a/net/ipv4/inet_hashtables.c
+++ b/net/ipv4/inet_hashtables.c
@@ -839,7 +839,8 @@ bool inet_bind2_bucket_match_addr_any(const struct inet_bind2_bucket *tb, const
#if IS_ENABLED(CONFIG_IPV6)
if (sk->sk_family != tb->family) {
if (sk->sk_family == AF_INET)
- return ipv6_addr_any(&tb->v6_rcv_saddr);
+ return ipv6_addr_any(&tb->v6_rcv_saddr) ||
+ ipv6_addr_v4mapped_any(&tb->v6_rcv_saddr);
return false;
}
--
2.30.2
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v2 net 3/6] tcp: Fix bind() regression for v4-mapped-v6 non-wildcard address.
2023-09-11 18:36 [PATCH v2 net 0/6] tcp: Fix bind() regression for v4-mapped-v6 address Kuniyuki Iwashima
2023-09-11 18:36 ` [PATCH v2 net 1/6] tcp: Factorise sk_family-independent comparison in inet_bind2_bucket_match(_addr_any) Kuniyuki Iwashima
2023-09-11 18:36 ` [PATCH v2 net 2/6] tcp: Fix bind() regression for v4-mapped-v6 wildcard address Kuniyuki Iwashima
@ 2023-09-11 18:36 ` Kuniyuki Iwashima
2023-09-12 4:59 ` Eric Dumazet
2023-09-12 7:25 ` Andrei Vagin
2023-09-11 18:36 ` [PATCH v2 net 4/6] selftest: tcp: Fix address length in bind_wildcard.c Kuniyuki Iwashima
` (4 subsequent siblings)
7 siblings, 2 replies; 15+ messages in thread
From: Kuniyuki Iwashima @ 2023-09-11 18:36 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
David Ahern
Cc: Joanne Koong, Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
Since bhash2 was introduced, the example below does not work as expected.
These two bind() should conflict, but the 2nd bind() now succeeds.
from socket import *
s1 = socket(AF_INET6, SOCK_STREAM)
s1.bind(('::ffff:127.0.0.1', 0))
s2 = socket(AF_INET, SOCK_STREAM)
s2.bind(('127.0.0.1', s1.getsockname()[1]))
During the 2nd bind() in inet_csk_get_port(), inet_bind2_bucket_find()
fails to find the 1st socket's tb2, so inet_bind2_bucket_create() allocates
a new tb2 for the 2nd socket. Then, we call inet_csk_bind_conflict() that
checks conflicts in the new tb2 by inet_bhash2_conflict(). However, the
new tb2 does not include the 1st socket, thus the bind() finally succeeds.
In this case, inet_bind2_bucket_match() must check if AF_INET6 tb2 has
the conflicting v4-mapped-v6 address so that inet_bind2_bucket_find()
returns the 1st socket's tb2.
Note that if we bind two sockets to 127.0.0.1 and then ::FFFF:127.0.0.1,
the 2nd bind() fails properly for the same reason mentinoed in the previous
commit.
Fixes: 28044fc1d495 ("net: Add a bhash2 table hashed by port and address")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
net/ipv4/inet_hashtables.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
index a58b04052ca6..c32f5e28758b 100644
--- a/net/ipv4/inet_hashtables.c
+++ b/net/ipv4/inet_hashtables.c
@@ -820,8 +820,13 @@ static bool inet_bind2_bucket_match(const struct inet_bind2_bucket *tb,
return false;
#if IS_ENABLED(CONFIG_IPV6)
- if (sk->sk_family != tb->family)
+ if (sk->sk_family != tb->family) {
+ if (sk->sk_family == AF_INET)
+ return ipv6_addr_v4mapped(&tb->v6_rcv_saddr) &&
+ tb->v6_rcv_saddr.s6_addr32[3] == sk->sk_rcv_saddr;
+
return false;
+ }
if (sk->sk_family == AF_INET6)
return ipv6_addr_equal(&tb->v6_rcv_saddr, &sk->sk_v6_rcv_saddr);
--
2.30.2
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v2 net 4/6] selftest: tcp: Fix address length in bind_wildcard.c.
2023-09-11 18:36 [PATCH v2 net 0/6] tcp: Fix bind() regression for v4-mapped-v6 address Kuniyuki Iwashima
` (2 preceding siblings ...)
2023-09-11 18:36 ` [PATCH v2 net 3/6] tcp: Fix bind() regression for v4-mapped-v6 non-wildcard address Kuniyuki Iwashima
@ 2023-09-11 18:36 ` Kuniyuki Iwashima
2023-09-11 18:36 ` [PATCH v2 net 5/6] selftest: tcp: Move expected_errno into each test case " Kuniyuki Iwashima
` (3 subsequent siblings)
7 siblings, 0 replies; 15+ messages in thread
From: Kuniyuki Iwashima @ 2023-09-11 18:36 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
David Ahern
Cc: Joanne Koong, Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
The selftest passes the IPv6 address length for an IPv4 address.
We should pass the correct length.
Note inet_bind_sk() does not check if the size is larger than
sizeof(struct sockaddr_in), so there is no real bug in this
selftest.
Fixes: 13715acf8ab5 ("selftest: Add test for bind() conflicts.")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
tools/testing/selftests/net/bind_wildcard.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/bind_wildcard.c b/tools/testing/selftests/net/bind_wildcard.c
index 58edfc15d28b..e7ebe72e879d 100644
--- a/tools/testing/selftests/net/bind_wildcard.c
+++ b/tools/testing/selftests/net/bind_wildcard.c
@@ -100,7 +100,7 @@ void bind_sockets(struct __test_metadata *_metadata,
TEST_F(bind_wildcard, v4_v6)
{
bind_sockets(_metadata, self,
- (struct sockaddr *)&self->addr4, sizeof(self->addr6),
+ (struct sockaddr *)&self->addr4, sizeof(self->addr4),
(struct sockaddr *)&self->addr6, sizeof(self->addr6));
}
--
2.30.2
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v2 net 5/6] selftest: tcp: Move expected_errno into each test case in bind_wildcard.c.
2023-09-11 18:36 [PATCH v2 net 0/6] tcp: Fix bind() regression for v4-mapped-v6 address Kuniyuki Iwashima
` (3 preceding siblings ...)
2023-09-11 18:36 ` [PATCH v2 net 4/6] selftest: tcp: Fix address length in bind_wildcard.c Kuniyuki Iwashima
@ 2023-09-11 18:36 ` Kuniyuki Iwashima
2023-09-11 18:37 ` [PATCH v2 net 6/6] selftest: tcp: Add v4-mapped-v6 cases " Kuniyuki Iwashima
` (2 subsequent siblings)
7 siblings, 0 replies; 15+ messages in thread
From: Kuniyuki Iwashima @ 2023-09-11 18:36 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
David Ahern
Cc: Joanne Koong, Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
This is a preparation patch for the following patch.
Let's define expected_errno in each test case so that we can add other test
cases easily.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
tools/testing/selftests/net/bind_wildcard.c | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)
diff --git a/tools/testing/selftests/net/bind_wildcard.c b/tools/testing/selftests/net/bind_wildcard.c
index e7ebe72e879d..81f694536099 100644
--- a/tools/testing/selftests/net/bind_wildcard.c
+++ b/tools/testing/selftests/net/bind_wildcard.c
@@ -10,37 +10,41 @@ FIXTURE(bind_wildcard)
{
struct sockaddr_in addr4;
struct sockaddr_in6 addr6;
- int expected_errno;
};
FIXTURE_VARIANT(bind_wildcard)
{
const __u32 addr4_const;
const struct in6_addr *addr6_const;
+ int expected_errno;
};
FIXTURE_VARIANT_ADD(bind_wildcard, v4_any_v6_any)
{
.addr4_const = INADDR_ANY,
.addr6_const = &in6addr_any,
+ .expected_errno = EADDRINUSE,
};
FIXTURE_VARIANT_ADD(bind_wildcard, v4_any_v6_local)
{
.addr4_const = INADDR_ANY,
.addr6_const = &in6addr_loopback,
+ .expected_errno = 0,
};
FIXTURE_VARIANT_ADD(bind_wildcard, v4_local_v6_any)
{
.addr4_const = INADDR_LOOPBACK,
.addr6_const = &in6addr_any,
+ .expected_errno = EADDRINUSE,
};
FIXTURE_VARIANT_ADD(bind_wildcard, v4_local_v6_local)
{
.addr4_const = INADDR_LOOPBACK,
.addr6_const = &in6addr_loopback,
+ .expected_errno = 0,
};
FIXTURE_SETUP(bind_wildcard)
@@ -52,11 +56,6 @@ FIXTURE_SETUP(bind_wildcard)
self->addr6.sin6_family = AF_INET6;
self->addr6.sin6_port = htons(0);
self->addr6.sin6_addr = *variant->addr6_const;
-
- if (variant->addr6_const == &in6addr_any)
- self->expected_errno = EADDRINUSE;
- else
- self->expected_errno = 0;
}
FIXTURE_TEARDOWN(bind_wildcard)
@@ -65,6 +64,7 @@ FIXTURE_TEARDOWN(bind_wildcard)
void bind_sockets(struct __test_metadata *_metadata,
FIXTURE_DATA(bind_wildcard) *self,
+ int expected_errno,
struct sockaddr *addr1, socklen_t addrlen1,
struct sockaddr *addr2, socklen_t addrlen2)
{
@@ -86,9 +86,9 @@ void bind_sockets(struct __test_metadata *_metadata,
ASSERT_GT(fd[1], 0);
ret = bind(fd[1], addr2, addrlen2);
- if (self->expected_errno) {
+ if (expected_errno) {
ASSERT_EQ(ret, -1);
- ASSERT_EQ(errno, self->expected_errno);
+ ASSERT_EQ(errno, expected_errno);
} else {
ASSERT_EQ(ret, 0);
}
@@ -99,14 +99,14 @@ void bind_sockets(struct __test_metadata *_metadata,
TEST_F(bind_wildcard, v4_v6)
{
- bind_sockets(_metadata, self,
+ bind_sockets(_metadata, self, variant->expected_errno,
(struct sockaddr *)&self->addr4, sizeof(self->addr4),
(struct sockaddr *)&self->addr6, sizeof(self->addr6));
}
TEST_F(bind_wildcard, v6_v4)
{
- bind_sockets(_metadata, self,
+ bind_sockets(_metadata, self, variant->expected_errno,
(struct sockaddr *)&self->addr6, sizeof(self->addr6),
(struct sockaddr *)&self->addr4, sizeof(self->addr4));
}
--
2.30.2
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v2 net 6/6] selftest: tcp: Add v4-mapped-v6 cases in bind_wildcard.c.
2023-09-11 18:36 [PATCH v2 net 0/6] tcp: Fix bind() regression for v4-mapped-v6 address Kuniyuki Iwashima
` (4 preceding siblings ...)
2023-09-11 18:36 ` [PATCH v2 net 5/6] selftest: tcp: Move expected_errno into each test case " Kuniyuki Iwashima
@ 2023-09-11 18:37 ` Kuniyuki Iwashima
2023-09-11 18:54 ` [PATCH v2 net 0/6] tcp: Fix bind() regression for v4-mapped-v6 address Martin KaFai Lau
2023-09-13 6:20 ` patchwork-bot+netdevbpf
7 siblings, 0 replies; 15+ messages in thread
From: Kuniyuki Iwashima @ 2023-09-11 18:37 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
David Ahern
Cc: Joanne Koong, Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
We add these 8 test cases in bind_wildcard.c to check bind() conflicts.
1st bind() 2nd bind()
--------- ---------
0.0.0.0 ::FFFF:0.0.0.0
::FFFF:0.0.0.0 0.0.0.0
0.0.0.0 ::FFFF:127.0.0.1
::FFFF:127.0.0.1 0.0.0.0
127.0.0.1 ::FFFF:0.0.0.0
::FFFF:0.0.0.0 127.0.0.1
127.0.0.1 ::FFFF:127.0.0.1
::FFFF:127.0.0.1 127.0.0.1
All test passed without bhash2 and with bhash2 and this series.
Before bhash2:
$ uname -r
6.0.0-rc1-00393-g0bf73255d3a3
$ ./bind_wildcard
...
# PASSED: 16 / 16 tests passed.
Just after bhash2:
$ uname -r
6.0.0-rc1-00394-g28044fc1d495
$ ./bind_wildcard
...
ok 15 bind_wildcard.v4_local_v6_v4mapped_local.v4_v6
not ok 16 bind_wildcard.v4_local_v6_v4mapped_local.v6_v4
# FAILED: 15 / 16 tests passed.
On net.git:
$ ./bind_wildcard
...
not ok 14 bind_wildcard.v4_local_v6_v4mapped_any.v6_v4
not ok 16 bind_wildcard.v4_local_v6_v4mapped_local.v6_v4
# FAILED: 13 / 16 tests passed.
With this series:
$ ./bind_wildcard
...
# PASSED: 16 / 16 tests passed.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
tools/testing/selftests/net/bind_wildcard.c | 46 +++++++++++++++++++++
1 file changed, 46 insertions(+)
diff --git a/tools/testing/selftests/net/bind_wildcard.c b/tools/testing/selftests/net/bind_wildcard.c
index 81f694536099..a2662348cdb1 100644
--- a/tools/testing/selftests/net/bind_wildcard.c
+++ b/tools/testing/selftests/net/bind_wildcard.c
@@ -6,6 +6,24 @@
#include "../kselftest_harness.h"
+struct in6_addr in6addr_v4mapped_any = {
+ .s6_addr = {
+ 0, 0, 0, 0,
+ 0, 0, 0, 0,
+ 0, 0, 255, 255,
+ 0, 0, 0, 0
+ }
+};
+
+struct in6_addr in6addr_v4mapped_loopback = {
+ .s6_addr = {
+ 0, 0, 0, 0,
+ 0, 0, 0, 0,
+ 0, 0, 255, 255,
+ 127, 0, 0, 1
+ }
+};
+
FIXTURE(bind_wildcard)
{
struct sockaddr_in addr4;
@@ -33,6 +51,20 @@ FIXTURE_VARIANT_ADD(bind_wildcard, v4_any_v6_local)
.expected_errno = 0,
};
+FIXTURE_VARIANT_ADD(bind_wildcard, v4_any_v6_v4mapped_any)
+{
+ .addr4_const = INADDR_ANY,
+ .addr6_const = &in6addr_v4mapped_any,
+ .expected_errno = EADDRINUSE,
+};
+
+FIXTURE_VARIANT_ADD(bind_wildcard, v4_any_v6_v4mapped_local)
+{
+ .addr4_const = INADDR_ANY,
+ .addr6_const = &in6addr_v4mapped_loopback,
+ .expected_errno = EADDRINUSE,
+};
+
FIXTURE_VARIANT_ADD(bind_wildcard, v4_local_v6_any)
{
.addr4_const = INADDR_LOOPBACK,
@@ -47,6 +79,20 @@ FIXTURE_VARIANT_ADD(bind_wildcard, v4_local_v6_local)
.expected_errno = 0,
};
+FIXTURE_VARIANT_ADD(bind_wildcard, v4_local_v6_v4mapped_any)
+{
+ .addr4_const = INADDR_LOOPBACK,
+ .addr6_const = &in6addr_v4mapped_any,
+ .expected_errno = EADDRINUSE,
+};
+
+FIXTURE_VARIANT_ADD(bind_wildcard, v4_local_v6_v4mapped_local)
+{
+ .addr4_const = INADDR_LOOPBACK,
+ .addr6_const = &in6addr_v4mapped_loopback,
+ .expected_errno = EADDRINUSE,
+};
+
FIXTURE_SETUP(bind_wildcard)
{
self->addr4.sin_family = AF_INET;
--
2.30.2
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH v2 net 0/6] tcp: Fix bind() regression for v4-mapped-v6 address
2023-09-11 18:36 [PATCH v2 net 0/6] tcp: Fix bind() regression for v4-mapped-v6 address Kuniyuki Iwashima
` (5 preceding siblings ...)
2023-09-11 18:37 ` [PATCH v2 net 6/6] selftest: tcp: Add v4-mapped-v6 cases " Kuniyuki Iwashima
@ 2023-09-11 18:54 ` Martin KaFai Lau
2023-09-13 6:20 ` patchwork-bot+netdevbpf
7 siblings, 0 replies; 15+ messages in thread
From: Martin KaFai Lau @ 2023-09-11 18:54 UTC (permalink / raw)
To: Kuniyuki Iwashima
Cc: Joanne Koong, Kuniyuki Iwashima, netdev, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, David Ahern
On 9/11/23 11:36 AM, Kuniyuki Iwashima wrote:
> Since bhash2 was introduced, bind() is broken in two cases related
> to v4-mapped-v6 address.
>
> This series fixes the regression and adds test to cover the cases.
Thanks for the fixes.
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 net 1/6] tcp: Factorise sk_family-independent comparison in inet_bind2_bucket_match(_addr_any).
2023-09-11 18:36 ` [PATCH v2 net 1/6] tcp: Factorise sk_family-independent comparison in inet_bind2_bucket_match(_addr_any) Kuniyuki Iwashima
@ 2023-09-12 4:48 ` Eric Dumazet
0 siblings, 0 replies; 15+ messages in thread
From: Eric Dumazet @ 2023-09-12 4:48 UTC (permalink / raw)
To: Kuniyuki Iwashima
Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, David Ahern,
Joanne Koong, Kuniyuki Iwashima, netdev
On Mon, Sep 11, 2023 at 8:37 PM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
>
> This is a prep patch to make the following patches cleaner that touch
> inet_bind2_bucket_match() and inet_bind2_bucket_match_addr_any().
>
> Both functions have duplicated comparison for netns, port, and l3mdev.
> Let's factorise them.
>
> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 net 2/6] tcp: Fix bind() regression for v4-mapped-v6 wildcard address.
2023-09-11 18:36 ` [PATCH v2 net 2/6] tcp: Fix bind() regression for v4-mapped-v6 wildcard address Kuniyuki Iwashima
@ 2023-09-12 4:58 ` Eric Dumazet
0 siblings, 0 replies; 15+ messages in thread
From: Eric Dumazet @ 2023-09-12 4:58 UTC (permalink / raw)
To: Kuniyuki Iwashima
Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, David Ahern,
Joanne Koong, Kuniyuki Iwashima, netdev, Andrei Vagin
On Mon, Sep 11, 2023 at 8:38 PM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
>
> Andrei Vagin reported bind() regression with strace logs.
>
> If we bind() a TCPv6 socket to ::FFFF:0.0.0.0 and then bind() a TCPv4
> socket to 127.0.0.1, the 2nd bind() should fail but now succeeds.
>
> from socket import *
>
> s1 = socket(AF_INET6, SOCK_STREAM)
> s1.bind(('::ffff:0.0.0.0', 0))
>
> s2 = socket(AF_INET, SOCK_STREAM)
> s2.bind(('127.0.0.1', s1.getsockname()[1]))
>
> During the 2nd bind(), if tb->family is AF_INET6 and sk->sk_family is
> AF_INET in inet_bind2_bucket_match_addr_any(), we still need to check
> if tb has the v4-mapped-v6 wildcard address.
>
> The example above does not work after commit 5456262d2baa ("net: Fix
> incorrect address comparison when searching for a bind2 bucket"), but
> the blamed change is not the commit.
>
> Before the commit, the leading zeros of ::FFFF:0.0.0.0 were treated
> as 0.0.0.0, and the sequence above worked by chance. Technically, this
> case has been broken since bhash2 was introduced.
>
> Note that if we bind() two sockets to 127.0.0.1 and then ::FFFF:0.0.0.0,
> the 2nd bind() fails properly because we fall back to using bhash to
> detect conflicts for the v4-mapped-v6 address.
>
> Fixes: 28044fc1d495 ("net: Add a bhash2 table hashed by port and address")
> Reported-by: Andrei Vagin <avagin@google.com>
> Closes: https://lore.kernel.org/netdev/ZPuYBOFC8zsK6r9T@google.com/
> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 net 3/6] tcp: Fix bind() regression for v4-mapped-v6 non-wildcard address.
2023-09-11 18:36 ` [PATCH v2 net 3/6] tcp: Fix bind() regression for v4-mapped-v6 non-wildcard address Kuniyuki Iwashima
@ 2023-09-12 4:59 ` Eric Dumazet
2023-09-12 7:25 ` Andrei Vagin
1 sibling, 0 replies; 15+ messages in thread
From: Eric Dumazet @ 2023-09-12 4:59 UTC (permalink / raw)
To: Kuniyuki Iwashima
Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, David Ahern,
Joanne Koong, Kuniyuki Iwashima, netdev
On Mon, Sep 11, 2023 at 8:38 PM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
>
> Since bhash2 was introduced, the example below does not work as expected.
> These two bind() should conflict, but the 2nd bind() now succeeds.
>
> from socket import *
>
> s1 = socket(AF_INET6, SOCK_STREAM)
> s1.bind(('::ffff:127.0.0.1', 0))
>
> s2 = socket(AF_INET, SOCK_STREAM)
> s2.bind(('127.0.0.1', s1.getsockname()[1]))
>
> During the 2nd bind() in inet_csk_get_port(), inet_bind2_bucket_find()
> fails to find the 1st socket's tb2, so inet_bind2_bucket_create() allocates
> a new tb2 for the 2nd socket. Then, we call inet_csk_bind_conflict() that
> checks conflicts in the new tb2 by inet_bhash2_conflict(). However, the
> new tb2 does not include the 1st socket, thus the bind() finally succeeds.
>
> In this case, inet_bind2_bucket_match() must check if AF_INET6 tb2 has
> the conflicting v4-mapped-v6 address so that inet_bind2_bucket_find()
> returns the 1st socket's tb2.
>
> Note that if we bind two sockets to 127.0.0.1 and then ::FFFF:127.0.0.1,
> the 2nd bind() fails properly for the same reason mentinoed in the previous
> commit.
>
> Fixes: 28044fc1d495 ("net: Add a bhash2 table hashed by port and address")
> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> ---
Reviewed-by: Eric Dumazet <edumazet@google.com>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 net 3/6] tcp: Fix bind() regression for v4-mapped-v6 non-wildcard address.
2023-09-11 18:36 ` [PATCH v2 net 3/6] tcp: Fix bind() regression for v4-mapped-v6 non-wildcard address Kuniyuki Iwashima
2023-09-12 4:59 ` Eric Dumazet
@ 2023-09-12 7:25 ` Andrei Vagin
2023-09-12 7:59 ` Kuniyuki Iwashima
1 sibling, 1 reply; 15+ messages in thread
From: Andrei Vagin @ 2023-09-12 7:25 UTC (permalink / raw)
To: Kuniyuki Iwashima
Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
David Ahern, Joanne Koong, Kuniyuki Iwashima, netdev
On Mon, Sep 11, 2023 at 11:36:57AM -0700, Kuniyuki Iwashima wrote:
> Since bhash2 was introduced, the example below does not work as expected.
> These two bind() should conflict, but the 2nd bind() now succeeds.
>
> from socket import *
>
> s1 = socket(AF_INET6, SOCK_STREAM)
> s1.bind(('::ffff:127.0.0.1', 0))
>
> s2 = socket(AF_INET, SOCK_STREAM)
> s2.bind(('127.0.0.1', s1.getsockname()[1]))
>
> During the 2nd bind() in inet_csk_get_port(), inet_bind2_bucket_find()
> fails to find the 1st socket's tb2, so inet_bind2_bucket_create() allocates
> a new tb2 for the 2nd socket. Then, we call inet_csk_bind_conflict() that
> checks conflicts in the new tb2 by inet_bhash2_conflict(). However, the
> new tb2 does not include the 1st socket, thus the bind() finally succeeds.
>
> In this case, inet_bind2_bucket_match() must check if AF_INET6 tb2 has
> the conflicting v4-mapped-v6 address so that inet_bind2_bucket_find()
> returns the 1st socket's tb2.
>
> Note that if we bind two sockets to 127.0.0.1 and then ::FFFF:127.0.0.1,
> the 2nd bind() fails properly for the same reason mentinoed in the previous
> commit.
>
> Fixes: 28044fc1d495 ("net: Add a bhash2 table hashed by port and address")
> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> ---
> net/ipv4/inet_hashtables.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
> index a58b04052ca6..c32f5e28758b 100644
> --- a/net/ipv4/inet_hashtables.c
> +++ b/net/ipv4/inet_hashtables.c
> @@ -820,8 +820,13 @@ static bool inet_bind2_bucket_match(const struct inet_bind2_bucket *tb,
Should we fix inet_bind2_bucket_addr_match too?
> return false;
>
> #if IS_ENABLED(CONFIG_IPV6)
> - if (sk->sk_family != tb->family)
> + if (sk->sk_family != tb->family) {
> + if (sk->sk_family == AF_INET)
> + return ipv6_addr_v4mapped(&tb->v6_rcv_saddr) &&
> + tb->v6_rcv_saddr.s6_addr32[3] == sk->sk_rcv_saddr;
I was wondering why we don't check a case when sk is AF_INET6 and tb is
AF_INET. I tried to run the next test:
import socket
sk4 = socket.socket(socket.AF_INET, socket.SOCK_STREAM, 0)
sk6 = socket.socket(socket.AF_INET6, socket.SOCK_STREAM, 0)
sk4.bind(("127.0.0.1", 32773))
sk6.bind(("::ffff:127.0.0.1", 32773))
The second bind returned EADDRINUSE. It works as expected only because
inet_use_bhash2_on_bind returns false for all v4mapped addresses. This
doesn't look right, and I am not sure it was intentional. I think it can
to be changed this way:
@@ -158,7 +158,7 @@ static bool inet_use_bhash2_on_bind(const struct sock *sk)
int addr_type = ipv6_addr_type(&sk->sk_v6_rcv_saddr);
return addr_type != IPV6_ADDR_ANY &&
- addr_type != IPV6_ADDR_MAPPED;
+ !ipv6_addr_v4mapped_any(&sk->sk_v6_rcv_saddr);
}
#endif
return sk->sk_rcv_saddr != htonl(INADDR_ANY);
As for this patch, I think it may be a good idea if bind2 buckets for
v4-mapped addresses are created with the AF_INET family and matching
ipv4 addresses.
> +
> return false;
> + }
>
> if (sk->sk_family == AF_INET6)
> return ipv6_addr_equal(&tb->v6_rcv_saddr, &sk->sk_v6_rcv_saddr);
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 net 3/6] tcp: Fix bind() regression for v4-mapped-v6 non-wildcard address.
2023-09-12 7:25 ` Andrei Vagin
@ 2023-09-12 7:59 ` Kuniyuki Iwashima
2023-09-12 14:29 ` Andrei Vagin
0 siblings, 1 reply; 15+ messages in thread
From: Kuniyuki Iwashima @ 2023-09-12 7:59 UTC (permalink / raw)
To: avagin
Cc: davem, dsahern, edumazet, joannelkoong, kuba, kuni1840, kuniyu,
netdev, pabeni
From: Andrei Vagin <avagin@gmail.com>
Date: Tue, 12 Sep 2023 00:25:58 -0700
> On Mon, Sep 11, 2023 at 11:36:57AM -0700, Kuniyuki Iwashima wrote:
> > Since bhash2 was introduced, the example below does not work as expected.
> > These two bind() should conflict, but the 2nd bind() now succeeds.
> >
> > from socket import *
> >
> > s1 = socket(AF_INET6, SOCK_STREAM)
> > s1.bind(('::ffff:127.0.0.1', 0))
> >
> > s2 = socket(AF_INET, SOCK_STREAM)
> > s2.bind(('127.0.0.1', s1.getsockname()[1]))
> >
> > During the 2nd bind() in inet_csk_get_port(), inet_bind2_bucket_find()
> > fails to find the 1st socket's tb2, so inet_bind2_bucket_create() allocates
> > a new tb2 for the 2nd socket. Then, we call inet_csk_bind_conflict() that
> > checks conflicts in the new tb2 by inet_bhash2_conflict(). However, the
> > new tb2 does not include the 1st socket, thus the bind() finally succeeds.
> >
> > In this case, inet_bind2_bucket_match() must check if AF_INET6 tb2 has
> > the conflicting v4-mapped-v6 address so that inet_bind2_bucket_find()
> > returns the 1st socket's tb2.
> >
> > Note that if we bind two sockets to 127.0.0.1 and then ::FFFF:127.0.0.1,
> > the 2nd bind() fails properly for the same reason mentinoed in the previous
> > commit.
> >
> > Fixes: 28044fc1d495 ("net: Add a bhash2 table hashed by port and address")
> > Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> > ---
> > net/ipv4/inet_hashtables.c | 7 ++++++-
> > 1 file changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
> > index a58b04052ca6..c32f5e28758b 100644
> > --- a/net/ipv4/inet_hashtables.c
> > +++ b/net/ipv4/inet_hashtables.c
> > @@ -820,8 +820,13 @@ static bool inet_bind2_bucket_match(const struct inet_bind2_bucket *tb,
>
> Should we fix inet_bind2_bucket_addr_match too?
No, there's no real bug.
I have this patch in my local branch and will post it against
net-next after this series is merged.
---8<---
commit 06333d4b0d053e4c0d40090b29e5a8546b42bb50
Author: Kuniyuki Iwashima <kuniyu@amazon.com>
Date: Sun Sep 10 19:01:23 2023 +0000
tcp: Remove redundant sk_family check in inet_bind2_bucket_addr_match().
Commit 5456262d2baa ("net: Fix incorrect address comparison when
searching for a bind2 bucket") added the test for the KMSAN report.
However, the condition never be true as tb2 is listener's
inet_csk(sk)->icsk_bind2_hash and its sk_family always matches with
child->sk_family.
Link: https://lore.kernel.org/netdev/CAG_fn=Ud3zSW7AZWXc+asfMhZVL5ETnvuY44Pmyv4NPv-ijN-A@mail.gmail.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
index c32f5e28758b..dfb1c61c0c2b 100644
--- a/net/ipv4/inet_hashtables.c
+++ b/net/ipv4/inet_hashtables.c
@@ -149,9 +149,6 @@ static bool inet_bind2_bucket_addr_match(const struct inet_bind2_bucket *tb2,
const struct sock *sk)
{
#if IS_ENABLED(CONFIG_IPV6)
- if (sk->sk_family != tb2->family)
- return false;
-
if (sk->sk_family == AF_INET6)
return ipv6_addr_equal(&tb2->v6_rcv_saddr,
&sk->sk_v6_rcv_saddr);
---8<---
>
> > return false;
> >
> > #if IS_ENABLED(CONFIG_IPV6)
> > - if (sk->sk_family != tb->family)
> > + if (sk->sk_family != tb->family) {
> > + if (sk->sk_family == AF_INET)
> > + return ipv6_addr_v4mapped(&tb->v6_rcv_saddr) &&
> > + tb->v6_rcv_saddr.s6_addr32[3] == sk->sk_rcv_saddr;
>
> I was wondering why we don't check a case when sk is AF_INET6 and tb is
> AF_INET. I tried to run the next test:
>
> import socket
> sk4 = socket.socket(socket.AF_INET, socket.SOCK_STREAM, 0)
> sk6 = socket.socket(socket.AF_INET6, socket.SOCK_STREAM, 0)
> sk4.bind(("127.0.0.1", 32773))
> sk6.bind(("::ffff:127.0.0.1", 32773))
>
> The second bind returned EADDRINUSE. It works as expected only because
> inet_use_bhash2_on_bind returns false for all v4mapped addresses. This
> doesn't look right, and I am not sure it was intentional. I think it can
> to be changed this way:
>
> @@ -158,7 +158,7 @@ static bool inet_use_bhash2_on_bind(const struct sock *sk)
> int addr_type = ipv6_addr_type(&sk->sk_v6_rcv_saddr);
>
> return addr_type != IPV6_ADDR_ANY &&
> - addr_type != IPV6_ADDR_MAPPED;
> + !ipv6_addr_v4mapped_any(&sk->sk_v6_rcv_saddr);
> }
> #endif
> return sk->sk_rcv_saddr != htonl(INADDR_ANY);
>
> As for this patch, I think it may be a good idea if bind2 buckets for
> v4-mapped addresses are created with the AF_INET family and matching
> ipv4 addresses.
Let's say we create tb2 with AF_INET for v4-mapped address. If we bind
::ffff:127.0.0.1 and 127.0.0.1, in the second bind(), both tb->family and
sk->sk_family is AF_INET. So, we can remove this AF_INET test.
if (sk->sk_family != tb->family) {
if (sk->sk_family == AF_INET)
But what about 127.0.0.1 and then ::ffff:127.0.0.1 ? There tb->family is
AF_INET and sk->sk_family is AF_INET6. We need to add another AF_INET6
test in the same place.
So, finally we need to check the special case in either way.
Also, as you may notice, we need to change inet_bind2_bucket_addr_match()
as well. As mentioned in my patch above, sk->sk_family always match
tb2->family there, but v4-mapped AF_INET tb2 breaks the rule.
Using bhash2 for v4-mapped-v6 address could be done but churns code a lot.
So, I think we should not include such changes as fix at least.
>
> > +
> > return false;
> > + }
> >
> > if (sk->sk_family == AF_INET6)
> > return ipv6_addr_equal(&tb->v6_rcv_saddr, &sk->sk_v6_rcv_saddr);
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH v2 net 3/6] tcp: Fix bind() regression for v4-mapped-v6 non-wildcard address.
2023-09-12 7:59 ` Kuniyuki Iwashima
@ 2023-09-12 14:29 ` Andrei Vagin
0 siblings, 0 replies; 15+ messages in thread
From: Andrei Vagin @ 2023-09-12 14:29 UTC (permalink / raw)
To: Kuniyuki Iwashima
Cc: davem, dsahern, edumazet, joannelkoong, kuba, kuni1840, netdev,
pabeni
On Tue, Sep 12, 2023 at 12:59 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
>
> From: Andrei Vagin <avagin@gmail.com>
> Date: Tue, 12 Sep 2023 00:25:58 -0700
> > On Mon, Sep 11, 2023 at 11:36:57AM -0700, Kuniyuki Iwashima wrote:
> > > Since bhash2 was introduced, the example below does not work as expected.
> > > These two bind() should conflict, but the 2nd bind() now succeeds.
> > >
> > > from socket import *
> > >
> > > s1 = socket(AF_INET6, SOCK_STREAM)
> > > s1.bind(('::ffff:127.0.0.1', 0))
> > >
> > > s2 = socket(AF_INET, SOCK_STREAM)
> > > s2.bind(('127.0.0.1', s1.getsockname()[1]))
> > >
> > > During the 2nd bind() in inet_csk_get_port(), inet_bind2_bucket_find()
> > > fails to find the 1st socket's tb2, so inet_bind2_bucket_create() allocates
> > > a new tb2 for the 2nd socket. Then, we call inet_csk_bind_conflict() that
> > > checks conflicts in the new tb2 by inet_bhash2_conflict(). However, the
> > > new tb2 does not include the 1st socket, thus the bind() finally succeeds.
> > >
> > > In this case, inet_bind2_bucket_match() must check if AF_INET6 tb2 has
> > > the conflicting v4-mapped-v6 address so that inet_bind2_bucket_find()
> > > returns the 1st socket's tb2.
> > >
> > > Note that if we bind two sockets to 127.0.0.1 and then ::FFFF:127.0.0.1,
> > > the 2nd bind() fails properly for the same reason mentinoed in the previous
> > > commit.
> > >
> > > Fixes: 28044fc1d495 ("net: Add a bhash2 table hashed by port and address")
> > > Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> > > ---
> > > net/ipv4/inet_hashtables.c | 7 ++++++-
> > > 1 file changed, 6 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
> > > index a58b04052ca6..c32f5e28758b 100644
> > > --- a/net/ipv4/inet_hashtables.c
> > > +++ b/net/ipv4/inet_hashtables.c
> > > @@ -820,8 +820,13 @@ static bool inet_bind2_bucket_match(const struct inet_bind2_bucket *tb,
> >
> > Should we fix inet_bind2_bucket_addr_match too?
>
> No, there's no real bug.
>
> I have this patch in my local branch and will post it against
> net-next after this series is merged.
>
> ---8<---
> commit 06333d4b0d053e4c0d40090b29e5a8546b42bb50
> Author: Kuniyuki Iwashima <kuniyu@amazon.com>
> Date: Sun Sep 10 19:01:23 2023 +0000
>
> tcp: Remove redundant sk_family check in inet_bind2_bucket_addr_match().
>
> Commit 5456262d2baa ("net: Fix incorrect address comparison when
> searching for a bind2 bucket") added the test for the KMSAN report.
>
> However, the condition never be true as tb2 is listener's
> inet_csk(sk)->icsk_bind2_hash and its sk_family always matches with
> child->sk_family.
>
> Link: https://lore.kernel.org/netdev/CAG_fn=Ud3zSW7AZWXc+asfMhZVL5ETnvuY44Pmyv4NPv-ijN-A@mail.gmail.com/
> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
>
> diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
> index c32f5e28758b..dfb1c61c0c2b 100644
> --- a/net/ipv4/inet_hashtables.c
> +++ b/net/ipv4/inet_hashtables.c
> @@ -149,9 +149,6 @@ static bool inet_bind2_bucket_addr_match(const struct inet_bind2_bucket *tb2,
> const struct sock *sk)
> {
> #if IS_ENABLED(CONFIG_IPV6)
> - if (sk->sk_family != tb2->family)
> - return false;
> -
> if (sk->sk_family == AF_INET6)
> return ipv6_addr_equal(&tb2->v6_rcv_saddr,
> &sk->sk_v6_rcv_saddr);
> ---8<---
>
>
> >
> > > return false;
> > >
> > > #if IS_ENABLED(CONFIG_IPV6)
> > > - if (sk->sk_family != tb->family)
> > > + if (sk->sk_family != tb->family) {
> > > + if (sk->sk_family == AF_INET)
> > > + return ipv6_addr_v4mapped(&tb->v6_rcv_saddr) &&
> > > + tb->v6_rcv_saddr.s6_addr32[3] == sk->sk_rcv_saddr;
> >
> > I was wondering why we don't check a case when sk is AF_INET6 and tb is
> > AF_INET. I tried to run the next test:
> >
> > import socket
> > sk4 = socket.socket(socket.AF_INET, socket.SOCK_STREAM, 0)
> > sk6 = socket.socket(socket.AF_INET6, socket.SOCK_STREAM, 0)
> > sk4.bind(("127.0.0.1", 32773))
> > sk6.bind(("::ffff:127.0.0.1", 32773))
> >
> > The second bind returned EADDRINUSE. It works as expected only because
> > inet_use_bhash2_on_bind returns false for all v4mapped addresses. This
> > doesn't look right, and I am not sure it was intentional. I think it can
> > to be changed this way:
> >
> > @@ -158,7 +158,7 @@ static bool inet_use_bhash2_on_bind(const struct sock *sk)
> > int addr_type = ipv6_addr_type(&sk->sk_v6_rcv_saddr);
> >
> > return addr_type != IPV6_ADDR_ANY &&
> > - addr_type != IPV6_ADDR_MAPPED;
> > + !ipv6_addr_v4mapped_any(&sk->sk_v6_rcv_saddr);
> > }
> > #endif
> > return sk->sk_rcv_saddr != htonl(INADDR_ANY);
> >
> > As for this patch, I think it may be a good idea if bind2 buckets for
> > v4-mapped addresses are created with the AF_INET family and matching
> > ipv4 addresses.
>
> Let's say we create tb2 with AF_INET for v4-mapped address. If we bind
> ::ffff:127.0.0.1 and 127.0.0.1, in the second bind(), both tb->family and
> sk->sk_family is AF_INET. So, we can remove this AF_INET test.
>
> if (sk->sk_family != tb->family) {
> if (sk->sk_family == AF_INET)
>
> But what about 127.0.0.1 and then ::ffff:127.0.0.1 ? There tb->family is
> AF_INET and sk->sk_family is AF_INET6. We need to add another AF_INET6
> test in the same place.
>
> So, finally we need to check the special case in either way.
>
> Also, as you may notice, we need to change inet_bind2_bucket_addr_match()
> as well. As mentioned in my patch above, sk->sk_family always match
> tb2->family there, but v4-mapped AF_INET tb2 breaks the rule.
Thanks for the explanation. It looks quite reasonable.
>
> Using bhash2 for v4-mapped-v6 address could be done but churns code a lot.
> So, I think we should not include such changes as fix at least.
My mistake was to suppose that it was done unintentionally. I didn't find any
explanations in commit messages and in the code. Sorry if I missed something.
I think a comment in the code could help to avoid such questions.
The patch looks good to me.
Acked-by: Andrei Vagin <avagin@gmail.com>
Thanks,
Andrei
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 net 0/6] tcp: Fix bind() regression for v4-mapped-v6 address
2023-09-11 18:36 [PATCH v2 net 0/6] tcp: Fix bind() regression for v4-mapped-v6 address Kuniyuki Iwashima
` (6 preceding siblings ...)
2023-09-11 18:54 ` [PATCH v2 net 0/6] tcp: Fix bind() regression for v4-mapped-v6 address Martin KaFai Lau
@ 2023-09-13 6:20 ` patchwork-bot+netdevbpf
7 siblings, 0 replies; 15+ messages in thread
From: patchwork-bot+netdevbpf @ 2023-09-13 6:20 UTC (permalink / raw)
To: Kuniyuki Iwashima
Cc: davem, edumazet, kuba, pabeni, dsahern, joannelkoong, kuni1840,
netdev
Hello:
This series was applied to netdev/net.git (main)
by David S. Miller <davem@davemloft.net>:
On Mon, 11 Sep 2023 11:36:54 -0700 you wrote:
> Since bhash2 was introduced, bind() is broken in two cases related
> to v4-mapped-v6 address.
>
> This series fixes the regression and adds test to cover the cases.
>
>
> Changes:
> v2:
> * Added patch 1 to factorise duplicated comparison (Eric Dumazet)
>
> [...]
Here is the summary with links:
- [v2,net,1/6] tcp: Factorise sk_family-independent comparison in inet_bind2_bucket_match(_addr_any).
https://git.kernel.org/netdev/net/c/c6d277064b1d
- [v2,net,2/6] tcp: Fix bind() regression for v4-mapped-v6 wildcard address.
https://git.kernel.org/netdev/net/c/aa99e5f87bd5
- [v2,net,3/6] tcp: Fix bind() regression for v4-mapped-v6 non-wildcard address.
https://git.kernel.org/netdev/net/c/c48ef9c4aed3
- [v2,net,4/6] selftest: tcp: Fix address length in bind_wildcard.c.
https://git.kernel.org/netdev/net/c/0071d15517b4
- [v2,net,5/6] selftest: tcp: Move expected_errno into each test case in bind_wildcard.c.
https://git.kernel.org/netdev/net/c/2895d879dd41
- [v2,net,6/6] selftest: tcp: Add v4-mapped-v6 cases in bind_wildcard.c.
https://git.kernel.org/netdev/net/c/8637d8e8b653
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2023-09-13 6:20 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-11 18:36 [PATCH v2 net 0/6] tcp: Fix bind() regression for v4-mapped-v6 address Kuniyuki Iwashima
2023-09-11 18:36 ` [PATCH v2 net 1/6] tcp: Factorise sk_family-independent comparison in inet_bind2_bucket_match(_addr_any) Kuniyuki Iwashima
2023-09-12 4:48 ` Eric Dumazet
2023-09-11 18:36 ` [PATCH v2 net 2/6] tcp: Fix bind() regression for v4-mapped-v6 wildcard address Kuniyuki Iwashima
2023-09-12 4:58 ` Eric Dumazet
2023-09-11 18:36 ` [PATCH v2 net 3/6] tcp: Fix bind() regression for v4-mapped-v6 non-wildcard address Kuniyuki Iwashima
2023-09-12 4:59 ` Eric Dumazet
2023-09-12 7:25 ` Andrei Vagin
2023-09-12 7:59 ` Kuniyuki Iwashima
2023-09-12 14:29 ` Andrei Vagin
2023-09-11 18:36 ` [PATCH v2 net 4/6] selftest: tcp: Fix address length in bind_wildcard.c Kuniyuki Iwashima
2023-09-11 18:36 ` [PATCH v2 net 5/6] selftest: tcp: Move expected_errno into each test case " Kuniyuki Iwashima
2023-09-11 18:37 ` [PATCH v2 net 6/6] selftest: tcp: Add v4-mapped-v6 cases " Kuniyuki Iwashima
2023-09-11 18:54 ` [PATCH v2 net 0/6] tcp: Fix bind() regression for v4-mapped-v6 address Martin KaFai Lau
2023-09-13 6:20 ` patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).