* [PATCH net 0/2] udplookup: Rescan udp hash chains if cross-linked
@ 2025-01-27 19:40 David Laight
2025-01-27 19:40 ` [PATCH net 1/2] udp4: " David Laight
0 siblings, 1 reply; 5+ messages in thread
From: David Laight @ 2025-01-27 19:40 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, netdev,
linux-kernel
Cc: David Laight, David Ahern, Paolo Abeni, Simon Horman, Tom Herbert,
Gabriel Krisman Bertazi, Lorenz Bauer, Kuniyuki Iwashima
udp_lib_rehash() can get called at any time and will move a
socket to a different hash2 chain.
This can cause udp[46]_lib_lookup2() (processing incoming UDP) to
fail to find a socket and an ICMP port unreachable be sent.
Prior to ca065d0cf80fa the lookup used 'hlist_nulls' and checked
that the 'end if list' marker was on the correct list.
The cross-linking can definitely happen (see earlier issues with
it looping forever because gcc cached the list head).
I can't see an easy way around adding another two parameters to
udp[46]_lib lookup().
However once udp-lite is removed 'mask' can be obtained from 'net'.
'net' itself could be obtained from 'sk'.
Not addressed here, but the 'reuseport' code doesn't look right to me.
David Laight (2):
IPv4: Rescan the hash2 list if the hash chains have got cross-linked.
IPv6: Rescan the hash2 list if the hash chains have got cross-linked.
net/ipv4/udp.c | 19 +++++++++++++++++--
net/ipv6/udp.c | 22 +++++++++++++++++++---
2 files changed, 36 insertions(+), 5 deletions(-)
--
2.39.5
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH net 1/2] udp4: Rescan udp hash chains if cross-linked.
2025-01-27 19:40 [PATCH net 0/2] udplookup: Rescan udp hash chains if cross-linked David Laight
@ 2025-01-27 19:40 ` David Laight
2025-01-27 19:40 ` [PATCH net 2/2] udp6: " David Laight
2025-01-27 20:33 ` [PATCH net 1/2] udp4: " Kuniyuki Iwashima
0 siblings, 2 replies; 5+ messages in thread
From: David Laight @ 2025-01-27 19:40 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, netdev,
linux-kernel
Cc: David Laight, David Ahern, Paolo Abeni, Simon Horman, Tom Herbert,
Gabriel Krisman Bertazi, Lorenz Bauer, Kuniyuki Iwashima
udp_lib_rehash() can get called at any time and will move a
socket to a different hash2 chain.
This can cause udp4_lib_lookup2() (processing incoming UDP) to
fail to find a socket and an ICMP port unreachable be sent.
Prior to ca065d0cf80fa the lookup used 'hlist_nulls' and checked
that the 'end if list' marker was on the correct list.
Signed-off-by: David Laight <david.laight.linux@gmail.com>
---
net/ipv4/udp.c | 19 +++++++++++++++++--
1 file changed, 17 insertions(+), 2 deletions(-)
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 86d282618515..a8e2b431d348 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -425,16 +425,21 @@ static struct sock *udp4_lib_lookup2(const struct net *net,
__be32 saddr, __be16 sport,
__be32 daddr, unsigned int hnum,
int dif, int sdif,
+ unsigned int hash2, unsigned int mask,
struct udp_hslot *hslot2,
struct sk_buff *skb)
{
+ unsigned int hash2_rescan;
struct sock *sk, *result;
int score, badness;
bool need_rescore;
+rescan:
+ hash2_rescan = hash2;
result = NULL;
badness = 0;
udp_portaddr_for_each_entry_rcu(sk, &hslot2->head) {
+ hash2_rescan = udp_sk(sk)->udp_portaddr_hash;
need_rescore = false;
rescore:
score = compute_score(need_rescore ? result : sk, net, saddr,
@@ -475,6 +480,16 @@ static struct sock *udp4_lib_lookup2(const struct net *net,
goto rescore;
}
}
+
+ /* udp sockets can get moved to a different hash chain.
+ * If the chains have got crossed then rescan.
+ */
+ if ((hash2_rescan ^ hash2) & mask) {
+ /* Ensure hslot2->head is reread */
+ barrier();
+ goto rescan;
+ }
+
return result;
}
@@ -654,7 +669,7 @@ struct sock *__udp4_lib_lookup(const struct net *net, __be32 saddr,
/* Lookup connected or non-wildcard socket */
result = udp4_lib_lookup2(net, saddr, sport,
daddr, hnum, dif, sdif,
- hslot2, skb);
+ hash2, udptable->mask, hslot2, skb);
if (!IS_ERR_OR_NULL(result) && result->sk_state == TCP_ESTABLISHED)
goto done;
@@ -680,7 +695,7 @@ struct sock *__udp4_lib_lookup(const struct net *net, __be32 saddr,
result = udp4_lib_lookup2(net, saddr, sport,
htonl(INADDR_ANY), hnum, dif, sdif,
- hslot2, skb);
+ hash2, udptable->mask, hslot2, skb);
done:
if (IS_ERR(result))
return NULL;
--
2.39.5
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH net 2/2] udp6: Rescan udp hash chains if cross-linked.
2025-01-27 19:40 ` [PATCH net 1/2] udp4: " David Laight
@ 2025-01-27 19:40 ` David Laight
2025-01-27 20:33 ` [PATCH net 1/2] udp4: " Kuniyuki Iwashima
1 sibling, 0 replies; 5+ messages in thread
From: David Laight @ 2025-01-27 19:40 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, netdev,
linux-kernel
Cc: David Laight, David Ahern, Paolo Abeni, Simon Horman, Tom Herbert,
Gabriel Krisman Bertazi, Lorenz Bauer, Kuniyuki Iwashima
udp_lib_rehash() can get called at any time and will move a
socket to a different hash2 chain.
This can cause udp6_lib_lookup2() (processing incoming UDP) to
fail to find a socket and an ICMP port unreachable be sent.
Prior to ca065d0cf80fa the lookup used 'hlist_nulls' and checked
that the 'end if list' marker was on the correct list.
Signed-off-by: David Laight <david.laight.linux@gmail.com>
---
net/ipv6/udp.c | 22 +++++++++++++++++++---
1 file changed, 19 insertions(+), 3 deletions(-)
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index d766fd798ecf..903ccb46495a 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -174,16 +174,22 @@ static int compute_score(struct sock *sk, const struct net *net,
static struct sock *udp6_lib_lookup2(const struct net *net,
const struct in6_addr *saddr, __be16 sport,
const struct in6_addr *daddr, unsigned int hnum,
- int dif, int sdif, struct udp_hslot *hslot2,
+ int dif, int sdif,
+ unsigned int hash2, unsigned int mask,
+ struct udp_hslot *hslot2,
struct sk_buff *skb)
{
+ unsigned int hash2_rescan;
struct sock *sk, *result;
int score, badness;
bool need_rescore;
+rescan:
+ hash2_rescan = hash2;
result = NULL;
badness = -1;
udp_portaddr_for_each_entry_rcu(sk, &hslot2->head) {
+ hash2_rescan = udp_sk(sk)->udp_portaddr_hash;
need_rescore = false;
rescore:
score = compute_score(need_rescore ? result : sk, net, saddr,
@@ -224,6 +230,16 @@ static struct sock *udp6_lib_lookup2(const struct net *net,
goto rescore;
}
}
+
+ /* udp sockets can get moved to a different hash chain.
+ * If the chains have got crossed then rescan.
+ */
+ if ((hash2_rescan ^ hash2) & mask) {
+ /* Ensure hslot2->head is reread */
+ barrier();
+ goto rescan;
+ }
+
return result;
}
@@ -320,7 +336,7 @@ struct sock *__udp6_lib_lookup(const struct net *net,
/* Lookup connected or non-wildcard sockets */
result = udp6_lib_lookup2(net, saddr, sport,
daddr, hnum, dif, sdif,
- hslot2, skb);
+ hash2, udptable->mask, hslot2, skb);
if (!IS_ERR_OR_NULL(result) && result->sk_state == TCP_ESTABLISHED)
goto done;
@@ -346,7 +362,7 @@ struct sock *__udp6_lib_lookup(const struct net *net,
result = udp6_lib_lookup2(net, saddr, sport,
&in6addr_any, hnum, dif, sdif,
- hslot2, skb);
+ hash2, udptable->mask, hslot2, skb);
done:
if (IS_ERR(result))
return NULL;
--
2.39.5
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH net 1/2] udp4: Rescan udp hash chains if cross-linked.
2025-01-27 19:40 ` [PATCH net 1/2] udp4: " David Laight
2025-01-27 19:40 ` [PATCH net 2/2] udp6: " David Laight
@ 2025-01-27 20:33 ` Kuniyuki Iwashima
2025-01-27 21:26 ` David Laight
1 sibling, 1 reply; 5+ messages in thread
From: Kuniyuki Iwashima @ 2025-01-27 20:33 UTC (permalink / raw)
To: david.laight.linux
Cc: davem, dsahern, edumazet, horms, krisman, kuba, kuniyu,
linux-kernel, lmb, netdev, pabeni, tom
From: David Laight <david.laight.linux@gmail.com>
Date: Mon, 27 Jan 2025 19:40:23 +0000
> udp_lib_rehash() can get called at any time and will move a
> socket to a different hash2 chain.
> This can cause udp4_lib_lookup2() (processing incoming UDP) to
> fail to find a socket and an ICMP port unreachable be sent.
>
> Prior to ca065d0cf80fa the lookup used 'hlist_nulls' and checked
> that the 'end if list' marker was on the correct list.
I think we should use hlist_nulls for hash2 as hash4.
---8<---
commit dab78a1745ab3c6001e1e4d50a9d09efef8e260d
Author: Philo Lu <lulie@linux.alibaba.com>
Date: Thu Nov 14 18:52:05 2024 +0800
net/udp: Add 4-tuple hash list basis
...
hash4 uses hlist_nulls to avoid moving wrongly onto another hlist due to
concurrent rehash, because rehash() can happen with lookup().
---8<---
Also, Fixes: tag is missing in both patches.
>
> Signed-off-by: David Laight <david.laight.linux@gmail.com>
> ---
> net/ipv4/udp.c | 19 +++++++++++++++++--
> 1 file changed, 17 insertions(+), 2 deletions(-)
>
> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> index 86d282618515..a8e2b431d348 100644
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -425,16 +425,21 @@ static struct sock *udp4_lib_lookup2(const struct net *net,
> __be32 saddr, __be16 sport,
> __be32 daddr, unsigned int hnum,
> int dif, int sdif,
> + unsigned int hash2, unsigned int mask,
> struct udp_hslot *hslot2,
> struct sk_buff *skb)
> {
> + unsigned int hash2_rescan;
> struct sock *sk, *result;
> int score, badness;
> bool need_rescore;
>
> +rescan:
> + hash2_rescan = hash2;
> result = NULL;
> badness = 0;
> udp_portaddr_for_each_entry_rcu(sk, &hslot2->head) {
> + hash2_rescan = udp_sk(sk)->udp_portaddr_hash;
> need_rescore = false;
> rescore:
> score = compute_score(need_rescore ? result : sk, net, saddr,
> @@ -475,6 +480,16 @@ static struct sock *udp4_lib_lookup2(const struct net *net,
> goto rescore;
> }
> }
> +
> + /* udp sockets can get moved to a different hash chain.
> + * If the chains have got crossed then rescan.
> + */
nit: trailing spaces here ^^^^^^^^
> + if ((hash2_rescan ^ hash2) & mask) {
> + /* Ensure hslot2->head is reread */
> + barrier();
> + goto rescan;
> + }
> +
> return result;
> }
>
> @@ -654,7 +669,7 @@ struct sock *__udp4_lib_lookup(const struct net *net, __be32 saddr,
> /* Lookup connected or non-wildcard socket */
> result = udp4_lib_lookup2(net, saddr, sport,
> daddr, hnum, dif, sdif,
> - hslot2, skb);
> + hash2, udptable->mask, hslot2, skb);
> if (!IS_ERR_OR_NULL(result) && result->sk_state == TCP_ESTABLISHED)
> goto done;
>
> @@ -680,7 +695,7 @@ struct sock *__udp4_lib_lookup(const struct net *net, __be32 saddr,
>
> result = udp4_lib_lookup2(net, saddr, sport,
> htonl(INADDR_ANY), hnum, dif, sdif,
> - hslot2, skb);
> + hash2, udptable->mask, hslot2, skb);
> done:
> if (IS_ERR(result))
> return NULL;
> --
> 2.39.5
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net 1/2] udp4: Rescan udp hash chains if cross-linked.
2025-01-27 20:33 ` [PATCH net 1/2] udp4: " Kuniyuki Iwashima
@ 2025-01-27 21:26 ` David Laight
0 siblings, 0 replies; 5+ messages in thread
From: David Laight @ 2025-01-27 21:26 UTC (permalink / raw)
To: Kuniyuki Iwashima
Cc: davem, dsahern, edumazet, horms, krisman, kuba, linux-kernel, lmb,
netdev, pabeni, tom
On Mon, 27 Jan 2025 12:33:04 -0800
Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
> From: David Laight <david.laight.linux@gmail.com>
> Date: Mon, 27 Jan 2025 19:40:23 +0000
> > udp_lib_rehash() can get called at any time and will move a
> > socket to a different hash2 chain.
> > This can cause udp4_lib_lookup2() (processing incoming UDP) to
> > fail to find a socket and an ICMP port unreachable be sent.
> >
> > Prior to ca065d0cf80fa the lookup used 'hlist_nulls' and checked
> > that the 'end if list' marker was on the correct list.
>
> I think we should use hlist_nulls for hash2 as hash4.
From what I remember when I first wrote this patch (mid 2023) using
hlist_nulls doesn't make much difference.
The code just did a rescan when the 'wrong NULL' was found rather than
when the last item wasn't on the starting hash chain.
ISTR it was removed to simplify other code paths.
>
> ---8<---
> commit dab78a1745ab3c6001e1e4d50a9d09efef8e260d
> Author: Philo Lu <lulie@linux.alibaba.com>
> Date: Thu Nov 14 18:52:05 2024 +0800
>
> net/udp: Add 4-tuple hash list basis
> ...
> hash4 uses hlist_nulls to avoid moving wrongly onto another hlist due to
> concurrent rehash, because rehash() can happen with lookup().
> ---8<---
>
>
> Also, Fixes: tag is missing in both patches.
Semi-deliberate to stop it being immediately backported.
While I think we have a system/test that fails it is running Ubuntu on a Dell
server and I don't think the raid controller driver is in the main kernel tree.
(We're definitely seeing unexpected ICMP on localhost - hard to get otherwise.)
David
>
> >
> > Signed-off-by: David Laight <david.laight.linux@gmail.com>
> > ---
> > net/ipv4/udp.c | 19 +++++++++++++++++--
> > 1 file changed, 17 insertions(+), 2 deletions(-)
> >
> > diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> > index 86d282618515..a8e2b431d348 100644
> > --- a/net/ipv4/udp.c
> > +++ b/net/ipv4/udp.c
> > @@ -425,16 +425,21 @@ static struct sock *udp4_lib_lookup2(const struct net *net,
> > __be32 saddr, __be16 sport,
> > __be32 daddr, unsigned int hnum,
> > int dif, int sdif,
> > + unsigned int hash2, unsigned int mask,
> > struct udp_hslot *hslot2,
> > struct sk_buff *skb)
> > {
> > + unsigned int hash2_rescan;
> > struct sock *sk, *result;
> > int score, badness;
> > bool need_rescore;
> >
> > +rescan:
> > + hash2_rescan = hash2;
> > result = NULL;
> > badness = 0;
> > udp_portaddr_for_each_entry_rcu(sk, &hslot2->head) {
> > + hash2_rescan = udp_sk(sk)->udp_portaddr_hash;
> > need_rescore = false;
> > rescore:
> > score = compute_score(need_rescore ? result : sk, net, saddr,
> > @@ -475,6 +480,16 @@ static struct sock *udp4_lib_lookup2(const struct net *net,
> > goto rescore;
> > }
> > }
> > +
> > + /* udp sockets can get moved to a different hash chain.
> > + * If the chains have got crossed then rescan.
> > + */
>
> nit: trailing spaces here ^^^^^^^^
>
>
> > + if ((hash2_rescan ^ hash2) & mask) {
> > + /* Ensure hslot2->head is reread */
> > + barrier();
> > + goto rescan;
> > + }
> > +
> > return result;
> > }
> >
> > @@ -654,7 +669,7 @@ struct sock *__udp4_lib_lookup(const struct net *net, __be32 saddr,
> > /* Lookup connected or non-wildcard socket */
> > result = udp4_lib_lookup2(net, saddr, sport,
> > daddr, hnum, dif, sdif,
> > - hslot2, skb);
> > + hash2, udptable->mask, hslot2, skb);
> > if (!IS_ERR_OR_NULL(result) && result->sk_state == TCP_ESTABLISHED)
> > goto done;
> >
> > @@ -680,7 +695,7 @@ struct sock *__udp4_lib_lookup(const struct net *net, __be32 saddr,
> >
> > result = udp4_lib_lookup2(net, saddr, sport,
> > htonl(INADDR_ANY), hnum, dif, sdif,
> > - hslot2, skb);
> > + hash2, udptable->mask, hslot2, skb);
> > done:
> > if (IS_ERR(result))
> > return NULL;
> > --
> > 2.39.5
> >
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-01-27 21:26 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-27 19:40 [PATCH net 0/2] udplookup: Rescan udp hash chains if cross-linked David Laight
2025-01-27 19:40 ` [PATCH net 1/2] udp4: " David Laight
2025-01-27 19:40 ` [PATCH net 2/2] udp6: " David Laight
2025-01-27 20:33 ` [PATCH net 1/2] udp4: " Kuniyuki Iwashima
2025-01-27 21:26 ` David Laight
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).