* [PATCH v2 net-next 0/6] udp: Introduce optional per-netns hash table.
@ 2022-11-11 4:00 Kuniyuki Iwashima
2022-11-11 4:00 ` [PATCH v2 net-next 1/6] udp: Clean up some functions Kuniyuki Iwashima
` (5 more replies)
0 siblings, 6 replies; 11+ messages in thread
From: Kuniyuki Iwashima @ 2022-11-11 4:00 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
This series is the UDP version of the per-netns ehash series [0],
which were initially in the same patch set. [1]
The notable difference with TCP is the max table size is 64K. This
is because the possible hash range by udp_hashfn() always fits in 64K
within the same netns. Also, the UDP per-netns table isolates both
1-tuple and 2-tuple tables.
For details, please see the last patch.
patch 1 - 4: prep for per-netns hash table
patch 5: allocate bitmap beforehand for udp_lib_get_port() and smaller
hash table
patch 6: add per-netns hash table
[0]: https://lore.kernel.org/netdev/20220908011022.45342-1-kuniyu@amazon.com/
[1]: https://lore.kernel.org/netdev/20220826000445.46552-1-kuniyu@amazon.com/
Changes:
v3:
* Drop get_port() fix (posted separately later)
* Patch 3
* Fix CONFIG_PROC_FS=n build failure (kernel test robot)
* Patch 5
* Allocate bitmap when creating netns (Paolo Abeni)
v2: https://lore.kernel.org/netdev/20221104190612.24206-1-kuniyu@amazon.com/
v1: [1]
Kuniyuki Iwashima (6):
udp: Clean up some functions.
udp: Set NULL to sk->sk_prot->h.udp_table.
udp: Set NULL to udp_seq_afinfo.udp_table.
udp: Access &udp_table via net.
udp: Add bitmap in udp_table.
udp: Introduce optional per-netns hash table.
Documentation/networking/ip-sysctl.rst | 27 ++++
include/linux/udp.h | 2 +
include/net/netns/ipv4.h | 3 +
include/net/udp.h | 20 +++
net/core/filter.c | 4 +-
net/ipv4/sysctl_net_ipv4.c | 38 +++++
net/ipv4/udp.c | 205 ++++++++++++++++++++-----
net/ipv4/udp_diag.c | 6 +-
net/ipv4/udp_offload.c | 5 +-
net/ipv6/udp.c | 31 ++--
net/ipv6/udp_offload.c | 5 +-
11 files changed, 287 insertions(+), 59 deletions(-)
--
2.30.2
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v2 net-next 1/6] udp: Clean up some functions.
2022-11-11 4:00 [PATCH v2 net-next 0/6] udp: Introduce optional per-netns hash table Kuniyuki Iwashima
@ 2022-11-11 4:00 ` Kuniyuki Iwashima
2022-11-11 4:00 ` [PATCH v2 net-next 2/6] udp: Set NULL to sk->sk_prot->h.udp_table Kuniyuki Iwashima
` (4 subsequent siblings)
5 siblings, 0 replies; 11+ messages in thread
From: Kuniyuki Iwashima @ 2022-11-11 4:00 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
This patch adds no functional change and cleans up some functions
that the following patches touch around so that we make them tidy
and easy to review/revert. The change is mainly to keep reverse
christmas tree order.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
net/ipv4/udp.c | 39 +++++++++++++++++++++++----------------
net/ipv6/udp.c | 12 ++++++++----
2 files changed, 31 insertions(+), 20 deletions(-)
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index b859d6c8298e..a34de263e9ce 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -232,16 +232,16 @@ static int udp_reuseport_add_sock(struct sock *sk, struct udp_hslot *hslot)
int udp_lib_get_port(struct sock *sk, unsigned short snum,
unsigned int hash2_nulladdr)
{
- struct udp_hslot *hslot, *hslot2;
struct udp_table *udptable = sk->sk_prot->h.udp_table;
- int error = 1;
+ struct udp_hslot *hslot, *hslot2;
struct net *net = sock_net(sk);
+ int error = 1;
if (!snum) {
+ DECLARE_BITMAP(bitmap, PORTS_PER_CHAIN);
+ unsigned short first, last;
int low, high, remaining;
unsigned int rand;
- unsigned short first, last;
- DECLARE_BITMAP(bitmap, PORTS_PER_CHAIN);
inet_get_local_port_range(net, &low, &high);
remaining = (high - low) + 1;
@@ -2519,10 +2519,13 @@ static struct sock *__udp4_lib_mcast_demux_lookup(struct net *net,
__be16 rmt_port, __be32 rmt_addr,
int dif, int sdif)
{
- struct sock *sk, *result;
unsigned short hnum = ntohs(loc_port);
- unsigned int slot = udp_hashfn(net, hnum, udp_table.mask);
- struct udp_hslot *hslot = &udp_table.hash[slot];
+ struct sock *sk, *result;
+ struct udp_hslot *hslot;
+ unsigned int slot;
+
+ slot = udp_hashfn(net, hnum, udp_table.mask);
+ hslot = &udp_table.hash[slot];
/* Do not bother scanning a too big list */
if (hslot->count > 10)
@@ -2550,14 +2553,18 @@ static struct sock *__udp4_lib_demux_lookup(struct net *net,
__be16 rmt_port, __be32 rmt_addr,
int dif, int sdif)
{
- unsigned short hnum = ntohs(loc_port);
- unsigned int hash2 = ipv4_portaddr_hash(net, loc_addr, hnum);
- unsigned int slot2 = hash2 & udp_table.mask;
- struct udp_hslot *hslot2 = &udp_table.hash2[slot2];
INET_ADDR_COOKIE(acookie, rmt_addr, loc_addr);
- const __portpair ports = INET_COMBINED_PORTS(rmt_port, hnum);
+ unsigned short hnum = ntohs(loc_port);
+ unsigned int hash2, slot2;
+ struct udp_hslot *hslot2;
+ __portpair ports;
struct sock *sk;
+ hash2 = ipv4_portaddr_hash(net, loc_addr, hnum);
+ slot2 = hash2 & udp_table.mask;
+ hslot2 = &udp_table.hash2[slot2];
+ ports = INET_COMBINED_PORTS(rmt_port, hnum);
+
udp_portaddr_for_each_entry_rcu(sk, &hslot2->head) {
if (inet_match(net, sk, acookie, ports, dif, sdif))
return sk;
@@ -2970,10 +2977,10 @@ EXPORT_SYMBOL(udp_prot);
static struct sock *udp_get_first(struct seq_file *seq, int start)
{
- struct sock *sk;
- struct udp_seq_afinfo *afinfo;
struct udp_iter_state *state = seq->private;
struct net *net = seq_file_net(seq);
+ struct udp_seq_afinfo *afinfo;
+ struct sock *sk;
if (state->bpf_seq_afinfo)
afinfo = state->bpf_seq_afinfo;
@@ -3004,9 +3011,9 @@ static struct sock *udp_get_first(struct seq_file *seq, int start)
static struct sock *udp_get_next(struct seq_file *seq, struct sock *sk)
{
- struct udp_seq_afinfo *afinfo;
struct udp_iter_state *state = seq->private;
struct net *net = seq_file_net(seq);
+ struct udp_seq_afinfo *afinfo;
if (state->bpf_seq_afinfo)
afinfo = state->bpf_seq_afinfo;
@@ -3062,8 +3069,8 @@ EXPORT_SYMBOL(udp_seq_next);
void udp_seq_stop(struct seq_file *seq, void *v)
{
- struct udp_seq_afinfo *afinfo;
struct udp_iter_state *state = seq->private;
+ struct udp_seq_afinfo *afinfo;
if (state->bpf_seq_afinfo)
afinfo = state->bpf_seq_afinfo;
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index e2de3d906c82..727de67e4c90 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -1064,12 +1064,16 @@ static struct sock *__udp6_lib_demux_lookup(struct net *net,
int dif, int sdif)
{
unsigned short hnum = ntohs(loc_port);
- unsigned int hash2 = ipv6_portaddr_hash(net, loc_addr, hnum);
- unsigned int slot2 = hash2 & udp_table.mask;
- struct udp_hslot *hslot2 = &udp_table.hash2[slot2];
- const __portpair ports = INET_COMBINED_PORTS(rmt_port, hnum);
+ unsigned int hash2, slot2;
+ struct udp_hslot *hslot2;
+ __portpair ports;
struct sock *sk;
+ hash2 = ipv6_portaddr_hash(net, loc_addr, hnum);
+ slot2 = hash2 & udp_table.mask;
+ hslot2 = &udp_table.hash2[slot2];
+ ports = INET_COMBINED_PORTS(rmt_port, hnum);
+
udp_portaddr_for_each_entry_rcu(sk, &hslot2->head) {
if (sk->sk_state == TCP_ESTABLISHED &&
inet6_match(net, sk, rmt_addr, loc_addr, ports, dif, sdif))
--
2.30.2
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2 net-next 2/6] udp: Set NULL to sk->sk_prot->h.udp_table.
2022-11-11 4:00 [PATCH v2 net-next 0/6] udp: Introduce optional per-netns hash table Kuniyuki Iwashima
2022-11-11 4:00 ` [PATCH v2 net-next 1/6] udp: Clean up some functions Kuniyuki Iwashima
@ 2022-11-11 4:00 ` Kuniyuki Iwashima
2022-11-11 4:00 ` [PATCH v2 net-next 3/6] udp: Set NULL to udp_seq_afinfo.udp_table Kuniyuki Iwashima
` (3 subsequent siblings)
5 siblings, 0 replies; 11+ messages in thread
From: Kuniyuki Iwashima @ 2022-11-11 4:00 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
We will soon introduce an optional per-netns hash table
for UDP.
This means we cannot use the global sk->sk_prot->h.udp_table
to fetch a UDP hash table.
Instead, set NULL to sk->sk_prot->h.udp_table for UDP and get
a proper table from net->ipv4.udp_table.
Note that we still need sk->sk_prot->h.udp_table for UDP LITE.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
include/net/netns/ipv4.h | 1 +
net/ipv4/udp.c | 15 +++++++++++----
net/ipv6/udp.c | 2 +-
3 files changed, 13 insertions(+), 5 deletions(-)
diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
index 25f90bba4889..e4cc4d3cacc4 100644
--- a/include/net/netns/ipv4.h
+++ b/include/net/netns/ipv4.h
@@ -43,6 +43,7 @@ struct tcp_fastopen_context;
struct netns_ipv4 {
struct inet_timewait_death_row tcp_death_row;
+ struct udp_table *udp_table;
#ifdef CONFIG_SYSCTL
struct ctl_table_header *forw_hdr;
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index a34de263e9ce..6206c27a1659 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -131,6 +131,11 @@ EXPORT_PER_CPU_SYMBOL_GPL(udp_memory_per_cpu_fw_alloc);
#define MAX_UDP_PORTS 65536
#define PORTS_PER_CHAIN (MAX_UDP_PORTS / UDP_HTABLE_SIZE_MIN)
+static struct udp_table *udp_get_table_prot(struct sock *sk)
+{
+ return sk->sk_prot->h.udp_table ? : sock_net(sk)->ipv4.udp_table;
+}
+
static int udp_lib_lport_inuse(struct net *net, __u16 num,
const struct udp_hslot *hslot,
unsigned long *bitmap,
@@ -232,7 +237,7 @@ static int udp_reuseport_add_sock(struct sock *sk, struct udp_hslot *hslot)
int udp_lib_get_port(struct sock *sk, unsigned short snum,
unsigned int hash2_nulladdr)
{
- struct udp_table *udptable = sk->sk_prot->h.udp_table;
+ struct udp_table *udptable = udp_get_table_prot(sk);
struct udp_hslot *hslot, *hslot2;
struct net *net = sock_net(sk);
int error = 1;
@@ -1999,7 +2004,7 @@ EXPORT_SYMBOL(udp_disconnect);
void udp_lib_unhash(struct sock *sk)
{
if (sk_hashed(sk)) {
- struct udp_table *udptable = sk->sk_prot->h.udp_table;
+ struct udp_table *udptable = udp_get_table_prot(sk);
struct udp_hslot *hslot, *hslot2;
hslot = udp_hashslot(udptable, sock_net(sk),
@@ -2030,7 +2035,7 @@ EXPORT_SYMBOL(udp_lib_unhash);
void udp_lib_rehash(struct sock *sk, u16 newhash)
{
if (sk_hashed(sk)) {
- struct udp_table *udptable = sk->sk_prot->h.udp_table;
+ struct udp_table *udptable = udp_get_table_prot(sk);
struct udp_hslot *hslot, *hslot2, *nhslot2;
hslot2 = udp_hashslot2(udptable, udp_sk(sk)->udp_portaddr_hash);
@@ -2967,7 +2972,7 @@ struct proto udp_prot = {
.sysctl_wmem_offset = offsetof(struct net, ipv4.sysctl_udp_wmem_min),
.sysctl_rmem_offset = offsetof(struct net, ipv4.sysctl_udp_rmem_min),
.obj_size = sizeof(struct udp_sock),
- .h.udp_table = &udp_table,
+ .h.udp_table = NULL,
.diag_destroy = udp_abort,
};
EXPORT_SYMBOL(udp_prot);
@@ -3280,6 +3285,8 @@ EXPORT_SYMBOL(udp_flow_hashrnd);
static int __net_init udp_sysctl_init(struct net *net)
{
+ net->ipv4.udp_table = &udp_table;
+
net->ipv4.sysctl_udp_rmem_min = PAGE_SIZE;
net->ipv4.sysctl_udp_wmem_min = PAGE_SIZE;
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 727de67e4c90..bbd6dc398f3b 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -1774,7 +1774,7 @@ struct proto udpv6_prot = {
.sysctl_wmem_offset = offsetof(struct net, ipv4.sysctl_udp_wmem_min),
.sysctl_rmem_offset = offsetof(struct net, ipv4.sysctl_udp_rmem_min),
.obj_size = sizeof(struct udp6_sock),
- .h.udp_table = &udp_table,
+ .h.udp_table = NULL,
.diag_destroy = udp_abort,
};
--
2.30.2
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2 net-next 3/6] udp: Set NULL to udp_seq_afinfo.udp_table.
2022-11-11 4:00 [PATCH v2 net-next 0/6] udp: Introduce optional per-netns hash table Kuniyuki Iwashima
2022-11-11 4:00 ` [PATCH v2 net-next 1/6] udp: Clean up some functions Kuniyuki Iwashima
2022-11-11 4:00 ` [PATCH v2 net-next 2/6] udp: Set NULL to sk->sk_prot->h.udp_table Kuniyuki Iwashima
@ 2022-11-11 4:00 ` Kuniyuki Iwashima
2022-11-11 4:00 ` [PATCH v2 net-next 4/6] udp: Access &udp_table via net Kuniyuki Iwashima
` (2 subsequent siblings)
5 siblings, 0 replies; 11+ messages in thread
From: Kuniyuki Iwashima @ 2022-11-11 4:00 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
We will soon introduce an optional per-netns hash table
for UDP.
This means we cannot use the global udp_seq_afinfo.udp_table
to fetch a UDP hash table.
Instead, set NULL to udp_seq_afinfo.udp_table for UDP and get
a proper table from net->ipv4.udp_table.
Note that we still need udp_seq_afinfo.udp_table for UDP LITE.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
net/ipv4/udp.c | 32 ++++++++++++++++++++++++--------
net/ipv6/udp.c | 2 +-
2 files changed, 25 insertions(+), 9 deletions(-)
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 6206c27a1659..a1a15eb76304 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -2980,11 +2980,18 @@ EXPORT_SYMBOL(udp_prot);
/* ------------------------------------------------------------------------ */
#ifdef CONFIG_PROC_FS
+static struct udp_table *udp_get_table_afinfo(struct udp_seq_afinfo *afinfo,
+ struct net *net)
+{
+ return afinfo->udp_table ? : net->ipv4.udp_table;
+}
+
static struct sock *udp_get_first(struct seq_file *seq, int start)
{
struct udp_iter_state *state = seq->private;
struct net *net = seq_file_net(seq);
struct udp_seq_afinfo *afinfo;
+ struct udp_table *udptable;
struct sock *sk;
if (state->bpf_seq_afinfo)
@@ -2992,9 +2999,11 @@ static struct sock *udp_get_first(struct seq_file *seq, int start)
else
afinfo = pde_data(file_inode(seq->file));
- for (state->bucket = start; state->bucket <= afinfo->udp_table->mask;
+ udptable = udp_get_table_afinfo(afinfo, net);
+
+ for (state->bucket = start; state->bucket <= udptable->mask;
++state->bucket) {
- struct udp_hslot *hslot = &afinfo->udp_table->hash[state->bucket];
+ struct udp_hslot *hslot = &udptable->hash[state->bucket];
if (hlist_empty(&hslot->head))
continue;
@@ -3019,6 +3028,7 @@ static struct sock *udp_get_next(struct seq_file *seq, struct sock *sk)
struct udp_iter_state *state = seq->private;
struct net *net = seq_file_net(seq);
struct udp_seq_afinfo *afinfo;
+ struct udp_table *udptable;
if (state->bpf_seq_afinfo)
afinfo = state->bpf_seq_afinfo;
@@ -3032,8 +3042,11 @@ static struct sock *udp_get_next(struct seq_file *seq, struct sock *sk)
sk->sk_family != afinfo->family)));
if (!sk) {
- if (state->bucket <= afinfo->udp_table->mask)
- spin_unlock_bh(&afinfo->udp_table->hash[state->bucket].lock);
+ udptable = udp_get_table_afinfo(afinfo, net);
+
+ if (state->bucket <= udptable->mask)
+ spin_unlock_bh(&udptable->hash[state->bucket].lock);
+
return udp_get_first(seq, state->bucket + 1);
}
return sk;
@@ -3076,14 +3089,17 @@ void udp_seq_stop(struct seq_file *seq, void *v)
{
struct udp_iter_state *state = seq->private;
struct udp_seq_afinfo *afinfo;
+ struct udp_table *udptable;
if (state->bpf_seq_afinfo)
afinfo = state->bpf_seq_afinfo;
else
afinfo = pde_data(file_inode(seq->file));
- if (state->bucket <= afinfo->udp_table->mask)
- spin_unlock_bh(&afinfo->udp_table->hash[state->bucket].lock);
+ udptable = udp_get_table_afinfo(afinfo, seq_file_net(seq));
+
+ if (state->bucket <= udptable->mask)
+ spin_unlock_bh(&udptable->hash[state->bucket].lock);
}
EXPORT_SYMBOL(udp_seq_stop);
@@ -3196,7 +3212,7 @@ EXPORT_SYMBOL(udp_seq_ops);
static struct udp_seq_afinfo udp4_seq_afinfo = {
.family = AF_INET,
- .udp_table = &udp_table,
+ .udp_table = NULL,
};
static int __net_init udp4_proc_init_net(struct net *net)
@@ -3316,7 +3332,7 @@ static int bpf_iter_init_udp(void *priv_data, struct bpf_iter_aux_info *aux)
return -ENOMEM;
afinfo->family = AF_UNSPEC;
- afinfo->udp_table = &udp_table;
+ afinfo->udp_table = NULL;
st->bpf_seq_afinfo = afinfo;
ret = bpf_iter_init_seq_net(priv_data, aux);
if (ret)
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index bbd6dc398f3b..c3dee1f8d3bd 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -1724,7 +1724,7 @@ EXPORT_SYMBOL(udp6_seq_ops);
static struct udp_seq_afinfo udp6_seq_afinfo = {
.family = AF_INET6,
- .udp_table = &udp_table,
+ .udp_table = NULL,
};
int __net_init udp6_proc_init(struct net *net)
--
2.30.2
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2 net-next 4/6] udp: Access &udp_table via net.
2022-11-11 4:00 [PATCH v2 net-next 0/6] udp: Introduce optional per-netns hash table Kuniyuki Iwashima
` (2 preceding siblings ...)
2022-11-11 4:00 ` [PATCH v2 net-next 3/6] udp: Set NULL to udp_seq_afinfo.udp_table Kuniyuki Iwashima
@ 2022-11-11 4:00 ` Kuniyuki Iwashima
2022-11-11 4:00 ` [PATCH v2 net-next 5/6] udp: Add bitmap in udp_table Kuniyuki Iwashima
2022-11-11 4:00 ` [PATCH v2 net-next 6/6] udp: Introduce optional per-netns hash table Kuniyuki Iwashima
5 siblings, 0 replies; 11+ messages in thread
From: Kuniyuki Iwashima @ 2022-11-11 4:00 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
We will soon introduce an optional per-netns hash table
for UDP.
This means we cannot use udp_table directly in most places.
Instead, access it via net->ipv4.udp_table.
The access will be valid only while initialising udp_table
itself and creating/destroying each netns.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
net/core/filter.c | 4 ++--
net/ipv4/udp.c | 23 +++++++++++++----------
net/ipv4/udp_diag.c | 6 +++---
net/ipv4/udp_offload.c | 5 +++--
net/ipv6/udp.c | 19 +++++++++++--------
net/ipv6/udp_offload.c | 5 +++--
6 files changed, 35 insertions(+), 27 deletions(-)
diff --git a/net/core/filter.c b/net/core/filter.c
index bb0136e7a8e4..2acd44c0c2b4 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -6428,7 +6428,7 @@ static struct sock *sk_lookup(struct net *net, struct bpf_sock_tuple *tuple,
else
sk = __udp4_lib_lookup(net, src4, tuple->ipv4.sport,
dst4, tuple->ipv4.dport,
- dif, sdif, &udp_table, NULL);
+ dif, sdif, net->ipv4.udp_table, NULL);
#if IS_ENABLED(CONFIG_IPV6)
} else {
struct in6_addr *src6 = (struct in6_addr *)&tuple->ipv6.saddr;
@@ -6444,7 +6444,7 @@ static struct sock *sk_lookup(struct net *net, struct bpf_sock_tuple *tuple,
src6, tuple->ipv6.sport,
dst6, tuple->ipv6.dport,
dif, sdif,
- &udp_table, NULL);
+ net->ipv4.udp_table, NULL);
#endif
}
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index a1a15eb76304..37e79158d145 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -472,7 +472,7 @@ static struct sock *udp4_lookup_run_bpf(struct net *net,
struct sock *sk, *reuse_sk;
bool no_reuseport;
- if (udptable != &udp_table)
+ if (udptable != net->ipv4.udp_table)
return NULL; /* only UDP is supported */
no_reuseport = bpf_sk_lookup_run_v4(net, IPPROTO_UDP, saddr, sport,
@@ -553,10 +553,11 @@ struct sock *udp4_lib_lookup_skb(const struct sk_buff *skb,
__be16 sport, __be16 dport)
{
const struct iphdr *iph = ip_hdr(skb);
+ struct net *net = dev_net(skb->dev);
- return __udp4_lib_lookup(dev_net(skb->dev), iph->saddr, sport,
+ return __udp4_lib_lookup(net, iph->saddr, sport,
iph->daddr, dport, inet_iif(skb),
- inet_sdif(skb), &udp_table, NULL);
+ inet_sdif(skb), net->ipv4.udp_table, NULL);
}
/* Must be called under rcu_read_lock().
@@ -569,7 +570,7 @@ struct sock *udp4_lib_lookup(struct net *net, __be32 saddr, __be16 sport,
struct sock *sk;
sk = __udp4_lib_lookup(net, saddr, sport, daddr, dport,
- dif, 0, &udp_table, NULL);
+ dif, 0, net->ipv4.udp_table, NULL);
if (sk && !refcount_inc_not_zero(&sk->sk_refcnt))
sk = NULL;
return sk;
@@ -807,7 +808,7 @@ int __udp4_lib_err(struct sk_buff *skb, u32 info, struct udp_table *udptable)
int udp_err(struct sk_buff *skb, u32 info)
{
- return __udp4_lib_err(skb, info, &udp_table);
+ return __udp4_lib_err(skb, info, dev_net(skb->dev)->ipv4.udp_table);
}
/*
@@ -2524,13 +2525,14 @@ static struct sock *__udp4_lib_mcast_demux_lookup(struct net *net,
__be16 rmt_port, __be32 rmt_addr,
int dif, int sdif)
{
+ struct udp_table *udptable = net->ipv4.udp_table;
unsigned short hnum = ntohs(loc_port);
struct sock *sk, *result;
struct udp_hslot *hslot;
unsigned int slot;
- slot = udp_hashfn(net, hnum, udp_table.mask);
- hslot = &udp_table.hash[slot];
+ slot = udp_hashfn(net, hnum, udptable->mask);
+ hslot = &udptable->hash[slot];
/* Do not bother scanning a too big list */
if (hslot->count > 10)
@@ -2558,6 +2560,7 @@ static struct sock *__udp4_lib_demux_lookup(struct net *net,
__be16 rmt_port, __be32 rmt_addr,
int dif, int sdif)
{
+ struct udp_table *udptable = net->ipv4.udp_table;
INET_ADDR_COOKIE(acookie, rmt_addr, loc_addr);
unsigned short hnum = ntohs(loc_port);
unsigned int hash2, slot2;
@@ -2566,8 +2569,8 @@ static struct sock *__udp4_lib_demux_lookup(struct net *net,
struct sock *sk;
hash2 = ipv4_portaddr_hash(net, loc_addr, hnum);
- slot2 = hash2 & udp_table.mask;
- hslot2 = &udp_table.hash2[slot2];
+ slot2 = hash2 & udptable->mask;
+ hslot2 = &udptable->hash2[slot2];
ports = INET_COMBINED_PORTS(rmt_port, hnum);
udp_portaddr_for_each_entry_rcu(sk, &hslot2->head) {
@@ -2649,7 +2652,7 @@ int udp_v4_early_demux(struct sk_buff *skb)
int udp_rcv(struct sk_buff *skb)
{
- return __udp4_lib_rcv(skb, &udp_table, IPPROTO_UDP);
+ return __udp4_lib_rcv(skb, dev_net(skb->dev)->ipv4.udp_table, IPPROTO_UDP);
}
void udp_destroy_sock(struct sock *sk)
diff --git a/net/ipv4/udp_diag.c b/net/ipv4/udp_diag.c
index 1ed8c4d78e5c..de3f2d31f510 100644
--- a/net/ipv4/udp_diag.c
+++ b/net/ipv4/udp_diag.c
@@ -147,13 +147,13 @@ static void udp_dump(struct udp_table *table, struct sk_buff *skb,
static void udp_diag_dump(struct sk_buff *skb, struct netlink_callback *cb,
const struct inet_diag_req_v2 *r)
{
- udp_dump(&udp_table, skb, cb, r);
+ udp_dump(sock_net(cb->skb->sk)->ipv4.udp_table, skb, cb, r);
}
static int udp_diag_dump_one(struct netlink_callback *cb,
const struct inet_diag_req_v2 *req)
{
- return udp_dump_one(&udp_table, cb, req);
+ return udp_dump_one(sock_net(cb->skb->sk)->ipv4.udp_table, cb, req);
}
static void udp_diag_get_info(struct sock *sk, struct inet_diag_msg *r,
@@ -225,7 +225,7 @@ static int __udp_diag_destroy(struct sk_buff *in_skb,
static int udp_diag_destroy(struct sk_buff *in_skb,
const struct inet_diag_req_v2 *req)
{
- return __udp_diag_destroy(in_skb, req, &udp_table);
+ return __udp_diag_destroy(in_skb, req, sock_net(in_skb->sk)->ipv4.udp_table);
}
static int udplite_diag_destroy(struct sk_buff *in_skb,
diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
index 6d1a4bec2614..aedde65e2268 100644
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -600,10 +600,11 @@ static struct sock *udp4_gro_lookup_skb(struct sk_buff *skb, __be16 sport,
__be16 dport)
{
const struct iphdr *iph = skb_gro_network_header(skb);
+ struct net *net = dev_net(skb->dev);
- return __udp4_lib_lookup(dev_net(skb->dev), iph->saddr, sport,
+ return __udp4_lib_lookup(net, iph->saddr, sport,
iph->daddr, dport, inet_iif(skb),
- inet_sdif(skb), &udp_table, NULL);
+ inet_sdif(skb), net->ipv4.udp_table, NULL);
}
INDIRECT_CALLABLE_SCOPE
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index c3dee1f8d3bd..9fb2f33ee3a7 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -217,7 +217,7 @@ static inline struct sock *udp6_lookup_run_bpf(struct net *net,
struct sock *sk, *reuse_sk;
bool no_reuseport;
- if (udptable != &udp_table)
+ if (udptable != net->ipv4.udp_table)
return NULL; /* only UDP is supported */
no_reuseport = bpf_sk_lookup_run_v6(net, IPPROTO_UDP, saddr, sport,
@@ -298,10 +298,11 @@ struct sock *udp6_lib_lookup_skb(const struct sk_buff *skb,
__be16 sport, __be16 dport)
{
const struct ipv6hdr *iph = ipv6_hdr(skb);
+ struct net *net = dev_net(skb->dev);
- return __udp6_lib_lookup(dev_net(skb->dev), &iph->saddr, sport,
+ return __udp6_lib_lookup(net, &iph->saddr, sport,
&iph->daddr, dport, inet6_iif(skb),
- inet6_sdif(skb), &udp_table, NULL);
+ inet6_sdif(skb), net->ipv4.udp_table, NULL);
}
/* Must be called under rcu_read_lock().
@@ -314,7 +315,7 @@ struct sock *udp6_lib_lookup(struct net *net, const struct in6_addr *saddr, __be
struct sock *sk;
sk = __udp6_lib_lookup(net, saddr, sport, daddr, dport,
- dif, 0, &udp_table, NULL);
+ dif, 0, net->ipv4.udp_table, NULL);
if (sk && !refcount_inc_not_zero(&sk->sk_refcnt))
sk = NULL;
return sk;
@@ -689,7 +690,8 @@ static __inline__ int udpv6_err(struct sk_buff *skb,
struct inet6_skb_parm *opt, u8 type,
u8 code, int offset, __be32 info)
{
- return __udp6_lib_err(skb, opt, type, code, offset, info, &udp_table);
+ return __udp6_lib_err(skb, opt, type, code, offset, info,
+ dev_net(skb->dev)->ipv4.udp_table);
}
static int udpv6_queue_rcv_one_skb(struct sock *sk, struct sk_buff *skb)
@@ -1063,6 +1065,7 @@ static struct sock *__udp6_lib_demux_lookup(struct net *net,
__be16 rmt_port, const struct in6_addr *rmt_addr,
int dif, int sdif)
{
+ struct udp_table *udptable = net->ipv4.udp_table;
unsigned short hnum = ntohs(loc_port);
unsigned int hash2, slot2;
struct udp_hslot *hslot2;
@@ -1070,8 +1073,8 @@ static struct sock *__udp6_lib_demux_lookup(struct net *net,
struct sock *sk;
hash2 = ipv6_portaddr_hash(net, loc_addr, hnum);
- slot2 = hash2 & udp_table.mask;
- hslot2 = &udp_table.hash2[slot2];
+ slot2 = hash2 & udptable->mask;
+ hslot2 = &udptable->hash2[slot2];
ports = INET_COMBINED_PORTS(rmt_port, hnum);
udp_portaddr_for_each_entry_rcu(sk, &hslot2->head) {
@@ -1127,7 +1130,7 @@ void udp_v6_early_demux(struct sk_buff *skb)
INDIRECT_CALLABLE_SCOPE int udpv6_rcv(struct sk_buff *skb)
{
- return __udp6_lib_rcv(skb, &udp_table, IPPROTO_UDP);
+ return __udp6_lib_rcv(skb, dev_net(skb->dev)->ipv4.udp_table, IPPROTO_UDP);
}
/*
diff --git a/net/ipv6/udp_offload.c b/net/ipv6/udp_offload.c
index 7720d04ed396..e0e10f6bcdc1 100644
--- a/net/ipv6/udp_offload.c
+++ b/net/ipv6/udp_offload.c
@@ -116,10 +116,11 @@ static struct sock *udp6_gro_lookup_skb(struct sk_buff *skb, __be16 sport,
__be16 dport)
{
const struct ipv6hdr *iph = skb_gro_network_header(skb);
+ struct net *net = dev_net(skb->dev);
- return __udp6_lib_lookup(dev_net(skb->dev), &iph->saddr, sport,
+ return __udp6_lib_lookup(net, &iph->saddr, sport,
&iph->daddr, dport, inet6_iif(skb),
- inet6_sdif(skb), &udp_table, NULL);
+ inet6_sdif(skb), net->ipv4.udp_table, NULL);
}
INDIRECT_CALLABLE_SCOPE
--
2.30.2
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2 net-next 5/6] udp: Add bitmap in udp_table.
2022-11-11 4:00 [PATCH v2 net-next 0/6] udp: Introduce optional per-netns hash table Kuniyuki Iwashima
` (3 preceding siblings ...)
2022-11-11 4:00 ` [PATCH v2 net-next 4/6] udp: Access &udp_table via net Kuniyuki Iwashima
@ 2022-11-11 4:00 ` Kuniyuki Iwashima
2022-11-11 4:00 ` [PATCH v2 net-next 6/6] udp: Introduce optional per-netns hash table Kuniyuki Iwashima
5 siblings, 0 replies; 11+ messages in thread
From: Kuniyuki Iwashima @ 2022-11-11 4:00 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
We use a bitmap in udp_lib_get_port() to search for an available
port. Currently, the bitmap size is fixed and has enough room for
UDP_HTABLE_SIZE_MIN.
The following patch adds the per-netns hash table for UDP, whose size
can be smaller than UDP_HTABLE_SIZE_MIN. If we define a bitmap with
enough size on the stack, it will be over CONFIG_FRAME_WARN. To avoid
that, we allocate bitmaps for each udp_table->hash[slot] in advance.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
include/linux/udp.h | 1 +
include/net/udp.h | 20 ++++++++++++++++++++
net/ipv4/udp.c | 12 +++++++++---
3 files changed, 30 insertions(+), 3 deletions(-)
diff --git a/include/linux/udp.h b/include/linux/udp.h
index dea57aa37df6..779a7c065a32 100644
--- a/include/linux/udp.h
+++ b/include/linux/udp.h
@@ -23,6 +23,7 @@ static inline struct udphdr *udp_hdr(const struct sk_buff *skb)
return (struct udphdr *)skb_transport_header(skb);
}
+#define UDP_MAX_PORT_LOG 16
#define UDP_HTABLE_SIZE_MIN (CONFIG_BASE_SMALL ? 128 : 256)
static inline u32 udp_hashfn(const struct net *net, u32 num, u32 mask)
diff --git a/include/net/udp.h b/include/net/udp.h
index de4b528522bb..314dd51a2cc6 100644
--- a/include/net/udp.h
+++ b/include/net/udp.h
@@ -72,11 +72,31 @@ struct udp_hslot {
struct udp_table {
struct udp_hslot *hash;
struct udp_hslot *hash2;
+ unsigned long *bitmap;
unsigned int mask;
unsigned int log;
};
extern struct udp_table udp_table;
void udp_table_init(struct udp_table *, const char *);
+
+static inline unsigned int udp_bitmap_size(struct udp_table *table)
+{
+ return 1 << (UDP_MAX_PORT_LOG - table->log);
+}
+
+static inline unsigned long *udp_hashbitmap(struct udp_table *table,
+ struct net *net, unsigned int num)
+{
+ unsigned long *bitmap;
+ unsigned int size;
+
+ size = udp_bitmap_size(table);
+ bitmap = &table->bitmap[udp_hashfn(net, num, table->mask) * BITS_TO_LONGS(size)];
+ bitmap_zero(bitmap, size);
+
+ return bitmap;
+}
+
static inline struct udp_hslot *udp_hashslot(struct udp_table *table,
struct net *net, unsigned int num)
{
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 37e79158d145..42d7b84a5f16 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -129,7 +129,6 @@ DEFINE_PER_CPU(int, udp_memory_per_cpu_fw_alloc);
EXPORT_PER_CPU_SYMBOL_GPL(udp_memory_per_cpu_fw_alloc);
#define MAX_UDP_PORTS 65536
-#define PORTS_PER_CHAIN (MAX_UDP_PORTS / UDP_HTABLE_SIZE_MIN)
static struct udp_table *udp_get_table_prot(struct sock *sk)
{
@@ -243,9 +242,9 @@ int udp_lib_get_port(struct sock *sk, unsigned short snum,
int error = 1;
if (!snum) {
- DECLARE_BITMAP(bitmap, PORTS_PER_CHAIN);
unsigned short first, last;
int low, high, remaining;
+ unsigned long *bitmap;
unsigned int rand;
inet_get_local_port_range(net, &low, &high);
@@ -260,8 +259,8 @@ int udp_lib_get_port(struct sock *sk, unsigned short snum,
last = first + udptable->mask + 1;
do {
hslot = udp_hashslot(udptable, net, first);
- bitmap_zero(bitmap, PORTS_PER_CHAIN);
spin_lock_bh(&hslot->lock);
+ bitmap = udp_hashbitmap(udptable, net, first);
udp_lib_lport_inuse(net, snum, hslot, bitmap, sk,
udptable->log);
@@ -3290,6 +3289,13 @@ void __init udp_table_init(struct udp_table *table, const char *name)
table->hash2[i].count = 0;
spin_lock_init(&table->hash2[i].lock);
}
+
+ table->bitmap = kmalloc_array(table->mask + 1,
+ BITS_TO_LONGS(udp_bitmap_size(table)) *
+ sizeof(unsigned long),
+ GFP_KERNEL);
+ if (!table->bitmap)
+ panic("UDP: failed to alloc bitmap\n");
}
u32 udp_flow_hashrnd(void)
--
2.30.2
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2 net-next 6/6] udp: Introduce optional per-netns hash table.
2022-11-11 4:00 [PATCH v2 net-next 0/6] udp: Introduce optional per-netns hash table Kuniyuki Iwashima
` (4 preceding siblings ...)
2022-11-11 4:00 ` [PATCH v2 net-next 5/6] udp: Add bitmap in udp_table Kuniyuki Iwashima
@ 2022-11-11 4:00 ` Kuniyuki Iwashima
2022-11-11 8:53 ` Paolo Abeni
5 siblings, 1 reply; 11+ messages in thread
From: Kuniyuki Iwashima @ 2022-11-11 4:00 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
The maximum hash table size is 64K due to the nature of the protocol. [0]
It's smaller than TCP, and fewer sockets can cause a performance drop.
On an EC2 c5.24xlarge instance (192 GiB memory), after running iperf3 in
different netns, creating 32Mi sockets without data transfer in the root
netns causes regression for the iperf3's connection.
uhash_entries sockets length Gbps
64K 1 1 5.69
1Mi 16 5.27
2Mi 32 4.90
4Mi 64 4.09
8Mi 128 2.96
16Mi 256 2.06
32Mi 512 1.12
The per-netns hash table breaks the lengthy lists into shorter ones. It is
useful on a multi-tenant system with thousands of netns. With smaller hash
tables, we can look up sockets faster, isolate noisy neighbours, and reduce
lock contention.
The max size of the per-netns table is 64K as well. This is because the
possible hash range by udp_hashfn() always fits in 64K within the same
netns and we cannot make full use of the whole buckets larger than 64K.
/* 0 < num < 64K -> X < hash < X + 64K */
(num + net_hash_mix(net)) & mask;
The sysctl usage is the same with TCP:
$ dmesg | cut -d ' ' -f 6- | grep "UDP hash"
UDP hash table entries: 65536 (order: 9, 2097152 bytes, vmalloc)
# sysctl net.ipv4.udp_hash_entries
net.ipv4.udp_hash_entries = 65536 # can be changed by uhash_entries
# sysctl net.ipv4.udp_child_hash_entries
net.ipv4.udp_child_hash_entries = 0 # disabled by default
# ip netns add test1
# ip netns exec test1 sysctl net.ipv4.udp_hash_entries
net.ipv4.udp_hash_entries = -65536 # share the global table
# sysctl -w net.ipv4.udp_child_hash_entries=100
net.ipv4.udp_child_hash_entries = 100
# ip netns add test2
# ip netns exec test2 sysctl net.ipv4.udp_hash_entries
net.ipv4.udp_hash_entries = 128 # own a per-netns table with 2^n buckets
We could optimise the hash table lookup/iteration further by removing
the netns comparison for the per-netns one in the future. Also, we
could optimise the sparse udp_hslot layout by putting it in udp_table.
[0]: https://lore.kernel.org/netdev/4ACC2815.7010101@gmail.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
Documentation/networking/ip-sysctl.rst | 27 +++++++
include/linux/udp.h | 1 +
include/net/netns/ipv4.h | 2 +
net/ipv4/sysctl_net_ipv4.c | 38 ++++++++++
net/ipv4/udp.c | 100 +++++++++++++++++++++++--
5 files changed, 163 insertions(+), 5 deletions(-)
diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
index 815efc89ad73..ea788ef4def0 100644
--- a/Documentation/networking/ip-sysctl.rst
+++ b/Documentation/networking/ip-sysctl.rst
@@ -1177,6 +1177,33 @@ udp_rmem_min - INTEGER
udp_wmem_min - INTEGER
UDP does not have tx memory accounting and this tunable has no effect.
+udp_hash_entries - INTEGER
+ Show the number of hash buckets for UDP sockets in the current
+ networking namespace.
+
+ A negative value means the networking namespace does not own its
+ hash buckets and shares the initial networking namespace's one.
+
+udp_child_ehash_entries - INTEGER
+ Control the number of hash buckets for UDP sockets in the child
+ networking namespace, which must be set before clone() or unshare().
+
+ If the value is not 0, the kernel uses a value rounded up to 2^n
+ as the actual hash bucket size. 0 is a special value, meaning
+ the child networking namespace will share the initial networking
+ namespace's hash buckets.
+
+ Note that the child will use the global one in case the kernel
+ fails to allocate enough memory. In addition, the global hash
+ buckets are spread over available NUMA nodes, but the allocation
+ of the child hash table depends on the current process's NUMA
+ policy, which could result in performance differences.
+
+ Possible values: 0, 2^n (n: 0 - 16 (64K))
+
+ Default: 0
+
+
RAW variables
=============
diff --git a/include/linux/udp.h b/include/linux/udp.h
index 779a7c065a32..18c8c9b7e39a 100644
--- a/include/linux/udp.h
+++ b/include/linux/udp.h
@@ -25,6 +25,7 @@ static inline struct udphdr *udp_hdr(const struct sk_buff *skb)
#define UDP_MAX_PORT_LOG 16
#define UDP_HTABLE_SIZE_MIN (CONFIG_BASE_SMALL ? 128 : 256)
+#define UDP_HTABLE_SIZE_MAX (1 << UDP_MAX_PORT_LOG)
static inline u32 udp_hashfn(const struct net *net, u32 num, u32 mask)
{
diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
index e4cc4d3cacc4..db762e35aca9 100644
--- a/include/net/netns/ipv4.h
+++ b/include/net/netns/ipv4.h
@@ -208,6 +208,8 @@ struct netns_ipv4 {
atomic_t dev_addr_genid;
+ unsigned int sysctl_udp_child_hash_entries;
+
#ifdef CONFIG_SYSCTL
unsigned long *sysctl_local_reserved_ports;
int sysctl_ip_prot_sock;
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index 0af28cedd071..34a601b9e57d 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -40,6 +40,7 @@ static int one_day_secs = 24 * 3600;
static u32 fib_multipath_hash_fields_all_mask __maybe_unused =
FIB_MULTIPATH_HASH_FIELD_ALL_MASK;
static unsigned int tcp_child_ehash_entries_max = 16 * 1024 * 1024;
+static unsigned int udp_child_hash_entries_max = UDP_HTABLE_SIZE_MAX;
static int tcp_plb_max_rounds = 31;
static int tcp_plb_max_cong_thresh = 256;
@@ -408,6 +409,28 @@ static int proc_tcp_ehash_entries(struct ctl_table *table, int write,
return proc_dointvec(&tbl, write, buffer, lenp, ppos);
}
+static int proc_udp_hash_entries(struct ctl_table *table, int write,
+ void *buffer, size_t *lenp, loff_t *ppos)
+{
+ struct net *net = container_of(table->data, struct net,
+ ipv4.sysctl_udp_child_hash_entries);
+ int udp_hash_entries;
+ struct ctl_table tbl;
+
+ udp_hash_entries = net->ipv4.udp_table->mask + 1;
+
+ /* A negative number indicates that the child netns
+ * shares the global udp_table.
+ */
+ if (!net_eq(net, &init_net) && net->ipv4.udp_table == &udp_table)
+ udp_hash_entries *= -1;
+
+ tbl.data = &udp_hash_entries;
+ tbl.maxlen = sizeof(int);
+
+ return proc_dointvec(&tbl, write, buffer, lenp, ppos);
+}
+
#ifdef CONFIG_IP_ROUTE_MULTIPATH
static int proc_fib_multipath_hash_policy(struct ctl_table *table, int write,
void *buffer, size_t *lenp,
@@ -1361,6 +1384,21 @@ static struct ctl_table ipv4_net_table[] = {
.extra1 = SYSCTL_ZERO,
.extra2 = &tcp_child_ehash_entries_max,
},
+ {
+ .procname = "udp_hash_entries",
+ .data = &init_net.ipv4.sysctl_udp_child_hash_entries,
+ .mode = 0444,
+ .proc_handler = proc_udp_hash_entries,
+ },
+ {
+ .procname = "udp_child_hash_entries",
+ .data = &init_net.ipv4.sysctl_udp_child_hash_entries,
+ .maxlen = sizeof(unsigned int),
+ .mode = 0644,
+ .proc_handler = proc_douintvec_minmax,
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = &udp_child_hash_entries_max,
+ },
{
.procname = "udp_rmem_min",
.data = &init_net.ipv4.sysctl_udp_rmem_min,
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 42d7b84a5f16..c76a4d7ee74e 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -3276,7 +3276,7 @@ void __init udp_table_init(struct udp_table *table, const char *name)
&table->log,
&table->mask,
UDP_HTABLE_SIZE_MIN,
- 64 * 1024);
+ UDP_HTABLE_SIZE_MAX);
table->hash2 = table->hash + (table->mask + 1);
for (i = 0; i <= table->mask; i++) {
@@ -3308,22 +3308,112 @@ u32 udp_flow_hashrnd(void)
}
EXPORT_SYMBOL(udp_flow_hashrnd);
-static int __net_init udp_sysctl_init(struct net *net)
+static void __net_init udp_sysctl_init(struct net *net)
{
- net->ipv4.udp_table = &udp_table;
-
net->ipv4.sysctl_udp_rmem_min = PAGE_SIZE;
net->ipv4.sysctl_udp_wmem_min = PAGE_SIZE;
#ifdef CONFIG_NET_L3_MASTER_DEV
net->ipv4.sysctl_udp_l3mdev_accept = 0;
#endif
+}
+
+static struct udp_table __net_init *udp_pernet_table_alloc(unsigned int hash_entries)
+{
+ unsigned long hash_size, bitmap_size;
+ struct udp_table *udptable;
+ int i;
+
+ udptable = kmalloc(sizeof(*udptable), GFP_KERNEL);
+ if (!udptable)
+ goto out;
+
+ udptable->log = ilog2(hash_entries);
+ udptable->mask = hash_entries - 1;
+
+ hash_size = L1_CACHE_ALIGN(hash_entries * 2 * sizeof(struct udp_hslot));
+ bitmap_size = hash_entries *
+ BITS_TO_LONGS(udp_bitmap_size(udptable)) * sizeof(unsigned long);
+
+ udptable->hash = vmalloc_huge(hash_size + bitmap_size, GFP_KERNEL_ACCOUNT);
+ if (!udptable->hash)
+ goto free_table;
+
+ udptable->hash2 = udptable->hash + hash_entries;
+ udptable->bitmap = (void *)udptable->hash + hash_size;
+
+ for (i = 0; i < hash_entries; i++) {
+ INIT_HLIST_HEAD(&udptable->hash[i].head);
+ udptable->hash[i].count = 0;
+ spin_lock_init(&udptable->hash[i].lock);
+
+ INIT_HLIST_HEAD(&udptable->hash2[i].head);
+ udptable->hash2[i].count = 0;
+ spin_lock_init(&udptable->hash2[i].lock);
+ }
+
+ return udptable;
+
+free_table:
+ kfree(udptable);
+out:
+ return NULL;
+}
+
+static void __net_exit udp_pernet_table_free(struct net *net)
+{
+ struct udp_table *udptable = net->ipv4.udp_table;
+
+ if (udptable == &udp_table)
+ return;
+
+ kvfree(udptable->hash);
+ kfree(udptable);
+}
+
+static void __net_init udp_set_table(struct net *net)
+{
+ struct udp_table *udptable;
+ unsigned int hash_entries;
+ struct net *old_net;
+
+ if (net_eq(net, &init_net))
+ goto fallback;
+
+ old_net = current->nsproxy->net_ns;
+ hash_entries = READ_ONCE(old_net->ipv4.sysctl_udp_child_hash_entries);
+ if (!hash_entries)
+ goto fallback;
+
+ hash_entries = roundup_pow_of_two(hash_entries);
+ udptable = udp_pernet_table_alloc(hash_entries);
+ if (udptable) {
+ net->ipv4.udp_table = udptable;
+ } else {
+ pr_warn("Failed to allocate UDP hash table (entries: %u) "
+ "for a netns, fallback to the global one\n",
+ hash_entries);
+fallback:
+ net->ipv4.udp_table = &udp_table;
+ }
+}
+
+static int __net_init udp_pernet_init(struct net *net)
+{
+ udp_sysctl_init(net);
+ udp_set_table(net);
return 0;
}
+static void __net_exit udp_pernet_exit(struct net *net)
+{
+ udp_pernet_table_free(net);
+}
+
static struct pernet_operations __net_initdata udp_sysctl_ops = {
- .init = udp_sysctl_init,
+ .init = udp_pernet_init,
+ .exit = udp_pernet_exit,
};
#if defined(CONFIG_BPF_SYSCALL) && defined(CONFIG_PROC_FS)
--
2.30.2
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v2 net-next 6/6] udp: Introduce optional per-netns hash table.
2022-11-11 4:00 ` [PATCH v2 net-next 6/6] udp: Introduce optional per-netns hash table Kuniyuki Iwashima
@ 2022-11-11 8:53 ` Paolo Abeni
2022-11-14 20:21 ` Kuniyuki Iwashima
0 siblings, 1 reply; 11+ messages in thread
From: Paolo Abeni @ 2022-11-11 8:53 UTC (permalink / raw)
To: Kuniyuki Iwashima, David S. Miller, Eric Dumazet, Jakub Kicinski
Cc: Kuniyuki Iwashima, netdev
On Thu, 2022-11-10 at 20:00 -0800, Kuniyuki Iwashima wrote:
> @@ -408,6 +409,28 @@ static int proc_tcp_ehash_entries(struct ctl_table *table, int write,
> return proc_dointvec(&tbl, write, buffer, lenp, ppos);
> }
>
> +static int proc_udp_hash_entries(struct ctl_table *table, int write,
> + void *buffer, size_t *lenp, loff_t *ppos)
> +{
> + struct net *net = container_of(table->data, struct net,
> + ipv4.sysctl_udp_child_hash_entries);
> + int udp_hash_entries;
> + struct ctl_table tbl;
> +
> + udp_hash_entries = net->ipv4.udp_table->mask + 1;
> +
> + /* A negative number indicates that the child netns
> + * shares the global udp_table.
> + */
> + if (!net_eq(net, &init_net) && net->ipv4.udp_table == &udp_table)
> + udp_hash_entries *= -1;
> +
> + tbl.data = &udp_hash_entries;
> + tbl.maxlen = sizeof(int);
I see the procfs code below will only use tbl.data and tbl.maxlen, but
perhaps is cleaner intially explicitly memset tbl to 0
>
> +
> + return proc_dointvec(&tbl, write, buffer, lenp, ppos);
> +}
> +
> #ifdef CONFIG_IP_ROUTE_MULTIPATH
> static int proc_fib_multipath_hash_policy(struct ctl_table *table, int write,
> void *buffer, size_t *lenp,
[...]
> @@ -3308,22 +3308,112 @@ u32 udp_flow_hashrnd(void)
> }
> EXPORT_SYMBOL(udp_flow_hashrnd);
>
> -static int __net_init udp_sysctl_init(struct net *net)
> +static void __net_init udp_sysctl_init(struct net *net)
> {
> - net->ipv4.udp_table = &udp_table;
> -
> net->ipv4.sysctl_udp_rmem_min = PAGE_SIZE;
> net->ipv4.sysctl_udp_wmem_min = PAGE_SIZE;
>
> #ifdef CONFIG_NET_L3_MASTER_DEV
> net->ipv4.sysctl_udp_l3mdev_accept = 0;
> #endif
> +}
> +
> +static struct udp_table __net_init *udp_pernet_table_alloc(unsigned int hash_entries)
> +{
> + unsigned long hash_size, bitmap_size;
> + struct udp_table *udptable;
> + int i;
> +
> + udptable = kmalloc(sizeof(*udptable), GFP_KERNEL);
> + if (!udptable)
> + goto out;
> +
> + udptable->log = ilog2(hash_entries);
> + udptable->mask = hash_entries - 1;
> +
> + hash_size = L1_CACHE_ALIGN(hash_entries * 2 * sizeof(struct udp_hslot));
> + bitmap_size = hash_entries *
> + BITS_TO_LONGS(udp_bitmap_size(udptable)) * sizeof(unsigned long);
Ouch, I'm very sorry. I did not realize we need a bitmap per hash
bucket. This leads to a constant 8k additional memory overhead per
netns, undependently from arch long bitsize.
I guess it's still acceptable, but perhaps worth mentioning in the
commit message?
Again sorry for the back && forth, I'm reconsidering all the above
given my dumb misunderstanding.
I see that a minumum size of 256 hash buckets does not match your use
case, still... if lower values of the per netns hash size are inflated
to e.g. 128 and we keep the bitmap on stack (should be 64 bytes wide, I
guess still an acceptable value), the per netns memory overhead will be
128*2*<hash bucket size> = 8K, less that what we get the above schema
and any smaller hash - a single hash bucket leads to a 32 + 8K memory
overhead.
TL;DR: what about accepting any per netns hash table size, but always
allocate at least 128 buckets and keep the bitmap on the stack?
Thanks,
Paolo
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 net-next 6/6] udp: Introduce optional per-netns hash table.
2022-11-11 8:53 ` Paolo Abeni
@ 2022-11-14 20:21 ` Kuniyuki Iwashima
2022-11-14 20:55 ` Paolo Abeni
0 siblings, 1 reply; 11+ messages in thread
From: Kuniyuki Iwashima @ 2022-11-14 20:21 UTC (permalink / raw)
To: pabeni; +Cc: davem, edumazet, kuba, kuni1840, kuniyu, netdev
From: Paolo Abeni <pabeni@redhat.com>
Date: Fri, 11 Nov 2022 09:53:31 +0100
> On Thu, 2022-11-10 at 20:00 -0800, Kuniyuki Iwashima wrote:
> > @@ -408,6 +409,28 @@ static int proc_tcp_ehash_entries(struct ctl_table *table, int write,
> > return proc_dointvec(&tbl, write, buffer, lenp, ppos);
> > }
> >
> > +static int proc_udp_hash_entries(struct ctl_table *table, int write,
> > + void *buffer, size_t *lenp, loff_t *ppos)
> > +{
> > + struct net *net = container_of(table->data, struct net,
> > + ipv4.sysctl_udp_child_hash_entries);
> > + int udp_hash_entries;
> > + struct ctl_table tbl;
> > +
> > + udp_hash_entries = net->ipv4.udp_table->mask + 1;
> > +
> > + /* A negative number indicates that the child netns
> > + * shares the global udp_table.
> > + */
> > + if (!net_eq(net, &init_net) && net->ipv4.udp_table == &udp_table)
> > + udp_hash_entries *= -1;
> > +
> > + tbl.data = &udp_hash_entries;
> > + tbl.maxlen = sizeof(int);
>
> I see the procfs code below will only use tbl.data and tbl.maxlen, but
> perhaps is cleaner intially explicitly memset tbl to 0
Will add memset()
>
> >
> > +
> > + return proc_dointvec(&tbl, write, buffer, lenp, ppos);
> > +}
> > +
> > #ifdef CONFIG_IP_ROUTE_MULTIPATH
> > static int proc_fib_multipath_hash_policy(struct ctl_table *table, int write,
> > void *buffer, size_t *lenp,
>
> [...]
>
> > @@ -3308,22 +3308,112 @@ u32 udp_flow_hashrnd(void)
> > }
> > EXPORT_SYMBOL(udp_flow_hashrnd);
> >
> > -static int __net_init udp_sysctl_init(struct net *net)
> > +static void __net_init udp_sysctl_init(struct net *net)
> > {
> > - net->ipv4.udp_table = &udp_table;
> > -
> > net->ipv4.sysctl_udp_rmem_min = PAGE_SIZE;
> > net->ipv4.sysctl_udp_wmem_min = PAGE_SIZE;
> >
> > #ifdef CONFIG_NET_L3_MASTER_DEV
> > net->ipv4.sysctl_udp_l3mdev_accept = 0;
> > #endif
> > +}
> > +
> > +static struct udp_table __net_init *udp_pernet_table_alloc(unsigned int hash_entries)
> > +{
> > + unsigned long hash_size, bitmap_size;
> > + struct udp_table *udptable;
> > + int i;
> > +
> > + udptable = kmalloc(sizeof(*udptable), GFP_KERNEL);
> > + if (!udptable)
> > + goto out;
> > +
> > + udptable->log = ilog2(hash_entries);
> > + udptable->mask = hash_entries - 1;
> > +
> > + hash_size = L1_CACHE_ALIGN(hash_entries * 2 * sizeof(struct udp_hslot));
> > + bitmap_size = hash_entries *
> > + BITS_TO_LONGS(udp_bitmap_size(udptable)) * sizeof(unsigned long);
>
> Ouch, I'm very sorry. I did not realize we need a bitmap per hash
> bucket. This leads to a constant 8k additional memory overhead per
> netns, undependently from arch long bitsize.
Ugh, it will be 64K per netns ... ?
hash_entries : 2 ^ n
BITS_TO_LONGS : 2 ^ -m # arch specific
udp_bitmap_size(udptable) : 2 ^ (16 - n)
sizeof(unsigned long) : 2 ^ m # arch specific
(2 ^ n) * (2 ^ -m) * (2 ^ (16 - n)) * (2 ^ m)
= 2 ^ (n - m + 16 - n + m)
= 2 ^ 16
= 64 K
>
> I guess it's still acceptable, but perhaps worth mentioning in the
> commit message?
>
> Again sorry for the back && forth, I'm reconsidering all the above
> given my dumb misunderstanding.
>
> I see that a minumum size of 256 hash buckets does not match your use
> case, still... if lower values of the per netns hash size are inflated
> to e.g. 128 and we keep the bitmap on stack (should be 64 bytes wide, I
> guess still an acceptable value), the per netns memory overhead will be
> 128*2*<hash bucket size> = 8K, less that what we get the above schema
> and any smaller hash - a single hash bucket leads to a 32 + 8K memory
> overhead.
>
> TL;DR: what about accepting any per netns hash table size, but always
> allocate at least 128 buckets and keep the bitmap on the stack?
Sure, I'll change the min to 128.
128 * 2 * 16 = 4096 = 4K
---
$ pahole -C udp_hslot vmlinux
struct udp_hslot {
struct hlist_head head; /* 0 8 */
int count; /* 8 4 */
spinlock_t lock; /* 12 4 */
/* size: 16, cachelines: 1, members: 3 */
/* last cacheline: 16 bytes */
} __attribute__((__aligned__(16)));
---
Thank you!
>
> Thanks,
>
> Paolo
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 net-next 6/6] udp: Introduce optional per-netns hash table.
2022-11-14 20:21 ` Kuniyuki Iwashima
@ 2022-11-14 20:55 ` Paolo Abeni
2022-11-14 21:08 ` Kuniyuki Iwashima
0 siblings, 1 reply; 11+ messages in thread
From: Paolo Abeni @ 2022-11-14 20:55 UTC (permalink / raw)
To: Kuniyuki Iwashima; +Cc: davem, edumazet, kuba, kuni1840, netdev
On Mon, 2022-11-14 at 12:21 -0800, Kuniyuki Iwashima wrote:
> From: Paolo Abeni <pabeni@redhat.com>
> Date: Fri, 11 Nov 2022 09:53:31 +0100
> > On Thu, 2022-11-10 at 20:00 -0800, Kuniyuki Iwashima wrote:
> > > @@ -408,6 +409,28 @@ static int proc_tcp_ehash_entries(struct ctl_table *table, int write,
> > > return proc_dointvec(&tbl, write, buffer, lenp, ppos);
> > > }
> > >
> > > +static int proc_udp_hash_entries(struct ctl_table *table, int write,
> > > + void *buffer, size_t *lenp, loff_t *ppos)
> > > +{
> > > + struct net *net = container_of(table->data, struct net,
> > > + ipv4.sysctl_udp_child_hash_entries);
> > > + int udp_hash_entries;
> > > + struct ctl_table tbl;
> > > +
> > > + udp_hash_entries = net->ipv4.udp_table->mask + 1;
> > > +
> > > + /* A negative number indicates that the child netns
> > > + * shares the global udp_table.
> > > + */
> > > + if (!net_eq(net, &init_net) && net->ipv4.udp_table == &udp_table)
> > > + udp_hash_entries *= -1;
> > > +
> > > + tbl.data = &udp_hash_entries;
> > > + tbl.maxlen = sizeof(int);
> >
> > I see the procfs code below will only use tbl.data and tbl.maxlen, but
> > perhaps is cleaner intially explicitly memset tbl to 0
>
> Will add memset()
>
>
> >
> > >
> > > +
> > > + return proc_dointvec(&tbl, write, buffer, lenp, ppos);
> > > +}
> > > +
> > > #ifdef CONFIG_IP_ROUTE_MULTIPATH
> > > static int proc_fib_multipath_hash_policy(struct ctl_table *table, int write,
> > > void *buffer, size_t *lenp,
> >
> > [...]
> >
> > > @@ -3308,22 +3308,112 @@ u32 udp_flow_hashrnd(void)
> > > }
> > > EXPORT_SYMBOL(udp_flow_hashrnd);
> > >
> > > -static int __net_init udp_sysctl_init(struct net *net)
> > > +static void __net_init udp_sysctl_init(struct net *net)
> > > {
> > > - net->ipv4.udp_table = &udp_table;
> > > -
> > > net->ipv4.sysctl_udp_rmem_min = PAGE_SIZE;
> > > net->ipv4.sysctl_udp_wmem_min = PAGE_SIZE;
> > >
> > > #ifdef CONFIG_NET_L3_MASTER_DEV
> > > net->ipv4.sysctl_udp_l3mdev_accept = 0;
> > > #endif
> > > +}
> > > +
> > > +static struct udp_table __net_init *udp_pernet_table_alloc(unsigned int hash_entries)
> > > +{
> > > + unsigned long hash_size, bitmap_size;
> > > + struct udp_table *udptable;
> > > + int i;
> > > +
> > > + udptable = kmalloc(sizeof(*udptable), GFP_KERNEL);
> > > + if (!udptable)
> > > + goto out;
> > > +
> > > + udptable->log = ilog2(hash_entries);
> > > + udptable->mask = hash_entries - 1;
> > > +
> > > + hash_size = L1_CACHE_ALIGN(hash_entries * 2 * sizeof(struct udp_hslot));
> > > + bitmap_size = hash_entries *
> > > + BITS_TO_LONGS(udp_bitmap_size(udptable)) * sizeof(unsigned long);
> >
> > Ouch, I'm very sorry. I did not realize we need a bitmap per hash
> > bucket. This leads to a constant 8k additional memory overhead per
> > netns, undependently from arch long bitsize.
>
> Ugh, it will be 64K per netns ... ?
>
> hash_entries : 2 ^ n
> BITS_TO_LONGS : 2 ^ -m # arch specific
> udp_bitmap_size(udptable) : 2 ^ (16 - n)
> sizeof(unsigned long) : 2 ^ m # arch specific
>
> (2 ^ n) * (2 ^ -m) * (2 ^ (16 - n)) * (2 ^ m)
> = 2 ^ (n - m + 16 - n + m)
> = 2 ^ 16
> = 64 K
For the records, I still think it's 8k ;)
BITS_TO_LONGS(n) * sizeof(unsigned long) is always equal to n/8
regardless of the arch, while the above math gives BITS_TO_LONGS(n) *
sizeof(unsigned long) == n.
Cheers,
Paolo
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 net-next 6/6] udp: Introduce optional per-netns hash table.
2022-11-14 20:55 ` Paolo Abeni
@ 2022-11-14 21:08 ` Kuniyuki Iwashima
0 siblings, 0 replies; 11+ messages in thread
From: Kuniyuki Iwashima @ 2022-11-14 21:08 UTC (permalink / raw)
To: pabeni; +Cc: davem, edumazet, kuba, kuni1840, kuniyu, netdev
From: Paolo Abeni <pabeni@redhat.com>
Date: Mon, 14 Nov 2022 21:55:11 +0100
> On Mon, 2022-11-14 at 12:21 -0800, Kuniyuki Iwashima wrote:
> > From: Paolo Abeni <pabeni@redhat.com>
> > Date: Fri, 11 Nov 2022 09:53:31 +0100
> > > On Thu, 2022-11-10 at 20:00 -0800, Kuniyuki Iwashima wrote:
> > > > @@ -408,6 +409,28 @@ static int proc_tcp_ehash_entries(struct ctl_table *table, int write,
> > > > return proc_dointvec(&tbl, write, buffer, lenp, ppos);
> > > > }
> > > >
> > > > +static int proc_udp_hash_entries(struct ctl_table *table, int write,
> > > > + void *buffer, size_t *lenp, loff_t *ppos)
> > > > +{
> > > > + struct net *net = container_of(table->data, struct net,
> > > > + ipv4.sysctl_udp_child_hash_entries);
> > > > + int udp_hash_entries;
> > > > + struct ctl_table tbl;
> > > > +
> > > > + udp_hash_entries = net->ipv4.udp_table->mask + 1;
> > > > +
> > > > + /* A negative number indicates that the child netns
> > > > + * shares the global udp_table.
> > > > + */
> > > > + if (!net_eq(net, &init_net) && net->ipv4.udp_table == &udp_table)
> > > > + udp_hash_entries *= -1;
> > > > +
> > > > + tbl.data = &udp_hash_entries;
> > > > + tbl.maxlen = sizeof(int);
> > >
> > > I see the procfs code below will only use tbl.data and tbl.maxlen, but
> > > perhaps is cleaner intially explicitly memset tbl to 0
> >
> > Will add memset()
> >
> >
> > >
> > > >
> > > > +
> > > > + return proc_dointvec(&tbl, write, buffer, lenp, ppos);
> > > > +}
> > > > +
> > > > #ifdef CONFIG_IP_ROUTE_MULTIPATH
> > > > static int proc_fib_multipath_hash_policy(struct ctl_table *table, int write,
> > > > void *buffer, size_t *lenp,
> > >
> > > [...]
> > >
> > > > @@ -3308,22 +3308,112 @@ u32 udp_flow_hashrnd(void)
> > > > }
> > > > EXPORT_SYMBOL(udp_flow_hashrnd);
> > > >
> > > > -static int __net_init udp_sysctl_init(struct net *net)
> > > > +static void __net_init udp_sysctl_init(struct net *net)
> > > > {
> > > > - net->ipv4.udp_table = &udp_table;
> > > > -
> > > > net->ipv4.sysctl_udp_rmem_min = PAGE_SIZE;
> > > > net->ipv4.sysctl_udp_wmem_min = PAGE_SIZE;
> > > >
> > > > #ifdef CONFIG_NET_L3_MASTER_DEV
> > > > net->ipv4.sysctl_udp_l3mdev_accept = 0;
> > > > #endif
> > > > +}
> > > > +
> > > > +static struct udp_table __net_init *udp_pernet_table_alloc(unsigned int hash_entries)
> > > > +{
> > > > + unsigned long hash_size, bitmap_size;
> > > > + struct udp_table *udptable;
> > > > + int i;
> > > > +
> > > > + udptable = kmalloc(sizeof(*udptable), GFP_KERNEL);
> > > > + if (!udptable)
> > > > + goto out;
> > > > +
> > > > + udptable->log = ilog2(hash_entries);
> > > > + udptable->mask = hash_entries - 1;
> > > > +
> > > > + hash_size = L1_CACHE_ALIGN(hash_entries * 2 * sizeof(struct udp_hslot));
> > > > + bitmap_size = hash_entries *
> > > > + BITS_TO_LONGS(udp_bitmap_size(udptable)) * sizeof(unsigned long);
> > >
> > > Ouch, I'm very sorry. I did not realize we need a bitmap per hash
> > > bucket. This leads to a constant 8k additional memory overhead per
> > > netns, undependently from arch long bitsize.
> >
> > Ugh, it will be 64K per netns ... ?
> >
> > hash_entries : 2 ^ n
> > BITS_TO_LONGS : 2 ^ -m # arch specific
-(m + 3)
> > udp_bitmap_size(udptable) : 2 ^ (16 - n)
> > sizeof(unsigned long) : 2 ^ m # arch specific
> >
> > (2 ^ n) * (2 ^ -m) * (2 ^ (16 - n)) * (2 ^ m)
(-m - 3)
> > = 2 ^ (n - m + 16 - n + m)
13
> > = 2 ^ 16
13
> > = 64 K
8 K
>
> For the records, I still think it's 8k ;)
>
> BITS_TO_LONGS(n) * sizeof(unsigned long) is always equal to n/8
> regardless of the arch, while the above math gives BITS_TO_LONGS(n) *
> sizeof(unsigned long) == n.
Ah, right!
My math was bad :p
Thank you!
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2022-11-14 21:09 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-11-11 4:00 [PATCH v2 net-next 0/6] udp: Introduce optional per-netns hash table Kuniyuki Iwashima
2022-11-11 4:00 ` [PATCH v2 net-next 1/6] udp: Clean up some functions Kuniyuki Iwashima
2022-11-11 4:00 ` [PATCH v2 net-next 2/6] udp: Set NULL to sk->sk_prot->h.udp_table Kuniyuki Iwashima
2022-11-11 4:00 ` [PATCH v2 net-next 3/6] udp: Set NULL to udp_seq_afinfo.udp_table Kuniyuki Iwashima
2022-11-11 4:00 ` [PATCH v2 net-next 4/6] udp: Access &udp_table via net Kuniyuki Iwashima
2022-11-11 4:00 ` [PATCH v2 net-next 5/6] udp: Add bitmap in udp_table Kuniyuki Iwashima
2022-11-11 4:00 ` [PATCH v2 net-next 6/6] udp: Introduce optional per-netns hash table Kuniyuki Iwashima
2022-11-11 8:53 ` Paolo Abeni
2022-11-14 20:21 ` Kuniyuki Iwashima
2022-11-14 20:55 ` Paolo Abeni
2022-11-14 21:08 ` Kuniyuki Iwashima
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).