netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] inet: add bound ports statistic
@ 2018-03-01  2:01 Stephen Hemminger
  2018-03-01  2:28 ` Eric Dumazet
  0 siblings, 1 reply; 6+ messages in thread
From: Stephen Hemminger @ 2018-03-01  2:01 UTC (permalink / raw)
  To: davem; +Cc: netdev, Stephen Hemminger, Stephen Hemminger

This adds a number of bound ports in a network namespace which is
a useful for socket summary (ss) command. It adds one additional
field onto /proc/net/sockstat.

Since collecting these kind of counters can be sensitive for large
machines, the impact is placed on the reading side which will be
much rarer.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 include/net/inet_connection_sock.h |  2 +-
 net/ipv4/inet_connection_sock.c    | 22 ++++++++++++++++++++++
 net/ipv4/proc.c                    |  5 +++--
 3 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/include/net/inet_connection_sock.h b/include/net/inet_connection_sock.h
index c1a93ce35e62..3044deec73ce 100644
--- a/include/net/inet_connection_sock.h
+++ b/include/net/inet_connection_sock.h
@@ -265,7 +265,7 @@ inet_csk_rto_backoff(const struct inet_connection_sock *icsk,
 }
 
 struct sock *inet_csk_accept(struct sock *sk, int flags, int *err, bool kern);
-
+unsigned int inet_csk_count_ports(struct net *net, struct proto *prot);
 int inet_csk_get_port(struct sock *sk, unsigned short snum);
 
 struct dst_entry *inet_csk_route_req(const struct sock *sk, struct flowi4 *fl4,
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index 881ac6d046f2..2418abce4d50 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -424,6 +424,28 @@ static int inet_csk_wait_for_connect(struct sock *sk, long timeo)
 	return err;
 }
 
+/* Count how many any entries are in the bind hash table */
+unsigned int inet_csk_count_ports(struct net *net, struct proto *prot)
+{
+	struct inet_hashinfo *hinfo = prot->h.hashinfo;
+	struct inet_bind_hashbucket *head;
+	struct inet_bind_bucket *tb;
+	unsigned int i, ports = 0;
+
+	for (i = 0; i < hinfo->bhash_size; i++) {
+		head = &hinfo->bhash[i];
+
+		spin_lock_bh(&head->lock);
+		inet_bind_bucket_for_each(tb, &head->chain) {
+			if (net_eq(ib_net(tb), net))
+				++ports;
+		}
+		spin_unlock_bh(&head->lock);
+	}
+
+	return ports;
+}
+
 /*
  * This will accept the next outstanding connection.
  */
diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c
index fdabc70283b6..7cd781026224 100644
--- a/net/ipv4/proc.c
+++ b/net/ipv4/proc.c
@@ -61,10 +61,11 @@ static int sockstat_seq_show(struct seq_file *seq, void *v)
 	sockets = proto_sockets_allocated_sum_positive(&tcp_prot);
 
 	socket_seq_show(seq);
-	seq_printf(seq, "TCP: inuse %d orphan %d tw %d alloc %d mem %ld\n",
+	seq_printf(seq, "TCP: inuse %d orphan %d tw %d alloc %d mem %ld ports %u\n",
 		   sock_prot_inuse_get(net, &tcp_prot), orphans,
 		   atomic_read(&net->ipv4.tcp_death_row.tw_count), sockets,
-		   proto_memory_allocated(&tcp_prot));
+		   proto_memory_allocated(&tcp_prot),
+		   inet_csk_count_ports(net, &tcp_prot));
 	seq_printf(seq, "UDP: inuse %d mem %ld\n",
 		   sock_prot_inuse_get(net, &udp_prot),
 		   proto_memory_allocated(&udp_prot));
-- 
2.16.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] inet: add bound ports statistic
  2018-03-01  2:01 [PATCH] inet: add bound ports statistic Stephen Hemminger
@ 2018-03-01  2:28 ` Eric Dumazet
  2018-03-01  3:32   ` David Miller
  0 siblings, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2018-03-01  2:28 UTC (permalink / raw)
  To: Stephen Hemminger, davem; +Cc: netdev, Stephen Hemminger

On Wed, 2018-02-28 at 18:01 -0800, Stephen Hemminger wrote:
> This adds a number of bound ports in a network namespace which is
> a useful for socket summary (ss) command. It adds one additional
> field onto /proc/net/sockstat.
> 
> Since collecting these kind of counters can be sensitive for large
> machines, the impact is placed on the reading side which will be
> much rarer.
> 
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---
>  include/net/inet_connection_sock.h |  2 +-
>  net/ipv4/inet_connection_sock.c    | 22 ++++++++++++++++++++++
>  net/ipv4/proc.c                    |  5 +++--
>  3 files changed, 26 insertions(+), 3 deletions(-)
> 
> diff --git a/include/net/inet_connection_sock.h b/include/net/inet_connection_sock.h
> index c1a93ce35e62..3044deec73ce 100644
> --- a/include/net/inet_connection_sock.h
> +++ b/include/net/inet_connection_sock.h
> @@ -265,7 +265,7 @@ inet_csk_rto_backoff(const struct inet_connection_sock *icsk,
>  }
>  
>  struct sock *inet_csk_accept(struct sock *sk, int flags, int *err, bool kern);
> -
> +unsigned int inet_csk_count_ports(struct net *net, struct proto *prot);
>  int inet_csk_get_port(struct sock *sk, unsigned short snum);
>  
>  struct dst_entry *inet_csk_route_req(const struct sock *sk, struct flowi4 *fl4,
> diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
> index 881ac6d046f2..2418abce4d50 100644
> --- a/net/ipv4/inet_connection_sock.c
> +++ b/net/ipv4/inet_connection_sock.c
> @@ -424,6 +424,28 @@ static int inet_csk_wait_for_connect(struct sock *sk, long timeo)
>  	return err;
>  }
>  
> +/* Count how many any entries are in the bind hash table */

How useful it is to report this information ?

Given REUSEADDR and REUSEPORT, I really wonder what can be derived from
this counter.

It seems its semantic is weak.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] inet: add bound ports statistic
  2018-03-01  2:28 ` Eric Dumazet
@ 2018-03-01  3:32   ` David Miller
  2018-03-01  4:28     ` Eric Dumazet
  0 siblings, 1 reply; 6+ messages in thread
From: David Miller @ 2018-03-01  3:32 UTC (permalink / raw)
  To: eric.dumazet; +Cc: stephen, netdev, sthemmin

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 28 Feb 2018 18:28:02 -0800

> How useful it is to report this information ?
> 
> Given REUSEADDR and REUSEPORT, I really wonder what can be derived from
> this counter.
> 
> It seems its semantic is weak.

To me none of this really matters.

What matters is that iproute2 reported this via slabinfo for longer
than a decade.

It broke recently when SLAB started merging caches just like SLUB
always did.

So alternative to Stephen's patch is turning off SLAB cache merging
for certain networking caches, which is definitely a non-starter.

I have directed Stephen to handle the situation in this way.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] inet: add bound ports statistic
  2018-03-01  3:32   ` David Miller
@ 2018-03-01  4:28     ` Eric Dumazet
  2018-03-01 16:16       ` Stephen Hemminger
  0 siblings, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2018-03-01  4:28 UTC (permalink / raw)
  To: David Miller; +Cc: stephen, netdev, sthemmin

On Wed, 2018-02-28 at 22:32 -0500, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Wed, 28 Feb 2018 18:28:02 -0800
> 
> > How useful it is to report this information ?
> > 
> > Given REUSEADDR and REUSEPORT, I really wonder what can be derived from
> > this counter.
> > 
> > It seems its semantic is weak.
> 
> To me none of this really matters.
> 
> What matters is that iproute2 reported this via slabinfo for longer
> than a decade.
> 
> It broke recently when SLAB started merging caches just like SLUB
> always did.


Linus himself removed some info that was much more useful in
commit a5ad88ce8c7fae7d ("mm: get rid of 'vmalloc_info' from
/proc/meminfo")

# egrep "VmallocUsed|VmallocChunk" /proc/meminfo
VmallocUsed:           0 kB
VmallocChunk:          0 kB

So I vote for not re-adding another loop in the kernel with no
preemption point.

Simply taking spinlocks like Stephen did is going to slow down the
other threads, lets face it.

This implementation has a high cost, and provides something that made
no sense in the first place.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] inet: add bound ports statistic
  2018-03-01  4:28     ` Eric Dumazet
@ 2018-03-01 16:16       ` Stephen Hemminger
  2018-03-01 17:09         ` Eric Dumazet
  0 siblings, 1 reply; 6+ messages in thread
From: Stephen Hemminger @ 2018-03-01 16:16 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev, sthemmin

On Wed, 28 Feb 2018 20:28:15 -0800
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> On Wed, 2018-02-28 at 22:32 -0500, David Miller wrote:
> > From: Eric Dumazet <eric.dumazet@gmail.com>
> > Date: Wed, 28 Feb 2018 18:28:02 -0800
> >   
> > > How useful it is to report this information ?
> > > 
> > > Given REUSEADDR and REUSEPORT, I really wonder what can be derived from
> > > this counter.
> > > 
> > > It seems its semantic is weak.  
> > 
> > To me none of this really matters.
> > 
> > What matters is that iproute2 reported this via slabinfo for longer
> > than a decade.
> > 
> > It broke recently when SLAB started merging caches just like SLUB
> > always did.  
> 
> 
> Linus himself removed some info that was much more useful in
> commit a5ad88ce8c7fae7d ("mm: get rid of 'vmalloc_info' from
> /proc/meminfo")
> 
> # egrep "VmallocUsed|VmallocChunk" /proc/meminfo
> VmallocUsed:           0 kB
> VmallocChunk:          0 kB
> 
> So I vote for not re-adding another loop in the kernel with no
> preemption point.
> 
> Simply taking spinlocks like Stephen did is going to slow down the
> other threads, lets face it.
> 
> This implementation has a high cost, and provides something that made
> no sense in the first place.
> 

I went through a several possible alternatives.
   1. Add a counter in the hash bucket head (like listen already has).
      But not namespace aware
   2. Add a percpu counter in network namespace (new struct tcp_netns)
      Logical and adds place to move tcp open sockets as well.
      But more expensive and several places in code don't have easy
      access to namespace.
   3. Counting entries in userspace; defeats the purpose of -s flag.

Agree it is not an urgent statistic, it is just it got broken; willing to
just drop it.

What about adding cond_resched between buckets like other places do?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] inet: add bound ports statistic
  2018-03-01 16:16       ` Stephen Hemminger
@ 2018-03-01 17:09         ` Eric Dumazet
  0 siblings, 0 replies; 6+ messages in thread
From: Eric Dumazet @ 2018-03-01 17:09 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: David Miller, netdev, sthemmin

On Thu, 2018-03-01 at 08:16 -0800, Stephen Hemminger wrote:
> 
> I went through a several possible alternatives.
>    1. Add a counter in the hash bucket head (like listen already has).
>       But not namespace aware
>    2. Add a percpu counter in network namespace (new struct tcp_netns)
>       Logical and adds place to move tcp open sockets as well.
>       But more expensive and several places in code don't have easy
>       access to namespace.

__this_cpu_inc() should not be very expensive.

We always have access to socket namespace when inserting/deleting a
bind entry.

Since we take the spinlock at that times, inserting
__this_cpu_inc(sock_net(sk)->ipv4.bind_counter) would add few
instructions.

>    3. Counting entries in userspace; defeats the purpose of -s flag.
> 
> Agree it is not an urgent statistic, it is just it got broken; willing to
> just drop it.

Not counting the fact that it has really no purpose.

> 
> What about adding cond_resched between buckets like other places do?

This still is dirtying at least 1MB of memory and performs ~64K atomic
operations.
Waste of cpu caches and memory bus.

If we consider converting the whole /proc/net/socketstat to provide a
netns view only, we might use a percpu structure holding all TCP socket
counters, to make sure no more than one cache line per cpu is used
while folding all the cpus counters.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-03-01 17:09 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-03-01  2:01 [PATCH] inet: add bound ports statistic Stephen Hemminger
2018-03-01  2:28 ` Eric Dumazet
2018-03-01  3:32   ` David Miller
2018-03-01  4:28     ` Eric Dumazet
2018-03-01 16:16       ` Stephen Hemminger
2018-03-01 17:09         ` Eric Dumazet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).