netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 1/1] inet_diag: fetch cong algo info when socket is destroyed
@ 2018-04-26 17:58 Jamal Hadi Salim
  2018-04-30  0:31 ` David Miller
  0 siblings, 1 reply; 3+ messages in thread
From: Jamal Hadi Salim @ 2018-04-26 17:58 UTC (permalink / raw)
  To: davem
  Cc: kraig, netdev, eric.dumazet, kernel, Jamal Hadi Salim,
	Jamal Hadi Salim

From: Jamal Hadi Salim <hadi@mojatatu.com>

When a user dumps an existing established tcp socket state
via inet diag, it is possible to retrieve the congestion control
details.
When an the sock is destroyed, the generated event has all the
details available in the dump sans congestion control info.
This patch fixes it.

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
---
 net/core/sock_diag.c |  3 +++
 net/ipv4/inet_diag.c | 48 ++++++++++++++++++++++++++++++++++++++----------
 2 files changed, 41 insertions(+), 10 deletions(-)

diff --git a/net/core/sock_diag.c b/net/core/sock_diag.c
index c37b5be7c5e4..0bf64dd70aee 100644
--- a/net/core/sock_diag.c
+++ b/net/core/sock_diag.c
@@ -7,6 +7,7 @@
 #include <net/net_namespace.h>
 #include <linux/module.h>
 #include <net/sock.h>
+#include <net/tcp.h>
 #include <linux/kernel.h>
 #include <linux/tcp.h>
 #include <linux/workqueue.h>
@@ -112,6 +113,8 @@ static size_t sock_diag_nlmsg_size(void)
 {
 	return NLMSG_ALIGN(sizeof(struct inet_diag_msg)
 	       + nla_total_size(sizeof(u8)) /* INET_DIAG_PROTOCOL */
+	       + nla_total_size(TCP_CA_NAME_MAX) /* INET_DIAG_CONG */
+	       + nla_total_size(sizeof(union tcp_cc_info))
 	       + nla_total_size_64bit(sizeof(struct tcp_info))); /* INET_DIAG_INFO */
 }
 
diff --git a/net/ipv4/inet_diag.c b/net/ipv4/inet_diag.c
index 4e5bc4b2f14e..9722f31cc9c5 100644
--- a/net/ipv4/inet_diag.c
+++ b/net/ipv4/inet_diag.c
@@ -159,6 +159,35 @@ int inet_diag_msg_attrs_fill(struct sock *sk, struct sk_buff *skb,
 }
 EXPORT_SYMBOL_GPL(inet_diag_msg_attrs_fill);
 
+static int inet_csk_cong_fill(struct sock *sk, struct sk_buff *skb, int ext)
+{
+	struct inet_connection_sock *icsk = inet_csk(sk);
+	const struct tcp_congestion_ops *ca_ops;
+	union tcp_cc_info info;
+	int attr, err = 0;
+	size_t sz = 0;
+
+	rcu_read_lock();
+	ca_ops = READ_ONCE(icsk->icsk_ca_ops);
+	if (ca_ops) {
+		if (ca_ops->get_info)
+			sz = ca_ops->get_info(sk, ext, &attr, &info);
+		if (ext & (1 << (INET_DIAG_CONG - 1))) {
+			err = nla_put_string(skb, INET_DIAG_CONG, ca_ops->name);
+			if (err < 0) {
+				rcu_read_unlock();
+				return err;
+			}
+		}
+	}
+	rcu_read_unlock();
+
+	if (sz)
+		err = nla_put(skb, attr, sz, &info);
+
+	return err;
+}
+
 int inet_sk_diag_fill(struct sock *sk, struct inet_connection_sock *icsk,
 		      struct sk_buff *skb, const struct inet_diag_req_v2 *req,
 		      struct user_namespace *user_ns,
@@ -274,16 +303,7 @@ int inet_sk_diag_fill(struct sock *sk, struct inet_connection_sock *icsk,
 			goto errout;
 
 	if (sk->sk_state < TCP_TIME_WAIT) {
-		union tcp_cc_info info;
-		size_t sz = 0;
-		int attr;
-
-		rcu_read_lock();
-		ca_ops = READ_ONCE(icsk->icsk_ca_ops);
-		if (ca_ops && ca_ops->get_info)
-			sz = ca_ops->get_info(sk, ext, &attr, &info);
-		rcu_read_unlock();
-		if (sz && nla_put(skb, attr, sz, &info) < 0)
+		if (inet_csk_cong_fill(sk, skb, ext))
 			goto errout;
 	}
 
@@ -1215,6 +1235,14 @@ int inet_diag_handler_get_info(struct sk_buff *skb, struct sock *sk)
 	if (attr)
 		info = nla_data(attr);
 
+#define EXT_MASK (1 << (INET_DIAG_VEGASINFO - 1) | 1 << (INET_DIAG_CONG - 1))
+	err = inet_csk_cong_fill(sk, skb, EXT_MASK);
+	if (err) {
+		inet_diag_unlock_handler(handler);
+		nlmsg_cancel(skb, nlh);
+		return err;
+	}
+
 	handler->idiag_get_info(sk, r, info);
 	inet_diag_unlock_handler(handler);
 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH net-next 1/1] inet_diag: fetch cong algo info when socket is destroyed
  2018-04-26 17:58 [PATCH net-next 1/1] inet_diag: fetch cong algo info when socket is destroyed Jamal Hadi Salim
@ 2018-04-30  0:31 ` David Miller
  2018-04-30 17:33   ` Jamal Hadi Salim
  0 siblings, 1 reply; 3+ messages in thread
From: David Miller @ 2018-04-30  0:31 UTC (permalink / raw)
  To: jhs; +Cc: kraig, netdev, eric.dumazet, kernel, hadi

From: Jamal Hadi Salim <jhs@mojatatu.com>
Date: Thu, 26 Apr 2018 13:58:05 -0400

> From: Jamal Hadi Salim <hadi@mojatatu.com>
> 
> When a user dumps an existing established tcp socket state
> via inet diag, it is possible to retrieve the congestion control
> details.
> When an the sock is destroyed, the generated event has all the
> details available in the dump sans congestion control info.
> This patch fixes it.
> 
> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>

Well, two things:

1) The congestion control info is opt-in, meaning that the user gets
   it in the dump if they ask for it.

   This information is opt-in, because otherwise the dumps get really
   large.

   Therefore, emitting this stuff by default on destroys in a
   non-starter.

2) The TCP_TIME_WAIT test is not there for looks.  You need to add it
   also to the destroy case, and guess what?  All the sockets you will
   see will not pass that test.

I'm not applying this, sorry.  I really think things are go as-is, and
if you really truly want the congestion control information you can
ask for it while the socket is still alive, and is in the proper state
to sample the congestion control state before you kill it off.

Thanks.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH net-next 1/1] inet_diag: fetch cong algo info when socket is destroyed
  2018-04-30  0:31 ` David Miller
@ 2018-04-30 17:33   ` Jamal Hadi Salim
  0 siblings, 0 replies; 3+ messages in thread
From: Jamal Hadi Salim @ 2018-04-30 17:33 UTC (permalink / raw)
  To: David Miller; +Cc: kraig, netdev, eric.dumazet, kernel, hadi

On 29/04/18 08:31 PM, David Miller wrote:

> Well, two things:
> 
> 1) The congestion control info is opt-in, meaning that the user gets
>     it in the dump if they ask for it.
> 
>     This information is opt-in, because otherwise the dumps get really
>     large.
> 
>     Therefore, emitting this stuff by default on destroys in a
>     non-starter.
> 

There are two options that I investigated:
Add a setsockopt() for a new group that indicate "give me the congestion
info in addition" or add a similar knob at bind() time. Either of those
approaches would require bigger surgeries. If you think either of those
is reasonable i will work in that direction.

Note: Vegas adds 4 32-bit words; BBR 5 32-bit words; the congestion
name another 16B worst case.
In the larger scope of things that is very small extra data and saves
all the complexity of the other approaches.

> 2) The TCP_TIME_WAIT test is not there for looks.  You need to add it
>     also to the destroy case, and guess what?  All the sockets you will
>     see will not pass that test.
> 

The TCP_TIME_WAIT test makes sense for a live socket. This sock is
past that stage.

> I'm not applying this, sorry.  I really think things are go as-is, and
> if you really truly want the congestion control information you can
> ask for it while the socket is still alive, and is in the proper state
> to sample the congestion control state before you kill it off.

I am avoiding the polling for scaling reasons. It worked fine for
small number of sockets.

cheers,
jamal

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2018-04-30 17:33 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-04-26 17:58 [PATCH net-next 1/1] inet_diag: fetch cong algo info when socket is destroyed Jamal Hadi Salim
2018-04-30  0:31 ` David Miller
2018-04-30 17:33   ` Jamal Hadi Salim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).