* [PATCH] Allow TCP connections to cache SYN packet for userspace inspection
@ 2015-05-01 17:43 Eric B Munson
[not found] ` <1430502237-5619-1-git-send-email-emunson-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 23+ messages in thread
From: Eric B Munson @ 2015-05-01 17:43 UTC (permalink / raw)
To: David S. Miller
Cc: Eric B Munson, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI,
Patrick McHardy, netdev-u79uwXL29TY76Z2rM5mHXA,
linux-api-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA
In order to enable policy decisions in userspace, the data contained in
the SYN packet would be useful for tracking or identifying connections.
Only parts of this data are available to userspace after the hand shake
is completed. This patch exposes a new setsockopt() option that will,
when used with a listening socket, ask the kernel to cache the skb
holding the SYN packet for retrieval later. The SYN skbs will not be
saved while the kernel is in syn cookie mode.
The same option will ask the kernel for the packet headers when used
with getsockopt() with the socket returned from accept(). The cached
packet will only be available for the first getsockopt() call, the skb
is consumed after the requested data is copied to userspace. Subsequent
calls will return -ENOENT. Because of this behavior, getsockopt() will
return -E2BIG if the caller supplied a buffer that is too small to hold
the skb header.
Signed-off-by: Eric B Munson <emunson-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org>
Cc: Alexey Kuznetsov <kuznet-v/Mj1YrvjDBInbfyfbPRSQ@public.gmane.org>
Cc: James Morris <jmorris-gx6/JNMH7DfYtjvyW6yDsg@public.gmane.org>
Cc: Hideaki YOSHIFUJI <yoshfuji-VfPWfsRibaP+Ru+s062T9g@public.gmane.org>
Cc: Patrick McHardy <kaber-dcUjhNyLwpNeoWH0uzbU5w@public.gmane.org>
Cc: netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Cc: linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
---
include/linux/tcp.h | 4 +++-
include/net/inet_sock.h | 1 +
include/uapi/linux/tcp.h | 1 +
net/ipv4/inet_connection_sock.c | 33 +++++++++++++++++++--------------
net/ipv4/tcp.c | 41 +++++++++++++++++++++++++++++++++++++++++
net/ipv4/tcp_input.c | 4 ++++
net/ipv4/tcp_ipv4.c | 1 +
net/ipv4/tcp_minisocks.c | 1 +
net/ipv6/tcp_ipv6.c | 1 +
9 files changed, 72 insertions(+), 15 deletions(-)
diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index 0caa3a2..2c39d07 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -191,7 +191,8 @@ struct tcp_sock {
syn_fastopen:1, /* SYN includes Fast Open option */
syn_fastopen_exp:1,/* SYN includes Fast Open exp. option */
syn_data_acked:1,/* data in SYN is acked by SYN-ACK */
- is_cwnd_limited:1;/* forward progress limited by snd_cwnd? */
+ is_cwnd_limited:1,/* forward progress limited by snd_cwnd? */
+ saved_syn:1;/* keep a copy of the syn packet */
u32 tlp_high_seq; /* snd_nxt at the time of TLP retransmit. */
/* RTT measurement */
@@ -318,6 +319,7 @@ struct tcp_sock {
* socket. Used to retransmit SYNACKs etc.
*/
struct request_sock *fastopen_rsk;
+ struct sk_buff *syn_skb;
};
enum tsq_flags {
diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
index b6c3737..cc0c18b 100644
--- a/include/net/inet_sock.h
+++ b/include/net/inet_sock.h
@@ -98,6 +98,7 @@ struct inet_request_sock {
struct ip_options_rcu *opt;
struct sk_buff *pktopts;
};
+ struct sk_buff *syn_skb;
};
static inline struct inet_request_sock *inet_rsk(const struct request_sock *sk)
diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h
index 3b97183..5d32550 100644
--- a/include/uapi/linux/tcp.h
+++ b/include/uapi/linux/tcp.h
@@ -112,6 +112,7 @@ enum {
#define TCP_FASTOPEN 23 /* Enable FastOpen on listeners */
#define TCP_TIMESTAMP 24
#define TCP_NOTSENT_LOWAT 25 /* limit number of unsent bytes in write queue */
+#define TCP_SAVED_SYN 26 /* cache SYN packets for retrieval by userspace */
struct tcp_repair_opt {
__u32 opt_code;
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index 8976ca4..2abcd50 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -325,21 +325,26 @@ struct sock *inet_csk_accept(struct sock *sk, int flags, int *err)
newsk = req->sk;
sk_acceptq_removed(sk);
- if (sk->sk_protocol == IPPROTO_TCP &&
- tcp_rsk(req)->tfo_listener &&
- queue->fastopenq) {
- spin_lock_bh(&queue->fastopenq->lock);
- if (tcp_rsk(req)->tfo_listener) {
- /* We are still waiting for the final ACK from 3WHS
- * so can't free req now. Instead, we set req->sk to
- * NULL to signify that the child socket is taken
- * so reqsk_fastopen_remove() will free the req
- * when 3WHS finishes (or is aborted).
- */
- req->sk = NULL;
- req = NULL;
+ if (sk->sk_protocol == IPPROTO_TCP) {
+ tcp_sk(newsk)->saved_syn = tcp_sk(sk)->saved_syn;
+ if (inet_rsk(req)->syn_skb)
+ tcp_sk(newsk)->syn_skb = skb_get(inet_rsk(req)->syn_skb);
+
+ if (tcp_rsk(req)->tfo_listener && queue->fastopenq) {
+ spin_lock_bh(&queue->fastopenq->lock);
+ if (tcp_rsk(req)->tfo_listener) {
+ /* We are still waiting for the final ACK from
+ * 3WHS so can't free req now. Instead, we set
+ * req->sk to NULL to signify that the child
+ * socket is taken so reqsk_fastopen_remove()
+ * will free the req when 3WHS finishes (or is
+ * aborted).
+ */
+ req->sk = NULL;
+ req = NULL;
+ }
+ spin_unlock_bh(&queue->fastopenq->lock);
}
- spin_unlock_bh(&queue->fastopenq->lock);
}
out:
release_sock(sk);
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 8c5cd9e..dcfc0b7 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2227,6 +2227,8 @@ EXPORT_SYMBOL(tcp_disconnect);
void tcp_sock_destruct(struct sock *sk)
{
+ consume_skb(tcp_sk(sk)->syn_skb);
+
inet_sock_destruct(sk);
kfree(inet_csk(sk)->icsk_accept_queue.fastopenq);
@@ -2558,6 +2560,11 @@ static int do_tcp_setsockopt(struct sock *sk, int level,
tp->notsent_lowat = val;
sk->sk_write_space(sk);
break;
+ case TCP_SAVED_SYN:
+ if (!((1 << sk->sk_state) & TCPF_LISTEN))
+ err = -EINVAL;
+ tp->saved_syn = !!(val);
+ break;
default:
err = -ENOPROTOOPT;
break;
@@ -2738,6 +2745,40 @@ static int do_tcp_getsockopt(struct sock *sk, int level,
val = !icsk->icsk_ack.pingpong;
break;
+ case TCP_SAVED_SYN: {
+ struct sk_buff *syn = xchg(&tp->syn_skb, NULL);
+ int bufsz;
+ int ret = -EFAULT;
+
+ if (get_user(len, optlen))
+ goto reset;
+
+ ret = -EINVAL;
+ if ((1 << sk->sk_state) & TCPF_LISTEN)
+ goto reset;
+ if (!tp->saved_syn)
+ goto reset;
+ ret = -ENOENT;
+ if (!syn)
+ goto reset;
+ bufsz = (unsigned long)skb_tail_pointer(syn) - (unsigned long)eth_hdr(syn);
+ ret = -E2BIG;
+ if (len < bufsz)
+ goto reset;
+
+ ret = -EFAULT;
+ if (put_user(bufsz, optlen))
+ goto reset;
+ if (copy_to_user(optval, eth_hdr(syn), bufsz))
+ goto reset;
+ consume_skb(syn);
+
+ return 0;
+reset:
+ tp->syn_skb = syn;
+ return ret;
+ }
+
case TCP_CONGESTION:
if (get_user(len, optlen))
return -EFAULT;
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 3a4d9b34..b5a61d2 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -6005,6 +6005,7 @@ struct request_sock *inet_reqsk_alloc(const struct request_sock_ops *ops,
kmemcheck_annotate_bitfield(ireq, flags);
ireq->opt = NULL;
+ ireq->syn_skb = NULL;
atomic64_set(&ireq->ir_cookie, 0);
ireq->ireq_state = TCP_NEW_SYN_RECV;
write_pnet(&ireq->ireq_net, sock_net(sk_listener));
@@ -6163,6 +6164,9 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
inet_rsk(req)->ecn_ok = 0;
}
+ if (!want_cookie && tp->saved_syn)
+ inet_rsk(req)->syn_skb = skb_get(skb);
+
tcp_rsk(req)->snt_isn = isn;
tcp_openreq_init_rwin(req, sk, dst);
fastopen = !want_cookie &&
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index fc1c658..c63661d 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -853,6 +853,7 @@ static int tcp_v4_send_synack(struct sock *sk, struct dst_entry *dst,
*/
static void tcp_v4_reqsk_destructor(struct request_sock *req)
{
+ consume_skb(inet_rsk(req)->syn_skb);
kfree(inet_rsk(req)->opt);
}
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index e5d7649..b3ffa73 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -535,6 +535,7 @@ struct sock *tcp_create_openreq_child(struct sock *sk, struct request_sock *req,
tcp_ecn_openreq_child(newtp, req);
newtp->fastopen_rsk = NULL;
newtp->syn_data_acked = 0;
+ newtp->syn_skb = NULL;
TCP_INC_STATS_BH(sock_net(sk), TCP_MIB_PASSIVEOPENS);
}
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index b6575d6..400ea2e 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -475,6 +475,7 @@ done:
static void tcp_v6_reqsk_destructor(struct request_sock *req)
{
+ consume_skb(inet_rsk(req)->syn_skb);
kfree_skb(inet_rsk(req)->pktopts);
}
--
1.9.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH] Allow TCP connections to cache SYN packet for userspace inspection
[not found] ` <1430502237-5619-1-git-send-email-emunson-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org>
@ 2015-05-01 18:42 ` Eric Dumazet
[not found] ` <1430505777.3711.135.camel-XN9IlZ5yJG9HTL0Zs8A6p/gx64E7kk8eUsxypvmhUTTZJqsBc5GL+g@public.gmane.org>
2015-05-01 19:27 ` [PATCH] Allow TCP connections to cache SYN packet for userspace inspection Andy Lutomirski
1 sibling, 1 reply; 23+ messages in thread
From: Eric Dumazet @ 2015-05-01 18:42 UTC (permalink / raw)
To: Eric B Munson
Cc: David S. Miller, Alexey Kuznetsov, James Morris,
Hideaki YOSHIFUJI, Patrick McHardy, netdev-u79uwXL29TY76Z2rM5mHXA,
linux-api-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA
On Fri, 2015-05-01 at 13:43 -0400, Eric B Munson wrote:
> In order to enable policy decisions in userspace, the data contained in
> the SYN packet would be useful for tracking or identifying connections.
> Only parts of this data are available to userspace after the hand shake
> is completed. This patch exposes a new setsockopt() option that will,
> when used with a listening socket, ask the kernel to cache the skb
> holding the SYN packet for retrieval later. The SYN skbs will not be
> saved while the kernel is in syn cookie mode.
>
> The same option will ask the kernel for the packet headers when used
> with getsockopt() with the socket returned from accept(). The cached
> packet will only be available for the first getsockopt() call, the skb
> is consumed after the requested data is copied to userspace. Subsequent
> calls will return -ENOENT. Because of this behavior, getsockopt() will
> return -E2BIG if the caller supplied a buffer that is too small to hold
> the skb header.
>
> Signed-off-by: Eric B Munson <emunson-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org>
> Cc: Alexey Kuznetsov <kuznet-v/Mj1YrvjDBInbfyfbPRSQ@public.gmane.org>
> Cc: James Morris <jmorris-gx6/JNMH7DfYtjvyW6yDsg@public.gmane.org>
> Cc: Hideaki YOSHIFUJI <yoshfuji-VfPWfsRibaP+Ru+s062T9g@public.gmane.org>
> Cc: Patrick McHardy <kaber-dcUjhNyLwpNeoWH0uzbU5w@public.gmane.org>
> Cc: netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> Cc: linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> ---
We have a similar patch here at Google, but we do not hold one skb and
dst per saved syn. That can be ~4KB for some drivers.
Only a kmalloc() with the needed part (headers), usually less than 128
bytes. We store the length in first byte of this allocation.
This has a huge difference if you want to have ~4 million request socks.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] Allow TCP connections to cache SYN packet for userspace inspection
[not found] ` <1430502237-5619-1-git-send-email-emunson-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org>
2015-05-01 18:42 ` Eric Dumazet
@ 2015-05-01 19:27 ` Andy Lutomirski
[not found] ` <CALCETrWi6h3DRu6Z8jJ_-MiWqRRyKZHntpJFNON=GpAjMDYXmQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
1 sibling, 1 reply; 23+ messages in thread
From: Andy Lutomirski @ 2015-05-01 19:27 UTC (permalink / raw)
To: Eric B Munson
Cc: David S. Miller, Alexey Kuznetsov, James Morris,
Hideaki YOSHIFUJI, Patrick McHardy, Network Development,
Linux API, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
On Fri, May 1, 2015 at 10:43 AM, Eric B Munson <emunson-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org> wrote:
> In order to enable policy decisions in userspace, the data contained in
> the SYN packet would be useful for tracking or identifying connections.
> Only parts of this data are available to userspace after the hand shake
> is completed. This patch exposes a new setsockopt() option that will,
> when used with a listening socket, ask the kernel to cache the skb
> holding the SYN packet for retrieval later. The SYN skbs will not be
> saved while the kernel is in syn cookie mode.
>
> The same option will ask the kernel for the packet headers when used
> with getsockopt() with the socket returned from accept(). The cached
> packet will only be available for the first getsockopt() call, the skb
> is consumed after the requested data is copied to userspace. Subsequent
> calls will return -ENOENT. Because of this behavior, getsockopt() will
> return -E2BIG if the caller supplied a buffer that is too small to hold
> the skb header.
What's the purpose and what headers are you returning?
There was a bit of a mixup with tx timestamps where the set of headers
returned was possibly excessive and incompletely thought out the first
time around.
--Andy
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] Allow TCP connections to cache SYN packet for userspace inspection
[not found] ` <1430505777.3711.135.camel-XN9IlZ5yJG9HTL0Zs8A6p/gx64E7kk8eUsxypvmhUTTZJqsBc5GL+g@public.gmane.org>
@ 2015-05-01 19:55 ` Tom Herbert
[not found] ` <CALx6S34ftz_wDoPwcJg_cMQu4QtnBJF-=d+gF5ieTA=d=r31-Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 23+ messages in thread
From: Tom Herbert @ 2015-05-01 19:55 UTC (permalink / raw)
To: Eric Dumazet
Cc: Eric B Munson, David S. Miller, Alexey Kuznetsov, James Morris,
Hideaki YOSHIFUJI, Patrick McHardy,
Linux Kernel Network Developers, linux-api-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA
On Fri, May 1, 2015 at 11:42 AM, Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Fri, 2015-05-01 at 13:43 -0400, Eric B Munson wrote:
>> In order to enable policy decisions in userspace, the data contained in
>> the SYN packet would be useful for tracking or identifying connections.
>> Only parts of this data are available to userspace after the hand shake
>> is completed. This patch exposes a new setsockopt() option that will,
>> when used with a listening socket, ask the kernel to cache the skb
>> holding the SYN packet for retrieval later. The SYN skbs will not be
>> saved while the kernel is in syn cookie mode.
>>
>> The same option will ask the kernel for the packet headers when used
>> with getsockopt() with the socket returned from accept(). The cached
>> packet will only be available for the first getsockopt() call, the skb
>> is consumed after the requested data is copied to userspace. Subsequent
>> calls will return -ENOENT. Because of this behavior, getsockopt() will
>> return -E2BIG if the caller supplied a buffer that is too small to hold
>> the skb header.
>>
>> Signed-off-by: Eric B Munson <emunson-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org>
>> Cc: Alexey Kuznetsov <kuznet-v/Mj1YrvjDBInbfyfbPRSQ@public.gmane.org>
>> Cc: James Morris <jmorris-gx6/JNMH7DfYtjvyW6yDsg@public.gmane.org>
>> Cc: Hideaki YOSHIFUJI <yoshfuji-VfPWfsRibaP+Ru+s062T9g@public.gmane.org>
>> Cc: Patrick McHardy <kaber-dcUjhNyLwpNeoWH0uzbU5w@public.gmane.org>
>> Cc: netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> Cc: linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> ---
>
> We have a similar patch here at Google, but we do not hold one skb and
> dst per saved syn. That can be ~4KB for some drivers.
>
> Only a kmalloc() with the needed part (headers), usually less than 128
> bytes. We store the length in first byte of this allocation.
>
> This has a huge difference if you want to have ~4 million request socks.
>
+1 on kmalloc solution. I posted a similar patch a couple of years ago
https://patchwork.ozlabs.org/patch/146034/. There was pushback on
memory usage and this having to narrow of a use case.
Tom
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] Allow TCP connections to cache SYN packet for userspace inspection
[not found] ` <CALCETrWi6h3DRu6Z8jJ_-MiWqRRyKZHntpJFNON=GpAjMDYXmQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-05-01 20:01 ` Eric B Munson
2015-05-01 20:28 ` Andy Lutomirski
0 siblings, 1 reply; 23+ messages in thread
From: Eric B Munson @ 2015-05-01 20:01 UTC (permalink / raw)
To: Andy Lutomirski
Cc: David S. Miller, Alexey Kuznetsov, James Morris,
Hideaki YOSHIFUJI, Patrick McHardy, Network Development,
Linux API, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
[-- Attachment #1: Type: text/plain, Size: 1892 bytes --]
On Fri, 01 May 2015, Andy Lutomirski wrote:
> On Fri, May 1, 2015 at 10:43 AM, Eric B Munson <emunson-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org> wrote:
> > In order to enable policy decisions in userspace, the data contained in
> > the SYN packet would be useful for tracking or identifying connections.
> > Only parts of this data are available to userspace after the hand shake
> > is completed. This patch exposes a new setsockopt() option that will,
> > when used with a listening socket, ask the kernel to cache the skb
> > holding the SYN packet for retrieval later. The SYN skbs will not be
> > saved while the kernel is in syn cookie mode.
> >
> > The same option will ask the kernel for the packet headers when used
> > with getsockopt() with the socket returned from accept(). The cached
> > packet will only be available for the first getsockopt() call, the skb
> > is consumed after the requested data is copied to userspace. Subsequent
> > calls will return -ENOENT. Because of this behavior, getsockopt() will
> > return -E2BIG if the caller supplied a buffer that is too small to hold
> > the skb header.
>
> What's the purpose and what headers are you returning?
Currently the ethernet, IP, and TCP headers are being returned. The IP
and TCP headers will be used by userspace to make decisions on how to
handle incoming connections. The ethernet headers are being returned
for completeness, I would be fine not including them in what is copied
if that is a concern, however the team requesting this change here
requires the IP and TCP headers.
>
> There was a bit of a mixup with tx timestamps where the set of headers
> returned was possibly excessive and incompletely thought out the first
> time around.
With this in mind, we could drop copying the ethernet headers and simply
hold onto the IP and TCP headers.
>
> --Andy
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] Allow TCP connections to cache SYN packet for userspace inspection
[not found] ` <CALx6S34ftz_wDoPwcJg_cMQu4QtnBJF-=d+gF5ieTA=d=r31-Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-05-01 20:14 ` Eric B Munson
[not found] ` <20150501201417.GB6113-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 23+ messages in thread
From: Eric B Munson @ 2015-05-01 20:14 UTC (permalink / raw)
To: Tom Herbert
Cc: Eric Dumazet, David S. Miller, Alexey Kuznetsov, James Morris,
Hideaki YOSHIFUJI, Patrick McHardy,
Linux Kernel Network Developers, linux-api-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA
[-- Attachment #1: Type: text/plain, Size: 2846 bytes --]
On Fri, 01 May 2015, Tom Herbert wrote:
> On Fri, May 1, 2015 at 11:42 AM, Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> > On Fri, 2015-05-01 at 13:43 -0400, Eric B Munson wrote:
> >> In order to enable policy decisions in userspace, the data contained in
> >> the SYN packet would be useful for tracking or identifying connections.
> >> Only parts of this data are available to userspace after the hand shake
> >> is completed. This patch exposes a new setsockopt() option that will,
> >> when used with a listening socket, ask the kernel to cache the skb
> >> holding the SYN packet for retrieval later. The SYN skbs will not be
> >> saved while the kernel is in syn cookie mode.
> >>
> >> The same option will ask the kernel for the packet headers when used
> >> with getsockopt() with the socket returned from accept(). The cached
> >> packet will only be available for the first getsockopt() call, the skb
> >> is consumed after the requested data is copied to userspace. Subsequent
> >> calls will return -ENOENT. Because of this behavior, getsockopt() will
> >> return -E2BIG if the caller supplied a buffer that is too small to hold
> >> the skb header.
> >>
> >> Signed-off-by: Eric B Munson <emunson-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org>
> >> Cc: Alexey Kuznetsov <kuznet-v/Mj1YrvjDBInbfyfbPRSQ@public.gmane.org>
> >> Cc: James Morris <jmorris-gx6/JNMH7DfYtjvyW6yDsg@public.gmane.org>
> >> Cc: Hideaki YOSHIFUJI <yoshfuji-VfPWfsRibaP+Ru+s062T9g@public.gmane.org>
> >> Cc: Patrick McHardy <kaber-dcUjhNyLwpNeoWH0uzbU5w@public.gmane.org>
> >> Cc: netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> >> Cc: linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> >> Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> >> ---
> >
> > We have a similar patch here at Google, but we do not hold one skb and
> > dst per saved syn. That can be ~4KB for some drivers.
> >
> > Only a kmalloc() with the needed part (headers), usually less than 128
> > bytes. We store the length in first byte of this allocation.
> >
> > This has a huge difference if you want to have ~4 million request socks.
> >
> +1 on kmalloc solution. I posted a similar patch a couple of years ago
> https://patchwork.ozlabs.org/patch/146034/. There was pushback on
> memory usage and this having to narrow of a use case.
>
> Tom
>
I cached the skb largely to take advantage of the built in reference
counting and avoid having to manage allocating memory and ownership of
said memory. For V2, how about I keep the skb reference in the request
structure and kmalloc() a buffer, to be owned by the tcp sock structure,
when the new tcp socket is created? This would also simplify the
getsockopt() so that the data was available to all callers until the
socket is closed.
Eric
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] Allow TCP connections to cache SYN packet for userspace inspection
[not found] ` <20150501201417.GB6113-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org>
@ 2015-05-01 20:23 ` Eric Dumazet
[not found] ` <1430511800.3711.138.camel-XN9IlZ5yJG9HTL0Zs8A6p/gx64E7kk8eUsxypvmhUTTZJqsBc5GL+g@public.gmane.org>
0 siblings, 1 reply; 23+ messages in thread
From: Eric Dumazet @ 2015-05-01 20:23 UTC (permalink / raw)
To: Eric B Munson
Cc: Tom Herbert, David S. Miller, Alexey Kuznetsov, James Morris,
Hideaki YOSHIFUJI, Patrick McHardy,
Linux Kernel Network Developers, linux-api-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA
On Fri, 2015-05-01 at 16:14 -0400, Eric B Munson wrote:
> On Fri, 01 May 2015, Tom Herbert wrote:
>
> > On Fri, May 1, 2015 at 11:42 AM, Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> > > On Fri, 2015-05-01 at 13:43 -0400, Eric B Munson wrote:
> > >> In order to enable policy decisions in userspace, the data contained in
> > >> the SYN packet would be useful for tracking or identifying connections.
> > >> Only parts of this data are available to userspace after the hand shake
> > >> is completed. This patch exposes a new setsockopt() option that will,
> > >> when used with a listening socket, ask the kernel to cache the skb
> > >> holding the SYN packet for retrieval later. The SYN skbs will not be
> > >> saved while the kernel is in syn cookie mode.
> > >>
> > >> The same option will ask the kernel for the packet headers when used
> > >> with getsockopt() with the socket returned from accept(). The cached
> > >> packet will only be available for the first getsockopt() call, the skb
> > >> is consumed after the requested data is copied to userspace. Subsequent
> > >> calls will return -ENOENT. Because of this behavior, getsockopt() will
> > >> return -E2BIG if the caller supplied a buffer that is too small to hold
> > >> the skb header.
> > >>
> > >> Signed-off-by: Eric B Munson <emunson-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org>
> > >> Cc: Alexey Kuznetsov <kuznet-v/Mj1YrvjDBInbfyfbPRSQ@public.gmane.org>
> > >> Cc: James Morris <jmorris-gx6/JNMH7DfYtjvyW6yDsg@public.gmane.org>
> > >> Cc: Hideaki YOSHIFUJI <yoshfuji-VfPWfsRibaP+Ru+s062T9g@public.gmane.org>
> > >> Cc: Patrick McHardy <kaber-dcUjhNyLwpNeoWH0uzbU5w@public.gmane.org>
> > >> Cc: netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > >> Cc: linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > >> Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > >> ---
> > >
> > > We have a similar patch here at Google, but we do not hold one skb and
> > > dst per saved syn. That can be ~4KB for some drivers.
> > >
> > > Only a kmalloc() with the needed part (headers), usually less than 128
> > > bytes. We store the length in first byte of this allocation.
> > >
> > > This has a huge difference if you want to have ~4 million request socks.
> > >
> > +1 on kmalloc solution. I posted a similar patch a couple of years ago
> > https://patchwork.ozlabs.org/patch/146034/. There was pushback on
> > memory usage and this having to narrow of a use case.
> >
> > Tom
> >
>
> I cached the skb largely to take advantage of the built in reference
> counting and avoid having to manage allocating memory and ownership of
> said memory. For V2, how about I keep the skb reference in the request
> structure and kmalloc() a buffer, to be owned by the tcp sock structure,
> when the new tcp socket is created? This would also simplify the
> getsockopt() so that the data was available to all callers until the
> socket is closed.
Please do not keep a reference on skb. This has a too big cost.
Have you read that we plan to have up to 4 or 10 million request socks ?
skb also holds a dst.
We can upstream our implementation (based on Tom prior patch), we have
been using it more than 2 years with success.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] Allow TCP connections to cache SYN packet for userspace inspection
2015-05-01 20:01 ` Eric B Munson
@ 2015-05-01 20:28 ` Andy Lutomirski
0 siblings, 0 replies; 23+ messages in thread
From: Andy Lutomirski @ 2015-05-01 20:28 UTC (permalink / raw)
To: Eric B Munson
Cc: David S. Miller, Alexey Kuznetsov, James Morris,
Hideaki YOSHIFUJI, Patrick McHardy, Network Development,
Linux API, linux-kernel@vger.kernel.org
On Fri, May 1, 2015 at 1:01 PM, Eric B Munson <emunson@akamai.com> wrote:
> On Fri, 01 May 2015, Andy Lutomirski wrote:
>
>> On Fri, May 1, 2015 at 10:43 AM, Eric B Munson <emunson@akamai.com> wrote:
>> > In order to enable policy decisions in userspace, the data contained in
>> > the SYN packet would be useful for tracking or identifying connections.
>> > Only parts of this data are available to userspace after the hand shake
>> > is completed. This patch exposes a new setsockopt() option that will,
>> > when used with a listening socket, ask the kernel to cache the skb
>> > holding the SYN packet for retrieval later. The SYN skbs will not be
>> > saved while the kernel is in syn cookie mode.
>> >
>> > The same option will ask the kernel for the packet headers when used
>> > with getsockopt() with the socket returned from accept(). The cached
>> > packet will only be available for the first getsockopt() call, the skb
>> > is consumed after the requested data is copied to userspace. Subsequent
>> > calls will return -ENOENT. Because of this behavior, getsockopt() will
>> > return -E2BIG if the caller supplied a buffer that is too small to hold
>> > the skb header.
>>
>> What's the purpose and what headers are you returning?
>
> Currently the ethernet, IP, and TCP headers are being returned. The IP
> and TCP headers will be used by userspace to make decisions on how to
> handle incoming connections. The ethernet headers are being returned
> for completeness, I would be fine not including them in what is copied
> if that is a concern, however the team requesting this change here
> requires the IP and TCP headers.
>
>>
>> There was a bit of a mixup with tx timestamps where the set of headers
>> returned was possibly excessive and incompletely thought out the first
>> time around.
>
> With this in mind, we could drop copying the ethernet headers and simply
> hold onto the IP and TCP headers.
That's probably better. If nothing else, you avoid breaking userspace
when you're not using Ethernet.
--Andy
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] Allow TCP connections to cache SYN packet for userspace inspection
[not found] ` <1430511800.3711.138.camel-XN9IlZ5yJG9HTL0Zs8A6p/gx64E7kk8eUsxypvmhUTTZJqsBc5GL+g@public.gmane.org>
@ 2015-05-01 20:29 ` Eric B Munson
[not found] ` <20150501202908.GC6113-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 23+ messages in thread
From: Eric B Munson @ 2015-05-01 20:29 UTC (permalink / raw)
To: Eric Dumazet
Cc: Tom Herbert, David S. Miller, Alexey Kuznetsov, James Morris,
Hideaki YOSHIFUJI, Patrick McHardy,
Linux Kernel Network Developers, linux-api-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA
[-- Attachment #1: Type: text/plain, Size: 3793 bytes --]
On Fri, 01 May 2015, Eric Dumazet wrote:
> On Fri, 2015-05-01 at 16:14 -0400, Eric B Munson wrote:
> > On Fri, 01 May 2015, Tom Herbert wrote:
> >
> > > On Fri, May 1, 2015 at 11:42 AM, Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> > > > On Fri, 2015-05-01 at 13:43 -0400, Eric B Munson wrote:
> > > >> In order to enable policy decisions in userspace, the data contained in
> > > >> the SYN packet would be useful for tracking or identifying connections.
> > > >> Only parts of this data are available to userspace after the hand shake
> > > >> is completed. This patch exposes a new setsockopt() option that will,
> > > >> when used with a listening socket, ask the kernel to cache the skb
> > > >> holding the SYN packet for retrieval later. The SYN skbs will not be
> > > >> saved while the kernel is in syn cookie mode.
> > > >>
> > > >> The same option will ask the kernel for the packet headers when used
> > > >> with getsockopt() with the socket returned from accept(). The cached
> > > >> packet will only be available for the first getsockopt() call, the skb
> > > >> is consumed after the requested data is copied to userspace. Subsequent
> > > >> calls will return -ENOENT. Because of this behavior, getsockopt() will
> > > >> return -E2BIG if the caller supplied a buffer that is too small to hold
> > > >> the skb header.
> > > >>
> > > >> Signed-off-by: Eric B Munson <emunson-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org>
> > > >> Cc: Alexey Kuznetsov <kuznet-v/Mj1YrvjDBInbfyfbPRSQ@public.gmane.org>
> > > >> Cc: James Morris <jmorris-gx6/JNMH7DfYtjvyW6yDsg@public.gmane.org>
> > > >> Cc: Hideaki YOSHIFUJI <yoshfuji-VfPWfsRibaP+Ru+s062T9g@public.gmane.org>
> > > >> Cc: Patrick McHardy <kaber-dcUjhNyLwpNeoWH0uzbU5w@public.gmane.org>
> > > >> Cc: netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > > >> Cc: linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > > >> Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > > >> ---
> > > >
> > > > We have a similar patch here at Google, but we do not hold one skb and
> > > > dst per saved syn. That can be ~4KB for some drivers.
> > > >
> > > > Only a kmalloc() with the needed part (headers), usually less than 128
> > > > bytes. We store the length in first byte of this allocation.
> > > >
> > > > This has a huge difference if you want to have ~4 million request socks.
> > > >
> > > +1 on kmalloc solution. I posted a similar patch a couple of years ago
> > > https://patchwork.ozlabs.org/patch/146034/. There was pushback on
> > > memory usage and this having to narrow of a use case.
> > >
> > > Tom
> > >
> >
> > I cached the skb largely to take advantage of the built in reference
> > counting and avoid having to manage allocating memory and ownership of
> > said memory. For V2, how about I keep the skb reference in the request
> > structure and kmalloc() a buffer, to be owned by the tcp sock structure,
> > when the new tcp socket is created? This would also simplify the
> > getsockopt() so that the data was available to all callers until the
> > socket is closed.
>
> Please do not keep a reference on skb. This has a too big cost.
>
> Have you read that we plan to have up to 4 or 10 million request socks ?
>
> skb also holds a dst.
>
> We can upstream our implementation (based on Tom prior patch), we have
> been using it more than 2 years with success.
>
>
As long as your implementation provides the IP and TCP headers, I would
be happy with that. I am also happy to rework my implementation to
extract and cache information when the request structure is built. If
you all have an implementation that you want to post, I will add my ack
if it meets our needs as well.
Eric
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] Allow TCP connections to cache SYN packet for userspace inspection
[not found] ` <20150501202908.GC6113-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org>
@ 2015-05-01 20:41 ` Eric Dumazet
[not found] ` <1430512894.3711.140.camel-XN9IlZ5yJG9HTL0Zs8A6p/gx64E7kk8eUsxypvmhUTTZJqsBc5GL+g@public.gmane.org>
0 siblings, 1 reply; 23+ messages in thread
From: Eric Dumazet @ 2015-05-01 20:41 UTC (permalink / raw)
To: Eric B Munson
Cc: Tom Herbert, David S. Miller, Alexey Kuznetsov, James Morris,
Hideaki YOSHIFUJI, Patrick McHardy,
Linux Kernel Network Developers, linux-api-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA
On Fri, 2015-05-01 at 16:29 -0400, Eric B Munson wrote:
>
>
> As long as your implementation provides the IP and TCP headers, I would
> be happy with that. I am also happy to rework my implementation to
> extract and cache information when the request structure is built. If
> you all have an implementation that you want to post, I will add my ack
> if it meets our needs as well.
Yes, I believe it will be easier we provide our implementation instead
of reviewing yours ;)
For example you had :
+ case TCP_SAVED_SYN:
+ if (!((1 << sk->sk_state) & TCPF_LISTEN))
+ err = -EINVAL;
+ tp->saved_syn = !!(val);
+ break;
But if you return an error, tp->saved_syn should be left unchanged.
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH net-next] tcp: provide SYN headers for passive connections
[not found] ` <1430512894.3711.140.camel-XN9IlZ5yJG9HTL0Zs8A6p/gx64E7kk8eUsxypvmhUTTZJqsBc5GL+g@public.gmane.org>
@ 2015-05-04 4:34 ` Eric Dumazet
[not found] ` <1430714086.3711.165.camel-XN9IlZ5yJG9HTL0Zs8A6p/gx64E7kk8eUsxypvmhUTTZJqsBc5GL+g@public.gmane.org>
2015-05-05 20:05 ` David Miller
0 siblings, 2 replies; 23+ messages in thread
From: Eric Dumazet @ 2015-05-04 4:34 UTC (permalink / raw)
To: Eric B Munson
Cc: Tom Herbert, David S. Miller, linux-api-u79uwXL29TY76Z2rM5mHXA,
netdev
From: Eric Dumazet <edumazet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
This patch allows a server application to get the TCP SYN headers for
its passive connections. This is useful if the server is doing
fingerprinting of clients based on SYN packet contents.
Two socket options are added: TCP_SAVE_SYN and TCP_SAVED_SYN.
The first is used on a socket to enable saving the SYN headers
for child connections. This can be set before or after the listen()
call.
The latter is used to retrieve the SYN headers for passive connections,
if the parent listener has enabled TCP_SAVE_SYN.
TCP_SAVED_SYN is read once, it frees the saved SYN headers.
The data returned in TCP_SAVED_SYN are network (IPv4/IPv6) and TCP
headers.
Original patch was written by Tom Herbert, I changed it to not hold
a full skb (and associated dst and conntracking reference).
We have used such patch for about 3 years at Google.
Signed-off-by: Eric Dumazet <edumazet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
---
include/linux/tcp.h | 8 ++++++++
include/net/request_sock.h | 4 +++-
include/uapi/linux/tcp.h | 2 ++
net/ipv4/tcp.c | 35 +++++++++++++++++++++++++++++++++++
net/ipv4/tcp_input.c | 18 ++++++++++++++++++
net/ipv4/tcp_ipv4.c | 1 +
net/ipv4/tcp_minisocks.c | 3 +++
7 files changed, 70 insertions(+), 1 deletion(-)
diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index 3b2911502a8c..e6fb5df22db1 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -199,6 +199,7 @@ struct tcp_sock {
syn_fastopen:1, /* SYN includes Fast Open option */
syn_fastopen_exp:1,/* SYN includes Fast Open exp. option */
syn_data_acked:1,/* data in SYN is acked by SYN-ACK */
+ save_syn:1, /* Save headers of SYN packet */
is_cwnd_limited:1;/* forward progress limited by snd_cwnd? */
u32 tlp_high_seq; /* snd_nxt at the time of TLP retransmit. */
@@ -326,6 +327,7 @@ struct tcp_sock {
* socket. Used to retransmit SYNACKs etc.
*/
struct request_sock *fastopen_rsk;
+ u32 *saved_syn;
};
enum tsq_flags {
@@ -393,4 +395,10 @@ static inline int fastopen_init_queue(struct sock *sk, int backlog)
return 0;
}
+static inline void tcp_saved_syn_free(struct tcp_sock *tp)
+{
+ kfree(tp->saved_syn);
+ tp->saved_syn = NULL;
+}
+
#endif /* _LINUX_TCP_H */
diff --git a/include/net/request_sock.h b/include/net/request_sock.h
index 9f4265ce8892..87935cad2f7b 100644
--- a/include/net/request_sock.h
+++ b/include/net/request_sock.h
@@ -64,6 +64,7 @@ struct request_sock {
struct timer_list rsk_timer;
const struct request_sock_ops *rsk_ops;
struct sock *sk;
+ u32 *saved_syn;
u32 secid;
u32 peer_secid;
};
@@ -77,7 +78,7 @@ reqsk_alloc(const struct request_sock_ops *ops, struct sock *sk_listener)
req->rsk_ops = ops;
sock_hold(sk_listener);
req->rsk_listener = sk_listener;
-
+ req->saved_syn = NULL;
/* Following is temporary. It is coupled with debugging
* helpers in reqsk_put() & reqsk_free()
*/
@@ -104,6 +105,7 @@ static inline void reqsk_free(struct request_sock *req)
req->rsk_ops->destructor(req);
if (req->rsk_listener)
sock_put(req->rsk_listener);
+ kfree(req->saved_syn);
kmem_cache_free(req->rsk_ops->slab, req);
}
diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h
index faa72f4fa547..51ebedba577f 100644
--- a/include/uapi/linux/tcp.h
+++ b/include/uapi/linux/tcp.h
@@ -113,6 +113,8 @@ enum {
#define TCP_TIMESTAMP 24
#define TCP_NOTSENT_LOWAT 25 /* limit number of unsent bytes in write queue */
#define TCP_CC_INFO 26 /* Get Congestion Control (optional) info */
+#define TCP_SAVE_SYN 27 /* Record SYN headers for new connections */
+#define TCP_SAVED_SYN 28 /* Get SYN headers recorded for connection */
struct tcp_repair_opt {
__u32 opt_code;
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 46efa03d2b11..ecccfdc50d76 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2482,6 +2482,13 @@ static int do_tcp_setsockopt(struct sock *sk, int level,
icsk->icsk_syn_retries = val;
break;
+ case TCP_SAVE_SYN:
+ if (val < 0 || val > 1)
+ err = -EINVAL;
+ else
+ tp->save_syn = val;
+ break;
+
case TCP_LINGER2:
if (val < 0)
tp->linger2 = -1;
@@ -2818,6 +2825,34 @@ static int do_tcp_getsockopt(struct sock *sk, int level,
case TCP_NOTSENT_LOWAT:
val = tp->notsent_lowat;
break;
+ case TCP_SAVE_SYN:
+ val = tp->save_syn;
+ break;
+ case TCP_SAVED_SYN: {
+ if (get_user(len, optlen))
+ return -EFAULT;
+
+ lock_sock(sk);
+ if (tp->saved_syn) {
+ len = min_t(unsigned int, tp->saved_syn[0], len);
+ if (put_user(len, optlen)) {
+ release_sock(sk);
+ return -EFAULT;
+ }
+ if (copy_to_user(optval, tp->saved_syn + 1, len)) {
+ release_sock(sk);
+ return -EFAULT;
+ }
+ tcp_saved_syn_free(tp);
+ release_sock(sk);
+ } else {
+ release_sock(sk);
+ len = 0;
+ if (put_user(len, optlen))
+ return -EFAULT;
+ }
+ return 0;
+ }
default:
return -ENOPROTOOPT;
}
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 09bdc4abfcbb..df2ca615cd0c 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -6060,6 +6060,23 @@ static bool tcp_syn_flood_action(struct sock *sk,
return want_cookie;
}
+static void tcp_reqsk_record_syn(const struct sock *sk,
+ struct request_sock *req,
+ const struct sk_buff *skb)
+{
+ if (tcp_sk(sk)->save_syn) {
+ u32 len = skb_network_header_len(skb) + tcp_hdrlen(skb);
+ u32 *copy;
+
+ copy = kmalloc(len + sizeof(u32), GFP_ATOMIC);
+ if (copy) {
+ copy[0] = len;
+ memcpy(©[1], skb_network_header(skb), len);
+ req->saved_syn = copy;
+ }
+ }
+}
+
int tcp_conn_request(struct request_sock_ops *rsk_ops,
const struct tcp_request_sock_ops *af_ops,
struct sock *sk, struct sk_buff *skb)
@@ -6192,6 +6209,7 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
tcp_rsk(req)->tfo_listener = false;
af_ops->queue_hash_add(sk, req, TCP_TIMEOUT_INIT);
}
+ tcp_reqsk_record_syn(sk, req, skb);
return 0;
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index fc1c658ec6c1..91cb4768a860 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1802,6 +1802,7 @@ void tcp_v4_destroy_sock(struct sock *sk)
/* If socket is aborted during connect operation */
tcp_free_fastopen_req(tp);
+ tcp_saved_syn_free(tp);
sk_sockets_allocated_dec(sk);
sock_release_memcg(sk);
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index e5d7649136fc..ebe2ab2596ed 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -536,6 +536,9 @@ struct sock *tcp_create_openreq_child(struct sock *sk, struct request_sock *req,
newtp->fastopen_rsk = NULL;
newtp->syn_data_acked = 0;
+ newtp->saved_syn = req->saved_syn;
+ req->saved_syn = NULL;
+
TCP_INC_STATS_BH(sock_net(sk), TCP_MIB_PASSIVEOPENS);
}
return newsk;
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH net-next] tcp: provide SYN headers for passive connections
[not found] ` <1430714086.3711.165.camel-XN9IlZ5yJG9HTL0Zs8A6p/gx64E7kk8eUsxypvmhUTTZJqsBc5GL+g@public.gmane.org>
@ 2015-05-04 6:47 ` Michael Kerrisk (man-pages)
[not found] ` <CAKgNAkiUOkjsE96E1DN_zwJAjJGLWME7-XGnFDszic7p7C=g7g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-05-04 14:02 ` Neal Cardwell
` (2 subsequent siblings)
3 siblings, 1 reply; 23+ messages in thread
From: Michael Kerrisk (man-pages) @ 2015-05-04 6:47 UTC (permalink / raw)
To: Eric Dumazet
Cc: Eric B Munson, Tom Herbert, David S. Miller, Linux API, netdev
Eric,
On 4 May 2015 at 06:34, Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> From: Eric Dumazet <edumazet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
>
> This patch allows a server application to get the TCP SYN headers for
> its passive connections. This is useful if the server is doing
> fingerprinting of clients based on SYN packet contents.
>
> Two socket options are added: TCP_SAVE_SYN and TCP_SAVED_SYN.
>
> The first is used on a socket to enable saving the SYN headers
> for child connections. This can be set before or after the listen()
> call.
>
> The latter is used to retrieve the SYN headers for passive connections,
> if the parent listener has enabled TCP_SAVE_SYN.
>
> TCP_SAVED_SYN is read once, it frees the saved SYN headers.
>
> The data returned in TCP_SAVED_SYN are network (IPv4/IPv6) and TCP
> headers.
This description is a little thin, so I'm unclear on one or two
points. TCP_SAVE_SYN is clearly applied to the listening socket. But
what about TCP_SAVED_SYN? Is that applied to the connected socket
returned by accept()?
The highly similar naming of these two seems unfortunate. At the very
least, it makes for easy confusion in conversations about the two
options. It would be better to have names that were more distinct.
Perhaps the latter could be TCP_CONN_SYN or TCP_CONN_SAVED_SYN, for
example?
Thanks,
Michael
> Original patch was written by Tom Herbert, I changed it to not hold
> a full skb (and associated dst and conntracking reference).
>
> We have used such patch for about 3 years at Google.
>
> Signed-off-by: Eric Dumazet <edumazet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> ---
> include/linux/tcp.h | 8 ++++++++
> include/net/request_sock.h | 4 +++-
> include/uapi/linux/tcp.h | 2 ++
> net/ipv4/tcp.c | 35 +++++++++++++++++++++++++++++++++++
> net/ipv4/tcp_input.c | 18 ++++++++++++++++++
> net/ipv4/tcp_ipv4.c | 1 +
> net/ipv4/tcp_minisocks.c | 3 +++
> 7 files changed, 70 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/tcp.h b/include/linux/tcp.h
> index 3b2911502a8c..e6fb5df22db1 100644
> --- a/include/linux/tcp.h
> +++ b/include/linux/tcp.h
> @@ -199,6 +199,7 @@ struct tcp_sock {
> syn_fastopen:1, /* SYN includes Fast Open option */
> syn_fastopen_exp:1,/* SYN includes Fast Open exp. option */
> syn_data_acked:1,/* data in SYN is acked by SYN-ACK */
> + save_syn:1, /* Save headers of SYN packet */
> is_cwnd_limited:1;/* forward progress limited by snd_cwnd? */
> u32 tlp_high_seq; /* snd_nxt at the time of TLP retransmit. */
>
> @@ -326,6 +327,7 @@ struct tcp_sock {
> * socket. Used to retransmit SYNACKs etc.
> */
> struct request_sock *fastopen_rsk;
> + u32 *saved_syn;
> };
>
> enum tsq_flags {
> @@ -393,4 +395,10 @@ static inline int fastopen_init_queue(struct sock *sk, int backlog)
> return 0;
> }
>
> +static inline void tcp_saved_syn_free(struct tcp_sock *tp)
> +{
> + kfree(tp->saved_syn);
> + tp->saved_syn = NULL;
> +}
> +
> #endif /* _LINUX_TCP_H */
> diff --git a/include/net/request_sock.h b/include/net/request_sock.h
> index 9f4265ce8892..87935cad2f7b 100644
> --- a/include/net/request_sock.h
> +++ b/include/net/request_sock.h
> @@ -64,6 +64,7 @@ struct request_sock {
> struct timer_list rsk_timer;
> const struct request_sock_ops *rsk_ops;
> struct sock *sk;
> + u32 *saved_syn;
> u32 secid;
> u32 peer_secid;
> };
> @@ -77,7 +78,7 @@ reqsk_alloc(const struct request_sock_ops *ops, struct sock *sk_listener)
> req->rsk_ops = ops;
> sock_hold(sk_listener);
> req->rsk_listener = sk_listener;
> -
> + req->saved_syn = NULL;
> /* Following is temporary. It is coupled with debugging
> * helpers in reqsk_put() & reqsk_free()
> */
> @@ -104,6 +105,7 @@ static inline void reqsk_free(struct request_sock *req)
> req->rsk_ops->destructor(req);
> if (req->rsk_listener)
> sock_put(req->rsk_listener);
> + kfree(req->saved_syn);
> kmem_cache_free(req->rsk_ops->slab, req);
> }
>
> diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h
> index faa72f4fa547..51ebedba577f 100644
> --- a/include/uapi/linux/tcp.h
> +++ b/include/uapi/linux/tcp.h
> @@ -113,6 +113,8 @@ enum {
> #define TCP_TIMESTAMP 24
> #define TCP_NOTSENT_LOWAT 25 /* limit number of unsent bytes in write queue */
> #define TCP_CC_INFO 26 /* Get Congestion Control (optional) info */
> +#define TCP_SAVE_SYN 27 /* Record SYN headers for new connections */
> +#define TCP_SAVED_SYN 28 /* Get SYN headers recorded for connection */
>
> struct tcp_repair_opt {
> __u32 opt_code;
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index 46efa03d2b11..ecccfdc50d76 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -2482,6 +2482,13 @@ static int do_tcp_setsockopt(struct sock *sk, int level,
> icsk->icsk_syn_retries = val;
> break;
>
> + case TCP_SAVE_SYN:
> + if (val < 0 || val > 1)
> + err = -EINVAL;
> + else
> + tp->save_syn = val;
> + break;
> +
> case TCP_LINGER2:
> if (val < 0)
> tp->linger2 = -1;
> @@ -2818,6 +2825,34 @@ static int do_tcp_getsockopt(struct sock *sk, int level,
> case TCP_NOTSENT_LOWAT:
> val = tp->notsent_lowat;
> break;
> + case TCP_SAVE_SYN:
> + val = tp->save_syn;
> + break;
> + case TCP_SAVED_SYN: {
> + if (get_user(len, optlen))
> + return -EFAULT;
> +
> + lock_sock(sk);
> + if (tp->saved_syn) {
> + len = min_t(unsigned int, tp->saved_syn[0], len);
> + if (put_user(len, optlen)) {
> + release_sock(sk);
> + return -EFAULT;
> + }
> + if (copy_to_user(optval, tp->saved_syn + 1, len)) {
> + release_sock(sk);
> + return -EFAULT;
> + }
> + tcp_saved_syn_free(tp);
> + release_sock(sk);
> + } else {
> + release_sock(sk);
> + len = 0;
> + if (put_user(len, optlen))
> + return -EFAULT;
> + }
> + return 0;
> + }
> default:
> return -ENOPROTOOPT;
> }
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index 09bdc4abfcbb..df2ca615cd0c 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -6060,6 +6060,23 @@ static bool tcp_syn_flood_action(struct sock *sk,
> return want_cookie;
> }
>
> +static void tcp_reqsk_record_syn(const struct sock *sk,
> + struct request_sock *req,
> + const struct sk_buff *skb)
> +{
> + if (tcp_sk(sk)->save_syn) {
> + u32 len = skb_network_header_len(skb) + tcp_hdrlen(skb);
> + u32 *copy;
> +
> + copy = kmalloc(len + sizeof(u32), GFP_ATOMIC);
> + if (copy) {
> + copy[0] = len;
> + memcpy(©[1], skb_network_header(skb), len);
> + req->saved_syn = copy;
> + }
> + }
> +}
> +
> int tcp_conn_request(struct request_sock_ops *rsk_ops,
> const struct tcp_request_sock_ops *af_ops,
> struct sock *sk, struct sk_buff *skb)
> @@ -6192,6 +6209,7 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
> tcp_rsk(req)->tfo_listener = false;
> af_ops->queue_hash_add(sk, req, TCP_TIMEOUT_INIT);
> }
> + tcp_reqsk_record_syn(sk, req, skb);
>
> return 0;
>
> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> index fc1c658ec6c1..91cb4768a860 100644
> --- a/net/ipv4/tcp_ipv4.c
> +++ b/net/ipv4/tcp_ipv4.c
> @@ -1802,6 +1802,7 @@ void tcp_v4_destroy_sock(struct sock *sk)
>
> /* If socket is aborted during connect operation */
> tcp_free_fastopen_req(tp);
> + tcp_saved_syn_free(tp);
>
> sk_sockets_allocated_dec(sk);
> sock_release_memcg(sk);
> diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
> index e5d7649136fc..ebe2ab2596ed 100644
> --- a/net/ipv4/tcp_minisocks.c
> +++ b/net/ipv4/tcp_minisocks.c
> @@ -536,6 +536,9 @@ struct sock *tcp_create_openreq_child(struct sock *sk, struct request_sock *req,
> newtp->fastopen_rsk = NULL;
> newtp->syn_data_acked = 0;
>
> + newtp->saved_syn = req->saved_syn;
> + req->saved_syn = NULL;
> +
> TCP_INC_STATS_BH(sock_net(sk), TCP_MIB_PASSIVEOPENS);
> }
> return newsk;
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-api" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH net-next] tcp: provide SYN headers for passive connections
[not found] ` <CAKgNAkiUOkjsE96E1DN_zwJAjJGLWME7-XGnFDszic7p7C=g7g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-05-04 13:53 ` Eric Dumazet
0 siblings, 0 replies; 23+ messages in thread
From: Eric Dumazet @ 2015-05-04 13:53 UTC (permalink / raw)
To: mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w
Cc: Eric B Munson, Tom Herbert, David S. Miller, Linux API, netdev
On Mon, 2015-05-04 at 08:47 +0200, Michael Kerrisk (man-pages) wrote:
> Eric,
>
> On 4 May 2015 at 06:34, Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> > From: Eric Dumazet <edumazet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> >
> > This patch allows a server application to get the TCP SYN headers for
> > its passive connections. This is useful if the server is doing
> > fingerprinting of clients based on SYN packet contents.
> >
> > Two socket options are added: TCP_SAVE_SYN and TCP_SAVED_SYN.
> >
> > The first is used on a socket to enable saving the SYN headers
> > for child connections. This can be set before or after the listen()
> > call.
> >
> > The latter is used to retrieve the SYN headers for passive connections,
> > if the parent listener has enabled TCP_SAVE_SYN.
> >
> > TCP_SAVED_SYN is read once, it frees the saved SYN headers.
> >
> > The data returned in TCP_SAVED_SYN are network (IPv4/IPv6) and TCP
> > headers.
>
> This description is a little thin, so I'm unclear on one or two
> points. TCP_SAVE_SYN is clearly applied to the listening socket. But
> what about TCP_SAVED_SYN? Is that applied to the connected socket
> returned by accept()?
>
> The highly similar naming of these two seems unfortunate. At the very
> least, it makes for easy confusion in conversations about the two
> options. It would be better to have names that were more distinct.
> Perhaps the latter could be TCP_CONN_SYN or TCP_CONN_SAVED_SYN, for
> example?
>
> Thanks,
TCP_CONN_SYN is rather confusing, because of the analogy with connect().
Maybe I can send the test Neal wrote 3 years ago, part of our automated
non regression tests we run here.
Let me know if you need more information, thanks !
/*
* Copyright 2012 Google Inc. All Rights Reserved.
* Author: ncardwell-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org (Neal Cardwell)
*
* A basic test for the TCP_SAVE_SYN and TCP_SAVED_SYN socket options.
*/
#include <assert.h>
#include <errno.h>
#include <netinet/in.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <unistd.h>
#ifndef TCP_SAVE_SYN
#define TCP_SAVE_SYN 27
#endif
#ifndef TCP_SAVED_SYN
#define TCP_SAVED_SYN 28
#endif
typedef char bool;
typedef enum bool_t {
false = 0,
true = 1,
} bool_t;
static void fail(const char *msg)
{
fprintf(stderr, "%s\n", msg);
exit(1);
}
static void fail_perror(const char *msg)
{
perror(msg);
exit(1);
}
/*
* Once every ~5000 connections, connect fails with ENOBUFS. This is
* not even in the manpage, and seem to be transient. For now, just retry.
*/
static void connect_reliably(int fd, struct sockaddr *daddr, int dlen)
{
int ret, max_runs = 5;
do {
ret = connect(fd, daddr, dlen);
} while (ret == -1 && errno == ENOBUFS && --max_runs);
if (ret)
fail_perror("reliable connect");
}
/* Get and validate the saved SYN. */
static void read_saved_syn(int fd, int address_family)
{
unsigned char syn[500];
socklen_t syn_len = sizeof(syn);
memset(syn, 0, sizeof(syn));
/* Read the saved SYN. */
if (getsockopt(fd, IPPROTO_TCP, TCP_SAVED_SYN, syn, &syn_len) != 0)
fail_perror("first getsockopt TCP_SAVED_SYN failed");
/* Check the length and first byte of the SYN. */
if (address_family == AF_INET) {
assert(syn_len == 60);
assert(syn[0] >> 4 == 0x4); /* IPv4 */
} else if (address_family == AF_INET6) {
assert(syn_len == 80);
assert(syn[0] >> 4 == 0x6); /* IPv6 */
} else {
assert(!"bad address family");
}
/* Check the last few bytes of the SYN, which will be TCP options. */
assert(syn[syn_len-4] == 0x01); /* TCP option: kind = NOP */
assert(syn[syn_len-3] == 0x03); /* TCP option: kind = window scale */
assert(syn[syn_len-2] == 0x03); /* TCP option: length = 3 */
assert(syn[syn_len-1] == 0x06 || syn[syn_len-1] == 0x07); /* TCP option: window scale = 6 or 7 */
/* If we try TCP_SAVED_SYN again it should succeed with 0 length. */
if (getsockopt(fd, IPPROTO_TCP, TCP_SAVED_SYN, syn, &syn_len) != 0)
fail("repeated getsockopt TCP_SAVED_SYN failed");
assert(syn_len == 0);
}
/* Open server and client socket and test TCP_SAVE_SYN and TCP_SAVED_SYN. */
static void do_test(struct sockaddr *srv_addr, uint16_t *srv_port, int srv_len,
struct sockaddr *cli_addr, uint16_t *cli_port, int cli_len,
bool get_saved_syn)
{
int fd_listen = -1, fd_accept = -1, fd_connect = -1;
int one = 1;
fd_listen = socket(srv_addr->sa_family, SOCK_STREAM, 0);
if (fd_listen == -1)
fail_perror("open fd_listen");
if (setsockopt(fd_listen, SOL_SOCKET, SO_REUSEADDR,
&one, sizeof(one)) < 0)
fail_perror("setsockopt SO_REUSEADDR");
if (bind(fd_listen, srv_addr, srv_len))
fail_perror("bind fd_listen");
if (getsockname(fd_listen, srv_addr, (socklen_t *)&srv_len))
fail_perror("getsockname fd_listen");
*cli_port = *srv_port;
if (setsockopt(fd_listen, IPPROTO_TCP, TCP_SAVE_SYN,
&one, sizeof(one)) < 0)
fail_perror("setsockopt TCP_SAVE_SYN");
if (listen(fd_listen, 1))
fail_perror("listen fd_listen");
fd_connect = socket(cli_addr->sa_family, SOCK_STREAM, 0);
if (fd_connect == -1)
fail_perror("open fd_connect");
connect_reliably(fd_connect, cli_addr, cli_len);
fd_accept = accept(fd_listen, NULL, 0);
if (fd_accept == -1)
fail_perror("accept fd_listen");
if (get_saved_syn) {
read_saved_syn(fd_accept, cli_addr->sa_family);
}
if (close(fd_listen))
fail_perror("close fd_listen");
if (close(fd_accept))
fail_perror("close fd_accept");
if (close(fd_connect))
fail_perror("close fd_connect");
}
static void test_ipv4(bool get_saved_syn)
{
struct sockaddr_in srv_addr = {
.sin_family = AF_INET,
.sin_addr.s_addr = INADDR_ANY,
.sin_port = 0,
};
struct sockaddr_in cli_addr = {
.sin_family = AF_INET,
.sin_addr.s_addr = htonl(INADDR_LOOPBACK),
};
printf(" testing IPv4 ...\n");
do_test((struct sockaddr *)&srv_addr, &srv_addr.sin_port,
sizeof(srv_addr),
(struct sockaddr *)&cli_addr, &cli_addr.sin_port,
sizeof(cli_addr),
get_saved_syn);
}
static void test_ipv6(bool get_saved_syn)
{
struct sockaddr_in6 srv_addr = {
.sin6_family = AF_INET6,
.sin6_addr = IN6ADDR_ANY_INIT,
.sin6_port = 0,
};
struct sockaddr_in6 cli_addr = {
.sin6_family = AF_INET6,
.sin6_addr = IN6ADDR_LOOPBACK_INIT,
};
printf(" testing IPv6 ...\n");
do_test((struct sockaddr *)&srv_addr, &srv_addr.sin6_port,
sizeof(srv_addr),
(struct sockaddr *)&cli_addr, &cli_addr.sin6_port,
sizeof(cli_addr),
get_saved_syn);
}
static void test_ipv4_mapped_ipv6(bool get_saved_syn)
{
struct sockaddr_in6 srv_addr = {
.sin6_family = AF_INET6,
.sin6_addr = IN6ADDR_ANY_INIT,
.sin6_port = 0,
};
struct sockaddr_in cli_addr = {
.sin_family = AF_INET,
.sin_addr.s_addr = htonl(INADDR_LOOPBACK),
};
printf(" testing IPv4-mapped-IPv6 (srv=AF_INET6, cli=AF_INET)...\n");
do_test((struct sockaddr *)&srv_addr, &srv_addr.sin6_port,
sizeof(srv_addr),
(struct sockaddr *)&cli_addr, &cli_addr.sin_port,
sizeof(cli_addr),
get_saved_syn);
}
static void test_all_address_families(bool get_saved_syn)
{
test_ipv4(get_saved_syn);
test_ipv6(get_saved_syn);
test_ipv4_mapped_ipv6(get_saved_syn);
}
static void run_all_tests(void)
{
bool get_saved_syn;
/* Test normal behavior, when we ask the kernel to record the
* SYN and then read it using the TCP_SAVED_SYN getsockopt().
*/
printf("test: reading the saved SYN...\n");
get_saved_syn = true;
test_all_address_families(get_saved_syn);
/* Test behavior when we ask the kernel to record the SYN and
* then never actually use the TCP_SAVED_SYN getsockopt() to
* extract the saved SYN.
*/
printf("test: not reading the saved SYN...\n");
get_saved_syn = false;
test_all_address_families(get_saved_syn);
}
static int usage(const char *executable_path)
{
fprintf(stderr, "usage: %s\n", executable_path);
return 1;
}
int main(int argc, char **argv)
{
if (getuid() != 0 || getgid() != 0)
fail("must run as root\n");
if (argc != 1)
return usage(argv[0]);
system("sysctl net.ipv4.tcp_timestamps=1");
system("sysctl net.ipv4.tcp_sack=1");
system("sysctl net.ipv4.tcp_window_scaling=1");
run_all_tests();
printf("OK. All tests passed.\n");
return 0;
}
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH net-next] tcp: provide SYN headers for passive connections
[not found] ` <1430714086.3711.165.camel-XN9IlZ5yJG9HTL0Zs8A6p/gx64E7kk8eUsxypvmhUTTZJqsBc5GL+g@public.gmane.org>
2015-05-04 6:47 ` Michael Kerrisk (man-pages)
@ 2015-05-04 14:02 ` Neal Cardwell
2015-05-04 14:21 ` Eric B Munson
2015-05-04 14:41 ` John Heffner
3 siblings, 0 replies; 23+ messages in thread
From: Neal Cardwell @ 2015-05-04 14:02 UTC (permalink / raw)
To: Eric Dumazet
Cc: Eric B Munson, Tom Herbert, David S. Miller,
linux-api-u79uwXL29TY76Z2rM5mHXA, netdev
On Mon, May 4, 2015 at 12:34 AM, Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> From: Eric Dumazet <edumazet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
>
> This patch allows a server application to get the TCP SYN headers for
> its passive connections. This is useful if the server is doing
> fingerprinting of clients based on SYN packet contents.
>
> Two socket options are added: TCP_SAVE_SYN and TCP_SAVED_SYN.
>
> The first is used on a socket to enable saving the SYN headers
> for child connections. This can be set before or after the listen()
> call.
>
> The latter is used to retrieve the SYN headers for passive connections,
> if the parent listener has enabled TCP_SAVE_SYN.
>
> TCP_SAVED_SYN is read once, it frees the saved SYN headers.
>
> The data returned in TCP_SAVED_SYN are network (IPv4/IPv6) and TCP
> headers.
>
> Original patch was written by Tom Herbert, I changed it to not hold
> a full skb (and associated dst and conntracking reference).
>
> We have used such patch for about 3 years at Google.
>
> Signed-off-by: Eric Dumazet <edumazet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> ---
Acked-by: Neal Cardwell <ncardwell-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
Tested-by: Neal Cardwell <ncardwell-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
The code looks good to me, and I re-ran the test I wrote (and which I
see Eric posted), and double-checked that it passes on net-next with
this patch applied.
Personally I like the socket option names from this patch.
TCP_SAVE_SYN means "Please save SYN headers for child sockets of this
listener". And TCP_SAVED_SYN means "Please give me the saved SYN for
this accepted child."
Thanks, Eric!
neal
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH net-next] tcp: provide SYN headers for passive connections
[not found] ` <1430714086.3711.165.camel-XN9IlZ5yJG9HTL0Zs8A6p/gx64E7kk8eUsxypvmhUTTZJqsBc5GL+g@public.gmane.org>
2015-05-04 6:47 ` Michael Kerrisk (man-pages)
2015-05-04 14:02 ` Neal Cardwell
@ 2015-05-04 14:21 ` Eric B Munson
[not found] ` <20150504142155.GD6113-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org>
2015-05-04 14:41 ` John Heffner
3 siblings, 1 reply; 23+ messages in thread
From: Eric B Munson @ 2015-05-04 14:21 UTC (permalink / raw)
To: Eric Dumazet
Cc: Tom Herbert, David S. Miller, linux-api-u79uwXL29TY76Z2rM5mHXA,
netdev
[-- Attachment #1: Type: text/plain, Size: 8264 bytes --]
On Sun, 03 May 2015, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
>
> This patch allows a server application to get the TCP SYN headers for
> its passive connections. This is useful if the server is doing
> fingerprinting of clients based on SYN packet contents.
>
> Two socket options are added: TCP_SAVE_SYN and TCP_SAVED_SYN.
>
> The first is used on a socket to enable saving the SYN headers
> for child connections. This can be set before or after the listen()
> call.
>
> The latter is used to retrieve the SYN headers for passive connections,
> if the parent listener has enabled TCP_SAVE_SYN.
>
> TCP_SAVED_SYN is read once, it frees the saved SYN headers.
>
> The data returned in TCP_SAVED_SYN are network (IPv4/IPv6) and TCP
> headers.
>
> Original patch was written by Tom Herbert, I changed it to not hold
> a full skb (and associated dst and conntracking reference).
>
> We have used such patch for about 3 years at Google.
>
> Signed-off-by: Eric Dumazet <edumazet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> ---
> include/linux/tcp.h | 8 ++++++++
> include/net/request_sock.h | 4 +++-
> include/uapi/linux/tcp.h | 2 ++
> net/ipv4/tcp.c | 35 +++++++++++++++++++++++++++++++++++
> net/ipv4/tcp_input.c | 18 ++++++++++++++++++
> net/ipv4/tcp_ipv4.c | 1 +
> net/ipv4/tcp_minisocks.c | 3 +++
> 7 files changed, 70 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/tcp.h b/include/linux/tcp.h
> index 3b2911502a8c..e6fb5df22db1 100644
> --- a/include/linux/tcp.h
> +++ b/include/linux/tcp.h
> @@ -199,6 +199,7 @@ struct tcp_sock {
> syn_fastopen:1, /* SYN includes Fast Open option */
> syn_fastopen_exp:1,/* SYN includes Fast Open exp. option */
> syn_data_acked:1,/* data in SYN is acked by SYN-ACK */
> + save_syn:1, /* Save headers of SYN packet */
> is_cwnd_limited:1;/* forward progress limited by snd_cwnd? */
> u32 tlp_high_seq; /* snd_nxt at the time of TLP retransmit. */
>
> @@ -326,6 +327,7 @@ struct tcp_sock {
> * socket. Used to retransmit SYNACKs etc.
> */
> struct request_sock *fastopen_rsk;
> + u32 *saved_syn;
> };
>
> enum tsq_flags {
> @@ -393,4 +395,10 @@ static inline int fastopen_init_queue(struct sock *sk, int backlog)
> return 0;
> }
>
> +static inline void tcp_saved_syn_free(struct tcp_sock *tp)
> +{
> + kfree(tp->saved_syn);
> + tp->saved_syn = NULL;
> +}
> +
> #endif /* _LINUX_TCP_H */
> diff --git a/include/net/request_sock.h b/include/net/request_sock.h
> index 9f4265ce8892..87935cad2f7b 100644
> --- a/include/net/request_sock.h
> +++ b/include/net/request_sock.h
> @@ -64,6 +64,7 @@ struct request_sock {
> struct timer_list rsk_timer;
> const struct request_sock_ops *rsk_ops;
> struct sock *sk;
> + u32 *saved_syn;
> u32 secid;
> u32 peer_secid;
> };
> @@ -77,7 +78,7 @@ reqsk_alloc(const struct request_sock_ops *ops, struct sock *sk_listener)
> req->rsk_ops = ops;
> sock_hold(sk_listener);
> req->rsk_listener = sk_listener;
> -
> + req->saved_syn = NULL;
> /* Following is temporary. It is coupled with debugging
> * helpers in reqsk_put() & reqsk_free()
> */
> @@ -104,6 +105,7 @@ static inline void reqsk_free(struct request_sock *req)
> req->rsk_ops->destructor(req);
> if (req->rsk_listener)
> sock_put(req->rsk_listener);
> + kfree(req->saved_syn);
> kmem_cache_free(req->rsk_ops->slab, req);
> }
>
> diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h
> index faa72f4fa547..51ebedba577f 100644
> --- a/include/uapi/linux/tcp.h
> +++ b/include/uapi/linux/tcp.h
> @@ -113,6 +113,8 @@ enum {
> #define TCP_TIMESTAMP 24
> #define TCP_NOTSENT_LOWAT 25 /* limit number of unsent bytes in write queue */
> #define TCP_CC_INFO 26 /* Get Congestion Control (optional) info */
> +#define TCP_SAVE_SYN 27 /* Record SYN headers for new connections */
> +#define TCP_SAVED_SYN 28 /* Get SYN headers recorded for connection */
The getsockopt() for TCP_SAVE_SYN doesn't offer much over returning 0
from TCP_SAVED_SYN if the headers are not available. Why not collapse
these into a single option? If you need to track the difference between
the headers are not avialable and should have been and the headers
should not be available it can be done through error codes from
getsockopt().
>
> struct tcp_repair_opt {
> __u32 opt_code;
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index 46efa03d2b11..ecccfdc50d76 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -2482,6 +2482,13 @@ static int do_tcp_setsockopt(struct sock *sk, int level,
> icsk->icsk_syn_retries = val;
> break;
>
> + case TCP_SAVE_SYN:
> + if (val < 0 || val > 1)
> + err = -EINVAL;
> + else
> + tp->save_syn = val;
> + break;
> +
> case TCP_LINGER2:
> if (val < 0)
> tp->linger2 = -1;
> @@ -2818,6 +2825,34 @@ static int do_tcp_getsockopt(struct sock *sk, int level,
> case TCP_NOTSENT_LOWAT:
> val = tp->notsent_lowat;
> break;
> + case TCP_SAVE_SYN:
> + val = tp->save_syn;
> + break;
> + case TCP_SAVED_SYN: {
> + if (get_user(len, optlen))
> + return -EFAULT;
> +
> + lock_sock(sk);
> + if (tp->saved_syn) {
What happens when userspace uses a buffer that is too small to hold the
saved headers? They get a partial result back but they do not get a
chance to come back for the rest (or really have any way of knowing
there was more beyond missing some of the tcp_hdr struct).
> + len = min_t(unsigned int, tp->saved_syn[0], len);
> + if (put_user(len, optlen)) {
> + release_sock(sk);
> + return -EFAULT;
> + }
> + if (copy_to_user(optval, tp->saved_syn + 1, len)) {
> + release_sock(sk);
> + return -EFAULT;
> + }
> + tcp_saved_syn_free(tp);
> + release_sock(sk);
> + } else {
> + release_sock(sk);
> + len = 0;
> + if (put_user(len, optlen))
> + return -EFAULT;
> + }
> + return 0;
> + }
> default:
> return -ENOPROTOOPT;
> }
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index 09bdc4abfcbb..df2ca615cd0c 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -6060,6 +6060,23 @@ static bool tcp_syn_flood_action(struct sock *sk,
> return want_cookie;
> }
>
> +static void tcp_reqsk_record_syn(const struct sock *sk,
> + struct request_sock *req,
> + const struct sk_buff *skb)
> +{
> + if (tcp_sk(sk)->save_syn) {
> + u32 len = skb_network_header_len(skb) + tcp_hdrlen(skb);
> + u32 *copy;
> +
> + copy = kmalloc(len + sizeof(u32), GFP_ATOMIC);
> + if (copy) {
> + copy[0] = len;
> + memcpy(©[1], skb_network_header(skb), len);
> + req->saved_syn = copy;
> + }
> + }
> +}
> +
> int tcp_conn_request(struct request_sock_ops *rsk_ops,
> const struct tcp_request_sock_ops *af_ops,
> struct sock *sk, struct sk_buff *skb)
> @@ -6192,6 +6209,7 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
> tcp_rsk(req)->tfo_listener = false;
> af_ops->queue_hash_add(sk, req, TCP_TIMEOUT_INIT);
> }
> + tcp_reqsk_record_syn(sk, req, skb);
>
> return 0;
>
> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> index fc1c658ec6c1..91cb4768a860 100644
> --- a/net/ipv4/tcp_ipv4.c
> +++ b/net/ipv4/tcp_ipv4.c
> @@ -1802,6 +1802,7 @@ void tcp_v4_destroy_sock(struct sock *sk)
>
> /* If socket is aborted during connect operation */
> tcp_free_fastopen_req(tp);
> + tcp_saved_syn_free(tp);
>
> sk_sockets_allocated_dec(sk);
> sock_release_memcg(sk);
> diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
> index e5d7649136fc..ebe2ab2596ed 100644
> --- a/net/ipv4/tcp_minisocks.c
> +++ b/net/ipv4/tcp_minisocks.c
> @@ -536,6 +536,9 @@ struct sock *tcp_create_openreq_child(struct sock *sk, struct request_sock *req,
> newtp->fastopen_rsk = NULL;
> newtp->syn_data_acked = 0;
>
> + newtp->saved_syn = req->saved_syn;
> + req->saved_syn = NULL;
> +
> TCP_INC_STATS_BH(sock_net(sk), TCP_MIB_PASSIVEOPENS);
> }
> return newsk;
>
>
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH net-next] tcp: provide SYN headers for passive connections
[not found] ` <20150504142155.GD6113-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org>
@ 2015-05-04 14:31 ` Eric Dumazet
[not found] ` <1430749912.3711.173.camel-XN9IlZ5yJG9HTL0Zs8A6p/gx64E7kk8eUsxypvmhUTTZJqsBc5GL+g@public.gmane.org>
0 siblings, 1 reply; 23+ messages in thread
From: Eric Dumazet @ 2015-05-04 14:31 UTC (permalink / raw)
To: Eric B Munson
Cc: Tom Herbert, David S. Miller, linux-api-u79uwXL29TY76Z2rM5mHXA,
netdev
On Mon, 2015-05-04 at 10:21 -0400, Eric B Munson wrote:
> The getsockopt() for TCP_SAVE_SYN doesn't offer much over returning 0
> from TCP_SAVED_SYN if the headers are not available. Why not collapse
> these into a single option? If you need to track the difference between
> the headers are not avialable and should have been and the headers
> should not be available it can be done through error codes from
> getsockopt().
A single option is not good, because it adds confusion.
Note that we need to set the bit on any socket state, and doing this
with a single option is not very easy. ( You got this wrong in your
patch btw)
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH net-next] tcp: provide SYN headers for passive connections
[not found] ` <1430749912.3711.173.camel-XN9IlZ5yJG9HTL0Zs8A6p/gx64E7kk8eUsxypvmhUTTZJqsBc5GL+g@public.gmane.org>
@ 2015-05-04 14:36 ` Eric Dumazet
0 siblings, 0 replies; 23+ messages in thread
From: Eric Dumazet @ 2015-05-04 14:36 UTC (permalink / raw)
To: Eric B Munson
Cc: Tom Herbert, David S. Miller, linux-api-u79uwXL29TY76Z2rM5mHXA,
netdev
On Mon, 2015-05-04 at 07:31 -0700, Eric Dumazet wrote:
> On Mon, 2015-05-04 at 10:21 -0400, Eric B Munson wrote:
>
> > The getsockopt() for TCP_SAVE_SYN doesn't offer much over returning 0
> > from TCP_SAVED_SYN if the headers are not available. Why not collapse
> > these into a single option? If you need to track the difference between
> > the headers are not avialable and should have been and the headers
> > should not be available it can be done through error codes from
> > getsockopt().
>
> A single option is not good, because it adds confusion.
>
> Note that we need to set the bit on any socket state, and doing this
> with a single option is not very easy. ( You got this wrong in your
> patch btw)
Another argument for having two options is for TCP_REPAIR support.
If you want to properly check/restore a socket, including its
TCP_SAVE_SYN state, then you definitely need 2 options.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH net-next] tcp: provide SYN headers for passive connections
[not found] ` <1430714086.3711.165.camel-XN9IlZ5yJG9HTL0Zs8A6p/gx64E7kk8eUsxypvmhUTTZJqsBc5GL+g@public.gmane.org>
` (2 preceding siblings ...)
2015-05-04 14:21 ` Eric B Munson
@ 2015-05-04 14:41 ` John Heffner
[not found] ` <CABrhC0nmsfAyHgJX8zEBDBVfFN=2qXKy7cO0Kbp9R9UCeEYowg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
3 siblings, 1 reply; 23+ messages in thread
From: John Heffner @ 2015-05-04 14:41 UTC (permalink / raw)
To: Eric Dumazet
Cc: Eric B Munson, Tom Herbert, David S. Miller,
linux-api-u79uwXL29TY76Z2rM5mHXA, netdev
On Mon, May 4, 2015 at 12:34 AM, Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>
> From: Eric Dumazet <edumazet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
>
> This patch allows a server application to get the TCP SYN headers for
> its passive connections. This is useful if the server is doing
> fingerprinting of clients based on SYN packet contents.
>
> Two socket options are added: TCP_SAVE_SYN and TCP_SAVED_SYN.
>
> The first is used on a socket to enable saving the SYN headers
> for child connections. This can be set before or after the listen()
> call.
>
> The latter is used to retrieve the SYN headers for passive connections,
> if the parent listener has enabled TCP_SAVE_SYN.
>
> TCP_SAVED_SYN is read once, it frees the saved SYN headers.
>
> The data returned in TCP_SAVED_SYN are network (IPv4/IPv6) and TCP
> headers.
>
> Original patch was written by Tom Herbert, I changed it to not hold
> a full skb (and associated dst and conntracking reference).
>
> We have used such patch for about 3 years at Google.
Nice idea, seems handy. But a couple (somewhat related) questions:
* Other than convenience, are there reasons not use an existing, more
general-purpose and portable mechanism like pcap? (Permissions, I
guess?)
* Are there conditions where, for security purposes, you don't want an
application to have access to the raw SYNs?
Thanks,
-John
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH net-next] tcp: provide SYN headers for passive connections
[not found] ` <CABrhC0nmsfAyHgJX8zEBDBVfFN=2qXKy7cO0Kbp9R9UCeEYowg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-05-04 14:58 ` Eric B Munson
2015-05-04 15:12 ` Eric Dumazet
1 sibling, 0 replies; 23+ messages in thread
From: Eric B Munson @ 2015-05-04 14:58 UTC (permalink / raw)
To: John Heffner
Cc: Eric Dumazet, Tom Herbert, David S. Miller,
linux-api-u79uwXL29TY76Z2rM5mHXA, netdev
[-- Attachment #1: Type: text/plain, Size: 2047 bytes --]
On Mon, 04 May 2015, John Heffner wrote:
> On Mon, May 4, 2015 at 12:34 AM, Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> >
> > From: Eric Dumazet <edumazet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> >
> > This patch allows a server application to get the TCP SYN headers for
> > its passive connections. This is useful if the server is doing
> > fingerprinting of clients based on SYN packet contents.
> >
> > Two socket options are added: TCP_SAVE_SYN and TCP_SAVED_SYN.
> >
> > The first is used on a socket to enable saving the SYN headers
> > for child connections. This can be set before or after the listen()
> > call.
> >
> > The latter is used to retrieve the SYN headers for passive connections,
> > if the parent listener has enabled TCP_SAVE_SYN.
> >
> > TCP_SAVED_SYN is read once, it frees the saved SYN headers.
> >
> > The data returned in TCP_SAVED_SYN are network (IPv4/IPv6) and TCP
> > headers.
> >
> > Original patch was written by Tom Herbert, I changed it to not hold
> > a full skb (and associated dst and conntracking reference).
> >
> > We have used such patch for about 3 years at Google.
>
>
> Nice idea, seems handy. But a couple (somewhat related) questions:
>
> * Other than convenience, are there reasons not use an existing, more
> general-purpose and portable mechanism like pcap? (Permissions, I
> guess?)
Eric Dumazet can speak for their reasons, but we had a difficult time
guaranteeing that the data collected by libpcap was avaialble to the
accepting thread when accept() returned without making the accepting
thread wait on that data. With this implementation, we guarantee that
userspace has access to this data when accept() returns.
> * Are there conditions where, for security purposes, you don't want an
> application to have access to the raw SYNs?
In this implementation, userspace does not have access to the raw SYN,
rather it can request a copy of the TCP and IP headers.
Eric
>
> Thanks,
> -John
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH net-next] tcp: provide SYN headers for passive connections
[not found] ` <CABrhC0nmsfAyHgJX8zEBDBVfFN=2qXKy7cO0Kbp9R9UCeEYowg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-05-04 14:58 ` Eric B Munson
@ 2015-05-04 15:12 ` Eric Dumazet
[not found] ` <1430752330.3711.180.camel-XN9IlZ5yJG9HTL0Zs8A6p/gx64E7kk8eUsxypvmhUTTZJqsBc5GL+g@public.gmane.org>
1 sibling, 1 reply; 23+ messages in thread
From: Eric Dumazet @ 2015-05-04 15:12 UTC (permalink / raw)
To: John Heffner
Cc: Eric B Munson, Tom Herbert, David S. Miller,
linux-api-u79uwXL29TY76Z2rM5mHXA, netdev
On Mon, 2015-05-04 at 10:41 -0400, John Heffner wrote:
> Nice idea, seems handy. But a couple (somewhat related) questions:
>
> * Other than convenience, are there reasons not use an existing, more
> general-purpose and portable mechanism like pcap? (Permissions, I
> guess?)
Very hard to synchronize when say you have 32 listeners sharing a single
port (SO_REUSEPORT), and receive one million SYN per second (when my TCP
listener scaling work is finished).
libpcap here would be a serious bottleneck, even with a clever FANIN
support on the af_packet sockets, considering use of multiqueue NIC.
> * Are there conditions where, for security purposes, you don't want an
> application to have access to the raw SYNs?
Not that we are aware of : We restrict the access to IP + TCP headers,
for the passive part. All information that is available there was
provided by the remote peer on a 'open way' anyway.
Thanks.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH net-next] tcp: provide SYN headers for passive connections
[not found] ` <1430752330.3711.180.camel-XN9IlZ5yJG9HTL0Zs8A6p/gx64E7kk8eUsxypvmhUTTZJqsBc5GL+g@public.gmane.org>
@ 2015-05-05 3:07 ` John Heffner
0 siblings, 0 replies; 23+ messages in thread
From: John Heffner @ 2015-05-05 3:07 UTC (permalink / raw)
To: Eric Dumazet
Cc: Eric B Munson, Tom Herbert, David S. Miller,
linux-api-u79uwXL29TY76Z2rM5mHXA, netdev
On Mon, May 4, 2015 at 11:12 AM, Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Mon, 2015-05-04 at 10:41 -0400, John Heffner wrote:
>
>> Nice idea, seems handy. But a couple (somewhat related) questions:
>>
>> * Other than convenience, are there reasons not use an existing, more
>> general-purpose and portable mechanism like pcap? (Permissions, I
>> guess?)
>
> Very hard to synchronize when say you have 32 listeners sharing a single
> port (SO_REUSEPORT), and receive one million SYN per second (when my TCP
> listener scaling work is finished).
>
> libpcap here would be a serious bottleneck, even with a clever FANIN
> support on the af_packet sockets, considering use of multiqueue NIC.
>
>> * Are there conditions where, for security purposes, you don't want an
>> application to have access to the raw SYNs?
>
> Not that we are aware of : We restrict the access to IP + TCP headers,
> for the passive part. All information that is available there was
> provided by the remote peer on a 'open way' anyway.
Makes sense, thanks.
-John
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH net-next] tcp: provide SYN headers for passive connections
2015-05-04 4:34 ` [PATCH net-next] tcp: provide SYN headers for passive connections Eric Dumazet
[not found] ` <1430714086.3711.165.camel-XN9IlZ5yJG9HTL0Zs8A6p/gx64E7kk8eUsxypvmhUTTZJqsBc5GL+g@public.gmane.org>
@ 2015-05-05 20:05 ` David Miller
[not found] ` <20150505.160535.1034497188259706110.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
1 sibling, 1 reply; 23+ messages in thread
From: David Miller @ 2015-05-05 20:05 UTC (permalink / raw)
To: eric.dumazet; +Cc: emunson, tom, linux-api, netdev
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Sun, 03 May 2015 21:34:46 -0700
> From: Eric Dumazet <edumazet@google.com>
>
> This patch allows a server application to get the TCP SYN headers for
> its passive connections. This is useful if the server is doing
> fingerprinting of clients based on SYN packet contents.
>
> Two socket options are added: TCP_SAVE_SYN and TCP_SAVED_SYN.
>
> The first is used on a socket to enable saving the SYN headers
> for child connections. This can be set before or after the listen()
> call.
>
> The latter is used to retrieve the SYN headers for passive connections,
> if the parent listener has enabled TCP_SAVE_SYN.
>
> TCP_SAVED_SYN is read once, it frees the saved SYN headers.
>
> The data returned in TCP_SAVED_SYN are network (IPv4/IPv6) and TCP
> headers.
>
> Original patch was written by Tom Herbert, I changed it to not hold
> a full skb (and associated dst and conntracking reference).
>
> We have used such patch for about 3 years at Google.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
Applied, but I think you should give maybe a little bit more thought
into the behavior when a not-large-enough buffer is supplied by the
user.
Since we're talking about things on the order of 60 bytes or so at
most, it seems clear to me that signalling an error is probably the
most advisable thing to do in this situation.
Maybe you could signal an error, yet write the actual length to
"optlen". That way the user can figure out how much they need. Then
they at least would have the option of retrying with a larger buffer.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH net-next] tcp: provide SYN headers for passive connections
[not found] ` <20150505.160535.1034497188259706110.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
@ 2015-05-05 21:02 ` Eric Dumazet
0 siblings, 0 replies; 23+ messages in thread
From: Eric Dumazet @ 2015-05-05 21:02 UTC (permalink / raw)
To: David Miller
Cc: emunson-JqFfY2XvxFXQT0dZR+AlfA, tom-BjP2VixgY4xUbtYUoyoikg,
linux-api-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA
On Tue, 2015-05-05 at 16:05 -0400, David Miller wrote:
> Applied, but I think you should give maybe a little bit more thought
> into the behavior when a not-large-enough buffer is supplied by the
> user.
>
> Since we're talking about things on the order of 60 bytes or so at
> most, it seems clear to me that signalling an error is probably the
> most advisable thing to do in this situation.
>
> Maybe you could signal an error, yet write the actual length to
> "optlen". That way the user can figure out how much they need. Then
> they at least would have the option of retrying with a larger buffer.
Seems fine to me, I'll provide a patch implementing your suggestions.
Thanks.
^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2015-05-05 21:02 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-05-01 17:43 [PATCH] Allow TCP connections to cache SYN packet for userspace inspection Eric B Munson
[not found] ` <1430502237-5619-1-git-send-email-emunson-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org>
2015-05-01 18:42 ` Eric Dumazet
[not found] ` <1430505777.3711.135.camel-XN9IlZ5yJG9HTL0Zs8A6p/gx64E7kk8eUsxypvmhUTTZJqsBc5GL+g@public.gmane.org>
2015-05-01 19:55 ` Tom Herbert
[not found] ` <CALx6S34ftz_wDoPwcJg_cMQu4QtnBJF-=d+gF5ieTA=d=r31-Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-05-01 20:14 ` Eric B Munson
[not found] ` <20150501201417.GB6113-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org>
2015-05-01 20:23 ` Eric Dumazet
[not found] ` <1430511800.3711.138.camel-XN9IlZ5yJG9HTL0Zs8A6p/gx64E7kk8eUsxypvmhUTTZJqsBc5GL+g@public.gmane.org>
2015-05-01 20:29 ` Eric B Munson
[not found] ` <20150501202908.GC6113-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org>
2015-05-01 20:41 ` Eric Dumazet
[not found] ` <1430512894.3711.140.camel-XN9IlZ5yJG9HTL0Zs8A6p/gx64E7kk8eUsxypvmhUTTZJqsBc5GL+g@public.gmane.org>
2015-05-04 4:34 ` [PATCH net-next] tcp: provide SYN headers for passive connections Eric Dumazet
[not found] ` <1430714086.3711.165.camel-XN9IlZ5yJG9HTL0Zs8A6p/gx64E7kk8eUsxypvmhUTTZJqsBc5GL+g@public.gmane.org>
2015-05-04 6:47 ` Michael Kerrisk (man-pages)
[not found] ` <CAKgNAkiUOkjsE96E1DN_zwJAjJGLWME7-XGnFDszic7p7C=g7g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-05-04 13:53 ` Eric Dumazet
2015-05-04 14:02 ` Neal Cardwell
2015-05-04 14:21 ` Eric B Munson
[not found] ` <20150504142155.GD6113-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org>
2015-05-04 14:31 ` Eric Dumazet
[not found] ` <1430749912.3711.173.camel-XN9IlZ5yJG9HTL0Zs8A6p/gx64E7kk8eUsxypvmhUTTZJqsBc5GL+g@public.gmane.org>
2015-05-04 14:36 ` Eric Dumazet
2015-05-04 14:41 ` John Heffner
[not found] ` <CABrhC0nmsfAyHgJX8zEBDBVfFN=2qXKy7cO0Kbp9R9UCeEYowg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-05-04 14:58 ` Eric B Munson
2015-05-04 15:12 ` Eric Dumazet
[not found] ` <1430752330.3711.180.camel-XN9IlZ5yJG9HTL0Zs8A6p/gx64E7kk8eUsxypvmhUTTZJqsBc5GL+g@public.gmane.org>
2015-05-05 3:07 ` John Heffner
2015-05-05 20:05 ` David Miller
[not found] ` <20150505.160535.1034497188259706110.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2015-05-05 21:02 ` Eric Dumazet
2015-05-01 19:27 ` [PATCH] Allow TCP connections to cache SYN packet for userspace inspection Andy Lutomirski
[not found] ` <CALCETrWi6h3DRu6Z8jJ_-MiWqRRyKZHntpJFNON=GpAjMDYXmQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-05-01 20:01 ` Eric B Munson
2015-05-01 20:28 ` Andy Lutomirski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).