netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3] net-netlink: Add a new attribute to expose TOS values via netlink
       [not found] <20111010145206.6f7e9ee2@nehalam.linuxnetplumber.net>
@ 2011-10-12  1:28 ` Muraliraja Muniraju
  2011-10-12  2:48   ` Eric Dumazet
  2011-10-12 15:52   ` [PATCH v3] net-netlink: Add a new attribute to expose TOS values via netlink Stephen Hemminger
  0 siblings, 2 replies; 12+ messages in thread
From: Muraliraja Muniraju @ 2011-10-12  1:28 UTC (permalink / raw)
  To: David S. Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy <kabe
  Cc: linux-kernel, netdev, Murali Raja

From: Murali Raja <muralira@google.com>

This patch exposes the tos value for the TCP sockets when the TOS flag
is requested in the ext_flags for the inet_diag request. This would mainly be
used to expose TOS values for both for TCP and UDP sockets. Currently it is
supported for TCP. When netlink support for UDP would be added the support
to expose the TOS values would alse be done.

Signed-off-by: Murali Raja <muralira@google.com>
---
Changelog since v2:
- Adding support for IPV6 class and using right API's
Changelog since v1:
- Removing reserved field 

 include/linux/inet_diag.h |    9 ++++++++-
 net/ipv4/inet_diag.c      |    5 +++++
 2 files changed, 13 insertions(+), 1 deletions(-)

diff --git a/include/linux/inet_diag.h b/include/linux/inet_diag.h
index bc8c490..e36093d 100644
--- a/include/linux/inet_diag.h
+++ b/include/linux/inet_diag.h
@@ -97,9 +97,10 @@ enum {
 	INET_DIAG_INFO,
 	INET_DIAG_VEGASINFO,
 	INET_DIAG_CONG,
+	INET_DIAG_TOS,
 };
 
-#define INET_DIAG_MAX INET_DIAG_CONG
+#define INET_DIAG_MAX INET_DIAG_TOS
 
 
 /* INET_DIAG_MEM */
@@ -120,6 +121,12 @@ struct tcpvegas_info {
 	__u32	tcpv_minrtt;
 };
 
+/* INET_DIAG_TOS */
+
+struct inet_diag_tos {
+	__u8	idiag_tos;
+};
+
 #ifdef __KERNEL__
 struct sock;
 struct inet_hashinfo;
diff --git a/net/ipv4/inet_diag.c b/net/ipv4/inet_diag.c
index 389a2e6..f5e2bda 100644
--- a/net/ipv4/inet_diag.c
+++ b/net/ipv4/inet_diag.c
@@ -108,6 +108,9 @@ static int inet_csk_diag_fill(struct sock *sk,
 		       icsk->icsk_ca_ops->name);
 	}
 
+	if ((ext & (1 << (INET_DIAG_TOS - 1))) && (sk->sk_family != AF_INET6))
+		RTA_PUT_U8(skb, INET_DIAG_TOS, inet->tos);
+
 	r->idiag_family = sk->sk_family;
 	r->idiag_state = sk->sk_state;
 	r->idiag_timer = 0;
@@ -130,6 +133,8 @@ static int inet_csk_diag_fill(struct sock *sk,
 			       &np->rcv_saddr);
 		ipv6_addr_copy((struct in6_addr *)r->id.idiag_dst,
 			       &np->daddr);
+		if (ext & (1 << (INET_DIAG_TOS - 1)))
+			RTA_PUT_U8(skb, INET_DIAG_TOS, np->tclass);
 	}
 #endif
 
-- 
1.7.3.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v3] net-netlink: Add a new attribute to expose TOS values via netlink
  2011-10-12  1:28 ` [PATCH v3] net-netlink: Add a new attribute to expose TOS values via netlink Muraliraja Muniraju
@ 2011-10-12  2:48   ` Eric Dumazet
  2011-10-12  4:16     ` MuraliRaja Muniraju
  2011-10-12 15:52   ` [PATCH v3] net-netlink: Add a new attribute to expose TOS values via netlink Stephen Hemminger
  1 sibling, 1 reply; 12+ messages in thread
From: Eric Dumazet @ 2011-10-12  2:48 UTC (permalink / raw)
  To: Muraliraja Muniraju
  Cc: David S. Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, linux-kernel, netdev

Le mardi 11 octobre 2011 à 18:28 -0700, Muraliraja Muniraju a écrit :
> From: Murali Raja <muralira@google.com>
> 
> This patch exposes the tos value for the TCP sockets when the TOS flag
> is requested in the ext_flags for the inet_diag request. This would mainly be
> used to expose TOS values for both for TCP and UDP sockets. Currently it is
> supported for TCP. When netlink support for UDP would be added the support
> to expose the TOS values would alse be done.
> 

You could mention TCLASS support for IPv6

> Signed-off-by: Murali Raja <muralira@google.com>
> ---
> Changelog since v2:
> - Adding support for IPV6 class and using right API's
> Changelog since v1:
> - Removing reserved field 
> 
>  include/linux/inet_diag.h |    9 ++++++++-
>  net/ipv4/inet_diag.c      |    5 +++++
>  2 files changed, 13 insertions(+), 1 deletions(-)
> 
> diff --git a/include/linux/inet_diag.h b/include/linux/inet_diag.h
> index bc8c490..e36093d 100644
> --- a/include/linux/inet_diag.h
> +++ b/include/linux/inet_diag.h
> @@ -97,9 +97,10 @@ enum {
>  	INET_DIAG_INFO,
>  	INET_DIAG_VEGASINFO,
>  	INET_DIAG_CONG,
> +	INET_DIAG_TOS,
>  };
>  
> -#define INET_DIAG_MAX INET_DIAG_CONG
> +#define INET_DIAG_MAX INET_DIAG_TOS
>  
> 
>  /* INET_DIAG_MEM */
> @@ -120,6 +121,12 @@ struct tcpvegas_info {
>  	__u32	tcpv_minrtt;
>  };
>  
> +/* INET_DIAG_TOS */
> +
> +struct inet_diag_tos {
> +	__u8	idiag_tos;
> +};

Are you sure its still needed ?

I am now wondering what is done in TIME_WAIT state.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3] net-netlink: Add a new attribute to expose TOS values via netlink
  2011-10-12  2:48   ` Eric Dumazet
@ 2011-10-12  4:16     ` MuraliRaja Muniraju
  2011-10-12  4:31       ` Eric Dumazet
  0 siblings, 1 reply; 12+ messages in thread
From: MuraliRaja Muniraju @ 2011-10-12  4:16 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S. Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, linux-kernel, netdev

Eric,
     I think it would be useful to expose the values this way.
     I shall make the changes to add TCLASS for ipv6 in the description.

     Regarding the TIME_WAIT state, I think it would be better of not
exposing the values because there would hardly be anything to transmit
during this time.

Thanks,
Murali

On Tue, Oct 11, 2011 at 7:48 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Le mardi 11 octobre 2011 à 18:28 -0700, Muraliraja Muniraju a écrit :
>> From: Murali Raja <muralira@google.com>
>>
>> This patch exposes the tos value for the TCP sockets when the TOS flag
>> is requested in the ext_flags for the inet_diag request. This would mainly be
>> used to expose TOS values for both for TCP and UDP sockets. Currently it is
>> supported for TCP. When netlink support for UDP would be added the support
>> to expose the TOS values would alse be done.
>>
>
> You could mention TCLASS support for IPv6
>
>> Signed-off-by: Murali Raja <muralira@google.com>
>> ---
>> Changelog since v2:
>> - Adding support for IPV6 class and using right API's
>> Changelog since v1:
>> - Removing reserved field
>>
>>  include/linux/inet_diag.h |    9 ++++++++-
>>  net/ipv4/inet_diag.c      |    5 +++++
>>  2 files changed, 13 insertions(+), 1 deletions(-)
>>
>> diff --git a/include/linux/inet_diag.h b/include/linux/inet_diag.h
>> index bc8c490..e36093d 100644
>> --- a/include/linux/inet_diag.h
>> +++ b/include/linux/inet_diag.h
>> @@ -97,9 +97,10 @@ enum {
>>       INET_DIAG_INFO,
>>       INET_DIAG_VEGASINFO,
>>       INET_DIAG_CONG,
>> +     INET_DIAG_TOS,
>>  };
>>
>> -#define INET_DIAG_MAX INET_DIAG_CONG
>> +#define INET_DIAG_MAX INET_DIAG_TOS
>>
>>
>>  /* INET_DIAG_MEM */
>> @@ -120,6 +121,12 @@ struct tcpvegas_info {
>>       __u32   tcpv_minrtt;
>>  };
>>
>> +/* INET_DIAG_TOS */
>> +
>> +struct inet_diag_tos {
>> +     __u8    idiag_tos;
>> +};
>
> Are you sure its still needed ?
>
> I am now wondering what is done in TIME_WAIT state.
>
>
>



-- 
Thanks,
Murali

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3] net-netlink: Add a new attribute to expose TOS values via netlink
  2011-10-12  4:16     ` MuraliRaja Muniraju
@ 2011-10-12  4:31       ` Eric Dumazet
  2011-10-12 19:00         ` MuraliRaja Muniraju
  2011-10-21 10:29         ` [PATCH net-next] ipv4: tcp: fix TOS value in ACK messages sent from TIME_WAIT Eric Dumazet
  0 siblings, 2 replies; 12+ messages in thread
From: Eric Dumazet @ 2011-10-12  4:31 UTC (permalink / raw)
  To: MuraliRaja Muniraju
  Cc: David S. Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, linux-kernel, netdev

Le mardi 11 octobre 2011 à 21:16 -0700, MuraliRaja Muniraju a écrit :
> Eric,
>      I think it would be useful to expose the values this way.
>      I shall make the changes to add TCLASS for ipv6 in the description.
> 
>      Regarding the TIME_WAIT state, I think it would be better of not
> exposing the values because there would hardly be anything to transmit
> during this time.
> 

My remark had nothing to do with your patch actually :

We dont store TOS/TCLASS on TIME_WAIT sockets, so you cannot report
value in inet_diag.

But we do send packets on behalf of TIME_WAIT sockets ;)

Think about ACK messages. They probably are sent with a 0 TOS/TCLASS
field... I am not sure its a problem or not.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3] net-netlink: Add a new attribute to expose TOS values via netlink
  2011-10-12  1:28 ` [PATCH v3] net-netlink: Add a new attribute to expose TOS values via netlink Muraliraja Muniraju
  2011-10-12  2:48   ` Eric Dumazet
@ 2011-10-12 15:52   ` Stephen Hemminger
  2011-10-12 19:00     ` [PATCH v4] " Muraliraja Muniraju
  1 sibling, 1 reply; 12+ messages in thread
From: Stephen Hemminger @ 2011-10-12 15:52 UTC (permalink / raw)
  To: Muraliraja Muniraju
  Cc: David S. Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, linux-kernel, netdev

On Tue, 11 Oct 2011 18:28:07 -0700
Muraliraja Muniraju <muralira@google.com> wrote:

> +/* INET_DIAG_TOS */
> +
> +struct inet_diag_tos {
> +	__u8	idiag_tos;
> +};
> +

This structure is not used.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3] net-netlink: Add a new attribute to expose TOS values via netlink
  2011-10-12  4:31       ` Eric Dumazet
@ 2011-10-12 19:00         ` MuraliRaja Muniraju
  2011-10-21 10:29         ` [PATCH net-next] ipv4: tcp: fix TOS value in ACK messages sent from TIME_WAIT Eric Dumazet
  1 sibling, 0 replies; 12+ messages in thread
From: MuraliRaja Muniraju @ 2011-10-12 19:00 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S. Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, linux-kernel, netdev

Hi Eric,
     I am planning to not add code for the TIME_WAIT case as the tos
values are zero as they are not saved. Please let me if this is OK.
Thanks,
Murali

On Tue, Oct 11, 2011 at 9:31 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Le mardi 11 octobre 2011 à 21:16 -0700, MuraliRaja Muniraju a écrit :
>> Eric,
>>      I think it would be useful to expose the values this way.
>>      I shall make the changes to add TCLASS for ipv6 in the description.
>>
>>      Regarding the TIME_WAIT state, I think it would be better of not
>> exposing the values because there would hardly be anything to transmit
>> during this time.
>>
>
> My remark had nothing to do with your patch actually :
>
> We dont store TOS/TCLASS on TIME_WAIT sockets, so you cannot report
> value in inet_diag.
>
> But we do send packets on behalf of TIME_WAIT sockets ;)
>
> Think about ACK messages. They probably are sent with a 0 TOS/TCLASS
> field... I am not sure its a problem or not.
>
>
>
>



-- 
Thanks,
Murali

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v4] net-netlink: Add a new attribute to expose TOS values via netlink
  2011-10-12 15:52   ` [PATCH v3] net-netlink: Add a new attribute to expose TOS values via netlink Stephen Hemminger
@ 2011-10-12 19:00     ` Muraliraja Muniraju
  2011-10-12 19:08       ` Stephen Hemminger
  2011-10-12 19:15       ` Eric Dumazet
  0 siblings, 2 replies; 12+ messages in thread
From: Muraliraja Muniraju @ 2011-10-12 19:00 UTC (permalink / raw)
  To: David S. Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy <kabe
  Cc: linux-kernel, netdev, Murali Raja

From: Murali Raja <muralira@google.com>

This patch exposes the tos value for the TCP sockets when the TOS flag
is requested in the ext_flags for the inet_diag request. This would mainly be
used to expose TOS values for both for TCP and UDP sockets. Currently it is
supported for TCP. When netlink support for UDP would be added the support
to expose the TOS values would alse be done. For IPV4 tos value is exposed
and for IPV6 tclass value is exposed.

Signed-off-by: Murali Raja <muralira@google.com>
---
Changelog since v3:
- Removed the tos structure from the inet_diag.h

Changelog since v2:
- Added support for IPv6 and used better API.

Changelog since v3:
- Removed reserved fields.

 include/linux/inet_diag.h |    3 ++-
 net/ipv4/inet_diag.c      |    5 +++++
 2 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/include/linux/inet_diag.h b/include/linux/inet_diag.h
index bc8c490..80b480c 100644
--- a/include/linux/inet_diag.h
+++ b/include/linux/inet_diag.h
@@ -97,9 +97,10 @@ enum {
 	INET_DIAG_INFO,
 	INET_DIAG_VEGASINFO,
 	INET_DIAG_CONG,
+	INET_DIAG_TOS,
 };
 
-#define INET_DIAG_MAX INET_DIAG_CONG
+#define INET_DIAG_MAX INET_DIAG_TOS
 
 
 /* INET_DIAG_MEM */
diff --git a/net/ipv4/inet_diag.c b/net/ipv4/inet_diag.c
index 389a2e6..f5e2bda 100644
--- a/net/ipv4/inet_diag.c
+++ b/net/ipv4/inet_diag.c
@@ -108,6 +108,9 @@ static int inet_csk_diag_fill(struct sock *sk,
 		       icsk->icsk_ca_ops->name);
 	}
 
+	if ((ext & (1 << (INET_DIAG_TOS - 1))) && (sk->sk_family != AF_INET6))
+		RTA_PUT_U8(skb, INET_DIAG_TOS, inet->tos);
+
 	r->idiag_family = sk->sk_family;
 	r->idiag_state = sk->sk_state;
 	r->idiag_timer = 0;
@@ -130,6 +133,8 @@ static int inet_csk_diag_fill(struct sock *sk,
 			       &np->rcv_saddr);
 		ipv6_addr_copy((struct in6_addr *)r->id.idiag_dst,
 			       &np->daddr);
+		if (ext & (1 << (INET_DIAG_TOS - 1)))
+			RTA_PUT_U8(skb, INET_DIAG_TOS, np->tclass);
 	}
 #endif
 
-- 
1.7.3.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v4] net-netlink: Add a new attribute to expose TOS values via netlink
  2011-10-12 19:00     ` [PATCH v4] " Muraliraja Muniraju
@ 2011-10-12 19:08       ` Stephen Hemminger
  2011-10-12 19:15       ` Eric Dumazet
  1 sibling, 0 replies; 12+ messages in thread
From: Stephen Hemminger @ 2011-10-12 19:08 UTC (permalink / raw)
  To: Muraliraja Muniraju
  Cc: David S. Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, linux-kernel, netdev

On Wed, 12 Oct 2011 12:00:35 -0700
Muraliraja Muniraju <muralira@google.com> wrote:

> From: Murali Raja <muralira@google.com>
> 
> This patch exposes the tos value for the TCP sockets when the TOS flag
> is requested in the ext_flags for the inet_diag request. This would mainly be
> used to expose TOS values for both for TCP and UDP sockets. Currently it is
> supported for TCP. When netlink support for UDP would be added the support
> to expose the TOS values would alse be done. For IPV4 tos value is exposed
> and for IPV6 tclass value is exposed.
> 
> Signed-off-by: Murali Raja <muralira@google.com>
> ---
> Changelog since v3:
> - Removed the tos structure from the inet_diag.h
> 
> Changelog since v2:
> - Added support for IPv6 and used better API.
> 
> Changelog since v3:
> - Removed reserved fields.


After this is accepted; it is trivial to add support to ss command.

Acked-by: Stephen Hemminger <shemminger@vyatta.com>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4] net-netlink: Add a new attribute to expose TOS values via netlink
  2011-10-12 19:00     ` [PATCH v4] " Muraliraja Muniraju
  2011-10-12 19:08       ` Stephen Hemminger
@ 2011-10-12 19:15       ` Eric Dumazet
  2011-10-12 23:10         ` David Miller
  1 sibling, 1 reply; 12+ messages in thread
From: Eric Dumazet @ 2011-10-12 19:15 UTC (permalink / raw)
  To: Muraliraja Muniraju
  Cc: David S. Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, linux-kernel, netdev

Le mercredi 12 octobre 2011 à 12:00 -0700, Muraliraja Muniraju a écrit :
> From: Murali Raja <muralira@google.com>
> 
> This patch exposes the tos value for the TCP sockets when the TOS flag
> is requested in the ext_flags for the inet_diag request. This would mainly be
> used to expose TOS values for both for TCP and UDP sockets. Currently it is
> supported for TCP. When netlink support for UDP would be added the support
> to expose the TOS values would alse be done. For IPV4 tos value is exposed
> and for IPV6 tclass value is exposed.
> 
> Signed-off-by: Murali Raja <muralira@google.com>
> ---

Acked-by: Eric Dumazet <eric.dumazet@gmail.com>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4] net-netlink: Add a new attribute to expose TOS values via netlink
  2011-10-12 19:15       ` Eric Dumazet
@ 2011-10-12 23:10         ` David Miller
  0 siblings, 0 replies; 12+ messages in thread
From: David Miller @ 2011-10-12 23:10 UTC (permalink / raw)
  To: eric.dumazet
  Cc: muralira, kuznet, jmorris, yoshfuji, kaber, linux-kernel, netdev

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 12 Oct 2011 21:15:34 +0200

> Le mercredi 12 octobre 2011 à 12:00 -0700, Muraliraja Muniraju a écrit :
>> From: Murali Raja <muralira@google.com>
>> 
>> This patch exposes the tos value for the TCP sockets when the TOS flag
>> is requested in the ext_flags for the inet_diag request. This would mainly be
>> used to expose TOS values for both for TCP and UDP sockets. Currently it is
>> supported for TCP. When netlink support for UDP would be added the support
>> to expose the TOS values would alse be done. For IPV4 tos value is exposed
>> and for IPV6 tclass value is exposed.
>> 
>> Signed-off-by: Murali Raja <muralira@google.com>
>> ---
> 
> Acked-by: Eric Dumazet <eric.dumazet@gmail.com>

Applied, thanks everyone.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH net-next] ipv4: tcp: fix TOS value in ACK messages sent from TIME_WAIT
  2011-10-12  4:31       ` Eric Dumazet
  2011-10-12 19:00         ` MuraliRaja Muniraju
@ 2011-10-21 10:29         ` Eric Dumazet
  2011-10-24  7:06           ` David Miller
  1 sibling, 1 reply; 12+ messages in thread
From: Eric Dumazet @ 2011-10-21 10:29 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, MuraliRaja Muniraju, Andi Kleen, KOVACS Krisztian

There is a long standing bug in linux tcp stack, about ACK messages sent
on behalf of TIME_WAIT sockets.

In the IP header of the ACK message, we choose to reflect TOS field of
incoming message, and this might break some setups.

Example of things that were broken :
  - Routing using TOS as a selector
  - Firewalls
  - Trafic classification / shaping

We now remember in timewait structure the inet tos field and use it in
ACK generation, and route lookup.

Notes :
 - We still reflect incoming TOS in RST messages.
 - We could extend MuraliRaja Muniraju patch to report TOS value in
netlink messages for TIME_WAIT sockets.
 - A patch is needed for IPv6

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: MuraliRaja Muniraju <muralira@google.com>
CC: Andi Kleen <ak@linux.intel.com>
CC: KOVACS Krisztian <hidden@balabit.hu>
---
 include/net/inet_timewait_sock.h |    3 ++-
 include/net/ip.h                 |    3 ++-
 net/ipv4/inet_timewait_sock.c    |    1 +
 net/ipv4/ip_output.c             |    6 +++---
 net/ipv4/tcp_ipv4.c              |   11 +++++++----
 5 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/include/net/inet_timewait_sock.h b/include/net/inet_timewait_sock.h
index f1a7709..180231c 100644
--- a/include/net/inet_timewait_sock.h
+++ b/include/net/inet_timewait_sock.h
@@ -126,7 +126,8 @@ struct inet_timewait_sock {
 	/* And these are ours. */
 	unsigned int		tw_ipv6only     : 1,
 				tw_transparent  : 1,
-				tw_pad		: 14,	/* 14 bits hole */
+				tw_pad		: 6,	/* 6 bits hole */
+				tw_tos		: 8,
 				tw_ipv6_offset  : 16;
 	kmemcheck_bitfield_end(flags);
 	unsigned long		tw_ttd;
diff --git a/include/net/ip.h b/include/net/ip.h
index c7e066a..eca0ef7 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -165,6 +165,7 @@ struct ip_reply_arg {
 	int	    csumoffset; /* u16 offset of csum in iov[0].iov_base */
 				/* -1 if not needed */ 
 	int	    bound_dev_if;
+	u8  	    tos;
 }; 
 
 #define IP_REPLY_ARG_NOSRCCHECK 1
@@ -175,7 +176,7 @@ static inline __u8 ip_reply_arg_flowi_flags(const struct ip_reply_arg *arg)
 }
 
 void ip_send_reply(struct sock *sk, struct sk_buff *skb, __be32 daddr,
-		   struct ip_reply_arg *arg, unsigned int len);
+		   const struct ip_reply_arg *arg, unsigned int len);
 
 struct ipv4_config {
 	int	log_martians;
diff --git a/net/ipv4/inet_timewait_sock.c b/net/ipv4/inet_timewait_sock.c
index 3c8dfa1..44d65d5 100644
--- a/net/ipv4/inet_timewait_sock.c
+++ b/net/ipv4/inet_timewait_sock.c
@@ -183,6 +183,7 @@ struct inet_timewait_sock *inet_twsk_alloc(const struct sock *sk, const int stat
 		tw->tw_daddr	    = inet->inet_daddr;
 		tw->tw_rcv_saddr    = inet->inet_rcv_saddr;
 		tw->tw_bound_dev_if = sk->sk_bound_dev_if;
+		tw->tw_tos	    = inet->tos;
 		tw->tw_num	    = inet->inet_num;
 		tw->tw_state	    = TCP_TIME_WAIT;
 		tw->tw_substate	    = state;
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index e1374ab..0bc95f3 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -1466,7 +1466,7 @@ static int ip_reply_glue_bits(void *dptr, char *to, int offset,
  *     	structure to pass arguments.
  */
 void ip_send_reply(struct sock *sk, struct sk_buff *skb, __be32 daddr,
-		   struct ip_reply_arg *arg, unsigned int len)
+		   const struct ip_reply_arg *arg, unsigned int len)
 {
 	struct inet_sock *inet = inet_sk(sk);
 	struct ip_options_data replyopts;
@@ -1489,7 +1489,7 @@ void ip_send_reply(struct sock *sk, struct sk_buff *skb, __be32 daddr,
 	}
 
 	flowi4_init_output(&fl4, arg->bound_dev_if, 0,
-			   RT_TOS(ip_hdr(skb)->tos),
+			   RT_TOS(arg->tos),
 			   RT_SCOPE_UNIVERSE, sk->sk_protocol,
 			   ip_reply_arg_flowi_flags(arg),
 			   daddr, rt->rt_spec_dst,
@@ -1506,7 +1506,7 @@ void ip_send_reply(struct sock *sk, struct sk_buff *skb, __be32 daddr,
 	   with locally disabled BH and that sk cannot be already spinlocked.
 	 */
 	bh_lock_sock(sk);
-	inet->tos = ip_hdr(skb)->tos;
+	inet->tos = arg->tos;
 	sk->sk_priority = skb->priority;
 	sk->sk_protocol = ip_hdr(skb)->protocol;
 	sk->sk_bound_dev_if = arg->bound_dev_if;
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 955c925..16687b3 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -652,6 +652,7 @@ static void tcp_v4_send_reset(struct sock *sk, struct sk_buff *skb)
 	arg.flags = (sk && inet_sk(sk)->transparent) ? IP_REPLY_ARG_NOSRCCHECK : 0;
 
 	net = dev_net(skb_dst(skb)->dev);
+	arg.tos = ip_hdr(skb)->tos;
 	ip_send_reply(net->ipv4.tcp_sock, skb, ip_hdr(skb)->saddr,
 		      &arg, arg.iov[0].iov_len);
 
@@ -666,7 +667,7 @@ static void tcp_v4_send_reset(struct sock *sk, struct sk_buff *skb)
 static void tcp_v4_send_ack(struct sk_buff *skb, u32 seq, u32 ack,
 			    u32 win, u32 ts, int oif,
 			    struct tcp_md5sig_key *key,
-			    int reply_flags)
+			    int reply_flags, u8 tos)
 {
 	const struct tcphdr *th = tcp_hdr(skb);
 	struct {
@@ -726,7 +727,7 @@ static void tcp_v4_send_ack(struct sk_buff *skb, u32 seq, u32 ack,
 	arg.csumoffset = offsetof(struct tcphdr, check) / 2;
 	if (oif)
 		arg.bound_dev_if = oif;
-
+	arg.tos = tos;
 	ip_send_reply(net->ipv4.tcp_sock, skb, ip_hdr(skb)->saddr,
 		      &arg, arg.iov[0].iov_len);
 
@@ -743,7 +744,8 @@ static void tcp_v4_timewait_ack(struct sock *sk, struct sk_buff *skb)
 			tcptw->tw_ts_recent,
 			tw->tw_bound_dev_if,
 			tcp_twsk_md5_key(tcptw),
-			tw->tw_transparent ? IP_REPLY_ARG_NOSRCCHECK : 0
+			tw->tw_transparent ? IP_REPLY_ARG_NOSRCCHECK : 0,
+			tw->tw_tos
 			);
 
 	inet_twsk_put(tw);
@@ -757,7 +759,8 @@ static void tcp_v4_reqsk_send_ack(struct sock *sk, struct sk_buff *skb,
 			req->ts_recent,
 			0,
 			tcp_v4_md5_do_lookup(sk, ip_hdr(skb)->daddr),
-			inet_rsk(req)->no_srccheck ? IP_REPLY_ARG_NOSRCCHECK : 0);
+			inet_rsk(req)->no_srccheck ? IP_REPLY_ARG_NOSRCCHECK : 0,
+			ip_hdr(skb)->tos);
 }
 
 /*

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next] ipv4: tcp: fix TOS value in ACK messages sent from TIME_WAIT
  2011-10-21 10:29         ` [PATCH net-next] ipv4: tcp: fix TOS value in ACK messages sent from TIME_WAIT Eric Dumazet
@ 2011-10-24  7:06           ` David Miller
  0 siblings, 0 replies; 12+ messages in thread
From: David Miller @ 2011-10-24  7:06 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev, muralira, ak, hidden

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 21 Oct 2011 12:29:25 +0200

> There is a long standing bug in linux tcp stack, about ACK messages sent
> on behalf of TIME_WAIT sockets.
> 
> In the IP header of the ACK message, we choose to reflect TOS field of
> incoming message, and this might break some setups.
> 
> Example of things that were broken :
>   - Routing using TOS as a selector
>   - Firewalls
>   - Trafic classification / shaping
> 
> We now remember in timewait structure the inet tos field and use it in
> ACK generation, and route lookup.
> 
> Notes :
>  - We still reflect incoming TOS in RST messages.
>  - We could extend MuraliRaja Muniraju patch to report TOS value in
> netlink messages for TIME_WAIT sockets.
>  - A patch is needed for IPv6
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

Applied, thanks a lot Eric.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2011-10-24  7:06 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20111010145206.6f7e9ee2@nehalam.linuxnetplumber.net>
2011-10-12  1:28 ` [PATCH v3] net-netlink: Add a new attribute to expose TOS values via netlink Muraliraja Muniraju
2011-10-12  2:48   ` Eric Dumazet
2011-10-12  4:16     ` MuraliRaja Muniraju
2011-10-12  4:31       ` Eric Dumazet
2011-10-12 19:00         ` MuraliRaja Muniraju
2011-10-21 10:29         ` [PATCH net-next] ipv4: tcp: fix TOS value in ACK messages sent from TIME_WAIT Eric Dumazet
2011-10-24  7:06           ` David Miller
2011-10-12 15:52   ` [PATCH v3] net-netlink: Add a new attribute to expose TOS values via netlink Stephen Hemminger
2011-10-12 19:00     ` [PATCH v4] " Muraliraja Muniraju
2011-10-12 19:08       ` Stephen Hemminger
2011-10-12 19:15       ` Eric Dumazet
2011-10-12 23:10         ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).