* [PATCH net-next v3 0/2] ipv4: per-datagram IP_TOS and IP_TTL via sendmsg()
@ 2013-09-24 13:43 Francesco Fusco
2013-09-24 13:43 ` [PATCH net-next v3 1/2] ipv4: IP_TOS and IP_TTL can be specified as ancillary data Francesco Fusco
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Francesco Fusco @ 2013-09-24 13:43 UTC (permalink / raw)
To: davem; +Cc: netdev
There is no way to set the IP_TOS field on a per-packet basis in IPv4, while
IPv6 has such a mechanism. Therefore one has to fall back to the setsockopt()
in case of IPv4.
Using the existing per-socket option is not convenient particularly in the
situations where multiple threads have to use the same socket data requiring
per-thread TOS values. In fact this would involve calling setsockopt() before
sendmsg() every time.
Francesco Fusco (2):
ipv4: IP_TOS and IP_TTL can be specified as ancillary data
ipv4: processing ancillary IP_TOS or IP_TTL
include/net/inet_sock.h | 3 +++
include/net/ip.h | 14 ++++++++++++++
include/net/route.h | 1 +
net/ipv4/icmp.c | 5 +++++
net/ipv4/ip_output.c | 13 ++++++++++---
net/ipv4/ip_sockglue.c | 20 +++++++++++++++++++-
net/ipv4/ping.c | 4 +++-
net/ipv4/raw.c | 4 +++-
net/ipv4/udp.c | 4 +++-
9 files changed, 61 insertions(+), 7 deletions(-)
--
1.8.3.1
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH net-next v3 1/2] ipv4: IP_TOS and IP_TTL can be specified as ancillary data
2013-09-24 13:43 [PATCH net-next v3 0/2] ipv4: per-datagram IP_TOS and IP_TTL via sendmsg() Francesco Fusco
@ 2013-09-24 13:43 ` Francesco Fusco
2013-09-24 13:43 ` [PATCH net-next v3 2/2] ipv4: processing ancillary IP_TOS or IP_TTL Francesco Fusco
2013-09-28 22:22 ` [PATCH net-next v3 0/2] ipv4: per-datagram IP_TOS and IP_TTL via sendmsg() David Miller
2 siblings, 0 replies; 4+ messages in thread
From: Francesco Fusco @ 2013-09-24 13:43 UTC (permalink / raw)
To: davem; +Cc: netdev
This patch enables the IP_TTL and IP_TOS values passed from userspace to
be stored in the ipcm_cookie struct. Three fields are added to the struct:
- the TTL, expressed as __u8.
The allowed values are in the [1-255].
A value of 0 means that the TTL is not specified.
- the TOS, expressed as __s16.
The allowed values are in the range [0,255].
A value of -1 means that the TOS is not specified.
- the priority, expressed as a char and computed when
handling the ancillary data.
Signed-off-by: Francesco Fusco <ffusco@redhat.com>
---
v1->v2
- changed the icmp_cookie ttl field from __s16 to __u8.
A value of 0 means that the TTL has not been specified
- to tos field is still __s16. The user can specify
values in the range 0-255 included, therefore I use
a value of -1 as a flag saying that the value has
not been specified
- the priority it is now a char instead of a __u32,
which is the return type of rt_tos2priority
- improved commit message
v1->v2
- no code changes, rebase to net-next
include/net/ip.h | 3 +++
net/ipv4/ip_sockglue.c | 20 +++++++++++++++++++-
2 files changed, 22 insertions(+), 1 deletion(-)
diff --git a/include/net/ip.h b/include/net/ip.h
index c1f192b..0135f38 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -56,6 +56,9 @@ struct ipcm_cookie {
int oif;
struct ip_options_rcu *opt;
__u8 tx_flags;
+ __u8 ttl;
+ __s16 tos;
+ char priority;
};
#define IPCB(skb) ((struct inet_skb_parm*)((skb)->cb))
diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index d9c4f11..56e3445 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -189,7 +189,7 @@ EXPORT_SYMBOL(ip_cmsg_recv);
int ip_cmsg_send(struct net *net, struct msghdr *msg, struct ipcm_cookie *ipc)
{
- int err;
+ int err, val;
struct cmsghdr *cmsg;
for (cmsg = CMSG_FIRSTHDR(msg); cmsg; cmsg = CMSG_NXTHDR(msg, cmsg)) {
@@ -215,6 +215,24 @@ int ip_cmsg_send(struct net *net, struct msghdr *msg, struct ipcm_cookie *ipc)
ipc->addr = info->ipi_spec_dst.s_addr;
break;
}
+ case IP_TTL:
+ if (cmsg->cmsg_len != CMSG_LEN(sizeof(int)))
+ return -EINVAL;
+ val = *(int *)CMSG_DATA(cmsg);
+ if (val < 1 || val > 255)
+ return -EINVAL;
+ ipc->ttl = val;
+ break;
+ case IP_TOS:
+ if (cmsg->cmsg_len != CMSG_LEN(sizeof(int)))
+ return -EINVAL;
+ val = *(int *)CMSG_DATA(cmsg);
+ if (val < 0 || val > 255)
+ return -EINVAL;
+ ipc->tos = val;
+ ipc->priority = rt_tos2priority(ipc->tos);
+ break;
+
default:
return -EINVAL;
}
--
1.8.3.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH net-next v3 2/2] ipv4: processing ancillary IP_TOS or IP_TTL
2013-09-24 13:43 [PATCH net-next v3 0/2] ipv4: per-datagram IP_TOS and IP_TTL via sendmsg() Francesco Fusco
2013-09-24 13:43 ` [PATCH net-next v3 1/2] ipv4: IP_TOS and IP_TTL can be specified as ancillary data Francesco Fusco
@ 2013-09-24 13:43 ` Francesco Fusco
2013-09-28 22:22 ` [PATCH net-next v3 0/2] ipv4: per-datagram IP_TOS and IP_TTL via sendmsg() David Miller
2 siblings, 0 replies; 4+ messages in thread
From: Francesco Fusco @ 2013-09-24 13:43 UTC (permalink / raw)
To: davem; +Cc: netdev
If IP_TOS or IP_TTL are specified as ancillary data, then sendmsg() sends out
packets with the specified TTL or TOS overriding the socket values specified
with the traditional setsockopt().
The struct inet_cork stores the values of TOS, TTL and priority that are
passed through the struct ipcm_cookie. If there are user-specified TOS
(tos != -1) or TTL (ttl != 0) in the struct ipcm_cookie, these values are
used to override the per-socket values. In case of TOS also the priority
is changed accordingly.
Two helper functions get_rttos and get_rtconn_flags are defined to take
into account the presence of a user specified TOS value when computing
RT_TOS and RT_CONN_FLAGS.
Signed-off-by: Francesco Fusco <ffusco@redhat.com>
---
v1->v2
- reworked the entire patch
- modified the ttl field in the struct inet_cork from __s16 to __u8:
0 means that the TTL is not specified
- the tos field in the struct inet_cork is still __s16:
-1 means tha the tos is not set
- modified the priority field in the struct inet_cork from __u32 to
char.
- introduced the get_rttos and get_rtconn_flags functions
v2->v3
- no code changes, rebase to net-next
include/net/inet_sock.h | 3 +++
include/net/ip.h | 11 +++++++++++
include/net/route.h | 1 +
net/ipv4/icmp.c | 5 +++++
net/ipv4/ip_output.c | 13 ++++++++++---
net/ipv4/ping.c | 4 +++-
net/ipv4/raw.c | 4 +++-
net/ipv4/udp.c | 4 +++-
8 files changed, 39 insertions(+), 6 deletions(-)
diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
index 636d203..f314177 100644
--- a/include/net/inet_sock.h
+++ b/include/net/inet_sock.h
@@ -103,6 +103,9 @@ struct inet_cork {
int length; /* Total length of all frames */
struct dst_entry *dst;
u8 tx_flags;
+ __u8 ttl;
+ __s16 tos;
+ char priority;
};
struct inet_cork_full {
diff --git a/include/net/ip.h b/include/net/ip.h
index 0135f38..77b4f9b 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -28,6 +28,7 @@
#include <linux/skbuff.h>
#include <net/inet_sock.h>
+#include <net/route.h>
#include <net/snmp.h>
#include <net/flow.h>
@@ -140,6 +141,16 @@ static inline struct sk_buff *ip_finish_skb(struct sock *sk, struct flowi4 *fl4)
return __ip_make_skb(sk, fl4, &sk->sk_write_queue, &inet_sk(sk)->cork.base);
}
+static inline __u8 get_rttos(struct ipcm_cookie* ipc, struct inet_sock *inet)
+{
+ return (ipc->tos != -1) ? RT_TOS(ipc->tos) : RT_TOS(inet->tos);
+}
+
+static inline __u8 get_rtconn_flags(struct ipcm_cookie* ipc, struct sock* sk)
+{
+ return (ipc->tos != -1) ? RT_CONN_FLAGS_TOS(sk, ipc->tos) : RT_CONN_FLAGS(sk);
+}
+
/* datagram.c */
int ip4_datagram_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len);
diff --git a/include/net/route.h b/include/net/route.h
index 6f572ca..0ad8e01 100644
--- a/include/net/route.h
+++ b/include/net/route.h
@@ -39,6 +39,7 @@
#define RTO_ONLINK 0x01
#define RT_CONN_FLAGS(sk) (RT_TOS(inet_sk(sk)->tos) | sock_flag(sk, SOCK_LOCALROUTE))
+#define RT_CONN_FLAGS_TOS(sk,tos) (RT_TOS(tos) | sock_flag(sk, SOCK_LOCALROUTE))
struct fib_nh;
struct fib_info;
diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
index 5f7d11a..5c0e8bc 100644
--- a/net/ipv4/icmp.c
+++ b/net/ipv4/icmp.c
@@ -353,6 +353,9 @@ static void icmp_reply(struct icmp_bxm *icmp_param, struct sk_buff *skb)
saddr = fib_compute_spec_dst(skb);
ipc.opt = NULL;
ipc.tx_flags = 0;
+ ipc.ttl = 0;
+ ipc.tos = -1;
+
if (icmp_param->replyopts.opt.opt.optlen) {
ipc.opt = &icmp_param->replyopts.opt;
if (ipc.opt->opt.srr)
@@ -608,6 +611,8 @@ void icmp_send(struct sk_buff *skb_in, int type, int code, __be32 info)
ipc.addr = iph->saddr;
ipc.opt = &icmp_param->replyopts.opt;
ipc.tx_flags = 0;
+ ipc.ttl = 0;
+ ipc.tos = -1;
rt = icmp_route_lookup(net, &fl4, skb_in, iph, saddr, tos,
type, code, icmp_param);
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index a04d872..7d8357b 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -1060,6 +1060,9 @@ static int ip_setup_cork(struct sock *sk, struct inet_cork *cork,
rt->dst.dev->mtu : dst_mtu(&rt->dst);
cork->dst = &rt->dst;
cork->length = 0;
+ cork->ttl = ipc->ttl;
+ cork->tos = ipc->tos;
+ cork->priority = ipc->priority;
cork->tx_flags = ipc->tx_flags;
return 0;
@@ -1311,7 +1314,9 @@ struct sk_buff *__ip_make_skb(struct sock *sk,
if (cork->flags & IPCORK_OPT)
opt = cork->opt;
- if (rt->rt_type == RTN_MULTICAST)
+ if (cork->ttl != 0)
+ ttl = cork->ttl;
+ else if (rt->rt_type == RTN_MULTICAST)
ttl = inet->mc_ttl;
else
ttl = ip_select_ttl(inet, &rt->dst);
@@ -1319,7 +1324,7 @@ struct sk_buff *__ip_make_skb(struct sock *sk,
iph = ip_hdr(skb);
iph->version = 4;
iph->ihl = 5;
- iph->tos = inet->tos;
+ iph->tos = (cork->tos != -1) ? cork->tos : inet->tos;
iph->frag_off = df;
iph->ttl = ttl;
iph->protocol = sk->sk_protocol;
@@ -1331,7 +1336,7 @@ struct sk_buff *__ip_make_skb(struct sock *sk,
ip_options_build(skb, opt, cork->addr, rt, 0);
}
- skb->priority = sk->sk_priority;
+ skb->priority = (cork->tos != -1) ? cork->priority: sk->sk_priority;
skb->mark = sk->sk_mark;
/*
* Steal rt from cork.dst to avoid a pair of atomic_inc/atomic_dec
@@ -1481,6 +1486,8 @@ void ip_send_unicast_reply(struct net *net, struct sk_buff *skb, __be32 daddr,
ipc.addr = daddr;
ipc.opt = NULL;
ipc.tx_flags = 0;
+ ipc.ttl = 0;
+ ipc.tos = -1;
if (replyopts.opt.opt.optlen) {
ipc.opt = &replyopts.opt;
diff --git a/net/ipv4/ping.c b/net/ipv4/ping.c
index d7d9882..706d108e 100644
--- a/net/ipv4/ping.c
+++ b/net/ipv4/ping.c
@@ -713,6 +713,8 @@ int ping_v4_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
ipc.opt = NULL;
ipc.oif = sk->sk_bound_dev_if;
ipc.tx_flags = 0;
+ ipc.ttl = 0;
+ ipc.tos = -1;
sock_tx_timestamp(sk, &ipc.tx_flags);
@@ -744,7 +746,7 @@ int ping_v4_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
return -EINVAL;
faddr = ipc.opt->opt.faddr;
}
- tos = RT_TOS(inet->tos);
+ tos = get_rttos(&ipc, inet);
if (sock_flag(sk, SOCK_LOCALROUTE) ||
(msg->msg_flags & MSG_DONTROUTE) ||
(ipc.opt && ipc.opt->opt.is_strictroute)) {
diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index bfec521..a3fe534 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -517,6 +517,8 @@ static int raw_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
ipc.addr = inet->inet_saddr;
ipc.opt = NULL;
ipc.tx_flags = 0;
+ ipc.ttl = 0;
+ ipc.tos = -1;
ipc.oif = sk->sk_bound_dev_if;
if (msg->msg_controllen) {
@@ -556,7 +558,7 @@ static int raw_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
daddr = ipc.opt->opt.faddr;
}
}
- tos = RT_CONN_FLAGS(sk);
+ tos = get_rtconn_flags(&ipc, sk);
if (msg->msg_flags & MSG_DONTROUTE)
tos |= RTO_ONLINK;
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 74d2c95..22462d94 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -855,6 +855,8 @@ int udp_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
ipc.opt = NULL;
ipc.tx_flags = 0;
+ ipc.ttl = 0;
+ ipc.tos = -1;
getfrag = is_udplite ? udplite_getfrag : ip_generic_getfrag;
@@ -938,7 +940,7 @@ int udp_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
faddr = ipc.opt->opt.faddr;
connected = 0;
}
- tos = RT_TOS(inet->tos);
+ tos = get_rttos(&ipc, inet);
if (sock_flag(sk, SOCK_LOCALROUTE) ||
(msg->msg_flags & MSG_DONTROUTE) ||
(ipc.opt && ipc.opt->opt.is_strictroute)) {
--
1.8.3.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH net-next v3 0/2] ipv4: per-datagram IP_TOS and IP_TTL via sendmsg()
2013-09-24 13:43 [PATCH net-next v3 0/2] ipv4: per-datagram IP_TOS and IP_TTL via sendmsg() Francesco Fusco
2013-09-24 13:43 ` [PATCH net-next v3 1/2] ipv4: IP_TOS and IP_TTL can be specified as ancillary data Francesco Fusco
2013-09-24 13:43 ` [PATCH net-next v3 2/2] ipv4: processing ancillary IP_TOS or IP_TTL Francesco Fusco
@ 2013-09-28 22:22 ` David Miller
2 siblings, 0 replies; 4+ messages in thread
From: David Miller @ 2013-09-28 22:22 UTC (permalink / raw)
To: ffusco; +Cc: netdev
From: Francesco Fusco <ffusco@redhat.com>
Date: Tue, 24 Sep 2013 15:43:07 +0200
> There is no way to set the IP_TOS field on a per-packet basis in IPv4, while
> IPv6 has such a mechanism. Therefore one has to fall back to the setsockopt()
> in case of IPv4.
>
> Using the existing per-socket option is not convenient particularly in the
> situations where multiple threads have to use the same socket data requiring
> per-thread TOS values. In fact this would involve calling setsockopt() before
> sendmsg() every time.
Series applied, thank you.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2013-09-28 22:15 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-09-24 13:43 [PATCH net-next v3 0/2] ipv4: per-datagram IP_TOS and IP_TTL via sendmsg() Francesco Fusco
2013-09-24 13:43 ` [PATCH net-next v3 1/2] ipv4: IP_TOS and IP_TTL can be specified as ancillary data Francesco Fusco
2013-09-24 13:43 ` [PATCH net-next v3 2/2] ipv4: processing ancillary IP_TOS or IP_TTL Francesco Fusco
2013-09-28 22:22 ` [PATCH net-next v3 0/2] ipv4: per-datagram IP_TOS and IP_TTL via sendmsg() David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).