[RFC net-next 0/4] Support UID range routing.

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [RFC net-next 0/4] Support UID range routing.
@ 2014-04-26  4:48 Lorenzo Colitti
  2014-04-26  4:48 ` [RFC net-next 1/4] net: ipv6: Introduce flowi6_init_output Lorenzo Colitti
                   ` (4 more replies)
  0 siblings, 5 replies; 21+ messages in thread
From: Lorenzo Colitti @ 2014-04-26  4:48 UTC (permalink / raw)
  To: netdev; +Cc: hannes, davem, jpa, Lorenzo Colitti

In some environments it is useful to route packets differently
based on the user ID. This can be done with iptables owner match
and MASQUERADE, but that forces the use of iptables to fix up
parameters such as MSS and imposes a per-packet cost, plus it
breaks applications that expect end-to-end.

This patch series adds support for routing on the UID that owns
the socket and allow userspace to configure routing rules based
on UID ranges.

Points I'd like feedback on:

1. The code uses sock_i_uid, which grabs sk_callback_lock.
   Is that necessary? For example, xt_owner doesn't grab it - it
   just dereferences sk->sk_socket->file. If it is necessary, I
   don't know know how much contention it can cause. Should UID
   routing be made a config option as a result?
2. This patch defines new fib attributes (FRA_UID_START and
   FRA_UID_END) at the end of the currently-defined range.
   Should it instead replace some FRA_UNUSED_x attributes?
3. Is is a bad idea to use two attributes? I played around with
   making this an array of two integers, or a struct, but the
   results seemed uglier than the current code.

Limitations:

1. Sockets that have been closed have no UID any more. I think
   xt_owner also has this limitation - it's because the UID
   is in the struct socket, which is gone at that point. This
   could be fixed by writing the UID back into the struct sock
   sock when orphaning the socket.
2. Path MTU discovery does not (yet) specify the UID in the
   routing lookup to clone the route. This is not hard to fix
   but I haven't gotten around to it yet. A packet too big or DF
   needed packet will still affect the MTU of the socket that
   caused it though.

Tested:

Black-box tested using user-mode Linux by pointing different
UIDs to different TAP interfaces. Tested the following in IPv4
and IPv6:

- TCP inbound and outbound connections
- UDP send connect+send
- Ping
- Userspace communication using a patched IP binary:
  - UID range rule add / delete
  - Route lookup with a UID

Lorenzo Colitti (4):
  net: ipv6: Introduce flowi6_init_output.
  net: core: Add a UID range to fib rules.
  net: core: Add the UID to flowi[46]_init_output.
  net: core: Add a RTA_UID attribute to routes.

 include/net/fib_rules.h          |  6 ++++-
 include/net/flow.h               | 31 ++++++++++++++++++++++-
 include/net/ip.h                 |  1 +
 include/net/route.h              |  5 ++--
 include/uapi/linux/fib_rules.h   |  2 ++
 include/uapi/linux/rtnetlink.h   |  1 +
 net/core/fib_rules.c             | 53 ++++++++++++++++++++++++++++++++++++++--
 net/ipv4/fib_frontend.c          |  1 +
 net/ipv4/inet_connection_sock.c  |  6 +++--
 net/ipv4/ip_output.c             |  3 ++-
 net/ipv4/ping.c                  |  3 ++-
 net/ipv4/raw.c                   |  3 ++-
 net/ipv4/route.c                 | 19 +++++++++-----
 net/ipv4/syncookies.c            |  3 ++-
 net/ipv4/udp.c                   |  3 ++-
 net/ipv6/af_inet6.c              | 13 ++++------
 net/ipv6/datagram.c              | 12 ++++-----
 net/ipv6/inet6_connection_sock.c | 25 ++++++++-----------
 net/ipv6/raw.c                   |  1 +
 net/ipv6/route.c                 |  7 ++++++
 net/ipv6/syncookies.c            | 13 +++++-----
 net/ipv6/tcp_ipv6.c              | 12 ++++-----
 net/ipv6/udp.c                   |  1 +
 23 files changed, 161 insertions(+), 63 deletions(-)

-- 
1.9.1.423.g4596e3a

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [RFC net-next 1/4] net: ipv6: Introduce flowi6_init_output.
  2014-04-26  4:48 [RFC net-next 0/4] Support UID range routing Lorenzo Colitti
@ 2014-04-26  4:48 ` Lorenzo Colitti
  2014-04-26  5:56   ` Julian Anastasov
  2014-04-26  4:48 ` [RFC net-next 2/4] net: core: Add a UID range to fib rules Lorenzo Colitti
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 21+ messages in thread
From: Lorenzo Colitti @ 2014-04-26  4:48 UTC (permalink / raw)
  To: netdev; +Cc: hannes, davem, jpa, Lorenzo Colitti

This is consistent with IPv4, and is a bit more compact. Also, by
forcing all common flowi6 parameters to be explicitly specified,
it makes it easier to see which parameters are being set and
which are being defaulted to zero. So, for example, no more
forgetting to do "fl6.fl6_mark = sk->sk_mark" and having to fix
it later like in net-next bf439b3.

Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
---
 include/net/flow.h               | 20 ++++++++++++++++++++
 net/ipv6/af_inet6.c              | 12 ++++--------
 net/ipv6/datagram.c              | 11 ++++-------
 net/ipv6/inet6_connection_sock.c | 23 ++++++++---------------
 net/ipv6/syncookies.c            | 12 +++++-------
 net/ipv6/tcp_ipv6.c              | 11 ++++-------
 6 files changed, 45 insertions(+), 44 deletions(-)

diff --git a/include/net/flow.h b/include/net/flow.h
index 8109a15..84044af 100644
--- a/include/net/flow.h
+++ b/include/net/flow.h
@@ -150,6 +150,26 @@ struct flowidn {
 #define fld_dport		uli.ports.dport
 } __attribute__((__aligned__(BITS_PER_LONG/8)));
 
+static inline void flowi6_init_output(struct flowi6 *fl6, int oif,
+				      __u32 mark, __u8 proto, __u8 flags,
+				      __be32 flowlabel,
+				      struct in6_addr daddr,
+				      struct in6_addr saddr,
+				      __be16 dport, __be16 sport)
+{
+	fl6->flowi6_oif = oif;
+	fl6->flowi6_iif = 0;
+	fl6->flowi6_mark = mark;
+	fl6->flowi6_proto = proto;
+	fl6->flowi6_flags = flags;
+	fl6->flowi6_secid = 0;
+	fl6->daddr = daddr;
+	fl6->saddr = saddr;
+	fl6->flowlabel = flowlabel;
+	fl6->fl6_dport = dport;
+	fl6->fl6_sport = sport;
+}
+
 struct flowi {
 	union {
 		struct flowi_common	__fl_common;
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index d935889..f8c11d2 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -649,14 +649,10 @@ int inet6_sk_rebuild_header(struct sock *sk)
 		struct flowi6 fl6;
 
 		memset(&fl6, 0, sizeof(fl6));
-		fl6.flowi6_proto = sk->sk_protocol;
-		fl6.daddr = sk->sk_v6_daddr;
-		fl6.saddr = np->saddr;
-		fl6.flowlabel = np->flow_label;
-		fl6.flowi6_oif = sk->sk_bound_dev_if;
-		fl6.flowi6_mark = sk->sk_mark;
-		fl6.fl6_dport = inet->inet_dport;
-		fl6.fl6_sport = inet->inet_sport;
+		flowi6_init_output(&fl6, sk->sk_bound_dev_if, sk->sk_mark,
+				   sk->sk_protocol, 0, np->flow_label,
+				   sk->sk_v6_daddr, np->saddr,
+				   inet->inet_dport, inet->inet_sport);
 		security_sk_classify_flow(sk, flowi6_to_flowi(&fl6));
 
 		final_p = fl6_update_dst(&fl6, np->opt, &final);
diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
index c3bf2d2..f15c165 100644
--- a/net/ipv6/datagram.c
+++ b/net/ipv6/datagram.c
@@ -154,13 +154,10 @@ ipv4_connected:
 	 *	destination cache for it.
 	 */
 
-	fl6.flowi6_proto = sk->sk_protocol;
-	fl6.daddr = sk->sk_v6_daddr;
-	fl6.saddr = np->saddr;
-	fl6.flowi6_oif = sk->sk_bound_dev_if;
-	fl6.flowi6_mark = sk->sk_mark;
-	fl6.fl6_dport = inet->inet_dport;
-	fl6.fl6_sport = inet->inet_sport;
+	flowi6_init_output(&fl6, sk->sk_bound_dev_if, sk->sk_mark,
+			   sk->sk_protocol, 0, fl6.flowlabel,
+			   sk->sk_v6_daddr, np->saddr,
+			   inet->inet_dport, inet->inet_sport);
 
 	if (!fl6.flowi6_oif && (addr_type&IPV6_ADDR_MULTICAST))
 		fl6.flowi6_oif = np->mcast_oif;
diff --git a/net/ipv6/inet6_connection_sock.c b/net/ipv6/inet6_connection_sock.c
index d4ade34..47f2272 100644
--- a/net/ipv6/inet6_connection_sock.c
+++ b/net/ipv6/inet6_connection_sock.c
@@ -76,14 +76,11 @@ struct dst_entry *inet6_csk_route_req(struct sock *sk,
 	struct dst_entry *dst;
 
 	memset(fl6, 0, sizeof(*fl6));
-	fl6->flowi6_proto = IPPROTO_TCP;
-	fl6->daddr = ireq->ir_v6_rmt_addr;
+	flowi6_init_output(fl6, ireq->ir_iif, sk->sk_mark,
+			   IPPROTO_TCP, 0, 0,
+			   ireq->ir_v6_rmt_addr, ireq->ir_v6_loc_addr,
+			   ireq->ir_rmt_port, htons(ireq->ir_num));
 	final_p = fl6_update_dst(fl6, np->opt, &final);
-	fl6->saddr = ireq->ir_v6_loc_addr;
-	fl6->flowi6_oif = ireq->ir_iif;
-	fl6->flowi6_mark = sk->sk_mark;
-	fl6->fl6_dport = ireq->ir_rmt_port;
-	fl6->fl6_sport = htons(ireq->ir_num);
 	security_req_classify_flow(req, flowi6_to_flowi(fl6));
 
 	dst = ip6_dst_lookup_flow(sk, fl6, final_p);
@@ -201,15 +198,11 @@ static struct dst_entry *inet6_csk_route_socket(struct sock *sk,
 	struct dst_entry *dst;
 
 	memset(fl6, 0, sizeof(*fl6));
-	fl6->flowi6_proto = sk->sk_protocol;
-	fl6->daddr = sk->sk_v6_daddr;
-	fl6->saddr = np->saddr;
-	fl6->flowlabel = np->flow_label;
+	flowi6_init_output(fl6, sk->sk_bound_dev_if, sk->sk_mark,
+			   sk->sk_protocol, 0, np->flow_label,
+			   sk->sk_v6_daddr, np->saddr,
+			   inet->inet_dport, inet->inet_sport);
 	IP6_ECN_flow_xmit(sk, fl6->flowlabel);
-	fl6->flowi6_oif = sk->sk_bound_dev_if;
-	fl6->flowi6_mark = sk->sk_mark;
-	fl6->fl6_sport = inet->inet_sport;
-	fl6->fl6_dport = inet->inet_dport;
 	security_sk_classify_flow(sk, flowi6_to_flowi(fl6));
 
 	final_p = fl6_update_dst(fl6, np->opt, &final);
diff --git a/net/ipv6/syncookies.c b/net/ipv6/syncookies.c
index bb53a5e7..09bb685 100644
--- a/net/ipv6/syncookies.c
+++ b/net/ipv6/syncookies.c
@@ -237,14 +237,12 @@ struct sock *cookie_v6_check(struct sock *sk, struct sk_buff *skb)
 		struct in6_addr *final_p, final;
 		struct flowi6 fl6;
 		memset(&fl6, 0, sizeof(fl6));
-		fl6.flowi6_proto = IPPROTO_TCP;
-		fl6.daddr = ireq->ir_v6_rmt_addr;
+		flowi6_init_output(&fl6, sk->sk_bound_dev_if, sk->sk_mark,
+				   IPPROTO_TCP, 0, 0,
+				   ireq->ir_v6_rmt_addr, ireq->ir_v6_loc_addr,
+				   ireq->ir_rmt_port, inet_sk(sk)->inet_sport);
+
 		final_p = fl6_update_dst(&fl6, np->opt, &final);
-		fl6.saddr = ireq->ir_v6_loc_addr;
-		fl6.flowi6_oif = sk->sk_bound_dev_if;
-		fl6.flowi6_mark = sk->sk_mark;
-		fl6.fl6_dport = ireq->ir_rmt_port;
-		fl6.fl6_sport = inet_sk(sk)->inet_sport;
 		security_req_classify_flow(req, flowi6_to_flowi(&fl6));
 
 		dst = ip6_dst_lookup_flow(sk, &fl6, final_p);
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index e289830..8f4f68a 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -243,13 +243,10 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr,
 	if (!ipv6_addr_any(&sk->sk_v6_rcv_saddr))
 		saddr = &sk->sk_v6_rcv_saddr;
 
-	fl6.flowi6_proto = IPPROTO_TCP;
-	fl6.daddr = sk->sk_v6_daddr;
-	fl6.saddr = saddr ? *saddr : np->saddr;
-	fl6.flowi6_oif = sk->sk_bound_dev_if;
-	fl6.flowi6_mark = sk->sk_mark;
-	fl6.fl6_dport = usin->sin6_port;
-	fl6.fl6_sport = inet->inet_sport;
+	flowi6_init_output(&fl6, sk->sk_bound_dev_if, sk->sk_mark,
+			   IPPROTO_TCP, 0, fl6.flowlabel,
+			   sk->sk_v6_daddr, saddr ? *saddr : np->saddr,
+			   usin->sin6_port, inet->inet_sport);
 
 	final_p = fl6_update_dst(&fl6, np->opt, &final);
 
-- 
1.9.1.423.g4596e3a

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC net-next 2/4] net: core: Add a UID range to fib rules.
  2014-04-26  4:48 [RFC net-next 0/4] Support UID range routing Lorenzo Colitti
  2014-04-26  4:48 ` [RFC net-next 1/4] net: ipv6: Introduce flowi6_init_output Lorenzo Colitti
@ 2014-04-26  4:48 ` Lorenzo Colitti
  2014-04-26  4:48 ` [RFC net-next 3/4] net: core: Add the UID to flowi[46]_init_output Lorenzo Colitti
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 21+ messages in thread
From: Lorenzo Colitti @ 2014-04-26  4:48 UTC (permalink / raw)
  To: netdev; +Cc: hannes, davem, jpa, Lorenzo Colitti

Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
---
 include/net/fib_rules.h        |  6 ++++-
 include/net/flow.h             |  5 ++++
 include/uapi/linux/fib_rules.h |  2 ++
 net/core/fib_rules.c           | 53 ++++++++++++++++++++++++++++++++++++++++--
 4 files changed, 63 insertions(+), 3 deletions(-)

diff --git a/include/net/fib_rules.h b/include/net/fib_rules.h
index e584de1..cb4470d 100644
--- a/include/net/fib_rules.h
+++ b/include/net/fib_rules.h
@@ -28,6 +28,8 @@ struct fib_rule {
 	int			suppress_prefixlen;
 	char			iifname[IFNAMSIZ];
 	char			oifname[IFNAMSIZ];
+	kuid_t			uid_start;
+	kuid_t			uid_end;
 	struct rcu_head		rcu;
 };
 
@@ -88,7 +90,9 @@ struct fib_rules_ops {
 	[FRA_TABLE]     = { .type = NLA_U32 }, \
 	[FRA_SUPPRESS_PREFIXLEN] = { .type = NLA_U32 }, \
 	[FRA_SUPPRESS_IFGROUP] = { .type = NLA_U32 }, \
-	[FRA_GOTO]	= { .type = NLA_U32 }
+	[FRA_GOTO]	= { .type = NLA_U32 }, \
+	[FRA_UID_START]	= { .type = NLA_U32 }, \
+	[FRA_UID_END]	= { .type = NLA_U32 }
 
 static inline void fib_rule_get(struct fib_rule *rule)
 {
diff --git a/include/net/flow.h b/include/net/flow.h
index 84044af..9828829 100644
--- a/include/net/flow.h
+++ b/include/net/flow.h
@@ -10,6 +10,7 @@
 #include <linux/socket.h>
 #include <linux/in6.h>
 #include <linux/atomic.h>
+#include <linux/uidgid.h>
 
 /*
  * ifindex generation is per-net namespace, and loopback is
@@ -30,6 +31,7 @@ struct flowi_common {
 #define FLOWI_FLAG_ANYSRC		0x01
 #define FLOWI_FLAG_KNOWN_NH		0x02
 	__u32	flowic_secid;
+	kuid_t	flowic_uid;
 };
 
 union flowi_uli {
@@ -66,6 +68,7 @@ struct flowi4 {
 #define flowi4_proto		__fl_common.flowic_proto
 #define flowi4_flags		__fl_common.flowic_flags
 #define flowi4_secid		__fl_common.flowic_secid
+#define flowi4_uid		__fl_common.flowic_uid
 
 	/* (saddr,daddr) must be grouped, same order as in IP header */
 	__be32			saddr;
@@ -122,6 +125,7 @@ struct flowi6 {
 #define flowi6_proto		__fl_common.flowic_proto
 #define flowi6_flags		__fl_common.flowic_flags
 #define flowi6_secid		__fl_common.flowic_secid
+#define flowi6_uid		__fl_common.flowic_uid
 	struct in6_addr		daddr;
 	struct in6_addr		saddr;
 	__be32			flowlabel;
@@ -185,6 +189,7 @@ struct flowi {
 #define flowi_proto	u.__fl_common.flowic_proto
 #define flowi_flags	u.__fl_common.flowic_flags
 #define flowi_secid	u.__fl_common.flowic_secid
+#define flowi_uid	u.__fl_common.flowic_uid
 } __attribute__((__aligned__(BITS_PER_LONG/8)));
 
 static inline struct flowi *flowi4_to_flowi(struct flowi4 *fl4)
diff --git a/include/uapi/linux/fib_rules.h b/include/uapi/linux/fib_rules.h
index 2b82d7e..743e300 100644
--- a/include/uapi/linux/fib_rules.h
+++ b/include/uapi/linux/fib_rules.h
@@ -49,6 +49,8 @@ enum {
 	FRA_TABLE,	/* Extended table id */
 	FRA_FWMASK,	/* mask for netfilter mark */
 	FRA_OIFNAME,
+	FRA_UID_START,	/* UID range */
+	FRA_UID_END,
 	__FRA_MAX
 };
 
diff --git a/net/core/fib_rules.c b/net/core/fib_rules.c
index 185c341..5cbcdfd 100644
--- a/net/core/fib_rules.c
+++ b/net/core/fib_rules.c
@@ -31,6 +31,8 @@ int fib_default_rule_add(struct fib_rules_ops *ops,
 	r->pref = pref;
 	r->table = table;
 	r->flags = flags;
+	r->uid_start = INVALID_UID;
+	r->uid_end = INVALID_UID;
 	r->fr_net = hold_net(ops->fro_net);
 
 	r->suppress_prefixlen = -1;
@@ -182,6 +184,23 @@ void fib_rules_unregister(struct fib_rules_ops *ops)
 }
 EXPORT_SYMBOL_GPL(fib_rules_unregister);
 
+static inline kuid_t fib_nl_uid(struct nlattr *nla)
+{
+	return make_kuid(current_user_ns(), nla_get_u32(nla));
+}
+
+static int nla_put_uid(struct sk_buff *skb, int idx, kuid_t uid)
+{
+	return nla_put_u32(skb, idx, from_kuid_munged(current_user_ns(), uid));
+}
+
+static int fib_uid_range_match(struct flowi *fl, struct fib_rule *rule)
+{
+	return (!uid_valid(rule->uid_start) && !uid_valid(rule->uid_end)) ||
+	       (uid_gte(fl->flowi_uid, rule->uid_start) &&
+		uid_lte(fl->flowi_uid, rule->uid_end));
+}
+
 static int fib_rule_match(struct fib_rule *rule, struct fib_rules_ops *ops,
 			  struct flowi *fl, int flags)
 {
@@ -196,6 +215,9 @@ static int fib_rule_match(struct fib_rule *rule, struct fib_rules_ops *ops,
 	if ((rule->mark ^ fl->flowi_mark) & rule->mark_mask)
 		goto out;
 
+	if (!fib_uid_range_match(fl, rule))
+		goto out;
+
 	ret = ops->match(rule, fl, flags);
 out:
 	return (rule->flags & FIB_RULE_INVERT) ? !ret : ret;
@@ -378,6 +400,19 @@ static int fib_nl_newrule(struct sk_buff *skb, struct nlmsghdr* nlh)
 	} else if (rule->action == FR_ACT_GOTO)
 		goto errout_free;
 
+	/* UID start and end must either both be valid or both unspecified. */
+	rule->uid_start = rule->uid_end = INVALID_UID;
+	if (tb[FRA_UID_START] || tb[FRA_UID_END]) {
+		if (tb[FRA_UID_START] && tb[FRA_UID_END]) {
+			rule->uid_start = fib_nl_uid(tb[FRA_UID_START]);
+			rule->uid_end = fib_nl_uid(tb[FRA_UID_END]);
+		}
+		if (!uid_valid(rule->uid_start) ||
+		    !uid_valid(rule->uid_end) ||
+		    !uid_lte(rule->uid_start, rule->uid_end))
+		goto errout_free;
+	}
+
 	err = ops->configure(rule, skb, frh, tb);
 	if (err < 0)
 		goto errout_free;
@@ -484,6 +519,14 @@ static int fib_nl_delrule(struct sk_buff *skb, struct nlmsghdr* nlh)
 		    (rule->mark_mask != nla_get_u32(tb[FRA_FWMASK])))
 			continue;
 
+		if (tb[FRA_UID_START] &&
+		    !uid_eq(rule->uid_start, fib_nl_uid(tb[FRA_UID_START])))
+			continue;
+
+		if (tb[FRA_UID_END] &&
+		    !uid_eq(rule->uid_end, fib_nl_uid(tb[FRA_UID_END])))
+			continue;
+
 		if (!ops->compare(rule, frh, tb))
 			continue;
 
@@ -542,7 +585,9 @@ static inline size_t fib_rule_nlmsg_size(struct fib_rules_ops *ops,
 			 + nla_total_size(4) /* FRA_SUPPRESS_PREFIXLEN */
 			 + nla_total_size(4) /* FRA_SUPPRESS_IFGROUP */
 			 + nla_total_size(4) /* FRA_FWMARK */
-			 + nla_total_size(4); /* FRA_FWMASK */
+			 + nla_total_size(4) /* FRA_FWMASK */
+			 + nla_total_size(4) /* FRA_UID_START */
+			 + nla_total_size(4); /* FRA_UID_END */
 
 	if (ops->nlmsg_payload)
 		payload += ops->nlmsg_payload(rule);
@@ -598,7 +643,11 @@ static int fib_nl_fill_rule(struct sk_buff *skb, struct fib_rule *rule,
 	    ((rule->mark_mask || rule->mark) &&
 	     nla_put_u32(skb, FRA_FWMASK, rule->mark_mask)) ||
 	    (rule->target &&
-	     nla_put_u32(skb, FRA_GOTO, rule->target)))
+	     nla_put_u32(skb, FRA_GOTO, rule->target)) ||
+	    (uid_valid(rule->uid_start) &&
+	     nla_put_uid(skb, FRA_UID_START, rule->uid_start)) ||
+	    (uid_valid(rule->uid_end) &&
+	     nla_put_uid(skb, FRA_UID_END, rule->uid_end)))
 		goto nla_put_failure;
 
 	if (rule->suppress_ifgroup != -1) {
-- 
1.9.1.423.g4596e3a

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC net-next 3/4] net: core: Add the UID to flowi[46]_init_output.
  2014-04-26  4:48 [RFC net-next 0/4] Support UID range routing Lorenzo Colitti
  2014-04-26  4:48 ` [RFC net-next 1/4] net: ipv6: Introduce flowi6_init_output Lorenzo Colitti
  2014-04-26  4:48 ` [RFC net-next 2/4] net: core: Add a UID range to fib rules Lorenzo Colitti
@ 2014-04-26  4:48 ` Lorenzo Colitti
  2014-04-26  4:48 ` [RFC net-next 4/4] net: core: Add a RTA_UID attribute to routes Lorenzo Colitti
  2014-04-26 13:14 ` [RFC net-next 0/4] Support UID range routing David Newall
  4 siblings, 0 replies; 21+ messages in thread
From: Lorenzo Colitti @ 2014-04-26  4:48 UTC (permalink / raw)
  To: netdev; +Cc: hannes, davem, jpa, Lorenzo Colitti

Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
---
 include/net/flow.h               |  8 ++++++--
 include/net/ip.h                 |  1 +
 include/net/route.h              |  5 +++--
 net/ipv4/inet_connection_sock.c  |  6 ++++--
 net/ipv4/ip_output.c             |  3 ++-
 net/ipv4/ping.c                  |  3 ++-
 net/ipv4/raw.c                   |  3 ++-
 net/ipv4/route.c                 | 14 ++++++++------
 net/ipv4/syncookies.c            |  3 ++-
 net/ipv4/udp.c                   |  3 ++-
 net/ipv6/af_inet6.c              |  3 ++-
 net/ipv6/datagram.c              |  3 ++-
 net/ipv6/inet6_connection_sock.c |  6 ++++--
 net/ipv6/raw.c                   |  1 +
 net/ipv6/syncookies.c            |  3 ++-
 net/ipv6/tcp_ipv6.c              |  3 ++-
 net/ipv6/udp.c                   |  1 +
 17 files changed, 46 insertions(+), 23 deletions(-)

diff --git a/include/net/flow.h b/include/net/flow.h
index 9828829..da9b806 100644
--- a/include/net/flow.h
+++ b/include/net/flow.h
@@ -88,7 +88,8 @@ static inline void flowi4_init_output(struct flowi4 *fl4, int oif,
 				      __u32 mark, __u8 tos, __u8 scope,
 				      __u8 proto, __u8 flags,
 				      __be32 daddr, __be32 saddr,
-				      __be16 dport, __be16 sport)
+				      __be16 dport, __be16 sport,
+				      kuid_t uid)
 {
 	fl4->flowi4_oif = oif;
 	fl4->flowi4_iif = LOOPBACK_IFINDEX;
@@ -98,6 +99,7 @@ static inline void flowi4_init_output(struct flowi4 *fl4, int oif,
 	fl4->flowi4_proto = proto;
 	fl4->flowi4_flags = flags;
 	fl4->flowi4_secid = 0;
+	fl4->flowi4_uid = uid;
 	fl4->daddr = daddr;
 	fl4->saddr = saddr;
 	fl4->fl4_dport = dport;
@@ -159,7 +161,8 @@ static inline void flowi6_init_output(struct flowi6 *fl6, int oif,
 				      __be32 flowlabel,
 				      struct in6_addr daddr,
 				      struct in6_addr saddr,
-				      __be16 dport, __be16 sport)
+				      __be16 dport, __be16 sport,
+				      kuid_t uid)
 {
 	fl6->flowi6_oif = oif;
 	fl6->flowi6_iif = 0;
@@ -167,6 +170,7 @@ static inline void flowi6_init_output(struct flowi6 *fl6, int oif,
 	fl6->flowi6_proto = proto;
 	fl6->flowi6_flags = flags;
 	fl6->flowi6_secid = 0;
+	fl6->flowi6_uid = uid;
 	fl6->daddr = daddr;
 	fl6->saddr = saddr;
 	fl6->flowlabel = flowlabel;
diff --git a/include/net/ip.h b/include/net/ip.h
index 3ec2b0f..0123f78 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -170,6 +170,7 @@ struct ip_reply_arg {
 				/* -1 if not needed */ 
 	int	    bound_dev_if;
 	u8  	    tos;
+	kuid_t	    uid;
 }; 
 
 #define IP_REPLY_ARG_NOSRCCHECK 1
diff --git a/include/net/route.h b/include/net/route.h
index b17cf28..22a231c 100644
--- a/include/net/route.h
+++ b/include/net/route.h
@@ -140,7 +140,7 @@ static inline struct rtable *ip_route_output_ports(struct net *net, struct flowi
 	flowi4_init_output(fl4, oif, sk ? sk->sk_mark : 0, tos,
 			   RT_SCOPE_UNIVERSE, proto,
 			   sk ? inet_sk_flowi_flags(sk) : 0,
-			   daddr, saddr, dport, sport);
+			   daddr, saddr, dport, sport, sock_i_uid(sk));
 	if (sk)
 		security_sk_classify_flow(sk, flowi4_to_flowi(fl4));
 	return ip_route_output_flow(net, fl4, sk);
@@ -249,7 +249,8 @@ static inline void ip_route_connect_init(struct flowi4 *fl4, __be32 dst, __be32
 		flow_flags |= FLOWI_FLAG_ANYSRC;
 
 	flowi4_init_output(fl4, oif, sk->sk_mark, tos, RT_SCOPE_UNIVERSE,
-			   protocol, flow_flags, dst, src, dport, sport);
+			   protocol, flow_flags, dst, src, dport, sport,
+			   sock_i_uid(sk));
 }
 
 static inline struct rtable *ip_route_connect(struct flowi4 *fl4,
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index 0d1e2cb..b184140 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -413,7 +413,8 @@ struct dst_entry *inet_csk_route_req(struct sock *sk,
 			   sk->sk_protocol,
 			   flags,
 			   (opt && opt->opt.srr) ? opt->opt.faddr : ireq->ir_rmt_addr,
-			   ireq->ir_loc_addr, ireq->ir_rmt_port, inet_sk(sk)->inet_sport);
+			   ireq->ir_loc_addr, ireq->ir_rmt_port, inet_sk(sk)->inet_sport,
+			   sock_i_uid(sk));
 	security_req_classify_flow(req, flowi4_to_flowi(fl4));
 	rt = ip_route_output_flow(net, fl4, sk);
 	if (IS_ERR(rt))
@@ -449,7 +450,8 @@ struct dst_entry *inet_csk_route_child_sock(struct sock *sk,
 			   RT_CONN_FLAGS(sk), RT_SCOPE_UNIVERSE,
 			   sk->sk_protocol, inet_sk_flowi_flags(sk),
 			   (opt && opt->opt.srr) ? opt->opt.faddr : ireq->ir_rmt_addr,
-			   ireq->ir_loc_addr, ireq->ir_rmt_port, inet_sk(sk)->inet_sport);
+			   ireq->ir_loc_addr, ireq->ir_rmt_port, inet_sk(sk)->inet_sport,
+			   sock_i_uid(sk));
 	security_req_classify_flow(req, flowi4_to_flowi(fl4));
 	rt = ip_route_output_flow(net, fl4, sk);
 	if (IS_ERR(rt))
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 1cbeba5..49998a9 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -1506,7 +1506,8 @@ void ip_send_unicast_reply(struct net *net, struct sk_buff *skb, __be32 daddr,
 			   RT_SCOPE_UNIVERSE, ip_hdr(skb)->protocol,
 			   ip_reply_arg_flowi_flags(arg),
 			   daddr, saddr,
-			   tcp_hdr(skb)->source, tcp_hdr(skb)->dest);
+			   tcp_hdr(skb)->source, tcp_hdr(skb)->dest,
+			   arg->uid);
 	security_skb_classify_flow(skb, flowi4_to_flowi(&fl4));
 	rt = ip_route_output_key(net, &fl4);
 	if (IS_ERR(rt))
diff --git a/net/ipv4/ping.c b/net/ipv4/ping.c
index 8210964..8a912b8 100644
--- a/net/ipv4/ping.c
+++ b/net/ipv4/ping.c
@@ -778,7 +778,8 @@ static int ping_v4_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *m
 
 	flowi4_init_output(&fl4, ipc.oif, sk->sk_mark, tos,
 			   RT_SCOPE_UNIVERSE, sk->sk_protocol,
-			   inet_sk_flowi_flags(sk), faddr, saddr, 0, 0);
+			   inet_sk_flowi_flags(sk), faddr, saddr, 0, 0,
+			   sock_i_uid(sk));
 
 	security_sk_classify_flow(sk, flowi4_to_flowi(&fl4));
 	rt = ip_route_output_flow(net, &fl4, sk);
diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index a9dbe58..1b56f9a 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -577,7 +577,8 @@ static int raw_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 			   inet->hdrincl ? IPPROTO_RAW : sk->sk_protocol,
 			   inet_sk_flowi_flags(sk) |
 			    (inet->hdrincl ? FLOWI_FLAG_KNOWN_NH : 0),
-			   daddr, saddr, 0, 0);
+			   daddr, saddr, 0, 0,
+			   sock_i_uid(sk));
 
 	if (!inet->hdrincl) {
 		err = raw_probe_proto_opt(&fl4, msg);
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index db1e0da..58017b1 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -492,7 +492,7 @@ void __ip_select_ident(struct iphdr *iph, struct dst_entry *dst, int more)
 }
 EXPORT_SYMBOL(__ip_select_ident);
 
-static void __build_flow_key(struct flowi4 *fl4, const struct sock *sk,
+static void __build_flow_key(struct flowi4 *fl4, struct sock *sk,
 			     const struct iphdr *iph,
 			     int oif, u8 tos,
 			     u8 prot, u32 mark, int flow_flags)
@@ -508,11 +508,12 @@ static void __build_flow_key(struct flowi4 *fl4, const struct sock *sk,
 	flowi4_init_output(fl4, oif, mark, tos,
 			   RT_SCOPE_UNIVERSE, prot,
 			   flow_flags,
-			   iph->daddr, iph->saddr, 0, 0);
+			   iph->daddr, iph->saddr, 0, 0,
+			   sock_i_uid(sk));
 }
 
 static void build_skb_flow_key(struct flowi4 *fl4, const struct sk_buff *skb,
-			       const struct sock *sk)
+			       struct sock *sk)
 {
 	const struct iphdr *iph = ip_hdr(skb);
 	int oif = skb->dev->ifindex;
@@ -523,7 +524,7 @@ static void build_skb_flow_key(struct flowi4 *fl4, const struct sk_buff *skb,
 	__build_flow_key(fl4, sk, iph, oif, tos, prot, mark, 0);
 }
 
-static void build_sk_flow_key(struct flowi4 *fl4, const struct sock *sk)
+static void build_sk_flow_key(struct flowi4 *fl4, struct sock *sk)
 {
 	const struct inet_sock *inet = inet_sk(sk);
 	const struct ip_options_rcu *inet_opt;
@@ -537,11 +538,12 @@ static void build_sk_flow_key(struct flowi4 *fl4, const struct sock *sk)
 			   RT_CONN_FLAGS(sk), RT_SCOPE_UNIVERSE,
 			   inet->hdrincl ? IPPROTO_RAW : sk->sk_protocol,
 			   inet_sk_flowi_flags(sk),
-			   daddr, inet->inet_saddr, 0, 0);
+			   daddr, inet->inet_saddr, 0, 0,
+			   sock_i_uid(sk));
 	rcu_read_unlock();
 }
 
-static void ip_rt_build_flow_key(struct flowi4 *fl4, const struct sock *sk,
+static void ip_rt_build_flow_key(struct flowi4 *fl4, struct sock *sk,
 				 const struct sk_buff *skb)
 {
 	if (skb)
diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c
index f2ed13c..fc15bca 100644
--- a/net/ipv4/syncookies.c
+++ b/net/ipv4/syncookies.c
@@ -343,7 +343,8 @@ struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb,
 			   RT_CONN_FLAGS(sk), RT_SCOPE_UNIVERSE, IPPROTO_TCP,
 			   inet_sk_flowi_flags(sk),
 			   (opt && opt->srr) ? opt->faddr : ireq->ir_rmt_addr,
-			   ireq->ir_loc_addr, th->source, th->dest);
+			   ireq->ir_loc_addr, th->source, th->dest,
+			   sock_i_uid(sk));
 	security_req_classify_flow(req, flowi4_to_flowi(&fl4));
 	rt = ip_route_output_key(sock_net(sk), &fl4);
 	if (IS_ERR(rt)) {
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 4468e1a..4776196 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -988,7 +988,8 @@ int udp_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 		flowi4_init_output(fl4, ipc.oif, sk->sk_mark, tos,
 				   RT_SCOPE_UNIVERSE, sk->sk_protocol,
 				   inet_sk_flowi_flags(sk),
-				   faddr, saddr, dport, inet->inet_sport);
+				   faddr, saddr, dport, inet->inet_sport,
+				   sock_i_uid(sk));
 
 		security_sk_classify_flow(sk, flowi4_to_flowi(fl4));
 		rt = ip_route_output_flow(net, fl4, sk);
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index f8c11d2..585859f 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -652,7 +652,8 @@ int inet6_sk_rebuild_header(struct sock *sk)
 		flowi6_init_output(&fl6, sk->sk_bound_dev_if, sk->sk_mark,
 				   sk->sk_protocol, 0, np->flow_label,
 				   sk->sk_v6_daddr, np->saddr,
-				   inet->inet_dport, inet->inet_sport);
+				   inet->inet_dport, inet->inet_sport,
+				   sock_i_uid(sk));
 		security_sk_classify_flow(sk, flowi6_to_flowi(&fl6));
 
 		final_p = fl6_update_dst(&fl6, np->opt, &final);
diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
index f15c165..156f1ea 100644
--- a/net/ipv6/datagram.c
+++ b/net/ipv6/datagram.c
@@ -157,7 +157,8 @@ ipv4_connected:
 	flowi6_init_output(&fl6, sk->sk_bound_dev_if, sk->sk_mark,
 			   sk->sk_protocol, 0, fl6.flowlabel,
 			   sk->sk_v6_daddr, np->saddr,
-			   inet->inet_dport, inet->inet_sport);
+			   inet->inet_dport, inet->inet_sport,
+			   sock_i_uid(sk));
 
 	if (!fl6.flowi6_oif && (addr_type&IPV6_ADDR_MULTICAST))
 		fl6.flowi6_oif = np->mcast_oif;
diff --git a/net/ipv6/inet6_connection_sock.c b/net/ipv6/inet6_connection_sock.c
index 47f2272..057ff9d 100644
--- a/net/ipv6/inet6_connection_sock.c
+++ b/net/ipv6/inet6_connection_sock.c
@@ -79,7 +79,8 @@ struct dst_entry *inet6_csk_route_req(struct sock *sk,
 	flowi6_init_output(fl6, ireq->ir_iif, sk->sk_mark,
 			   IPPROTO_TCP, 0, 0,
 			   ireq->ir_v6_rmt_addr, ireq->ir_v6_loc_addr,
-			   ireq->ir_rmt_port, htons(ireq->ir_num));
+			   ireq->ir_rmt_port, htons(ireq->ir_num),
+			   sock_i_uid(sk));
 	final_p = fl6_update_dst(fl6, np->opt, &final);
 	security_req_classify_flow(req, flowi6_to_flowi(fl6));
 
@@ -201,7 +202,8 @@ static struct dst_entry *inet6_csk_route_socket(struct sock *sk,
 	flowi6_init_output(fl6, sk->sk_bound_dev_if, sk->sk_mark,
 			   sk->sk_protocol, 0, np->flow_label,
 			   sk->sk_v6_daddr, np->saddr,
-			   inet->inet_dport, inet->inet_sport);
+			   inet->inet_dport, inet->inet_sport,
+			   sock_i_uid(sk));
 	IP6_ECN_flow_xmit(sk, fl6->flowlabel);
 	security_sk_classify_flow(sk, flowi6_to_flowi(fl6));
 
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 1f29996..77f2d1a 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -770,6 +770,7 @@ static int rawv6_sendmsg(struct kiocb *iocb, struct sock *sk,
 	memset(&fl6, 0, sizeof(fl6));
 
 	fl6.flowi6_mark = sk->sk_mark;
+	fl6->flowi6_uid = sock_i_uid(sk);
 
 	if (sin6) {
 		if (addr_len < SIN6_LEN_RFC2133)
diff --git a/net/ipv6/syncookies.c b/net/ipv6/syncookies.c
index 09bb685..99f7b1a 100644
--- a/net/ipv6/syncookies.c
+++ b/net/ipv6/syncookies.c
@@ -240,7 +240,8 @@ struct sock *cookie_v6_check(struct sock *sk, struct sk_buff *skb)
 		flowi6_init_output(&fl6, sk->sk_bound_dev_if, sk->sk_mark,
 				   IPPROTO_TCP, 0, 0,
 				   ireq->ir_v6_rmt_addr, ireq->ir_v6_loc_addr,
-				   ireq->ir_rmt_port, inet_sk(sk)->inet_sport);
+				   ireq->ir_rmt_port, inet_sk(sk)->inet_sport,
+				   sock_i_uid(sk));
 
 		final_p = fl6_update_dst(&fl6, np->opt, &final);
 		security_req_classify_flow(req, flowi6_to_flowi(&fl6));
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 8f4f68a..a044154 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -246,7 +246,8 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr,
 	flowi6_init_output(&fl6, sk->sk_bound_dev_if, sk->sk_mark,
 			   IPPROTO_TCP, 0, fl6.flowlabel,
 			   sk->sk_v6_daddr, saddr ? *saddr : np->saddr,
-			   usin->sin6_port, inet->inet_sport);
+			   usin->sin6_port, inet->inet_sport,
+			   sock_i_uid(sk));
 
 	final_p = fl6_update_dst(&fl6, np->opt, &final);
 
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 1e586d9..6838cd1 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -1177,6 +1177,7 @@ do_udp_sendmsg:
 		fl6.flowi6_oif = np->sticky_pktinfo.ipi6_ifindex;
 
 	fl6.flowi6_mark = sk->sk_mark;
+	fl6->flowi6_uid = sock_i_uid(sk);
 
 	if (msg->msg_controllen) {
 		opt = &opt_space;
-- 
1.9.1.423.g4596e3a

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC net-next 4/4] net: core: Add a RTA_UID attribute to routes.
  2014-04-26  4:48 [RFC net-next 0/4] Support UID range routing Lorenzo Colitti
                   ` (2 preceding siblings ...)
  2014-04-26  4:48 ` [RFC net-next 3/4] net: core: Add the UID to flowi[46]_init_output Lorenzo Colitti
@ 2014-04-26  4:48 ` Lorenzo Colitti
  2014-04-26 13:14 ` [RFC net-next 0/4] Support UID range routing David Newall
  4 siblings, 0 replies; 21+ messages in thread
From: Lorenzo Colitti @ 2014-04-26  4:48 UTC (permalink / raw)
  To: netdev; +Cc: hannes, davem, jpa, Lorenzo Colitti

This is so that userspace can do per-UID route lookups.

Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
---
 include/uapi/linux/rtnetlink.h | 1 +
 net/ipv4/fib_frontend.c        | 1 +
 net/ipv4/route.c               | 5 +++++
 net/ipv6/route.c               | 7 +++++++
 4 files changed, 14 insertions(+)

diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h
index eb0f1a5..01757b7 100644
--- a/include/uapi/linux/rtnetlink.h
+++ b/include/uapi/linux/rtnetlink.h
@@ -297,6 +297,7 @@ enum rtattr_type_t {
 	RTA_TABLE,
 	RTA_MARK,
 	RTA_MFC_STATS,
+	RTA_UID,
 	__RTA_MAX
 };
 
diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
index 255aa99..dca307c 100644
--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -531,6 +531,7 @@ const struct nla_policy rtm_ipv4_policy[RTA_MAX + 1] = {
 	[RTA_METRICS]		= { .type = NLA_NESTED },
 	[RTA_MULTIPATH]		= { .len = sizeof(struct rtnexthop) },
 	[RTA_FLOW]		= { .type = NLA_U32 },
+	[RTA_UID]		= { .type = NLA_U32 },
 };
 
 static int rtm_to_fib_config(struct net *net, struct sk_buff *skb,
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 58017b1..57daf60 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2385,6 +2385,7 @@ static int inet_rtm_getroute(struct sk_buff *in_skb, struct nlmsghdr *nlh)
 	int err;
 	int mark;
 	struct sk_buff *skb;
+	kuid_t uid;
 
 	err = nlmsg_parse(nlh, sizeof(*rtm), tb, RTA_MAX, rtm_ipv4_policy);
 	if (err < 0)
@@ -2412,6 +2413,9 @@ static int inet_rtm_getroute(struct sk_buff *in_skb, struct nlmsghdr *nlh)
 	dst = tb[RTA_DST] ? nla_get_be32(tb[RTA_DST]) : 0;
 	iif = tb[RTA_IIF] ? nla_get_u32(tb[RTA_IIF]) : 0;
 	mark = tb[RTA_MARK] ? nla_get_u32(tb[RTA_MARK]) : 0;
+	uid = tb[RTA_UID] ?
+		make_kuid(current_user_ns(), nla_get_u32(tb[RTA_UID])) :
+		current_uid();
 
 	memset(&fl4, 0, sizeof(fl4));
 	fl4.daddr = dst;
@@ -2419,6 +2423,7 @@ static int inet_rtm_getroute(struct sk_buff *in_skb, struct nlmsghdr *nlh)
 	fl4.flowi4_tos = rtm->rtm_tos;
 	fl4.flowi4_oif = tb[RTA_OIF] ? nla_get_u32(tb[RTA_OIF]) : 0;
 	fl4.flowi4_mark = mark;
+	fl4.flowi4_uid = uid;
 
 	if (iif) {
 		struct net_device *dev;
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 4011617..75a5d41 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -2321,6 +2321,7 @@ static const struct nla_policy rtm_ipv6_policy[RTA_MAX+1] = {
 	[RTA_PRIORITY]          = { .type = NLA_U32 },
 	[RTA_METRICS]           = { .type = NLA_NESTED },
 	[RTA_MULTIPATH]		= { .len = sizeof(struct rtnexthop) },
+	[RTA_UID]		= { .type = NLA_U32 },
 };
 
 static int rtm_to_fib6_config(struct sk_buff *skb, struct nlmsghdr *nlh,
@@ -2707,6 +2708,12 @@ static int inet6_rtm_getroute(struct sk_buff *in_skb, struct nlmsghdr* nlh)
 	if (tb[RTA_OIF])
 		oif = nla_get_u32(tb[RTA_OIF]);
 
+	if (tb[RTA_UID])
+		fl6.flowi6_uid = make_kuid(current_user_ns(),
+					   nla_get_u32(tb[RTA_UID]));
+	else
+		fl6.flowi6_uid = current_uid();
+
 	if (iif) {
 		struct net_device *dev;
 		int flags = 0;
-- 
1.9.1.423.g4596e3a

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [RFC net-next 1/4] net: ipv6: Introduce flowi6_init_output.
  2014-04-26  4:48 ` [RFC net-next 1/4] net: ipv6: Introduce flowi6_init_output Lorenzo Colitti
@ 2014-04-26  5:56   ` Julian Anastasov
  2014-04-27  4:03     ` Lorenzo Colitti
  0 siblings, 1 reply; 21+ messages in thread
From: Julian Anastasov @ 2014-04-26  5:56 UTC (permalink / raw)
  To: Lorenzo Colitti; +Cc: netdev, hannes, davem, jpa


	Hello,

On Sat, 26 Apr 2014, Lorenzo Colitti wrote:

> +static inline void flowi6_init_output(struct flowi6 *fl6, int oif,
> +				      __u32 mark, __u8 proto, __u8 flags,
> +				      __be32 flowlabel,
> +				      struct in6_addr daddr,
> +				      struct in6_addr saddr,
> +				      __be16 dport, __be16 sport)
> +{
> +	fl6->flowi6_oif = oif;
> +	fl6->flowi6_iif = 0;

	Make sure LOOPBACK_IFINDEX is provided in
flowi6_iif instead of 0, recent commit fixed such
FIB rule lookups.

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC net-next 0/4] Support UID range routing.
  2014-04-26  4:48 [RFC net-next 0/4] Support UID range routing Lorenzo Colitti
                   ` (3 preceding siblings ...)
  2014-04-26  4:48 ` [RFC net-next 4/4] net: core: Add a RTA_UID attribute to routes Lorenzo Colitti
@ 2014-04-26 13:14 ` David Newall
  2014-04-28 14:38   ` Lorenzo Colitti
  4 siblings, 1 reply; 21+ messages in thread
From: David Newall @ 2014-04-26 13:14 UTC (permalink / raw)
  To: Lorenzo Colitti, netdev; +Cc: hannes, davem, jpa

Hi Lorenzo,

> In some environments it is useful to route packets differently
> based on the user ID.

I don't understand what your patch will do because, as I understand the 
word, "routing" means forwarding a packet one hop closer to the 
destination, so I expect you must reach the destination before you could 
know the UID.  I'm probably missing something quite obvious. Could you 
illustrate a use-case?

Thanks,

David

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC net-next 1/4] net: ipv6: Introduce flowi6_init_output.
  2014-04-26  5:56   ` Julian Anastasov
@ 2014-04-27  4:03     ` Lorenzo Colitti
  2014-04-28  7:07       ` Julian Anastasov
  0 siblings, 1 reply; 21+ messages in thread
From: Lorenzo Colitti @ 2014-04-27  4:03 UTC (permalink / raw)
  To: Julian Anastasov
  Cc: netdev@vger.kernel.org, Hannes Frederic Sowa, David Miller,
	JP Abgrall

On Sat, Apr 26, 2014 at 2:56 PM, Julian Anastasov <ja@ssi.bg> wrote:
>> +     fl6->flowi6_oif = oif;
>> +     fl6->flowi6_iif = 0;
>
>         Make sure LOOPBACK_IFINDEX is provided in
> flowi6_iif instead of 0, recent commit fixed such
> FIB rule lookups.

I'm assuming that was for IPv4? All the IPv6 code in current net-next
still seems to set flowi6_iif to zero. It's later set to
LOOPBACK_IFINDEX in ip6_route_output.

You could argue that if this patch is accepted, then setting
flowi6_iif to LOOPBACK_IFINDEX belongs there for consistency with
IPv4. That seems reasonable, but I'd prefer not doing that in this
patch because 1) this patch is semantically a no-op and I'd like to
keep it that way, 2) it would be a bigger change, because even with
this patch, there are still codepaths that initialize their own flowi6
and memset them to zero.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC net-next 1/4] net: ipv6: Introduce flowi6_init_output.
  2014-04-27  4:03     ` Lorenzo Colitti
@ 2014-04-28  7:07       ` Julian Anastasov
  0 siblings, 0 replies; 21+ messages in thread
From: Julian Anastasov @ 2014-04-28  7:07 UTC (permalink / raw)
  To: Lorenzo Colitti
  Cc: netdev@vger.kernel.org, Hannes Frederic Sowa, David Miller,
	JP Abgrall


	Hello,

On Sun, 27 Apr 2014, Lorenzo Colitti wrote:

> On Sat, Apr 26, 2014 at 2:56 PM, Julian Anastasov <ja@ssi.bg> wrote:
> >> +     fl6->flowi6_oif = oif;
> >> +     fl6->flowi6_iif = 0;
> >
> >         Make sure LOOPBACK_IFINDEX is provided in
> > flowi6_iif instead of 0, recent commit fixed such
> > FIB rule lookups.
> 
> I'm assuming that was for IPv4? All the IPv6 code in current net-next
> still seems to set flowi6_iif to zero. It's later set to
> LOOPBACK_IFINDEX in ip6_route_output.

	I see, it seems more IPv6 places need a fix.

> You could argue that if this patch is accepted, then setting
> flowi6_iif to LOOPBACK_IFINDEX belongs there for consistency with
> IPv4. That seems reasonable, but I'd prefer not doing that in this
> patch because 1) this patch is semantically a no-op and I'd like to
> keep it that way, 2) it would be a bigger change, because even with
> this patch, there are still codepaths that initialize their own flowi6
> and memset them to zero.

	OK. I'll try a patch to fix the remaining flowi6_iif
places...

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC net-next 0/4] Support UID range routing.
  2014-04-26 13:14 ` [RFC net-next 0/4] Support UID range routing David Newall
@ 2014-04-28 14:38   ` Lorenzo Colitti
       [not found]     ` <20140428.125807.409036177577836732.davem@davemloft.net>
  2014-04-30  4:36     ` Lorenzo Colitti
  0 siblings, 2 replies; 21+ messages in thread
From: Lorenzo Colitti @ 2014-04-28 14:38 UTC (permalink / raw)
  To: David Newall
  Cc: netdev@vger.kernel.org, Hannes Frederic Sowa, David Miller,
	JP Abgrall

On Sat, Apr 26, 2014 at 10:14 PM, David Newall <davidn@davidnewall.com> wrote:
> I don't understand what your patch will do because, as I understand the
> word, "routing" means forwarding a packet one hop closer to the destination,
> so I expect you must reach the destination before you could know the UID.
> I'm probably missing something quite obvious. Could you illustrate a
> use-case?

This is useful for originated packets only, not forwarded packets.

It can be used if a host has more than one way to reach a destination
(multiple physical interfaces, multiple GRE tunnels, multiple real or
virtual routers on link, etc.) and wants to decide which to use based
on the user ID of the sending application.

The user ID could identify a service (e.g., mail vs. web), different
users/customers on the a shared server / machine, different users /
applications on a mobile device, etc.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC net-next 0/4] Support UID range routing.
       [not found]     ` <20140428.125807.409036177577836732.davem@davemloft.net>
@ 2014-04-28 19:01       ` Lorenzo Colitti
  2014-05-02 19:15         ` Lorenzo Colitti
  0 siblings, 1 reply; 21+ messages in thread
From: Lorenzo Colitti @ 2014-04-28 19:01 UTC (permalink / raw)
  To: David Miller
  Cc: davidn, netdev@vger.kernel.org, Hannes Frederic Sowa, JP Abgrall

On Tue, Apr 29, 2014 at 1:58 AM, David Miller <davem@davemloft.net> wrote:
> There is absolutely no such thing as the UID of a socket.

Sorry - perhaps I should have said "socket creator" or "socket owner".
Basically, what this patch calls "UID" is what the xt_owner module and
xt_LOG iptables modules consider to be the "owner" of a socket, what
nfqueue presents as the user ID, what shows up in
/proc/net/{udp,tcp,raw} in the "uid" column, etc. In most cases this
is the effective UID that made the call to socket() or accept().

This patch allows using that concept in routing. This can be done
today with "iptables -m owner --uid-owner 12345 -j MARK --set-mark
0xbeef; ip rule from fwmark 0xbeef lookup 100", but that has the
limitations I set out in my original message (e.g., incorrect source
address).

> And in software interrupt context, sending TCP ACKs for example, at
> best you have just the socket.  There is no appropriate UID to choose
> in such situations.

For as long as a kernel socket has a corresponding userspace socket,
the kernel socket has a pointer to the userspace socket object
(sk->sk_socket), which has the owner UID, right? So that information
is still accessible even outside process context. It's true that when
the userspace socket is closed that information goes away, but in
theory, it could be written back to the kernel socket in sock_orphan.

> I absolutely do not want to see a feature like this added to the tree,
> as it exemplifies a rather deep misunderstanding of how the hierarchy
> of networking stack objects are arranged and relate to eachother.

The examples I cite above are all in the tree. Do you consider them to
be misguided? In particular, the semantics of the iptables owner match
module seem quite similar to me - it allows making decisions
(including routing decisions) on a packet based on the socket owner.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC net-next 0/4] Support UID range routing.
  2014-04-28 14:38   ` Lorenzo Colitti
       [not found]     ` <20140428.125807.409036177577836732.davem@davemloft.net>
@ 2014-04-30  4:36     ` Lorenzo Colitti
  2014-04-30  7:52       ` David Newall
  1 sibling, 1 reply; 21+ messages in thread
From: Lorenzo Colitti @ 2014-04-30  4:36 UTC (permalink / raw)
  To: David Newall
  Cc: netdev@vger.kernel.org, Hannes Frederic Sowa, David Miller,
	JP Abgrall

On Mon, Apr 28, 2014 at 11:38 PM, Lorenzo Colitti <lorenzo@google.com> wrote:
> The user ID could identify a service (e.g., mail vs. web), different
> users/customers on the a shared server / machine, different users /
> applications on a mobile device, etc.

Example real-world use case: the Android VPN framework currently uses
iptables owner matching to mark packets, fwmark routing, and
masquerade to send traffic on the correct VPN for the user (Android
tablets are multi-user devices).

The use of NAT forces the system to use MSS rewriting instead of PMTUD
and makes it impossible for the app to know its real IP address and
port, breaking apps like SIP clients (in addition to requiring
conntrack to keep state locally-originated connections). Per-UID
routing would solve all these problems in a much cleaner way.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC net-next 0/4] Support UID range routing.
  2014-04-30  4:36     ` Lorenzo Colitti
@ 2014-04-30  7:52       ` David Newall
  2014-04-30  8:04         ` Lorenzo Colitti
  0 siblings, 1 reply; 21+ messages in thread
From: David Newall @ 2014-04-30  7:52 UTC (permalink / raw)
  To: Lorenzo Colitti
  Cc: netdev@vger.kernel.org, Hannes Frederic Sowa, David Miller,
	JP Abgrall

On 30/04/14 14:06, Lorenzo Colitti wrote:
> The use of NAT [...] makes it impossible for the app to know its
> real IP address and port

The original address *is* the real address.  NAT breaks IP's design and 
is a very mixed blessing.  NAT isn't needed nor used with IPv6, and 
being in IPv4's twilight years, an argument predicated on NAT is not 
very convincing.

I feel that describing the patch as routing is misleading, as it 
performs only outbound link selection.  It fosters an expectation of 
bi-directionality, and that is not the case.  It will often result in 
asymmetric routes.

Is it possible you are trying to solve a problem which has already been 
solved, for example by IGP?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC net-next 0/4] Support UID range routing.
  2014-04-30  7:52       ` David Newall
@ 2014-04-30  8:04         ` Lorenzo Colitti
  0 siblings, 0 replies; 21+ messages in thread
From: Lorenzo Colitti @ 2014-04-30  8:04 UTC (permalink / raw)
  To: David Newall
  Cc: netdev@vger.kernel.org, Hannes Frederic Sowa, David Miller,
	JP Abgrall

On Wed, Apr 30, 2014 at 4:52 PM, David Newall <davidn@davidnewall.com> wrote:
> The original address *is* the real address.  NAT breaks IP's design and is a very mixed blessing.  NAT isn't needed nor used with IPv6, and being in IPv4's twilight years, an argument predicated on NAT is not very convincing.

Right. NAT is what the code does today. This change allows getting rid of it.

> I feel that describing the patch as routing is misleading, as it performs only outbound link selection.  It fosters an expectation of bi-directionality, and that is not the case.  It will often result in asymmetric routes.

The patch adds the capability to take into account the user ID when
doing a routing lookup. The routing lookup affects outbound interface,
but it also affects source address selection, MTU and advertised TCP
MSS, and a variety of other parameters that are configurable on a
per-route basis (e.g., congestion window).

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC net-next 0/4] Support UID range routing.
  2014-04-28 19:01       ` Lorenzo Colitti
@ 2014-05-02 19:15         ` Lorenzo Colitti
  2014-05-02 19:24           ` David Miller
  0 siblings, 1 reply; 21+ messages in thread
From: Lorenzo Colitti @ 2014-05-02 19:15 UTC (permalink / raw)
  To: David Miller
  Cc: David Newall, netdev@vger.kernel.org, Hannes Frederic Sowa,
	JP Abgrall

On Tue, Apr 29, 2014 at 4:01 AM, Lorenzo Colitti <lorenzo@google.com> wrote:
> Basically, what this patch calls "UID" is what the xt_owner module and
> xt_LOG iptables modules consider to be the "owner" of a socket, what
> nfqueue presents as the user ID, what shows up in
> /proc/net/{udp,tcp,raw} in the "uid" column, etc. In most cases this
> is the effective UID that made the call to socket() or accept().
>
> This patch allows using that concept in routing. This can be done
> today with "iptables -m owner --uid-owner 12345 -j MARK --set-mark
> 0xbeef; ip rule from fwmark 0xbeef lookup 100", but that has the
> limitations I set out in my original message (e.g., incorrect source
> address).

David,

did that help clarify what I'm proposing here? Does this patch still
seem misguided to you even though its semantics match existing
functionality?

Lorenzo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC net-next 0/4] Support UID range routing.
  2014-05-02 19:15         ` Lorenzo Colitti
@ 2014-05-02 19:24           ` David Miller
  2014-05-07  3:59             ` Lorenzo Colitti
  0 siblings, 1 reply; 21+ messages in thread
From: David Miller @ 2014-05-02 19:24 UTC (permalink / raw)
  To: lorenzo; +Cc: davidn, netdev, hannes, jpa

From: Lorenzo Colitti <lorenzo@google.com>
Date: Sat, 3 May 2014 04:15:38 +0900

> On Tue, Apr 29, 2014 at 4:01 AM, Lorenzo Colitti <lorenzo@google.com> wrote:
>> Basically, what this patch calls "UID" is what the xt_owner module and
>> xt_LOG iptables modules consider to be the "owner" of a socket, what
>> nfqueue presents as the user ID, what shows up in
>> /proc/net/{udp,tcp,raw} in the "uid" column, etc. In most cases this
>> is the effective UID that made the call to socket() or accept().
>>
>> This patch allows using that concept in routing. This can be done
>> today with "iptables -m owner --uid-owner 12345 -j MARK --set-mark
>> 0xbeef; ip rule from fwmark 0xbeef lookup 100", but that has the
>> limitations I set out in my original message (e.g., incorrect source
>> address).
> 
> did that help clarify what I'm proposing here? Does this patch still
> seem misguided to you even though its semantics match existing
> functionality?

I understand what you're trying to achieve, but it still leaves a very
bad taste in my mouth.

And I question whether you absolutely cannot get the desired source
address set with appropriate adjustments to the netfilter config,
netfilter can mangle the packet any way you desire it to so why can't
it do so to the source address?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC net-next 0/4] Support UID range routing.
  2014-05-02 19:24           ` David Miller
@ 2014-05-07  3:59             ` Lorenzo Colitti
  2014-05-07  9:24               ` Hannes Frederic Sowa
  0 siblings, 1 reply; 21+ messages in thread
From: Lorenzo Colitti @ 2014-05-07  3:59 UTC (permalink / raw)
  To: David Miller
  Cc: David Newall, netdev@vger.kernel.org, Hannes Frederic Sowa,
	JP Abgrall

On Sat, May 3, 2014 at 4:24 AM, David Miller <davem@davemloft.net> wrote:
> > did that help clarify what I'm proposing here? Does this patch still
> > seem misguided to you even though its semantics match existing
> > functionality?
>
> I understand what you're trying to achieve, but it still leaves a very
> bad taste in my mouth.
>
> And I question whether you absolutely cannot get the desired source
> address set with appropriate adjustments to the netfilter config,
> netfilter can mangle the packet any way you desire it to so why can't
> it do so to the source address?

The problem is not that netfilter picks the wrong source address. The
problem is that it that address doesn't match the socket's source
address. The source address in the socket is selected by the initial
routing lookup (e.g., in tcp_v[46]_connect), but the packet source
address is from the output interface . So if the application calls
getsockname(), it will get an IP address that's not the one that is
actually being used to send its packets (and in fact is on a different
interface). This breaks applications that need end-to-end
connectivity. It also doesn't allow applications to select their
source address using things like bind(), IPV6_PKTINFO,
IPV6_ADDR_PREFERENCES,  etc.

This doesn't just affect the source address, it similarly affects any
other parameters that are taken from route lookups and stored in the
socket, such as MSS, initial cwnd / rwnd / RTO, etc. Some (like MSS)
can be fixed up with netfilter, but not all. Also, every connection
made through this scheme takes up conntrack state, is affected by
conntrack timeouts, etc.

Assuming you agree that this is sane use case, then I think UID-based
routing rules are a much cleaner way to do it than netfilter. The
kernel has all the information it needs to do this correctly, but
today we do not use that information in the initial lookup - we just
use it later on to do NAT. That breaks app expectations before the
packet even hits the wire, and that leaves a bad taste in my mouth.

I can see why we wouldn't want to add new rule types when things can
be done with netfilter. But I think that when it comes to making
routing decisions based on information that's in the socket structure
(i.e., from the application), those have to be made using routing
rules. For output packets, netfilter only gets to influence things too
late, so it can at best patch things up later.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC net-next 0/4] Support UID range routing.
  2014-05-07  3:59             ` Lorenzo Colitti
@ 2014-05-07  9:24               ` Hannes Frederic Sowa
  2014-05-07 10:58                 ` Lorenzo Colitti
  0 siblings, 1 reply; 21+ messages in thread
From: Hannes Frederic Sowa @ 2014-05-07  9:24 UTC (permalink / raw)
  To: Lorenzo Colitti, David Miller; +Cc: David Newall, netdev, JP Abgrall

Hi,

On Tue, May 6, 2014, at 20:59, Lorenzo Colitti wrote:
> This doesn't just affect the source address, it similarly affects any
> other parameters that are taken from route lookups and stored in the
> socket, such as MSS, initial cwnd / rwnd / RTO, etc. Some (like MSS)
> can be fixed up with netfilter, but not all. Also, every connection
> made through this scheme takes up conntrack state, is affected by
> conntrack timeouts, etc.

I question the abstraction of using UIDs for matching routing rules.
E.g. freebsd uses setfib[1] to alter the view of the routing table per
process. E.g. an interface like ip rule exec (action ACTION)+ PROGRAM
would be much nicer in combination with a prctl, maybe? I would much
rather enjoy an interface not based on UIDs. Would something like that
solve your initial problem?

The other possibility that came to my mind would be that it is possible
to share interfaces and ip addresses per netns but it seems more
difficult to implement.

Greetings,

  Hannes

[1]
http://www.freebsd.org/cgi/man.cgi?query=setfib&apropos=0&sektion=0&manpath=FreeBSD+10.0-RELEASE&arch=default&format=html

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC net-next 0/4] Support UID range routing.
  2014-05-07  9:24               ` Hannes Frederic Sowa
@ 2014-05-07 10:58                 ` Lorenzo Colitti
  2014-05-11 21:45                   ` Hannes Frederic Sowa
  0 siblings, 1 reply; 21+ messages in thread
From: Lorenzo Colitti @ 2014-05-07 10:58 UTC (permalink / raw)
  To: Hannes Frederic Sowa
  Cc: David Miller, David Newall, netdev@vger.kernel.org, JP Abgrall

On Wed, May 7, 2014 at 6:24 PM, Hannes Frederic Sowa
<hannes@stressinduktion.org> wrote:
> I question the abstraction of using UIDs for matching routing rules.
> E.g. freebsd uses setfib[1] to alter the view of the routing table per
> process. E.g. an interface like ip rule exec (action ACTION)+ PROGRAM
> would be much nicer in combination with a prctl, maybe? I would much
> rather enjoy an interface not based on UIDs. Would something like that
> solve your initial problem?

So you're suggesting something that would still be an ip rule, but
would match a new identifier ("fibgroup") rather than the uid? I think
that would work, though obviously it's a much bigger change than what
I am proposing here.

It would require defining a new identifier, figuring out what its
semantics are, setting it when socket objects are created, attaching
it to sockets across accept/fork, etc. Userspace code would have to be
update it to set it on processes (whereas the uid is already dealt
with by existing tools), etc.

If you're proposing something not that's not an ip rule, then that
seems like a step backwards, because it won't allow the rich policy
allowed by processing rules in priority order, throw routes, FRA_GOTO,
etc.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC net-next 0/4] Support UID range routing.
  2014-05-07 10:58                 ` Lorenzo Colitti
@ 2014-05-11 21:45                   ` Hannes Frederic Sowa
  2014-05-12 20:25                     ` Lorenzo Colitti
  0 siblings, 1 reply; 21+ messages in thread
From: Hannes Frederic Sowa @ 2014-05-11 21:45 UTC (permalink / raw)
  To: Lorenzo Colitti; +Cc: David Miller, David Newall, netdev, JP Abgrall

On Wed, May 7, 2014, at 3:58, Lorenzo Colitti wrote:
> On Wed, May 7, 2014 at 6:24 PM, Hannes Frederic Sowa
> <hannes@stressinduktion.org> wrote:
> > I question the abstraction of using UIDs for matching routing rules.
> > E.g. freebsd uses setfib[1] to alter the view of the routing table per
> > process. E.g. an interface like ip rule exec (action ACTION)+ PROGRAM
> > would be much nicer in combination with a prctl, maybe? I would much
> > rather enjoy an interface not based on UIDs. Would something like that
> > solve your initial problem?
> 
> So you're suggesting something that would still be an ip rule, but
> would match a new identifier ("fibgroup") rather than the uid? I think
> that would work, though obviously it's a much bigger change than what
> I am proposing here.
> 
> It would require defining a new identifier, figuring out what its
> semantics are, setting it when socket objects are created, attaching
> it to sockets across accept/fork, etc. Userspace code would have to be
> update it to set it on processes (whereas the uid is already dealt
> with by existing tools), etc.

That was my idea, yes. Having some kind of opaque identifier with
user-friendly names a la /etc/iproute2/rt_tables which can be used in ip
rule matches.

I see, it would require very heavy lifting, but in the end seems to be
more user-friendly to me than uids. But I guess task_struct is something
like sk_buff, where you need to find very good arguments to expand it.

Maybe something like cgroup/net_cls.classid would be possible, but I am
not familar on how to interact with cgroups internally and don't know
how much work that would be (and if more network cgroup interaction is
actually desirable).

> If you're proposing something not that's not an ip rule, then that
> seems like a step backwards, because it won't allow the rich policy
> allowed by processing rules in priority order, throw routes, FRA_GOTO,
> etc.

Agreed.

Bye,

  Hannes

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC net-next 0/4] Support UID range routing.
  2014-05-11 21:45                   ` Hannes Frederic Sowa
@ 2014-05-12 20:25                     ` Lorenzo Colitti
  0 siblings, 0 replies; 21+ messages in thread
From: Lorenzo Colitti @ 2014-05-12 20:25 UTC (permalink / raw)
  To: Hannes Frederic Sowa
  Cc: David Miller, David Newall, netdev@vger.kernel.org, JP Abgrall

On Mon, May 12, 2014 at 6:45 AM, Hannes Frederic Sowa
<hannes@stressinduktion.org> wrote:
>> It would require defining a new identifier, figuring out what its
>> semantics are, setting it when socket objects are created, attaching
>> it to sockets across accept/fork, etc. Userspace code would have to be
>> update it to set it on processes (whereas the uid is already dealt
>> with by existing tools), etc.
>
> That was my idea, yes. Having some kind of opaque identifier with
> user-friendly names a la /etc/iproute2/rt_tables which can be used in ip
> rule matches.
>
> I see, it would require very heavy lifting, but in the end seems to be
> more user-friendly to me than uids.

Well, I don't know. Today we have an iptables module in tree that does
UID matching, which means that presumably people do find it useful. In
addition to the Android use case there seem to be other people doing
it as well [1][2][3]. On the other hand, we don't have a similar one
for PIDs or a generic "fib grouo". nftables meta seems to support UID
and gid, but not PID.

For people that currently use the socket owner to do routing, this
patch is a strictly better way to do it, because it doesn't have any
of the drawbacks of NAT (information in the socket structures that
doesn't correspond to reality; more conntrack state; NAT breaking
end-to-end; etc.).

Whether a new fibgroup option would serve these use cases better is
hard to say. If what people want to do is per-process, then perhaps an
opaque fibgroup is better. But if what they actually want to do is
per-user, then if you implement fibgroups you still have to build
machinery in userspace to make sure that each user's processes got put
into the right fib group.

[1] http://blog.sebastien.raveau.name/2009/04/per-process-routing.html
[2] http://arstechnica.com/civis/viewtopic.php?f=16&t=1195455
[3] http://www.niftiestsoftware.com/2011/08/28/making-all-network-traffic-for-a-linux-user-use-a-specific-network-interface/

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2014-05-12 20:25 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-04-26  4:48 [RFC net-next 0/4] Support UID range routing Lorenzo Colitti
2014-04-26  4:48 ` [RFC net-next 1/4] net: ipv6: Introduce flowi6_init_output Lorenzo Colitti
2014-04-26  5:56   ` Julian Anastasov
2014-04-27  4:03     ` Lorenzo Colitti
2014-04-28  7:07       ` Julian Anastasov
2014-04-26  4:48 ` [RFC net-next 2/4] net: core: Add a UID range to fib rules Lorenzo Colitti
2014-04-26  4:48 ` [RFC net-next 3/4] net: core: Add the UID to flowi[46]_init_output Lorenzo Colitti
2014-04-26  4:48 ` [RFC net-next 4/4] net: core: Add a RTA_UID attribute to routes Lorenzo Colitti
2014-04-26 13:14 ` [RFC net-next 0/4] Support UID range routing David Newall
2014-04-28 14:38   ` Lorenzo Colitti
     [not found]     ` <20140428.125807.409036177577836732.davem@davemloft.net>
2014-04-28 19:01       ` Lorenzo Colitti
2014-05-02 19:15         ` Lorenzo Colitti
2014-05-02 19:24           ` David Miller
2014-05-07  3:59             ` Lorenzo Colitti
2014-05-07  9:24               ` Hannes Frederic Sowa
2014-05-07 10:58                 ` Lorenzo Colitti
2014-05-11 21:45                   ` Hannes Frederic Sowa
2014-05-12 20:25                     ` Lorenzo Colitti
2014-04-30  4:36     ` Lorenzo Colitti
2014-04-30  7:52       ` David Newall
2014-04-30  8:04         ` Lorenzo Colitti

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).