netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHv3 net-next-2.6 0/5] XFRM,IPv6: Removal of RH2/HAO from IPsec-protected MIPv6 traffic
@ 2010-09-29  9:05 Arnaud Ebalard
  2010-09-29  9:05 ` [PATCHv3 net-next-2.6 1/5] XFRM,IPv6: Remove xfrm_spi_hash() dependency on destination address Arnaud Ebalard
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Arnaud Ebalard @ 2010-09-29  9:05 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Herbert Xu, Hideaki YOSHIFUJI; +Cc: netdev

Hi,

This an updated version of the patches. For reference, introduction of
the feature is here http://thread.gmane.org/gmane.linux.network/172941

This version 3 now also builds with ipv6 modular. To do that, a helper
(input_addr_check()) has been added to struct xfrm_state_afinfo. To
avoid the penalty of xfrm_state_get/put_afinfo() calls from xfrm_input(),
I spent some time in the sources and came up with the idea of accessing
it safely as follows:

 x = xfrm_state_lookup(net, skb->mark, NULL, spi, nexthdr, family);
 if (x == NULL ||
     x->outer_mode->afinfo->input_addr_check(skb, x)) {
     ...

Tell me if I missed something.

Comments welcome.

Cheers,

a+

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCHv3 net-next-2.6 1/5] XFRM,IPv6: Remove xfrm_spi_hash() dependency on destination address
  2010-09-29  9:05 [PATCHv3 net-next-2.6 0/5] XFRM,IPv6: Removal of RH2/HAO from IPsec-protected MIPv6 traffic Arnaud Ebalard
@ 2010-09-29  9:05 ` Arnaud Ebalard
  2010-09-29  9:05 ` [PATCHv3 net-next-2.6 2/5] XFRM,IPv6: Introduce receive sockopts to access IRO remapped src/dst addresses Arnaud Ebalard
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 13+ messages in thread
From: Arnaud Ebalard @ 2010-09-29  9:05 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Herbert Xu, Hideaki YOSHIFUJI; +Cc: netdev


In the new IPsec architecture [RFC4301], "for an SA used to carry
unicast traffic, the Security Parameters Index (SPI) by itself
suffices to specify an SA".  Section 4.1 of [RFC4301] provides
additional guidance on the topic.

In the old IPsec architecture [RFC2401], a SA "is uniquely identified
by a triple consisting of a Security Parameter Index (SPI), an IP
Destination Address and a security protocol (AH or ESP) identifier".

If an IPsec stack only supports the behavior mandated by the old
IPsec architecture, SAD lookup on inbound packets require the use of
both the SPI and the destination address of the SA.

For inbound IPsec traffic, IRO remapping rules may exist on the MN to
remap the destination address (CoA) into the HoA.  In that case, by
design, the address found in the destination address field of the
packet (CoA) does not match the one in the SA (HoA).

At the moment, Linux XFRM stack includes the address when computing
the hash to perform state lookup by SPI. This patch changes XFRM
state hash computation to prevent destination address to be
used. This will later allow finding states for packets w/ mangled
destination addresses.

Signed-off-by: Arnaud Ebalard <arno@natisbad.org>
---
 net/xfrm/xfrm_hash.h  |   21 +--------------------
 net/xfrm/xfrm_state.c |   20 ++++++++------------
 2 files changed, 9 insertions(+), 32 deletions(-)

diff --git a/net/xfrm/xfrm_hash.h b/net/xfrm/xfrm_hash.h
index 8e69533..19eeee7 100644
--- a/net/xfrm/xfrm_hash.h
+++ b/net/xfrm/xfrm_hash.h
@@ -4,16 +4,6 @@
 #include <linux/xfrm.h>
 #include <linux/socket.h>
 
-static inline unsigned int __xfrm4_addr_hash(xfrm_address_t *addr)
-{
-	return ntohl(addr->a4);
-}
-
-static inline unsigned int __xfrm6_addr_hash(xfrm_address_t *addr)
-{
-	return ntohl(addr->a6[2] ^ addr->a6[3]);
-}
-
 static inline unsigned int __xfrm4_daddr_saddr_hash(xfrm_address_t *daddr, xfrm_address_t *saddr)
 {
 	u32 sum = (__force u32)daddr->a4 + (__force u32)saddr->a4;
@@ -60,18 +50,9 @@ static inline unsigned __xfrm_src_hash(xfrm_address_t *daddr,
 }
 
 static inline unsigned int
-__xfrm_spi_hash(xfrm_address_t *daddr, __be32 spi, u8 proto, unsigned short family,
-		unsigned int hmask)
+__xfrm_spi_hash(__be32 spi, u8 proto, unsigned int hmask)
 {
 	unsigned int h = (__force u32)spi ^ proto;
-	switch (family) {
-	case AF_INET:
-		h ^= __xfrm4_addr_hash(daddr);
-		break;
-	case AF_INET6:
-		h ^= __xfrm6_addr_hash(daddr);
-		break;
-	}
 	return (h ^ (h >> 10) ^ (h >> 20)) & hmask;
 }
 
diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index eb96ce5..b6a4d8d 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -30,7 +30,7 @@
 
 /* Each xfrm_state may be linked to two tables:
 
-   1. Hash table by (spi,daddr,ah/esp) to find SA by SPI. (input,ctl)
+   1. Hash table by (spi,ah/esp) to find SA by SPI. (input,ctl)
    2. Hash table by (daddr,family,reqid) to find what SAs exist for given
       destination/tunnel endpoint. (output)
  */
@@ -67,9 +67,9 @@ static inline unsigned int xfrm_src_hash(struct net *net,
 }
 
 static inline unsigned int
-xfrm_spi_hash(struct net *net, xfrm_address_t *daddr, __be32 spi, u8 proto, unsigned short family)
+xfrm_spi_hash(struct net *net, __be32 spi, u8 proto)
 {
-	return __xfrm_spi_hash(daddr, spi, proto, family, net->xfrm.state_hmask);
+	return __xfrm_spi_hash(spi, proto, net->xfrm.state_hmask);
 }
 
 static void xfrm_hash_transfer(struct hlist_head *list,
@@ -95,9 +95,7 @@ static void xfrm_hash_transfer(struct hlist_head *list,
 		hlist_add_head(&x->bysrc, nsrctable+h);
 
 		if (x->id.spi) {
-			h = __xfrm_spi_hash(&x->id.daddr, x->id.spi,
-					    x->id.proto, x->props.family,
-					    nhashmask);
+			h = __xfrm_spi_hash(x->id.spi, x->id.proto, nhashmask);
 			hlist_add_head(&x->byspi, nspitable+h);
 		}
 	}
@@ -679,7 +677,7 @@ xfrm_init_tempstate(struct xfrm_state *x, struct flowi *fl,
 
 static struct xfrm_state *__xfrm_state_lookup(struct net *net, u32 mark, xfrm_address_t *daddr, __be32 spi, u8 proto, unsigned short family)
 {
-	unsigned int h = xfrm_spi_hash(net, daddr, spi, proto, family);
+	unsigned int h = xfrm_spi_hash(net, spi, proto);
 	struct xfrm_state *x;
 	struct hlist_node *entry;
 
@@ -868,7 +866,7 @@ found:
 			h = xfrm_src_hash(net, daddr, saddr, encap_family);
 			hlist_add_head(&x->bysrc, net->xfrm.state_bysrc+h);
 			if (x->id.spi) {
-				h = xfrm_spi_hash(net, &x->id.daddr, x->id.spi, x->id.proto, encap_family);
+				h = xfrm_spi_hash(net, x->id.spi, x->id.proto);
 				hlist_add_head(&x->byspi, net->xfrm.state_byspi+h);
 			}
 			x->lft.hard_add_expires_seconds = net->xfrm.sysctl_acq_expires;
@@ -942,9 +940,7 @@ static void __xfrm_state_insert(struct xfrm_state *x)
 	hlist_add_head(&x->bysrc, net->xfrm.state_bysrc+h);
 
 	if (x->id.spi) {
-		h = xfrm_spi_hash(net, &x->id.daddr, x->id.spi, x->id.proto,
-				  x->props.family);
-
+		h = xfrm_spi_hash(net, x->id.spi, x->id.proto);
 		hlist_add_head(&x->byspi, net->xfrm.state_byspi+h);
 	}
 
@@ -1535,7 +1531,7 @@ int xfrm_alloc_spi(struct xfrm_state *x, u32 low, u32 high)
 	}
 	if (x->id.spi) {
 		spin_lock_bh(&xfrm_state_lock);
-		h = xfrm_spi_hash(net, &x->id.daddr, x->id.spi, x->id.proto, x->props.family);
+		h = xfrm_spi_hash(net, x->id.spi, x->id.proto);
 		hlist_add_head(&x->byspi, net->xfrm.state_byspi+h);
 		spin_unlock_bh(&xfrm_state_lock);
 
-- 
1.7.1



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCHv3 net-next-2.6 2/5] XFRM,IPv6: Introduce receive sockopts to access IRO remapped src/dst addresses
  2010-09-29  9:05 [PATCHv3 net-next-2.6 0/5] XFRM,IPv6: Removal of RH2/HAO from IPsec-protected MIPv6 traffic Arnaud Ebalard
  2010-09-29  9:05 ` [PATCHv3 net-next-2.6 1/5] XFRM,IPv6: Remove xfrm_spi_hash() dependency on destination address Arnaud Ebalard
@ 2010-09-29  9:05 ` Arnaud Ebalard
  2010-09-29  9:05 ` [PATCHv3 net-next-2.6 3/5] XFRM,IPv6: Add IRO src/dst address remapping XFRM types and i/o handlers Arnaud Ebalard
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 13+ messages in thread
From: Arnaud Ebalard @ 2010-09-29  9:05 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Herbert Xu, Hideaki YOSHIFUJI; +Cc: netdev


This patch introduces IRO recv sockopts, in order for userland processes
(e.g. UMIP) to access on-wire source or destination addresses found in
incoming (IPsec-protected) packets as they were before remapping by IRO.
The socket options are respectively IPV6_RECVIROSRC and IPV6_RECVIRODST.

Basically, the two recv socket options are similar in their purpose to
their generic RH2/HAO counterparts defined in RFC 3542 (IPV6_RECVIROSRC
<->  IPV6_RECVDSTOPTS, IPV6_RECVIRODST <-> IPV6_RECVRTHDR). They differ
on the following aspects:

 - IRO reporting sockopts only work on incoming IPsec-protected packets
   Userspace will never get IRO remapped address report for common
   (non protected) packets.
 - The receiver gets the original source/desination address (IRO
   remapping) from its IPsec stack.
 - as IRO sockopts only deal with addresses, no specific structure is
   defined, i.e. struct in6_addr is used to pass info.

As we only interact with IPsec protected packets, struct sec_path is
used to carry information (addresses) for incoming packets that have
undergone remapping process.

Signed-off-by: Arnaud Ebalard <arno@natisbad.org>
---
 include/linux/in6.h      |    7 +++++++
 include/linux/ipv6.h     |    4 +++-
 include/net/xfrm.h       |    5 +++++
 net/ipv6/datagram.c      |   18 ++++++++++++++++++
 net/ipv6/ipv6_sockglue.c |   26 ++++++++++++++++++++++++++
 5 files changed, 59 insertions(+), 1 deletions(-)

diff --git a/include/linux/in6.h b/include/linux/in6.h
index c4bf46f..52a98ab 100644
--- a/include/linux/in6.h
+++ b/include/linux/in6.h
@@ -283,4 +283,11 @@ struct in6_flowlabel_req {
  * MRT6_PIM			208
  * (reserved)			209
  */
+
+/* IRO (IPsec Route Optimization) sockopts */
+#define IPV6_RECVIROSRC         74
+#define IPV6_IROSRC		75
+#define IPV6_RECVIRODST         76
+#define IPV6_IRODST		77
+
 #endif
diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index e62683b..55289ee 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -341,7 +341,9 @@ struct ipv6_pinfo {
 				odstopts:1,
                                 rxflow:1,
 				rxtclass:1,
-				rxpmtu:1;
+				rxpmtu:1,
+				irosrc:1,
+				irodst:1;
 		} bits;
 		__u16		all;
 	} rxopt;
diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 4f53532..e6a753c 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -909,6 +909,11 @@ struct sec_path {
 	atomic_t		refcnt;
 	int			len;
 	struct xfrm_state	*xvec[XFRM_MAX_DEPTH];
+
+#ifdef CONFIG_XFRM_SUB_POLICY
+	struct in6_addr         irosrc;
+	struct in6_addr         irodst;
+#endif
 };
 
 static inline struct sec_path *
diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
index ef371aa..2952c9e 100644
--- a/net/ipv6/datagram.c
+++ b/net/ipv6/datagram.c
@@ -29,6 +29,7 @@
 #include <net/transp_v6.h>
 #include <net/ip6_route.h>
 #include <net/tcp_states.h>
+#include <net/xfrm.h>
 
 #include <linux/errqueue.h>
 #include <asm/uaccess.h>
@@ -504,6 +505,23 @@ int datagram_recv_ctl(struct sock *sk, struct msghdr *msg, struct sk_buff *skb)
 		put_cmsg(msg, SOL_IPV6, IPV6_HOPOPTS, (ptr[1]+1)<<3, ptr);
 	}
 
+#ifdef CONFIG_XFRM_SUB_POLICY
+	/* If access to IRO-remapped source or destination address has been
+	 * requested and it has indeed been remapped, provide the on-wire
+	 * address to userland */
+	if (skb_sec_path(skb)) {
+		struct sec_path *sp = skb_sec_path(skb);
+
+		if (np->rxopt.bits.irosrc && !ipv6_addr_any(&sp->irosrc))
+			put_cmsg(msg, SOL_IPV6, IPV6_IROSRC,
+				 sizeof(sp->irosrc), &sp->irosrc);
+
+		if (np->rxopt.bits.irodst && !ipv6_addr_any(&sp->irodst))
+			put_cmsg(msg, SOL_IPV6, IPV6_IRODST,
+				 sizeof(sp->irodst), &sp->irodst);
+	}
+#endif
+
 	if (opt->lastopt &&
 	    (np->rxopt.bits.dstopts || np->rxopt.bits.srcrt)) {
 		/*
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index a7f66bc..722a49f 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -302,6 +302,22 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 		retv = 0;
 		break;
 
+#ifdef CONFIG_XFRM_SUB_POLICY
+	case IPV6_RECVIROSRC:
+		if (optlen < sizeof(int))
+			goto e_inval;
+		np->rxopt.bits.irosrc = valbool;
+		retv = 0;
+		break;
+
+	case IPV6_RECVIRODST:
+		if (optlen < sizeof(int))
+			goto e_inval;
+		np->rxopt.bits.irodst = valbool;
+		retv = 0;
+		break;
+#endif
+
 	case IPV6_2292DSTOPTS:
 		if (optlen < sizeof(int))
 			goto e_inval;
@@ -1056,6 +1072,16 @@ static int do_ipv6_getsockopt(struct sock *sk, int level, int optname,
 		val = np->rxopt.bits.dstopts;
 		break;
 
+#ifdef CONFIG_XFRM_SUB_POLICY
+	case IPV6_RECVIROSRC:
+		val = np->rxopt.bits.irosrc;
+		break;
+
+	case IPV6_RECVIRODST:
+		val = np->rxopt.bits.irodst;
+		break;
+#endif
+
 	case IPV6_2292DSTOPTS:
 		val = np->rxopt.bits.odstopts;
 		break;
-- 
1.7.1



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCHv3 net-next-2.6 3/5] XFRM,IPv6: Add IRO src/dst address remapping XFRM types and i/o handlers
  2010-09-29  9:05 [PATCHv3 net-next-2.6 0/5] XFRM,IPv6: Removal of RH2/HAO from IPsec-protected MIPv6 traffic Arnaud Ebalard
  2010-09-29  9:05 ` [PATCHv3 net-next-2.6 1/5] XFRM,IPv6: Remove xfrm_spi_hash() dependency on destination address Arnaud Ebalard
  2010-09-29  9:05 ` [PATCHv3 net-next-2.6 2/5] XFRM,IPv6: Introduce receive sockopts to access IRO remapped src/dst addresses Arnaud Ebalard
@ 2010-09-29  9:05 ` Arnaud Ebalard
  2010-09-30  3:16   ` David Miller
  2010-09-29  9:05 ` [PATCHv3 net-next-2.6 4/5] XFRM,IPv6: Add IRO remapping hook in xfrm_input() Arnaud Ebalard
  2010-09-29  9:06 ` [PATCHv3 net-next-2.6 5/5] XFRM,IPv6: Add IRO remapping capability via socket ancillary data path Arnaud Ebalard
  4 siblings, 1 reply; 13+ messages in thread
From: Arnaud Ebalard @ 2010-09-29  9:05 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Herbert Xu, Hideaki YOSHIFUJI; +Cc: netdev


Add IRO source and destination remapping XFRM types and associated
input/output handlers. This allows userland to install such states
in order to support remapping of source or destination address
of packet. They basically work like existing RH2 and HAO ones; the
main difference is that output handlers do not expand the packet by
adding an extension header: they simply change the source or
destination in place. Input handlers are almost the same as RH2/HAO
version in their behavior, but they are triggered differently. RH2
and HAO handlers are triggered based on structures found in the
packet. On input, IRO states (and associated handlers) are looked
up when processing an IPsec-protected packet, when there is an
address mismatch.

Signed-off-by: Arnaud Ebalard <arno@natisbad.org>
---
 include/net/xfrm.h       |    2 +
 net/ipv6/mip6.c          |  153 ++++++++++++++++++++++++++++++++++++++++++++++
 net/ipv6/xfrm6_mode_ro.c |   11 +++-
 net/xfrm/xfrm_user.c     |    4 +
 4 files changed, 169 insertions(+), 1 deletions(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index e6a753c..05b2b1f 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -35,6 +35,8 @@
 #define XFRM_PROTO_IPV6		41
 #define XFRM_PROTO_ROUTING	IPPROTO_ROUTING
 #define XFRM_PROTO_DSTOPTS	IPPROTO_DSTOPTS
+#define XFRM_PROTO_IRO_SRC      127
+#define XFRM_PROTO_IRO_DST      128
 
 #define XFRM_ALIGN8(len)	(((len) + 7) & ~7)
 #define MODULE_ALIAS_XFRM_MODE(family, encap) \
diff --git a/net/ipv6/mip6.c b/net/ipv6/mip6.c
index d6e9599..04b9e1d 100644
--- a/net/ipv6/mip6.c
+++ b/net/ipv6/mip6.c
@@ -477,6 +477,131 @@ static const struct xfrm_type mip6_rthdr_type =
 	.hdr_offset	= mip6_rthdr_offset,
 };
 
+#ifdef CONFIG_XFRM_SUB_POLICY
+/* IRO equivalent of mip6_destopt_input(): handles incoming packet with a
+ * source address different from the one expected in the SA: check that
+ * received source address is indeed the CoA we expected (or any address
+ * if the state references the unspecified address '::') */
+static int mip6_iro_src_input(struct xfrm_state *x, struct sk_buff *skb)
+{
+	struct ipv6hdr *iph = ipv6_hdr(skb);
+	int err = 1;
+
+	spin_lock(&x->lock);
+	if (!ipv6_addr_equal(&iph->saddr, (struct in6_addr *)x->coaddr) &&
+	    !ipv6_addr_any((struct in6_addr *)x->coaddr))
+		err = -ENOENT;
+	spin_unlock(&x->lock);
+
+	return err;
+}
+
+/* IRO equivalent of mip6_destopt_output(): replaces current source address
+ * of outgoing packet by state's CoA. */
+static int mip6_iro_src_output(struct xfrm_state *x, struct sk_buff *skb)
+{
+	struct ipv6hdr *iph = ipv6_hdr(skb);
+
+	spin_lock_bh(&x->lock);
+	memcpy(&iph->saddr, x->coaddr, sizeof(iph->saddr));
+	spin_unlock_bh(&x->lock);
+
+	return 0;
+}
+
+static int mip6_iro_src_reject(struct xfrm_state *x, struct sk_buff *skb, struct flowi *fl)
+{
+	int err = 0;
+
+	/* XXX We may need some reject handler at some point but it is not
+	 * critical yet: see xfrm_secpath_reject() in net/xfrm/xfrm_policy.c
+	 * and aslo what mip6_destopt_reject() implements */
+
+	printk("XXX FIXME: mip6_iro_src_reject() called\n");
+
+	return err;
+}
+
+/* This is the IRO equivalent of mip6_rthdr_input(): handles incoming packet
+ * with a destination address different from the one expected in the SA:
+ * check that received destination address is indeed the CoA we expected
+ * (or any address if the state references the unspecified address '::') */
+static int mip6_iro_dst_input(struct xfrm_state *x, struct sk_buff *skb)
+{
+	struct ipv6hdr *iph = ipv6_hdr(skb);
+	int err = 1;
+
+	spin_lock(&x->lock);
+	if (!ipv6_addr_equal(&iph->daddr, (struct in6_addr *)x->coaddr) &&
+	    !ipv6_addr_any((struct in6_addr *)x->coaddr))
+		err = -ENOENT;
+	spin_unlock(&x->lock);
+
+	return err;
+}
+
+/* IRO equivalent of mip6_rthdr_output(): replaces current destination
+ * address of outgoing packet with state's CoA */
+static int mip6_iro_dst_output(struct xfrm_state *x, struct sk_buff *skb)
+{
+	struct ipv6hdr *iph = ipv6_hdr(skb);
+
+	spin_lock_bh(&x->lock);
+	memcpy(&iph->daddr, x->coaddr, sizeof(iph->daddr));
+	spin_unlock_bh(&x->lock);
+
+	return 0;
+}
+
+/* Common to iro src and dst remapping states. */
+static int mip6_iro_init_state(struct xfrm_state *x)
+{
+	if (x->id.spi) {
+		printk(KERN_INFO "%s: spi is not 0: %u\n", __func__,
+		       x->id.spi);
+		return -EINVAL;
+	}
+	if (x->props.mode != XFRM_MODE_ROUTEOPTIMIZATION) {
+		printk(KERN_INFO "%s: state's mode is not %u: %u\n",
+		       __func__, XFRM_MODE_ROUTEOPTIMIZATION,
+		       x->props.mode);
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+/* Unlike common IPsec protocols, nothing to do when destroying */
+static void mip6_iro_destroy(struct xfrm_state *x)
+{
+}
+
+static const struct xfrm_type mip6_iro_src_type =
+{
+	.description	= "MIP6_IRO_SRC",
+	.owner		= THIS_MODULE,
+	.proto	     	= XFRM_PROTO_IRO_SRC,
+	.flags		= XFRM_TYPE_NON_FRAGMENT | XFRM_TYPE_LOCAL_COADDR,
+	.init_state	= mip6_iro_init_state,
+	.destructor	= mip6_iro_destroy,
+	.input		= mip6_iro_src_input,
+	.output		= mip6_iro_src_output,
+	.reject         = mip6_iro_src_reject,
+};
+
+static const struct xfrm_type mip6_iro_dst_type =
+{
+	.description	= "MIP6_IRO_DST",
+	.owner		= THIS_MODULE,
+	.proto	     	= XFRM_PROTO_IRO_DST,
+	.flags		= XFRM_TYPE_NON_FRAGMENT | XFRM_TYPE_REMOTE_COADDR,
+	.init_state	= mip6_iro_init_state,
+	.destructor	= mip6_iro_destroy,
+	.input		= mip6_iro_dst_input,
+	.output		= mip6_iro_dst_output,
+};
+#endif /* CONFIG_XFRM_SUB_POLICY */
+
 static int __init mip6_init(void)
 {
 	printk(KERN_INFO "Mobile IPv6\n");
@@ -489,6 +614,20 @@ static int __init mip6_init(void)
 		printk(KERN_INFO "%s: can't add xfrm type(rthdr)\n", __func__);
 		goto mip6_rthdr_xfrm_fail;
 	}
+
+#ifdef CONFIG_XFRM_SUB_POLICY
+	if (xfrm_register_type(&mip6_iro_src_type, AF_INET6) < 0) {
+		printk(KERN_INFO "%s: can't add xfrm type(IRO src remap)\n",
+		       __func__);
+		goto mip6_iro_src_remap_xfrm_fail;
+	}
+	if (xfrm_register_type(&mip6_iro_dst_type, AF_INET6) < 0) {
+		printk(KERN_INFO "%s: can't add xfrm type(IRO dst remap)\n",
+		       __func__);
+		goto mip6_iro_dst_remap_xfrm_fail;
+	}
+#endif
+
 	if (rawv6_mh_filter_register(mip6_mh_filter) < 0) {
 		printk(KERN_INFO "%s: can't add rawv6 mh filter\n", __func__);
 		goto mip6_rawv6_mh_fail;
@@ -498,6 +637,12 @@ static int __init mip6_init(void)
 	return 0;
 
  mip6_rawv6_mh_fail:
+#ifdef CONFIG_XFRM_SUB_POLICY
+	xfrm_unregister_type(&mip6_iro_dst_type, AF_INET6);
+ mip6_iro_dst_remap_xfrm_fail:
+	xfrm_unregister_type(&mip6_iro_src_type, AF_INET6);
+ mip6_iro_src_remap_xfrm_fail:
+#endif
 	xfrm_unregister_type(&mip6_rthdr_type, AF_INET6);
  mip6_rthdr_xfrm_fail:
 	xfrm_unregister_type(&mip6_destopt_type, AF_INET6);
@@ -509,6 +654,14 @@ static void __exit mip6_fini(void)
 {
 	if (rawv6_mh_filter_unregister(mip6_mh_filter) < 0)
 		printk(KERN_INFO "%s: can't remove rawv6 mh filter\n", __func__);
+#ifdef CONFIG_XFRM_SUB_POLICY
+	if (xfrm_unregister_type(&mip6_iro_dst_type, AF_INET6) < 0)
+		printk(KERN_INFO "%s: can't remove xfrm type(IRO dst remap)\n",
+		       __func__);
+	if (xfrm_unregister_type(&mip6_iro_src_type, AF_INET6) < 0)
+		printk(KERN_INFO "%s: can't remove xfrm type(IRO src remap)\n",
+		       __func__);
+#endif
 	if (xfrm_unregister_type(&mip6_rthdr_type, AF_INET6) < 0)
 		printk(KERN_INFO "%s: can't remove xfrm type(rthdr)\n", __func__);
 	if (xfrm_unregister_type(&mip6_destopt_type, AF_INET6) < 0)
diff --git a/net/ipv6/xfrm6_mode_ro.c b/net/ipv6/xfrm6_mode_ro.c
index 63d5d49..ea33178 100644
--- a/net/ipv6/xfrm6_mode_ro.c
+++ b/net/ipv6/xfrm6_mode_ro.c
@@ -45,6 +45,15 @@ static int xfrm6_ro_output(struct xfrm_state *x, struct sk_buff *skb)
 	u8 *prevhdr;
 	int hdr_len;
 
+	/* Unlike RH2 (IPPROTO_ROUTING) and HAO in DstOpt
+	 * (IPPROTO_DSTOPTS), IRO remapping states do not
+	 * add extension header to the packet. Source
+	 * and/or destination addresses are simply modified
+	 * in place. */
+	if (x->id.proto == XFRM_PROTO_IRO_SRC ||
+	    x->id.proto == XFRM_PROTO_IRO_DST)
+		goto out;
+
 	iph = ipv6_hdr(skb);
 
 	hdr_len = x->type->hdr_offset(x, skb, &prevhdr);
@@ -54,8 +63,8 @@ static int xfrm6_ro_output(struct xfrm_state *x, struct sk_buff *skb)
 	__skb_pull(skb, hdr_len);
 	memmove(ipv6_hdr(skb), iph, hdr_len);
 
+ out:
 	x->lastused = get_seconds();
-
 	return 0;
 }
 
diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
index 8bae6b2..2aecd40 100644
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -179,6 +179,10 @@ static int verify_newsa_info(struct xfrm_usersa_info *p,
 #if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)
 	case IPPROTO_DSTOPTS:
 	case IPPROTO_ROUTING:
+#ifdef CONFIG_XFRM_SUB_POLICY
+	case XFRM_PROTO_IRO_SRC:
+	case XFRM_PROTO_IRO_DST:
+#endif
 		if (attrs[XFRMA_ALG_COMP]	||
 		    attrs[XFRMA_ALG_AUTH]	||
 		    attrs[XFRMA_ALG_AUTH_TRUNC]	||
-- 
1.7.1



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCHv3 net-next-2.6 4/5] XFRM,IPv6: Add IRO remapping hook in xfrm_input()
  2010-09-29  9:05 [PATCHv3 net-next-2.6 0/5] XFRM,IPv6: Removal of RH2/HAO from IPsec-protected MIPv6 traffic Arnaud Ebalard
                   ` (2 preceding siblings ...)
  2010-09-29  9:05 ` [PATCHv3 net-next-2.6 3/5] XFRM,IPv6: Add IRO src/dst address remapping XFRM types and i/o handlers Arnaud Ebalard
@ 2010-09-29  9:05 ` Arnaud Ebalard
  2010-09-30  3:17   ` David Miller
  2010-09-29  9:06 ` [PATCHv3 net-next-2.6 5/5] XFRM,IPv6: Add IRO remapping capability via socket ancillary data path Arnaud Ebalard
  4 siblings, 1 reply; 13+ messages in thread
From: Arnaud Ebalard @ 2010-09-29  9:05 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Herbert Xu, Hideaki YOSHIFUJI; +Cc: netdev


Add a hook in xfrm_input() to allow IRO remapping to occur when
an incoming packet matching an existing SA (based on SPI) with
an unexpected destination or source address is received.
Because IRO does not consume additional bits in a packet (that's
the point), there is no way to demultiplex based on something
like nh or spi. Instead, IRO input handlers (for source and
destination address remapping) are called upon address mismatch
during IPsec processing.
For that to work, we rely on the fact that SPI values generated
locally are no more linked to destination address (first patch
of the set) and we postpone a bit the expected address check in
xfrm_input() (inside xfrm_state_lookup() against daddr param) by
introducing a call to the input_addr_check() handler from the
struct xfrm_state_afinfo associated with the address family.

Signed-off-by: Arnaud Ebalard <arno@natisbad.org>
---
 include/net/xfrm.h     |    5 +++
 net/ipv4/xfrm4_input.c |   12 ++++++++
 net/ipv4/xfrm4_state.c |    1 +
 net/ipv6/xfrm6_input.c |   70 +++++++++++++++++++++++++++++++++++++++++++++++-
 net/ipv6/xfrm6_state.c |    1 +
 net/xfrm/xfrm_input.c  |    5 ++-
 net/xfrm/xfrm_state.c  |    2 +-
 7 files changed, 92 insertions(+), 4 deletions(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 05b2b1f..5b84c19 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -312,6 +312,8 @@ struct xfrm_state_afinfo {
 						  struct sk_buff *skb);
 	int			(*transport_finish)(struct sk_buff *skb,
 						    int async);
+	int			(*input_addr_check)(struct sk_buff *skb,
+						    struct xfrm_state *x);
 };
 
 extern int xfrm_state_register_afinfo(struct xfrm_state_afinfo *afinfo);
@@ -623,6 +625,7 @@ struct xfrm_spi_skb_cb {
 		struct inet6_skb_parm h6;
 	} header;
 
+	unsigned int saddroff;
 	unsigned int daddroff;
 	unsigned int family;
 };
@@ -1405,6 +1408,7 @@ extern int xfrm4_rcv_encap(struct sk_buff *skb, int nexthdr, __be32 spi,
 			   int encap_type);
 extern int xfrm4_transport_finish(struct sk_buff *skb, int async);
 extern int xfrm4_rcv(struct sk_buff *skb);
+extern int xfrm4_input_addr_check(struct sk_buff *skb, struct xfrm_state *x);
 
 static inline int xfrm4_rcv_spi(struct sk_buff *skb, int nexthdr, __be32 spi)
 {
@@ -1423,6 +1427,7 @@ extern int xfrm6_transport_finish(struct sk_buff *skb, int async);
 extern int xfrm6_rcv(struct sk_buff *skb);
 extern int xfrm6_input_addr(struct sk_buff *skb, xfrm_address_t *daddr,
 			    xfrm_address_t *saddr, u8 proto);
+extern int xfrm6_input_addr_check(struct sk_buff *skb, struct xfrm_state *x);
 extern int xfrm6_tunnel_register(struct xfrm6_tunnel *handler, unsigned short family);
 extern int xfrm6_tunnel_deregister(struct xfrm6_tunnel *handler, unsigned short family);
 extern __be32 xfrm6_tunnel_alloc_spi(struct net *net, xfrm_address_t *saddr);
diff --git a/net/ipv4/xfrm4_input.c b/net/ipv4/xfrm4_input.c
index 06814b6..82e23ec 100644
--- a/net/ipv4/xfrm4_input.c
+++ b/net/ipv4/xfrm4_input.c
@@ -41,6 +41,7 @@ int xfrm4_rcv_encap(struct sk_buff *skb, int nexthdr, __be32 spi,
 		    int encap_type)
 {
 	XFRM_SPI_SKB_CB(skb)->family = AF_INET;
+	XFRM_SPI_SKB_CB(skb)->saddroff = offsetof(struct iphdr, saddr);
 	XFRM_SPI_SKB_CB(skb)->daddroff = offsetof(struct iphdr, daddr);
 	return xfrm_input(skb, nexthdr, spi, encap_type);
 }
@@ -164,3 +165,14 @@ int xfrm4_rcv(struct sk_buff *skb)
 	return xfrm4_rcv_spi(skb, ip_hdr(skb)->protocol, 0);
 }
 EXPORT_SYMBOL(xfrm4_rcv);
+
+int xfrm4_input_addr_check(struct sk_buff *skb, struct xfrm_state *x)
+{
+	xfrm_address_t *daddr;
+
+	daddr = (xfrm_address_t *)(skb_network_header(skb) +
+				   XFRM_SPI_SKB_CB(skb)->daddroff);
+
+	return xfrm_addr_cmp(&x->id.daddr, daddr, AF_INET);
+}
+EXPORT_SYMBOL(xfrm4_input_addr_check);
diff --git a/net/ipv4/xfrm4_state.c b/net/ipv4/xfrm4_state.c
index 4794762..c6b038a 100644
--- a/net/ipv4/xfrm4_state.c
+++ b/net/ipv4/xfrm4_state.c
@@ -79,6 +79,7 @@ static struct xfrm_state_afinfo xfrm4_state_afinfo = {
 	.extract_input		= xfrm4_extract_input,
 	.extract_output		= xfrm4_extract_output,
 	.transport_finish	= xfrm4_transport_finish,
+	.input_addr_check	= xfrm4_input_addr_check,
 };
 
 void __init xfrm4_state_init(void)
diff --git a/net/ipv6/xfrm6_input.c b/net/ipv6/xfrm6_input.c
index f8c3cf8..754ecf7 100644
--- a/net/ipv6/xfrm6_input.c
+++ b/net/ipv6/xfrm6_input.c
@@ -15,6 +15,7 @@
 #include <linux/netfilter_ipv6.h>
 #include <net/ipv6.h>
 #include <net/xfrm.h>
+#include <net/ip6_route.h> /* XXX for ip6_route_input() */
 
 int xfrm6_extract_input(struct xfrm_state *x, struct sk_buff *skb)
 {
@@ -24,6 +25,7 @@ int xfrm6_extract_input(struct xfrm_state *x, struct sk_buff *skb)
 int xfrm6_rcv_spi(struct sk_buff *skb, int nexthdr, __be32 spi)
 {
 	XFRM_SPI_SKB_CB(skb)->family = AF_INET6;
+	XFRM_SPI_SKB_CB(skb)->saddroff = offsetof(struct ipv6hdr, saddr);
 	XFRM_SPI_SKB_CB(skb)->daddroff = offsetof(struct ipv6hdr, daddr);
 	return xfrm_input(skb, nexthdr, spi, 0);
 }
@@ -142,5 +144,71 @@ int xfrm6_input_addr(struct sk_buff *skb, xfrm_address_t *daddr,
 drop:
 	return -1;
 }
-
 EXPORT_SYMBOL(xfrm6_input_addr);
+
+#if defined(CONFIG_XFRM_SUB_POLICY)
+/* Perform check on source and destination addresses and possibly IRO
+ * address remapping upon mismatch and if matching IRO state exists. */
+int xfrm6_input_addr_check(struct sk_buff *skb, struct xfrm_state *x)
+{
+	xfrm_address_t *saddr, *exp_saddr, *daddr, *exp_daddr;
+
+	saddr = (xfrm_address_t *)(skb_network_header(skb) +
+				   XFRM_SPI_SKB_CB(skb)->saddroff);
+	daddr = (xfrm_address_t *)(skb_network_header(skb) +
+				   XFRM_SPI_SKB_CB(skb)->daddroff);
+
+	exp_daddr = &x->id.daddr;
+	if (xfrm_addr_cmp(exp_daddr, daddr, AF_INET6)) {
+		/* Destination address mismatch: check if we have an IRO
+		 * destination remapping state to explain that.
+		 *
+		 * Note: saddr is provided as a hint. If source address
+		 * is also a remapped one, xfrm6_input_addr() will manage
+		 * to find IRO destination remapping state */
+		if (xfrm6_input_addr(skb, exp_daddr, saddr,
+				     XFRM_PROTO_IRO_DST) < 0)
+			return -1;
+
+		/* Copy destination address to sec_path for sock opts and
+		 * replace packet destination address with expected HoA */
+		ipv6_addr_copy(&skb->sp->irodst, (struct in6_addr *)daddr);
+		ipv6_addr_copy((struct in6_addr *)daddr,
+			       (struct in6_addr *)exp_daddr);
+
+		skb_dst_drop(skb);
+		ip6_route_input(skb);
+		if (skb_dst(skb)->error)
+			return -1;
+	}
+
+	exp_saddr = &x->props.saddr;
+	if (xfrm_addr_cmp(exp_saddr, saddr, AF_INET6)) {
+		/* Source address mismatch: check if we have an IRO
+		 * source remapping state to explain that.
+		 *
+		 * Note: unlike for destination addresses above, a
+		 * source mismatch is not considered fatal */
+		if (xfrm6_input_addr(skb, daddr, exp_saddr,
+				     XFRM_PROTO_IRO_SRC) < 0)
+			return 0;
+
+		/* Copy destination address to sec_path for sock opts and
+		 * then replace source address with expected peer's HoA */
+		ipv6_addr_copy(&skb->sp->irosrc, (struct in6_addr *)saddr);
+		ipv6_addr_copy((struct in6_addr *)saddr,
+			       (struct in6_addr *)exp_saddr);
+	}
+
+	return 0;
+}
+#else
+int xfrm6_input_addr_check(struct sk_buff *skb, struct xfrm_state *x)
+{
+	xfrm_address_t *daddr;
+	daddr = (xfrm_address_t *)(skb_network_header(skb) +
+				   XFRM_SPI_SKB_CB(skb)->daddroff);
+	return xfrm_addr_cmp(&x->id.daddr, daddr, AF_INET6);
+}
+#endif
+EXPORT_SYMBOL(xfrm6_input_addr_check);
diff --git a/net/ipv6/xfrm6_state.c b/net/ipv6/xfrm6_state.c
index a67575d..aeb4688 100644
--- a/net/ipv6/xfrm6_state.c
+++ b/net/ipv6/xfrm6_state.c
@@ -179,6 +179,7 @@ static struct xfrm_state_afinfo xfrm6_state_afinfo = {
 	.extract_input		= xfrm6_extract_input,
 	.extract_output		= xfrm6_extract_output,
 	.transport_finish	= xfrm6_transport_finish,
+	.input_addr_check	= xfrm6_input_addr_check,
 };
 
 int __init xfrm6_state_init(void)
diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
index 45f1c98..9ff65f6 100644
--- a/net/xfrm/xfrm_input.c
+++ b/net/xfrm/xfrm_input.c
@@ -152,8 +152,9 @@ int xfrm_input(struct sk_buff *skb, int nexthdr, __be32 spi, int encap_type)
 			goto drop;
 		}
 
-		x = xfrm_state_lookup(net, skb->mark, daddr, spi, nexthdr, family);
-		if (x == NULL) {
+		x = xfrm_state_lookup(net, skb->mark, NULL, spi, nexthdr, family);
+		if (x == NULL ||
+		    x->outer_mode->afinfo->input_addr_check(skb, x)) {
 			XFRM_INC_STATS(net, LINUX_MIB_XFRMINNOSTATES);
 			xfrm_audit_state_notfound(skb, family, spi, seq);
 			goto drop;
diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index b6a4d8d..b8f7c08 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -685,7 +685,7 @@ static struct xfrm_state *__xfrm_state_lookup(struct net *net, u32 mark, xfrm_ad
 		if (x->props.family != family ||
 		    x->id.spi       != spi ||
 		    x->id.proto     != proto ||
-		    xfrm_addr_cmp(&x->id.daddr, daddr, family))
+		    (daddr && xfrm_addr_cmp(&x->id.daddr, daddr, family)))
 			continue;
 
 		if ((mark & x->mark.m) != x->mark.v)
-- 
1.7.1



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCHv3 net-next-2.6 5/5] XFRM,IPv6: Add IRO remapping capability via socket ancillary data path
  2010-09-29  9:05 [PATCHv3 net-next-2.6 0/5] XFRM,IPv6: Removal of RH2/HAO from IPsec-protected MIPv6 traffic Arnaud Ebalard
                   ` (3 preceding siblings ...)
  2010-09-29  9:05 ` [PATCHv3 net-next-2.6 4/5] XFRM,IPv6: Add IRO remapping hook in xfrm_input() Arnaud Ebalard
@ 2010-09-29  9:06 ` Arnaud Ebalard
  4 siblings, 0 replies; 13+ messages in thread
From: Arnaud Ebalard @ 2010-09-29  9:06 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Herbert Xu, Hideaki YOSHIFUJI; +Cc: netdev


This provides the ability to remap src/dst address using IRO
via ancillary data passed to sockets. This is the IRO equivalent
of what is done for RH2/HAO (i.e. IPV6_RTHDR/IPV6_DSTOPTS).
This is used by UMIP during BA emission when acting as a Home
Agent.

Signed-off-by: Arnaud Ebalard <arno@natisbad.org>
---
 include/net/ipv6.h       |    4 ++++
 net/ipv6/datagram.c      |   20 ++++++++++++++++++++
 net/ipv6/exthdrs.c       |   19 +++++++++++++------
 net/ipv6/ip6_flowlabel.c |    7 +++++++
 net/ipv6/ip6_output.c    |   22 ++++++++++++++++++++++
 net/ipv6/raw.c           |    3 ++-
 6 files changed, 68 insertions(+), 7 deletions(-)

diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 4a3cd2c..2ba96d8 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -188,6 +188,10 @@ struct ipv6_txoptions {
 	struct ipv6_rt_hdr	*srcrt;	/* Routing Header */
 	struct ipv6_opt_hdr	*dst1opt;
 
+	/* XXX protect those via some ifdef e.g. CONFIG_XFRM_SUB_POLICY ? */
+	struct in6_addr         *iro_src;
+	struct in6_addr         *iro_dst;
+
 	/* Option buffer, as read by IPV6_PKTOPTIONS, starts here. */
 };
 
diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
index 2952c9e..0ac7adf 100644
--- a/net/ipv6/datagram.c
+++ b/net/ipv6/datagram.c
@@ -757,6 +757,26 @@ int datagram_send_ctl(struct net *net,
 			}
 			break;
 
+#ifdef CONFIG_XFRM_SUB_POLICY
+		case IPV6_IROSRC:
+		case IPV6_IRODST:
+			if (cmsg->cmsg_len < CMSG_LEN(sizeof(struct in6_addr))) {
+				err = -EINVAL;
+				goto exit_f;
+			}
+
+			if (!capable(CAP_NET_RAW)) {
+				err = -EPERM;
+				goto exit_f;
+			}
+
+			if (cmsg->cmsg_type == IPV6_IROSRC)
+				opt->iro_src = (struct in6_addr *)CMSG_DATA(cmsg);
+			else
+				opt->iro_dst = (struct in6_addr *)CMSG_DATA(cmsg);
+			break;
+#endif
+
 		case IPV6_2292RTHDR:
 		case IPV6_RTHDR:
 			if (cmsg->cmsg_len < CMSG_LEN(sizeof(struct ipv6_rt_hdr))) {
diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c
index 262f105..e480b06 100644
--- a/net/ipv6/exthdrs.c
+++ b/net/ipv6/exthdrs.c
@@ -750,6 +750,10 @@ ipv6_dup_options(struct sock *sk, struct ipv6_txoptions *opt)
 			*((char**)&opt2->dst1opt) += dif;
 		if (opt2->srcrt)
 			*((char**)&opt2->srcrt) += dif;
+		if (opt2->iro_src)
+			*((char**)&opt2->iro_src) += dif;
+		if (opt2->iro_dst)
+			*((char**)&opt2->iro_dst) += dif;
 	}
 	return opt2;
 }
@@ -874,24 +878,27 @@ struct ipv6_txoptions *ipv6_fixup_options(struct ipv6_txoptions *opt_space,
 
 /**
  * fl6_update_dst - update flowi destination address with info given
- *                  by srcrt option, if any.
+ *                  by srcrt/iro_dst option, if any.
  *
  * @fl: flowi for which fl6_dst is to be updated
- * @opt: struct ipv6_txoptions in which to look for srcrt opt
+ * @opt: struct ipv6_txoptions in which to look for srcrt/iro_dst opt
  * @orig: copy of original fl6_dst address if modified
  *
- * Returns NULL if no txoptions or no srcrt, otherwise returns orig
- * and initial value of fl->fl6_dst set in orig
+ * Returns NULL if no txoptions or no options to change flowi destination
+ * (srcrt or IRO destination remapping rule), otherwise returns orig and
+ * initial value of fl->fl6_dst set in orig
  */
 struct in6_addr *fl6_update_dst(struct flowi *fl,
 				const struct ipv6_txoptions *opt,
 				struct in6_addr *orig)
 {
-	if (!opt || !opt->srcrt)
+	if (!opt || (!opt->srcrt && !opt->iro_dst))
 		return NULL;
 
 	ipv6_addr_copy(orig, &fl->fl6_dst);
-	ipv6_addr_copy(&fl->fl6_dst, ((struct rt0_hdr *)opt->srcrt)->addr);
+	ipv6_addr_copy(&fl->fl6_dst,
+		       opt->iro_dst ?: ((struct rt0_hdr *)opt->srcrt)->addr);
+
 	return orig;
 }
 
diff --git a/net/ipv6/ip6_flowlabel.c b/net/ipv6/ip6_flowlabel.c
index 1365468..dbf9c29 100644
--- a/net/ipv6/ip6_flowlabel.c
+++ b/net/ipv6/ip6_flowlabel.c
@@ -280,6 +280,9 @@ struct ipv6_txoptions *fl6_merge_options(struct ipv6_txoptions * opt_space,
 		opt_space->hopopt = fl_opt->hopopt;
 		opt_space->dst0opt = fl_opt->dst0opt;
 		opt_space->srcrt = fl_opt->srcrt;
+		/* XXX protect those via some ifdef - see net/ipv6.h */
+		opt_space->iro_src = fl_opt->iro_src;
+		opt_space->iro_dst = fl_opt->iro_dst;
 		opt_space->opt_nflen = fl_opt->opt_nflen;
 	} else {
 		if (fopt->opt_nflen == 0)
@@ -287,6 +290,9 @@ struct ipv6_txoptions *fl6_merge_options(struct ipv6_txoptions * opt_space,
 		opt_space->hopopt = NULL;
 		opt_space->dst0opt = NULL;
 		opt_space->srcrt = NULL;
+		/* XXX protect those via some ifdef - see net/ipv6.h */
+		opt_space->iro_src = NULL;
+		opt_space->iro_dst = NULL;
 		opt_space->opt_nflen = 0;
 	}
 	opt_space->dst1opt = fopt->dst1opt;
@@ -456,6 +462,7 @@ static int ipv6_opt_cmp(struct ipv6_txoptions *o1, struct ipv6_txoptions *o2)
 		return 1;
 	if (ipv6_hdr_cmp((struct ipv6_opt_hdr *)o1->srcrt, (struct ipv6_opt_hdr *)o2->srcrt))
 		return 1;
+
 	return 0;
 }
 
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 99157b4..210f269 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -222,6 +222,8 @@ int ip6_xmit(struct sock *sk, struct sk_buff *skb, struct flowi *fl,
 			ipv6_push_frag_opts(skb, opt, &proto);
 		if (opt->opt_nflen)
 			ipv6_push_nfrag_opts(skb, opt, &proto, &first_hop);
+		if (opt->iro_dst)
+			first_hop = opt->iro_dst;
 	}
 
 	skb_push(skb, sizeof(struct ipv6hdr));
@@ -1106,6 +1108,12 @@ static inline struct ipv6_rt_hdr *ip6_rthdr_dup(struct ipv6_rt_hdr *src,
 	return src ? kmemdup(src, (src->hdrlen + 1) * 8, gfp) : NULL;
 }
 
+static inline struct in6_addr *ip6_iro_addr_dup(struct in6_addr *addr,
+						gfp_t gfp)
+{
+	return addr ? kmemdup(addr, sizeof(struct in6_addr), gfp) : NULL;
+}
+
 int ip6_append_data(struct sock *sk, int getfrag(void *from, char *to,
 	int offset, int len, int odd, struct sk_buff *skb),
 	void *from, int length, int transhdrlen,
@@ -1162,6 +1170,16 @@ int ip6_append_data(struct sock *sk, int getfrag(void *from, char *to,
 			if (opt->srcrt && !np->cork.opt->srcrt)
 				return -ENOBUFS;
 
+			np->cork.opt->iro_src = ip6_iro_addr_dup(opt->iro_src,
+								 sk->sk_allocation);
+			if (opt->iro_src && !np->cork.opt->iro_src)
+				return -ENOBUFS;
+
+			np->cork.opt->iro_dst = ip6_iro_addr_dup(opt->iro_dst,
+								 sk->sk_allocation);
+			if (opt->iro_dst && !np->cork.opt->iro_dst)
+				return -ENOBUFS;
+
 			/* need source address above miyazawa*/
 		}
 		dst_hold(&rt->dst);
@@ -1440,6 +1458,8 @@ static void ip6_cork_release(struct inet_sock *inet, struct ipv6_pinfo *np)
 		kfree(np->cork.opt->dst1opt);
 		kfree(np->cork.opt->hopopt);
 		kfree(np->cork.opt->srcrt);
+		kfree(np->cork.opt->iro_src);
+		kfree(np->cork.opt->iro_dst);
 		kfree(np->cork.opt);
 		np->cork.opt = NULL;
 	}
@@ -1495,6 +1515,8 @@ int ip6_push_pending_frames(struct sock *sk)
 		ipv6_push_frag_opts(skb, opt, &proto);
 	if (opt && opt->opt_nflen)
 		ipv6_push_nfrag_opts(skb, opt, &proto, &final_dst);
+	if (opt && opt->iro_dst)
+		final_dst = opt->iro_dst;
 
 	skb_push(skb, sizeof(struct ipv6hdr));
 	skb_reset_network_header(skb);
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 45e6efb..1a11bd5 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -828,7 +828,8 @@ static int rawv6_sendmsg(struct kiocb *iocb, struct sock *sk,
 			if (flowlabel == NULL)
 				return -EINVAL;
 		}
-		if (!(opt->opt_nflen|opt->opt_flen))
+		if (!(opt->opt_nflen|opt->opt_flen) &&
+		    (!opt->iro_src && !opt->iro_dst))
 			opt = NULL;
 	}
 	if (opt == NULL)
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCHv3 net-next-2.6 3/5] XFRM,IPv6: Add IRO src/dst address remapping XFRM types and i/o handlers
  2010-09-29  9:05 ` [PATCHv3 net-next-2.6 3/5] XFRM,IPv6: Add IRO src/dst address remapping XFRM types and i/o handlers Arnaud Ebalard
@ 2010-09-30  3:16   ` David Miller
  2010-10-02 10:17     ` Arnaud Ebalard
  0 siblings, 1 reply; 13+ messages in thread
From: David Miller @ 2010-09-30  3:16 UTC (permalink / raw)
  To: arno; +Cc: eric.dumazet, herbert, yoshfuji, netdev

From: Arnaud Ebalard <arno@natisbad.org>
Date: Wed, 29 Sep 2010 11:05:47 +0200

> +static int mip6_iro_src_reject(struct xfrm_state *x, struct sk_buff *skb, struct flowi *fl)
> +{
> +	int err = 0;
> +
> +	/* XXX We may need some reject handler at some point but it is not
> +	 * critical yet: see xfrm_secpath_reject() in net/xfrm/xfrm_policy.c
> +	 * and aslo what mip6_destopt_reject() implements */
> +
> +	printk("XXX FIXME: mip6_iro_src_reject() called\n");

pr_debug() or pr_err() or get rid of it altogher and use WARN_ON() or
similar.

> +	spin_lock(&x->lock);
> +	if (!ipv6_addr_equal(&iph->daddr, (struct in6_addr *)x->coaddr) &&
> +	    !ipv6_addr_any((struct in6_addr *)x->coaddr))
> +		err = -ENOENT;
> +	spin_unlock(&x->lock);

What are you actually protecting with this lock?  The moment you drop
it the x->coaddr can change which changes the result you should return
here.

I suspect you either don't need the lock, or you need to lock at a higher
level.

> +		printk(KERN_INFO "%s: spi is not 0: %u\n", __func__,

pr_info()

> +		printk(KERN_INFO "%s: state's mode is not %u: %u\n",

pr_info()

> +		       __func__, XFRM_MODE_ROUTEOPTIMIZATION,

Printing decimal values for CPP macro constants does not make log
messages very readable.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCHv3 net-next-2.6 4/5] XFRM,IPv6: Add IRO remapping hook in xfrm_input()
  2010-09-29  9:05 ` [PATCHv3 net-next-2.6 4/5] XFRM,IPv6: Add IRO remapping hook in xfrm_input() Arnaud Ebalard
@ 2010-09-30  3:17   ` David Miller
  0 siblings, 0 replies; 13+ messages in thread
From: David Miller @ 2010-09-30  3:17 UTC (permalink / raw)
  To: arno; +Cc: eric.dumazet, herbert, yoshfuji, netdev

From: Arnaud Ebalard <arno@natisbad.org>
Date: Wed, 29 Sep 2010 11:05:59 +0200

> +EXPORT_SYMBOL(xfrm4_input_addr_check);

> +EXPORT_SYMBOL(xfrm6_input_addr_check);

net/ipv{4,6}/xfrm{4,6}_{state,input}.c will be built together as a
group, so there is no need to export the address check symbol to
modules.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCHv3 net-next-2.6 3/5] XFRM,IPv6: Add IRO src/dst address remapping XFRM types and i/o handlers
  2010-09-30  3:16   ` David Miller
@ 2010-10-02 10:17     ` Arnaud Ebalard
  2010-10-02 10:32       ` Herbert Xu
  0 siblings, 1 reply; 13+ messages in thread
From: Arnaud Ebalard @ 2010-10-02 10:17 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, herbert, yoshfuji, netdev

Hi,

David Miller <davem@davemloft.net> writes:

>> +static int mip6_iro_src_reject(struct xfrm_state *x, struct sk_buff *skb, struct flowi *fl)
>> +{
>> +	int err = 0;
>> +
>> +	/* XXX We may need some reject handler at some point but it is not
>> +	 * critical yet: see xfrm_secpath_reject() in net/xfrm/xfrm_policy.c
>> +	 * and aslo what mip6_destopt_reject() implements */
>> +
>> +	printk("XXX FIXME: mip6_iro_src_reject() called\n");
>
> pr_debug() or pr_err() or get rid of it altogher and use WARN_ON() or
> similar.

I will take a look at this reject handler tomorrow (implement or remove it).


>> +	spin_lock(&x->lock);
>> +	if (!ipv6_addr_equal(&iph->daddr, (struct in6_addr *)x->coaddr) &&
>> +	    !ipv6_addr_any((struct in6_addr *)x->coaddr))
>> +		err = -ENOENT;
>> +	spin_unlock(&x->lock);
>
> What are you actually protecting with this lock?  The moment you drop
> it the x->coaddr can change which changes the result you should return
> here.
>
> I suspect you either don't need the lock, or you need to lock at a higher
> level.

I basically trusted RH2 input handler code and reused it as a basis:

  static int mip6_rthdr_input(struct xfrm_state *x, struct sk_buff *skb)
  {
  	struct ipv6hdr *iph = ipv6_hdr(skb);
  	struct rt2_hdr *rt2 = (struct rt2_hdr *)skb->data;
  	int err = rt2->rt_hdr.nexthdr;
  
  	spin_lock(&x->lock);
  	if (!ipv6_addr_equal(&iph->daddr, (struct in6_addr *)x->coaddr) &&
  	    !ipv6_addr_any((struct in6_addr *)x->coaddr))
  		err = -ENOENT;
  	spin_unlock(&x->lock);
  
  	return err;
  }

*At that time*, I considered the lock useful to prevent changes on coaddr
during the two checks, i.e. to make it coherent. But I think you are
right and I see no reason for the lock not to be at a higher level:
I may have missed somthing but AFAICT, from a look at the code, there is
nothing preventing x->coaddr to  be updated (via xfrm_sa_update()) just
before or just after the checks.

I took a look at the callers for mip6 handlers and if I am not mistaken
there is *only* xfrm6_input_addr() because xfrm_input() only handles
esp, ah and ipcomp extension headers and not mip6-related ones
(i.e. only IPsec-related ones, those with a SPI). Here is a snippet of
the interesting (lock-wise) part of xfrm6_input_addr():

>	for (i = 0; i < 3; i++) {
>
>               <....snip....>
>
> 		spin_lock(&x->lock);
> 
> 		if ((!i || (x->props.flags & XFRM_STATE_WILDRECV)) &&
> 		    likely(x->km.state == XFRM_STATE_VALID) &&
> 		    !xfrm_state_check_expire(x)) {
> 			spin_unlock(&x->lock);
> 			if (x->type->input(x, skb) > 0) {
> 				/* found a valid state */
> 				break;
> 			}
> 		} else
> 			spin_unlock(&x->lock);
> 
> 		xfrm_state_put(x);
> 		x = NULL;
> 	}
> 
> 	if (!x) {
> 		XFRM_INC_STATS(net, LINUX_MIB_XFRMINNOSTATES);
> 		xfrm_audit_state_notfound_simple(skb, AF_INET6);
> 		goto drop;
> 	}
> 
> 	skb->sp->xvec[skb->sp->len++] = x;
> 
> 	spin_lock(&x->lock);
> 
> 	x->curlft.bytes += skb->len;
> 	x->curlft.packets++;
> 
> 	spin_unlock(&x->lock);

and I see no reason not to keep the lock we have on the state until the
end of the function when the state is valid (when we break), instead of
releasing it to get it again later. Something like the following would
allow removing the spin_lock()/spin_unlock() calls from all mip6 input
handlers (mip6_{destopt,rthdr,iro_src,iro_dst}_input()):

> 		spin_lock(&x->lock);
>
> 		if ((!i || (x->props.flags & XFRM_STATE_WILDRECV)) &&
> 		    likely(x->km.state == XFRM_STATE_VALID) &&
> 		    !xfrm_state_check_expire(x)) {
> 			if (x->type->input(x, skb) > 0) {
> 				/* found a valid state */
> 				break;
> 			} 
> 		}
>
> 		spin_unlock(&x->lock);
>
> 		xfrm_state_put(x);
> 		x = NULL;
> 	}
> 
> 	if (!x) {
> 		XFRM_INC_STATS(net, LINUX_MIB_XFRMINNOSTATES);
> 		xfrm_audit_state_notfound_simple(skb, AF_INET6);
> 		goto drop;
> 	}
> 
> 	skb->sp->xvec[skb->sp->len++] = x;
> 
> 	x->curlft.bytes += skb->len;
> 	x->curlft.packets++;
> 
> 	spin_unlock(&x->lock);

If this is ok, I will add a patch to the set to do that and also remove
the locks from the input handlers I introduce.

Cheers,

a+

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCHv3 net-next-2.6 3/5] XFRM,IPv6: Add IRO src/dst address remapping XFRM types and i/o handlers
  2010-10-02 10:17     ` Arnaud Ebalard
@ 2010-10-02 10:32       ` Herbert Xu
  2010-10-03 13:41         ` Arnaud Ebalard
  0 siblings, 1 reply; 13+ messages in thread
From: Herbert Xu @ 2010-10-02 10:32 UTC (permalink / raw)
  To: Arnaud Ebalard; +Cc: David Miller, eric.dumazet, yoshfuji, netdev

On Sat, Oct 02, 2010 at 12:17:35PM +0200, Arnaud Ebalard wrote:
>
> and I see no reason not to keep the lock we have on the state until the
> end of the function when the state is valid (when we break), instead of
> releasing it to get it again later. Something like the following would
> allow removing the spin_lock()/spin_unlock() calls from all mip6 input
> handlers (mip6_{destopt,rthdr,iro_src,iro_dst}_input()):

No I moved the state lock down precisely because it should not
be taken at a higher level as that breaks asynchronous IPsec
processing and the fact that it isn't needed in most places.

If your code needs it then you should take it rather than impose
it on real IPsec users.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCHv3 net-next-2.6 3/5] XFRM,IPv6: Add IRO src/dst address remapping XFRM types and i/o handlers
  2010-10-02 10:32       ` Herbert Xu
@ 2010-10-03 13:41         ` Arnaud Ebalard
  2010-10-03 15:12           ` Herbert Xu
  0 siblings, 1 reply; 13+ messages in thread
From: Arnaud Ebalard @ 2010-10-03 13:41 UTC (permalink / raw)
  To: Herbert Xu; +Cc: David Miller, eric.dumazet, yoshfuji, netdev

Hi Herbert,

Herbert Xu <herbert@gondor.apana.org.au> writes:

> On Sat, Oct 02, 2010 at 12:17:35PM +0200, Arnaud Ebalard wrote:
>>
>> and I see no reason not to keep the lock we have on the state until the
>> end of the function when the state is valid (when we break), instead of
>> releasing it to get it again later. Something like the following would
>> allow removing the spin_lock()/spin_unlock() calls from all mip6 input
>> handlers (mip6_{destopt,rthdr,iro_src,iro_dst}_input()):
>
> No I moved the state lock down precisely because it should not
> be taken at a higher level as that breaks asynchronous IPsec
> processing and the fact that it isn't needed in most places.
>
> If your code needs it then you should take it rather than impose
> it on real IPsec users.

Understood. Note that I am on your side with this: my primary concern
while pushing the feature is *to not break or slow down standard IPsec*.
I do not expect my code to be accepted or even read otherwise.

As for the current point raised by David on the position of the locks in
my input handlers, they are based on the position of the locks in the
*existing* RH2 (mip6_rthdr_input()) and HAO (mip6_destopt_input())
handlers. As they serve the same purpose (src/dst address check against
state's address) and the code is basically the same, I have no reason to
do things differently as what is currently upstream.

After your reply, I took a (too long) look at the history of
xfrm6_input_addr() to understand why it is as it is. If it can spare you
some time, here is what I think happened:

 - Initially (commit fbd9a5b4, Aug 23 2006), the checks on the status of
   state, the call to x->type->input() and the changes on state's
   processing stats (x->curlft changes) were *globally* protected by a
   call to spin_lock(). The same day, a related commit (3d126890) added
   support for RH2/HAO input handler. No lock inside the handler. The
   content of xfrm6_input_addr() was:

		spin_lock(&x->lock);

                <...snip...>

		nh = x->type->input(x, skb);
		if (nh <= 0) {
			spin_unlock(&x->lock);
			xfrm_state_put(x);
			x = NULL;
			continue;
		}

		x->curlft.bytes += skb->len;
		x->curlft.packets++;

		spin_unlock(&x->lock);

 - Then, as you wrote, the state lock was moved in all input handlers
   (commit 0ebea8ef, Nov 13 2007), including RH2/HAO ones:

   @@ -128,12 +128,15 @@ static int mip6_destopt_input(struct xfrm_state *x, struct sk_buff *skb)
    {
           struct ipv6hdr *iph = ipv6_hdr(skb);
           struct ipv6_destopt_hdr *destopt = (struct ipv6_destopt_hdr *)skb->data;
   +       int err = destopt->nexthdr;
    
   +       spin_lock(&x->lock);
           if (!ipv6_addr_equal(&iph->saddr, (struct in6_addr *)x->coaddr) &&
               !ipv6_addr_any((struct in6_addr *)x->coaddr))
   -               return -ENOENT;
   +               err = -ENOENT;
   +       spin_unlock(&x->lock);
    
   -       return destopt->nexthdr;
   +       return err;
    }

   With that commit, I think a deadlock was introduced in MIPv6 code
   because xfm6_input_addr() was left unchanged, i.e. x->type->input()
   was called with the lock held. Am I correct?

 - The code of xfrm6_input_addr() was then optimized by commit a002c6fd
   in such a way that x->type->input() was then put outside the
   protection of the lock, which (if I am not mistaken) removed the
   deadlock: 

	spin_lock(&x->lock);

	if ((!i || (x->props.flags & XFRM_STATE_WILDRECV)) &&
	    likely(x->km.state == XFRM_STATE_VALID) &&
	    !xfrm_state_check_expire(x)) {
		spin_unlock(&x->lock);
		if (x->type->input(x, skb) > 0) {
			/* found a valid state */
			break;
		}
	} else
		spin_unlock(&x->lock);

   I don't know if this is was intentional.

   But the main question remains on the position of the lock. Here,
   checks are done on the status of the state, lock is released,
   reacquired in the input handler to do additional check and then
   released again, to be reacquired later in the function to act on
   statistics. Is my reading of the code correct?


Herbert, you certainly have a better understanding of XFRM code than I
have and can probably tell if the locking behavior above is valid or
buggy. Yoshifuji-san, David or Eric may also have good ideas on that.


As a side note (I think I was not explicit enough in my previous email),
I think the possible changes to xfrm_input_addr() and MIPv6 handlers we
are discussing are not expected to impact standard IPsec code because
there are 2 different cases in which states input handlers are called 
(i.e. x->type->input()):

 - xfrm_input(): for standard IPsec case (incl. async resumption). This 
   is only for esp, ah, ipcomp and tunneling.
 - xfrm6_input_addr(): for MIPv6 extension header, i.e. RH2 and HAO in
   destopt.

and we are discussing the second.


David, as for my patches, if this is ok for you, I will keep the code of
my input handlers aligned on the code of RH2/HAO handlers and will modify
it later based on the possible corrections made on those upstream.

Don't hesitate to slap me if I made some mistakes in my analysis ;-)

Cheers,

a+

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCHv3 net-next-2.6 3/5] XFRM,IPv6: Add IRO src/dst address remapping XFRM types and i/o handlers
  2010-10-03 13:41         ` Arnaud Ebalard
@ 2010-10-03 15:12           ` Herbert Xu
  2010-10-03 21:25             ` Arnaud Ebalard
  0 siblings, 1 reply; 13+ messages in thread
From: Herbert Xu @ 2010-10-03 15:12 UTC (permalink / raw)
  To: Arnaud Ebalard; +Cc: David Miller, eric.dumazet, yoshfuji, netdev

On Sun, Oct 03, 2010 at 03:41:04PM +0200, Arnaud Ebalard wrote:
>
> After your reply, I took a (too long) look at the history of
> xfrm6_input_addr() to understand why it is as it is. If it can spare you
> some time, here is what I think happened:

...

>  - Then, as you wrote, the state lock was moved in all input handlers
>    (commit 0ebea8ef, Nov 13 2007), including RH2/HAO ones:

...

>    With that commit, I think a deadlock was introduced in MIPv6 code
>    because xfm6_input_addr() was left unchanged, i.e. x->type->input()
>    was called with the lock held. Am I correct?
> 
>  - The code of xfrm6_input_addr() was then optimized by commit a002c6fd
>    in such a way that x->type->input() was then put outside the
>    protection of the lock, which (if I am not mistaken) removed the
>    deadlock: 

...

>    I don't know if this is was intentional.

Indeed MIPv6 was completely out of action for three months and
nobody noticed :)

>    But the main question remains on the position of the lock. Here,
>    checks are done on the status of the state, lock is released,
>    reacquired in the input handler to do additional check and then
>    released again, to be reacquired later in the function to act on
>    statistics. Is my reading of the code correct?

When I moved the lock down I chose the safest option and added
it to every single input function.  So it may well be the case
that the lock isn't needed at all on the MIPv6 path.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCHv3 net-next-2.6 3/5] XFRM,IPv6: Add IRO src/dst address remapping XFRM types and i/o handlers
  2010-10-03 15:12           ` Herbert Xu
@ 2010-10-03 21:25             ` Arnaud Ebalard
  0 siblings, 0 replies; 13+ messages in thread
From: Arnaud Ebalard @ 2010-10-03 21:25 UTC (permalink / raw)
  To: Herbert Xu; +Cc: David Miller, eric.dumazet, yoshfuji, netdev

Hello,

Herbert Xu <herbert@gondor.apana.org.au> writes:

> On Sun, Oct 03, 2010 at 03:41:04PM +0200, Arnaud Ebalard wrote:
>>
>> After your reply, I took a (too long) look at the history of
>> xfrm6_input_addr() to understand why it is as it is. If it can spare you
>> some time, here is what I think happened:
>
> ...
>
>>  - Then, as you wrote, the state lock was moved in all input handlers
>>    (commit 0ebea8ef, Nov 13 2007), including RH2/HAO ones:
>
> ...
>
>>    With that commit, I think a deadlock was introduced in MIPv6 code
>>    because xfm6_input_addr() was left unchanged, i.e. x->type->input()
>>    was called with the lock held. Am I correct?
>> 
>>  - The code of xfrm6_input_addr() was then optimized by commit a002c6fd
>>    in such a way that x->type->input() was then put outside the
>>    protection of the lock, which (if I am not mistaken) removed the
>>    deadlock: 
>
> ...
>
>>    I don't know if this is was intentional.
>
> Indeed MIPv6 was completely out of action for three months and
> nobody noticed :)

hehe ;-) Just to correct a missing waypoint in my history, which is in
fact the real fix for the deadlock:

   commit 9473e1f631de339c50bde1e3bd09e1045fe90fd5
   Author: Masahide NAKAMURA <nakam@linux-ipv6.org>
   Date:   Thu Dec 20 20:41:57 2007 -0800
   
       [XFRM] MIPv6: Fix to input RO state correctly.
       
       Disable spin_lock during xfrm_type.input() function.
       Follow design as IPsec inbound does.
    
       Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
       Signed-off-by: David S. Miller <davem@davemloft.net>

>>    But the main question remains on the position of the lock. Here,
>>    checks are done on the status of the state, lock is released,
>>    reacquired in the input handler to do additional check and then
>>    released again, to be reacquired later in the function to act on
>>    statistics. Is my reading of the code correct?
>
> When I moved the lock down I chose the safest option and added
> it to every single input function.  So it may well be the case
> that the lock isn't needed at all on the MIPv6 path.

I don't have any technical argument to support the removal of the locks,
i.e. don't see what would prevent changes during the check. I will try
and spend more time on it, but meanwhile I think it's safe to keep
things the way they are.

Thanks for your time, Herbert.

Cheers,

a+

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2010-10-03 21:25 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-09-29  9:05 [PATCHv3 net-next-2.6 0/5] XFRM,IPv6: Removal of RH2/HAO from IPsec-protected MIPv6 traffic Arnaud Ebalard
2010-09-29  9:05 ` [PATCHv3 net-next-2.6 1/5] XFRM,IPv6: Remove xfrm_spi_hash() dependency on destination address Arnaud Ebalard
2010-09-29  9:05 ` [PATCHv3 net-next-2.6 2/5] XFRM,IPv6: Introduce receive sockopts to access IRO remapped src/dst addresses Arnaud Ebalard
2010-09-29  9:05 ` [PATCHv3 net-next-2.6 3/5] XFRM,IPv6: Add IRO src/dst address remapping XFRM types and i/o handlers Arnaud Ebalard
2010-09-30  3:16   ` David Miller
2010-10-02 10:17     ` Arnaud Ebalard
2010-10-02 10:32       ` Herbert Xu
2010-10-03 13:41         ` Arnaud Ebalard
2010-10-03 15:12           ` Herbert Xu
2010-10-03 21:25             ` Arnaud Ebalard
2010-09-29  9:05 ` [PATCHv3 net-next-2.6 4/5] XFRM,IPv6: Add IRO remapping hook in xfrm_input() Arnaud Ebalard
2010-09-30  3:17   ` David Miller
2010-09-29  9:06 ` [PATCHv3 net-next-2.6 5/5] XFRM,IPv6: Add IRO remapping capability via socket ancillary data path Arnaud Ebalard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).