Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH v3 ipsec-next 3/3] xfrm: wrap xfrmdev_ops with offload config
From: Marcelo Ricardo Leitner @ 2017-12-20 17:20 UTC (permalink / raw)
  To: Shannon Nelson; +Cc: steffen.klassert, netdev
In-Reply-To: <eba8e3f2-f9f3-fe6a-5bfa-e4532ca478c1@oracle.com>

On Wed, Dec 20, 2017 at 08:22:40AM -0800, Shannon Nelson wrote:
> On 12/20/2017 8:03 AM, Marcelo Ricardo Leitner wrote:
> > On Tue, Dec 19, 2017 at 03:35:49PM -0800, Shannon Nelson wrote:
> > > There's no reason to define netdev->xfrmdev_ops if
> > > the offload facility is not CONFIG'd in.
> > > 
> > > Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
> > 
> > This one could use a Fixes tag perhaps:
> > Fixes: d77e38e612a0 ("xfrm: Add an IPsec hardware offloading API")
> > 
> > as in theory the build was broken since then, as it added:
> > +#ifdef CONFIG_XFRM_OFFLOAD
> > +struct xfrmdev_ops {
> > ...
> > +#ifdef CONFIG_XFRM
> > +       const struct xfrmdev_ops *xfrmdev_ops;
> > 
> > So the pointer would have an undefined type
> >    if CONFIG_XFRM && !CONFIG_XFRM_OFFLOAD
> > Though I couldn't reproduce this, not sure why.
> 
> Hmmm, I don't think this requires a "Fixes" tag, as the code all worked just
> fine, I'm just doing a little cleaning.

I still don't get how it works, but okay.

> 
> Patch 2/3 adds a more intense look at the data structure, so I needed to
> change it to the CONFIG_XFRM_OFFLOAD so as to not break the build. Since the
> xfrmdev_ops field is now never used unless we have CONFIG_XFRM_OFFLOAD, we
> can change the net_device definition to be just a bit smaller without it.
> 
> > 
> > But.. is it buildable with this patch? I mine failed:
> > 
> > obj-$(CONFIG_XFRM) := xfrm_policy.o xfrm_state.o xfrm_hash.o \
> >                        xfrm_input.o xfrm_output.o \
> >                        xfrm_sysctl.o xfrm_replay.o xfrm_device.o
> > 
> > so xfrm_device is always in if CONFIG_XFRM is there,
> > xfrm_dev_init(), via xfrm_dev_notifier -> xfrm_dev_event() ->
> >    xfrm_dev_register() and then:
> > 
> > static int xfrm_dev_register(struct net_device *dev)
> > {
> >          if ((dev->features & NETIF_F_HW_ESP) && !dev->xfrmdev_ops)
> 
> This looks like you haven't applied version 3 of the 2nd patch "xfrm: check
> for xdo_dev_ops add and delete".  I missed this in the earlier version (not
> enough compile tests), but version 3 of patch 2/3  should address it.

Right you are, missed it here.

Thanks,
Marcelo

^ permalink raw reply

* Re: [RFC ipsec-next 3/4] net: xfrm: support multiple VTI tunnels
From: Lorenzo Colitti @ 2017-12-20 17:12 UTC (permalink / raw)
  To: David Miller
  Cc: netdev, Steffen Klassert, Subash Abhinov Kasiviswanathan,
	Nathan Harold
In-Reply-To: <20171218.125642.639075398593924537.davem@davemloft.net>

On Tue, Dec 19, 2017 at 2:56 AM, David Miller <davem@davemloft.net> wrote:
> > - ICMP errors are similar to input, except the search is for the
> >   outbound XFRM state, because the only data that is available is
> >   the outbound SPI. Thus, ICMP errors are only processed if the
> >   ikey is the same as the same as the okey. AFAICS this is
> >   consistent with GRE tunnels, but not with existing VTI
> >   behaviour.
>
> I think you will need to sort out the VTI ICMP behavior difference
> with what exists now.

Thanks for the feedback. I've sent out a new series that addresses this.

I had to make some minor changes to the common ip tunnel lookup
functions to make it work, because currently, a tunnel can only be
looked up by i_key. https://patchwork.ozlabs.org/patch/851558/ .

^ permalink raw reply

* [PATCH net,stable] s390/qeth: fix error handling in checksum cmd callback
From: Julian Wiedmann @ 2017-12-20 17:07 UTC (permalink / raw)
  To: David Miller
  Cc: netdev, linux-s390, Martin Schwidefsky, Heiko Carstens,
	Stefan Raspl, Ursula Braun, Julian Wiedmann

Make sure to check both return code fields before processing the
response. Otherwise we risk operating on invalid data.

Fixes: c9475369bd2b ("s390/qeth: rework RX/TX checksum offload")
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
---
 drivers/s390/net/qeth_core_main.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/s390/net/qeth_core_main.c b/drivers/s390/net/qeth_core_main.c
index 6c815207f4f5..3614df68830f 100644
--- a/drivers/s390/net/qeth_core_main.c
+++ b/drivers/s390/net/qeth_core_main.c
@@ -5386,6 +5386,13 @@ int qeth_poll(struct napi_struct *napi, int budget)
 }
 EXPORT_SYMBOL_GPL(qeth_poll);
 
+static int qeth_setassparms_inspect_rc(struct qeth_ipa_cmd *cmd)
+{
+	if (!cmd->hdr.return_code)
+		cmd->hdr.return_code = cmd->data.setassparms.hdr.return_code;
+	return cmd->hdr.return_code;
+}
+
 int qeth_setassparms_cb(struct qeth_card *card,
 			struct qeth_reply *reply, unsigned long data)
 {
@@ -6242,7 +6249,7 @@ static int qeth_ipa_checksum_run_cmd_cb(struct qeth_card *card,
 				(struct qeth_checksum_cmd *)reply->param;
 
 	QETH_CARD_TEXT(card, 4, "chkdoccb");
-	if (cmd->hdr.return_code)
+	if (qeth_setassparms_inspect_rc(cmd))
 		return 0;
 
 	memset(chksum_cb, 0, sizeof(*chksum_cb));
-- 
2.13.5

^ permalink raw reply related

* [PATCH ipsec-next 7/7] net: xfrm: Don't pass tunnel objects to xfrm6_rcv_spi.
From: Lorenzo Colitti @ 2017-12-20 17:06 UTC (permalink / raw)
  To: netdev; +Cc: steffen.klassert, subashab, nharold, davem, Lorenzo Colitti
In-Reply-To: <20171220170607.41516-1-lorenzo@google.com>

This change removes the tunnel parameter from xfrm6_rcv_spi and
deletes xfrm6_rcv_tnl. These were only used by the VTI code and
are now unused.

Tested: https://android-review.googlesource.com/571524
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
---
 include/net/xfrm.h      |  4 +---
 net/ipv6/ip6_vti.c      |  2 +-
 net/ipv6/xfrm6_input.c  | 13 +++----------
 net/ipv6/xfrm6_tunnel.c |  2 +-
 4 files changed, 6 insertions(+), 15 deletions(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 3d245f2f6f..fc19dda73c 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -1638,10 +1638,8 @@ int xfrm4_tunnel_deregister(struct xfrm_tunnel *handler, unsigned short family);
 void xfrm4_local_error(struct sk_buff *skb, u32 mtu);
 int xfrm6_extract_header(struct sk_buff *skb);
 int xfrm6_extract_input(struct xfrm_state *x, struct sk_buff *skb);
-int xfrm6_rcv_spi(struct sk_buff *skb, int nexthdr, __be32 spi,
-		  struct ip6_tnl *t);
+int xfrm6_rcv_spi(struct sk_buff *skb, int nexthdr, __be32 spi);
 int xfrm6_transport_finish(struct sk_buff *skb, int async);
-int xfrm6_rcv_tnl(struct sk_buff *skb, struct ip6_tnl *t);
 int xfrm6_rcv(struct sk_buff *skb);
 int xfrm6_input_addr(struct sk_buff *skb, xfrm_address_t *daddr,
 		     xfrm_address_t *saddr, u8 proto);
diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c
index 18c2695dc3..2ac0bfff0f 100644
--- a/net/ipv6/ip6_vti.c
+++ b/net/ipv6/ip6_vti.c
@@ -401,7 +401,7 @@ static int vti6_rcv(struct sk_buff *skb)
 	int nexthdr = skb_network_header(skb)[IP6CB(skb)->nhoff];
 
 	XFRM_TUNNEL_SKB_CB(skb)->tunnel.lookup = vti6_lookup;
-	return xfrm6_rcv_spi(skb, nexthdr, 0, NULL);
+	return xfrm6_rcv_spi(skb, nexthdr, 0);
 }
 
 static int vti6_rcv_cb(struct sk_buff *skb, int err)
diff --git a/net/ipv6/xfrm6_input.c b/net/ipv6/xfrm6_input.c
index 6d1b734fef..5f20e30926 100644
--- a/net/ipv6/xfrm6_input.c
+++ b/net/ipv6/xfrm6_input.c
@@ -22,8 +22,7 @@ int xfrm6_extract_input(struct xfrm_state *x, struct sk_buff *skb)
 	return xfrm6_extract_header(skb);
 }
 
-int xfrm6_rcv_spi(struct sk_buff *skb, int nexthdr, __be32 spi,
-		  struct ip6_tnl *t)
+int xfrm6_rcv_spi(struct sk_buff *skb, int nexthdr, __be32 spi)
 {
 	XFRM_SPI_SKB_CB(skb)->family = AF_INET6;
 	XFRM_SPI_SKB_CB(skb)->daddroff = offsetof(struct ipv6hdr, daddr);
@@ -59,16 +58,10 @@ int xfrm6_transport_finish(struct sk_buff *skb, int async)
 	return -1;
 }
 
-int xfrm6_rcv_tnl(struct sk_buff *skb, struct ip6_tnl *t)
-{
-	return xfrm6_rcv_spi(skb, skb_network_header(skb)[IP6CB(skb)->nhoff],
-			     0, t);
-}
-EXPORT_SYMBOL(xfrm6_rcv_tnl);
-
 int xfrm6_rcv(struct sk_buff *skb)
 {
-	return xfrm6_rcv_tnl(skb, NULL);
+	return xfrm6_rcv_spi(skb, skb_network_header(skb)[IP6CB(skb)->nhoff],
+			     0);
 }
 EXPORT_SYMBOL(xfrm6_rcv);
 int xfrm6_input_addr(struct sk_buff *skb, xfrm_address_t *daddr,
diff --git a/net/ipv6/xfrm6_tunnel.c b/net/ipv6/xfrm6_tunnel.c
index f85f0d7480..02161543a9 100644
--- a/net/ipv6/xfrm6_tunnel.c
+++ b/net/ipv6/xfrm6_tunnel.c
@@ -236,7 +236,7 @@ static int xfrm6_tunnel_rcv(struct sk_buff *skb)
 	__be32 spi;
 
 	spi = xfrm6_tunnel_spi_lookup(net, (const xfrm_address_t *)&iph->saddr);
-	return xfrm6_rcv_spi(skb, IPPROTO_IPV6, spi, NULL);
+	return xfrm6_rcv_spi(skb, IPPROTO_IPV6, spi);
 }
 
 static int xfrm6_tunnel_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
-- 
2.15.1.620.gb9897f4670-goog

^ permalink raw reply related

* [PATCH ipsec-next 6/7] net: xfrm: Allow userspace to configure keyed VTI tunnels.
From: Lorenzo Colitti @ 2017-12-20 17:06 UTC (permalink / raw)
  To: netdev; +Cc: steffen.klassert, subashab, nharold, davem, Lorenzo Colitti
In-Reply-To: <20171220170607.41516-1-lorenzo@google.com>

This commit allows userspace to configure keyed VTI tunnels by
adding a IFLA_VTI_FLAGS attribute and a VTI_KEYED flag. When set,
the flag causes the tunnel parameter i_flags to be set to
TUNNEL_KEY.

Creating both a non-keyed VTI and a keyed VTI on the same IP
src+dst pair is not useful. Because non-keyed VTIs always accept
packets, in such a configuration the keyed VTI would not receive
any traffic. This is disallowed by modifying the ip_tunnel_find
and vti6_locate functions to treat VTIs on the same src+dst pair
as identical unless they are both keyed (in which case they can
coexist, by design). So attempts to create such duplicate tunnels
- or to change one tunnel in such a way that it would duplicate
another - will fail with EEXIST.

Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
---
 include/uapi/linux/if_tunnel.h |  4 ++++
 net/ipv4/ip_tunnel.c           | 10 +++++++++-
 net/ipv4/ip_vti.c              | 26 +++++++++++++++++++++++---
 net/ipv6/ip6_vti.c             | 33 +++++++++++++++++++++++++++++++--
 4 files changed, 67 insertions(+), 6 deletions(-)

diff --git a/include/uapi/linux/if_tunnel.h b/include/uapi/linux/if_tunnel.h
index 1b3d148c45..b431b1c209 100644
--- a/include/uapi/linux/if_tunnel.h
+++ b/include/uapi/linux/if_tunnel.h
@@ -148,6 +148,9 @@ enum {
 /* VTI-mode i_flags */
 #define VTI_ISVTI ((__force __be16)0x0001)
 
+/* VTI netlink iflags. */
+#define VTI_KEYED 0x0001
+
 enum {
 	IFLA_VTI_UNSPEC,
 	IFLA_VTI_LINK,
@@ -156,6 +159,7 @@ enum {
 	IFLA_VTI_LOCAL,
 	IFLA_VTI_REMOTE,
 	IFLA_VTI_FWMARK,
+	IFLA_VTI_FLAGS,
 	__IFLA_VTI_MAX,
 };
 
diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
index f45968bb81..9a0a56b491 100644
--- a/net/ipv4/ip_tunnel.c
+++ b/net/ipv4/ip_tunnel.c
@@ -84,6 +84,14 @@ static bool ip_tunnel_key_match(const struct ip_tunnel_parm *p,
 		return !(flags & TUNNEL_KEY);
 }
 
+static bool ip_tunnel_match(const struct ip_tunnel_parm *p,
+			    __be32 flags, u8 lookup_flags, __be32 key)
+{
+	return ip_tunnel_key_match(p, flags, lookup_flags, key) ||
+	       ((p->i_flags & flags & VTI_ISVTI) &&
+		!(p->i_flags & flags & TUNNEL_KEY));
+}
+
 /* Fallback tunnel: no source, no destination, no key, no options
 
    Tunnel hash table:
@@ -242,7 +250,7 @@ static struct ip_tunnel *ip_tunnel_find(struct ip_tunnel_net *itn,
 		    remote == t->parms.iph.daddr &&
 		    link == t->parms.link &&
 		    type == t->dev->type &&
-		    ip_tunnel_key_match(&t->parms, flags, 0, key))
+		    ip_tunnel_match(&t->parms, flags, 0, key))
 			break;
 	}
 	return t;
diff --git a/net/ipv4/ip_vti.c b/net/ipv4/ip_vti.c
index 9d28433a60..1f52719228 100644
--- a/net/ipv4/ip_vti.c
+++ b/net/ipv4/ip_vti.c
@@ -385,6 +385,16 @@ static int vti4_err(struct sk_buff *skb, u32 info)
 	return tunnel ? 0 : -1;
 }
 
+static __be16 vti_flags_to_tnl_flags(__u16 flags)
+{
+	return VTI_ISVTI | ((flags & VTI_KEYED) ? TUNNEL_KEY : 0);
+}
+
+static __u16 tnl_flags_to_vti_flags(__be16 i_flags)
+{
+	return (i_flags & TUNNEL_KEY) ? VTI_KEYED : 0;
+}
+
 static int
 vti_tunnel_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
 {
@@ -525,6 +535,8 @@ static void vti_netlink_parms(struct nlattr *data[],
 			      struct ip_tunnel_parm *parms,
 			      __u32 *fwmark)
 {
+	__u16 flags = 0;
+
 	memset(parms, 0, sizeof(*parms));
 
 	parms->iph.protocol = IPPROTO_IPIP;
@@ -532,8 +544,6 @@ static void vti_netlink_parms(struct nlattr *data[],
 	if (!data)
 		return;
 
-	parms->i_flags = VTI_ISVTI;
-
 	if (data[IFLA_VTI_LINK])
 		parms->link = nla_get_u32(data[IFLA_VTI_LINK]);
 
@@ -551,6 +561,11 @@ static void vti_netlink_parms(struct nlattr *data[],
 
 	if (data[IFLA_VTI_FWMARK])
 		*fwmark = nla_get_u32(data[IFLA_VTI_FWMARK]);
+
+	if (data[IFLA_VTI_FLAGS])
+		flags = nla_get_u16(data[IFLA_VTI_FLAGS]);
+
+	parms->i_flags = vti_flags_to_tnl_flags(flags);
 }
 
 static int vti_newlink(struct net *src_net, struct net_device *dev,
@@ -591,6 +606,8 @@ static size_t vti_get_size(const struct net_device *dev)
 		nla_total_size(4) +
 		/* IFLA_VTI_FWMARK */
 		nla_total_size(4) +
+		/* IFLA_VTI_FLAGS */
+		nla_total_size(2) +
 		0;
 }
 
@@ -604,7 +621,9 @@ static int vti_fill_info(struct sk_buff *skb, const struct net_device *dev)
 	    nla_put_be32(skb, IFLA_VTI_OKEY, p->o_key) ||
 	    nla_put_in_addr(skb, IFLA_VTI_LOCAL, p->iph.saddr) ||
 	    nla_put_in_addr(skb, IFLA_VTI_REMOTE, p->iph.daddr) ||
-	    nla_put_u32(skb, IFLA_VTI_FWMARK, t->fwmark))
+	    nla_put_u32(skb, IFLA_VTI_FWMARK, t->fwmark) ||
+	    nla_put_u16(skb, IFLA_VTI_FLAGS,
+			tnl_flags_to_vti_flags(p->i_flags)))
 		return -EMSGSIZE;
 
 	return 0;
@@ -617,6 +636,7 @@ static const struct nla_policy vti_policy[IFLA_VTI_MAX + 1] = {
 	[IFLA_VTI_LOCAL]	= { .len = FIELD_SIZEOF(struct iphdr, saddr) },
 	[IFLA_VTI_REMOTE]	= { .len = FIELD_SIZEOF(struct iphdr, daddr) },
 	[IFLA_VTI_FWMARK]	= { .type = NLA_U32 },
+	[IFLA_VTI_FLAGS]	= { .type = NLA_U16 },
 };
 
 static struct rtnl_link_ops vti_link_ops __read_mostly = {
diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c
index bf64821b8a..18c2695dc3 100644
--- a/net/ipv6/ip6_vti.c
+++ b/net/ipv6/ip6_vti.c
@@ -86,6 +86,13 @@ static bool vti6_match_key(const struct ip6_tnl *t, __be32 key, bool in)
 	return !(flags & TUNNEL_KEY) || tunnel_key == key;
 }
 
+static bool vti6_match_tunnel(const struct ip6_tnl *t, struct __ip6_tnl_parm *p)
+{
+	return !(t->parms.i_flags & TUNNEL_KEY) ||
+	       !(p->i_flags & TUNNEL_KEY) ||
+	       vti6_match_key(t, p->i_key, true);
+}
+
 /**
  * vti6_tnl_lookup - fetch tunnel matching the end-point addresses and key
  *   @net: network namespace
@@ -280,7 +287,7 @@ static struct ip6_tnl *vti6_locate(struct net *net, struct __ip6_tnl_parm *p,
 	     tp = &t->next) {
 		if (ipv6_addr_equal(local, &t->parms.laddr) &&
 		    ipv6_addr_equal(remote, &t->parms.raddr) &&
-		    vti6_match_key(t, p->i_key, true)) {
+		    vti6_match_tunnel(t, p)) {
 			if (create)
 				return NULL;
 
@@ -990,9 +997,21 @@ static int vti6_validate(struct nlattr *tb[], struct nlattr *data[],
 	return 0;
 }
 
+static __be16 vti_flags_to_tnl_flags(__u16 i_flags)
+{
+	return VTI_ISVTI | ((i_flags & VTI_KEYED) ? TUNNEL_KEY : 0);
+}
+
+static __u16 tnl_flags_to_vti_flags(__be16 i_flags)
+{
+	return (i_flags & TUNNEL_KEY) ? VTI_KEYED : 0;
+}
+
 static void vti6_netlink_parms(struct nlattr *data[],
 			       struct __ip6_tnl_parm *parms)
 {
+	__u16 flags = 0;
+
 	memset(parms, 0, sizeof(*parms));
 
 	if (!data)
@@ -1015,6 +1034,11 @@ static void vti6_netlink_parms(struct nlattr *data[],
 
 	if (data[IFLA_VTI_FWMARK])
 		parms->fwmark = nla_get_u32(data[IFLA_VTI_FWMARK]);
+
+	if (data[IFLA_VTI_FLAGS])
+		flags = nla_get_u16(data[IFLA_VTI_FLAGS]);
+
+	parms->i_flags = vti_flags_to_tnl_flags(flags);
 }
 
 static int vti6_newlink(struct net *src_net, struct net_device *dev,
@@ -1084,6 +1108,8 @@ static size_t vti6_get_size(const struct net_device *dev)
 		nla_total_size(4) +
 		/* IFLA_VTI_FWMARK */
 		nla_total_size(4) +
+		/* IFLA_VTI_FLAGS */
+		nla_total_size(2) +
 		0;
 }
 
@@ -1097,7 +1123,9 @@ static int vti6_fill_info(struct sk_buff *skb, const struct net_device *dev)
 	    nla_put_in6_addr(skb, IFLA_VTI_REMOTE, &parm->raddr) ||
 	    nla_put_be32(skb, IFLA_VTI_IKEY, parm->i_key) ||
 	    nla_put_be32(skb, IFLA_VTI_OKEY, parm->o_key) ||
-	    nla_put_u32(skb, IFLA_VTI_FWMARK, parm->fwmark))
+	    nla_put_u32(skb, IFLA_VTI_FWMARK, parm->fwmark) ||
+	    nla_put_u16(skb, IFLA_VTI_FLAGS,
+			tnl_flags_to_vti_flags(parm->i_flags)))
 		goto nla_put_failure;
 	return 0;
 
@@ -1112,6 +1140,7 @@ static const struct nla_policy vti6_policy[IFLA_VTI_MAX + 1] = {
 	[IFLA_VTI_IKEY]		= { .type = NLA_U32 },
 	[IFLA_VTI_OKEY]		= { .type = NLA_U32 },
 	[IFLA_VTI_FWMARK]	= { .type = NLA_U32 },
+	[IFLA_VTI_FLAGS]	= { .type = NLA_U16 },
 };
 
 static struct rtnl_link_ops vti6_link_ops __read_mostly = {
-- 
2.15.1.620.gb9897f4670-goog

^ permalink raw reply related

* [PATCH ipsec-next 5/7] net: xfrm: Deliver packets to keyed VTI tunnels.
From: Lorenzo Colitti @ 2017-12-20 17:06 UTC (permalink / raw)
  To: netdev; +Cc: steffen.klassert, subashab, nharold, davem, Lorenzo Colitti
In-Reply-To: <20171220170607.41516-1-lorenzo@google.com>

- Input works as follows:
  1. Attempt to match a regular VTI by IP addresses only. If that
     succeeds, use the i_key as the mark to look up the xfrm
     state.
  2. If the match failed, do an XFRM state lookup that ignores
     the mark. If that finds an state, then use the state match's
     mark to find the tunnel by its i_key.
- ICMP errors: similar to input, except the search is for the
  outbound XFRM state and the tunnel is found by o_key instead of
  by i_key.
- The output path is the same as existing VTIs. A routing lookup
  matches a VTI interface. The VTI uses its o_key as the mark to
  select an XFRM state. The state transforms the packet.

Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
---
 net/ipv4/ip_vti.c  | 52 ++++++++++++++++++++++++++++++------------
 net/ipv6/ip6_vti.c | 67 ++++++++++++++++++++++++++++++++++++++++--------------
 2 files changed, 88 insertions(+), 31 deletions(-)

diff --git a/net/ipv4/ip_vti.c b/net/ipv4/ip_vti.c
index 21f93e398e..9d28433a60 100644
--- a/net/ipv4/ip_vti.c
+++ b/net/ipv4/ip_vti.c
@@ -63,6 +63,18 @@ vti4_find_tunnel(struct sk_buff *skb, __be32 spi, struct xfrm_state **x)
 		*x = xfrm_state_lookup(net, be32_to_cpu(tunnel->parms.i_key),
 				       (xfrm_address_t *)&iph->daddr,
 				       spi, iph->protocol, AF_INET);
+	} else {
+		*x = xfrm_state_lookup_loose(net, skb->mark,
+					     (xfrm_address_t *)&iph->daddr,
+					     spi, iph->protocol, AF_INET);
+		if (!*x)
+			return NULL;
+		tunnel = ip_tunnel_lookup(itn, skb->dev->ifindex, TUNNEL_KEY,
+					  TUNNEL_LOOKUP_NO_KEY,
+					  iph->saddr, iph->daddr,
+					  cpu_to_be32((*x)->mark.v));
+		if (!tunnel)
+			xfrm_state_put(*x);
 	}
 
 	return tunnel;
@@ -302,7 +314,6 @@ static netdev_tx_t vti_tunnel_xmit(struct sk_buff *skb, struct net_device *dev)
 static int vti4_err(struct sk_buff *skb, u32 info)
 {
 	__be32 spi;
-	__u32 mark;
 	struct xfrm_state *x;
 	struct ip_tunnel *tunnel;
 	struct ip_esp_hdr *esph;
@@ -313,13 +324,6 @@ static int vti4_err(struct sk_buff *skb, u32 info)
 	int protocol = iph->protocol;
 	struct ip_tunnel_net *itn = net_generic(net, vti_net_id);
 
-	tunnel = ip_tunnel_lookup(itn, skb->dev->ifindex, TUNNEL_NO_KEY, 0,
-				  iph->daddr, iph->saddr, 0);
-	if (!tunnel)
-		return -1;
-
-	mark = be32_to_cpu(tunnel->parms.o_key);
-
 	switch (protocol) {
 	case IPPROTO_ESP:
 		esph = (struct ip_esp_hdr *)(skb->data+(iph->ihl<<2));
@@ -347,18 +351,38 @@ static int vti4_err(struct sk_buff *skb, u32 info)
 		return 0;
 	}
 
-	x = xfrm_state_lookup(net, mark, (const xfrm_address_t *)&iph->daddr,
-			      spi, protocol, AF_INET);
-	if (!x)
-		return 0;
+	tunnel = ip_tunnel_lookup(itn, skb->dev->ifindex, TUNNEL_NO_KEY, 0,
+				  iph->daddr, iph->saddr, 0);
+	if (tunnel) {
+		x = xfrm_state_lookup(net, be32_to_cpu(tunnel->parms.o_key),
+				      (xfrm_address_t *)&iph->daddr,
+				      spi, iph->protocol, AF_INET);
+	} else {
+		x = xfrm_state_lookup_loose(net, skb->mark,
+					    (xfrm_address_t *)&iph->daddr,
+					    spi, iph->protocol, AF_INET);
+		if (!x)
+			goto out;
+		tunnel = ip_tunnel_lookup(itn, skb->dev->ifindex, TUNNEL_KEY,
+					  TUNNEL_LOOKUP_NO_KEY |
+					  TUNNEL_LOOKUP_OKEY,
+					  iph->daddr, iph->saddr,
+					  cpu_to_be32(x->mark.v));
+	}
+
+	if (!tunnel || !x)
+		goto out;
 
 	if (icmp_hdr(skb)->type == ICMP_DEST_UNREACH)
 		ipv4_update_pmtu(skb, net, info, 0, 0, protocol, 0);
 	else
 		ipv4_redirect(skb, net, 0, 0, protocol, 0);
-	xfrm_state_put(x);
 
-	return 0;
+out:
+	if (x)
+		xfrm_state_put(x);
+
+	return tunnel ? 0 : -1;
 }
 
 static int
diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c
index 5994fedd19..bf64821b8a 100644
--- a/net/ipv6/ip6_vti.c
+++ b/net/ipv6/ip6_vti.c
@@ -78,11 +78,21 @@ struct vti6_net {
 #define for_each_vti6_tunnel_rcu(start) \
 	for (t = rcu_dereference(start); t; t = rcu_dereference(t->next))
 
+static bool vti6_match_key(const struct ip6_tnl *t, __be32 key, bool in)
+{
+	__be16 tunnel_key = in ? t->parms.i_key : t->parms.o_key;
+	__be16 flags = in ? t->parms.i_flags : t->parms.o_flags;
+
+	return !(flags & TUNNEL_KEY) || tunnel_key == key;
+}
+
 /**
- * vti6_tnl_lookup - fetch tunnel matching the end-point addresses
+ * vti6_tnl_lookup - fetch tunnel matching the end-point addresses and key
  *   @net: network namespace
  *   @remote: the address of the tunnel exit-point
  *   @local: the address of the tunnel entry-point
+ *   @key: the key of the tunnel
+ *   @in: whether to match i_key or i_key
  *
  * Return:
  *   tunnel matching given end-points if found,
@@ -91,7 +101,7 @@ struct vti6_net {
  **/
 static struct ip6_tnl *
 vti6_tnl_lookup(struct net *net, const struct in6_addr *remote,
-		const struct in6_addr *local)
+		const struct in6_addr *local, __be32 key, bool in)
 {
 	unsigned int hash = HASH(remote, local);
 	struct ip6_tnl *t;
@@ -101,6 +111,7 @@ vti6_tnl_lookup(struct net *net, const struct in6_addr *remote,
 	for_each_vti6_tunnel_rcu(ip6n->tnls_r_l[hash]) {
 		if (ipv6_addr_equal(local, &t->parms.laddr) &&
 		    ipv6_addr_equal(remote, &t->parms.raddr) &&
+		    vti6_match_key(t, key, in) &&
 		    (t->dev->flags & IFF_UP))
 			return t;
 	}
@@ -109,6 +120,7 @@ vti6_tnl_lookup(struct net *net, const struct in6_addr *remote,
 	hash = HASH(&any, local);
 	for_each_vti6_tunnel_rcu(ip6n->tnls_r_l[hash]) {
 		if (ipv6_addr_equal(local, &t->parms.laddr) &&
+		    vti6_match_key(t, key, in) &&
 		    (t->dev->flags & IFF_UP))
 			return t;
 	}
@@ -116,6 +128,7 @@ vti6_tnl_lookup(struct net *net, const struct in6_addr *remote,
 	hash = HASH(remote, &any);
 	for_each_vti6_tunnel_rcu(ip6n->tnls_r_l[hash]) {
 		if (ipv6_addr_equal(remote, &t->parms.raddr) &&
+		    vti6_match_key(t, key, in) &&
 		    (t->dev->flags & IFF_UP))
 			return t;
 	}
@@ -266,7 +279,8 @@ static struct ip6_tnl *vti6_locate(struct net *net, struct __ip6_tnl_parm *p,
 	     (t = rtnl_dereference(*tp)) != NULL;
 	     tp = &t->next) {
 		if (ipv6_addr_equal(local, &t->parms.laddr) &&
-		    ipv6_addr_equal(remote, &t->parms.raddr)) {
+		    ipv6_addr_equal(remote, &t->parms.raddr) &&
+		    vti6_match_key(t, p->i_key, true)) {
 			if (create)
 				return NULL;
 
@@ -304,11 +318,21 @@ vti6_find_tunnel(struct sk_buff *skb, __be32 spi, struct xfrm_state **x)
 	struct net *net = dev_net(skb->dev);
 	struct ip6_tnl *t;
 
-	t = vti6_tnl_lookup(net, &ipv6h->saddr, &ipv6h->daddr);
+	t = vti6_tnl_lookup(net, &ipv6h->saddr, &ipv6h->daddr, 0, true);
 	if (t) {
 		*x = xfrm_state_lookup(net, be32_to_cpu(t->parms.i_key),
 				       (xfrm_address_t *)&ipv6h->daddr,
 				       spi, ipv6h->nexthdr, AF_INET6);
+	} else {
+		*x = xfrm_state_lookup_loose(net, skb->mark,
+					     (xfrm_address_t *)&ipv6h->daddr,
+					     spi, ipv6h->nexthdr, AF_INET6);
+		if (!*x)
+			return NULL;
+		t =  vti6_tnl_lookup(net, &ipv6h->saddr, &ipv6h->daddr,
+				     cpu_to_be32((*x)->mark.v), true);
+		if (!t)
+			xfrm_state_put(*x);
 	}
 
 	return t;
@@ -613,7 +637,6 @@ static int vti6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
 		    u8 type, u8 code, int offset, __be32 info)
 {
 	__be32 spi;
-	__u32 mark;
 	struct xfrm_state *x;
 	struct ip6_tnl *t;
 	struct ip_esp_hdr *esph;
@@ -623,12 +646,6 @@ static int vti6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
 	const struct ipv6hdr *iph = (const struct ipv6hdr *)skb->data;
 	int protocol = iph->nexthdr;
 
-	t = vti6_tnl_lookup(dev_net(skb->dev), &iph->daddr, &iph->saddr);
-	if (!t)
-		return -1;
-
-	mark = be32_to_cpu(t->parms.o_key);
-
 	switch (protocol) {
 	case IPPROTO_ESP:
 		esph = (struct ip_esp_hdr *)(skb->data + offset);
@@ -650,19 +667,35 @@ static int vti6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
 	    type != NDISC_REDIRECT)
 		return 0;
 
-	x = xfrm_state_lookup(net, mark, (const xfrm_address_t *)&iph->daddr,
-			      spi, protocol, AF_INET6);
-	if (!x)
-		return 0;
+	t = vti6_tnl_lookup(net, &iph->daddr, &iph->saddr, 0, false);
+	if (t) {
+		x = xfrm_state_lookup(net, be32_to_cpu(t->parms.o_key),
+				      (xfrm_address_t *)&iph->daddr,
+				      spi, protocol, AF_INET6);
+	} else {
+		x = xfrm_state_lookup_loose(net, skb->mark,
+					    (xfrm_address_t *)&iph->daddr,
+					    spi, protocol, AF_INET6);
+		if (!x)
+			goto out;
+		t = vti6_tnl_lookup(net, &iph->daddr, &iph->saddr,
+				    cpu_to_be32(x->mark.v), false);
+	}
+
+	if (!t || !x)
+		goto out;
 
 	if (type == NDISC_REDIRECT)
 		ip6_redirect(skb, net, skb->dev->ifindex, 0,
 			     sock_net_uid(net, NULL));
 	else
 		ip6_update_pmtu(skb, net, info, 0, 0, sock_net_uid(net, NULL));
-	xfrm_state_put(x);
 
-	return 0;
+out:
+	if (x)
+		xfrm_state_put(x);
+
+	return t ? 0 : -1;
 }
 
 static void vti6_link_config(struct ip6_tnl *t)
-- 
2.15.1.620.gb9897f4670-goog

^ permalink raw reply related

* [PATCH ipsec-next 4/7] net: xfrm: Find VTI interfaces from xfrm_input.
From: Lorenzo Colitti @ 2017-12-20 17:06 UTC (permalink / raw)
  To: netdev; +Cc: steffen.klassert, subashab, nharold, davem, Lorenzo Colitti
In-Reply-To: <20171220170607.41516-1-lorenzo@google.com>

Currently, the VTI input path works by first looking up the VTI
by its IP addresses, then setting the tunnel pointer in the
XFRM_TUNNEL_SKB_CB, and then having xfrm_input override the mark
with the mark in the tunnel.

This patch changes the order so that the tunnel is found by a
callback from xfrm_input. Each tunnel type (currently only ip_vti
and ip6_vti) implements a lookup function pointer that finds the
tunnel and sets it in the CB, and also does a state lookup.

This has the advantage that much more information is available to
the tunnel lookup function, including the looked-up XFRM state.
This will be used in a future change to allow finding the tunnel
based on the result of the xfrm lookup and not just on IP
addresses, which will allow multiple tunnels on the same IP
address pair.

The lookup function pointer occupies the same space in the
XFRM_TUNNEL_SKB_CB as the IPv4/IPv6 tunnel pointer. The semantics
of the field are:
- When not running a handler that uses tunnels: always null.
- At the beginning of xfrm_input: lookup function pointer.
- After xfrm_input calls the lookup function: tunnel if found,
  else null.

Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
---
 include/net/xfrm.h     |  2 ++
 net/ipv4/ip_vti.c      | 43 ++++++++++++++++++++++++++++++++++++----
 net/ipv6/ip6_vti.c     | 53 +++++++++++++++++++++++++++++++++++++++++++++-----
 net/ipv6/xfrm6_input.c |  1 -
 net/xfrm/xfrm_input.c  | 34 +++++++++++++++++++-------------
 5 files changed, 109 insertions(+), 24 deletions(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 9d3b7c0ac6..3d245f2f6f 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -653,6 +653,8 @@ struct xfrm_tunnel_skb_cb {
 	} header;
 
 	union {
+		int (*lookup)(struct sk_buff *skb, int nexthdr, __be32 spi,
+			      __be32 seq, struct xfrm_state **x);
 		struct ip_tunnel *ip4;
 		struct ip6_tnl *ip6;
 	} tunnel;
diff --git a/net/ipv4/ip_vti.c b/net/ipv4/ip_vti.c
index 804cee8126..21f93e398e 100644
--- a/net/ipv4/ip_vti.c
+++ b/net/ipv4/ip_vti.c
@@ -49,8 +49,8 @@ static struct rtnl_link_ops vti_link_ops __read_mostly;
 static unsigned int vti_net_id __read_mostly;
 static int vti_tunnel_init(struct net_device *dev);
 
-static int vti_input(struct sk_buff *skb, int nexthdr, __be32 spi,
-		     int encap_type)
+static struct ip_tunnel *
+vti4_find_tunnel(struct sk_buff *skb, __be32 spi, struct xfrm_state **x)
 {
 	struct ip_tunnel *tunnel;
 	const struct iphdr *iph = ip_hdr(skb);
@@ -59,19 +59,52 @@ static int vti_input(struct sk_buff *skb, int nexthdr, __be32 spi,
 
 	tunnel = ip_tunnel_lookup(itn, skb->dev->ifindex, TUNNEL_NO_KEY, 0,
 				  iph->saddr, iph->daddr, 0);
+	if (tunnel) {
+		*x = xfrm_state_lookup(net, be32_to_cpu(tunnel->parms.i_key),
+				       (xfrm_address_t *)&iph->daddr,
+				       spi, iph->protocol, AF_INET);
+	}
+
+	return tunnel;
+}
+
+static int vti_lookup(struct sk_buff *skb, int nexthdr, __be32 spi, __be32 seq,
+		      struct xfrm_state **x)
+{
+	struct net *net = dev_net(skb->dev);
+	struct ip_tunnel *tunnel;
+
+	tunnel = vti4_find_tunnel(skb, spi, x);
 	if (tunnel) {
 		if (!xfrm4_policy_check(NULL, XFRM_POLICY_IN, skb))
 			goto drop;
 
+		if (!*x) {
+			XFRM_INC_STATS(net, LINUX_MIB_XFRMINNOSTATES);
+			xfrm_audit_state_notfound(skb, AF_INET, spi, seq);
+			tunnel->dev->stats.rx_errors++;
+			tunnel->dev->stats.rx_dropped++;
+			goto drop;
+		}
+
 		XFRM_TUNNEL_SKB_CB(skb)->tunnel.ip4 = tunnel;
 
-		return xfrm_input(skb, nexthdr, spi, encap_type);
+		return 0;
 	}
 
 	return -EINVAL;
 drop:
+	if (*x)
+		xfrm_state_put(*x);
 	kfree_skb(skb);
-	return 0;
+	return -ESRCH;
+}
+
+static int vti_input(struct sk_buff *skb, int nexthdr, __be32 spi,
+		     int encap_type)
+{
+	XFRM_TUNNEL_SKB_CB(skb)->tunnel.lookup = vti_lookup;
+	return xfrm_input(skb, nexthdr, spi, encap_type);
 }
 
 static int vti_rcv(struct sk_buff *skb)
@@ -93,6 +126,8 @@ static int vti_rcv_cb(struct sk_buff *skb, int err)
 	u32 orig_mark = skb->mark;
 	int ret;
 
+	XFRM_TUNNEL_SKB_CB(skb)->tunnel.ip4 = NULL;
+
 	if (!tunnel)
 		return 1;
 
diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c
index dbb74f3c57..5994fedd19 100644
--- a/net/ipv6/ip6_vti.c
+++ b/net/ipv6/ip6_vti.c
@@ -297,13 +297,33 @@ static void vti6_dev_uninit(struct net_device *dev)
 	dev_put(dev);
 }
 
-static int vti6_rcv(struct sk_buff *skb)
+static struct ip6_tnl *
+vti6_find_tunnel(struct sk_buff *skb, __be32 spi, struct xfrm_state **x)
 {
+	const struct ipv6hdr *ipv6h = ipv6_hdr(skb);
+	struct net *net = dev_net(skb->dev);
 	struct ip6_tnl *t;
+
+	t = vti6_tnl_lookup(net, &ipv6h->saddr, &ipv6h->daddr);
+	if (t) {
+		*x = xfrm_state_lookup(net, be32_to_cpu(t->parms.i_key),
+				       (xfrm_address_t *)&ipv6h->daddr,
+				       spi, ipv6h->nexthdr, AF_INET6);
+	}
+
+	return t;
+}
+
+static int
+vti6_lookup(struct sk_buff *skb, int nexthdr, __be32 spi, __be32 seq,
+	    struct xfrm_state **x)
+{
 	const struct ipv6hdr *ipv6h = ipv6_hdr(skb);
+	struct net *net = dev_net(skb->dev);
+	struct ip6_tnl *t;
 
 	rcu_read_lock();
-	t = vti6_tnl_lookup(dev_net(skb->dev), &ipv6h->saddr, &ipv6h->daddr);
+	t = vti6_find_tunnel(skb, spi, x);
 	if (t) {
 		if (t->parms.proto != IPPROTO_IPV6 && t->parms.proto != 0) {
 			rcu_read_unlock();
@@ -312,7 +332,7 @@ static int vti6_rcv(struct sk_buff *skb)
 
 		if (!xfrm6_policy_check(NULL, XFRM_POLICY_IN, skb)) {
 			rcu_read_unlock();
-			return 0;
+			goto discard;
 		}
 
 		if (!ip6_tnl_rcv_ctl(t, &ipv6h->daddr, &ipv6h->saddr)) {
@@ -321,15 +341,36 @@ static int vti6_rcv(struct sk_buff *skb)
 			goto discard;
 		}
 
+		if (!*x) {
+			XFRM_INC_STATS(net, LINUX_MIB_XFRMINNOSTATES);
+			xfrm_audit_state_notfound(skb, AF_INET6, spi, seq);
+			t->dev->stats.rx_errors++;
+			t->dev->stats.rx_dropped++;
+			rcu_read_unlock();
+			goto discard;
+		}
+
 		rcu_read_unlock();
 
-		return xfrm6_rcv_tnl(skb, t);
+		XFRM_TUNNEL_SKB_CB(skb)->tunnel.ip6 = t;
+
+		return 0;
 	}
 	rcu_read_unlock();
 	return -EINVAL;
 discard:
+	if (*x)
+		xfrm_state_put(*x);
 	kfree_skb(skb);
-	return 0;
+	return -ESRCH;
+}
+
+static int vti6_rcv(struct sk_buff *skb)
+{
+	int nexthdr = skb_network_header(skb)[IP6CB(skb)->nhoff];
+
+	XFRM_TUNNEL_SKB_CB(skb)->tunnel.lookup = vti6_lookup;
+	return xfrm6_rcv_spi(skb, nexthdr, 0, NULL);
 }
 
 static int vti6_rcv_cb(struct sk_buff *skb, int err)
@@ -343,6 +384,8 @@ static int vti6_rcv_cb(struct sk_buff *skb, int err)
 	u32 orig_mark = skb->mark;
 	int ret;
 
+	XFRM_TUNNEL_SKB_CB(skb)->tunnel.ip6 = NULL;
+
 	if (!t)
 		return 1;
 
diff --git a/net/ipv6/xfrm6_input.c b/net/ipv6/xfrm6_input.c
index fe04e23af9..6d1b734fef 100644
--- a/net/ipv6/xfrm6_input.c
+++ b/net/ipv6/xfrm6_input.c
@@ -25,7 +25,6 @@ int xfrm6_extract_input(struct xfrm_state *x, struct sk_buff *skb)
 int xfrm6_rcv_spi(struct sk_buff *skb, int nexthdr, __be32 spi,
 		  struct ip6_tnl *t)
 {
-	XFRM_TUNNEL_SKB_CB(skb)->tunnel.ip6 = t;
 	XFRM_SPI_SKB_CB(skb)->family = AF_INET6;
 	XFRM_SPI_SKB_CB(skb)->daddroff = offsetof(struct ipv6hdr, daddr);
 	return xfrm_input(skb, nexthdr, spi, 0);
diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
index ac277b97e0..7b54f58454 100644
--- a/net/xfrm/xfrm_input.c
+++ b/net/xfrm/xfrm_input.c
@@ -267,18 +267,6 @@ int xfrm_input(struct sk_buff *skb, int nexthdr, __be32 spi, int encap_type)
 
 	family = XFRM_SPI_SKB_CB(skb)->family;
 
-	/* if tunnel is present override skb->mark value with tunnel i_key */
-	switch (family) {
-	case AF_INET:
-		if (XFRM_TUNNEL_SKB_CB(skb)->tunnel.ip4)
-			mark = be32_to_cpu(XFRM_TUNNEL_SKB_CB(skb)->tunnel.ip4->parms.i_key);
-		break;
-	case AF_INET6:
-		if (XFRM_TUNNEL_SKB_CB(skb)->tunnel.ip6)
-			mark = be32_to_cpu(XFRM_TUNNEL_SKB_CB(skb)->tunnel.ip6->parms.i_key);
-		break;
-	}
-
 	err = secpath_set(skb);
 	if (err) {
 		XFRM_INC_STATS(net, LINUX_MIB_XFRMINERROR);
@@ -293,14 +281,29 @@ int xfrm_input(struct sk_buff *skb, int nexthdr, __be32 spi, int encap_type)
 
 	daddr = (xfrm_address_t *)(skb_network_header(skb) +
 				   XFRM_SPI_SKB_CB(skb)->daddroff);
+
+	if (XFRM_TUNNEL_SKB_CB(skb)->tunnel.lookup) {
+		err = XFRM_TUNNEL_SKB_CB(skb)->tunnel.lookup(skb, nexthdr,
+							     spi, seq, &x);
+		if (err) {
+			XFRM_TUNNEL_SKB_CB(skb)->tunnel.lookup = NULL;
+			return err;
+		}
+	}
+
 	do {
 		if (skb->sp->len == XFRM_MAX_DEPTH) {
 			XFRM_INC_STATS(net, LINUX_MIB_XFRMINBUFFERERROR);
+			if (x)
+				xfrm_state_put(x);
 			goto drop;
 		}
 
-		x = xfrm_state_lookup(net, mark, daddr, spi, nexthdr, family);
-		if (x == NULL) {
+		if (!x)
+			x = xfrm_state_lookup(net, mark, daddr, spi, nexthdr,
+					      family);
+
+		if (!x) {
 			XFRM_INC_STATS(net, LINUX_MIB_XFRMINNOSTATES);
 			xfrm_audit_state_notfound(skb, family, spi, seq);
 			goto drop;
@@ -420,6 +423,9 @@ int xfrm_input(struct sk_buff *skb, int nexthdr, __be32 spi, int encap_type)
 			XFRM_INC_STATS(net, LINUX_MIB_XFRMINHDRERROR);
 			goto drop;
 		}
+
+		if (!err)
+			x = NULL;
 	} while (!err);
 
 	err = xfrm_rcv_cb(skb, family, x->type->proto, 0);
-- 
2.15.1.620.gb9897f4670-goog

^ permalink raw reply related

* [PATCH ipsec-next 3/7] net: xfrm: Add an xfrm lookup that ignores the mark.
From: Lorenzo Colitti @ 2017-12-20 17:06 UTC (permalink / raw)
  To: netdev; +Cc: steffen.klassert, subashab, nharold, davem, Lorenzo Colitti
In-Reply-To: <20171220170607.41516-1-lorenzo@google.com>

The xfrm inbound and ICMP error paths can match inbound XFRM states
that have a mark, but only if the skb mark is already correctly set
to match the state mark. This typically requires iptables rules
(potentially even per SA iptables rules), which impose configuration
complexity.

In some cases, it may be useful to match such an SA anyway. An example
is when processing an ICMP error to an ESP packet that we previously
sent. In this case, the only information available to match the SA are
the IP addresses and the outbound SPI. Therefore, if the output SA has
a mark, the lookup will fail and the ICMP packet cannot be processed
unless the packet is somehow already marked.

Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
---
 include/net/xfrm.h    |  4 ++++
 net/xfrm/xfrm_state.c | 33 +++++++++++++++++++++++++++++++++
 2 files changed, 37 insertions(+)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 1ec0c47606..9d3b7c0ac6 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -1550,6 +1550,10 @@ struct xfrm_state *xfrm_state_lookup_byaddr(struct net *net, u32 mark,
 					    const xfrm_address_t *saddr,
 					    u8 proto,
 					    unsigned short family);
+struct xfrm_state *xfrm_state_lookup_loose(struct net *net, u32 mark,
+					   const xfrm_address_t *daddr,
+					   __be32 spi, u8 proto,
+					   unsigned short family);
 #ifdef CONFIG_XFRM_SUB_POLICY
 int xfrm_tmpl_sort(struct xfrm_tmpl **dst, struct xfrm_tmpl **src, int n,
 		   unsigned short family, struct net *net);
diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index 1b7856be3e..cff151c714 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -839,6 +839,39 @@ static struct xfrm_state *__xfrm_state_lookup(struct net *net, u32 mark,
 	return NULL;
 }
 
+struct xfrm_state *xfrm_state_lookup_loose(struct net *net, u32 mark,
+					   const xfrm_address_t *daddr,
+					   __be32 spi, u8 proto,
+					   unsigned short family)
+{
+	unsigned int h = xfrm_spi_hash(net, daddr, spi, proto, family);
+	struct xfrm_state *x, *cand = NULL;
+
+	rcu_read_lock();
+	hlist_for_each_entry_rcu(x, net->xfrm.state_byspi + h, byspi) {
+		if (x->props.family != family ||
+		    x->id.spi       != spi ||
+		    x->id.proto     != proto ||
+		    !xfrm_addr_equal(&x->id.daddr, daddr, family))
+			continue;
+
+		if (((mark & x->mark.m) == x->mark.v) &&
+		    xfrm_state_hold_rcu(x)) {
+			if (cand)
+				xfrm_state_put(cand);
+			rcu_read_unlock();
+			return x;
+		}
+
+		if (!cand && xfrm_state_hold_rcu(x))
+			cand = x;
+	}
+
+	rcu_read_unlock();
+	return cand;
+}
+EXPORT_SYMBOL(xfrm_state_lookup_loose);
+
 static struct xfrm_state *__xfrm_state_lookup_byaddr(struct net *net, u32 mark,
 						     const xfrm_address_t *daddr,
 						     const xfrm_address_t *saddr,
-- 
2.15.1.620.gb9897f4670-goog

^ permalink raw reply related

* [PATCH ipsec-next 2/7] net: ipv4: Add new flags to tunnel lookup.
From: Lorenzo Colitti @ 2017-12-20 17:06 UTC (permalink / raw)
  To: netdev; +Cc: steffen.klassert, subashab, nharold, davem, Lorenzo Colitti
In-Reply-To: <20171220170607.41516-1-lorenzo@google.com>

This patch adds support for two new flags to ip_tunnel_lookup.

- TUNNEL_LOOKUP_NO_KEY: ignores the tunnel's i_key when
  determining which hash bucket to look up the tunnel in. This is
  useful for tunnels such as VTI, which have i_keys, but are
  always hashed into bucket 0.
- TUNNEL_LOOKUP_OKEY: finds the tunnel by o_key instead of i_key.

Together, these flags allow processing ICMP errors correctly on
keyed tunnels where i_key != o_key. If such tunnels receive an
ICMP error, the only information available in the packet is the
o_key, so we must be able to find a tunnel by o_key alone. For
that to work, the tunnel hash must not depend on the i_key,
because if it does, we won't be able to find it by o_key alone.

TUNNEL_LOOKUP_NO_KEY is very similar to TUNNEL_NO_KEY so it might
be possible just to use TUNNEL_NO_KEY instead. However, it might
be confusing to see code simultaneously pass in both TUNNEL_NO_KEY
and TUNNEL_KEY.

These flags are numbered separately from tunnel flags because
they are not tunnel properties but properties of tunnel lookups.
(Also, the tunnel flags are 16 bits and all but one is unused.)

The flags are passed into ip_lookup by adding a new parameter.
This could also be done by expanding the existing flags parameter
from __be16 to __be32 and ensuring that the new flags are all
above the 16-bit boundary.

Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
---
 include/net/ip_tunnels.h |  6 +++++-
 net/ipv4/ip_gre.c        |  6 +++---
 net/ipv4/ip_tunnel.c     | 22 +++++++++++++---------
 net/ipv4/ip_vti.c        |  4 ++--
 net/ipv4/ipip.c          |  6 +++---
 5 files changed, 26 insertions(+), 18 deletions(-)

diff --git a/include/net/ip_tunnels.h b/include/net/ip_tunnels.h
index 1f16773cfd..19d97b993a 100644
--- a/include/net/ip_tunnels.h
+++ b/include/net/ip_tunnels.h
@@ -163,6 +163,10 @@ struct ip_tunnel {
 #define TUNNEL_OPTIONS_PRESENT \
 		(TUNNEL_GENEVE_OPT | TUNNEL_VXLAN_OPT | TUNNEL_ERSPAN_OPT)
 
+/* Flags for ip_tunnel_lookup. */
+#define TUNNEL_LOOKUP_NO_KEY	0x01
+#define TUNNEL_LOOKUP_OKEY	0x02
+
 struct tnl_ptk_info {
 	__be16 flags;
 	__be16 proto;
@@ -276,7 +280,7 @@ int ip_tunnel_change_mtu(struct net_device *dev, int new_mtu);
 void ip_tunnel_get_stats64(struct net_device *dev,
 			   struct rtnl_link_stats64 *tot);
 struct ip_tunnel *ip_tunnel_lookup(struct ip_tunnel_net *itn,
-				   int link, __be16 flags,
+				   int link, __be16 flags, u8 lookup_flags,
 				   __be32 remote, __be32 local,
 				   __be32 key);
 
diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
index fd4d6e96da..f16a46cb19 100644
--- a/net/ipv4/ip_gre.c
+++ b/net/ipv4/ip_gre.c
@@ -182,7 +182,7 @@ static void ipgre_err(struct sk_buff *skb, u32 info,
 		itn = net_generic(net, ipgre_net_id);
 
 	iph = (const struct iphdr *)(icmp_hdr(skb) + 1);
-	t = ip_tunnel_lookup(itn, skb->dev->ifindex, tpi->flags,
+	t = ip_tunnel_lookup(itn, skb->dev->ifindex, tpi->flags, 0,
 			     iph->daddr, iph->saddr, tpi->key);
 
 	if (!t)
@@ -280,7 +280,7 @@ static int erspan_rcv(struct sk_buff *skb, struct tnl_ptk_info *tpi,
 	 */
 	tpi->key = cpu_to_be32(ntohs(ershdr->session_id) & ID_MASK);
 	tunnel = ip_tunnel_lookup(itn, skb->dev->ifindex,
-				  tpi->flags | TUNNEL_KEY,
+				  tpi->flags | TUNNEL_KEY, 0,
 				  iph->saddr, iph->daddr, tpi->key);
 
 	if (tunnel) {
@@ -356,7 +356,7 @@ static int __ipgre_rcv(struct sk_buff *skb, const struct tnl_ptk_info *tpi,
 	struct ip_tunnel *tunnel;
 
 	iph = ip_hdr(skb);
-	tunnel = ip_tunnel_lookup(itn, skb->dev->ifindex, tpi->flags,
+	tunnel = ip_tunnel_lookup(itn, skb->dev->ifindex, tpi->flags, 0,
 				  iph->saddr, iph->daddr, tpi->key);
 
 	if (tunnel) {
diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
index 539c8f22c4..f45968bb81 100644
--- a/net/ipv4/ip_tunnel.c
+++ b/net/ipv4/ip_tunnel.c
@@ -70,11 +70,13 @@ static unsigned int ip_tunnel_hash(__be32 key, __be32 remote)
 }
 
 static bool ip_tunnel_key_match(const struct ip_tunnel_parm *p,
-				__be16 flags, __be32 key)
+				__be32 flags, u8 lookup_flags, __be32 key)
 {
+	__be32 tunnel_key = (lookup_flags & TUNNEL_LOOKUP_OKEY) ? p->o_key :
+								  p->i_key;
 	if (p->i_flags & TUNNEL_KEY) {
 		if (flags & TUNNEL_KEY)
-			return key == p->i_key;
+			return key == tunnel_key;
 		else
 			/* key expected, none present */
 			return false;
@@ -94,15 +96,17 @@ static bool ip_tunnel_key_match(const struct ip_tunnel_parm *p,
    Given src, dst and key, find appropriate for input tunnel.
 */
 struct ip_tunnel *ip_tunnel_lookup(struct ip_tunnel_net *itn,
-				   int link, __be16 flags,
+				   int link, __be16 flags, u8 lookup_flags,
 				   __be32 remote, __be32 local,
 				   __be32 key)
 {
 	unsigned int hash;
 	struct ip_tunnel *t, *cand = NULL;
 	struct hlist_head *head;
+	__be32 hash_key;
 
-	hash = ip_tunnel_hash(key, remote);
+	hash_key = (lookup_flags & TUNNEL_LOOKUP_NO_KEY) ? 0 : key;
+	hash = ip_tunnel_hash(hash_key, remote);
 	head = &itn->tunnels[hash];
 
 	hlist_for_each_entry_rcu(t, head, hash_node) {
@@ -111,7 +115,7 @@ struct ip_tunnel *ip_tunnel_lookup(struct ip_tunnel_net *itn,
 		    !(t->dev->flags & IFF_UP))
 			continue;
 
-		if (!ip_tunnel_key_match(&t->parms, flags, key))
+		if (!ip_tunnel_key_match(&t->parms, flags, lookup_flags, key))
 			continue;
 
 		if (t->parms.link == link)
@@ -126,7 +130,7 @@ struct ip_tunnel *ip_tunnel_lookup(struct ip_tunnel_net *itn,
 		    !(t->dev->flags & IFF_UP))
 			continue;
 
-		if (!ip_tunnel_key_match(&t->parms, flags, key))
+		if (!ip_tunnel_key_match(&t->parms, flags, lookup_flags, key))
 			continue;
 
 		if (t->parms.link == link)
@@ -135,7 +139,7 @@ struct ip_tunnel *ip_tunnel_lookup(struct ip_tunnel_net *itn,
 			cand = t;
 	}
 
-	hash = ip_tunnel_hash(key, 0);
+	hash = ip_tunnel_hash(hash_key, 0);
 	head = &itn->tunnels[hash];
 
 	hlist_for_each_entry_rcu(t, head, hash_node) {
@@ -146,7 +150,7 @@ struct ip_tunnel *ip_tunnel_lookup(struct ip_tunnel_net *itn,
 		if (!(t->dev->flags & IFF_UP))
 			continue;
 
-		if (!ip_tunnel_key_match(&t->parms, flags, key))
+		if (!ip_tunnel_key_match(&t->parms, flags, lookup_flags, key))
 			continue;
 
 		if (t->parms.link == link)
@@ -238,7 +242,7 @@ static struct ip_tunnel *ip_tunnel_find(struct ip_tunnel_net *itn,
 		    remote == t->parms.iph.daddr &&
 		    link == t->parms.link &&
 		    type == t->dev->type &&
-		    ip_tunnel_key_match(&t->parms, flags, key))
+		    ip_tunnel_key_match(&t->parms, flags, 0, key))
 			break;
 	}
 	return t;
diff --git a/net/ipv4/ip_vti.c b/net/ipv4/ip_vti.c
index 949f432a5f..804cee8126 100644
--- a/net/ipv4/ip_vti.c
+++ b/net/ipv4/ip_vti.c
@@ -57,7 +57,7 @@ static int vti_input(struct sk_buff *skb, int nexthdr, __be32 spi,
 	struct net *net = dev_net(skb->dev);
 	struct ip_tunnel_net *itn = net_generic(net, vti_net_id);
 
-	tunnel = ip_tunnel_lookup(itn, skb->dev->ifindex, TUNNEL_NO_KEY,
+	tunnel = ip_tunnel_lookup(itn, skb->dev->ifindex, TUNNEL_NO_KEY, 0,
 				  iph->saddr, iph->daddr, 0);
 	if (tunnel) {
 		if (!xfrm4_policy_check(NULL, XFRM_POLICY_IN, skb))
@@ -278,7 +278,7 @@ static int vti4_err(struct sk_buff *skb, u32 info)
 	int protocol = iph->protocol;
 	struct ip_tunnel_net *itn = net_generic(net, vti_net_id);
 
-	tunnel = ip_tunnel_lookup(itn, skb->dev->ifindex, TUNNEL_NO_KEY,
+	tunnel = ip_tunnel_lookup(itn, skb->dev->ifindex, TUNNEL_NO_KEY, 0,
 				  iph->daddr, iph->saddr, 0);
 	if (!tunnel)
 		return -1;
diff --git a/net/ipv4/ipip.c b/net/ipv4/ipip.c
index c891235b49..81f94ffb92 100644
--- a/net/ipv4/ipip.c
+++ b/net/ipv4/ipip.c
@@ -167,7 +167,7 @@ static int ipip_err(struct sk_buff *skb, u32 info)
 		goto out;
 	}
 
-	t = ip_tunnel_lookup(itn, skb->dev->ifindex, TUNNEL_NO_KEY,
+	t = ip_tunnel_lookup(itn, skb->dev->ifindex, TUNNEL_NO_KEY, 0,
 			     iph->daddr, iph->saddr, 0);
 	if (!t) {
 		err = -ENOENT;
@@ -224,8 +224,8 @@ static int ipip_tunnel_rcv(struct sk_buff *skb, u8 ipproto)
 	const struct iphdr *iph;
 
 	iph = ip_hdr(skb);
-	tunnel = ip_tunnel_lookup(itn, skb->dev->ifindex, TUNNEL_NO_KEY,
-			iph->saddr, iph->daddr, 0);
+	tunnel = ip_tunnel_lookup(itn, skb->dev->ifindex, TUNNEL_NO_KEY, 0,
+				  iph->saddr, iph->daddr, 0);
 	if (tunnel) {
 		const struct tnl_ptk_info *tpi;
 
-- 
2.15.1.620.gb9897f4670-goog

^ permalink raw reply related

* [PATCH ipsec-next 1/7] net: xfrm: Don't check for TUNNEL_KEY when hashing VTI tunnels.
From: Lorenzo Colitti @ 2017-12-20 17:06 UTC (permalink / raw)
  To: netdev; +Cc: steffen.klassert, subashab, nharold, davem, Lorenzo Colitti
In-Reply-To: <20171220170607.41516-1-lorenzo@google.com>

Currently, ip_bucket sets the lookup i_key to 0 if the tunnel's
i_flags have VTI_ISVTI flag set but not TUNNEL_KEY. However, it
can can never happen that TUNNEL_KEY is set if VTI_ISVTI is also
set (see below). Therefore, just drop the check for TUNNEL_KEY
and only set i_key to 0 on VTI_ISVTI.

This will allow the VTI code to set TUNNEL_KEY on certain
tunnels in a future change.

None of the callers of ip_bucket pass in TUNNEL_KEY | VTI_ISVTI.
The call graph is as follows:

- ip_tunnel_add
  - ip_tunnel_create
    - ip_tunnel_ioctl
      - ipgre_tunnel_ioctl: can set TUNNEL_KEY but not VTI_ISVTI
      - ipip_tunnel_ioctl: hardcodes i_flags to 0
      - vti_tunnel_ioctl: hardcodes i_flags to VTI_ISVTI
  - ip_tunnel_update: doesn't touch i_flags
  - ip_tunnel_init_net: memsets flags to 0
  - ip_tunnel_newlink
    - ipgre_newlink
      - ipgre_netlink_parms: can set TUNNEL_KEY but not VTI_ISVTI
    - vti_newlink: hardcodes i_flags to VTI_ISVTI
  - ip_tunnel_changelink: doesn't set flags
- ip_tunnel_find
  - ip_tunnel_ioctl (see above)
  - ip_tunnel_newlink (see above)
  - ip_tunnel_changelink (see above)

VTI_ISVTI has the same value as TUNNEL_DONT_FRAGMENT, but that
is never set into tunnel parameters.

Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
---
 net/ipv4/ip_tunnel.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
index 5ddb1cb52b..539c8f22c4 100644
--- a/net/ipv4/ip_tunnel.c
+++ b/net/ipv4/ip_tunnel.c
@@ -198,7 +198,7 @@ static struct hlist_head *ip_bucket(struct ip_tunnel_net *itn,
 	else
 		remote = 0;
 
-	if (!(parms->i_flags & TUNNEL_KEY) && (parms->i_flags & VTI_ISVTI))
+	if (parms->i_flags & VTI_ISVTI)
 		i_key = 0;
 
 	h = ip_tunnel_hash(i_key, remote);
-- 
2.15.1.620.gb9897f4670-goog

^ permalink raw reply related

* [PATCH ipsec-next 0/7]: Support multiple VTIs with the same src+dst pair
From: Lorenzo Colitti @ 2017-12-20 17:06 UTC (permalink / raw)
  To: netdev; +Cc: steffen.klassert, subashab, nharold, davem

When using IPsec tunnel mode, VTIs provide many benefits compared
to direct configuration of xfrm policies / states. However, one
limitation is that there can only be one VTI between a given pair
of IP addresses. This does not allow configuring multiple IPsec
tunnels to the same security gateway. This is required by some
deployments, for example I-WLAN [3GPP TS 24.327].

This patchset introduces a new VTI_KEYED flag that allows
configuration of multiple VTIs between the same IP address
pairs. The semantics are as follows:

- The output path is the same as current VTI behaviour, where a
  routing lookup selects a VTI interface, and the VTI's okey
  specifies the mark to use in the XFRM lookup.
- The input and ICMP error paths instead work by first looking up
  an SA with a loose match that ignores the mark. That mark is
  then used to find the tunnel by ikey (for input packets) or
  okey (for ICMP errors).

In order for ICMP errors to work, flags are added to the common
IP lookup functions to ignore the tunnel ikey and to look up
tunnels by okey instead of ikey.

On the same IP address pair, keyed VTIs can coexist with each
other (as long as the ikeys are different), but cannot coexist
with keyless VTIs. This is because the existing keyless VTI
behaviour (which this series does not change) is to always accept
packets matching an input policy, regardless of whether there is
any matching XFRM state. Thus, the keyless VTI would accept the
traffic for the keyed tunnel and drop it because it would not
match the keyed tunnel's state.

Changes from RFC series:
- Processing of ICMP errors now works when ikey != okey.
- Series now contains changes to the common tunnel lookup
  functions to match tunnels by okey and to ignore ikey when
  matching.
- Fixed missing EXPORT_SYMBOL for xfrm_state_lookup_loose.
- Made vti6_lookup static as it should have been.

^ permalink raw reply

* Re: [PATCH net-next] qed*: Utilize FW 8.33.1.0
From: David Miller @ 2017-12-20 16:54 UTC (permalink / raw)
  To: jakub.kicinski
  Cc: Tomer.Tayar, netdev, linux-rdma, linux-scsi, Ariel.Elior,
	Michal.Kalderon, Yuval.Bason, Ram.Amrani, Manish.Chopra,
	Chad.Dupuis, Manish.Rangankar
In-Reply-To: <20171219154651.30930d2a@cakuba.netronome.com>

From: Jakub Kicinski <jakub.kicinski@netronome.com>
Date: Tue, 19 Dec 2017 15:46:51 -0800

> On Tue, 19 Dec 2017 16:05:23 +0200, Tomer Tayar wrote:
>> Sorry for the very long patch.
>> The firmware changes are spread all over w/o a good modularity.
> 
> Rings false.  Significant portion of this patch is just whitespace 
> and comment changes.

Totally agreed.

This thing is beyond huge already, adding unrelated whitespace and
comment changes make reviewing it even more impossible.

^ permalink raw reply

* Re: [PATCH net-next v4] ip6_vti: adjust vti mtu according to mtu of lower device
From: David Miller @ 2017-12-20 16:53 UTC (permalink / raw)
  To: alexey.kodanev; +Cc: netdev, steffen.klassert, pvorel, shannon.nelson
In-Reply-To: <1513691961-19692-1-git-send-email-alexey.kodanev@oracle.com>

From: Alexey Kodanev <alexey.kodanev@oracle.com>
Date: Tue, 19 Dec 2017 16:59:21 +0300

> LTP/udp6_ipsec_vti tests fail when sending large UDP datagrams over
> ip6_vti that require fragmentation and the underlying device has an
> MTU smaller than 1500 plus some extra space for headers. This happens
> because ip6_vti, by default, sets MTU to ETH_DATA_LEN and not updating
> it depending on a destination address or link parameter. Further
> attempts to send UDP packets may succeed because pmtu gets updated on
> ICMPV6_PKT_TOOBIG in vti6_err().
> 
> In case the lower device has larger MTU size, e.g. 9000, ip6_vti works
> but not using the possible maximum size, output packets have 1500 limit.
> 
> The above cases require manual MTU setup after ip6_vti creation. However
> ip_vti already updates MTU based on lower device with ip_tunnel_bind_dev().
> 
> Here is the example when the lower device MTU is set to 9000:
 ...
> Reported-by: Petr Vorel <pvorel@suse.cz>
> Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>

Applied, thanks Alexey.

^ permalink raw reply

* Re: [PATCH v3 net-next 0/6] tls: Add generic NIC offload infrastructure
From: David Miller @ 2017-12-20 16:36 UTC (permalink / raw)
  To: ilyal; +Cc: borisp, jiri, netdev, davejwatson, tom, hannes, aviadye, liranl
In-Reply-To: <AM4PR0501MB27231C4EC47FB287D7FF1EC0D40C0@AM4PR0501MB2723.eurprd05.prod.outlook.com>

From: Ilya Lesokhin <ilyal@mellanox.com>
Date: Wed, 20 Dec 2017 16:23:03 +0000

>> Whether you provide the API addition patches and the user in the same patch
>> series, or a separate one, doesn't really matter.  What is important is that this
>> is accessible to the reviewer at the same time.
> 
> Note that we did provide a user in an accessible place.

That is not accessible for people reading netdev, it needs to be posted
on the netdev list.

It is never appropriate to require a reviewer to look at some external
site to review a patch series posted here.

^ permalink raw reply

* Re: [PATCHv3 net-next 00/14] net: sched: sch: introduce extack support
From: Alexander Aring @ 2017-12-20 16:35 UTC (permalink / raw)
  To: David Miller
  Cc: Jamal Hadi Salim, Cong Wang, Jiří Pírko, netdev,
	kernel, David Ahern
In-Reply-To: <20171220.113249.514392331552783606.davem@davemloft.net>

Hi,

On Wed, Dec 20, 2017 at 11:32 AM, David Miller <davem@davemloft.net> wrote:
> From: Alexander Aring <aring@mojatatu.com>
> Date: Mon, 18 Dec 2017 17:44:59 -0500
>
>> this patch series basically add support for extack in common qdisc handling.
>> Additional it adds extack pointer to common qdisc callback handling this
>> offers per qdisc implementation to setting the extack message for each
>> failure over netlink.
>
> This patch series doesn't apply cleanly to net-next.

okay, I will rebase it and send v4.

Thanks.

- Alex

^ permalink raw reply

* Re: [PATCHv3 net-next 00/14] net: sched: sch: introduce extack support
From: David Miller @ 2017-12-20 16:32 UTC (permalink / raw)
  To: aring; +Cc: jhs, xiyou.wangcong, jiri, netdev, kernel, dsahern
In-Reply-To: <20171218224513.29836-1-aring@mojatatu.com>

From: Alexander Aring <aring@mojatatu.com>
Date: Mon, 18 Dec 2017 17:44:59 -0500

> this patch series basically add support for extack in common qdisc handling.
> Additional it adds extack pointer to common qdisc callback handling this
> offers per qdisc implementation to setting the extack message for each
> failure over netlink.

This patch series doesn't apply cleanly to net-next.

^ permalink raw reply

* [PATCH net] ip6_gre: fix device features for ioctl setup
From: Alexey Kodanev @ 2017-12-20 16:36 UTC (permalink / raw)
  To: netdev; +Cc: Alexander Duyck, David Miller, Alexey Kodanev

When ip6gre is created using ioctl, its features, such as
scatter-gather, GSO and tx-checksumming will be turned off:

  # ip -f inet6 tunnel add gre6 mode ip6gre remote fd00::1
  # ethtool -k gre6 (truncated output)
    tx-checksumming: off
    scatter-gather: off
    tcp-segmentation-offload: off
    generic-segmentation-offload: off [requested on]

But when netlink is used, they will be enabled:
  # ip link add gre6 type ip6gre remote fd00::1
  # ethtool -k gre6 (truncated output)
    tx-checksumming: on
    scatter-gather: on
    tcp-segmentation-offload: on
    generic-segmentation-offload: on

This results in a loss of performance when gre6 is created via ioctl.
The issue was found with LTP/gre tests.

Fix it by moving the setup of device features to a separate function
and invoke it with ndo_init callback because both netlink and ioctl
will eventually call it via register_netdevice():

   register_netdevice()
       - ndo_init() callback -> ip6gre_tunnel_init() or ip6gre_tap_init()
           - ip6gre_tunnel_init_common()
                - ip6gre_tnl_init_features()

The moved code also contains two minor style fixes:
  * removed needless tab from GRE6_FEATURES on NETIF_F_HIGHDMA line.
  * fixed the issue reported by checkpatch: "Unnecessary parentheses around
    'nt->encap.type == TUNNEL_ENCAP_NONE'"

Fixes: ac4eb009e477 ("ip6gre: Add support for basic offloads offloads excluding GSO")
Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
---
 net/ipv6/ip6_gre.c |   57 +++++++++++++++++++++++++++++----------------------
 1 files changed, 32 insertions(+), 25 deletions(-)

diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c
index 4cfd8e0..9a4b376 100644
--- a/net/ipv6/ip6_gre.c
+++ b/net/ipv6/ip6_gre.c
@@ -1014,6 +1014,36 @@ static void ip6gre_tunnel_setup(struct net_device *dev)
 	eth_random_addr(dev->perm_addr);
 }
 
+#define GRE6_FEATURES (NETIF_F_SG |		\
+		       NETIF_F_FRAGLIST |	\
+		       NETIF_F_HIGHDMA |	\
+		       NETIF_F_HW_CSUM)
+
+static void ip6gre_tnl_init_features(struct net_device *dev)
+{
+	struct ip6_tnl *nt = netdev_priv(dev);
+
+	dev->features		|= GRE6_FEATURES;
+	dev->hw_features	|= GRE6_FEATURES;
+
+	if (!(nt->parms.o_flags & TUNNEL_SEQ)) {
+		/* TCP offload with GRE SEQ is not supported, nor
+		 * can we support 2 levels of outer headers requiring
+		 * an update.
+		 */
+		if (!(nt->parms.o_flags & TUNNEL_CSUM) ||
+		    nt->encap.type == TUNNEL_ENCAP_NONE) {
+			dev->features    |= NETIF_F_GSO_SOFTWARE;
+			dev->hw_features |= NETIF_F_GSO_SOFTWARE;
+		}
+
+		/* Can use a lockless transmit, unless we generate
+		 * output sequences
+		 */
+		dev->features |= NETIF_F_LLTX;
+	}
+}
+
 static int ip6gre_tunnel_init_common(struct net_device *dev)
 {
 	struct ip6_tnl *tunnel;
@@ -1048,6 +1078,8 @@ static int ip6gre_tunnel_init_common(struct net_device *dev)
 	if (!(tunnel->parms.flags & IP6_TNL_F_IGN_ENCAP_LIMIT))
 		dev->mtu -= 8;
 
+	ip6gre_tnl_init_features(dev);
+
 	return 0;
 }
 
@@ -1298,11 +1330,6 @@ static int ip6gre_tap_init(struct net_device *dev)
 	.ndo_get_iflink = ip6_tnl_get_iflink,
 };
 
-#define GRE6_FEATURES (NETIF_F_SG |		\
-		       NETIF_F_FRAGLIST |	\
-		       NETIF_F_HIGHDMA |		\
-		       NETIF_F_HW_CSUM)
-
 static void ip6gre_tap_setup(struct net_device *dev)
 {
 
@@ -1382,26 +1409,6 @@ static int ip6gre_newlink(struct net *src_net, struct net_device *dev,
 	nt->net = dev_net(dev);
 	ip6gre_tnl_link_config(nt, !tb[IFLA_MTU]);
 
-	dev->features		|= GRE6_FEATURES;
-	dev->hw_features	|= GRE6_FEATURES;
-
-	if (!(nt->parms.o_flags & TUNNEL_SEQ)) {
-		/* TCP offload with GRE SEQ is not supported, nor
-		 * can we support 2 levels of outer headers requiring
-		 * an update.
-		 */
-		if (!(nt->parms.o_flags & TUNNEL_CSUM) ||
-		    (nt->encap.type == TUNNEL_ENCAP_NONE)) {
-			dev->features    |= NETIF_F_GSO_SOFTWARE;
-			dev->hw_features |= NETIF_F_GSO_SOFTWARE;
-		}
-
-		/* Can use a lockless transmit, unless we generate
-		 * output sequences
-		 */
-		dev->features |= NETIF_F_LLTX;
-	}
-
 	err = register_netdevice(dev);
 	if (err)
 		goto out;
-- 
1.7.1

^ permalink raw reply related

* Re: [QUESTION] Doubt about NAPI_GRO_CB(skb)->is_atomic in tcpv4 gro process
From: Alexander Duyck @ 2017-12-20 16:24 UTC (permalink / raw)
  To: Yunsheng Lin
  Cc: netdev@vger.kernel.org, davem@davemloft.net, linuxarm@huawei.com,
	yuxiaowu, wzhen.wang, Xuehuahu
In-Reply-To: <ca367d0d-b0d7-3357-7196-c0da17ef9890@huawei.com>

On Wed, Dec 20, 2017 at 1:09 AM, Yunsheng Lin <linyunsheng@huawei.com> wrote:
> Hi, all
>         I have some doubt about NAPI_GRO_CB(skb)->is_atomic when
> analyzing the tcpv4 gro process:
>
> Firstly we set NAPI_GRO_CB(skb)->is_atomic to 1 in dev_gro_receive:
> https://elixir.free-electrons.com/linux/v4.15-rc4/source/net/core/dev.c#L4838
>
> And then in inet_gro_receive, we check the NAPI_GRO_CB(skb)->is_atomic
> before setting NAPI_GRO_CB(skb)->is_atomic according to IP_DF bit in the ip header:
> https://elixir.free-electrons.com/linux/v4.15-rc4/source/net/ipv4/af_inet.c#L1319
>
> struct sk_buff **inet_gro_receive(struct sk_buff **head, struct sk_buff *skb)
> {
> .....................
>         for (p = *head; p; p = p->next) {
> ........................
>
>                 /* If the previous IP ID value was based on an atomic
>                  * datagram we can overwrite the value and ignore it.
>                  */
>                 if (NAPI_GRO_CB(skb)->is_atomic)                      //we check it here
>                         NAPI_GRO_CB(p)->flush_id = flush_id;
>                 else
>                         NAPI_GRO_CB(p)->flush_id |= flush_id;
>         }
>
>         NAPI_GRO_CB(skb)->is_atomic = !!(iph->frag_off & htons(IP_DF));  //we set it here
>         NAPI_GRO_CB(skb)->flush |= flush;
>         skb_set_network_header(skb, off);
> ................................
> }
>
> My question is whether we should check the NAPI_GRO_CB(skb)->is_atomic or NAPI_GRO_CB(p)->is_atomic?
> If we should check NAPI_GRO_CB(skb)->is_atomic, then maybe it is unnecessary because it is alway true.
> If we should check NAPI_GRO_CB(p)->is_atomic, maybe there is a bug here.
>
> So what is the logic here? I am just start analyzing the gro, maybe I miss something obvious here.

The logic there is to address the multiple IP header case where there
are 2 or more IP headers due to things like VXLAN or GRE tunnels. So
what will happen is that an outer IP header will end up being sent
with DF not set and will clear the is_atomic value then we want to OR
in the next header that is applied. It defaults to assignment on
is_atomic because the first IP header will encounter flush_id with no
previous configuration occupying it.

The part I am not sure about is if we should be using assignment for
is_atomic or using an "&=" to clear the bit and leave it cleared. I
don't know if there has been much testing of multiple levels of tunnel
header.

Thanks.

- Alex

^ permalink raw reply

* Re: [PATCH v3 ipsec-next 3/3] xfrm: wrap xfrmdev_ops with offload config
From: Shannon Nelson @ 2017-12-20 16:22 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner; +Cc: steffen.klassert, netdev
In-Reply-To: <20171220160240.GK6122@localhost.localdomain>

On 12/20/2017 8:03 AM, Marcelo Ricardo Leitner wrote:
> On Tue, Dec 19, 2017 at 03:35:49PM -0800, Shannon Nelson wrote:
>> There's no reason to define netdev->xfrmdev_ops if
>> the offload facility is not CONFIG'd in.
>>
>> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
> 
> This one could use a Fixes tag perhaps:
> Fixes: d77e38e612a0 ("xfrm: Add an IPsec hardware offloading API")
> 
> as in theory the build was broken since then, as it added:
> +#ifdef CONFIG_XFRM_OFFLOAD
> +struct xfrmdev_ops {
> ...
> +#ifdef CONFIG_XFRM
> +       const struct xfrmdev_ops *xfrmdev_ops;
> 
> So the pointer would have an undefined type
>    if CONFIG_XFRM && !CONFIG_XFRM_OFFLOAD
> Though I couldn't reproduce this, not sure why.

Hmmm, I don't think this requires a "Fixes" tag, as the code all worked 
just fine, I'm just doing a little cleaning.

Patch 2/3 adds a more intense look at the data structure, so I needed to 
change it to the CONFIG_XFRM_OFFLOAD so as to not break the build. 
Since the xfrmdev_ops field is now never used unless we have 
CONFIG_XFRM_OFFLOAD, we can change the net_device definition to be just 
a bit smaller without it.

> 
> But.. is it buildable with this patch? I mine failed:
> 
> obj-$(CONFIG_XFRM) := xfrm_policy.o xfrm_state.o xfrm_hash.o \
>                        xfrm_input.o xfrm_output.o \
>                        xfrm_sysctl.o xfrm_replay.o xfrm_device.o
> 
> so xfrm_device is always in if CONFIG_XFRM is there,
> xfrm_dev_init(), via xfrm_dev_notifier -> xfrm_dev_event() ->
>    xfrm_dev_register() and then:
> 
> static int xfrm_dev_register(struct net_device *dev)
> {
>          if ((dev->features & NETIF_F_HW_ESP) && !dev->xfrmdev_ops)

This looks like you haven't applied version 3 of the 2nd patch "xfrm: 
check for xdo_dev_ops add and delete".  I missed this in the earlier 
version (not enough compile tests), but version 3 of patch 2/3  should 
address it.

sln

>                                                   ^^^^^^^^^^^^^^^^
> 
> We can't control CONFIG_XFRM_OFFLOAD directly, so unless you
> unselected other offloadings such as INET_ESP_OFFLOAD, it is still on.
> 
> linux/net/xfrm/xfrm_device.c: In function ‘xfrm_dev_register’:
> linux/net/xfrm/xfrm_device.c:147:48: error: ‘struct net_device’ has no member named ‘xfrmdev_ops’; did you mean ‘netdev_ops’?
>    if ((dev->features & NETIF_F_HW_ESP) && !dev->xfrmdev_ops)
>                                                  ^~~~~~~~~~~
>                                                  netdev_ops
> 
> 
>> ---
>>   include/linux/netdevice.h | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>> index 2eaac7d..145d0de 100644
>> --- a/include/linux/netdevice.h
>> +++ b/include/linux/netdevice.h
>> @@ -1697,7 +1697,7 @@ struct net_device {
>>   	const struct ndisc_ops *ndisc_ops;
>>   #endif
>>   
>> -#ifdef CONFIG_XFRM
>> +#ifdef CONFIG_XFRM_OFFLOAD
>>   	const struct xfrmdev_ops *xfrmdev_ops;
>>   #endif
>>   
>> -- 
>> 2.7.4
>>

^ permalink raw reply

* RE: [PATCH v3 net-next 0/6] tls: Add generic NIC offload infrastructure
From: Ilya Lesokhin @ 2017-12-20 16:23 UTC (permalink / raw)
  To: David Miller, Boris Pismenny
  Cc: jiri@resnulli.us, netdev@vger.kernel.org, davejwatson@fb.com,
	tom@herbertland.com, hannes@stressinduktion.org, Aviad Yehezkel,
	Liran Liss
In-Reply-To: <20171220.111208.1328340432834146497.davem@davemloft.net>

> 
> > Dave, would you prefer to get the driver patches that use this infra
> > before the infra?
> 
> The arguments you present are silly.
> 
> In order to analyze any proposed API, the users of it must be presented for the
> reviewers to see as well.
> 
> Logically, you must have tried to make use of the APIs to see how well they
> work and are usable for at least one such user, right?
Right, we agree.
> 
> Therefore, the use case exists, and you must present it alongside the API
> proposal.
> 
> Whether you provide the API addition patches and the user in the same patch
> series, or a separate one, doesn't really matter.  What is important is that this
> is accessible to the reviewer at the same time.

Note that we did provide a user in an accessible place.
https://github.com/Mellanox/tls-offload/tree/tls_device_v3
The link was at the bottom of the cover letter.

We just feel that the code there is not yet ready for upstream submission, and it might have
conflicts with other stuff submitted by Mellanox.

Would it be better if we submitted the mlx5e TLS support as an RFC alongside the TLS
Infrastructure patches?

^ permalink raw reply

* Re: sparc64 verifier failures..
From: David Miller @ 2017-12-20 16:17 UTC (permalink / raw)
  To: daniel; +Cc: netdev, alexei.starovoitov
In-Reply-To: <58565df5-2c45-6ab3-fb48-de011da70e6a@iogearbox.net>

From: Daniel Borkmann <daniel@iogearbox.net>
Date: Wed, 20 Dec 2017 14:05:36 +0100

> On 12/19/2017 09:36 PM, David Miller wrote:
>> 
>> I'm getting about 100 verifier failures on sparc64.
>> 
>> The vast majority of them seem to be due to misaligned packet
>> accesses.  Here is a sample of some of the failures.
> 
> Thanks, I'll check it next days. It would probably make sense to have
> a --strict-align mode for test_verifier that enables it on all tests
> unconditionally so selftests suite would always run in both modes (at
> least on archs that don't have this restriction).

Indeed, and for some reason I thought we were already doing this.

^ permalink raw reply

* Re: [PATCH v3 net-next 0/6] tls: Add generic NIC offload infrastructure
From: David Miller @ 2017-12-20 16:12 UTC (permalink / raw)
  To: borisp; +Cc: jiri, ilyal, netdev, davejwatson, tom, hannes, aviadye, liranl
In-Reply-To: <HE1PR0501MB223554DC2840A95B594ED0D1B00C0@HE1PR0501MB2235.eurprd05.prod.outlook.com>

From: Boris Pismenny <borisp@mellanox.com>
Date: Wed, 20 Dec 2017 08:28:03 +0000

> Dave, would you prefer to get the driver patches that use this infra
> before the infra?

The arguments you present are silly.

In order to analyze any proposed API, the users of it must be presented
for the reviewers to see as well.

Logically, you must have tried to make use of the APIs to see how well
they work and are usable for at least one such user, right?

Therefore, the use case exists, and you must present it alongside the
API proposal.

Whether you provide the API addition patches and the user in the same
patch series, or a separate one, doesn't really matter.  What is
important is that this is accessible to the reviewer at the same
time.

^ permalink raw reply

* Re: [PATCH net-next] dev: Correctly get length of alias string in dev_set_alias()
From: David Miller @ 2017-12-20 16:05 UTC (permalink / raw)
  To: serhe.popovych; +Cc: netdev
In-Reply-To: <1513633115-16940-1-git-send-email-serhe.popovych@gmail.com>

From: Serhey Popovych <serhe.popovych@gmail.com>
Date: Mon, 18 Dec 2017 23:38:35 +0200

> We supply number of bytes available in @alias via @len
> parameter to dev_set_alias() which is not the same
> as zero terminated string length that can be shorter.
> 
> Both dev_set_alias() users (rtnetlink and sysfs) can
> submit number of bytes up to IFALIASZ with actual string
> length slightly shorter by putting '\0' not at @len - 1.
> 
> Use strnlen() to get length of zero terminated string
> and not access beyond @len. Correct comment about @len
> and explain how to unset alias (i.e. use zero for @len).
> 
> Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>

I don't really see this as useful, really.

In the sysfs case, we are not presented with a NULL terminated string.
Instead, the net sysfs code gives us a length that goes up until the
trailing newline character.  The sysfs case is never larger than the
actual string size + 1.

The netlink attribute is usually sized appropriately for whatever the
string length actually is.

This therefore just seems to add an new strnlen() unnecessarily to
this code path, which rarely does anything helpful.

Thanks.

^ permalink raw reply

* Re: [PATCH v3 ipsec-next 3/3] xfrm: wrap xfrmdev_ops with offload config
From: Marcelo Ricardo Leitner @ 2017-12-20 16:03 UTC (permalink / raw)
  To: Shannon Nelson; +Cc: steffen.klassert, netdev
In-Reply-To: <1513726549-7065-4-git-send-email-shannon.nelson@oracle.com>

On Tue, Dec 19, 2017 at 03:35:49PM -0800, Shannon Nelson wrote:
> There's no reason to define netdev->xfrmdev_ops if
> the offload facility is not CONFIG'd in.
> 
> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>

This one could use a Fixes tag perhaps:
Fixes: d77e38e612a0 ("xfrm: Add an IPsec hardware offloading API")

as in theory the build was broken since then, as it added:
+#ifdef CONFIG_XFRM_OFFLOAD
+struct xfrmdev_ops {
...
+#ifdef CONFIG_XFRM
+       const struct xfrmdev_ops *xfrmdev_ops;

So the pointer would have an undefined type
  if CONFIG_XFRM && !CONFIG_XFRM_OFFLOAD
Though I couldn't reproduce this, not sure why.

But.. is it buildable with this patch? I mine failed:

obj-$(CONFIG_XFRM) := xfrm_policy.o xfrm_state.o xfrm_hash.o \
                      xfrm_input.o xfrm_output.o \
                      xfrm_sysctl.o xfrm_replay.o xfrm_device.o

so xfrm_device is always in if CONFIG_XFRM is there,
xfrm_dev_init(), via xfrm_dev_notifier -> xfrm_dev_event() ->
  xfrm_dev_register() and then:

static int xfrm_dev_register(struct net_device *dev)
{
        if ((dev->features & NETIF_F_HW_ESP) && !dev->xfrmdev_ops)
                                                 ^^^^^^^^^^^^^^^^

We can't control CONFIG_XFRM_OFFLOAD directly, so unless you
unselected other offloadings such as INET_ESP_OFFLOAD, it is still on.

linux/net/xfrm/xfrm_device.c: In function ‘xfrm_dev_register’:
linux/net/xfrm/xfrm_device.c:147:48: error: ‘struct net_device’ has no member named ‘xfrmdev_ops’; did you mean ‘netdev_ops’?
  if ((dev->features & NETIF_F_HW_ESP) && !dev->xfrmdev_ops)
                                                ^~~~~~~~~~~
                                                netdev_ops


> ---
>  include/linux/netdevice.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 2eaac7d..145d0de 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -1697,7 +1697,7 @@ struct net_device {
>  	const struct ndisc_ops *ndisc_ops;
>  #endif
>  
> -#ifdef CONFIG_XFRM
> +#ifdef CONFIG_XFRM_OFFLOAD
>  	const struct xfrmdev_ops *xfrmdev_ops;
>  #endif
>  
> -- 
> 2.7.4
> 

^ permalink raw reply

* [PATCH v2] net: ibm: emac: support RGMII-[RX|TX]ID phymode
From: Christian Lamparter @ 2017-12-20 16:02 UTC (permalink / raw)
  To: netdev; +Cc: David S . Miller, Andrew Lunn, Christophe Jaillet

The RGMII spec allows compliance for devices that implement an internal
delay on TXC and/or RXC inside the transmitter. This patch adds the
necessary RGMII_[RX|TX]ID mode code to handle such PHYs with the
emac driver.

Signed-off-by: Christian Lamparter <chunkeey@gmail.com>

---
v2: - utilize phy_interface_mode_is_rgmii()
---
 drivers/net/ethernet/ibm/emac/core.c  |  4 ++--
 drivers/net/ethernet/ibm/emac/emac.h  |  3 +++
 drivers/net/ethernet/ibm/emac/rgmii.c | 10 ++++++++--
 3 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/ibm/emac/core.c b/drivers/net/ethernet/ibm/emac/core.c
index 7feff2450ed6..043e72e28bba 100644
--- a/drivers/net/ethernet/ibm/emac/core.c
+++ b/drivers/net/ethernet/ibm/emac/core.c
@@ -199,8 +199,8 @@ static void __emac_set_multicast_list(struct emac_instance *dev);
 
 static inline int emac_phy_supports_gige(int phy_mode)
 {
-	return  phy_mode == PHY_MODE_GMII ||
-		phy_mode == PHY_MODE_RGMII ||
+	return  phy_interface_mode_is_rgmii(phy_mode) ||
+		phy_mode == PHY_MODE_GMII ||
 		phy_mode == PHY_MODE_SGMII ||
 		phy_mode == PHY_MODE_TBI ||
 		phy_mode == PHY_MODE_RTBI;
diff --git a/drivers/net/ethernet/ibm/emac/emac.h b/drivers/net/ethernet/ibm/emac/emac.h
index 5afcc27ceebb..8c6d2af7281b 100644
--- a/drivers/net/ethernet/ibm/emac/emac.h
+++ b/drivers/net/ethernet/ibm/emac/emac.h
@@ -112,6 +112,9 @@ struct emac_regs {
 #define PHY_MODE_RMII	PHY_INTERFACE_MODE_RMII
 #define PHY_MODE_SMII	PHY_INTERFACE_MODE_SMII
 #define PHY_MODE_RGMII	PHY_INTERFACE_MODE_RGMII
+#define PHY_MODE_RGMII_ID	PHY_INTERFACE_MODE_RGMII_ID
+#define PHY_MODE_RGMII_RXID	PHY_INTERFACE_MODE_RGMII_RXID
+#define PHY_MODE_RGMII_TXID	PHY_INTERFACE_MODE_RGMII_TXID
 #define PHY_MODE_TBI	PHY_INTERFACE_MODE_TBI
 #define PHY_MODE_GMII	PHY_INTERFACE_MODE_GMII
 #define PHY_MODE_RTBI	PHY_INTERFACE_MODE_RTBI
diff --git a/drivers/net/ethernet/ibm/emac/rgmii.c b/drivers/net/ethernet/ibm/emac/rgmii.c
index c4a1ac38bba8..124b0473d2b7 100644
--- a/drivers/net/ethernet/ibm/emac/rgmii.c
+++ b/drivers/net/ethernet/ibm/emac/rgmii.c
@@ -52,9 +52,9 @@
 /* RGMII bridge supports only GMII/TBI and RGMII/RTBI PHYs */
 static inline int rgmii_valid_mode(int phy_mode)
 {
-	return  phy_mode == PHY_MODE_GMII ||
+	return  phy_interface_mode_is_rgmii(phy_mode) ||
+		phy_mode == PHY_MODE_GMII ||
 		phy_mode == PHY_MODE_MII ||
-		phy_mode == PHY_MODE_RGMII ||
 		phy_mode == PHY_MODE_TBI ||
 		phy_mode == PHY_MODE_RTBI;
 }
@@ -63,6 +63,9 @@ static inline const char *rgmii_mode_name(int mode)
 {
 	switch (mode) {
 	case PHY_MODE_RGMII:
+	case PHY_MODE_RGMII_ID:
+	case PHY_MODE_RGMII_RXID:
+	case PHY_MODE_RGMII_TXID:
 		return "RGMII";
 	case PHY_MODE_TBI:
 		return "TBI";
@@ -81,6 +84,9 @@ static inline u32 rgmii_mode_mask(int mode, int input)
 {
 	switch (mode) {
 	case PHY_MODE_RGMII:
+	case PHY_MODE_RGMII_ID:
+	case PHY_MODE_RGMII_RXID:
+	case PHY_MODE_RGMII_TXID:
 		return RGMII_FER_RGMII(input);
 	case PHY_MODE_TBI:
 		return RGMII_FER_TBI(input);
-- 
2.15.1

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox