* Re: [PATCH 03/17] netfilter: add namespace support for l3proto
From: Pablo Neira Ayuso @ 2012-05-24 10:04 UTC (permalink / raw)
To: Gao feng; +Cc: netfilter-devel, netdev, serge.hallyn, ebiederm, dlezcano
In-Reply-To: <4FBD95AA.8070301@cn.fujitsu.com>
On Thu, May 24, 2012 at 09:58:02AM +0800, Gao feng wrote:
> 于 2012年05月23日 18:29, Pablo Neira Ayuso 写道:
> > On Mon, May 14, 2012 at 04:52:13PM +0800, Gao feng wrote:
[...]
> >> diff --git a/net/netfilter/nf_conntrack_proto.c b/net/netfilter/nf_conntrack_proto.c
> >> index 6d68727..7ee6653 100644
> >> --- a/net/netfilter/nf_conntrack_proto.c
> >> +++ b/net/netfilter/nf_conntrack_proto.c
> >> @@ -170,85 +170,116 @@ static int kill_l4proto(struct nf_conn *i, void *data)
> >> nf_ct_l3num(i) == l4proto->l3proto;
> >> }
> >>
> >> -static int nf_ct_l3proto_register_sysctl(struct nf_conntrack_l3proto *l3proto)
> >> +static struct nf_ip_net *nf_ct_l3proto_net(struct net *net,
> >> + struct nf_conntrack_l3proto *l3proto)
> >> +{
> >> + if (l3proto->l3proto == PF_INET)
> >> + return &net->ct.proto;
> >> + else
> >> + return NULL;
> >> +}
> >> +
> >> +static int nf_ct_l3proto_register_sysctl(struct net *net,
> >> + struct nf_conntrack_l3proto *l3proto)
> >> {
> >> int err = 0;
> >> + struct nf_ip_net *in = nf_ct_l3proto_net(net, l3proto);
> >>
> >> -#ifdef CONFIG_SYSCTL
> >> - if (l3proto->ctl_table != NULL) {
> >> - err = nf_ct_register_sysctl(&init_net,
> >> - &l3proto->ctl_table_header,
> >> + if (in == NULL)
> >> + return 0;
> >
> > Under what circunstances that in be NULL?
>
> Because l3proto_ipv6 doesn't need sysctl,so l3proto_ipv6's nf_ip_net is NULL,
> please see function nf_ct_l3proto_net above.
Then, please add a comment there to explain that some per-net protocol
information may missing since no sysctl is supported.
^ permalink raw reply
* Re: [PATCH 02/17] netfilter: add namespace support for l4proto
From: Pablo Neira Ayuso @ 2012-05-24 10:00 UTC (permalink / raw)
To: Gao feng
Cc: netfilter-devel, netdev, serge.hallyn, ebiederm, dlezcano,
Gao feng
In-Reply-To: <4FBD9473.5050304@cn.fujitsu.com>
On Thu, May 24, 2012 at 09:52:51AM +0800, Gao feng wrote:
> 于 2012年05月23日 18:25, Pablo Neira Ayuso 写道:
> > On Mon, May 14, 2012 at 04:52:12PM +0800, Gao feng wrote:
> >> From: Gao feng <gaofeng@cn.fujitus.com>
[...]
> >> @@ -243,137 +253,172 @@ void nf_conntrack_l3proto_unregister(struct nf_conntrack_l3proto *proto)
> >> }
> >> EXPORT_SYMBOL_GPL(nf_conntrack_l3proto_unregister);
> >>
> >> -static int nf_ct_l4proto_register_sysctl(struct nf_conntrack_l4proto *l4proto)
> >> +static struct nf_proto_net *nf_ct_l4proto_net(struct net *net,
> >> + struct nf_conntrack_l4proto *l4proto)
> >> {
> >> - int err = 0;
> >> + if (l4proto->net_id)
> >> + return net_generic(net, *l4proto->net_id);
> >> + else
> >> + return NULL;
> >> +}
> >>
> >> +int nf_ct_l4proto_register_sysctl(struct net *net,
> >> + struct nf_conntrack_l4proto *l4proto)
> >> +{
> >> + int err = 0;
> >> + struct nf_proto_net *pn = nf_ct_l4proto_net(net, l4proto);
> >> + if (pn == NULL)
> >> + return 0;
> >> #ifdef CONFIG_SYSCTL
> >> - if (l4proto->ctl_table != NULL) {
> >> - err = nf_ct_register_sysctl(l4proto->ctl_table_header,
> >> + if (pn->ctl_table != NULL) {
> >> + err = nf_ct_register_sysctl(net,
> >> + &pn->ctl_table_header,
> >> "net/netfilter",
> >> - l4proto->ctl_table,
> >> - l4proto->ctl_table_users);
> >> - if (err < 0)
> >> + pn->ctl_table,
> >> + &pn->users);
> >> + if (err < 0) {
> >> + kfree(pn->ctl_table);
> >> + pn->ctl_table = NULL;
> > ^^^^^^^^^^^
> > Do you really need to set this above to NULL? Is there any existing
> > bug trap? If not, it's superfluous, please, remove it.
> >
> yes,l4proto_tcp(udp,icmp)'s ctl_table is stored in netns_ct.proto,
> so when we register l4proto_tcp's sysctl failed,ctl_table will still
> point to the kfreed memory. this will cause panic the next
> time we register l4proto_tcp's sysctl.
I see, thanks for the clarification.
^ permalink raw reply
* Re: [PATCH 01/17] netfilter: add struct nf_proto_net for register l4proto sysctl
From: Pablo Neira Ayuso @ 2012-05-24 9:58 UTC (permalink / raw)
To: Gao feng
Cc: netfilter-devel, netdev, serge.hallyn, ebiederm, dlezcano,
Gao feng
In-Reply-To: <4FBD9076.6060309@cn.fujitsu.com>
On Thu, May 24, 2012 at 09:35:50AM +0800, Gao feng wrote:
> Hi pablo:
>
> 于 2012年05月23日 18:12, Pablo Neira Ayuso 写道:
> > On Mon, May 14, 2012 at 04:52:11PM +0800, Gao feng wrote:
> >> From: Gao feng <gaofeng@cn.fujitus.com>
> >>
> >> the struct nf_proto_net stroes proto's ctl_table_header and ctl_table,
> >> nf_ct_l4proto_(un)register_sysctl use it to register sysctl.
> >>
> >> there are some changes for struct nf_conntrack_l4proto:
> >> - add field compat to identify if this proto should do compat.
> >> - the net_id field is used to store the pernet_operations id
> >> that belones to l4proto.
> >> - init_net will be used to initial the proto's pernet data
> >>
> >> and add init_net for struct nf_conntrack_l3proto too.
> >
> > This patchset looks bette but there are still things that we have to
> > resolve.
> >
> > The first one (regarding this patch 1/17) changes in:
> > * include/net/netfilter/nf_conntrack_l4proto.h
> > * include/net/netns/conntrack.h
> >
> > should be included in:
> > [PATCH] netfilter: add namespace support for l4proto
> >
> > And changes in:
> > * include/net/netfilter/nf_conntrack_l3proto.h
> >
> > should be included in:
> > [PATCH] netfilter: add namespace support for l3proto
> >
> > I already told you. A patch that adds a structure without using it,
> > is not good. The structure has to go together with the code uses it.
> >
>
> It seams this patch should be merged to "netfilter: add namespace support for l4proto"
> the struct nf_proto_net is first used there.
>
> > More comments below.
> >
> >> Acked-by: Eric W. Biederman <ebiederm@xmission.com>
> >> Signed-off-by: Gao feng <gaofeng@cn.fujitus.com>
> >> ---
> >> include/net/netfilter/nf_conntrack_l3proto.h | 3 +++
> >> include/net/netfilter/nf_conntrack_l4proto.h | 6 ++++++
> >> include/net/netns/conntrack.h | 12 ++++++++++++
> >> 3 files changed, 21 insertions(+), 0 deletions(-)
> >>
> >> diff --git a/include/net/netfilter/nf_conntrack_l3proto.h b/include/net/netfilter/nf_conntrack_l3proto.h
> >> index 9699c02..9766005 100644
> >> --- a/include/net/netfilter/nf_conntrack_l3proto.h
> >> +++ b/include/net/netfilter/nf_conntrack_l3proto.h
> >> @@ -69,6 +69,9 @@ struct nf_conntrack_l3proto {
> >> struct ctl_table *ctl_table;
> >> #endif /* CONFIG_SYSCTL */
> >>
> >> + /* Init l3proto pernet data */
> >> + int (*init_net)(struct net *net);
> >> +
> >> /* Module (if any) which this is connected to. */
> >> struct module *me;
> >> };
> >> diff --git a/include/net/netfilter/nf_conntrack_l4proto.h b/include/net/netfilter/nf_conntrack_l4proto.h
> >> index 3b572bb..a90eab5 100644
> >> --- a/include/net/netfilter/nf_conntrack_l4proto.h
> >> +++ b/include/net/netfilter/nf_conntrack_l4proto.h
> >> @@ -22,6 +22,8 @@ struct nf_conntrack_l4proto {
> >> /* L4 Protocol number. */
> >> u_int8_t l4proto;
> >>
> >> + u_int8_t compat;
> >
> > I don't see why we need this new field.
> >
> > It seems to be set to 1 in each structure that has set:
> >
> > .ctl_compat_table
> >
> > to non-NULL. So, it's redundant.
> >
> > Moreover, you already know from the protocol tracker itself if you
> > have to allocate the compat ctl table or not.
> >
> > In other words: You set compat to 1 for nf_conntrack_l4proto_generic.
> > Then, you pass that compat value to generic_init_net via ->inet_net
> > again, but this information (that determines if the compat has to be
> > done or not) is already in the scope of the protocol tracker.
> >
>
> because some protocols such l4proto_tcp6 and l4proto_tcp use the same init_net
> function. the l4proto_tcp6 doesn't need compat sysctl, so we should use this new
> field to identify if we should kmemdup compat_sysctl_table.
Then, could you use two init_net functions? one for TCP for IPv4 and another
for TCP for IPv6?
> and beacuse protocols will have pernet ctl_compat_table and ctl_table,the .ctl_compat_table
> field will be deleted in patch 15/17. so we should the new field compat.
>
> actually, we don't need to pass compat value for generic_init_net,beacuse
> we know l4proto_generic need compat. But consider there are l4proto_tcp(6), and in order to keep
> code readable,I prefer to add compat field and pass it to init_net.
>
> > You have to fix this.
> >
> >> +
> >> /* Try to fill in the third arg: dataoff is offset past network protocol
> >> hdr. Return true if possible. */
> >> bool (*pkt_to_tuple)(const struct sk_buff *skb, unsigned int dataoff,
> >> @@ -103,6 +105,10 @@ struct nf_conntrack_l4proto {
> >> struct ctl_table *ctl_compat_table;
> >> #endif
> >> #endif
> >> + int *net_id;
> >> + /* Init l4proto pernet data */
> >> + int (*init_net)(struct net *net, u_int8_t compat);
> >> +
> >> /* Protocol name */
> >> const char *name;
> >>
> >> diff --git a/include/net/netns/conntrack.h b/include/net/netns/conntrack.h
> >> index a053a19..1f53038 100644
> >> --- a/include/net/netns/conntrack.h
> >> +++ b/include/net/netns/conntrack.h
> >> @@ -8,6 +8,18 @@
> >> struct ctl_table_header;
> >> struct nf_conntrack_ecache;
> >>
> >> +struct nf_proto_net {
> >> +#ifdef CONFIG_SYSCTL
> >> + struct ctl_table_header *ctl_table_header;
> >> + struct ctl_table *ctl_table;
> >> +#ifdef CONFIG_NF_CONNTRACK_PROC_COMPAT
> >> + struct ctl_table_header *ctl_compat_header;
> >> + struct ctl_table *ctl_compat_table;
> >> +#endif
> >> +#endif
> >> + unsigned int users;
> >> +};
> >> +
> >> struct netns_ct {
> >> atomic_t count;
> >> unsigned int expect_count;
> > --
> > To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> >
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH 15/17] netfilter: cleanup sysctl for l4proto and l3proto
From: Pablo Neira Ayuso @ 2012-05-24 9:56 UTC (permalink / raw)
To: Gao feng; +Cc: netfilter-devel, netdev, serge.hallyn, ebiederm, dlezcano
In-Reply-To: <4FBD87E6.6000402@cn.fujitsu.com>
On Thu, May 24, 2012 at 08:59:18AM +0800, Gao feng wrote:
> Hi pablo:
>
> 于 2012年05月23日 18:38, Pablo Neira Ayuso 写道:
> > On Mon, May 14, 2012 at 04:52:25PM +0800, Gao feng wrote:
> >> delete no useless sysctl data for l4proto and l3proto.
> >>
> >> Acked-by: Eric W. Biederman <ebiederm@xmission.com>
> >> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
> >> ---
> >> include/net/netfilter/nf_conntrack_l3proto.h | 2 --
> >> include/net/netfilter/nf_conntrack_l4proto.h | 10 ----------
> >> net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c | 1 -
> >> net/ipv4/netfilter/nf_conntrack_proto_icmp.c | 8 --------
> >> net/ipv6/netfilter/nf_conntrack_proto_icmpv6.c | 5 -----
> >> net/netfilter/nf_conntrack_proto_generic.c | 8 --------
> >> net/netfilter/nf_conntrack_proto_sctp.c | 15 ---------------
> >> net/netfilter/nf_conntrack_proto_tcp.c | 15 ---------------
> >> net/netfilter/nf_conntrack_proto_udp.c | 15 ---------------
> >> net/netfilter/nf_conntrack_proto_udplite.c | 12 ------------
> >> 10 files changed, 0 insertions(+), 91 deletions(-)
> >>
> >> diff --git a/include/net/netfilter/nf_conntrack_l3proto.h b/include/net/netfilter/nf_conntrack_l3proto.h
> >> index d6df8c7..6f7c13f 100644
> >> --- a/include/net/netfilter/nf_conntrack_l3proto.h
> >> +++ b/include/net/netfilter/nf_conntrack_l3proto.h
> >> @@ -64,9 +64,7 @@ struct nf_conntrack_l3proto {
> >> size_t nla_size;
> >>
> >> #ifdef CONFIG_SYSCTL
> >> - struct ctl_table_header *ctl_table_header;
> >> const char *ctl_table_path;
> >> - struct ctl_table *ctl_table;
> >> #endif /* CONFIG_SYSCTL */
> >>
> >> /* Init l3proto pernet data */
> >> diff --git a/include/net/netfilter/nf_conntrack_l4proto.h b/include/net/netfilter/nf_conntrack_l4proto.h
> >> index 0d329b9..4881df34 100644
> >> --- a/include/net/netfilter/nf_conntrack_l4proto.h
> >> +++ b/include/net/netfilter/nf_conntrack_l4proto.h
> >> @@ -95,16 +95,6 @@ struct nf_conntrack_l4proto {
> >> const struct nla_policy *nla_policy;
> >> } ctnl_timeout;
> >> #endif
> >> -
> >> -#ifdef CONFIG_SYSCTL
> >> - struct ctl_table_header **ctl_table_header;
> >> - struct ctl_table *ctl_table;
> >> - unsigned int *ctl_table_users;
> >> -#ifdef CONFIG_NF_CONNTRACK_PROC_COMPAT
> >> - struct ctl_table_header *ctl_compat_table_header;
> >> - struct ctl_table *ctl_compat_table;
> >> -#endif
> >> -#endif
> >
> > Interesting. This structure is added in patch 1/17, then it's remove
> > in patch 15/17.
> >
> > Probably I'm missing anything, but why are you doing it like that?
>
> This structure means ctl_table_header,ctl_table and so on?
>
> I add this structure to struct nf_proto_net in patch 1/17,so those fields in
> struct nf_conntrack_l4proto are useless,this patch is just some cleanup.
>
> the same with nf_conntrack_l3proto.
I see, then it's OK. Please, elaborate a bit more the patch
description to explain that this structure is not required anymore.
^ permalink raw reply
* Re: [PATCH 04/17] netfilter: add namespace support for l4proto_generic
From: Pablo Neira Ayuso @ 2012-05-24 9:52 UTC (permalink / raw)
To: Gao feng; +Cc: netfilter-devel, netdev, serge.hallyn, ebiederm, dlezcano
In-Reply-To: <4FBD8B40.4020303@cn.fujitsu.com>
On Thu, May 24, 2012 at 09:13:36AM +0800, Gao feng wrote:
> 于 2012年05月23日 18:32, Pablo Neira Ayuso 写道:
> > On Mon, May 14, 2012 at 04:52:14PM +0800, Gao feng wrote:
> >> implement and export nf_conntrack_proto_generic_[init,fini],
> >> nf_conntrack_[init,cleanup]_net call them to register or unregister
> >> the sysctl of generic proto.
> >>
> >> implement generic_net_init,it's used to initial the pernet
> >> data for generic proto.
> >>
> >> and use nf_generic_net.timeout to replace nf_ct_generic_timeout in
> >> get_timeouts function.
> >>
> >> Acked-by: Eric W. Biederman <ebiederm@xmission.com>
> >> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
> >> ---
> >> include/net/netfilter/nf_conntrack_l4proto.h | 2 +
> >> include/net/netns/conntrack.h | 6 +++
> >> net/netfilter/nf_conntrack_core.c | 8 +++-
> >> net/netfilter/nf_conntrack_proto.c | 21 +++++-----
> >> net/netfilter/nf_conntrack_proto_generic.c | 55 ++++++++++++++++++++++++-
> >> 5 files changed, 76 insertions(+), 16 deletions(-)
> >>
> >> diff --git a/include/net/netfilter/nf_conntrack_l4proto.h b/include/net/netfilter/nf_conntrack_l4proto.h
> >> index a93dcd5..0d329b9 100644
> >> --- a/include/net/netfilter/nf_conntrack_l4proto.h
> >> +++ b/include/net/netfilter/nf_conntrack_l4proto.h
> >> @@ -118,6 +118,8 @@ struct nf_conntrack_l4proto {
> >>
> >> /* Existing built-in generic protocol */
> >> extern struct nf_conntrack_l4proto nf_conntrack_l4proto_generic;
> >> +extern int nf_conntrack_proto_generic_init(struct net *net);
> >> +extern void nf_conntrack_proto_generic_fini(struct net *net);
> >>
> >> #define MAX_NF_CT_PROTO 256
> >>
> >> diff --git a/include/net/netns/conntrack.h b/include/net/netns/conntrack.h
> >> index 94992e9..3381b80 100644
> >> --- a/include/net/netns/conntrack.h
> >> +++ b/include/net/netns/conntrack.h
> >> @@ -20,7 +20,13 @@ struct nf_proto_net {
> >> unsigned int users;
> >> };
> >>
> >> +struct nf_generic_net {
> >> + struct nf_proto_net pn;
> >> + unsigned int timeout;
> >> +};
> >> +
> >> struct nf_ip_net {
> >> + struct nf_generic_net generic;
> >> #if defined(CONFIG_SYSCTL) && defined(CONFIG_NF_CONNTRACK_PROC_COMPAT)
> >> struct ctl_table_header *ctl_table_header;
> >> struct ctl_table *ctl_table;
> >> diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
> >> index 32c5909..fd33e91 100644
> >> --- a/net/netfilter/nf_conntrack_core.c
> >> +++ b/net/netfilter/nf_conntrack_core.c
> >> @@ -1353,6 +1353,7 @@ static void nf_conntrack_cleanup_net(struct net *net)
> >> }
> >>
> >> nf_ct_free_hashtable(net->ct.hash, net->ct.htable_size);
> >> + nf_conntrack_proto_generic_fini(net);
> >> nf_conntrack_helper_fini(net);
> >> nf_conntrack_timeout_fini(net);
> >> nf_conntrack_ecache_fini(net);
> >> @@ -1586,9 +1587,12 @@ static int nf_conntrack_init_net(struct net *net)
> >> ret = nf_conntrack_helper_init(net);
> >> if (ret < 0)
> >> goto err_helper;
> >> -
> >> + ret = nf_conntrack_proto_generic_init(net);
> >> + if (ret < 0)
> >> + goto err_generic;
> >> return 0;
> >> -
> >> +err_generic:
> >> + nf_conntrack_helper_fini(net);
> >> err_helper:
> >> nf_conntrack_timeout_fini(net);
> >> err_timeout:
> >> diff --git a/net/netfilter/nf_conntrack_proto.c b/net/netfilter/nf_conntrack_proto.c
> >> index 7ee6653..9b4bf6d 100644
> >> --- a/net/netfilter/nf_conntrack_proto.c
> >> +++ b/net/netfilter/nf_conntrack_proto.c
> >> @@ -287,10 +287,16 @@ EXPORT_SYMBOL_GPL(nf_conntrack_l3proto_unregister);
> >> static struct nf_proto_net *nf_ct_l4proto_net(struct net *net,
> >> struct nf_conntrack_l4proto *l4proto)
> >> {
> >> - if (l4proto->net_id)
> >> - return net_generic(net, *l4proto->net_id);
> >> - else
> >> - return NULL;
> >> + switch (l4proto->l4proto) {
> >> + case 255: /* l4proto_generic */
> >> + return (struct nf_proto_net *)&net->ct.proto.generic;
> >> + default:
> >> + if (l4proto->net_id)
> >> + return net_generic(net, *l4proto->net_id);
> >> + else
> >> + return NULL;
> >> + }
> >> + return NULL;
> >> }
> >>
> >> int nf_ct_l4proto_register_sysctl(struct net *net,
> >> @@ -457,11 +463,6 @@ EXPORT_SYMBOL_GPL(nf_conntrack_l4proto_unregister);
> >> int nf_conntrack_proto_init(void)
> >> {
> >> unsigned int i;
> >> - int err;
> >> -
> >> - err = nf_ct_l4proto_register_sysctl(&init_net, &nf_conntrack_l4proto_generic);
> >> - if (err < 0)
> >> - return err;
> >
> > I like that all protocols sysctl are registered by
> > nf_conntrack_proto_init. Can you keep using that?
>
> you mean per-net's generic_proto sysctl are registered by
> nf_conntrack_proto_init?
>
> such as
>
> int nf_conntrack_proto_init(struct net *net)
> {
> ...
> err = nf_ct_l4proto_register_sysctl(net, &nf_conntrack_l4proto_generic);
Yes, all protocol trackers included in nf_conntrack_proto_init:
err = nf_conntrack_proto_generic_init(net);
...
err = nf_conntrack_proto_tcp_init(net);
...
and so on.
> ...
> }
>
> if my understanding is right,my answer is yes we can ;)
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* [PATCH] net: qmi_wwan: Add Sierra Wireless device IDs
From: Bjørn Mork @ 2012-05-24 9:19 UTC (permalink / raw)
To: netdev; +Cc: linux-usb, Bjørn Mork
Some additional Gobi3K IDs found in the BSD/GPL licensed
out-of-tree GobiNet driver from Sierra Wireless.
Signed-off-by: Bjørn Mork <bjorn@mork.no>
---
drivers/net/usb/qmi_wwan.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)
diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c
index 380dbea..3b20678 100644
--- a/drivers/net/usb/qmi_wwan.c
+++ b/drivers/net/usb/qmi_wwan.c
@@ -547,6 +547,8 @@ static const struct usb_device_id products[] = {
{QMI_GOBI_DEVICE(0x16d8, 0x8002)}, /* CMDTech Gobi 2000 Modem device (VU922) */
{QMI_GOBI_DEVICE(0x05c6, 0x9205)}, /* Gobi 2000 Modem device */
{QMI_GOBI_DEVICE(0x1199, 0x9013)}, /* Sierra Wireless Gobi 3000 Modem device (MC8355) */
+ {QMI_GOBI_DEVICE(0x1199, 0x9015)}, /* Sierra Wireless Gobi 3000 Modem device */
+ {QMI_GOBI_DEVICE(0x1199, 0x9019)}, /* Sierra Wireless Gobi 3000 Modem device */
{ } /* END */
};
MODULE_DEVICE_TABLE(usb, products);
--
1.7.2.5
^ permalink raw reply related
* [PATCH 16/21] datapath: remove tunnel cache
From: Simon Horman @ 2012-05-24 9:09 UTC (permalink / raw)
To: dev; +Cc: netdev, Kyle Mestery, Simon Horman
In-Reply-To: <1337850554-10339-1-git-send-email-horms@verge.net.au>
As tunndevs no longer have a daddr the cache can no longer built in this way.
Furthermore, its not clear to me what the value of keeping the cache is in
the context of moving towards allowing use of in-tree tunnelling.
Cc: Kyle Mestery <kmestery@cisco.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
---
datapath/tunnel.c | 384 +++---------------------------------------------------
datapath/tunnel.h | 52 --------
2 files changed, 20 insertions(+), 416 deletions(-)
diff --git a/datapath/tunnel.c b/datapath/tunnel.c
index cdcb0a7..b997cb8 100644
--- a/datapath/tunnel.c
+++ b/datapath/tunnel.c
@@ -52,43 +52,9 @@
#include "vport-generic.h"
#include "vport-internal_dev.h"
-#ifdef NEED_CACHE_TIMEOUT
-/*
- * On kernels where we can't quickly detect changes in the rest of the system
- * we use an expiration time to invalidate the cache. A shorter expiration
- * reduces the length of time that we may potentially blackhole packets while
- * a longer time increases performance by reducing the frequency that the
- * cache needs to be rebuilt. A variety of factors may cause the cache to be
- * invalidated before the expiration time but this is the maximum. The time
- * is expressed in jiffies.
- */
-#define MAX_CACHE_EXP HZ
-#endif
-
-/*
- * Interval to check for and remove caches that are no longer valid. Caches
- * are checked for validity before they are used for packet encapsulation and
- * old caches are removed at that time. However, if no packets are sent through
- * the tunnel then the cache will never be destroyed. Since it holds
- * references to a number of system objects, the cache will continue to use
- * system resources by not allowing those objects to be destroyed. The cache
- * cleaner is periodically run to free invalid caches. It does not
- * significantly affect system performance. A lower interval will release
- * resources faster but will itself consume resources by requiring more frequent
- * checks. A longer interval may result in messages being printed to the kernel
- * message buffer about unreleased resources. The interval is expressed in
- * jiffies.
- */
-#define CACHE_CLEANER_INTERVAL (5 * HZ)
-
-#define CACHE_DATA_ALIGN 16
#define PORT_TABLE_SIZE 1024
static struct hlist_head *port_table __read_mostly;
-static int port_table_count;
-
-static void cache_cleaner(struct work_struct *work);
-static DECLARE_DELAYED_WORK(cache_cleaner_wq, cache_cleaner);
/*
* These are just used as an optimization: they don't require any kind of
@@ -108,60 +74,17 @@ static unsigned int multicast_ports __read_mostly;
#define rt_dst(rt) (rt->u.dst)
#endif
-#if LINUX_VERSION_CODE >= KERNEL_VERSION(3,1,0)
-static struct hh_cache *rt_hh(struct rtable *rt)
-{
- struct neighbour *neigh = dst_get_neighbour_noref(&rt->dst);
- if (!neigh || !(neigh->nud_state & NUD_CONNECTED) ||
- !neigh->hh.hh_len)
- return NULL;
- return &neigh->hh;
-}
-#else
-#define rt_hh(rt) (rt_dst(rt).hh)
-#endif
-
static struct vport *tnl_vport_to_vport(const struct tnl_vport *tnl_vport)
{
return vport_from_priv(tnl_vport);
}
-/* This is analogous to rtnl_dereference for the tunnel cache. It checks that
- * cache_lock is held, so it is only for update side code.
- */
-static struct tnl_cache *cache_dereference(struct tnl_vport *tnl_vport)
-{
- return rcu_dereference_protected(tnl_vport->cache,
- lockdep_is_held(&tnl_vport->cache_lock));
-}
-
-static void schedule_cache_cleaner(void)
-{
- schedule_delayed_work(&cache_cleaner_wq, CACHE_CLEANER_INTERVAL);
-}
-
-static void free_cache(struct tnl_cache *cache)
-{
- if (!cache)
- return;
-
- ovs_flow_put(cache->flow);
- ip_rt_put(cache->rt);
- kfree(cache);
-}
-
static void free_config_rcu(struct rcu_head *rcu)
{
struct tnl_mutable_config *c = container_of(rcu, struct tnl_mutable_config, rcu);
kfree(c);
}
-static void free_cache_rcu(struct rcu_head *rcu)
-{
- struct tnl_cache *c = container_of(rcu, struct tnl_cache, rcu);
- free_cache(c);
-}
-
static void assign_config_rcu(struct vport *vport,
struct tnl_mutable_config *new_config)
{
@@ -174,18 +97,6 @@ static void assign_config_rcu(struct vport *vport,
call_rcu(&old_config->rcu, free_config_rcu);
}
-static void assign_cache_rcu(struct vport *vport, struct tnl_cache *new_cache)
-{
- struct tnl_vport *tnl_vport = tnl_vport_priv(vport);
- struct tnl_cache *old_cache;
-
- old_cache = cache_dereference(tnl_vport);
- rcu_assign_pointer(tnl_vport->cache, new_cache);
-
- if (old_cache)
- call_rcu(&old_cache->rcu, free_cache_rcu);
-}
-
static unsigned int *find_port_pool(const struct tnl_mutable_config *mutable)
{
bool is_multicast = ipv4_is_multicast(mutable->key.daddr);
@@ -223,13 +134,9 @@ static void port_table_add_port(struct vport *vport)
const struct tnl_mutable_config *mutable;
u32 hash;
- if (port_table_count == 0)
- schedule_cache_cleaner();
-
mutable = rtnl_dereference(tnl_vport->mutable);
hash = port_hash(&mutable->key);
hlist_add_head_rcu(&tnl_vport->hash_node, find_bucket(hash));
- port_table_count++;
(*find_port_pool(rtnl_dereference(tnl_vport->mutable)))++;
}
@@ -240,10 +147,6 @@ static void port_table_remove_port(struct vport *vport)
hlist_del_init_rcu(&tnl_vport->hash_node);
- port_table_count--;
- if (port_table_count == 0)
- cancel_delayed_work_sync(&cache_cleaner_wq);
-
(*find_port_pool(rtnl_dereference(tnl_vport->mutable)))--;
}
@@ -780,11 +683,6 @@ static void create_tunnel_header(const struct vport *vport,
tnl_vport->tnl_ops->build_header(vport, mutable, iph + 1);
}
-static void *get_cached_header(const struct tnl_cache *cache)
-{
- return (void *)cache + ALIGN(sizeof(struct tnl_cache), CACHE_DATA_ALIGN);
-}
-
#ifdef HAVE_RT_GENID
static inline int rt_genid(struct net *net)
{
@@ -792,184 +690,6 @@ static inline int rt_genid(struct net *net)
}
#endif
-static bool check_cache_valid(const struct tnl_cache *cache,
- const struct tnl_mutable_config *mutable)
-{
- struct hh_cache *hh;
-
- if (!cache)
- return false;
-
- hh = rt_hh(cache->rt);
- return hh &&
-#ifdef NEED_CACHE_TIMEOUT
- time_before(jiffies, cache->expiration) &&
-#endif
-#ifdef HAVE_RT_GENID
- rt_genid(dev_net(rt_dst(cache->rt).dev)) == cache->rt->rt_genid &&
-#endif
-#ifdef HAVE_HH_SEQ
- hh->hh_lock.sequence == cache->hh_seq &&
-#endif
- mutable->seq == cache->mutable_seq &&
- (!ovs_is_internal_dev(rt_dst(cache->rt).dev) ||
- (cache->flow && !cache->flow->dead));
-}
-
-static void __cache_cleaner(struct tnl_vport *tnl_vport)
-{
- const struct tnl_mutable_config *mutable =
- rcu_dereference(tnl_vport->mutable);
- const struct tnl_cache *cache = rcu_dereference(tnl_vport->cache);
-
- if (cache && !check_cache_valid(cache, mutable) &&
- spin_trylock_bh(&tnl_vport->cache_lock)) {
- assign_cache_rcu(tnl_vport_to_vport(tnl_vport), NULL);
- spin_unlock_bh(&tnl_vport->cache_lock);
- }
-}
-
-static void cache_cleaner(struct work_struct *work)
-{
- int i;
-
- schedule_cache_cleaner();
-
- rcu_read_lock();
- for (i = 0; i < PORT_TABLE_SIZE; i++) {
- struct hlist_node *n;
- struct hlist_head *bucket;
- struct tnl_vport *tnl_vport;
-
- bucket = &port_table[i];
- hlist_for_each_entry_rcu(tnl_vport, n, bucket, hash_node)
- __cache_cleaner(tnl_vport);
- }
- rcu_read_unlock();
-}
-
-static void create_eth_hdr(struct tnl_cache *cache, struct hh_cache *hh)
-{
- void *cache_data = get_cached_header(cache);
- int hh_off;
-
-#ifdef HAVE_HH_SEQ
- unsigned hh_seq;
-
- do {
- hh_seq = read_seqbegin(&hh->hh_lock);
- hh_off = HH_DATA_ALIGN(hh->hh_len) - hh->hh_len;
- memcpy(cache_data, (void *)hh->hh_data + hh_off, hh->hh_len);
- cache->hh_len = hh->hh_len;
- } while (read_seqretry(&hh->hh_lock, hh_seq));
-
- cache->hh_seq = hh_seq;
-#else
- read_lock(&hh->hh_lock);
- hh_off = HH_DATA_ALIGN(hh->hh_len) - hh->hh_len;
- memcpy(cache_data, (void *)hh->hh_data + hh_off, hh->hh_len);
- cache->hh_len = hh->hh_len;
- read_unlock(&hh->hh_lock);
-#endif
-}
-
-static struct tnl_cache *build_cache(struct vport *vport,
- const struct tnl_mutable_config *mutable,
- struct rtable *rt)
-{
- struct tnl_vport *tnl_vport = tnl_vport_priv(vport);
- struct tnl_cache *cache;
- void *cache_data;
- int cache_len;
- struct hh_cache *hh;
-
- if (!(mutable->flags & TNL_F_HDR_CACHE))
- return NULL;
-
- /*
- * If there is no entry in the ARP cache or if this device does not
- * support hard header caching just fall back to the IP stack.
- */
-
- hh = rt_hh(rt);
- if (!hh)
- return NULL;
-
- /*
- * If lock is contended fall back to directly building the header.
- * We're not going to help performance by sitting here spinning.
- */
- if (!spin_trylock(&tnl_vport->cache_lock))
- return NULL;
-
- cache = cache_dereference(tnl_vport);
- if (check_cache_valid(cache, mutable))
- goto unlock;
- else
- cache = NULL;
-
- cache_len = LL_RESERVED_SPACE(rt_dst(rt).dev) + mutable->tunnel_hlen;
-
- cache = kzalloc(ALIGN(sizeof(struct tnl_cache), CACHE_DATA_ALIGN) +
- cache_len, GFP_ATOMIC);
- if (!cache)
- goto unlock;
-
- create_eth_hdr(cache, hh);
- cache_data = get_cached_header(cache) + cache->hh_len;
- cache->len = cache->hh_len + mutable->tunnel_hlen;
-
- create_tunnel_header(vport, mutable, rt, cache_data);
-
- cache->mutable_seq = mutable->seq;
- cache->rt = rt;
-#ifdef NEED_CACHE_TIMEOUT
- cache->expiration = jiffies + tnl_vport->cache_exp_interval;
-#endif
-
- if (ovs_is_internal_dev(rt_dst(rt).dev)) {
- struct sw_flow_key flow_key;
- struct vport *dst_vport;
- struct sk_buff *skb;
- int err;
- int flow_key_len;
- struct sw_flow *flow;
-
- dst_vport = ovs_internal_dev_get_vport(rt_dst(rt).dev);
- if (!dst_vport)
- goto done;
-
- skb = alloc_skb(cache->len, GFP_ATOMIC);
- if (!skb)
- goto done;
-
- __skb_put(skb, cache->len);
- memcpy(skb->data, get_cached_header(cache), cache->len);
-
- err = ovs_flow_extract(skb, dst_vport->port_no, &flow_key,
- &flow_key_len);
-
- consume_skb(skb);
- if (err)
- goto done;
-
- flow = ovs_flow_tbl_lookup(rcu_dereference(dst_vport->dp->table),
- &flow_key, flow_key_len);
- if (flow) {
- cache->flow = flow;
- ovs_flow_hold(flow);
- }
- }
-
-done:
- assign_cache_rcu(vport, cache);
-
-unlock:
- spin_unlock(&tnl_vport->cache_lock);
-
- return cache;
-}
-
static struct rtable *__find_route(const struct tnl_mutable_config *mutable,
u8 ipproto, __be32 daddr, __be32 saddr,
u8 tos)
@@ -1001,33 +721,19 @@ static struct rtable *__find_route(const struct tnl_mutable_config *mutable,
static struct rtable *find_route(struct vport *vport,
const struct tnl_mutable_config *mutable,
- u8 tos, __be32 daddr, __be32 saddr,
- struct tnl_cache **cache)
+ u8 tos, __be32 daddr, __be32 saddr)
{
struct tnl_vport *tnl_vport = tnl_vport_priv(vport);
- struct tnl_cache *cur_cache = rcu_dereference(tnl_vport->cache);
+ struct rtable *rt;
- *cache = NULL;
tos = RT_TOS(tos);
- if (daddr == mutable->key.daddr && saddr == mutable->key.saddr &&
- tos == RT_TOS(mutable->tos) &&
- check_cache_valid(cur_cache, mutable)) {
- *cache = cur_cache;
- return cur_cache->rt;
- } else {
- struct rtable *rt;
-
- rt = __find_route(mutable, tnl_vport->tnl_ops->ipproto,
- daddr, saddr, tos);
- if (IS_ERR(rt))
- return NULL;
-
- if (likely(tos == RT_TOS(mutable->tos)))
- *cache = build_cache(vport, mutable, rt);
+ rt = __find_route(mutable, tnl_vport->tnl_ops->ipproto,
+ daddr, saddr, tos);
+ if (IS_ERR(rt))
+ return NULL;
- return rt;
- }
+ return rt;
}
static bool need_linearize(const struct sk_buff *skb)
@@ -1152,7 +858,6 @@ int ovs_tnl_send(struct vport *vport, struct sk_buff *skb)
enum vport_err_type err = VPORT_E_TX_ERROR;
struct rtable *rt;
struct dst_entry *unattached_dst = NULL;
- struct tnl_cache *cache;
int sent_len = 0;
__be16 frag_off = 0;
__be32 daddr;
@@ -1210,11 +915,10 @@ int ovs_tnl_send(struct vport *vport, struct sk_buff *skb)
}
/* Route lookup */
- rt = find_route(vport, mutable, tos, daddr, saddr, &cache);
+ rt = find_route(vport, mutable, tos, daddr, saddr);
if (unlikely(!rt))
goto error_free;
- if (unlikely(!cache))
- unattached_dst = &rt_dst(rt);
+ unattached_dst = &rt_dst(rt);
tos = INET_ECN_encapsulate(tos, inner_tos);
@@ -1239,11 +943,9 @@ int ovs_tnl_send(struct vport *vport, struct sk_buff *skb)
* If we are over the MTU, allow the IP stack to handle fragmentation.
* Fragmentation is a slow path anyways.
*/
- if (unlikely(skb->len + mutable->tunnel_hlen > dst_mtu(&rt_dst(rt)) &&
- cache)) {
+ if (unlikely(skb->len + mutable->tunnel_hlen > dst_mtu(&rt_dst(rt)))) {
unattached_dst = &rt_dst(rt);
dst_hold(unattached_dst);
- cache = NULL;
}
/* TTL */
@@ -1270,23 +972,15 @@ int ovs_tnl_send(struct vport *vport, struct sk_buff *skb)
if (unlikely(vlan_deaccel_tag(skb)))
goto next;
- if (likely(cache)) {
- skb_push(skb, cache->len);
- memcpy(skb->data, get_cached_header(cache), cache->len);
- skb_reset_mac_header(skb);
- skb_set_network_header(skb, cache->hh_len);
-
- } else {
- skb_push(skb, mutable->tunnel_hlen);
- create_tunnel_header(vport, mutable, rt, skb->data);
- skb_reset_network_header(skb);
-
- if (next_skb)
- skb_dst_set(skb, dst_clone(unattached_dst));
- else {
- skb_dst_set(skb, unattached_dst);
- unattached_dst = NULL;
- }
+ skb_push(skb, mutable->tunnel_hlen);
+ create_tunnel_header(vport, mutable, rt, skb->data);
+ skb_reset_network_header(skb);
+
+ if (next_skb)
+ skb_dst_set(skb, dst_clone(unattached_dst));
+ else {
+ skb_dst_set(skb, unattached_dst);
+ unattached_dst = NULL;
}
skb_set_transport_header(skb, skb_network_offset(skb) + sizeof(struct iphdr));
@@ -1301,37 +995,7 @@ int ovs_tnl_send(struct vport *vport, struct sk_buff *skb)
if (unlikely(!skb))
goto next;
- if (likely(cache)) {
- int orig_len = skb->len - cache->len;
- struct vport *cache_vport;
-
- cache_vport = ovs_internal_dev_get_vport(rt_dst(rt).dev);
- skb->protocol = htons(ETH_P_IP);
- iph = ip_hdr(skb);
- iph->tot_len = htons(skb->len - skb_network_offset(skb));
- ip_send_check(iph);
-
- if (cache_vport) {
- if (unlikely(compute_ip_summed(skb, true))) {
- kfree_skb(skb);
- goto next;
- }
-
- OVS_CB(skb)->flow = cache->flow;
- ovs_vport_receive(cache_vport, skb);
- sent_len += orig_len;
- } else {
- int xmit_err;
-
- skb->dev = rt_dst(rt).dev;
- xmit_err = dev_queue_xmit(skb);
-
- if (likely(net_xmit_eval(xmit_err) == 0))
- sent_len += orig_len;
- }
- } else
- sent_len += send_frags(skb, mutable);
-
+ sent_len += send_frags(skb, mutable);
next:
skb = next_skb;
}
@@ -1414,13 +1078,6 @@ struct vport *ovs_tnl_create(const struct vport_parms *parms,
if (err)
goto error_free_mutable;
- spin_lock_init(&tnl_vport->cache_lock);
-
-#ifdef NEED_CACHE_TIMEOUT
- tnl_vport->cache_exp_interval = MAX_CACHE_EXP -
- (net_random() % (MAX_CACHE_EXP / 2));
-#endif
-
rcu_assign_pointer(tnl_vport->mutable, mutable);
port_table_add_port(vport);
@@ -1439,7 +1096,6 @@ static void free_port_rcu(struct rcu_head *rcu)
struct tnl_vport *tnl_vport = container_of(rcu,
struct tnl_vport, rcu);
- free_cache((struct tnl_cache __force *)tnl_vport->cache);
kfree((struct tnl_mutable __force *)tnl_vport->mutable);
ovs_vport_free(tnl_vport_to_vport(tnl_vport));
}
diff --git a/datapath/tunnel.h b/datapath/tunnel.h
index 0af27ac..ed3b4ec 100644
--- a/datapath/tunnel.h
+++ b/datapath/tunnel.h
@@ -172,58 +172,6 @@ struct tnl_ops {
/* If we can't detect all system changes directly we need to use a timeout. */
#define NEED_CACHE_TIMEOUT
#endif
-struct tnl_cache {
- struct rcu_head rcu;
-
- int len; /* Length of data to be memcpy'd from cache. */
- int hh_len; /* Hardware hdr length, cached from hh_cache. */
-
- /* Sequence number of mutable->seq from which this cache was
- * generated. */
- unsigned mutable_seq;
-
-#ifdef HAVE_HH_SEQ
- /*
- * The sequence number from the seqlock protecting the hardware header
- * cache (in the ARP cache). Since every write increments the counter
- * this gives us an easy way to tell if it has changed.
- */
- unsigned hh_seq;
-#endif
-
-#ifdef NEED_CACHE_TIMEOUT
- /*
- * If we don't have direct mechanisms to detect all important changes in
- * the system fall back to an expiration time. This expiration time
- * can be relatively short since at high rates there will be millions of
- * packets per second, so we'll still get plenty of benefit from the
- * cache. Note that if something changes we may blackhole packets
- * until the expiration time (depending on what changed and the kernel
- * version we may be able to detect the change sooner). Expiration is
- * expressed as a time in jiffies.
- */
- unsigned long expiration;
-#endif
-
- /*
- * The routing table entry that is the result of looking up the tunnel
- * endpoints. It also contains a sequence number (called a generation
- * ID) that can be compared to a global sequence to tell if the routing
- * table has changed (and therefore there is a potential that this
- * cached route has been invalidated).
- */
- struct rtable *rt;
-
- /*
- * If the output device for tunnel traffic is an OVS internal device,
- * the flow of that datapath. Since all tunnel traffic will have the
- * same headers this allows us to cache the flow lookup. NULL if the
- * output device is not OVS or if there is no flow installed.
- */
- struct sw_flow *flow;
-
- /* The cached header follows after padding for alignment. */
-};
struct tnl_vport {
struct rcu_head rcu;
--
1.7.10.2.484.gcd07cc5
^ permalink raw reply related
* [PATCH 20/21] datapath: Use tun_key flags for id and csum settings on transmit
From: Simon Horman @ 2012-05-24 9:09 UTC (permalink / raw)
To: dev; +Cc: netdev, Kyle Mestery, Simon Horman
In-Reply-To: <1337850554-10339-1-git-send-email-horms@verge.net.au>
The use of these flags in the tnl_mutable_config structure
are no longer correct as a tunnel device may be used to
transmit packets for many different tunnels.
This change restores the checksum and out key behavior of
tunneling.
Cc: Kyle Mestery <kmestery@cisco.com>
Signed-of-by: Simon Horman <horms@verge.net.au>
---
datapath/tunnel.c | 58 ++++++++++++++++++++++++-------------------------
datapath/tunnel.h | 12 +++-------
datapath/vport-capwap.c | 28 ++++++++++++------------
datapath/vport-gre.c | 33 ++++++++++++++--------------
4 files changed, 63 insertions(+), 68 deletions(-)
diff --git a/datapath/tunnel.c b/datapath/tunnel.c
index a303d8d..982de25 100644
--- a/datapath/tunnel.c
+++ b/datapath/tunnel.c
@@ -500,7 +500,7 @@ bool ovs_tnl_frag_needed(struct vport *vport,
static bool check_mtu(struct sk_buff *skb,
struct vport *vport,
- const struct tnl_mutable_config *mutable,
+ const struct tnl_mutable_config *mutable, int tun_hlen,
const struct rtable *rt, __be16 *frag_offp)
{
bool df_inherit = mutable->flags & TNL_F_DF_INHERIT;
@@ -524,10 +524,7 @@ static bool check_mtu(struct sk_buff *skb,
eth_hdr(skb)->h_proto == htons(ETH_P_8021Q))
vlan_header = VLAN_HLEN;
- mtu = dst_mtu(&rt_dst(rt))
- - ETH_HLEN
- - mutable->tunnel_hlen
- - vlan_header;
+ mtu = dst_mtu(&rt_dst(rt)) - ETH_HLEN - tun_hlen - vlan_header;
}
if (skb->protocol == htons(ETH_P_IP)) {
@@ -569,11 +566,10 @@ static bool check_mtu(struct sk_buff *skb,
}
static void create_tunnel_header(const struct vport *vport,
- const struct tnl_mutable_config *mutable,
- const struct rtable *rt, void *header)
+ const struct rtable *rt, struct sk_buff *skb)
{
struct tnl_vport *tnl_vport = tnl_vport_priv(vport);
- struct iphdr *iph = header;
+ struct iphdr *iph = (struct iphdr *)skb->data;
iph->version = 4;
iph->ihl = sizeof(struct iphdr) >> 2;
@@ -584,7 +580,7 @@ static void create_tunnel_header(const struct vport *vport,
if (!iph->ttl)
iph->ttl = ip4_dst_hoplimit(&rt_dst(rt));
- tnl_vport->tnl_ops->build_header(vport, mutable, iph + 1);
+ tnl_vport->tnl_ops->build_header(vport, skb);
}
#ifdef HAVE_RT_GENID
@@ -657,16 +653,14 @@ static bool need_linearize(const struct sk_buff *skb)
return false;
}
-static struct sk_buff *handle_offloads(struct sk_buff *skb,
- const struct tnl_mutable_config *mutable,
+static struct sk_buff *handle_offloads(struct sk_buff *skb, int tun_hlen,
const struct rtable *rt)
{
int min_headroom;
int err;
min_headroom = LL_RESERVED_SPACE(rt_dst(rt).dev) + rt_dst(rt).header_len
- + mutable->tunnel_hlen
- + (vlan_tx_tag_present(skb) ? VLAN_HLEN : 0);
+ + tun_hlen + (vlan_tx_tag_present(skb) ? VLAN_HLEN : 0);
if (skb_headroom(skb) < min_headroom || skb_header_cloned(skb)) {
int head_delta = SKB_DATA_ALIGN(min_headroom -
@@ -719,15 +713,14 @@ error:
return ERR_PTR(err);
}
-static int send_frags(struct sk_buff *skb,
- const struct tnl_mutable_config *mutable)
+static int send_frags(struct sk_buff *skb, int tun_hlen)
{
int sent_len;
sent_len = 0;
while (skb) {
struct sk_buff *next = skb->next;
- int frag_len = skb->len - mutable->tunnel_hlen;
+ int frag_len = skb->len - tun_hlen;
int err;
skb->next = NULL;
@@ -752,6 +745,14 @@ free_frags:
return sent_len;
}
+static int tunnel_hlen(struct tnl_vport *tnl_vport, struct sk_buff *skb)
+{
+ int tun_hlen = tnl_vport->tnl_ops->hdr_len(skb);
+ if (tun_hlen < 0)
+ return tun_hlen;
+ return tun_hlen + sizeof(struct iphdr);
+}
+
int ovs_tnl_send(struct vport *vport, struct sk_buff *skb)
{
struct tnl_vport *tnl_vport = tnl_vport_priv(vport);
@@ -765,6 +766,7 @@ int ovs_tnl_send(struct vport *vport, struct sk_buff *skb)
u8 ttl;
u8 inner_tos;
u8 tos;
+ int tun_hlen;
if (!OVS_CB(skb)->tun_key)
goto error_free;
@@ -822,13 +824,17 @@ int ovs_tnl_send(struct vport *vport, struct sk_buff *skb)
skb_dst_drop(skb);
skb_clear_rxhash(skb);
+ tun_hlen = tunnel_hlen(tnl_vport, skb);
+ if (unlikely(tun_hlen < 0))
+ goto error;
+
/* Offloading */
- skb = handle_offloads(skb, mutable, rt);
+ skb = handle_offloads(skb, tun_hlen, rt);
if (IS_ERR(skb))
goto error;
/* MTU */
- if (unlikely(!check_mtu(skb, vport, mutable, rt, &frag_off))) {
+ if (unlikely(!check_mtu(skb, vport, mutable, tun_hlen, rt, &frag_off))) {
err = VPORT_E_TX_DROPPED;
goto error_free;
}
@@ -837,7 +843,7 @@ int ovs_tnl_send(struct vport *vport, struct sk_buff *skb)
* If we are over the MTU, allow the IP stack to handle fragmentation.
* Fragmentation is a slow path anyways.
*/
- if (unlikely(skb->len + mutable->tunnel_hlen > dst_mtu(&rt_dst(rt)))) {
+ if (unlikely(skb->len + tun_hlen > dst_mtu(&rt_dst(rt)))) {
unattached_dst = &rt_dst(rt);
dst_hold(unattached_dst);
}
@@ -862,8 +868,8 @@ int ovs_tnl_send(struct vport *vport, struct sk_buff *skb)
if (unlikely(vlan_deaccel_tag(skb)))
goto next;
- skb_push(skb, mutable->tunnel_hlen);
- create_tunnel_header(vport, mutable, rt, skb->data);
+ skb_push(skb, tun_hlen);
+ create_tunnel_header(vport, rt, skb);
skb_reset_network_header(skb);
if (next_skb)
@@ -880,12 +886,12 @@ int ovs_tnl_send(struct vport *vport, struct sk_buff *skb)
iph->frag_off = frag_off;
ip_select_ident(iph, &rt_dst(rt), NULL);
- skb = tnl_vport->tnl_ops->update_header(vport, mutable,
+ skb = tnl_vport->tnl_ops->update_header(vport, tun_hlen,
&rt_dst(rt), skb);
if (unlikely(!skb))
goto next;
- sent_len += send_frags(skb, mutable);
+ sent_len += send_frags(skb, tun_hlen);
next:
skb = next_skb;
}
@@ -917,12 +923,6 @@ static int tnl_set_config(struct net *net,
port_key_set_net(&mutable->key, net);
mutable->key.tunnel_type = tnl_ops->tunnel_type;
- mutable->tunnel_hlen = tnl_ops->hdr_len(mutable);
- if (mutable->tunnel_hlen < 0)
- return mutable->tunnel_hlen;
-
- mutable->tunnel_hlen += sizeof(struct iphdr);
-
old_vport = port_table_lookup(&mutable->key);
if (old_vport && old_vport != cur_vport)
return -EEXIST;
diff --git a/datapath/tunnel.h b/datapath/tunnel.h
index cddb88e..a32241f 100644
--- a/datapath/tunnel.h
+++ b/datapath/tunnel.h
@@ -84,10 +84,8 @@ static inline void port_key_set_net(struct port_lookup_key *key, struct net *net
* attributes.
* @rcu: RCU callback head for deferred destruction.
* @seq: Sequence number for distinguishing configuration versions.
- * @tunnel_hlen: Tunnel header length.
* @eth_addr: Source address for packets generated by tunnel itself
* (e.g. ICMP fragmentation needed messages).
- * @out_key: Key to use on output, 0 if this tunnel has no fixed output key.
* @flags: TNL_F_* flags.
*/
struct tnl_mutable_config {
@@ -96,12 +94,9 @@ struct tnl_mutable_config {
unsigned seq;
- unsigned tunnel_hlen;
-
unsigned char eth_addr[ETH_ALEN];
/* Configured via OVS_TUNNEL_ATTR_* attributes. */
- __be64 out_key;
u32 flags;
};
@@ -114,7 +109,7 @@ struct tnl_ops {
* build_header() (i.e. excludes the IP header). Returns a negative
* error code if the configuration is invalid.
*/
- int (*hdr_len)(const struct tnl_mutable_config *);
+ int (*hdr_len)(struct sk_buff *skb);
/*
* Builds the static portion of the tunnel header, which is stored in
@@ -124,8 +119,7 @@ struct tnl_ops {
* in some circumstances caching is disabled and this function will be
* called for every packet, so try not to make it too slow.
*/
- void (*build_header)(const struct vport *,
- const struct tnl_mutable_config *, void *header);
+ void (*build_header)(const struct vport *, struct sk_buff *);
/*
* Updates the cached header of a packet to match the actual packet
@@ -136,7 +130,7 @@ struct tnl_ops {
* of fragmentation).
*/
struct sk_buff *(*update_header)(const struct vport *,
- const struct tnl_mutable_config *,
+ int tun_hlen,
struct dst_entry *, struct sk_buff *);
};
diff --git a/datapath/vport-capwap.c b/datapath/vport-capwap.c
index a180b87..102a207 100644
--- a/datapath/vport-capwap.c
+++ b/datapath/vport-capwap.c
@@ -155,16 +155,17 @@ static struct inet_frags frag_state = {
.secret_interval = CAPWAP_FRAG_SECRET_INTERVAL,
};
-static int capwap_hdr_len(const struct tnl_mutable_config *mutable)
+static int capwap_hdr_len(struct sk_buff *skb)
{
int size = CAPWAP_MIN_HLEN;
/* CAPWAP has no checksums. */
- if (mutable->flags & TNL_F_CSUM)
+ if (OVS_CB(skb)->tun_key->tun_flags & TNL_F_CSUM) {
return -EINVAL;
/* if keys are specified, then add WSI field */
- if (mutable->out_key || (mutable->flags & TNL_F_OUT_KEY_ACTION)) {
+ if (OVS_CB(skb)->tun_key->tun_id ||
+ OVS_CB(skb)->tun_key->tun_flags & TNL_F_OUT_KEY_ACTION)
size += sizeof(struct capwaphdr_wsi) +
sizeof(struct capwaphdr_wsi_key);
}
@@ -172,11 +173,10 @@ static int capwap_hdr_len(const struct tnl_mutable_config *mutable)
return size;
}
-static void capwap_build_header(const struct vport *vport,
- const struct tnl_mutable_config *mutable,
- void *header)
+static void capwap_build_header(const struct vport *vport, struct sk_buff *skb)
{
- struct udphdr *udph = header;
+ struct iphdr *iph = (struct iphdr *)skb->data;
+ struct udphdr *udph = (struct udphdr *)(iph + 1);
struct capwaphdr *cwh = (struct capwaphdr *)(udph + 1);
udph->source = htons(CAPWAP_SRC_PORT);
@@ -186,7 +186,8 @@ static void capwap_build_header(const struct vport *vport,
cwh->frag_id = 0;
cwh->frag_off = 0;
- if (mutable->out_key || (mutable->flags & TNL_F_OUT_KEY_ACTION)) {
+ if (OVS_CB(skb)->tun_key->tun_id ||
+ OVS_CB(skb)->tun_key->tun_flags & TNL_F_OUT_KEY_ACTION) {
struct capwaphdr_wsi *wsi = (struct capwaphdr_wsi *)(cwh + 1);
cwh->begin = CAPWAP_KEYED;
@@ -197,9 +198,9 @@ static void capwap_build_header(const struct vport *vport,
wsi->flags = CAPWAP_WSI_F_KEY64;
wsi->reserved_padding = 0;
- if (mutable->out_key) {
+ if (OVS_CB(skb)->tun_key->tun_id) {
struct capwaphdr_wsi_key *opt = (struct capwaphdr_wsi_key *)(wsi + 1);
- opt->key = mutable->out_key;
+ opt->key = OVS_CB(skb)->tun_key->tun_id;
}
} else {
/* make packet readable by old capwap code */
@@ -208,13 +209,12 @@ static void capwap_build_header(const struct vport *vport,
}
static struct sk_buff *capwap_update_header(const struct vport *vport,
- const struct tnl_mutable_config *mutable,
- struct dst_entry *dst,
+ int tun_hlen, struct dst_entry *dst,
struct sk_buff *skb)
{
struct udphdr *udph = udp_hdr(skb);
- if (mutable->flags & TNL_F_OUT_KEY_ACTION) {
+ if (OVS_CB(skb)->tun_key->tun_flags & TNL_F_OUT_KEY_ACTION) {
/* first field in WSI is key */
struct capwaphdr *cwh = (struct capwaphdr *)(udph + 1);
struct capwaphdr_wsi *wsi = (struct capwaphdr_wsi *)(cwh + 1);
@@ -226,7 +226,7 @@ static struct sk_buff *capwap_update_header(const struct vport *vport,
udph->len = htons(skb->len - skb_transport_offset(skb));
if (unlikely(skb->len - skb_network_offset(skb) > dst_mtu(dst))) {
- unsigned int hlen = skb_transport_offset(skb) + capwap_hdr_len(mutable);
+ unsigned int hlen = skb_transport_offset(skb) + capwap_hdr_len(skb);
skb = fragment(skb, vport, dst, hlen);
}
diff --git a/datapath/vport-gre.c b/datapath/vport-gre.c
index 8fab193..b6a4308 100644
--- a/datapath/vport-gre.c
+++ b/datapath/vport-gre.c
@@ -45,16 +45,17 @@ struct gre_base_hdr {
__be16 protocol;
};
-static int gre_hdr_len(const struct tnl_mutable_config *mutable)
+static int gre_hdr_len(struct sk_buff *skb)
{
int len;
len = GRE_HEADER_SECTION;
- if (mutable->flags & TNL_F_CSUM)
+ if (OVS_CB(skb)->tun_key->tun_flags & TNL_F_CSUM)
len += GRE_HEADER_SECTION;
- if (mutable->out_key || mutable->flags & TNL_F_OUT_KEY_ACTION)
+ if (OVS_CB(skb)->tun_key->tun_id ||
+ OVS_CB(skb)->tun_key->tun_flags & TNL_F_OUT_KEY_ACTION)
len += GRE_HEADER_SECTION;
return len;
@@ -70,41 +71,41 @@ static __be32 be64_get_low32(__be64 x)
#endif
}
-static void gre_build_header(const struct vport *vport,
- const struct tnl_mutable_config *mutable,
- void *header)
+static void gre_build_header(const struct vport *vport, struct sk_buff *skb)
{
- struct gre_base_hdr *greh = header;
+ struct iphdr *iph = (struct iphdr *)skb->data;
+ struct gre_base_hdr *greh = (struct gre_base_hdr *)(iph + 1);
__be32 *options = (__be32 *)(greh + 1);
greh->protocol = htons(ETH_P_TEB);
greh->flags = 0;
- if (mutable->flags & TNL_F_CSUM) {
+ if (OVS_CB(skb)->tun_key->tun_flags & TNL_F_CSUM) {
greh->flags |= GRE_CSUM;
*options = 0;
options++;
}
- if (mutable->out_key || mutable->flags & TNL_F_OUT_KEY_ACTION)
+ if (OVS_CB(skb)->tun_key->tun_id ||
+ OVS_CB(skb)->tun_key->tun_flags & TNL_F_OUT_KEY_ACTION)
greh->flags |= GRE_KEY;
- if (mutable->out_key)
- *options = be64_get_low32(mutable->out_key);
+ if (OVS_CB(skb)->tun_key->tun_id)
+ *options = be64_get_low32(OVS_CB(skb)->tun_key->tun_id);
}
static struct sk_buff *gre_update_header(const struct vport *vport,
- const struct tnl_mutable_config *mutable,
- struct dst_entry *dst,
+ int tun_hlen, struct dst_entry *dst,
struct sk_buff *skb)
{
- __be32 *options = (__be32 *)(skb_network_header(skb) + mutable->tunnel_hlen
+ __be32 *options = (__be32 *)(skb_network_header(skb) + tun_hlen
- GRE_HEADER_SECTION);
- if (mutable->out_key || mutable->flags & TNL_F_OUT_KEY_ACTION)
+ if (OVS_CB(skb)->tun_key->tun_id ||
+ OVS_CB(skb)->tun_key->tun_flags & TNL_F_OUT_KEY_ACTION)
options--;
- if (mutable->flags & TNL_F_CSUM)
+ if (OVS_CB(skb)->tun_key->tun_flags & TNL_F_CSUM)
*(__sum16 *)options = csum_fold(skb_checksum(skb,
skb_transport_offset(skb),
skb->len - skb_transport_offset(skb),
--
1.7.10.2.484.gcd07cc5
^ permalink raw reply related
* [PATCH 21/21] datapath: Always use tun_key flags
From: Simon Horman @ 2012-05-24 9:09 UTC (permalink / raw)
To: dev; +Cc: netdev, Kyle Mestery, Simon Horman
In-Reply-To: <1337850554-10339-1-git-send-email-horms@verge.net.au>
These flags should always be valid and allows the flags
element of tnl_mutable_config to be removed.
The flags in mutable were actually not being set due to a previous patch in
this series, so all flag-related features, except outgoing ken and csum
which were restored in a previous patch, were disabled.
Cc: Kyle Mestery <kmestery@cisco.com>
Signed-of-by: Simon Horman <horms@verge.net.au>
---
datapath/tunnel.c | 13 ++++++-------
datapath/tunnel.h | 4 ----
2 files changed, 6 insertions(+), 11 deletions(-)
diff --git a/datapath/tunnel.c b/datapath/tunnel.c
index 982de25..a91e319 100644
--- a/datapath/tunnel.c
+++ b/datapath/tunnel.c
@@ -482,7 +482,7 @@ bool ovs_tnl_frag_needed(struct vport *vport,
* not symmetric then PMTUD needs to be disabled since we won't have
* any way of synthesizing packets.
*/
- if ((mutable->flags & (TNL_F_IN_KEY_MATCH | TNL_F_OUT_KEY_ACTION)) ==
+ if ((OVS_CB(skb)->tun_key->tun_flags & (TNL_F_IN_KEY_MATCH | TNL_F_OUT_KEY_ACTION)) ==
(TNL_F_IN_KEY_MATCH | TNL_F_OUT_KEY_ACTION)) {
ntun_key = *tun_key;
OVS_CB(nskb)->tun_key = &ntun_key;
@@ -503,9 +503,9 @@ static bool check_mtu(struct sk_buff *skb,
const struct tnl_mutable_config *mutable, int tun_hlen,
const struct rtable *rt, __be16 *frag_offp)
{
- bool df_inherit = mutable->flags & TNL_F_DF_INHERIT;
- bool pmtud = mutable->flags & TNL_F_PMTUD;
- __be16 frag_off = mutable->flags & TNL_F_DF_DEFAULT ? htons(IP_DF) : 0;
+ bool df_inherit = OVS_CB(skb)->tun_key->tun_flags & TNL_F_DF_INHERIT;
+ bool pmtud = OVS_CB(skb)->tun_key->tun_flags & TNL_F_PMTUD;
+ __be16 frag_off = OVS_CB(skb)->tun_key->tun_flags & TNL_F_DF_DEFAULT ? htons(IP_DF) : 0;
int mtu = 0;
unsigned int packet_length = skb->len - ETH_HLEN;
@@ -804,7 +804,7 @@ int ovs_tnl_send(struct vport *vport, struct sk_buff *skb)
else
inner_tos = 0;
- if (mutable->flags & TNL_F_TOS_INHERIT)
+ if (OVS_CB(skb)->tun_key->tun_flags & TNL_F_TOS_INHERIT)
tos = inner_tos;
else
tos = OVS_CB(skb)->tun_key->ipv4_tos;
@@ -851,7 +851,7 @@ int ovs_tnl_send(struct vport *vport, struct sk_buff *skb)
ttl = OVS_CB(skb)->tun_key->ipv4_ttl;
if (!ttl)
ttl = ip4_dst_hoplimit(&rt_dst(rt));
- if (mutable->flags & TNL_F_TTL_INHERIT) {
+ if (OVS_CB(skb)->tun_key->tun_flags & TNL_F_TTL_INHERIT) {
if (skb->protocol == htons(ETH_P_IP))
ttl = ip_hdr(skb)->ttl;
#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)
@@ -919,7 +919,6 @@ static int tnl_set_config(struct net *net,
{
const struct vport *old_vport;
- mutable->flags = 0;
port_key_set_net(&mutable->key, net);
mutable->key.tunnel_type = tnl_ops->tunnel_type;
diff --git a/datapath/tunnel.h b/datapath/tunnel.h
index a32241f..4893903 100644
--- a/datapath/tunnel.h
+++ b/datapath/tunnel.h
@@ -86,7 +86,6 @@ static inline void port_key_set_net(struct port_lookup_key *key, struct net *net
* @seq: Sequence number for distinguishing configuration versions.
* @eth_addr: Source address for packets generated by tunnel itself
* (e.g. ICMP fragmentation needed messages).
- * @flags: TNL_F_* flags.
*/
struct tnl_mutable_config {
struct port_lookup_key key;
@@ -95,9 +94,6 @@ struct tnl_mutable_config {
unsigned seq;
unsigned char eth_addr[ETH_ALEN];
-
- /* Configured via OVS_TUNNEL_ATTR_* attributes. */
- u32 flags;
};
struct tnl_ops {
--
1.7.10.2.484.gcd07cc5
^ permalink raw reply related
* [PATCH 19/21] datapath: Simplify vport lookup
From: Simon Horman @ 2012-05-24 9:09 UTC (permalink / raw)
To: dev; +Cc: netdev, Kyle Mestery, Simon Horman
In-Reply-To: <1337850554-10339-1-git-send-email-horms@verge.net.au>
The lookup is now only based on the net and tunnel type.
It should be possible to either get rid of the lookup alltogether
or push it into the GRE and CAPWAP implementations, but this
change is simpler for now
Cc: Kyle Mestery <kmestery@cisco.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
---
datapath/tunnel.c | 110 +++---------------------------------------------
datapath/tunnel.h | 18 ++------
datapath/vport-capwap.c | 7 +--
datapath/vport-gre.c | 10 ++---
4 files changed, 16 insertions(+), 129 deletions(-)
diff --git a/datapath/tunnel.c b/datapath/tunnel.c
index 39aa2af..a303d8d 100644
--- a/datapath/tunnel.c
+++ b/datapath/tunnel.c
@@ -56,18 +56,6 @@
static struct hlist_head *port_table __read_mostly;
-/*
- * These are just used as an optimization: they don't require any kind of
- * synchronization because we could have just as easily read the value before
- * the port change happened.
- */
-static unsigned int key_local_remote_ports __read_mostly;
-static unsigned int key_remote_ports __read_mostly;
-static unsigned int key_multicast_ports __read_mostly;
-static unsigned int local_remote_ports __read_mostly;
-static unsigned int remote_ports __read_mostly;
-static unsigned int multicast_ports __read_mostly;
-
#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,36)
#define rt_dst(rt) (rt->dst)
#else
@@ -97,27 +85,6 @@ static void assign_config_rcu(struct vport *vport,
call_rcu(&old_config->rcu, free_config_rcu);
}
-static unsigned int *find_port_pool(const struct tnl_mutable_config *mutable)
-{
- bool is_multicast = ipv4_is_multicast(mutable->key.daddr);
-
- if (mutable->flags & TNL_F_IN_KEY_MATCH) {
- if (mutable->key.saddr)
- return &local_remote_ports;
- else if (is_multicast)
- return &multicast_ports;
- else
- return &remote_ports;
- } else {
- if (mutable->key.saddr)
- return &key_local_remote_ports;
- else if (is_multicast)
- return &key_multicast_ports;
- else
- return &key_remote_ports;
- }
-}
-
static u32 port_hash(const struct port_lookup_key *key)
{
return jhash2((u32 *)key, (PORT_KEY_LEN / sizeof(u32)), 0);
@@ -137,8 +104,6 @@ static void port_table_add_port(struct vport *vport)
mutable = rtnl_dereference(tnl_vport->mutable);
hash = port_hash(&mutable->key);
hlist_add_head_rcu(&tnl_vport->hash_node, find_bucket(hash));
-
- (*find_port_pool(rtnl_dereference(tnl_vport->mutable)))++;
}
static void port_table_remove_port(struct vport *vport)
@@ -146,12 +111,9 @@ static void port_table_remove_port(struct vport *vport)
struct tnl_vport *tnl_vport = tnl_vport_priv(vport);
hlist_del_init_rcu(&tnl_vport->hash_node);
-
- (*find_port_pool(rtnl_dereference(tnl_vport->mutable)))--;
}
-static struct vport *port_table_lookup(struct port_lookup_key *key,
- const struct tnl_mutable_config **pmutable)
+static struct vport *port_table_lookup(struct port_lookup_key *key)
{
struct hlist_node *n;
struct hlist_head *bucket;
@@ -164,79 +126,21 @@ static struct vport *port_table_lookup(struct port_lookup_key *key,
struct tnl_mutable_config *mutable;
mutable = rcu_dereference_rtnl(tnl_vport->mutable);
- if (!memcmp(&mutable->key, key, PORT_KEY_LEN)) {
- *pmutable = mutable;
+ if (!memcmp(&mutable->key, key, PORT_KEY_LEN))
return tnl_vport_to_vport(tnl_vport);
- }
}
return NULL;
}
-struct vport *ovs_tnl_find_port(struct net *net, __be32 saddr, __be32 daddr,
- __be64 key, int tunnel_type,
- const struct tnl_mutable_config **mutable)
+struct vport *ovs_tnl_find_port(struct net *net, u32 tunnel_type)
{
struct port_lookup_key lookup;
- struct vport *vport;
- bool is_multicast = ipv4_is_multicast(saddr);
port_key_set_net(&lookup, net);
- lookup.saddr = saddr;
- lookup.daddr = daddr;
-
- /* First try for exact match on in_key. */
- lookup.in_key = key;
- lookup.tunnel_type = tunnel_type | TNL_T_KEY_EXACT;
- if (!is_multicast && key_local_remote_ports) {
- vport = port_table_lookup(&lookup, mutable);
- if (vport)
- return vport;
- }
- if (key_remote_ports) {
- lookup.saddr = 0;
- vport = port_table_lookup(&lookup, mutable);
- if (vport)
- return vport;
-
- lookup.saddr = saddr;
- }
-
- /* Then try matches that wildcard in_key. */
- lookup.in_key = 0;
- lookup.tunnel_type = tunnel_type | TNL_T_KEY_MATCH;
- if (!is_multicast && local_remote_ports) {
- vport = port_table_lookup(&lookup, mutable);
- if (vport)
- return vport;
- }
- if (remote_ports) {
- lookup.saddr = 0;
- vport = port_table_lookup(&lookup, mutable);
- if (vport)
- return vport;
- }
+ lookup.tunnel_type = tunnel_type;
- if (is_multicast) {
- lookup.saddr = 0;
- lookup.daddr = saddr;
- if (key_multicast_ports) {
- lookup.tunnel_type = tunnel_type | TNL_T_KEY_EXACT;
- lookup.in_key = key;
- vport = port_table_lookup(&lookup, mutable);
- if (vport)
- return vport;
- }
- if (multicast_ports) {
- lookup.tunnel_type = tunnel_type | TNL_T_KEY_MATCH;
- lookup.in_key = 0;
- vport = port_table_lookup(&lookup, mutable);
- if (vport)
- return vport;
- }
- }
-
- return NULL;
+ return port_table_lookup(&lookup);
}
static void ecn_decapsulate(struct sk_buff *skb)
@@ -1008,11 +912,9 @@ static int tnl_set_config(struct net *net,
struct tnl_mutable_config *mutable)
{
const struct vport *old_vport;
- const struct tnl_mutable_config *old_mutable;
mutable->flags = 0;
port_key_set_net(&mutable->key, net);
- mutable->key.daddr = htonl(0);
mutable->key.tunnel_type = tnl_ops->tunnel_type;
mutable->tunnel_hlen = tnl_ops->hdr_len(mutable);
@@ -1021,7 +923,7 @@ static int tnl_set_config(struct net *net,
mutable->tunnel_hlen += sizeof(struct iphdr);
- old_vport = port_table_lookup(&mutable->key, &old_mutable);
+ old_vport = port_table_lookup(&mutable->key);
if (old_vport && old_vport != cur_vport)
return -EEXIST;
diff --git a/datapath/tunnel.h b/datapath/tunnel.h
index 330df27..cddb88e 100644
--- a/datapath/tunnel.h
+++ b/datapath/tunnel.h
@@ -35,16 +35,9 @@
/*
* One of these goes in struct tnl_ops and in tnl_find_port().
- * These values are in the same namespace as other TNL_T_* values, so
- * only the least significant 10 bits are available to define protocol
- * identifiers.
*/
-#define TNL_T_PROTO_GRE 0
-#define TNL_T_PROTO_CAPWAP 1
-
-/* These flags are only needed when calling tnl_find_port(). */
-#define TNL_T_KEY_EXACT (1 << 10)
-#define TNL_T_KEY_MATCH (1 << 11)
+#define TNL_T_PROTO_GRE 0
+#define TNL_T_PROTO_CAPWAP 1
/* Private flags not exposed to userspace in this form. */
#define TNL_F_IN_KEY_MATCH (1 << 16) /* Store the key in tun_id to
@@ -66,12 +59,9 @@
* @tunnel_type: Set of TNL_T_* flags that define lookup.
*/
struct port_lookup_key {
- __be64 in_key;
#ifdef CONFIG_NET_NS
struct net *net;
#endif
- __be32 saddr;
- __be32 daddr;
u32 tunnel_type;
};
@@ -212,9 +202,7 @@ const unsigned char *ovs_tnl_get_addr(const struct vport *vport);
int ovs_tnl_send(struct vport *vport, struct sk_buff *skb);
void ovs_tnl_rcv(struct vport *vport, struct sk_buff *skb);
-struct vport *ovs_tnl_find_port(struct net *net, __be32 saddr, __be32 daddr,
- __be64 key, int tunnel_type,
- const struct tnl_mutable_config **mutable);
+struct vport *ovs_tnl_find_port(struct net *net, u32 tunnel_type);
bool ovs_tnl_frag_needed(struct vport *vport,
const struct tnl_mutable_config *mutable,
struct sk_buff *skb, unsigned int mtu,
diff --git a/datapath/vport-capwap.c b/datapath/vport-capwap.c
index f26a7d2..a180b87 100644
--- a/datapath/vport-capwap.c
+++ b/datapath/vport-capwap.c
@@ -314,7 +314,6 @@ error:
static int capwap_rcv(struct sock *sk, struct sk_buff *skb)
{
struct vport *vport;
- const struct tnl_mutable_config *mutable;
struct iphdr *iph;
struct ovs_key_ipv4_tunnel tun_key;
__be64 key = 0;
@@ -327,15 +326,13 @@ static int capwap_rcv(struct sock *sk, struct sk_buff *skb)
goto out;
iph = ip_hdr(skb);
- vport = ovs_tnl_find_port(sock_net(sk), iph->daddr, iph->saddr, key,
- TNL_T_PROTO_CAPWAP, &mutable);
+ vport = ovs_tnl_find_port(dev_net(skb->dev), TNL_T_PROTO_CAPWAP);
if (unlikely(!vport)) {
icmp_send(skb, ICMP_DEST_UNREACH, ICMP_PORT_UNREACH, 0);
goto error;
}
- tun_key_init(&tun_key, iph,
- mutable->flags & TNL_F_IN_KEY_MATCH ? key : 0);
+ tun_key_init(&tun_key, iph, key);
OVS_CB(skb)->tun_key = &tun_key;
ovs_tnl_rcv(vport, skb);
diff --git a/datapath/vport-gre.c b/datapath/vport-gre.c
index f610097..8fab193 100644
--- a/datapath/vport-gre.c
+++ b/datapath/vport-gre.c
@@ -170,6 +170,8 @@ static int parse_header(struct iphdr *iph, __be16 *flags, __be64 *key)
/* Called with rcu_read_lock and BH disabled. */
static void gre_err(struct sk_buff *skb, u32 info)
{
+#warning fix gre_err
+#if 0
struct vport *vport;
const struct tnl_mutable_config *mutable;
const int type = icmp_hdr(skb)->type;
@@ -292,6 +294,7 @@ out:
skb_set_mac_header(skb, orig_mac_header);
skb_set_network_header(skb, orig_nw_header);
skb->protocol = htons(ETH_P_IP);
+#endif
}
static bool check_checksum(struct sk_buff *skb)
@@ -324,7 +327,6 @@ static bool check_checksum(struct sk_buff *skb)
static int gre_rcv(struct sk_buff *skb)
{
struct vport *vport;
- const struct tnl_mutable_config *mutable;
int hdr_len;
struct iphdr *iph;
struct ovs_key_ipv4_tunnel tun_key;
@@ -345,16 +347,14 @@ static int gre_rcv(struct sk_buff *skb)
goto error;
iph = ip_hdr(skb);
- vport = ovs_tnl_find_port(dev_net(skb->dev), iph->daddr, iph->saddr, key,
- TNL_T_PROTO_GRE, &mutable);
+ vport = ovs_tnl_find_port(dev_net(skb->dev), TNL_T_PROTO_GRE);
if (unlikely(!vport)) {
icmp_send(skb, ICMP_DEST_UNREACH, ICMP_PORT_UNREACH, 0);
goto error;
}
- tun_key_init(&tun_key, iph,
- mutable->flags & TNL_F_IN_KEY_MATCH ? key : 0);
+ tun_key_init(&tun_key, iph, key);
OVS_CB(skb)->tun_key = &tun_key;
__skb_pull(skb, hdr_len);
--
1.7.10.2.484.gcd07cc5
^ permalink raw reply related
* [PATCH 17/21] datapath: Always use tun_key addresses for route lookup
From: Simon Horman @ 2012-05-24 9:09 UTC (permalink / raw)
To: dev; +Cc: netdev, Kyle Mestery, Simon Horman
In-Reply-To: <1337850554-10339-1-git-send-email-horms@verge.net.au>
The tun_key should always be present and correct.
Mutable no longer stores correct address information
and the saddr and daddr fields will be removed.
Cc: Kyle Mestery <kmestery@cisco.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
---
datapath/tunnel.c | 42 +++++++++++++++++-------------------------
1 file changed, 17 insertions(+), 25 deletions(-)
diff --git a/datapath/tunnel.c b/datapath/tunnel.c
index b997cb8..ba18055 100644
--- a/datapath/tunnel.c
+++ b/datapath/tunnel.c
@@ -690,46 +690,44 @@ static inline int rt_genid(struct net *net)
}
#endif
-static struct rtable *__find_route(const struct tnl_mutable_config *mutable,
- u8 ipproto, __be32 daddr, __be32 saddr,
- u8 tos)
+static struct rtable *__find_route(struct net *net, u8 ipproto,
+ struct ovs_key_ipv4_tunnel *tun_key, u8 tos)
{
/* Tunnel configuration keeps DSCP part of TOS bits, But Linux
* router expect RT_TOS bits only. */
#if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,39)
struct flowi fl = { .nl_u = { .ip4_u = {
- .daddr = daddr,
- .saddr = saddr,
+ .daddr = tun_key->ipv4_dst,
+ .saddr = tun_key->ipv4_src,
.tos = RT_TOS(tos) } },
.proto = ipproto };
struct rtable *rt;
- if (unlikely(ip_route_output_key(port_key_get_net(&mutable->key), &rt, &fl)))
+ if (unlikely(ip_route_output_key(net, &rt, &fl)))
return ERR_PTR(-EADDRNOTAVAIL);
return rt;
#else
- struct flowi4 fl = { .daddr = daddr,
- .saddr = saddr,
+ struct flowi4 fl = { .daddr = tun_key->ipv4_dst,
+ .saddr = tun_key->ipv4_src,
.flowi4_tos = RT_TOS(tos),
.flowi4_proto = ipproto };
- return ip_route_output_key(port_key_get_net(&mutable->key), &fl);
+ return ip_route_output_key(net, &fl);
#endif
}
-static struct rtable *find_route(struct vport *vport,
- const struct tnl_mutable_config *mutable,
- u8 tos, __be32 daddr, __be32 saddr)
+static struct rtable *find_route(struct vport *vport, struct net *net,
+ struct ovs_key_ipv4_tunnel *tun_key, u8 tos)
{
struct tnl_vport *tnl_vport = tnl_vport_priv(vport);
struct rtable *rt;
tos = RT_TOS(tos);
- rt = __find_route(mutable, tnl_vport->tnl_ops->ipproto,
- daddr, saddr, tos);
+ rt = __find_route(net, tnl_vport->tnl_ops->ipproto,
+ tun_key, tos);
if (IS_ERR(rt))
return NULL;
@@ -860,12 +858,13 @@ int ovs_tnl_send(struct vport *vport, struct sk_buff *skb)
struct dst_entry *unattached_dst = NULL;
int sent_len = 0;
__be16 frag_off = 0;
- __be32 daddr;
- __be32 saddr;
u8 ttl;
u8 inner_tos;
u8 tos;
+ if (!OVS_CB(skb)->tun_key)
+ goto error_free;
+
/* Validate the protocol headers before we try to use them. */
if (skb->protocol == htons(ETH_P_8021Q) &&
!vlan_tx_tag_present(skb)) {
@@ -906,16 +905,9 @@ int ovs_tnl_send(struct vport *vport, struct sk_buff *skb)
else
tos = mutable->tos;
- if (OVS_CB(skb)->tun_key) {
- daddr = OVS_CB(skb)->tun_key->ipv4_dst;
- saddr = OVS_CB(skb)->tun_key->ipv4_src;
- } else {
- daddr = mutable->key.daddr;
- saddr = mutable->key.saddr;
- }
-
/* Route lookup */
- rt = find_route(vport, mutable, tos, daddr, saddr);
+ rt = find_route(vport, port_key_get_net(&mutable->key),
+ OVS_CB(skb)->tun_key, tos);
if (unlikely(!rt))
goto error_free;
unattached_dst = &rt_dst(rt);
--
1.7.10.2.484.gcd07cc5
^ permalink raw reply related
* [PATCH 08/21] ofproto: Add realdev_to_txdev()
From: Simon Horman @ 2012-05-24 9:09 UTC (permalink / raw)
To: dev; +Cc: netdev, Kyle Mestery, Simon Horman
In-Reply-To: <1337850554-10339-1-git-send-email-horms@verge.net.au>
This is used to map a tunnel or VLAN realdevs to
tundev and vlandevs respectively. This is used
on transmit to map fromt the interface used
in user-space to the interface used in the datapath.
In the case where an interface is not a tunnel
and does not have VLAN splinters configured
a identity map is made.
Cc: Kyle Mestery <kmestery@cisco.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
---
ofproto/ofproto-dpif.c | 31 +++++++++++++++++++++++--------
1 file changed, 23 insertions(+), 8 deletions(-)
diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index 642b508..c7ea391 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -539,8 +539,6 @@ struct vlan_splinter {
int vid;
};
-static uint32_t vsp_realdev_to_vlandev(const struct ofproto_dpif *,
- uint32_t realdev, ovs_be16 vlan_tci);
static bool vsp_adjust_flow(const struct ofproto_dpif *, struct flow *);
static void vsp_remove(struct ofport_dpif *);
static void vsp_add(struct ofport_dpif *, uint16_t realdev_ofp_port, int vid);
@@ -555,6 +553,10 @@ static unsigned multicast_ports;
static int set_tunnelling(struct ofport *ofport_, uint16_t realdev_ofp_port,
const struct tunnel_settings *s);
+static uint32_t
+realdev_to_txdev(const struct ofproto_dpif *ofproto,
+ const struct ofport_dpif *ofport, ovs_be16 vlan_tci);
+
static struct ofport_dpif *
ofport_dpif_cast(const struct ofport *ofport)
{
@@ -4700,9 +4702,8 @@ send_packet(const struct ofport_dpif *ofport, struct ofpbuf *packet)
int error;
flow_extract((struct ofpbuf *) packet, 0, 0, 0, &flow);
- odp_port = vsp_realdev_to_vlandev(ofproto, ofport->odp_port,
- flow.vlan_tci);
- if (odp_port != ofport->odp_port) {
+ odp_port = realdev_to_txdev(ofproto, ofport, flow.vlan_tci);
+ if (odp_port != ofport->odp_port && !ofport->tun) {
eth_pop_vlan(packet);
flow.vlan_tci = htons(0);
}
@@ -4909,9 +4910,8 @@ compose_output_action__(struct action_xlate_ctx *ctx, uint16_t ofp_port,
* later and we're pre-populating the flow table. */
}
- out_port = vsp_realdev_to_vlandev(ctx->ofproto, odp_port,
- ctx->flow.vlan_tci);
- if (out_port != odp_port) {
+ out_port = realdev_to_txdev(ctx->ofproto, ofport, ctx->flow.vlan_tci);
+ if (out_port != odp_port && !ofport->tun) {
ctx->flow.vlan_tci = htons(0);
}
commit_odp_actions(&ctx->flow, &ctx->base_flow, ctx->odp_actions);
@@ -7211,6 +7211,21 @@ set_tunnelling(struct ofport *ofport_, uint16_t tundev_ofp_port,
return 0;
}
+
+/* Maps a port to the port that it should be transmitted on.
+ * If tunneling is enabled then the associated tunnel port is returned.
+ * If VLAN splintering is enabled then the ofp_port of the vlandev is
+ * returned.
+ * Otherwise no mapping is in effect and ofport->odp_port is returned. */
+static uint32_t
+realdev_to_txdev(const struct ofproto_dpif *ofproto,
+ const struct ofport_dpif *ofport, ovs_be16 vlan_tci)
+{
+ if (ofport->tun) {
+ return ofp_port_to_odp_port(ofport->tun->tundev_ofp_port);
+ }
+ return vsp_realdev_to_vlandev(ofproto, ofport->odp_port, vlan_tci);
+}
\f
const struct ofproto_class ofproto_dpif_class = {
enumerate_types,
--
1.7.10.2.484.gcd07cc5
^ permalink raw reply related
* [PATCH 02/21] datapath: Use tun_key on transmit
From: Simon Horman @ 2012-05-24 9:08 UTC (permalink / raw)
To: dev; +Cc: netdev, Kyle Mestery, Simon Horman
In-Reply-To: <1337850554-10339-1-git-send-email-horms@verge.net.au>
Use the tun_key, which is the basis of flow-based tunnelling, on transmit.
Cc: Kyle Mestery <kmestery@cisco.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
---
datapath/tunnel.c | 45 ++++++++++++++++++++++++++++++++-------------
1 file changed, 32 insertions(+), 13 deletions(-)
diff --git a/datapath/tunnel.c b/datapath/tunnel.c
index 010e513..61add96 100644
--- a/datapath/tunnel.c
+++ b/datapath/tunnel.c
@@ -1002,15 +1002,16 @@ unlock:
}
static struct rtable *__find_route(const struct tnl_mutable_config *mutable,
- u8 ipproto, u8 tos)
+ u8 ipproto, __be32 daddr, __be32 saddr,
+ u8 tos)
{
/* Tunnel configuration keeps DSCP part of TOS bits, But Linux
* router expect RT_TOS bits only. */
#if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,39)
struct flowi fl = { .nl_u = { .ip4_u = {
- .daddr = mutable->key.daddr,
- .saddr = mutable->key.saddr,
+ .daddr = daddr,
+ .saddr = saddr,
.tos = RT_TOS(tos) } },
.proto = ipproto };
struct rtable *rt;
@@ -1020,8 +1021,8 @@ static struct rtable *__find_route(const struct tnl_mutable_config *mutable,
return rt;
#else
- struct flowi4 fl = { .daddr = mutable->key.daddr,
- .saddr = mutable->key.saddr,
+ struct flowi4 fl = { .daddr = daddr,
+ .saddr = saddr,
.flowi4_tos = RT_TOS(tos),
.flowi4_proto = ipproto };
@@ -1031,7 +1032,8 @@ static struct rtable *__find_route(const struct tnl_mutable_config *mutable,
static struct rtable *find_route(struct vport *vport,
const struct tnl_mutable_config *mutable,
- u8 tos, struct tnl_cache **cache)
+ u8 tos, __be32 daddr, __be32 saddr,
+ struct tnl_cache **cache)
{
struct tnl_vport *tnl_vport = tnl_vport_priv(vport);
struct tnl_cache *cur_cache = rcu_dereference(tnl_vport->cache);
@@ -1039,14 +1041,16 @@ static struct rtable *find_route(struct vport *vport,
*cache = NULL;
tos = RT_TOS(tos);
- if (likely(tos == RT_TOS(mutable->tos) &&
- check_cache_valid(cur_cache, mutable))) {
+ if (daddr == mutable->key.daddr && saddr == mutable->key.saddr &&
+ tos == RT_TOS(mutable->tos) &&
+ check_cache_valid(cur_cache, mutable)) {
*cache = cur_cache;
return cur_cache->rt;
} else {
struct rtable *rt;
- rt = __find_route(mutable, tnl_vport->tnl_ops->ipproto, tos);
+ rt = __find_route(mutable, tnl_vport->tnl_ops->ipproto,
+ daddr, saddr, tos);
if (IS_ERR(rt))
return NULL;
@@ -1182,6 +1186,8 @@ int ovs_tnl_send(struct vport *vport, struct sk_buff *skb)
struct tnl_cache *cache;
int sent_len = 0;
__be16 frag_off = 0;
+ __be32 daddr;
+ __be32 saddr;
u8 ttl;
u8 inner_tos;
u8 tos;
@@ -1221,11 +1227,21 @@ int ovs_tnl_send(struct vport *vport, struct sk_buff *skb)
if (mutable->flags & TNL_F_TOS_INHERIT)
tos = inner_tos;
+ else if (OVS_CB(skb)->tun_key)
+ tos = OVS_CB(skb)->tun_key->ipv4_tos;
else
tos = mutable->tos;
+ if (OVS_CB(skb)->tun_key) {
+ daddr = OVS_CB(skb)->tun_key->ipv4_dst;
+ saddr = OVS_CB(skb)->tun_key->ipv4_src;
+ } else {
+ daddr = mutable->key.daddr;
+ saddr = mutable->key.saddr;
+ }
+
/* Route lookup */
- rt = find_route(vport, mutable, tos, &cache);
+ rt = find_route(vport, mutable, tos, daddr, saddr, &cache);
if (unlikely(!rt))
goto error_free;
if (unlikely(!cache))
@@ -1262,10 +1278,12 @@ int ovs_tnl_send(struct vport *vport, struct sk_buff *skb)
}
/* TTL */
- ttl = mutable->ttl;
+ if (OVS_CB(skb)->tun_key)
+ ttl = OVS_CB(skb)->tun_key->ipv4_ttl;
+ else
+ ttl = mutable->ttl;
if (!ttl)
ttl = ip4_dst_hoplimit(&rt_dst(rt));
-
if (mutable->flags & TNL_F_TTL_INHERIT) {
if (skb->protocol == htons(ETH_P_IP))
ttl = ip_hdr(skb)->ttl;
@@ -1444,7 +1462,8 @@ static int tnl_set_config(struct net *net, struct nlattr *options,
struct net_device *dev;
struct rtable *rt;
- rt = __find_route(mutable, tnl_ops->ipproto, mutable->tos);
+ rt = __find_route(mutable, tnl_ops->ipproto, mutable->tos,
+ mutable->key.daddr, mutable->key.saddr);
if (IS_ERR(rt))
return -EADDRNOTAVAIL;
dev = rt_dst(rt).dev;
--
1.7.10.2.484.gcd07cc5
^ permalink raw reply related
* [RFC v4 00/21] Flow Based Tunneling for Open vSwitch
From: Simon Horman @ 2012-05-24 9:08 UTC (permalink / raw)
To: dev; +Cc: netdev, Kyle Mestery
Hi,
This series comprises a fresh batch of proposed changes to introduce
flow-based tunnelling.
At the heart of these changes is the following structure, which
is attached as a pointer to skb->cb.
struct ovs_key_ipv4_tunnel {
__be64 tun_id;
__u32 tun_flags;
__be32 ipv4_src;
__be32 ipv4_dst;
__u8 ipv4_tos;
__u8 ipv4_ttl;
__u8 pad[2];
};
This series does not introdue use of in-tree kernel tunneling code
by Open vSwitch. However, it is intended as preliminary work
for that goal and I believe attaching a structure similar
to the one above to to skb->cb could be mechanism to achieve that.
I have CCed netdev for any comment on that.
Some details of the implementatoin follow, they are not
particularly related to the use of in-tree kernel tunneling code.
Overview:
In general the appraoch that I have taken in user-space is to split
tunneling into realdevs and tundevs. Tunnel realdevs are devices that look
to users like the existing port-based tunnelling implementation. Tunnel
tundevs exist in the datapath and are where tx and rx occur. Tunnel
tundevs have very little configuration and are unable to opperate without
flow information that describes at least the remote IP.
Changes:
* Do not attempt to configure a tundev realport, it will fail which
results in ovs-vswitchd to start. I had not noticed this as
ovs-vswitchd will start if there are no tundevs present in the databse
when it starts, and I usally test on a fresh install.
* Add a flags fields to ovs_key_ipv4_tunnel (above) and use it
to reinstate the functionality of various flags e.g. tunnel checksum,
tunnel out key. Previously these flags were set on the 'mutable' of
a tunnel device in the kernel, however this is no longer appropriate
as a tunnel device may now handle multiple tunnels.
* Cleaned up output and parsing of tunnel flows.
Test Suite enhancements to come.
* Do not use Linux kernel headers in lib/odp-util.c.
This is achieved by defining a new structure flow_tun_key
and using it instead of ovs_key_ipv4_tunnel. THe structure
is currently the same internally as ovs_key_ipv4_tunnel.
Limiations:
* In this series, realdevs exist in the kernel although I believe
it should not be necessary for them to do so. The reason that they are
there is to limit the changes that are needed to the user-space netdev
code and to allow review of the series before making those changes.
* PMTU discovery is broken and I'm unsure if it has been fixed.
Jesse Gross sugested that a uer-space implemtation of MSS clampint would
be a good solution to this. I have made a start on that and sent a
separate email about it.
* The header cache has been removed, but some reminants of the
API remain. In particualr the tunnel header is still created and updated,
even thogh both occur for each transmit. It may make sense to
recombine those calls into a single call if the header cache is
to be permantently removed.
* Multicast could be implemented in user-space byt currently isn't.
This means that muilticast remote IP for tunneling is broken.
* I have not implemented matches for tun_keys. This means
that the current implementation only provides port-based tunneling
implemented on top of flow-bassed tunneling. It is not yet possible for a
controller to match on or set the tun_key of flows.
I expect this to be a small body of work to complete.
* The way that I have split the patchs is still somewhat arbitrary.
I wanted to avoid one very large patch to aid review. But a lot of the
chagnes are inter-related, so a bisectable split seems rather difficult.
None the less, the split could be significantly improved.
----------------------------------------------------------------
Simon Horman (21):
datapath: tunnelling: Replace tun_id with tun_key
datapath: Use tun_key on transmit
odp-util: Add tun_key to parse_odp_key_attr()
vswitchd: Add iface_parse_tunnel
vswitchd: Add add_tunnel_ports()
ofproto: Add set_tunnelling()
vswitchd: Configure tunnel interfaces.
ofproto: Add realdev_to_txdev()
ofproto: Add tundev_to_realdev()
classifier: Convert struct flow flow_metadata to use tun_key
datapath, vport: Provide tunnel realdev and tundev classes and vports
lib: Replace commit_set_tun_id_action() with commit_set_tunnel_action()
global: Remove OVS_KEY_ATTR_TUN_ID
ofproto: Set flow tun_key in compose_output_action()
datapath: Remove mlink element from tnl_mutable_config
datapath: remove tunnel cache
datapath: Always use tun_key addresses for route lookup
dataptah: remove ttl and tos from tnl_mutable_config
datapath: Simplify vport lookup
datapath: Use tun_key flags for id and csum settings on transmit
datapath: Always use tun_key flags
datapath/Modules.mk | 3 +-
datapath/actions.c | 6 +-
datapath/datapath.c | 11 +-
datapath/datapath.h | 5 +-
datapath/flow.c | 35 +-
datapath/flow.h | 27 +-
datapath/tunnel.c | 782 +++++-----------------------------------
datapath/tunnel.h | 98 +----
datapath/vport-capwap.c | 45 +--
datapath/vport-gre.c | 62 ++--
datapath/vport-tunnel-realdev.c | 260 +++++++++++++
datapath/vport.c | 3 +-
datapath/vport.h | 1 +
include/linux/openvswitch.h | 24 +-
include/openvswitch/tunnel.h | 4 +
lib/classifier.c | 8 +-
lib/dpif-linux.c | 2 +-
lib/dpif-netdev.c | 2 +-
lib/flow.c | 31 +-
lib/flow.h | 21 +-
lib/meta-flow.c | 4 +-
lib/netdev-vport.c | 333 ++++-------------
lib/nx-match.c | 2 +-
lib/odp-util.c | 72 +++-
lib/odp-util.h | 5 +-
lib/ofp-print.c | 12 +-
lib/ofp-util.c | 4 +-
ofproto/ofproto-dpif.c | 347 ++++++++++++++++--
ofproto/ofproto-provider.h | 12 +
ofproto/ofproto.c | 28 ++
ofproto/ofproto.h | 46 +++
tests/test-classifier.c | 7 +-
vswitchd/bridge.c | 350 ++++++++++++++++++
33 files changed, 1451 insertions(+), 1201 deletions(-)
create mode 100644 datapath/vport-tunnel-realdev.c
^ permalink raw reply
* [PATCH 03/21] odp-util: Add tun_key to parse_odp_key_attr()
From: Simon Horman @ 2012-05-24 9:08 UTC (permalink / raw)
To: dev; +Cc: netdev, Kyle Mestery, Simon Horman
In-Reply-To: <1337850554-10339-1-git-send-email-horms@verge.net.au>
Cc: Kyle Mestery <kmestery@cisco.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
---
v4
Correct parsing of tunnel key in parse_odp_key_attr()
so that it matches the out put of format_odp_key_attr()
TODO: fix test suite
v3
* Initial post
---
lib/odp-util.c | 29 +++++++++++++++++++++++++++++
1 file changed, 29 insertions(+)
diff --git a/lib/odp-util.c b/lib/odp-util.c
index 23d1efe..7cff00c 100644
--- a/lib/odp-util.c
+++ b/lib/odp-util.c
@@ -925,6 +925,35 @@ parse_odp_key_attr(const char *s, const struct simap *port_names,
}
{
+ ovs_be32 ipv4_src;
+ ovs_be32 ipv4_dst;
+ unsigned long long tun_flags;
+ int ipv4_tos;
+ int ipv4_ttl;
+ int n = -1;
+
+ if (sscanf(s, "ipv4_tunnel(tun_id=%31[x0123456789abcdefABCDEF]"
+ ",flags=%llx,src="IP_SCAN_FMT",dst="IP_SCAN_FMT
+ ",tos=%i,ttl=%i)%n",
+ tun_id_s, &tun_flags,
+ IP_SCAN_ARGS(&ipv4_src), IP_SCAN_ARGS(&ipv4_dst),
+ &ipv4_tos, &ipv4_ttl, &n) > 0
+ && n > 0) {
+ struct ovs_key_ipv4_tunnel tun_key;
+
+ tun_key.tun_id = htonll(strtoull(tun_id_s, NULL, 0));
+ tun_key.tun_flags = tun_flags;
+ tun_key.ipv4_src = ipv4_src;
+ tun_key.ipv4_dst = ipv4_dst;
+ tun_key.ipv4_tos = ipv4_tos;
+ tun_key.ipv4_ttl = ipv4_ttl;
+ nl_msg_put_unspec(key, OVS_KEY_ATTR_IPV4_TUNNEL,
+ &tun_key, sizeof tun_key);
+ return n;
+ }
+ }
+
+ {
unsigned long long int in_port;
int n = -1;
--
1.7.10.2.484.gcd07cc5
^ permalink raw reply related
* [PATCH 18/21] dataptah: remove ttl and tos from tnl_mutable_config
From: Simon Horman @ 2012-05-24 9:09 UTC (permalink / raw)
To: dev-yBygre7rU0TnMu66kgdUjQ; +Cc: netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1337850554-10339-1-git-send-email-horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
tun_key should always be present and correct in ovs_tnl_send()
It ought to be possible to handle the ttl entirely
in user-space. This is not implemented yet. However, the
TNL_F_TOS_INHERIT is currently never set.
Cc: Kyle Mestery <kmestery-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
---
datapath/tunnel.c | 10 ++--------
datapath/tunnel.h | 4 ----
2 files changed, 2 insertions(+), 12 deletions(-)
diff --git a/datapath/tunnel.c b/datapath/tunnel.c
index ba18055..39aa2af 100644
--- a/datapath/tunnel.c
+++ b/datapath/tunnel.c
@@ -900,10 +900,8 @@ int ovs_tnl_send(struct vport *vport, struct sk_buff *skb)
if (mutable->flags & TNL_F_TOS_INHERIT)
tos = inner_tos;
- else if (OVS_CB(skb)->tun_key)
- tos = OVS_CB(skb)->tun_key->ipv4_tos;
else
- tos = mutable->tos;
+ tos = OVS_CB(skb)->tun_key->ipv4_tos;
/* Route lookup */
rt = find_route(vport, port_key_get_net(&mutable->key),
@@ -940,11 +938,7 @@ int ovs_tnl_send(struct vport *vport, struct sk_buff *skb)
dst_hold(unattached_dst);
}
- /* TTL */
- if (OVS_CB(skb)->tun_key)
- ttl = OVS_CB(skb)->tun_key->ipv4_ttl;
- else
- ttl = mutable->ttl;
+ ttl = OVS_CB(skb)->tun_key->ipv4_ttl;
if (!ttl)
ttl = ip4_dst_hoplimit(&rt_dst(rt));
if (mutable->flags & TNL_F_TTL_INHERIT) {
diff --git a/datapath/tunnel.h b/datapath/tunnel.h
index ed3b4ec..330df27 100644
--- a/datapath/tunnel.h
+++ b/datapath/tunnel.h
@@ -99,8 +99,6 @@ static inline void port_key_set_net(struct port_lookup_key *key, struct net *net
* (e.g. ICMP fragmentation needed messages).
* @out_key: Key to use on output, 0 if this tunnel has no fixed output key.
* @flags: TNL_F_* flags.
- * @tos: IPv4 TOS value to use for tunnel, 0 if no fixed TOS.
- * @ttl: IPv4 TTL value to use for tunnel, 0 if no fixed TTL.
*/
struct tnl_mutable_config {
struct port_lookup_key key;
@@ -115,8 +113,6 @@ struct tnl_mutable_config {
/* Configured via OVS_TUNNEL_ATTR_* attributes. */
__be64 out_key;
u32 flags;
- u8 tos;
- u8 ttl;
};
struct tnl_ops {
--
1.7.10.2.484.gcd07cc5
^ permalink raw reply related
* [PATCH 15/21] datapath: Remove mlink element from tnl_mutable_config
From: Simon Horman @ 2012-05-24 9:09 UTC (permalink / raw)
To: dev-yBygre7rU0TnMu66kgdUjQ; +Cc: netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1337850554-10339-1-git-send-email-horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
Multicast may be handled in user-space (but isn't yet).
Cc: Kyle Mestery <kmestery-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
---
datapath/tunnel.c | 22 ----------------------
datapath/tunnel.h | 3 ---
2 files changed, 25 deletions(-)
diff --git a/datapath/tunnel.c b/datapath/tunnel.c
index f07ec69..cdcb0a7 100644
--- a/datapath/tunnel.c
+++ b/datapath/tunnel.c
@@ -162,21 +162,6 @@ static void free_cache_rcu(struct rcu_head *rcu)
free_cache(c);
}
-/* Frees the portion of 'mutable' that requires RTNL and thus can't happen
- * within an RCU callback. Fortunately this part doesn't require waiting for
- * an RCU grace period.
- */
-static void free_mutable_rtnl(struct tnl_mutable_config *mutable)
-{
- ASSERT_RTNL();
- if (ipv4_is_multicast(mutable->key.daddr) && mutable->mlink) {
- struct in_device *in_dev;
- in_dev = inetdev_by_index(port_key_get_net(&mutable->key), mutable->mlink);
- if (in_dev)
- ip_mc_dec_group(in_dev, mutable->key.daddr);
- }
-}
-
static void assign_config_rcu(struct vport *vport,
struct tnl_mutable_config *new_config)
{
@@ -186,7 +171,6 @@ static void assign_config_rcu(struct vport *vport,
old_config = rtnl_dereference(tnl_vport->mutable);
rcu_assign_pointer(tnl_vport->mutable, new_config);
- free_mutable_rtnl(old_config);
call_rcu(&old_config->rcu, free_config_rcu);
}
@@ -1391,8 +1375,6 @@ static int tnl_set_config(struct net *net,
if (old_vport && old_vport != cur_vport)
return -EEXIST;
- mutable->mlink = 0;
-
return 0;
}
@@ -1445,7 +1427,6 @@ struct vport *ovs_tnl_create(const struct vport_parms *parms,
return vport;
error_free_mutable:
- free_mutable_rtnl(mutable);
kfree(mutable);
error_free_vport:
ovs_vport_free(vport);
@@ -1470,7 +1451,6 @@ void ovs_tnl_destroy(struct vport *vport)
mutable = rtnl_dereference(tnl_vport->mutable);
port_table_remove_port(vport);
- free_mutable_rtnl(mutable);
call_rcu(&tnl_vport->rcu, free_port_rcu);
}
@@ -1484,8 +1464,6 @@ int ovs_tnl_set_addr(struct vport *vport, const unsigned char *addr)
if (!mutable)
return -ENOMEM;
- old_mutable->mlink = 0;
-
memcpy(mutable->eth_addr, addr, ETH_ALEN);
assign_config_rcu(vport, mutable);
diff --git a/datapath/tunnel.h b/datapath/tunnel.h
index 7d78297..0af27ac 100644
--- a/datapath/tunnel.h
+++ b/datapath/tunnel.h
@@ -117,9 +117,6 @@ struct tnl_mutable_config {
u32 flags;
u8 tos;
u8 ttl;
-
- /* Multicast configuration. */
- int mlink;
};
struct tnl_ops {
--
1.7.10.2.484.gcd07cc5
^ permalink raw reply related
* [PATCH 14/21] ofproto: Set flow tun_key in compose_output_action()
From: Simon Horman @ 2012-05-24 9:09 UTC (permalink / raw)
To: dev-yBygre7rU0TnMu66kgdUjQ; +Cc: netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1337850554-10339-1-git-send-email-horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
In essence this attached the tun_key, if any,
to the output processing of a packet. This allows
it the packet to be transmitted using flow-based
tunneling as necessary.
Cc: Kyle Mestery <kmestery-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
---
v4
* Set tun_flags field of flow.tun_key
* Remove debugging message
v3
* Initial release
datapath: Add flags to ovs_key_ipv4_tunnel
Add flags to ovs_key_ipv4_tunnel and set from
the tunnel's realdev flags. This allows the datapath
to have access to flags on transmit which can be
used to effect the transmission - e.g. add a tunnel id.
Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
---
ofproto/ofproto-dpif.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index 2a52f37..b1354a2 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -4919,8 +4919,17 @@ compose_output_action__(struct action_xlate_ctx *ctx, uint16_t ofp_port,
}
out_port = realdev_to_txdev(ctx->ofproto, ofport, ctx->flow.vlan_tci);
- if (out_port != odp_port && !ofport->tun) {
- ctx->flow.vlan_tci = htons(0);
+ if (out_port != odp_port) {
+ if (ofport->tun) {
+ ctx->flow.tun_key.tun_id = ofport->tun->s.out_key;
+ ctx->flow.tun_key.tun_flags = ofport->tun->s.flags;
+ ctx->flow.tun_key.ipv4_src = ofport->tun->s.saddr;
+ ctx->flow.tun_key.ipv4_dst = ofport->tun->s.daddr;
+ ctx->flow.tun_key.ipv4_tos = ofport->tun->s.tos;
+ ctx->flow.tun_key.ipv4_ttl = ofport->tun->s.ttl;
+ } else {
+ ctx->flow.vlan_tci = htons(0);
+ }
}
commit_odp_actions(&ctx->flow, &ctx->base_flow, ctx->odp_actions);
nl_msg_put_u32(ctx->odp_actions, OVS_ACTION_ATTR_OUTPUT, out_port);
@@ -5576,7 +5585,7 @@ action_xlate_ctx_init(struct action_xlate_ctx *ctx,
ctx->ofproto = ofproto;
ctx->flow = *flow;
ctx->base_flow = ctx->flow;
- ctx->base_flow.tun_key.ipv4_src = 0;
+ ctx->base_flow.tun_key.ipv4_src = htonl(0);
ctx->base_flow.vlan_tci = initial_tci;
ctx->rule = rule;
ctx->packet = packet;
--
1.7.10.2.484.gcd07cc5
^ permalink raw reply related
* [PATCH 13/21] global: Remove OVS_KEY_ATTR_TUN_ID
From: Simon Horman @ 2012-05-24 9:09 UTC (permalink / raw)
To: dev-yBygre7rU0TnMu66kgdUjQ; +Cc: netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1337850554-10339-1-git-send-email-horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
OVS_KEY_ATTR_TUN_ID may now be removed as it is
no longer used in any meaningful way.
Cc: Kyle Mestery <kmestery-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
---
datapath/datapath.c | 1 -
datapath/flow.c | 1 -
include/linux/openvswitch.h | 1 -
lib/dpif-netdev.c | 1 -
lib/odp-util.c | 18 ------------------
5 files changed, 22 deletions(-)
diff --git a/datapath/datapath.c b/datapath/datapath.c
index 65dfe79..dcff4c6 100644
--- a/datapath/datapath.c
+++ b/datapath/datapath.c
@@ -590,7 +590,6 @@ static int validate_set(const struct nlattr *a,
const struct ovs_key_ipv4_tunnel *tun_key;
case OVS_KEY_ATTR_PRIORITY:
- case OVS_KEY_ATTR_TUN_ID:
case OVS_KEY_ATTR_ETHERNET:
break;
diff --git a/datapath/flow.c b/datapath/flow.c
index 49c0dd8..9c898c6 100644
--- a/datapath/flow.c
+++ b/datapath/flow.c
@@ -847,7 +847,6 @@ const int ovs_key_lens[OVS_KEY_ATTR_MAX + 1] = {
[OVS_KEY_ATTR_ND] = sizeof(struct ovs_key_nd),
/* Not upstream. */
- [OVS_KEY_ATTR_TUN_ID] = sizeof(__be64),
[OVS_KEY_ATTR_IPV4_TUNNEL] = sizeof(struct ovs_key_ipv4_tunnel),
};
diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h
index f2d56ec..9de3f20 100644
--- a/include/linux/openvswitch.h
+++ b/include/linux/openvswitch.h
@@ -279,7 +279,6 @@ enum ovs_key_attr {
OVS_KEY_ATTR_ICMPV6, /* struct ovs_key_icmpv6 */
OVS_KEY_ATTR_ARP, /* struct ovs_key_arp */
OVS_KEY_ATTR_ND, /* struct ovs_key_nd */
- OVS_KEY_ATTR_TUN_ID, /* be64 tunnel ID */
OVS_KEY_ATTR_IPV4_TUNNEL, /* struct ovs_key_ipv4_tunnel */
__OVS_KEY_ATTR_MAX
};
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index d065a3a..ff00e05 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -1162,7 +1162,6 @@ execute_set_action(struct ofpbuf *packet, const struct nlattr *a)
const struct ovs_key_udp *udp_key;
switch (type) {
- case OVS_KEY_ATTR_TUN_ID:
case OVS_KEY_ATTR_PRIORITY:
case OVS_KEY_ATTR_IPV6:
case OVS_KEY_ATTR_IPV4_TUNNEL:
diff --git a/lib/odp-util.c b/lib/odp-util.c
index 11b7a1b..d1fe9d8 100644
--- a/lib/odp-util.c
+++ b/lib/odp-util.c
@@ -105,7 +105,6 @@ ovs_key_attr_to_string(enum ovs_key_attr attr)
case OVS_KEY_ATTR_ICMPV6: return "icmpv6";
case OVS_KEY_ATTR_ARP: return "arp";
case OVS_KEY_ATTR_ND: return "nd";
- case OVS_KEY_ATTR_TUN_ID: return "tun_id";
case OVS_KEY_ATTR_IPV4_TUNNEL: return "ipv4_tunnel";
case __OVS_KEY_ATTR_MAX:
@@ -602,7 +601,6 @@ odp_flow_key_attr_len(uint16_t type)
switch ((enum ovs_key_attr) type) {
case OVS_KEY_ATTR_ENCAP: return -2;
case OVS_KEY_ATTR_PRIORITY: return 4;
- case OVS_KEY_ATTR_TUN_ID: return 8;
case OVS_KEY_ATTR_IN_PORT: return 4;
case OVS_KEY_ATTR_ETHERNET: return sizeof(struct ovs_key_ethernet);
case OVS_KEY_ATTR_VLAN: return sizeof(ovs_be16);
@@ -697,10 +695,6 @@ format_odp_key_attr(const struct nlattr *a, struct ds *ds)
ds_put_format(ds, "(%"PRIu32")", nl_attr_get_u32(a));
break;
- case OVS_KEY_ATTR_TUN_ID:
- ds_put_format(ds, "(%#"PRIx64")", ntohll(nl_attr_get_be64(a)));
- break;
-
case OVS_KEY_ATTR_IPV4_TUNNEL:
ipv4_tun_key = nl_attr_get(a);
ds_put_format(ds, "(tun_id=%"PRIx64",flags=%"PRIx32
@@ -913,18 +907,6 @@ parse_odp_key_attr(const char *s, const struct simap *port_names,
}
{
- char tun_id_s[32];
- int n = -1;
-
- if (sscanf(s, "tun_id(%31[x0123456789abcdefABCDEF])%n",
- tun_id_s, &n) > 0 && n > 0) {
- uint64_t tun_id = strtoull(tun_id_s, NULL, 0);
- nl_msg_put_be64(key, OVS_KEY_ATTR_TUN_ID, htonll(tun_id));
- return n;
- }
- }
-
- {
ovs_be32 ipv4_src;
ovs_be32 ipv4_dst;
unsigned long long tun_flags;
--
1.7.10.2.484.gcd07cc5
^ permalink raw reply related
* [PATCH 12/21] lib: Replace commit_set_tun_id_action() with commit_set_tunnel_action()
From: Simon Horman @ 2012-05-24 9:09 UTC (permalink / raw)
To: dev-yBygre7rU0TnMu66kgdUjQ; +Cc: netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1337850554-10339-1-git-send-email-horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
Cc: Kyle Mestery <kmestery-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
---
include/linux/openvswitch.h | 11 +++++++++++
lib/odp-util.c | 12 ++++++------
2 files changed, 17 insertions(+), 6 deletions(-)
diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h
index 87a3e22..f2d56ec 100644
--- a/include/linux/openvswitch.h
+++ b/include/linux/openvswitch.h
@@ -372,6 +372,17 @@ struct ovs_key_ipv4_tunnel {
__u8 pad[2];
};
+static inline int
+ovs_key_ipv4_tunnel_equal(const struct ovs_key_ipv4_tunnel *a,
+ const struct ovs_key_ipv4_tunnel *b)
+{
+ return a->ipv4_dst == b->ipv4_dst &&
+ a->tun_id == b->tun_id &&
+ a->ipv4_src == b->ipv4_src &&
+ a->ipv4_tos == b->ipv4_tos &&
+ a->ipv4_ttl == b->ipv4_ttl;
+}
+
/**
* enum ovs_flow_attr - attributes for %OVS_FLOW_* commands.
* @OVS_FLOW_ATTR_KEY: Nested %OVS_KEY_ATTR_* attributes specifying the flow
diff --git a/lib/odp-util.c b/lib/odp-util.c
index 5f76f5e..11b7a1b 100644
--- a/lib/odp-util.c
+++ b/lib/odp-util.c
@@ -1892,16 +1892,16 @@ commit_set_action(struct ofpbuf *odp_actions, enum ovs_key_attr key_type,
}
static void
-commit_set_tun_id_action(const struct flow *flow, struct flow *base,
+commit_set_tunnel_action(const struct flow *flow, struct flow *base,
struct ofpbuf *odp_actions)
{
- if (base->tun_key.tun_id == flow->tun_key.tun_id) {
+ if (ovs_key_ipv4_tunnel_equal(&base->tun_key, &flow->tun_key)) {
return;
}
- base->tun_key.tun_id = flow->tun_key.tun_id;
+ base->tun_key = flow->tun_key;
- commit_set_action(odp_actions, OVS_KEY_ATTR_TUN_ID,
- &base->tun_key.tun_id, sizeof(base->tun_key.tun_id));
+ commit_set_action(odp_actions, OVS_KEY_ATTR_IPV4_TUNNEL,
+ &base->tun_key, sizeof(base->tun_key));
}
static void
@@ -2072,7 +2072,7 @@ void
commit_odp_actions(const struct flow *flow, struct flow *base,
struct ofpbuf *odp_actions)
{
- commit_set_tun_id_action(flow, base, odp_actions);
+ commit_set_tunnel_action(flow, base, odp_actions);
commit_set_ether_addr_action(flow, base, odp_actions);
commit_vlan_action(flow, base, odp_actions);
commit_set_nw_action(flow, base, odp_actions);
--
1.7.10.2.484.gcd07cc5
^ permalink raw reply related
* [PATCH 11/21] datapath, vport: Provide tunnel realdev and tundev classes and vports
From: Simon Horman @ 2012-05-24 9:09 UTC (permalink / raw)
To: dev-yBygre7rU0TnMu66kgdUjQ; +Cc: netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1337850554-10339-1-git-send-email-horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
On the user-space side of things, the existing tunnel classes become tunnel
realdev classes and new classes are added to provide tunnel tundevs.
On the datpath side of things, the existing tunnel vports are used as
tundev vports. A new vport is added for tunnel realdevs.
It should be possible to remove realdevs entirely from the datapath,
however that requries teaching the user-space netdev to exclude them from
kernel-related opperations. I have avoided that at this time in order to
allow review of other aspects of the approach taken in my flow-bassed
tunneling prototype.
Cc: Kyle Mestery <kmestery-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
--
v4
* Tunnel tundevs should have a NULL set_config callback as their
parse_config call back is NULL. Otherwise, reconfiguration will fail and
ovs-vwitchd will exit if started with tundevs already configured.
* Remove unparse_tunnel_config, it is not used
v3
* Initial Post
remove unparse_tunnel_config
---
datapath/Modules.mk | 3 +-
datapath/tunnel.c | 158 +------------------
datapath/vport-capwap.c | 2 -
datapath/vport-gre.c | 2 -
datapath/vport-tunnel-realdev.c | 260 +++++++++++++++++++++++++++++++
datapath/vport.c | 1 +
datapath/vport.h | 1 +
include/linux/openvswitch.h | 1 +
include/openvswitch/tunnel.h | 2 +
lib/netdev-vport.c | 333 +++++++++-------------------------------
10 files changed, 343 insertions(+), 420 deletions(-)
create mode 100644 datapath/vport-tunnel-realdev.c
diff --git a/datapath/Modules.mk b/datapath/Modules.mk
index 24c1075..9aed4c3 100644
--- a/datapath/Modules.mk
+++ b/datapath/Modules.mk
@@ -26,7 +26,8 @@ openvswitch_sources = \
vport-gre.c \
vport-internal_dev.c \
vport-netdev.c \
- vport-patch.c
+ vport-patch.c \
+ vport-tunnel-realdev.c
openvswitch_headers = \
checksum.h \
diff --git a/datapath/tunnel.c b/datapath/tunnel.c
index 61add96..f07ec69 100644
--- a/datapath/tunnel.c
+++ b/datapath/tunnel.c
@@ -250,21 +250,6 @@ static void port_table_add_port(struct vport *vport)
(*find_port_pool(rtnl_dereference(tnl_vport->mutable)))++;
}
-static void port_table_move_port(struct vport *vport,
- struct tnl_mutable_config *new_mutable)
-{
- struct tnl_vport *tnl_vport = tnl_vport_priv(vport);
- u32 hash;
-
- hash = port_hash(&new_mutable->key);
- hlist_del_init_rcu(&tnl_vport->hash_node);
- hlist_add_head_rcu(&tnl_vport->hash_node, find_bucket(hash));
-
- (*find_port_pool(rtnl_dereference(tnl_vport->mutable)))--;
- assign_config_rcu(vport, new_mutable);
- (*find_port_pool(rtnl_dereference(tnl_vport->mutable)))++;
-}
-
static void port_table_remove_port(struct vport *vport)
{
struct tnl_vport *tnl_vport = tnl_vport_priv(vport);
@@ -1381,71 +1366,20 @@ out:
return sent_len;
}
-static const struct nla_policy tnl_policy[OVS_TUNNEL_ATTR_MAX + 1] = {
- [OVS_TUNNEL_ATTR_FLAGS] = { .type = NLA_U32 },
- [OVS_TUNNEL_ATTR_DST_IPV4] = { .type = NLA_U32 },
- [OVS_TUNNEL_ATTR_SRC_IPV4] = { .type = NLA_U32 },
- [OVS_TUNNEL_ATTR_OUT_KEY] = { .type = NLA_U64 },
- [OVS_TUNNEL_ATTR_IN_KEY] = { .type = NLA_U64 },
- [OVS_TUNNEL_ATTR_TOS] = { .type = NLA_U8 },
- [OVS_TUNNEL_ATTR_TTL] = { .type = NLA_U8 },
-};
-
/* Sets OVS_TUNNEL_ATTR_* fields in 'mutable', which must initially be
* zeroed. */
-static int tnl_set_config(struct net *net, struct nlattr *options,
+static int tnl_set_config(struct net *net,
const struct tnl_ops *tnl_ops,
const struct vport *cur_vport,
struct tnl_mutable_config *mutable)
{
const struct vport *old_vport;
const struct tnl_mutable_config *old_mutable;
- struct nlattr *a[OVS_TUNNEL_ATTR_MAX + 1];
- int err;
-
- if (!options)
- return -EINVAL;
-
- err = nla_parse_nested(a, OVS_TUNNEL_ATTR_MAX, options, tnl_policy);
- if (err)
- return err;
-
- if (!a[OVS_TUNNEL_ATTR_FLAGS] || !a[OVS_TUNNEL_ATTR_DST_IPV4])
- return -EINVAL;
-
- mutable->flags = nla_get_u32(a[OVS_TUNNEL_ATTR_FLAGS]) & TNL_F_PUBLIC;
+ mutable->flags = 0;
port_key_set_net(&mutable->key, net);
- mutable->key.daddr = nla_get_be32(a[OVS_TUNNEL_ATTR_DST_IPV4]);
- if (a[OVS_TUNNEL_ATTR_SRC_IPV4]) {
- if (ipv4_is_multicast(mutable->key.daddr))
- return -EINVAL;
- mutable->key.saddr = nla_get_be32(a[OVS_TUNNEL_ATTR_SRC_IPV4]);
- }
-
- if (a[OVS_TUNNEL_ATTR_TOS]) {
- mutable->tos = nla_get_u8(a[OVS_TUNNEL_ATTR_TOS]);
- /* Reject ToS config with ECN bits set. */
- if (mutable->tos & INET_ECN_MASK)
- return -EINVAL;
- }
-
- if (a[OVS_TUNNEL_ATTR_TTL])
- mutable->ttl = nla_get_u8(a[OVS_TUNNEL_ATTR_TTL]);
-
+ mutable->key.daddr = htonl(0);
mutable->key.tunnel_type = tnl_ops->tunnel_type;
- if (!a[OVS_TUNNEL_ATTR_IN_KEY]) {
- mutable->key.tunnel_type |= TNL_T_KEY_MATCH;
- mutable->flags |= TNL_F_IN_KEY_MATCH;
- } else {
- mutable->key.tunnel_type |= TNL_T_KEY_EXACT;
- mutable->key.in_key = nla_get_be64(a[OVS_TUNNEL_ATTR_IN_KEY]);
- }
-
- if (!a[OVS_TUNNEL_ATTR_OUT_KEY])
- mutable->flags |= TNL_F_OUT_KEY_ACTION;
- else
- mutable->out_key = nla_get_be64(a[OVS_TUNNEL_ATTR_OUT_KEY]);
mutable->tunnel_hlen = tnl_ops->hdr_len(mutable);
if (mutable->tunnel_hlen < 0)
@@ -1458,21 +1392,6 @@ static int tnl_set_config(struct net *net, struct nlattr *options,
return -EEXIST;
mutable->mlink = 0;
- if (ipv4_is_multicast(mutable->key.daddr)) {
- struct net_device *dev;
- struct rtable *rt;
-
- rt = __find_route(mutable, tnl_ops->ipproto, mutable->tos,
- mutable->key.daddr, mutable->key.saddr);
- if (IS_ERR(rt))
- return -EADDRNOTAVAIL;
- dev = rt_dst(rt).dev;
- ip_rt_put(rt);
- if (__in_dev_get_rtnl(dev) == NULL)
- return -EADDRNOTAVAIL;
- mutable->mlink = dev->ifindex;
- ip_mc_inc_group(__in_dev_get_rtnl(dev), mutable->key.daddr);
- }
return 0;
}
@@ -1509,8 +1428,7 @@ struct vport *ovs_tnl_create(const struct vport_parms *parms,
get_random_bytes(&initial_frag_id, sizeof(int));
atomic_set(&tnl_vport->frag_id, initial_frag_id);
- err = tnl_set_config(ovs_dp_get_net(parms->dp), parms->options, tnl_ops,
- NULL, mutable);
+ err = tnl_set_config(ovs_dp_get_net(parms->dp), tnl_ops, NULL, mutable);
if (err)
goto error_free_mutable;
@@ -1535,74 +1453,6 @@ error:
return ERR_PTR(err);
}
-int ovs_tnl_set_options(struct vport *vport, struct nlattr *options)
-{
- struct tnl_vport *tnl_vport = tnl_vport_priv(vport);
- const struct tnl_mutable_config *old_mutable;
- struct tnl_mutable_config *mutable;
- int err;
-
- mutable = kzalloc(sizeof(struct tnl_mutable_config), GFP_KERNEL);
- if (!mutable) {
- err = -ENOMEM;
- goto error;
- }
-
- /* Copy fields whose values should be retained. */
- old_mutable = rtnl_dereference(tnl_vport->mutable);
- mutable->seq = old_mutable->seq + 1;
- memcpy(mutable->eth_addr, old_mutable->eth_addr, ETH_ALEN);
-
- /* Parse the others configured by userspace. */
- err = tnl_set_config(ovs_dp_get_net(vport->dp), options, tnl_vport->tnl_ops,
- vport, mutable);
- if (err)
- goto error_free;
-
- if (port_hash(&mutable->key) != port_hash(&old_mutable->key))
- port_table_move_port(vport, mutable);
- else
- assign_config_rcu(vport, mutable);
-
- return 0;
-
-error_free:
- free_mutable_rtnl(mutable);
- kfree(mutable);
-error:
- return err;
-}
-
-int ovs_tnl_get_options(const struct vport *vport, struct sk_buff *skb)
-{
- const struct tnl_vport *tnl_vport = tnl_vport_priv(vport);
- const struct tnl_mutable_config *mutable = rcu_dereference_rtnl(tnl_vport->mutable);
-
- if (nla_put_u32(skb, OVS_TUNNEL_ATTR_FLAGS,
- mutable->flags & TNL_F_PUBLIC) ||
- nla_put_be32(skb, OVS_TUNNEL_ATTR_DST_IPV4, mutable->key.daddr))
- goto nla_put_failure;
-
- if (!(mutable->flags & TNL_F_IN_KEY_MATCH) &&
- nla_put_be64(skb, OVS_TUNNEL_ATTR_IN_KEY, mutable->key.in_key))
- goto nla_put_failure;
- if (!(mutable->flags & TNL_F_OUT_KEY_ACTION) &&
- nla_put_be64(skb, OVS_TUNNEL_ATTR_OUT_KEY, mutable->out_key))
- goto nla_put_failure;
- if (mutable->key.saddr &&
- nla_put_be32(skb, OVS_TUNNEL_ATTR_SRC_IPV4, mutable->key.saddr))
- goto nla_put_failure;
- if (mutable->tos && nla_put_u8(skb, OVS_TUNNEL_ATTR_TOS, mutable->tos))
- goto nla_put_failure;
- if (mutable->ttl && nla_put_u8(skb, OVS_TUNNEL_ATTR_TTL, mutable->ttl))
- goto nla_put_failure;
-
- return 0;
-
-nla_put_failure:
- return -EMSGSIZE;
-}
-
static void free_port_rcu(struct rcu_head *rcu)
{
struct tnl_vport *tnl_vport = container_of(rcu,
diff --git a/datapath/vport-capwap.c b/datapath/vport-capwap.c
index 1e08d5a..f26a7d2 100644
--- a/datapath/vport-capwap.c
+++ b/datapath/vport-capwap.c
@@ -835,8 +835,6 @@ const struct vport_ops ovs_capwap_vport_ops = {
.set_addr = ovs_tnl_set_addr,
.get_name = ovs_tnl_get_name,
.get_addr = ovs_tnl_get_addr,
- .get_options = ovs_tnl_get_options,
- .set_options = ovs_tnl_set_options,
.get_dev_flags = ovs_vport_gen_get_dev_flags,
.is_running = ovs_vport_gen_is_running,
.get_operstate = ovs_vport_gen_get_operstate,
diff --git a/datapath/vport-gre.c b/datapath/vport-gre.c
index fd2b038..f610097 100644
--- a/datapath/vport-gre.c
+++ b/datapath/vport-gre.c
@@ -415,8 +415,6 @@ const struct vport_ops ovs_gre_vport_ops = {
.set_addr = ovs_tnl_set_addr,
.get_name = ovs_tnl_get_name,
.get_addr = ovs_tnl_get_addr,
- .get_options = ovs_tnl_get_options,
- .set_options = ovs_tnl_set_options,
.get_dev_flags = ovs_vport_gen_get_dev_flags,
.is_running = ovs_vport_gen_is_running,
.get_operstate = ovs_vport_gen_get_operstate,
diff --git a/datapath/vport-tunnel-realdev.c b/datapath/vport-tunnel-realdev.c
new file mode 100644
index 0000000..6225f70
--- /dev/null
+++ b/datapath/vport-tunnel-realdev.c
@@ -0,0 +1,260 @@
+/*
+ * Copyright (c) 2012 Horms Solution Ltd.
+ *
+ * Based on vport-patch.c
+ *
+ * Copyright (c) 2007-2012 Nicira, Inc.
+ *
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of version 2 of the GNU General Public
+ * License as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
+ * 02110-1301, USA
+ */
+
+#include <linux/kernel.h>
+#include <linux/jhash.h>
+#include <linux/list.h>
+#include <linux/rtnetlink.h>
+#include <net/net_namespace.h>
+
+#include "compat.h"
+#include "datapath.h"
+#include "vport.h"
+#include "vport-generic.h"
+
+struct realdev_config {
+ struct rcu_head rcu;
+
+ unsigned char eth_addr[ETH_ALEN];
+ __be32 daddr;
+ u32 flags;
+};
+
+struct realdev_vport {
+ struct rcu_head rcu;
+
+ char name[IFNAMSIZ];
+
+ struct realdev_config __rcu *realdevconf;
+};
+
+static struct realdev_vport *realdev_vport_priv(const struct vport *vport)
+{
+ return vport_priv(vport);
+}
+
+/* RCU callback. */
+static void free_config(struct rcu_head *rcu)
+{
+ struct realdev_config *c = container_of(rcu, struct realdev_config, rcu);
+ kfree(c);
+}
+
+static void assign_config_rcu(struct vport *vport,
+ struct realdev_config *new_config)
+{
+ struct realdev_vport *realdev_vport = realdev_vport_priv(vport);
+ struct realdev_config *old_config;
+
+ old_config = rtnl_dereference(realdev_vport->realdevconf);
+ rcu_assign_pointer(realdev_vport->realdevconf, new_config);
+ call_rcu(&old_config->rcu, free_config);
+}
+
+static int realdev_init(void)
+{
+ return 0;
+}
+
+static void realdev_exit(void)
+{
+}
+
+static const struct nla_policy realdev_policy[OVS_TUNNEL_ATTR_MAX + 1] = {
+ [OVS_TUNNEL_ATTR_FLAGS] = { .type = NLA_U32 },
+ [OVS_TUNNEL_ATTR_DST_IPV4] = { .type = NLA_U32 },
+};
+
+static int realdev_set_config(struct vport *vport, const struct nlattr *options,
+ struct realdev_config *realdevconf)
+{
+ struct nlattr *a[OVS_TUNNEL_ATTR_MAX + 1];
+ int err;
+
+ if (!options)
+ return -EINVAL;
+
+ err = nla_parse_nested(a, OVS_TUNNEL_ATTR_MAX, options, realdev_policy);
+ if (err)
+ return err;
+
+ if (!a[OVS_TUNNEL_ATTR_FLAGS] || !a[OVS_TUNNEL_ATTR_DST_IPV4])
+ return -EINVAL;
+
+ realdevconf->flags = nla_get_u32(a[OVS_TUNNEL_ATTR_FLAGS]);
+ realdevconf->daddr = nla_get_u32(a[OVS_TUNNEL_ATTR_DST_IPV4]);
+
+ return 0;
+}
+
+
+static struct vport *realdev_create(const struct vport_parms *parms)
+{
+ struct vport *vport;
+ struct realdev_vport *realdev_vport;
+ struct realdev_config *realdevconf;
+ int err;
+
+ vport = ovs_vport_alloc(sizeof(struct realdev_vport),
+ &ovs_tunnel_realdev_vport_ops, parms);
+ if (IS_ERR(vport)) {
+ err = PTR_ERR(vport);
+ goto error;
+ }
+
+ realdev_vport = realdev_vport_priv(vport);
+
+ strcpy(realdev_vport->name, parms->name);
+
+ realdevconf = kmalloc(sizeof(struct realdev_config), GFP_KERNEL);
+ if (!realdevconf) {
+ err = -ENOMEM;
+ goto error_free_vport;
+ }
+
+ err = realdev_set_config(vport, parms->options, realdevconf);
+ if (err)
+ goto error_free_realdevconf;
+
+ random_ether_addr(realdevconf->eth_addr);
+
+ rcu_assign_pointer(realdev_vport->realdevconf, realdevconf);
+
+ return vport;
+
+error_free_realdevconf:
+ kfree(realdevconf);
+error_free_vport:
+ ovs_vport_free(vport);
+error:
+ return ERR_PTR(err);
+}
+
+static void free_port_rcu(struct rcu_head *rcu)
+{
+ struct realdev_vport *realdev_vport = container_of(rcu,
+ struct realdev_vport, rcu);
+
+ kfree((struct realdev_config __force *)realdev_vport->realdevconf);
+ ovs_vport_free(vport_from_priv(realdev_vport));
+}
+
+static void realdev_destroy(struct vport *vport)
+{
+ struct realdev_vport *realdev_vport = realdev_vport_priv(vport);
+ call_rcu(&realdev_vport->rcu, free_port_rcu);
+}
+
+static int realdev_set_addr(struct vport *vport, const unsigned char *addr)
+{
+ struct realdev_vport *realdev_vport = realdev_vport_priv(vport);
+ struct realdev_config *realdevconf;
+
+ realdevconf = kmemdup(rtnl_dereference(realdev_vport->realdevconf),
+ sizeof(struct realdev_config), GFP_KERNEL);
+ if (!realdevconf)
+ return -ENOMEM;
+
+ memcpy(realdevconf->eth_addr, addr, ETH_ALEN);
+ assign_config_rcu(vport, realdevconf);
+
+ return 0;
+}
+
+static int realdev_set_options(struct vport *vport, struct nlattr *options)
+{
+ struct realdev_vport *realdev_vport = realdev_vport_priv(vport);
+ struct realdev_config *realdevconf;
+ int err;
+
+ realdevconf = kmemdup(rtnl_dereference(realdev_vport->realdevconf),
+ sizeof(struct realdev_config), GFP_KERNEL);
+ if (!realdevconf) {
+ err = -ENOMEM;
+ goto error;
+ }
+
+ err = realdev_set_config(vport, options, realdevconf);
+ if (err)
+ goto error_free;
+
+ assign_config_rcu(vport, realdevconf);
+
+ return 0;
+error_free:
+ kfree(realdevconf);
+error:
+ return err;
+}
+
+static const char *realdev_get_name(const struct vport *vport)
+{
+ const struct realdev_vport *realdev_vport = realdev_vport_priv(vport);
+ return realdev_vport->name;
+}
+
+static const unsigned char *realdev_get_addr(const struct vport *vport)
+{
+ const struct realdev_vport *realdev_vport = realdev_vport_priv(vport);
+ return rcu_dereference_rtnl(realdev_vport->realdevconf)->eth_addr;
+}
+
+static int realdev_get_options(const struct vport *vport, struct sk_buff *skb)
+{
+ struct realdev_vport *realdev_vport = realdev_vport_priv(vport);
+ struct realdev_config *realdevconf =
+ rcu_dereference_rtnl(realdev_vport->realdevconf);
+ int err;
+
+ err = nla_put_u32(skb, OVS_TUNNEL_ATTR_FLAGS, realdevconf->flags);
+ if (err)
+ goto error;
+
+ err = nla_put_u32(skb, OVS_TUNNEL_ATTR_DST_IPV4, realdevconf->daddr);
+error:
+ return err;
+}
+
+static int realdev_send(struct vport *vport, struct sk_buff *skb)
+{
+ kfree_skb(skb);
+ ovs_vport_record_error(vport, VPORT_E_TX_DROPPED);
+ return 0;
+}
+
+const struct vport_ops ovs_tunnel_realdev_vport_ops = {
+ .type = OVS_VPORT_TYPE_TUNNEL_REALDEV,
+ .init = realdev_init,
+ .exit = realdev_exit,
+ .create = realdev_create,
+ .destroy = realdev_destroy,
+ .set_addr = realdev_set_addr,
+ .get_name = realdev_get_name,
+ .get_addr = realdev_get_addr,
+ .get_options = realdev_get_options,
+ .set_options = realdev_set_options,
+ .get_dev_flags = ovs_vport_gen_get_dev_flags,
+ .is_running = ovs_vport_gen_is_running,
+ .get_operstate = ovs_vport_gen_get_operstate,
+ .send = realdev_send,
+};
diff --git a/datapath/vport.c b/datapath/vport.c
index 0c77a1b..7759e07 100644
--- a/datapath/vport.c
+++ b/datapath/vport.c
@@ -44,6 +44,7 @@ static const struct vport_ops *base_vport_ops_list[] = {
#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,26)
&ovs_capwap_vport_ops,
#endif
+ &ovs_tunnel_realdev_vport_ops,
};
static const struct vport_ops **vport_ops_list;
diff --git a/datapath/vport.h b/datapath/vport.h
index b0cdeae..893daaf 100644
--- a/datapath/vport.h
+++ b/datapath/vport.h
@@ -257,5 +257,6 @@ extern const struct vport_ops ovs_internal_vport_ops;
extern const struct vport_ops ovs_patch_vport_ops;
extern const struct vport_ops ovs_gre_vport_ops;
extern const struct vport_ops ovs_capwap_vport_ops;
+extern const struct vport_ops ovs_tunnel_realdev_vport_ops;
#endif /* vport.h */
diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h
index c32bb58..87a3e22 100644
--- a/include/linux/openvswitch.h
+++ b/include/linux/openvswitch.h
@@ -185,6 +185,7 @@ enum ovs_vport_type {
OVS_VPORT_TYPE_PATCH = 100, /* virtual tunnel connecting two vports */
OVS_VPORT_TYPE_GRE, /* GRE tunnel */
OVS_VPORT_TYPE_CAPWAP, /* CAPWAP tunnel */
+ OVS_VPORT_TYPE_TUNNEL_REALDEV, /* real tunnel device */
__OVS_VPORT_TYPE_MAX
};
diff --git a/include/openvswitch/tunnel.h b/include/openvswitch/tunnel.h
index 5f55ecc..078a940 100644
--- a/include/openvswitch/tunnel.h
+++ b/include/openvswitch/tunnel.h
@@ -74,4 +74,6 @@ enum {
#define TNL_F_IN_KEY (1 << 8) /* Tunnel port has input key. */
#define TNL_F_OUT_KEY (1 << 9) /* Tunnel port has output key. */
+#define TNL_F_CAPWAP (1 << 10)
+
#endif /* openvswitch/tunnel.h */
diff --git a/lib/netdev-vport.c b/lib/netdev-vport.c
index a9eb3eb..7a9803b 100644
--- a/lib/netdev-vport.c
+++ b/lib/netdev-vport.c
@@ -155,15 +155,24 @@ netdev_vport_get_netdev_type(const struct dpif_linux_vport *vport)
return "patch";
case OVS_VPORT_TYPE_GRE:
- if (tnl_port_config_from_nlattr(vport->options, vport->options_len,
- a)) {
- break;
- }
- return (nl_attr_get_u32(a[OVS_TUNNEL_ATTR_FLAGS]) & TNL_F_IPSEC
- ? "ipsec_gre" : "gre");
+ return "gre-tundev";
case OVS_VPORT_TYPE_CAPWAP:
- return "capwap";
+ return "capwap-tundev";
+
+ case OVS_VPORT_TYPE_TUNNEL_REALDEV:
+ if (tnl_port_config_from_nlattr(vport->options,
+ vport->options_len, a)) {
+ return "no-config";
+ }
+
+ if (nl_attr_get_u32(a[OVS_TUNNEL_ATTR_FLAGS]) & TNL_F_CAPWAP) {
+ return "capwap";
+ } else if (nl_attr_get_u32(a[OVS_TUNNEL_ATTR_FLAGS]) & TNL_F_IPSEC) {
+ return "ipsec_gre";
+ } else {
+ return "gre";
+ }
case __OVS_VPORT_TYPE_MAX:
break;
@@ -248,6 +257,10 @@ netdev_vport_get_config(struct netdev_dev *dev_, struct shash *args)
ofpbuf_delete(buf);
}
+ if (!vport_class->unparse_config) {
+ return 0;
+ }
+
error = vport_class->unparse_config(name, netdev_class->type,
dev->options->data,
dev->options->size,
@@ -267,11 +280,13 @@ netdev_vport_set_config(struct netdev_dev *dev_, const struct shash *args)
struct netdev_dev_vport *dev = netdev_dev_vport_cast(dev_);
const char *name = netdev_dev_get_name(dev_);
struct ofpbuf *options;
- int error;
+ int error = 0;
options = ofpbuf_new(64);
- error = vport_class->parse_config(name, netdev_dev_get_type(dev_),
- args, options);
+ if (vport_class->parse_config) {
+ error = vport_class->parse_config(name, netdev_dev_get_type(dev_),
+ args, options);
+ }
if (!error
&& (!dev->options
|| options->size != dev->options->size
@@ -550,47 +565,18 @@ netdev_vport_poll_notify(const struct netdev *netdev)
\f
/* Code specific to individual vport types. */
-static void
-set_key(const struct shash *args, const char *name, uint16_t type,
- struct ofpbuf *options)
-{
- const char *s;
-
- s = shash_find_data(args, name);
- if (!s) {
- s = shash_find_data(args, "key");
- if (!s) {
- s = "0";
- }
- }
-
- if (!strcmp(s, "flow")) {
- /* This is the default if no attribute is present. */
- } else {
- nl_msg_put_be64(options, type, htonll(strtoull(s, NULL, 0)));
- }
-}
-
static int
parse_tunnel_config(const char *name, const char *type,
const struct shash *args, struct ofpbuf *options)
{
- bool is_gre = false;
- bool is_ipsec = false;
- struct shash_node *node;
- bool ipsec_mech_set = false;
ovs_be32 daddr = htonl(0);
- ovs_be32 saddr = htonl(0);
- uint32_t flags;
-
- flags = TNL_F_DF_DEFAULT | TNL_F_PMTUD | TNL_F_HDR_CACHE;
- if (!strcmp(type, "gre")) {
- is_gre = true;
- } else if (!strcmp(type, "ipsec_gre")) {
- is_gre = true;
- is_ipsec = true;
+ struct shash_node *node;
+ uint32_t flags = 0;
+
+ if (!strcmp(type, "ipsec_gre")) {
flags |= TNL_F_IPSEC;
- flags &= ~TNL_F_HDR_CACHE;
+ } else if (!strcmp(type, "capwap")) {
+ flags |= TNL_F_CAPWAP;
}
SHASH_FOR_EACH (node, args) {
@@ -601,112 +587,9 @@ parse_tunnel_config(const char *name, const char *type,
} else {
daddr = in_addr.s_addr;
}
- } else if (!strcmp(node->name, "local_ip")) {
- struct in_addr in_addr;
- if (lookup_ip(node->data, &in_addr)) {
- VLOG_WARN("%s: bad %s 'local_ip'", name, type);
- } else {
- saddr = in_addr.s_addr;
- }
- } else if (!strcmp(node->name, "tos")) {
- if (!strcmp(node->data, "inherit")) {
- flags |= TNL_F_TOS_INHERIT;
- } else {
- char *endptr;
- int tos;
- tos = strtol(node->data, &endptr, 0);
- if (*endptr == '\0') {
- nl_msg_put_u8(options, OVS_TUNNEL_ATTR_TOS, tos);
- }
- }
- } else if (!strcmp(node->name, "ttl")) {
- if (!strcmp(node->data, "inherit")) {
- flags |= TNL_F_TTL_INHERIT;
- } else {
- nl_msg_put_u8(options, OVS_TUNNEL_ATTR_TTL, atoi(node->data));
- }
- } else if (!strcmp(node->name, "csum") && is_gre) {
- if (!strcmp(node->data, "true")) {
- flags |= TNL_F_CSUM;
- }
- } else if (!strcmp(node->name, "df_inherit")) {
- if (!strcmp(node->data, "true")) {
- flags |= TNL_F_DF_INHERIT;
- }
- } else if (!strcmp(node->name, "df_default")) {
- if (!strcmp(node->data, "false")) {
- flags &= ~TNL_F_DF_DEFAULT;
- }
- } else if (!strcmp(node->name, "pmtud")) {
- if (!strcmp(node->data, "false")) {
- flags &= ~TNL_F_PMTUD;
- }
- } else if (!strcmp(node->name, "header_cache")) {
- if (!strcmp(node->data, "false")) {
- flags &= ~TNL_F_HDR_CACHE;
- }
- } else if (!strcmp(node->name, "peer_cert") && is_ipsec) {
- if (shash_find(args, "certificate")) {
- ipsec_mech_set = true;
- } else {
- const char *use_ssl_cert;
-
- /* If the "use_ssl_cert" is true, then "certificate" and
- * "private_key" will be pulled from the SSL table. The
- * use of this option is strongly discouraged, since it
- * will like be removed when multiple SSL configurations
- * are supported by OVS.
- */
- use_ssl_cert = shash_find_data(args, "use_ssl_cert");
- if (!use_ssl_cert || strcmp(use_ssl_cert, "true")) {
- VLOG_ERR("%s: 'peer_cert' requires 'certificate' argument",
- name);
- return EINVAL;
- }
- ipsec_mech_set = true;
- }
- } else if (!strcmp(node->name, "psk") && is_ipsec) {
- ipsec_mech_set = true;
- } else if (is_ipsec
- && (!strcmp(node->name, "certificate")
- || !strcmp(node->name, "private_key")
- || !strcmp(node->name, "use_ssl_cert"))) {
- /* Ignore options not used by the netdev. */
- } else if (!strcmp(node->name, "key") ||
- !strcmp(node->name, "in_key") ||
- !strcmp(node->name, "out_key")) {
- /* Handled separately below. */
- } else {
- VLOG_WARN("%s: unknown %s argument '%s'", name, type, node->name);
}
}
- if (is_ipsec) {
- char *file_name = xasprintf("%s/%s", ovs_rundir(),
- "ovs-monitor-ipsec.pid");
- pid_t pid = read_pidfile(file_name);
- free(file_name);
- if (pid < 0) {
- VLOG_ERR("%s: IPsec requires the ovs-monitor-ipsec daemon",
- name);
- return EINVAL;
- }
-
- if (shash_find(args, "peer_cert") && shash_find(args, "psk")) {
- VLOG_ERR("%s: cannot define both 'peer_cert' and 'psk'", name);
- return EINVAL;
- }
-
- if (!ipsec_mech_set) {
- VLOG_ERR("%s: IPsec requires an 'peer_cert' or psk' argument",
- name);
- return EINVAL;
- }
- }
-
- set_key(args, "in_key", OVS_TUNNEL_ATTR_IN_KEY, options);
- set_key(args, "out_key", OVS_TUNNEL_ATTR_OUT_KEY, options);
-
if (!daddr) {
VLOG_ERR("%s: %s type requires valid 'remote_ip' argument",
name, type);
@@ -714,14 +597,6 @@ parse_tunnel_config(const char *name, const char *type,
}
nl_msg_put_be32(options, OVS_TUNNEL_ATTR_DST_IPV4, daddr);
- if (saddr) {
- if (ip_is_multicast(daddr)) {
- VLOG_WARN("%s: remote_ip is multicast, ignoring local_ip", name);
- } else {
- nl_msg_put_be32(options, OVS_TUNNEL_ATTR_SRC_IPV4, saddr);
- }
- }
-
nl_msg_put_u32(options, OVS_TUNNEL_ATTR_FLAGS, flags);
return 0;
@@ -749,95 +624,6 @@ tnl_port_config_from_nlattr(const struct nlattr *options, size_t options_len,
}
return 0;
}
-
-static uint64_t
-get_be64_or_zero(const struct nlattr *a)
-{
- return a ? ntohll(nl_attr_get_be64(a)) : 0;
-}
-
-static int
-unparse_tunnel_config(const char *name OVS_UNUSED, const char *type OVS_UNUSED,
- const struct nlattr *options, size_t options_len,
- struct shash *args)
-{
- struct nlattr *a[OVS_TUNNEL_ATTR_MAX + 1];
- ovs_be32 daddr;
- uint32_t flags;
- int error;
-
- error = tnl_port_config_from_nlattr(options, options_len, a);
- if (error) {
- return error;
- }
-
- flags = nl_attr_get_u32(a[OVS_TUNNEL_ATTR_FLAGS]);
- if (!(flags & TNL_F_HDR_CACHE) == !(flags & TNL_F_IPSEC)) {
- smap_add(args, "header_cache",
- flags & TNL_F_HDR_CACHE ? "true" : "false");
- }
-
- daddr = nl_attr_get_be32(a[OVS_TUNNEL_ATTR_DST_IPV4]);
- shash_add(args, "remote_ip", xasprintf(IP_FMT, IP_ARGS(&daddr)));
-
- if (a[OVS_TUNNEL_ATTR_SRC_IPV4]) {
- ovs_be32 saddr = nl_attr_get_be32(a[OVS_TUNNEL_ATTR_SRC_IPV4]);
- shash_add(args, "local_ip", xasprintf(IP_FMT, IP_ARGS(&saddr)));
- }
-
- if (!a[OVS_TUNNEL_ATTR_IN_KEY] && !a[OVS_TUNNEL_ATTR_OUT_KEY]) {
- smap_add(args, "key", "flow");
- } else {
- uint64_t in_key = get_be64_or_zero(a[OVS_TUNNEL_ATTR_IN_KEY]);
- uint64_t out_key = get_be64_or_zero(a[OVS_TUNNEL_ATTR_OUT_KEY]);
-
- if (in_key && in_key == out_key) {
- shash_add(args, "key", xasprintf("%"PRIu64, in_key));
- } else {
- if (!a[OVS_TUNNEL_ATTR_IN_KEY]) {
- smap_add(args, "in_key", "flow");
- } else if (in_key) {
- shash_add(args, "in_key", xasprintf("%"PRIu64, in_key));
- }
-
- if (!a[OVS_TUNNEL_ATTR_OUT_KEY]) {
- smap_add(args, "out_key", "flow");
- } else if (out_key) {
- shash_add(args, "out_key", xasprintf("%"PRIu64, out_key));
- }
- }
- }
-
- if (flags & TNL_F_TTL_INHERIT) {
- smap_add(args, "tos", "inherit");
- } else if (a[OVS_TUNNEL_ATTR_TTL]) {
- int ttl = nl_attr_get_u8(a[OVS_TUNNEL_ATTR_TTL]);
- shash_add(args, "tos", xasprintf("%d", ttl));
- }
-
- if (flags & TNL_F_TOS_INHERIT) {
- smap_add(args, "tos", "inherit");
- } else if (a[OVS_TUNNEL_ATTR_TOS]) {
- int tos = nl_attr_get_u8(a[OVS_TUNNEL_ATTR_TOS]);
- shash_add(args, "tos", xasprintf("0x%x", tos));
- }
-
- if (flags & TNL_F_CSUM) {
- smap_add(args, "csum", "true");
- }
- if (flags & TNL_F_DF_INHERIT) {
- smap_add(args, "df_inherit", "true");
- }
- if (!(flags & TNL_F_DF_DEFAULT)) {
- smap_add(args, "df_default", "false");
- }
- if (!(flags & TNL_F_PMTUD)) {
- smap_add(args, "pmtud", "false");
- }
-
- return 0;
-}
-
static int
parse_patch_config(const char *name, const char *type OVS_UNUSED,
const struct shash *args, struct ofpbuf *options)
@@ -894,15 +680,17 @@ unparse_patch_config(const char *name OVS_UNUSED, const char *type OVS_UNUSED,
return 0;
}
\f
-#define VPORT_FUNCTIONS(GET_STATUS) \
+#define __VPORT_FUNCTIONS(RUN, WAIT, GET_CONFIG, \
+ SET_CONFIG, SEND, GET_STATS, \
+ SET_STATS, GET_STATUS) \
NULL, \
- netdev_vport_run, \
- netdev_vport_wait, \
+ RUN, \
+ WAIT, \
\
netdev_vport_create, \
netdev_vport_destroy, \
- netdev_vport_get_config, \
- netdev_vport_set_config, \
+ GET_CONFIG, \
+ SET_CONFIG, \
\
netdev_vport_open, \
netdev_vport_close, \
@@ -912,7 +700,7 @@ unparse_patch_config(const char *name OVS_UNUSED, const char *type OVS_UNUSED,
NULL, /* recv_wait */ \
NULL, /* drain */ \
\
- netdev_vport_send, /* send */ \
+ SEND, /* send */ \
NULL, /* send_wait */ \
\
netdev_vport_set_etheraddr, \
@@ -923,8 +711,8 @@ unparse_patch_config(const char *name OVS_UNUSED, const char *type OVS_UNUSED,
NULL, /* get_carrier */ \
NULL, /* get_carrier_resets */ \
NULL, /* get_miimon */ \
- netdev_vport_get_stats, \
- netdev_vport_set_stats, \
+ GET_STATS, \
+ SET_STATS, \
\
NULL, /* get_features */ \
NULL, /* set_advertisements */ \
@@ -953,24 +741,47 @@ unparse_patch_config(const char *name OVS_UNUSED, const char *type OVS_UNUSED,
\
netdev_vport_change_seq
+#define VPORT_FUNCTIONS(SET_CONFIG, GET_STATUS) \
+ __VPORT_FUNCTIONS(netdev_vport_run, \
+ netdev_vport_wait, \
+ netdev_vport_get_config, \
+ SET_CONFIG, \
+ netdev_vport_send, \
+ netdev_vport_get_stats, \
+ netdev_vport_set_stats, \
+ GET_STATUS)
+
+#define VPORT_TUNNEL_REALDEV_FUNCTIONS \
+ __VPORT_FUNCTIONS(NULL, NULL, NULL, \
+ netdev_vport_set_config, \
+ NULL, NULL, NULL, NULL)
+
void
netdev_vport_register(void)
{
static const struct vport_class vport_classes[] = {
- { OVS_VPORT_TYPE_GRE,
- { "gre", VPORT_FUNCTIONS(netdev_vport_get_drv_info) },
- parse_tunnel_config, unparse_tunnel_config },
+ { OVS_VPORT_TYPE_TUNNEL_REALDEV,
+ { "gre", VPORT_TUNNEL_REALDEV_FUNCTIONS },
+ parse_tunnel_config, NULL },
+
+ { OVS_VPORT_TYPE_TUNNEL_REALDEV,
+ { "ipsec_gre", VPORT_TUNNEL_REALDEV_FUNCTIONS },
+ parse_tunnel_config, NULL },
{ OVS_VPORT_TYPE_GRE,
- { "ipsec_gre", VPORT_FUNCTIONS(netdev_vport_get_drv_info) },
- parse_tunnel_config, unparse_tunnel_config },
+ { "gre-tundev", VPORT_FUNCTIONS(NULL, netdev_vport_get_drv_info) },
+ NULL, NULL },
+
+ { OVS_VPORT_TYPE_TUNNEL_REALDEV,
+ { "capwap", VPORT_TUNNEL_REALDEV_FUNCTIONS },
+ parse_tunnel_config, NULL },
{ OVS_VPORT_TYPE_CAPWAP,
- { "capwap", VPORT_FUNCTIONS(netdev_vport_get_drv_info) },
- parse_tunnel_config, unparse_tunnel_config },
+ { "capwap-tundev", VPORT_FUNCTIONS(NULL, netdev_vport_get_drv_info) },
+ NULL, NULL },
{ OVS_VPORT_TYPE_PATCH,
- { "patch", VPORT_FUNCTIONS(NULL) },
+ { "patch", VPORT_FUNCTIONS(netdev_vport_set_config, NULL) },
parse_patch_config, unparse_patch_config }
};
--
1.7.10.2.484.gcd07cc5
^ permalink raw reply related
* [PATCH 10/21] classifier: Convert struct flow flow_metadata to use tun_key
From: Simon Horman @ 2012-05-24 9:09 UTC (permalink / raw)
To: dev-yBygre7rU0TnMu66kgdUjQ; +Cc: netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1337850554-10339-1-git-send-email-horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
This allows the tun_key tp be bassed throughout user-space,
attached to a flow. This is the essence of flow-based tunneling.
This does not add tun_key or wildcards, other than the existing match for
the tun_id. It is envisaged that most if not all fields of the tun_key
could be wildcarded.
Cc: Kyle Mestery <kmestery-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
---
v4
* flow_format() and ofp_print_packet_in() format strings:
- Make more consistent with eachother and format_odp_key_attr()
- Update for flags field of tunnel
* Remove debugging message
* Add struct flow_tun_key to avoid needing to use
ovs_key_ipv4_tunnel which is defined in a Linux kernel header.
This code should be ofproto-provider agnostic.
v3
* Initial posting
classifer: don't use kernel tunnel structure
---
lib/classifier.c | 8 ++++----
lib/dpif-linux.c | 2 +-
lib/flow.c | 31 ++++++++++++++++++++++++++-----
lib/flow.h | 21 ++++++++++++++++-----
lib/meta-flow.c | 4 ++--
lib/nx-match.c | 2 +-
lib/odp-util.c | 24 ++++++++++++++++--------
lib/ofp-print.c | 12 ++++++++++--
lib/ofp-util.c | 4 ++--
ofproto/ofproto-dpif.c | 11 ++++++-----
tests/test-classifier.c | 7 ++++---
11 files changed, 88 insertions(+), 38 deletions(-)
diff --git a/lib/classifier.c b/lib/classifier.c
index e11a585..7dc6560 100644
--- a/lib/classifier.c
+++ b/lib/classifier.c
@@ -129,7 +129,7 @@ cls_rule_set_tun_id_masked(struct cls_rule *rule,
ovs_be64 tun_id, ovs_be64 mask)
{
rule->wc.tun_id_mask = mask;
- rule->flow.tun_id = tun_id & mask;
+ rule->flow.tun_key.tun_id = tun_id & mask;
}
void
@@ -563,11 +563,11 @@ cls_rule_format(const struct cls_rule *rule, struct ds *s)
case 0:
break;
case CONSTANT_HTONLL(UINT64_MAX):
- ds_put_format(s, "tun_id=%#"PRIx64",", ntohll(f->tun_id));
+ ds_put_format(s, "tun_id=%#"PRIx64",", ntohll(f->tun_key.tun_id));
break;
default:
ds_put_format(s, "tun_id=%#"PRIx64"/%#"PRIx64",",
- ntohll(f->tun_id), ntohll(wc->tun_id_mask));
+ ntohll(f->tun_key.tun_id), ntohll(wc->tun_id_mask));
break;
}
if (!(w & FWW_IN_PORT)) {
@@ -1187,7 +1187,7 @@ flow_equal_except(const struct flow *a, const struct flow *b,
}
}
- return (!((a->tun_id ^ b->tun_id) & wildcards->tun_id_mask)
+ return (!((a->tun_key.tun_id ^ b->tun_key.tun_id) & wildcards->tun_id_mask)
&& !((a->nw_src ^ b->nw_src) & wildcards->nw_src_mask)
&& !((a->nw_dst ^ b->nw_dst) & wildcards->nw_dst_mask)
&& (wc & FWW_IN_PORT || a->in_port == b->in_port)
diff --git a/lib/dpif-linux.c b/lib/dpif-linux.c
index 256c9d6..0e5cdd2 100644
--- a/lib/dpif-linux.c
+++ b/lib/dpif-linux.c
@@ -1292,7 +1292,7 @@ dpif_linux_vport_send(int dp_ifindex, uint32_t port_no,
uint64_t action;
ofpbuf_use_const(&packet, data, size);
- flow_extract(&packet, 0, htonll(0), 0, &flow);
+ flow_extract(&packet, 0, NULL, 0, &flow);
ofpbuf_use_stack(&key, &keybuf, sizeof keybuf);
odp_flow_key_from_flow(&key, &flow);
diff --git a/lib/flow.c b/lib/flow.c
index fc61610..8645e7d 100644
--- a/lib/flow.c
+++ b/lib/flow.c
@@ -330,7 +330,8 @@ invalid:
* present and has a correct length, and otherwise NULL.
*/
void
-flow_extract(struct ofpbuf *packet, uint32_t skb_priority, ovs_be64 tun_id,
+flow_extract(struct ofpbuf *packet, uint32_t skb_priority,
+ const struct flow_tun_key *tun_key,
uint16_t ofp_in_port, struct flow *flow)
{
struct ofpbuf b = *packet;
@@ -339,7 +340,9 @@ flow_extract(struct ofpbuf *packet, uint32_t skb_priority, ovs_be64 tun_id,
COVERAGE_INC(flow_extract);
memset(flow, 0, sizeof *flow);
- flow->tun_id = tun_id;
+ if (tun_key) {
+ flow->tun_key = *tun_key;;
+ }
flow->in_port = ofp_in_port;
flow->skb_priority = skb_priority;
@@ -449,7 +452,7 @@ flow_zero_wildcards(struct flow *flow, const struct flow_wildcards *wildcards)
for (i = 0; i < FLOW_N_REGS; i++) {
flow->regs[i] &= wildcards->reg_masks[i];
}
- flow->tun_id &= wildcards->tun_id_mask;
+ flow->tun_key.tun_id &= wildcards->tun_id_mask;
flow->nw_src &= wildcards->nw_src_mask;
flow->nw_dst &= wildcards->nw_dst_mask;
if (wc & FWW_IN_PORT) {
@@ -508,7 +511,7 @@ flow_get_metadata(const struct flow *flow, struct flow_metadata *fmd)
{
BUILD_ASSERT_DECL(FLOW_WC_SEQ == 10);
- fmd->tun_id = flow->tun_id;
+ fmd->tun_key = flow->tun_key;
fmd->tun_id_mask = htonll(UINT64_MAX);
memcpy(fmd->regs, flow->regs, sizeof fmd->regs);
@@ -528,11 +531,13 @@ flow_to_string(const struct flow *flow)
void
flow_format(struct ds *ds, const struct flow *flow)
{
+ /* The tunnel key is also displayed as part of tunnel() below.
+ * It is here for backwards-compatibility */
ds_put_format(ds, "priority:%"PRIu32
",tunnel:%#"PRIx64
",in_port:%04"PRIx16,
flow->skb_priority,
- ntohll(flow->tun_id),
+ ntohll(flow->tun_key.tun_id),
flow->in_port);
ds_put_format(ds, ",tci(");
@@ -579,6 +584,22 @@ flow_format(struct ds *ds, const struct flow *flow)
ETH_ADDR_ARGS(flow->arp_sha),
ETH_ADDR_ARGS(flow->arp_tha));
}
+ if (!eth_addr_is_zero(flow->arp_sha) || !eth_addr_is_zero(flow->arp_tha)) {
+ ds_put_format(ds, " arp_ha("ETH_ADDR_FMT"->"ETH_ADDR_FMT")",
+ ETH_ADDR_ARGS(flow->arp_sha),
+ ETH_ADDR_ARGS(flow->arp_tha));
+ }
+ if (flow->tun_key.ipv4_dst != htonl(0)) {
+ ds_put_format(ds, " tunnel(tun_id:%"PRIx64",flags:%"PRIx32
+ ",ip("IP_FMT"->"IP_FMT"),"
+ ",tos:%"PRIx8",ttl:%"PRIu8")",
+ ntohll(flow->tun_key.tun_id),
+ flow->tun_key.tun_flags,
+ IP_ARGS(&flow->tun_key.ipv4_src),
+ IP_ARGS(&flow->tun_key.ipv4_dst),
+ flow->tun_key.ipv4_tos, flow->tun_key.ipv4_ttl);
+ }
+
}
void
diff --git a/lib/flow.h b/lib/flow.h
index 7ee9a26..0b5932f 100644
--- a/lib/flow.h
+++ b/lib/flow.h
@@ -52,8 +52,18 @@ BUILD_ASSERT_DECL(FLOW_N_REGS <= NXM_NX_MAX_REGS);
BUILD_ASSERT_DECL(FLOW_NW_FRAG_ANY == NX_IP_FRAG_ANY);
BUILD_ASSERT_DECL(FLOW_NW_FRAG_LATER == NX_IP_FRAG_LATER);
+struct flow_tun_key {
+ ovs_be64 tun_id;
+ uint32_t tun_flags;
+ ovs_be32 ipv4_src;
+ ovs_be32 ipv4_dst;
+ uint8_t ipv4_tos;
+ uint8_t ipv4_ttl;
+ uint8_t pad[2];
+};
+
struct flow {
- ovs_be64 tun_id; /* Encapsulating tunnel ID. */
+ struct flow_tun_key tun_key;/* Encapsulating tunnel. */
struct in6_addr ipv6_src; /* IPv6 source address. */
struct in6_addr ipv6_dst; /* IPv6 destination address. */
struct in6_addr nd_target; /* IPv6 neighbor discovery (ND) target. */
@@ -82,7 +92,7 @@ struct flow {
* indicate which metadata fields are relevant in a given context. Typically
* they will be all 1 or all 0. */
struct flow_metadata {
- ovs_be64 tun_id; /* Encapsulating tunnel ID. */
+ struct flow_tun_key tun_key; /* Encapsulating tunnel. */
ovs_be64 tun_id_mask; /* 1-bit in each significant tun_id bit.*/
uint32_t regs[FLOW_N_REGS]; /* Registers. */
@@ -93,16 +103,17 @@ struct flow_metadata {
/* Assert that there are FLOW_SIG_SIZE bytes of significant data in "struct
* flow", followed by FLOW_PAD_SIZE bytes of padding. */
-#define FLOW_SIG_SIZE (110 + FLOW_N_REGS * 4)
+#define FLOW_SIG_SIZE (126 + FLOW_N_REGS * 4)
#define FLOW_PAD_SIZE 2
BUILD_ASSERT_DECL(offsetof(struct flow, nw_frag) == FLOW_SIG_SIZE - 1);
BUILD_ASSERT_DECL(sizeof(((struct flow *)0)->nw_frag) == 1);
BUILD_ASSERT_DECL(sizeof(struct flow) == FLOW_SIG_SIZE + FLOW_PAD_SIZE);
/* Remember to update FLOW_WC_SEQ when changing 'struct flow'. */
-BUILD_ASSERT_DECL(FLOW_SIG_SIZE == 142 && FLOW_WC_SEQ == 10);
+BUILD_ASSERT_DECL(FLOW_SIG_SIZE == 158 && FLOW_WC_SEQ == 10);
-void flow_extract(struct ofpbuf *, uint32_t priority, ovs_be64 tun_id,
+void flow_extract(struct ofpbuf *, uint32_t priority,
+ const struct flow_tun_key *,
uint16_t in_port, struct flow *);
void flow_zero_wildcards(struct flow *, const struct flow_wildcards *);
void flow_get_metadata(const struct flow *, struct flow_metadata *);
diff --git a/lib/meta-flow.c b/lib/meta-flow.c
index 8b60b35..0b47ea1 100644
--- a/lib/meta-flow.c
+++ b/lib/meta-flow.c
@@ -962,7 +962,7 @@ mf_get_value(const struct mf_field *mf, const struct flow *flow,
{
switch (mf->id) {
case MFF_TUN_ID:
- value->be64 = flow->tun_id;
+ value->be64 = flow->tun_key.tun_id;
break;
case MFF_IN_PORT:
@@ -1300,7 +1300,7 @@ mf_set_flow_value(const struct mf_field *mf,
{
switch (mf->id) {
case MFF_TUN_ID:
- flow->tun_id = value->be64;
+ flow->tun_key.tun_id = value->be64;
break;
case MFF_IN_PORT:
diff --git a/lib/nx-match.c b/lib/nx-match.c
index 34c8354..f97ef5d 100644
--- a/lib/nx-match.c
+++ b/lib/nx-match.c
@@ -541,7 +541,7 @@ nx_put_match(struct ofpbuf *b, const struct cls_rule *cr,
}
/* Tunnel ID. */
- nxm_put_64m(b, NXM_NX_TUN_ID, flow->tun_id, cr->wc.tun_id_mask);
+ nxm_put_64m(b, NXM_NX_TUN_ID, flow->tun_key.tun_id, cr->wc.tun_id_mask);
/* Registers. */
for (i = 0; i < FLOW_N_REGS; i++) {
diff --git a/lib/odp-util.c b/lib/odp-util.c
index 7cff00c..5f76f5e 100644
--- a/lib/odp-util.c
+++ b/lib/odp-util.c
@@ -1299,8 +1299,12 @@ odp_flow_key_from_flow(struct ofpbuf *buf, const struct flow *flow)
nl_msg_put_u32(buf, OVS_KEY_ATTR_PRIORITY, flow->skb_priority);
}
- if (flow->tun_id != htonll(0)) {
- nl_msg_put_be64(buf, OVS_KEY_ATTR_TUN_ID, flow->tun_id);
+ if (flow->tun_key.ipv4_dst != htonl(0)) {
+ struct flow_tun_key *tun_key;
+
+ tun_key = nl_msg_put_unspec_uninit(buf, OVS_KEY_ATTR_IPV4_TUNNEL,
+ sizeof *tun_key);
+ *tun_key = flow->tun_key;
}
if (flow->in_port != OFPP_NONE && flow->in_port != OFPP_CONTROLLER) {
@@ -1791,9 +1795,13 @@ odp_flow_key_to_flow(const struct nlattr *key, size_t key_len,
expected_attrs |= UINT64_C(1) << OVS_KEY_ATTR_PRIORITY;
}
- if (present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_TUN_ID)) {
- flow->tun_id = nl_attr_get_be64(attrs[OVS_KEY_ATTR_TUN_ID]);
- expected_attrs |= UINT64_C(1) << OVS_KEY_ATTR_TUN_ID;
+ if (present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_IPV4_TUNNEL)) {
+ const struct flow_tun_key *tun_key;
+
+ tun_key = nl_attr_get(attrs[OVS_KEY_ATTR_IPV4_TUNNEL]);
+ flow->tun_key = *tun_key;
+
+ expected_attrs |= UINT64_C(1) << OVS_KEY_ATTR_IPV4_TUNNEL;
}
if (present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_IN_PORT)) {
@@ -1887,13 +1895,13 @@ static void
commit_set_tun_id_action(const struct flow *flow, struct flow *base,
struct ofpbuf *odp_actions)
{
- if (base->tun_id == flow->tun_id) {
+ if (base->tun_key.tun_id == flow->tun_key.tun_id) {
return;
}
- base->tun_id = flow->tun_id;
+ base->tun_key.tun_id = flow->tun_key.tun_id;
commit_set_action(odp_actions, OVS_KEY_ATTR_TUN_ID,
- &base->tun_id, sizeof(base->tun_id));
+ &base->tun_key.tun_id, sizeof(base->tun_key.tun_id));
}
static void
diff --git a/lib/ofp-print.c b/lib/ofp-print.c
index 1757a30..fff7454 100644
--- a/lib/ofp-print.c
+++ b/lib/ofp-print.c
@@ -106,11 +106,19 @@ ofp_print_packet_in(struct ds *string, const struct ofp_header *oh,
ds_put_format(string, " total_len=%"PRIu16" in_port=", pin.total_len);
ofputil_format_port(pin.fmd.in_port, string);
- if (pin.fmd.tun_id_mask) {
- ds_put_format(string, " tun_id=0x%"PRIx64, ntohll(pin.fmd.tun_id));
+ if (pin.fmd.tun_key.ipv4_dst != htonl(0)) {
+ ds_put_format(string, " tunnel(tun_id=0x%"PRIx64,
+ ntohll(pin.fmd.tun_key.tun_id));
if (pin.fmd.tun_id_mask != htonll(UINT64_MAX)) {
ds_put_format(string, "/0x%"PRIx64, ntohll(pin.fmd.tun_id_mask));
}
+ ds_put_format(string, ",flags=%"PRIx32",ip="IP_FMT"->"IP_FMT","
+ "tos=%"PRIx8",ttl=%"PRIu8")",
+ pin.fmd.tun_key.tun_flags,
+ IP_ARGS(&pin.fmd.tun_key.ipv4_src),
+ IP_ARGS(&pin.fmd.tun_key.ipv4_dst),
+ pin.fmd.tun_key.ipv4_tos,
+ pin.fmd.tun_key.ipv4_ttl);
}
for (i = 0; i < FLOW_N_REGS; i++) {
diff --git a/lib/ofp-util.c b/lib/ofp-util.c
index 90124ec..652a6bf 100644
--- a/lib/ofp-util.c
+++ b/lib/ofp-util.c
@@ -2096,7 +2096,7 @@ ofputil_decode_packet_in(struct ofputil_packet_in *pin,
pin->fmd.in_port = rule.flow.in_port;
- pin->fmd.tun_id = rule.flow.tun_id;
+ pin->fmd.tun_key.tun_id = rule.flow.tun_key.tun_id;
pin->fmd.tun_id_mask = rule.wc.tun_id_mask;
memcpy(pin->fmd.regs, rule.flow.regs, sizeof pin->fmd.regs);
@@ -2149,7 +2149,7 @@ ofputil_encode_packet_in(const struct ofputil_packet_in *pin,
+ 2 + send_len);
cls_rule_init_catchall(&rule, 0);
- cls_rule_set_tun_id_masked(&rule, pin->fmd.tun_id,
+ cls_rule_set_tun_id_masked(&rule, pin->fmd.tun_key.tun_id,
pin->fmd.tun_id_mask);
for (i = 0; i < FLOW_N_REGS; i++) {
diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index 03a86bc..2a52f37 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -3080,7 +3080,7 @@ handle_miss_upcalls(struct ofproto_dpif *ofproto, struct dpif_upcall *upcalls,
continue;
}
flow_extract(upcall->packet, miss->flow.skb_priority,
- miss->flow.tun_id, miss->flow.in_port, &miss->flow);
+ &miss->flow.tun_key, miss->flow.in_port, &miss->flow);
/* Add other packets to a to-do list. */
hash = flow_hash(&miss->flow, 0);
@@ -5464,7 +5464,7 @@ do_xlate_actions(const union ofp_action *in, size_t n_in,
case OFPUTIL_NXAST_SET_TUNNEL:
nast = (const struct nx_action_set_tunnel *) ia;
tun_id = htonll(ntohl(nast->tun_id));
- ctx->flow.tun_id = tun_id;
+ ctx->flow.tun_key.tun_id = tun_id;
break;
case OFPUTIL_NXAST_SET_QUEUE:
@@ -5492,7 +5492,7 @@ do_xlate_actions(const union ofp_action *in, size_t n_in,
case OFPUTIL_NXAST_SET_TUNNEL64:
tun_id = ((const struct nx_action_set_tunnel64 *) ia)->tun_id;
- ctx->flow.tun_id = tun_id;
+ ctx->flow.tun_key.tun_id = tun_id;
break;
case OFPUTIL_NXAST_MULTIPATH:
@@ -5576,7 +5576,7 @@ action_xlate_ctx_init(struct action_xlate_ctx *ctx,
ctx->ofproto = ofproto;
ctx->flow = *flow;
ctx->base_flow = ctx->flow;
- ctx->base_flow.tun_id = 0;
+ ctx->base_flow.tun_key.ipv4_src = 0;
ctx->base_flow.vlan_tci = initial_tci;
ctx->rule = rule;
ctx->packet = packet;
@@ -6739,6 +6739,7 @@ ofproto_unixctl_trace(struct unixctl_conn *conn, int argc, const char *argv[],
const char *packet_s = argv[5];
uint16_t in_port = ofp_port_to_odp_port(atoi(in_port_s));
ovs_be64 tun_id = htonll(strtoull(tun_id_s, NULL, 0));
+ struct ovs_key_ipv4_tunnel tun_key = { .tun_id = tun_id };
uint32_t priority = atoi(priority_s);
const char *msg;
@@ -6753,7 +6754,7 @@ ofproto_unixctl_trace(struct unixctl_conn *conn, int argc, const char *argv[],
ds_put_cstr(&result, s);
free(s);
- flow_extract(packet, priority, tun_id, in_port, &flow);
+ flow_extract(packet, priority, &tun_key, in_port, &flow);
initial_tci = flow.vlan_tci;
} else {
unixctl_command_reply_error(conn, "Bad command syntax");
diff --git a/tests/test-classifier.c b/tests/test-classifier.c
index fcafdb2..5bb5df8 100644
--- a/tests/test-classifier.c
+++ b/tests/test-classifier.c
@@ -44,7 +44,7 @@
/* struct flow all-caps */ \
/* FWW_* bit(s) member name name */ \
/* -------------------------- ----------- -------- */ \
- CLS_FIELD(0, tun_id, TUN_ID) \
+ CLS_FIELD(0, tun_key.tun_id, TUN_ID) \
CLS_FIELD(0, nw_src, NW_SRC) \
CLS_FIELD(0, nw_dst, NW_DST) \
CLS_FIELD(FWW_IN_PORT, in_port, IN_PORT) \
@@ -206,7 +206,8 @@ match(const struct cls_rule *wild, const struct flow *fixed)
eq = !((fixed->vlan_tci ^ wild->flow.vlan_tci)
& wild->wc.vlan_tci_mask);
} else if (f_idx == CLS_F_IDX_TUN_ID) {
- eq = !((fixed->tun_id ^ wild->flow.tun_id) & wild->wc.tun_id_mask);
+ eq = !((fixed->tun_key.tun_id ^ wild->flow.tun_key.tun_id) &
+ wild->wc.tun_id_mask);
} else if (f_idx == CLS_F_IDX_NW_DSCP) {
eq = !((fixed->nw_tos ^ wild->flow.nw_tos) & IP_DSCP_MASK);
} else {
@@ -362,7 +363,7 @@ compare_classifiers(struct classifier *cls, struct tcls *tcls)
x = rand () % N_FLOW_VALUES;
flow.nw_src = nw_src_values[get_value(&x, N_NW_SRC_VALUES)];
flow.nw_dst = nw_dst_values[get_value(&x, N_NW_DST_VALUES)];
- flow.tun_id = tun_id_values[get_value(&x, N_TUN_ID_VALUES)];
+ flow.tun_key.tun_id = tun_id_values[get_value(&x, N_TUN_ID_VALUES)];
flow.in_port = in_port_values[get_value(&x, N_IN_PORT_VALUES)];
flow.vlan_tci = vlan_tci_values[get_value(&x, N_VLAN_TCI_VALUES)];
flow.dl_type = dl_type_values[get_value(&x, N_DL_TYPE_VALUES)];
--
1.7.10.2.484.gcd07cc5
^ permalink raw reply related
* [PATCH 09/21] ofproto: Add tundev_to_realdev()
From: Simon Horman @ 2012-05-24 9:09 UTC (permalink / raw)
To: dev-yBygre7rU0TnMu66kgdUjQ; +Cc: netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1337850554-10339-1-git-send-email-horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
In essence this is a duplication of ovs_tnl_find_port(),
copying code from the datapath to vswitchd. It is planned
that the datapath version will be removed.
It is used to map from the tundev interface that a
packet is recieved by in the datapath to the tunnel realdev
interface used in user-sapce. It is the tunnel realdev
that has the tunnel configuration attached.
Cc: Kyle Mestery <kmestery-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
---
ofproto/ofproto-dpif.c | 194 ++++++++++++++++++++++++++++++++++++++++++++-----
1 file changed, 174 insertions(+), 20 deletions(-)
diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index c7ea391..03a86bc 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -183,7 +183,7 @@ static void bundle_del_port(struct ofport_dpif *);
static void bundle_run(struct ofbundle *);
static void bundle_wait(struct ofbundle *);
static struct ofbundle *lookup_input_bundle(const struct ofproto_dpif *,
- uint16_t in_port, bool warn,
+ const struct flow *, bool warn,
struct ofport_dpif **in_ofportp);
/* A controller may use OFPP_NONE as the ingress port to indicate that
@@ -550,8 +550,12 @@ static unsigned remote_ports;
static unsigned key_multicast_ports;
static unsigned multicast_ports;
+static bool tunnel_adjust_flow(const struct ofproto_dpif *ofproto,
+ struct flow *flow);
static int set_tunnelling(struct ofport *ofport_, uint16_t realdev_ofp_port,
const struct tunnel_settings *s);
+static struct ofport_dpif *tundev_to_realdev(const struct ofproto_dpif *ofproto,
+ const struct flow *flow);
static uint32_t
realdev_to_txdev(const struct ofproto_dpif *ofproto,
@@ -2998,6 +3002,7 @@ ofproto_dpif_extract_flow_key(const struct ofproto_dpif *ofproto,
struct ofpbuf *packet)
{
enum odp_key_fitness fitness;
+ bool adjusted = false;
fitness = odp_flow_key_to_flow(key, key_len, flow);
if (fitness == ODP_FIT_ERROR) {
@@ -3005,7 +3010,9 @@ ofproto_dpif_extract_flow_key(const struct ofproto_dpif *ofproto,
}
*initial_tci = flow->vlan_tci;
- if (vsp_adjust_flow(ofproto, flow)) {
+ if (tunnel_adjust_flow(ofproto, flow)) {
+ adjusted = true;
+ } else if (vsp_adjust_flow(ofproto, flow)) {
if (packet) {
/* Make the packet resemble the flow, so that it gets sent to an
* OpenFlow controller properly, so that it looks correct for
@@ -3023,11 +3030,12 @@ ofproto_dpif_extract_flow_key(const struct ofproto_dpif *ofproto,
* since we don't need that header anymore. */
eth_push_vlan(packet, flow->vlan_tci);
}
+ adjusted = true;
+ }
- /* Let the caller know that we can't reproduce 'key' from 'flow'. */
- if (fitness == ODP_FIT_PERFECT) {
- fitness = ODP_FIT_TOO_MUCH;
- }
+ /* Let the caller know that we can't reproduce 'key' from 'flow'. */
+ if (adjusted && fitness == ODP_FIT_PERFECT) {
+ fitness = ODP_FIT_TOO_MUCH;
}
return fitness;
@@ -5934,7 +5942,7 @@ add_mirror_actions(struct action_xlate_ctx *ctx, const struct flow *orig_flow)
const struct nlattr *a;
size_t left;
- in_bundle = lookup_input_bundle(ctx->ofproto, orig_flow->in_port,
+ in_bundle = lookup_input_bundle(ctx->ofproto, orig_flow,
ctx->packet != NULL, NULL);
if (!in_bundle) {
return;
@@ -6095,13 +6103,17 @@ update_learning_table(struct ofproto_dpif *ofproto,
}
static struct ofbundle *
-lookup_input_bundle(const struct ofproto_dpif *ofproto, uint16_t in_port,
- bool warn, struct ofport_dpif **in_ofportp)
+lookup_input_bundle(const struct ofproto_dpif *ofproto,
+ const struct flow *flow, bool warn,
+ struct ofport_dpif **in_ofportp)
{
struct ofport_dpif *ofport;
/* Find the port and bundle for the received packet. */
- ofport = get_ofp_port(ofproto, in_port);
+ ofport = tundev_to_realdev(ofproto, flow);
+ if (!ofport) {
+ ofport = get_ofp_port(ofproto, flow->in_port);
+ }
if (in_ofportp) {
*in_ofportp = ofport;
}
@@ -6111,7 +6123,7 @@ lookup_input_bundle(const struct ofproto_dpif *ofproto, uint16_t in_port,
/* Special-case OFPP_NONE, which a controller may use as the ingress
* port for traffic that it is sourcing. */
- if (in_port == OFPP_NONE) {
+ if (flow->in_port == OFPP_NONE) {
return &ofpp_none_bundle;
}
@@ -6129,7 +6141,7 @@ lookup_input_bundle(const struct ofproto_dpif *ofproto, uint16_t in_port,
static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
VLOG_WARN_RL(&rl, "bridge %s: received packet on unknown "
- "port %"PRIu16, ofproto->up.name, in_port);
+ "port %"PRIu16, ofproto->up.name, flow->in_port);
}
return NULL;
}
@@ -6196,7 +6208,7 @@ xlate_normal(struct action_xlate_ctx *ctx)
ctx->has_normal = true;
- in_bundle = lookup_input_bundle(ctx->ofproto, ctx->flow.in_port,
+ in_bundle = lookup_input_bundle(ctx->ofproto, &ctx->flow,
ctx->packet != NULL, &in_port);
if (!in_bundle) {
return;
@@ -7166,16 +7178,19 @@ tun_remove(struct ofport_dpif *ofport)
}
static void
-tun_add(struct ofport_dpif *ofport, uint16_t tundev_ofp_port,
- const struct tunnel_settings *s)
+tun_add(struct ofport_dpif *ofport)
{
struct ofproto_dpif *ofproto = ofproto_dpif_cast(ofport->up.ofproto);
- ofport->tun->tundev_ofp_port = tundev_ofp_port;
- ofport->tun->s = *s;
+ /* Only add if the saddr is non-zero, in which case ofport is a
+ * realdev. Otherwise it is a tundev */
+ if (ofport->tun->s.daddr == htonl(0)) {
+ return;
+ }
+
(*tun_port_pool(&ofport->tun->s))++;
hmap_insert(&ofproto->tundev_map, &ofport->tun->tundev_node,
- hash_int(tundev_ofp_port, 0));
+ hash_int(ofport->tun->tundev_ofp_port, 0));
}
static int
@@ -7203,15 +7218,154 @@ set_tunnelling(struct ofport *ofport_, uint16_t tundev_ofp_port,
if (ofport->tun->tundev_ofp_port == tundev_ofp_port &&
tunnel_settings_equal(&ofport->tun->s, s)) {
return 0;
- }
+ }
tun_remove(ofport);
}
- tun_add(ofport, tundev_ofp_port, s);
+ ofport->tun->s = *s;
+ ofport->tun->tundev_ofp_port = tundev_ofp_port;
+ tun_add(ofport);
return 0;
}
+struct tunnel_lookup_key {
+ ovs_be64 tun_id;
+ ovs_be32 ipv4_src;
+ ovs_be32 ipv4_dst;
+ uint8_t tun_type;
+};
+
+static struct ofport_dpif *
+tundev_find(const struct ofproto_dpif *ofproto, uint16_t tundev_ofp_port,
+ const struct tunnel_lookup_key *tun_key)
+{
+ struct ofport_dpif_tun *tun;
+
+ HMAP_FOR_EACH_WITH_HASH (tun, tundev_node, hash_int(tundev_ofp_port, 0),
+ &ofproto->tundev_map) {
+ if (tun_key->tun_type == tun->s.type &&
+ tun_key->ipv4_dst == tun->s.daddr &&
+ tun_key->tun_id == tun->s.in_key &&
+ tun_key->ipv4_src == tun->s.saddr) {
+ return tun->ofport;
+ }
+ }
+
+ return NULL;
+}
+
+/* Returns the OpenFlow port number of the "real" device underlying the Linux
+ * tunnel device matching tun_key.
+ *
+ * Returns 0 if no match is found */
+static struct ofport_dpif *
+tundev_to_realdev(const struct ofproto_dpif *ofproto, const struct flow *flow)
+{
+ bool is_multicast = ipv4_is_multicast(flow->tun_key.ipv4_dst);
+ struct ofport_dpif *tundev_ofport;
+ struct ofport_dpif *realdev_ofport;
+ struct tunnel_lookup_key lookup;
+
+ /* Nothing to do if the packet wasn't unencapsulated on receive */
+ if (!flow->tun_key.ipv4_dst) {
+ return NULL;
+ }
+
+ /* Nothing to do if there are no tunnel devices configured */
+ if (hmap_is_empty(&ofproto->tundev_map)) {
+ return NULL;
+ }
+
+ /* Give up if the tunnel device can't be found
+ * or isn't a tunnel tundev */
+ tundev_ofport = get_ofp_port(ofproto, flow->in_port);
+ if (!tundev_ofport || !tundev_ofport->tun || tundev_ofport->tun->s.daddr) {
+ return NULL;
+ }
+
+ lookup.tun_id = flow->tun_key.tun_id;
+ lookup.ipv4_src = flow->tun_key.ipv4_dst;
+ lookup.ipv4_dst = flow->tun_key.ipv4_src;
+
+ /* First try for an exact match on the tun_id */
+ lookup.tun_id = flow->tun_key.tun_id;
+ lookup.tun_type = tundev_ofport->tun->s.type | TNL_T_KEY_EXACT;
+ if (!is_multicast && key_local_remote_ports) {
+ realdev_ofport = tundev_find(ofproto, flow->in_port, &lookup);
+ if (realdev_ofport)
+ return realdev_ofport;
+ }
+ if (key_remote_ports) {
+ lookup.ipv4_src = htonl(0);
+ realdev_ofport = tundev_find(ofproto, flow->in_port, &lookup);
+ if (realdev_ofport)
+ return realdev_ofport;
+ lookup.ipv4_src = flow->tun_key.ipv4_dst;
+ }
+
+ /* Then try matches that wildcard the tun_id. */
+ lookup.tun_id = htonll(0);
+ lookup.tun_type = tundev_ofport->tun->s.type | TNL_T_KEY_MATCH;
+ if (!is_multicast && local_remote_ports) {
+ realdev_ofport = tundev_find(ofproto, flow->in_port, &lookup);
+ if (realdev_ofport)
+ return realdev_ofport;
+ }
+ if (remote_ports) {
+ lookup.ipv4_src = htonl(0);
+ realdev_ofport = tundev_find(ofproto, flow->in_port, &lookup);
+ if (realdev_ofport)
+ return realdev_ofport;
+ }
+
+ if (is_multicast) {
+ lookup.ipv4_src = htonl(0);
+ lookup.ipv4_dst = flow->tun_key.ipv4_dst;
+ if (key_multicast_ports) {
+ lookup.tun_id = flow->tun_key.tun_id;
+ lookup.tun_type = tundev_ofport->tun->s.type | TNL_T_KEY_EXACT;
+ realdev_ofport = tundev_find(ofproto, flow->in_port, &lookup);
+ if (realdev_ofport)
+ return realdev_ofport;
+ }
+ if (multicast_ports) {
+ lookup.tun_id = 0;
+ lookup.tun_type = tundev_ofport->tun->s.type | TNL_T_KEY_MATCH;
+ realdev_ofport = tundev_find(ofproto, flow->in_port, &lookup);
+ if (realdev_ofport)
+ return realdev_ofport;
+ }
+ }
+
+ return NULL;
+}
+
+/* Given 'flow', a flow representing a packet received on 'ofproto', checks
+ * whether 'flow->in_port' represents a Linux tunnel device. If so, changes
+ * 'flow->in_port' to the "real" device backing the tunnel device, sets
+ * 'flow->key' to using the real device's tunnel settings, and returns true.
+ * Otherwise (which is always the case unless tunneling enabled), returns
+ * false without making any changes. */
+static bool
+tunnel_adjust_flow(const struct ofproto_dpif *ofproto, struct flow *flow)
+{
+ const struct ofport_dpif *realdev_ofport = tundev_to_realdev(ofproto, flow);
+ if (!realdev_ofport) {
+ return false;
+ }
+
+ /* Cause the flow to be processed as if it came in on the real device with
+ * the tunnel's key. */
+ flow->in_port = ofp_port_to_odp_port(realdev_ofport->up.ofp_port);
+ flow->tun_key.tun_id = realdev_ofport->tun->s.out_key;
+ flow->tun_key.ipv4_src = realdev_ofport->tun->s.saddr;
+ flow->tun_key.ipv4_dst = realdev_ofport->tun->s.daddr;
+ flow->tun_key.ipv4_tos = realdev_ofport->tun->s.tos;
+ flow->tun_key.ipv4_ttl = realdev_ofport->tun->s.ttl;
+ return true;
+}
+
/* Maps a port to the port that it should be transmitted on.
* If tunneling is enabled then the associated tunnel port is returned.
* If VLAN splintering is enabled then the ofp_port of the vlandev is
--
1.7.10.2.484.gcd07cc5
^ permalink raw reply related
* [PATCH 07/21] vswitchd: Configure tunnel interfaces.
From: Simon Horman @ 2012-05-24 9:09 UTC (permalink / raw)
To: dev-yBygre7rU0TnMu66kgdUjQ; +Cc: netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1337850554-10339-1-git-send-email-horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
For tunnel realdevs this sets the remote IP and type,
and optionally source IP, ttl and tos. The remote IP
must non-zero.
For tunnel tundevs only the type is configured.
The remote IP must be zero.
Cc: Kyle Mestery <kmestery-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
---
vswitchd/bridge.c | 69 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 69 insertions(+)
diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c
index 3d187f0..a67f391 100644
--- a/vswitchd/bridge.c
+++ b/vswitchd/bridge.c
@@ -242,6 +242,7 @@ static void iface_set_ofport(const struct ovsrec_interface *, int64_t ofport);
static void iface_clear_db_record(const struct ovsrec_interface *if_cfg);
static void iface_configure_qos(struct iface *, const struct ovsrec_qos *);
static void iface_configure_cfm(struct iface *);
+static void iface_configure_tunnel(struct iface *);
static void iface_refresh_cfm_stats(struct iface *);
static void iface_refresh_stats(struct iface *);
static void iface_refresh_status(struct iface *);
@@ -535,6 +536,7 @@ bridge_reconfigure_continue(const struct ovsrec_open_vswitch *ovs_cfg)
LIST_FOR_EACH (iface, port_elem, &port->ifaces) {
iface_configure_cfm(iface);
iface_configure_qos(iface, port->cfg->qos);
+ iface_configure_tunnel(iface);
iface_set_mac(iface);
}
}
@@ -627,6 +629,21 @@ bridge_update_ofprotos(void)
}
}
+is_tunnel_tundev(const char *type)
+{
+ return !strcmp(type, "gre-tundev") || !strcmp(type, "capwap-tundev");
+}
+
+static uint8_t
+tunnel_tundev_type_from_str(const char *type)
+{
+ if (!strcmp(type, "gre-tundev"))
+ return TNL_T_PROTO_GRE;
+ if (!strcmp(type, "gre-tundev"))
+ return TNL_T_PROTO_CAPWAP;
+ NOT_REACHED();
+}
+
static bool
is_tunnel_realdev(const char *type)
{
@@ -648,6 +665,15 @@ port_configure(struct port *port)
return;
}
+ if (list_is_singleton(&port->ifaces)) {
+ iface = CONTAINER_OF(list_front(&port->ifaces),
+ struct iface, port_elem);
+ if (is_tunnel_tundev(iface->type)) {
+ ofproto_bundle_unregister(port->bridge->ofproto, port);
+ return;
+ }
+ }
+
/* Get name. */
s.name = port->name;
@@ -3686,6 +3712,49 @@ iface_configure_cfm(struct iface *iface)
ofproto_port_set_cfm(iface->port->bridge->ofproto, iface->ofp_port, &s);
}
+static void
+iface_configure_tunnel_tundev(struct iface *iface)
+{
+ const char *type = iface_get_type(iface->cfg, iface->port->bridge->cfg);
+ struct tunnel_settings s = { .type = tunnel_tundev_type_from_str(type) };
+
+ ofproto_port_set_tunnel(iface->port->bridge->ofproto, 0,
+ iface->ofp_port, &s);
+}
+
+static void
+iface_configure_tunnel_realdev(struct iface *iface)
+{
+ struct tunnel_settings s = { .tos = 0 };
+ const char *type = iface_get_type(iface->cfg, iface->port->bridge->cfg);
+ struct iface *tundev;
+
+ /* This will not fail as it has already been called
+ * to check for errors */
+ iface_parse_tunnel(iface->cfg, type, &s);
+
+ tundev = iface_lookup(iface->port->bridge, type);
+ assert(tundev);
+
+ ofproto_port_set_tunnel(iface->port->bridge->ofproto, tundev->ofp_port,
+ iface->ofp_port, &s);
+}
+
+static void
+iface_configure_tunnel(struct iface *iface)
+{
+ const char *type = iface_get_type(iface->cfg, iface->port->bridge->cfg);
+
+ if (is_tunnel_realdev(type)) {
+ return iface_configure_tunnel_realdev(iface);
+ } else if (is_tunnel_tundev(type)) {
+ return iface_configure_tunnel_tundev(iface);
+ }
+
+ /* Nothing to do */
+ return;
+}
+
/* Returns true if 'iface' is synthetic, that is, if we constructed it locally
* instead of obtaining it from the database. */
static bool
--
1.7.10.2.484.gcd07cc5
^ permalink raw reply related
* [PATCH 06/21] ofproto: Add set_tunnelling()
From: Simon Horman @ 2012-05-24 9:08 UTC (permalink / raw)
To: dev-yBygre7rU0TnMu66kgdUjQ; +Cc: netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1337850554-10339-1-git-send-email-horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
Allow configuration of tunneling in ofproto_port instances.
For tunnel realdevs this includes the remote IP of the and type tunnel,
and optionally the local IP, tos and ttl.
For tunnel tundevs it only includes the type.
realdevs and tundevs can be differentiated by examining the remote IP,
which is always zero for tundevs and always non-zero for realdevs.
Cc: Kyle Mestery <kmestery-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
---
ofproto/ofproto-dpif.c | 116 +++++++++++++++++++++++++++++++++++++++++++++
ofproto/ofproto-provider.h | 12 +++++
ofproto/ofproto.c | 28 +++++++++++
ofproto/ofproto.h | 13 +++++
4 files changed, 169 insertions(+)
diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index f2c2ca9..642b508 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -476,6 +476,13 @@ static void facet_account(struct facet *);
static bool facet_is_controller_flow(struct facet *);
+struct ofport_dpif_tun {
+ struct tunnel_settings s;
+ uint16_t tundev_ofp_port;
+ struct hmap_node tundev_node;
+ struct ofport_dpif *ofport; /* Containing ofport_dpif */
+};
+
struct ofport_dpif {
struct ofport up;
@@ -503,6 +510,9 @@ struct ofport_dpif {
* widespread use, we will delete these interfaces. */
uint16_t realdev_ofp_port;
int vlandev_vid;
+
+ /* Tunneling */
+ struct ofport_dpif_tun *tun;
};
/* Node in 'ofport_dpif''s 'priorities' map. Used to maintain a map from
@@ -535,6 +545,16 @@ static bool vsp_adjust_flow(const struct ofproto_dpif *, struct flow *);
static void vsp_remove(struct ofport_dpif *);
static void vsp_add(struct ofport_dpif *, uint16_t realdev_ofp_port, int vid);
+static unsigned key_local_remote_ports;
+static unsigned key_remote_ports;
+static unsigned local_remote_ports;
+static unsigned remote_ports;
+static unsigned key_multicast_ports;
+static unsigned multicast_ports;
+
+static int set_tunnelling(struct ofport *ofport_, uint16_t realdev_ofp_port,
+ const struct tunnel_settings *s);
+
static struct ofport_dpif *
ofport_dpif_cast(const struct ofport *ofport)
{
@@ -612,6 +632,9 @@ struct ofproto_dpif {
/* VLAN splinters. */
struct hmap realdev_vid_map; /* (realdev,vid) -> vlandev. */
struct hmap vlandev_map; /* vlandev -> (realdev,vid). */
+
+ /* Tunnelling */
+ struct hmap tundev_map; /* tundev -> realdev */
};
/* Defer flow mod completion until "ovs-appctl ofproto/unclog"? (Useful only
@@ -771,6 +794,8 @@ construct(struct ofproto *ofproto_)
hmap_init(&ofproto->vlandev_map);
hmap_init(&ofproto->realdev_vid_map);
+ hmap_init(&ofproto->tundev_map);
+
hmap_insert(&all_ofproto_dpifs, &ofproto->all_ofproto_dpifs_node,
hash_string(ofproto->up.name, 0));
memset(&ofproto->stats, 0, sizeof ofproto->stats);
@@ -1153,6 +1178,7 @@ port_construct(struct ofport *port_)
hmap_init(&port->priorities);
port->realdev_ofp_port = 0;
port->vlandev_vid = 0;
+ port->tun = NULL;
port->carrier_seq = netdev_get_carrier_resets(port->up.netdev);
if (ofproto->sflow) {
@@ -1171,6 +1197,7 @@ port_destruct(struct ofport *port_)
ofproto->need_revalidate = true;
bundle_remove(port_);
set_cfm(port_, NULL);
+ set_tunnelling(port_, 0, NULL);
if (ofproto->sflow) {
dpif_sflow_del_port(ofproto->sflow, port->odp_port);
}
@@ -7097,6 +7124,94 @@ vsp_add(struct ofport_dpif *port, uint16_t realdev_ofp_port, int vid)
}
}
\f
+static inline bool
+ipv4_is_multicast(__be32 addr)
+{
+ return (addr & htonl(0xf0000000)) == htonl(0xe0000000);
+}
+
+static unsigned int *
+tun_port_pool(const struct tunnel_settings *s)
+{
+ bool is_multicast = ipv4_is_multicast(s->daddr);
+
+ if (s->type & TNL_T_KEY_MATCH) {
+ if (s->saddr)
+ return &local_remote_ports;
+ else if (is_multicast)
+ return &multicast_ports;
+ else
+ return &remote_ports;
+ } else {
+ if (s->saddr)
+ return &key_local_remote_ports;
+ else if (is_multicast)
+ return &key_multicast_ports;
+ else
+ return &key_remote_ports;
+ }
+}
+
+static void
+tun_remove(struct ofport_dpif *ofport)
+{
+ struct ofproto_dpif *ofproto = ofproto_dpif_cast(ofport->up.ofproto);
+
+ if (!ofport->tun) {
+ return;
+ }
+
+ hmap_remove(&ofproto->tundev_map, &ofport->tun->tundev_node);
+ (*tun_port_pool(&ofport->tun->s))--;
+}
+
+static void
+tun_add(struct ofport_dpif *ofport, uint16_t tundev_ofp_port,
+ const struct tunnel_settings *s)
+{
+ struct ofproto_dpif *ofproto = ofproto_dpif_cast(ofport->up.ofproto);
+
+ ofport->tun->tundev_ofp_port = tundev_ofp_port;
+ ofport->tun->s = *s;
+ (*tun_port_pool(&ofport->tun->s))++;
+ hmap_insert(&ofproto->tundev_map, &ofport->tun->tundev_node,
+ hash_int(tundev_ofp_port, 0));
+}
+
+static int
+set_tunnelling(struct ofport *ofport_, uint16_t tundev_ofp_port,
+ const struct tunnel_settings *s)
+{
+ struct ofport_dpif *ofport = ofport_dpif_cast(ofport_);
+
+ if (!s) {
+ tun_remove(ofport);
+ free(ofport->tun);
+ ofport->tun = NULL;
+ return 0;
+ }
+
+ if (!ofport->tun) {
+ struct ofproto_dpif *ofproto;
+
+ ofproto = ofproto_dpif_cast(ofport->up.ofproto);
+ ofproto->need_revalidate = true;
+ ofport->tun = xzalloc(sizeof *ofport->tun);
+ ofport->tun->ofport = ofport;
+ }
+ else {
+ if (ofport->tun->tundev_ofp_port == tundev_ofp_port &&
+ tunnel_settings_equal(&ofport->tun->s, s)) {
+ return 0;
+ }
+ tun_remove(ofport);
+ }
+
+ tun_add(ofport, tundev_ofp_port, s);
+
+ return 0;
+}
+\f
const struct ofproto_class ofproto_dpif_class = {
enumerate_types,
enumerate_names,
@@ -7159,4 +7274,5 @@ const struct ofproto_class ofproto_dpif_class = {
forward_bpdu_changed,
set_mac_idle_time,
set_realdev,
+ set_tunnelling,
};
diff --git a/ofproto/ofproto-provider.h b/ofproto/ofproto-provider.h
index 1f3ad37..be39691 100644
--- a/ofproto/ofproto-provider.h
+++ b/ofproto/ofproto-provider.h
@@ -1168,6 +1168,18 @@ struct ofproto_class {
* it. */
int (*set_realdev)(struct ofport *ofport,
uint16_t realdev_ofp_port, int vid);
+
+ /* Configures tunneling for 'ofport'.
+ *
+ * If 'tunnel_settings' is nonnull, configures tunneling
+ * according to its members.
+ *
+ * If 'tunneling_settings' is null, then any tunnel configuration is
+ * removed.
+ *
+ * This function should be null if tunnelling is not supported */
+ int (*set_tunnelling)(struct ofport *ofport, uint16_t tundev_ofp_port,
+ const struct tunnel_settings *s);
};
extern const struct ofproto_class ofproto_dpif_class;
diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c
index 0bda06a..79f7a24 100644
--- a/ofproto/ofproto.c
+++ b/ofproto/ofproto.c
@@ -4184,3 +4184,31 @@ ofproto_port_set_realdev(struct ofproto *ofproto, uint16_t vlandev_ofp_port,
}
return error;
}
+
+/* Configure tunneling parameters of a port
+ *
+ * This function has no effect if 'ofproto' does not have a port 'ofp_port'. */
+void
+ofproto_port_set_tunnel(struct ofproto *ofproto, uint16_t tundev_ofp_port,
+ uint16_t ofp_port, const struct tunnel_settings *s)
+{
+ struct ofport *ofport;
+ int error;
+
+ ofport = ofproto_get_port(ofproto, ofp_port);
+ if (!ofport) {
+ VLOG_WARN("%s: cannot configure tunnel on nonexistent port %"PRIu16,
+ ofproto->name, ofp_port);
+ return;
+ }
+
+ error = (ofproto->ofproto_class->set_tunnelling
+ ? ofproto->ofproto_class->set_tunnelling(ofport,
+ tundev_ofp_port, s)
+ : EOPNOTSUPP);
+ if (error) {
+ VLOG_WARN("%s: Tunnel configuration on port %"PRIu16" (%s) failed (%s)",
+ ofproto->name, ofp_port,
+ netdev_get_name(ofport->netdev), strerror(error));
+ }
+}
diff --git a/ofproto/ofproto.h b/ofproto/ofproto.h
index d8739b0..147a588 100644
--- a/ofproto/ofproto.h
+++ b/ofproto/ofproto.h
@@ -398,6 +398,19 @@ struct tunnel_settings {
uint8_t type;
};
+static inline bool
+tunnel_settings_equal(const struct tunnel_settings *a,
+ const struct tunnel_settings *b)
+{
+ return a->daddr == b->daddr &&
+ a->in_key == b->in_key &&
+ a->out_key == b->out_key &&
+ a->saddr == b->saddr &&
+ a->flags == b->flags &&
+ a->tos == b->tos &&
+ a->ttl == b->ttl;
+}
+
void ofproto_port_set_tunnel(struct ofproto *ofproto, uint16_t tundev_ofp_port,
uint16_t realdev_ofp_port,
const struct tunnel_settings *s);
--
1.7.10.2.484.gcd07cc5
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox