From: Robert Shearman <rshearma@brocade.com>
To: roopa <roopa@cumulusnetworks.com>
Cc: <netdev@vger.kernel.org>,
"Eric W. Biederman" <ebiederm@xmission.com>,
Thomas Graf <tgraf@suug.ch>,
Dinesh Dutt <ddutt@cumulusnetworks.com>,
"Vivek Venkatraman" <vivek@cumulusnetworks.com>
Subject: Re: [RFC net-next 2/3] ipv4: storing and retrieval of per-nexthop encap
Date: Tue, 2 Jun 2015 17:35:18 +0100 [thread overview]
Message-ID: <556DDB46.9010701@brocade.com> (raw)
In-Reply-To: <556DD368.8070806@cumulusnetworks.com>
On 02/06/15 17:01, roopa wrote:
> On 6/1/15, 9:46 AM, Robert Shearman wrote:
>> Parse RTA_ENCAP attribute for one path and multipath routes. The encap
>> length is stored in a newly added field to fib_nh, nh_encap_len,
>> although this is added to a padding hole in the structure so that it
>> doesn't increase the size at all. The encap data itself is stored at
>> the end of the array of nexthops. Whilst this means that retrieval
>> isn't optimal, especially if there are multiple nexthops, this avoids
>> the memory cost of an extra pointer, as well as any potential change
>> to the cache or instruction layout that could cause a performance
>> impact.
>>
>> Currently, the dst structure allocated to represent the destination of
>> the packet and used for retrieving the encap by the encap-type
>> interface has been grown through the addition of the rt_encap_len and
>> rt_encap fields. This isn't desirable and could be fixed by defining a
>> new destination type with operations copied from the normal case,
>> other than the addition of the get_encap operation.
>>
>> Signed-off-by: Robert Shearman <rshearma@brocade.com>
...
>> @@ -434,6 +445,83 @@ static int fib_detect_death(struct fib_info *fi,
>> int order,
>> return 1;
>> }
>> +static int fib_total_encap(struct fib_config *cfg)
>> +{
>> + struct net *net = cfg->fc_nlinfo.nl_net;
>> + int total_encap_len = 0;
>> +
>> + if (cfg->fc_mp) {
>> + int remaining = cfg->fc_mp_len;
>> + struct rtnexthop *rtnh = cfg->fc_mp;
>> +
>> + while (rtnh_ok(rtnh, remaining)) {
>> + struct nlattr *nla, *attrs = rtnh_attrs(rtnh);
>> + int attrlen;
>> +
>> + attrlen = rtnh_attrlen(rtnh);
>> + nla = nla_find(attrs, attrlen, RTA_ENCAP);
>> + if (nla) {
>> + struct net_device *dev;
>> + int len;
>> +
>> + dev = __dev_get_by_index(net,
>> + rtnh->rtnh_ifindex);
>> + if (!dev)
>> + return -EINVAL;
>> +
>> + /* Determine space required */
>> + len = rtnl_parse_encap(dev, nla, NULL);
>> + if (len < 0)
>> + return len;
>> +
>> + total_encap_len += len;
>> + }
>> +
>> + rtnh = rtnh_next(rtnh, &remaining);
>> + }
>> + } else {
>> + if (cfg->fc_encap) {
>> + struct net_device *dev;
>> + int len;
>> +
>> + dev = __dev_get_by_index(net, cfg->fc_oif);
>> + if (!dev)
>> + return -EINVAL;
>> +
>> + /* Determine space required */
>> + len = rtnl_parse_encap(dev, cfg->fc_encap, NULL);
>> + if (len < 0)
>> + return len;
>> +
>> + total_encap_len += len;
>> + }
>> + }
>> +
>> + return total_encap_len;
>> +}
> we could avoid parsing and finding this device twice, if fib_nh just
> held a pointer to the encap_info
> (or tunnel info) ?. And the encap_info/tun_info could be refcounted and
> shared between
> nexthops ?. In my implementation i have just a pointer to parsed encap
> state
> in fib_nh
Right - I took the approach here to avoid any extra memory use if encap
isn't in use for the nexthop/route, and to avoid any potential
performance penalty caused by extra cache misses. It would certainly
make things simpler if those weren't concerns. I'd appreciate input from
the community on this.
>> @@ -1403,6 +1418,15 @@ static void rt_set_nexthop(struct rtable *rt,
>> __be32 daddr,
>> #ifdef CONFIG_IP_ROUTE_CLASSID
>> rt->dst.tclassid = nh->nh_tclassid;
>> #endif
>> +
>> + nh_encap = fib_get_nh_encap(fi, nh);
>> + if (unlikely(nh_encap)) {
>> + rt->rt_encap = kmemdup(nh_encap, nh->nh_encap_len,
>> + GFP_KERNEL);
>> + if (rt->rt_encap)
>> + rt->rt_encap_len = nh->nh_encap_len;
>> + }
>> +
>
> And..., you could make the rtable point to the same encap info.
Ack.
Thanks,
Rob
next prev parent reply other threads:[~2015-06-02 16:36 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-06-01 16:46 [RFC net-next 0/3] IP imposition of per-nh MPLS encap Robert Shearman
2015-06-01 16:46 ` [RFC net-next 1/3] net: infra for per-nexthop encap data Robert Shearman
2015-06-02 18:15 ` Eric W. Biederman
2015-06-01 16:46 ` [RFC net-next 2/3] ipv4: storing and retrieval of per-nexthop encap Robert Shearman
2015-06-02 16:01 ` roopa
2015-06-02 16:35 ` Robert Shearman [this message]
2015-06-01 16:46 ` [RFC net-next 3/3] mpls: new ipmpls device for encapsulating IP packets as mpls Robert Shearman
2015-06-02 16:15 ` roopa
2015-06-02 16:33 ` Robert Shearman
2015-06-02 18:57 ` roopa
2015-06-02 21:06 ` Robert Shearman
2015-06-03 18:43 ` Vivek Venkatraman
2015-06-04 18:46 ` Robert Shearman
2015-06-04 21:38 ` Vivek Venkatraman
2015-06-02 18:26 ` Eric W. Biederman
2015-06-02 21:37 ` Thomas Graf
2015-06-02 22:48 ` Eric W. Biederman
2015-06-02 23:23 ` Eric W. Biederman
2015-06-03 9:50 ` Thomas Graf
2015-06-02 0:06 ` [RFC net-next 0/3] IP imposition of per-nh MPLS encap Thomas Graf
2015-06-02 13:28 ` Robert Shearman
2015-06-02 21:43 ` Thomas Graf
2015-06-03 13:30 ` Robert Shearman
2015-06-02 15:31 ` roopa
2015-06-02 18:30 ` Eric W. Biederman
2015-06-02 18:39 ` roopa
2015-06-02 18:11 ` Eric W. Biederman
2015-06-02 20:57 ` Robert Shearman
2015-06-02 21:10 ` Eric W. Biederman
2015-06-02 22:15 ` Robert Shearman
2015-06-02 22:58 ` Eric W. Biederman
2015-06-04 15:12 ` Nicolas Dichtel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=556DDB46.9010701@brocade.com \
--to=rshearma@brocade.com \
--cc=ddutt@cumulusnetworks.com \
--cc=ebiederm@xmission.com \
--cc=netdev@vger.kernel.org \
--cc=roopa@cumulusnetworks.com \
--cc=tgraf@suug.ch \
--cc=vivek@cumulusnetworks.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.