From: Matthew Hall <mhall@mhcomputing.net>
To: Vladimir Medvedkin <medvedkinv@gmail.com>
Cc: "<dev@dpdk.org>" <dev@dpdk.org>
Subject: Re: rte_lpm with larger nexthops or another method?
Date: Mon, 22 Jun 2015 10:53:02 -0700 [thread overview]
Message-ID: <20150622175302.GA15788@mhcomputing.net> (raw)
In-Reply-To: <CANDrEHnuEwX6WGBFbMiB0C38m6axMRiAe3z=9OcjsZkGbPZSSA@mail.gmail.com>
That's a lot better indeed! 16 million CIDR blocks would be a huge improvement.
Do you happen to know what is also possible for rte_lpm6?
Matthew.
On Mon, Jun 22, 2015 at 01:11:14PM +0300, Vladimir Medvedkin wrote:
> Hi Matthew,
>
> I just recently thought about next_hop extension. For ipv4 we can do
> something like:
> struct rte_lpm_tbl24_entry {
> /* Stores Next hop or group index (i.e. gindex)into tbl8. */
> union {
> uint32_t next_hop :24;
> uint32_t tbl8_gindex :24;
> }__attribute__((__packed__));
> /* Using single uint8_t to store 3 values. */
> uint32_t valid :1; /**< Validation flag. */
> uint32_t ext_entry :1; /**< External entry. */
> uint32_t depth :6; /**< Rule depth. */
> };
> so we have 24 bit for next_hop.
>
> 2015-06-22 5:29 GMT+03:00 Matthew Hall <mhall@mhcomputing.net>:
>
> > Hello,
> >
> > I have gone out on the internet for days looking at a bunch of different
> > radix tree implementations to see if I could figure a way to implement my
> > own tree, just to work around the really low 255 CIDR block limitation in
> > librte_lpm. Unfortunately every single one I could find falls into one of
> > these two annoying categories:
> >
> > 1) bloated with a lot of irrelevant kernel code I don't care about
> > (especially the Linux version but also the BSD one, which also makes a
> > weird assumption every address object stores its length in byte 0 of the
> > address struct). These are hard to convert into something that plays nice
> > with raw packet data.
> >
> > 2) very seemingly simple code, which breaks horribly if you try to add
> > IPv6 support (such as the radix tree from University of Michigan / LLVM
> > compiler benchmark suite, and the one from the old unmaintained mrt daemon,
> > which includes a bizarre custom reference-counted memory manager that is
> > very convoluted). These are easy to set up, but cause a lot of weird
> > segfaults which I am having a difficult time to try to debug.
> >
> > So it seems like I am going nowhere with this approach. Instead, I'd like
> > to know, what would I need to do to add this support to my local copy of
> > librte_lpm? Let's assume for the sake of this discussion, that I don't care
> > one iota about any performance cost, and I am happy if I need to prefetch
> > two cachelines instead of just one (which I recall from a past thread is
> > why librte_lpm has such a low nexthop limit to start with).
> >
> > Failing that, does anybody have a known good userspace version of any of
> > these sort of items:
> >
> > 1) Hash Based FIB (forwarding information base),
> > 2) Tree Based FIB,
> > 3) Patricia trie (which does not break horribly on IPv6 or make bad
> > assumptions about data format besides uint8_t* and length),
> > 4) Crit-Bit tree
> > 5) any other good way of taking IPv4 and IPv6 and finding the longest
> > prefix match against a table of pre-loaded CIDR blocks?
> >
> > I am really pulling out my hair trying to find a way to do something which
> > doesn't seem like it should have to be be this difficult. I must be missing
> > a more obvious way to handle this.
> >
> > Thanks,
> > Matthew
next prev parent reply other threads:[~2015-06-22 17:55 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-06-22 2:29 rte_lpm with larger nexthops or another method? Matthew Hall
2015-06-22 10:11 ` Vladimir Medvedkin
2015-06-22 17:53 ` Matthew Hall [this message]
2015-06-23 3:51 ` Stephen Hemminger
2015-06-23 6:30 ` Matthew Hall
2015-06-23 7:19 ` Vladimir Medvedkin
2015-06-24 4:13 ` Matthew Hall
2015-06-24 4:28 ` Matthew Hall
2015-06-24 5:15 ` Matthew Hall
2015-06-24 7:04 ` Vladimir Medvedkin
2015-06-24 17:56 ` Matthew Hall
2015-06-26 7:01 ` Matthew Hall
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150622175302.GA15788@mhcomputing.net \
--to=mhall@mhcomputing.net \
--cc=dev@dpdk.org \
--cc=medvedkinv@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.