From: Jesper Dangaard Brouer <brouer@redhat.com>
To: "Md. Islam" <mislam4@kent.edu>
Cc: Netdev <netdev@vger.kernel.org>,
David Miller <davem@davemloft.net>,
David Ahern <dsahern@gmail.com>,
Eric Dumazet <edumazet@google.com>,
Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>,
Stephen Hemminger <stephen@networkplumber.org>,
makita.toshiaki@lab.ntt.co.jp, panda@hongo.wide.ad.jp,
yasuhiro.ohara@ntt.com, brouer@redhat.com,
John Fastabend <john.fastabend@gmail.com>
Subject: Re: [PATCH RFC net-next] net: Poptrie based FIB lookup
Date: Mon, 27 Aug 2018 17:33:34 +0200 [thread overview]
Message-ID: <20180827173334.16ff0673@redhat.com> (raw)
In-Reply-To: <CAFgPn1CHgSKRMy1RgutEh9sragUK7WtqO_COzzcqZEaugP8+7Q@mail.gmail.com>
On Sun, 26 Aug 2018 22:13:53 -0400 "Md. Islam" <mislam4@kent.edu> wrote:
> This patch implements Poptrie [1] based FIB lookup. It exhibits pretty
> impressive lookup performance compared to LC-trie. This poptrie
> implementation however somewhat deviates from the original
> implementation [2]. I tested this patch very rigorously with several
> FIB tables containing half a million routes. I got same result as
> LC-trie based fib_lookup().
It sounds really promising performance wise. The article [1] claim
lookup speeds up-to 240 Million lookups per second. That is a crazy
speed. This is 4.166 nanosec per lookup (1/240*1000), and their test
CPU is 3.9GHz, which gives them 16.25 CPU cycles for a lookup, with 3
insn per cycle, that gives them max 48 perfectly pipelined instructions
per lookup.
> Poptrie is intended to work in conjunction with LC-trie (not replace
> it). It is primarily designed to overcome many issues of TCAM based
> router [1]. [1] shows that the Poptrie can achieve very impressive
> lookup performance on CPU. This patch will mainly be used by XDP
> forwarding.
>
> 1. Asai, Hirochika, and Yasuhiro Ohara. "Poptrie: A compressed trie
> with population count for fast and scalable software IP routing table
> lookup." ACM SIGCOMM Computer Communication Review. 2015.
>
> 2. https://github.com/pixos/poptrie
>
> From c5e05ea66b06eb9313749bc8969b4c2798fcf96a Mon Sep 17 00:00:00 2001
> From: tamimcse <tamim@csebuet.org>
> Date: Sun, 26 Aug 2018 21:12:38 -0400
> Subject: [PATCH] Implented Poptrie
This above "commit-info" should not be part of the patch description.
> Signed-off-by: tamimcse <tamim@csebuet.org>
Use you real/full name here.
> ---
First of order of business: You need to conform to the kernels coding
standards!
https://www.kernel.org/doc/html/v4.18/process/coding-style.html
There is a script avail to check this called: scripts/checkpatch.pl
It summary says:
total: 139 errors, 238 warnings, 6 checks, 372 lines checked
(Not good, more error+warnings than lines...)
Please fix up those...
> include/net/ip_fib.h | 40 +++++++
> net/ipv4/Makefile | 2 +-
> net/ipv4/fib_poptrie.c | 295 +++++++++++++++++++++++++++++++++++++++++++++++++
> net/ipv4/fib_trie.c | 3 +
> 4 files changed, 339 insertions(+), 1 deletion(-)
> create mode 100644 net/ipv4/fib_poptrie.c
[...]
> diff --git a/net/ipv4/fib_poptrie.c b/net/ipv4/fib_poptrie.c
> new file mode 100644
> index 0000000..b3a88ab
> --- /dev/null
> +++ b/net/ipv4/fib_poptrie.c
> @@ -0,0 +1,295 @@
[....]
> +/*Insert a new node at index*/
> +static void insert_chield_node(struct poptrie_node *node,
> + char index)
> +{
You are misspelling "child" as "chield"
> + int i, j;
> + struct poptrie_node *arr;
> + int arr_size = (int)hweight64(node->nodevec);
> +
> + arr = kcalloc(arr_size + 1, sizeof(*arr), GFP_ATOMIC);
> + for (i = 0, j = 0; i < (arr_size + 1); i++) {
> + if (i != index && j < arr_size)
> + arr[i] = node->chield_nodes[j++];
> + }
> +
> + kfree(node->chield_nodes);
> + node->chield_nodes = arr;
> +}
[...]
> +/*We assume that pt->root is not NULL*/
> +void poptrie_lookup(struct poptrie *pt, __be32 dest, struct net_device **dev)
> +{
> + register u32 index;
> + register u64 bitmap, bitmask;
> + register unsigned long leaf_index;
> + register unsigned long node_index;
> + register struct poptrie_node *node = pt->root;
> + register u8 fib_index = pt->def_nh;
> + register u8 carry = 0;
> + register u8 carry_bit = 2;
Do you have performance data, that tell you that "register" is needed here?
> + while (1) {
> + /*Extract 6 bytes from dest */
> + if (likely(carry_bit != 8)) {
> + index = ((dest & 252) >> carry_bit) | carry;
> + carry = (dest & ((1 << carry_bit) - 1)) << (6 - carry_bit);
> + carry_bit = carry_bit + 2;
> + dest = dest >> 8;
> + } else {
> + index = carry;
> + carry = 0;
> + carry_bit = 2;
> + }
> +
> + /*Create a bitmap based on the the extracted value*/
> + bitmap = 1ULL << index;
> + bitmask = bitmap - 1;
> +
> + /*Find corresponding leaf*/
> + if (likely(node->vector & bitmap)) {
> + leaf_index = hweight64(node->leafvec & bitmask);
Just as help for reviewers, the popcnt instruction is here.
hweight64 == popcnt
> + if (!(node->leafvec & bitmap))
> + leaf_index--;
> + fib_index = node->leaves[leaf_index];
> + }
> +
> + /*Find corresponding node*/
> + if (likely(node->nodevec & bitmap)) {
> + node_index = hweight64(node->nodevec & bitmask);
And here.
> + node = &node->chield_nodes[node_index];
> + continue;
> + }
> +
> + *dev = get_fib(&pt->nhs, fib_index);
> + return;
> + }
> +}
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer
prev parent reply other threads:[~2018-08-27 19:20 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-27 2:13 [PATCH RFC net-next] net: Poptrie based FIB lookup Md. Islam
2018-08-27 15:33 ` Jesper Dangaard Brouer [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180827173334.16ff0673@redhat.com \
--to=brouer@redhat.com \
--cc=davem@davemloft.net \
--cc=dsahern@gmail.com \
--cc=edumazet@google.com \
--cc=john.fastabend@gmail.com \
--cc=kuznet@ms2.inr.ac.ru \
--cc=makita.toshiaki@lab.ntt.co.jp \
--cc=mislam4@kent.edu \
--cc=netdev@vger.kernel.org \
--cc=panda@hongo.wide.ad.jp \
--cc=stephen@networkplumber.org \
--cc=yasuhiro.ohara@ntt.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.