From: "Medvedkin, Vladimir" <vladimir.medvedkin@intel.com>
To: Maxime Leroy <maxime@leroys.fr>
Cc: <dev@dpdk.org>, <rjarry@redhat.com>, <nsaxena16@gmail.com>,
<mb@smartsharesystems.com>, <adwivedi@marvell.com>,
<jerinjacobk@gmail.com>
Subject: Re: [RFC PATCH 0/4] VRF support in FIB library
Date: Mon, 23 Mar 2026 12:49:11 +0000 [thread overview]
Message-ID: <22bebf4a-3801-45e9-8ac5-726cb6c89721@intel.com> (raw)
In-Reply-To: <CAHHRULVmnHih3BqoCnMv-FEj=aC3LehgCFaVhh=0eQ-SLsovXg@mail.gmail.com>
Hi Maxime,
On 3/23/2026 11:27 AM, Maxime Leroy wrote:
> Hi Vladimir,
>
>
> On Sun, Mar 22, 2026 at 4:42 PM Vladimir Medvedkin
> <vladimir.medvedkin@intel.com> wrote:
>> This series adds multi-VRF support to both IPv4 and IPv6 FIB paths by
>> allowing a single FIB instance to host multiple isolated routing domains.
>>
>> Currently FIB instance represents one routing instance. For workloads that
>> need multiple VRFs, the only option is to create multiple FIB objects. In a
>> burst oriented datapath, packets in the same batch can belong to different VRFs, so
>> the application either does per-packet lookup in different FIB instances or
>> regroups packets by VRF before lookup. Both approaches are expensive.
>>
>> To remove that cost, this series keeps all VRFs inside one FIB instance and
>> extends lookup input with per-packet VRF IDs.
>>
>> The design follows the existing fast-path structure for both families. IPv4 and
>> IPv6 use multi-ary trees with a 2^24 associativity on a first level (tbl24). The
>> first-level table scales per configured VRF. This increases memory usage, but
>> keeps performance and lookup complexity on par with non-VRF implementation.
>>
> Thanks for the RFC. Some thoughts below.
>
> Memory cost: the flat TBL24 replicates the entire table for every VRF
> (num_vrfs * 2^24 * nh_size). With 256 VRFs and 8B nexthops that is
> 32 GB for TBL24 alone. In grout we support up to 256 VRFs allocated
> on demand -- this approach forces the full cost upfront even if most
> VRFs are empty.
Yes, increased memory consumption is the
trade-off.WemakethischoiceinDPDKquite often,such as pre-allocatedmbufs,
mempoolsand many other stuff allocated in advance to gain performance.
For FIB, I chose to replicate TBL24 per VRF for this same reason.
And, as Morten mentioned earlier, if memory is the priority, a table
instance per VRF allocated on-demand is still supported.
The high memory cost stems from TBL24's design: for IPv4, it was
justified by the BGP filtering convention (no prefixes more specific
than /24 in BGPv4 full view), ensuring most lookups hit with just one
random memory access. For IPv6, we should likely switch to a 16-bit TRIE
scheme on all layers. For IPv4, alternative algorithms with smaller
footprints (like DXR or DIR16-8-8, as used in VPP) may be worth
exploring if BGP full view is not required for those VRFs.
>
> Per-packet VRF lookup: Rx bursts come from one port, thus one VRF.
> Mixed-VRF bulk lookups do not occur in practice. The three AVX512
> code paths add complexity for a scenario that does not exist, at
> least for a classic router. Am I missing a use-case?
That's not true, you're missing out on a lot of established core use
cases that are at least 2 decades old:
- VLAN subinterface abstraction. Each subinterface may belong to a
separate VRF
- MPLS VPN
- Policy based routing
>
> I am not too familiar with DPDK FIB internals, but would it be
> possible to keep a separate TBL24 per VRF and only share the TBL8
> pool?
it is how it is implemented right now with one note - TBL24 are pre
allocated.
> Something like pre-allocating an array of max_vrfs TBL24
> pointers, allocating each TBL24 on demand at VRF add time,
and you suggesting to allocate TBL24 on demand by adding an extra
indirection layer. Thiswill leadtolowerperformance,whichIwouldliketo avoid.
> and
> having them all point into a shared TBL8 pool. The TBL8 index in
> TBL24 entries seems to already be global, so would that work without
> encoding changes?
>
> Going further: could the same idea extend to IPv6? The dir24_8 and
> trie seem to use the same TBL8 block format (256 entries, same
> (nh << 1) | ext_bit encoding, same size). Would unifying the TBL8
> allocator allow a single pool shared across IPv4, IPv6, and all
> VRFs? That could be a bigger win for /32-heavy and /128-heavy tables
> and maybe a good first step before multi-VRF.
So, you are suggesting merging IPv4 and IPv6 into a single unified FIB?
I'm not sure how this can be a bigger win, could you please elaborate
more on this?
> Regards,
>
> Maxime Leroy
>
>> Vladimir Medvedkin (4):
>> fib: add multi-VRF support
>> fib: add VRF functional and unit tests
>> fib6: add multi-VRF support
>> fib6: add VRF functional and unit tests
>>
>> app/test-fib/main.c | 257 ++++++++++++++++++++++--
>> app/test/test_fib.c | 298 +++++++++++++++++++++++++++
>> app/test/test_fib6.c | 319 ++++++++++++++++++++++++++++-
>> lib/fib/dir24_8.c | 241 ++++++++++++++++------
>> lib/fib/dir24_8.h | 255 ++++++++++++++++--------
>> lib/fib/dir24_8_avx512.c | 420 +++++++++++++++++++++++++++++++--------
>> lib/fib/dir24_8_avx512.h | 80 +++++++-
>> lib/fib/rte_fib.c | 158 ++++++++++++---
>> lib/fib/rte_fib.h | 94 ++++++++-
>> lib/fib/rte_fib6.c | 166 +++++++++++++---
>> lib/fib/rte_fib6.h | 88 +++++++-
>> lib/fib/trie.c | 158 +++++++++++----
>> lib/fib/trie.h | 51 +++--
>> lib/fib/trie_avx512.c | 225 +++++++++++++++++++--
>> lib/fib/trie_avx512.h | 39 +++-
>> 15 files changed, 2453 insertions(+), 396 deletions(-)
>>
>> --
>> 2.43.0
>>
>
--
Regards,
Vladimir
next prev parent reply other threads:[~2026-03-23 12:49 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-22 15:42 [RFC PATCH 0/4] VRF support in FIB library Vladimir Medvedkin
2026-03-22 15:42 ` [RFC PATCH 1/4] fib: add multi-VRF support Vladimir Medvedkin
2026-03-23 15:48 ` Konstantin Ananyev
2026-03-23 19:06 ` Medvedkin, Vladimir
2026-03-23 22:22 ` Konstantin Ananyev
2026-03-25 14:09 ` Medvedkin, Vladimir
2026-03-26 10:13 ` Konstantin Ananyev
2026-03-27 18:32 ` Medvedkin, Vladimir
2026-03-22 15:42 ` [RFC PATCH 2/4] fib: add VRF functional and unit tests Vladimir Medvedkin
2026-03-22 16:40 ` Stephen Hemminger
2026-03-22 16:41 ` Stephen Hemminger
2026-03-22 15:42 ` [RFC PATCH 3/4] fib6: add multi-VRF support Vladimir Medvedkin
2026-03-22 15:42 ` [RFC PATCH 4/4] fib6: add VRF functional and unit tests Vladimir Medvedkin
2026-03-22 16:45 ` Stephen Hemminger
2026-03-22 16:43 ` [RFC PATCH 0/4] VRF support in FIB library Stephen Hemminger
2026-03-23 9:01 ` Morten Brørup
2026-03-23 11:32 ` Medvedkin, Vladimir
2026-03-23 11:16 ` Medvedkin, Vladimir
2026-03-23 9:54 ` Robin Jarry
2026-03-23 11:34 ` Medvedkin, Vladimir
2026-03-23 11:27 ` Maxime Leroy
2026-03-23 12:49 ` Medvedkin, Vladimir [this message]
2026-03-23 14:53 ` Maxime Leroy
2026-03-23 15:08 ` Robin Jarry
2026-03-23 15:27 ` Morten Brørup
2026-03-23 18:52 ` Medvedkin, Vladimir
2026-03-23 18:42 ` Medvedkin, Vladimir
2026-03-24 9:19 ` Maxime Leroy
2026-03-25 15:56 ` Medvedkin, Vladimir
2026-03-25 21:43 ` Maxime Leroy
2026-03-27 18:27 ` Medvedkin, Vladimir
2026-04-02 16:51 ` Maxime Leroy
2026-03-23 19:05 ` Stephen Hemminger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=22bebf4a-3801-45e9-8ac5-726cb6c89721@intel.com \
--to=vladimir.medvedkin@intel.com \
--cc=adwivedi@marvell.com \
--cc=dev@dpdk.org \
--cc=jerinjacobk@gmail.com \
--cc=maxime@leroys.fr \
--cc=mb@smartsharesystems.com \
--cc=nsaxena16@gmail.com \
--cc=rjarry@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox