public inbox for dev@dpdk.org
 help / color / mirror / Atom feed
From: Konstantin Ananyev <konstantin.ananyev@huawei.com>
To: "Medvedkin, Vladimir" <vladimir.medvedkin@intel.com>,
	"dev@dpdk.org" <dev@dpdk.org>
Cc: "rjarry@redhat.com" <rjarry@redhat.com>,
	"nsaxena16@gmail.com" <nsaxena16@gmail.com>,
	"mb@smartsharesystems.com" <mb@smartsharesystems.com>,
	 "adwivedi@marvell.com" <adwivedi@marvell.com>,
	"jerinjacobk@gmail.com" <jerinjacobk@gmail.com>,
	Maxime Leroy <maxime@leroys.fr>
Subject: RE: [RFC PATCH 1/4] fib: add multi-VRF support
Date: Thu, 26 Mar 2026 10:13:31 +0000	[thread overview]
Message-ID: <11250ee33c514310aa034c0f7ae0d8e5@huawei.com> (raw)
In-Reply-To: <688998df-fdeb-4748-8821-6cc0ed49ffd0@intel.com>



> >>>> Add VRF (Virtual Routing and Forwarding) support to the IPv4
> >>>> FIB library, allowing multiple independent routing tables
> >>>> within a single FIB instance.
> >>>>
> >>>> Introduce max_vrfs and vrf_default_nh fields in rte_fib_conf
> >>>> to configure the number of VRFs and per-VRF default nexthops.
> >>> Thanks Vladimir, allowing multiple VRFs per same LPM table will
> >>> definitely be a useful thing to have.
> >>> Though, I have the same concern as Maxime:
> >>> memory requirements are just overwhelming.
> >>> Stupid q - why just not to store a pointer to a vector of next-hops
> >>> within the table entry?
> >> Am I understand correctly, a vector with max_number_of_vrfs entries and
> >> use vrf id to address a nexthop?
> > Yes.
> 
> Here I can see 2 problems:
> 
> 1. tbl entries must be the size of a pointer, so no way to use smaller sizes

Yes, but as we are talking about storing nexthops for multiple VRFs anyway,
I don't think it is a big deal.

> 2. those vectors will be sparsely populated and, depending on the
> runtime configuration, may consume a lot of memory too (as Robin
> mentioned they may have 1024 VRFs)

Yeas, each VRF vector can become really sparse and we waste a lot of memory.
If that's an issue, we probably can think about something smarter
then simple flat array indexed by vrf-id: something like 2-level B-tree or so.
The main positives that I see in that approach: 
- low extra overhead at lookup  - one/two extra pointer de-refernces.
- it allows CP to allocate/free space for each such vecto separately,
  so we don't need to pre-allocate memory for max possible entries at startup.     

> >
> >> Yes, this may work.
> >> But, if we are going to do an extra memory access, I'd better to
> >> maintain an internal hash table with 5 byte keys {24_bits_from_LPM,
> >> 16_bits_vrf_id} to retrieve a nexthop.
> > Hmm... and what to do with entries in tbl8, I mean what will be the key for
> them?
> > Or you don't plan to put entries from tbl8 to that hash table?
> 
> The idea is to have a single LPM struct with a join superset of all
> prefixes existing in all VRFs. Each prefix in this LPM struct has its
> own unique "nexthop", which is not the final next hop, but an
> intermediate metadata defining this unique prefix. Then, the following
> search is performed with the key containing this intermediate metadata +
> vrf_id in some exact match database like hash table. This approach is
> the most memory friendly, since there is only one LPM data struct (which
> scales well with number of prefixes it has) with intermediate entries
> only 4b long.
> On the other hand it requires an extra search, so lookup will be slower.
> Also, some current LPM optimizations, like tbl8 collapsing if all tbl8
> entries have a similar value, will be gone.

Yes, and yes :)
Yes it would help to save memory, and yes lookup will most likely be slower. 
The other thing that I consider as a possible drawback here - with current rte_hash
implementation we still need to allocate space for all possible max entries at startup.
But that's not new in DPDK, and for most cases it is considered as acceptable trade-off.
Overall, it seems like a possible approach to me, I suppose the main question is:
what will be the price of that extra hash-lookup here. 
Again there is a bulk version of hash lookup and in theory it might be it can be
improved further (avx512 version on x86?).   

> 
> >
> >>> And we can provide to the user with ability to specify custom
> >>> alloc/free function for these vectors.
> >>> That would help to avoid allocating huge chunks of memory at startup.
> >>> I understand that it will be one extra memory dereference,
> >>> but probably it will be not that critical in terms of performance .
> >>> Again for bulk function  we might be able to pipeline lookups and
> >>> de-references and hide that extra load latency.
> >>>
> >>>> Add four new experimental APIs:
> >>>> - rte_fib_vrf_add() and rte_fib_vrf_delete() to manage routes
> >>>>     per VRF
> >>>> - rte_fib_vrf_lookup_bulk() for multi-VRF bulk lookups
> >>>> - rte_fib_vrf_get_rib() to retrieve a per-VRF RIB handle
> >>>>
> >>>> Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
> >>>> ---
> >>>>    lib/fib/dir24_8.c        | 241 ++++++++++++++++------
> >>>>    lib/fib/dir24_8.h        | 255 ++++++++++++++++--------
> >>>>    lib/fib/dir24_8_avx512.c | 420 +++++++++++++++++++++++++++++++--------
> >>>>    lib/fib/dir24_8_avx512.h |  80 +++++++-
> >>>>    lib/fib/rte_fib.c        | 158 ++++++++++++---
> >>>>    lib/fib/rte_fib.h        |  94 ++++++++-
> >>>>    6 files changed, 988 insertions(+), 260 deletions(-)
> >>>>
> >> <snip>
> >>
> >> --
> >> Regards,
> >> Vladimir
> >>
> --
> Regards,
> Vladimir
> 


  reply	other threads:[~2026-03-26 10:13 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-22 15:42 [RFC PATCH 0/4] VRF support in FIB library Vladimir Medvedkin
2026-03-22 15:42 ` [RFC PATCH 1/4] fib: add multi-VRF support Vladimir Medvedkin
2026-03-23 15:48   ` Konstantin Ananyev
2026-03-23 19:06     ` Medvedkin, Vladimir
2026-03-23 22:22       ` Konstantin Ananyev
2026-03-25 14:09         ` Medvedkin, Vladimir
2026-03-26 10:13           ` Konstantin Ananyev [this message]
2026-03-27 18:32             ` Medvedkin, Vladimir
2026-03-22 15:42 ` [RFC PATCH 2/4] fib: add VRF functional and unit tests Vladimir Medvedkin
2026-03-22 16:40   ` Stephen Hemminger
2026-03-22 16:41   ` Stephen Hemminger
2026-03-22 15:42 ` [RFC PATCH 3/4] fib6: add multi-VRF support Vladimir Medvedkin
2026-03-22 15:42 ` [RFC PATCH 4/4] fib6: add VRF functional and unit tests Vladimir Medvedkin
2026-03-22 16:45   ` Stephen Hemminger
2026-03-22 16:43 ` [RFC PATCH 0/4] VRF support in FIB library Stephen Hemminger
2026-03-23  9:01   ` Morten Brørup
2026-03-23 11:32     ` Medvedkin, Vladimir
2026-03-23 11:16   ` Medvedkin, Vladimir
2026-03-23  9:54 ` Robin Jarry
2026-03-23 11:34   ` Medvedkin, Vladimir
2026-03-23 11:27 ` Maxime Leroy
2026-03-23 12:49   ` Medvedkin, Vladimir
2026-03-23 14:53     ` Maxime Leroy
2026-03-23 15:08       ` Robin Jarry
2026-03-23 15:27         ` Morten Brørup
2026-03-23 18:52           ` Medvedkin, Vladimir
2026-03-23 18:42       ` Medvedkin, Vladimir
2026-03-24  9:19         ` Maxime Leroy
2026-03-25 15:56           ` Medvedkin, Vladimir
2026-03-25 21:43             ` Maxime Leroy
2026-03-27 18:27               ` Medvedkin, Vladimir
2026-04-02 16:51                 ` Maxime Leroy
2026-03-23 19:05 ` Stephen Hemminger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=11250ee33c514310aa034c0f7ae0d8e5@huawei.com \
    --to=konstantin.ananyev@huawei.com \
    --cc=adwivedi@marvell.com \
    --cc=dev@dpdk.org \
    --cc=jerinjacobk@gmail.com \
    --cc=maxime@leroys.fr \
    --cc=mb@smartsharesystems.com \
    --cc=nsaxena16@gmail.com \
    --cc=rjarry@redhat.com \
    --cc=vladimir.medvedkin@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox