Re: [RFC PATCH 0/4] VRF support in FIB library

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Stephen Hemminger <stephen@networkplumber.org>
To: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
Cc: dev@dpdk.org, rjarry@redhat.com, nsaxena16@gmail.com,
	mb@smartsharesystems.com, adwivedi@marvell.com,
	jerinjacobk@gmail.com
Subject: Re: [RFC PATCH 0/4] VRF support in FIB library
Date: Mon, 23 Mar 2026 12:05:40 -0700	[thread overview]
Message-ID: <20260323120540.2d247ece@phoenix.local> (raw)
In-Reply-To: <20260322154215.3686528-1-vladimir.medvedkin@intel.com>

On Sun, 22 Mar 2026 15:42:11 +0000
Vladimir Medvedkin <vladimir.medvedkin@intel.com> wrote:

> This series adds multi-VRF support to both IPv4 and IPv6 FIB paths by
> allowing a single FIB instance to host multiple isolated routing domains.
> 
> Currently FIB instance represents one routing instance. For workloads that
> need multiple VRFs, the only option is to create multiple FIB objects. In a
> burst oriented datapath, packets in the same batch can belong to different VRFs, so
> the application either does per-packet lookup in different FIB instances or
> regroups packets by VRF before lookup. Both approaches are expensive.
> 
> To remove that cost, this series keeps all VRFs inside one FIB instance and
> extends lookup input with per-packet VRF IDs.
> 
> The design follows the existing fast-path structure for both families. IPv4 and
> IPv6 use multi-ary trees with a 2^24 associativity on a first level (tbl24). The
> first-level table scales per configured VRF. This increases memory usage, but
> keeps performance and lookup complexity on par with non-VRF implementation.
> 
> Vladimir Medvedkin (4):
>   fib: add multi-VRF support
>   fib: add VRF functional and unit tests
>   fib6: add multi-VRF support
>   fib6: add VRF functional and unit tests
> 
>  app/test-fib/main.c      | 257 ++++++++++++++++++++++--
>  app/test/test_fib.c      | 298 +++++++++++++++++++++++++++
>  app/test/test_fib6.c     | 319 ++++++++++++++++++++++++++++-
>  lib/fib/dir24_8.c        | 241 ++++++++++++++++------
>  lib/fib/dir24_8.h        | 255 ++++++++++++++++--------
>  lib/fib/dir24_8_avx512.c | 420 +++++++++++++++++++++++++++++++--------
>  lib/fib/dir24_8_avx512.h |  80 +++++++-
>  lib/fib/rte_fib.c        | 158 ++++++++++++---
>  lib/fib/rte_fib.h        |  94 ++++++++-
>  lib/fib/rte_fib6.c       | 166 +++++++++++++---
>  lib/fib/rte_fib6.h       |  88 +++++++-
>  lib/fib/trie.c           | 158 +++++++++++----
>  lib/fib/trie.h           |  51 +++--
>  lib/fib/trie_avx512.c    | 225 +++++++++++++++++++--
>  lib/fib/trie_avx512.h    |  39 +++-
>  15 files changed, 2453 insertions(+), 396 deletions(-)
> 

AI review found several things

Review: [RFC PATCH 1/4] fib: add multi-VRF support
       [RFC PATCH 2/4] fib: add VRF functional and unit tests
       [RFC PATCH 3/4] fib6: add multi-VRF support
       [RFC PATCH 4/4] fib6: add VRF functional and unit tests

Overall this is a well-structured RFC that adds multi-VRF support
to both the IPv4 and IPv6 FIB libraries with AVX512-optimized
lookup paths and comprehensive test coverage. There is one
significant correctness bug in the AVX512 gather paths, several
design points worth discussing, and some minor issues.

Patch 1/4 - fib: add multi-VRF support

Error: Signed overflow in AVX512 32-bit gather for VRF IDs >= 128

The VRF_SCALE_SMALL path (num_vrfs in [2, 255]) computes the
tbl24 index in 32-bit arithmetic as (vrf_id << 24) + (ip >> 8).
For vrf_id >= 128, vrf_id << 24 sets bit 31, making the result
negative when interpreted as int32. The _mm512_i32gather_epi32
and _mm512_i32gather_epi64 intrinsics sign-extend 32-bit indices
to compute byte offsets, so a negative index produces a read
before the start of tbl24 -- a buffer underflow.

Example: vrf_id=128, ip=0 gives index 0x08000000 << 24 =
0x80000000 = -2147483648 as signed int32.

This affects all nexthop sizes in both dir24_8_avx512.c and
trie_avx512.c.

Fix: Either lower the VRF_SCALE_SMALL ceiling from 256 to 128
(so VRFs 128-255 use the 64-bit path), or switch to unsigned
gather by pre-scaling the indices into byte offsets and using
scale=1 with unsigned arithmetic.

In dir24_8_avx512.c get_vector_fn():

  if (dp->num_vrfs >= 256) {

should be:

  if (dp->num_vrfs >= 128) {

Same change needed in trie.c get_vector_fn().

Warning: ABI break -- public function pointer typedefs changed

rte_fib_lookup_fn_t and rte_fib_modify_fn_t in rte_fib.h (and
the corresponding fib6 typedefs in rte_fib6.h) have new
parameters (vrf_ids/vrf_id). These are installed header typedefs
used by applications setting custom lookup functions via
rte_fib_select_lookup(). Changing them is an ABI break that needs
deprecation notice or ABI versioning.

Similarly, adding max_vrfs and vrf_default_nh to rte_fib_conf and
rte_fib6_conf changes the struct layout.

Since this is RFC, this is expected, but it will need to be
addressed before non-RFC submission.

Warning: No release notes for new experimental APIs

Eight new experimental APIs are added (rte_fib_vrf_add,
rte_fib_vrf_delete, rte_fib_vrf_lookup_bulk, rte_fib_vrf_get_rib
plus the fib6 equivalents). These need entries in
doc/guides/rel_notes/.

Warning: No testpmd hooks for new APIs

Per DPDK policy, new APIs should have hooks in app/testpmd.

Patch 2/4 - fib: add VRF functional and unit tests

Warning: Resource leak in run_v4() -- conf.vrf_default_nh not freed

In app/test-fib/main.c run_v4(), conf.vrf_default_nh is allocated
via rte_malloc() but never freed on any path (success or failure).
Same issue in run_v6() in patch 4/4.

Patch 3/4 - fib6: add multi-VRF support

Error: Same signed-overflow AVX512 gather bug as patch 1/4

The trie_avx512.c VRF_SCALE_SMALL path has the identical issue:
_mm512_slli_epi32(vrf32, 24) produces a negative signed index
for vrf_id >= 128, causing the 32-bit gather to read from a
negative offset.

In trie.c get_vector_fn():

  if (dp->num_vrfs >= 256) {

should be:

  if (dp->num_vrfs >= 128) {

Warning: Potential 32-bit truncation in trie helper functions

build_common_root() computes idx_tbl as uint64_t but passes it
to get_tbl_val_by_idx() and get_tbl_p_by_idx(). If those helpers
take uint32_t index parameters (the original code used 32-bit
indices), the upper bits will be silently truncated for large VRF
counts. The helpers should be widened to accept uint64_t, or
confirm they already do.

In practice, large VRF counts (hundreds+) with IPv6 trie tbl24
would require terabytes of memory, so this is unlikely to
manifest, but it is a latent correctness issue.

     prev parent reply	other threads:[~2026-03-23 19:05 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-22 15:42 [RFC PATCH 0/4] VRF support in FIB library Vladimir Medvedkin
2026-03-22 15:42 ` [RFC PATCH 1/4] fib: add multi-VRF support Vladimir Medvedkin
2026-03-23 15:48   ` Konstantin Ananyev
2026-03-23 19:06     ` Medvedkin, Vladimir
2026-03-23 22:22       ` Konstantin Ananyev
2026-03-25 14:09         ` Medvedkin, Vladimir
2026-03-26 10:13           ` Konstantin Ananyev
2026-03-27 18:32             ` Medvedkin, Vladimir
2026-03-22 15:42 ` [RFC PATCH 2/4] fib: add VRF functional and unit tests Vladimir Medvedkin
2026-03-22 16:40   ` Stephen Hemminger
2026-03-22 16:41   ` Stephen Hemminger
2026-03-22 15:42 ` [RFC PATCH 3/4] fib6: add multi-VRF support Vladimir Medvedkin
2026-03-22 15:42 ` [RFC PATCH 4/4] fib6: add VRF functional and unit tests Vladimir Medvedkin
2026-03-22 16:45   ` Stephen Hemminger
2026-03-22 16:43 ` [RFC PATCH 0/4] VRF support in FIB library Stephen Hemminger
2026-03-23  9:01   ` Morten Brørup
2026-03-23 11:32     ` Medvedkin, Vladimir
2026-03-23 11:16   ` Medvedkin, Vladimir
2026-03-23  9:54 ` Robin Jarry
2026-03-23 11:34   ` Medvedkin, Vladimir
2026-03-23 11:27 ` Maxime Leroy
2026-03-23 12:49   ` Medvedkin, Vladimir
2026-03-23 14:53     ` Maxime Leroy
2026-03-23 15:08       ` Robin Jarry
2026-03-23 15:27         ` Morten Brørup
2026-03-23 18:52           ` Medvedkin, Vladimir
2026-03-23 18:42       ` Medvedkin, Vladimir
2026-03-24  9:19         ` Maxime Leroy
2026-03-25 15:56           ` Medvedkin, Vladimir
2026-03-25 21:43             ` Maxime Leroy
2026-03-27 18:27               ` Medvedkin, Vladimir
2026-04-02 16:51                 ` Maxime Leroy
2026-03-23 19:05 ` Stephen Hemminger [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260323120540.2d247ece@phoenix.local \
    --to=stephen@networkplumber.org \
    --cc=adwivedi@marvell.com \
    --cc=dev@dpdk.org \
    --cc=jerinjacobk@gmail.com \
    --cc=mb@smartsharesystems.com \
    --cc=nsaxena16@gmail.com \
    --cc=rjarry@redhat.com \
    --cc=vladimir.medvedkin@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.