DPDK-dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH v6 00/23] net/sxe2: added Linkdata sxe2 ethernet driver
From: Stephen Hemminger @ 2026-06-24 16:38 UTC (permalink / raw)
  To: liujie5; +Cc: dev
In-Reply-To: <20260624020211.3687062-1-liujie5@linkdatatechnology.com>

On Wed, 24 Jun 2026 10:02:11 +0800
liujie5@linkdatatechnology.com wrote:

> From: Jie Liu <liujie5@linkdatatechnology.com>
> 
> This patch set implements core functionality for the SXE2 PMD,
> including basic driver framework, data path setup, and advanced
> offload features (VLAN, RSS,TM, PTP etc.).
> 
> V6:
> 	Refactored sxe2_ptype_tbl from adapter-indirection pattern (adapter->ptype_tbl[]) 
> 	to extern const direct-access pattern, matching txgbe PMD convention
> 
> 	All vector/SIMD Rx paths (SSE, AVX2, AVX512, NEON) index sxe2_ptype_tbl[] directly without local pointer indirection
> 
> 	remove flow_dup_pattern_mode devarg
> 
> Jie Liu (23):
>   net/sxe2: remove software statistics devargs
>   net/sxe2: add Rx framework and packet types callback
>   net/sxe2: support AVX512 vectorized path for Rx and Tx
>   net/sxe2: add AVX2 vector data path for Rx and Tx
>   net/sxe2: add link update callback
>   net/sxe2: support L2 filtering and MAC config
>   drivers: support RSS feature
>   net/sxe2: support TM hierarchy and shaping
>   net/sxe2: support IPsec inline protocol offload
>   net/sxe2: support statistics and multi-process
>   drivers: interrupt handling
>   net/sxe2: add NEON vec Rx/Tx burst functions
>   drivers: add support for VF representors
>   net/sxe2: add support for custom UDP tunnel ports
>   net/sxe2: support firmware version reading
>   net/sxe2: implement get monitor address
>   common/sxe2: add shared SFP module definitions
>   net/sxe2: support SFP module info and EEPROM access
>   net/sxe2: implement private dump info
>   net/sxe2: add mbuf validation in Tx debug mode
>   common/sxe2: add callback for memory event handling
>   net/sxe2: add private devargs parsing
>   net/sxe2: update sxe2 feature matrix docs
> 
>  doc/guides/nics/features/sxe2.ini          |   56 +
>  doc/guides/nics/sxe2.rst                   |  147 ++
>  drivers/common/sxe2/sxe2_common.c          |  156 ++
>  drivers/common/sxe2/sxe2_common.h          |    4 +
>  drivers/common/sxe2/sxe2_flow_public.h     |  633 +++++++
>  drivers/common/sxe2/sxe2_ioctl_chnl.c      |  178 +-
>  drivers/common/sxe2/sxe2_ioctl_chnl_func.h |   18 +
>  drivers/common/sxe2/sxe2_msg.h             |  118 ++
>  drivers/net/sxe2/meson.build               |   52 +
>  drivers/net/sxe2/sxe2_cmd_chnl.c           | 1587 +++++++++++++++-
>  drivers/net/sxe2/sxe2_cmd_chnl.h           |  139 ++
>  drivers/net/sxe2/sxe2_drv_cmd.h            |  523 +++++-
>  drivers/net/sxe2/sxe2_dump.c               |  300 +++
>  drivers/net/sxe2/sxe2_dump.h               |   12 +
>  drivers/net/sxe2/sxe2_ethdev.c             | 1468 ++++++++++++++-
>  drivers/net/sxe2/sxe2_ethdev.h             |  111 +-
>  drivers/net/sxe2/sxe2_ethdev_repr.c        |  609 ++++++
>  drivers/net/sxe2/sxe2_ethdev_repr.h        |   32 +
>  drivers/net/sxe2/sxe2_filter.c             |  895 +++++++++
>  drivers/net/sxe2/sxe2_filter.h             |  100 +
>  drivers/net/sxe2/sxe2_flow.c               | 1391 ++++++++++++++
>  drivers/net/sxe2/sxe2_flow.h               |   30 +
>  drivers/net/sxe2/sxe2_flow_define.h        |  144 ++
>  drivers/net/sxe2/sxe2_flow_parse_action.c  | 1182 ++++++++++++
>  drivers/net/sxe2/sxe2_flow_parse_action.h  |   23 +
>  drivers/net/sxe2/sxe2_flow_parse_engine.c  |  106 ++
>  drivers/net/sxe2/sxe2_flow_parse_engine.h  |   13 +
>  drivers/net/sxe2/sxe2_flow_parse_pattern.c | 1935 +++++++++++++++++++
>  drivers/net/sxe2/sxe2_flow_parse_pattern.h |   46 +
>  drivers/net/sxe2/sxe2_ipsec.c              | 1565 ++++++++++++++++
>  drivers/net/sxe2/sxe2_ipsec.h              |  254 +++
>  drivers/net/sxe2/sxe2_irq.c                | 1026 ++++++++++
>  drivers/net/sxe2/sxe2_irq.h                |   25 +
>  drivers/net/sxe2/sxe2_mac.c                |  530 ++++++
>  drivers/net/sxe2/sxe2_mac.h                |   84 +
>  drivers/net/sxe2/sxe2_mp.c                 |  414 +++++
>  drivers/net/sxe2/sxe2_mp.h                 |   67 +
>  drivers/net/sxe2/sxe2_queue.c              |   17 +-
>  drivers/net/sxe2/sxe2_queue.h              |   15 +-
>  drivers/net/sxe2/sxe2_rss.c                |  584 ++++++
>  drivers/net/sxe2/sxe2_rss.h                |   81 +
>  drivers/net/sxe2/sxe2_rx.c                 |   93 +-
>  drivers/net/sxe2/sxe2_rx.h                 |    2 +
>  drivers/net/sxe2/sxe2_security.c           |  335 ++++
>  drivers/net/sxe2/sxe2_security.h           |   77 +
>  drivers/net/sxe2/sxe2_stats.c              |  586 ++++++
>  drivers/net/sxe2/sxe2_stats.h              |   39 +
>  drivers/net/sxe2/sxe2_switchdev.c          |  332 ++++
>  drivers/net/sxe2/sxe2_switchdev.h          |   33 +
>  drivers/net/sxe2/sxe2_tm.c                 | 1151 ++++++++++++
>  drivers/net/sxe2/sxe2_tm.h                 |   76 +
>  drivers/net/sxe2/sxe2_tx.c                 |    7 +
>  drivers/net/sxe2/sxe2_txrx.c               | 1958 +++++++++++++++++++-
>  drivers/net/sxe2/sxe2_txrx.h               |    8 +
>  drivers/net/sxe2/sxe2_txrx_check_mbuf.c    |  595 ++++++
>  drivers/net/sxe2/sxe2_txrx_check_mbuf.h    |   38 +
>  drivers/net/sxe2/sxe2_txrx_poll.c          |  284 ++-
>  drivers/net/sxe2/sxe2_txrx_vec.c           |   46 +-
>  drivers/net/sxe2/sxe2_txrx_vec.h           |   38 +-
>  drivers/net/sxe2/sxe2_txrx_vec_avx2.c      |  747 ++++++++
>  drivers/net/sxe2/sxe2_txrx_vec_avx512.c    |  867 +++++++++
>  drivers/net/sxe2/sxe2_txrx_vec_common.h    |   54 +-
>  drivers/net/sxe2/sxe2_txrx_vec_neon.c      |  689 +++++++
>  drivers/net/sxe2/sxe2_txrx_vec_sse.c       |   38 +-
>  drivers/net/sxe2/sxe2_vsi.c                |  146 ++
>  drivers/net/sxe2/sxe2_vsi.h                |   12 +-
>  drivers/net/sxe2/sxe2vf_regs.h             |   85 +
>  67 files changed, 24733 insertions(+), 273 deletions(-)
>  create mode 100644 drivers/common/sxe2/sxe2_flow_public.h
>  create mode 100644 drivers/common/sxe2/sxe2_msg.h
>  create mode 100644 drivers/net/sxe2/sxe2_dump.c
>  create mode 100644 drivers/net/sxe2/sxe2_dump.h
>  create mode 100644 drivers/net/sxe2/sxe2_ethdev_repr.c
>  create mode 100644 drivers/net/sxe2/sxe2_ethdev_repr.h
>  create mode 100644 drivers/net/sxe2/sxe2_filter.c
>  create mode 100644 drivers/net/sxe2/sxe2_filter.h
>  create mode 100644 drivers/net/sxe2/sxe2_flow.c
>  create mode 100644 drivers/net/sxe2/sxe2_flow.h
>  create mode 100644 drivers/net/sxe2/sxe2_flow_define.h
>  create mode 100644 drivers/net/sxe2/sxe2_flow_parse_action.c
>  create mode 100644 drivers/net/sxe2/sxe2_flow_parse_action.h
>  create mode 100644 drivers/net/sxe2/sxe2_flow_parse_engine.c
>  create mode 100644 drivers/net/sxe2/sxe2_flow_parse_engine.h
>  create mode 100644 drivers/net/sxe2/sxe2_flow_parse_pattern.c
>  create mode 100644 drivers/net/sxe2/sxe2_flow_parse_pattern.h
>  create mode 100644 drivers/net/sxe2/sxe2_ipsec.c
>  create mode 100644 drivers/net/sxe2/sxe2_ipsec.h
>  create mode 100644 drivers/net/sxe2/sxe2_irq.c
>  create mode 100644 drivers/net/sxe2/sxe2_mac.c
>  create mode 100644 drivers/net/sxe2/sxe2_mac.h
>  create mode 100644 drivers/net/sxe2/sxe2_mp.c
>  create mode 100644 drivers/net/sxe2/sxe2_mp.h
>  create mode 100644 drivers/net/sxe2/sxe2_rss.c
>  create mode 100644 drivers/net/sxe2/sxe2_rss.h
>  create mode 100644 drivers/net/sxe2/sxe2_security.c
>  create mode 100644 drivers/net/sxe2/sxe2_security.h
>  create mode 100644 drivers/net/sxe2/sxe2_stats.c
>  create mode 100644 drivers/net/sxe2/sxe2_stats.h
>  create mode 100644 drivers/net/sxe2/sxe2_switchdev.c
>  create mode 100644 drivers/net/sxe2/sxe2_switchdev.h
>  create mode 100644 drivers/net/sxe2/sxe2_tm.c
>  create mode 100644 drivers/net/sxe2/sxe2_tm.h
>  create mode 100644 drivers/net/sxe2/sxe2_txrx_check_mbuf.c
>  create mode 100644 drivers/net/sxe2/sxe2_txrx_check_mbuf.h
>  create mode 100644 drivers/net/sxe2/sxe2_txrx_vec_avx2.c
>  create mode 100644 drivers/net/sxe2/sxe2_txrx_vec_avx512.c
>  create mode 100644 drivers/net/sxe2/sxe2_txrx_vec_neon.c
>  create mode 100644 drivers/net/sxe2/sxe2vf_regs.h
> 

On the duplicate flow devargs; your observation was correct, mlx5 has similar devarg option.
Sorry that I didn't strongly nudge AI in the right direction earlier.

To be clear:
  - sxe2 should use same devarg as mlx5 for this "allow_duplicate_pattern"
  - default should be 1
  - make sure documentation matches flag.

The patch looks much better; but there is a bisection breaking
patch ordering around 6 and 7.

After I pushed AI review to follow that, the review was the following:
(I means Claude AI, not me Stephen)

[PATCH v6 00/23] sxe2 driver feature additions

The flow-duplicate-pattern situation regressed in the wrong direction.

I reread mlx5's allow_duplicate_pattern in light of your "align to mlx5"
guidance and I was wrong to push for removal in v4 and v5. mlx5 has the same
devarg: name allow_duplicate_pattern, default 1, valid 0|1, and both values
expose a real hardware capability that rte_flow itself does not articulate
(value 0 -> EEXIST on duplicate; value 1 -> hardware queues duplicates as
shadow rules that activate when prior rules are deleted). That is not a
"standard API by boot flag" semantics violation as I had read it - it is a
hardware capability switch, and it is documented in doc/guides/nics/mlx5.rst
exactly the way sxe2's was. My v4-v5 push to drop it was based on a
misreading.

Unfortunately the v6 response is to half-remove the devarg in a way that is
worse than either keeping it or removing it cleanly:

[PATCH v6 22/23] flow-duplicate-pattern parsing removed, field left behind

Verified in the assembled tree:

  drivers/net/sxe2/sxe2_ethdev.h:138:
      uint8_t flow_dup_pattern_mode;          /* field still in struct */

  drivers/net/sxe2/sxe2_flow.c:806:
      rte_flow_error_set(error, EEXIST, ..., NULL,
                         adapter->devargs.flow_dup_pattern_mode ?
                         "Duplicate flow pattern." :
                         "Duplicate flow pattern is not allowed.");

  drivers/common/sxe2/sxe2_flow_public.h:603:
      uint8_t switch_pattern_dup_allow;       /* still in flow metadata */

The devarg's parser, default-setter, register_param_string entry, and
documentation are all gone, but the storage and the read site remain.
The field is zero-initialized and never written, so the ternary always picks
the "is not allowed" string. The hardware-capability path that the value=1
branch used to drive (setting switch_pattern_dup_allow on per-rule metadata)
has no caller now either. None of this is a correctness bug - duplicate
rules are uniformly rejected with EEXIST, which is the conservative
behaviour - but it is dead code that misleads the next reader.

Suggested fix: revert the v5-to-v6 changes that removed the parser and
docs, and instead align with mlx5:

  - Rename the devarg to allow_duplicate_pattern (underscore, matching
    mlx5's spelling).
  - Keep default = 1, matching mlx5's non-HWS default.
  - Reuse mlx5's documentation wording in sxe2.rst, adapted for the
    switch engine. The "only the first rule takes effect, the next
    activates when the first is deleted" semantic from mlx5.rst describes
    what the value=1 path does in hardware, and that wording is what was
    missing from v5's documentation.
  - Keep switch_pattern_dup_allow in flow metadata since it is what
    propagates the policy to the hardware programming path.

Or, if the cleaner path is preferred: actually finish the removal. Drop
flow_dup_pattern_mode from struct sxe2_devargs, replace the ternary in
sxe2_flow.c with the single "is not allowed" string, and drop
switch_pattern_dup_allow from struct (or wire it to a compile-time
constant if hardware programming truly needs it). Either direction is
defensible; the v6 in-between state is not.

[PATCH v6 06/23 and 07/23] patches posted in wrong order in the bundle

The bundle has 07/23 (drivers: support RSS feature) before 06/23
(net/sxe2: support L2 filtering and MAC config). git am applies them in
mbox order, so the series fails to apply: 07/23 depends on symbols 06/23
introduces. The fix is just to repost the bundle in numeric order;
contents are byte-identical to v5 for both patches. Worth fixing before
the next post because anyone running git am < bundle.mbox hits this.

Once the mbox order is fixed and the flow-duplicate-pattern situation is
resolved one way or the other, this is ready.


^ permalink raw reply

* [PATCH] checkpatches: suppress warnings about msleep()
From: Stephen Hemminger @ 2026-06-24 16:25 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Thomas Monjalon

The DPDK checkpatch uses the underlying kernel tool which
does some checks that are only relevant in kernel code.
The warning about msleep() time only makes sense in the
kernel.

Suppress feedback like:

WARNING:MSLEEP: msleep < 20ms can sleep for up to 20ms; see function description of msleep().
+		msleep(10);

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 devtools/checkpatches.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/devtools/checkpatches.sh b/devtools/checkpatches.sh
index f5dd77443f..071bdb2c15 100755
--- a/devtools/checkpatches.sh
+++ b/devtools/checkpatches.sh
@@ -34,7 +34,7 @@ PREFER_KERNEL_TYPES,PREFER_FALLTHROUGH,BIT_MACRO,CONST_STRUCT,\
 SPLIT_STRING,LONG_LINE_STRING,C99_COMMENT_TOLERANCE,\
 LINE_SPACING,PARENTHESIS_ALIGNMENT,NETWORKING_BLOCK_COMMENT_STYLE,\
 NEW_TYPEDEFS,COMPARISON_TO_NULL,AVOID_BUG,EXPORT_SYMBOL,\
-BAD_REPORTED_BY_LINK"
+BAD_REPORTED_BY_LINK,MSLEEP"
 options="$options $DPDK_CHECKPATCH_OPTIONS"
 
 print_usage () {
-- 
2.53.0


^ permalink raw reply related

* Re: [PATCH v10 13/21] net/txgbe: fix link stability for 40G NIC
From: Stephen Hemminger @ 2026-06-24 16:18 UTC (permalink / raw)
  To: Zaiyu Wang; +Cc: dev, stable, Jiawen Wu
In-Reply-To: <20260624115254.20348-14-zaiyuwang@trustnetic.com>

On Wed, 24 Jun 2026 19:52:45 +0800
Zaiyu Wang <zaiyuwang@trustnetic.com> wrote:

> +
> +void txgbe_e56_rx_rd_second_code_40g(struct txgbe_hw *hw, int *SECOND_CODE, int lane)
> +{
> +	int i, median;
> +	unsigned int rdata;
> +	u32 addr;
> +	int RXS_BBCDR_SECOND_ORDER_ST[RXS_READ_COUNT];
> +
> +	/* Set ovrd_en=0 to read ASIC value */
> +	addr = E56G__RXS0_ANA_OVRDEN_1_ADDR + (lane *  E56PHY_RXS_OFFSET);
> +	rdata = rd32_ephy(hw, addr);
> +	EPHY_XFLD(E56G__RXS0_ANA_OVRDEN_1, ovrd_en_ana_bbcdr_int_cstm_i) = 0;
> +	wr32_ephy(hw, addr, rdata);
> +
> +	/*
> +	 * As status update from RXS hardware is asynchronous to read status of SECOND_ORDER,
> +	 * follow sequence mentioned below.
> +	 */
> +	for (i = 0; i < RXS_READ_COUNT; i = i + 1) {
> +		addr = E56G__RXS0_ANA_OVRDVAL_5_ADDR + (lane *  E56PHY_RXS_OFFSET);
> +		rdata = rd32_ephy(hw, addr);
> +		RXS_BBCDR_SECOND_ORDER_ST[i] = EPHY_XFLD(E56G__RXS0_ANA_OVRDVAL_5,
> +							 ana_bbcdr_int_cstm_i);
> +		usec_delay(100);
> +	}
> +
> +	/* sort array RXS_BBCDR_SECOND_ORDER_ST[i] */
> +	qsort(RXS_BBCDR_SECOND_ORDER_ST, RXS_READ_COUNT, sizeof(int), txgbe_e56_int_cmp);
> +
> +	median = ((RXS_READ_COUNT + 1) / 2) - 1;
> +	*SECOND_CODE = RXS_BBCDR_SECOND_ORDER_ST[median];
> +
> +	return;
> +}

These extra returns are causing extra checkpatch warnings.
I know this is base code but if possible could you remove them.


WARNING:RETURN_VOID: void function return statements are not generally useful
#707: FILE: drivers/net/txgbe/base/txgbe_e56.c:1806:
+	return;
+}

WARNING:RETURN_VOID: void function return statements are not generally useful
#736: FILE: drivers/net/txgbe/base/txgbe_e56.c:1835:
+	return;
+}

^ permalink raw reply

* Re: [PATCH v2] dts: update dts check format script and resolve errors
From: Koushik Bhargav Nimoji @ 2026-06-24 15:44 UTC (permalink / raw)
  To: Patrick Robb
  Cc: luca.vizzarro, dev, abailey, ahassick, lylavoie,
	NBU-Contact-Thomas Monjalon
In-Reply-To: <CAK6Duxs-o_OcgjM9_dm-oBFTU7zKjk1g1X0bvbS=gtgwFSd0=Q@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 2344 bytes --]

On Tue, Jun 23, 2026 at 6:35 PM Patrick Robb <patrickrobb1997@gmail.com>
wrote:

> Looks like your patch is failing some of the checks on patchwork,
> including doc build:
> https://github.com/ovsrobot/dpdk/actions/runs/27789008643
>
> Remember to run a doc build locally before sending any patches:
>
> meson setup build
> ninja -C build doc
>
>
I will resolve this and send a patch out shortly.

Otherwise, please provide a little more info regarding your info. So, you
> have updated some of the dependencies used in the dts check format script.
> I think what I see from a quick look that is relevant is:
>
> -mypy = "^1.13.0"
> +mypy = "^2.1.0"
>  toml = "^0.10.2"
> -ruff = "^0.8.1"
> -types-paramiko = "^3.5.0.20240928"
> +ruff = "^0.15.16"
> +types-paramiko = "^4.0.0.20260518"
>  types-invoke = "^2.0.0.10"
> -types-pyyaml = "^6.0.12.20240917"
> +types-pyyaml = "^6.0.12.20260518"
>
> What is being done broadly? All dependencies covered by poetry are being
> updated? or just the subset included in format checks? Are dependencies
> being brought to current latest or something different?
>

Overall, only the dependencies being used by the dts-check-format.sh script
were updated. They were brought to the latest versions that were compatible
with each other. This allows for more rigorous format checking, as new
errors surfaced once the tool versions were updated.

>
> I remember Thomas mentioning that DTS was not checking the
> dts-check-format.sh at DPDK Summit and that confused me. Perhaps he is
> running DTS and dts-check-format.sh outside of poetry (which we have said
> is okay to do) and he is on newer versions of the formatting dependencies
> than what we currently have committed to the poetry.lock.
>

The lab was notified by Thomas about the following:

"About DTS, I think you are running an old version of the tools used in
devtools/dts-check-format.sh. When I run it, I have a lot of warnings. We
should fix them in DTS and then update the tools in the lab."

This may be what he was referring to at the DPDK summit. When running on
poetry it uses the older tool versions, to which none of the errors show
up. When running outside of poetry, it most likely uses the newer tool
versions which causes the underlying errors to appear.

Thanks,
Koushik

[-- Attachment #2: Type: text/html, Size: 3639 bytes --]

^ permalink raw reply

* Re: [PATCH] net/virtio-user: fix eventfd sharing in secondary process
From: Stephen Hemminger @ 2026-06-24 15:10 UTC (permalink / raw)
  To: Samar Yadav; +Cc: dev, maxime.coquelin, chenbox, tiwei.bie, stable
In-Reply-To: <20260624085741.2195573-1-samaryadav5@gmail.com>

On Wed, 24 Jun 2026 08:57:41 +0000
Samar Yadav <samaryadav5@gmail.com> wrote:

> +	pp = rte_zmalloc("virtio_user_proc_priv", sizeof(*pp), 0);
> +	if (pp == NULL)
> +		return -ENOMEM;
> +
> +	pp->kickfds = rte_malloc("virtio_user_proc_priv",
> +				 total_queues * sizeof(int), 0);
> +	pp->callfds = rte_malloc("virtio_user_proc_priv",
> +				 total_queues * sizeof(int), 0);

Better to use rte_calloc.

^ permalink raw reply

* Re: [PATCH] net/virtio-user: fix eventfd sharing in secondary process
From: Stephen Hemminger @ 2026-06-24 15:16 UTC (permalink / raw)
  To: Samar Yadav; +Cc: dev, maxime.coquelin, chenbox, tiwei.bie, stable
In-Reply-To: <20260624085741.2195573-1-samaryadav5@gmail.com>

On Wed, 24 Jun 2026 08:57:41 +0000
Samar Yadav <samaryadav5@gmail.com> wrote:

> @@ -865,9 +913,15 @@ virtio_user_dev_uninit(struct virtio_user_dev *dev)
>  
>  	rte_mem_event_callback_unregister(VIRTIO_USER_MEM_EVENT_CLB_NAME, dev);
>  
> +	/*
> +	 * Serialize closing/freeing the kick/call fd arrays against the MP
> +	 * handler, which reads them under the same lock to share them with
> +	 * secondary processes.
> +	 */
> +	pthread_mutex_lock(&dev->mutex);
>  	virtio_user_dev_uninit_notify(dev);
> -
>  	virtio_user_free_vrings(dev);
> +	pthread_mutex_unlock(&dev->mutex);
>  
>  	free(dev->ifname);

Related bug. virtio_user is not initializing mutex as safe between
processes. See rte_thread_mutex_init_shared() vs pthread_mutex_init()

^ permalink raw reply

* [PATCH] vhost/crypto: fix segfault
From: Radu Nicolau @ 2026-06-24 14:20 UTC (permalink / raw)
  To: dev; +Cc: Radu Nicolau, stable, Maxime Coquelin, Chenbo Xia, Jay Zhou,
	Fan Zhang

Fix potential call with dev->mem uninitialized, one common usecase
example being running the autotest with more than one device.

Fixes: 3bb595ecd682 ("vhost/crypto: add request handler")
Cc: stable@dpdk.org

Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
---
 lib/vhost/vhost_crypto.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/lib/vhost/vhost_crypto.c b/lib/vhost/vhost_crypto.c
index 648e2d731b..3679eaca1e 100644
--- a/lib/vhost/vhost_crypto.c
+++ b/lib/vhost/vhost_crypto.c
@@ -1512,6 +1512,10 @@ vhost_crypto_process_one_req(struct vhost_crypto *vcrypto,
 		VC_LOG_ERR("Invalid descriptor");
 		return -1;
 	}
+	if (unlikely((vc_req->dev->mem) == NULL)) {
+		VC_LOG_ERR("Uninitialized vhost device");
+		return -1;
+	}
 
 	dlen = head->len;
 	src_desc = IOVA_TO_VVA(struct vring_desc *, vc_req->dev, vq,
-- 
2.52.0


^ permalink raw reply related

* RE: [PATCH v4 3/7] test/bpf: add test for large shift
From: Marat Khalili @ 2026-06-24 13:44 UTC (permalink / raw)
  To: Stephen Hemminger, dev@dpdk.org; +Cc: Konstantin Ananyev
In-Reply-To: <8c882f31aaec46d984fc4c689ad92602@huawei.com>

> > +/*
> > + * Shift by an immediate that doesn't fit in a signed byte: the C1 shift
> > + * group takes a fixed 1-byte immediate, but imm_size() returns 4 for
> > + * counts >= 128, so the x86 JIT emits 3 stray bytes and desyncs the
> > + * instruction stream. The shift results are discarded (a count >= 64 is
> > + * UB in the interpreter); the test returns a known constant, which the
> > + * corrupted stream fails to produce.
> > + */
> > +static const struct ebpf_insn test_shift_big_imm_prog[] = {
> > +	{
> > +		.code = (BPF_ALU | EBPF_MOV | BPF_K),
> > +		.dst_reg = EBPF_REG_2,
> > +		.imm = 0x1,
> > +	},
> > +	{
> > +		.code = (EBPF_ALU64 | BPF_LSH | BPF_K),
> > +		.dst_reg = EBPF_REG_2,
> > +		.imm = 137,
> > +	},
> > +	{
> > +		.code = (EBPF_ALU64 | BPF_RSH | BPF_K),
> > +		.dst_reg = EBPF_REG_2,
> > +		.imm = 200,
> > +	},
> > +	{
> > +		.code = (EBPF_ALU64 | EBPF_ARSH | BPF_K),
> > +		.dst_reg = EBPF_REG_2,
> > +		.imm = 255,
> > +	},
> > +	/* known result; a desynced stream won't reproduce it */
> > +	{
> > +		.code = (BPF_ALU | EBPF_MOV | BPF_K),
> > +		.dst_reg = EBPF_REG_0,
> > +		.imm = 0x55,
> > +	},
> > +	{
> > +		.code = (BPF_JMP | EBPF_EXIT),
> > +	},
> > +};
> 
> // snip the rest
> 
> Thanks a lot for adding this test. Can we use the shift results though, instead
> of discarding them, maybe as another test case? If the interpreter is unable to
> reproduce them or triggers sanitizer it needs to be fixed as well. (Apologies
> for this scope creep but I hope we get to the bottom of it eventually.)

Speaking of scope creep, this currently fails on ARM since emit_lsl (as well as 
emit_lsr, emit_asr) does not clear unused immediate bits and thus the value 
does not fit in the encoding.

^ permalink raw reply

* [PATCH v2 4/4] ethdev: fix promoted flow metadata symbols
From: Dariusz Sosnowski @ 2026-06-24 13:13 UTC (permalink / raw)
  To: David Marchand, Bruce Richardson, Thomas Monjalon,
	Andrew Rybchenko, Ori Kam
  Cc: dev, Yu Jiang
In-Reply-To: <20260624131337.1127323-1-dsosnowski@nvidia.com>

Offending commit stabilized the following symbols
related to flow metadata:

- 1 function symbol:
    - rte_flow_dynf_metadata_register
- 2 variable symbols:
    - rte_flow_dynf_metadata_offs
    - rte_flow_dynf_metadata_mask

Any application using experimental flow metadata symbols,
which was linked dynamically against 25.11 version of ethdev
library and using current version of ethdev library
would fail to start on symbol lookup error:

/tmp/dpdk-25.11/usr/local/bin/dpdk-testpmd:
  symbol lookup error: /tmp/dpdk-25.11/usr/local/bin/dpdk-testpmd:
    undefined symbol: rte_flow_dynf_metadata_offs, version EXPERIMENTAL

This patch addresses that issue by restoring EXPERIMENTAL version
on the global variables to keep ABI compatibility [1].
Related inline helpers and variable declarations are kept as stable
(i.e., no __rte_experimental marker).
EXPERIMENTAL version will be removed from these global variables
in 26.11 release cycle on next ABI version bump.

Standard function symbol versioning is also applied on
rte_flow_dynf_metadata_register() function.

[1]: https://inbox.dpdk.org/dev/m7s3jl2566kibbapr2mfa2ic2opuc6b4ok2g67j3il5dgduzih@cz5wcdstb75n/

Bugzilla ID: 1957
Fixes: 4ee2f5c1cedf ("ethdev: promote flow metadata API to stable")

Reported-by: Yu Jiang <yux.jiang@intel.com>
Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
---
 lib/ethdev/meson.build |  2 ++
 lib/ethdev/rte_flow.c  | 13 ++++++++-----
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/lib/ethdev/meson.build b/lib/ethdev/meson.build
index 8ba6c708a2..63fd866af9 100644
--- a/lib/ethdev/meson.build
+++ b/lib/ethdev/meson.build
@@ -1,6 +1,8 @@
 # SPDX-License-Identifier: BSD-3-Clause
 # Copyright(c) 2017 Intel Corporation
 
+use_function_versioning = true
+
 sources = files(
         'ethdev_driver.c',
         'ethdev_private.c',
diff --git a/lib/ethdev/rte_flow.c b/lib/ethdev/rte_flow.c
index ec0fe08355..24eb5a95b0 100644
--- a/lib/ethdev/rte_flow.c
+++ b/lib/ethdev/rte_flow.c
@@ -23,11 +23,11 @@
 #define FLOW_LOG RTE_ETHDEV_LOG_LINE
 
 /* Mbuf dynamic field name for metadata. */
-RTE_EXPORT_SYMBOL(rte_flow_dynf_metadata_offs)
+RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_flow_dynf_metadata_offs, 19.11)
 int32_t rte_flow_dynf_metadata_offs = -1;
 
 /* Mbuf dynamic field flag bit number for metadata. */
-RTE_EXPORT_SYMBOL(rte_flow_dynf_metadata_mask)
+RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_flow_dynf_metadata_mask, 19.11)
 uint64_t rte_flow_dynf_metadata_mask;
 
 /**
@@ -281,9 +281,7 @@ static const struct rte_flow_desc_data rte_flow_desc_action[] = {
 	MK_FLOW_ACTION(JUMP_TO_TABLE_INDEX, sizeof(struct rte_flow_action_jump_to_table_index)),
 };
 
-RTE_EXPORT_SYMBOL(rte_flow_dynf_metadata_register)
-int
-rte_flow_dynf_metadata_register(void)
+RTE_DEFAULT_SYMBOL(26, int, rte_flow_dynf_metadata_register, (void))
 {
 	int offset;
 	int flag;
@@ -316,6 +314,11 @@ rte_flow_dynf_metadata_register(void)
 	return -rte_errno;
 }
 
+RTE_VERSION_EXPERIMENTAL_SYMBOL(int, rte_flow_dynf_metadata_register, (void))
+{
+	return rte_flow_dynf_metadata_register();
+}
+
 static inline void
 fts_enter(struct rte_eth_dev *dev)
 {
-- 
2.47.3


^ permalink raw reply related

* [PATCH v2 1/4] eal: fix macro for versioned experimental symbol
From: Dariusz Sosnowski @ 2026-06-24 13:13 UTC (permalink / raw)
  To: David Marchand, Bruce Richardson; +Cc: dev, Yu Jiang
In-Reply-To: <20260624131337.1127323-1-dsosnowski@nvidia.com>

Add a missing semicolon after __asm__ block in
RTE_VERSION_EXPERIMENTAL_SYMBOL macro.
It's lack triggers the following compilation error with clang:

    ../lib/ethdev/rte_flow.c:320:1: error: expected ';' after top-level asm block
      320 | RTE_VERSION_EXPERIMENTAL_SYMBOL(int, rte_flow_dynf_metadata_register, (void))
          | ^
    ../lib/eal/common/eal_export.h:75:74: note: expanded from macro 'RTE_VERSION_EXPERIMENTAL_SYMBOL'
       75 | __asm__(".symver " RTE_STR(name) "_exp, " RTE_STR(name) "@EXPERIMENTAL") \
          |                                                                          ^
    ../lib/eal/include/rte_common.h:237:20: note: expanded from macro '\
    __rte_used'
      237 | #define __rte_used __attribute__((used))
          |                    ^

Fixes: e30e194c4d06 ("eal: rework function versioning macros")
Cc: david.marchand@redhat.com

Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
---
 lib/eal/common/eal_export.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/eal/common/eal_export.h b/lib/eal/common/eal_export.h
index 888fd9f9ed..7971bf8d7a 100644
--- a/lib/eal/common/eal_export.h
+++ b/lib/eal/common/eal_export.h
@@ -72,7 +72,7 @@ __rte_used type name ## _v ## ver args; \
 type name ## _v ## ver args
 
 #define RTE_VERSION_EXPERIMENTAL_SYMBOL(type, name, args) VERSIONING_WARN \
-__asm__(".symver " RTE_STR(name) "_exp, " RTE_STR(name) "@EXPERIMENTAL") \
+__asm__(".symver " RTE_STR(name) "_exp, " RTE_STR(name) "@EXPERIMENTAL"); \
 __rte_used type name ## _exp args; \
 type name ## _exp args
 
-- 
2.47.3


^ permalink raw reply related

* [PATCH v2 3/4] net/mlx5: fix stabilized function versions
From: Dariusz Sosnowski @ 2026-06-24 13:13 UTC (permalink / raw)
  To: David Marchand, Bruce Richardson, Viacheslav Ovsiienko, Bing Zhao,
	Ori Kam, Suanming Mou, Matan Azrad
  Cc: dev, Yu Jiang
In-Reply-To: <20260624131337.1127323-1-dsosnowski@nvidia.com>

Offending patch stabilized the following function symbols:

- rte_pmd_mlx5_driver_event_cb_register
- rte_pmd_mlx5_driver_event_cb_unregister
- rte_pmd_mlx5_enable_steering
- rte_pmd_mlx5_disable_steering

These function symbols were introduced in 25.11.
Any application using these functions, linked against 25.11 version,
would fail when used with 26.07 libraries, because only DPDK_26 versions
of these symbols were exported.

This patch fixes that by adding proper function symbol versioning
to these symbols.

Fixes: e8cab133645f ("net/mlx5: promote some private API to stable")

Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
---
 drivers/net/mlx5/meson.build         |  2 ++
 drivers/net/mlx5/mlx5_driver_event.c | 22 ++++++++++++++++------
 drivers/net/mlx5/mlx5_flow.c         | 18 ++++++++++++------
 3 files changed, 30 insertions(+), 12 deletions(-)

diff --git a/drivers/net/mlx5/meson.build b/drivers/net/mlx5/meson.build
index 82a7dfe782..0fa6322779 100644
--- a/drivers/net/mlx5/meson.build
+++ b/drivers/net/mlx5/meson.build
@@ -2,6 +2,8 @@
 # Copyright 2018 6WIND S.A.
 # Copyright 2018 Mellanox Technologies, Ltd
 
+use_function_versioning = true
+
 if not (is_linux or is_windows)
     build = false
     reason = 'only supported on Linux and Windows'
diff --git a/drivers/net/mlx5/mlx5_driver_event.c b/drivers/net/mlx5/mlx5_driver_event.c
index 89e49331c8..d0e22d6151 100644
--- a/drivers/net/mlx5/mlx5_driver_event.c
+++ b/drivers/net/mlx5/mlx5_driver_event.c
@@ -236,9 +236,8 @@ notify_existing_devices(rte_pmd_mlx5_driver_event_callback_t cb, void *opaque)
 		notify_existing_queues(port_id, cb, opaque);
 }
 
-RTE_EXPORT_SYMBOL(rte_pmd_mlx5_driver_event_cb_register)
-int
-rte_pmd_mlx5_driver_event_cb_register(rte_pmd_mlx5_driver_event_callback_t cb, void *opaque)
+RTE_DEFAULT_SYMBOL(26, int, rte_pmd_mlx5_driver_event_cb_register,
+		   (rte_pmd_mlx5_driver_event_callback_t cb, void *opaque))
 {
 	struct registered_cb *r;
 
@@ -264,9 +263,14 @@ rte_pmd_mlx5_driver_event_cb_register(rte_pmd_mlx5_driver_event_callback_t cb, v
 	return 0;
 }
 
-RTE_EXPORT_SYMBOL(rte_pmd_mlx5_driver_event_cb_unregister)
-int
-rte_pmd_mlx5_driver_event_cb_unregister(rte_pmd_mlx5_driver_event_callback_t cb)
+RTE_VERSION_EXPERIMENTAL_SYMBOL(int, rte_pmd_mlx5_driver_event_cb_register,
+				(rte_pmd_mlx5_driver_event_callback_t cb, void *opaque))
+{
+	return rte_pmd_mlx5_driver_event_cb_register(cb, opaque);
+}
+
+RTE_DEFAULT_SYMBOL(26, int, rte_pmd_mlx5_driver_event_cb_unregister,
+		   (rte_pmd_mlx5_driver_event_callback_t cb))
 {
 	struct registered_cb *r;
 	bool found = false;
@@ -289,6 +293,12 @@ rte_pmd_mlx5_driver_event_cb_unregister(rte_pmd_mlx5_driver_event_callback_t cb)
 	return 0;
 }
 
+RTE_VERSION_EXPERIMENTAL_SYMBOL(int, rte_pmd_mlx5_driver_event_cb_unregister,
+				(rte_pmd_mlx5_driver_event_callback_t cb))
+{
+	return rte_pmd_mlx5_driver_event_cb_unregister(cb);
+}
+
 RTE_FINI(rte_pmd_mlx5_driver_event_cb_cleanup) {
 	struct registered_cb *r;
 
diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index a95dd9dc94..4b984df892 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -12506,9 +12506,7 @@ flow_disable_steering_run_on_related(struct rte_eth_dev *dev,
 	}
 }
 
-RTE_EXPORT_SYMBOL(rte_pmd_mlx5_disable_steering)
-void
-rte_pmd_mlx5_disable_steering(void)
+RTE_DEFAULT_SYMBOL(26, void, rte_pmd_mlx5_disable_steering, (void))
 {
 	uint16_t port_id;
 
@@ -12532,9 +12530,12 @@ rte_pmd_mlx5_disable_steering(void)
 	mlx5_steering_disabled = true;
 }
 
-RTE_EXPORT_SYMBOL(rte_pmd_mlx5_enable_steering)
-int
-rte_pmd_mlx5_enable_steering(void)
+RTE_VERSION_EXPERIMENTAL_SYMBOL(void, rte_pmd_mlx5_disable_steering, (void))
+{
+	rte_pmd_mlx5_disable_steering();
+}
+
+RTE_DEFAULT_SYMBOL(26, int, rte_pmd_mlx5_enable_steering, (void))
 {
 	uint16_t port_id;
 
@@ -12551,6 +12552,11 @@ rte_pmd_mlx5_enable_steering(void)
 	return 0;
 }
 
+RTE_VERSION_EXPERIMENTAL_SYMBOL(int, rte_pmd_mlx5_enable_steering, (void))
+{
+	return rte_pmd_mlx5_enable_steering();
+}
+
 bool
 mlx5_vport_rx_metadata_passing_enabled(const struct mlx5_dev_ctx_shared *sh)
 {
-- 
2.47.3


^ permalink raw reply related

* [PATCH v2 2/4] build: support function versioning for drivers
From: Dariusz Sosnowski @ 2026-06-24 13:13 UTC (permalink / raw)
  To: David Marchand, Bruce Richardson; +Cc: dev, Yu Jiang
In-Reply-To: <20260624131337.1127323-1-dsosnowski@nvidia.com>

Add support for enabling function versioning
(through use_function_versioning meson variable) for drivers,
similar to libraries.

Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
---
 drivers/meson.build | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/meson.build b/drivers/meson.build
index 4d95604ecd..a63d93372a 100644
--- a/drivers/meson.build
+++ b/drivers/meson.build
@@ -171,6 +171,7 @@ foreach subpath:subdirs
         pkgconfig_extra_libs = []
         testpmd_sources = []
         require_iova_in_mbuf = true
+        use_function_versioning = false
         # for handling base code files which may need extra cflags
         base_sources = []
         base_cflags = []
@@ -273,6 +274,13 @@ foreach subpath:subdirs
         endif
         dpdk_conf.set(lib_name.to_upper(), 1)
 
+        if developer_mode and is_windows and use_function_versioning
+            message('@0@: Function versioning is not supported by Windows.'.format(name))
+        endif
+        if use_function_versioning
+            cflags += '-DRTE_USE_FUNCTION_VERSIONING'
+        endif
+
         dpdk_extra_ldflags += pkgconfig_extra_libs
 
         dpdk_headers += headers
-- 
2.47.3


^ permalink raw reply related

* [PATCH v2 0/4] add versioned symbols for recently stabilized APIs
From: Dariusz Sosnowski @ 2026-06-24 13:13 UTC (permalink / raw)
  To: David Marchand, Bruce Richardson, Thomas Monjalon,
	Andrew Rybchenko, Viacheslav Ovsiienko, Bing Zhao, Ori Kam,
	Suanming Mou, Matan Azrad
  Cc: dev, Yu Jiang
In-Reply-To: <20260623113752.1100072-1-dsosnowski@nvidia.com>

Main goal of this patchset is to address https://bugs.dpdk.org/show_bug.cgi?id=1957
but it also handles other recently stabilized symbols and has some minor fixes:

- Patch 1 - Fix RTE_VERSION_EXPERIMENTAL_SYMBOL macro on clang.
- Patch 2 - Allow function versioning inside drivers.
- Patch 3 - Version the function symbols stabilized in
  https://git.dpdk.org/dpdk/commit/?id=e8cab133645f5466ef75e511629add43b68a5027
- Patch 4 - Version the rte_flow_dynf_metadata_register() function stabilized in
  https://git.dpdk.org/dpdk/commit/?id=4ee2f5c1cedf9ee7f39afa667f71b07f4004ba5c
  Restore EXPERIMENTAL version on global variable symbols
  rte_flow_dynf_metadata_offs and rte_flow_dynf_metadata_mask.

v2:
- Drop patches introducing versioning macros for symbol aliases
  and their usage (patch 4 and 5 from v1)
- EXPERIMENTAL version on global variable symbols
  rte_flow_dynf_metadata_offs and rte_flow_dynf_metadata_mask,
  as discussed under v1.
- Change commit title prefix in patch (2) from "drivers" to "build".

v1: https://inbox.dpdk.org/dev/20260623113752.1100072-1-dsosnowski@nvidia.com/

Dariusz Sosnowski (4):
  eal: fix macro for versioned experimental symbol
  build: support function versioning for drivers
  net/mlx5: fix stabilized function versions
  ethdev: fix promoted flow metadata symbols

 drivers/meson.build                  |  8 ++++++++
 drivers/net/mlx5/meson.build         |  2 ++
 drivers/net/mlx5/mlx5_driver_event.c | 22 ++++++++++++++++------
 drivers/net/mlx5/mlx5_flow.c         | 18 ++++++++++++------
 lib/eal/common/eal_export.h          |  2 +-
 lib/ethdev/meson.build               |  2 ++
 lib/ethdev/rte_flow.c                | 13 ++++++++-----
 7 files changed, 49 insertions(+), 18 deletions(-)

--
2.47.3


^ permalink raw reply

* Re: [PATCH v5] graph: add optional profiling stats
From: saeed bishara @ 2026-06-24 13:09 UTC (permalink / raw)
  To: Morten Brørup
  Cc: Pavan Nikhilesh, Stephen Hemminger, Wathsala Vithanage,
	Bruce Richardson, thomas, Jerin Jacob, dev, Jerin Jacob,
	Kiran Kumar K, Nithin Dabilpuram, Zhirun Yan
In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC35F6593C@smartserver.smartshare.dk>

On Wed, Jun 24, 2026 at 10:59 AM Morten Brørup <mb@smartsharesystems.com> wrote:
>
> +Pavan Nikhilesh, +Stephen Hemminger, +Wathsala Vithanage, +Bruce Richardson, +Thomas Monjalon
>
> > From: saeed bishara [mailto:saeed.bishara.os@gmail.com]
> > Sent: Tuesday, 23 June 2026 16.11
> >
> > > > also, instead of adding cacheline for this profiling data, can we
> > > > share with line 1 that used solely for xstats?
> > >
> > > This profiling data is 4 indexes * 2 values * 8-byte fields, so one
> > cache line in itself.
> > make sense.
> > btw, the default value of RTE_GRAPH_BURST_SIZE is 256, I suspect that
> > real applications will enforce smaller burst when pulling from input
> > devices (e.g. 32). Do you expect such cases to change
> > RTE_GRAPH_BURST_SIZE?
>
> Excellent question! I don't know.
> They should. E.g. an application optimized for latency should certainly not process bursts of 256 objects.
>
> IMO, the root problem is the lack of a unified burst size across DPDK, which causes every library to be designed with its own optimal burst size.
> E.g. the Mbuf library uses 64 (for rte_pktmbuf_free_bulk()), and the Graph library uses 256.
>
> There has been an attempt at introducing a unified burst size [1] for DPDK, but it met a lot of resistance, so it still needs to be refined before we can reach a conclusion.
> The drivers supposedly can report an "optimal" burst size at run-time, which the application can then use. But the application is unable to configure its internal burst sizes if one driver reports 64 and another reports 32.
> I'm strongly in favor of a build time constant, used across DPDK. The default value should work reasonably well across drivers and libraries.
> And if an application wants to optimize for performance (either throughput or latency), the developer should experiment to find the optimal value.
> Furthermore, designing for a build time constant max burst size throughout DPDK might provide performance benefits in itself, as the compiler can optimize for this.
>
> [1]: https://inbox.dpdk.org/dev/KdOygM96Qb6d6ADK1-AcnA@monjalon.net/
>
> Now, back to your question...
> As a workaround, I can sample Graph node performance data for 32 objects, instead of sampling for RTE_GRAPH_BURST_SIZE / 2.
I see, so there is no simple static parameter here. what about
tracking max burst, then report the calls/cycles for that case, the
user will also find what was that max burst, and how often it occured.

saeed

^ permalink raw reply

* Re: [PATCH v2] dts: update dts check format script and resolve errors
From: Thomas Monjalon @ 2026-06-24 13:04 UTC (permalink / raw)
  To: Patrick Robb
  Cc: Koushik Bhargav Nimoji, luca.vizzarro, dev, abailey, ahassick,
	lylavoie
In-Reply-To: <CAK6Duxs-o_OcgjM9_dm-oBFTU7zKjk1g1X0bvbS=gtgwFSd0=Q@mail.gmail.com>

24/06/2026 00:34, Patrick Robb:
> I remember Thomas mentioning that DTS was not checking the
> dts-check-format.sh at DPDK Summit and that confused me. Perhaps he is
> running DTS and dts-check-format.sh outside of poetry (which we have said
> is okay to do) and he is on newer versions of the formatting dependencies
> than what we currently have committed to the poetry.lock.

Yes, newer versions are failing.
But with poetry, it runs fine.



^ permalink raw reply

* [PATCH] net/virtio-user: fix eventfd sharing in secondary process
From: Samar Yadav @ 2026-06-24  8:57 UTC (permalink / raw)
  To: dev; +Cc: maxime.coquelin, chenbox, Samar Yadav, tiwei.bie, stable

virtio_user secondary processes cannot communicate with the vhost
backend: the kick/call eventfds are opened by the primary and never
shared, so a secondary's queue notification writes to an invalid fd
and traffic stalls.

Share the fds over a dedicated virtio-user multiprocess channel. The
primary registers a process-wide MP action that returns a port's
kick/call fds (looked up by port name); a secondary requests them at
probe time, before the port is announced.

The received fds are stored in eth_dev->process_private, which is
per-process, instead of the primary-owned shared dev->kickfds and
dev->callfds arrays; the secondary data path notifies the backend using
its own kickfd. In the primary, the MP handler reads the fd arrays under
dev->mutex, and the teardown path takes the same lock while closing and
freeing them, so the two cannot race.

Fixes: 1c8489da561b ("net/virtio-user: fix multi-process support")
Cc: tiwei.bie@intel.com
Cc: stable@dpdk.org

Signed-off-by: Samar Yadav <samaryadav5@gmail.com>
---
 .mailmap                                      |   1 +
 .../net/virtio/virtio_user/virtio_user_dev.c  |  56 +++-
 .../net/virtio/virtio_user/virtio_user_dev.h  |  20 ++
 drivers/net/virtio/virtio_user_ethdev.c       | 260 +++++++++++++++++-
 4 files changed, 333 insertions(+), 4 deletions(-)

diff --git a/.mailmap b/.mailmap
index 4001e5fb0e70..8f921d4b9f46 100644
--- a/.mailmap
+++ b/.mailmap
@@ -1448,6 +1448,7 @@ Salem Sol <salems@nvidia.com>
 Sam Andrew <samandrew@microsoft.com>
 Sam Chen <sam.chen@nebula-matrix.com>
 Sam Grove <sam.grove@sifive.com>
+Samar Yadav <samaryadav5@gmail.com> <samar.yadav@broadcom.com>
 Sameer Vaze <svaze@qti.qualcomm.com>
 Sameh Gobriel <sameh.gobriel@intel.com>
 Samik Gupta <samik.gupta@broadcom.com>
diff --git a/drivers/net/virtio/virtio_user/virtio_user_dev.c b/drivers/net/virtio/virtio_user/virtio_user_dev.c
index f3df73c1f0ca..5e431ebc6511 100644
--- a/drivers/net/virtio/virtio_user/virtio_user_dev.c
+++ b/drivers/net/virtio/virtio_user/virtio_user_dev.c
@@ -34,6 +34,54 @@ const char * const virtio_user_backend_strings[] = {
 	[VIRTIO_USER_BACKEND_VHOST_VDPA] = "VHOST_VDPA",
 };
 
+/*
+ * Collect the primary device's kick/call fds (interleaved kick,call per queue)
+ * for sharing with a secondary process. Caller must serialize against the
+ * control path (dev->mutex) so the fd arrays are not freed concurrently.
+ */
+int
+virtio_user_get_eventfds_from_dev(struct virtio_user_dev *dev,
+				  int fds[VIRTIO_USER_MAX_EVENTFDS])
+{
+	uint32_t max_queues;
+	int i, total_fds = 0;
+	int kickfd, callfd;
+
+	if (dev == NULL || fds == NULL)
+		return -EINVAL;
+
+	if (dev->kickfds == NULL || dev->callfds == NULL) {
+		PMD_INIT_LOG(ERR, "Device eventfd arrays not initialized");
+		return -EINVAL;
+	}
+
+	max_queues = dev->max_queue_pairs * 2;
+	if (dev->hw_cvq)
+		max_queues += 1;
+
+	/* each queue contributes a kick and a call fd */
+	if (max_queues * 2 > VIRTIO_USER_MAX_EVENTFDS) {
+		PMD_INIT_LOG(ERR,
+			     "Device needs %u eventfds, exceeds MP limit %d",
+			     max_queues * 2, VIRTIO_USER_MAX_EVENTFDS);
+		return -E2BIG;
+	}
+
+	for (i = 0; i < (int)max_queues; i++) {
+		kickfd = dev->kickfds[i];
+		callfd = dev->callfds[i];
+		if (kickfd < 0 || callfd < 0) {
+			PMD_INIT_LOG(ERR, "Queue %d has invalid fds (kick=%d call=%d)",
+				     i, kickfd, callfd);
+			return -EINVAL;
+		}
+		fds[total_fds++] = kickfd;
+		fds[total_fds++] = callfd;
+	}
+
+	return total_fds;
+}
+
 static int
 virtio_user_uninit_notify_queue(struct virtio_user_dev *dev, uint32_t queue_sel)
 {
@@ -865,9 +913,15 @@ virtio_user_dev_uninit(struct virtio_user_dev *dev)
 
 	rte_mem_event_callback_unregister(VIRTIO_USER_MEM_EVENT_CLB_NAME, dev);
 
+	/*
+	 * Serialize closing/freeing the kick/call fd arrays against the MP
+	 * handler, which reads them under the same lock to share them with
+	 * secondary processes.
+	 */
+	pthread_mutex_lock(&dev->mutex);
 	virtio_user_dev_uninit_notify(dev);
-
 	virtio_user_free_vrings(dev);
+	pthread_mutex_unlock(&dev->mutex);
 
 	free(dev->ifname);
 
diff --git a/drivers/net/virtio/virtio_user/virtio_user_dev.h b/drivers/net/virtio/virtio_user/virtio_user_dev.h
index 66400b3b6295..c00297c79ed8 100644
--- a/drivers/net/virtio/virtio_user/virtio_user_dev.h
+++ b/drivers/net/virtio/virtio_user/virtio_user_dev.h
@@ -11,6 +11,11 @@
 #include "../virtio.h"
 #include "../virtio_ring.h"
 
+#include <rte_eal.h>
+
+/* Max eventfds shareable over the MP channel (bounded by SCM_RIGHTS). */
+#define VIRTIO_USER_MAX_EVENTFDS RTE_MP_MAX_FD_NUM
+
 enum virtio_user_backend_type {
 	VIRTIO_USER_BACKEND_UNKNOWN,
 	VIRTIO_USER_BACKEND_VHOST_USER,
@@ -89,5 +94,20 @@ int virtio_user_dev_get_rss_config(struct virtio_user_dev *dev, void *dst, size_
 				   int length);
 void virtio_user_dev_delayed_disconnect_handler(void *param);
 int virtio_user_dev_server_reconnect(struct virtio_user_dev *dev);
+
+/**
+ * Collect a primary device's kick/call eventfds for sharing with a
+ * secondary process over the multiprocess channel.
+ *
+ * @param dev
+ *   Pointer to the virtio_user device (primary).
+ * @param fds
+ *   Output array, must hold at least VIRTIO_USER_MAX_EVENTFDS elements.
+ * @return
+ *   Number of fds written on success, negative errno on error.
+ */
+int virtio_user_get_eventfds_from_dev(struct virtio_user_dev *dev,
+				      int fds[VIRTIO_USER_MAX_EVENTFDS]);
+
 extern const char * const virtio_user_backend_strings[];
 #endif
diff --git a/drivers/net/virtio/virtio_user_ethdev.c b/drivers/net/virtio/virtio_user_ethdev.c
index 747dddeb2eba..1c724ad59ea6 100644
--- a/drivers/net/virtio/virtio_user_ethdev.c
+++ b/drivers/net/virtio/virtio_user_ethdev.c
@@ -27,6 +27,35 @@
 #include "virtio_rxtx.h"
 #include "virtio_user/virtio_user_dev.h"
 #include "virtio_user/vhost.h"
+#include <errno.h>
+#include <rte_errno.h>
+#include <rte_string_fns.h>
+#include <rte_spinlock.h>
+
+/* Virtio-user multiprocess communication channel */
+#define VIRTIO_USER_MP_NAME "virtio_user_mp"
+
+struct virtio_user_mp_param {
+	char port_name[RTE_DEV_NAME_MAX_LEN];
+};
+
+/*
+ * Per-process private data, referenced by eth_dev->process_private which (unlike
+ * dev_private) is NOT shared between primary and secondary processes. A secondary
+ * stores the kick/call fds it receives from the primary here, so it never mutates
+ * the primary-owned shared dev->kickfds/dev->callfds arrays. callfds are kept for
+ * a complete per-process view of the backend fds; only kickfds are used by the
+ * secondary data path today.
+ */
+struct virtio_user_proc_priv {
+	uint32_t nr_queues;
+	int *kickfds;
+	int *callfds;
+};
+
+/* Guards one-time registration of the process-wide MP action. */
+static rte_spinlock_t virtio_user_mp_lock = RTE_SPINLOCK_INITIALIZER;
+static bool virtio_user_mp_registered;
 
 #define virtio_user_get_dev(hwp) container_of(hwp, struct virtio_user_dev, hw)
 
@@ -269,6 +298,26 @@ virtio_user_del_queue(struct virtio_hw *hw, struct virtqueue *vq)
 		virtio_user_dev_destroy_shadow_cvq(dev);
 }
 
+/*
+ * Return the kick fd to notify the backend for a queue in the running process.
+ * The secondary uses its own fds (process_private); the primary owns dev->kickfds.
+ */
+static int
+virtio_user_get_kickfd(struct virtio_hw *hw, struct virtio_user_dev *dev,
+		       uint16_t queue_idx)
+{
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY) {
+		struct rte_eth_dev *eth_dev = &rte_eth_devices[hw->port_id];
+		struct virtio_user_proc_priv *pp = eth_dev->process_private;
+
+		if (pp == NULL || queue_idx >= pp->nr_queues)
+			return -1;
+		return pp->kickfds[queue_idx];
+	}
+
+	return dev->kickfds[queue_idx];
+}
+
 static void
 virtio_user_notify_queue(struct virtio_hw *hw, struct virtqueue *vq)
 {
@@ -282,8 +331,10 @@ virtio_user_notify_queue(struct virtio_hw *hw, struct virtqueue *vq)
 	}
 
 	if (!dev->notify_area) {
-		if (write(dev->kickfds[vq->vq_queue_index], &notify_data,
-			  sizeof(notify_data)) < 0)
+		int kickfd = virtio_user_get_kickfd(hw, dev, vq->vq_queue_index);
+
+		if (kickfd < 0 || write(kickfd, &notify_data,
+				sizeof(notify_data)) < 0)
 			PMD_DRV_LOG(ERR, "failed to kick backend: %s",
 				    strerror(errno));
 		return;
@@ -495,6 +546,166 @@ virtio_user_eth_dev_free(struct rte_eth_dev *eth_dev)
 	rte_eth_dev_release_port(eth_dev);
 }
 
+/* Close and free a secondary's per-process eventfd storage. */
+static void
+virtio_user_free_proc_priv(struct rte_eth_dev *eth_dev)
+{
+	struct virtio_user_proc_priv *pp = eth_dev->process_private;
+	uint32_t i;
+
+	if (pp == NULL)
+		return;
+
+	for (i = 0; i < pp->nr_queues; i++) {
+		if (pp->kickfds != NULL && pp->kickfds[i] >= 0)
+			close(pp->kickfds[i]);
+		if (pp->callfds != NULL && pp->callfds[i] >= 0)
+			close(pp->callfds[i]);
+	}
+
+	rte_free(pp->kickfds);
+	rte_free(pp->callfds);
+	rte_free(pp);
+	eth_dev->process_private = NULL;
+}
+
+/*
+ * Primary-side MP handler: reply with this port's kick/call eventfds so the
+ * requesting secondary can talk to the vhost backend. Always sends a reply
+ * (num_fds == 0 on error) so the secondary fails fast instead of timing out.
+ */
+static int
+virtio_user_mp_primary_handler(const struct rte_mp_msg *msg, const void *peer)
+{
+	const struct virtio_user_mp_param *param =
+		(const struct virtio_user_mp_param *)msg->param;
+	int eventfds[VIRTIO_USER_MAX_EVENTFDS];
+	struct rte_eth_dev *eth_dev;
+	struct virtio_user_dev *dev;
+	struct rte_mp_msg reply;
+	int num_fds;
+	int i;
+
+	memset(&reply, 0, sizeof(reply));
+	strlcpy(reply.name, msg->name, sizeof(reply.name));
+	reply.len_param = 0;
+	reply.num_fds = 0;
+
+	eth_dev = rte_eth_dev_get_by_name(param->port_name);
+	if (eth_dev == NULL || eth_dev->data->dev_private == NULL) {
+		PMD_INIT_LOG(ERR, "Failed to find virtio_user port: %s",
+			     param->port_name);
+		return rte_mp_reply(&reply, peer);
+	}
+
+	dev = eth_dev->data->dev_private;
+
+	/* serialize against control-path changes to the fd arrays */
+	pthread_mutex_lock(&dev->mutex);
+	num_fds = virtio_user_get_eventfds_from_dev(dev, eventfds);
+	if (num_fds >= 0 && num_fds <= RTE_MP_MAX_FD_NUM) {
+		reply.num_fds = num_fds;
+		for (i = 0; i < num_fds; i++)
+			reply.fds[i] = eventfds[i];
+	} else {
+		PMD_INIT_LOG(ERR, "Cannot share eventfds for %s (ret=%d)",
+			     param->port_name, num_fds);
+	}
+	pthread_mutex_unlock(&dev->mutex);
+
+	return rte_mp_reply(&reply, peer);
+}
+
+/*
+ * Secondary-side: request the primary's kick/call eventfds and store them in
+ * this process's eth_dev->process_private. The shared dev->kickfds/dev->callfds
+ * arrays (owned by the primary) are never touched.
+ */
+static int
+virtio_user_sync_eventfds(struct rte_eth_dev *eth_dev, struct virtio_user_dev *dev)
+{
+	struct rte_mp_msg mp_req, *mp_rep;
+	struct rte_mp_reply mp_reply = {0};
+	struct virtio_user_mp_param *req_param;
+	struct timespec ts = {.tv_sec = 5, .tv_nsec = 0};
+	struct virtio_user_proc_priv *pp;
+	uint32_t total_queues, i;
+	int nr_fds, ret = 0;
+
+	if (dev == NULL)
+		return -EINVAL;
+
+	if (rte_eal_process_type() != RTE_PROC_SECONDARY)
+		return -EINVAL;
+
+	total_queues = dev->max_queue_pairs * 2 + (dev->hw_cvq ? 1 : 0);
+
+	pp = rte_zmalloc("virtio_user_proc_priv", sizeof(*pp), 0);
+	if (pp == NULL)
+		return -ENOMEM;
+
+	pp->kickfds = rte_malloc("virtio_user_proc_priv",
+				 total_queues * sizeof(int), 0);
+	pp->callfds = rte_malloc("virtio_user_proc_priv",
+				 total_queues * sizeof(int), 0);
+	if (pp->kickfds == NULL || pp->callfds == NULL) {
+		ret = -ENOMEM;
+		goto err_free;
+	}
+	for (i = 0; i < total_queues; i++) {
+		pp->kickfds[i] = -1;
+		pp->callfds[i] = -1;
+	}
+
+	memset(&mp_req, 0, sizeof(mp_req));
+	req_param = (struct virtio_user_mp_param *)mp_req.param;
+	strlcpy(req_param->port_name, eth_dev->data->name,
+		sizeof(req_param->port_name));
+	strlcpy(mp_req.name, VIRTIO_USER_MP_NAME, RTE_MP_MAX_NAME_LEN);
+	mp_req.len_param = sizeof(*req_param);
+	mp_req.num_fds = 0;
+
+	if (rte_mp_request_sync(&mp_req, &mp_reply, &ts) < 0 ||
+	    mp_reply.nb_received != 1) {
+		PMD_INIT_LOG(ERR, "Failed to request eventfds from primary");
+		free(mp_reply.msgs);
+		ret = -EIO;
+		goto err_free;
+	}
+
+	mp_rep = &mp_reply.msgs[0];
+	nr_fds = mp_rep->num_fds;
+
+	/* a partially-synced device cannot work: treat any mismatch as fatal */
+	if (nr_fds != (int)total_queues * 2) {
+		PMD_INIT_LOG(ERR, "Expected %u eventfds, received %d",
+			     total_queues * 2, nr_fds);
+		for (i = 0; i < (uint32_t)nr_fds; i++)
+			close(mp_rep->fds[i]);
+		free(mp_reply.msgs);
+		ret = -EPROTO;
+		goto err_free;
+	}
+
+	for (i = 0; i < total_queues; i++) {
+		pp->kickfds[i] = mp_rep->fds[i * 2];
+		pp->callfds[i] = mp_rep->fds[i * 2 + 1];
+	}
+	pp->nr_queues = total_queues;
+	free(mp_reply.msgs);
+
+	eth_dev->process_private = pp;
+	PMD_INIT_LOG(DEBUG, "Synced %u queue eventfds for secondary port %s",
+		     total_queues, eth_dev->data->name);
+	return 0;
+
+err_free:
+	rte_free(pp->kickfds);
+	rte_free(pp->callfds);
+	rte_free(pp);
+	return ret;
+}
+
 /* Dev initialization routine. Invoked once for each virtio vdev at
  * EAL init time, see rte_bus_probe().
  * Returns 0 on success.
@@ -542,6 +753,17 @@ virtio_user_pmd_probe(struct rte_vdev_device *vdev)
 
 		eth_dev->dev_ops = &virtio_user_secondary_eth_dev_ops;
 		eth_dev->device = &vdev->device;
+
+		/* populate this process's eventfds before announcing the port */
+		ret = virtio_user_sync_eventfds(eth_dev, dev);
+		if (ret < 0) {
+			PMD_INIT_LOG(ERR,
+				     "Failed to sync eventfds in secondary: %d",
+				     ret);
+			rte_eth_dev_release_port(eth_dev);
+			return ret;
+		}
+
 		rte_eth_dev_probing_finish(eth_dev);
 		return 0;
 	}
@@ -722,6 +944,36 @@ virtio_user_pmd_probe(struct rte_vdev_device *vdev)
 		}
 	}
 
+	/*
+	 * Register the process-wide MP action once so secondaries can fetch a
+	 * port's eventfds by name. It is intentionally left registered for the
+	 * lifetime of the process (cleaned up at exit): unregistering per device
+	 * cannot drain handler calls already dispatched on the EAL MP thread.
+	 */
+	rte_spinlock_lock(&virtio_user_mp_lock);
+	if (!virtio_user_mp_registered) {
+		ret = rte_mp_action_register(VIRTIO_USER_MP_NAME,
+					     virtio_user_mp_primary_handler);
+		if (ret < 0 && rte_errno != EEXIST) {
+			rte_spinlock_unlock(&virtio_user_mp_lock);
+			if (rte_errno == ENOTSUP) {
+				PMD_INIT_LOG(WARNING,
+					"MP unsupported, secondary eventfd sharing disabled");
+				rte_eth_dev_probing_finish(eth_dev);
+				ret = 0;
+				goto end;
+			}
+			PMD_INIT_LOG(ERR, "Failed to register MP handler: %s",
+				     strerror(rte_errno));
+			virtio_user_dev_uninit(dev);
+			virtio_user_eth_dev_free(eth_dev);
+			ret = -1;
+			goto end;
+		}
+		virtio_user_mp_registered = true;
+	}
+	rte_spinlock_unlock(&virtio_user_mp_lock);
+
 	rte_eth_dev_probing_finish(eth_dev);
 	ret = 0;
 
@@ -749,8 +1001,10 @@ virtio_user_pmd_remove(struct rte_vdev_device *vdev)
 	if (!eth_dev)
 		return 0;
 
-	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
+		virtio_user_free_proc_priv(eth_dev);
 		return rte_eth_dev_release_port(eth_dev);
+	}
 
 	/* make sure the device is stopped, queues freed */
 	return rte_eth_dev_close(eth_dev->data->port_id);
-- 
2.52.0


^ permalink raw reply related

* Re: [PATCH] net/intel: fix use of non-recommended string functions
From: Bruce Richardson @ 2026-06-24 12:33 UTC (permalink / raw)
  To: Loftus, Ciara
  Cc: dev@dpdk.org, Shetty, Praveen, Burakov, Anatoly,
	Medvedkin, Vladimir, Wani, Shaiq, stable@dpdk.org
In-Reply-To: <IA4PR11MB92788919362D4E5CF700DACA8EED2@IA4PR11MB9278.namprd11.prod.outlook.com>

On Wed, Jun 24, 2026 at 10:51:11AM +0100, Loftus, Ciara wrote:
> > Subject: [PATCH] net/intel: fix use of non-recommended string functions
> > 
> > Replace use of the strncpy and strcpy functions with the safer strlcpy
> > alternative, which both bounds-checks and guarantees null termination.
> > In the process also replace instances of strcat with strlcat where
> > appropriate.
> > 
> > Fixes: 2d823ecd671c ("net/cpfl: support device initialization")
> > Fixes: c4c59ae62793 ("net/cpfl: refactor flow parser")
> > Fixes: c10881d3ee74 ("net/cpfl: support flow prog action")
> > Fixes: 9481b0902efe ("net/ice: send driver version to firmware")
> > Fixes: 7f7cbf80bdb7 ("net/ice: factorize firmware loading")
> > Fixes: 549343c25db8 ("net/idpf: support device initialization")
> > Fixes: 484f8e407a94 ("net/igb: support xstats by ID")
> > Fixes: fca82a8accf9 ("net/ixgbe: support xstats by ID")
> > Fixes: e163c18a15b0 ("net/i40e: update ptype and pctype info")
> > Cc: stable@dpdk.org
> > 
> > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> 
> Acked-by: Ciara Loftus <ciara.loftus@intel.com>
> 
Applied to dpdk-next-net-intel.

/Bruce

^ permalink raw reply

* [PATCH v5 24/24] doc: add release notes for BPF validation fixes
From: Marat Khalili @ 2026-06-24 12:17 UTC (permalink / raw)
  Cc: dev, Konstantin Ananyev
In-Reply-To: <20260624121800.40635-1-marat.khalili@huawei.com>

Document hardening the BPF validator.

Signed-off-by: Marat Khalili <marat.khalili@huawei.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
---
 doc/guides/rel_notes/release_26_07.rst | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/doc/guides/rel_notes/release_26_07.rst b/doc/guides/rel_notes/release_26_07.rst
index 8471966a4992..9376e7acad24 100644
--- a/doc/guides/rel_notes/release_26_07.rst
+++ b/doc/guides/rel_notes/release_26_07.rst
@@ -164,7 +164,7 @@ New Features
     for installing already loaded BPF programs as port callbacks
     (as opposed to loading them directly from ELF files).
 
-* **Added BPF validation debugging API.**
+* **Added BPF validation debugging API and hardened BPF validator.**
 
   * Introduced a new set of APIs (prefixed with ``rte_bpf_validate_debug_``) to
     introspect the BPF validator. This provides a mechanism to set breakpoints
@@ -172,6 +172,10 @@ New Features
     (such as tracked register bounds). This API is crucial primarily for writing
     comprehensive tests for the validator, but also serves as a foundation for a
     future interactive eBPF validation debugger.
+  * Fixed numerous bugs in the BPF validator's abstract interpretation logic,
+    including incorrect bounds tracking for jumps and arithmetic operations, as
+    well as fixing several instances of undefined behavior (UB) when verifying
+    malicious or corrupt programs.
 
 * **Added AI review helpers.**
 
-- 
2.43.0


^ permalink raw reply related

* [PATCH v5 23/24] bpf/validate: prevent overflow when building graph
From: Marat Khalili @ 2026-06-24 12:17 UTC (permalink / raw)
  To: Konstantin Ananyev; +Cc: dev, stable, Claudia Cauli
In-Reply-To: <20260624121800.40635-1-marat.khalili@huawei.com>

Function `evst_pool_init` for malicious or corrupt BPF program with
number of conditional jumps exceeding a third of UINT32_MAX could cause
arithmetic and buffer overflows when working with the program graph.

Fix the issue by limiting maximum number of conditional jumps supported
by UINT32_MAX / 4, or more than 1 billion.

Fixes: 8021917293d0 ("bpf: add extra validation for input BPF program")
Cc: stable@dpdk.org

Reported-by: Claudia Cauli <claudiacauli@gmail.com>
Signed-off-by: Marat Khalili <marat.khalili@huawei.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
---
 lib/bpf/bpf_validate.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/lib/bpf/bpf_validate.c b/lib/bpf/bpf_validate.c
index 03c590c75377..f9960088a285 100644
--- a/lib/bpf/bpf_validate.c
+++ b/lib/bpf/bpf_validate.c
@@ -2662,6 +2662,10 @@ evst_pool_init(struct bpf_verifier *bvf)
 {
 	uint32_t k, n;
 
+	if (bvf->nb_jcc_nodes > UINT32_MAX / 4)
+		/* Calculations that follow may overflow. */
+		return -E2BIG;
+
 	/*
 	 * We need nb_jcc_nodes + 1 for save_cur/restore_cur
 	 * remaining ones will be used for state tracking/pruning.
-- 
2.43.0


^ permalink raw reply related

* [PATCH v5 17/24] bpf/validate: fix BPF_JMP empty range handling
From: Marat Khalili @ 2026-06-24 12:17 UTC (permalink / raw)
  To: Konstantin Ananyev; +Cc: dev, stable, Claudia Cauli
In-Reply-To: <20260624121800.40635-1-marat.khalili@huawei.com>

Function `eval_jcc` did not account for 'dynamically unreachable' code
paths. Some code paths may be _dynamically_ unreachable, which measn
that according to validator calculations no valid values are left to
evaluate. This does not indicate dead code since same code might be
reachable through other code paths. Previous behaviour resulted in:
* undefined behaviour in corner cases;
* ranges breaking min <= max invariant relied upon in multiple places
  (e.g. signed overflow detection in `eval_mul` only checks `s.min` to
  make sure the range is non-negative and so on);
* unnecessary work for validator contributing to exponential code paths
  grow in some cases.

E.g. consider the following program with the current validation code:

    Tested program:
        0:  mov r0, #0x0
        1:  mov r2, #0x2a
        2:  lddw r3, #0x8000000000000000
        4:  jslt r2, r3, L7  ; tested instruction
        5:  mov r0, #0x1
        6:  exit
        7:  mov r0, #0x2
        8:  exit
    Pre-state:
       r2:  42
       r3:  INT64_MIN
    Post-state:
       r2:  42
       r3:  INT64_MIN
    Jump-state:
       r2:  42
       r3:  43..INT64_MIN INTERSECT 0x8000000000000000 (!)

At step 7 after jump from tested instruction validator considers r3 to
equal 0x8000000000000000 if viewed as unsigned, or have nonsensical
range 43..INT64_MIN if viewed as signed. In reality there is just no
valid range for this code path since it will never occur.

With sanitizer the following diagnostic is generated:

    lib/bpf/bpf_validate.c:1824:15: runtime error: signed integer
    overflow: -9223372036854775808 - 1 cannot be represented in type
    'long int'
        #0 0x000002761e41 in eval_jslt_jsge lib/bpf/bpf_validate.c:1824
        #1 0x000002762acb in eval_jcc lib/bpf/bpf_validate.c:1881
        #2 0x00000276b749 in evaluate lib/bpf/bpf_validate.c:3245
    ...

    SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior
    lib/bpf/bpf_validate.c:1824:15

Add pruning of dynamically unreachable code paths that arise from
ordering comparisons. Add tests for remaining ordering jump cases.

Fixes: 8021917293d0 ("bpf: add extra validation for input BPF program")
Cc: stable@dpdk.org

Reported-by: Claudia Cauli <claudiacauli@gmail.com>
Signed-off-by: Marat Khalili <marat.khalili@huawei.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
---
 app/test/test_bpf_validate.c     | 277 ++++++++++++++++++++++++++++++-
 lib/bpf/bpf_validate.c           |  96 ++++++++---
 lib/bpf/rte_bpf_validate_debug.h |   2 +
 3 files changed, 351 insertions(+), 24 deletions(-)

diff --git a/app/test/test_bpf_validate.c b/app/test/test_bpf_validate.c
index 63db2e252dd3..2755df1e65d9 100644
--- a/app/test/test_bpf_validate.c
+++ b/app/test/test_bpf_validate.c
@@ -135,6 +135,11 @@ static const struct domain unknown = {
 	.u = { .min = 0, .max = UINT64_MAX },
 };
 
+/* Unreachable state. */
+static const struct state unreachable = {
+	.is_unreachable = true,
+};
+
 
 /* BUILDING DOMAINS */
 
@@ -1710,6 +1715,55 @@ test_jmp64_jslt_x(void)
 REGISTER_FAST_TEST(bpf_validate_jmp64_jslt_x_autotest, NOHUGE_OK, ASAN_OK,
 	test_jmp64_jslt_x);
 
+/* Jump on ordering comparisons with potential bound overflow. */
+static int
+test_jmp64_ordering_overflow(void)
+{
+	/* In this test signed and unsigned cases are spelled out explicitly. */
+	const bool also_signed = false;
+
+	TEST_ASSERT_SUCCESS(verify_comparison((struct verify_instruction_param){
+		.tested_instruction = {
+			.code = (BPF_JMP | EBPF_JSLT | BPF_X),
+		},
+		.pre.dst = make_singleton_domain(42),
+		.pre.src = make_singleton_domain(INT64_MIN),
+		.jump = unreachable,
+	}, also_signed), "signed less than INT64_MIN");
+
+	TEST_ASSERT_SUCCESS(verify_comparison((struct verify_instruction_param){
+		.tested_instruction = {
+			.code = (BPF_JMP | EBPF_JSGT | BPF_X),
+		},
+		.pre.dst = make_singleton_domain(42),
+		.pre.src = make_singleton_domain(INT64_MAX),
+		.jump = unreachable,
+	}, also_signed), "signed greater than INT64_MAX");
+
+	TEST_ASSERT_SUCCESS(verify_comparison((struct verify_instruction_param){
+		.tested_instruction = {
+			.code = (BPF_JMP | EBPF_JLT | BPF_X),
+		},
+		.pre.dst = make_singleton_domain(42),
+		.pre.src = make_singleton_domain(0),
+		.jump = unreachable,
+	}, also_signed), "unsigned less than zero");
+
+	TEST_ASSERT_SUCCESS(verify_comparison((struct verify_instruction_param){
+		.tested_instruction = {
+			.code = (BPF_JMP | BPF_JGT | BPF_X),
+		},
+		.pre.dst = make_singleton_domain(42),
+		.pre.src = make_singleton_domain(UINT64_MAX),
+		.jump = unreachable,
+	}, also_signed), "unsigned greater than UINT64_MAX");
+
+	return TEST_SUCCESS;
+}
+
+REGISTER_FAST_TEST(bpf_validate_jmp64_ordering_overflow_autotest, NOHUGE_OK, ASAN_OK,
+	test_jmp64_ordering_overflow);
+
 /* Jump on ordering comparisons between two ranges. */
 static int
 test_jmp64_ordering_ranges(void)
@@ -1717,6 +1771,29 @@ test_jmp64_ordering_ranges(void)
 	/* All ranges used are valid for both signed and unsigned comparisons. */
 	const bool also_signed = true;
 
+	/*
+	 *               20 ---- dst ---- 60
+	 * 0 - src - 10
+	 */
+
+	TEST_ASSERT_SUCCESS(verify_comparison((struct verify_instruction_param){
+		.tested_instruction = {
+			.code = (BPF_JMP | EBPF_JLT | BPF_X),
+		},
+		.pre.dst = make_signed_domain(20, 60),
+		.pre.src = make_signed_domain(0, 10),
+		.jump = unreachable,
+	}, also_signed), "strict, dst range strongly greater than src range");
+
+	TEST_ASSERT_SUCCESS(verify_comparison((struct verify_instruction_param){
+		.tested_instruction = {
+			.code = (BPF_JMP | EBPF_JLE | BPF_X),
+		},
+		.pre.dst = make_signed_domain(20, 60),
+		.pre.src = make_signed_domain(0, 10),
+		.jump = unreachable,
+	}, also_signed), "non-strict, dst range strongly greater than src range");
+
 	/*
 	 *     20 ---- dst ---- 60
 	 * 10 -- src -- 40
@@ -1817,15 +1894,38 @@ test_jmp64_ordering_ranges(void)
 		.post.src = make_signed_domain(40, 59),
 	}, also_signed), "non-strict, dst range weakly less than src range");
 
+	/*
+	 *     20 ---- dst ---- 60
+	 *                          70 - src - 80
+	 */
+
+	TEST_ASSERT_SUCCESS(verify_comparison((struct verify_instruction_param){
+		.tested_instruction = {
+			.code = (BPF_JMP | EBPF_JLT | BPF_X),
+		},
+		.pre.dst = make_signed_domain(20, 60),
+		.pre.src = make_signed_domain(70, 80),
+		.post = unreachable,
+	}, also_signed), "strict, dst range strongly less than src range");
+
+	TEST_ASSERT_SUCCESS(verify_comparison((struct verify_instruction_param){
+		.tested_instruction = {
+			.code = (BPF_JMP | EBPF_JLE | BPF_X),
+		},
+		.pre.dst = make_signed_domain(20, 60),
+		.pre.src = make_signed_domain(70, 80),
+		.post = unreachable,
+	}, also_signed), "non-strict, dst range strongly less than src range");
+
 	return TEST_SUCCESS;
 }
 
 REGISTER_FAST_TEST(bpf_validate_jmp64_ordering_ranges_autotest, NOHUGE_OK, ASAN_OK,
 	test_jmp64_ordering_ranges);
 
-/* Jump on ordering comparisons with singleton. */
+/* Jump on ordering comparisons with singleton inside the range. */
 static int
-test_jmp64_ordering_singleton(void)
+test_jmp64_ordering_singleton_inside(void)
 {
 	/* All ranges used are valid for both signed and unsigned comparisons. */
 	const bool also_signed = true;
@@ -1878,8 +1978,177 @@ test_jmp64_ordering_singleton(void)
 	return TEST_SUCCESS;
 }
 
-REGISTER_FAST_TEST(bpf_validate_jmp64_ordering_singleton_autotest, NOHUGE_OK, ASAN_OK,
-	test_jmp64_ordering_singleton);
+REGISTER_FAST_TEST(bpf_validate_jmp64_ordering_singleton_inside_autotest, NOHUGE_OK, ASAN_OK,
+	test_jmp64_ordering_singleton_inside);
+
+/* Jump on ordering comparisons with singleton outside the range. */
+static int
+test_jmp64_ordering_singleton_outside(void)
+{
+	/* All ranges used are valid for both signed and unsigned comparisons. */
+	const bool also_signed = true;
+
+	/*
+	 *       20 ---- dst ---- 60
+	 *  imm
+	 */
+
+	TEST_ASSERT_SUCCESS(verify_comparison((struct verify_instruction_param){
+		.tested_instruction = {
+			.code = (BPF_JMP | EBPF_JLT | BPF_K),
+			.imm = 10,
+		},
+		.pre.dst = make_signed_domain(20, 60),
+		.jump = unreachable,
+	}, also_signed), "(BPF_JMP | EBPF_JLT | BPF_K) check, range greater than imm");
+
+	TEST_ASSERT_SUCCESS(verify_comparison((struct verify_instruction_param){
+		.tested_instruction = {
+			.code = (BPF_JMP | EBPF_JLE | BPF_K),
+			.imm = 10,
+		},
+		.pre.dst = make_signed_domain(20, 60),
+		.jump = unreachable,
+	}, also_signed), "(BPF_JMP | EBPF_JLE | BPF_K) check, range greater than imm");
+
+	TEST_ASSERT_SUCCESS(verify_comparison((struct verify_instruction_param){
+		.tested_instruction = {
+			.code = (BPF_JMP | BPF_JGT | BPF_K),
+			.imm = 10,
+		},
+		.pre.dst = make_signed_domain(20, 60),
+		.post = unreachable,
+	}, also_signed), "(BPF_JMP | EBPF_JGT | BPF_K) check, range greater than imm");
+
+	TEST_ASSERT_SUCCESS(verify_comparison((struct verify_instruction_param){
+		.tested_instruction = {
+			.code = (BPF_JMP | BPF_JGE | BPF_K),
+			.imm = 10,
+		},
+		.pre.dst = make_signed_domain(20, 60),
+		.post = unreachable,
+	}, also_signed), "(BPF_JMP | EBPF_JGE | BPF_K) check, range greater than imm");
+
+	/*
+	 *       20 ---- dst ---- 60
+	 *                            imm
+	 */
+
+	TEST_ASSERT_SUCCESS(verify_comparison((struct verify_instruction_param){
+		.tested_instruction = {
+			.code = (BPF_JMP | EBPF_JLT | BPF_K),
+			.imm = 70,
+		},
+		.pre.dst = make_signed_domain(20, 60),
+		.post = unreachable,
+	}, also_signed), "(BPF_JMP | EBPF_JLT | BPF_K) check, range less than imm");
+
+	TEST_ASSERT_SUCCESS(verify_comparison((struct verify_instruction_param){
+		.tested_instruction = {
+			.code = (BPF_JMP | EBPF_JLE | BPF_K),
+			.imm = 70,
+		},
+		.pre.dst = make_signed_domain(20, 60),
+		.post = unreachable,
+	}, also_signed), "(BPF_JMP | EBPF_JLE | BPF_K) check, range less than imm");
+
+	TEST_ASSERT_SUCCESS(verify_comparison((struct verify_instruction_param){
+		.tested_instruction = {
+			.code = (BPF_JMP | BPF_JGT | BPF_K),
+			.imm = 70,
+		},
+		.pre.dst = make_signed_domain(20, 60),
+		.jump = unreachable,
+	}, also_signed), "(BPF_JMP | EBPF_JGT | BPF_K) check, range less than imm");
+
+	TEST_ASSERT_SUCCESS(verify_comparison((struct verify_instruction_param){
+		.tested_instruction = {
+			.code = (BPF_JMP | BPF_JGE | BPF_K),
+			.imm = 70,
+		},
+		.pre.dst = make_signed_domain(20, 60),
+		.jump = unreachable,
+	}, also_signed), "(BPF_JMP | EBPF_JGE | BPF_K) check, range less than imm");
+
+	return TEST_SUCCESS;
+}
+
+REGISTER_FAST_TEST(bpf_validate_jmp64_ordering_singleton_outside_autotest, NOHUGE_OK, ASAN_OK,
+	test_jmp64_ordering_singleton_outside);
+
+/* Jump on ordering comparisons with ranges "touching" each other. */
+static int
+test_jmp64_ordering_touching(void)
+{
+	/* All ranges used are valid for both signed and unsigned comparisons. */
+	const bool also_signed = true;
+
+	for (int overlap = 0; overlap != 3; ++overlap) {
+
+		/*
+		 *                  20 - dst - 30
+		 * 10 - src - (19 + overlap)
+		 */
+
+		TEST_ASSERT_SUCCESS(verify_comparison((struct verify_instruction_param){
+			.tested_instruction = {
+				.code = (BPF_JMP | EBPF_JLT | BPF_X),
+			},
+			.pre.dst = make_signed_domain(20, 30),
+			.pre.src = make_signed_domain(10, 19 + overlap),
+			.jump = overlap <= 1 ? unreachable : (struct state){
+				.dst = make_singleton_domain(20),
+				.src = make_singleton_domain(21),
+			},
+		}, also_signed), "strict, dst left touching src right, overlap=%d", overlap);
+
+		TEST_ASSERT_SUCCESS(verify_comparison((struct verify_instruction_param){
+			.tested_instruction = {
+				.code = (BPF_JMP | EBPF_JLE | BPF_X),
+			},
+			.pre.dst = make_signed_domain(20, 30),
+			.pre.src = make_signed_domain(10, 19 + overlap),
+			.jump = overlap < 1 ? unreachable : (struct state){
+				.dst = make_signed_domain(20, 19 + overlap),
+				.src = make_signed_domain(20, 19 + overlap),
+			},
+		}, also_signed), "non-strict, dst left touching src right, overlap=%d", overlap);
+
+		/*
+		 * 10 - dst - (19 + overlap)
+		 *                  20 - src - 30
+		 */
+
+		TEST_ASSERT_SUCCESS(verify_comparison((struct verify_instruction_param){
+			.tested_instruction = {
+				.code = (BPF_JMP | EBPF_JLT | BPF_X),
+			},
+			.pre.dst = make_signed_domain(10, 19 + overlap),
+			.pre.src = make_signed_domain(20, 30),
+			.post = overlap < 1 ? unreachable : (struct state){
+				.dst = make_signed_domain(20, 19 + overlap),
+				.src = make_signed_domain(20, 19 + overlap),
+			},
+		}, also_signed), "strict, dst right touching src left, overlap=%d", overlap);
+
+		TEST_ASSERT_SUCCESS(verify_comparison((struct verify_instruction_param){
+			.tested_instruction = {
+				.code = (BPF_JMP | EBPF_JLE | BPF_X),
+			},
+			.pre.dst = make_signed_domain(10, 19 + overlap),
+			.pre.src = make_signed_domain(20, 30),
+			.post = overlap <= 1 ? unreachable : (struct state){
+				.dst = make_singleton_domain(21),
+				.src = make_singleton_domain(20),
+			},
+		}, also_signed), "non-strict, dst right touching src left, overlap=%d", overlap);
+	}
+
+	return TEST_SUCCESS;
+}
+
+REGISTER_FAST_TEST(bpf_validate_jmp64_ordering_touching_autotest, NOHUGE_OK, ASAN_OK,
+	test_jmp64_ordering_touching);
 
 /* 64-bit load from heap (should be set to unknown). */
 static int
diff --git a/lib/bpf/bpf_validate.c b/lib/bpf/bpf_validate.c
index 2e535069fe4d..af084e36c8d0 100644
--- a/lib/bpf/bpf_validate.c
+++ b/lib/bpf/bpf_validate.c
@@ -19,6 +19,9 @@
 
 #define BPF_ARG_PTR_STACK RTE_BPF_ARG_RESERVED
 
+/* type containing no values (AKA "bottom", "never" etc)  */
+#define BPF_ARG_UNINHABITED ((enum rte_bpf_arg_type)(RTE_BPF_ARG_UNDEF - 1))
+
 struct bpf_reg_val {
 	struct rte_bpf_arg v;
 	uint64_t mask;
@@ -36,6 +39,8 @@ struct bpf_eval_state {
 	SLIST_ENTRY(bpf_eval_state) next; /* for @safe list traversal */
 	struct bpf_reg_val rv[EBPF_REG_NUM];
 	struct bpf_reg_val sv[MAX_BPF_STACK_SIZE / sizeof(uint64_t)];
+	/* flag set for branches determined to be dynamically unreachable */
+	bool unreachable;
 };
 
 SLIST_HEAD(bpf_evst_head, bpf_eval_state);
@@ -174,6 +179,9 @@ __rte_bpf_validate_can_access(const struct bpf_verifier *verifier,
 	struct value_set access_set;
 	uint32_t opsz;
 
+	if (st->unreachable)
+		return -ENOENT;
+
 	switch (BPF_CLASS(access->code)) {
 	case BPF_LDX:
 		rv = &st->rv[access->src_reg];
@@ -310,6 +318,10 @@ __rte_bpf_validate_may_jump(const struct bpf_verifier *verifier,
 	if (!may_jump_code_is_supported(jump->code))
 		return -ENOTSUP;
 
+	if (st->unreachable)
+		/* Set no bits since neither false nor true is possible. */
+		return 0;
+
 	rd = &st->rv[jump->dst_reg];
 	dst_set = (rd->v.type == RTE_BPF_ARG_UNDEF) ? value_set_full :
 		value_set_from_pair(rd->s.min, rd->s.max, rd->u.min, rd->u.max);
@@ -1521,40 +1533,68 @@ static void
 eval_jgt_jle(struct bpf_reg_val *trd, struct bpf_reg_val *trs,
 	struct bpf_reg_val *frd, struct bpf_reg_val *frs)
 {
-	frd->u.max = RTE_MIN(frd->u.max, frs->u.max);
-	frs->u.min = RTE_MAX(frs->u.min, frd->u.min);
-	trd->u.min = RTE_MAX(trd->u.min, trs->u.min + 1);
-	trs->u.max = RTE_MIN(trs->u.max, trd->u.max - 1);
+	if (frd->u.min <= frs->u.max) {
+		frd->u.max = RTE_MIN(frd->u.max, frs->u.max);
+		frs->u.min = RTE_MAX(frs->u.min, frd->u.min);
+	} else
+		frd->v.type = frs->v.type = BPF_ARG_UNINHABITED;
+
+	if (trs->u.min < trd->u.max) {
+		trd->u.min = RTE_MAX(trd->u.min, trs->u.min + 1);
+		trs->u.max = RTE_MIN(trs->u.max, trd->u.max - 1);
+	} else
+		trd->v.type = trs->v.type = BPF_ARG_UNINHABITED;
 }
 
 static void
 eval_jlt_jge(struct bpf_reg_val *trd, struct bpf_reg_val *trs,
 	struct bpf_reg_val *frd, struct bpf_reg_val *frs)
 {
-	frd->u.min = RTE_MAX(frd->u.min, frs->u.min);
-	frs->u.max = RTE_MIN(frs->u.max, frd->u.max);
-	trd->u.max = RTE_MIN(trd->u.max, trs->u.max - 1);
-	trs->u.min = RTE_MAX(trs->u.min, trd->u.min + 1);
+	if (frs->u.min <= frd->u.max) {
+		frd->u.min = RTE_MAX(frd->u.min, frs->u.min);
+		frs->u.max = RTE_MIN(frs->u.max, frd->u.max);
+	} else
+		frd->v.type = frs->v.type = BPF_ARG_UNINHABITED;
+
+	if (trd->u.min < trs->u.max) {
+		trd->u.max = RTE_MIN(trd->u.max, trs->u.max - 1);
+		trs->u.min = RTE_MAX(trs->u.min, trd->u.min + 1);
+	} else
+		trd->v.type = trs->v.type = BPF_ARG_UNINHABITED;
 }
 
 static void
 eval_jsgt_jsle(struct bpf_reg_val *trd, struct bpf_reg_val *trs,
 	struct bpf_reg_val *frd, struct bpf_reg_val *frs)
 {
-	frd->s.max = RTE_MIN(frd->s.max, frs->s.max);
-	frs->s.min = RTE_MAX(frs->s.min, frd->s.min);
-	trd->s.min = RTE_MAX(trd->s.min, trs->s.min + 1);
-	trs->s.max = RTE_MIN(trs->s.max, trd->s.max - 1);
+	if (frd->s.min <= frs->s.max) {
+		frd->s.max = RTE_MIN(frd->s.max, frs->s.max);
+		frs->s.min = RTE_MAX(frs->s.min, frd->s.min);
+	} else
+		frd->v.type = frs->v.type = BPF_ARG_UNINHABITED;
+
+	if (trs->s.min < trd->s.max) {
+		trd->s.min = RTE_MAX(trd->s.min, trs->s.min + 1);
+		trs->s.max = RTE_MIN(trs->s.max, trd->s.max - 1);
+	} else
+		trd->v.type = trs->v.type = BPF_ARG_UNINHABITED;
 }
 
 static void
 eval_jslt_jsge(struct bpf_reg_val *trd, struct bpf_reg_val *trs,
 	struct bpf_reg_val *frd, struct bpf_reg_val *frs)
 {
-	frd->s.min = RTE_MAX(frd->s.min, frs->s.min);
-	frs->s.max = RTE_MIN(frs->s.max, frd->s.max);
-	trd->s.max = RTE_MIN(trd->s.max, trs->s.max - 1);
-	trs->s.min = RTE_MAX(trs->s.min, trd->s.min + 1);
+	if (frs->s.min <= frd->s.max) {
+		frd->s.min = RTE_MAX(frd->s.min, frs->s.min);
+		frs->s.max = RTE_MIN(frs->s.max, frd->s.max);
+	} else
+		frd->v.type = frs->v.type = BPF_ARG_UNINHABITED;
+
+	if (trd->s.min < trs->s.max) {
+		trd->s.max = RTE_MIN(trd->s.max, trs->s.max - 1);
+		trs->s.min = RTE_MAX(trs->s.min, trd->s.min + 1);
+	} else
+		trd->v.type = trs->v.type = BPF_ARG_UNINHABITED;
 }
 
 static const char *
@@ -1609,6 +1649,14 @@ eval_jcc(struct bpf_verifier *bvf, const struct ebpf_insn *ins)
 	else if (op == EBPF_JSGE)
 		eval_jslt_jsge(frd, frs, trd, trs);
 
+	if (trd->v.type == BPF_ARG_UNINHABITED ||
+			trs->v.type == BPF_ARG_UNINHABITED)
+		tst->unreachable = true;
+
+	if (frd->v.type == BPF_ARG_UNINHABITED ||
+			frs->v.type == BPF_ARG_UNINHABITED)
+		fst->unreachable = true;
+
 	return NULL;
 }
 
@@ -2349,7 +2397,7 @@ set_edge_type(struct bpf_verifier *bvf, struct inst_node *node,
  * Depth-First Search (DFS) through previously constructed
  * Control Flow Graph (CFG).
  * Information collected at this path would be used later
- * to determine is there any loops, and/or unreachable instructions.
+ * to determine is there any loops, and/or statically unreachable instructions.
  * PREREQUISITE: there is at least one node.
  */
 static void
@@ -2397,7 +2445,7 @@ dfs(struct bpf_verifier *bvf)
 }
 
 /*
- * report unreachable instructions.
+ * report statically unreachable instructions.
  */
 static void
 log_unreachable(const struct bpf_verifier *bvf)
@@ -2970,13 +3018,21 @@ evaluate(struct bpf_verifier *bvf)
 				stats.nb_restore++;
 			}
 
+			if (bvf->evst->unreachable) {
+				rc = __rte_bpf_validate_debug_evaluate_step(
+					debug, get_node_idx(bvf, next),
+					RTE_BPF_VALIDATE_DEBUG_EVENT_BRANCH_UNREACHABLE);
+				if (rc < 0)
+					break;
+
+				next = NULL;
 			/*
 			 * for jcc targets: check did we already evaluated
 			 * that path and can it's evaluation be skipped that
 			 * time.
 			 */
-			if (node->nb_edge > 1 && prune_eval_state(bvf, node,
-					next) == 0) {
+			} else if (node->nb_edge > 1 &&
+					prune_eval_state(bvf, node, next) == 0) {
 				rc = __rte_bpf_validate_debug_evaluate_step(
 					debug, get_node_idx(bvf, next),
 					RTE_BPF_VALIDATE_DEBUG_EVENT_BRANCH_PRUNE);
diff --git a/lib/bpf/rte_bpf_validate_debug.h b/lib/bpf/rte_bpf_validate_debug.h
index 89bf587f0211..f30fa926f10a 100644
--- a/lib/bpf/rte_bpf_validate_debug.h
+++ b/lib/bpf/rte_bpf_validate_debug.h
@@ -49,6 +49,8 @@ enum rte_bpf_validate_debug_event {
 	RTE_BPF_VALIDATE_DEBUG_EVENT_BRANCH_PRUNE,
 	/* End of branch verification, after the last verified instruction. */
 	RTE_BPF_VALIDATE_DEBUG_EVENT_BRANCH_RETURN,
+	/* Pruning branch as dynamically unreachable. */
+	RTE_BPF_VALIDATE_DEBUG_EVENT_BRANCH_UNREACHABLE,
 	/* Number of valid event values. */
 	RTE_BPF_VALIDATE_DEBUG_EVENT_END,
 };
-- 
2.43.0


^ permalink raw reply related

* [PATCH v5 22/24] bpf/validate: fix BPF_XOR signed min calculation
From: Marat Khalili @ 2026-06-24 12:17 UTC (permalink / raw)
  To: Konstantin Ananyev; +Cc: dev, stable, Claudia Cauli
In-Reply-To: <20260624121800.40635-1-marat.khalili@huawei.com>

Function `eval_xor` calculated signed minimum using essentially unsigned
algorithm as long as any of the operands have non-negative range, which
is incorrect since it ignores any negative numbers that may have the
sign or any other bits set.

E.g. consider the following program with the current validation code:

    Tested program:
        0:  mov r0, #0x0
        1:  ldxdw r2, [r1 + 0]
        2:  jsgt r2, #0x0, L5
        3:  xor r2, #0x0  ; tested instruction
        4:  mov r0, #0x1
        5:  exit
    Pre-state:
       r2:  INT64_MIN..0
    Post-state:
       r2:  0

After the tested instruction validator considers r2 to equal 0, however
if -1 was loaded on step 1 it is possible for it to be -1.

Set signed range to full if any of the operands can be negative,
otherwise (if both operands are non-negative) use same algorithm as for
unsigned numbers. Add test.

Fixes: 8021917293d0 ("bpf: add extra validation for input BPF program")
Cc: stable@dpdk.org

Reported-by: Claudia Cauli <claudiacauli@gmail.com>
Signed-off-by: Marat Khalili <marat.khalili@huawei.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
---
 app/test/test_bpf_validate.c | 17 +++++++++++++++++
 lib/bpf/bpf_validate.c       |  2 +-
 2 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/app/test/test_bpf_validate.c b/app/test/test_bpf_validate.c
index 5cf8d99effb5..3f6747eaf6e9 100644
--- a/app/test/test_bpf_validate.c
+++ b/app/test/test_bpf_validate.c
@@ -1764,6 +1764,23 @@ test_alu64_sub_x_src_signed_max_zero(void)
 REGISTER_FAST_TEST(bpf_validate_alu64_sub_x_src_signed_max_zero_autotest, NOHUGE_OK, ASAN_OK,
 	test_alu64_sub_x_src_signed_max_zero);
 
+/* 64-bit bitwise XOR between a negative scalar range and zero immediate. */
+static int
+test_alu64_xor_k_negative(void)
+{
+	return verify_instruction((struct verify_instruction_param){
+		.tested_instruction = {
+			.code = (EBPF_ALU64 | BPF_XOR | BPF_K),
+			.imm = 0,
+		},
+		.pre.dst = make_signed_domain(INT64_MIN, 0),
+		.post.dst = unknown,
+	});
+}
+
+REGISTER_FAST_TEST(bpf_validate_alu64_xor_k_negative_autotest, NOHUGE_OK, ASAN_OK,
+	test_alu64_xor_k_negative);
+
 /* Jump if greater than immediate. */
 static int
 test_jmp64_jeq_k(void)
diff --git a/lib/bpf/bpf_validate.c b/lib/bpf/bpf_validate.c
index 131a5468dbc4..03c590c75377 100644
--- a/lib/bpf/bpf_validate.c
+++ b/lib/bpf/bpf_validate.c
@@ -910,7 +910,7 @@ eval_xor(struct bpf_reg_val *rd, const struct bpf_reg_val *rs, size_t opsz,
 		rd->s.max ^= rs->s.max;
 
 	/* both operands are non-negative */
-	} else if (rd->s.min >= 0 || rs->s.min >= 0) {
+	} else if (rd->s.min >= 0 && rs->s.min >= 0) {
 		rd->s.max = eval_uor_max(rd->s.max, rs->s.max, opsz);
 		rd->s.min = 0;
 	} else
-- 
2.43.0


^ permalink raw reply related

* [PATCH v5 21/24] bpf/validate: fix BPF_SUB signed max zero case
From: Marat Khalili @ 2026-06-24 12:17 UTC (permalink / raw)
  To: Konstantin Ananyev; +Cc: dev, stable, Claudia Cauli
In-Reply-To: <20260624121800.40635-1-marat.khalili@huawei.com>

Function `eval_sub` used source register signed minimum to detect
overflow of the difference (operation result) signed minimum, and source
register signed maximum to detect overflow of the difference signed
maximum. However in the actual formula for difference source register
bounds are swapped (correctly, since we subtract it), so in overflow
detection we should also have swapped them. It caused false negatives in
certain cases.

E.g. consider the following program with the current validation code:

    Tested program:
        0:  mov r0, #0x0
        1:  ldxdw r2, [r1 + 0]
        2:  jsgt r2, #0x0, L7
        3:  ldxdw r3, [r1 + 8]
        4:  jsgt r3, #0x0, L7
        5:  sub r2, r3  ; tested instruction
        6:  mov r0, #0x1
        7:  exit
    Pre-state:
       r2:  INT64_MIN..0
       r3:  INT64_MIN..0
    Post-state:
       r2:  INT64_MIN

Validator ignores overflow of signed minimum and considers result to
always equal INT64_MIN. However, if -1 was loaded on step 1 and -2 was
loaded on step 3 it is possible for the difference to equal 1.

Swap source register signed minimum and maximum in the overflow
condition to match the new range formula, add test.

Fixes: 8021917293d0 ("bpf: add extra validation for input BPF program")
Cc: stable@dpdk.org

Reported-by: Claudia Cauli <claudiacauli@gmail.com>
Signed-off-by: Marat Khalili <marat.khalili@huawei.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
---
 app/test/test_bpf_validate.c | 17 +++++++++++++++++
 lib/bpf/bpf_validate.c       |  4 ++--
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/app/test/test_bpf_validate.c b/app/test/test_bpf_validate.c
index dfcf49ccb936..5cf8d99effb5 100644
--- a/app/test/test_bpf_validate.c
+++ b/app/test/test_bpf_validate.c
@@ -1747,6 +1747,23 @@ test_alu64_or_k_positive(void)
 REGISTER_FAST_TEST(bpf_validate_alu64_or_k_positive_autotest, NOHUGE_OK, ASAN_OK,
 	test_alu64_or_k_positive);
 
+/* 64-bit difference between two negative ranges.. */
+static int
+test_alu64_sub_x_src_signed_max_zero(void)
+{
+	return verify_instruction((struct verify_instruction_param){
+		.tested_instruction = {
+			.code = (EBPF_ALU64 | BPF_SUB | BPF_X),
+		},
+		.pre.dst = make_signed_domain(INT64_MIN, 0),
+		.pre.src = make_signed_domain(INT64_MIN, 0),
+		.post.dst = unknown,
+	});
+}
+
+REGISTER_FAST_TEST(bpf_validate_alu64_sub_x_src_signed_max_zero_autotest, NOHUGE_OK, ASAN_OK,
+	test_alu64_sub_x_src_signed_max_zero);
+
 /* Jump if greater than immediate. */
 static int
 test_jmp64_jeq_k(void)
diff --git a/lib/bpf/bpf_validate.c b/lib/bpf/bpf_validate.c
index abb39cfd328d..131a5468dbc4 100644
--- a/lib/bpf/bpf_validate.c
+++ b/lib/bpf/bpf_validate.c
@@ -716,9 +716,9 @@ eval_sub(struct bpf_reg_val *rd, const struct bpf_reg_val *rs, uint64_t msk)
 		eval_umax_bound(&rv, msk);
 
 	if ((rd->s.min != rd->s.max || rs->s.min != rs->s.max) &&
-			(((rs->s.min < 0 && rv.s.min < rd->s.min) ||
+			(((rs->s.max < 0 && rv.s.min < rd->s.min) ||
 			rv.s.min > rd->s.min) ||
-			((rs->s.max < 0 && rv.s.max < rd->s.max) ||
+			((rs->s.min < 0 && rv.s.max < rd->s.max) ||
 			rv.s.max > rd->s.max)))
 		eval_smax_bound(&rv, msk);
 
-- 
2.43.0


^ permalink raw reply related

* [PATCH v5 20/24] bpf/validate: fix BPF_OR min calculations
From: Marat Khalili @ 2026-06-24 12:17 UTC (permalink / raw)
  To: Konstantin Ananyev; +Cc: dev, stable, Claudia Cauli
In-Reply-To: <20260624121800.40635-1-marat.khalili@huawei.com>

This commit fixes two different problems in signed and unsigned minimum
calculations within `eval_or`. Passing tests requires both problems to
be fixed which is why the changes are squashed in one commit.

1) Function `eval_or` calculated result signed minimum as bitwise OR
between corresponding minimums as long as any of them is non-negative,
which is incorrect since values within the range can have zeroes where
the minimums don't, including the sign bit.

E.g. consider the following program with the current validation code:

    Tested program:
        0:  mov r0, #0x0
        1:  ldxdw r2, [r1 + 0]
        2:  jlt r2, #0x5, L8
        3:  jgt r2, #0x6, L8
        4:  jslt r2, #0x5, L8
        5:  jsgt r2, #0x6, L8
        6:  or r2, #0xfffffffe  ; tested instruction
        7:  mov r0, #0x1
        8:  exit
    Pre-state:
       r2:  5..6
    Post-state:
       r2:  -1

After the tested instruction validator considers r2 to always equal -1,
however if 6 was loaded on step 1 it is possible for it to be -2:

     0x6 & 0xfffffffffffffffe == 0xfffffffffffffffe = -2

Set signed range to full if any of the operands can be negative,
otherwise use the maximum of both minimums as a new signed minimum
following the idea that result of bitwise OR cannot be smaller than its
operands. Add test.

2) Function `eval_or` calculated result unsigned minimum as bitwise OR
between corresponding minimums, which is incorrect since values within
the range can have zeroes the minimums don't.

E.g. consider the following program with the current validation code:

    Tested program:
        0:  mov r0, #0x0
        1:  ldxdw r2, [r1 + 0]
        2:  jlt r2, #0x5, L8
        3:  jgt r2, #0x6, L8
        4:  jslt r2, #0x5, L8
        5:  jsgt r2, #0x6, L8
        6:  or r2, #0x2  ; tested instruction
        7:  mov r0, #0x1
        8:  exit
    Pre-state:
       r2:  5..6
    Post-state:
       r2:  7

After the tested instruction validator considers r2 to always equal 7,
however if 6 was loaded on step 1 it is possible for it to be 6:

    0x6 & 0x2 == 0x6

Use the maximum of both minimums as a new unsigned minimum following the
idea that result of bitwise OR cannot be smaller than its operands. Add
test.

Fixes: 8021917293d0 ("bpf: add extra validation for input BPF program")
Cc: stable@dpdk.org

Reported-by: Claudia Cauli <claudiacauli@gmail.com>
Signed-off-by: Marat Khalili <marat.khalili@huawei.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
---
 app/test/test_bpf_validate.c | 34 ++++++++++++++++++++++++++++++++++
 lib/bpf/bpf_validate.c       |  6 +++---
 2 files changed, 37 insertions(+), 3 deletions(-)

diff --git a/app/test/test_bpf_validate.c b/app/test/test_bpf_validate.c
index 55a2f383bd23..dfcf49ccb936 100644
--- a/app/test/test_bpf_validate.c
+++ b/app/test/test_bpf_validate.c
@@ -1713,6 +1713,40 @@ test_alu64_neg_zero_last(void)
 REGISTER_FAST_TEST(bpf_validate_alu64_neg_zero_last_autotest, NOHUGE_OK, ASAN_OK,
 	test_alu64_neg_zero_last);
 
+/* 64-bit bitwise OR between a positive scalar range and negative immediate. */
+static int
+test_alu64_or_k_negative(void)
+{
+	return verify_instruction((struct verify_instruction_param){
+		.tested_instruction = {
+			.code = (EBPF_ALU64 | BPF_OR | BPF_K),
+			.imm = -2,
+		},
+		.pre.dst = make_signed_domain(5, 6),
+		.post.dst = make_signed_domain(-2, -1),
+	});
+}
+
+REGISTER_FAST_TEST(bpf_validate_alu64_or_k_negative_autotest, NOHUGE_OK, ASAN_OK,
+	test_alu64_or_k_negative);
+
+/* 64-bit bitwise OR between a positive scalar range and positive immediate. */
+static int
+test_alu64_or_k_positive(void)
+{
+	return verify_instruction((struct verify_instruction_param){
+		.tested_instruction = {
+			.code = (EBPF_ALU64 | BPF_OR | BPF_K),
+			.imm = 2,
+		},
+		.pre.dst = make_signed_domain(5, 6),
+		.post.dst = make_signed_domain(5, 7),
+	});
+}
+
+REGISTER_FAST_TEST(bpf_validate_alu64_or_k_positive_autotest, NOHUGE_OK, ASAN_OK,
+	test_alu64_or_k_positive);
+
 /* Jump if greater than immediate. */
 static int
 test_jmp64_jeq_k(void)
diff --git a/lib/bpf/bpf_validate.c b/lib/bpf/bpf_validate.c
index 4e4c0ddeb2b8..abb39cfd328d 100644
--- a/lib/bpf/bpf_validate.c
+++ b/lib/bpf/bpf_validate.c
@@ -875,7 +875,7 @@ eval_or(struct bpf_reg_val *rd, const struct bpf_reg_val *rs, size_t opsz,
 		rd->u.max |= rs->u.max;
 	} else {
 		rd->u.max = eval_uor_max(rd->u.max, rs->u.max, opsz);
-		rd->u.min |= rs->u.min;
+		rd->u.min = RTE_MAX(rd->u.min, rs->u.min);
 	}
 
 	/* both operands are constants */
@@ -884,9 +884,9 @@ eval_or(struct bpf_reg_val *rd, const struct bpf_reg_val *rs, size_t opsz,
 		rd->s.max |= rs->s.max;
 
 	/* both operands are non-negative */
-	} else if (rd->s.min >= 0 || rs->s.min >= 0) {
+	} else if (rd->s.min >= 0 && rs->s.min >= 0) {
 		rd->s.max = eval_uor_max(rd->s.max, rs->s.max, opsz);
-		rd->s.min |= rs->s.min;
+		rd->s.min = RTE_MAX(rd->s.min, rs->s.min);
 	} else
 		eval_smax_bound(rd, msk);
 }
-- 
2.43.0


^ permalink raw reply related

* [PATCH v5 19/24] bpf/validate: fix BPF_LSH shift-out-of-bounds UB
From: Marat Khalili @ 2026-06-24 12:17 UTC (permalink / raw)
  To: Konstantin Ananyev; +Cc: dev, stable, Claudia Cauli
In-Reply-To: <20260624121800.40635-1-marat.khalili@huawei.com>

Function `eval_lsh` when validating left shift by 63 invoked macro
`RTE_LEN2MASK(0, int64_t)` which triggered shift-out-of-bounds undefined
behaviour.

E.g. consider the following program with the current validation code:

    Tested program:
        0:  mov r0, #0x0
        1:  ldxdw r2, [r1 + 0]
        2:  jlt r2, #0x3, L8
        3:  jgt r2, #0x5, L8
        4:  jslt r2, #0x3, L8
        5:  jsgt r2, #0x5, L8
        6:  lsh r2, #0x3f  ; tested instruction
        7:  mov r0, #0x1
        8:  exit
    Pre-state:
       r2:  3..5
    Post-state:
       r2:  0..UINT64_MAX

With sanitizer the following diagnostic is generated:

    lib/bpf/bpf_validate.c:785:4: runtime error: shift exponent 64 is
    too large for 64-bit type 'long unsigned int'
        #0 0x00000274d5e0 in eval_lsh lib/bpf/bpf_validate.c:785
        #1 0x00000275a2ea in eval_alu lib/bpf/bpf_validate.c:1310
        #2 0x00000276ce3d in evaluate lib/bpf/bpf_validate.c:3284

Add guard for this case, add test.

Fixes: 8021917293d0 ("bpf: add extra validation for input BPF program")
Cc: stable@dpdk.org

Reported-by: Claudia Cauli <claudiacauli@gmail.com>
Signed-off-by: Marat Khalili <marat.khalili@huawei.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
---
 app/test/test_bpf_validate.c | 17 +++++++++++++++++
 lib/bpf/bpf_validate.c       |  3 ++-
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/app/test/test_bpf_validate.c b/app/test/test_bpf_validate.c
index 40ed84ca67ff..55a2f383bd23 100644
--- a/app/test/test_bpf_validate.c
+++ b/app/test/test_bpf_validate.c
@@ -1536,6 +1536,23 @@ test_alu64_div_mod_overflow(void)
 REGISTER_FAST_TEST(bpf_validate_alu64_div_mod_overflow_autotest, NOHUGE_OK, ASAN_OK,
 	test_alu64_div_mod_overflow);
 
+/* 64-bit left shift by 63. */
+static int
+test_alu64_lsh_63(void)
+{
+	return verify_instruction((struct verify_instruction_param){
+		.tested_instruction = {
+			.code = (EBPF_ALU64 | BPF_LSH | BPF_K),
+			.imm = 63,
+		},
+		.pre.dst = make_signed_domain(3, 5),
+		.post.dst = unknown,
+	});
+}
+
+REGISTER_FAST_TEST(bpf_validate_alu64_lsh_63_autotest, NOHUGE_OK, ASAN_OK,
+	test_alu64_lsh_63);
+
 /* 64-bit multiplication of constant and immediate with overflow. */
 static int
 test_alu64_mul_k_overflow(void)
diff --git a/lib/bpf/bpf_validate.c b/lib/bpf/bpf_validate.c
index d4d8ec4251f1..4e4c0ddeb2b8 100644
--- a/lib/bpf/bpf_validate.c
+++ b/lib/bpf/bpf_validate.c
@@ -746,7 +746,8 @@ eval_lsh(struct bpf_reg_val *rd, const struct bpf_reg_val *rs, size_t opsz,
 
 	/* check that dreg values are and would remain always positive */
 	if ((uint64_t)rd->s.min >> (opsz - 1) != 0 || rd->s.max >=
-			RTE_LEN2MASK(opsz - rs->u.max - 1, int64_t))
+			(rs->u.max == opsz - 1 ? 0 :
+				 RTE_LEN2MASK(opsz - rs->u.max - 1, int64_t)))
 		eval_smax_bound(rd, msk);
 	else {
 		rd->s.max <<= rs->u.max;
-- 
2.43.0


^ permalink raw reply related

* [PATCH v5 05/24] bpf/validate: introduce debugging interface
From: Marat Khalili @ 2026-06-24 12:17 UTC (permalink / raw)
  To: Konstantin Ananyev; +Cc: dev
In-Reply-To: <20260624121800.40635-1-marat.khalili@huawei.com>

Introduce debugging interface for BPF validator. New API lets one
observe evaluation of the validated BPF program, including step
evaluation, setting break- and catchpoints, inspecting possible jumps
and memory accesses in current state, as well as formatting current
state elements for the user. It can be used to build both automated
tests and interactive validation debuggers without tight coupling to a
specific validator implementation.

Signed-off-by: Marat Khalili <marat.khalili@huawei.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com>
---
 doc/guides/prog_guide/bpf_lib.rst      |  31 ++
 doc/guides/rel_notes/release_26_07.rst |  10 +-
 lib/bpf/bpf_validate.c                 | 448 ++++++++++++++++-
 lib/bpf/bpf_validate.h                 |  60 +++
 lib/bpf/bpf_validate_debug.c           | 659 +++++++++++++++++++++++++
 lib/bpf/bpf_validate_debug.h           |  86 ++++
 lib/bpf/bpf_value_set.c                | 403 +++++++++++++++
 lib/bpf/bpf_value_set.h                | 126 +++++
 lib/bpf/meson.build                    |   9 +-
 lib/bpf/rte_bpf.h                      |   4 +
 lib/bpf/rte_bpf_validate_debug.h       | 375 ++++++++++++++
 11 files changed, 2205 insertions(+), 6 deletions(-)
 create mode 100644 lib/bpf/bpf_validate.h
 create mode 100644 lib/bpf/bpf_validate_debug.c
 create mode 100644 lib/bpf/bpf_validate_debug.h
 create mode 100644 lib/bpf/bpf_value_set.c
 create mode 100644 lib/bpf/bpf_value_set.h
 create mode 100644 lib/bpf/rte_bpf_validate_debug.h

diff --git a/doc/guides/prog_guide/bpf_lib.rst b/doc/guides/prog_guide/bpf_lib.rst
index ed07a9f9a2c0..b5e52c097cb1 100644
--- a/doc/guides/prog_guide/bpf_lib.rst
+++ b/doc/guides/prog_guide/bpf_lib.rst
@@ -116,6 +116,37 @@ For example, ``(BPF_IND | BPF_W | BPF_LD)`` means:
 and ``R1-R5`` were scratched.
 
 
+Validation Debugging
+--------------------
+
+The DPDK BPF library includes a validation debugging API designed primarily for
+writing comprehensive unit tests for the eBPF verifier. It allows developers to
+introspect the abstract interpretation process step-by-step to guarantee that
+the verifier correctly models the semantics of eBPF instructions.
+
+The validation debugging API operates using a gdb-like approach:
+
+1.  **Initialization:** Create a debug session using
+    ``rte_bpf_validate_debug_create()`` and pass it to the loader via the
+    ``debug`` field in ``struct rte_bpf_prm_ex``.
+2.  **Breakpoints and Catchpoints:** Before loading, use
+    ``rte_bpf_validate_debug_break()`` or ``rte_bpf_validate_debug_catch()``
+    to register callback functions that trigger at specific instruction indices
+    (program counters) or upon specific validation events.
+3.  **State Introspection:** Within the callbacks, the API provides functions
+    like ``rte_bpf_validate_debug_can_access()``,
+    ``rte_bpf_validate_debug_may_jump()``, and various formatting functions
+    to safely inspect the verifier's internal belief about register bounds
+    and memory states at that specific execution point.
+
+When adding a test for a new eBPF instruction or fixing a validator bug,
+developers should utilize the harness provided in
+``app/test/test_bpf_validate.c``. This harness encapsulates the debugging API,
+allowing you to define the expected abstract domains (signed and unsigned
+intervals) for registers before and after a tested instruction, generating
+the necessary eBPF bytecode and breakpoints automatically.
+
+
 Not currently supported eBPF features
 -------------------------------------
 
diff --git a/doc/guides/rel_notes/release_26_07.rst b/doc/guides/rel_notes/release_26_07.rst
index 0b1cac3e0d2f..8471966a4992 100644
--- a/doc/guides/rel_notes/release_26_07.rst
+++ b/doc/guides/rel_notes/release_26_07.rst
@@ -164,12 +164,20 @@ New Features
     for installing already loaded BPF programs as port callbacks
     (as opposed to loading them directly from ELF files).
 
+* **Added BPF validation debugging API.**
+
+  * Introduced a new set of APIs (prefixed with ``rte_bpf_validate_debug_``) to
+    introspect the BPF validator. This provides a mechanism to set breakpoints
+    or catchpoints during validation and inspect the verifier's internal state
+    (such as tracked register bounds). This API is crucial primarily for writing
+    comprehensive tests for the validator, but also serves as a foundation for a
+    future interactive eBPF validation debugger.
+
 * **Added AI review helpers.**
 
   Added AGENTS.md file for AI review
   and supporting scripts to review patches and documentation.
 
-
 Removed Items
 -------------
 
diff --git a/lib/bpf/bpf_validate.c b/lib/bpf/bpf_validate.c
index 362d00c77095..f3f462920a3d 100644
--- a/lib/bpf/bpf_validate.c
+++ b/lib/bpf/bpf_validate.c
@@ -9,9 +9,13 @@
 #include <stdint.h>
 #include <inttypes.h>
 
+#include <rte_bpf_validate_debug.h>
 #include <rte_common.h>
 
 #include "bpf_impl.h"
+#include "bpf_validate.h"
+#include "bpf_validate_debug.h"
+#include "bpf_value_set.h"
 
 #define BPF_ARG_PTR_STACK RTE_BPF_ARG_RESERVED
 
@@ -92,6 +96,7 @@ struct bpf_verifier {
 	struct inst_node *evin;
 	struct evst_pool evst_sr_pool; /* for evst save/restore */
 	struct evst_pool evst_tp_pool; /* for evst track/prune */
+	struct rte_bpf_validate_debug *debug;
 };
 
 struct bpf_ins_check {
@@ -118,6 +123,409 @@ struct bpf_ins_check {
 /* For LD_IND R6 is an implicit CTX register. */
 #define	IND_SRC_REGS	(WRT_REGS ^ 1 << EBPF_REG_6)
 
+/*
+ * Debugging internal interface and helpers.
+ */
+
+static bool
+reg_val_range_is_valid(const struct bpf_reg_val *rv)
+{
+	if (rv->v.type == RTE_BPF_ARG_UNDEF)
+		return true;
+
+	if (rv->s.min > rv->s.max)
+		return false;
+
+	if (rv->u.min > rv->u.max)
+		return false;
+
+	/* If one of the ranges does not change sign, the other should match. */
+	if (rv->s.min >= 0 || rv->s.max < 0 ||
+			rv->u.min > INT64_MAX || rv->u.max <= INT64_MAX)
+		return rv->u.min == (uint64_t)rv->s.min &&
+			rv->u.max == (uint64_t)rv->s.max;
+
+	return true;
+}
+
+int
+__rte_bpf_validate_state_is_valid(const struct bpf_verifier *verifier)
+{
+	const struct bpf_eval_state *const st = verifier->evst;
+
+	for (int reg = 0; reg != RTE_DIM(st->rv); ++reg)
+		if (!reg_val_range_is_valid(st->rv + reg))
+			return false;
+
+	for (int var = 0; var != RTE_DIM(st->sv); ++var)
+		if (!reg_val_range_is_valid(st->sv + var))
+			return false;
+
+	return true;
+}
+
+int
+__rte_bpf_validate_can_access(const struct bpf_verifier *verifier,
+	const struct ebpf_insn *access, uint64_t off64)
+{
+	const struct bpf_eval_state *const st = verifier->evst;
+	const struct bpf_reg_val *rv;
+	/* Set of accessed byte offsets relative to memory area base. */
+	struct value_set access_set;
+	uint32_t opsz;
+
+	switch (BPF_CLASS(access->code)) {
+	case BPF_LDX:
+		rv = &st->rv[access->src_reg];
+		if (rv->v.type == BPF_ARG_PTR_STACK)
+			/* Not supporting stack access queries yet. */
+			return -ENOTSUP;
+		break;
+	case BPF_ST:
+		rv = &st->rv[access->dst_reg];
+		break;
+	case BPF_STX:
+		rv = &st->rv[access->dst_reg];
+		if (st->rv[access->src_reg].v.type == RTE_BPF_ARG_UNDEF)
+			return false;
+		break;
+	default:
+		return -ENOTSUP;
+	}
+
+	if (!RTE_BPF_ARG_PTR_TYPE(rv->v.type) || rv->v.size == 0)
+		return false;
+
+	access_set = value_set_from_pair(rv->s.min, rv->s.max, rv->u.min, rv->u.max);
+	value_set_translate(&access_set, off64);
+	opsz = bpf_size(BPF_SIZE(access->code));
+	value_set_add_contiguous(&access_set, 0, opsz - 1);
+
+	return value_set_is_covered_by_contiguous(&access_set, 0, rv->v.size - 1);
+}
+
+/* Return true if instruction `code` is supported by `may_jump`. */
+static bool
+may_jump_code_is_supported(uint8_t code)
+{
+	if (BPF_CLASS(code) != BPF_JMP)
+		return false;
+
+	switch (BPF_OP(code)) {
+	case BPF_JEQ:
+	case BPF_JGT:
+	case BPF_JGE:
+	case EBPF_JNE:
+	case EBPF_JSGT:
+	case EBPF_JSGE:
+	case EBPF_JLT:
+	case EBPF_JLE:
+	case EBPF_JSLT:
+	case EBPF_JSLE:
+		return true;
+	default:
+		return false;
+	}
+}
+
+/* Return true if instruction `code` corresponds to a signed comparison. */
+static bool
+may_jump_code_is_signed(uint8_t code)
+{
+	switch (BPF_OP(code)) {
+	case EBPF_JSGT:
+	case EBPF_JSGE:
+	case EBPF_JSLT:
+	case EBPF_JSLE:
+		return true;
+	default:
+		return false;
+	}
+}
+
+/* Return true the specified jump condition _may_ be true. */
+static bool
+may_jump(uint8_t code, const struct value_set *origin,
+	const struct value_set *dst_set, const struct value_set *src_set)
+{
+	switch (BPF_OP(code)) {
+	case BPF_JEQ:
+		return value_sets_intersect(dst_set, src_set);
+	case EBPF_JNE:
+		return !(value_set_is_singleton(dst_set) &&
+			value_sets_equal(dst_set, src_set));
+	case BPF_JGT:
+	case EBPF_JSGT:
+		return !value_sets_based_less_or_equal(origin, dst_set, src_set);
+	case BPF_JGE:
+	case EBPF_JSGE:
+		return !value_sets_based_less(origin, dst_set, src_set);
+	case EBPF_JLT:
+	case EBPF_JSLT:
+		return !value_sets_based_less_or_equal(origin, src_set, dst_set);
+	case EBPF_JSLE:
+	case EBPF_JLE:
+		return !value_sets_based_less(origin, src_set, dst_set);
+	}
+	/* may_jump_code_is_supported should have caught this */
+	RTE_ASSERT(false);
+	return false;
+}
+
+/* Return instruction code for jump condition complement (negated result). */
+static uint8_t
+may_jump_code_complement(uint8_t code)
+{
+	switch (BPF_OP(code)) {
+	case BPF_JEQ:
+	case EBPF_JNE:
+		return code ^ BPF_JEQ ^ EBPF_JNE;
+	case BPF_JGT:
+	case EBPF_JLE:
+		return code ^ BPF_JGT ^ EBPF_JLE;
+	case BPF_JGE:
+	case EBPF_JLT:
+		return code ^ BPF_JGE ^ EBPF_JLT;
+	case EBPF_JSGT:
+	case EBPF_JSLE:
+		return code ^ EBPF_JSGT ^ EBPF_JSLE;
+	case EBPF_JSGE:
+	case EBPF_JSLT:
+		return code ^ EBPF_JSGE ^ EBPF_JSLT;
+	}
+	/* may_jump_code_is_supported should have caught this */
+	RTE_ASSERT(false);
+	return 0;
+}
+
+int
+__rte_bpf_validate_may_jump(const struct bpf_verifier *verifier,
+	const struct ebpf_insn *jump, uint64_t imm64)
+{
+	const struct bpf_eval_state *const st = verifier->evst;
+	const struct bpf_reg_val *rd, *rs;
+	struct value_set dst_set, src_set, origin;
+	int result;
+
+	if (!may_jump_code_is_supported(jump->code))
+		return -ENOTSUP;
+
+	rd = &st->rv[jump->dst_reg];
+	dst_set = (rd->v.type == RTE_BPF_ARG_UNDEF) ? value_set_full :
+		value_set_from_pair(rd->s.min, rd->s.max, rd->u.min, rd->u.max);
+
+	rs = BPF_SRC(jump->code) == BPF_X ? &st->rv[jump->src_reg] : NULL;
+	src_set = rs == NULL ? value_set_singleton((int64_t)jump->imm) :
+		rs->v.type == RTE_BPF_ARG_UNDEF ? value_set_full :
+		value_set_from_pair(rs->s.min, rs->s.max, rs->u.min, rs->u.max);
+
+	value_set_translate(&src_set, imm64);
+
+	if (RTE_BPF_ARG_PTR_TYPE(rd->v.type) &&
+			(rs != NULL && RTE_BPF_ARG_PTR_TYPE(rs->v.type)) &&
+			rd->v.size == rs->v.size) {
+		/*
+		 * Both sides are pointers with the same memory area size.
+		 * Until tracking of memory areas is implemented we will consider them
+		 * pointing to the same memory area just because of this.
+		 * In this case our value sets represent offsets from the memory area base,
+		 * which is some unknown distance from the scalar zero (NULL).
+		 * We know however that the memory area cannot cross zero address.
+		 * Thus range of origin relative to memory base starts with 1 byte gap
+		 * after the memory area and ends just before it.
+		 */
+		origin = value_set_contiguous(rd->v.size + 1, -1);
+	} else {
+		/* Scalar value of a pointer depends on the memory area base address. */
+		if (RTE_BPF_ARG_PTR_TYPE(rd->v.type))
+			value_set_add_contiguous(&dst_set, 1, UINT64_MAX - rd->v.size);
+		if (rs != NULL && RTE_BPF_ARG_PTR_TYPE(rs->v.type))
+			value_set_add_contiguous(&dst_set, 1, UINT64_MAX - rs->v.size);
+		origin = value_set_singleton(0);
+	}
+
+	if (may_jump_code_is_signed(jump->code))
+		/* Shift origin to the minimal value for signed comparisons. */
+		value_set_translate(&origin, INT64_MIN);
+
+	result = 0;
+
+	if (may_jump(jump->code, &origin, &dst_set, &src_set))
+		result |= RTE_BPF_VALIDATE_DEBUG_MAY_BE_TRUE;
+
+	if (may_jump(may_jump_code_complement(jump->code), &origin, &dst_set, &src_set))
+		result |= RTE_BPF_VALIDATE_DEBUG_MAY_BE_FALSE;
+
+	return result;
+}
+
+/* Like snprintf, but advances (except for overflow) ptr and reduces szleft. */
+__rte_format_printf(3, 4)
+static int
+buf_printf(char **ptr, ssize_t *szleft, const char *format, ...)
+{
+	va_list args;
+	int rc;
+
+	va_start(args, format);
+	rc = vsnprintf(*ptr, RTE_MAX(0, *szleft), format, args);
+	va_end(args);
+
+	if (rc > 0) {
+		*szleft -= rc;
+		if (*szleft > 0)
+			*ptr += rc;
+	}
+
+	return rc;
+}
+
+static int
+format_memory_area(char **ptr, ssize_t *szleft, const struct bpf_reg_val *rv)
+{
+	switch (rv->v.type) {
+	case RTE_BPF_ARG_RAW:
+		return 0;
+	case RTE_BPF_ARG_PTR:
+		return buf_printf(ptr, szleft, "%%buffer<%zu> + ",
+			(size_t)rv->v.size);
+	case RTE_BPF_ARG_PTR_MBUF:
+		return buf_printf(ptr, szleft, "%%mbuf<%zu, %zu> + ",
+			(size_t)rv->v.size, (size_t)rv->v.buf_size);
+	case BPF_ARG_PTR_STACK:
+		return buf_printf(ptr, szleft, "%%stack + ");
+	default:
+		return -ENOTSUP;
+	}
+}
+
+/* Format min..max interval using validate-debug API and updating ptr and szleft. */
+static int
+buf_print_interval(char **ptr, ssize_t *szleft, char format, uint64_t min, uint64_t max)
+{
+	int rc;
+
+	rc = rte_bpf_validate_debug_format_interval(*ptr, RTE_MAX(0, *szleft),
+		format, min, max);
+
+	if (rc > 0) {
+		*szleft -= rc;
+		if (*szleft > 0)
+			*ptr += rc;
+	}
+
+	return rc;
+}
+
+/* Format rv roughly as "<signed-range> INTERSECT <unsigned-hex-range>" */
+static int
+format_register_range(char **ptr, ssize_t *szleft, const struct bpf_reg_val *rv)
+{
+	int rc;
+	uint64_t expected_unsigned_min, expected_unsigned_max;
+	const bool valid = reg_val_range_is_valid(rv);
+
+	/* Print signed unless trivial. */
+	if (!valid || rv->s.min != INT64_MIN || rv->s.max != INT64_MAX) {
+		rc = buf_print_interval(ptr, szleft, 'd', rv->s.min, rv->s.max);
+		if (rc < 0)
+			return rc;
+
+		if (valid) {
+			/* Skip printing unsigned if it has expected values. */
+			if (rv->s.min >= 0 || rv->s.max < 0) {
+				expected_unsigned_min = (uint64_t)rv->s.min;
+				expected_unsigned_max = (uint64_t)rv->s.max;
+			} else {
+				expected_unsigned_min = 0;
+				expected_unsigned_max = UINT64_MAX;
+			}
+
+			if (rv->u.min == expected_unsigned_min &&
+					rv->u.max == expected_unsigned_max)
+				return 0;
+		}
+
+		rc = buf_printf(ptr, szleft, " INTERSECT ");
+		if (rc < 0)
+			return rc;
+	}
+
+	rc = buf_print_interval(ptr, szleft, 'x', rv->u.min, rv->u.max);
+	if (rc < 0)
+		return rc;
+
+	if (!valid) {
+		rc = buf_printf(ptr, szleft, " (!)");
+		if (rc < 0)
+			return rc;
+	}
+
+	return 0;
+}
+
+/* Format rv roughly as "<memory-object> + <offsets-range>" */
+static int
+format_reg_val(char *buffer, size_t bufsz, const struct bpf_reg_val *rv)
+{
+	char *ptr = buffer;
+	ssize_t szleft = bufsz;
+	int rc;
+
+	if (rv->v.type == RTE_BPF_ARG_UNDEF)
+		return snprintf(buffer, bufsz, "%%undefined");
+
+	/* Print data area info, if any. */
+	rc = format_memory_area(&ptr, &szleft, rv);
+	if (rc < 0)
+		return rc;
+
+	rc = format_register_range(&ptr, &szleft, rv);
+	if (rc < 0)
+		return rc;
+
+	/* At least one snprintf was called and added terminating zero. */
+	RTE_ASSERT(szleft < (ssize_t)bufsz);
+	--szleft;
+
+	return bufsz - szleft;
+}
+
+int
+__rte_bpf_validate_format_register_info(const struct bpf_verifier *verifier,
+	char *buffer, size_t bufsz, uint8_t reg)
+{
+	if (reg >= EBPF_REG_NUM)
+		return -EINVAL;
+
+	return format_reg_val(buffer, bufsz, &verifier->evst->rv[reg]);
+}
+
+int
+__rte_bpf_validate_format_frame_info(const struct bpf_verifier *verifier,
+	char *buffer, size_t bufsz, int32_t offset)
+{
+	if (offset % sizeof(uint64_t) != 0)
+		return -EINVAL;
+
+	if (offset >= 0 || offset < -MAX_BPF_STACK_SIZE)
+		return -ERANGE;
+
+	offset = (MAX_BPF_STACK_SIZE + offset) / sizeof(uint64_t);
+
+	return format_reg_val(buffer, bufsz, &verifier->evst->sv[offset]);
+}
+
+int32_t
+__rte_bpf_validate_get_frame_size(const struct bpf_verifier *verifier)
+{
+	if (verifier->stack_sz > INT32_MAX)
+		return -ERANGE;
+
+	return verifier->stack_sz;
+}
+
+
 /*
  * check and evaluate functions for particular instruction types.
  */
@@ -2405,7 +2813,9 @@ evaluate(struct bpf_verifier *bvf)
 	const char *err;
 	const struct ebpf_insn *ins;
 	struct inst_node *next, *node;
-	int rc = 0;
+	int prev_nb_edge;  /* branching number of the previous instruction */
+	int rc, debug_rc;
+	struct rte_bpf_validate_debug *const debug = bvf->prm->debug;
 
 	struct {
 		uint32_t nb_eval;
@@ -2439,11 +2849,15 @@ evaluate(struct bpf_verifier *bvf)
 	ins = bvf->prm->raw.ins;
 	node = bvf->in;
 	next = node;
+	prev_nb_edge = 1;
 
 	memset(&stats, 0, sizeof(stats));
 
-	while (node != NULL) {
+	rc = __rte_bpf_validate_debug_evaluate_start(debug, bvf, bvf->prm);
+	if (rc < 0)
+		return rc;
 
+	while (node != NULL) {
 		/*
 		 * current node evaluation, make sure we evaluate
 		 * each node only once.
@@ -2464,6 +2878,13 @@ evaluate(struct bpf_verifier *bvf)
 			}
 
 			if (ins_chk[op].eval != NULL) {
+				rc = __rte_bpf_validate_debug_evaluate_step(
+					debug, idx, prev_nb_edge > 1 ?
+						RTE_BPF_VALIDATE_DEBUG_EVENT_BRANCH_ENTER :
+						RTE_BPF_VALIDATE_DEBUG_EVENT_STEP);
+				if (rc < 0)
+					break;
+
 				err = ins_chk[op].eval(bvf, ins + idx);
 				stats.nb_eval++;
 				if (err != NULL) {
@@ -2499,10 +2920,17 @@ evaluate(struct bpf_verifier *bvf)
 			 */
 			if (node->nb_edge > 1 && prune_eval_state(bvf, node,
 					next) == 0) {
+				rc = __rte_bpf_validate_debug_evaluate_step(
+					debug, get_node_idx(bvf, next),
+					RTE_BPF_VALIDATE_DEBUG_EVENT_BRANCH_PRUNE);
+				if (rc < 0)
+					break;
+
 				next = NULL;
 				stats.nb_prune++;
 			} else {
 				next->prev_node = node;
+				prev_nb_edge = node->nb_edge;
 				node = next;
 			}
 		} else {
@@ -2511,8 +2939,18 @@ evaluate(struct bpf_verifier *bvf)
 			 * mark it's @start state as safe for future references,
 			 * and proceed with parent.
 			 */
+
+			if (prev_nb_edge != 0) {
+				rc = __rte_bpf_validate_debug_evaluate_step(
+					debug, get_node_idx(bvf, node) + 1,
+					RTE_BPF_VALIDATE_DEBUG_EVENT_BRANCH_RETURN);
+				if (rc < 0)
+					break;
+			}
+
 			node->cur_edge = 0;
 			save_safe_eval_state(bvf, node);
+			prev_nb_edge = 0;
 			node = node->prev_node;
 
 			/* first node will not have prev, signalling finish */
@@ -2532,7 +2970,11 @@ evaluate(struct bpf_verifier *bvf)
 		__func__, bvf, rc,
 		stats.nb_eval, stats.nb_prune, stats.nb_save, stats.nb_restore);
 
-	return rc;
+	debug_rc = __rte_bpf_validate_debug_evaluate_finish(debug, rc);
+	rc = debug_rc < 0 ? debug_rc : rc;
+
+	/* Caller does not expect positive values. */
+	return RTE_MIN(0, rc);
 }
 
 static bool
diff --git a/lib/bpf/bpf_validate.h b/lib/bpf/bpf_validate.h
new file mode 100644
index 000000000000..9912f4fd5c4f
--- /dev/null
+++ b/lib/bpf/bpf_validate.h
@@ -0,0 +1,60 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2025 Huawei Technologies Co., Ltd
+ */
+
+#ifndef _BPF_VALIDATE_H_
+#define _BPF_VALIDATE_H_
+
+/**
+ * @file bpf_validate.h
+ *
+ * Internal-use headers for eBPF validation observability.
+ */
+
+#include <bpf_def.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+struct bpf_verifier;
+
+/*
+ * Return 1 if the verifier passes internal self-check,
+ * 0 if it fails, or a negative error code.
+ */
+int
+__rte_bpf_validate_state_is_valid(const struct bpf_verifier *verifier);
+
+/*
+ * Return 1 if the specified access instruction is valid,
+ * 0 if it is invalid, or a negative error code.
+ */
+int
+__rte_bpf_validate_can_access(const struct bpf_verifier *verifier,
+	const struct ebpf_insn *access, uint64_t off64);
+
+/* Get possible truth values of the specified jump condition. */
+int
+__rte_bpf_validate_may_jump(const struct bpf_verifier *verifier,
+	const struct ebpf_insn *jump, uint64_t imm64);
+
+/* Format known information about the register for the user. */
+int
+__rte_bpf_validate_format_register_info(const struct bpf_verifier *verifier,
+	char *buffer, size_t bufsz, uint8_t reg);
+
+/* Format known information about the frame location for the user. */
+int
+__rte_bpf_validate_format_frame_info(const struct bpf_verifier *verifier,
+	char *buffer, size_t bufsz, int32_t offset);
+
+/* Return frame size. */
+int32_t
+__rte_bpf_validate_get_frame_size(const struct bpf_verifier *verifier);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _BPF_VALIDATE_H_ */
diff --git a/lib/bpf/bpf_validate_debug.c b/lib/bpf/bpf_validate_debug.c
new file mode 100644
index 000000000000..5d18804a74bc
--- /dev/null
+++ b/lib/bpf/bpf_validate_debug.c
@@ -0,0 +1,659 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2025 Huawei Technologies Co., Ltd
+ */
+
+#include "bpf_impl.h"
+#include "bpf_validate.h"
+#include "bpf_validate_debug.h"
+
+#include <eal_export.h>
+#include <rte_bpf_validate_debug.h>
+#include <rte_errno.h>
+#include <rte_per_lcore.h>
+
+#include <errno.h>
+#include <stddef.h>
+#include <stdlib.h>
+
+#ifndef LIST_FOREACH_SAFE
+/* We need this macro which neither Linux nor EAL for Linux include yet. */
+#define	LIST_FOREACH_SAFE(var, head, field, tvar)			\
+	for ((var) = LIST_FIRST((head));				\
+	    (var) && ((tvar) = LIST_NEXT((var), field), 1);		\
+	    (var) = (tvar))
+#endif
+
+#define EVENT_ARRAY_LENGTH RTE_BPF_VALIDATE_DEBUG_EVENT_END
+
+struct rte_bpf_validate_debug_point {
+	LIST_ENTRY(rte_bpf_validate_debug_point) list;
+	struct rte_bpf_validate_debug_callback callback;
+	uint32_t pc;
+};
+
+LIST_HEAD(point_list, rte_bpf_validate_debug_point);
+
+struct rte_bpf_validate_debug {
+	/* Accessible immediately after object creation. */
+	struct point_list pending_breakpoints;
+	struct point_list *catchpoint_lists;
+	struct rte_bpf_validate_debug_callback step_callback;
+
+	/* Accessible only after evaluate start. */
+	const struct bpf_verifier *verifier;
+	const struct rte_bpf_prm_ex *bpf_prm;
+	struct point_list *breakpoint_lists;
+	struct rte_bpf_validate_debug_point *last_point;
+	uint32_t pc;
+	/* Evaluate stage (only tracking `evaluate` part at the moment). */
+	bool evaluate_started;
+	bool evaluate_finished;
+	int evaluate_result;  /* Only valid if `evaluate_finished` is true. */
+};
+
+/* Point lists functions. */
+
+/* Destroy all points in the list. */
+static void
+point_list_destroy(struct point_list *point_list)
+{
+	struct rte_bpf_validate_debug_point *point, *next;
+
+	LIST_FOREACH_SAFE(point, point_list, list, next)
+		rte_bpf_validate_debug_point_destroy(point);
+
+	RTE_ASSERT(LIST_EMPTY(point_list));
+}
+
+/* Destroy all points in all lists in the array and free the array. */
+static void
+point_lists_destroy(struct point_list *point_lists, uint32_t length)
+{
+	if (point_lists == NULL)
+		return;
+
+	for (uint32_t pli = 0; pli != length; ++pli)
+		point_list_destroy(&point_lists[pli]);
+
+	free(point_lists);
+}
+
+/* Dynamically allocate and initialize an array of point lists. */
+static struct point_list *
+point_lists_create(uint32_t length)
+{
+	/* Allocate at least one element to avoid calloc(0, ...) shenanigans. */
+	struct point_list *const array =
+		calloc(RTE_MAX(1u, length), sizeof(*array));
+	if (array == NULL)
+		return NULL;
+
+	for (uint32_t pli = 0; pli != length; ++pli)
+		LIST_INIT(&array[pli]);
+
+	return array;
+}
+
+/* Move point to a different list. */
+static inline void
+point_move(struct rte_bpf_validate_debug_point *point,
+	struct point_list *destination)
+{
+	LIST_REMOVE(point, list);
+	LIST_INSERT_HEAD(destination, point, list);
+}
+
+/* Move all points between lists (the order is inverted). */
+static void
+points_move(struct point_list *source, struct point_list *destination)
+{
+	struct rte_bpf_validate_debug_point *point, *next;
+
+	LIST_FOREACH_SAFE(point, source, list, next)
+		point_move(point, destination);
+	RTE_ASSERT(LIST_EMPTY(source));
+}
+
+/* Pending breakpoints. */
+
+/* Return true if all pending breakpoints have pc less than nb_ins. */
+static bool
+debug_pending_breakpoints_are_valid(const struct rte_bpf_validate_debug *debug,
+	uint32_t nb_ins)
+{
+	const struct rte_bpf_validate_debug_point *breakpoint;
+
+	LIST_FOREACH(breakpoint, &debug->pending_breakpoints, list)
+		if (breakpoint->pc >= nb_ins)
+			return false;
+
+	return true;
+}
+
+/* Move all pending breakpoints to correct per-pc lists. */
+static void
+debug_pending_breakpoints_restore(struct rte_bpf_validate_debug *debug)
+{
+	struct rte_bpf_validate_debug_point *breakpoint, *next;
+	struct point_list breakpoints;
+
+	/* Invert the list first to preserve point order when we move them. */
+	LIST_INIT(&breakpoints);
+	points_move(&debug->pending_breakpoints, &breakpoints);
+
+	LIST_FOREACH_SAFE(breakpoint, &breakpoints, list, next)
+		point_move(breakpoint, &debug->breakpoint_lists[breakpoint->pc]);
+	RTE_ASSERT(LIST_EMPTY(&breakpoints));
+}
+
+/* Move all breakpoints from per-pc lists to the pending one. */
+static void
+debug_pending_breakpoints_save(struct rte_bpf_validate_debug *debug)
+{
+	struct point_list breakpoints;
+
+	LIST_INIT(&breakpoints);
+	for (uint32_t pc = 0; pc != debug->bpf_prm->raw.nb_ins; ++pc)
+		points_move(&debug->breakpoint_lists[pc], &breakpoints);
+
+	/* Invert the list to restore point order after we moved them. */
+	RTE_ASSERT(LIST_EMPTY(&debug->pending_breakpoints));
+	points_move(&breakpoints, &debug->pending_breakpoints);
+}
+
+/* Debug instance creation and destruction. */
+
+RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_bpf_validate_debug_destroy, 26.07)
+void
+rte_bpf_validate_debug_destroy(struct rte_bpf_validate_debug *debug)
+{
+	if (debug == NULL)
+		return;
+
+	/* Cannot destroy the instance during validation. */
+	RTE_ASSERT(!debug->evaluate_started);
+
+	point_lists_destroy(debug->catchpoint_lists, EVENT_ARRAY_LENGTH);
+	point_list_destroy(&debug->pending_breakpoints);
+	free(debug);
+}
+
+RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_bpf_validate_debug_create, 26.07)
+struct rte_bpf_validate_debug *
+rte_bpf_validate_debug_create(void)
+{
+	struct rte_bpf_validate_debug *const debug = calloc(1, sizeof(*debug));
+	if (debug == NULL) {
+		rte_errno = ENOMEM;
+		return NULL;
+	}
+
+	LIST_INIT(&debug->pending_breakpoints);
+
+	debug->catchpoint_lists = point_lists_create(EVENT_ARRAY_LENGTH);
+	if (debug->catchpoint_lists == NULL) {
+		free(debug);
+		rte_errno = ENOMEM;
+		return NULL;
+	}
+
+	return debug;
+}
+
+/* Managing callbacks. */
+
+/* Call back the user function with correct arguments for a point. */
+static inline int
+debug_point_call_back(struct rte_bpf_validate_debug *debug,
+	struct rte_bpf_validate_debug_point *point)
+{
+	debug->last_point = point;
+	return point->callback.fn(debug, point->callback.ctx);
+}
+
+/* Call back all points in point_list. */
+static int
+debug_points_call_back(struct rte_bpf_validate_debug *debug,
+	const struct point_list *point_list)
+{
+	struct rte_bpf_validate_debug_point *point, *next;
+	int rc = 0;
+
+	LIST_FOREACH_SAFE(point, point_list, list, next)
+		rc = rc < 0 ? rc : debug_point_call_back(debug, point);
+
+	return rc;
+}
+
+/* Call back all catchpoints for the specified event. */
+static int
+debug_send_event(struct rte_bpf_validate_debug *debug, debug_event_t event)
+{
+	return debug_points_call_back(debug, &debug->catchpoint_lists[event]);
+}
+
+/* Create new point and insert it into the specified list. */
+static struct rte_bpf_validate_debug_point *
+point_list_insert(struct point_list *point_list,
+	const struct rte_bpf_validate_debug_callback *callback, uint32_t pc)
+{
+	struct rte_bpf_validate_debug_point *const point =
+		malloc(sizeof(*point));
+	if (point == NULL) {
+		rte_errno = ENOMEM;
+		return NULL;
+	}
+
+	LIST_INSERT_HEAD(point_list, point, list);
+	point->callback = *callback;
+	point->pc = pc;
+	return point;
+}
+
+RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_bpf_validate_debug_break, 26.07)
+struct rte_bpf_validate_debug_point *
+rte_bpf_validate_debug_break(struct rte_bpf_validate_debug *debug, uint32_t pc,
+	const struct rte_bpf_validate_debug_callback *callback)
+{
+	if (debug == NULL || callback == NULL || callback->fn == NULL) {
+		rte_errno = EINVAL;
+		return NULL;
+	}
+
+	if (!debug->evaluate_started)
+		return point_list_insert(&debug->pending_breakpoints,
+			callback, pc);
+
+	if (pc >= debug->bpf_prm->raw.nb_ins) {
+		rte_errno = ENOENT;
+		return NULL;
+	}
+
+	return point_list_insert(&debug->breakpoint_lists[pc], callback, pc);
+}
+
+RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_bpf_validate_debug_catch, 26.07)
+struct rte_bpf_validate_debug_point *
+rte_bpf_validate_debug_catch(struct rte_bpf_validate_debug *debug,
+	debug_event_t event, const struct rte_bpf_validate_debug_callback *callback)
+{
+	if (debug == NULL || callback == NULL || callback->fn == NULL ||
+			event < 0 || event >= RTE_BPF_VALIDATE_DEBUG_EVENT_END) {
+		rte_errno = EINVAL;
+		return NULL;
+	}
+
+	return point_list_insert(&debug->catchpoint_lists[event], callback, 0);
+}
+
+RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_bpf_validate_debug_point_destroy, 26.07)
+void
+rte_bpf_validate_debug_point_destroy(struct rte_bpf_validate_debug_point *point)
+{
+	if (point == NULL)
+		return;
+
+	LIST_REMOVE(point, list);
+	free(point);
+}
+
+/* Querying execution state. */
+
+RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_bpf_validate_debug_get_bpf_param, 26.07)
+const struct rte_bpf_prm_ex *
+rte_bpf_validate_debug_get_bpf_param(const struct rte_bpf_validate_debug *debug)
+{
+	if (debug == NULL) {
+		rte_errno = EINVAL;
+		return NULL;
+	}
+
+	if (!debug->evaluate_started) {
+		rte_errno = ECHILD;
+		return NULL;
+	}
+
+	return debug->bpf_prm;
+}
+
+RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_bpf_validate_debug_get_ins, 26.07)
+int
+rte_bpf_validate_debug_get_ins(const struct rte_bpf_validate_debug *debug,
+	const struct ebpf_insn **ins, uint32_t *nb_ins)
+{
+	if (debug == NULL)
+		return -EINVAL;
+
+	if (!debug->evaluate_started)
+		return -ECHILD;
+
+	if (debug->bpf_prm->origin != RTE_BPF_ORIGIN_RAW)
+		return -ENOTSUP;
+
+	*ins = debug->bpf_prm->raw.ins;
+	*nb_ins = debug->bpf_prm->raw.nb_ins;
+	return 0;
+}
+
+RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_bpf_validate_debug_get_last_point, 26.07)
+struct rte_bpf_validate_debug_point *
+rte_bpf_validate_debug_get_last_point(const struct rte_bpf_validate_debug *debug)
+{
+	if (debug == NULL) {
+		rte_errno = EINVAL;
+		return NULL;
+	}
+
+	return debug->last_point;
+}
+
+RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_bpf_validate_debug_get_pc, 26.07)
+uint32_t
+rte_bpf_validate_debug_get_pc(const struct rte_bpf_validate_debug *debug)
+{
+	if (debug == NULL || !debug->evaluate_started)
+		return UINT32_MAX;
+
+	return debug->pc;
+}
+
+RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_bpf_validate_debug_get_validation_result, 26.07)
+int
+rte_bpf_validate_debug_get_validation_result(const struct rte_bpf_validate_debug *debug,
+	int *result)
+{
+	if (debug == NULL)
+		return -EINVAL;
+
+	if (!debug->evaluate_finished)
+		return -EAGAIN;
+
+	*result = debug->evaluate_result;
+
+	return 0;
+}
+
+/* Querying VM state. */
+
+RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_bpf_validate_debug_can_access, 26.07)
+int
+rte_bpf_validate_debug_can_access(const struct rte_bpf_validate_debug *debug,
+	const struct ebpf_insn *access, uint64_t off64)
+{
+	if (debug == NULL || access == NULL)
+		return -EINVAL;
+
+	if (!debug->evaluate_started)
+		return -ECHILD;
+
+	return __rte_bpf_validate_can_access(debug->verifier, access, off64);
+}
+
+RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_bpf_validate_debug_may_jump, 26.07)
+int
+rte_bpf_validate_debug_may_jump(const struct rte_bpf_validate_debug *debug,
+	const struct ebpf_insn *jump, uint64_t imm64)
+{
+	if (debug == NULL || jump == NULL)
+		return -EINVAL;
+
+	if (!debug->evaluate_started)
+		return -ECHILD;
+
+	return __rte_bpf_validate_may_jump(debug->verifier, jump, imm64);
+}
+
+/* Formatting VM state for user. */
+
+RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_bpf_validate_debug_format_register_info, 26.07)
+int
+rte_bpf_validate_debug_format_register_info(const struct rte_bpf_validate_debug *debug,
+	char *buffer, size_t bufsz, uint8_t reg)
+{
+	if (debug == NULL)
+		return -EINVAL;
+
+	if (!debug->evaluate_started)
+		return -ECHILD;
+
+	return __rte_bpf_validate_format_register_info(debug->verifier, buffer,
+		bufsz, reg);
+}
+
+RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_bpf_validate_debug_format_frame_info, 26.07)
+int
+rte_bpf_validate_debug_format_frame_info(const struct rte_bpf_validate_debug *debug,
+	char *buffer, size_t bufsz, int32_t offset)
+{
+	if (debug == NULL)
+		return -EINVAL;
+
+	if (!debug->evaluate_started)
+		return -ECHILD;
+
+	return __rte_bpf_validate_format_frame_info(debug->verifier, buffer,
+		bufsz, offset);
+}
+
+RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_bpf_validate_debug_get_frame_size, 26.07)
+int32_t
+rte_bpf_validate_debug_get_frame_size(const struct rte_bpf_validate_debug *debug)
+{
+	if (debug == NULL)
+		return -EINVAL;
+
+	if (!debug->evaluate_started)
+		return -ECHILD;
+
+	return __rte_bpf_validate_get_frame_size(debug->verifier);
+}
+
+/* Courtesy formatting functions for user-supplied values. */
+
+RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_bpf_validate_debug_format_value, 26.07)
+int
+rte_bpf_validate_debug_format_value(char *buffer, size_t bufsz, char format,
+	uint64_t value)
+{
+	static const struct {
+		uint64_t value;
+		const char *name;
+	} constants[] = {
+		{ .value = INT64_MIN, .name = "INT64_MIN" },
+		{ .value = INT32_MIN, .name = "INT32_MIN" },
+		{ .value = INT16_MIN, .name = "INT16_MIN" },
+		{ .value = INT8_MIN, .name = "INT8_MIN" },
+		{ .value = INT8_MAX, .name = "INT8_MAX" },
+		{ .value = UINT8_MAX, .name = "UINT8_MAX" },
+		{ .value = INT16_MAX, .name = "INT16_MAX" },
+		{ .value = UINT16_MAX, .name = "UINT16_MAX" },
+		{ .value = INT32_MAX, .name = "INT32_MAX" },
+		{ .value = UINT32_MAX, .name = "UINT32_MAX" },
+		{ .value = INT64_MAX, .name = "INT64_MAX" },
+		/* UINT64_MAX omitted on purpose, it looks better as -1 */
+	};
+
+	switch (format) {
+	case 'd':
+		for (int ci = 0; ci != RTE_DIM(constants); ++ci)
+			if (constants[ci].value == value)
+				return snprintf(buffer, bufsz, "%s", constants[ci].name);
+		/*
+		 * Special case numbers close to int32_t or int64_t range ends,
+		 * since they are hard to recognize in decimal otherwise.
+		 */
+		if (value - INT64_MIN < 1000000)
+			return snprintf(buffer, bufsz, "INT64_MIN+%" PRId64,
+				value - INT64_MIN);
+		if (INT64_MAX - value < 1000000)
+			return snprintf(buffer, bufsz, "INT64_MAX-%" PRId64,
+				INT64_MAX - value);
+		if (value - INT32_MIN < 1000)
+			return snprintf(buffer, bufsz, "INT32_MIN+%" PRId64,
+				value - INT32_MIN);
+		if (INT32_MAX - value < 1000)
+			return snprintf(buffer, bufsz, "INT32_MAX-%" PRId64,
+				INT32_MAX - value);
+		return snprintf(buffer, bufsz, "%" PRId64, value);
+	case 'x':
+		/* Special case only the common case of UINT64_MAX. */
+		if (value == UINT64_MAX)
+			return snprintf(buffer, bufsz, "%s", "UINT64_MAX");
+		return snprintf(buffer, bufsz, "%#" PRIx64, value);
+	default:
+		return -EINVAL;
+	}
+}
+
+RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_bpf_validate_debug_format_interval, 26.07)
+int
+rte_bpf_validate_debug_format_interval(char *buffer, size_t bufsz, char format,
+	uint64_t min, uint64_t max)
+{
+	char min_buffer[32], max_buffer[32];
+	int rc;
+
+	if (min == max)
+		return rte_bpf_validate_debug_format_value(buffer, bufsz, format, min);
+
+	rc = rte_bpf_validate_debug_format_value(min_buffer, sizeof(min_buffer), format, min);
+	if (rc < 0)
+		return rc;
+
+	rc = rte_bpf_validate_debug_format_value(max_buffer, sizeof(max_buffer), format, max);
+	if (rc < 0)
+		return rc;
+
+	return snprintf(buffer, bufsz, "%s..%s", min_buffer, max_buffer);
+}
+
+/* Evaluation start and finish. */
+
+/* Free all resources associated with current evaluation. */
+static void
+debug_evaluate_close(struct rte_bpf_validate_debug *debug)
+{
+	RTE_ASSERT(debug->evaluate_started);
+	debug_pending_breakpoints_save(debug);
+	free(debug->breakpoint_lists);
+	debug->breakpoint_lists = NULL;
+	debug->evaluate_started = false;
+}
+
+int
+__rte_bpf_validate_debug_evaluate_start(struct rte_bpf_validate_debug *debug,
+	const struct bpf_verifier *verifier, const struct rte_bpf_prm_ex *bpf_prm)
+{
+	if (debug == NULL)
+		return 0;
+
+	if (verifier == NULL || bpf_prm == NULL ||
+			bpf_prm->origin != RTE_BPF_ORIGIN_RAW)
+		return -EINVAL;
+
+	if (debug->evaluate_started) {
+		RTE_BPF_LOG_FUNC_LINE(ERR, "already started");
+		return -EEXIST;
+	}
+
+	if (!debug_pending_breakpoints_are_valid(debug, bpf_prm->raw.nb_ins))
+		return -ENOENT;
+
+	debug->verifier = verifier;
+	debug->bpf_prm = bpf_prm;
+	debug->breakpoint_lists = point_lists_create(bpf_prm->raw.nb_ins);
+	if (debug->breakpoint_lists == NULL)
+		return -ENOMEM;
+	debug_pending_breakpoints_restore(debug);
+	debug->last_point = NULL;
+	debug->pc = 0;
+	debug->evaluate_started = true;
+
+	const int rc = debug_send_event(debug,
+		RTE_BPF_VALIDATE_DEBUG_EVENT_VALIDATION_START);
+	if (rc < 0) {
+		debug_evaluate_close(debug);
+		return rc;
+	}
+
+	RTE_BPF_LOG_FUNC_LINE(DEBUG, "evaluate started");
+	return 0;
+}
+
+int
+__rte_bpf_validate_debug_evaluate_step(struct rte_bpf_validate_debug *debug,
+	uint32_t pc, debug_event_t event)
+{
+	int rc;
+
+	if (debug == NULL)
+		return 0;
+
+	if (!debug->evaluate_started) {
+		RTE_BPF_LOG_FUNC_LINE(ERR, "not started");
+		return -ECHILD;
+	}
+
+	if (pc > debug->bpf_prm->raw.nb_ins || event < 0 ||
+			event >= RTE_BPF_VALIDATE_DEBUG_EVENT_END)
+		return -EINVAL;
+
+	debug->pc = pc;
+
+	rc = __rte_bpf_validate_state_is_valid(debug->verifier);
+	if (rc == 0)
+		rc = debug_send_event(debug,
+			RTE_BPF_VALIDATE_DEBUG_EVENT_INVALID_STATE);
+
+	if (event != RTE_BPF_VALIDATE_DEBUG_EVENT_STEP)
+		rc = rc < 0 ? rc : debug_send_event(debug, event);
+
+	if (event == RTE_BPF_VALIDATE_DEBUG_EVENT_STEP ||
+			event == RTE_BPF_VALIDATE_DEBUG_EVENT_BRANCH_ENTER)
+		/* Stepping into a real instruction to execute. */
+		rc = rc < 0 ? rc : debug_points_call_back(debug,
+			&debug->breakpoint_lists[pc]);
+
+	rc = rc < 0 ? rc : debug_send_event(debug,
+		RTE_BPF_VALIDATE_DEBUG_EVENT_STEP);
+
+	return rc;
+}
+
+int
+__rte_bpf_validate_debug_evaluate_finish(struct rte_bpf_validate_debug *debug,
+	int result)
+{
+	int rc = 0;
+	uint32_t pc;
+	debug_event_t event;
+
+	if (debug == NULL)
+		return 0;
+
+	if (!debug->evaluate_started) {
+		RTE_BPF_LOG_FUNC_LINE(ERR, "not started");
+		return -ECHILD;
+	}
+
+	debug->evaluate_finished = true;
+	debug->evaluate_result = result;
+
+	if (result != -ECANCELED) {
+		if (result < 0) {
+			/* Last known pc is the place we failed. */
+			pc = debug->pc;
+			event = RTE_BPF_VALIDATE_DEBUG_EVENT_VALIDATION_FAILURE;
+		} else {
+			/* Show program end, not particular instruction. */
+			pc = debug->bpf_prm->raw.nb_ins;
+			event = RTE_BPF_VALIDATE_DEBUG_EVENT_VALIDATION_SUCCESS;
+		}
+
+		rc = __rte_bpf_validate_debug_evaluate_step(debug, pc, event);
+	}
+
+	debug_evaluate_close(debug);
+
+	return rc;
+}
diff --git a/lib/bpf/bpf_validate_debug.h b/lib/bpf/bpf_validate_debug.h
new file mode 100644
index 000000000000..a91f3e9c48b2
--- /dev/null
+++ b/lib/bpf/bpf_validate_debug.h
@@ -0,0 +1,86 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2025 Huawei Technologies Co., Ltd
+ */
+
+#ifndef _BPF_VALIDATE_DEBUG_H_
+#define _BPF_VALIDATE_DEBUG_H_
+
+/**
+ * @file bpf_validate_debug.h
+ *
+ * Internal-use headers for eBPF validation debug notifications.
+ */
+
+#include "rte_bpf_validate_debug.h"
+
+#include <stdint.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+struct rte_bpf_prm_ex;
+struct rte_bpf_validate_debug;
+struct bpf_verifier;
+
+/* Type alias for validation event enum. */
+typedef enum rte_bpf_validate_debug_event debug_event_t;
+
+/*
+ * Signal beginning of evaluation process.
+ *
+ * Immediately return 0 if debug is NULL.
+ *
+ * @param debug
+ *   Validate debug instance configured by user, can be NULL.
+ * @param verifier
+ *   Opaque pointer that can be used for calling bpf_validate.h API.
+ * @param bpf_prm
+ *   Parameters struct of the validated eBPF program, including code with all
+ *   patches and relocations applied.
+ * @return
+ *   Non-negative value on success, negative errno on failure.
+ */
+int
+__rte_bpf_validate_debug_evaluate_start(struct rte_bpf_validate_debug *debug,
+	const struct bpf_verifier *verifier, const struct rte_bpf_prm_ex *bpf_prm);
+
+/*
+ * Signal each instruction, branch end, or evaluation end.
+ *
+ * Immediately return 0 if debug is NULL.
+ *
+ * @param debug
+ *   Validate debug instance configured by user, can be NULL.
+ * @param pc
+ *   Current value of the program counter, or next after last instruction.
+ * @param event
+ *   Specific evaluation event if any, or RTE_BPF_VALIDATE_DEBUG_EVENT_STEP.
+ * @return
+ *   Non-negative value: evaluation should continue;
+ *   -ECANCELED: evaluation should fail without calling this API again;
+ *   Other negative value: evaluation should fail signalling failure;
+ */
+int
+__rte_bpf_validate_debug_evaluate_step(struct rte_bpf_validate_debug *debug,
+	uint32_t pc, debug_event_t event);
+
+/*
+ * Signal end of evaluation process.
+ *
+ * Immediately return 0 if debug is NULL.
+ *
+ * @param debug
+ *   Validate debug instance configured by user, can be NULL.
+ * @return
+ *   Non-negative value on success, negative errno on failure.
+ */
+int
+__rte_bpf_validate_debug_evaluate_finish(struct rte_bpf_validate_debug *debug,
+	int result);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _BPF_VALIDATE_DEBUG_H_ */
diff --git a/lib/bpf/bpf_value_set.c b/lib/bpf/bpf_value_set.c
new file mode 100644
index 000000000000..86f46de66f2f
--- /dev/null
+++ b/lib/bpf/bpf_value_set.c
@@ -0,0 +1,403 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2026 Huawei Technologies Co., Ltd
+ */
+
+#include "bpf_value_set.h"
+
+#include <rte_debug.h>
+
+/* Helper interval operations and checks.  */
+
+/* One of many possible full intervals. */
+static const struct value_set_interval canonical_full_interval = {
+	.first = 0,
+	.last = UINT64_MAX,
+};
+
+/* Translate ("shift") interval by `offset`. */
+static void
+interval_translate(struct value_set_interval *interval, uint64_t offset)
+{
+	interval->first += offset;
+	interval->last += offset;
+}
+
+/* Return true if the interval includes all possible values. */
+static bool
+interval_is_full(struct value_set_interval interval)
+{
+	return interval.last + 1 == interval.first;
+}
+
+/* Return true if the interval includes `value`. */
+static bool
+interval_contains(struct value_set_interval interval, uint64_t value)
+{
+	return value - interval.first <= interval.last - interval.first;
+}
+
+/* Return true if the interval `lhs` includes all values from `rhs`. */
+static bool
+interval_covers(struct value_set_interval lhs, struct value_set_interval rhs)
+{
+	const uint64_t offset = -lhs.first;
+	interval_translate(&lhs, offset);
+	interval_translate(&rhs, offset);
+	RTE_ASSERT(lhs.first == 0);
+
+	return lhs.last == UINT64_MAX ||
+		(lhs.last >= rhs.last && rhs.last >= rhs.first);
+}
+
+/* Return true if the interval includes step from UINT64_MAX to 0. */
+static bool
+interval_crosses_zero(struct value_set_interval interval)
+{
+	return interval.last < interval.first;
+}
+
+/* Return number of elements in a non-full elements, 0 for full interval. */
+static uint64_t
+interval_size(struct value_set_interval interval)
+{
+	return interval.last - interval.first + 1;
+}
+
+/* Return true if two intervals represent same sets of values. */
+static bool
+intervals_equal(struct value_set_interval lhs, struct value_set_interval rhs)
+{
+	return (interval_is_full(lhs) && interval_is_full(rhs)) ||
+		(lhs.first == rhs.first && lhs.last == rhs.last);
+}
+
+/* Return true if two intervals have common elements. */
+static bool
+intervals_intersect(struct value_set_interval lhs, struct value_set_interval rhs)
+{
+	return interval_contains(lhs, rhs.first) || interval_contains(rhs, lhs.first);
+}
+
+/* Return true if `rhs.first` follows `lhs.last` with some gap. Does not check other ends! */
+static bool
+intervals_follow_with_gap(struct value_set_interval lhs, struct value_set_interval rhs)
+{
+	return lhs.last != UINT64_MAX && rhs.first > lhs.last + 1;
+}
+
+/* Return true if `(l - o) < (r - o)` for all `(o in origin, l in lhs, r in rhs)`. */
+static bool
+intervals_based_less(struct value_set_interval origin, struct value_set_interval lhs,
+	struct value_set_interval rhs)
+{
+	/* Translate all intervals for the origin to start at 0. */
+	const uint64_t offset = -origin.first;
+	interval_translate(&origin, offset);
+	interval_translate(&lhs, offset);
+	interval_translate(&rhs, offset);
+	RTE_ASSERT(origin.first == 0);
+
+	return origin.last <= lhs.first &&
+		lhs.first <= lhs.last &&
+		lhs.last < rhs.first &&
+		rhs.first <= rhs.last;
+}
+
+/* Return true if `(l - o) <= (r - o)` for all `(o in origin, l in lhs, r in rhs)`. */
+static bool
+intervals_based_less_or_equal(struct value_set_interval origin, struct value_set_interval lhs,
+	struct value_set_interval rhs)
+{
+	/* Translate all intervals for the origin to start at 0. */
+	const uint64_t offset = -origin.first;
+	interval_translate(&origin, offset);
+	interval_translate(&lhs, offset);
+	interval_translate(&rhs, offset);
+	RTE_ASSERT(origin.first == 0);
+
+	/* Special cases. */
+	if (origin.last == 0 && lhs.first == 0 && lhs.last == 0)
+		return true;
+	if (origin.last == 0 && rhs.first == UINT64_MAX && rhs.last == UINT64_MAX)
+		return true;
+	if (lhs.first == lhs.last && lhs.last == rhs.first && rhs.first == rhs.last)
+		return true;
+
+	return origin.last <= lhs.first &&
+		lhs.first <= lhs.last &&
+		lhs.last <= rhs.first &&
+		rhs.first <= rhs.last;
+}
+
+/* Append interval rhs to list of intervals in lhs. */
+static void
+value_set_append(struct value_set *lhs, struct value_set_interval rhs)
+{
+	RTE_VERIFY(lhs->nb_interval < VALUE_SET_NB_INTERVAL_MAX);
+	RTE_VERIFY(lhs->nb_interval == 0 ||
+		intervals_follow_with_gap(lhs->interval[lhs->nb_interval - 1], rhs));
+	lhs->interval[lhs->nb_interval++] = rhs;
+}
+
+/*
+ * Helper operations on noncyclic value set and intervals.
+ * Noncyclic means no interval crosses zero,
+ * but in return last value set interval may touch first.
+ */
+
+static struct value_set
+noncyclic_value_set_union_interval(const struct value_set *lhs, const struct value_set_interval rhs)
+{
+	struct value_set result = {};
+	uint32_t index = 0;
+
+	RTE_ASSERT(lhs->nb_interval == 0 ||
+		!interval_crosses_zero(lhs->interval[lhs->nb_interval - 1]));
+	RTE_ASSERT(!interval_crosses_zero(rhs));
+
+	/* Append to result all lhs intervals preceding rhs. */
+	for (; index != lhs->nb_interval; ++index) {
+		const struct value_set_interval lhs_interval = lhs->interval[index];
+		if (!intervals_follow_with_gap(lhs_interval, rhs))
+			break;
+
+		value_set_append(&result, lhs_interval);
+	}
+
+	/* Appendinterval joined from rhs and all lhs intervals intersecting or touching it. */
+	struct value_set_interval joint_interval = rhs;
+	for (; index != lhs->nb_interval; ++index) {
+		const struct value_set_interval lhs_interval = lhs->interval[index];
+		if (intervals_follow_with_gap(rhs, lhs_interval))
+			break;
+
+		joint_interval.first = RTE_MIN(joint_interval.first, lhs_interval.first);
+		joint_interval.last = RTE_MAX(joint_interval.last, lhs_interval.last);
+	}
+	value_set_append(&result, joint_interval);
+
+	/* Append to result all lhs intervals following rhs. */
+	for (; index != lhs->nb_interval; ++index)
+		value_set_append(&result, lhs->interval[index]);
+
+	return result;
+}
+
+/* Make "normal" maximal disjoint interval value set out of noncyclic one. */
+static struct value_set
+value_set_from_noncyclic(const struct value_set *set)
+{
+	struct value_set result = {};
+	uint32_t index = 0;
+
+	if (set->nb_interval <= 1)
+		return *set;
+
+	struct value_set_interval last_interval = set->interval[set->nb_interval - 1];
+	if (last_interval.last == UINT64_MAX && set->interval[0].first == 0) {
+		/* Join first interval with the last one instead of copying it. */
+		last_interval.last = set->interval[0].last;
+		++index;
+	}
+
+	for (; index != set->nb_interval - 1; ++index)
+		value_set_append(&result, set->interval[index]);
+
+	value_set_append(&result, last_interval);
+
+	return result;
+}
+
+/* Make lhs a union of lhs and rhs. */
+static void
+value_set_union_interval(struct value_set *lhs, const struct value_set_interval rhs)
+{
+	struct value_set temp;
+
+	if (value_set_is_empty(lhs)) {
+		value_set_append(lhs, rhs);
+		return;
+	}
+
+	struct value_set_interval *const last_interval = &lhs->interval[lhs->nb_interval - 1];
+	const bool last_interval_crossed_zero = interval_crosses_zero(*last_interval);
+	const uint64_t wrapping_last = last_interval->last;
+
+	if (last_interval_crossed_zero)
+		/* Make value set noncyclic by removing crossing part of last interval. */
+		last_interval->last = UINT64_MAX;
+
+	if (interval_crosses_zero(rhs)) {
+		/* Add parts before and after zero separately. */
+		temp = noncyclic_value_set_union_interval(lhs,
+			(struct value_set_interval){
+				.first = rhs.first,
+				.last = UINT64_MAX,
+			});
+		temp = noncyclic_value_set_union_interval(lhs,
+			(struct value_set_interval){
+				.first = 0,
+				.last = rhs.last,
+			});
+	} else
+		temp = noncyclic_value_set_union_interval(lhs, rhs);
+
+	if (last_interval_crossed_zero)
+		/* Restore previously removed part. */
+		temp = noncyclic_value_set_union_interval(&temp,
+			(struct value_set_interval){
+				.first = 0,
+				.last = wrapping_last,
+			});
+
+	*lhs = value_set_from_noncyclic(&temp);
+}
+
+/* Set `lhs` to the set of possible sums between values from `lhs` and `rhs`. */
+static void
+value_set_add_interval(struct value_set *lhs, struct value_set_interval rhs)
+{
+	const struct value_set temp = *lhs;
+	lhs->nb_interval = 0;
+
+	for (uint32_t index = 0; index != temp.nb_interval; ++index) {
+		const struct value_set_interval interval = temp.interval[index];
+		if (interval_is_full(rhs) || interval_is_full(interval) ||
+				interval_size(interval) > UINT64_MAX - interval_size(rhs)) {
+			value_set_append(lhs, canonical_full_interval);
+			return;
+		}
+	}
+
+	for (uint32_t index = 0; index != temp.nb_interval; ++index)
+		value_set_union_interval(lhs, (struct value_set_interval){
+			/* Checked sizes above, so these interval expansions won't overflow. */
+			.first = temp.interval[index].first + rhs.first,
+			.last = temp.interval[index].last + rhs.last,
+		});
+}
+
+struct value_set
+value_set_singleton(uint64_t value)
+{
+	return value_set_contiguous(value, value);
+}
+
+struct value_set
+value_set_contiguous(uint64_t first, uint64_t last)
+{
+	return (struct value_set){
+		.nb_interval = 1,
+		.interval = {
+			{ .first = first, .last = last },
+		},
+	};
+}
+
+struct value_set
+value_set_from_pair(uint64_t first1, uint64_t last1, uint64_t first2, uint64_t last2)
+{
+	struct value_set result = {};
+
+	if (first1 - first2 <= last2 - first2)
+		/* Interval 1 starts within interval 2. */
+		value_set_union_interval(&result, (struct value_set_interval){
+				.first = first1,
+				.last = first1 + RTE_MIN(last1 - first1, last2 - first1),
+			});
+
+	if (first2 - first1 <= last1 - first1)
+		/* Interval 2 starts within interval 1. */
+		value_set_union_interval(&result, (struct value_set_interval){
+				.first = first2,
+				.last = first2 + RTE_MIN(last2 - first2, last1 - first2),
+			});
+
+	return result;
+}
+
+bool
+value_set_is_empty(const struct value_set *set)
+{
+	return set->nb_interval == 0;
+}
+
+bool
+value_set_is_singleton(const struct value_set *set)
+{
+	return set->nb_interval == 1 && interval_size(set->interval[0]) == 1;
+}
+
+bool
+value_sets_equal(const struct value_set *lhs, const struct value_set *rhs)
+{
+	if (lhs->nb_interval != rhs->nb_interval)
+		return false;
+
+	for (uint32_t index = 0; index != lhs->nb_interval; ++index)
+		if (!intervals_equal(lhs->interval[index], rhs->interval[index]))
+			return false;
+
+	return true;
+}
+
+bool
+value_sets_intersect(const struct value_set *lhs, const struct value_set *rhs)
+{
+	for (uint32_t lhs_index = 0; lhs_index != lhs->nb_interval; ++lhs_index)
+		for (uint32_t rhs_index = 0; rhs_index != rhs->nb_interval; ++rhs_index)
+			if (intervals_intersect(lhs->interval[lhs_index], rhs->interval[rhs_index]))
+				return true;
+
+	return false;
+}
+
+bool
+value_set_is_covered_by_contiguous(const struct value_set *lhs, uint64_t first, uint64_t last)
+{
+	const struct value_set_interval rhs = { .first = first, .last = last };
+	for (uint32_t lhs_index = 0; lhs_index != lhs->nb_interval; ++lhs_index)
+		if (!interval_covers(rhs, lhs->interval[lhs_index]))
+			return false;
+
+	return true;
+}
+
+bool
+value_sets_based_less(const struct value_set *origin, const struct value_set *lhs,
+	const struct value_set *rhs)
+{
+	for (uint32_t origin_index = 0; origin_index != origin->nb_interval; ++origin_index)
+		for (uint32_t lhs_index = 0; lhs_index != lhs->nb_interval; ++lhs_index)
+			for (uint32_t rhs_index = 0; rhs_index != rhs->nb_interval; ++rhs_index)
+				if (!intervals_based_less(origin->interval[origin_index],
+						lhs->interval[lhs_index], rhs->interval[rhs_index]))
+					return false;
+	return true;
+}
+
+bool
+value_sets_based_less_or_equal(const struct value_set *origin, const struct value_set *lhs,
+	const struct value_set *rhs)
+{
+	for (uint32_t origin_index = 0; origin_index != origin->nb_interval; ++origin_index)
+		for (uint32_t lhs_index = 0; lhs_index != lhs->nb_interval; ++lhs_index)
+			for (uint32_t rhs_index = 0; rhs_index != rhs->nb_interval; ++rhs_index)
+				if (!intervals_based_less_or_equal(origin->interval[origin_index],
+						lhs->interval[lhs_index], rhs->interval[rhs_index]))
+					return false;
+	return true;
+}
+
+void
+value_set_translate(struct value_set *set, uint64_t offset)
+{
+	for (uint32_t index = 0; index != set->nb_interval; ++index)
+		interval_translate(&set->interval[index], offset);
+}
+
+void
+value_set_add_contiguous(struct value_set *lhs, uint64_t first, uint64_t last)
+{
+	value_set_add_interval(lhs, (struct value_set_interval){ .first = first, .last = last });
+}
diff --git a/lib/bpf/bpf_value_set.h b/lib/bpf/bpf_value_set.h
new file mode 100644
index 000000000000..5e7f8e521f55
--- /dev/null
+++ b/lib/bpf/bpf_value_set.h
@@ -0,0 +1,126 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2026 Huawei Technologies Co., Ltd
+ */
+
+#ifndef _BPF_VALUE_SET_H_
+#define _BPF_VALUE_SET_H_
+
+/**
+ * @file value_set.h
+ *
+ * Value set operations for BPF validate debug.
+ *
+ * This is not a general use library, only minimal set of operations is provided
+ * that are necessary for implementing validate debug interface.
+ */
+
+#include <stdbool.h>
+#include <stdint.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define VALUE_SET_NB_INTERVAL_MAX 3
+
+/*
+ * Cyclic interval on uint64_t.
+ *
+ * Cyclic means value of `last` might be numerically smaller than `first`,
+ * that is the interval may cross from UINT64_MAX to 0.
+ *
+ * Contains element `first` and all elements that can be obtained from it by
+ * adding 1 until the result reaches `last`, which is included.
+ * There is thus multiple representations of the full set and no representation
+ * of the empty set.
+ *
+ * When `first` and `last` are accepted separately as function arguments, the
+ * term _contiguous_ is being used. It means that values of `first` and `last`
+ * are used to create a contiguous set composed of a single cyclic interval
+ * defined by these points.
+ */
+struct value_set_interval {
+	uint64_t first;
+	uint64_t last;
+};
+
+/*
+ * Set of values represented as an ordered sequence of maximal disjoint cyclic intervals.
+ *
+ * Condition `maximal disjoint` means intervals do not intersect or touch each other.
+ *
+ * The sequence is ordered by member `first`. Only last interval may thus cross zero.
+ */
+struct value_set {
+	uint32_t nb_interval;
+	struct value_set_interval interval[VALUE_SET_NB_INTERVAL_MAX];
+};
+
+/* Empty value set. */
+static const struct value_set value_set_empty = {
+	.nb_interval = 0,
+};
+
+/* Full (including every possible value) value set. */
+static const struct value_set value_set_full = {
+	.nb_interval = 1,
+	.interval = {
+		{ .first = 0, .last = UINT64_MAX },
+	},
+};
+
+/* Return set containing only `value`. */
+struct value_set
+value_set_singleton(uint64_t value);
+
+/* Return set of all values between and including `first` and `last` (AKA first..last). */
+struct value_set
+value_set_contiguous(uint64_t first, uint64_t last);
+
+/* Return set of all values belonging to _both_ first1..last1 and first2..last. */
+struct value_set
+value_set_from_pair(uint64_t first1, uint64_t last1, uint64_t first2, uint64_t last2);
+
+/* Return true if the set is empty. */
+bool
+value_set_is_empty(const struct value_set *set);
+
+/* Return true if the set only contains one element. */
+bool
+value_set_is_singleton(const struct value_set *set);
+
+/* Return true if lhs and rhs represent the same set. */
+bool
+value_sets_equal(const struct value_set *lhs, const struct value_set *rhs);
+
+/* Return true if sets intersect (contain common elements). */
+bool
+value_sets_intersect(const struct value_set *lhs, const struct value_set *rhs);
+
+/* Return true if all elements in lhs belong to interval first..last */
+bool
+value_set_is_covered_by_contiguous(const struct value_set *lhs, uint64_t first, uint64_t last);
+
+/* Return true if `(l - o) < (r - o)` for all `(o in origin, l in lhs, r in rhs)`. */
+bool
+value_sets_based_less(const struct value_set *origin, const struct value_set *lhs,
+	const struct value_set *rhs);
+
+/* Return true if `(l - o) <= (r - o)` for all `(o in origin, l in lhs, r in rhs)`. */
+bool
+value_sets_based_less_or_equal(const struct value_set *origin, const struct value_set *lhs,
+	const struct value_set *rhs);
+
+/* Translate ("shift") all set elements by `offset`. */
+void
+value_set_translate(struct value_set *lhs, uint64_t rhs);
+
+/* Set `lhs` to the set of possible sums between values from `lhs` and `rhs`. */
+void
+value_set_add_contiguous(struct value_set *lhs, uint64_t first, uint64_t last);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _BPF_VALUE_SET_H */
diff --git a/lib/bpf/meson.build b/lib/bpf/meson.build
index 7e8a300e3f87..b74a5c232107 100644
--- a/lib/bpf/meson.build
+++ b/lib/bpf/meson.build
@@ -24,6 +24,8 @@ sources = files(
         'bpf_load_elf.c',
         'bpf_pkt.c',
         'bpf_validate.c',
+        'bpf_validate_debug.c',
+        'bpf_value_set.c',
 )
 
 if arch_subdir == 'x86' and dpdk_conf.get('RTE_ARCH_64')
@@ -32,9 +34,12 @@ elif dpdk_conf.has('RTE_ARCH_ARM64')
     sources += files('bpf_jit_arm64.c')
 endif
 
-headers = files('bpf_def.h',
+headers = files(
+        'bpf_def.h',
         'rte_bpf.h',
-        'rte_bpf_ethdev.h')
+        'rte_bpf_ethdev.h',
+        'rte_bpf_validate_debug.h',
+)
 
 deps += ['mbuf', 'net', 'ethdev']
 
diff --git a/lib/bpf/rte_bpf.h b/lib/bpf/rte_bpf.h
index b6c232704a56..052849945c45 100644
--- a/lib/bpf/rte_bpf.h
+++ b/lib/bpf/rte_bpf.h
@@ -118,6 +118,7 @@ enum rte_bpf_origin {
 };
 
 struct bpf_insn;
+struct rte_bpf_validate_debug;
 
 /**
  * Input parameters for loading eBPF code, extensible version.
@@ -158,6 +159,9 @@ struct rte_bpf_prm_ex {
 
 	struct rte_bpf_arg prog_arg[EBPF_FUNC_MAX_ARGS];  /**< program arguments */
 	uint32_t nb_prog_arg;  /**< program argument count */
+
+	/* Validate debug instance. */
+	struct rte_bpf_validate_debug *debug;
 };
 
 /**
diff --git a/lib/bpf/rte_bpf_validate_debug.h b/lib/bpf/rte_bpf_validate_debug.h
new file mode 100644
index 000000000000..89bf587f0211
--- /dev/null
+++ b/lib/bpf/rte_bpf_validate_debug.h
@@ -0,0 +1,375 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2025 Huawei Technologies Co., Ltd
+ */
+
+#ifndef _RTE_BPF_VALIDATE_DEBUG_H_
+#define _RTE_BPF_VALIDATE_DEBUG_H_
+
+/**
+ * @file rte_bpf_validate_debug.h
+ *
+ * Debugging interface for BPF validation.
+ *
+ * Can be used for debugging BPF validation problems as well as in tests.
+ */
+
+#include <bpf_def.h>
+#include <rte_compat.h>
+
+#include <stdbool.h>
+#include <stddef.h>
+#include <stdint.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define RTE_BPF_VALIDATE_DEBUG_MAY_BE_FALSE	RTE_BIT32(0)
+#define RTE_BPF_VALIDATE_DEBUG_MAY_BE_TRUE	RTE_BIT32(1)
+
+/**
+ * Supported validate events.
+ *
+ * Valid events begin from 0 and end before `RTE_BPF_VALIDATE_DEBUG_EVENT_END`.
+ */
+enum rte_bpf_validate_debug_event {
+	/* Just before every instruction, at branch or validation end. */
+	RTE_BPF_VALIDATE_DEBUG_EVENT_STEP,
+	/* Validator has failed its internal self-checks. */
+	RTE_BPF_VALIDATE_DEBUG_EVENT_INVALID_STATE,
+	/* Start of validation. */
+	RTE_BPF_VALIDATE_DEBUG_EVENT_VALIDATION_START,
+	/* Successful finish of validation. */
+	RTE_BPF_VALIDATE_DEBUG_EVENT_VALIDATION_SUCCESS,
+	/* Finish of validation with error. */
+	RTE_BPF_VALIDATE_DEBUG_EVENT_VALIDATION_FAILURE,
+	/* Beginning of a branch just after the jump. */
+	RTE_BPF_VALIDATE_DEBUG_EVENT_BRANCH_ENTER,
+	/* Pruning branch as verified earlier. */
+	RTE_BPF_VALIDATE_DEBUG_EVENT_BRANCH_PRUNE,
+	/* End of branch verification, after the last verified instruction. */
+	RTE_BPF_VALIDATE_DEBUG_EVENT_BRANCH_RETURN,
+	/* Number of valid event values. */
+	RTE_BPF_VALIDATE_DEBUG_EVENT_END,
+};
+
+struct rte_bpf_validate_debug;
+struct rte_bpf_validate_debug_point;
+
+/** User callback description. */
+struct rte_bpf_validate_debug_callback {
+	int (*fn)(struct rte_bpf_validate_debug *debug, void *ctx);
+	void *ctx;
+};
+
+/** Invoked by rte_bpf_validate_debug_for_each_point for each breakpoint and catchpoint. */
+typedef int (*rte_bpf_validate_debug_point_process_t)(struct rte_bpf_validate_debug_point *point,
+	void *ctx);
+
+/**
+ * Create new debug instance.
+ *
+ * @return
+ *   Debug instance in case of success.
+ *   NULL with rte_errno set in case of a failure.
+ */
+__rte_experimental
+struct rte_bpf_validate_debug *
+rte_bpf_validate_debug_create(void);
+
+/**
+ * Destroy debug instance.
+ *
+ * Behavior is undefined if validation with this debug instance is ongoing.
+ *
+ * @param debug
+ *   Debug instance, or NULL.
+ */
+__rte_experimental
+void
+rte_bpf_validate_debug_destroy(struct rte_bpf_validate_debug *debug);
+
+/**
+ * Create new breakpoint at specified location.
+ *
+ * Can be called before the validation has started. If at validation start later
+ * the program will not have the specified instruction, the start will fail.
+ *
+ * It is allowed to create breakpoints for the same location a callback is
+ * currently executing for, but it will not be invoked in the same cycle.
+ *
+ * @param debug
+ *   Debug instance.
+ * @param pc
+ *   Program counter to create breakpoint at.
+ * @param callback
+ *   Callback to invoke.
+ * @return
+ *   New breakpoint on success, NULL with rte_errno set on failure.
+ */
+__rte_experimental
+struct rte_bpf_validate_debug_point *
+rte_bpf_validate_debug_break(struct rte_bpf_validate_debug *debug, uint32_t pc,
+	const struct rte_bpf_validate_debug_callback *callback);
+
+/**
+ * Create new catchpoint for specified event.
+ *
+ * Can be called before the validation has started.
+ *
+ * It is allowed to create catchpoints for the same event a callback is
+ * currently executing for, but it will not be invoked in the same cycle.
+ *
+ * @param debug
+ *   Debug instance.
+ * @param event
+ *   Validation event to create catchpoint for.
+ * @param callback
+ *   Callback to invoke.
+ * @return
+ *   New breakpoint on success, NULL with rte_errno set on failure.
+ */
+__rte_experimental
+struct rte_bpf_validate_debug_point *
+rte_bpf_validate_debug_catch(struct rte_bpf_validate_debug *debug,
+	enum rte_bpf_validate_debug_event event,
+	const struct rte_bpf_validate_debug_callback *callback);
+
+/**
+ * Delete breakpoint or catchpoint and free all associated resources.
+ *
+ * If a callback is currently being executed, calling this API is allowed for:
+ * - breakpoint or catchpoint the callback is executed for;
+ * - breakpoints or catchpoints for other locations or events;
+ * and NOT allowed for:
+ * - other breakpoints or catchpoints for the same location or event.
+ *
+ * @param point
+ *   Breakpoint or catchpoint to destroy, or NULL.
+ */
+__rte_experimental
+void
+rte_bpf_validate_debug_point_destroy(struct rte_bpf_validate_debug_point *point);
+
+/**
+ * Get effective eBPF parameters struct.
+ *
+ * @param debug
+ *   Debug instance.
+ * @return
+ *   Parameters struct of the validated eBPF program, including code with all
+ *   patches and relocations applied.
+ */
+__rte_experimental
+const struct rte_bpf_prm_ex *
+rte_bpf_validate_debug_get_bpf_param(const struct rte_bpf_validate_debug *debug);
+
+/**
+ * Get pointer to effective eBPF program instructions.
+ *
+ * @param debug
+ *   Debug instance.
+ * @param ins
+ *   Upon return, program instructions with all patches and relocations applied.
+ * @param nb_ins
+ *   Upon return, number of program instructions.
+ * @return
+ *   Non-negative value on success, negative errno on failure.
+ */
+__rte_experimental
+int
+rte_bpf_validate_debug_get_ins(const struct rte_bpf_validate_debug *debug,
+	const struct ebpf_insn **ins, uint32_t *nb_ins);
+
+/**
+ * Get last triggered breakpoint or catchpoint.
+ *
+ * Can be used to destroy currently processed breakpoint or catchpoint.
+ *
+ * The pointer may be invalid if the breakpoint or catchpoint has already been
+ * destroyed earlier.
+ *
+ * @param debug
+ *   Debug instance.
+ * @return
+ *   Last triggered breakpoint or callpoint, including one the callback is
+ *   currently executing for.
+ *   NULL of none were triggered in the current validation process.
+ */
+__rte_experimental
+struct rte_bpf_validate_debug_point *
+rte_bpf_validate_debug_get_last_point(const struct rte_bpf_validate_debug *debug);
+
+/**
+ * Get current instruction index, or one after last if finishing.
+ *
+ * @param debug
+ *   Debug instance.
+ * @return
+ *   Current program counter being validated, or one after last.
+ *   UINT32_MAX if no program is being validated.
+ */
+__rte_experimental
+uint32_t
+rte_bpf_validate_debug_get_pc(const struct rte_bpf_validate_debug *debug);
+
+/**
+ * Get the validation result, if it has finished.
+ *
+ * @param debug
+ *   Debug instance.
+ * @param result
+ *   Upon successful return, the validation result (negative if validation failed).
+ * @return
+ *   Non-negative value if validation has finished and result variable was written;
+ *   -EAGAIN if validation is still ongoing;
+ *   other negative errno in case of failure;
+ */
+__rte_experimental
+int
+rte_bpf_validate_debug_get_validation_result(const struct rte_bpf_validate_debug *debug,
+	int *result);
+
+/**
+ * Check if specified memory access instruction is currently valid.
+ *
+ * @param debug
+ *   Debug instance.
+ * @param access
+ *   Memory load or store eBPF instruction.
+ * @param off64
+ *   Additional 64-bit offset added to ins->off.
+ * @return
+ *   1 if specified memory access is currently valid;
+ *   0 if specified memory access is currently invalid;
+ *   negative errno in case of failure;
+ */
+__rte_experimental
+int
+rte_bpf_validate_debug_can_access(const struct rte_bpf_validate_debug *debug,
+	const struct ebpf_insn *access, uint64_t off64);
+
+/**
+ * Get possible truth values of the specified jump condition.
+ *
+ * @param debug
+ *   Debug instance.
+ * @param jump
+ *   Conditional jump instruction specifying the condition.
+ * @param imm64
+ *   Additional 64-bit immediate added to the source.
+ * @return
+ *   in case of success, bitwise combination of:
+ *     RTE_BPF_VALIDATE_DEBUG_MAY_BE_FALSE if the jump condition may be false;
+ *     RTE_BPF_VALIDATE_DEBUG_MAY_BE_TRUE if the jump condition may be true;
+ *   negative errno in case of failure.
+ */
+__rte_experimental
+int
+rte_bpf_validate_debug_may_jump(const struct rte_bpf_validate_debug *debug,
+	const struct ebpf_insn *jump, uint64_t imm64);
+
+/**
+ * Format information about specified register for the user.
+ *
+ * Parameters buffer, bufsz and return value work the same way as for snprintf.
+ *
+ * @param debug
+ *   Debug instance.
+ * @param buffer
+ *   Buffer to fill with register information.
+ * @param bufsz
+ *   Buffer size (including space for terminating zero).
+ * @param reg
+ *   Register to provide information about.
+ * @return
+ *   Number of characters needed _excluding_ terminating zero.
+ */
+__rte_experimental
+int
+rte_bpf_validate_debug_format_register_info(const struct rte_bpf_validate_debug *debug,
+	char *buffer, size_t bufsz, uint8_t reg);
+
+/**
+ * Format information about specified stack frame location for the user.
+ *
+ * Parameters buffer, bufsz and return value work the same way as for snprintf.
+ *
+ * @param debug
+ *   Debug instance.
+ * @param buffer
+ *   Buffer to fill with register information.
+ * @param bufsz
+ *   Buffer size (including space for terminating zero).
+ * @param offset
+ *   Stack frame offset to provide information about, in bytes.
+ *   Typically a negative multiple of 8.
+ * @return
+ *   Number of characters needed _excluding_ terminating zero.
+ */
+__rte_experimental
+int
+rte_bpf_validate_debug_format_frame_info(const struct rte_bpf_validate_debug *debug,
+	char *buffer, size_t bufsz, int32_t offset);
+
+/**
+ * Get program stack frame size.
+ *
+ * @param debug
+ *   Debug instance.
+ * @return
+ *   Program stack frame size in bytes.
+ */
+__rte_experimental
+int32_t
+rte_bpf_validate_debug_get_frame_size(const struct rte_bpf_validate_debug *debug);
+
+/**
+ * Format value following the style of register format function.
+ *
+ * Parameters buffer, bufsz and return value work the same way as for snprintf.
+ *
+ * @param buffer
+ *   Buffer to fill with register information.
+ * @param bufsz
+ *   Buffer size (including space for terminating zero).
+ * @param format
+ *   One of characters 'd' or 'x' for signed or hexadecimal format.
+ * @param value
+ *   Formatted value, can be signed typecast to unsigned.
+ * @return
+ *   Number of characters needed _excluding_ terminating zero.
+ */
+__rte_experimental
+int
+rte_bpf_validate_debug_format_value(char *buffer, size_t bufsz, char format,
+	uint64_t value);
+
+/**
+ * Format interval following the style of register format function.
+ *
+ * Parameters buffer, bufsz and return value work the same way as for snprintf.
+ *
+ * @param buffer
+ *   Buffer to fill with register information.
+ * @param bufsz
+ *   Buffer size (including space for terminating zero).
+ * @param format
+ *   One of characters 'd' or 'x' for signed or hexadecimal format.
+ * @param min
+ *   Minimum value of the interval, can be signed typecast to unsigned.
+ * @param max
+ *   Maximum value of the interval, can be signed typecast to unsigned.
+ * @return
+ *   Number of characters needed _excluding_ terminating zero.
+ */
+__rte_experimental
+int
+rte_bpf_validate_debug_format_interval(char *buffer, size_t bufsz, char format,
+	uint64_t min, uint64_t max);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_BPF_VALIDATE_DEBUG_H_ */
-- 
2.43.0


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox