[PATCH v4 0/4] Introduce Topology NUMA grouping for lcores

DPDK-dev Archive on lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v4 0/4] Introduce Topology NUMA grouping for lcores
@ 2024-11-05 10:28 Vipin Varghese
  2024-11-05 10:28 ` [PATCH v4 1/4] eal/lcore: add topology based functions Vipin Varghese
                   ` (6 more replies)
  0 siblings, 7 replies; 30+ messages in thread
From: Vipin Varghese @ 2024-11-05 10:28 UTC (permalink / raw)
  To: dev, roretzla, bruce.richardson, john.mcnamara, dmitry.kozliuk
  Cc: pbhagavatula, jerinj, ruifeng.wang, mattias.ronnblom,
	anatoly.burakov, stephen, ferruh.yigit, honnappa.nagarahalli,
	wathsala.vithanage, konstantin.ananyev, mb

This patch introduces improvements for NUMA topology awareness in
relation to DPDK logical cores. The goal is to expose API which allows
users to select optimal logical cores for any application. These logical
cores can be selected from various NUMA domains like CPU and I/O.

Change Summary:
 - Introduces the concept of NUMA domain partitioning based on CPU and
   I/O topology.
 - Adds support for grouping DPDK logical cores within the same Cache
   and I/O domain for improved locality.
 - Implements topology detection and core grouping logic that
   distinguishes between the following NUMA configurations:
    * CPU topology & I/O topology (e.g., AMD SoC EPYC, Intel Xeon SPR)
    * CPU+I/O topology (e.g., Ampere One with SLC, Intel Xeon SPR with SNC)
 - Enhances performance by minimizing lcore dispersion across tiles|compute
   package with different L2/L3 cache or IO domains.

Reason:
 - Applications using DPDK libraries relies on consistent memory access.
 - Lcores being closer to same NUMA domain as IO.
 - Lcores sharing same cache.

Latency is minimized by using lcores that share the same NUMA topology.
Memory access is optimized by utilizing cores within the same NUMA
domain or tile. Cache coherence is preserved within the same shared cache
domain, reducing the remote access from tile|compute package via snooping
(local hit in either L2 or L3 within same NUMA domain).

Library dependency: hwloc

Topology Flags:
---------------
 - RTE_LCORE_DOMAIN_L1: to group cores sharing same L1 cache
 - RTE_LCORE_DOMAIN_SMT: same as RTE_LCORE_DOMAIN_L1
 - RTE_LCORE_DOMAIN_L2: group cores sharing same L2 cache
 - RTE_LCORE_DOMAIN_L3: group cores sharing same L3 cache
 - RTE_LCORE_DOMAIN_L4: group cores sharing same L4 cache
 - RTE_LCORE_DOMAIN_IO: group cores sharing same IO

< Function: Purpose >
---------------------
 - rte_get_domain_count: get domain count based on Topology Flag
 - rte_lcore_count_from_domain: get valid lcores count under each domain
 - rte_get_lcore_in_domain: valid lcore id based on index
 - rte_lcore_cpuset_in_domain: return valid cpuset based on index
 - rte_lcore_is_main_in_domain: return true|false if main lcore is present
 - rte_get_next_lcore_from_domain: next valid lcore within domain
 - rte_get_next_lcore_from_next_domain: next valid lcore from next domain

Note:
 1. Topology is NUMA grouping.
 2. Domain is various sub-groups within a specific Topology.

Topology example: L1, L2, L3, L4, IO
Domian example: IO-A, IO-B

< MACRO: Purpose >
------------------
 - RTE_LCORE_FOREACH_DOMAIN: iterate lcores from all domains
 - RTE_LCORE_FOREACH_WORKER_DOMAIN: iterate worker lcores from all domains
 - RTE_LCORE_FORN_NEXT_DOMAIN: iterate domain select n'th lcore
 - RTE_LCORE_FORN_WORKER_NEXT_DOMAIN: iterate domain for worker n'th lcore.

Future work (after merge):
--------------------------
 - dma-perf per IO NUMA
 - eventdev per L3 NUMA
 - pipeline per SMT|L3 NUMA
 - distributor per L3 for Port-Queue
 - l2fwd-power per SMT
 - testpmd option for IO NUMA per port

Platform tested on:
-------------------
 - INTEL(R) XEON(R) PLATINUM 8562Y+ (support IO numa 1 & 2)
 - AMD EPYC 8534P (supports IO numa 1 & 2)
 - AMD EPYC 9554 (supports IO numa 1, 2, 4)

Logs:
-----
1. INTEL(R) XEON(R) PLATINUM 8562Y+:
 - SNC=1
        Domain (IO): at index (0) there are 48 core, with (0) at index 0
 - SNC=2
        Domain (IO): at index (0) there are 24 core, with (0) at index 0
        Domain (IO): at index (1) there are 24 core, with (12) at index 0

2. AMD EPYC 8534P:
 - NPS=1:
        Domain (IO): at index (0) there are 128 core, with (0) at index 0
 - NPS=2:
        Domain (IO): at index (0) there are 64 core, with (0) at index 0
        Domain (IO): at index (1) there are 64 core, with (32) at index 0

Signed-off-by: Vipin Varghese <vipin.varghese@amd.com>

Vipin Varghese (4):
  eal/lcore: add topology based functions
  test/lcore: enable tests for topology
  doc: add topology grouping details
  examples: update with lcore topology API

 app/test/test_lcores.c                        | 528 +++++++++++++
 config/meson.build                            |  18 +
 .../prog_guide/env_abstraction_layer.rst      |  22 +
 examples/helloworld/main.c                    | 154 +++-
 examples/l2fwd/main.c                         |  56 +-
 examples/skeleton/basicfwd.c                  |  22 +
 lib/eal/common/eal_common_lcore.c             | 714 ++++++++++++++++++
 lib/eal/common/eal_private.h                  |  58 ++
 lib/eal/freebsd/eal.c                         |  10 +
 lib/eal/include/rte_lcore.h                   | 209 +++++
 lib/eal/linux/eal.c                           |  11 +
 lib/eal/meson.build                           |   4 +
 lib/eal/version.map                           |  11 +
 lib/eal/windows/eal.c                         |  12 +
 14 files changed, 1819 insertions(+), 10 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v4 1/4] eal/lcore: add topology based functions
  2024-11-05 10:28 [PATCH v4 0/4] Introduce Topology NUMA grouping for lcores Vipin Varghese
@ 2024-11-05 10:28 ` Vipin Varghese
  2024-11-05 10:28 ` [PATCH v4 2/4] test/lcore: enable tests for topology Vipin Varghese
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 30+ messages in thread
From: Vipin Varghese @ 2024-11-05 10:28 UTC (permalink / raw)
  To: dev, roretzla, bruce.richardson, john.mcnamara, dmitry.kozliuk
  Cc: pbhagavatula, jerinj, ruifeng.wang, mattias.ronnblom,
	anatoly.burakov, stephen, ferruh.yigit, honnappa.nagarahalli,
	wathsala.vithanage, konstantin.ananyev, mb

Introduce topology aware lcore mapping into lcore API.
With higher core density, more and more cores are categorized
into various chiplets based on IO (memory and PCIe) and
Last Level Cache (mainly L3).

Using hwloc library, the dpdk available lcores can be grouped
into various groups nameley L1, L2, L3, L4 and IO. This patch
introduces functions and MACRO that helps to identify such
groups.

Internal API:
 - get_domain_lcore_count;
 - get_domain_lcore_mapping
 - rte_eal_topology_init;
 - rte_eal_topology_release;

External Experimental API:
 - rte_get_domain_count;
 - rte_get_lcore_in_domain;
 - rte_get_next_lcore_from_domain;
 - rte_get_next_lcore_from_next_domain;
 - rte_lcore_count_from_domain;
 - rte_lcore_cpuset_in_domain;
 - rte_lcore_is_main_in_domain;

v4 changes:
 - add internal API get_domain_lcore_count
 - add external API rte_lcore_cpuset_in_domain
 - add external API rte_lcore_is_main_in_domain
 - remove malloc casting: Stephen Hemminger
 - remove NULL check before free: Stephen Hemminger
 - convert l3_count & io_count to uint16_t: Stephen Hemminger
 - use rte_malloc for internal core_mapping: Stephen Hemminger
 - extend to L4 cache: Morten Brørup
 - add comment as place holder for enable cache-id: Morten Brørup

v2 changes:
 - focuses on rte_lcore api for getting topology
 - use hwloc instead of sysfs exploration - Mattias Rönnblom
 - L1, L2 and IO domain mapping - Ferruh, Vipin
 - new API marked experimental - Stephen Hemminger

Signed-off-by: Vipin Varghese <vipin.varghese@amd.com>
---
 config/meson.build                |  18 +
 lib/eal/common/eal_common_lcore.c | 714 ++++++++++++++++++++++++++++++
 lib/eal/common/eal_private.h      |  58 +++
 lib/eal/freebsd/eal.c             |  10 +
 lib/eal/include/rte_lcore.h       | 209 +++++++++
 lib/eal/linux/eal.c               |  11 +
 lib/eal/meson.build               |   4 +
 lib/eal/version.map               |  11 +
 lib/eal/windows/eal.c             |  12 +
 9 files changed, 1047 insertions(+)

diff --git a/config/meson.build b/config/meson.build
index 5095d2fbcb..42e4d28f8d 100644
--- a/config/meson.build
+++ b/config/meson.build
@@ -240,6 +240,24 @@ if find_libnuma
     endif
 endif
 
+has_libhwloc = false
+find_libhwloc = true
+
+if meson.is_cross_build() and not meson.get_cross_property('hwloc', true)
+    # don't look for libhwloc if explicitly disabled in cross build
+    find_libhwloc = false
+endif
+
+if find_libhwloc
+    hwloc_dep = cc.find_library('hwloc', required: false)
+    if hwloc_dep.found() and cc.has_header('hwloc.h')
+        dpdk_conf.set10('RTE_HAS_LIBHWLOC', true)
+        has_libhwloc = true
+        add_project_link_arguments('-lhwloc', language: 'c')
+        dpdk_extra_ldflags += '-lhwloc'
+    endif
+endif
+
 has_libfdt = false
 fdt_dep = cc.find_library('fdt', required: false)
 if fdt_dep.found() and cc.has_header('fdt.h')
diff --git a/lib/eal/common/eal_common_lcore.c b/lib/eal/common/eal_common_lcore.c
index 2ff9252c52..756aaf9fbc 100644
--- a/lib/eal/common/eal_common_lcore.c
+++ b/lib/eal/common/eal_common_lcore.c
@@ -14,6 +14,7 @@
 #ifndef RTE_EXEC_ENV_WINDOWS
 #include <rte_telemetry.h>
 #endif
+#include <rte_malloc.h>
 
 #include "eal_private.h"
 #include "eal_thread.h"
@@ -112,6 +113,371 @@ unsigned int rte_get_next_lcore(unsigned int i, int skip_main, int wrap)
 	return i;
 }
 
+#ifdef RTE_EAL_HWLOC_TOPOLOGY_PROBE
+static struct core_domain_mapping *
+get_domain_lcore_mapping(unsigned int domain_sel, unsigned int domain_indx)
+{
+	struct core_domain_mapping *ptr =
+		(domain_sel & RTE_LCORE_DOMAIN_IO) ? topo_cnfg.io[domain_indx] :
+		(domain_sel & RTE_LCORE_DOMAIN_L4) ? topo_cnfg.l4[domain_indx] :
+		(domain_sel & RTE_LCORE_DOMAIN_L3) ? topo_cnfg.l3[domain_indx] :
+		(domain_sel & RTE_LCORE_DOMAIN_L2) ? topo_cnfg.l2[domain_indx] :
+		(domain_sel & RTE_LCORE_DOMAIN_L1) ? topo_cnfg.l1[domain_indx] : NULL;
+
+	return ptr;
+}
+
+static unsigned int
+get_domain_lcore_count(unsigned int domain_sel)
+{
+	return ((domain_sel & RTE_LCORE_DOMAIN_IO) ? topo_cnfg.io_core_count :
+		(domain_sel & RTE_LCORE_DOMAIN_L4) ? topo_cnfg.l4_core_count :
+		(domain_sel & RTE_LCORE_DOMAIN_L3) ? topo_cnfg.l3_core_count :
+		(domain_sel & RTE_LCORE_DOMAIN_L2) ? topo_cnfg.l2_core_count :
+		(domain_sel & RTE_LCORE_DOMAIN_L1) ? topo_cnfg.l1_core_count : 0);
+}
+#endif
+
+unsigned int rte_get_domain_count(unsigned int domain_sel __rte_unused)
+{
+	unsigned int domain_cnt = 0;
+
+#ifdef RTE_EAL_HWLOC_TOPOLOGY_PROBE
+	if (domain_sel & RTE_LCORE_DOMAIN_ALL) {
+		domain_cnt =
+			(domain_sel & RTE_LCORE_DOMAIN_IO) ? topo_cnfg.io_count :
+			(domain_sel & RTE_LCORE_DOMAIN_L4) ? topo_cnfg.l4_count :
+			(domain_sel & RTE_LCORE_DOMAIN_L3) ? topo_cnfg.l3_count :
+			(domain_sel & RTE_LCORE_DOMAIN_L2) ? topo_cnfg.l2_count :
+			(domain_sel & RTE_LCORE_DOMAIN_L1) ? topo_cnfg.l1_count : 0;
+	}
+#endif
+
+	return domain_cnt;
+}
+
+unsigned int
+rte_lcore_count_from_domain(unsigned int domain_sel __rte_unused,
+unsigned int domain_indx __rte_unused)
+{
+	unsigned int core_cnt = 0;
+
+#ifdef RTE_EAL_HWLOC_TOPOLOGY_PROBE
+	unsigned int domain_cnt = 0;
+
+	if ((domain_sel & RTE_LCORE_DOMAIN_ALL) == 0)
+		return core_cnt;
+
+	domain_cnt = rte_get_domain_count(domain_sel);
+
+	if (domain_cnt == 0)
+		return core_cnt;
+
+	if ((domain_indx != RTE_LCORE_DOMAIN_LCORES_ALL) && (domain_indx >= domain_cnt))
+		return core_cnt;
+
+	core_cnt = (domain_sel & RTE_LCORE_DOMAIN_IO) ? topo_cnfg.io_core_count :
+			(domain_sel & RTE_LCORE_DOMAIN_L4) ? topo_cnfg.l3_core_count :
+			(domain_sel & RTE_LCORE_DOMAIN_L3) ? topo_cnfg.l3_core_count :
+			(domain_sel & RTE_LCORE_DOMAIN_L2) ? topo_cnfg.l2_core_count :
+			(domain_sel & RTE_LCORE_DOMAIN_L1) ? topo_cnfg.l1_core_count : 0;
+
+	if ((domain_indx != RTE_LCORE_DOMAIN_LCORES_ALL) && (core_cnt)) {
+		struct core_domain_mapping *ptr = get_domain_lcore_mapping(domain_sel, domain_indx);
+		core_cnt = ptr->core_count;
+	}
+#endif
+
+	return core_cnt;
+}
+
+unsigned int
+rte_get_lcore_in_domain(unsigned int domain_sel __rte_unused,
+unsigned int domain_indx __rte_unused, unsigned int lcore_pos __rte_unused)
+{
+	uint16_t sel_core = RTE_MAX_LCORE;
+
+#ifdef RTE_EAL_HWLOC_TOPOLOGY_PROBE
+	unsigned int domain_cnt = 0;
+	unsigned int core_cnt = 0;
+
+	if (domain_sel & RTE_LCORE_DOMAIN_ALL) {
+		domain_cnt = rte_get_domain_count(domain_sel);
+		if (domain_cnt == 0)
+			return sel_core;
+
+		core_cnt = rte_lcore_count_from_domain(domain_sel, RTE_LCORE_DOMAIN_LCORES_ALL);
+		if (core_cnt == 0)
+			return sel_core;
+
+		struct core_domain_mapping *ptr = get_domain_lcore_mapping(domain_sel, domain_indx);
+		if ((ptr) && (ptr->core_count)) {
+			if (lcore_pos < ptr->core_count)
+				sel_core = ptr->cores[lcore_pos];
+		}
+	}
+#endif
+
+	return sel_core;
+}
+
+rte_cpuset_t
+rte_lcore_cpuset_in_domain(unsigned int domain_sel __rte_unused,
+unsigned int domain_indx __rte_unused)
+{
+	rte_cpuset_t ret_cpu_set;
+	CPU_ZERO(&ret_cpu_set);
+
+#ifdef RTE_EAL_HWLOC_TOPOLOGY_PROBE
+	struct core_domain_mapping *ptr = NULL;
+	unsigned int domain_count = rte_get_domain_count(domain_sel);
+
+	if ((domain_count == 0) || (domain_indx > domain_count))
+		return ret_cpu_set;
+
+	ptr = get_domain_lcore_mapping(domain_sel, domain_indx);
+	if (ptr->core_count == 0)
+		return ret_cpu_set;
+
+	CPU_OR(&ret_cpu_set, &ret_cpu_set, &ptr->core_set);
+#endif
+
+	return ret_cpu_set;
+}
+
+bool
+rte_lcore_is_main_in_domain(unsigned int domain_sel __rte_unused,
+unsigned int domain_indx __rte_unused)
+{
+	bool is_main_in_domain = false;
+
+#ifdef RTE_EAL_HWLOC_TOPOLOGY_PROBE
+	struct core_domain_mapping *ptr = NULL;
+	unsigned int main_lcore = rte_get_main_lcore();
+	unsigned int domain_count = rte_get_domain_count(domain_sel);
+
+	if ((domain_count == 0) || (domain_indx > domain_count))
+		return is_main_in_domain;
+
+	ptr = get_domain_lcore_mapping(domain_sel, domain_indx);
+	if (ptr->core_count == 0)
+		return is_main_in_domain;
+
+	is_main_in_domain = CPU_ISSET(main_lcore, &ptr->core_set);
+#endif
+
+	return is_main_in_domain;
+}
+
+unsigned int
+rte_get_next_lcore_from_domain(unsigned int indx __rte_unused,
+int skip_main __rte_unused, int wrap __rte_unused, uint32_t flag __rte_unused)
+{
+	if (indx >= RTE_MAX_LCORE) {
+#ifdef RTE_EAL_HWLOC_TOPOLOGY_PROBE
+		if (get_domain_lcore_count(flag) == 0)
+			return RTE_MAX_LCORE;
+#endif
+		indx = rte_get_next_lcore(-1, skip_main, wrap);
+		return indx;
+	}
+	uint16_t usr_lcore = indx % RTE_MAX_LCORE;
+	uint16_t sel_domain_core = RTE_MAX_LCORE;
+
+	EAL_LOG(DEBUG, "lcore (%u), skip main lcore (%d), wrap (%d), flag (%u)",
+		usr_lcore, skip_main, wrap, flag);
+
+	/* check the input lcore indx */
+	if (!rte_lcore_is_enabled(indx)) {
+		EAL_LOG(ERR, "User input lcore (%u) is not enabled!!!", indx);
+		return sel_domain_core;
+	}
+
+	if ((rte_lcore_count() == 1)) {
+		EAL_LOG(DEBUG, "only 1 lcore in dpdk process!!!");
+		sel_domain_core = wrap ? indx : sel_domain_core;
+		return sel_domain_core;
+	}
+
+#ifdef RTE_EAL_HWLOC_TOPOLOGY_PROBE
+	uint16_t main_lcore = rte_get_main_lcore();
+	uint16_t sel_domain = 0xffff;
+	uint16_t sel_domain_core_index = 0xffff;
+	uint16_t sel_domain_core_count = 0;
+
+	struct core_domain_mapping *ptr = NULL;
+	uint16_t domain_count = 0;
+	uint16_t domain_core_count = 0;
+	uint16_t *domain_core_list = NULL;
+
+	domain_count = rte_get_domain_count(flag);
+	if (domain_count == 0) {
+		EAL_LOG(DEBUG, "No domain found for cores with flag (%u)!!!", flag);
+		return sel_domain_core;
+	}
+
+	/* identify the lcore to get the domain to start from */
+	for (int i = 0; (i < domain_count) && (sel_domain_core_index == 0xffff); i++) {
+		ptr = get_domain_lcore_mapping(flag, i);
+
+		domain_core_count = ptr->core_count;
+		domain_core_list = ptr->cores;
+
+		for (int j = 0; j < domain_core_count; j++) {
+			if (usr_lcore == domain_core_list[j]) {
+				sel_domain_core_index = j;
+				sel_domain_core_count = domain_core_count;
+				sel_domain = i;
+				break;
+			}
+		}
+	}
+
+	if (sel_domain_core_count == 1) {
+		EAL_LOG(DEBUG, "there is no more lcore in the domain!!!");
+		return sel_domain_core;
+	}
+
+	EAL_LOG(DEBUG, "selected: domain (%u), core: count %u, index %u, core: current %u",
+		sel_domain, sel_domain_core_count, sel_domain_core_index,
+		domain_core_list[sel_domain_core_index]);
+
+	/* get next lcore from the selected domain */
+	/* next lcore is always `sel_domain_core_index + 1`, but needs boundary check */
+	bool lcore_found = false;
+	uint16_t next_domain_lcore_index = sel_domain_core_index + 1;
+	while (false == lcore_found) {
+
+		if (next_domain_lcore_index >= sel_domain_core_count) {
+			if (wrap) {
+				next_domain_lcore_index = 0;
+				continue;
+			}
+			break;
+		}
+
+		/* check if main lcore skip */
+		if ((domain_core_list[next_domain_lcore_index] == main_lcore) && (skip_main)) {
+			next_domain_lcore_index += 1;
+			continue;
+		}
+
+		lcore_found = true;
+	}
+	if (true == lcore_found)
+		sel_domain_core = domain_core_list[next_domain_lcore_index];
+#endif
+
+	EAL_LOG(DEBUG, "Selected core (%u)", sel_domain_core);
+	return sel_domain_core;
+}
+
+unsigned int
+rte_get_next_lcore_from_next_domain(unsigned int indx __rte_unused,
+int skip_main __rte_unused, int wrap __rte_unused,
+uint32_t flag __rte_unused, int cores_to_skip __rte_unused)
+{
+	if (indx >= RTE_MAX_LCORE) {
+#ifdef RTE_EAL_HWLOC_TOPOLOGY_PROBE
+		if (get_domain_lcore_count(flag) == 0)
+			return RTE_MAX_LCORE;
+#endif
+		indx = rte_get_next_lcore(-1, skip_main, wrap);
+		return indx;
+	}
+
+	uint16_t sel_domain_core = RTE_MAX_LCORE;
+	uint16_t usr_lcore = indx % RTE_MAX_LCORE;
+
+	EAL_LOG(DEBUG, "lcore (%u), skip main lcore (%d), wrap (%d), flag (%u)",
+		usr_lcore, skip_main, wrap, flag);
+
+	/* check the input lcore indx */
+	if (!rte_lcore_is_enabled(indx)) {
+		EAL_LOG(DEBUG, "User input lcore (%u) is not enabled!!!", indx);
+		return sel_domain_core;
+	}
+
+#ifdef RTE_EAL_HWLOC_TOPOLOGY_PROBE
+	uint16_t main_lcore = rte_get_main_lcore();
+
+	uint16_t sel_domain = 0xffff;
+	uint16_t sel_domain_core_index = 0xffff;
+
+	uint16_t domain_count = 0;
+	uint16_t domain_core_count = 0;
+	uint16_t *domain_core_list = NULL;
+
+	domain_count = rte_get_domain_count(flag);
+	if (domain_count == 0) {
+		EAL_LOG(DEBUG, "No Domains found for the flag (%u)!!!", flag);
+		return sel_domain_core;
+	}
+
+	/* identify the lcore to get the domain to start from */
+	struct core_domain_mapping *ptr = NULL;
+	for (int i = 0; (i < domain_count) && (sel_domain_core_index == 0xffff); i++) {
+		ptr = get_domain_lcore_mapping(flag, i);
+		domain_core_count = ptr->core_count;
+		domain_core_list = ptr->cores;
+
+		for (int j = 0; j < domain_core_count; j++) {
+			if (usr_lcore == domain_core_list[j]) {
+				sel_domain_core_index = j;
+				sel_domain = i;
+				break;
+			}
+		}
+	}
+
+	if (sel_domain_core_index == 0xffff) {
+		EAL_LOG(DEBUG, "Invalid lcore %u for the flag (%u)!!!", indx, flag);
+		return sel_domain_core;
+	}
+
+	EAL_LOG(DEBUG, "Selected - core_index (%u); domain (%u), core_count (%u), cores (%p)",
+		sel_domain_core_index, sel_domain, domain_core_count, domain_core_list);
+
+	uint16_t skip_cores = (cores_to_skip >= 0) ? cores_to_skip : (0 - cores_to_skip);
+
+	/* get the next domain & valid lcore */
+	sel_domain = (((1 + sel_domain) == domain_count) && (wrap)) ? 0 : (1 + sel_domain);
+	sel_domain_core_index = 0xffff;
+
+	bool iter_loop = false;
+	for (int i = sel_domain; (i < domain_count) && (sel_domain_core == RTE_MAX_LCORE); i++) {
+		ptr = get_domain_lcore_mapping(flag, i);
+
+		domain_core_count = ptr->core_count;
+		domain_core_list = ptr->cores;
+
+		/* check if we have cores to iterate from this domain */
+		if (skip_cores >= domain_core_count)
+			continue;
+
+		if (((1 + sel_domain) == domain_count) && (wrap)) {
+			if (iter_loop == true)
+				break;
+
+			iter_loop = true;
+		}
+
+		sel_domain_core_index = (cores_to_skip >= 0) ? skip_cores :
+					(domain_core_count - skip_cores);
+		sel_domain_core = domain_core_list[sel_domain_core_index];
+
+		if ((skip_main) && (sel_domain_core == main_lcore)) {
+			sel_domain_core_index = 0xffff;
+			sel_domain_core = RTE_MAX_LCORE;
+			continue;
+		}
+	}
+#endif
+
+	EAL_LOG(DEBUG, "Selected core (%u)", sel_domain_core);
+	return sel_domain_core;
+}
+
 unsigned int
 rte_lcore_to_socket_id(unsigned int lcore_id)
 {
@@ -131,6 +497,354 @@ socket_id_cmp(const void *a, const void *b)
 	return 0;
 }
 
+
+
+/*
+ * Use HWLOC library to parse L1|L2|L3|NUMA-IO on the running target machine.
+ * Store the topology structure in memory.
+ */
+int
+rte_eal_topology_init(void)
+{
+#ifdef RTE_EAL_HWLOC_TOPOLOGY_PROBE
+	memset(&topo_cnfg, 0, sizeof(struct topology_config));
+
+	hwloc_topology_init(&topo_cnfg.topology);
+	hwloc_topology_load(topo_cnfg.topology);
+
+	int l1_depth = hwloc_get_type_depth(topo_cnfg.topology, HWLOC_OBJ_L1CACHE);
+	int l2_depth = hwloc_get_type_depth(topo_cnfg.topology, HWLOC_OBJ_L2CACHE);
+	int l3_depth = hwloc_get_type_depth(topo_cnfg.topology, HWLOC_OBJ_L3CACHE);
+	int l4_depth = hwloc_get_type_depth(topo_cnfg.topology, HWLOC_OBJ_L4CACHE);
+	int io_depth = hwloc_get_type_depth(topo_cnfg.topology, HWLOC_OBJ_NUMANODE);
+
+	EAL_LOG(DEBUG, "TOPOLOGY - depth: l1 %d, l2 %d, l3 %d, l4 %d, io %d",
+		l1_depth, l2_depth, l3_depth, l4_depth, io_depth);
+
+	topo_cnfg.l1_count = hwloc_get_nbobjs_by_depth(topo_cnfg.topology, l1_depth);
+	topo_cnfg.l2_count = hwloc_get_nbobjs_by_depth(topo_cnfg.topology, l2_depth);
+	topo_cnfg.l3_count = hwloc_get_nbobjs_by_depth(topo_cnfg.topology, l3_depth);
+	topo_cnfg.l4_count = hwloc_get_nbobjs_by_depth(topo_cnfg.topology, l4_depth);
+	topo_cnfg.io_count = hwloc_get_nbobjs_by_depth(topo_cnfg.topology, io_depth);
+
+	EAL_LOG(DEBUG, "TOPOLOGY - obj count: l1 %d, l2 %d, l3 %d, l4 %d, io %d",
+		topo_cnfg.l1_count, topo_cnfg.l2_count,
+		topo_cnfg.l3_count, topo_cnfg.l4_count,
+		topo_cnfg.io_count);
+
+	if ((l1_depth) && (topo_cnfg.l1_count)) {
+		topo_cnfg.l1 = rte_malloc(NULL,
+				sizeof(struct core_domain_mapping *) * topo_cnfg.l1_count, 0);
+		if (topo_cnfg.l1 == NULL) {
+			rte_eal_topology_release();
+			return -1;
+		}
+
+		for (int j = 0; j < topo_cnfg.l1_count; j++) {
+			hwloc_obj_t obj = hwloc_get_obj_by_depth(topo_cnfg.topology, l1_depth, j);
+			unsigned int first_cpu = hwloc_bitmap_first(obj->cpuset);
+			unsigned int cpu_count = hwloc_bitmap_weight(obj->cpuset);
+
+			topo_cnfg.l1[j] = rte_malloc(NULL, sizeof(struct core_domain_mapping), 0);
+			if (topo_cnfg.l1[j] == NULL) {
+				rte_eal_topology_release();
+				return -1;
+			}
+
+			topo_cnfg.l1[j]->core_count = 0;
+			topo_cnfg.l1[j]->cores = rte_malloc(NULL, sizeof(uint16_t) * cpu_count, 0);
+			if (topo_cnfg.l1[j]->cores == NULL) {
+				rte_eal_topology_release();
+				return -1;
+			}
+
+			signed int cpu_id = first_cpu;
+			unsigned int cpu_index = 0;
+			do {
+				if (rte_lcore_is_enabled(cpu_id)) {
+					EAL_LOG(DEBUG, " L1|SMT domain (%u) lcore %u", j, cpu_id);
+					topo_cnfg.l1[j]->cores[cpu_index] = cpu_id;
+					cpu_index++;
+
+					CPU_SET(cpu_id, &topo_cnfg.l1[j]->core_set);
+					topo_cnfg.l1[j]->core_count += 1;
+					topo_cnfg.l1_core_count += 1;
+				}
+				cpu_id = hwloc_bitmap_next(obj->cpuset, cpu_id);
+				cpu_count -= 1;
+			} while ((cpu_id != -1) && (cpu_count));
+		}
+	}
+
+	if ((l2_depth) && (topo_cnfg.l2_count)) {
+		topo_cnfg.l2 = rte_malloc(NULL,
+				sizeof(struct core_domain_mapping *) * topo_cnfg.l2_count, 0);
+		if (topo_cnfg.l2 == NULL) {
+			rte_eal_topology_release();
+			return -1;
+		}
+
+		for (int j = 0; j < topo_cnfg.l2_count; j++) {
+			hwloc_obj_t obj = hwloc_get_obj_by_depth(topo_cnfg.topology, l2_depth, j);
+			unsigned int first_cpu = hwloc_bitmap_first(obj->cpuset);
+			unsigned int cpu_count = hwloc_bitmap_weight(obj->cpuset);
+
+			topo_cnfg.l2[j] = rte_malloc(NULL, sizeof(struct core_domain_mapping), 0);
+			if (topo_cnfg.l2[j] == NULL) {
+				rte_eal_topology_release();
+				return -1;
+			}
+
+			topo_cnfg.l2[j]->core_count = 0;
+			topo_cnfg.l2[j]->cores = rte_malloc(NULL, sizeof(uint16_t) * cpu_count, 0);
+			if (topo_cnfg.l2[j]->cores == NULL) {
+				rte_eal_topology_release();
+				return -1;
+			}
+
+			signed int cpu_id = first_cpu;
+			unsigned int cpu_index = 0;
+			do {
+				if (rte_lcore_is_enabled(cpu_id)) {
+					EAL_LOG(DEBUG, " L2 domain (%u) lcore %u", j, cpu_id);
+					topo_cnfg.l2[j]->cores[cpu_index] = cpu_id;
+					cpu_index++;
+
+					CPU_SET(cpu_id, &topo_cnfg.l2[j]->core_set);
+					topo_cnfg.l2[j]->core_count += 1;
+					topo_cnfg.l2_core_count += 1;
+				}
+				cpu_id = hwloc_bitmap_next(obj->cpuset, cpu_id);
+				cpu_count -= 1;
+			} while ((cpu_id != -1) && (cpu_count));
+		}
+	}
+
+	if ((l3_depth) && (topo_cnfg.l3_count)) {
+		topo_cnfg.l3 = rte_malloc(NULL,
+				sizeof(struct core_domain_mapping *) * topo_cnfg.l3_count, 0);
+		if (topo_cnfg.l3 == NULL) {
+			rte_eal_topology_release();
+			return -1;
+		}
+
+		for (int j = 0; j < topo_cnfg.l3_count; j++) {
+			hwloc_obj_t obj = hwloc_get_obj_by_depth(topo_cnfg.topology, l3_depth, j);
+			unsigned int first_cpu = hwloc_bitmap_first(obj->cpuset);
+			unsigned int cpu_count = hwloc_bitmap_weight(obj->cpuset);
+
+			topo_cnfg.l3[j] = rte_malloc(NULL, sizeof(struct core_domain_mapping), 0);
+			if (topo_cnfg.l3[j] == NULL) {
+				rte_eal_topology_release();
+				return -1;
+			}
+
+			topo_cnfg.l3[j]->core_count = 0;
+			topo_cnfg.l3[j]->cores = rte_malloc(NULL, sizeof(uint16_t) * cpu_count, 0);
+			if (topo_cnfg.l3[j]->cores == NULL) {
+				rte_eal_topology_release();
+				return -1;
+			}
+
+			signed int cpu_id = first_cpu;
+			unsigned int cpu_index = 0;
+			do {
+				if (rte_lcore_is_enabled(cpu_id)) {
+					EAL_LOG(DEBUG, " L3 domain (%u) lcore %u", j, cpu_id);
+					topo_cnfg.l3[j]->cores[cpu_index] = cpu_id;
+					cpu_index++;
+
+					CPU_SET(cpu_id, &topo_cnfg.l3[j]->core_set);
+					topo_cnfg.l3[j]->core_count += 1;
+					topo_cnfg.l3_core_count += 1;
+				}
+				cpu_id = hwloc_bitmap_next(obj->cpuset, cpu_id);
+				cpu_count -= 1;
+			} while ((cpu_id != -1) && (cpu_count));
+		}
+	}
+
+	if ((l4_depth) && (topo_cnfg.l4_count)) {
+		topo_cnfg.l4 = rte_malloc(NULL,
+				sizeof(struct core_domain_mapping *) * topo_cnfg.l4_count, 0);
+		if (topo_cnfg.l4 == NULL) {
+			rte_eal_topology_release();
+			return -1;
+		}
+
+		for (int j = 0; j < topo_cnfg.l4_count; j++) {
+			hwloc_obj_t obj = hwloc_get_obj_by_depth(topo_cnfg.topology, l4_depth, j);
+			unsigned int first_cpu = hwloc_bitmap_first(obj->cpuset);
+			unsigned int cpu_count = hwloc_bitmap_weight(obj->cpuset);
+
+			topo_cnfg.l4[j] = rte_malloc(NULL, sizeof(struct core_domain_mapping), 0);
+			if (topo_cnfg.l4[j] == NULL) {
+				rte_eal_topology_release();
+				return -1;
+			}
+
+			topo_cnfg.l4[j]->core_count = 0;
+			topo_cnfg.l4[j]->cores = rte_malloc(NULL, sizeof(uint16_t) * cpu_count, 0);
+			if (topo_cnfg.l4[j]->cores == NULL) {
+				rte_eal_topology_release();
+				return -1;
+			}
+
+			signed int cpu_id = first_cpu;
+			unsigned int cpu_index = 0;
+			do {
+				if (rte_lcore_is_enabled(cpu_id)) {
+					EAL_LOG(DEBUG, " L4 domain (%u) lcore %u", j, cpu_id);
+					topo_cnfg.l4[j]->cores[cpu_index] = cpu_id;
+					cpu_index++;
+
+					CPU_SET(cpu_id, &topo_cnfg.l3[j]->core_set);
+					topo_cnfg.l4[j]->core_count += 1;
+					topo_cnfg.l4_core_count += 1;
+				}
+				cpu_id = hwloc_bitmap_next(obj->cpuset, cpu_id);
+				cpu_count -= 1;
+			} while ((cpu_id != -1) && (cpu_count));
+		}
+	}
+
+	if ((io_depth) && (topo_cnfg.io_count)) {
+		topo_cnfg.io = rte_malloc(NULL,
+				sizeof(struct core_domain_mapping *) * topo_cnfg.io_count, 0);
+		if (topo_cnfg.io == NULL) {
+			rte_eal_topology_release();
+			return -1;
+		}
+
+		for (int j = 0; j < topo_cnfg.io_count; j++) {
+			hwloc_obj_t obj = hwloc_get_obj_by_depth(topo_cnfg.topology, io_depth, j);
+			unsigned int first_cpu = hwloc_bitmap_first(obj->cpuset);
+			unsigned int cpu_count = hwloc_bitmap_weight(obj->cpuset);
+
+			topo_cnfg.io[j] = rte_malloc(NULL, sizeof(struct core_domain_mapping), 0);
+			if (topo_cnfg.io[j] == NULL) {
+				rte_eal_topology_release();
+				return -1;
+			}
+
+			topo_cnfg.io[j]->core_count = 0;
+			topo_cnfg.io[j]->cores = rte_malloc(NULL, sizeof(uint16_t) * cpu_count, 0);
+			if (topo_cnfg.io[j]->cores == NULL) {
+				rte_eal_topology_release();
+				return -1;
+			}
+
+			signed int cpu_id = first_cpu;
+			unsigned int cpu_index = 0;
+			do {
+				if (rte_lcore_is_enabled(cpu_id)) {
+					EAL_LOG(DEBUG, " IO domain (%u) lcore %u", j, cpu_id);
+					topo_cnfg.io[j]->cores[cpu_index] = cpu_id;
+					cpu_index++;
+
+					CPU_SET(cpu_id, &topo_cnfg.io[j]->core_set);
+					topo_cnfg.io[j]->core_count += 1;
+					topo_cnfg.io_core_count += 1;
+				}
+				cpu_id = hwloc_bitmap_next(obj->cpuset, cpu_id);
+				cpu_count -= 1;
+			} while ((cpu_id != -1) && (cpu_count));
+		}
+	}
+
+	hwloc_topology_destroy(topo_cnfg.topology);
+	topo_cnfg.topology = NULL;
+
+	EAL_LOG(INFO, "TOPOLOGY - core count: l1 %u, l2 %u, l3 %u, l4 %u, io %u",
+		topo_cnfg.l1_core_count, topo_cnfg.l2_core_count,
+		topo_cnfg.l3_core_count, topo_cnfg.l4_core_count,
+		topo_cnfg.io_core_count);
+#endif
+
+	return 0;
+}
+
+/*
+ * release HWLOC topology structure memory
+ */
+int
+rte_eal_topology_release(void)
+{
+#ifdef RTE_EAL_HWLOC_TOPOLOGY_PROBE
+	EAL_LOG(DEBUG, "release l1 domain memory!");
+	for (int i = 0; i < topo_cnfg.l1_count; i++) {
+		if (topo_cnfg.l1[i]->cores) {
+			rte_free(topo_cnfg.l1[i]->cores);
+			topo_cnfg.l1[i]->core_count = 0;
+		}
+	}
+
+	if (topo_cnfg.l1_count) {
+		rte_free(topo_cnfg.l1);
+		topo_cnfg.l1 = NULL;
+		topo_cnfg.l1_count = 0;
+	}
+
+	EAL_LOG(DEBUG, "release l2 domain memory!");
+	for (int i = 0; i < topo_cnfg.l2_count; i++) {
+		if (topo_cnfg.l2[i]->cores) {
+			rte_free(topo_cnfg.l2[i]->cores);
+			topo_cnfg.l2[i]->core_count = 0;
+		}
+	}
+
+	if (topo_cnfg.l2_count) {
+		rte_free(topo_cnfg.l2);
+		topo_cnfg.l2 = NULL;
+		topo_cnfg.l2_count = 0;
+	}
+
+	EAL_LOG(DEBUG, "release l3 domain memory!");
+	for (int i = 0; i < topo_cnfg.l3_count; i++) {
+		if (topo_cnfg.l3[i]->cores) {
+			rte_free(topo_cnfg.l3[i]->cores);
+			topo_cnfg.l3[i]->core_count = 0;
+		}
+	}
+
+	if (topo_cnfg.l3_count) {
+		rte_free(topo_cnfg.l3);
+		topo_cnfg.l3 = NULL;
+		topo_cnfg.l3_count = 0;
+	}
+
+	EAL_LOG(DEBUG, "release l4 domain memory!");
+	for (int i = 0; i < topo_cnfg.l4_count; i++) {
+		if (topo_cnfg.l4[i]->cores) {
+			rte_free(topo_cnfg.l4[i]->cores);
+			topo_cnfg.l4[i]->core_count = 0;
+		}
+	}
+
+	if (topo_cnfg.l4_count) {
+		rte_free(topo_cnfg.l4);
+		topo_cnfg.l4 = NULL;
+		topo_cnfg.l4_count = 0;
+	}
+
+	EAL_LOG(DEBUG, "release IO domain memory!");
+	for (int i = 0; i < topo_cnfg.io_count; i++) {
+		if (topo_cnfg.io[i]->cores) {
+			rte_free(topo_cnfg.io[i]->cores);
+			topo_cnfg.io[i]->core_count = 0;
+		}
+	}
+
+	if (topo_cnfg.io_count) {
+		rte_free(topo_cnfg.io);
+		topo_cnfg.io = NULL;
+		topo_cnfg.io_count = 0;
+	}
+#endif
+
+	return 0;
+}
+
 /*
  * Parse /sys/devices/system/cpu to get the number of physical and logical
  * processors on the machine. The function will fill the cpu_info
diff --git a/lib/eal/common/eal_private.h b/lib/eal/common/eal_private.h
index bb315dab04..de9d5fc50f 100644
--- a/lib/eal/common/eal_private.h
+++ b/lib/eal/common/eal_private.h
@@ -14,9 +14,14 @@
 #include <rte_lcore.h>
 #include <rte_log.h>
 #include <rte_memory.h>
+#include <rte_os.h>
 
 #include "eal_internal_cfg.h"
 
+#ifdef RTE_EAL_HWLOC_TOPOLOGY_PROBE
+#include <hwloc.h>
+#endif
+
 /**
  * Structure storing internal configuration (per-lcore)
  */
@@ -40,6 +45,45 @@ struct lcore_config {
 
 extern struct lcore_config lcore_config[RTE_MAX_LCORE];
 
+struct core_domain_mapping {
+	rte_cpuset_t core_set;	/**< cpu_set representing lcores within domain */
+	uint16_t core_count;	/**< dpdk enabled lcores within domain */
+	uint16_t *cores;	/**< list of cores */
+
+	/* uint16_t *l1_cache_id; */
+	/* uint16_t *l2_cache_id; */
+	/* uint16_t *l3_cache_id; */
+	/* uint16_t *l4_cache_id; */
+};
+
+struct topology_config {
+#ifdef RTE_EAL_HWLOC_TOPOLOGY_PROBE
+	hwloc_topology_t topology;
+#endif
+
+	/* domain count */
+	uint16_t l1_count;
+	uint16_t l2_count;
+	uint16_t l3_count;
+	uint16_t l4_count;
+	uint16_t io_count;
+
+	/* total cores under all domain */
+	uint16_t l1_core_count;
+	uint16_t l2_core_count;
+	uint16_t l3_core_count;
+	uint16_t l4_core_count;
+	uint16_t io_core_count;
+
+	/* two dimensional array for each domain */
+	struct core_domain_mapping **l1;
+	struct core_domain_mapping **l2;
+	struct core_domain_mapping **l3;
+	struct core_domain_mapping **l4;
+	struct core_domain_mapping **io;
+};
+extern struct topology_config topo_cnfg;
+
 /**
  * The global RTE configuration structure.
  */
@@ -81,6 +125,20 @@ struct rte_config *rte_eal_get_configuration(void);
  */
 int rte_eal_memzone_init(void);
 
+
+/**
+ * Initialize the topology structure using HWLOC Library
+ */
+__rte_internal
+int rte_eal_topology_init(void);
+
+/**
+ * Release the memory held by Topology structure
+ */
+__rte_internal
+int rte_eal_topology_release(void);
+
+
 /**
  * Fill configuration with number of physical and logical processors
  *
diff --git a/lib/eal/freebsd/eal.c b/lib/eal/freebsd/eal.c
index 1229230063..301f993748 100644
--- a/lib/eal/freebsd/eal.c
+++ b/lib/eal/freebsd/eal.c
@@ -73,6 +73,8 @@ struct lcore_config lcore_config[RTE_MAX_LCORE];
 /* used by rte_rdtsc() */
 int rte_cycles_vmware_tsc_map;
 
+/* holds topology information */
+struct topology_config topo_cnfg;
 
 int
 eal_clean_runtime_dir(void)
@@ -912,6 +914,12 @@ rte_eal_init(int argc, char **argv)
 			return -1;
 	}
 
+	if (rte_eal_topology_init()) {
+		rte_eal_init_alert("Cannot invoke topology!!!");
+		rte_errno = ENOTSUP;
+		return -1;
+	}
+
 	eal_mcfg_complete();
 
 	return fctret;
@@ -932,6 +940,8 @@ rte_eal_cleanup(void)
 
 	struct internal_config *internal_conf =
 		eal_get_internal_configuration();
+
+	rte_eal_topology_release();
 	rte_service_finalize();
 	rte_mp_channel_cleanup();
 	eal_bus_cleanup();
diff --git a/lib/eal/include/rte_lcore.h b/lib/eal/include/rte_lcore.h
index 549b9e68c5..56c309d0e7 100644
--- a/lib/eal/include/rte_lcore.h
+++ b/lib/eal/include/rte_lcore.h
@@ -18,6 +18,7 @@
 #include <rte_eal.h>
 #include <rte_launch.h>
 #include <rte_thread.h>
+#include <rte_bitset.h>
 
 #ifdef __cplusplus
 extern "C" {
@@ -37,6 +38,44 @@ enum rte_lcore_role_t {
 	ROLE_NON_EAL,
 };
 
+/**
+ * The lcore grouping with in the L1 Domain.
+ */
+#define RTE_LCORE_DOMAIN_L1  RTE_BIT32(0)
+/**
+ * The lcore grouping with in the L2 Domain.
+ */
+#define RTE_LCORE_DOMAIN_L2  RTE_BIT32(1)
+/**
+ * The lcore grouping with in the L3 Domain.
+ */
+#define RTE_LCORE_DOMAIN_L3  RTE_BIT32(2)
+/**
+ * The lcore grouping with in the L4 Domain.
+ */
+#define RTE_LCORE_DOMAIN_L4  RTE_BIT32(3)
+/**
+ * The lcore grouping with in the IO Domain.
+ */
+#define RTE_LCORE_DOMAIN_IO  RTE_BIT32(4)
+/**
+ * The lcore grouping with in the SMT Domain (Like L1 Domain).
+ */
+#define RTE_LCORE_DOMAIN_SMT RTE_LCORE_DOMAIN_L1
+/**
+ * The lcore grouping based on Domains (L1|L2|L3|L4|IO).
+ */
+#define RTE_LCORE_DOMAIN_ALL (RTE_LCORE_DOMAIN_L1 |     \
+				RTE_LCORE_DOMAIN_L2 |   \
+				RTE_LCORE_DOMAIN_L3 |   \
+				RTE_LCORE_DOMAIN_L4 |   \
+				RTE_LCORE_DOMAIN_IO)
+/**
+ * The mask for getting all cores under same topology.
+ */
+#define RTE_LCORE_DOMAIN_LCORES_ALL RTE_GENMASK32(31, 0)
+
+
 /**
  * Get a lcore's role.
  *
@@ -211,6 +250,144 @@ int rte_lcore_is_enabled(unsigned int lcore_id);
  */
 unsigned int rte_get_next_lcore(unsigned int i, int skip_main, int wrap);
 
+/**
+ * Get count for selected domain.
+ *
+ * @param domain_sel
+ *   Domain selection, RTE_LCORE_DOMAIN_[L1|L2|L3|L4|IO].
+ * @return
+ *   total count for selected domain.
+ *
+ * @note valid for EAL args of lcore and coremask.
+ *
+ */
+__rte_experimental
+unsigned int rte_get_domain_count(unsigned int domain_sel);
+
+/**
+ * Get count for lcores for a domain.
+ *
+ * @param domain_sel
+ *   Domain selection, RTE_LCORE_DOMAIN_[L1|L2|L3|L4|IO].
+ * @param domain_indx
+ *   Domain Index, valid range from 0 to (rte_get_domain_count - 1).
+ * @return
+ *   total count for lcore in a selected index of a domain.
+ *
+ * @note valid for EAL args of lcore and coremask.
+ *
+ */
+__rte_experimental
+unsigned int
+rte_lcore_count_from_domain(unsigned int domain_sel, unsigned int domain_indx);
+
+/**
+ * Get n'th lcore from a selected domain.
+ *
+ * @param domain_sel
+ *   Domain selection, RTE_LCORE_DOMAIN_[L1|L2|L3|L4|IO].
+ * @param domain_indx
+ *   Domain Index, valid range from 0 to (rte_get_domain_count - 1).
+ * @param lcore_pos
+ *   lcore position, valid range from 0 to (dpdk_enabled_lcores in the domain -1)
+ * @return
+ *   lcore from the list for the selected domain.
+ *
+ * @note valid for EAL args of lcore and coremask.
+ *
+ */
+__rte_experimental
+unsigned int
+rte_get_lcore_in_domain(unsigned int domain_sel,
+unsigned int domain_indx, unsigned int lcore_pos);
+
+#ifdef RTE_HAS_CPUSET
+/**
+ * Return cpuset for all lcores in selected domain.
+ *
+ * @param domain_sel
+ *   Domain selection, RTE_LCORE_DOMAIN_[L1|L2|L3|L4|IO].
+ * @param domain_indx
+ *   Domain Index, valid range from 0 to (rte_get_domain_count - 1).
+ * @return
+ *   cpuset for all lcores from the selected domain.
+ *
+ * @note valid for EAL args of lcore and coremask.
+ *
+ */
+__rte_experimental
+rte_cpuset_t
+rte_lcore_cpuset_in_domain(unsigned int domain_sel, unsigned int domain_indx);
+__rte_experimental
+#endif
+
+/**
+ * Return TRUE|FALSE if main lcore in available in selected domain.
+ *
+ * @param domain_sel
+ *   Domain selection, RTE_LCORE_DOMAIN_[L1|L2|L3|L4|IO].
+ * @param domain_indx
+ *   Domain Index, valid range from 0 to (rte_get_domain_count - 1).
+ * @return
+ *   Check if main lcore is avaialable in the selected domain.
+ *
+ * @note valid for EAL args of lcore and coremask.
+ *
+ */
+bool
+rte_lcore_is_main_in_domain(unsigned int domain_sel, unsigned int domain_indx);
+
+/**
+ * Get the enabled lcores from next domain based on extended flag.
+ *
+ * @param i
+ *   The current lcore (reference).
+ * @param skip_main
+ *   If true, do not return the ID of the main lcore.
+ * @param wrap
+ *   If true, go back to first core of flag based domain when last core is reached.
+ *   If false, return RTE_MAX_LCORE when no more cores are available.
+ * @param flag
+ *   Allows user to select various domain as specified under RTE_LCORE_DOMAIN_[L1|L2|L3|L4|IO]
+ *
+ * @return
+ *   The next lcore_id or RTE_MAX_LCORE if not found.
+ *
+ * @note valid for EAL args of lcore and coremask.
+ *
+ */
+__rte_experimental
+unsigned int
+rte_get_next_lcore_from_domain(unsigned int i, int skip_main, int wrap,
+uint32_t flag);
+
+/**
+ * Get the Nth (first|last) lcores from next domain based on extended flag.
+ *
+ * @param i
+ *   The current lcore (reference).
+ * @param skip_main
+ *   If true, do not return the ID of the main lcore.
+ * @param wrap
+ *   If true, go back to first core of flag based domain when last core is reached.
+ *   If false, return RTE_MAX_LCORE when no more cores are available.
+ * @param flag
+ *   Allows user to select various domain as specified under RTE_LCORE_DOMAIN_(L1|L2|L3|L4|IO)
+ * @param cores_to_skip
+ *   If set to positive value, will skip to Nth lcore from start.
+ *   If set to negative value, will skip to Nth lcore from last.
+ *
+ * @return
+ *   The next lcore_id or RTE_MAX_LCORE if not found.
+ *
+ * @note valid for EAL args of lcore and coremask.
+ *
+ */
+__rte_experimental
+unsigned int
+rte_get_next_lcore_from_next_domain(unsigned int i,
+int skip_main, int wrap, uint32_t flag, int cores_to_skip);
+
 /**
  * Macro to browse all running lcores.
  */
@@ -227,6 +404,38 @@ unsigned int rte_get_next_lcore(unsigned int i, int skip_main, int wrap);
 	     i < RTE_MAX_LCORE;						\
 	     i = rte_get_next_lcore(i, 1, 0))
 
+/**
+ * Macro to browse all running lcores in a domain.
+ */
+#define RTE_LCORE_FOREACH_DOMAIN(i, flag)				\
+	for (i = rte_get_next_lcore_from_domain(-1, 0, 0, flag);	\
+	     i < RTE_MAX_LCORE;						\
+	     i = rte_get_next_lcore_from_domain(i, 0, 0, flag))
+
+/**
+ * Macro to browse all running lcores except the main lcore in domain.
+ */
+#define RTE_LCORE_FOREACH_WORKER_DOMAIN(i, flag)				\
+	for (i = rte_get_next_lcore_from_domain(-1, 1, 0, flag);	\
+	     i < RTE_MAX_LCORE;						\
+	     i = rte_get_next_lcore_from_domain(i, 1, 0, flag))
+
+/**
+ * Macro to browse Nth lcores on each domain.
+ */
+#define RTE_LCORE_FORN_NEXT_DOMAIN(i, flag, n)				\
+	for (i = rte_get_next_lcore_from_next_domain(-1, 0, 0, flag, n);\
+	     i < RTE_MAX_LCORE;						\
+	     i = rte_get_next_lcore_from_next_domain(i, 0, 0, flag, n))
+
+/**
+ * Macro to browse all Nth lcores except the main lcore on each domain.
+ */
+#define RTE_LCORE_FORN_WORKER_NEXT_DOMAIN(i, flag, n)			\
+	for (i = rte_get_next_lcore_from_next_domain(-1, 1, 0, flag, n);\
+	     i < RTE_MAX_LCORE;						\
+	     i = rte_get_next_lcore_from_next_domain(i, 1, 0, flag, n))
+
 /**
  * Callback prototype for initializing lcores.
  *
diff --git a/lib/eal/linux/eal.c b/lib/eal/linux/eal.c
index 54577b7718..093f208319 100644
--- a/lib/eal/linux/eal.c
+++ b/lib/eal/linux/eal.c
@@ -65,6 +65,9 @@
  * duration of the program, as we hold a write lock on it in the primary proc */
 static int mem_cfg_fd = -1;
 
+/* holds topology information */
+struct topology_config topo_cnfg;
+
 static struct flock wr_lock = {
 		.l_type = F_WRLCK,
 		.l_whence = SEEK_SET,
@@ -1311,6 +1314,12 @@ rte_eal_init(int argc, char **argv)
 			return -1;
 	}
 
+	if (rte_eal_topology_init()) {
+		rte_eal_init_alert("Cannot invoke topology!!!");
+		rte_errno = ENOTSUP;
+		return -1;
+	}
+
 	eal_mcfg_complete();
 
 	return fctret;
@@ -1352,6 +1361,8 @@ rte_eal_cleanup(void)
 	struct internal_config *internal_conf =
 		eal_get_internal_configuration();
 
+	rte_eal_topology_release();
+
 	if (rte_eal_process_type() == RTE_PROC_PRIMARY &&
 			internal_conf->hugepage_file.unlink_existing)
 		rte_memseg_walk(mark_freeable, NULL);
diff --git a/lib/eal/meson.build b/lib/eal/meson.build
index e1d6c4cf17..690b95d5df 100644
--- a/lib/eal/meson.build
+++ b/lib/eal/meson.build
@@ -31,3 +31,7 @@ endif
 if is_freebsd
     annotate_locks = false
 endif
+
+if has_libhwloc
+    dpdk_conf.set10('RTE_EAL_HWLOC_TOPOLOGY_PROBE', true)
+endif
diff --git a/lib/eal/version.map b/lib/eal/version.map
index 747331af60..a2f1fc1a6c 100644
--- a/lib/eal/version.map
+++ b/lib/eal/version.map
@@ -397,6 +397,15 @@ EXPERIMENTAL {
 
 	# added in 24.11
 	rte_bitset_to_str;
+
+	# added in 25.03
+	rte_get_domain_count;
+	rte_get_lcore_in_domain;
+	rte_get_next_lcore_from_domain;
+	rte_get_next_lcore_from_next_domain;
+	rte_lcore_count_from_domain;
+	rte_lcore_cpuset_in_domain;
+	rte_lcore_is_main_in_domain;
 };
 
 INTERNAL {
@@ -406,6 +415,8 @@ INTERNAL {
 	rte_bus_unregister;
 	rte_eal_get_baseaddr;
 	rte_eal_parse_coremask;
+	rte_eal_topology_init;
+	rte_eal_topology_release;
 	rte_firmware_read;
 	rte_intr_allow_others;
 	rte_intr_cap_multiple;
diff --git a/lib/eal/windows/eal.c b/lib/eal/windows/eal.c
index 28b78a95a6..2edfc4128c 100644
--- a/lib/eal/windows/eal.c
+++ b/lib/eal/windows/eal.c
@@ -40,6 +40,10 @@ static int mem_cfg_fd = -1;
 /* internal configuration (per-core) */
 struct lcore_config lcore_config[RTE_MAX_LCORE];
 
+/* holds topology information */
+struct topology_config topo_cnfg;
+
+
 /* Detect if we are a primary or a secondary process */
 enum rte_proc_type_t
 eal_proc_type_detect(void)
@@ -262,6 +266,8 @@ rte_eal_cleanup(void)
 	struct internal_config *internal_conf =
 		eal_get_internal_configuration();
 
+	rte_eal_topology_release();
+
 	eal_intr_thread_cancel();
 	eal_mem_virt2iova_cleanup();
 	eal_bus_cleanup();
@@ -505,6 +511,12 @@ rte_eal_init(int argc, char **argv)
 	rte_eal_mp_remote_launch(sync_func, NULL, SKIP_MAIN);
 	rte_eal_mp_wait_lcore();
 
+	if (rte_eal_topology_init()) {
+		rte_eal_init_alert("Cannot invoke topology!!!");
+		rte_errno = ENOTSUP;
+		return -1;
+	}
+
 	eal_mcfg_complete();
 
 	return fctret;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v4 2/4] test/lcore: enable tests for topology
  2024-11-05 10:28 [PATCH v4 0/4] Introduce Topology NUMA grouping for lcores Vipin Varghese
  2024-11-05 10:28 ` [PATCH v4 1/4] eal/lcore: add topology based functions Vipin Varghese
@ 2024-11-05 10:28 ` Vipin Varghese
  2024-11-05 10:28 ` [PATCH v4 3/4] doc: add topology grouping details Vipin Varghese
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 30+ messages in thread
From: Vipin Varghese @ 2024-11-05 10:28 UTC (permalink / raw)
  To: dev, roretzla, bruce.richardson, john.mcnamara, dmitry.kozliuk
  Cc: pbhagavatula, jerinj, ruifeng.wang, mattias.ronnblom,
	anatoly.burakov, stephen, ferruh.yigit, honnappa.nagarahalli,
	wathsala.vithanage, konstantin.ananyev, mb

add functional test cases to validate topology supported lcore
API.

v4 cahnges:
 - add MACRO for triggering tests if topology is enabled.
 - add test cases for
   * rte_lcore_is_main_in_domain
   * rte_lcore_cpuset_in_domain
   * MACRO RTE_LCORE_FORN_NEXT_DOMAIN
   * MACRO RTE_LCORE_FORN_WORKER_NEXT_DOMAIN

v3 changes:
 - fix test check for RTE_LCORE_FOREACH_DOMAIN

Signed-off-by: Vipin Varghese <vipin.varghese@amd.com>
---
 app/test/test_lcores.c | 528 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 528 insertions(+)

diff --git a/app/test/test_lcores.c b/app/test/test_lcores.c
index bd5c0dd94b..15a9417e66 100644
--- a/app/test/test_lcores.c
+++ b/app/test/test_lcores.c
@@ -389,6 +389,513 @@ test_ctrl_thread(void)
 	return 0;
 }
 
+#ifdef RTE_EAL_HWLOC_TOPOLOGY_PROBE
+static int
+test_topology_macro(void)
+{
+	unsigned int total_lcores = 0;
+	unsigned int total_wrkr_lcores = 0;
+
+	unsigned int total_lcore_io = 0;
+	unsigned int total_lcore_l4 = 0;
+	unsigned int total_lcore_l3 = 0;
+	unsigned int total_lcore_l2 = 0;
+	unsigned int total_lcore_l1 = 0;
+
+	unsigned int total_wrkr_lcore_io = 0;
+	unsigned int total_wrkr_lcore_l4 = 0;
+	unsigned int total_wrkr_lcore_l3 = 0;
+	unsigned int total_wrkr_lcore_l2 = 0;
+	unsigned int total_wrkr_lcore_l1 = 0;
+
+	unsigned int lcore;
+
+	/* get topology core count */
+	lcore = -1;
+	RTE_LCORE_FOREACH(lcore)
+		total_lcores += 1;
+
+	lcore = -1;
+	RTE_LCORE_FOREACH_WORKER(lcore)
+		total_wrkr_lcores += 1;
+
+	if ((total_wrkr_lcores + 1) != total_lcores) {
+		printf("ERR: failed in MACRO for RTE_LCORE_FOREACH\n");
+		return -2;
+	}
+
+	lcore = -1;
+	RTE_LCORE_FOREACH_DOMAIN(lcore, RTE_LCORE_DOMAIN_IO)
+		total_lcore_io += 1;
+
+	lcore = -1;
+	RTE_LCORE_FOREACH_DOMAIN(lcore, RTE_LCORE_DOMAIN_L4)
+		total_lcore_l4 += 1;
+
+	lcore = -1;
+	RTE_LCORE_FOREACH_DOMAIN(lcore, RTE_LCORE_DOMAIN_L3)
+		total_lcore_l3 += 1;
+
+	lcore = -1;
+	RTE_LCORE_FOREACH_DOMAIN(lcore, RTE_LCORE_DOMAIN_L2)
+		total_lcore_l2 += 1;
+
+	lcore = -1;
+	RTE_LCORE_FOREACH_DOMAIN(lcore, RTE_LCORE_DOMAIN_L1)
+		total_lcore_l1 += 1;
+
+	printf("DBG: lcore count: default (%u), io (%u), l4 (%u), l3 (%u), l2 (%u), l1 (%u).\n",
+		total_lcores, total_lcore_io,
+		total_lcore_l4, total_lcore_l3, total_lcore_l2, total_lcore_l1);
+
+
+	lcore = -1;
+	RTE_LCORE_FOREACH_WORKER_DOMAIN(lcore, RTE_LCORE_DOMAIN_IO)
+		total_wrkr_lcore_io += 1;
+
+	lcore = -1;
+	RTE_LCORE_FOREACH_WORKER_DOMAIN(lcore, RTE_LCORE_DOMAIN_L4)
+		total_wrkr_lcore_l4 += 1;
+
+	lcore = -1;
+	RTE_LCORE_FOREACH_WORKER_DOMAIN(lcore, RTE_LCORE_DOMAIN_L3)
+		total_wrkr_lcore_l3 += 1;
+
+	lcore = -1;
+	RTE_LCORE_FOREACH_WORKER_DOMAIN(lcore, RTE_LCORE_DOMAIN_L2)
+		total_wrkr_lcore_l2 += 1;
+
+	lcore = -1;
+	RTE_LCORE_FOREACH_WORKER_DOMAIN(lcore, RTE_LCORE_DOMAIN_L1)
+		total_wrkr_lcore_l1 += 1;
+
+	printf("DBG: worker lcore count: default (%u), io (%u), l4 (%u), l3 (%u), l2 (%u), l1 (%u).\n",
+		total_wrkr_lcores, total_wrkr_lcore_io,
+		total_wrkr_lcore_l4, total_wrkr_lcore_l3,
+		total_wrkr_lcore_l2, total_wrkr_lcore_l1);
+
+
+	if ((total_wrkr_lcore_io) > total_lcore_io) {
+		printf("ERR: failed in MACRO for RTE_LCORE_FOREACH_DOMAIN for IO\n");
+		return -2;
+	}
+
+	if ((total_wrkr_lcore_l4) > total_lcore_l4) {
+		printf("ERR: failed in MACRO for RTE_LCORE_FOREACH_DOMAIN for L4\n");
+		return -2;
+	}
+
+	if ((total_wrkr_lcore_l3) > total_lcore_l3) {
+		printf("ERR: failed in MACRO for RTE_LCORE_FOREACH_DOMAIN for L3\n");
+		return -2;
+	}
+
+	if ((total_wrkr_lcore_l2) > total_lcore_l2) {
+		printf("ERR: failed in MACRO for RTE_LCORE_FOREACH_DOMAIN for L2\n");
+		return -2;
+	}
+
+	if ((total_wrkr_lcore_l1) > total_lcore_l1) {
+		printf("ERR: failed in MACRO for RTE_LCORE_FOREACH_DOMAIN for L1\n");
+		return -2;
+	}
+
+	total_lcore_io = 0;
+	total_lcore_l4 = 0;
+	total_lcore_l3 = 0;
+	total_lcore_l2 = 0;
+	total_lcore_l1 = 0;
+
+	lcore = -1;
+	RTE_LCORE_FORN_NEXT_DOMAIN(lcore, RTE_LCORE_DOMAIN_IO, 0)
+		total_lcore_io += 1;
+
+	lcore = -1;
+	RTE_LCORE_FORN_NEXT_DOMAIN(lcore, RTE_LCORE_DOMAIN_L4, 0)
+		total_lcore_l4 += 1;
+
+	lcore = -1;
+	RTE_LCORE_FORN_NEXT_DOMAIN(lcore, RTE_LCORE_DOMAIN_L3, 0)
+		total_lcore_l3 += 1;
+
+	lcore = -1;
+	RTE_LCORE_FORN_NEXT_DOMAIN(lcore, RTE_LCORE_DOMAIN_L2, 0)
+		total_lcore_l2 += 1;
+
+	lcore = -1;
+	RTE_LCORE_FORN_NEXT_DOMAIN(lcore, RTE_LCORE_DOMAIN_L1, 0)
+		total_lcore_l1 += 1;
+
+	printf("DBG: macro domain lcore: default (%u), io (%u), l4 (%u), l3 (%u), l2 (%u), l1 (%u).\n",
+		total_lcores, total_lcore_io,
+		total_lcore_l4, total_lcore_l3, total_lcore_l2, total_lcore_l1);
+
+	total_wrkr_lcore_io = 0;
+	total_wrkr_lcore_l4 = 0;
+	total_wrkr_lcore_l3 = 0;
+	total_wrkr_lcore_l2 = 0;
+	total_wrkr_lcore_l1 = 0;
+
+	lcore = -1;
+	RTE_LCORE_FORN_WORKER_NEXT_DOMAIN(lcore, RTE_LCORE_DOMAIN_IO, 0)
+		total_wrkr_lcore_io += 1;
+
+	lcore = -1;
+	RTE_LCORE_FORN_WORKER_NEXT_DOMAIN(lcore, RTE_LCORE_DOMAIN_L4, 0)
+		total_wrkr_lcore_l4 += 1;
+
+	lcore = -1;
+	RTE_LCORE_FORN_WORKER_NEXT_DOMAIN(lcore, RTE_LCORE_DOMAIN_L3, 0)
+		total_wrkr_lcore_l3 += 1;
+
+	lcore = -1;
+	RTE_LCORE_FORN_WORKER_NEXT_DOMAIN(lcore, RTE_LCORE_DOMAIN_L2, 0)
+		total_wrkr_lcore_l2 += 1;
+
+	lcore = -1;
+	RTE_LCORE_FORN_WORKER_NEXT_DOMAIN(lcore, RTE_LCORE_DOMAIN_L1, 0)
+		total_wrkr_lcore_l1 += 1;
+
+	printf("DBG: macro next domain worker count: default (%u), io (%u), l4 (%u), l3 (%u), l2 (%u), l1 (%u).\n",
+		total_wrkr_lcores, total_wrkr_lcore_io,
+		total_wrkr_lcore_l4, total_wrkr_lcore_l3,
+		total_wrkr_lcore_l2, total_wrkr_lcore_l1);
+
+	if ((total_wrkr_lcore_io) > total_lcore_io) {
+		printf("ERR: failed in MACRO for RTE_LCORE_FORN_NEXT_DOMAIN for IO\n");
+		return -2;
+	}
+
+	if ((total_wrkr_lcore_l4) > total_lcore_l4) {
+		printf("ERR: failed in MACRO for RTE_LCORE_FORN_NEXT_DOMAIN for L4\n");
+		return -2;
+	}
+
+	if ((total_wrkr_lcore_l3) > total_lcore_l3) {
+		printf("ERR: failed in MACRO for RTE_LCORE_FORN_NEXT_DOMAIN for L3\n");
+		return -2;
+	}
+
+	if ((total_wrkr_lcore_l2) > total_lcore_l2) {
+		printf("ERR: failed in MACRO for RTE_LCORE_FORN_NEXT_DOMAIN for L2\n");
+		return -2;
+	}
+
+	if ((total_wrkr_lcore_l1) > total_lcore_l1) {
+		printf("ERR: failed in MACRO for RTE_LCORE_FORN_NEXT_DOMAIN for L1\n");
+		return -2;
+	}
+	printf("INFO: lcore DOMAIN macro: success!\n");
+	return 0;
+}
+
+static int
+test_lcore_count_from_domain(void)
+{
+	unsigned int total_lcores = 0;
+	unsigned int total_lcore_io = 0;
+	unsigned int total_lcore_l4 = 0;
+	unsigned int total_lcore_l3 = 0;
+	unsigned int total_lcore_l2 = 0;
+	unsigned int total_lcore_l1 = 0;
+
+	unsigned int domain_count;
+	unsigned int i;
+
+	/* get topology core count */
+	total_lcores = rte_lcore_count();
+
+	domain_count = rte_get_domain_count(RTE_LCORE_DOMAIN_IO);
+	for (i = 0; i < domain_count; i++)
+		total_lcore_io += rte_lcore_count_from_domain(RTE_LCORE_DOMAIN_IO, i);
+
+	domain_count = rte_get_domain_count(RTE_LCORE_DOMAIN_L4);
+	for (i = 0; i < domain_count; i++)
+		total_lcore_l4 += rte_lcore_count_from_domain(RTE_LCORE_DOMAIN_L4, i);
+
+	domain_count = rte_get_domain_count(RTE_LCORE_DOMAIN_L3);
+	for (i = 0; i < domain_count; i++)
+		total_lcore_l3 += rte_lcore_count_from_domain(RTE_LCORE_DOMAIN_L3, i);
+
+	domain_count = rte_get_domain_count(RTE_LCORE_DOMAIN_L2);
+	for (i = 0; i < domain_count; i++)
+		total_lcore_l2 += rte_lcore_count_from_domain(RTE_LCORE_DOMAIN_L2, i);
+
+	domain_count = rte_get_domain_count(RTE_LCORE_DOMAIN_L1);
+	for (i = 0; i < domain_count; i++)
+		total_lcore_l1 += rte_lcore_count_from_domain(RTE_LCORE_DOMAIN_L1, i);
+
+	printf("DBG: lcore count: default (%u), io (%u), l4 (%u), l3 (%u), l2 (%u), l1 (%u).\n",
+		total_lcores, total_lcore_io,
+		total_lcore_l4, total_lcore_l3, total_lcore_l2, total_lcore_l1);
+
+	if ((total_lcore_l1 && (total_lcores != total_lcore_l1)) ||
+		(total_lcore_l2 && (total_lcores != total_lcore_l2)) ||
+		(total_lcore_l3 && (total_lcores != total_lcore_l3)) ||
+		(total_lcore_l4 && (total_lcores != total_lcore_l4)) ||
+		(total_lcore_io && (total_lcores != total_lcore_io))) {
+		printf("ERR: failed in domain API\n");
+		return -2;
+	}
+
+	printf("INFO: lcore count domain API: success\n");
+
+	return 0;
+}
+
+#ifdef RTE_HAS_CPUSET
+static int
+test_lcore_cpuset_from_domain(void)
+{
+	unsigned int domain_count;
+	uint16_t dmn_idx;
+	rte_cpuset_t cpu_set_list;
+
+	dmn_idx = 0;
+	domain_count = rte_get_domain_count(RTE_LCORE_DOMAIN_IO);
+
+	for (; dmn_idx < domain_count; dmn_idx++) {
+		cpu_set_list = rte_lcore_cpuset_in_domain(RTE_LCORE_DOMAIN_IO, dmn_idx);
+
+		for (uint16_t cpu_idx = 0; cpu_idx < RTE_MAX_LCORE; cpu_idx++) {
+			if (CPU_ISSET(cpu_idx, &cpu_set_list)) {
+				if (!rte_lcore_is_enabled(cpu_idx)) {
+					printf("ERR: lcore id: %u, shared from IO (%u) domain is not enabled!\n",
+						cpu_idx, dmn_idx);
+					return -1;
+				}
+			}
+		}
+	}
+
+	dmn_idx = 0;
+	domain_count = rte_get_domain_count(RTE_LCORE_DOMAIN_L4);
+
+	for (; dmn_idx < domain_count; dmn_idx++) {
+		cpu_set_list = rte_lcore_cpuset_in_domain(RTE_LCORE_DOMAIN_L4, dmn_idx);
+
+		for (uint16_t cpu_idx = 0; cpu_idx < RTE_MAX_LCORE; cpu_idx++) {
+			if (CPU_ISSET(cpu_idx, &cpu_set_list)) {
+				if (!rte_lcore_is_enabled(cpu_idx)) {
+					printf("ERR: lcore id: %u, shared from L4 (%u) domain is not enabled!\n",
+						cpu_idx, dmn_idx);
+					return -1;
+				}
+			}
+		}
+	}
+
+	dmn_idx = 0;
+	domain_count = rte_get_domain_count(RTE_LCORE_DOMAIN_L3);
+
+	for (; dmn_idx < domain_count; dmn_idx++) {
+		cpu_set_list = rte_lcore_cpuset_in_domain(RTE_LCORE_DOMAIN_L3, dmn_idx);
+
+		for (uint16_t cpu_idx = 0; cpu_idx < RTE_MAX_LCORE; cpu_idx++) {
+			if (CPU_ISSET(cpu_idx, &cpu_set_list)) {
+				if (!rte_lcore_is_enabled(cpu_idx)) {
+					printf("ERR: lcore id: %u, shared from L3 (%u) domain is not enabled!\n",
+						cpu_idx, dmn_idx);
+					return -1;
+				}
+			}
+		}
+	}
+
+	dmn_idx = 0;
+	domain_count = rte_get_domain_count(RTE_LCORE_DOMAIN_L2);
+
+	for (; dmn_idx < domain_count; dmn_idx++) {
+		cpu_set_list = rte_lcore_cpuset_in_domain(RTE_LCORE_DOMAIN_L2, dmn_idx);
+
+		for (uint16_t cpu_idx = 0; cpu_idx < RTE_MAX_LCORE; cpu_idx++) {
+			if (CPU_ISSET(cpu_idx, &cpu_set_list)) {
+				if (!rte_lcore_is_enabled(cpu_idx)) {
+					printf("ERR: lcore id: %u, shared from L2 (%u) domain is not enabled!\n",
+						cpu_idx, dmn_idx);
+					return -1;
+				}
+			}
+		}
+	}
+
+	dmn_idx = 0;
+	domain_count = rte_get_domain_count(RTE_LCORE_DOMAIN_L1);
+
+	for (; dmn_idx < domain_count; dmn_idx++) {
+		cpu_set_list = rte_lcore_cpuset_in_domain(RTE_LCORE_DOMAIN_L1, dmn_idx);
+
+		for (uint16_t cpu_idx = 0; cpu_idx < RTE_MAX_LCORE; cpu_idx++) {
+			if (CPU_ISSET(cpu_idx, &cpu_set_list)) {
+				if (!rte_lcore_is_enabled(cpu_idx)) {
+					printf("ERR: lcore id: %u, shared from IO (%u) domain is not enabled!\n",
+						cpu_idx, dmn_idx);
+					return -1;
+				}
+			}
+		}
+	}
+
+	cpu_set_list = rte_lcore_cpuset_in_domain(RTE_LCORE_DOMAIN_L1, RTE_MAX_LCORE);
+	if (CPU_COUNT(&cpu_set_list)) {
+		printf("ERR: RTE_MAX_LCORE (%u) in L1 domain is enabled!\n", RTE_MAX_LCORE);
+		return -2;
+	}
+
+	cpu_set_list = rte_lcore_cpuset_in_domain(RTE_LCORE_DOMAIN_L2, RTE_MAX_LCORE);
+	if (CPU_COUNT(&cpu_set_list)) {
+		printf("ERR: RTE_MAX_LCORE (%u) in L2 domain is enabled!\n", RTE_MAX_LCORE);
+		return -2;
+	}
+
+	cpu_set_list = rte_lcore_cpuset_in_domain(RTE_LCORE_DOMAIN_L3, RTE_MAX_LCORE);
+	if (CPU_COUNT(&cpu_set_list)) {
+		printf("ERR: RTE_MAX_LCORE (%u) in L3 domain is enabled!\n", RTE_MAX_LCORE);
+		return -2;
+	}
+
+	cpu_set_list = rte_lcore_cpuset_in_domain(RTE_LCORE_DOMAIN_IO, RTE_MAX_LCORE);
+	if (CPU_COUNT(&cpu_set_list)) {
+		printf("ERR: RTE_MAX_LCORE (%u) in IO domain is enabled!\n", RTE_MAX_LCORE);
+		return -2;
+	}
+
+	printf("INFO: cpuset_in_domain API: success!\n");
+	return 0;
+}
+#endif
+
+static int
+test_main_lcore_in_domain(void)
+{
+	bool main_lcore_found;
+	unsigned int domain_count;
+	uint16_t dmn_idx;
+
+	main_lcore_found = false;
+	dmn_idx = 0;
+	domain_count = rte_get_domain_count(RTE_LCORE_DOMAIN_IO);
+	for (; dmn_idx < domain_count; dmn_idx++) {
+		main_lcore_found = rte_lcore_is_main_in_domain(RTE_LCORE_DOMAIN_IO, dmn_idx);
+		if (main_lcore_found) {
+			printf("DBG: main lcore found in IO domain: %u\n", dmn_idx);
+			break;
+		}
+	}
+
+	if ((domain_count) && (main_lcore_found == false)) {
+		printf("ERR: main lcore is not found in any of the IO domain!\n");
+		return -1;
+	}
+
+	main_lcore_found = false;
+	dmn_idx = 0;
+	domain_count = rte_get_domain_count(RTE_LCORE_DOMAIN_L4);
+	for (; dmn_idx  < domain_count; dmn_idx++) {
+		main_lcore_found = rte_lcore_is_main_in_domain(RTE_LCORE_DOMAIN_L4, dmn_idx);
+		if (main_lcore_found) {
+			printf("DBG: main lcore found in L4 domain: %u\n", dmn_idx);
+			break;
+		}
+	}
+
+	if ((domain_count) && (main_lcore_found == false)) {
+		printf("ERR: main lcore is not found in any of the L4 domain!\n");
+		return -1;
+	}
+
+	main_lcore_found = false;
+	dmn_idx = 0;
+	domain_count = rte_get_domain_count(RTE_LCORE_DOMAIN_L3);
+	for (; dmn_idx  < domain_count; dmn_idx++) {
+		main_lcore_found = rte_lcore_is_main_in_domain(RTE_LCORE_DOMAIN_L3, dmn_idx);
+		if (main_lcore_found) {
+			printf("DBG: main lcore found in L3 domain: %u\n", dmn_idx);
+			break;
+		}
+	}
+
+	if ((domain_count) && (main_lcore_found == false)) {
+		printf("ERR: main lcore is not found in any of the L3 domain!\n");
+		return -1;
+	}
+
+	main_lcore_found = false;
+	dmn_idx = 0;
+	domain_count = rte_get_domain_count(RTE_LCORE_DOMAIN_L2);
+	for (; dmn_idx  < domain_count; dmn_idx++) {
+		main_lcore_found = rte_lcore_is_main_in_domain(RTE_LCORE_DOMAIN_L2, dmn_idx);
+		if (main_lcore_found) {
+			printf("DBG: main lcore is found on the L2 domain: %u\n", dmn_idx);
+			break;
+		}
+	}
+
+	if ((domain_count) && (main_lcore_found == false)) {
+		printf("ERR: main lcore is not found in any of the L2 domain!\n");
+		return -1;
+	}
+
+	main_lcore_found = false;
+	dmn_idx = 0;
+	domain_count = rte_get_domain_count(RTE_LCORE_DOMAIN_L1);
+	for (; dmn_idx  < domain_count; dmn_idx++) {
+		main_lcore_found = rte_lcore_is_main_in_domain(RTE_LCORE_DOMAIN_L1, dmn_idx);
+		if (main_lcore_found) {
+			printf("DBG: main lcore is found on the L1 domain: %u\n", dmn_idx);
+			break;
+		}
+	}
+
+	if ((domain_count) && (main_lcore_found == false)) {
+		printf("ERR: main lcore is not found in any of the L1 domain!\n");
+		return -1;
+	}
+
+	printf("INFO: is_main_lcore_in_domain API: success!\n");
+	return 0;
+}
+
+static int
+test_lcore_from_domain_negative(void)
+{
+	unsigned int domain_count;
+
+	domain_count = rte_get_domain_count(RTE_LCORE_DOMAIN_IO);
+	if ((domain_count) && (rte_lcore_count_from_domain(RTE_LCORE_DOMAIN_IO, domain_count))) {
+		printf("ERR: domain API inconsistent for IO\n");
+		return -1;
+	}
+
+	domain_count = rte_get_domain_count(RTE_LCORE_DOMAIN_L4);
+	if ((domain_count) && (rte_lcore_count_from_domain(RTE_LCORE_DOMAIN_L4, domain_count))) {
+		printf("ERR: domain API inconsistent for L4\n");
+		return -1;
+	}
+
+	domain_count = rte_get_domain_count(RTE_LCORE_DOMAIN_L3);
+	if ((domain_count) && (rte_lcore_count_from_domain(RTE_LCORE_DOMAIN_L3, domain_count))) {
+		printf("ERR: domain API inconsistent for L3\n");
+		return -1;
+	}
+
+	domain_count = rte_get_domain_count(RTE_LCORE_DOMAIN_L2);
+	if ((domain_count) && (rte_lcore_count_from_domain(RTE_LCORE_DOMAIN_L2, domain_count))) {
+		printf("ERR: domain API inconsistent for L2\n");
+		return -1;
+	}
+
+	domain_count = rte_get_domain_count(RTE_LCORE_DOMAIN_L1);
+	if ((domain_count) && (rte_lcore_count_from_domain(RTE_LCORE_DOMAIN_L1, domain_count))) {
+		printf("ERR: domain API inconsistent for L1\n");
+		return -1;
+	}
+
+	printf("INFO: lcore domain API: success!\n");
+	return 0;
+}
+#endif
+
 static int
 test_lcores(void)
 {
@@ -419,6 +926,27 @@ test_lcores(void)
 	if (test_ctrl_thread() < 0)
 		return TEST_FAILED;
 
+#ifdef RTE_EAL_HWLOC_TOPOLOGY_PROBE
+	printf("\nTopology test\n");
+
+	if (test_topology_macro() < 0)
+		return TEST_FAILED;
+
+	if (test_lcore_count_from_domain() < 0)
+		return TEST_FAILED;
+
+	if (test_lcore_from_domain_negative() < 0)
+		return TEST_FAILED;
+
+#ifdef RTE_HAS_CPUSET
+	if (test_lcore_cpuset_from_domain() < 0)
+		return TEST_FAILED;
+#endif
+
+	if (test_main_lcore_in_domain() < 0)
+		return TEST_FAILED;
+#endif
+
 	return TEST_SUCCESS;
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v4 3/4] doc: add topology grouping details
  2024-11-05 10:28 [PATCH v4 0/4] Introduce Topology NUMA grouping for lcores Vipin Varghese
  2024-11-05 10:28 ` [PATCH v4 1/4] eal/lcore: add topology based functions Vipin Varghese
  2024-11-05 10:28 ` [PATCH v4 2/4] test/lcore: enable tests for topology Vipin Varghese
@ 2024-11-05 10:28 ` Vipin Varghese
  2024-11-05 10:28 ` [PATCH v4 4/4] examples: update with lcore topology API Vipin Varghese
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 30+ messages in thread
From: Vipin Varghese @ 2024-11-05 10:28 UTC (permalink / raw)
  To: dev, roretzla, bruce.richardson, john.mcnamara, dmitry.kozliuk
  Cc: pbhagavatula, jerinj, ruifeng.wang, mattias.ronnblom,
	anatoly.burakov, stephen, ferruh.yigit, honnappa.nagarahalli,
	wathsala.vithanage, konstantin.ananyev, mb

Add `Topology` section into eal_init documentation.

Signed-off-by: Vipin Varghese <vipin.varghese@amd.com>
---
 .../prog_guide/env_abstraction_layer.rst      | 22 +++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst b/doc/guides/prog_guide/env_abstraction_layer.rst
index b9fac1839d..3ff6a17501 100644
--- a/doc/guides/prog_guide/env_abstraction_layer.rst
+++ b/doc/guides/prog_guide/env_abstraction_layer.rst
@@ -1188,3 +1188,25 @@ will not be deallocated.
 
 Any successful deallocation event will trigger a callback, for which user
 applications and other DPDK subsystems can register.
+
+Topology
+--------
+
+During `rte_eal_init`, an internal topology structure is created to group DPDK enabled
+lcores into various topology. Using HWLOC library, lcores are categorized into multiple
+domains based on topology groups such as
+
+*   L1 cache
+*   L2 cache
+*   L3 cache
+*   L4 cache
+*   IO
+
+Using `rte_lcore_` extended API, user can retrieve lcores from groups using topology flag and
+domain index. Refer to the API documentation for details.
+
+.. note::
+
+    Topology API are tested on HWLOC library version `2.7.0`.
+    In absence of HWLOC library, initialization of topology objects is skipped.
+    For cross compile, please ensure appropriate path for `pkg-config` is enabled.
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v4 4/4] examples: update with lcore topology API
  2024-11-05 10:28 [PATCH v4 0/4] Introduce Topology NUMA grouping for lcores Vipin Varghese
                   ` (2 preceding siblings ...)
  2024-11-05 10:28 ` [PATCH v4 3/4] doc: add topology grouping details Vipin Varghese
@ 2024-11-05 10:28 ` Vipin Varghese
  2025-02-13  3:09 ` [PATCH v4 0/4] Introduce Topology NUMA grouping for lcores Varghese, Vipin
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 30+ messages in thread
From: Vipin Varghese @ 2024-11-05 10:28 UTC (permalink / raw)
  To: dev, roretzla, bruce.richardson, john.mcnamara, dmitry.kozliuk
  Cc: pbhagavatula, jerinj, ruifeng.wang, mattias.ronnblom,
	anatoly.burakov, stephen, ferruh.yigit, honnappa.nagarahalli,
	wathsala.vithanage, konstantin.ananyev, mb

Enhance example code to allow topology based lcores API, while
retaining default behaviour.

 - helloworld: allow lcoes to send hello to lcores in selected topology.
 - l2fwd: allow use of IO lcores topology.
 - skeleton: choose the lcore from IO topology which has more ports.

v4 changes:
 - cross compilation failure on ARM: Pavan Nikhilesh Bhagavatula
 - update helloworld for L4

v3 changes:
 - fix typo from SE_NO_TOPOLOGY to USE_NO_TOPOLOGY

Signed-off-by: Vipin Varghese <vipin.varghese@amd.com>
---
 examples/helloworld/main.c   | 154 ++++++++++++++++++++++++++++++++++-
 examples/l2fwd/main.c        |  56 +++++++++++--
 examples/skeleton/basicfwd.c |  22 +++++
 3 files changed, 222 insertions(+), 10 deletions(-)

diff --git a/examples/helloworld/main.c b/examples/helloworld/main.c
index af509138da..f39db532d9 100644
--- a/examples/helloworld/main.c
+++ b/examples/helloworld/main.c
@@ -5,8 +5,10 @@
 #include <stdio.h>
 #include <string.h>
 #include <stdint.h>
+#include <stdlib.h>
 #include <errno.h>
 #include <sys/queue.h>
+#include <getopt.h>
 
 #include <rte_memory.h>
 #include <rte_launch.h>
@@ -14,6 +16,14 @@
 #include <rte_per_lcore.h>
 #include <rte_lcore.h>
 #include <rte_debug.h>
+#include <rte_log.h>
+
+#define RTE_LOGTYPE_HELLOWORLD RTE_LOGTYPE_USER1
+#define USE_NO_TOPOLOGY 0xffff
+
+static uint16_t topo_sel = USE_NO_TOPOLOGY;
+/* lcore selector based on Topology */
+static const char short_options[] = "T:";
 
 /* Launch a function on lcore. 8< */
 static int
@@ -21,11 +31,119 @@ lcore_hello(__rte_unused void *arg)
 {
 	unsigned lcore_id;
 	lcore_id = rte_lcore_id();
+
 	printf("hello from core %u\n", lcore_id);
 	return 0;
 }
+
+static int
+send_lcore_hello(__rte_unused void *arg)
+{
+	unsigned int lcore_id;
+	uint16_t send_lcore_id;
+	uint16_t send_count = 0;
+
+	lcore_id = rte_lcore_id();
+
+	send_lcore_id = rte_get_next_lcore_from_domain(lcore_id, false, true, topo_sel);
+
+	while ((send_lcore_id != RTE_MAX_LCORE) && (lcore_id != send_lcore_id)) {
+		printf("hello from lcore %3u to lcore %3u\n",
+			lcore_id, send_lcore_id);
+
+		send_lcore_id = rte_get_next_lcore_from_domain(send_lcore_id,
+				false, true, topo_sel);
+		send_count += 1;
+	}
+
+	if (send_count == 0)
+		printf("for %3u lcore; there are no lcore in (%s) domain!!!\n",
+			lcore_id,
+			(topo_sel == RTE_LCORE_DOMAIN_L1) ? "L1" :
+			(topo_sel == RTE_LCORE_DOMAIN_L2) ? "L2" :
+			(topo_sel == RTE_LCORE_DOMAIN_L3) ? "L3" :
+			(topo_sel == RTE_LCORE_DOMAIN_L4) ? "L4" : "IO");
+
+	return 0;
+}
 /* >8 End of launching function on lcore. */
 
+/* display usage. 8< */
+static void
+helloworld_usage(const char *prgname)
+{
+	printf("%s [EAL options] -- [-T TOPO]\n"
+		"  -T TOPO: choose topology to send hello to cores\n"
+		"	- 0: sharing IO\n"
+		"	- 1: sharing L1\n"
+		"	- 2: sharing L2\n"
+		"	- 3: sharing L3\n"
+		"	- 4: sharing L4\n\n",
+		prgname);
+}
+
+static unsigned int
+parse_topology(const char *q_arg)
+{
+	char *end = NULL;
+	unsigned long n;
+
+	/* parse the topology option */
+	n = strtoul(q_arg, &end, 10);
+
+	if ((q_arg[0] == '\0') || (end == NULL) || (*end != '\0'))
+		return 0;
+
+	if (n > 4)
+		return USE_NO_TOPOLOGY;
+
+	n = (n == 0) ? RTE_LCORE_DOMAIN_IO :
+		(n == 1) ? RTE_LCORE_DOMAIN_L1 :
+		(n == 2) ? RTE_LCORE_DOMAIN_L2 :
+		(n == 3) ? RTE_LCORE_DOMAIN_L3 : RTE_LCORE_DOMAIN_L4;
+
+	return n;
+}
+
+/* Parse the argument given in the command line of the application */
+static int
+helloworld_parse_args(int argc, char **argv)
+{
+	int opt, ret;
+	char **argvopt = argv;
+	int option_index;
+	char *prgname = argv[0];
+	while ((opt = getopt_long(argc, argvopt, short_options,
+				NULL, &option_index)) != EOF) {
+		switch (opt) {
+		/* Topology selection */
+		case 'T':
+			topo_sel = parse_topology(optarg);
+			if (topo_sel == USE_NO_TOPOLOGY) {
+				helloworld_usage(prgname);
+				rte_exit(EXIT_FAILURE, "Invalid Topology selection\n");
+			}
+
+			RTE_LOG(DEBUG, HELLOWORLD, "USR selects (%s) domain cores!\n",
+				(topo_sel == RTE_LCORE_DOMAIN_L1) ? "L1" :
+				(topo_sel == RTE_LCORE_DOMAIN_L2) ? "L2" :
+				(topo_sel == RTE_LCORE_DOMAIN_L3) ? "L3" :
+				(topo_sel == RTE_LCORE_DOMAIN_L4) ? "L4" : "IO");
+
+			ret = 0;
+			break;
+		default:
+			helloworld_usage(prgname);
+			return -1;
+		}
+	}
+	if (optind >= 0)
+		argv[optind-1] = prgname;
+	ret = optind-1;
+	optind = 1; /* reset getopt lib */
+	return ret;
+}
+
 /* Initialization of Environment Abstraction Layer (EAL). 8< */
 int
 main(int argc, char **argv)
@@ -38,15 +156,47 @@ main(int argc, char **argv)
 		rte_panic("Cannot init EAL\n");
 	/* >8 End of initialization of Environment Abstraction Layer */
 
+	argc -= ret;
+	argv += ret;
+
+	ret = helloworld_parse_args(argc, argv);
+	if (ret < 0)
+		rte_exit(EXIT_FAILURE, "Invalid arguments\n");
+
+	if (topo_sel != USE_NO_TOPOLOGY) {
+		uint16_t domain_count = rte_get_domain_count(topo_sel);
+
+		RTE_LOG(DEBUG, HELLOWORLD, "selected Domain (%s)\n",
+			(topo_sel == RTE_LCORE_DOMAIN_L1) ? "L1" :
+			(topo_sel == RTE_LCORE_DOMAIN_L2) ? "L2" :
+			(topo_sel == RTE_LCORE_DOMAIN_L3) ? "L3" : "IO");
+
+		for (int i = 0; i < domain_count; i++) {
+			uint16_t domain_lcore_count = rte_lcore_count_from_domain(topo_sel, i);
+			uint16_t domain_lcore = rte_get_lcore_in_domain(topo_sel, i, 0);
+
+			if (domain_lcore_count)
+				RTE_LOG(DEBUG, HELLOWORLD, "at index (%u), %u cores, lcore (%u) at index 0\n",
+					i,
+					domain_lcore_count,
+					domain_lcore);
+		}
+	}
+
 	/* Launches the function on each lcore. 8< */
 	RTE_LCORE_FOREACH_WORKER(lcore_id) {
 		/* Simpler equivalent. 8< */
-		rte_eal_remote_launch(lcore_hello, NULL, lcore_id);
+		rte_eal_remote_launch((topo_sel == USE_NO_TOPOLOGY) ?
+					lcore_hello : send_lcore_hello, NULL, lcore_id);
 		/* >8 End of simpler equivalent. */
 	}
 
 	/* call it on main lcore too */
-	lcore_hello(NULL);
+	if (topo_sel == USE_NO_TOPOLOGY)
+		lcore_hello(NULL);
+	else
+		send_lcore_hello(NULL);
+
 	/* >8 End of launching the function on each lcore. */
 
 	rte_eal_mp_wait_lcore();
diff --git a/examples/l2fwd/main.c b/examples/l2fwd/main.c
index c6fafdd019..398dd15502 100644
--- a/examples/l2fwd/main.c
+++ b/examples/l2fwd/main.c
@@ -46,6 +46,9 @@ static int mac_updating = 1;
 /* Ports set in promiscuous mode off by default. */
 static int promiscuous_on;
 
+/* select lcores based on ports numa (RTE_LCORE_DOMAIN_IO). */
+static bool select_port_from_io_domain;
+
 #define RTE_LOGTYPE_L2FWD RTE_LOGTYPE_USER1
 
 #define MAX_PKT_BURST 32
@@ -314,6 +317,7 @@ l2fwd_usage(const char *prgname)
 	       "  -P : Enable promiscuous mode\n"
 	       "  -q NQ: number of queue (=ports) per lcore (default is 1)\n"
 	       "  -T PERIOD: statistics will be refreshed each PERIOD seconds (0 to disable, 10 default, 86400 maximum)\n"
+	       "  -t : Enable IO domain lcores mapping to Ports\n"
 	       "  --no-mac-updating: Disable MAC addresses updating (enabled by default)\n"
 	       "      When enabled:\n"
 	       "       - The source MAC address is replaced by the TX port MAC address\n"
@@ -431,6 +435,7 @@ static const char short_options[] =
 	"P"   /* promiscuous */
 	"q:"  /* number of queues */
 	"T:"  /* timer period */
+	"t"  /* lcore from port io numa */
 	;
 
 #define CMD_LINE_OPT_NO_MAC_UPDATING "no-mac-updating"
@@ -502,6 +507,11 @@ l2fwd_parse_args(int argc, char **argv)
 			timer_period = timer_secs;
 			break;
 
+		/* lcores from port io numa */
+		case 't':
+			select_port_from_io_domain = true;
+			break;
+
 		/* long options */
 		case CMD_LINE_OPT_PORTMAP_NUM:
 			ret = l2fwd_parse_port_pair_config(optarg);
@@ -654,7 +664,7 @@ main(int argc, char **argv)
 	uint16_t nb_ports;
 	uint16_t nb_ports_available = 0;
 	uint16_t portid, last_port;
-	unsigned lcore_id, rx_lcore_id;
+	uint16_t lcore_id, rx_lcore_id;
 	unsigned nb_ports_in_mask = 0;
 	unsigned int nb_lcores = 0;
 	unsigned int nb_mbufs;
@@ -738,18 +748,48 @@ main(int argc, char **argv)
 	qconf = NULL;
 
 	/* Initialize the port/queue configuration of each logical core */
+	if (rte_get_domain_count(RTE_LCORE_DOMAIN_IO) == 0)
+		rte_exit(EXIT_FAILURE, "we do not have enough cores in IO numa!\n");
+
+	uint16_t coreindx_io_domain[RTE_MAX_ETHPORTS] = {0};
+	uint16_t lcore_io_domain[RTE_MAX_ETHPORTS] = {RTE_MAX_LCORE};
+	uint16_t l3_domain_count = rte_get_domain_count(RTE_LCORE_DOMAIN_IO);
+
+	for (int i = 0; i < l3_domain_count; i++)
+		lcore_io_domain[i] = rte_get_lcore_in_domain(RTE_LCORE_DOMAIN_IO, i, 0);
+
 	RTE_ETH_FOREACH_DEV(portid) {
 		/* skip ports that are not enabled */
 		if ((l2fwd_enabled_port_mask & (1 << portid)) == 0)
 			continue;
 
-		/* get the lcore_id for this port */
-		while (rte_lcore_is_enabled(rx_lcore_id) == 0 ||
-		       lcore_queue_conf[rx_lcore_id].n_rx_port ==
-		       l2fwd_rx_queue_per_lcore) {
-			rx_lcore_id++;
-			if (rx_lcore_id >= RTE_MAX_LCORE)
-				rte_exit(EXIT_FAILURE, "Not enough cores\n");
+		/* get IO NUMA for the port */
+		int port_socket = rte_eth_dev_socket_id(portid);
+
+		if (select_port_from_io_domain == false) {
+			/* get the lcore_id for this port */
+			while ((rte_lcore_is_enabled(rx_lcore_id) == 0) ||
+			       (lcore_queue_conf[rx_lcore_id].n_rx_port ==
+				l2fwd_rx_queue_per_lcore)) {
+				rx_lcore_id++;
+				if (rx_lcore_id >= RTE_MAX_LCORE)
+					rte_exit(EXIT_FAILURE, "Not enough cores\n");
+			}
+		} else {
+			/* get lcore from IO numa for this port */
+			rx_lcore_id = lcore_io_domain[port_socket];
+
+			if (lcore_queue_conf[rx_lcore_id].n_rx_port == l2fwd_rx_queue_per_lcore) {
+				coreindx_io_domain[port_socket] += 1;
+				rx_lcore_id = rte_get_lcore_in_domain(RTE_LCORE_DOMAIN_IO,
+						port_socket, coreindx_io_domain[port_socket]);
+			}
+
+			if (rx_lcore_id == RTE_MAX_LCORE)
+				rte_exit(EXIT_FAILURE, "unable find IO (%u) numa lcore for port (%u)\n",
+					 port_socket, portid);
+
+			lcore_io_domain[port_socket] = rx_lcore_id;
 		}
 
 		if (qconf != &lcore_queue_conf[rx_lcore_id]) {
diff --git a/examples/skeleton/basicfwd.c b/examples/skeleton/basicfwd.c
index 133293cf15..65faf46e16 100644
--- a/examples/skeleton/basicfwd.c
+++ b/examples/skeleton/basicfwd.c
@@ -176,6 +176,11 @@ main(int argc, char *argv[])
 	unsigned nb_ports;
 	uint16_t portid;
 
+	uint16_t ports_socket_domain[RTE_MAX_ETHPORTS] = {0};
+	uint16_t sel_io_socket = 0;
+	uint16_t sel_io_indx = 0;
+	uint16_t core_count_from_io = 0;
+
 	/* Initializion the Environment Abstraction Layer (EAL). 8< */
 	int ret = rte_eal_init(argc, argv);
 	if (ret < 0)
@@ -190,6 +195,20 @@ main(int argc, char *argv[])
 	if (nb_ports < 2 || (nb_ports & 1))
 		rte_exit(EXIT_FAILURE, "Error: number of ports must be even\n");
 
+	/* get the socekt of each port */
+	RTE_ETH_FOREACH_DEV(portid) {
+		ports_socket_domain[rte_eth_dev_socket_id(portid)] += 1;
+
+		if (ports_socket_domain[rte_eth_dev_socket_id(portid)] > sel_io_socket) {
+			sel_io_socket = ports_socket_domain[rte_eth_dev_socket_id(portid)];
+			sel_io_indx = rte_eth_dev_socket_id(portid);
+		}
+	}
+
+	core_count_from_io = rte_lcore_count_from_domain(RTE_LCORE_DOMAIN_IO, sel_io_indx);
+	if (core_count_from_io == 0)
+		printf("\nWARNING: select main_lcore from IO domain (%u)\n", sel_io_indx);
+
 	/* Creates a new mempool in memory to hold the mbufs. */
 
 	/* Allocates mempool to hold the mbufs. 8< */
@@ -210,6 +229,9 @@ main(int argc, char *argv[])
 	if (rte_lcore_count() > 1)
 		printf("\nWARNING: Too many lcores enabled. Only 1 used.\n");
 
+	if (rte_lcore_is_main_in_domain(RTE_LCORE_DOMAIN_IO, sel_io_indx) == false)
+		printf("\nWARNING: please use lcore from IO domain %u.\n", sel_io_indx);
+
 	/* Call lcore_main on the main core only. Called on single lcore. 8< */
 	lcore_main();
 	/* >8 End of called on single lcore. */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* RE: [PATCH v4 0/4] Introduce Topology NUMA grouping for lcores
  2024-11-05 10:28 [PATCH v4 0/4] Introduce Topology NUMA grouping for lcores Vipin Varghese
                   ` (3 preceding siblings ...)
  2024-11-05 10:28 ` [PATCH v4 4/4] examples: update with lcore topology API Vipin Varghese
@ 2025-02-13  3:09 ` Varghese, Vipin
  2025-02-13  8:34   ` Thomas Monjalon
  2025-03-17 13:46 ` Jan Viktorin
  2026-04-14 19:38 ` [PATCH v5 0/3] eal/topology: introduce topology-aware lcore grouping Vipin Varghese
  6 siblings, 1 reply; 30+ messages in thread
From: Varghese, Vipin @ 2025-02-13  3:09 UTC (permalink / raw)
  To: Varghese, Vipin, dev@dpdk.org, roretzla@linux.microsoft.com,
	bruce.richardson@intel.com, john.mcnamara@intel.com,
	dmitry.kozliuk@gmail.com, Thomas Monjalon,
	Ajit Khaparde (ajit.khaparde@broadcom.com), Song, Keesang
  Cc: pbhagavatula@marvell.com, jerinj@marvell.com,
	ruifeng.wang@arm.com, mattias.ronnblom@ericsson.com,
	anatoly.burakov@intel.com, stephen@networkplumber.org,
	Yigit, Ferruh, honnappa.nagarahalli@arm.com,
	wathsala.vithanage@arm.com, konstantin.ananyev@huawei.com,
	mb@smartsharesystems.com

[AMD Official Use Only - AMD Internal Distribution Only]

Adding Thomas and Ajit to the loop.

Hi Ajit, we have been using the patch series for identifying the topology and getting l3 cache id for populating the steering tag for Device Specific Model & MSI-x driven af-xdp for the experimental STAG firmware on Thor.

Hence current use of topology library helps in
1. workload placement in same Cache or IO domain
2. populating id for MSIx or Device specific model for steering tags.

Thomas and Ajith can we get some help to get this mainline too?



> -----Original Message-----
> From: Vipin Varghese <vipin.varghese@amd.com>
> Sent: Tuesday, November 5, 2024 3:59 PM
> To: dev@dpdk.org; roretzla@linux.microsoft.com; bruce.richardson@intel.com;
> john.mcnamara@intel.com; dmitry.kozliuk@gmail.com
> Cc: pbhagavatula@marvell.com; jerinj@marvell.com; ruifeng.wang@arm.com;
> mattias.ronnblom@ericsson.com; anatoly.burakov@intel.com;
> stephen@networkplumber.org; Yigit, Ferruh <Ferruh.Yigit@amd.com>;
> honnappa.nagarahalli@arm.com; wathsala.vithanage@arm.com;
> konstantin.ananyev@huawei.com; mb@smartsharesystems.com
> Subject: [PATCH v4 0/4] Introduce Topology NUMA grouping for lcores
>
> Caution: This message originated from an External Source. Use proper caution
> when opening attachments, clicking links, or responding.
>
>
> This patch introduces improvements for NUMA topology awareness in relation to
> DPDK logical cores. The goal is to expose API which allows users to select optimal
> logical cores for any application. These logical cores can be selected from various
> NUMA domains like CPU and I/O.
>
> Change Summary:
>  - Introduces the concept of NUMA domain partitioning based on CPU and
>    I/O topology.
>  - Adds support for grouping DPDK logical cores within the same Cache
>    and I/O domain for improved locality.
>  - Implements topology detection and core grouping logic that
>    distinguishes between the following NUMA configurations:
>     * CPU topology & I/O topology (e.g., AMD SoC EPYC, Intel Xeon SPR)
>     * CPU+I/O topology (e.g., Ampere One with SLC, Intel Xeon SPR with SNC)
>  - Enhances performance by minimizing lcore dispersion across tiles|compute
>    package with different L2/L3 cache or IO domains.
>
> Reason:
>  - Applications using DPDK libraries relies on consistent memory access.
>  - Lcores being closer to same NUMA domain as IO.
>  - Lcores sharing same cache.
>
> Latency is minimized by using lcores that share the same NUMA topology.
> Memory access is optimized by utilizing cores within the same NUMA domain or
> tile. Cache coherence is preserved within the same shared cache domain, reducing
> the remote access from tile|compute package via snooping (local hit in either L2 or
> L3 within same NUMA domain).
>
> Library dependency: hwloc
>
> Topology Flags:
> ---------------
>  - RTE_LCORE_DOMAIN_L1: to group cores sharing same L1 cache
>  - RTE_LCORE_DOMAIN_SMT: same as RTE_LCORE_DOMAIN_L1
>  - RTE_LCORE_DOMAIN_L2: group cores sharing same L2 cache
>  - RTE_LCORE_DOMAIN_L3: group cores sharing same L3 cache
>  - RTE_LCORE_DOMAIN_L4: group cores sharing same L4 cache
>  - RTE_LCORE_DOMAIN_IO: group cores sharing same IO
>
> < Function: Purpose >
> ---------------------
>  - rte_get_domain_count: get domain count based on Topology Flag
>  - rte_lcore_count_from_domain: get valid lcores count under each domain
>  - rte_get_lcore_in_domain: valid lcore id based on index
>  - rte_lcore_cpuset_in_domain: return valid cpuset based on index
>  - rte_lcore_is_main_in_domain: return true|false if main lcore is present
>  - rte_get_next_lcore_from_domain: next valid lcore within domain
>  - rte_get_next_lcore_from_next_domain: next valid lcore from next domain
>
> Note:
>  1. Topology is NUMA grouping.
>  2. Domain is various sub-groups within a specific Topology.
>
> Topology example: L1, L2, L3, L4, IO
> Domian example: IO-A, IO-B
>
> < MACRO: Purpose >
> ------------------
>  - RTE_LCORE_FOREACH_DOMAIN: iterate lcores from all domains
>  - RTE_LCORE_FOREACH_WORKER_DOMAIN: iterate worker lcores from all
> domains
>  - RTE_LCORE_FORN_NEXT_DOMAIN: iterate domain select n'th lcore
>  - RTE_LCORE_FORN_WORKER_NEXT_DOMAIN: iterate domain for worker n'th
> lcore.
>
> Future work (after merge):
> --------------------------
>  - dma-perf per IO NUMA
>  - eventdev per L3 NUMA
>  - pipeline per SMT|L3 NUMA
>  - distributor per L3 for Port-Queue
>  - l2fwd-power per SMT
>  - testpmd option for IO NUMA per port
>
> Platform tested on:
> -------------------
>  - INTEL(R) XEON(R) PLATINUM 8562Y+ (support IO numa 1 & 2)
>  - AMD EPYC 8534P (supports IO numa 1 & 2)
>  - AMD EPYC 9554 (supports IO numa 1, 2, 4)
>
> Logs:
> -----
> 1. INTEL(R) XEON(R) PLATINUM 8562Y+:
>  - SNC=1
>         Domain (IO): at index (0) there are 48 core, with (0) at index 0
>  - SNC=2
>         Domain (IO): at index (0) there are 24 core, with (0) at index 0
>         Domain (IO): at index (1) there are 24 core, with (12) at index 0
>
> 2. AMD EPYC 8534P:
>  - NPS=1:
>         Domain (IO): at index (0) there are 128 core, with (0) at index 0
>  - NPS=2:
>         Domain (IO): at index (0) there are 64 core, with (0) at index 0
>         Domain (IO): at index (1) there are 64 core, with (32) at index 0
>
> Signed-off-by: Vipin Varghese <vipin.varghese@amd.com>
>
> Vipin Varghese (4):
>   eal/lcore: add topology based functions
>   test/lcore: enable tests for topology
>   doc: add topology grouping details
>   examples: update with lcore topology API
>
>  app/test/test_lcores.c                        | 528 +++++++++++++
>  config/meson.build                            |  18 +
>  .../prog_guide/env_abstraction_layer.rst      |  22 +
>  examples/helloworld/main.c                    | 154 +++-
>  examples/l2fwd/main.c                         |  56 +-
>  examples/skeleton/basicfwd.c                  |  22 +
>  lib/eal/common/eal_common_lcore.c             | 714 ++++++++++++++++++
>  lib/eal/common/eal_private.h                  |  58 ++
>  lib/eal/freebsd/eal.c                         |  10 +
>  lib/eal/include/rte_lcore.h                   | 209 +++++
>  lib/eal/linux/eal.c                           |  11 +
>  lib/eal/meson.build                           |   4 +
>  lib/eal/version.map                           |  11 +
>  lib/eal/windows/eal.c                         |  12 +
>  14 files changed, 1819 insertions(+), 10 deletions(-)
>
> --
> 2.34.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 0/4] Introduce Topology NUMA grouping for lcores
  2025-02-13  3:09 ` [PATCH v4 0/4] Introduce Topology NUMA grouping for lcores Varghese, Vipin
@ 2025-02-13  8:34   ` Thomas Monjalon
  2025-02-13  9:18     ` Morten Brørup
  2025-03-03  8:59     ` Varghese, Vipin
  0 siblings, 2 replies; 30+ messages in thread
From: Thomas Monjalon @ 2025-02-13  8:34 UTC (permalink / raw)
  To: Varghese, Vipin
  Cc: dev@dpdk.org, roretzla@linux.microsoft.com,
	bruce.richardson@intel.com, john.mcnamara@intel.com,
	dmitry.kozliuk@gmail.com,
	Ajit Khaparde (ajit.khaparde@broadcom.com), Song, Keesang,
	pbhagavatula@marvell.com, jerinj@marvell.com,
	ruifeng.wang@arm.com, mattias.ronnblom@ericsson.com,
	anatoly.burakov@intel.com, stephen@networkplumber.org,
	Yigit, Ferruh, honnappa.nagarahalli@arm.com,
	wathsala.vithanage@arm.com, konstantin.ananyev@huawei.com,
	mb@smartsharesystems.com

13/02/2025 04:09, Varghese, Vipin:
> [AMD Official Use Only - AMD Internal Distribution Only]
> 
> Adding Thomas and Ajit to the loop.
> 
> Hi Ajit, we have been using the patch series for identifying the topology and getting l3 cache id for populating the steering tag for Device Specific Model & MSI-x driven af-xdp for the experimental STAG firmware on Thor.
> 
> Hence current use of topology library helps in
> 1. workload placement in same Cache or IO domain
> 2. populating id for MSIx or Device specific model for steering tags.
> 
> Thomas and Ajith can we get some help to get this mainline too?

Yes, sorry the review discussions did not start.
It has been forgotten.

You could rebase a v2 to make it more visible.

Minor note: the changelog should be after --- in the commit log.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [PATCH v4 0/4] Introduce Topology NUMA grouping for lcores
  2025-02-13  8:34   ` Thomas Monjalon
@ 2025-02-13  9:18     ` Morten Brørup
  2025-03-03  9:06       ` Varghese, Vipin
  2025-03-03  8:59     ` Varghese, Vipin
  1 sibling, 1 reply; 30+ messages in thread
From: Morten Brørup @ 2025-02-13  9:18 UTC (permalink / raw)
  To: Varghese, Vipin, Thomas Monjalon
  Cc: dev, roretzla, bruce.richardson, john.mcnamara, dmitry.kozliuk,
	ajit.khaparde, Song, Keesang, pbhagavatula, jerinj, ruifeng.wang,
	mattias.ronnblom, anatoly.burakov, stephen, Yigit, Ferruh,
	honnappa.nagarahalli, wathsala.vithanage, konstantin.ananyev

> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> Sent: Thursday, 13 February 2025 09.34
> 
> 13/02/2025 04:09, Varghese, Vipin:
> > [AMD Official Use Only - AMD Internal Distribution Only]
> >
> > Adding Thomas and Ajit to the loop.
> >
> > Hi Ajit, we have been using the patch series for identifying the
> topology and getting l3 cache id for populating the steering tag for
> Device Specific Model & MSI-x driven af-xdp for the experimental STAG
> firmware on Thor.

Excellent. A real life example use case helps the review process a lot!

> >
> > Hence current use of topology library helps in
> > 1. workload placement in same Cache or IO domain
> > 2. populating id for MSIx or Device specific model for steering tags.
> >
> > Thomas and Ajith can we get some help to get this mainline too?
> 
> Yes, sorry the review discussions did not start.
> It has been forgotten.

I think the topology/domain API in the EAL should be co-designed with the steering tag API in the ethdev library, so the design can be reviewed/discussed in its entirety.

To help the review discussion, please consider describing the following:
Which APIs are for slow path, and which are for fast path?
Which APIs are "must have", i.e. core to making it work at all, and which APIs are "nice to have", i.e. support APIs to ease use of the new features?

I haven't looked at the hwloc library's API; but I guess these new EAL functions are closely related. Is it a thin wrapper around the hwloc library, or is it very different?

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [PATCH v4 0/4] Introduce Topology NUMA grouping for lcores
  2025-02-13  8:34   ` Thomas Monjalon
  2025-02-13  9:18     ` Morten Brørup
@ 2025-03-03  8:59     ` Varghese, Vipin
  1 sibling, 0 replies; 30+ messages in thread
From: Varghese, Vipin @ 2025-03-03  8:59 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: dev@dpdk.org, roretzla@linux.microsoft.com,
	bruce.richardson@intel.com, john.mcnamara@intel.com,
	dmitry.kozliuk@gmail.com,
	Ajit Khaparde (ajit.khaparde@broadcom.com), Song, Keesang,
	pbhagavatula@marvell.com, jerinj@marvell.com,
	ruifeng.wang@arm.com, mattias.ronnblom@ericsson.com,
	anatoly.burakov@intel.com, stephen@networkplumber.org,
	Yigit, Ferruh, honnappa.nagarahalli@arm.com,
	wathsala.vithanage@arm.com, konstantin.ananyev@huawei.com,
	mb@smartsharesystems.com

[Public]

Hi Thomas

snipped

> >
> > Thomas and Ajith can we get some help to get this mainline too?
>
> Yes, sorry the review discussions did not start.
> It has been forgotten.
>
> You could rebase a v2 to make it more visible.
Sure will do this week.

>
> Minor note: the changelog should be after --- in the commit log.
>


^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [PATCH v4 0/4] Introduce Topology NUMA grouping for lcores
  2025-02-13  9:18     ` Morten Brørup
@ 2025-03-03  9:06       ` Varghese, Vipin
  2025-03-04 10:08         ` Morten Brørup
  0 siblings, 1 reply; 30+ messages in thread
From: Varghese, Vipin @ 2025-03-03  9:06 UTC (permalink / raw)
  To: Morten Brørup, Thomas Monjalon
  Cc: dev@dpdk.org, roretzla@linux.microsoft.com,
	bruce.richardson@intel.com, john.mcnamara@intel.com,
	dmitry.kozliuk@gmail.com, ajit.khaparde@broadcom.com,
	Song, Keesang, pbhagavatula@marvell.com, jerinj@marvell.com,
	ruifeng.wang@arm.com, mattias.ronnblom@ericsson.com,
	anatoly.burakov@intel.com, stephen@networkplumber.org,
	Yigit, Ferruh, honnappa.nagarahalli@arm.com,
	wathsala.vithanage@arm.com, konstantin.ananyev@huawei.com

[Public]

Hi Morten,

snipped


>
>
> > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > Sent: Thursday, 13 February 2025 09.34
> >
> > 13/02/2025 04:09, Varghese, Vipin:
> > > [AMD Official Use Only - AMD Internal Distribution Only]
> > >
> > > Adding Thomas and Ajit to the loop.
> > >
> > > Hi Ajit, we have been using the patch series for identifying the
> > topology and getting l3 cache id for populating the steering tag for
> > Device Specific Model & MSI-x driven af-xdp for the experimental STAG
> > firmware on Thor.
>
> Excellent. A real life example use case helps the review process a lot!

Steering tag is one of the examples or uses, as shared in the current patch series  we make use of these for other examples too.
Eventdev, pkt-distributor and graph nodes are also in works to exploit L2|L3 cache local coherency too.

>
> > >
> > > Hence current use of topology library helps in 1. workload placement
> > > in same Cache or IO domain 2. populating id for MSIx or Device
> > > specific model for steering tags.
> > >
> > > Thomas and Ajith can we get some help to get this mainline too?
> >
> > Yes, sorry the review discussions did not start.
> > It has been forgotten.
>
> I think the topology/domain API in the EAL should be co-designed with the steering
> tag API in the ethdev library, so the design can be reviewed/discussed in its entirety.

As shared in the discussion, we have been exploring simplified approach for steering tags, namely

1. pci-dev args (crude way)
2. flow api for RX (experimental way)

Based on the platform (in case of AMD EPYC, these are translated to `L3 id + 1`)

We do agree rte_ethdev library can use topology API. Current topology API are designed to be made independent from steering tags, as other examples do make use of the same.

>
> To help the review discussion, please consider describing the following:
> Which APIs are for slow path, and which are for fast path?
> Which APIs are "must have", i.e. core to making it work at all, and which APIs are
> "nice to have", i.e. support APIs to ease use of the new features?

Yes, will try to do the same in updated version. For Slow and Fast path API I might need some help, as I was under the impression current behavior is same rte_lcore (invoked at setup and before remote launch). But will check again.

>
> I haven't looked at the hwloc library's API; but I guess these new EAL functions are
> closely related. Is it a thin wrapper around the hwloc library, or is it very different?
This is very thin wrapper on top of hwloc library only. But with DPDK RTE_MAX_LCORE & RTE_NUMA boundary check and population.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [PATCH v4 0/4] Introduce Topology NUMA grouping for lcores
  2025-03-03  9:06       ` Varghese, Vipin
@ 2025-03-04 10:08         ` Morten Brørup
  2025-03-05  7:43           ` Mattias Rönnblom
  0 siblings, 1 reply; 30+ messages in thread
From: Morten Brørup @ 2025-03-04 10:08 UTC (permalink / raw)
  To: Varghese, Vipin, Thomas Monjalon
  Cc: dev, roretzla, bruce.richardson, john.mcnamara, dmitry.kozliuk,
	ajit.khaparde, Song, Keesang, pbhagavatula, jerinj, ruifeng.wang,
	mattias.ronnblom, anatoly.burakov, stephen, Yigit, Ferruh,
	honnappa.nagarahalli, wathsala.vithanage, konstantin.ananyev

> From: Varghese, Vipin [mailto:Vipin.Varghese@amd.com]
> Sent: Monday, 3 March 2025 10.06
> 
> [Public]
> 
> Hi Morten,
> 
> snipped
> 
> >
> > > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > > Sent: Thursday, 13 February 2025 09.34
> > >
> > > 13/02/2025 04:09, Varghese, Vipin:
> > > > [AMD Official Use Only - AMD Internal Distribution Only]
> > > >
> > > > Adding Thomas and Ajit to the loop.
> > > >
> > > > Hi Ajit, we have been using the patch series for identifying the
> > > topology and getting l3 cache id for populating the steering tag
> for
> > > Device Specific Model & MSI-x driven af-xdp for the experimental
> STAG
> > > firmware on Thor.
> >
> > Excellent. A real life example use case helps the review process a
> lot!
> 
> Steering tag is one of the examples or uses, as shared in the current
> patch series  we make use of these for other examples too.
> Eventdev, pkt-distributor and graph nodes are also in works to exploit
> L2|L3 cache local coherency too.
> 
> >
> > > >
> > > > Hence current use of topology library helps in 1. workload
> placement
> > > > in same Cache or IO domain 2. populating id for MSIx or Device
> > > > specific model for steering tags.
> > > >
> > > > Thomas and Ajith can we get some help to get this mainline too?
> > >
> > > Yes, sorry the review discussions did not start.
> > > It has been forgotten.
> >
> > I think the topology/domain API in the EAL should be co-designed with
> the steering
> > tag API in the ethdev library, so the design can be
> reviewed/discussed in its entirety.
> 
> As shared in the discussion, we have been exploring simplified approach
> for steering tags, namely
> 
> 1. pci-dev args (crude way)
> 2. flow api for RX (experimental way)
> 
> Based on the platform (in case of AMD EPYC, these are translated to `L3
> id + 1`)
> 
> We do agree rte_ethdev library can use topology API. Current topology
> API are designed to be made independent from steering tags, as other
> examples do make use of the same.
> 
> >
> > To help the review discussion, please consider describing the
> following:
> > Which APIs are for slow path, and which are for fast path?
> > Which APIs are "must have", i.e. core to making it work at all, and
> which APIs are
> > "nice to have", i.e. support APIs to ease use of the new features?
> 
> Yes, will try to do the same in updated version. For Slow and Fast path
> API I might need some help, as I was under the impression current
> behavior is same rte_lcore (invoked at setup and before remote launch).
> But will check again.

Probably they are all used for configuration only, and thus all slow path; but if there are any fast path APIs, they should be highlighted as such.

> 
> >
> > I haven't looked at the hwloc library's API; but I guess these new
> EAL functions are
> > closely related. Is it a thin wrapper around the hwloc library, or is
> it very different?
> This is very thin wrapper on top of hwloc library only. But with DPDK
> RTE_MAX_LCORE & RTE_NUMA boundary check and population.

OK. The hwloc library is portable across Linux, BSD and Windows, which is great!

Please also describe the benefits of using this DPDK library, compared to directly using the hwloc library.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 0/4] Introduce Topology NUMA grouping for lcores
  2025-03-04 10:08         ` Morten Brørup
@ 2025-03-05  7:43           ` Mattias Rönnblom
  0 siblings, 0 replies; 30+ messages in thread
From: Mattias Rönnblom @ 2025-03-05  7:43 UTC (permalink / raw)
  To: Morten Brørup, Varghese, Vipin, Thomas Monjalon
  Cc: dev, roretzla, bruce.richardson, john.mcnamara, dmitry.kozliuk,
	ajit.khaparde, Song, Keesang, pbhagavatula, jerinj, ruifeng.wang,
	mattias.ronnblom, anatoly.burakov, stephen, Yigit, Ferruh,
	honnappa.nagarahalli, wathsala.vithanage, konstantin.ananyev

On 2025-03-04 11:08, Morten Brørup wrote:
>> From: Varghese, Vipin [mailto:Vipin.Varghese@amd.com]
>> Sent: Monday, 3 March 2025 10.06
>>
>> [Public]
>>
>> Hi Morten,
>>
>> snipped
>>
>>>
>>>> From: Thomas Monjalon [mailto:thomas@monjalon.net]
>>>> Sent: Thursday, 13 February 2025 09.34
>>>>
>>>> 13/02/2025 04:09, Varghese, Vipin:
>>>>> [AMD Official Use Only - AMD Internal Distribution Only]
>>>>>
>>>>> Adding Thomas and Ajit to the loop.
>>>>>
>>>>> Hi Ajit, we have been using the patch series for identifying the
>>>> topology and getting l3 cache id for populating the steering tag
>> for
>>>> Device Specific Model & MSI-x driven af-xdp for the experimental
>> STAG
>>>> firmware on Thor.
>>>
>>> Excellent. A real life example use case helps the review process a
>> lot!
>>
>> Steering tag is one of the examples or uses, as shared in the current
>> patch series  we make use of these for other examples too.
>> Eventdev, pkt-distributor and graph nodes are also in works to exploit
>> L2|L3 cache local coherency too.
>>
>>>
>>>>>
>>>>> Hence current use of topology library helps in 1. workload
>> placement
>>>>> in same Cache or IO domain 2. populating id for MSIx or Device
>>>>> specific model for steering tags.
>>>>>
>>>>> Thomas and Ajith can we get some help to get this mainline too?
>>>>
>>>> Yes, sorry the review discussions did not start.
>>>> It has been forgotten.
>>>
>>> I think the topology/domain API in the EAL should be co-designed with
>> the steering
>>> tag API in the ethdev library, so the design can be
>> reviewed/discussed in its entirety.
>>
>> As shared in the discussion, we have been exploring simplified approach
>> for steering tags, namely
>>
>> 1. pci-dev args (crude way)
>> 2. flow api for RX (experimental way)
>>
>> Based on the platform (in case of AMD EPYC, these are translated to `L3
>> id + 1`)
>>
>> We do agree rte_ethdev library can use topology API. Current topology
>> API are designed to be made independent from steering tags, as other
>> examples do make use of the same.
>>
>>>
>>> To help the review discussion, please consider describing the
>> following:
>>> Which APIs are for slow path, and which are for fast path?
>>> Which APIs are "must have", i.e. core to making it work at all, and
>> which APIs are
>>> "nice to have", i.e. support APIs to ease use of the new features?
>>
>> Yes, will try to do the same in updated version. For Slow and Fast path
>> API I might need some help, as I was under the impression current
>> behavior is same rte_lcore (invoked at setup and before remote launch).
>> But will check again.
> 
> Probably they are all used for configuration only, and thus all slow path; but if there are any fast path APIs, they should be highlighted as such.
> 

Preferably, software work schedulers like DSW should be able to read 
topology information during run-time/steady-state operation. If topology 
APIs are slow or non-MT-safe, they will need to build up their own data 
structures for such information (which is not crazy idea, but leads to 
duplication).

I didn't follow the hwloc discussions, so I may lack some context for 
this discussion.

>>
>>>
>>> I haven't looked at the hwloc library's API; but I guess these new
>> EAL functions are
>>> closely related. Is it a thin wrapper around the hwloc library, or is
>> it very different?
>> This is very thin wrapper on top of hwloc library only. But with DPDK
>> RTE_MAX_LCORE & RTE_NUMA boundary check and population.
> 
> OK. The hwloc library is portable across Linux, BSD and Windows, which is great!
> 
> Please also describe the benefits of using this DPDK library, compared to directly using the hwloc library.
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 0/4] Introduce Topology NUMA grouping for lcores
  2024-11-05 10:28 [PATCH v4 0/4] Introduce Topology NUMA grouping for lcores Vipin Varghese
                   ` (4 preceding siblings ...)
  2025-02-13  3:09 ` [PATCH v4 0/4] Introduce Topology NUMA grouping for lcores Varghese, Vipin
@ 2025-03-17 13:46 ` Jan Viktorin
  2025-04-09 10:08   ` Varghese, Vipin
  2026-01-17 18:57   ` Stephen Hemminger
  2026-04-14 19:38 ` [PATCH v5 0/3] eal/topology: introduce topology-aware lcore grouping Vipin Varghese
  6 siblings, 2 replies; 30+ messages in thread
From: Jan Viktorin @ 2025-03-17 13:46 UTC (permalink / raw)
  To: Vipin Varghese
  Cc: dev, roretzla, bruce.richardson, john.mcnamara, dmitry.kozliuk,
	pbhagavatula, jerinj, ruifeng.wang, mattias.ronnblom,
	anatoly.burakov, stephen, ferruh.yigit, honnappa.nagarahalli,
	wathsala.vithanage, konstantin.ananyev, mb

Hello Vipin and others,

please, will there be any progress or update on this series?

I successfully tested those changes on our Intel and AMD machines and
would like to use it in production soon.

The API is a little bit unintuitive, at least for me, but I
successfully integrated into our software.

I am missing a clear relation to the NUMA socket approach used in DPDK.
E.g. I would like to be able to easily walk over a list of lcores from
a specific NUMA node grouped by L3 domain. Yes, there is the
RTE_LCORE_DOMAIN_IO, but would it always match the appropriate socket
IDs?

Also, I do not clearly understand what is the purpose of using domain
selector like:

  RTE_LCORE_DOMAIN_L1 | RTE_LCORE_DOMAIN_L2

or even:

  RTE_LCORE_DOMAIN_L3 | RTE_LCORE_DOMAIN_L2

the documentation does not explain this. I could not spot any kind of
grouping that would help me in any way. Some "best practices" examples
would be nice to have to understand the intentions better.

I found a little catch when running DPDK with more lcores than there
are physical or SMT CPU cores. This happens when using e.g. an option
like --lcores=(0-15)@(0-1). The results from the topology API would not
match the lcores because hwloc is not aware of the lcores concept. This
might be mentioned somewhere.

Anyway, I really appreciate this work and would like to see it upstream.
Especially for AMD machines, some framework like this is a must.

Kind regards,
Jan

On Tue, 5 Nov 2024 15:58:45 +0530
Vipin Varghese <vipin.varghese@amd.com> wrote:

> This patch introduces improvements for NUMA topology awareness in
> relation to DPDK logical cores. The goal is to expose API which allows
> users to select optimal logical cores for any application. These
> logical cores can be selected from various NUMA domains like CPU and
> I/O.
> 
> Change Summary:
>  - Introduces the concept of NUMA domain partitioning based on CPU and
>    I/O topology.
>  - Adds support for grouping DPDK logical cores within the same Cache
>    and I/O domain for improved locality.
>  - Implements topology detection and core grouping logic that
>    distinguishes between the following NUMA configurations:
>     * CPU topology & I/O topology (e.g., AMD SoC EPYC, Intel Xeon SPR)
>     * CPU+I/O topology (e.g., Ampere One with SLC, Intel Xeon SPR
> with SNC)
>  - Enhances performance by minimizing lcore dispersion across
> tiles|compute package with different L2/L3 cache or IO domains.
> 
> Reason:
>  - Applications using DPDK libraries relies on consistent memory
> access.
>  - Lcores being closer to same NUMA domain as IO.
>  - Lcores sharing same cache.
> 
> Latency is minimized by using lcores that share the same NUMA
> topology. Memory access is optimized by utilizing cores within the
> same NUMA domain or tile. Cache coherence is preserved within the
> same shared cache domain, reducing the remote access from
> tile|compute package via snooping (local hit in either L2 or L3
> within same NUMA domain).
> 
> Library dependency: hwloc
> 
> Topology Flags:
> ---------------
>  - RTE_LCORE_DOMAIN_L1: to group cores sharing same L1 cache
>  - RTE_LCORE_DOMAIN_SMT: same as RTE_LCORE_DOMAIN_L1
>  - RTE_LCORE_DOMAIN_L2: group cores sharing same L2 cache
>  - RTE_LCORE_DOMAIN_L3: group cores sharing same L3 cache
>  - RTE_LCORE_DOMAIN_L4: group cores sharing same L4 cache
>  - RTE_LCORE_DOMAIN_IO: group cores sharing same IO
> 
> < Function: Purpose >
> ---------------------
>  - rte_get_domain_count: get domain count based on Topology Flag
>  - rte_lcore_count_from_domain: get valid lcores count under each
> domain
>  - rte_get_lcore_in_domain: valid lcore id based on index
>  - rte_lcore_cpuset_in_domain: return valid cpuset based on index
>  - rte_lcore_is_main_in_domain: return true|false if main lcore is
> present
>  - rte_get_next_lcore_from_domain: next valid lcore within domain
>  - rte_get_next_lcore_from_next_domain: next valid lcore from next
> domain
> 
> Note:
>  1. Topology is NUMA grouping.
>  2. Domain is various sub-groups within a specific Topology.
> 
> Topology example: L1, L2, L3, L4, IO
> Domian example: IO-A, IO-B
> 
> < MACRO: Purpose >
> ------------------
>  - RTE_LCORE_FOREACH_DOMAIN: iterate lcores from all domains
>  - RTE_LCORE_FOREACH_WORKER_DOMAIN: iterate worker lcores from all
> domains
>  - RTE_LCORE_FORN_NEXT_DOMAIN: iterate domain select n'th lcore
>  - RTE_LCORE_FORN_WORKER_NEXT_DOMAIN: iterate domain for worker n'th
> lcore.
> 
> Future work (after merge):
> --------------------------
>  - dma-perf per IO NUMA
>  - eventdev per L3 NUMA
>  - pipeline per SMT|L3 NUMA
>  - distributor per L3 for Port-Queue
>  - l2fwd-power per SMT
>  - testpmd option for IO NUMA per port
> 
> Platform tested on:
> -------------------
>  - INTEL(R) XEON(R) PLATINUM 8562Y+ (support IO numa 1 & 2)
>  - AMD EPYC 8534P (supports IO numa 1 & 2)
>  - AMD EPYC 9554 (supports IO numa 1, 2, 4)
> 
> Logs:
> -----
> 1. INTEL(R) XEON(R) PLATINUM 8562Y+:
>  - SNC=1
>         Domain (IO): at index (0) there are 48 core, with (0) at
> index 0
>  - SNC=2
>         Domain (IO): at index (0) there are 24 core, with (0) at
> index 0 Domain (IO): at index (1) there are 24 core, with (12) at
> index 0
> 
> 2. AMD EPYC 8534P:
>  - NPS=1:
>         Domain (IO): at index (0) there are 128 core, with (0) at
> index 0
>  - NPS=2:
>         Domain (IO): at index (0) there are 64 core, with (0) at
> index 0 Domain (IO): at index (1) there are 64 core, with (32) at
> index 0
> 
> Signed-off-by: Vipin Varghese <vipin.varghese@amd.com>
> 
> Vipin Varghese (4):
>   eal/lcore: add topology based functions
>   test/lcore: enable tests for topology
>   doc: add topology grouping details
>   examples: update with lcore topology API
> 
>  app/test/test_lcores.c                        | 528 +++++++++++++
>  config/meson.build                            |  18 +
>  .../prog_guide/env_abstraction_layer.rst      |  22 +
>  examples/helloworld/main.c                    | 154 +++-
>  examples/l2fwd/main.c                         |  56 +-
>  examples/skeleton/basicfwd.c                  |  22 +
>  lib/eal/common/eal_common_lcore.c             | 714
> ++++++++++++++++++ lib/eal/common/eal_private.h                  |
> 58 ++ lib/eal/freebsd/eal.c                         |  10 +
>  lib/eal/include/rte_lcore.h                   | 209 +++++
>  lib/eal/linux/eal.c                           |  11 +
>  lib/eal/meson.build                           |   4 +
>  lib/eal/version.map                           |  11 +
>  lib/eal/windows/eal.c                         |  12 +
>  14 files changed, 1819 insertions(+), 10 deletions(-)
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [PATCH v4 0/4] Introduce Topology NUMA grouping for lcores
  2025-03-17 13:46 ` Jan Viktorin
@ 2025-04-09 10:08   ` Varghese, Vipin
  2025-06-03  6:03     ` Varghese, Vipin
  2026-01-17 18:57   ` Stephen Hemminger
  1 sibling, 1 reply; 30+ messages in thread
From: Varghese, Vipin @ 2025-04-09 10:08 UTC (permalink / raw)
  To: Jan Viktorin
  Cc: dev@dpdk.org, roretzla@linux.microsoft.com,
	bruce.richardson@intel.com, john.mcnamara@intel.com,
	dmitry.kozliuk@gmail.com, pbhagavatula@marvell.com,
	jerinj@marvell.com, ruifeng.wang@arm.com,
	mattias.ronnblom@ericsson.com, anatoly.burakov@intel.com,
	stephen@networkplumber.org, Yigit, Ferruh,
	honnappa.nagarahalli@arm.com, wathsala.vithanage@arm.com,
	konstantin.ananyev@huawei.com, mb@smartsharesystems.com

[AMD Official Use Only - AMD Internal Distribution Only]

Snipped

>
> Hello Vipin and others,
>
> please, will there be any progress or update on this series?

Apologies, we did a small update in slack, and missed this out here. Let me try to address your questions below

>
> I successfully tested those changes on our Intel and AMD machines and would like
> to use it in production soon.
>
> The API is a little bit unintuitive, at least for me, but I successfully integrated into our
> software.
>
> I am missing a clear relation to the NUMA socket approach used in DPDK.
> E.g. I would like to be able to easily walk over a list of lcores from a specific NUMA
> node grouped by L3 domain. Yes, there is the RTE_LCORE_DOMAIN_IO, but would
> it always match the appropriate socket IDs?

Yes, we from AMD were internally debating the same. But since there is an API in lcore API as ` rte_lcore_to_socket_id`, adding yet another variation or argument lack it luster.
Hence we internally debating when using the new API why not check if it is desired Physical Socket or Sub Socket Numa domain?

Hence, we did not add the option.

>
> Also, I do not clearly understand what is the purpose of using domain selector like:
>
>   RTE_LCORE_DOMAIN_L1 | RTE_LCORE_DOMAIN_L2
>
> or even:
>
>   RTE_LCORE_DOMAIN_L3 | RTE_LCORE_DOMAIN_L2

I believe we have mentioned in documents to choose 1, if used multiple combo based on the code flow only 1 will be picked up.

real use of these are to select physical cores, under same cache or io domain.
Example: certain SoC has 4 cores sharing L2, which makes pipeline processing more convinent (less data movement). In such cases select lcores within same L2 topologoly.

>
> the documentation does not explain this. I could not spot any kind of grouping that
> would help me in any way. Some "best practices" examples would be nice to have to
> understand the intentions better.

From https://patches.dpdk.org/project/dpdk/cover/20241105102849.1947-1-vipin.varghese@amd.com/

```
Reason:
 - Applications using DPDK libraries relies on consistent memory access.
 - Lcores being closer to same NUMA domain as IO.
 - Lcores sharing same cache.

Latency is minimized by using lcores that share the same NUMA topology.
Memory access is optimized by utilizing cores within the same NUMA
domain or tile. Cache coherence is preserved within the same shared cache
domain, reducing the remote access from tile|compute package via snooping
(local hit in either L2 or L3 within same NUMA domain).
```

>
> I found a little catch when running DPDK with more lcores than there are physical or
> SMT CPU cores. This happens when using e.g. an option like --lcores=(0-15)@(0-1).
> The results from the topology API would not match the lcores because hwloc is not
> aware of the lcores concept. This might be mentioned somewhere.

Yes, this is expected. As one can map any cpu cores to dpdk lcore with `lcore-map`.
We did mentioned this in RFCv4, but when upgraded to RFCv5 we missed to mention it back.

>
> Anyway, I really appreciate this work and would like to see it upstream.
> Especially for AMD machines, some framework like this is a must.
>
> Kind regards,
> Jan
>

We are planning to remove RFC tag and share the final version for upcoming release for DPDK shortly.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [PATCH v4 0/4] Introduce Topology NUMA grouping for lcores
  2025-04-09 10:08   ` Varghese, Vipin
@ 2025-06-03  6:03     ` Varghese, Vipin
  0 siblings, 0 replies; 30+ messages in thread
From: Varghese, Vipin @ 2025-06-03  6:03 UTC (permalink / raw)
  To: Varghese, Vipin, Jan Viktorin
  Cc: dev@dpdk.org, roretzla@linux.microsoft.com,
	bruce.richardson@intel.com, john.mcnamara@intel.com,
	dmitry.kozliuk@gmail.com, pbhagavatula@marvell.com,
	jerinj@marvell.com, ruifeng.wang@arm.com,
	mattias.ronnblom@ericsson.com, anatoly.burakov@intel.com,
	stephen@networkplumber.org, Yigit, Ferruh,
	honnappa.nagarahalli@arm.com, wathsala.vithanage@arm.com,
	konstantin.ananyev@huawei.com, mb@smartsharesystems.com

[Public]

Hi All,

Saring `rte_topology_` API patch next version targeted for upcoming release.
Extras adding support for Cache-ID for L2 and L3 for Cache line stashing, Code Data Prioritization too.

Snipped

>
> >
> > Hello Vipin and others,
> >
> > please, will there be any progress or update on this series?
>
> Apologies, we did a small update in slack, and missed this out here. Let me try to
> address your questions below
>
> >
> > I successfully tested those changes on our Intel and AMD machines and
> > would like to use it in production soon.
> >
> > The API is a little bit unintuitive, at least for me, but I
> > successfully integrated into our software.
> >
> > I am missing a clear relation to the NUMA socket approach used in DPDK.
> > E.g. I would like to be able to easily walk over a list of lcores from
> > a specific NUMA node grouped by L3 domain. Yes, there is the
> > RTE_LCORE_DOMAIN_IO, but would it always match the appropriate socket
> IDs?
>
> Yes, we from AMD were internally debating the same. But since there is an API in
> lcore API as ` rte_lcore_to_socket_id`, adding yet another variation or argument
> lack it luster.
> Hence we internally debating when using the new API why not check if it is desired
> Physical Socket or Sub Socket Numa domain?
>
> Hence, we did not add the option.
>
> >
> > Also, I do not clearly understand what is the purpose of using domain selector
> like:
> >
> >   RTE_LCORE_DOMAIN_L1 | RTE_LCORE_DOMAIN_L2
> >
> > or even:
> >
> >   RTE_LCORE_DOMAIN_L3 | RTE_LCORE_DOMAIN_L2
>
> I believe we have mentioned in documents to choose 1, if used multiple combo
> based on the code flow only 1 will be picked up.
>
> real use of these are to select physical cores, under same cache or io domain.
> Example: certain SoC has 4 cores sharing L2, which makes pipeline processing
> more convinent (less data movement). In such cases select lcores within same L2
> topologoly.
>
> >
> > the documentation does not explain this. I could not spot any kind of
> > grouping that would help me in any way. Some "best practices" examples
> > would be nice to have to understand the intentions better.
>
> From https://patches.dpdk.org/project/dpdk/cover/20241105102849.1947-1-
> vipin.varghese@amd.com/
>
> ```
> Reason:
>  - Applications using DPDK libraries relies on consistent memory access.
>  - Lcores being closer to same NUMA domain as IO.
>  - Lcores sharing same cache.
>
> Latency is minimized by using lcores that share the same NUMA topology.
> Memory access is optimized by utilizing cores within the same NUMA domain or
> tile. Cache coherence is preserved within the same shared cache domain, reducing
> the remote access from tile|compute package via snooping (local hit in either L2 or
> L3 within same NUMA domain).
> ```
>
> >
> > I found a little catch when running DPDK with more lcores than there
> > are physical or SMT CPU cores. This happens when using e.g. an option like --
> lcores=(0-15)@(0-1).
> > The results from the topology API would not match the lcores because
> > hwloc is not aware of the lcores concept. This might be mentioned somewhere.
>
> Yes, this is expected. As one can map any cpu cores to dpdk lcore with `lcore-
> map`.
> We did mentioned this in RFCv4, but when upgraded to RFCv5 we missed to
> mention it back.
>
> >
> > Anyway, I really appreciate this work and would like to see it upstream.
> > Especially for AMD machines, some framework like this is a must.
> >
> > Kind regards,
> > Jan
> >
>
> We are planning to remove RFC tag and share the final version for upcoming
> release for DPDK shortly.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 0/4] Introduce Topology NUMA grouping for lcores
  2025-03-17 13:46 ` Jan Viktorin
  2025-04-09 10:08   ` Varghese, Vipin
@ 2026-01-17 18:57   ` Stephen Hemminger
  2026-01-19 14:55     ` [PATCH v4 0/4] Introduce Topology NUMA grouping for cores Varghese, Vipin
  1 sibling, 1 reply; 30+ messages in thread
From: Stephen Hemminger @ 2026-01-17 18:57 UTC (permalink / raw)
  To: Jan Viktorin
  Cc: Vipin Varghese, dev, roretzla, bruce.richardson, john.mcnamara,
	dmitry.kozliuk, pbhagavatula, jerinj, ruifeng.wang,
	mattias.ronnblom, anatoly.burakov, ferruh.yigit,
	honnappa.nagarahalli, wathsala.vithanage, konstantin.ananyev, mb

On Mon, 17 Mar 2025 14:46:07 +0100
Jan Viktorin <viktorin@cesnet.cz> wrote:

> Hello Vipin and others,
> 
> please, will there be any progress or update on this series?
> 
> I successfully tested those changes on our Intel and AMD machines and
> would like to use it in production soon.
> 
> The API is a little bit unintuitive, at least for me, but I
> successfully integrated into our software.
> 
> I am missing a clear relation to the NUMA socket approach used in DPDK.
> E.g. I would like to be able to easily walk over a list of lcores from
> a specific NUMA node grouped by L3 domain. Yes, there is the
> RTE_LCORE_DOMAIN_IO, but would it always match the appropriate socket
> IDs?
> 
> Also, I do not clearly understand what is the purpose of using domain
> selector like:
> 
>   RTE_LCORE_DOMAIN_L1 | RTE_LCORE_DOMAIN_L2
> 
> or even:
> 
>   RTE_LCORE_DOMAIN_L3 | RTE_LCORE_DOMAIN_L2
> 
> the documentation does not explain this. I could not spot any kind of
> grouping that would help me in any way. Some "best practices" examples
> would be nice to have to understand the intentions better.
> 
> I found a little catch when running DPDK with more lcores than there
> are physical or SMT CPU cores. This happens when using e.g. an option
> like --lcores=(0-15)@(0-1). The results from the topology API would not
> match the lcores because hwloc is not aware of the lcores concept. This
> might be mentioned somewhere.
> 
> Anyway, I really appreciate this work and would like to see it upstream.
> Especially for AMD machines, some framework like this is a must.
> 
> Kind regards,
> Jan
> 
> On Tue, 5 Nov 2024 15:58:45 +0530
> Vipin Varghese <vipin.varghese@amd.com> wrote:
> 
> > This patch introduces improvements for NUMA topology awareness in
> > relation to DPDK logical cores. The goal is to expose API which allows
> > users to select optimal logical cores for any application. These
> > logical cores can be selected from various NUMA domains like CPU and
> > I/O.
> > 
> > Change Summary:
> >  - Introduces the concept of NUMA domain partitioning based on CPU and
> >    I/O topology.
> >  - Adds support for grouping DPDK logical cores within the same Cache
> >    and I/O domain for improved locality.
> >  - Implements topology detection and core grouping logic that
> >    distinguishes between the following NUMA configurations:
> >     * CPU topology & I/O topology (e.g., AMD SoC EPYC, Intel Xeon SPR)
> >     * CPU+I/O topology (e.g., Ampere One with SLC, Intel Xeon SPR
> > with SNC)
> >  - Enhances performance by minimizing lcore dispersion across
> > tiles|compute package with different L2/L3 cache or IO domains.
> > 
> > Reason:
> >  - Applications using DPDK libraries relies on consistent memory
> > access.
> >  - Lcores being closer to same NUMA domain as IO.
> >  - Lcores sharing same cache.
> > 
> > Latency is minimized by using lcores that share the same NUMA
> > topology. Memory access is optimized by utilizing cores within the
> > same NUMA domain or tile. Cache coherence is preserved within the
> > same shared cache domain, reducing the remote access from
> > tile|compute package via snooping (local hit in either L2 or L3
> > within same NUMA domain).
> > 
> > Library dependency: hwloc
> > 
> > Topology Flags:
> > ---------------
> >  - RTE_LCORE_DOMAIN_L1: to group cores sharing same L1 cache
> >  - RTE_LCORE_DOMAIN_SMT: same as RTE_LCORE_DOMAIN_L1
> >  - RTE_LCORE_DOMAIN_L2: group cores sharing same L2 cache
> >  - RTE_LCORE_DOMAIN_L3: group cores sharing same L3 cache
> >  - RTE_LCORE_DOMAIN_L4: group cores sharing same L4 cache
> >  - RTE_LCORE_DOMAIN_IO: group cores sharing same IO
> > 
> > < Function: Purpose >
> > ---------------------
> >  - rte_get_domain_count: get domain count based on Topology Flag
> >  - rte_lcore_count_from_domain: get valid lcores count under each
> > domain
> >  - rte_get_lcore_in_domain: valid lcore id based on index
> >  - rte_lcore_cpuset_in_domain: return valid cpuset based on index
> >  - rte_lcore_is_main_in_domain: return true|false if main lcore is
> > present
> >  - rte_get_next_lcore_from_domain: next valid lcore within domain
> >  - rte_get_next_lcore_from_next_domain: next valid lcore from next
> > domain
> > 
> > Note:
> >  1. Topology is NUMA grouping.
> >  2. Domain is various sub-groups within a specific Topology.
> > 
> > Topology example: L1, L2, L3, L4, IO
> > Domian example: IO-A, IO-B
> > 
> > < MACRO: Purpose >
> > ------------------
> >  - RTE_LCORE_FOREACH_DOMAIN: iterate lcores from all domains
> >  - RTE_LCORE_FOREACH_WORKER_DOMAIN: iterate worker lcores from all
> > domains
> >  - RTE_LCORE_FORN_NEXT_DOMAIN: iterate domain select n'th lcore
> >  - RTE_LCORE_FORN_WORKER_NEXT_DOMAIN: iterate domain for worker n'th
> > lcore.
> > 
> > Future work (after merge):
> > --------------------------
> >  - dma-perf per IO NUMA
> >  - eventdev per L3 NUMA
> >  - pipeline per SMT|L3 NUMA
> >  - distributor per L3 for Port-Queue
> >  - l2fwd-power per SMT
> >  - testpmd option for IO NUMA per port
> > 
> > Platform tested on:
> > -------------------
> >  - INTEL(R) XEON(R) PLATINUM 8562Y+ (support IO numa 1 & 2)
> >  - AMD EPYC 8534P (supports IO numa 1 & 2)
> >  - AMD EPYC 9554 (supports IO numa 1, 2, 4)
> > 
> > Logs:
> > -----
> > 1. INTEL(R) XEON(R) PLATINUM 8562Y+:
> >  - SNC=1
> >         Domain (IO): at index (0) there are 48 core, with (0) at
> > index 0
> >  - SNC=2
> >         Domain (IO): at index (0) there are 24 core, with (0) at
> > index 0 Domain (IO): at index (1) there are 24 core, with (12) at
> > index 0
> > 
> > 2. AMD EPYC 8534P:
> >  - NPS=1:
> >         Domain (IO): at index (0) there are 128 core, with (0) at
> > index 0
> >  - NPS=2:
> >         Domain (IO): at index (0) there are 64 core, with (0) at
> > index 0 Domain (IO): at index (1) there are 64 core, with (32) at
> > index 0
> > 
> > Signed-off-by: Vipin Varghese <vipin.varghese@amd.com>
> > 
> > Vipin Varghese (4):
> >   eal/lcore: add topology based functions
> >   test/lcore: enable tests for topology
> >   doc: add topology grouping details
> >   examples: update with lcore topology API
> > 
> >  app/test/test_lcores.c                        | 528 +++++++++++++
> >  config/meson.build                            |  18 +
> >  .../prog_guide/env_abstraction_layer.rst      |  22 +
> >  examples/helloworld/main.c                    | 154 +++-
> >  examples/l2fwd/main.c                         |  56 +-
> >  examples/skeleton/basicfwd.c                  |  22 +
> >  lib/eal/common/eal_common_lcore.c             | 714
> > ++++++++++++++++++ lib/eal/common/eal_private.h                  |
> > 58 ++ lib/eal/freebsd/eal.c                         |  10 +
> >  lib/eal/include/rte_lcore.h                   | 209 +++++
> >  lib/eal/linux/eal.c                           |  11 +
> >  lib/eal/meson.build                           |   4 +
> >  lib/eal/version.map                           |  11 +
> >  lib/eal/windows/eal.c                         |  12 +
> >  14 files changed, 1819 insertions(+), 10 deletions(-)
> >   

This patch series does not apply cleanly to current DPDK main branch.
Please rebase and resubmit.

AI patch review had the following insights:

# DPDK Patch Series Review: Topology-based Lcore Functions (v4)

**Series**: [PATCH v4 1-4/4] eal/lcore: add topology based functions  
**Date**: November 5, 2024  
**Author**: Vipin Varghese <vipin.varghese@amd.com>

---

## Executive Summary

This 4-patch series adds topology-aware lcore mapping to DPDK's lcore API, allowing cores to be grouped by chiplet locality (L1-L4 cache, IO). The series includes API additions, tests, documentation, and example updates.

**Overall Assessment**: The patches have several issues that need to be addressed before acceptance.

---

## Patch 1/4: eal/lcore: add topology based functions

### Commit Message Review

#### ✅ PASS: Subject Line Format
- Length: 45 characters (within 60 char limit)
- Format: `eal/lcore: add topology based functions`
- Proper prefix, lowercase, imperative mood, no trailing period

#### ⚠️ WARNING: Commit Body Issues

**Line Length Violations** (must be ≤75 characters):
- Line 147: "Using hwloc library, the dpdk available lcores can be grouped" - appears OK
- Line 148: "into various groups nameley L1, L2, L3, L4 and IO. This patch" - OK
- However, several lines approach the limit

**Typo**: "nameley" should be "namely" (line 147)

**Body Structure**: The commit message starts appropriately but could be more detailed about the problem being solved and why this change is needed.

#### ❌ ERROR: Missing Required Tags

The patch is missing:
- `Signed-off-by:` tag - This is **MANDATORY** per AGENTS.md line 116

The version history (v4 changes) is good and appropriately placed but should be separated from the main commit message by a `---` line before the diff.

### Code Review

#### License Headers

Need to check if new files have proper SPDX headers. From the visible diff, additions to existing files should maintain their existing headers.

#### Naming Conventions

**✅ PASS**: External API functions properly prefixed with `rte_`:
- `rte_get_domain_count`
- `rte_get_lcore_in_domain`
- `rte_get_next_lcore_from_domain`
- `rte_get_next_lcore_from_next_domain`
- `rte_lcore_count_from_domain`
- `rte_lcore_cpuset_in_domain`
- `rte_lcore_is_main_in_domain`

**⚠️ INFO**: Internal functions lack `rte_` prefix, which is appropriate for internal APIs:
- `get_domain_lcore_count`
- `get_domain_lcore_mapping`
- `rte_eal_topology_init`
- `rte_eal_topology_release`

However, the last two (`rte_eal_topology_*`) have `rte_` prefix despite being listed as internal - verify if these should be marked as `__rte_internal`.

#### Code Style Issues

From the visible diffs:

**Line Length**: Need to verify all code lines are ≤100 characters

**Type Usage**:
- Good: Conversion of `l3_count` & `io_count` to `uint16_t` (mentioned in v4 changes)

**Memory Management**:
- Good: Removed unnecessary NULL checks before free (mentioned in v4 changes)
- Good: Removed unnecessary malloc casting (mentioned in v4 changes)

**Comments**: Would need to see full code to verify comment formatting

#### API Design Issues

**⚠️ WARNING: Experimental APIs**
All new external APIs should be marked with `__rte_experimental` per AGENTS.md line 766. The commit message mentions "External Experimental API" but need to verify in the actual header file that each function has the `__rte_experimental` attribute on its own line.

**⚠️ WARNING: Missing Doxygen**
New public APIs must have Doxygen comments (AGENTS.md line 756). Need to verify full header file.

#### Testing Requirements

**⚠️ WARNING**: Per AGENTS.md line 698-699:
- New APIs must be used in `/app` test directory
- New device APIs require at least one driver implementation

The series includes patch 2/4 for tests, which is good. Need to verify the tests use the `TEST_ASSERT` macros and `unit_test_suite_runner` infrastructure (AGENTS.md lines 703-743).

### Documentation Requirements

**⚠️ WARNING**: Per AGENTS.md line 758-762:
- Release notes must be updated in `doc/guides/rel_notes/` for important changes
- Code and documentation must be updated atomically

The series includes patch 3/4 for documentation, which is good. However, need to verify:
- Documentation matches code behavior
- Only the **current release** notes file is updated
- Doxygen comments are present for all public APIs

---

## Patch 2/4: test/lcore: enable tests for topology

### Commit Message Review

#### Subject Line
- Format: `test/lcore: enable tests for topology`
- Length: 37 characters ✅
- Proper prefix, lowercase, imperative ✅

#### Missing Elements
- ❌ **ERROR**: Missing `Signed-off-by:` tag

### Test Code Requirements

**Need to verify**:
- Tests use `TEST_ASSERT` macros (AGENTS.md line 745-752)
- Tests use `unit_test_suite_runner` infrastructure (AGENTS.md line 703-743)
- Tests are properly registered with `REGISTER_FAST_TEST` or similar

---

## Patch 3/4: doc: add topology grouping details

### Commit Message Review

#### Subject Line
- Format: `doc: add topology grouping details`
- Length: 35 characters ✅
- Proper prefix, lowercase, imperative ✅

#### Missing Elements
- ❌ **ERROR**: Missing `Signed-off-by:` tag

### Documentation Requirements

**Need to verify**:
- Documentation matches actual code behavior
- Release notes updated for current release only
- Proper RST formatting
- No passive voice (per DPDK documentation standards)

---

## Patch 4/4: examples: update with lcore topology API

### Commit Message Review

#### Subject Line
- Format: `examples: update with lcore topology API`
- Length: 39 characters ✅
- Proper prefix ✅

**⚠️ WARNING: Component Prefix**
Per AGENTS.md line 89, `example:` should be `examples/foo:` with specific example name.
Should be something like: `examples/helloworld: add topology support`

#### Missing Elements
- ❌ **ERROR**: Missing `Signed-off-by:` tag

### Code Review for Examples

From the visible diff in helloworld:

**Line 2608-2609**: Ternary operator formatting
```c
rte_eal_remote_launch((topo_sel == USE_NO_TOPOLOGY) ?
    lcore_hello : send_lcore_hello, NULL, lcore_id);
```
This is acceptable but could be more readable. Consider:
```c
lcore_func = (topo_sel == USE_NO_TOPOLOGY) ? lcore_hello : send_lcore_hello;
rte_eal_remote_launch(lcore_func, NULL, lcore_id);
```

**Line 2631-2632**: Comment formatting
```c
+/* select lcores based on ports numa (RTE_LCORE_DOMAIN_IO). */
+static bool select_port_from_io_domain;
```
✅ Good comment style

**L2fwd example changes** (lines 2624-2729):

**Line 2641**: Missing space in comment
```c
+	       "  -t : Enable IO domain lcores mapping to Ports\n"
```
Should probably be: "IO-domain lcore mapping" or "IO domain lcore mapping"

**Line 2670**: Variable type change
```c
-	unsigned lcore_id, rx_lcore_id;
+	uint16_t lcore_id, rx_lcore_id;
```
✅ Good: Using explicit types instead of `unsigned`

**Line 2681-2686**: Array initialization
```c
+	uint16_t coreindx_io_domain[RTE_MAX_ETHPORTS] = {0};
+	uint16_t lcore_io_domain[RTE_MAX_ETHPORTS] = {RTE_MAX_LCORE};
```
**⚠️ WARNING**: The second initialization is problematic. It only sets the first element to `RTE_MAX_LCORE`, not all elements. Should use a loop or memset for initialization:
```c
for (int i = 0; i < RTE_MAX_ETHPORTS; i++)
    lcore_io_domain[i] = RTE_MAX_LCORE;
```

**Line 2683**: Variable naming issue
```c
+	uint16_t l3_domain_count = rte_get_domain_count(RTE_LCORE_DOMAIN_IO);
```
**❌ ERROR**: Variable is named `l3_domain_count` but gets IO domain count. Should be `io_domain_count`.

**Line 2705-2710**: Parentheses in while condition
```c
while ((rte_lcore_is_enabled(rx_lcore_id) == 0) ||
       (lcore_queue_conf[rx_lcore_id].n_rx_port ==
	l2fwd_rx_queue_per_lcore)) {
```
Excessive parentheses - the outer ones around each condition aren't needed.

**Skeleton example changes** (lines 2731-2777):

**Line 2750**: Typo in comment
```c
+	/* get the socekt of each port */
```
Should be "socket", not "socekt"

**Line 2760**: Function usage looks good
```c
+	core_count_from_io = rte_lcore_count_from_domain(RTE_LCORE_DOMAIN_IO, sel_io_indx);
```

---

## Summary of Critical Issues

### Errors (Must Fix):

1. **All 4 patches missing `Signed-off-by:` tags** - This is mandatory per DPDK contribution guidelines
2. **Patch 4**: Variable naming bug - `l3_domain_count` should be `io_domain_count`
3. **Patch 4**: Array initialization bug - `lcore_io_domain` array not fully initialized

### Warnings (Should Fix):

4. **Patch 1**: Typo in commit message - "nameley" → "namely"
5. **Patch 4**: Wrong component prefix - should be `examples/helloworld:` and `examples/l2fwd:` and `examples/skeleton:`
6. **Patch 4**: Typo in comment - "socekt" → "socket"
7. **Patch 4**: Unnecessary parentheses in while condition
8. All patches should verify:
   - API functions have `__rte_experimental` markers
   - Doxygen comments present for all public APIs
   - Tests use proper `TEST_ASSERT` infrastructure
   - Release notes updated

### Info (Consider):

9. **Patch 1**: Version history (v4 changes) should be after `---` separator, not in commit body
10. **Patch 4**: Ternary operator could be more readable with intermediate variable
11. Verify all code lines are ≤100 characters
12. Verify commit body lines are ≤75 characters

---

## Recommended Actions

1. **Immediate**: Add `Signed-off-by:` tags to all patches with author's real name and email
2. **Immediate**: Fix variable naming bug (`l3_domain_count` → `io_domain_count`)
3. **Immediate**: Fix array initialization bug for `lcore_io_domain`
4. Fix typos: "nameley" → "namely", "socekt" → "socket"
5. Update patch 4 subject lines to use specific example names
6. Verify all new APIs are marked `__rte_experimental`
7. Verify all new APIs have Doxygen documentation
8. Move v4 changelog to post-`---` section
9. Run DPDK validation tools:
   - `devtools/check-git-log.sh`
   - `devtools/checkpatches.sh`
   - `devtools/test-meson-builds.sh`

---

## Validation Commands

Before resubmitting, run:

```bash
# Check commit messages
devtools/check-git-log.sh

# Check patch formatting
devtools/checkpatches.sh *.patch

# Verify compilation
devtools/test-meson-builds.sh

# Check maintainers
devtools/get-maintainer.sh *.patch
```

---

## Positive Aspects

✅ Good responsiveness to v4 feedback (removed malloc casting, NULL checks, fixed types)  
✅ Comprehensive series with tests, documentation, and examples  
✅ Clear API naming with proper `rte_` prefixes  
✅ Good use of explicit types (`uint16_t`) instead of `unsigned`  
✅ Appropriate subject line formatting and component prefixes (mostly)  

The core concept and implementation approach appear sound, but the patches need the above corrections before merging.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [PATCH v4 0/4] Introduce Topology NUMA grouping for cores
  2026-01-17 18:57   ` Stephen Hemminger
@ 2026-01-19 14:55     ` Varghese, Vipin
  0 siblings, 0 replies; 30+ messages in thread
From: Varghese, Vipin @ 2026-01-19 14:55 UTC (permalink / raw)
  To: Stephen Hemminger, Jan Viktorin
  Cc: dev@dpdk.org, roretzla@linux.microsoft.com,
	bruce.richardson@intel.com, john.mcnamara@intel.com,
	dmitry.kozliuk@gmail.com, pbhagavatula@marvell.com,
	jerinj@marvell.com, ruifeng.wang@arm.com,
	mattias.ronnblom@ericsson.com, anatoly.burakov@intel.com,
	Yigit, Ferruh, honnappa.nagarahalli@arm.com,
	wathsala.vithanage@arm.com, konstantin.ananyev@huawei.com,
	mb@smartsharesystems.com

[Public]

Hi @Stephen Hemminger,

Thank you for the tool and report generation, we are working internally to release v5 soon.

Snipped

>
> ---
>
> ## Positive Aspects
>
> ✅ Good responsiveness to v4 feedback (removed malloc casting, NULL checks,
> fixed types) ✅ Comprehensive series with tests, documentation, and examples
> ✅ Clear API naming with proper `rte_` prefixes ✅ Good use of explicit types
> (`uint16_t`) instead of `unsigned` ✅ Appropriate subject line formatting and
> component prefixes (mostly)
>
> The core concept and implementation approach appear sound, but the patches
> need the above corrections before merging.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v5 0/3] eal/topology: introduce topology-aware lcore grouping
  2024-11-05 10:28 [PATCH v4 0/4] Introduce Topology NUMA grouping for lcores Vipin Varghese
                   ` (5 preceding siblings ...)
  2025-03-17 13:46 ` Jan Viktorin
@ 2026-04-14 19:38 ` Vipin Varghese
  2026-04-14 19:38   ` [PATCH v5 v5 1/3] eal/topology: add Topology grouping for lcores Vipin Varghese
                     ` (3 more replies)
  6 siblings, 4 replies; 30+ messages in thread
From: Vipin Varghese @ 2026-04-14 19:38 UTC (permalink / raw)
  To: dev, sivaprasad.tummala
  Cc: konstantin.ananyev, wathsala.vithanage, bruce.richardson,
	viktorin, mb

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="y", Size: 3525 bytes --]

This series introduces a topology library that groups DPDK lcores based on
CPU cache hierarchy and NUMA topology. The goal is to provide a stable and
explicit API that allows applications to select lcores with better locality
and cache sharing characteristics.

The series includes:
  - EAL support for topology discovery using hwloc and domain-based lcore
    grouping (L1/L2/L3/L4/NUMA)
  - Topology-aware test cases validating API behavior and edge conditions
  - Programmer’s guide describing the topology library and APIs

The API is marked experimental and does not change existing lcore behavior
unless explicitly used by the application.

Changes in v5:
  - Addressed review comments from v4
  - Fixed ARM cross-compilation issues
  - Cleaned up domain iteration and error handling
  - Updated tests to cover domain edge cases
  - Documentation refinements and API usage clarification

Changes in v4:
  - Corrected domain selection semantics
  - Updated example usage
  - Fixed naming and typo issues

Changes in v3:
  - Fixed macro naming (USE_NO_TOPOLOGY)
  - Minor cleanups based on early feedback

Tested on:
  - AMD EPYC (Milan, Genoa, Siena, Turin, Turin-Dense, Sorano)
  - Intel Xeon (SPR-SP, GNR-SP)
  - ARM Ampere
  - NVIDIA Grace Superchip

Dependencies:
  - hwloc-dev (tested with 2.10.0)

Patch breakdown:
  1/3 eal/topology: add topology grouping for lcores
  2/3 app: add topology-aware test cases
  3/3 doc: add topology library documentation

Future Work:
 - integrate into examples
  -- hellowrld: ready
  -- pkt-distributor: in-progress
  -- l2fwd: ready
  -- l3fwd: to start
  -- eventdevpipeline: PoC ready
 - integrate topology test
  -- crypto: yet to start
  -- compression: yet to start
  -- dma: PoC ready
 - add new features for
  -- PQoS: yet to start
  -- Data Injection: PoC with BRDCM Thor-2 ready

Tested OS: Linux only, need help with BSD and Windows

Tested with and without hwloc-dev library for
 - Ampere, aarch64, Neoverse-N1, NUMA-2, 256 CPU threads
 - Grace superchip, aarch64, Neoverse-V2, NUMA-2, 144 CPU threads
 - Intel GNR-SP, 6767P, NUMA-2, 256 Threads
 - AMD EPYC Siena, 8534P, NUMA-1, 128 Threads
 - AMD EPYC Sorano, 8635P, NUMA-1, 168 Threads

Signed-off-by: Vipin Varghese <vipin.varghese@amd.com>
``

Vipin Varghese (3):
  eal/topology: add Topology grouping for lcores
  app: add topology aware test case
  doc: add new section topology

 app/test/meson.build                   |   1 +
 app/test/test_ring_perf.c              | 416 +++++++++++++-
 app/test/test_stack_perf.c             | 409 ++++++++++++++
 app/test/test_topology.c               | 676 ++++++++++++++++++++++
 config/meson.build                     |  18 +
 doc/api/doxy-api-index.md              |   1 +
 doc/guides/prog_guide/index.rst        |   3 +-
 doc/guides/prog_guide/topology_lib.rst | 155 +++++
 lib/eal/common/eal_private.h           |  74 +++
 lib/eal/common/eal_topology.c          | 747 +++++++++++++++++++++++++
 lib/eal/common/meson.build             |   1 +
 lib/eal/freebsd/eal.c                  |  10 +-
 lib/eal/include/meson.build            |   1 +
 lib/eal/include/rte_topology.h         | 255 +++++++++
 lib/eal/linux/eal.c                    |   7 +
 lib/eal/meson.build                    |   4 +
 16 files changed, 2773 insertions(+), 5 deletions(-)
 create mode 100644 app/test/test_topology.c
 create mode 100644 doc/guides/prog_guide/topology_lib.rst
 create mode 100644 lib/eal/common/eal_topology.c
 create mode 100644 lib/eal/include/rte_topology.h

-- 
2.43.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v5 v5 1/3] eal/topology: add Topology grouping for lcores
  2026-04-14 19:38 ` [PATCH v5 0/3] eal/topology: introduce topology-aware lcore grouping Vipin Varghese
@ 2026-04-14 19:38   ` Vipin Varghese
  2026-04-15 14:06     ` Morten Brørup
  2026-04-14 19:38   ` [PATCH v5 v5 2/3] app: add topology aware test case Vipin Varghese
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 30+ messages in thread
From: Vipin Varghese @ 2026-04-14 19:38 UTC (permalink / raw)
  To: dev, sivaprasad.tummala
  Cc: konstantin.ananyev, wathsala.vithanage, bruce.richardson,
	viktorin, mb

This patch introduces NUMA topology awareness in relation
to DPDK logical cores. The goal is to expose API which allows
users to select optimal logical cores for any application.
These logical cores can be selected from various NUMA domains
like CPU and I/O.

Change Summary:
 - Add concept of domain partitioning based on CPU and I/O topology.
 - Group DPDK logical cores iinto groups of L1|L2|L3|L4|IO.
 - Add supportor helper MACRO as iterator.

v4 changes:
 - cross compilation failure on ARM: Pavan Nikhilesh Bhagavatula
 - update helloworld for L4

v3 changes:
 - fix typo from SE_NO_TOPOLOGY to USE_NO_TOPOLOGY

Reason:
 - Applications can performs better using lcores within the same domain.
 - In pipeline and graph application; sharing cache reduces memory access.
 - Use L2|L3 cache-id to configure Data injection & PQoS.
 - Integrate hwloc-dev library, which allows
   -- grouping into DPDK favourable domain
   -- reverse lookup from lcore to domain-id.
   -- ensure no ABI breakage with versions of hwloc-dev
   -- consistent mapping even with DPDK lcore option `R`.

Library dependency: hwloc-dev

RTE_TOPO API:
+++++++++++++

Domain Enumeration
 - rte_topo_get_domain_count(domain_sel)

Lcore Enumeration
 - rte_topo_get_lcore_count_from_domain(domain_sel, domain_idx)
 - rte_topo_get_nth_lcore_in_domain(domain_sel, domain_idx, lcore_pos)
 - rte_topo_get_next_lcore(lcore, skip_main, wrap, flag)
 - rte_topo_get_nth_lcore_from_domain(domain_idx, lcore_pos, wrap, flag)

Domain Lookup
 - rte_topo_get_domain_index_from_lcore(domain_sel, lcore)
 - rte_topo_is_main_lcore_in_domain(domain_sel, domain_idx)

Cpuset
 - rte_topo_get_lcore_cpuset_in_domain(domain_sel, domain_idx)

Debug
 - rte_topo_dump(FILE *f)

Platform tested on:
-------------------
 - AMD EPYC MILAN
 - AMD EPYC GENOA
 - AMD EPYC SIENA
 - AMD EPYC TURIN
 - AMD EPYC TURIN-DENSE
 - AMD EPYC SORANO
 - ARM AMPERE
 - INTEL XEON GNR-SP
 - INTEL XEON SPR-SP
 - NVIDIA GRACE SUPERCHIP

Signed-off-by: Vipin Varghese <vipin.varghese@amd.com>
---
 config/meson.build             |  18 +
 lib/eal/common/eal_private.h   |  74 ++++
 lib/eal/common/eal_topology.c  | 746 +++++++++++++++++++++++++++++++++
 lib/eal/common/meson.build     |   1 +
 lib/eal/freebsd/eal.c          |  10 +-
 lib/eal/include/meson.build    |   1 +
 lib/eal/include/rte_topology.h | 255 +++++++++++
 lib/eal/linux/eal.c            |   7 +
 lib/eal/meson.build            |   4 +
 9 files changed, 1115 insertions(+), 1 deletion(-)
 create mode 100644 lib/eal/common/eal_topology.c
 create mode 100644 lib/eal/include/rte_topology.h

diff --git a/config/meson.build b/config/meson.build
index 9ba7b9a338..db2faccdbc 100644
--- a/config/meson.build
+++ b/config/meson.build
@@ -245,6 +245,24 @@ if find_libnuma
     endif
 endif
 
+has_libhwloc = false
+find_libhwloc = true
+
+if meson.is_cross_build() and not meson.get_external_property('hwloc', true)
+    # don't look for libhwloc if explicitly disabled in cross build
+    find_libhwloc = false
+endif
+
+if find_libhwloc
+    hwloc_dep = cc.find_library('hwloc', required: false)
+    if hwloc_dep.found() and cc.has_header('hwloc.h')
+        dpdk_conf.set10('RTE_LIBHWLOC_PROBE', true)
+        has_libhwloc = true
+        #add_project_link_arguments('-lhwloc', language: 'c')
+        #dpdk_extra_ldflags += '-lhwloc'
+    endif
+endif
+
 has_libfdt = false
 fdt_dep = cc.find_library('fdt', required: false)
 if fdt_dep.found() and cc.has_header('fdt.h') and cc.links(min_c_code, dependencies: fdt_dep)
diff --git a/lib/eal/common/eal_private.h b/lib/eal/common/eal_private.h
index e032dd10c9..904df0d0b7 100644
--- a/lib/eal/common/eal_private.h
+++ b/lib/eal/common/eal_private.h
@@ -9,12 +9,17 @@
 #include <stdint.h>
 #include <stdio.h>
 #include <sys/queue.h>
+#include <rte_os.h>
 
 #include <dev_driver.h>
 #include <rte_lcore.h>
 #include <rte_log.h>
 #include <rte_memory.h>
 
+#ifdef RTE_LIBHWLOC_PROBE
+#include <hwloc.h>
+#endif
+
 #include "eal_internal_cfg.h"
 
 /**
@@ -40,6 +45,63 @@ struct lcore_config {
 
 extern struct lcore_config lcore_config[RTE_MAX_LCORE];
 
+struct core_domain_mapping {
+	rte_cpuset_t core_set;	/**< cpu_set representing lcores within domain */
+	uint16_t core_count;	/**< dpdk enabled lcores within domain */
+	uint16_t *cores;	/**< list of cores */
+};
+
+struct lcore_mapping {
+	uint16_t cpu;
+	uint16_t numa_domain;
+	uint16_t l4_domain;
+	uint16_t l3_domain;
+	uint16_t l2_domain;
+	uint16_t l1_domain;
+	uint16_t numa_cacheid;
+	uint16_t l4_cacheid;
+	uint16_t l3_cacheid;
+	uint16_t l2_cacheid;
+	uint16_t l1_cacheid;
+};
+
+#define RTE_TOPO_MAX_CPU_CORES 2048
+
+struct topology_config {
+#ifdef RTE_LIBHWLOC_PROBE
+	hwloc_topology_t topology;
+#endif
+
+	/* domain count */
+	uint16_t l1_count;
+	uint16_t l2_count;
+	uint16_t l3_count;
+	uint16_t l4_count;
+	uint16_t numa_count;
+
+	/* total cores under all domain */
+	uint16_t l1_core_count;
+	uint16_t l2_core_count;
+	uint16_t l3_core_count;
+	uint16_t l4_core_count;
+	uint16_t numa_core_count;
+
+	/* dpdk lcore to cpu core map */
+	uint16_t lcore_to_cpu_map[RTE_TOPO_MAX_CPU_CORES];
+
+	/* two dimensional array for each domain */
+	struct core_domain_mapping **l1;
+	struct core_domain_mapping **l2;
+	struct core_domain_mapping **l3;
+	struct core_domain_mapping **l4;
+	struct core_domain_mapping **numa;
+
+	/* reverse map lcore to domain lookup */
+	struct lcore_mapping lcore_map[RTE_MAX_LCORE];
+};
+extern struct topology_config topo_cnfg;
+
+
 /**
  * The global RTE configuration structure.
  */
@@ -102,6 +164,18 @@ char *eal_cpuset_to_str(const rte_cpuset_t *cpuset);
  */
 int rte_eal_memzone_init(void);
 
+/**
+ * Initialize the topology structure using HWLOC Library
+ */
+__rte_internal
+int rte_eal_topology_init(void);
+
+/**
+ * Release the memory held by Topology structure
+ */
+__rte_internal
+int rte_eal_topology_release(void);
+
 /**
  * Fill configuration with number of physical and logical processors
  *
diff --git a/lib/eal/common/eal_topology.c b/lib/eal/common/eal_topology.c
new file mode 100644
index 0000000000..7362d8e723
--- /dev/null
+++ b/lib/eal/common/eal_topology.c
@@ -0,0 +1,746 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2026 AMD Corporation
+ */
+
+#include <inttypes.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include <rte_topology.h>
+#include <rte_malloc.h>
+
+#include <eal_export.h>
+#include "eal_private.h"
+
+struct topology_config topo_cnfg;
+
+#ifdef RTE_LIBHWLOC_PROBE
+static inline bool is_valid_single_domain(unsigned int domainbits)
+{
+	if ((domainbits == 0) || (domainbits & ~RTE_TOPO_DOMAIN_ALL))
+		return false;
+
+	return (__builtin_popcount(domainbits) == 1);
+}
+
+static unsigned int
+get_domain_count(unsigned int domain_sel)
+{
+	if (is_valid_single_domain(domain_sel) == false)
+		return 0;
+
+	unsigned int domain_cnt =
+		(domain_sel & RTE_TOPO_DOMAIN_NUMA) ? topo_cnfg.numa_count :
+		(domain_sel & RTE_TOPO_DOMAIN_L4) ? topo_cnfg.l4_count :
+		(domain_sel & RTE_TOPO_DOMAIN_L3) ? topo_cnfg.l3_count :
+		(domain_sel & RTE_TOPO_DOMAIN_L2) ? topo_cnfg.l2_count :
+		(domain_sel & RTE_TOPO_DOMAIN_L1) ? topo_cnfg.l1_count : 0;
+
+	return domain_cnt;
+}
+
+static struct core_domain_mapping *
+get_domain_lcore_mapping(unsigned int domain_sel, unsigned int domain_indx)
+{
+	if (is_valid_single_domain(domain_sel) == false)
+		return NULL;
+
+	if (domain_indx >= get_domain_count(domain_sel))
+		return NULL;
+
+	struct core_domain_mapping *ptr =
+		(domain_sel & RTE_TOPO_DOMAIN_NUMA) ? topo_cnfg.numa[domain_indx] :
+		(domain_sel & RTE_TOPO_DOMAIN_L4) ? topo_cnfg.l4[domain_indx] :
+		(domain_sel & RTE_TOPO_DOMAIN_L3) ? topo_cnfg.l3[domain_indx] :
+		(domain_sel & RTE_TOPO_DOMAIN_L2) ? topo_cnfg.l2[domain_indx] :
+		(domain_sel & RTE_TOPO_DOMAIN_L1) ? topo_cnfg.l1[domain_indx] : NULL;
+
+	return ptr;
+}
+
+static unsigned int
+get_domain_lcore_count(unsigned int domain_sel)
+{
+	if (is_valid_single_domain(domain_sel) == false)
+		return 0;
+
+	return ((domain_sel & RTE_TOPO_DOMAIN_NUMA) ? topo_cnfg.numa_core_count :
+		(domain_sel & RTE_TOPO_DOMAIN_L4) ? topo_cnfg.l4_core_count :
+		(domain_sel & RTE_TOPO_DOMAIN_L3) ? topo_cnfg.l3_core_count :
+		(domain_sel & RTE_TOPO_DOMAIN_L2) ? topo_cnfg.l2_core_count :
+		(domain_sel & RTE_TOPO_DOMAIN_L1) ? topo_cnfg.l1_core_count : 0);
+}
+
+static unsigned int
+get_lcore_count_from_domain_index(unsigned int domain_sel, unsigned int domain_indx)
+{
+	if ((is_valid_single_domain(domain_sel) == false) ||
+		(domain_indx >= get_domain_count(domain_sel)))
+		return 0;
+
+	struct core_domain_mapping *ptr = get_domain_lcore_mapping(domain_sel, domain_indx);
+	if (ptr == NULL)
+		return 0;
+
+	return ptr->core_count;
+}
+
+static uint16_t
+get_lcore_from_domain_position(unsigned int domain_sel, unsigned int domain_indx, unsigned int pos)
+{
+	if (pos >= RTE_MAX_LCORE)
+		return RTE_MAX_LCORE;
+
+	struct core_domain_mapping *ptr = get_domain_lcore_mapping(domain_sel, domain_indx);
+	if (ptr == NULL)
+		return RTE_MAX_LCORE;
+
+	if (pos >= ptr->core_count)
+		return RTE_MAX_LCORE;
+
+	return ptr->cores[pos];
+}
+#endif
+
+RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_topo_get_domain_index_from_lcore, 26.07)
+int
+rte_topo_get_domain_index_from_lcore(unsigned int domain_sel, uint16_t lcore)
+{
+#ifdef RTE_LIBHWLOC_PROBE
+	if (!rte_lcore_is_enabled(lcore))
+		return -1;
+
+	if (is_valid_single_domain(domain_sel) == false)
+		return -2;
+
+	return ((domain_sel & RTE_TOPO_DOMAIN_NUMA) ? topo_cnfg.lcore_map[lcore].numa_domain :
+		(domain_sel & RTE_TOPO_DOMAIN_L4) ? topo_cnfg.lcore_map[lcore].l4_domain :
+		(domain_sel & RTE_TOPO_DOMAIN_L3) ? topo_cnfg.lcore_map[lcore].l3_domain :
+		(domain_sel & RTE_TOPO_DOMAIN_L2) ? topo_cnfg.lcore_map[lcore].l2_domain :
+		(domain_sel & RTE_TOPO_DOMAIN_L1) ? topo_cnfg.lcore_map[lcore].l1_domain : -3);
+#else
+	RTE_SET_USED(domain_sel);
+	RTE_SET_USED(lcore);
+	return -3;
+#endif
+}
+
+RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_topo_get_domain_count, 26.07)
+unsigned int rte_topo_get_domain_count(unsigned int domain_sel)
+{
+#ifdef RTE_LIBHWLOC_PROBE
+	return get_domain_count(domain_sel);
+#else
+	RTE_SET_USED(domain_sel);
+#endif
+
+	return 0;
+}
+
+RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_topo_get_lcore_count_from_domain, 26.07)
+unsigned int
+rte_topo_get_lcore_count_from_domain(unsigned int domain_sel __rte_unused,
+unsigned int domain_indx __rte_unused)
+{
+#ifdef RTE_LIBHWLOC_PROBE
+	return get_lcore_count_from_domain_index(domain_sel, domain_indx);
+#else
+	RTE_SET_USED(domain_sel);
+	RTE_SET_USED(domain_indx);
+#endif
+	return 0;
+}
+
+RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_topo_get_nth_lcore_in_domain, 26.07)
+unsigned int
+rte_topo_get_nth_lcore_in_domain(unsigned int domain_sel __rte_unused,
+unsigned int domain_indx __rte_unused, unsigned int lcore_pos __rte_unused)
+{
+#ifdef RTE_LIBHWLOC_PROBE
+	return get_lcore_from_domain_position(domain_sel, domain_indx, lcore_pos);
+#else
+	RTE_SET_USED(domain_sel);
+	RTE_SET_USED(domain_indx);
+	RTE_SET_USED(lcore_pos);
+#endif
+	return RTE_MAX_LCORE;
+}
+
+RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_topo_get_lcore_cpuset_in_domain, 26.07)
+rte_cpuset_t
+rte_topo_get_lcore_cpuset_in_domain(unsigned int domain_sel __rte_unused,
+unsigned int domain_indx __rte_unused)
+{
+	rte_cpuset_t ret_cpu_set;
+	CPU_ZERO(&ret_cpu_set);
+
+#ifdef RTE_LIBHWLOC_PROBE
+	const struct core_domain_mapping *ptr = get_domain_lcore_mapping(domain_sel, domain_indx);
+
+	if ((ptr == NULL) || (ptr->core_count == 0))
+		return ret_cpu_set;
+
+	CPU_OR(&ret_cpu_set, &ret_cpu_set, &ptr->core_set);
+#else
+	RTE_SET_USED(domain_sel);
+	RTE_SET_USED(domain_indx);
+#endif
+
+	return ret_cpu_set;
+}
+
+RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_topo_is_main_lcore_in_domain, 26.07)
+bool
+rte_topo_is_main_lcore_in_domain(unsigned int domain_sel __rte_unused,
+unsigned int domain_indx __rte_unused)
+{
+#ifdef RTE_LIBHWLOC_PROBE
+	const unsigned int main_lcore = rte_get_main_lcore();
+	const struct core_domain_mapping *ptr = get_domain_lcore_mapping(domain_sel, domain_indx);
+
+	if ((ptr == NULL) || (ptr->core_count == 0))
+		return false;
+
+	return CPU_ISSET(main_lcore, &ptr->core_set);
+#else
+	RTE_SET_USED(domain_sel);
+	RTE_SET_USED(domain_indx);
+#endif
+
+	return false;
+}
+
+RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_topo_get_nth_lcore_from_domain, 26.07)
+unsigned int
+rte_topo_get_nth_lcore_from_domain(unsigned int domain_indx __rte_unused,
+unsigned int lcore_pos __rte_unused,
+int wrap __rte_unused, uint32_t flag __rte_unused)
+{
+#ifdef RTE_LIBHWLOC_PROBE
+	const unsigned int lcore_in_domain = get_domain_lcore_count(flag);
+	const unsigned int domain_count = get_domain_count(flag);
+
+	if ((domain_count == 0) || (lcore_in_domain <= 1))
+		return RTE_MAX_LCORE;
+
+	const bool find_first_lcore_in_first_domain =
+			((domain_indx == RTE_TOPO_DOMAIN_MAX) &&
+				(lcore_pos == RTE_TOPO_DOMAIN_LCORE_POS_MAX)) ? true : false;
+	const bool find_domain_from_lcore_pos =
+			((domain_indx == RTE_TOPO_DOMAIN_MAX) &&
+				(lcore_pos < RTE_TOPO_DOMAIN_LCORE_POS_MAX)) ? true : false;
+
+	struct core_domain_mapping *ptr = NULL;
+
+	/* if user has passed invalid lcore id, get the first valid lcore */
+	if (find_first_lcore_in_first_domain) {
+		for (unsigned int domain_index = 0; domain_index < domain_count; domain_index++) {
+			ptr = get_domain_lcore_mapping(flag, domain_index);
+			if ((ptr == NULL) || (ptr->core_count == 0))
+				continue;
+
+			/* get first lcore from valid domain based on the flag */
+			for (unsigned int i = 0; i < ptr->core_count; i++) {
+				uint16_t lcore = ptr->cores[i];
+
+				EAL_LOG(DEBUG, "Found lcore (%u) in domain (%d) at pos %u",
+					lcore, domain_index, i);
+				return lcore;
+			}
+		}
+
+		return RTE_MAX_LCORE;
+	}
+
+	/* if user has passed lcore pos, get lcore from matching domian */
+	if (find_domain_from_lcore_pos) {
+		for (unsigned int domain_index = 0; domain_index < domain_count; domain_index++) {
+			unsigned int pos_lcore = lcore_pos;
+			ptr = get_domain_lcore_mapping(flag, domain_index);
+			if ((ptr == NULL) || (ptr->core_count == 0))
+				continue;
+
+			if (wrap)
+				pos_lcore = (ptr->core_count > lcore_pos) ?
+					lcore_pos : lcore_pos %  ptr->core_count;
+
+			/* get first lcore from valid domain based on the flag */
+			for (unsigned int i = pos_lcore; i < ptr->core_count; i++) {
+				uint16_t lcore = ptr->cores[i];
+
+				EAL_LOG(DEBUG, "Found lcore (%u) in domain (%d) at pos %u",
+					lcore, domain_index, i);
+				return lcore;
+			}
+		}
+
+		return RTE_MAX_LCORE;
+	}
+
+	if (wrap)
+		domain_indx = domain_indx % domain_count;
+
+	/* get cores set in domain_indx */
+	ptr = get_domain_lcore_mapping(flag, domain_indx);
+	if ((ptr == NULL) || (ptr->core_count == 0))
+		return RTE_MAX_LCORE;
+
+	if (wrap)
+		lcore_pos = lcore_pos % ptr->core_count;
+
+	if (lcore_pos >= ptr->core_count)
+		return RTE_MAX_LCORE;
+
+	EAL_LOG(DEBUG, "lcore pos (%u) from domain (%u)", lcore_pos, domain_indx);
+
+	bool wrap_once = false;
+	unsigned int new_lcore_pos = lcore_pos;
+
+	while (1) {
+		if (new_lcore_pos >= ptr->core_count) {
+			if (!wrap)
+				return RTE_MAX_LCORE;
+
+			if ((wrap == true) && (wrap_once == true))
+				return RTE_MAX_LCORE;
+
+			new_lcore_pos = 0;
+			wrap_once = true;
+		}
+
+		/* check if the domain has cores_to_skip */
+		uint16_t new_lcore = ptr->cores[new_lcore_pos];
+
+		EAL_LOG(DEBUG, "Selected core (%u) at position %u", new_lcore, new_lcore_pos);
+		return new_lcore;
+	}
+
+#else
+	RTE_SET_USED(domain_indx);
+	RTE_SET_USED(lcore_pos);
+	RTE_SET_USED(wrap);
+	RTE_SET_USED(flag);
+#endif
+
+	return RTE_MAX_LCORE;
+}
+
+RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_topo_get_next_lcore, 26.07)
+unsigned int
+rte_topo_get_next_lcore(uint16_t lcore __rte_unused,
+bool skip_main __rte_unused, bool wrap __rte_unused, uint32_t flag __rte_unused)
+{
+#ifdef RTE_LIBHWLOC_PROBE
+	const uint16_t main_lcore = rte_get_main_lcore();
+	const unsigned int lcore_in_domain = get_domain_lcore_count(flag);
+	const unsigned int domain_count = get_domain_count(flag);
+
+	if ((domain_count == 0) || (lcore_in_domain <= 1))
+		return RTE_MAX_LCORE;
+
+	if (wrap)
+		lcore = lcore % RTE_MAX_LCORE;
+
+	if ((lcore >= RTE_MAX_LCORE) && (wrap == false))
+		return RTE_MAX_LCORE;
+
+	int lcore_domain = rte_topo_get_domain_index_from_lcore(flag, lcore);
+	if (lcore_domain < 0)
+		return RTE_MAX_LCORE;
+
+	struct core_domain_mapping *ptr = get_domain_lcore_mapping(flag, lcore_domain);
+	if ((ptr == NULL) || (ptr->core_count == 0))
+		return RTE_MAX_LCORE;
+
+	unsigned int lcore_pos = RTE_TOPO_DOMAIN_LCORE_POS_MAX;
+	for (unsigned int i = 0; i < ptr->core_count; i++) {
+		uint16_t find_lcore = ptr->cores[i];
+
+		if (lcore == find_lcore) {
+			lcore_pos = i;
+			break;
+		}
+	}
+
+	if (lcore_pos == RTE_TOPO_DOMAIN_LCORE_POS_MAX)
+		return RTE_MAX_LCORE;
+
+	EAL_LOG(DEBUG, "lcore pos (%u) from domain (%u)", lcore_pos, lcore_domain);
+
+	bool wrap_once = false;
+	unsigned int new_lcore_pos = lcore_pos + 1;
+
+	while (1) {
+		if (new_lcore_pos >= ptr->core_count) {
+			if (!wrap)
+				return RTE_MAX_LCORE;
+
+			if ((wrap == true) && (wrap_once == true))
+				return RTE_MAX_LCORE;
+
+			new_lcore_pos = 0;
+			wrap_once = true;
+		}
+
+		/* check if the domain has cores_to_skip */
+		uint16_t new_lcore = ptr->cores[new_lcore_pos];
+		bool main_in_domain = rte_topo_is_main_lcore_in_domain(flag, lcore_domain);
+
+		if (main_in_domain) {
+			if ((skip_main) && (new_lcore == main_lcore)) {
+				new_lcore_pos++;
+				continue;
+			}
+		}
+
+		EAL_LOG(DEBUG, "Selected core (%u) at position %u", new_lcore, new_lcore_pos);
+		return new_lcore;
+	}
+
+#else
+	RTE_SET_USED(skip_main);
+	RTE_SET_USED(wrap);
+	RTE_SET_USED(flag);
+#endif
+
+	return RTE_MAX_LCORE;
+}
+
+RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_topo_dump, 26.07)
+void
+rte_topo_dump(FILE *f)
+{
+#ifdef RTE_LIBHWLOC_PROBE
+	static const unsigned int domain_types[] = {
+		RTE_TOPO_DOMAIN_NUMA,
+		RTE_TOPO_DOMAIN_L4,
+		RTE_TOPO_DOMAIN_L3,
+		RTE_TOPO_DOMAIN_L2,
+		RTE_TOPO_DOMAIN_L1
+	};
+
+	fprintf(f, "| %15s | %15s | %15s | %15s |\n",
+		"Domain-Name", "Domains", "Domains-with-lcore", "Domain-total-lcore");
+	fprintf(f, "----------------------------------------------------------------------------------------------\n");
+	for (unsigned int d = 0; d < RTE_DIM(domain_types); d++) {
+		unsigned int domain = RTE_TOPO_DOMAIN_MAX;
+		unsigned int domain_valid_count = 0;
+		unsigned int domain_valid_lcore_count = 0;
+
+		RTE_TOPO_FOREACH_DOMAIN(domain, domain_types[d]) {
+			if (rte_topo_get_lcore_count_from_domain(domain_types[d], domain))
+				domain_valid_count += 1;
+			domain_valid_lcore_count +=
+				rte_topo_get_lcore_count_from_domain(domain_types[d], domain);
+		}
+
+		fprintf(f, "| %15s | %15u | %15u | %15u |\n",
+			(domain_types[d] == RTE_TOPO_DOMAIN_NUMA) ? "NUMA" :
+			(domain_types[d] == RTE_TOPO_DOMAIN_L4) ? "L4" :
+			(domain_types[d] == RTE_TOPO_DOMAIN_L3) ? "L3" :
+			(domain_types[d] == RTE_TOPO_DOMAIN_L2) ? "L2" :
+			(domain_types[d] == RTE_TOPO_DOMAIN_L1) ? "L1" : NULL,
+			rte_topo_get_domain_count(domain_types[d]),
+			domain_valid_count,
+			domain_valid_lcore_count);
+	}
+	fprintf(f, "----------------------------------------------------------------------------------------------\n\n");
+
+	fprintf(f, "| %15s | %15s | %15s |\n",
+		"Domain-Name", "Domain-Index", "lcores");
+	fprintf(f, "----------------------------------------------------------------------------------------------");
+	for (unsigned int d = 0; d < RTE_DIM(domain_types); d++) {
+		unsigned int domain = RTE_TOPO_DOMAIN_MAX;
+
+		RTE_TOPO_FOREACH_DOMAIN(domain, domain_types[d]) {
+			if (rte_topo_get_lcore_count_from_domain(domain_types[d], domain) == 0)
+				continue;
+
+			fprintf(f, "\n| %15s | %15u | ",
+				(domain_types[d] == RTE_TOPO_DOMAIN_NUMA) ? "NUMA" :
+				(domain_types[d] == RTE_TOPO_DOMAIN_L4) ? "L4" :
+				(domain_types[d] == RTE_TOPO_DOMAIN_L3) ? "L3" :
+				(domain_types[d] == RTE_TOPO_DOMAIN_L2) ? "L2" :
+				(domain_types[d] == RTE_TOPO_DOMAIN_L1) ? "L1" : NULL,
+				domain);
+
+			uint16_t lcore = RTE_MAX_LCORE;
+			unsigned int pos = 0;
+			RTE_TOPO_FOREACH_LCORE_IN_DOMAIN(lcore, domain, pos, domain_types[d])
+				fprintf(f, " %u ", lcore);
+		}
+	}
+	fprintf(f, "\n----------------------------------------------------------------------------------------------\n\n");
+
+	fprintf(f, "| %10s |  %10s | %10s | %10s | %10s | %10s | %10s |\n",
+		"lcore", "cpu", "NUMA-Index", "L4-Index", "L3-Index", "L2-Index", "L1-Index");
+	fprintf(f, "------------------------------------------------------------------------------\n");
+	for (unsigned int i = 0; i < RTE_MAX_LCORE; i++) {
+		if (rte_lcore_is_enabled(i) == false)
+			continue;
+
+		fprintf(f, "| %10u |  %10u | %10u | %10u | %10u | %10u | %10u |\n",
+			i,
+			topo_cnfg.lcore_map[i].cpu,
+			topo_cnfg.lcore_map[i].numa_domain,
+			topo_cnfg.lcore_map[i].l4_domain,
+			topo_cnfg.lcore_map[i].l3_domain,
+			topo_cnfg.lcore_map[i].l2_domain,
+			topo_cnfg.lcore_map[i].l1_domain);
+	}
+	fprintf(f, "------------------------------------------------------------------------------\n\n");
+
+	fprintf(f, "| %10s |  %10s | %10s | %10s | %10s | %10s | %10s |\n",
+		"lcore", "cpu", "NUMA-cacheid", "L4-cacheid", "L3-cacheid", "L2-cacheid", "L1-cacheid");
+	fprintf(f, "------------------------------------------------------------------------------\n");
+	for (unsigned int i = 0; i < RTE_MAX_LCORE; i++) {
+		if (rte_lcore_is_enabled(i) == false)
+			continue;
+
+		fprintf(f, "| %10u |  %10u | %10u | %10u | %10u | %10u | %10u |\n",
+			i,
+			topo_cnfg.lcore_map[i].cpu,
+			topo_cnfg.lcore_map[i].numa_cacheid,
+			topo_cnfg.lcore_map[i].l4_cacheid,
+			topo_cnfg.lcore_map[i].l3_cacheid,
+			topo_cnfg.lcore_map[i].l2_cacheid,
+			topo_cnfg.lcore_map[i].l1_cacheid);
+	}
+	fprintf(f, "------------------------------------------------------------------------------\n\n");
+
+#else
+	RTE_SET_USED(f);
+#endif
+}
+
+#ifdef RTE_LIBHWLOC_PROBE
+static int
+lcore_to_core(unsigned int lcore)
+{
+	rte_cpuset_t cpu;
+	CPU_ZERO(&cpu);
+
+	cpu = rte_lcore_cpuset(lcore);
+
+	for (int i = 0; i < RTE_TOPO_MAX_CPU_CORES; i++) {
+		if (CPU_ISSET(i, &cpu))
+			return i;
+	}
+
+	return -1;
+}
+
+static int
+eal_topology_map_layer(hwloc_topology_t topology, int depth,
+uint16_t *layer_cnt, struct core_domain_mapping ***layer_ptr,
+uint16_t *total_core_cnt, const char *layer_name)
+{
+	if (depth == HWLOC_TYPE_DEPTH_UNKNOWN || *layer_cnt == 0)
+		return 0;
+
+	*layer_ptr = rte_malloc(NULL, sizeof(struct core_domain_mapping *) * (*layer_cnt), 0);
+	if (*layer_ptr == NULL)
+		return -1;
+
+	/* create lcore-domain-mapping */
+	for (uint16_t j = 0; j < *layer_cnt; j++) {
+		hwloc_obj_t obj = hwloc_get_obj_by_depth(topology, depth, j);
+		int cpu_count = hwloc_bitmap_weight(obj->cpuset);
+		if (cpu_count == -1)
+			continue;
+
+		struct core_domain_mapping *dm =
+			rte_zmalloc(NULL, sizeof(struct core_domain_mapping), 0);
+		if (!dm)
+			return -1;
+
+		(*layer_ptr)[j] = dm;
+		CPU_ZERO(&dm->core_set);
+		dm->core_count = 0;
+
+		dm->cores = rte_malloc(NULL, sizeof(uint16_t) * cpu_count, 0);
+		if (!dm->cores)
+			return -1;
+	}
+
+	/* populate lcore-mapping */
+	for (uint16_t j = 0; j < *layer_cnt; j++) {
+		hwloc_obj_t obj = hwloc_get_obj_by_depth(topology, depth, j);
+		if (!obj || hwloc_bitmap_iszero(obj->cpuset))
+			continue;
+
+		int cpu_id = -1;
+		while ((cpu_id = hwloc_bitmap_next(obj->cpuset, cpu_id)) != -1) {
+			if (!rte_lcore_is_enabled(cpu_id))
+				continue;
+
+			EAL_LOG(DEBUG, " %s domain (%u) lcore %u, logical %u, os %u",
+				layer_name, j, cpu_id, obj->logical_index, obj->os_index);
+
+			int cpu_core = lcore_to_core(cpu_id);
+			if (cpu_core == -1)
+				return -1;
+
+			topo_cnfg.lcore_map[cpu_id].cpu = (uint16_t) cpu_core;
+
+			for (uint16_t k = 0; k < *layer_cnt; k++) {
+				hwloc_obj_t obj_core =
+					hwloc_get_obj_by_depth(topology, depth, k);
+				int cpu_count_core =
+					hwloc_bitmap_weight(obj_core->cpuset);
+				if (cpu_count_core == -1)
+					continue;
+
+				if (hwloc_bitmap_isset(obj_core->cpuset,
+					topo_cnfg.lcore_map[cpu_id]. cpu)) {
+					if (strncmp(layer_name, "NUMA", 4) == 0) {
+						topo_cnfg.lcore_map[cpu_id].numa_domain = k;
+						topo_cnfg.lcore_map[cpu_id].numa_cacheid =
+							obj_core->logical_index;
+					} else if (strncmp(layer_name, "L4", 2) == 0) {
+						topo_cnfg.lcore_map[cpu_id].l4_domain = k;
+						topo_cnfg.lcore_map[cpu_id].l4_cacheid =
+							obj_core->logical_index;
+					} else if (strncmp(layer_name, "L3", 2) == 0) {
+						topo_cnfg.lcore_map[cpu_id].l3_domain = k;
+						topo_cnfg.lcore_map[cpu_id].l3_cacheid =
+							obj_core->logical_index;
+					} else if (strncmp(layer_name, "L2", 2) == 0) {
+						topo_cnfg.lcore_map[cpu_id].l2_domain = k;
+						topo_cnfg.lcore_map[cpu_id].l2_cacheid =
+							obj_core->logical_index;
+					} else if (strncmp(layer_name, "L1", 2) == 0) {
+						topo_cnfg.lcore_map[cpu_id].l1_domain = k;
+						topo_cnfg.lcore_map[cpu_id].l1_cacheid =
+							obj_core->logical_index;
+					}
+
+					/* populate lcore-domain-mapping */
+					struct core_domain_mapping *dm = (*layer_ptr)[k];
+					if (dm == NULL)
+						return -2;
+
+					dm->cores[dm->core_count++] = (uint16_t)cpu_id;
+					CPU_SET(cpu_id, &dm->core_set);
+
+					(*total_core_cnt)++;
+					break;
+				}
+			}
+		}
+	}
+
+	return 0;
+}
+#endif
+
+/*
+ * Use HWLOC library to parse L1|L2|L3|NUMA-IO on the running target machine.
+ * Store the topology structure in memory.
+ */
+RTE_EXPORT_INTERNAL_SYMBOL(rte_eal_topology_init)
+int rte_eal_topology_init(void)
+{
+#ifdef RTE_LIBHWLOC_PROBE
+	memset(&topo_cnfg, 0, sizeof(struct topology_config));
+
+	if (hwloc_topology_init(&topo_cnfg.topology) < 0)
+		return -1;
+
+	if (hwloc_topology_load(topo_cnfg.topology) < 0) {
+		hwloc_topology_destroy(topo_cnfg.topology);
+		return -2;
+	}
+
+	struct {
+		int depth;
+		uint16_t *count;
+		struct core_domain_mapping ***ptr;
+		uint16_t *total_cores;
+		const char *name;
+	} layers[] = {
+		{ hwloc_get_type_depth(topo_cnfg.topology, HWLOC_OBJ_L1CACHE),
+			&topo_cnfg.l1_count, &topo_cnfg.l1, &topo_cnfg.l1_core_count, "L1" },
+		{ hwloc_get_type_depth(topo_cnfg.topology, HWLOC_OBJ_L2CACHE),
+			&topo_cnfg.l2_count, &topo_cnfg.l2, &topo_cnfg.l2_core_count, "L2" },
+		{ hwloc_get_type_depth(topo_cnfg.topology, HWLOC_OBJ_L3CACHE),
+			&topo_cnfg.l3_count, &topo_cnfg.l3, &topo_cnfg.l3_core_count, "L3" },
+		{ hwloc_get_type_depth(topo_cnfg.topology, HWLOC_OBJ_L4CACHE),
+			&topo_cnfg.l4_count, &topo_cnfg.l4, &topo_cnfg.l4_core_count, "L4" },
+		{ hwloc_get_type_depth(topo_cnfg.topology, HWLOC_OBJ_NUMANODE),
+			&topo_cnfg.numa_count, &topo_cnfg.numa, &topo_cnfg.numa_core_count, "NUMA" }
+	};
+
+	for (int i = 0; i < 5; i++) {
+		*layers[i].count = hwloc_get_nbobjs_by_depth(topo_cnfg.topology, layers[i].depth);
+		if (eal_topology_map_layer(topo_cnfg.topology, layers[i].depth, layers[i].count,
+			layers[i].ptr, layers[i].total_cores, layers[i].name) < 0) {
+			rte_eal_topology_release();
+			return -1;
+		}
+	}
+
+	hwloc_topology_destroy(topo_cnfg.topology);
+	topo_cnfg.topology = NULL;
+#endif
+
+	return 0;
+}
+
+
+#ifdef RTE_LIBHWLOC_PROBE
+struct domain_store {
+	struct core_domain_mapping **map;
+	uint16_t count;
+	uint16_t core_count;
+	const char *name;
+};
+
+static void
+release_domain(struct domain_store *d)
+{
+	if (!d->map) {
+		d->count = 0;
+		d->core_count = 0;
+		return;
+	}
+
+	for (int i = 0; i < d->count; i++) {
+		if (!d->map[i])
+			continue;
+		rte_free(d->map[i]->cores);
+		d->map[i]->cores = NULL;
+		rte_free(d->map[i]);
+		d->map[i] = NULL;
+	}
+
+	rte_free(d->map);
+	d->map = NULL;
+}
+#endif
+
+/*
+ * release HWLOC topology structure memory
+ */
+RTE_EXPORT_INTERNAL_SYMBOL(rte_eal_topology_release)
+int
+rte_eal_topology_release(void)
+{
+#ifdef RTE_LIBHWLOC_PROBE
+
+	struct domain_store domains[] = {
+		{ topo_cnfg.l1,   topo_cnfg.l1_count,   topo_cnfg.l1_core_count,   "L1"   },
+		{ topo_cnfg.l2,   topo_cnfg.l2_count,   topo_cnfg.l2_core_count,   "L2"   },
+		{ topo_cnfg.l3,   topo_cnfg.l3_count,   topo_cnfg.l3_core_count,   "L3"   },
+		{ topo_cnfg.l4,   topo_cnfg.l4_count,   topo_cnfg.l4_core_count,   "L4"   },
+		{ topo_cnfg.numa, topo_cnfg.numa_count,  topo_cnfg.numa_core_count, "NUMA" },
+	};
+
+	for (unsigned int d = 0; d < RTE_DIM(domains); d++) {
+		EAL_LOG(DEBUG, "release %s domain memory", domains[d].name);
+		release_domain(&domains[d]);
+	}
+#endif
+
+	return 0;
+}
diff --git a/lib/eal/common/meson.build b/lib/eal/common/meson.build
index e273745e93..834ed2130b 100644
--- a/lib/eal/common/meson.build
+++ b/lib/eal/common/meson.build
@@ -50,6 +50,7 @@ if not is_windows
             'eal_common_trace.c',
             'eal_common_trace_ctf.c',
             'eal_common_trace_utils.c',
+            'eal_topology.c',
             'hotplug_mp.c',
             'malloc_mp.c',
             'rte_keepalive.c',
diff --git a/lib/eal/freebsd/eal.c b/lib/eal/freebsd/eal.c
index 60f5e676a8..0d016a379f 100644
--- a/lib/eal/freebsd/eal.c
+++ b/lib/eal/freebsd/eal.c
@@ -42,6 +42,8 @@
 #include <rte_devargs.h>
 #include <rte_version.h>
 #include <rte_vfio.h>
+#include <rte_topology.h>
+
 #include <malloc_heap.h>
 #include <telemetry_internal.h>
 
@@ -77,7 +79,6 @@ struct lcore_config lcore_config[RTE_MAX_LCORE];
 RTE_EXPORT_SYMBOL(rte_cycles_vmware_tsc_map)
 int rte_cycles_vmware_tsc_map;
 
-
 int
 eal_clean_runtime_dir(void)
 {
@@ -754,6 +755,12 @@ rte_eal_init(int argc, char **argv)
 			goto err_out;
 	}
 
+	ret = rte_eal_topology_init();
+	if (ret) {
+		rte_eal_init_alert("Cannot invoke topology, skipping topology!!!");
+		rte_errno = ENOTSUP;
+	}
+
 	eal_mcfg_complete();
 
 	return fctret;
@@ -781,6 +788,7 @@ rte_eal_cleanup(void)
 		eal_get_internal_configuration();
 	rte_service_finalize();
 	eal_bus_cleanup();
+	rte_eal_topology_release();
 	rte_mp_channel_cleanup();
 	rte_eal_alarm_cleanup();
 	rte_trace_save();
diff --git a/lib/eal/include/meson.build b/lib/eal/include/meson.build
index aef5824e5f..16857f76bf 100644
--- a/lib/eal/include/meson.build
+++ b/lib/eal/include/meson.build
@@ -50,6 +50,7 @@ headers += files(
         'rte_thread.h',
         'rte_ticketlock.h',
         'rte_time.h',
+        'rte_topology.h',
         'rte_trace.h',
         'rte_trace_point.h',
         'rte_trace_point_register.h',
diff --git a/lib/eal/include/rte_topology.h b/lib/eal/include/rte_topology.h
new file mode 100644
index 0000000000..1ecee6b031
--- /dev/null
+++ b/lib/eal/include/rte_topology.h
@@ -0,0 +1,255 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2010-2014 Intel Corporation
+ */
+
+#ifndef _RTE_TOPO_TOPO_H_
+#define _RTE_TOPO_TOPO_H_
+
+/**
+ * @file
+ *
+ * API for lcore and socket manipulation
+ */
+#include <rte_lcore.h>
+#include <rte_bitops.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * The lcore grouping with in the L1 Domain.
+ */
+#define RTE_TOPO_DOMAIN_L1  RTE_BIT32(0)
+/**
+ * The lcore grouping with in the L2 Domain.
+ */
+#define RTE_TOPO_DOMAIN_L2  RTE_BIT32(1)
+/**
+ * The lcore grouping with in the L3 Domain.
+ */
+#define RTE_TOPO_DOMAIN_L3  RTE_BIT32(2)
+/**
+ * The lcore grouping with in the L4 Domain.
+ */
+#define RTE_TOPO_DOMAIN_L4  RTE_BIT32(3)
+/**
+ * The lcore grouping with in the IO Domain.
+ */
+#define RTE_TOPO_DOMAIN_NUMA  RTE_BIT32(4)
+/**
+ * The lcore grouping with in the SMT Domain (Like L1 Domain).
+ */
+#define RTE_TOPO_DOMAIN_SMT RTE_TOPO_DOMAIN_L1
+/**
+ * The lcore grouping based on Domains (L1|L2|L3|L4|NUMA).
+ */
+#define RTE_TOPO_DOMAIN_ALL (RTE_TOPO_DOMAIN_L1 |	\
+				RTE_TOPO_DOMAIN_L2 |	\
+				RTE_TOPO_DOMAIN_L3 |	\
+				RTE_TOPO_DOMAIN_L4 |	\
+				RTE_TOPO_DOMAIN_NUMA)
+/**
+ * The mask for all bits set for domain
+ */
+#define RTE_TOPO_DOMAIN_MAX RTE_GENMASK32(31, 0)
+#define RTE_TOPO_DOMAIN_LCORE_POS_MAX RTE_GENMASK32(31, 0)
+
+
+/**
+ * Get count for selected domain.
+ *
+ * @param domain_sel
+ *   Domain selection, RTE_TOPO_DOMAIN_[L1|L2|L3|L4|NUMA].
+ * @return
+ *   Number of domains, or 0 if:
+ *   - hwloc not available
+ *   - Invalid domain selector
+ *   - Domain type doesn't exist on system
+ *
+ * @note valid for EAL args of lcore and coremask.
+ *
+ */
+__rte_experimental
+unsigned int rte_topo_get_domain_count(unsigned int domain_sel);
+
+/**
+ * Get count for lcores in a domain.
+ *
+ * @param domain_sel
+ *   Domain selection, RTE_TOPO_DOMAIN_[L1|L2|L3|L4|NUMA].
+ * @param domain_indx
+ *   Domain Index, valid range from 0 to (rte_topo_get_domain_count - 1).
+ * @return
+ *   total count for lcore in a selected index of a domain.
+ *
+ * @note valid for EAL args of lcore and coremask.
+ *
+ */
+__rte_experimental
+unsigned int
+rte_topo_get_lcore_count_from_domain(unsigned int domain_sel, unsigned int domain_indx);
+
+/**
+ * Get domain index using lcore & domain.
+ *
+ * @param domain_sel
+ *   Domain selection, RTE_TOPO_DOMAIN_[L1|L2|L3|L4|NUMA].
+ * @param lcore
+ *   valid lcore within valid selected domain.
+ * @return
+ *   < 0, invalid domain index
+ *   >= 0, valid domain index
+ *
+ * @note valid for EAL args of lcore and coremask.
+ *
+ */
+__rte_experimental
+int
+rte_topo_get_domain_index_from_lcore(unsigned int domain_sel, uint16_t lcore);
+
+/**
+ * Get n'th lcore from a selected domain.
+ *
+ * @param domain_sel
+ *   Domain selection, RTE_TOPO_DOMAIN_[L1|L2|L3|L4|NUMA].
+ * @param domain_indx
+ *   Domain Index, valid range from 0 to (rte_topo_get_domain_count - 1).
+ * @param lcore_pos
+ *   lcore position, valid range from 0 to (dpdk_enabled_lcores in the domain -1)
+ * @return
+ *   lcore from the list for the selected domain.
+ *
+ * @note valid for EAL args of lcore and coremask.
+ *
+ */
+__rte_experimental
+unsigned int
+rte_topo_get_nth_lcore_in_domain(unsigned int domain_sel,
+unsigned int domain_indx, unsigned int lcore_pos);
+
+#ifdef RTE_HAS_CPUSET
+/**
+ * Return cpuset for all lcores in selected domain.
+ *
+ * @param domain_sel
+ *   Domain selection, RTE_TOPO_DOMAIN_[L1|L2|L3|L4|NUMA].
+ * @param domain_indx
+ *   Domain Index, valid range from 0 to (rte_topo_get_domain_count - 1).
+ * @return
+ *   cpuset for all lcores from the selected domain.
+ *
+ * @note valid for EAL args of lcore and coremask.
+ *
+ */
+__rte_experimental
+rte_cpuset_t
+rte_topo_get_lcore_cpuset_in_domain(unsigned int domain_sel, unsigned int domain_indx);
+#endif
+
+/**
+ * Return TRUE|FALSE if main lcore in available in selected domain.
+ *
+ * @param domain_sel
+ *   Domain selection, RTE_TOPO_DOMAIN_[L1|L2|L3|L4|NUMA].
+ * @param domain_indx
+ *   Domain Index, valid range from 0 to (rte_topo_get_domain_count - 1).
+ * @return
+ *   Check if main lcore is avaialable in the selected domain.
+ *
+ * @note valid for EAL args of lcore and coremask.
+ *
+ */
+__rte_experimental
+bool
+rte_topo_is_main_lcore_in_domain(unsigned int domain_sel, unsigned int domain_indx);
+
+/**
+ * Get the enabled lcores from next domain based on extended flag.
+ *
+ * @param lcore
+ *   The current lcore (reference).
+ * @param skip_main
+ *   If true, do not return the ID of the main lcore.
+ * @param wrap
+ *   If true, go back to first core of flag based domain when last core is reached.
+ *   If false, return RTE_MAX_LCORE when no more cores are available.
+ * @param flag
+ *   Allows user to select various domain as specified under RTE_TOPO_DOMAIN_[L1|L2|L3|L4|NUMA]
+ *
+ * @return
+ *   The next lcore_id or RTE_MAX_LCORE if not found.
+ *
+ * @note valid for EAL args of lcore and coremask.
+ *
+ */
+__rte_experimental
+unsigned int
+rte_topo_get_next_lcore(uint16_t lcore,
+bool skip_main, bool wrap, uint32_t flag);
+
+/**
+ * Get the Nth (first|last) lcores from next domain based on extended flag.
+ *
+ * @param domain_indx
+ *   Domain Index, valid range from 0 to (rte_topo_get_domain_count - 1).
+ * @param lcore_pos
+ *   lcore position, valid range from 0 to (dpdk_enabled_lcores in the domain -1)
+ * @param wrap
+ *   If true, go back to first core of flag based domain when last core is reached.
+ *   If false, return RTE_MAX_LCORE when no more cores are available.
+ * @param flag
+ *   Allows user to select various domain as specified under RTE_TOPO_DOMAIN_(L1|L2|L3|L4|NUMA)
+ *
+ * @return
+ *   The next lcore_id or RTE_MAX_LCORE if not found.
+ *
+ * @note valid for EAL args of lcore and coremask.
+ *
+ */
+__rte_experimental
+unsigned int
+rte_topo_get_nth_lcore_from_domain(unsigned int domain_indx, unsigned int lcore_pos,
+int wrap, uint32_t flag);
+
+/**
+ * Dump an internal topo_config to a file.
+ *
+ * Dump all fields for struct topology_config fields,
+ *
+ * @param f
+ *   A pointer to a file for output
+ */
+__rte_experimental
+void
+rte_topo_dump(FILE *f);
+
+#define RTE_TOPO_FOREACH_DOMAIN(domain_index, flag)	\
+	const unsigned int domain_count = rte_topo_get_domain_count(flag);	\
+	for (domain_index = 0; domain_index < domain_count; domain_index++)
+
+#define RTE_TOPO_FOREACH_WORKER_DOMAIN(domain_index, flag)	\
+	const unsigned int domain_count = rte_topo_get_domain_count(flag);	\
+	for (domain_index += (rte_topo_is_main_lcore_in_domain(domain_index, flag)) ? 1 : 0;	\
+		domain_index < domain_count;	\
+		domain_index += (rte_topo_is_main_lcore_in_domain(domain_index + 1, flag)) ? 2 : 1)
+
+#define RTE_TOPO_FOREACH_LCORE_IN_DOMAIN(lcore, domain_indx, lcore_pos, flag)	\
+	for (lcore = rte_topo_get_nth_lcore_from_domain(domain_indx, lcore_pos, 0, flag);	\
+		lcore < RTE_MAX_LCORE;	\
+		lcore = rte_topo_get_nth_lcore_from_domain(domain_indx, ++lcore_pos, 0, flag))
+
+#define RTE_TOPO_FOREACH_WORKER_LCORE_IN_DOMAIN(lcore, domain_indx, flag)	\
+	lcore = rte_topo_get_nth_lcore_from_domain(domain, 0, 0, flag);	\
+	uint16_t main_lcore = rte_get_main_lcore();	\
+	for (lcore = (lcore != main_lcore) ? \
+		lcore : rte_topo_get_next_lcore(lcore, 1, 0, flag);	\
+		lcore < RTE_MAX_LCORE;	\
+		lcore = rte_topo_get_next_lcore(lcore, 1, 0, flag))
+
+#ifdef __cplusplus
+}
+#endif
+
+
+#endif /* _RTE_TOPO_TOPO_H_ */
diff --git a/lib/eal/linux/eal.c b/lib/eal/linux/eal.c
index d848de03d8..f6a49badf2 100644
--- a/lib/eal/linux/eal.c
+++ b/lib/eal/linux/eal.c
@@ -42,6 +42,7 @@
 #include <rte_version.h>
 #include <malloc_heap.h>
 #include <rte_vfio.h>
+#include <rte_topology.h>
 
 #include <telemetry_internal.h>
 #include <eal_export.h>
@@ -927,6 +928,11 @@ rte_eal_init(int argc, char **argv)
 			goto err_out;
 	}
 
+	if (rte_eal_topology_init()) {
+		rte_eal_init_alert("Cannot invoke topology, skipping topologly!!!");
+		rte_errno = ENOTSUP;
+	}
+
 	eal_mcfg_complete();
 
 	return fctret;
@@ -981,6 +987,7 @@ rte_eal_cleanup(void)
 	rte_service_finalize();
 	eal_bus_cleanup();
 	vfio_mp_sync_cleanup();
+	rte_eal_topology_release();
 	rte_mp_channel_cleanup();
 	rte_eal_alarm_cleanup();
 	rte_trace_save();
diff --git a/lib/eal/meson.build b/lib/eal/meson.build
index f9fcee24ee..f6cd81ed8e 100644
--- a/lib/eal/meson.build
+++ b/lib/eal/meson.build
@@ -31,3 +31,7 @@ endif
 if is_freebsd
     annotate_locks = false
 endif
+
+if dpdk_conf.has('RTE_LIBHWLOC_PROBE')
+    ext_deps += hwloc_dep
+endif
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v5 v5 2/3] app: add topology aware test case
  2026-04-14 19:38 ` [PATCH v5 0/3] eal/topology: introduce topology-aware lcore grouping Vipin Varghese
  2026-04-14 19:38   ` [PATCH v5 v5 1/3] eal/topology: add Topology grouping for lcores Vipin Varghese
@ 2026-04-14 19:38   ` Vipin Varghese
  2026-04-15  5:21     ` Sudheendra Sampath
  2026-04-16  7:22     ` Varghese, Vipin
  2026-04-14 19:38   ` [PATCH v5 v5 3/3] doc: add new section topology Vipin Varghese
  2026-04-14 20:22   ` [PATCH v5 0/3] eal/topology: introduce topology-aware lcore grouping Stephen Hemminger
  3 siblings, 2 replies; 30+ messages in thread
From: Vipin Varghese @ 2026-04-14 19:38 UTC (permalink / raw)
  To: dev, sivaprasad.tummala
  Cc: konstantin.ananyev, wathsala.vithanage, bruce.richardson,
	viktorin, mb

changes:
 - rework stack and ring with domain lcores
 - add new test cases for topology API

ring_test inspired from 20251024141635.3939617-1-sivaprasad.tummala@amd.com

Suggested-by: Sivaprasad Tummala <sivaprasad.tummala@amd.com>
Signed-off-by: Vipin Varghese <vipin.varghese@amd.com>
---
 app/test/meson.build       |   1 +
 app/test/test_ring_perf.c  | 416 ++++++++++++++++++++++-
 app/test/test_stack_perf.c | 409 ++++++++++++++++++++++
 app/test/test_topology.c   | 676 +++++++++++++++++++++++++++++++++++++
 4 files changed, 1499 insertions(+), 3 deletions(-)
 create mode 100644 app/test/test_topology.c

diff --git a/app/test/meson.build b/app/test/meson.build
index 7d458f9c07..f584ea66c1 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -207,6 +207,7 @@ source_file_deps = {
     'test_timer_perf.c': ['timer'],
     'test_timer_racecond.c': ['timer'],
     'test_timer_secondary.c': ['timer'],
+    'test_topology.c': [],
     'test_trace.c': [],
     'test_trace_perf.c': [],
     'test_trace_register.c': [],
diff --git a/app/test/test_ring_perf.c b/app/test/test_ring_perf.c
index 9a2a481458..1f0fd8a0fa 100644
--- a/app/test/test_ring_perf.c
+++ b/app/test/test_ring_perf.c
@@ -10,6 +10,9 @@
 #include <rte_cycles.h>
 #include <rte_launch.h>
 #include <rte_pause.h>
+#ifdef RTE_LIBHWLOC_PROBE
+#include <rte_topology.h>
+#endif
 #include <string.h>
 
 #include "test.h"
@@ -74,7 +77,7 @@ test_ring_print_test_string(unsigned int api_type, int esize,
 static int
 get_two_hyperthreads(struct lcore_pair *lcp)
 {
-	unsigned id1, id2;
+	unsigned int id1, id2;
 	unsigned c1, c2, s1, s2;
 	RTE_LCORE_FOREACH(id1) {
 		/* inner loop just re-reads all id's. We could skip the first few
@@ -101,7 +104,7 @@ get_two_hyperthreads(struct lcore_pair *lcp)
 static int
 get_two_cores(struct lcore_pair *lcp)
 {
-	unsigned id1, id2;
+	unsigned int id1, id2;
 	unsigned c1, c2, s1, s2;
 	RTE_LCORE_FOREACH(id1) {
 		RTE_LCORE_FOREACH(id2) {
@@ -125,7 +128,7 @@ get_two_cores(struct lcore_pair *lcp)
 static int
 get_two_sockets(struct lcore_pair *lcp)
 {
-	unsigned id1, id2;
+	unsigned int id1, id2;
 	unsigned s1, s2;
 	RTE_LCORE_FOREACH(id1) {
 		RTE_LCORE_FOREACH(id2) {
@@ -143,6 +146,359 @@ get_two_sockets(struct lcore_pair *lcp)
 	return 1;
 }
 
+#ifdef RTE_LIBHWLOC_PROBE
+static int
+get_same_numa_domains(struct lcore_pair *lcp)
+{
+	if (rte_topo_get_domain_count(RTE_TOPO_DOMAIN_NUMA) == 0)
+		return 1;
+
+	unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
+	unsigned int domain = 0;
+
+	RTE_TOPO_FOREACH_DOMAIN(domain, RTE_TOPO_DOMAIN_NUMA) {
+		if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
+			break;
+
+		if (rte_topo_get_lcore_count_from_domain(RTE_TOPO_DOMAIN_NUMA, domain) < 2)
+			continue;
+
+		id1 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0, RTE_TOPO_DOMAIN_NUMA);
+		id2 = rte_topo_get_nth_lcore_from_domain(domain, 1, 0, RTE_TOPO_DOMAIN_NUMA);
+	}
+
+	if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
+		return 2;
+
+	if (id1 == id2)
+		return 3;
+
+	lcp->c1 = id1;
+	lcp->c2 = id2;
+
+	return 0;
+}
+
+static int
+get_same_l4_domains(struct lcore_pair *lcp)
+{
+	if (rte_topo_get_domain_count(RTE_TOPO_DOMAIN_L4) == 0)
+		return 1;
+
+	unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
+	unsigned int domain = 0;
+
+	RTE_TOPO_FOREACH_DOMAIN(domain, RTE_TOPO_DOMAIN_L4) {
+		if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
+			break;
+
+		if (rte_topo_get_lcore_count_from_domain(RTE_TOPO_DOMAIN_L4, domain) < 2)
+			continue;
+
+		id1 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0, RTE_TOPO_DOMAIN_L4);
+		id2 = rte_topo_get_nth_lcore_from_domain(domain, 1, 0, RTE_TOPO_DOMAIN_L4);
+	}
+
+	if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
+		return 2;
+
+	if (id1 == id2)
+		return 3;
+
+	lcp->c1 = id1;
+	lcp->c2 = id2;
+	return 0;
+}
+
+static int
+get_same_l3_domains(struct lcore_pair *lcp)
+{	if (rte_topo_get_domain_count(RTE_TOPO_DOMAIN_L3) == 0)
+		return 1;
+
+	unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
+	unsigned int domain = 0;
+
+	RTE_TOPO_FOREACH_DOMAIN(domain, RTE_TOPO_DOMAIN_L3) {
+		if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
+			break;
+
+		if (rte_topo_get_lcore_count_from_domain(RTE_TOPO_DOMAIN_L3, domain) < 2)
+			continue;
+
+		id1 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0, RTE_TOPO_DOMAIN_L3);
+		id2 = rte_topo_get_nth_lcore_from_domain(domain, 1, 0, RTE_TOPO_DOMAIN_L3);
+	}
+
+	if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
+		return 2;
+
+	if (id1 == id2)
+		return 3;
+
+	lcp->c1 = id1;
+	lcp->c2 = id2;
+
+	return 0;
+
+}
+
+static int
+get_same_l2_domains(struct lcore_pair *lcp)
+{	if (rte_topo_get_domain_count(RTE_TOPO_DOMAIN_L2) == 0)
+		return 1;
+
+	unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
+	unsigned int domain = 0;
+
+	RTE_TOPO_FOREACH_DOMAIN(domain, RTE_TOPO_DOMAIN_L2) {
+		if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
+			break;
+
+		if (rte_topo_get_lcore_count_from_domain(RTE_TOPO_DOMAIN_L2, domain) < 2)
+			continue;
+
+		id1 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0, RTE_TOPO_DOMAIN_L2);
+		id2 = rte_topo_get_nth_lcore_from_domain(domain, 1, 0, RTE_TOPO_DOMAIN_L2);
+	}
+
+	if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
+		return 2;
+
+	if (id1 == id2)
+		return 3;
+
+	lcp->c1 = id1;
+	lcp->c2 = id2;
+
+	return 0;
+
+}
+
+static int
+get_same_l1_domains(struct lcore_pair *lcp)
+{
+	if (rte_topo_get_domain_count(RTE_TOPO_DOMAIN_L1) == 0)
+		return 1;
+
+	unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
+	unsigned int domain = 0;
+
+	RTE_TOPO_FOREACH_DOMAIN(domain, RTE_TOPO_DOMAIN_L1) {
+		if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
+			break;
+
+		if (rte_topo_get_lcore_count_from_domain(RTE_TOPO_DOMAIN_L1, domain) < 2)
+			continue;
+
+		id1 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0, RTE_TOPO_DOMAIN_L1);
+		id2 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0, RTE_TOPO_DOMAIN_L1);
+	}
+
+	if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
+		return 2;
+
+	if (id1 == id2)
+		return 3;
+
+	lcp->c1 = id1;
+	lcp->c2 = id2;
+
+	return 0;
+
+}
+static int
+get_two_numa_domains(struct lcore_pair *lcp)
+{
+	if (rte_topo_get_domain_count(RTE_TOPO_DOMAIN_NUMA) < 2)
+		return 1;
+
+	unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
+	unsigned int domain = 0;
+
+	RTE_TOPO_FOREACH_DOMAIN(domain, RTE_TOPO_DOMAIN_NUMA) {
+		if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
+			break;
+
+		if (rte_topo_get_lcore_count_from_domain(RTE_TOPO_DOMAIN_NUMA, domain) == 0)
+			continue;
+
+		if (id1 == RTE_MAX_LCORE) {
+			id1 = rte_topo_get_nth_lcore_from_domain(domain,
+				0, 0, RTE_TOPO_DOMAIN_NUMA);
+			continue;
+		}
+		if (id2 == RTE_MAX_LCORE) {
+			id2 = rte_topo_get_nth_lcore_from_domain(domain,
+				0, 0, RTE_TOPO_DOMAIN_NUMA);
+			continue;
+		}
+	}
+
+	if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
+		return 2;
+
+	if (id1 == id2)
+		return 3;
+
+	lcp->c1 = id1;
+	lcp->c2 = id2;
+
+	return 0;
+}
+
+static int
+get_two_l4_domains(struct lcore_pair *lcp)
+{
+	if (rte_topo_get_domain_count(RTE_TOPO_DOMAIN_L4) < 2)
+		return 1;
+
+	unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
+	unsigned int domain = 0;
+
+	RTE_TOPO_FOREACH_DOMAIN(domain, RTE_TOPO_DOMAIN_L4) {
+		if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
+			break;
+
+		if (rte_topo_get_lcore_count_from_domain(RTE_TOPO_DOMAIN_L4, domain) == 0)
+			continue;
+
+		if (id1 == RTE_MAX_LCORE) {
+			id1 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0, RTE_TOPO_DOMAIN_L4);
+			continue;
+		}
+		if (id2 == RTE_MAX_LCORE) {
+			id2 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0, RTE_TOPO_DOMAIN_L4);
+			continue;
+		}
+	}
+
+	if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
+		return 2;
+
+	if (id1 == id2)
+		return 3;
+
+	lcp->c1 = id1;
+	lcp->c2 = id2;
+	return 0;
+}
+
+static int
+get_two_l3_domains(struct lcore_pair *lcp)
+{	if (rte_topo_get_domain_count(RTE_TOPO_DOMAIN_L3) < 2)
+		return 1;
+
+	unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
+	unsigned int domain = 0;
+
+	RTE_TOPO_FOREACH_DOMAIN(domain, RTE_TOPO_DOMAIN_L3) {
+		if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
+			break;
+
+		if (rte_topo_get_lcore_count_from_domain(RTE_TOPO_DOMAIN_L3, domain) == 0)
+			continue;
+
+		if (id1 == RTE_MAX_LCORE) {
+			id1 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0, RTE_TOPO_DOMAIN_L3);
+			continue;
+		}
+		if (id2 == RTE_MAX_LCORE) {
+			id2 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0, RTE_TOPO_DOMAIN_L3);
+			continue;
+		}
+	}
+
+	if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
+		return 2;
+
+	if (id1 == id2)
+		return 3;
+
+	lcp->c1 = id1;
+	lcp->c2 = id2;
+
+	return 0;
+
+}
+
+static int
+get_two_l2_domains(struct lcore_pair *lcp)
+{	if (rte_topo_get_domain_count(RTE_TOPO_DOMAIN_L2) < 2)
+		return 1;
+
+	unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
+	unsigned int domain = 0;
+
+	RTE_TOPO_FOREACH_DOMAIN(domain, RTE_TOPO_DOMAIN_L2) {
+		if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
+			break;
+
+		if (rte_topo_get_lcore_count_from_domain(RTE_TOPO_DOMAIN_L2, domain) == 0)
+			continue;
+
+		if (id1 == RTE_MAX_LCORE) {
+			id1 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0, RTE_TOPO_DOMAIN_L2);
+			continue;
+		}
+		if (id2 == RTE_MAX_LCORE) {
+			id2 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0, RTE_TOPO_DOMAIN_L2);
+			continue;
+		}
+	}
+
+	if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
+		return 2;
+
+	if (id1 == id2)
+		return 3;
+
+	lcp->c1 = id1;
+	lcp->c2 = id2;
+
+	return 0;
+
+}
+
+static int
+get_two_l1_domains(struct lcore_pair *lcp)
+{
+	if (rte_topo_get_domain_count(RTE_TOPO_DOMAIN_L1) < 2)
+		return 1;
+
+	unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
+	unsigned int domain = 0;
+
+	RTE_TOPO_FOREACH_DOMAIN(domain, RTE_TOPO_DOMAIN_L1) {
+		if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
+			break;
+
+		if (rte_topo_get_lcore_count_from_domain(RTE_TOPO_DOMAIN_L1, domain) == 0)
+			continue;
+
+		if (id1 == RTE_MAX_LCORE) {
+			id1 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0, RTE_TOPO_DOMAIN_L1);
+			continue;
+		}
+		if (id2 == RTE_MAX_LCORE) {
+			id2 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0, RTE_TOPO_DOMAIN_L1);
+			continue;
+		}
+	}
+
+	if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
+		return 2;
+
+	if (id1 == id2)
+		return 3;
+
+	lcp->c1 = id1;
+	lcp->c2 = id2;
+
+	return 0;
+
+}
+#endif
+
 /* Get cycle counts for dequeuing from an empty ring. Should be 2 or 3 cycles */
 static void
 test_empty_dequeue(struct rte_ring *r, const int esize,
@@ -488,6 +844,60 @@ test_ring_perf_esize_run_on_two_cores(
 		if (run_on_core_pair(&cores, param1, param2) < 0)
 			return -1;
 	}
+#ifdef RTE_LIBHWLOC_PROBE
+	if (rte_lcore_count() > 2) {
+		if (get_same_numa_domains(&cores) == 0) {
+			printf("\n### Testing using same numa domain nodes ###\n");
+			if (run_on_core_pair(&cores, param1, param2) < 0)
+				return -1;
+		}
+		if (get_same_l4_domains(&cores) == 0) {
+			printf("\n### Testing using same l4 domain nodes ###\n");
+			if (run_on_core_pair(&cores, param1, param2) < 0)
+				return -1;
+		}
+		if (get_same_l3_domains(&cores) == 0) {
+			printf("\n### Testing using same l3 domain nodes ###\n");
+			if (run_on_core_pair(&cores, param1, param2) < 0)
+				return -1;
+		}
+		if (get_same_l2_domains(&cores) == 0) {
+			printf("\n### Testing using same l2 domain nodes ###\n");
+			if (run_on_core_pair(&cores, param1, param2) < 0)
+				return -1;
+		}
+		if (get_same_l1_domains(&cores) == 0) {
+			printf("\n### Testing using same l1 domain nodes ###\n");
+			if (run_on_core_pair(&cores, param1, param2) < 0)
+				return -1;
+		}
+		if (get_two_numa_domains(&cores) == 0) {
+			printf("\n### Testing using two numa domain nodes ###\n");
+			if (run_on_core_pair(&cores, param1, param2) < 0)
+				return -1;
+		}
+		if (get_two_l4_domains(&cores) == 0) {
+			printf("\n### Testing using two l4 domain nodes ###\n");
+			if (run_on_core_pair(&cores, param1, param2) < 0)
+				return -1;
+		}
+		if (get_two_l3_domains(&cores) == 0) {
+			printf("\n### Testing using two l3 domain nodes ###\n");
+			if (run_on_core_pair(&cores, param1, param2) < 0)
+				return -1;
+		}
+		if (get_two_l2_domains(&cores) == 0) {
+			printf("\n### Testing using two l2 domain nodes ###\n");
+			if (run_on_core_pair(&cores, param1, param2) < 0)
+				return -1;
+		}
+		if (get_two_l1_domains(&cores) == 0) {
+			printf("\n### Testing using two l1 domain nodes ###\n");
+			if (run_on_core_pair(&cores, param1, param2) < 0)
+				return -1;
+		}
+	}
+#endif
 	return 0;
 }
 
diff --git a/app/test/test_stack_perf.c b/app/test/test_stack_perf.c
index 3f17a2606c..e5b038a3e8 100644
--- a/app/test/test_stack_perf.c
+++ b/app/test/test_stack_perf.c
@@ -10,6 +10,9 @@
 #include <rte_launch.h>
 #include <rte_pause.h>
 #include <rte_stack.h>
+#ifdef RTE_LIBHWLOC_PROBE
+#include <rte_topology.h>
+#endif
 
 #include "test.h"
 
@@ -105,6 +108,367 @@ get_two_sockets(struct lcore_pair *lcp)
 	return 1;
 }
 
+#ifdef RTE_LIBHWLOC_PROBE
+static int
+get_same_numa_domains(struct lcore_pair *lcp)
+{
+	if (rte_topo_get_domain_count(RTE_TOPO_DOMAIN_NUMA) == 0)
+		return 1;
+
+	unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
+	unsigned int domain = 0;
+
+	RTE_TOPO_FOREACH_DOMAIN(domain, RTE_TOPO_DOMAIN_NUMA) {
+		if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
+			break;
+
+		if (rte_topo_get_lcore_count_from_domain(RTE_TOPO_DOMAIN_NUMA, domain) < 2)
+			continue;
+
+		id1 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0, RTE_TOPO_DOMAIN_NUMA);
+		id2 = rte_topo_get_nth_lcore_from_domain(domain, 1, 0, RTE_TOPO_DOMAIN_NUMA);
+	}
+
+	if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
+		return 2;
+
+	if (id1 == id2)
+		return 3;
+
+	lcp->c1 = id1;
+	lcp->c2 = id2;
+
+	return 0;
+}
+
+static int
+get_same_l4_domains(struct lcore_pair *lcp)
+{
+	if (rte_topo_get_domain_count(RTE_TOPO_DOMAIN_L4) == 0)
+		return 1;
+
+	unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
+	unsigned int domain = 0;
+
+	RTE_TOPO_FOREACH_DOMAIN(domain, RTE_TOPO_DOMAIN_L4) {
+		if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
+			break;
+
+		if (rte_topo_get_lcore_count_from_domain(RTE_TOPO_DOMAIN_L4, domain) < 2)
+			continue;
+
+		id1 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0, RTE_TOPO_DOMAIN_L4);
+		id2 = rte_topo_get_nth_lcore_from_domain(domain, 1, 0, RTE_TOPO_DOMAIN_L4);
+	}
+
+	if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
+		return 2;
+
+	if (id1 == id2)
+		return 3;
+
+	lcp->c1 = id1;
+	lcp->c2 = id2;
+
+	return 0;
+}
+
+static int
+get_same_l3_domains(struct lcore_pair *lcp)
+{
+	if (rte_topo_get_domain_count(RTE_TOPO_DOMAIN_L3) == 0)
+		return 1;
+
+	unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
+	unsigned int domain = 0;
+
+	RTE_TOPO_FOREACH_DOMAIN(domain, RTE_TOPO_DOMAIN_L3) {
+		if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
+			break;
+
+		if (rte_topo_get_lcore_count_from_domain(RTE_TOPO_DOMAIN_L3, domain) < 2)
+			continue;
+
+		id1 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0, RTE_TOPO_DOMAIN_L3);
+		id2 = rte_topo_get_nth_lcore_from_domain(domain, 1, 0, RTE_TOPO_DOMAIN_L3);
+	}
+
+	if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
+		return 2;
+
+	if (id1 == id2)
+		return 3;
+
+	lcp->c1 = id1;
+	lcp->c2 = id2;
+
+	return 0;
+}
+
+static int
+get_same_l2_domains(struct lcore_pair *lcp)
+{
+	if (rte_topo_get_domain_count(RTE_TOPO_DOMAIN_L2) == 0)
+		return 1;
+
+	unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
+	unsigned int domain = 0;
+
+	RTE_TOPO_FOREACH_DOMAIN(domain, RTE_TOPO_DOMAIN_L2) {
+		if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
+			break;
+
+		if (rte_topo_get_lcore_count_from_domain(RTE_TOPO_DOMAIN_L2, domain) < 2)
+			continue;
+
+		id1 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0, RTE_TOPO_DOMAIN_L2);
+		id2 = rte_topo_get_nth_lcore_from_domain(domain, 1, 0, RTE_TOPO_DOMAIN_L2);
+	}
+
+	if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
+		return 2;
+
+	if (id1 == id2)
+		return 3;
+
+	lcp->c1 = id1;
+	lcp->c2 = id2;
+
+	return 0;
+}
+
+static int
+get_same_l1_domains(struct lcore_pair *lcp)
+{
+	if (rte_topo_get_domain_count(RTE_TOPO_DOMAIN_L1) == 0)
+		return 1;
+
+	unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
+	unsigned int domain = 0;
+
+	RTE_TOPO_FOREACH_DOMAIN(domain, RTE_TOPO_DOMAIN_L1) {
+		if (rte_topo_is_main_lcore_in_domain(domain, RTE_TOPO_DOMAIN_L1))
+			continue;
+
+		if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
+			break;
+
+		if (rte_topo_get_lcore_count_from_domain(RTE_TOPO_DOMAIN_L1, domain) < 2)
+			continue;
+
+		id1 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0, RTE_TOPO_DOMAIN_L1);
+		id2 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0, RTE_TOPO_DOMAIN_L1);
+	}
+
+	if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
+		return 2;
+
+	if (id1 == id2)
+		return 3;
+
+	lcp->c1 = id1;
+	lcp->c2 = id2;
+
+	return 0;
+}
+static int
+get_two_numa_domains(struct lcore_pair *lcp)
+{
+	if (rte_topo_get_domain_count(RTE_TOPO_DOMAIN_NUMA) < 2)
+		return 1;
+
+	unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
+	unsigned int domain = 0;
+
+	RTE_TOPO_FOREACH_DOMAIN(domain, RTE_TOPO_DOMAIN_NUMA) {
+		if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
+			break;
+
+		if (rte_topo_get_lcore_count_from_domain(RTE_TOPO_DOMAIN_NUMA, domain) == 0)
+			continue;
+
+		if (id1 == RTE_MAX_LCORE) {
+			id1 = rte_topo_get_nth_lcore_from_domain(domain,
+				0, 0, RTE_TOPO_DOMAIN_NUMA);
+			continue;
+		}
+
+		if (id2 == RTE_MAX_LCORE) {
+			id2 = rte_topo_get_nth_lcore_from_domain(domain,
+				0, 0, RTE_TOPO_DOMAIN_NUMA);
+			continue;
+		}
+	}
+
+	if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
+		return 2;
+
+	if (id1 == id2)
+		return 3;
+
+	lcp->c1 = id1;
+	lcp->c2 = id2;
+
+	return 0;
+}
+
+static int
+get_two_l4_domains(struct lcore_pair *lcp)
+{
+	if (rte_topo_get_domain_count(RTE_TOPO_DOMAIN_L4) < 2)
+		return 1;
+
+	unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
+	unsigned int domain = 0;
+
+	RTE_TOPO_FOREACH_DOMAIN(domain, RTE_TOPO_DOMAIN_L4) {
+		if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
+			break;
+
+		if (rte_topo_get_lcore_count_from_domain(RTE_TOPO_DOMAIN_L4, domain) == 0)
+			continue;
+
+		if (id1 == RTE_MAX_LCORE) {
+			id1 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0, RTE_TOPO_DOMAIN_L4);
+			continue;
+		}
+		if (id2 == RTE_MAX_LCORE) {
+			id2 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0, RTE_TOPO_DOMAIN_L4);
+			continue;
+		}
+	}
+
+	if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
+		return 2;
+
+	if (id1 == id2)
+		return 3;
+
+	lcp->c1 = id1;
+	lcp->c2 = id2;
+
+	return 0;
+}
+
+static int
+get_two_l3_domains(struct lcore_pair *lcp)
+{
+	if (rte_topo_get_domain_count(RTE_TOPO_DOMAIN_L3) < 2)
+		return 1;
+
+	unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
+	unsigned int domain = 0;
+
+	RTE_TOPO_FOREACH_DOMAIN(domain, RTE_TOPO_DOMAIN_L3) {
+		if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
+			break;
+
+		if (rte_topo_get_lcore_count_from_domain(RTE_TOPO_DOMAIN_L3, domain) == 0)
+			continue;
+
+		if (id1 == RTE_MAX_LCORE) {
+			id1 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0, RTE_TOPO_DOMAIN_L3);
+			continue;
+		}
+
+		if (id2 == RTE_MAX_LCORE) {
+			id2 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0, RTE_TOPO_DOMAIN_L3);
+			continue;
+		}
+	}
+
+	if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
+		return 2;
+
+	if (id1 == id2)
+		return 3;
+
+	lcp->c1 = id1;
+	lcp->c2 = id2;
+
+	return 0;
+}
+
+static int
+get_two_l2_domains(struct lcore_pair *lcp)
+{
+	if (rte_topo_get_domain_count(RTE_TOPO_DOMAIN_L2) < 2)
+		return 1;
+
+	unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
+	unsigned int domain = 0;
+
+	RTE_TOPO_FOREACH_DOMAIN(domain, RTE_TOPO_DOMAIN_L2) {
+		if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
+			break;
+
+		if (rte_topo_get_lcore_count_from_domain(RTE_TOPO_DOMAIN_L2, domain) == 0)
+			continue;
+
+		if (id1 == RTE_MAX_LCORE) {
+			id1 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0, RTE_TOPO_DOMAIN_L2);
+			continue;
+		}
+
+		if (id2 == RTE_MAX_LCORE) {
+			id2 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0, RTE_TOPO_DOMAIN_L2);
+			continue;
+		}
+	}
+
+	if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
+		return 2;
+
+	if (id1 == id2)
+		return 3;
+
+	lcp->c1 = id1;
+	lcp->c2 = id2;
+
+	return 0;
+}
+
+static int
+get_two_l1_domains(struct lcore_pair *lcp)
+{
+	if (rte_topo_get_domain_count(RTE_TOPO_DOMAIN_L1) < 2)
+		return 1;
+
+	unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
+	unsigned int domain = 0;
+
+	RTE_TOPO_FOREACH_DOMAIN(domain, RTE_TOPO_DOMAIN_L1) {
+		if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
+			break;
+
+		if (rte_topo_get_lcore_count_from_domain(RTE_TOPO_DOMAIN_L1, domain) == 0)
+			continue;
+
+		if (id1 == RTE_MAX_LCORE) {
+			id1 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0, RTE_TOPO_DOMAIN_L1);
+			continue;
+		}
+
+		if (id2 == RTE_MAX_LCORE) {
+			id2 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0, RTE_TOPO_DOMAIN_L1);
+			continue;
+		}
+	}
+
+	if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
+		return 2;
+
+	if (id1 == id2)
+		return 3;
+
+	lcp->c1 = id1;
+	lcp->c2 = id2;
+
+	return 0;
+}
+#endif
+
+
 /* Measure the cycle cost of popping an empty stack. */
 static void
 test_empty_pop(struct rte_stack *s)
@@ -331,6 +695,51 @@ __test_stack_perf(uint32_t flags)
 		run_on_core_pair(&cores, s, bulk_push_pop);
 	}
 
+#ifdef RTE_LIBHWLOC_PROBE
+	if (rte_lcore_count() > 2) {
+		if (get_same_numa_domains(&cores) == 0) {
+			printf("\n### Testing using same numa domain nodes ###\n");
+			run_on_core_pair(&cores, s, bulk_push_pop);
+		}
+		if (get_same_l4_domains(&cores) == 0) {
+			printf("\n### Testing using same l4 domain nodes ###\n");
+			run_on_core_pair(&cores, s, bulk_push_pop);
+		}
+		if (get_same_l3_domains(&cores) == 0) {
+			printf("\n### Testing using same l3 domain nodes ###\n");
+			run_on_core_pair(&cores, s, bulk_push_pop);
+		}
+		if (get_same_l2_domains(&cores) == 0) {
+			printf("\n### Testing using same l2 domain nodes ###\n");
+			run_on_core_pair(&cores, s, bulk_push_pop);
+		}
+		if (get_same_l1_domains(&cores) == 0) {
+			printf("\n### Testing using same l1 domain nodes ###\n");
+			run_on_core_pair(&cores, s, bulk_push_pop);
+		}
+		if (get_two_numa_domains(&cores) == 0) {
+			printf("\n### Testing using two numa domain nodes ###\n");
+			run_on_core_pair(&cores, s, bulk_push_pop);
+		}
+		if (get_two_l4_domains(&cores) == 0) {
+			printf("\n### Testing using two l4 domain nodes ###\n");
+			run_on_core_pair(&cores, s, bulk_push_pop);
+		}
+		if (get_two_l3_domains(&cores) == 0) {
+			printf("\n### Testing using two l3 domain nodes ###\n");
+			run_on_core_pair(&cores, s, bulk_push_pop);
+		}
+		if (get_two_l2_domains(&cores) == 0) {
+			printf("\n### Testing using two l2 domain nodes ###\n");
+			run_on_core_pair(&cores, s, bulk_push_pop);
+		}
+		if (get_two_l1_domains(&cores) == 0) {
+			printf("\n### Testing using two l1 domain nodes ###\n");
+			run_on_core_pair(&cores, s, bulk_push_pop);
+		}
+	}
+#endif
+
 	printf("\n### Testing on all %u lcores ###\n", rte_lcore_count());
 	run_on_n_cores(s, bulk_push_pop, rte_lcore_count());
 
diff --git a/app/test/test_topology.c b/app/test/test_topology.c
new file mode 100644
index 0000000000..f2244ad807
--- /dev/null
+++ b/app/test/test_topology.c
@@ -0,0 +1,676 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2023 AMD Corporation
+ */
+
+#include <sched.h>
+#include <string.h>
+#include <unistd.h>
+
+#include <rte_common.h>
+#include <rte_errno.h>
+#include <rte_lcore.h>
+#include <rte_thread.h>
+#include <rte_topology.h>
+
+#include "test.h"
+
+#ifndef _POSIX_PRIORITY_SCHEDULING
+/* sched_yield(2):
+ * POSIX systems on which sched_yield() is available define
+ * _POSIX_PRIORITY_SCHEDULING in <unistd.h>.
+ */
+#define sched_yield()
+#endif
+
+#ifdef RTE_LIBHWLOC_PROBE
+
+static const unsigned int domain_types[] = {
+	RTE_TOPO_DOMAIN_NUMA,
+	RTE_TOPO_DOMAIN_L4,
+	RTE_TOPO_DOMAIN_L3,
+	RTE_TOPO_DOMAIN_L2,
+	RTE_TOPO_DOMAIN_L1
+};
+
+static int
+test_topology_macro(void)
+{
+	unsigned int total_lcores = 0;
+	unsigned int total_wrkr_lcores = 0;
+
+	unsigned int count_lcore = 0;
+	unsigned int total_lcore = 0;
+	unsigned int total_wrkr_lcore = 0;
+
+	unsigned int lcore = 0, pos = 0, domain = 0;
+
+	/* get topology core count */
+	lcore = -1;
+	RTE_LCORE_FOREACH(lcore)
+		total_lcores += 1;
+
+	lcore = -1;
+	RTE_LCORE_FOREACH_WORKER(lcore)
+		total_wrkr_lcores += 1;
+
+	RTE_TEST_ASSERT(((total_wrkr_lcores + 1) == total_lcores),
+		"fail in MACRO for RTE_LCORE_FOREACH\n");
+
+	RTE_LOG(DEBUG, USER1, "Lcore: %u, Lcore Worker: %u\n", total_lcores, total_wrkr_lcores);
+	RTE_LOG(DEBUG, USER1, "| %10s | %10s | %10s | %10s |\n",
+		"domain name", "count", "LCORE", "WORKER");
+	RTE_LOG(DEBUG, USER1, "------------------------------------------------------\n");
+
+	for (unsigned int d = 0; d < RTE_DIM(domain_types); d++) {
+		count_lcore = 0;
+		total_lcore = 0;
+		total_wrkr_lcore = 0;
+		domain = RTE_TOPO_DOMAIN_MAX;
+		RTE_TOPO_FOREACH_DOMAIN(domain, domain_types[d]) {
+			count_lcore +=
+				rte_topo_get_lcore_count_from_domain(domain_types[d], domain);
+
+			lcore = RTE_MAX_LCORE;
+			pos = 0;
+			RTE_TOPO_FOREACH_LCORE_IN_DOMAIN(lcore, domain, pos, domain_types[d])
+				total_lcore += 1;
+
+			/* skip domain */
+			if (rte_topo_is_main_lcore_in_domain(domain, domain_types[d]))
+				continue;
+
+			lcore = RTE_MAX_LCORE;
+			RTE_TOPO_FOREACH_WORKER_LCORE_IN_DOMAIN(lcore, domain, domain_types[d]) {
+				total_wrkr_lcore += 1;
+			}
+		}
+
+		if (count_lcore) {
+			RTE_TEST_ASSERT((total_wrkr_lcore < total_lcore),
+				"unexpected workers in %s domain!\n",
+				(domain_types[d] == RTE_TOPO_DOMAIN_NUMA) ? "NUMA" :
+				(domain_types[d] == RTE_TOPO_DOMAIN_L4) ? "L4" :
+				(domain_types[d] == RTE_TOPO_DOMAIN_L3) ? "L3" :
+				(domain_types[d] == RTE_TOPO_DOMAIN_L2) ? "L2" :
+				(domain_types[d] == RTE_TOPO_DOMAIN_L1) ? "L1" : NULL);
+
+			RTE_LOG(DEBUG, USER1, "| %10s | %10u | %10u | %10u |\n",
+				(domain_types[d] == RTE_TOPO_DOMAIN_NUMA) ? "NUMA" :
+				(domain_types[d] == RTE_TOPO_DOMAIN_L4) ? "L4" :
+				(domain_types[d] == RTE_TOPO_DOMAIN_L3) ? "L3" :
+				(domain_types[d] == RTE_TOPO_DOMAIN_L2) ? "L2" :
+				(domain_types[d] == RTE_TOPO_DOMAIN_L1) ? "L1" : NULL,
+				rte_topo_get_domain_count(domain_types[d]),
+				total_lcore, total_wrkr_lcore);
+		}
+	}
+	RTE_LOG(DEBUG, USER1, "---------------------------------------------------------\n");
+
+	printf("INFO: lcore DOMAIN macro: success!\n");
+	return TEST_SUCCESS;
+}
+
+static int
+test_lcore_count_from_domain(void)
+{
+	unsigned int total_lcores = 0;
+	unsigned int total_domain_lcores = 0;
+	unsigned int domain_count;
+	unsigned int i;
+
+	/* get topology core count */
+	total_lcores = rte_lcore_count();
+
+	RTE_LOG(DEBUG, USER1, "| %10s | %10s |\n", "domain", " LCORE");
+	RTE_LOG(DEBUG, USER1, "---------------------------------------\n");
+	RTE_LOG(DEBUG, USER1, "| %10s | %10u |\n", "rte_lcore", total_lcores);
+
+	for (unsigned int d = 0; d < RTE_DIM(domain_types); d++) {
+		total_domain_lcores = 0;
+		domain_count = rte_topo_get_domain_count(domain_types[d]);
+		for (i = 0; i < domain_count; i++)
+			total_domain_lcores +=
+				rte_topo_get_lcore_count_from_domain(domain_types[d], i);
+
+		if (domain_count) {
+			RTE_TEST_ASSERT((total_domain_lcores == total_lcores),
+				"domain %s lcores does not match!\n",
+				(domain_types[d] == RTE_TOPO_DOMAIN_NUMA) ? "NUMA" :
+				(domain_types[d] == RTE_TOPO_DOMAIN_L4) ? "L4" :
+				(domain_types[d] == RTE_TOPO_DOMAIN_L3) ? "L3" :
+				(domain_types[d] == RTE_TOPO_DOMAIN_L2) ? "L2" :
+				(domain_types[d] == RTE_TOPO_DOMAIN_L1) ? "L1" : NULL);
+
+			RTE_LOG(DEBUG, USER1, "| %10s | %10u |\n",
+				(domain_types[d] == RTE_TOPO_DOMAIN_NUMA) ? "NUMA" :
+				(domain_types[d] == RTE_TOPO_DOMAIN_L4) ? "L4" :
+				(domain_types[d] == RTE_TOPO_DOMAIN_L3) ? "L3" :
+				(domain_types[d] == RTE_TOPO_DOMAIN_L2) ? "L2" :
+				(domain_types[d] == RTE_TOPO_DOMAIN_L1) ? "L1" : NULL,
+				total_domain_lcores);
+		}
+	}
+	RTE_LOG(DEBUG, USER1, "---------------------------------------\n");
+
+	printf("INFO: lcore count domain API: success\n");
+	return TEST_SUCCESS;
+}
+
+#ifdef RTE_HAS_CPUSET
+static int
+test_lcore_cpuset_from_domain(void)
+{
+	unsigned int domain_count;
+	uint16_t dmn_idx;
+	rte_cpuset_t cpu_set_list;
+
+	for (unsigned int d = 0; d < RTE_DIM(domain_types); d++) {
+		domain_count = rte_topo_get_domain_count(domain_types[d]);
+		for (dmn_idx = 0; dmn_idx < domain_count; dmn_idx++) {
+			cpu_set_list = rte_topo_get_lcore_cpuset_in_domain(domain_types[d],
+				dmn_idx);
+
+			for (uint16_t cpu_idx = 0; cpu_idx < RTE_MAX_LCORE; cpu_idx++) {
+				if (CPU_ISSET(cpu_idx, &cpu_set_list))
+					RTE_TEST_ASSERT(rte_lcore_is_enabled(cpu_idx), "%s domain at %u lcore %u not enabled!\n",
+					(domain_types[d] == RTE_TOPO_DOMAIN_NUMA) ? "NUMA" :
+					(domain_types[d] == RTE_TOPO_DOMAIN_L4) ? "L4" :
+					(domain_types[d] == RTE_TOPO_DOMAIN_L3) ? "L3" :
+					(domain_types[d] == RTE_TOPO_DOMAIN_L2) ? "L2" :
+					(domain_types[d] == RTE_TOPO_DOMAIN_L1) ? "L1" : NULL,
+					dmn_idx, cpu_idx);
+
+			}
+		}
+	}
+	printf("INFO: topology cpuset: success!\n");
+
+	for (unsigned int d = 0; d < RTE_DIM(domain_types); d++) {
+		cpu_set_list = rte_topo_get_lcore_cpuset_in_domain(domain_types[d], UINT32_MAX);
+		RTE_TEST_ASSERT((CPU_COUNT(&cpu_set_list) == 0),
+			"lcore not expected for %s domain invalid index!\n",
+			(domain_types[d] == RTE_TOPO_DOMAIN_NUMA) ? "NUMA" :
+			(domain_types[d] == RTE_TOPO_DOMAIN_L4) ? "L4" :
+			(domain_types[d] == RTE_TOPO_DOMAIN_L3) ? "L3" :
+			(domain_types[d] == RTE_TOPO_DOMAIN_L2) ? "L2" :
+			(domain_types[d] == RTE_TOPO_DOMAIN_L1) ? "L1" : NULL);
+	}
+
+	printf("INFO: cpuset_in_domain API: success!\n");
+	return TEST_SUCCESS;
+}
+#endif
+
+static int
+test_main_lcore_in_domain(void)
+{
+	for (unsigned int d = 0; d < RTE_DIM(domain_types); d++) {
+		bool main_lcore_found = false;
+		unsigned int domain_count = rte_topo_get_domain_count(domain_types[d]);
+		for (unsigned int dmn_idx = 0; dmn_idx < domain_count; dmn_idx++) {
+			main_lcore_found = rte_topo_is_main_lcore_in_domain(RTE_TOPO_DOMAIN_NUMA,
+				dmn_idx);
+			if (main_lcore_found)
+				break;
+		}
+
+		if (domain_count)
+			RTE_TEST_ASSERT((main_lcore_found == true),
+			"main lcore is not found in %s domain!\n",
+			(domain_types[d] == RTE_TOPO_DOMAIN_NUMA) ? "NUMA" :
+			(domain_types[d] == RTE_TOPO_DOMAIN_L4) ? "L4" :
+			(domain_types[d] == RTE_TOPO_DOMAIN_L3) ? "L3" :
+			(domain_types[d] == RTE_TOPO_DOMAIN_L2) ? "L2" :
+			(domain_types[d] == RTE_TOPO_DOMAIN_L1) ? "L1" : NULL);
+	}
+
+	printf("INFO: is_main_lcore_in_domain API: success!\n");
+	return TEST_SUCCESS;
+}
+
+static int
+test_lcore_from_domain_negative(void)
+{
+	for (unsigned int d = 0; d < RTE_DIM(domain_types); d++) {
+		const unsigned int domain_count = rte_topo_get_domain_count(domain_types[d]);
+		if (domain_count)
+			RTE_TEST_ASSERT(
+				(rte_topo_get_lcore_count_from_domain(domain_types[d],
+					domain_count) == 0),
+				"domain %s API inconsistent for numa\n",
+				(domain_types[d] == RTE_TOPO_DOMAIN_NUMA) ? "NUMA" :
+				(domain_types[d] == RTE_TOPO_DOMAIN_L4) ? "L4" :
+				(domain_types[d] == RTE_TOPO_DOMAIN_L3) ? "L3" :
+				(domain_types[d] == RTE_TOPO_DOMAIN_L2) ? "L2" :
+				(domain_types[d] == RTE_TOPO_DOMAIN_L1) ? "L1" : NULL);
+	}
+
+	printf("INFO: lcore domain API: success!\n");
+	return TEST_SUCCESS;
+}
+
+static int
+test_wrap_with_skip_main_edge_case(void)
+{
+	const unsigned int main_lcore = rte_get_main_lcore();
+
+	for (unsigned int d = 0; d < RTE_DIM(domain_types); d++) {
+		const unsigned int domain_count = rte_topo_get_domain_count(domain_types[d]);
+		for (unsigned int domain_index = 0; domain_index < domain_count; domain_index++) {
+			unsigned int lcores_in_domain_index =
+				rte_topo_get_lcore_count_from_domain(domain_types[d],
+					domain_index);
+
+			if (lcores_in_domain_index &&
+				(rte_topo_is_main_lcore_in_domain(domain_types[d],
+					lcores_in_domain_index))) {
+
+				if (lcores_in_domain_index == 1)
+					continue;
+
+				for (unsigned int i = 0; i < lcores_in_domain_index; i++) {
+					const uint16_t next_lcore =
+						rte_topo_get_nth_lcore_from_domain(domain_index,
+							i, 0, domain_types[d]);
+
+					RTE_TEST_ASSERT(next_lcore != main_lcore,
+						"expected domain %s, main lcore %u, to be skipped!",
+						(domain_types[d] == RTE_TOPO_DOMAIN_NUMA) ? "NUMA" :
+						(domain_types[d] == RTE_TOPO_DOMAIN_L4) ? "L4" :
+						(domain_types[d] == RTE_TOPO_DOMAIN_L3) ? "L3" :
+						(domain_types[d] == RTE_TOPO_DOMAIN_L2) ? "L2" :
+						(domain_types[d] == RTE_TOPO_DOMAIN_L1) ? "L1" :
+						NULL,
+						main_lcore);
+				}
+			}
+		}
+	}
+
+	printf("INFO: skip main lcore API: success!\n");
+	return TEST_SUCCESS;
+}
+
+static int
+test_invalid_domain_selector(void)
+{
+	unsigned int count;
+	unsigned int lcore;
+	rte_cpuset_t cpuset;
+
+	/* Test with completely invalid domain selector */
+	count = rte_topo_get_domain_count(0xDEADBEEF);
+	RTE_TEST_ASSERT((count == 0), "Invalid domain selector should return 0 count\n");
+
+	/* Test with 0 (no bits set) */
+	count = rte_topo_get_domain_count(0);
+	RTE_TEST_ASSERT((count == 0), "Zero domain selector should return 0 count\n");
+
+	/* Test count_from_domain with invalid selector */
+	count = rte_topo_get_lcore_count_from_domain(0xBADC0DE, 0);
+	RTE_TEST_ASSERT((count == 0), "Invalid domain should return 0 cores\n");
+
+	/* Test get_lcore_in_domain with invalid selector */
+	lcore = rte_topo_get_nth_lcore_in_domain(0xBADC0DE, 0, 0);
+	RTE_TEST_ASSERT((lcore == RTE_MAX_LCORE), "Invalid domain should return RTE_MAX_LCORE\n");
+
+	/* Test cpuset with invalid selector */
+	cpuset = rte_topo_get_lcore_cpuset_in_domain(0xBADC0DE, 0);
+	RTE_TEST_ASSERT((CPU_COUNT(&cpuset) == 0), "Invalid domain should return empty cpuset\n");
+
+	printf("INFO: Invalid domain selector test: success\n");
+	return TEST_SUCCESS;
+}
+
+static int
+test_multiple_invalid_inputs(void)
+{
+	if (rte_lcore_count() == 1) {
+		printf("INFO: topology MACRO test requires more than 1 core, skipping!\n");
+		return TEST_SKIPPED;
+	}
+
+	/* Test all APIs with multiple types of invalid inputs */
+	unsigned int invalid_domains[] = {
+		0,             /* No bits set */
+		0xFFFFFFFF,    /* All bits set (not a single domain) */
+		0x80000000,    /* Bit outside valid range */
+		0x12345678,    /* Random invalid value */
+	};
+
+	for (int i = 0; i < 4; i++) {
+		unsigned int domain = invalid_domains[i];
+
+		/* All should return safe defaults */
+		RTE_TEST_ASSERT((rte_topo_get_domain_count(domain) == 0),
+			"Invalid domain 0x%x should have NO count\n", domain);
+		RTE_TEST_ASSERT((rte_topo_get_lcore_count_from_domain(domain, 0) == 0),
+			"Invalid domain 0x%x should have NO cores\n", domain);
+		RTE_TEST_ASSERT((rte_topo_get_nth_lcore_in_domain(domain, 0, 0) == RTE_MAX_LCORE),
+			"Invalid domain 0x%x should return MAX_LCORE\n", domain);
+	}
+
+	printf("INFO: Multiple invalid inputs test: success\n");
+	return TEST_SUCCESS;
+}
+
+static int
+test_large_index_values(void)
+{
+	if (rte_lcore_count() == 1) {
+		printf("INFO: topology MACRO test requires more than 1 core, skipping!\n");
+		return TEST_SKIPPED;
+	}
+
+	uint16_t test_lcore = 0;
+	unsigned int large_indices[] = {
+		RTE_MAX_LCORE,
+		RTE_MAX_LCORE + 1,
+		UINT32_MAX,
+		0x7FFFFFFF,
+	};
+
+	for (unsigned int d = 0; d < RTE_DIM(domain_types); d++) {
+		for (unsigned int i = 0; i < RTE_DIM(large_indices); i++) {
+			unsigned int idx = large_indices[i];
+
+			/* Should all handle gracefully and return safe defaults */
+			test_lcore = rte_topo_get_lcore_count_from_domain(domain_types[d], idx);
+			RTE_TEST_ASSERT(test_lcore == 0,
+				"Large index %u in domain %s should return 0 cores\n",
+				idx,
+				(domain_types[d] == RTE_TOPO_DOMAIN_NUMA) ? "NUMA" :
+				(domain_types[d] == RTE_TOPO_DOMAIN_L4) ? "L4" :
+				(domain_types[d] == RTE_TOPO_DOMAIN_L3) ? "L3" :
+				(domain_types[d] == RTE_TOPO_DOMAIN_L2) ? "L2" :
+				(domain_types[d] == RTE_TOPO_DOMAIN_L1) ? "L1" : NULL);
+
+			test_lcore = rte_topo_get_nth_lcore_in_domain(domain_types[d], idx, 0);
+			RTE_TEST_ASSERT(test_lcore == RTE_MAX_LCORE,
+				"Large index %u in domain %s should return MAX_LCORE\n",
+				idx,
+				(domain_types[d] == RTE_TOPO_DOMAIN_NUMA) ? "NUMA" :
+				(domain_types[d] == RTE_TOPO_DOMAIN_L4) ? "L4" :
+				(domain_types[d] == RTE_TOPO_DOMAIN_L3) ? "L3" :
+				(domain_types[d] == RTE_TOPO_DOMAIN_L2) ? "L2" :
+				(domain_types[d] == RTE_TOPO_DOMAIN_L1) ? "L1" : NULL);
+
+#ifdef RTE_HAS_CPUSET
+			rte_cpuset_t cpuset = rte_topo_get_lcore_cpuset_in_domain(domain_types[d],
+				idx);
+			RTE_TEST_ASSERT(CPU_COUNT(&cpuset) == 0,
+				"Large index %u in domain %s should return empty cpuset",
+				idx,
+				(domain_types[d] == RTE_TOPO_DOMAIN_NUMA) ? "NUMA" :
+				(domain_types[d] == RTE_TOPO_DOMAIN_L4) ? "L4" :
+				(domain_types[d] == RTE_TOPO_DOMAIN_L3) ? "L3" :
+				(domain_types[d] == RTE_TOPO_DOMAIN_L2) ? "L2" :
+				(domain_types[d] == RTE_TOPO_DOMAIN_L1) ? "L1" : NULL);
+#endif
+		}
+	}
+
+	printf("INFO: Large index values test: success\n");
+	return TEST_SUCCESS;
+}
+
+
+static int
+test_domain_next_lcore_no_wrap(void)
+{
+	if (rte_lcore_count() == 1) {
+		printf("INFO: topology MACRO test requires more than 1 core, skipping!\n");
+		return TEST_SKIPPED;
+	}
+
+	for (unsigned int d = 0; d < RTE_DIM(domain_types); d++) {
+		const unsigned int lcores_in_domain = rte_topo_get_domain_count(domain_types[d]);
+
+		for (unsigned int domain_index = 0; domain_index < lcores_in_domain;
+			domain_index++) {
+			unsigned int lcores_in_domain_index =
+				rte_topo_get_lcore_count_from_domain(domain_types[d],
+					domain_index);
+
+			for (unsigned int i = 0; i < lcores_in_domain_index; i++) {
+				const uint16_t curr_lcore =
+					rte_topo_get_nth_lcore_from_domain(domain_index,
+						i, 0, domain_types[d]);
+
+				const uint16_t wrap_lcore =
+					rte_topo_get_nth_lcore_from_domain(domain_index,
+						lcores_in_domain_index + i, 0, domain_types[d]);
+
+				RTE_TEST_ASSERT(wrap_lcore == RTE_MAX_LCORE,
+					"expected domain %s, lcore %u, wrapped lcore %u should be RTE_MAX_LCORE!",
+					(domain_types[d] == RTE_TOPO_DOMAIN_NUMA) ? "NUMA" :
+					(domain_types[d] == RTE_TOPO_DOMAIN_L4) ? "L4" :
+					(domain_types[d] == RTE_TOPO_DOMAIN_L3) ? "L3" :
+					(domain_types[d] == RTE_TOPO_DOMAIN_L2) ? "L2" :
+					(domain_types[d] == RTE_TOPO_DOMAIN_L1) ? "L1" : NULL,
+					curr_lcore, wrap_lcore);
+			}
+		}
+	}
+
+	printf("INFO: next lcore in domain test: success\n");
+	return TEST_SUCCESS;
+}
+
+static int
+test_domain_next_lcore_wrap(void)
+{
+	for (unsigned int d = 0; d < RTE_DIM(domain_types); d++) {
+		const unsigned int lcores_in_domain = rte_topo_get_domain_count(domain_types[d]);
+		for (unsigned int domain_index = 0; domain_index < lcores_in_domain;
+			domain_index++) {
+			unsigned int lcores_in_domain_index =
+				rte_topo_get_lcore_count_from_domain(domain_types[d],
+					domain_index);
+
+			for (unsigned int i = 0; i < lcores_in_domain_index; i++) {
+				const uint16_t curr_lcore =
+					rte_topo_get_nth_lcore_from_domain(domain_index, i, 0,
+						domain_types[d]);
+				const uint16_t wrap_lcore =
+					rte_topo_get_nth_lcore_from_domain(domain_index,
+						lcores_in_domain_index + i, 1, domain_types[d]);
+
+				RTE_TEST_ASSERT(curr_lcore == wrap_lcore,
+					"expected domain %s, lcore %u, wrapped lcore %u not same!",
+					(domain_types[d] == RTE_TOPO_DOMAIN_NUMA) ? "NUMA" :
+					(domain_types[d] == RTE_TOPO_DOMAIN_L4) ? "L4" :
+					(domain_types[d] == RTE_TOPO_DOMAIN_L3) ? "L3" :
+					(domain_types[d] == RTE_TOPO_DOMAIN_L2) ? "L2" :
+					(domain_types[d] == RTE_TOPO_DOMAIN_L1) ? "L1" : NULL,
+					curr_lcore, wrap_lcore);
+			}
+		}
+	}
+
+	printf("INFO: wrap next lcore in domain test: success\n");
+	return TEST_SUCCESS;
+}
+
+
+static int
+test_multibit_domain_selector(void)
+{
+	const unsigned int bad_sel = RTE_TOPO_DOMAIN_L1 | RTE_TOPO_DOMAIN_L2;
+
+	unsigned int count;
+	unsigned int lcore;
+	rte_cpuset_t cpuset;
+
+	count = rte_topo_get_domain_count(bad_sel);
+	RTE_TEST_ASSERT(count == 0,
+		"Multi-bit selector should return 0 domains");
+
+	count = rte_topo_get_lcore_count_from_domain(bad_sel, 0);
+	RTE_TEST_ASSERT(count == 0,
+		"Multi-bit selector should return 0 lcores");
+
+	lcore = rte_topo_get_nth_lcore_in_domain(bad_sel, 0, 0);
+	RTE_TEST_ASSERT(lcore == RTE_MAX_LCORE,
+		"Multi-bit selector should return RTE_MAX_LCORE");
+
+#ifdef RTE_HAS_CPUSET
+	cpuset = rte_topo_get_lcore_cpuset_in_domain(bad_sel, 0);
+	RTE_TEST_ASSERT(CPU_COUNT(&cpuset) == 0,
+		"Multi-bit selector should return empty cpuset");
+#endif
+
+	printf("INFO: invalid domain select test: success\n");
+	return TEST_SUCCESS;
+}
+
+static int
+test_domain_lcore_round_trip(void)
+{
+	for (unsigned int d = 0; d < RTE_DIM(domain_types); d++) {
+		unsigned int dom_cnt = rte_topo_get_domain_count(domain_types[d]);
+
+		for (unsigned int i = 0; i < dom_cnt; i++) {
+			unsigned int lcnt =
+				rte_topo_get_lcore_count_from_domain(domain_types[d], i);
+
+			for (unsigned int p = 0; p < lcnt; p++) {
+				uint16_t lcore =
+					rte_topo_get_nth_lcore_in_domain(domain_types[d], i, p);
+
+				int idx =
+					rte_topo_get_domain_index_from_lcore(domain_types[d],
+						lcore);
+
+				RTE_TEST_ASSERT(idx == (int)i,
+					"Round-trip mismatch: domain %u lcore %u -> idx %d",
+					i, lcore, idx);
+			}
+		}
+	}
+
+	printf("INFO: lcore domain cross test: success\n");
+	return TEST_SUCCESS;
+}
+
+static int
+test_domain_lcore_ordering(void)
+{
+	unsigned int domain = RTE_TOPO_DOMAIN_L1;
+	if (rte_topo_get_domain_count(domain) == 0)
+		return TEST_SKIPPED;
+
+	unsigned int lcnt = rte_topo_get_lcore_count_from_domain(domain, 0);
+
+	uint16_t prev = 0;
+	bool first = true;
+
+	for (unsigned int i = 0; i < lcnt; i++) {
+		uint16_t cur = rte_topo_get_nth_lcore_in_domain(domain, 0, i);
+
+		if (!first)
+			RTE_TEST_ASSERT(cur > prev, "Lcore ordering not strictly increasing");
+		first = false;
+		prev = cur;
+	}
+
+	printf("INFO: lcores ascending domain test: success\n");
+	return TEST_SUCCESS;
+}
+
+static int
+test_cpuset_matches_lcore_list(void)
+{
+#ifdef RTE_HAS_CPUSET
+	unsigned int domain = RTE_TOPO_DOMAIN_L1;
+	if (rte_topo_get_domain_count(domain) == 0)
+		return TEST_SKIPPED;
+
+	rte_cpuset_t cpuset = rte_topo_get_lcore_cpuset_in_domain(domain, 0);
+
+	unsigned int lcnt = rte_topo_get_lcore_count_from_domain(domain, 0);
+
+	for (unsigned int i = 0; i < lcnt; i++) {
+		int16_t lc = rte_topo_get_nth_lcore_in_domain(domain, 0, i);
+
+	RTE_TEST_ASSERT(CPU_ISSET(lc, &cpuset),
+		"Cpuset missing lcore %u", lc);
+	}
+
+	RTE_TEST_ASSERT(((unsigned int)CPU_COUNT(&cpuset) == lcnt), "Cpuset contains extra CPUs");
+
+	printf("INFO: cpuset lcore cross test: success\n");
+	return TEST_SUCCESS;
+#else
+	return TEST_SKIPPED;
+#endif
+}
+#endif
+
+static int
+test_topology_lcores(void)
+{
+#ifdef RTE_LIBHWLOC_PROBE
+	printf("\nTopology test\n");
+
+	printf("\nLcore dump mapped to topology\n");
+	rte_topo_dump(stdout);
+	printf("\n\n");
+
+	if (rte_lcore_count() == 1) {
+		RTE_LOG(INFO, USER1, "topology MACRO test requires more than 1 core, skipping!\n");
+		return TEST_SKIPPED;
+	}
+
+	if (test_topology_macro() < 0)
+		return TEST_FAILED;
+
+	if (test_lcore_count_from_domain() < 0)
+		return TEST_FAILED;
+
+	if (test_lcore_from_domain_negative() < 0)
+		return TEST_FAILED;
+
+#ifdef RTE_HAS_CPUSET
+	if (test_lcore_cpuset_from_domain() < 0)
+		return TEST_FAILED;
+#endif
+
+	if (test_main_lcore_in_domain() < 0)
+		return TEST_FAILED;
+
+	if (test_wrap_with_skip_main_edge_case() < 0)
+		return TEST_FAILED;
+
+	if (test_invalid_domain_selector() < 0)
+		return TEST_FAILED;
+
+	if (test_multiple_invalid_inputs() < 0)
+		return TEST_FAILED;
+
+	if (test_large_index_values() < 0)
+		return TEST_FAILED;
+
+	if (test_domain_next_lcore_no_wrap() < 0)
+		return TEST_FAILED;
+
+	if (test_domain_next_lcore_wrap() < 0)
+		return TEST_FAILED;
+
+	if (test_multibit_domain_selector() < 0)
+		return TEST_FAILED;
+
+	if (test_domain_lcore_round_trip() < 0)
+		return TEST_FAILED;
+
+	if (test_domain_lcore_ordering() < 0)
+		return TEST_FAILED;
+
+	if (test_cpuset_matches_lcore_list() < 0)
+		return TEST_FAILED;
+#endif
+
+	return TEST_SUCCESS;
+}
+
+REGISTER_FAST_TEST(topology_autotest, NOHUGE_OK, ASAN_OK, test_topology_lcores);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v5 v5 3/3] doc: add new section topology
  2026-04-14 19:38 ` [PATCH v5 0/3] eal/topology: introduce topology-aware lcore grouping Vipin Varghese
  2026-04-14 19:38   ` [PATCH v5 v5 1/3] eal/topology: add Topology grouping for lcores Vipin Varghese
  2026-04-14 19:38   ` [PATCH v5 v5 2/3] app: add topology aware test case Vipin Varghese
@ 2026-04-14 19:38   ` Vipin Varghese
  2026-04-14 20:22   ` [PATCH v5 0/3] eal/topology: introduce topology-aware lcore grouping Stephen Hemminger
  3 siblings, 0 replies; 30+ messages in thread
From: Vipin Varghese @ 2026-04-14 19:38 UTC (permalink / raw)
  To: dev, sivaprasad.tummala
  Cc: konstantin.ananyev, wathsala.vithanage, bruce.richardson,
	viktorin, mb

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="y", Size: 5242 bytes --]

changes:
 - add new section under utility
 - include rte_topo API and usuage

Signed-off-by: Vipin Varghese <vipin.varghese@amd.com>
---
 doc/api/doxy-api-index.md              |   1 +
 doc/guides/prog_guide/index.rst        |   3 +-
 doc/guides/prog_guide/topology_lib.rst | 155 +++++++++++++++++++++++++
 3 files changed, 158 insertions(+), 1 deletion(-)
 create mode 100644 doc/guides/prog_guide/topology_lib.rst

diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index 9296042119..e8655fb956 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -102,6 +102,7 @@ The public API headers are grouped by topics:
   [interrupts](@ref rte_interrupts.h),
   [launch](@ref rte_launch.h),
   [lcore](@ref rte_lcore.h),
+  [topology](@ref rte_topology.h),
   [service cores](@ref rte_service.h),
   [keepalive](@ref rte_keepalive.h),
   [power/freq](@ref rte_power_cpufreq.h),
diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index e6f24945b0..8e1153acef 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -133,9 +133,10 @@ Utility Libraries
     pcapng_lib
     bpf_lib
     trace_lib
+    topology_lib
 
 
-Howto Guides
+How to Guides
 -------------
 
 .. toctree::
diff --git a/doc/guides/prog_guide/topology_lib.rst b/doc/guides/prog_guide/topology_lib.rst
new file mode 100644
index 0000000000..42af7e5793
--- /dev/null
+++ b/doc/guides/prog_guide/topology_lib.rst
@@ -0,0 +1,155 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright(c) 2026 AMD Inc.
+
+Topology Library
+================
+
+Overview
+--------
+
+The Topology library provides NUMA‑aware grouping of DPDK logical cores
+based on CPU-CACHE and I/O topology.
+
+It exposes APIs that allow applications to query topology domains and
+enumerate logical cores within those domains. This enables topology‑aware
+core selection for improved locality and performance.
+
+The library integrates with the ``hwloc`` library to obtain hardware
+topology information while maintaining ABI stability.
+
+Motivation
+----------
+
+Application performance can be improved when:
+
+- DPDK libraries and PMDs operate within the same topology domain
+- Cache sharing is maximized in pipeline and graph applications
+- Cache identifiers (L2/L3) are used for:
+  - Data placement
+  - Platform QoS (PQoS) configuration
+
+This library provides a consistent topology view, including support for
+EAL lcore reordering via the ``-R`` option.
+
+Functionality
+-------------
+
+The Topology library provides the following functionality:
+
+- Partitioning of logical cores into topology domains
+- Support for CPU and I/O based domain selection
+- Grouping of lcores by hierarchy levels: L1, L2, L3, L4, IO
+- Reverse lookup from lcore to domain index
+- Helper APIs for lcore and domain iteration
+
+Dependencies
+------------
+
+- ``hwloc-dev`` tested on `2.10.0`
+
+The dependency is used to:
+
+- Discover system topology
+- Group logical cores into DPDK‑specific domains
+- Provide stable mappings across EAL configurations
+
+API Overview
+------------
+
+All APIs are provided under the ``RTE_TOPO`` namespace.
+
+Domain Enumeration
+------------------
+
+Get the number of domains for a selected topology type.
+
+.. code-block:: c
+
+   uint32_t
+   rte_topo_get_domain_count(enum rte_topo_domain_sel domain_sel);
+
+Lcore Enumeration
+-----------------
+
+Enumerate logical cores within a topology domain.
+
+.. code-block:: c
+
+   uint32_t
+   rte_topo_get_lcore_count_from_domain(
+       enum rte_topo_domain_sel domain_sel,
+       uint32_t domain_idx);
+
+   unsigned int
+   rte_topo_get_nth_lcore_in_domain(
+       enum rte_topo_domain_sel domain_sel,
+       uint32_t domain_idx,
+       uint32_t lcore_pos);
+
+Iterate over logical cores with optional filtering.
+
+.. code-block:: c
+
+   unsigned int
+   rte_topo_get_next_lcore(
+       unsigned int lcore,
+       bool skip_main,
+       bool wrap,
+       uint32_t flag);
+
+   unsigned int
+   rte_topo_get_nth_lcore_from_domain(
+       uint32_t domain_idx,
+       uint32_t lcore_pos,
+       bool wrap,
+       uint32_t flag);
+
+Domain Lookup
+-------------
+
+Query the domain associated with a logical core.
+
+.. code-block:: c
+
+   int
+   rte_topo_get_domain_index_from_lcore(
+       enum rte_topo_domain_sel domain_sel,
+       unsigned int lcore);
+
+Check whether the main lcore belongs to a domain.
+
+.. code-block:: c
+
+   bool
+   rte_topo_is_main_lcore_in_domain(
+       enum rte_topo_domain_sel domain_sel,
+       uint32_t domain_idx);
+
+CPU Set Access
+--------------
+
+Retrieve the CPU set associated with a topology domain.
+
+.. code-block:: c
+
+   const rte_cpuset_t *
+   rte_topo_get_lcore_cpuset_in_domain(
+       enum rte_topo_domain_sel domain_sel,
+       uint32_t domain_idx);
+
+Debug Support
+-------------
+
+Dump topology information for debugging purposes.
+
+.. code-block:: c
+
+   void
+   rte_topo_dump(FILE *f);
+
+Usage Notes
+-----------
+
+- Domain‑aware lcore selection can reduce remote memory access.
+- Cache‑level domains are suitable for cache‑sensitive workloads.
+- Topology mappings remain stable across EAL lcore configurations.
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 0/3] eal/topology: introduce topology-aware lcore grouping
  2026-04-14 19:38 ` [PATCH v5 0/3] eal/topology: introduce topology-aware lcore grouping Vipin Varghese
                     ` (2 preceding siblings ...)
  2026-04-14 19:38   ` [PATCH v5 v5 3/3] doc: add new section topology Vipin Varghese
@ 2026-04-14 20:22   ` Stephen Hemminger
  3 siblings, 0 replies; 30+ messages in thread
From: Stephen Hemminger @ 2026-04-14 20:22 UTC (permalink / raw)
  To: Vipin Varghese
  Cc: dev, sivaprasad.tummala, konstantin.ananyev, wathsala.vithanage,
	bruce.richardson, viktorin, mb

On Wed, 15 Apr 2026 01:08:18 +0530
Vipin Varghese <vipin.varghese@amd.com> wrote:

> This series introduces a topology library that groups DPDK lcores based on
> CPU cache hierarchy and NUMA topology. The goal is to provide a stable and
> explicit API that allows applications to select lcores with better locality
> and cache sharing characteristics.
> 
> The series includes:
>   - EAL support for topology discovery using hwloc and domain-based lcore
>     grouping (L1/L2/L3/L4/NUMA)
>   - Topology-aware test cases validating API behavior and edge conditions
>   - Programmer’s guide describing the topology library and APIs
> 
> The API is marked experimental and does not change existing lcore behavior
> unless explicitly used by the application.
> 
> Changes in v5:
>   - Addressed review comments from v4
>   - Fixed ARM cross-compilation issues
>   - Cleaned up domain iteration and error handling
>   - Updated tests to cover domain edge cases
>   - Documentation refinements and API usage clarification
> 
> Changes in v4:
>   - Corrected domain selection semantics
>   - Updated example usage
>   - Fixed naming and typo issues
> 
> Changes in v3:
>   - Fixed macro naming (USE_NO_TOPOLOGY)
>   - Minor cleanups based on early feedback
> 
> Tested on:
>   - AMD EPYC (Milan, Genoa, Siena, Turin, Turin-Dense, Sorano)
>   - Intel Xeon (SPR-SP, GNR-SP)
>   - ARM Ampere
>   - NVIDIA Grace Superchip
> 
> Dependencies:
>   - hwloc-dev (tested with 2.10.0)
> 
> Patch breakdown:
>   1/3 eal/topology: add topology grouping for lcores
>   2/3 app: add topology-aware test cases
>   3/3 doc: add topology library documentation
> 
> Future Work:
>  - integrate into examples
>   -- hellowrld: ready
>   -- pkt-distributor: in-progress
>   -- l2fwd: ready
>   -- l3fwd: to start
>   -- eventdevpipeline: PoC ready
>  - integrate topology test
>   -- crypto: yet to start
>   -- compression: yet to start
>   -- dma: PoC ready
>  - add new features for
>   -- PQoS: yet to start
>   -- Data Injection: PoC with BRDCM Thor-2 ready
> 
> Tested OS: Linux only, need help with BSD and Windows
> 
> Tested with and without hwloc-dev library for
>  - Ampere, aarch64, Neoverse-N1, NUMA-2, 256 CPU threads
>  - Grace superchip, aarch64, Neoverse-V2, NUMA-2, 144 CPU threads
>  - Intel GNR-SP, 6767P, NUMA-2, 256 Threads
>  - AMD EPYC Siena, 8534P, NUMA-1, 128 Threads
>  - AMD EPYC Sorano, 8635P, NUMA-1, 168 Threads
> 
> Signed-off-by: Vipin Varghese <vipin.varghese@amd.com>
> ``
> 
> Vipin Varghese (3):
>   eal/topology: add Topology grouping for lcores
>   app: add topology aware test case
>   doc: add new section topology
> 
>  app/test/meson.build                   |   1 +
>  app/test/test_ring_perf.c              | 416 +++++++++++++-
>  app/test/test_stack_perf.c             | 409 ++++++++++++++
>  app/test/test_topology.c               | 676 ++++++++++++++++++++++
>  config/meson.build                     |  18 +
>  doc/api/doxy-api-index.md              |   1 +
>  doc/guides/prog_guide/index.rst        |   3 +-
>  doc/guides/prog_guide/topology_lib.rst | 155 +++++
>  lib/eal/common/eal_private.h           |  74 +++
>  lib/eal/common/eal_topology.c          | 747 +++++++++++++++++++++++++
>  lib/eal/common/meson.build             |   1 +
>  lib/eal/freebsd/eal.c                  |  10 +-
>  lib/eal/include/meson.build            |   1 +
>  lib/eal/include/rte_topology.h         | 255 +++++++++
>  lib/eal/linux/eal.c                    |   7 +
>  lib/eal/meson.build                    |   4 +
>  16 files changed, 2773 insertions(+), 5 deletions(-)
>  create mode 100644 app/test/test_topology.c
>  create mode 100644 doc/guides/prog_guide/topology_lib.rst
>  create mode 100644 lib/eal/common/eal_topology.c
>  create mode 100644 lib/eal/include/rte_topology.h
> 

AI review reported lots of possible correctness problems:

# DPDK Patch Review - bundle-1843-topology-v5.mbox

## Overview

This patch series introduces topology awareness to DPDK, allowing applications to select logical cores based on CPU cache and I/O topology. The series adds a new `rte_topology` library with APIs to query and enumerate logical cores within topology domains (L1/L2/L3/L4/NUMA).

---

## Correctness Bugs (HIGHEST PRIORITY - report at >=50% confidence)

### Patch 1/3: lib/eal/common/eal_topology.c

#### **Error: Resource leak on `eal_topology_map_layer()` failure**

In `rte_eal_topology_init()`, if any call to `eal_topology_map_layer()` fails (returns < 0), the code calls `rte_eal_topology_release()` which frees previously allocated layers. However, the `hwloc_topology_t` handle `topo_cnfg.topology` is **NOT** destroyed before returning. This leaks the hwloc topology handle.

**Location:** `lib/eal/common/eal_topology.c:685-712`

```c
for (int i = 0; i < 5; i++) {
    *layers[i].count = hwloc_get_nbobjs_by_depth(topo_cnfg.topology, layers[i].depth);
    if (eal_topology_map_layer(topo_cnfg.topology, layers[i].depth, layers[i].count,
        layers[i].ptr, layers[i].total_cores, layers[i].name) < 0) {
        rte_eal_topology_release();  /* frees layer memory */
        return -1;  /* BUG: topo_cnfg.topology NOT destroyed here */
    }
}

hwloc_topology_destroy(topo_cnfg.topology);  /* only reached on success */
topo_cnfg.topology = NULL;
```

**Fix:** Destroy the topology before returning on error:

```c
if (eal_topology_map_layer(...) < 0) {
    hwloc_topology_destroy(topo_cnfg.topology);
    topo_cnfg.topology = NULL;
    rte_eal_topology_release();
    return -1;
}
```

---

#### **Error: Potential use-after-free in `eal_topology_map_layer()` on partial allocation failure**

In `eal_topology_map_layer()`, the code allocates `dm->cores` for each domain. If a later allocation in the same loop iteration fails (e.g., for domain `j+1`), the function returns -1 immediately **without freeing `dm->cores` already allocated for earlier domains**. The caller (`rte_eal_topology_init()`) then calls `rte_eal_topology_release()`, which expects `layer_ptr[j]` to be non-NULL but `layer_ptr[j]->cores` may be uninitialized or garbage if the allocation failed before that point. This can cause a use-after-free or double-free when `rte_eal_topology_release()` calls `rte_free(d->map[i]->cores)`.

**Location:** `lib/eal/common/eal_topology.c:537-557`

```c
for (uint16_t j = 0; j < *layer_cnt; j++) {
    hwloc_obj_t obj = hwloc_get_obj_by_depth(topology, depth, j);
    int cpu_count = hwloc_bitmap_weight(obj->cpuset);
    if (cpu_count == -1)
        continue;

    struct core_domain_mapping *dm =
        rte_zmalloc(NULL, sizeof(struct core_domain_mapping), 0);
    if (!dm)
        return -1;  /* BUG: leaks layer_ptr array allocated by caller */

    (*layer_ptr)[j] = dm;
    CPU_ZERO(&dm->core_set);
    dm->core_count = 0;

    dm->cores = rte_malloc(NULL, sizeof(uint16_t) * cpu_count, 0);
    if (!dm->cores)
        return -1;  /* BUG: leaks dm (just allocated) and previous entries */
}
```

**Fix:** On allocation failure, free all previously allocated entries before returning:

```c
if (!dm) {
    /* Free all previously allocated entries */
    for (uint16_t k = 0; k < j; k++) {
        if ((*layer_ptr)[k]) {
            rte_free((*layer_ptr)[k]->cores);
            rte_free((*layer_ptr)[k]);
        }
    }
    rte_free(*layer_ptr);
    return -1;
}

/* Same cleanup for dm->cores allocation failure */
if (!dm->cores) {
    rte_free(dm);
    for (uint16_t k = 0; k < j; k++) {
        if ((*layer_ptr)[k]) {
            rte_free((*layer_ptr)[k]->cores);
            rte_free((*layer_ptr)[k]);
        }
    }
    rte_free(*layer_ptr);
    return -1;
}
```

---

#### **Error: NULL pointer dereference in `eal_topology_map_layer()` when `hwloc_get_obj_by_depth()` returns NULL**

In the second loop of `eal_topology_map_layer()` (lines 560-620), the code calls `hwloc_get_obj_by_depth(topology, depth, j)` and checks `if (!obj || hwloc_bitmap_iszero(obj->cpuset))` to skip NULL objects. However, in the **inner loop** (lines 579-618), the code calls `hwloc_get_obj_by_depth(topology, depth, k)` and assigns it to `obj_core` but does **NOT** check if `obj_core` is NULL before calling `hwloc_bitmap_weight(obj_core->cpuset)` on line 581. If `hwloc_get_obj_by_depth()` returns NULL for domain `k`, this will dereference a NULL pointer.

**Location:** `lib/eal/common/eal_topology.c:579-582`

```c
for (uint16_t k = 0; k < *layer_cnt; k++) {
    hwloc_obj_t obj_core = hwloc_get_obj_by_depth(topology, depth, k);
    int cpu_count_core = hwloc_bitmap_weight(obj_core->cpuset);  /* NULL deref if obj_core == NULL */
    if (cpu_count_core == -1)
        continue;
```

**Fix:** Check `obj_core` before dereferencing:

```c
hwloc_obj_t obj_core = hwloc_get_obj_by_depth(topology, depth, k);
if (!obj_core)
    continue;
int cpu_count_core = hwloc_bitmap_weight(obj_core->cpuset);
if (cpu_count_core == -1)
    continue;
```

---

#### **Error: Incorrect second argument to `rte_topo_get_nth_lcore_from_domain()` in `get_same_l1_domains()` (test_ring_perf.c and test_stack_perf.c)**

In both `app/test/test_ring_perf.c:290` and `app/test/test_stack_perf.c:258`, the function `get_same_l1_domains()` calls:

```c
id2 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0, RTE_TOPO_DOMAIN_L1);
```

The second argument (`lcore_pos`) is `0`, which is the same as for `id1`. This will assign **the same lcore** to both `id1` and `id2`, causing the subsequent check `if (id1 == id2) return 3;` to always trigger. This is a logic error: the intent is clearly to get two **different** lcores from the same domain.

**Location:** `app/test/test_ring_perf.c:287-290` and `app/test/test_stack_perf.c:255-258`

**Fix:** Use position `1` for the second lcore:

```c
id1 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0, RTE_TOPO_DOMAIN_L1);
id2 = rte_topo_get_nth_lcore_from_domain(domain, 1, 0, RTE_TOPO_DOMAIN_L1);
```

---

#### **Error: Iteration condition in `test_main_lcore_in_domain()` uses wrong domain type for lookup**

In `app/test/test_topology.c:211`, the loop iterates over `domain_count` for `domain_types[d]`, but the call to `rte_topo_is_main_lcore_in_domain()` uses `RTE_TOPO_DOMAIN_NUMA` instead of `domain_types[d]`. This means the test only checks the NUMA domain regardless of which domain type `d` selects (L1/L2/L3/L4).

**Location:** `app/test/test_topology.c:206-216`

```c
for (unsigned int d = 0; d < RTE_DIM(domain_types); d++) {
    bool main_lcore_found = false;
    unsigned int domain_count = rte_topo_get_domain_count(domain_types[d]);
    for (unsigned int dmn_idx = 0; dmn_idx < domain_count; dmn_idx++) {
        main_lcore_found = rte_topo_is_main_lcore_in_domain(RTE_TOPO_DOMAIN_NUMA,  /* BUG: should be domain_types[d] */
            dmn_idx);
        if (main_lcore_found)
            break;
    }
```

**Fix:**

```c
main_lcore_found = rte_topo_is_main_lcore_in_domain(domain_types[d], dmn_idx);
```

---

#### **Error: Infinite loop risk in `rte_topo_get_nth_lcore_from_domain()` when `ptr->core_count` is 0**

In `lib/eal/common/eal_topology.c:296-318`, the function enters a `while (1)` loop that increments `new_lcore_pos`. If `ptr->core_count` is 0 (which the code checks earlier but does not return immediately), the loop will wrap `new_lcore_pos` back to 0 indefinitely, never breaking. While the function returns `RTE_MAX_LCORE` if `ptr->core_count == 0` before the loop, the logic flow is unclear and the loop body does not have a clear termination condition if the core count is 0.

**Location:** `lib/eal/common/eal_topology.c:283-318`

**Fix:** Add a sanity check inside the loop to prevent infinite iteration:

```c
unsigned int iterations = 0;
while (1) {
    if (iterations++ > ptr->core_count * 2)  /* safety limit */
        return RTE_MAX_LCORE;
    /* ... rest of loop ... */
}
```

However, the real issue is that the code already returns `RTE_MAX_LCORE` if `ptr->core_count == 0` on line 287, so this is more of a defensive-programming note. The function should be refactored for clarity.

---

#### **Error: Missing NULL check after `get_domain_lcore_mapping()` in `rte_topo_get_next_lcore()`**

In `rte_topo_get_next_lcore()`, the code calls `get_domain_lcore_mapping(flag, lcore_domain)` and checks if `ptr` is NULL on line 350. However, if `ptr` is NULL, the function returns `RTE_MAX_LCORE`. This is correct, but the subsequent logic on line 381 calls `rte_topo_is_main_lcore_in_domain(flag, lcore_domain)`, which internally may call `get_domain_lcore_mapping()` again. If that call also returns NULL (which it will if the domain is invalid), the function `rte_topo_is_main_lcore_in_domain()` will return `false`, which is safe. However, the logic is fragile and should explicitly handle the NULL case to avoid relying on transitive safety.

**Location:** `lib/eal/common/eal_topology.c:381`

**Recommendation:** The code is technically safe but could be clearer. No change required, but consider restructuring for maintainability.

---

### Patch 2/3: app/test Topology Tests

#### **Error: Macro `RTE_TOPO_FOREACH_WORKER_LCORE_IN_DOMAIN` declares variable in macro expansion (shadowing risk)**

In `lib/eal/include/rte_topology.h:243-248`, the macro `RTE_TOPO_FOREACH_WORKER_LCORE_IN_DOMAIN` declares a local variable `main_lcore` inside the macro expansion:

```c
#define RTE_TOPO_FOREACH_WORKER_LCORE_IN_DOMAIN(lcore, domain_indx, flag)  \
    lcore = rte_topo_get_nth_lcore_from_domain(domain, 0, 0, flag);  \
    uint16_t main_lcore = rte_get_main_lcore();  \
    for (lcore = (lcore != main_lcore) ? \
        lcore : rte_topo_get_next_lcore(lcore, 1, 0, flag);  \
        lcore < RTE_MAX_LCORE;  \
        lcore = rte_topo_get_next_lcore(lcore, 1, 0, flag))
```

This can cause a compiler error or shadowing if the caller already has a variable named `main_lcore` in scope. Additionally, the macro uses `domain` (line 244) but the parameter is `domain_indx`, which is a typo and will cause a compilation error.

**Location:** `lib/eal/include/rte_topology.h:243-248`

**Fix:** Wrap in a `do { } while (0)` and use a uniquely-named variable, or document that the macro must not be used if `main_lcore` is already declared. Also fix the typo:

```c
#define RTE_TOPO_FOREACH_WORKER_LCORE_IN_DOMAIN(lcore, domain_indx, flag)  \
    lcore = rte_topo_get_nth_lcore_from_domain(domain_indx, 0, 0, flag);  \
    uint16_t __topo_main_lcore = rte_get_main_lcore();  \
    for (lcore = (lcore != __topo_main_lcore) ? \
        lcore : rte_topo_get_next_lcore(lcore, 1, 0, flag);  \
        lcore < RTE_MAX_LCORE;  \
        lcore = rte_topo_get_next_lcore(lcore, 1, 0, flag))
```

---

### Summary of Correctness Bugs

1. **hwloc topology leak** on `eal_topology_map_layer()` failure
2. **Resource leak** in `eal_topology_map_layer()` on partial allocation failure
3. **NULL pointer dereference** in `eal_topology_map_layer()` inner loop
4. **Logic error** in `get_same_l1_domains()` (same lcore assigned to `id1` and `id2`)
5. **Wrong domain type** in `test_main_lcore_in_domain()` (uses `RTE_TOPO_DOMAIN_NUMA` instead of `domain_types[d]`)
6. **Macro typo** in `RTE_TOPO_FOREACH_WORKER_LCORE_IN_DOMAIN` (uses `domain` instead of `domain_indx`)
7. **Potential infinite loop** in `rte_topo_get_nth_lcore_from_domain()` if `ptr->core_count == 0` (mitigated by early

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 v5 2/3] app: add topology aware test case
  2026-04-14 19:38   ` [PATCH v5 v5 2/3] app: add topology aware test case Vipin Varghese
@ 2026-04-15  5:21     ` Sudheendra Sampath
  2026-04-16  7:22     ` Varghese, Vipin
  1 sibling, 0 replies; 30+ messages in thread
From: Sudheendra Sampath @ 2026-04-15  5:21 UTC (permalink / raw)
  To: vipin.varghese
  Cc: bruce.richardson, dev, konstantin.ananyev, mb, sivaprasad.tummala,
	viktorin, wathsala.vithanage

Hi Vipin,

This is my first ever patch review with DPDK.  Apologies if I am not
following the right procedure.  I welcome any feedback or help in
correcting the procedure.

However, I had the following review related to the above patch.

I see that changes to app/test/test_ring_perf.c introduced the
following functions:

get_same_numa_domains(struct lcore_pair *lcp)
get_same_l4_domains(struct lcore_pair *lcp)
get_same_l3_domains(struct lcore_pair *lcp)
get_same_l2_domains(struct lcore_pair *lcp)
get_same_l1_domains(struct lcore_pair *lcp)
get_two_numa_domains(struct lcore_pair *lcp)
get_two_l4_domains(struct lcore_pair *lcp)
get_two_l3_domains(struct lcore_pair *lcp)
get_two_l2_domains(struct lcore_pair *lcp)
get_two_l1_domains(struct lcore_pair *lcp)

In the implementation of these, most of the code is pretty much
identical, except for the topology domain type.

Curious to know if it is better to implement a 'common function' -
get_topo_domains(struct lcore_pair *, topology_domain) - as arguments
and call get_topo_domains() from all of the above functions.  

For example: 
get_same_numa_domains() calls get_topo_domains(lp,
RTE_TOPO_DOMAIN_NUMA)

This will make it more cleaner and will help changes isolated to one
place if there are 'common' issues.

Appreciate if you can let me know if my thinking aligns with the
correct implementation.

Thanks and best regards,

Sudheendra G Sampath

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [PATCH v5 v5 1/3] eal/topology: add Topology grouping for lcores
  2026-04-14 19:38   ` [PATCH v5 v5 1/3] eal/topology: add Topology grouping for lcores Vipin Varghese
@ 2026-04-15 14:06     ` Morten Brørup
  2026-04-15 17:52       ` Varghese, Vipin
  0 siblings, 1 reply; 30+ messages in thread
From: Morten Brørup @ 2026-04-15 14:06 UTC (permalink / raw)
  To: Vipin Varghese, dev, sivaprasad.tummala
  Cc: konstantin.ananyev, wathsala.vithanage, bruce.richardson,
	viktorin

> @@ -40,6 +45,63 @@ struct lcore_config {
> 
>  extern struct lcore_config lcore_config[RTE_MAX_LCORE];
> 
> +struct core_domain_mapping {
> +	rte_cpuset_t core_set;	/**< cpu_set representing lcores within
> domain */
> +	uint16_t core_count;	/**< dpdk enabled lcores within domain */
> +	uint16_t *cores;	/**< list of cores */
> +};

In Linux, rte_cpu_set_t is the same as cpu_set_t, which is limited to CPU_SETSIZE (1024) cores.

We should stop using that (fixed size) type, and use dynamically sized CPU sets instead.
Ref: https://man7.org/linux/man-pages/man3/CPU_SET.3.html

IMHO, both the fd_set type, limited to FDSET_SIZE (1024) file descriptors, and the cpu_set_t type, limited to CPU_SETSIZE (1024) cores, should be considered obsolete, and not used in new DPDK code.

It's hard to change old DPDK APIs relying on the fixed size rte_cpu_set_t type, but let's avoid adding new APIs using that (obsolete) type.

Also refer to: https://bugs.dpdk.org/show_bug.cgi?id=1704

PS: I guess a larger CPU_SETSIZE is available for systems built entirely from scratch, including libc and all libraries and applications. Not a generic solution.

-Morten

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [PATCH v5 v5 1/3] eal/topology: add Topology grouping for lcores
  2026-04-15 14:06     ` Morten Brørup
@ 2026-04-15 17:52       ` Varghese, Vipin
  0 siblings, 0 replies; 30+ messages in thread
From: Varghese, Vipin @ 2026-04-15 17:52 UTC (permalink / raw)
  To: Morten Brørup, dev@dpdk.org, Tummala, Sivaprasad
  Cc: konstantin.ananyev@huawei.com, wathsala.vithanage@arm.com,
	bruce.richardson@intel.com, viktorin@cesnet.cz

[AMD Official Use Only - AMD Internal Distribution Only]

Thank you @Morten Brørup

<snipped>

> >
> > +struct core_domain_mapping {
> > +     rte_cpuset_t core_set;  /**< cpu_set representing lcores within
> > domain */
> > +     uint16_t core_count;    /**< dpdk enabled lcores within domain */
> > +     uint16_t *cores;        /**< list of cores */
> > +};
>
> In Linux, rte_cpu_set_t is the same as cpu_set_t, which is limited to CPU_SETSIZE
> (1024) cores.
>
> We should stop using that (fixed size) type, and use dynamically sized CPU sets
> instead.
> Ref: https://man7.org/linux/man-pages/man3/CPU_SET.3.html
>
> IMHO, both the fd_set type, limited to FDSET_SIZE (1024) file descriptors, and the
> cpu_set_t type, limited to CPU_SETSIZE (1024) cores, should be considered
> obsolete, and not used in new DPDK code.
>
> It's hard to change old DPDK APIs relying on the fixed size rte_cpu_set_t type, but
> let's avoid adding new APIs using that (obsolete) type.
>
> Also refer to: https://bugs.dpdk.org/show_bug.cgi?id=1704
>
> PS: I guess a larger CPU_SETSIZE is available for systems built entirely from
> scratch, including libc and all libraries and applications. Not a generic solution.
>
> -Morten

Thanks for the pointer, If I understand this correctly for given DPDK processing for `rte_lcore_count` is valid representation While hwloc_init can show upto 1024 or more. Hence step-1 identify the number of cores (or execution engines) per domain. Alloc for cpuset and use set for valid lcores.

I will make this change in v6 and omit rte_cpuset_t


^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [PATCH v5 v5 2/3] app: add topology aware test case
  2026-04-14 19:38   ` [PATCH v5 v5 2/3] app: add topology aware test case Vipin Varghese
  2026-04-15  5:21     ` Sudheendra Sampath
@ 2026-04-16  7:22     ` Varghese, Vipin
  2026-04-16 13:19       ` Varghese, Vipin
  1 sibling, 1 reply; 30+ messages in thread
From: Varghese, Vipin @ 2026-04-16  7:22 UTC (permalink / raw)
  To: Varghese, Vipin, dev@dpdk.org, Tummala, Sivaprasad
  Cc: konstantin.ananyev@huawei.com, wathsala.vithanage@arm.com,
	bruce.richardson@intel.com, viktorin@cesnet.cz,
	mb@smartsharesystems.com

[AMD Official Use Only - AMD Internal Distribution Only]

Snipped

>
>  #include "test.h"
> @@ -74,7 +77,7 @@ test_ring_print_test_string(unsigned int api_type, int esize,
>  static int
>  get_two_hyperthreads(struct lcore_pair *lcp)
>  {
> -       unsigned id1, id2;
> +       unsigned int id1, id2;
>         unsigned c1, c2, s1, s2;
>         RTE_LCORE_FOREACH(id1) {
>                 /* inner loop just re-reads all id's. We could skip the first few
> @@ -101,7 +104,7 @@ get_two_hyperthreads(struct lcore_pair *lcp)
>  static int
>  get_two_cores(struct lcore_pair *lcp)
>  {
> -       unsigned id1, id2;
> +       unsigned int id1, id2;
>         unsigned c1, c2, s1, s2;
>         RTE_LCORE_FOREACH(id1) {
>                 RTE_LCORE_FOREACH(id2) {
> @@ -125,7 +128,7 @@ get_two_cores(struct lcore_pair *lcp)
>  static int
>  get_two_sockets(struct lcore_pair *lcp)
>  {
> -       unsigned id1, id2;
> +       unsigned int id1, id2;
>         unsigned s1, s2;
>         RTE_LCORE_FOREACH(id1) {
>                 RTE_LCORE_FOREACH(id2) {
> @@ -143,6 +146,359 @@ get_two_sockets(struct lcore_pair *lcp)
>         return 1;
>  }
>
> +#ifdef RTE_LIBHWLOC_PROBE
> +static int
> +get_same_numa_domains(struct lcore_pair *lcp)
> +{
> +       if (rte_topo_get_domain_count(RTE_TOPO_DOMAIN_NUMA) == 0)
> +               return 1;
> +
> +       unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
> +       unsigned int domain = 0;
> +
> +       RTE_TOPO_FOREACH_DOMAIN(domain, RTE_TOPO_DOMAIN_NUMA) {
> +               if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
> +                       break;
> +
> +               if
> (rte_topo_get_lcore_count_from_domain(RTE_TOPO_DOMAIN_NUMA, domain) <
> 2)
> +                       continue;
> +
> +               id1 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0,
> RTE_TOPO_DOMAIN_NUMA);
> +               id2 = rte_topo_get_nth_lcore_from_domain(domain, 1, 0,
> RTE_TOPO_DOMAIN_NUMA);
> +       }
> +
> +       if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
> +               return 2;
> +
> +       if (id1 == id2)
> +               return 3;
> +
> +       lcp->c1 = id1;
> +       lcp->c2 = id2;
> +
> +       return 0;
> +}
> +
> +static int
> +get_same_l4_domains(struct lcore_pair *lcp)
> +{
> +       if (rte_topo_get_domain_count(RTE_TOPO_DOMAIN_L4) == 0)
> +               return 1;
> +
> +       unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
> +       unsigned int domain = 0;
> +
> +       RTE_TOPO_FOREACH_DOMAIN(domain, RTE_TOPO_DOMAIN_L4) {
> +               if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
> +                       break;
> +
> +               if (rte_topo_get_lcore_count_from_domain(RTE_TOPO_DOMAIN_L4,
> domain) < 2)
> +                       continue;
> +
> +               id1 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0,
> RTE_TOPO_DOMAIN_L4);
> +               id2 = rte_topo_get_nth_lcore_from_domain(domain, 1, 0,
> RTE_TOPO_DOMAIN_L4);
> +       }
> +
> +       if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
> +               return 2;
> +
> +       if (id1 == id2)
> +               return 3;
> +
> +       lcp->c1 = id1;
> +       lcp->c2 = id2;
> +       return 0;
> +}
> +
> +static int
> +get_same_l3_domains(struct lcore_pair *lcp)
> +{      if (rte_topo_get_domain_count(RTE_TOPO_DOMAIN_L3) == 0)
> +               return 1;
> +
> +       unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
> +       unsigned int domain = 0;
> +
> +       RTE_TOPO_FOREACH_DOMAIN(domain, RTE_TOPO_DOMAIN_L3) {
> +               if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
> +                       break;
> +
> +               if (rte_topo_get_lcore_count_from_domain(RTE_TOPO_DOMAIN_L3,
> domain) < 2)
> +                       continue;
> +
> +               id1 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0,
> RTE_TOPO_DOMAIN_L3);
> +               id2 = rte_topo_get_nth_lcore_from_domain(domain, 1, 0,
> RTE_TOPO_DOMAIN_L3);
> +       }
> +
> +       if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
> +               return 2;
> +
> +       if (id1 == id2)
> +               return 3;
> +
> +       lcp->c1 = id1;
> +       lcp->c2 = id2;
> +
> +       return 0;
> +
> +}
> +
> +static int
> +get_same_l2_domains(struct lcore_pair *lcp)
> +{      if (rte_topo_get_domain_count(RTE_TOPO_DOMAIN_L2) == 0)
> +               return 1;
> +
> +       unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
> +       unsigned int domain = 0;
> +
> +       RTE_TOPO_FOREACH_DOMAIN(domain, RTE_TOPO_DOMAIN_L2) {
> +               if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
> +                       break;
> +
> +               if (rte_topo_get_lcore_count_from_domain(RTE_TOPO_DOMAIN_L2,
> domain) < 2)
> +                       continue;
> +
> +               id1 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0,
> RTE_TOPO_DOMAIN_L2);
> +               id2 = rte_topo_get_nth_lcore_from_domain(domain, 1, 0,
> RTE_TOPO_DOMAIN_L2);
> +       }
> +
> +       if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
> +               return 2;
> +
> +       if (id1 == id2)
> +               return 3;
> +
> +       lcp->c1 = id1;
> +       lcp->c2 = id2;
> +
> +       return 0;
> +
> +}
> +
> +static int
> +get_same_l1_domains(struct lcore_pair *lcp)
> +{
> +       if (rte_topo_get_domain_count(RTE_TOPO_DOMAIN_L1) == 0)
> +               return 1;
> +
> +       unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
> +       unsigned int domain = 0;
> +
> +       RTE_TOPO_FOREACH_DOMAIN(domain, RTE_TOPO_DOMAIN_L1) {
> +               if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
> +                       break;
> +
> +               if (rte_topo_get_lcore_count_from_domain(RTE_TOPO_DOMAIN_L1,
> domain) < 2)
> +                       continue;
> +
> +               id1 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0,
> RTE_TOPO_DOMAIN_L1);
> +               id2 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0,
> RTE_TOPO_DOMAIN_L1);
> +       }
> +
> +       if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
> +               return 2;
> +
> +       if (id1 == id2)
> +               return 3;
> +
> +       lcp->c1 = id1;
> +       lcp->c2 = id2;
> +
> +       return 0;
> +
> +}
> +static int
> +get_two_numa_domains(struct lcore_pair *lcp)
> +{
> +       if (rte_topo_get_domain_count(RTE_TOPO_DOMAIN_NUMA) < 2)
> +               return 1;
> +
> +       unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
> +       unsigned int domain = 0;
> +
> +       RTE_TOPO_FOREACH_DOMAIN(domain, RTE_TOPO_DOMAIN_NUMA) {
> +               if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
> +                       break;
> +
> +               if
> (rte_topo_get_lcore_count_from_domain(RTE_TOPO_DOMAIN_NUMA, domain)
> == 0)
> +                       continue;
> +
> +               if (id1 == RTE_MAX_LCORE) {
> +                       id1 = rte_topo_get_nth_lcore_from_domain(domain,
> +                               0, 0, RTE_TOPO_DOMAIN_NUMA);
> +                       continue;
> +               }
> +               if (id2 == RTE_MAX_LCORE) {
> +                       id2 = rte_topo_get_nth_lcore_from_domain(domain,
> +                               0, 0, RTE_TOPO_DOMAIN_NUMA);
> +                       continue;
> +               }
> +       }
> +
> +       if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
> +               return 2;
> +
> +       if (id1 == id2)
> +               return 3;
> +
> +       lcp->c1 = id1;
> +       lcp->c2 = id2;
> +
> +       return 0;
> +}
> +
> +static int
> +get_two_l4_domains(struct lcore_pair *lcp)
> +{
> +       if (rte_topo_get_domain_count(RTE_TOPO_DOMAIN_L4) < 2)
> +               return 1;
> +
> +       unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
> +       unsigned int domain = 0;
> +
> +       RTE_TOPO_FOREACH_DOMAIN(domain, RTE_TOPO_DOMAIN_L4) {
> +               if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
> +                       break;
> +
> +               if (rte_topo_get_lcore_count_from_domain(RTE_TOPO_DOMAIN_L4,
> domain) == 0)
> +                       continue;
> +
> +               if (id1 == RTE_MAX_LCORE) {
> +                       id1 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0,
> RTE_TOPO_DOMAIN_L4);
> +                       continue;
> +               }
> +               if (id2 == RTE_MAX_LCORE) {
> +                       id2 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0,
> RTE_TOPO_DOMAIN_L4);
> +                       continue;
> +               }
> +       }
> +
> +       if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
> +               return 2;
> +
> +       if (id1 == id2)
> +               return 3;
> +
> +       lcp->c1 = id1;
> +       lcp->c2 = id2;
> +       return 0;
> +}
> +
> +static int
> +get_two_l3_domains(struct lcore_pair *lcp)
> +{      if (rte_topo_get_domain_count(RTE_TOPO_DOMAIN_L3) < 2)
> +               return 1;
> +
> +       unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
> +       unsigned int domain = 0;
> +
> +       RTE_TOPO_FOREACH_DOMAIN(domain, RTE_TOPO_DOMAIN_L3) {
> +               if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
> +                       break;
> +
> +               if (rte_topo_get_lcore_count_from_domain(RTE_TOPO_DOMAIN_L3,
> domain) == 0)
> +                       continue;
> +
> +               if (id1 == RTE_MAX_LCORE) {
> +                       id1 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0,
> RTE_TOPO_DOMAIN_L3);
> +                       continue;
> +               }
> +               if (id2 == RTE_MAX_LCORE) {
> +                       id2 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0,
> RTE_TOPO_DOMAIN_L3);
> +                       continue;
> +               }
> +       }
> +
> +       if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
> +               return 2;
> +
> +       if (id1 == id2)
> +               return 3;
> +
> +       lcp->c1 = id1;
> +       lcp->c2 = id2;
> +
> +       return 0;
> +
> +}
> +
> +static int
> +get_two_l2_domains(struct lcore_pair *lcp)
> +{      if (rte_topo_get_domain_count(RTE_TOPO_DOMAIN_L2) < 2)
> +               return 1;
> +
> +       unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
> +       unsigned int domain = 0;
> +
> +       RTE_TOPO_FOREACH_DOMAIN(domain, RTE_TOPO_DOMAIN_L2) {
> +               if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
> +                       break;
> +
> +               if (rte_topo_get_lcore_count_from_domain(RTE_TOPO_DOMAIN_L2,
> domain) == 0)
> +                       continue;
> +
> +               if (id1 == RTE_MAX_LCORE) {
> +                       id1 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0,
> RTE_TOPO_DOMAIN_L2);
> +                       continue;
> +               }
> +               if (id2 == RTE_MAX_LCORE) {
> +                       id2 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0,
> RTE_TOPO_DOMAIN_L2);
> +                       continue;
> +               }
> +       }
> +
> +       if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
> +               return 2;
> +
> +       if (id1 == id2)
> +               return 3;
> +
> +       lcp->c1 = id1;
> +       lcp->c2 = id2;
> +
> +       return 0;
> +
> +}
> +
> +static int
> +get_two_l1_domains(struct lcore_pair *lcp)
> +{
> +       if (rte_topo_get_domain_count(RTE_TOPO_DOMAIN_L1) < 2)
> +               return 1;
> +
> +       unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
> +       unsigned int domain = 0;
> +
> +       RTE_TOPO_FOREACH_DOMAIN(domain, RTE_TOPO_DOMAIN_L1) {
> +               if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
> +                       break;
> +
> +               if (rte_topo_get_lcore_count_from_domain(RTE_TOPO_DOMAIN_L1,
> domain) == 0)
> +                       continue;
> +
> +               if (id1 == RTE_MAX_LCORE) {
> +                       id1 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0,
> RTE_TOPO_DOMAIN_L1);
> +                       continue;
> +               }
> +               if (id2 == RTE_MAX_LCORE) {
> +                       id2 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0,
> RTE_TOPO_DOMAIN_L1);
> +                       continue;
> +               }
> +       }
> +
> +       if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
> +               return 2;
> +
> +       if (id1 == id2)
> +               return 3;
> +
> +       lcp->c1 = id1;
> +       lcp->c2 = id2;
> +
> +       return 0;
> +
> +}
> +#endif
> +
>  /* Get cycle counts for dequeuing from an empty ring. Should be 2 or 3 cycles */
>  static void
>  test_empty_dequeue(struct rte_ring *r, const int esize,
> @@ -488,6 +844,60 @@ test_ring_perf_esize_run_on_two_cores(
>                 if (run_on_core_pair(&cores, param1, param2) < 0)
>                         return -1;
>         }
> +#ifdef RTE_LIBHWLOC_PROBE
> +       if (rte_lcore_count() > 2) {
> +               if (get_same_numa_domains(&cores) == 0) {
> +                       printf("\n### Testing using same numa domain nodes ###\n");
> +                       if (run_on_core_pair(&cores, param1, param2) < 0)
> +                               return -1;
> +               }
> +               if (get_same_l4_domains(&cores) == 0) {
> +                       printf("\n### Testing using same l4 domain nodes ###\n");
> +                       if (run_on_core_pair(&cores, param1, param2) < 0)
> +                               return -1;
> +               }
> +               if (get_same_l3_domains(&cores) == 0) {
> +                       printf("\n### Testing using same l3 domain nodes ###\n");
> +                       if (run_on_core_pair(&cores, param1, param2) < 0)
> +                               return -1;
> +               }
> +               if (get_same_l2_domains(&cores) == 0) {
> +                       printf("\n### Testing using same l2 domain nodes ###\n");
> +                       if (run_on_core_pair(&cores, param1, param2) < 0)
> +                               return -1;
> +               }
> +               if (get_same_l1_domains(&cores) == 0) {
> +                       printf("\n### Testing using same l1 domain nodes ###\n");
> +                       if (run_on_core_pair(&cores, param1, param2) < 0)
> +                               return -1;
> +               }
> +               if (get_two_numa_domains(&cores) == 0) {
> +                       printf("\n### Testing using two numa domain nodes ###\n");
> +                       if (run_on_core_pair(&cores, param1, param2) < 0)
> +                               return -1;
> +               }
> +               if (get_two_l4_domains(&cores) == 0) {
> +                       printf("\n### Testing using two l4 domain nodes ###\n");
> +                       if (run_on_core_pair(&cores, param1, param2) < 0)
> +                               return -1;
> +               }
> +               if (get_two_l3_domains(&cores) == 0) {
> +                       printf("\n### Testing using two l3 domain nodes ###\n");
> +                       if (run_on_core_pair(&cores, param1, param2) < 0)
> +                               return -1;
> +               }
> +               if (get_two_l2_domains(&cores) == 0) {
> +                       printf("\n### Testing using two l2 domain nodes ###\n");
> +                       if (run_on_core_pair(&cores, param1, param2) < 0)
> +                               return -1;
> +               }
> +               if (get_two_l1_domains(&cores) == 0) {
> +                       printf("\n### Testing using two l1 domain nodes ###\n");
> +                       if (run_on_core_pair(&cores, param1, param2) < 0)
> +                               return -1;
> +               }
> +       }
> +#endif
>         return 0;
>  }
>
> diff --git a/app/test/test_stack_perf.c b/app/test/test_stack_perf.c
> index 3f17a2606c..e5b038a3e8 100644
> --- a/app/test/test_stack_perf.c
> +++ b/app/test/test_stack_perf.c
> @@ -10,6 +10,9 @@
>  #include <rte_launch.h>
>  #include <rte_pause.h>
>  #include <rte_stack.h>
> +#ifdef RTE_LIBHWLOC_PROBE
> +#include <rte_topology.h>
> +#endif
>
>  #include "test.h"
>
> @@ -105,6 +108,367 @@ get_two_sockets(struct lcore_pair *lcp)
>         return 1;
>  }
>
> +#ifdef RTE_LIBHWLOC_PROBE
> +static int
> +get_same_numa_domains(struct lcore_pair *lcp)
> +{
> +       if (rte_topo_get_domain_count(RTE_TOPO_DOMAIN_NUMA) == 0)
> +               return 1;
> +
> +       unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
> +       unsigned int domain = 0;
> +
> +       RTE_TOPO_FOREACH_DOMAIN(domain, RTE_TOPO_DOMAIN_NUMA) {
> +               if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
> +                       break;
> +
> +               if
> (rte_topo_get_lcore_count_from_domain(RTE_TOPO_DOMAIN_NUMA, domain) <
> 2)
> +                       continue;
> +
> +               id1 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0,
> RTE_TOPO_DOMAIN_NUMA);
> +               id2 = rte_topo_get_nth_lcore_from_domain(domain, 1, 0,
> RTE_TOPO_DOMAIN_NUMA);
> +       }
> +
> +       if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
> +               return 2;
> +
> +       if (id1 == id2)
> +               return 3;
> +
> +       lcp->c1 = id1;
> +       lcp->c2 = id2;
> +
> +       return 0;
> +}
> +
> +static int
> +get_same_l4_domains(struct lcore_pair *lcp)
> +{
> +       if (rte_topo_get_domain_count(RTE_TOPO_DOMAIN_L4) == 0)
> +               return 1;
> +
> +       unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
> +       unsigned int domain = 0;
> +
> +       RTE_TOPO_FOREACH_DOMAIN(domain, RTE_TOPO_DOMAIN_L4) {
> +               if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
> +                       break;
> +
> +               if (rte_topo_get_lcore_count_from_domain(RTE_TOPO_DOMAIN_L4,
> domain) < 2)
> +                       continue;
> +
> +               id1 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0,
> RTE_TOPO_DOMAIN_L4);
> +               id2 = rte_topo_get_nth_lcore_from_domain(domain, 1, 0,
> RTE_TOPO_DOMAIN_L4);
> +       }
> +
> +       if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
> +               return 2;
> +
> +       if (id1 == id2)
> +               return 3;
> +
> +       lcp->c1 = id1;
> +       lcp->c2 = id2;
> +
> +       return 0;
> +}
> +
> +static int
> +get_same_l3_domains(struct lcore_pair *lcp)
> +{
> +       if (rte_topo_get_domain_count(RTE_TOPO_DOMAIN_L3) == 0)
> +               return 1;
> +
> +       unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
> +       unsigned int domain = 0;
> +
> +       RTE_TOPO_FOREACH_DOMAIN(domain, RTE_TOPO_DOMAIN_L3) {
> +               if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
> +                       break;
> +
> +               if (rte_topo_get_lcore_count_from_domain(RTE_TOPO_DOMAIN_L3,
> domain) < 2)
> +                       continue;
> +
> +               id1 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0,
> RTE_TOPO_DOMAIN_L3);
> +               id2 = rte_topo_get_nth_lcore_from_domain(domain, 1, 0,
> RTE_TOPO_DOMAIN_L3);
> +       }
> +
> +       if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
> +               return 2;
> +
> +       if (id1 == id2)
> +               return 3;
> +
> +       lcp->c1 = id1;
> +       lcp->c2 = id2;
> +
> +       return 0;
> +}
> +
> +static int
> +get_same_l2_domains(struct lcore_pair *lcp)
> +{
> +       if (rte_topo_get_domain_count(RTE_TOPO_DOMAIN_L2) == 0)
> +               return 1;
> +
> +       unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
> +       unsigned int domain = 0;
> +
> +       RTE_TOPO_FOREACH_DOMAIN(domain, RTE_TOPO_DOMAIN_L2) {
> +               if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
> +                       break;
> +
> +               if (rte_topo_get_lcore_count_from_domain(RTE_TOPO_DOMAIN_L2,
> domain) < 2)
> +                       continue;
> +
> +               id1 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0,
> RTE_TOPO_DOMAIN_L2);
> +               id2 = rte_topo_get_nth_lcore_from_domain(domain, 1, 0,
> RTE_TOPO_DOMAIN_L2);
> +       }
> +
> +       if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
> +               return 2;
> +
> +       if (id1 == id2)
> +               return 3;
> +
> +       lcp->c1 = id1;
> +       lcp->c2 = id2;
> +
> +       return 0;
> +}
> +
> +static int
> +get_same_l1_domains(struct lcore_pair *lcp)
> +{
> +       if (rte_topo_get_domain_count(RTE_TOPO_DOMAIN_L1) == 0)
> +               return 1;
> +
> +       unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
> +       unsigned int domain = 0;
> +
> +       RTE_TOPO_FOREACH_DOMAIN(domain, RTE_TOPO_DOMAIN_L1) {
> +               if (rte_topo_is_main_lcore_in_domain(domain,
> RTE_TOPO_DOMAIN_L1))
> +                       continue;
> +
> +               if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
> +                       break;
> +
> +               if (rte_topo_get_lcore_count_from_domain(RTE_TOPO_DOMAIN_L1,
> domain) < 2)
> +                       continue;
> +
> +               id1 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0,
> RTE_TOPO_DOMAIN_L1);
> +               id2 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0,
> RTE_TOPO_DOMAIN_L1);
> +       }
> +
> +       if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
> +               return 2;
> +
> +       if (id1 == id2)
> +               return 3;
> +
> +       lcp->c1 = id1;
> +       lcp->c2 = id2;
> +
> +       return 0;
> +}
> +static int
> +get_two_numa_domains(struct lcore_pair *lcp)
> +{
> +       if (rte_topo_get_domain_count(RTE_TOPO_DOMAIN_NUMA) < 2)
> +               return 1;
> +
> +       unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
> +       unsigned int domain = 0;
> +
> +       RTE_TOPO_FOREACH_DOMAIN(domain, RTE_TOPO_DOMAIN_NUMA) {
> +               if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
> +                       break;
> +
> +               if
> (rte_topo_get_lcore_count_from_domain(RTE_TOPO_DOMAIN_NUMA, domain)
> == 0)
> +                       continue;
> +
> +               if (id1 == RTE_MAX_LCORE) {
> +                       id1 = rte_topo_get_nth_lcore_from_domain(domain,
> +                               0, 0, RTE_TOPO_DOMAIN_NUMA);
> +                       continue;
> +               }
> +
> +               if (id2 == RTE_MAX_LCORE) {
> +                       id2 = rte_topo_get_nth_lcore_from_domain(domain,
> +                               0, 0, RTE_TOPO_DOMAIN_NUMA);
> +                       continue;
> +               }
> +       }
> +
> +       if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
> +               return 2;
> +
> +       if (id1 == id2)
> +               return 3;
> +
> +       lcp->c1 = id1;
> +       lcp->c2 = id2;
> +
> +       return 0;
> +}
> +
> +static int
> +get_two_l4_domains(struct lcore_pair *lcp)
> +{
> +       if (rte_topo_get_domain_count(RTE_TOPO_DOMAIN_L4) < 2)
> +               return 1;
> +
> +       unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
> +       unsigned int domain = 0;
> +
> +       RTE_TOPO_FOREACH_DOMAIN(domain, RTE_TOPO_DOMAIN_L4) {
> +               if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
> +                       break;
> +
> +               if (rte_topo_get_lcore_count_from_domain(RTE_TOPO_DOMAIN_L4,
> domain) == 0)
> +                       continue;
> +
> +               if (id1 == RTE_MAX_LCORE) {
> +                       id1 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0,
> RTE_TOPO_DOMAIN_L4);
> +                       continue;
> +               }
> +               if (id2 == RTE_MAX_LCORE) {
> +                       id2 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0,
> RTE_TOPO_DOMAIN_L4);
> +                       continue;
> +               }
> +       }
> +
> +       if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
> +               return 2;
> +
> +       if (id1 == id2)
> +               return 3;
> +
> +       lcp->c1 = id1;
> +       lcp->c2 = id2;
> +
> +       return 0;
> +}
> +
> +static int
> +get_two_l3_domains(struct lcore_pair *lcp)
> +{
> +       if (rte_topo_get_domain_count(RTE_TOPO_DOMAIN_L3) < 2)
> +               return 1;
> +
> +       unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
> +       unsigned int domain = 0;
> +
> +       RTE_TOPO_FOREACH_DOMAIN(domain, RTE_TOPO_DOMAIN_L3) {
> +               if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
> +                       break;
> +
> +               if (rte_topo_get_lcore_count_from_domain(RTE_TOPO_DOMAIN_L3,
> domain) == 0)
> +                       continue;
> +
> +               if (id1 == RTE_MAX_LCORE) {
> +                       id1 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0,
> RTE_TOPO_DOMAIN_L3);
> +                       continue;
> +               }
> +
> +               if (id2 == RTE_MAX_LCORE) {
> +                       id2 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0,
> RTE_TOPO_DOMAIN_L3);
> +                       continue;
> +               }
> +       }
> +
> +       if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
> +               return 2;
> +
> +       if (id1 == id2)
> +               return 3;
> +
> +       lcp->c1 = id1;
> +       lcp->c2 = id2;
> +
> +       return 0;
> +}
> +
> +static int
> +get_two_l2_domains(struct lcore_pair *lcp)
> +{
> +       if (rte_topo_get_domain_count(RTE_TOPO_DOMAIN_L2) < 2)
> +               return 1;
> +
> +       unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
> +       unsigned int domain = 0;
> +
> +       RTE_TOPO_FOREACH_DOMAIN(domain, RTE_TOPO_DOMAIN_L2) {
> +               if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
> +                       break;
> +
> +               if (rte_topo_get_lcore_count_from_domain(RTE_TOPO_DOMAIN_L2,
> domain) == 0)
> +                       continue;
> +
> +               if (id1 == RTE_MAX_LCORE) {
> +                       id1 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0,
> RTE_TOPO_DOMAIN_L2);
> +                       continue;
> +               }
> +
> +               if (id2 == RTE_MAX_LCORE) {
> +                       id2 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0,
> RTE_TOPO_DOMAIN_L2);
> +                       continue;
> +               }
> +       }
> +
> +       if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
> +               return 2;
> +
> +       if (id1 == id2)
> +               return 3;
> +
> +       lcp->c1 = id1;
> +       lcp->c2 = id2;
> +
> +       return 0;
> +}
> +
> +static int
> +get_two_l1_domains(struct lcore_pair *lcp)
> +{
> +       if (rte_topo_get_domain_count(RTE_TOPO_DOMAIN_L1) < 2)
> +               return 1;
> +
> +       unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
> +       unsigned int domain = 0;
> +
> +       RTE_TOPO_FOREACH_DOMAIN(domain, RTE_TOPO_DOMAIN_L1) {
> +               if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
> +                       break;
> +
> +               if (rte_topo_get_lcore_count_from_domain(RTE_TOPO_DOMAIN_L1,
> domain) == 0)
> +                       continue;
> +
> +               if (id1 == RTE_MAX_LCORE) {
> +                       id1 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0,
> RTE_TOPO_DOMAIN_L1);
> +                       continue;
> +               }
> +
> +               if (id2 == RTE_MAX_LCORE) {
> +                       id2 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0,
> RTE_TOPO_DOMAIN_L1);
> +                       continue;
> +               }
> +       }
> +
> +       if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
> +               return 2;
> +
> +       if (id1 == id2)
> +               return 3;
> +
> +       lcp->c1 = id1;
> +       lcp->c2 = id2;
> +
> +       return 0;
> +}
> +#endif
> +
> +
>  /* Measure the cycle cost of popping an empty stack. */
>  static void
>  test_empty_pop(struct rte_stack *s)
> @@ -331,6 +695,51 @@ __test_stack_perf(uint32_t flags)
>                 run_on_core_pair(&cores, s, bulk_push_pop);
>         }
>
> +#ifdef RTE_LIBHWLOC_PROBE
> +       if (rte_lcore_count() > 2) {
> +               if (get_same_numa_domains(&cores) == 0) {
> +                       printf("\n### Testing using same numa domain nodes ###\n");
> +                       run_on_core_pair(&cores, s, bulk_push_pop);
> +               }
> +               if (get_same_l4_domains(&cores) == 0) {
> +                       printf("\n### Testing using same l4 domain nodes ###\n");
> +                       run_on_core_pair(&cores, s, bulk_push_pop);
> +               }
> +               if (get_same_l3_domains(&cores) == 0) {
> +                       printf("\n### Testing using same l3 domain nodes ###\n");
> +                       run_on_core_pair(&cores, s, bulk_push_pop);
> +               }
> +               if (get_same_l2_domains(&cores) == 0) {
> +                       printf("\n### Testing using same l2 domain nodes ###\n");
> +                       run_on_core_pair(&cores, s, bulk_push_pop);
> +               }
> +               if (get_same_l1_domains(&cores) == 0) {
> +                       printf("\n### Testing using same l1 domain nodes ###\n");
> +                       run_on_core_pair(&cores, s, bulk_push_pop);
> +               }
> +               if (get_two_numa_domains(&cores) == 0) {
> +                       printf("\n### Testing using two numa domain nodes ###\n");
> +                       run_on_core_pair(&cores, s, bulk_push_pop);
> +               }
> +               if (get_two_l4_domains(&cores) == 0) {
> +                       printf("\n### Testing using two l4 domain nodes ###\n");
> +                       run_on_core_pair(&cores, s, bulk_push_pop);
> +               }
> +               if (get_two_l3_domains(&cores) == 0) {
> +                       printf("\n### Testing using two l3 domain nodes ###\n");
> +                       run_on_core_pair(&cores, s, bulk_push_pop);
> +               }
> +               if (get_two_l2_domains(&cores) == 0) {
> +                       printf("\n### Testing using two l2 domain nodes ###\n");
> +                       run_on_core_pair(&cores, s, bulk_push_pop);
> +               }
> +               if (get_two_l1_domains(&cores) == 0) {
> +                       printf("\n### Testing using two l1 domain nodes ###\n");
> +                       run_on_core_pair(&cores, s, bulk_push_pop);
> +               }
> +       }
> +#endif
> +
>         printf("\n### Testing on all %u lcores ###\n", rte_lcore_count());
>         run_on_n_cores(s, bulk_push_pop, rte_lcore_count());
>

<snipped>

Unable to find `Sudheendra Sampath` in dpdk mailing list and Slack. Can you please connect with me email address or slack.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [PATCH v5 v5 2/3] app: add topology aware test case
  2026-04-16  7:22     ` Varghese, Vipin
@ 2026-04-16 13:19       ` Varghese, Vipin
  2026-04-17  1:21         ` Varghese, Vipin
  0 siblings, 1 reply; 30+ messages in thread
From: Varghese, Vipin @ 2026-04-16 13:19 UTC (permalink / raw)
  To: dev@dpdk.org, Tummala, Sivaprasad
  Cc: konstantin.ananyev@huawei.com, wathsala.vithanage@arm.com,
	bruce.richardson@intel.com, viktorin@cesnet.cz,
	mb@smartsharesystems.com

[Public]

<snipped>
>
> Unable to find `Sudheendra Sampath` in dpdk mailing list and Slack. Can you please
> connect with me email address or slack.

Hi Sudheendra,

Thank you for the suggestion and welcome to dpdk review too.

My initial code design were

```
#ifdef RTE_LIBHWLOC_PROBE
+static int
+get_same_domains(struct lcore_pair *lcp, uint32_t domain_sel)
+{
+       if (rte_topo_get_domain_count(domain_sel) == 0)
+               return 1;
+
+       unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
+       unsigned int domain = 0;
+
+       RTE_TOPO_FOREACH_DOMAIN(domain, domain_sel) {
+               if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
+                       break;
+
+               if (rte_topo_get_lcore_count_from_domain(domain_sel, domain) < 2)
+                       continue;
+
+               id1 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0, domain_sel);
+               id2 = rte_topo_get_nth_lcore_from_domain(domain, 1, 0, domain_sel);
+       }
+
+       if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
+               return 2;
+
+       if (id1 == id2)
+               return 3;
+
+       lcp->c1 = id1;
+       lcp->c2 = id2;
+
+       return 0;
+}
+
+static int
+get_two_domains(struct lcore_pair *lcp, uint32_t domain_sel)
+{
+       if (rte_topo_get_domain_count(domain_sel) < 2)
+               return 1;
+
+       unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
+       unsigned int domain = 0;
+
+       RTE_TOPO_FOREACH_DOMAIN(domain, domain_sel) {
+               if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
+                       break;
+
+               if (rte_topo_get_lcore_count_from_domain(domain_sel, domain) == 0)
+                       continue;
+
+               if (id1 == RTE_MAX_LCORE) {
+                       id1 = rte_topo_get_nth_lcore_from_domain(domain,
+                               0, 0, domain_sel);
+                       continue;
+               }
+               if (id2 == RTE_MAX_LCORE) {
+                       id2 = rte_topo_get_nth_lcore_from_domain(domain,
+                               0, 0, domain_sel);
+                       continue;
+               }
+       }
+
+       if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
+               return 2;
+
+       if (id1 == id2)
+               return 3;
+
+       lcp->c1 = id1;
+       lcp->c2 = id2;
+
+       return 0;
+}
+#endif
+
 /* Get cycle counts for dequeuing from an empty ring. Should be 2 or 3 cycles */
 static void
 test_empty_dequeue(struct rte_ring *r, const int esize,
@@ -488,6 +574,33 @@ test_ring_perf_esize_run_on_two_cores(
                if (run_on_core_pair(&cores, param1, param2) < 0)
                        return -1;
        }
+#ifdef RTE_LIBHWLOC_PROBE
+       if (rte_lcore_count() > 2) {
+               for (uint32_t d = 0; d < RTE_DIM(domain_types); d++) {
+                       if (get_same_domains(&cores, domain_types[d]) == 0) {
+                               printf("\n### Testing using same %s domain nodes ###\n",
+                                       (domain_types[d] == RTE_TOPO_DOMAIN_NUMA) ? "NUMA" :
+                                       (domain_types[d] == RTE_TOPO_DOMAIN_L4) ? "L4" :
+                                       (domain_types[d] == RTE_TOPO_DOMAIN_L3) ? "L3" :
+                                       (domain_types[d] == RTE_TOPO_DOMAIN_L2) ? "L2" :
+                                       (domain_types[d] == RTE_TOPO_DOMAIN_L1) ? "L1" : NULL);
+                               if (run_on_core_pair(&cores, param1, param2) < 0)
+                                       return -1;
+                       }
+
+                       if (get_two_domains(&cores, domain_types[d]) == 0) {
+                               printf("\n### Testing using two %s domain nodes ###\n",
+                                       (domain_types[d] == RTE_TOPO_DOMAIN_NUMA) ? "NUMA" :
+                                       (domain_types[d] == RTE_TOPO_DOMAIN_L4) ? "L4" :
+                                       (domain_types[d] == RTE_TOPO_DOMAIN_L3) ? "L3" :
+                                       (domain_types[d] == RTE_TOPO_DOMAIN_L2) ? "L2" :
+                                       (domain_types[d] == RTE_TOPO_DOMAIN_L1) ? "L1" : NULL);
+                               if (run_on_core_pair(&cores, param1, param2) < 0)
+                                       return -1;
+                       }
+               }
+       }
+#endif
        return 0;
 }

```

Since it breaks the current function test format, I went ahead with current approach.
I will re-introduce the domain iterator design in v6 for you. Please try it out and let me know.

Regards
Vipin Varghese

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [PATCH v5 v5 2/3] app: add topology aware test case
  2026-04-16 13:19       ` Varghese, Vipin
@ 2026-04-17  1:21         ` Varghese, Vipin
  2026-04-17  5:19           ` Sudheendra Sampath
  0 siblings, 1 reply; 30+ messages in thread
From: Varghese, Vipin @ 2026-04-17  1:21 UTC (permalink / raw)
  To: Varghese, Vipin, dev@dpdk.org, Tummala, Sivaprasad,
	giveback4fun@gmail.com
  Cc: konstantin.ananyev@huawei.com, wathsala.vithanage@arm.com,
	bruce.richardson@intel.com, viktorin@cesnet.cz,
	mb@smartsharesystems.com

[Public]

Sudhhendra and myself were able to connect over Slack. Adding his email address to the loop of reply
.

Sudhendra please check the reply posted below
> -----Original Message-----
> From: Varghese, Vipin <Vipin.Varghese@amd.com>
> Sent: Thursday, April 16, 2026 6:50 PM
> To: dev@dpdk.org; Tummala, Sivaprasad <Sivaprasad.Tummala@amd.com>
> Cc: konstantin.ananyev@huawei.com; wathsala.vithanage@arm.com;
> bruce.richardson@intel.com; viktorin@cesnet.cz; mb@smartsharesystems.com
> Subject: RE: [PATCH v5 v5 2/3] app: add topology aware test case
>
> Caution: This message originated from an External Source. Use proper caution
> when opening attachments, clicking links, or responding.
>
>
> [Public]
>
> <snipped>
> >
> > Unable to find `Sudheendra Sampath` in dpdk mailing list and Slack.
> > Can you please connect with me email address or slack.
>
> Hi Sudheendra,
>
> Thank you for the suggestion and welcome to dpdk review too.
>
> My initial code design were
>
> ```
> #ifdef RTE_LIBHWLOC_PROBE
> +static int
> +get_same_domains(struct lcore_pair *lcp, uint32_t domain_sel) {
> +       if (rte_topo_get_domain_count(domain_sel) == 0)
> +               return 1;
> +
> +       unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
> +       unsigned int domain = 0;
> +
> +       RTE_TOPO_FOREACH_DOMAIN(domain, domain_sel) {
> +               if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
> +                       break;
> +
> +               if (rte_topo_get_lcore_count_from_domain(domain_sel, domain) < 2)
> +                       continue;
> +
> +               id1 = rte_topo_get_nth_lcore_from_domain(domain, 0, 0, domain_sel);
> +               id2 = rte_topo_get_nth_lcore_from_domain(domain, 1, 0, domain_sel);
> +       }
> +
> +       if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
> +               return 2;
> +
> +       if (id1 == id2)
> +               return 3;
> +
> +       lcp->c1 = id1;
> +       lcp->c2 = id2;
> +
> +       return 0;
> +}
> +
> +static int
> +get_two_domains(struct lcore_pair *lcp, uint32_t domain_sel) {
> +       if (rte_topo_get_domain_count(domain_sel) < 2)
> +               return 1;
> +
> +       unsigned int id1 = RTE_MAX_LCORE, id2 = RTE_MAX_LCORE;
> +       unsigned int domain = 0;
> +
> +       RTE_TOPO_FOREACH_DOMAIN(domain, domain_sel) {
> +               if ((id1 != RTE_MAX_LCORE) && (id2 != RTE_MAX_LCORE))
> +                       break;
> +
> +               if (rte_topo_get_lcore_count_from_domain(domain_sel, domain) == 0)
> +                       continue;
> +
> +               if (id1 == RTE_MAX_LCORE) {
> +                       id1 = rte_topo_get_nth_lcore_from_domain(domain,
> +                               0, 0, domain_sel);
> +                       continue;
> +               }
> +               if (id2 == RTE_MAX_LCORE) {
> +                       id2 = rte_topo_get_nth_lcore_from_domain(domain,
> +                               0, 0, domain_sel);
> +                       continue;
> +               }
> +       }
> +
> +       if ((id1 == RTE_MAX_LCORE) || (id2 == RTE_MAX_LCORE))
> +               return 2;
> +
> +       if (id1 == id2)
> +               return 3;
> +
> +       lcp->c1 = id1;
> +       lcp->c2 = id2;
> +
> +       return 0;
> +}
> +#endif
> +
>  /* Get cycle counts for dequeuing from an empty ring. Should be 2 or 3 cycles */
> static void  test_empty_dequeue(struct rte_ring *r, const int esize, @@ -488,6
> +574,33 @@ test_ring_perf_esize_run_on_two_cores(
>                 if (run_on_core_pair(&cores, param1, param2) < 0)
>                         return -1;
>         }
> +#ifdef RTE_LIBHWLOC_PROBE
> +       if (rte_lcore_count() > 2) {
> +               for (uint32_t d = 0; d < RTE_DIM(domain_types); d++) {
> +                       if (get_same_domains(&cores, domain_types[d]) == 0) {
> +                               printf("\n### Testing using same %s domain nodes ###\n",
> +                                       (domain_types[d] == RTE_TOPO_DOMAIN_NUMA) ?
> "NUMA" :
> +                                       (domain_types[d] == RTE_TOPO_DOMAIN_L4) ? "L4" :
> +                                       (domain_types[d] == RTE_TOPO_DOMAIN_L3) ? "L3" :
> +                                       (domain_types[d] == RTE_TOPO_DOMAIN_L2) ? "L2" :
> +                                       (domain_types[d] == RTE_TOPO_DOMAIN_L1) ? "L1" :
> NULL);
> +                               if (run_on_core_pair(&cores, param1, param2) < 0)
> +                                       return -1;
> +                       }
> +
> +                       if (get_two_domains(&cores, domain_types[d]) == 0) {
> +                               printf("\n### Testing using two %s domain nodes ###\n",
> +                                       (domain_types[d] == RTE_TOPO_DOMAIN_NUMA) ?
> "NUMA" :
> +                                       (domain_types[d] == RTE_TOPO_DOMAIN_L4) ? "L4" :
> +                                       (domain_types[d] == RTE_TOPO_DOMAIN_L3) ? "L3" :
> +                                       (domain_types[d] == RTE_TOPO_DOMAIN_L2) ? "L2" :
> +                                       (domain_types[d] == RTE_TOPO_DOMAIN_L1) ? "L1" :
> NULL);
> +                               if (run_on_core_pair(&cores, param1, param2) < 0)
> +                                       return -1;
> +                       }
> +               }
> +       }
> +#endif
>         return 0;
>  }
>
> ```
>
> Since it breaks the current function test format, I went ahead with current approach.
> I will re-introduce the domain iterator design in v6 for you. Please try it out and let me
> know.
>
> Regards
> Vipin Varghese

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 v5 2/3] app: add topology aware test case
  2026-04-17  1:21         ` Varghese, Vipin
@ 2026-04-17  5:19           ` Sudheendra Sampath
  2026-04-17  9:55             ` Varghese, Vipin
  0 siblings, 1 reply; 30+ messages in thread
From: Sudheendra Sampath @ 2026-04-17  5:19 UTC (permalink / raw)
  To: Varghese, Vipin, dev@dpdk.org, Tummala, Sivaprasad
  Cc: konstantin.ananyev@huawei.com, wathsala.vithanage@arm.com,
	bruce.richardson@intel.com, viktorin@cesnet.cz,
	mb@smartsharesystems.com

Hi Vipin,

Pardon my ignorance.  I am not sure I understand what you mean by
"current function test format".  

Can you let me know where I can read about the current function test
format ?

Thanks

Sudheendra Sampath

^ permalink raw reply	[flat|nested] 30+ messages in thread

* RE: [PATCH v5 v5 2/3] app: add topology aware test case
  2026-04-17  5:19           ` Sudheendra Sampath
@ 2026-04-17  9:55             ` Varghese, Vipin
  0 siblings, 0 replies; 30+ messages in thread
From: Varghese, Vipin @ 2026-04-17  9:55 UTC (permalink / raw)
  To: giveback4fun@gmail.com, dev@dpdk.org, Tummala, Sivaprasad
  Cc: konstantin.ananyev@huawei.com, wathsala.vithanage@arm.com,
	bruce.richardson@intel.com, viktorin@cesnet.cz,
	mb@smartsharesystems.com

[AMD Official Use Only - AMD Internal Distribution Only]

<snipped>

>
>
> Hi Vipin,
>
> Pardon my ignorance.  I am not sure I understand what you mean by "current
> function test format".
>
> Can you let me know where I can read about the current function test format ?


I read the test code and understand each test call is individual test function from the main function.
There do not have iterator or helper function supporting it.

But I shared already my initial design, and reason why I did not submit it.

I mentioned in my email, I can submit the same design for you in v6.

Let me know if you have any questions for the same?

>
> Thanks
>
> Sudheendra Sampath

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2026-04-17  9:55 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-05 10:28 [PATCH v4 0/4] Introduce Topology NUMA grouping for lcores Vipin Varghese
2024-11-05 10:28 ` [PATCH v4 1/4] eal/lcore: add topology based functions Vipin Varghese
2024-11-05 10:28 ` [PATCH v4 2/4] test/lcore: enable tests for topology Vipin Varghese
2024-11-05 10:28 ` [PATCH v4 3/4] doc: add topology grouping details Vipin Varghese
2024-11-05 10:28 ` [PATCH v4 4/4] examples: update with lcore topology API Vipin Varghese
2025-02-13  3:09 ` [PATCH v4 0/4] Introduce Topology NUMA grouping for lcores Varghese, Vipin
2025-02-13  8:34   ` Thomas Monjalon
2025-02-13  9:18     ` Morten Brørup
2025-03-03  9:06       ` Varghese, Vipin
2025-03-04 10:08         ` Morten Brørup
2025-03-05  7:43           ` Mattias Rönnblom
2025-03-03  8:59     ` Varghese, Vipin
2025-03-17 13:46 ` Jan Viktorin
2025-04-09 10:08   ` Varghese, Vipin
2025-06-03  6:03     ` Varghese, Vipin
2026-01-17 18:57   ` Stephen Hemminger
2026-01-19 14:55     ` [PATCH v4 0/4] Introduce Topology NUMA grouping for cores Varghese, Vipin
2026-04-14 19:38 ` [PATCH v5 0/3] eal/topology: introduce topology-aware lcore grouping Vipin Varghese
2026-04-14 19:38   ` [PATCH v5 v5 1/3] eal/topology: add Topology grouping for lcores Vipin Varghese
2026-04-15 14:06     ` Morten Brørup
2026-04-15 17:52       ` Varghese, Vipin
2026-04-14 19:38   ` [PATCH v5 v5 2/3] app: add topology aware test case Vipin Varghese
2026-04-15  5:21     ` Sudheendra Sampath
2026-04-16  7:22     ` Varghese, Vipin
2026-04-16 13:19       ` Varghese, Vipin
2026-04-17  1:21         ` Varghese, Vipin
2026-04-17  5:19           ` Sudheendra Sampath
2026-04-17  9:55             ` Varghese, Vipin
2026-04-14 19:38   ` [PATCH v5 v5 3/3] doc: add new section topology Vipin Varghese
2026-04-14 20:22   ` [PATCH v5 0/3] eal/topology: introduce topology-aware lcore grouping Stephen Hemminger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox