linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RESEND PATCH v4 0/4] dma-mapping: benchmark: Add support for dma_map_sg
@ 2025-06-14 14:34 Qinxin Xia
  2025-06-14 14:34 ` [RESEND PATCH v4 1/4] dma-mapping: benchmark: Add padding to ensure uABI remained consistent Qinxin Xia
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Qinxin Xia @ 2025-06-14 14:34 UTC (permalink / raw)
  To: 21cnbao, m.szyprowski, robin.murphy
  Cc: yangyicong, hch, iommu, jonathan.cameron, prime.zeng, fanghao11,
	linux-kernel, linuxarm, xiaqinxin

Modify the framework to adapt to more map modes, add benchmark
support for dma_map_sg, and add support sg map mode in ioctl.

The result:
[root@localhost]# ./dma_map_benchmark -m 1 -g 8 -t 8 -s 30 -d 2
dma mapping mode: DMA_MAP_SG_MODE
dma mapping benchmark: threads:8 seconds:30 node:-1 dir:FROM_DEVICE granule/sg_nents: 8
average map latency(us):1.4 standard deviation:0.3
average unmap latency(us):1.3 standard deviation:0.3
[root@localhost]# ./dma_map_benchmark -m 0 -g 8 -t 8 -s 30 -d 2
dma mapping mode: DMA_MAP_SINGLE_MODE
dma mapping benchmark: threads:8 seconds:30 node:-1 dir:FROM_DEVICE granule/sg_nents: 8
average map latency(us):1.0 standard deviation:0.3
average unmap latency(us):1.3 standard deviation:0.5

---
Changes since V3:
- Address the comments from Barry, change mode to a more specific namespace.
- Link: https://lore.kernel.org/all/20250509020238.3378396-1-xiaqinxin@huawei.com/

Changes since V2:
- Address the comments from Barry and ALOK, some commit information and function
  input parameter names are modified to make them more accurate.
- Link: https://lore.kernel.org/all/20250506030100.394376-1-xiaqinxin@huawei.com/

Changes since V1:
- Address the comments from Barry, added some comments and changed the unmap type to void.
- Link: https://lore.kernel.org/lkml/20250212022718.1995504-1-xiaqinxin@huawei.com/


Qinxin Xia (4):
  dma-mapping: benchmark: Add padding to ensure uABI remained consistent
  dma-mapping: benchmark: modify the framework to adapt to more map
    modes
  dma-mapping: benchmark: add support for dma_map_sg
  selftests/dma: Add dma_map_sg support for dma_map_benchmark

 include/linux/map_benchmark.h                 |  46 +++-
 kernel/dma/map_benchmark.c                    | 225 ++++++++++++++++--
 .../testing/selftests/dma/dma_map_benchmark.c |  16 +-
 3 files changed, 252 insertions(+), 35 deletions(-)

-- 
2.33.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [RESEND PATCH v4 1/4] dma-mapping: benchmark: Add padding to ensure uABI remained consistent
  2025-06-14 14:34 [RESEND PATCH v4 0/4] dma-mapping: benchmark: Add support for dma_map_sg Qinxin Xia
@ 2025-06-14 14:34 ` Qinxin Xia
  2025-06-16  9:53   ` Jonathan Cameron
  2025-06-14 14:34 ` [RESEND PATCH v4 2/4] dma-mapping: benchmark: modify the framework to adapt to more map modes Qinxin Xia
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 11+ messages in thread
From: Qinxin Xia @ 2025-06-14 14:34 UTC (permalink / raw)
  To: 21cnbao, m.szyprowski, robin.murphy
  Cc: yangyicong, hch, iommu, jonathan.cameron, prime.zeng, fanghao11,
	linux-kernel, linuxarm, xiaqinxin

The padding field in the structure was previously reserved to
maintain a stable interface for potential new fields, ensuring
compatibility with user-space shared data structures.
However,it was accidentally removed by tiantao in a prior commit,
which may lead to incompatibility between user space and the kernel.

This patch reinstates the padding to restore the original structure
layout and preserve compatibility.

Fixes: 8ddde07a3d28 ("dma-mapping: benchmark: extract a common header file for map_benchmark definition")
Cc: stable@vger.kernel.org
Acked-by: Barry Song <baohua@kernel.org>
Signed-off-by: Qinxin Xia <xiaqinxin@huawei.com>
---
 include/linux/map_benchmark.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/map_benchmark.h b/include/linux/map_benchmark.h
index 62674c83bde4..2ac2fe52f248 100644
--- a/include/linux/map_benchmark.h
+++ b/include/linux/map_benchmark.h
@@ -27,5 +27,6 @@ struct map_benchmark {
 	__u32 dma_dir; /* DMA data direction */
 	__u32 dma_trans_ns; /* time for DMA transmission in ns */
 	__u32 granule;  /* how many PAGE_SIZE will do map/unmap once a time */
+	__u8 expansion[76];     /* For future use */
 };
 #endif /* _KERNEL_DMA_BENCHMARK_H */
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RESEND PATCH v4 2/4] dma-mapping: benchmark: modify the framework to adapt to more map modes
  2025-06-14 14:34 [RESEND PATCH v4 0/4] dma-mapping: benchmark: Add support for dma_map_sg Qinxin Xia
  2025-06-14 14:34 ` [RESEND PATCH v4 1/4] dma-mapping: benchmark: Add padding to ensure uABI remained consistent Qinxin Xia
@ 2025-06-14 14:34 ` Qinxin Xia
  2025-06-16 10:05   ` Jonathan Cameron
  2025-06-14 14:34 ` [RESEND PATCH v4 3/4] dma-mapping: benchmark: add support for dma_map_sg Qinxin Xia
  2025-06-14 14:34 ` [RESEND PATCH v4 4/4] selftests/dma: Add dma_map_sg support for dma_map_benchmark Qinxin Xia
  3 siblings, 1 reply; 11+ messages in thread
From: Qinxin Xia @ 2025-06-14 14:34 UTC (permalink / raw)
  To: 21cnbao, m.szyprowski, robin.murphy
  Cc: yangyicong, hch, iommu, jonathan.cameron, prime.zeng, fanghao11,
	linux-kernel, linuxarm, xiaqinxin

In some service scenarios, the performance of dma_map_sg needs to be
tested to support different map modes for benchmarks. This patch adjusts
the DMA map benchmark framework to make the DMA map benchmark framework
more flexible and adaptable to other mapping modes in the future.
By abstracting the framework into four interfaces:prepare, unprepare,
do_map, and do_unmap.The new map schema can be introduced more easily
without major modifications to the existing code structure.

Reviewed-by: Barry Song <baohua@kernel.org>
Signed-off-by: Qinxin Xia <xiaqinxin@huawei.com>
---
 include/linux/map_benchmark.h |   8 ++-
 kernel/dma/map_benchmark.c    | 122 +++++++++++++++++++++++++++-------
 2 files changed, 106 insertions(+), 24 deletions(-)

diff --git a/include/linux/map_benchmark.h b/include/linux/map_benchmark.h
index 2ac2fe52f248..020b3155c262 100644
--- a/include/linux/map_benchmark.h
+++ b/include/linux/map_benchmark.h
@@ -15,6 +15,11 @@
 #define DMA_MAP_TO_DEVICE       1
 #define DMA_MAP_FROM_DEVICE     2
 
+enum {
+	DMA_MAP_BENCH_SINGLE_MODE,
+	DMA_MAP_BENCH_MODE_MAX
+};
+
 struct map_benchmark {
 	__u64 avg_map_100ns; /* average map latency in 100ns */
 	__u64 map_stddev; /* standard deviation of map latency */
@@ -27,6 +32,7 @@ struct map_benchmark {
 	__u32 dma_dir; /* DMA data direction */
 	__u32 dma_trans_ns; /* time for DMA transmission in ns */
 	__u32 granule;  /* how many PAGE_SIZE will do map/unmap once a time */
-	__u8 expansion[76];     /* For future use */
+	__u8  map_mode; /* the mode of dma map */
+	__u8 expansion[75];     /* For future use */
 };
 #endif /* _KERNEL_DMA_BENCHMARK_H */
diff --git a/kernel/dma/map_benchmark.c b/kernel/dma/map_benchmark.c
index cc19a3efea89..05f85cf00c35 100644
--- a/kernel/dma/map_benchmark.c
+++ b/kernel/dma/map_benchmark.c
@@ -5,6 +5,7 @@
 
 #define pr_fmt(fmt)	KBUILD_MODNAME ": " fmt
 
+#include <linux/cleanup.h>
 #include <linux/debugfs.h>
 #include <linux/delay.h>
 #include <linux/device.h>
@@ -31,17 +32,99 @@ struct map_benchmark_data {
 	atomic64_t loops;
 };
 
+struct map_benchmark_ops {
+	void *(*prepare)(struct map_benchmark_data *map);
+	void (*unprepare)(void *mparam);
+	int (*do_map)(void *mparam);
+	void (*do_unmap)(void *mparam);
+};
+
+struct dma_single_map_param {
+	struct device *dev;
+	dma_addr_t addr;
+	void *xbuf;
+	u32 npages;
+	u32 dma_dir;
+};
+
+static void *dma_single_map_benchmark_prepare(struct map_benchmark_data *map)
+{
+	struct dma_single_map_param *params __free(kfree) = kzalloc(sizeof(*params),
+								    GFP_KERNEL);
+	if (!params)
+		return NULL;
+
+	params->npages = map->bparam.granule;
+	params->dma_dir = map->bparam.dma_dir;
+	params->dev = map->dev;
+	params->xbuf = alloc_pages_exact(params->npages * PAGE_SIZE, GFP_KERNEL);
+	if (!params->xbuf)
+		return NULL;
+
+	/*
+	 * for a non-coherent device, if we don't stain them in the
+	 * cache, this will give an underestimate of the real-world
+	 * overhead of BIDIRECTIONAL or TO_DEVICE mappings;
+	 * 66 means evertything goes well! 66 is lucky.
+	 */
+	if (params->dma_dir != DMA_FROM_DEVICE)
+		memset(params->xbuf, 0x66, params->npages * PAGE_SIZE);
+
+	return_ptr(params);
+}
+
+static void dma_single_map_benchmark_unprepare(void *mparam)
+{
+	struct dma_single_map_param *params = mparam;
+
+	free_pages_exact(params->xbuf, params->npages * PAGE_SIZE);
+	kfree(params);
+}
+
+static int dma_single_map_benchmark_do_map(void *mparam)
+{
+	struct dma_single_map_param *params = mparam;
+	int ret = 0;
+
+	params->addr = dma_map_single(params->dev, params->xbuf,
+				      params->npages * PAGE_SIZE, params->dma_dir);
+	if (unlikely(dma_mapping_error(params->dev, params->addr))) {
+		pr_err("dma_map_single failed on %s\n", dev_name(params->dev));
+		ret = -ENOMEM;
+	}
+
+	return ret;
+}
+
+static void dma_single_map_benchmark_do_unmap(void *mparam)
+{
+	struct dma_single_map_param *params = mparam;
+
+	dma_unmap_single(params->dev, params->addr,
+			 params->npages * PAGE_SIZE, params->dma_dir);
+}
+
+static struct map_benchmark_ops dma_single_map_benchmark_ops = {
+	.prepare = dma_single_map_benchmark_prepare,
+	.unprepare = dma_single_map_benchmark_unprepare,
+	.do_map = dma_single_map_benchmark_do_map,
+	.do_unmap = dma_single_map_benchmark_do_unmap,
+};
+
+static struct map_benchmark_ops *dma_map_benchmark_ops[DMA_MAP_BENCH_MODE_MAX] = {
+	[DMA_MAP_BENCH_SINGLE_MODE] = &dma_single_map_benchmark_ops,
+};
+
 static int map_benchmark_thread(void *data)
 {
-	void *buf;
-	dma_addr_t dma_addr;
 	struct map_benchmark_data *map = data;
-	int npages = map->bparam.granule;
-	u64 size = npages * PAGE_SIZE;
+	__u8 map_mode = map->bparam.map_mode;
 	int ret = 0;
 
-	buf = alloc_pages_exact(size, GFP_KERNEL);
-	if (!buf)
+	struct map_benchmark_ops *mb_ops = dma_map_benchmark_ops[map_mode];
+	void *mparam = mb_ops->prepare(map);
+
+	if (!mparam)
 		return -ENOMEM;
 
 	while (!kthread_should_stop())  {
@@ -49,23 +132,10 @@ static int map_benchmark_thread(void *data)
 		ktime_t map_stime, map_etime, unmap_stime, unmap_etime;
 		ktime_t map_delta, unmap_delta;
 
-		/*
-		 * for a non-coherent device, if we don't stain them in the
-		 * cache, this will give an underestimate of the real-world
-		 * overhead of BIDIRECTIONAL or TO_DEVICE mappings;
-		 * 66 means evertything goes well! 66 is lucky.
-		 */
-		if (map->dir != DMA_FROM_DEVICE)
-			memset(buf, 0x66, size);
-
 		map_stime = ktime_get();
-		dma_addr = dma_map_single(map->dev, buf, size, map->dir);
-		if (unlikely(dma_mapping_error(map->dev, dma_addr))) {
-			pr_err("dma_map_single failed on %s\n",
-				dev_name(map->dev));
-			ret = -ENOMEM;
+		ret = mb_ops->do_map(mparam);
+		if (ret)
 			goto out;
-		}
 		map_etime = ktime_get();
 		map_delta = ktime_sub(map_etime, map_stime);
 
@@ -73,7 +143,8 @@ static int map_benchmark_thread(void *data)
 		ndelay(map->bparam.dma_trans_ns);
 
 		unmap_stime = ktime_get();
-		dma_unmap_single(map->dev, dma_addr, size, map->dir);
+		mb_ops->do_unmap(mparam);
+
 		unmap_etime = ktime_get();
 		unmap_delta = ktime_sub(unmap_etime, unmap_stime);
 
@@ -108,7 +179,7 @@ static int map_benchmark_thread(void *data)
 	}
 
 out:
-	free_pages_exact(buf, size);
+	mb_ops->unprepare(mparam);
 	return ret;
 }
 
@@ -209,6 +280,11 @@ static long map_benchmark_ioctl(struct file *file, unsigned int cmd,
 
 	switch (cmd) {
 	case DMA_MAP_BENCHMARK:
+		if (map->bparam.map_mode >= DMA_MAP_BENCH_MODE_MAX) {
+			pr_err("invalid map mode\n");
+			return -EINVAL;
+		}
+
 		if (map->bparam.threads == 0 ||
 		    map->bparam.threads > DMA_MAP_MAX_THREADS) {
 			pr_err("invalid thread number\n");
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RESEND PATCH v4 3/4] dma-mapping: benchmark: add support for dma_map_sg
  2025-06-14 14:34 [RESEND PATCH v4 0/4] dma-mapping: benchmark: Add support for dma_map_sg Qinxin Xia
  2025-06-14 14:34 ` [RESEND PATCH v4 1/4] dma-mapping: benchmark: Add padding to ensure uABI remained consistent Qinxin Xia
  2025-06-14 14:34 ` [RESEND PATCH v4 2/4] dma-mapping: benchmark: modify the framework to adapt to more map modes Qinxin Xia
@ 2025-06-14 14:34 ` Qinxin Xia
  2025-06-16 10:15   ` Jonathan Cameron
  2025-06-14 14:34 ` [RESEND PATCH v4 4/4] selftests/dma: Add dma_map_sg support for dma_map_benchmark Qinxin Xia
  3 siblings, 1 reply; 11+ messages in thread
From: Qinxin Xia @ 2025-06-14 14:34 UTC (permalink / raw)
  To: 21cnbao, m.szyprowski, robin.murphy
  Cc: yangyicong, hch, iommu, jonathan.cameron, prime.zeng, fanghao11,
	linux-kernel, linuxarm, xiaqinxin

Support for dma scatter-gather mapping and is intended for testing
mapping performance. It achieves by introducing the dma_sg_map_param
structure and related functions, which enable the implementation of
scatter-gather mapping preparation, mapping, and unmapping operations.
Additionally, the dma_map_benchmark_ops array is updated to include
operations for scatter-gather mapping. This commit aims to provide
a wider range of mapping performance test to cater to different scenarios.

Reviewed-by: Barry Song <baohua@kernel.org>
Signed-off-by: Qinxin Xia <xiaqinxin@huawei.com>
---
 include/linux/map_benchmark.h |  43 ++++++++++----
 kernel/dma/map_benchmark.c    | 103 ++++++++++++++++++++++++++++++++++
 2 files changed, 134 insertions(+), 12 deletions(-)

diff --git a/include/linux/map_benchmark.h b/include/linux/map_benchmark.h
index 020b3155c262..9e8ec59e027a 100644
--- a/include/linux/map_benchmark.h
+++ b/include/linux/map_benchmark.h
@@ -17,22 +17,41 @@
 
 enum {
 	DMA_MAP_BENCH_SINGLE_MODE,
+	DMA_MAP_BENCH_SG_MODE,
 	DMA_MAP_BENCH_MODE_MAX
 };
 
+/**
+ * struct map_benchmark - Benchmarking data for DMA mapping operations.
+ * @avg_map_100ns: Average map latency in 100ns.
+ * @map_stddev: Standard deviation of map latency.
+ * @avg_unmap_100ns: Average unmap latency in 100ns.
+ * @unmap_stddev: Standard deviation of unmap latency.
+ * @threads: Number of threads performing map/unmap operations in parallel.
+ * @seconds: Duration of the test in seconds.
+ * @node: NUMA node on which this benchmark will run.
+ * @dma_bits: DMA addressing capability.
+ * @dma_dir: DMA data direction.
+ * @dma_trans_ns: Time for DMA transmission in ns.
+ * @granule: Number of PAGE_SIZE units to map/unmap at once.
+	     In SG mode, this represents the number of scatterlist entries.
+	     In single mode, this represents the total size of the mapping.
+ * @map_mode: Mode of DMA mapping.
+ * @expansion: Reserved for future use.
+ */
 struct map_benchmark {
-	__u64 avg_map_100ns; /* average map latency in 100ns */
-	__u64 map_stddev; /* standard deviation of map latency */
-	__u64 avg_unmap_100ns; /* as above */
+	__u64 avg_map_100ns;
+	__u64 map_stddev;
+	__u64 avg_unmap_100ns;
 	__u64 unmap_stddev;
-	__u32 threads; /* how many threads will do map/unmap in parallel */
-	__u32 seconds; /* how long the test will last */
-	__s32 node; /* which numa node this benchmark will run on */
-	__u32 dma_bits; /* DMA addressing capability */
-	__u32 dma_dir; /* DMA data direction */
-	__u32 dma_trans_ns; /* time for DMA transmission in ns */
-	__u32 granule;  /* how many PAGE_SIZE will do map/unmap once a time */
-	__u8  map_mode; /* the mode of dma map */
-	__u8 expansion[75];     /* For future use */
+	__u32 threads;
+	__u32 seconds;
+	__s32 node;
+	__u32 dma_bits;
+	__u32 dma_dir;
+	__u32 dma_trans_ns;
+	__u32 granule;
+	__u8  map_mode;
+	__u8 expansion[75];
 };
 #endif /* _KERNEL_DMA_BENCHMARK_H */
diff --git a/kernel/dma/map_benchmark.c b/kernel/dma/map_benchmark.c
index 05f85cf00c35..843be4dc0993 100644
--- a/kernel/dma/map_benchmark.c
+++ b/kernel/dma/map_benchmark.c
@@ -17,6 +17,7 @@
 #include <linux/module.h>
 #include <linux/pci.h>
 #include <linux/platform_device.h>
+#include <linux/scatterlist.h>
 #include <linux/slab.h>
 #include <linux/timekeeping.h>
 
@@ -111,8 +112,110 @@ static struct map_benchmark_ops dma_single_map_benchmark_ops = {
 	.do_unmap = dma_single_map_benchmark_do_unmap,
 };
 
+struct dma_sg_map_param {
+	struct sg_table sgt;
+	struct device *dev;
+	void **buf;
+	u32 npages;
+	u32 dma_dir;
+};
+
+static void *dma_sg_map_benchmark_prepare(struct map_benchmark_data *map)
+{
+	struct scatterlist *sg;
+	int i;
+
+	struct dma_sg_map_param *params __free(kfree) = kzalloc(sizeof(*params), GFP_KERNEL);
+	if (!params)
+		return NULL;
+
+	/*
+	 * Set the number of scatterlist entries based on the granule.
+	 * In SG mode, 'granule' represents the number of scatterlist entries.
+	 * Each scatterlist entry corresponds to a single page.
+	 */
+	params->npages = map->bparam.granule;
+	params->dma_dir = map->bparam.dma_dir;
+	params->dev = map->dev;
+	params->buf = kmalloc_array(params->npages, sizeof(*params->buf),
+				    GFP_KERNEL);
+	if (!params->buf)
+		goto out;
+
+	if (sg_alloc_table(&params->sgt, params->npages, GFP_KERNEL))
+		goto free_buf;
+
+	for_each_sgtable_sg(&params->sgt, sg, i) {
+		params->buf[i] = (void *)__get_free_page(GFP_KERNEL);
+		if (!params->buf[i])
+			goto free_page;
+
+		if (params->dma_dir != DMA_FROM_DEVICE)
+			memset(params->buf[i], 0x66, PAGE_SIZE);
+
+		sg_set_buf(sg, params->buf[i], PAGE_SIZE);
+	}
+
+	return_ptr(params);
+
+free_page:
+	while (i-- > 0)
+		free_page((unsigned long)params->buf[i]);
+
+	sg_free_table(&params->sgt);
+free_buf:
+	kfree(params->buf);
+out:
+	return NULL;
+}
+
+static void dma_sg_map_benchmark_unprepare(void *mparam)
+{
+	struct dma_sg_map_param *params = mparam;
+	int i;
+
+	for (i = 0; i < params->npages; i++)
+		free_page((unsigned long)params->buf[i]);
+
+	sg_free_table(&params->sgt);
+
+	kfree(params->buf);
+	kfree(params);
+}
+
+static int dma_sg_map_benchmark_do_map(void *mparam)
+{
+	struct dma_sg_map_param *params = mparam;
+	int ret = 0;
+
+	int sg_mapped = dma_map_sg(params->dev, params->sgt.sgl,
+				   params->npages, params->dma_dir);
+	if (!sg_mapped) {
+		pr_err("dma_map_sg failed on %s\n", dev_name(params->dev));
+		ret = -ENOMEM;
+	}
+
+	return ret;
+}
+
+static void dma_sg_map_benchmark_do_unmap(void *mparam)
+{
+	struct dma_sg_map_param *params = mparam;
+
+	dma_unmap_sg(params->dev, params->sgt.sgl, params->npages,
+		     params->dma_dir);
+}
+
+static struct map_benchmark_ops dma_sg_map_benchmark_ops = {
+	.prepare = dma_sg_map_benchmark_prepare,
+	.unprepare = dma_sg_map_benchmark_unprepare,
+	.do_map = dma_sg_map_benchmark_do_map,
+	.do_unmap = dma_sg_map_benchmark_do_unmap,
+};
+
 static struct map_benchmark_ops *dma_map_benchmark_ops[DMA_MAP_BENCH_MODE_MAX] = {
 	[DMA_MAP_BENCH_SINGLE_MODE] = &dma_single_map_benchmark_ops,
+	[DMA_MAP_BENCH_SG_MODE] = &dma_sg_map_benchmark_ops,
 };
 
 static int map_benchmark_thread(void *data)
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RESEND PATCH v4 4/4] selftests/dma: Add dma_map_sg support for dma_map_benchmark
  2025-06-14 14:34 [RESEND PATCH v4 0/4] dma-mapping: benchmark: Add support for dma_map_sg Qinxin Xia
                   ` (2 preceding siblings ...)
  2025-06-14 14:34 ` [RESEND PATCH v4 3/4] dma-mapping: benchmark: add support for dma_map_sg Qinxin Xia
@ 2025-06-14 14:34 ` Qinxin Xia
  2025-06-16 10:16   ` Jonathan Cameron
  3 siblings, 1 reply; 11+ messages in thread
From: Qinxin Xia @ 2025-06-14 14:34 UTC (permalink / raw)
  To: 21cnbao, m.szyprowski, robin.murphy
  Cc: yangyicong, hch, iommu, jonathan.cameron, prime.zeng, fanghao11,
	linux-kernel, linuxarm, xiaqinxin

Support for dma_map_sg, add option '-m' to distinguish mode.

i) Users can set option '-m' to select mode:
   DMA_MAP_BENCH_SINGLE_MODE=0, DMA_MAP_BENCH_SG_MODE:=1
   (The mode is also show in the test result).
ii) Users can set option '-g' to set sg_nents
    (total count of entries in scatterlist)
    the maximum number is 1024. Each of sg buf size is PAGE_SIZE.
    e.g
    [root@localhost]# ./dma_map_benchmark -m 1 -g 8 -t 8 -s 30 -d 2
    dma mapping mode: DMA_MAP_BENCH_SG_MODE
    dma mapping benchmark: threads:8 seconds:30 node:-1
    dir:FROM_DEVICE granule/sg_nents: 8
    average map latency(us):1.4 standard deviation:0.3
    average unmap latency(us):1.3 standard deviation:0.3
    [root@localhost]# ./dma_map_benchmark -m 0 -g 8 -t 8 -s 30 -d 2
    dma mapping mode: DMA_MAP_BENCH_SINGLE_MODE
    dma mapping benchmark: threads:8 seconds:30 node:-1
    dir:FROM_DEVICE granule/sg_nents: 8
    average map latency(us):1.0 standard deviation:0.3
    average unmap latency(us):1.3 standard deviation:0.5

Reviewed-by: Barry Song <baohua@kernel.org>
Signed-off-by: Qinxin Xia <xiaqinxin@huawei.com>
---
 tools/testing/selftests/dma/dma_map_benchmark.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/dma/dma_map_benchmark.c b/tools/testing/selftests/dma/dma_map_benchmark.c
index b12f1f9babf8..c37cbc7e8766 100644
--- a/tools/testing/selftests/dma/dma_map_benchmark.c
+++ b/tools/testing/selftests/dma/dma_map_benchmark.c
@@ -27,6 +27,7 @@ int main(int argc, char **argv)
 	int fd, opt;
 	/* default single thread, run 20 seconds on NUMA_NO_NODE */
 	int threads = 1, seconds = 20, node = -1;
+	int map_mode = DMA_MAP_BENCH_SINGLE_MODE;
 	/* default dma mask 32bit, bidirectional DMA */
 	int bits = 32, xdelay = 0, dir = DMA_MAP_BIDIRECTIONAL;
 	/* default granule 1 PAGESIZE */
@@ -34,7 +35,7 @@ int main(int argc, char **argv)
 
 	int cmd = DMA_MAP_BENCHMARK;
 
-	while ((opt = getopt(argc, argv, "t:s:n:b:d:x:g:")) != -1) {
+	while ((opt = getopt(argc, argv, "t:s:n:b:d:x:g:m:")) != -1) {
 		switch (opt) {
 		case 't':
 			threads = atoi(optarg);
@@ -57,11 +58,20 @@ int main(int argc, char **argv)
 		case 'g':
 			granule = atoi(optarg);
 			break;
+		case 'm':
+			map_mode = atoi(optarg);
+			break;
 		default:
 			return -1;
 		}
 	}
 
+	if (map_mode >= DMA_MAP_BENCH_MODE_MAX) {
+		fprintf(stderr, "invalid map mode, DMA_MAP_BENCH_SINGLE_MODE:%d, DMA_MAP_BENCH_SG_MODE:%d\n",
+			DMA_MAP_BENCH_SINGLE_MODE, DMA_MAP_BENCH_SG_MODE);
+		exit(1);
+	}
+
 	if (threads <= 0 || threads > DMA_MAP_MAX_THREADS) {
 		fprintf(stderr, "invalid number of threads, must be in 1-%d\n",
 			DMA_MAP_MAX_THREADS);
@@ -111,13 +121,15 @@ int main(int argc, char **argv)
 	map.dma_dir = dir;
 	map.dma_trans_ns = xdelay;
 	map.granule = granule;
+	map.map_mode = map_mode;
 
 	if (ioctl(fd, cmd, &map)) {
 		perror("ioctl");
 		exit(1);
 	}
 
-	printf("dma mapping benchmark: threads:%d seconds:%d node:%d dir:%s granule: %d\n",
+	printf("dma mapping mode: %d\n", map_mode);
+	printf("dma mapping benchmark: threads:%d seconds:%d node:%d dir:%s granule/sg_nents: %d\n",
 			threads, seconds, node, dir[directions], granule);
 	printf("average map latency(us):%.1f standard deviation:%.1f\n",
 			map.avg_map_100ns/10.0, map.map_stddev/10.0);
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [RESEND PATCH v4 1/4] dma-mapping: benchmark: Add padding to ensure uABI remained consistent
  2025-06-14 14:34 ` [RESEND PATCH v4 1/4] dma-mapping: benchmark: Add padding to ensure uABI remained consistent Qinxin Xia
@ 2025-06-16  9:53   ` Jonathan Cameron
  2025-06-16 10:40     ` Barry Song
  0 siblings, 1 reply; 11+ messages in thread
From: Jonathan Cameron @ 2025-06-16  9:53 UTC (permalink / raw)
  To: Qinxin Xia
  Cc: 21cnbao, m.szyprowski, robin.murphy, yangyicong, hch, iommu,
	prime.zeng, fanghao11, linux-kernel, linuxarm

On Sat, 14 Jun 2025 22:34:51 +0800
Qinxin Xia <xiaqinxin@huawei.com> wrote:

> The padding field in the structure was previously reserved to
> maintain a stable interface for potential new fields, ensuring
> compatibility with user-space shared data structures.
> However,it was accidentally removed by tiantao in a prior commit,
> which may lead to incompatibility between user space and the kernel.
> 
> This patch reinstates the padding to restore the original structure
> layout and preserve compatibility.
> 
> Fixes: 8ddde07a3d28 ("dma-mapping: benchmark: extract a common header file for map_benchmark definition")
> Cc: stable@vger.kernel.org
> Acked-by: Barry Song <baohua@kernel.org>
> Signed-off-by: Qinxin Xia <xiaqinxin@huawei.com>

FWIW I checked the patch above indeed accidentally dropped the padding and the structure
is copied to userspace so this fix is correct.  Given it's not in a uapi header this
only really affects the selftest I think this is mostly a case of there possibly being
out of tree tools with a local copy of this structure definition.

Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RESEND PATCH v4 2/4] dma-mapping: benchmark: modify the framework to adapt to more map modes
  2025-06-14 14:34 ` [RESEND PATCH v4 2/4] dma-mapping: benchmark: modify the framework to adapt to more map modes Qinxin Xia
@ 2025-06-16 10:05   ` Jonathan Cameron
  0 siblings, 0 replies; 11+ messages in thread
From: Jonathan Cameron @ 2025-06-16 10:05 UTC (permalink / raw)
  To: Qinxin Xia
  Cc: 21cnbao, m.szyprowski, robin.murphy, yangyicong, hch, iommu,
	prime.zeng, fanghao11, linux-kernel, linuxarm

On Sat, 14 Jun 2025 22:34:52 +0800
Qinxin Xia <xiaqinxin@huawei.com> wrote:

> In some service scenarios, the performance of dma_map_sg needs to be
> tested to support different map modes for benchmarks. This patch adjusts
> the DMA map benchmark framework to make the DMA map benchmark framework
> more flexible and adaptable to other mapping modes in the future.
> By abstracting the framework into four interfaces:prepare, unprepare,
> do_map, and do_unmap.The new map schema can be introduced more easily
> without major modifications to the existing code structure.
> 
> Reviewed-by: Barry Song <baohua@kernel.org>
> Signed-off-by: Qinxin Xia <xiaqinxin@huawei.com>

There is what looks like an accidental change in behavior for loops
after the first one.  I think the cache lines will end up clean so
any flush will be just dropping them.  Prior to this patch they
were probably dirty.

Jonathan

>  #endif /* _KERNEL_DMA_BENCHMARK_H */
> diff --git a/kernel/dma/map_benchmark.c b/kernel/dma/map_benchmark.c
> index cc19a3efea89..05f85cf00c35 100644
> --- a/kernel/dma/map_benchmark.c
> +++ b/kernel/dma/map_benchmark.c
> @@ -5,6 +5,7 @@

> +static void *dma_single_map_benchmark_prepare(struct map_benchmark_data *map)
> +{
> +	struct dma_single_map_param *params __free(kfree) = kzalloc(sizeof(*params),
> +								    GFP_KERNEL);
Trivial: I'd split this slightly differently.

	struct dma_single_map_param *params __free(kfree) =
		kzalloc(sizeof(*params), GFP_KERNEL);


> +}

> +
> +static int dma_single_map_benchmark_do_map(void *mparam)
> +{
> +	struct dma_single_map_param *params = mparam;
> +	int ret = 0;
> +
> +	params->addr = dma_map_single(params->dev, params->xbuf,
> +				      params->npages * PAGE_SIZE, params->dma_dir);
> +	if (unlikely(dma_mapping_error(params->dev, params->addr))) {
> +		pr_err("dma_map_single failed on %s\n", dev_name(params->dev));

dev_err() seems more appropriate than passing in the dev to a pr_err.

> +		ret = -ENOMEM;
		return -ENOMEM;
Or better still don't assume the error return of dma_mapping_error()
(even though it is currently only -ENOMEM)

> +	}
> +
	return 0;


would be neater and avoid need for the local variable.
If you add stuff here later in the series then fine to ignore this comment.


> +	return ret;
> +}

>  static int map_benchmark_thread(void *data)
>  {
> -	void *buf;
> -	dma_addr_t dma_addr;
>  	struct map_benchmark_data *map = data;
> -	int npages = map->bparam.granule;
> -	u64 size = npages * PAGE_SIZE;
> +	__u8 map_mode = map->bparam.map_mode;
>  	int ret = 0;
>  
> -	buf = alloc_pages_exact(size, GFP_KERNEL);
> -	if (!buf)
> +	struct map_benchmark_ops *mb_ops = dma_map_benchmark_ops[map_mode];
> +	void *mparam = mb_ops->prepare(map);
> +
> +	if (!mparam)
>  		return -ENOMEM;
>  
>  	while (!kthread_should_stop())  {
> @@ -49,23 +132,10 @@ static int map_benchmark_thread(void *data)
>  		ktime_t map_stime, map_etime, unmap_stime, unmap_etime;
>  		ktime_t map_delta, unmap_delta;
>  
> -		/*
> -		 * for a non-coherent device, if we don't stain them in the
> -		 * cache, this will give an underestimate of the real-world
> -		 * overhead of BIDIRECTIONAL or TO_DEVICE mappings;
> -		 * 66 means evertything goes well! 66 is lucky.
> -		 */
> -		if (map->dir != DMA_FROM_DEVICE)
> -			memset(buf, 0x66, size);

This seems to change the behavior form memset every time to only once
in the prepare call above.  If that has no affect on what is being benchmarked,
then add a comment on it to the patch description.


> -
>  		map_stime = ktime_get();
> -		dma_addr = dma_map_single(map->dev, buf, size, map->dir);
> -		if (unlikely(dma_mapping_error(map->dev, dma_addr))) {
> -			pr_err("dma_map_single failed on %s\n",
> -				dev_name(map->dev));
> -			ret = -ENOMEM;
> +		ret = mb_ops->do_map(mparam);
> +		if (ret)
>  			goto out;
> -		}
>  		map_etime = ktime_get();
>  		map_delta = ktime_sub(map_etime, map_stime);
>  
> @@ -73,7 +143,8 @@ static int map_benchmark_thread(void *data)
>  		ndelay(map->bparam.dma_trans_ns);
>  
>  		unmap_stime = ktime_get();
> -		dma_unmap_single(map->dev, dma_addr, size, map->dir);
> +		mb_ops->do_unmap(mparam);
> +
>  		unmap_etime = ktime_get();
>  		unmap_delta = ktime_sub(unmap_etime, unmap_stime);
>  
> @@ -108,7 +179,7 @@ static int map_benchmark_thread(void *data)
>  	}
>  
>  out:
> -	free_pages_exact(buf, size);
> +	mb_ops->unprepare(mparam);
>  	return ret;
>  }


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RESEND PATCH v4 3/4] dma-mapping: benchmark: add support for dma_map_sg
  2025-06-14 14:34 ` [RESEND PATCH v4 3/4] dma-mapping: benchmark: add support for dma_map_sg Qinxin Xia
@ 2025-06-16 10:15   ` Jonathan Cameron
  0 siblings, 0 replies; 11+ messages in thread
From: Jonathan Cameron @ 2025-06-16 10:15 UTC (permalink / raw)
  To: Qinxin Xia
  Cc: 21cnbao, m.szyprowski, robin.murphy, yangyicong, hch, iommu,
	prime.zeng, fanghao11, linux-kernel, linuxarm

On Sat, 14 Jun 2025 22:34:53 +0800
Qinxin Xia <xiaqinxin@huawei.com> wrote:

> Support for dma scatter-gather mapping and is intended for testing
> mapping performance. It achieves by introducing the dma_sg_map_param
> structure and related functions, which enable the implementation of
> scatter-gather mapping preparation, mapping, and unmapping operations.
> Additionally, the dma_map_benchmark_ops array is updated to include
> operations for scatter-gather mapping. This commit aims to provide
> a wider range of mapping performance test to cater to different scenarios.
> 
> Reviewed-by: Barry Song <baohua@kernel.org>
> Signed-off-by: Qinxin Xia <xiaqinxin@huawei.com>
Main thing here is don't do the docs format in a patch that makes
real changes. That hides what is really going on and makes review
harder!

Jonathan

> ---
>  include/linux/map_benchmark.h |  43 ++++++++++----
>  kernel/dma/map_benchmark.c    | 103 ++++++++++++++++++++++++++++++++++
>  2 files changed, 134 insertions(+), 12 deletions(-)
> 
> diff --git a/include/linux/map_benchmark.h b/include/linux/map_benchmark.h
> index 020b3155c262..9e8ec59e027a 100644
> --- a/include/linux/map_benchmark.h
> +++ b/include/linux/map_benchmark.h
> @@ -17,22 +17,41 @@
>  
>  enum {
>  	DMA_MAP_BENCH_SINGLE_MODE,
> +	DMA_MAP_BENCH_SG_MODE,
>  	DMA_MAP_BENCH_MODE_MAX
>  };
>  
> +/**
> + * struct map_benchmark - Benchmarking data for DMA mapping operations.
> + * @avg_map_100ns: Average map latency in 100ns.
> + * @map_stddev: Standard deviation of map latency.
> + * @avg_unmap_100ns: Average unmap latency in 100ns.
> + * @unmap_stddev: Standard deviation of unmap latency.
> + * @threads: Number of threads performing map/unmap operations in parallel.
> + * @seconds: Duration of the test in seconds.
> + * @node: NUMA node on which this benchmark will run.
> + * @dma_bits: DMA addressing capability.
> + * @dma_dir: DMA data direction.
> + * @dma_trans_ns: Time for DMA transmission in ns.
> + * @granule: Number of PAGE_SIZE units to map/unmap at once.
> +	     In SG mode, this represents the number of scatterlist entries.
> +	     In single mode, this represents the total size of the mapping.
> + * @map_mode: Mode of DMA mapping.
> + * @expansion: Reserved for future use.
> + */
>  struct map_benchmark {
> -	__u64 avg_map_100ns; /* average map latency in 100ns */
> -	__u64 map_stddev; /* standard deviation of map latency */
> -	__u64 avg_unmap_100ns; /* as above */
> +	__u64 avg_map_100ns;

Do this docs format change in a precursor patch.  As done here
we can't see easily what changed in adding the dma_map_sg support.


> +	__u64 map_stddev;
> +	__u64 avg_unmap_100ns;
>  	__u64 unmap_stddev;
> -	__u32 threads; /* how many threads will do map/unmap in parallel */
> -	__u32 seconds; /* how long the test will last */
> -	__s32 node; /* which numa node this benchmark will run on */
> -	__u32 dma_bits; /* DMA addressing capability */
> -	__u32 dma_dir; /* DMA data direction */
> -	__u32 dma_trans_ns; /* time for DMA transmission in ns */
> -	__u32 granule;  /* how many PAGE_SIZE will do map/unmap once a time */
> -	__u8  map_mode; /* the mode of dma map */
> -	__u8 expansion[75];     /* For future use */
> +	__u32 threads;
> +	__u32 seconds;
> +	__s32 node;
> +	__u32 dma_bits;
> +	__u32 dma_dir;
> +	__u32 dma_trans_ns;
> +	__u32 granule;
> +	__u8  map_mode;
> +	__u8 expansion[75];
>  };
>  #endif /* _KERNEL_DMA_BENCHMARK_H */
> diff --git a/kernel/dma/map_benchmark.c b/kernel/dma/map_benchmark.c
> index 05f85cf00c35..843be4dc0993 100644
> --- a/kernel/dma/map_benchmark.c
> +++ b/kernel/dma/map_benchmark.c
> @@ -17,6 +17,7 @@
>  #include <linux/module.h>
>  #include <linux/pci.h>
>  #include <linux/platform_device.h>
> +#include <linux/scatterlist.h>
>  #include <linux/slab.h>
>  #include <linux/timekeeping.h>
>  
> @@ -111,8 +112,110 @@ static struct map_benchmark_ops dma_single_map_benchmark_ops = {
>  	.do_unmap = dma_single_map_benchmark_do_unmap,
>  };
>  
> +struct dma_sg_map_param {
> +	struct sg_table sgt;
> +	struct device *dev;
> +	void **buf;
> +	u32 npages;
> +	u32 dma_dir;
> +};
> +
> +static void *dma_sg_map_benchmark_prepare(struct map_benchmark_data *map)
> +{
> +	struct scatterlist *sg;
> +	int i;
> +
> +	struct dma_sg_map_param *params __free(kfree) = kzalloc(sizeof(*params), GFP_KERNEL);

I'd keep style of previous patch and break this long line.

The mix of cleanup.h and gotos is sometimes problematic. I'm not sure I'd
bother the the __free() in this case as the benefit is really small.


> +	if (!params)
> +		return NULL;
> +
> +	/*
> +	 * Set the number of scatterlist entries based on the granule.
> +	 * In SG mode, 'granule' represents the number of scatterlist entries.
> +	 * Each scatterlist entry corresponds to a single page.
> +	 */
> +	params->npages = map->bparam.granule;
> +	params->dma_dir = map->bparam.dma_dir;
> +	params->dev = map->dev;
> +	params->buf = kmalloc_array(params->npages, sizeof(*params->buf),
> +				    GFP_KERNEL);
> +	if (!params->buf)
> +		goto out;
> +
> +	if (sg_alloc_table(&params->sgt, params->npages, GFP_KERNEL))
> +		goto free_buf;
> +
> +	for_each_sgtable_sg(&params->sgt, sg, i) {
> +		params->buf[i] = (void *)__get_free_page(GFP_KERNEL);
> +		if (!params->buf[i])
> +			goto free_page;
> +
> +		if (params->dma_dir != DMA_FROM_DEVICE)
> +			memset(params->buf[i], 0x66, PAGE_SIZE);

As in the previous - not sure this has the affect we want if
we are doing multiple loops in the thread.

> +
> +		sg_set_buf(sg, params->buf[i], PAGE_SIZE);
> +	}
> +
> +	return_ptr(params);
> +
> +free_page:
> +	while (i-- > 0)
> +		free_page((unsigned long)params->buf[i]);
> +
> +	sg_free_table(&params->sgt);
> +free_buf:
> +	kfree(params->buf);
> +out:

A label where there is nothing to do is usually better handled
with early returns in the error paths.  Those are easier to
review as reader doesn't need to go look for where anything is
done.


> +	return NULL;
> +}

> +static int dma_sg_map_benchmark_do_map(void *mparam)
> +{
> +	struct dma_sg_map_param *params = mparam;
> +	int ret = 0;
> +
> +	int sg_mapped = dma_map_sg(params->dev, params->sgt.sgl,
> +				   params->npages, params->dma_dir);
> +	if (!sg_mapped) {
> +		pr_err("dma_map_sg failed on %s\n", dev_name(params->dev));
> +		ret = -ENOMEM;
Similar comments to those in previous patch apply here as well.

> +	}
> +
> +	return ret;
> +}



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RESEND PATCH v4 4/4] selftests/dma: Add dma_map_sg support for dma_map_benchmark
  2025-06-14 14:34 ` [RESEND PATCH v4 4/4] selftests/dma: Add dma_map_sg support for dma_map_benchmark Qinxin Xia
@ 2025-06-16 10:16   ` Jonathan Cameron
  0 siblings, 0 replies; 11+ messages in thread
From: Jonathan Cameron @ 2025-06-16 10:16 UTC (permalink / raw)
  To: Qinxin Xia
  Cc: 21cnbao, m.szyprowski, robin.murphy, yangyicong, hch, iommu,
	prime.zeng, fanghao11, linux-kernel, linuxarm

On Sat, 14 Jun 2025 22:34:54 +0800
Qinxin Xia <xiaqinxin@huawei.com> wrote:

> Support for dma_map_sg, add option '-m' to distinguish mode.
> 
> i) Users can set option '-m' to select mode:
>    DMA_MAP_BENCH_SINGLE_MODE=0, DMA_MAP_BENCH_SG_MODE:=1
>    (The mode is also show in the test result).

shown

> ii) Users can set option '-g' to set sg_nents
>     (total count of entries in scatterlist)
>     the maximum number is 1024. Each of sg buf size is PAGE_SIZE.
>     e.g
>     [root@localhost]# ./dma_map_benchmark -m 1 -g 8 -t 8 -s 30 -d 2
>     dma mapping mode: DMA_MAP_BENCH_SG_MODE
>     dma mapping benchmark: threads:8 seconds:30 node:-1
>     dir:FROM_DEVICE granule/sg_nents: 8
>     average map latency(us):1.4 standard deviation:0.3
>     average unmap latency(us):1.3 standard deviation:0.3
>     [root@localhost]# ./dma_map_benchmark -m 0 -g 8 -t 8 -s 30 -d 2
>     dma mapping mode: DMA_MAP_BENCH_SINGLE_MODE
>     dma mapping benchmark: threads:8 seconds:30 node:-1
>     dir:FROM_DEVICE granule/sg_nents: 8
>     average map latency(us):1.0 standard deviation:0.3
>     average unmap latency(us):1.3 standard deviation:0.5
> 
> Reviewed-by: Barry Song <baohua@kernel.org>
> Signed-off-by: Qinxin Xia <xiaqinxin@huawei.com>
Looks fine to me.
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>

> ---
>  tools/testing/selftests/dma/dma_map_benchmark.c | 16 ++++++++++++++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/testing/selftests/dma/dma_map_benchmark.c b/tools/testing/selftests/dma/dma_map_benchmark.c
> index b12f1f9babf8..c37cbc7e8766 100644
> --- a/tools/testing/selftests/dma/dma_map_benchmark.c
> +++ b/tools/testing/selftests/dma/dma_map_benchmark.c
> @@ -27,6 +27,7 @@ int main(int argc, char **argv)
>  	int fd, opt;
>  	/* default single thread, run 20 seconds on NUMA_NO_NODE */
>  	int threads = 1, seconds = 20, node = -1;
> +	int map_mode = DMA_MAP_BENCH_SINGLE_MODE;
>  	/* default dma mask 32bit, bidirectional DMA */
>  	int bits = 32, xdelay = 0, dir = DMA_MAP_BIDIRECTIONAL;
>  	/* default granule 1 PAGESIZE */
> @@ -34,7 +35,7 @@ int main(int argc, char **argv)
>  
>  	int cmd = DMA_MAP_BENCHMARK;
>  
> -	while ((opt = getopt(argc, argv, "t:s:n:b:d:x:g:")) != -1) {
> +	while ((opt = getopt(argc, argv, "t:s:n:b:d:x:g:m:")) != -1) {
>  		switch (opt) {
>  		case 't':
>  			threads = atoi(optarg);
> @@ -57,11 +58,20 @@ int main(int argc, char **argv)
>  		case 'g':
>  			granule = atoi(optarg);
>  			break;
> +		case 'm':
> +			map_mode = atoi(optarg);
> +			break;
>  		default:
>  			return -1;
>  		}
>  	}
>  
> +	if (map_mode >= DMA_MAP_BENCH_MODE_MAX) {
> +		fprintf(stderr, "invalid map mode, DMA_MAP_BENCH_SINGLE_MODE:%d, DMA_MAP_BENCH_SG_MODE:%d\n",
> +			DMA_MAP_BENCH_SINGLE_MODE, DMA_MAP_BENCH_SG_MODE);
> +		exit(1);
> +	}
> +
>  	if (threads <= 0 || threads > DMA_MAP_MAX_THREADS) {
>  		fprintf(stderr, "invalid number of threads, must be in 1-%d\n",
>  			DMA_MAP_MAX_THREADS);
> @@ -111,13 +121,15 @@ int main(int argc, char **argv)
>  	map.dma_dir = dir;
>  	map.dma_trans_ns = xdelay;
>  	map.granule = granule;
> +	map.map_mode = map_mode;
>  
>  	if (ioctl(fd, cmd, &map)) {
>  		perror("ioctl");
>  		exit(1);
>  	}
>  
> -	printf("dma mapping benchmark: threads:%d seconds:%d node:%d dir:%s granule: %d\n",
> +	printf("dma mapping mode: %d\n", map_mode);
> +	printf("dma mapping benchmark: threads:%d seconds:%d node:%d dir:%s granule/sg_nents: %d\n",
>  			threads, seconds, node, dir[directions], granule);
>  	printf("average map latency(us):%.1f standard deviation:%.1f\n",
>  			map.avg_map_100ns/10.0, map.map_stddev/10.0);


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RESEND PATCH v4 1/4] dma-mapping: benchmark: Add padding to ensure uABI remained consistent
  2025-06-16  9:53   ` Jonathan Cameron
@ 2025-06-16 10:40     ` Barry Song
  2025-06-23 12:01       ` Marek Szyprowski
  0 siblings, 1 reply; 11+ messages in thread
From: Barry Song @ 2025-06-16 10:40 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Qinxin Xia, m.szyprowski, robin.murphy, yangyicong, hch, iommu,
	prime.zeng, fanghao11, linux-kernel, linuxarm

On Mon, Jun 16, 2025 at 9:53 PM Jonathan Cameron
<Jonathan.Cameron@huawei.com> wrote:
>
> On Sat, 14 Jun 2025 22:34:51 +0800
> Qinxin Xia <xiaqinxin@huawei.com> wrote:
>
> > The padding field in the structure was previously reserved to
> > maintain a stable interface for potential new fields, ensuring
> > compatibility with user-space shared data structures.
> > However,it was accidentally removed by tiantao in a prior commit,
> > which may lead to incompatibility between user space and the kernel.
> >
> > This patch reinstates the padding to restore the original structure
> > layout and preserve compatibility.
> >
> > Fixes: 8ddde07a3d28 ("dma-mapping: benchmark: extract a common header file for map_benchmark definition")
> > Cc: stable@vger.kernel.org
> > Acked-by: Barry Song <baohua@kernel.org>
> > Signed-off-by: Qinxin Xia <xiaqinxin@huawei.com>
>
> FWIW I checked the patch above indeed accidentally dropped the padding and the structure
> is copied to userspace so this fix is correct.  Given it's not in a uapi header this
> only really affects the selftest I think this is mostly a case of there possibly being
> out of tree tools with a local copy of this structure definition.

Somehow, I feel we have placed
tools/testing/selftests/dma/dma_map_benchmark.c in the wrong location.
As a selftest, it should have a mechanism to check kernel dependencies,
start properly and automatically, and report pass or fail.

dma_map_benchmark.c seems more like a tool that belongs in tools/dma,
rather than a test.

>
> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>

Thanks
Barry

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RESEND PATCH v4 1/4] dma-mapping: benchmark: Add padding to ensure uABI remained consistent
  2025-06-16 10:40     ` Barry Song
@ 2025-06-23 12:01       ` Marek Szyprowski
  0 siblings, 0 replies; 11+ messages in thread
From: Marek Szyprowski @ 2025-06-23 12:01 UTC (permalink / raw)
  To: Barry Song, Jonathan Cameron
  Cc: Qinxin Xia, robin.murphy, yangyicong, hch, iommu, prime.zeng,
	fanghao11, linux-kernel, linuxarm

On 16.06.2025 12:40, Barry Song wrote:
> On Mon, Jun 16, 2025 at 9:53 PM Jonathan Cameron
> <Jonathan.Cameron@huawei.com> wrote:
>> On Sat, 14 Jun 2025 22:34:51 +0800
>> Qinxin Xia <xiaqinxin@huawei.com> wrote:
>>> The padding field in the structure was previously reserved to
>>> maintain a stable interface for potential new fields, ensuring
>>> compatibility with user-space shared data structures.
>>> However,it was accidentally removed by tiantao in a prior commit,
>>> which may lead to incompatibility between user space and the kernel.
>>>
>>> This patch reinstates the padding to restore the original structure
>>> layout and preserve compatibility.
>>>
>>> Fixes: 8ddde07a3d28 ("dma-mapping: benchmark: extract a common header file for map_benchmark definition")
>>> Cc: stable@vger.kernel.org
>>> Acked-by: Barry Song <baohua@kernel.org>
>>> Signed-off-by: Qinxin Xia <xiaqinxin@huawei.com>
>> FWIW I checked the patch above indeed accidentally dropped the padding and the structure
>> is copied to userspace so this fix is correct.  Given it's not in a uapi header this
>> only really affects the selftest I think this is mostly a case of there possibly being
>> out of tree tools with a local copy of this structure definition.
> Somehow, I feel we have placed
> tools/testing/selftests/dma/dma_map_benchmark.c in the wrong location.
> As a selftest, it should have a mechanism to check kernel dependencies,
> start properly and automatically, and report pass or fail.
>
> dma_map_benchmark.c seems more like a tool that belongs in tools/dma,
> rather than a test.

Indeed imho it would be better to move it out of selftests directory.

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2025-06-23 12:01 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-14 14:34 [RESEND PATCH v4 0/4] dma-mapping: benchmark: Add support for dma_map_sg Qinxin Xia
2025-06-14 14:34 ` [RESEND PATCH v4 1/4] dma-mapping: benchmark: Add padding to ensure uABI remained consistent Qinxin Xia
2025-06-16  9:53   ` Jonathan Cameron
2025-06-16 10:40     ` Barry Song
2025-06-23 12:01       ` Marek Szyprowski
2025-06-14 14:34 ` [RESEND PATCH v4 2/4] dma-mapping: benchmark: modify the framework to adapt to more map modes Qinxin Xia
2025-06-16 10:05   ` Jonathan Cameron
2025-06-14 14:34 ` [RESEND PATCH v4 3/4] dma-mapping: benchmark: add support for dma_map_sg Qinxin Xia
2025-06-16 10:15   ` Jonathan Cameron
2025-06-14 14:34 ` [RESEND PATCH v4 4/4] selftests/dma: Add dma_map_sg support for dma_map_benchmark Qinxin Xia
2025-06-16 10:16   ` Jonathan Cameron

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).