linux-trace-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] dma-mapping: Trace more error paths
@ 2024-10-17 18:13 Sean Anderson
  2024-10-17 18:13 ` [PATCH 1/3] dma-mapping: Trace dma_alloc/free direction Sean Anderson
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Sean Anderson @ 2024-10-17 18:13 UTC (permalink / raw)
  To: Christoph Hellwig, iommu
  Cc: linux-trace-kernel, Masami Hiramatsu, linux-kernel, Robin Murphy,
	Mathieu Desnoyers, Steven Rostedt, Marek Szyprowski,
	Sean Anderson

Some DMA functions are not traced when they fail. I found this pretty
confusing, since it seems like the device skips calling the DMA function
and fails anyway. This series adds some additional tracepoints to
address this.


Sean Anderson (3):
  dma-mapping: Trace dma_alloc/free direction
  dma-mapping: Use trace_dma_alloc for dma_alloc* instead of using
    trace_dma_map
  dma-mapping: Trace more error paths

 include/trace/events/dma.h | 161 +++++++++++++++++++++++++++++++++++--
 kernel/dma/mapping.c       |  37 ++++++---
 2 files changed, 178 insertions(+), 20 deletions(-)

-- 
2.35.1.1320.gc452695387.dirty


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/3] dma-mapping: Trace dma_alloc/free direction
  2024-10-17 18:13 [PATCH 0/3] dma-mapping: Trace more error paths Sean Anderson
@ 2024-10-17 18:13 ` Sean Anderson
  2024-10-17 18:13 ` [PATCH 2/3] dma-mapping: Use trace_dma_alloc for dma_alloc* instead of using trace_dma_map Sean Anderson
  2024-10-17 18:13 ` [PATCH 3/3] dma-mapping: Trace more error paths Sean Anderson
  2 siblings, 0 replies; 7+ messages in thread
From: Sean Anderson @ 2024-10-17 18:13 UTC (permalink / raw)
  To: Christoph Hellwig, iommu
  Cc: linux-trace-kernel, Masami Hiramatsu, linux-kernel, Robin Murphy,
	Mathieu Desnoyers, Steven Rostedt, Marek Szyprowski,
	Sean Anderson

In preparation for using these tracepoints in a few more places, trace
the DMA direction as well. For coherent allocations this is always
bidirectional.

Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
---

 include/trace/events/dma.h | 18 ++++++++++++------
 kernel/dma/mapping.c       |  6 ++++--
 2 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/include/trace/events/dma.h b/include/trace/events/dma.h
index b0f41265191c..012729cc178f 100644
--- a/include/trace/events/dma.h
+++ b/include/trace/events/dma.h
@@ -116,8 +116,9 @@ DEFINE_EVENT(dma_unmap, dma_unmap_resource,
 
 TRACE_EVENT(dma_alloc,
 	TP_PROTO(struct device *dev, void *virt_addr, dma_addr_t dma_addr,
-		 size_t size, gfp_t flags, unsigned long attrs),
-	TP_ARGS(dev, virt_addr, dma_addr, size, flags, attrs),
+		 size_t size, enum dma_data_direction dir, gfp_t flags,
+		 unsigned long attrs),
+	TP_ARGS(dev, virt_addr, dma_addr, size, dir, flags, attrs),
 
 	TP_STRUCT__entry(
 		__string(device, dev_name(dev))
@@ -125,6 +126,7 @@ TRACE_EVENT(dma_alloc,
 		__field(u64, dma_addr)
 		__field(size_t, size)
 		__field(gfp_t, flags)
+		__field(enum dma_data_direction, dir)
 		__field(unsigned long, attrs)
 	),
 
@@ -137,8 +139,9 @@ TRACE_EVENT(dma_alloc,
 		__entry->attrs = attrs;
 	),
 
-	TP_printk("%s dma_addr=%llx size=%zu virt_addr=%p flags=%s attrs=%s",
+	TP_printk("%s dir=%s dma_addr=%llx size=%zu virt_addr=%p flags=%s attrs=%s",
 		__get_str(device),
+		decode_dma_data_direction(__entry->dir),
 		__entry->dma_addr,
 		__entry->size,
 		__entry->virt_addr,
@@ -148,14 +151,15 @@ TRACE_EVENT(dma_alloc,
 
 TRACE_EVENT(dma_free,
 	TP_PROTO(struct device *dev, void *virt_addr, dma_addr_t dma_addr,
-		 size_t size, unsigned long attrs),
-	TP_ARGS(dev, virt_addr, dma_addr, size, attrs),
+		 size_t size, enum dma_data_direction dir, unsigned long attrs),
+	TP_ARGS(dev, virt_addr, dma_addr, size, dir, attrs),
 
 	TP_STRUCT__entry(
 		__string(device, dev_name(dev))
 		__field(void *, virt_addr)
 		__field(u64, dma_addr)
 		__field(size_t, size)
+		__field(enum dma_data_direction, dir)
 		__field(unsigned long, attrs)
 	),
 
@@ -164,11 +168,13 @@ TRACE_EVENT(dma_free,
 		__entry->virt_addr = virt_addr;
 		__entry->dma_addr = dma_addr;
 		__entry->size = size;
+		__entry->dir = dir;
 		__entry->attrs = attrs;
 	),
 
-	TP_printk("%s dma_addr=%llx size=%zu virt_addr=%p attrs=%s",
+	TP_printk("%s dir=%s dma_addr=%llx size=%zu virt_addr=%p attrs=%s",
 		__get_str(device),
+		decode_dma_data_direction(__entry->dir),
 		__entry->dma_addr,
 		__entry->size,
 		__entry->virt_addr,
diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
index 864a1121bf08..944ac835030a 100644
--- a/kernel/dma/mapping.c
+++ b/kernel/dma/mapping.c
@@ -619,7 +619,8 @@ void *dma_alloc_attrs(struct device *dev, size_t size, dma_addr_t *dma_handle,
 	else
 		return NULL;
 
-	trace_dma_alloc(dev, cpu_addr, *dma_handle, size, flag, attrs);
+	trace_dma_alloc(dev, cpu_addr, *dma_handle, size, DMA_BIDIRECTIONAL,
+			flag, attrs);
 	debug_dma_alloc_coherent(dev, size, *dma_handle, cpu_addr, attrs);
 	return cpu_addr;
 }
@@ -644,7 +645,8 @@ void dma_free_attrs(struct device *dev, size_t size, void *cpu_addr,
 	if (!cpu_addr)
 		return;
 
-	trace_dma_free(dev, cpu_addr, dma_handle, size, attrs);
+	trace_dma_free(dev, cpu_addr, dma_handle, size, DMA_BIDIRECTIONAL,
+		       attrs);
 	debug_dma_free_coherent(dev, size, cpu_addr, dma_handle);
 	if (dma_alloc_direct(dev, ops))
 		dma_direct_free(dev, size, cpu_addr, dma_handle, attrs);
-- 
2.35.1.1320.gc452695387.dirty


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/3] dma-mapping: Use trace_dma_alloc for dma_alloc* instead of using trace_dma_map
  2024-10-17 18:13 [PATCH 0/3] dma-mapping: Trace more error paths Sean Anderson
  2024-10-17 18:13 ` [PATCH 1/3] dma-mapping: Trace dma_alloc/free direction Sean Anderson
@ 2024-10-17 18:13 ` Sean Anderson
  2024-10-18  5:27   ` Christoph Hellwig
  2024-10-17 18:13 ` [PATCH 3/3] dma-mapping: Trace more error paths Sean Anderson
  2 siblings, 1 reply; 7+ messages in thread
From: Sean Anderson @ 2024-10-17 18:13 UTC (permalink / raw)
  To: Christoph Hellwig, iommu
  Cc: linux-trace-kernel, Masami Hiramatsu, linux-kernel, Robin Murphy,
	Mathieu Desnoyers, Steven Rostedt, Marek Szyprowski,
	Sean Anderson

In some cases, we use trace_dma_map to trace dma_alloc* functions. This
generally follows dma_debug. However, this does not record all of the
relevant information for allocations, such as GFP flags. Create new
dma_alloc tracepoints for these functions. Note that while
dma_alloc_noncontiguous may allocate discontiguous pages (from the CPU's
point of view), the device will only see one contiguous mapping.
Therefore, we just need to trace dma_addr and size.

Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
---

 include/trace/events/dma.h | 102 ++++++++++++++++++++++++++++++++++++-
 kernel/dma/mapping.c       |  10 ++--
 2 files changed, 105 insertions(+), 7 deletions(-)

diff --git a/include/trace/events/dma.h b/include/trace/events/dma.h
index 012729cc178f..9bc647f9ad4d 100644
--- a/include/trace/events/dma.h
+++ b/include/trace/events/dma.h
@@ -114,7 +114,7 @@ DEFINE_EVENT(dma_unmap, dma_unmap_resource,
 		 enum dma_data_direction dir, unsigned long attrs),
 	TP_ARGS(dev, addr, size, dir, attrs));
 
-TRACE_EVENT(dma_alloc,
+DECLARE_EVENT_CLASS(_dma_alloc,
 	TP_PROTO(struct device *dev, void *virt_addr, dma_addr_t dma_addr,
 		 size_t size, enum dma_data_direction dir, gfp_t flags,
 		 unsigned long attrs),
@@ -149,7 +149,60 @@ TRACE_EVENT(dma_alloc,
 		decode_dma_attrs(__entry->attrs))
 );
 
-TRACE_EVENT(dma_free,
+DEFINE_EVENT(_dma_alloc, dma_alloc,
+	TP_PROTO(struct device *dev, void *virt_addr, dma_addr_t dma_addr,
+		 size_t size, enum dma_data_direction dir, gfp_t flags,
+		 unsigned long attrs),
+	TP_ARGS(dev, virt_addr, dma_addr, size, dir, flags, attrs));
+
+DEFINE_EVENT(_dma_alloc, dma_alloc_pages,
+	TP_PROTO(struct device *dev, void *virt_addr, dma_addr_t dma_addr,
+		 size_t size, enum dma_data_direction dir, gfp_t flags,
+		 unsigned long attrs),
+	TP_ARGS(dev, virt_addr, dma_addr, size, dir, flags, attrs));
+
+TRACE_EVENT(dma_alloc_sgt,
+	TP_PROTO(struct device *dev, struct sg_table *sgt, size_t size,
+		 enum dma_data_direction dir, gfp_t flags, unsigned long attrs),
+	TP_ARGS(dev, sgt, size, dir, flags, attrs),
+
+	TP_STRUCT__entry(
+		__string(device, dev_name(dev))
+		__dynamic_array(u64, phys_addrs, sgt->orig_nents)
+		__field(u64, dma_addr)
+		__field(size_t, size)
+		__field(enum dma_data_direction, dir)
+		__field(gfp_t, flags)
+		__field(unsigned long, attrs)
+	),
+
+	TP_fast_assign(
+		struct scatterlist *sg;
+		int i;
+
+		__assign_str(device);
+		for_each_sg(sgt->sgl, sg, sgt->orig_nents, i)
+			((u64 *)__get_dynamic_array(phys_addrs))[i] = sg_phys(sg);
+		__entry->dma_addr = sg_dma_address(sgt->sgl);
+		__entry->size = size;
+		__entry->dir = dir;
+		__entry->flags = flags;
+		__entry->attrs = attrs;
+	),
+
+	TP_printk("%s dir=%s dma_addr=%llx size=%zu phys_addrs=%s flags=%s attrs=%s",
+		__get_str(device),
+		decode_dma_data_direction(__entry->dir),
+		__entry->dma_addr,
+		__entry->size,
+		__print_array(__get_dynamic_array(phys_addrs),
+			      __get_dynamic_array_len(phys_addrs) /
+				sizeof(u64), sizeof(u64)),
+		show_gfp_flags(__entry->flags),
+		decode_dma_attrs(__entry->attrs))
+);
+
+DECLARE_EVENT_CLASS(_dma_free,
 	TP_PROTO(struct device *dev, void *virt_addr, dma_addr_t dma_addr,
 		 size_t size, enum dma_data_direction dir, unsigned long attrs),
 	TP_ARGS(dev, virt_addr, dma_addr, size, dir, attrs),
@@ -181,6 +234,51 @@ TRACE_EVENT(dma_free,
 		decode_dma_attrs(__entry->attrs))
 );
 
+DEFINE_EVENT(_dma_free, dma_free,
+	TP_PROTO(struct device *dev, void *virt_addr, dma_addr_t dma_addr,
+		 size_t size, enum dma_data_direction dir, unsigned long attrs),
+	TP_ARGS(dev, virt_addr, dma_addr, size, dir, attrs));
+
+DEFINE_EVENT(_dma_free, dma_free_pages,
+	TP_PROTO(struct device *dev, void *virt_addr, dma_addr_t dma_addr,
+		 size_t size, enum dma_data_direction dir, unsigned long attrs),
+	TP_ARGS(dev, virt_addr, dma_addr, size, dir, attrs));
+
+TRACE_EVENT(dma_free_sgt,
+	TP_PROTO(struct device *dev, struct sg_table *sgt, size_t size,
+		 enum dma_data_direction dir),
+	TP_ARGS(dev, sgt, size, dir),
+
+	TP_STRUCT__entry(
+		__string(device, dev_name(dev))
+		__dynamic_array(u64, phys_addrs, sgt->orig_nents)
+		__field(u64, dma_addr)
+		__field(size_t, size)
+		__field(enum dma_data_direction, dir)
+	),
+
+	TP_fast_assign(
+		struct scatterlist *sg;
+		int i;
+
+		__assign_str(device);
+		for_each_sg(sgt->sgl, sg, sgt->orig_nents, i)
+			((u64 *)__get_dynamic_array(phys_addrs))[i] = sg_phys(sg);
+		__entry->dma_addr = sg_dma_address(sgt->sgl);
+		__entry->size = size;
+		__entry->dir = dir;
+	),
+
+	TP_printk("%s dir=%s dma_addr=%llx size=%zu phys_addrs=%s",
+		__get_str(device),
+		decode_dma_data_direction(__entry->dir),
+		__entry->dma_addr,
+		__entry->size,
+		__print_array(__get_dynamic_array(phys_addrs),
+			      __get_dynamic_array_len(phys_addrs) /
+				sizeof(u64), sizeof(u64)))
+);
+
 TRACE_EVENT(dma_map_sg,
 	TP_PROTO(struct device *dev, struct scatterlist *sgl, int nents,
 		 int ents, enum dma_data_direction dir, unsigned long attrs),
diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
index 944ac835030a..b8a6bc492fae 100644
--- a/kernel/dma/mapping.c
+++ b/kernel/dma/mapping.c
@@ -685,8 +685,8 @@ struct page *dma_alloc_pages(struct device *dev, size_t size,
 	struct page *page = __dma_alloc_pages(dev, size, dma_handle, dir, gfp);
 
 	if (page) {
-		trace_dma_map_page(dev, page_to_phys(page), *dma_handle, size,
-				   dir, 0);
+		trace_dma_alloc_pages(dev, page_to_virt(page), *dma_handle,
+				      size, dir, gfp, 0);
 		debug_dma_map_page(dev, page, 0, size, dir, *dma_handle, 0);
 	}
 	return page;
@@ -710,7 +710,7 @@ static void __dma_free_pages(struct device *dev, size_t size, struct page *page,
 void dma_free_pages(struct device *dev, size_t size, struct page *page,
 		dma_addr_t dma_handle, enum dma_data_direction dir)
 {
-	trace_dma_unmap_page(dev, dma_handle, size, dir, 0);
+	trace_dma_free_pages(dev, page_to_virt(page), dma_handle, size, dir, 0);
 	debug_dma_unmap_page(dev, dma_handle, size, dir);
 	__dma_free_pages(dev, size, page, dma_handle, dir);
 }
@@ -770,7 +770,7 @@ struct sg_table *dma_alloc_noncontiguous(struct device *dev, size_t size,
 
 	if (sgt) {
 		sgt->nents = 1;
-		trace_dma_map_sg(dev, sgt->sgl, sgt->orig_nents, 1, dir, attrs);
+		trace_dma_alloc_sgt(dev, sgt, size, dir, gfp, attrs);
 		debug_dma_map_sg(dev, sgt->sgl, sgt->orig_nents, 1, dir, attrs);
 	}
 	return sgt;
@@ -789,7 +789,7 @@ static void free_single_sgt(struct device *dev, size_t size,
 void dma_free_noncontiguous(struct device *dev, size_t size,
 		struct sg_table *sgt, enum dma_data_direction dir)
 {
-	trace_dma_unmap_sg(dev, sgt->sgl, sgt->orig_nents, dir, 0);
+	trace_dma_free_sgt(dev, sgt, size, dir);
 	debug_dma_unmap_sg(dev, sgt->sgl, sgt->orig_nents, dir);
 
 	if (use_dma_iommu(dev))
-- 
2.35.1.1320.gc452695387.dirty


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/3] dma-mapping: Trace more error paths
  2024-10-17 18:13 [PATCH 0/3] dma-mapping: Trace more error paths Sean Anderson
  2024-10-17 18:13 ` [PATCH 1/3] dma-mapping: Trace dma_alloc/free direction Sean Anderson
  2024-10-17 18:13 ` [PATCH 2/3] dma-mapping: Use trace_dma_alloc for dma_alloc* instead of using trace_dma_map Sean Anderson
@ 2024-10-17 18:13 ` Sean Anderson
  2024-10-21 10:38   ` Robin Murphy
  2 siblings, 1 reply; 7+ messages in thread
From: Sean Anderson @ 2024-10-17 18:13 UTC (permalink / raw)
  To: Christoph Hellwig, iommu
  Cc: linux-trace-kernel, Masami Hiramatsu, linux-kernel, Robin Murphy,
	Mathieu Desnoyers, Steven Rostedt, Marek Szyprowski,
	Sean Anderson

It can be surprising to the user if DMA functions are only traced on
success. On failure, it can be unclear what the source of the problem
is. Fix this by tracing all functions even when they fail. Cases where
we BUG/WARN are skipped, since those should be sufficiently noisy
already.

Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
---

 include/trace/events/dma.h | 41 ++++++++++++++++++++++++++++++++++++++
 kernel/dma/mapping.c       | 27 +++++++++++++++++--------
 2 files changed, 60 insertions(+), 8 deletions(-)

diff --git a/include/trace/events/dma.h b/include/trace/events/dma.h
index 9bc647f9ad4d..321cce327404 100644
--- a/include/trace/events/dma.h
+++ b/include/trace/events/dma.h
@@ -161,6 +161,12 @@ DEFINE_EVENT(_dma_alloc, dma_alloc_pages,
 		 unsigned long attrs),
 	TP_ARGS(dev, virt_addr, dma_addr, size, dir, flags, attrs));
 
+DEFINE_EVENT(_dma_alloc, dma_alloc_sgt_err,
+	TP_PROTO(struct device *dev, void *virt_addr, dma_addr_t dma_addr,
+		 size_t size, enum dma_data_direction dir, gfp_t flags,
+		 unsigned long attrs),
+	TP_ARGS(dev, virt_addr, dma_addr, size, dir, flags, attrs));
+
 TRACE_EVENT(dma_alloc_sgt,
 	TP_PROTO(struct device *dev, struct sg_table *sgt, size_t size,
 		 enum dma_data_direction dir, gfp_t flags, unsigned long attrs),
@@ -325,6 +331,41 @@ TRACE_EVENT(dma_map_sg,
 		decode_dma_attrs(__entry->attrs))
 );
 
+TRACE_EVENT(dma_map_sg_err,
+	TP_PROTO(struct device *dev, struct scatterlist *sgl, int nents,
+		 int err, enum dma_data_direction dir, unsigned long attrs),
+	TP_ARGS(dev, sgl, nents, err, dir, attrs),
+
+	TP_STRUCT__entry(
+		__string(device, dev_name(dev))
+		__dynamic_array(u64, phys_addrs, nents)
+		__field(int, err)
+		__field(enum dma_data_direction, dir)
+		__field(unsigned long, attrs)
+	),
+
+	TP_fast_assign(
+		struct scatterlist *sg;
+		int i;
+
+		__assign_str(device);
+		for_each_sg(sgl, sg, nents, i)
+			((u64 *)__get_dynamic_array(phys_addrs))[i] = sg_phys(sg);
+		__entry->err = err;
+		__entry->dir = dir;
+		__entry->attrs = attrs;
+	),
+
+	TP_printk("%s dir=%s dma_addrs=%s err=%d attrs=%s",
+		__get_str(device),
+		decode_dma_data_direction(__entry->dir),
+		__print_array(__get_dynamic_array(phys_addrs),
+			      __get_dynamic_array_len(phys_addrs) /
+				sizeof(u64), sizeof(u64)),
+		__entry->err,
+		decode_dma_attrs(__entry->attrs))
+);
+
 TRACE_EVENT(dma_unmap_sg,
 	TP_PROTO(struct device *dev, struct scatterlist *sgl, int nents,
 		 enum dma_data_direction dir, unsigned long attrs),
diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
index b8a6bc492fae..636dbb0629a4 100644
--- a/kernel/dma/mapping.c
+++ b/kernel/dma/mapping.c
@@ -223,6 +223,7 @@ static int __dma_map_sg_attrs(struct device *dev, struct scatterlist *sg,
 		debug_dma_map_sg(dev, sg, nents, ents, dir, attrs);
 	} else if (WARN_ON_ONCE(ents != -EINVAL && ents != -ENOMEM &&
 				ents != -EIO && ents != -EREMOTEIO)) {
+		trace_dma_map_sg_err(dev, sg, nents, ents, dir, attrs);
 		return -EIO;
 	}
 
@@ -604,20 +605,26 @@ void *dma_alloc_attrs(struct device *dev, size_t size, dma_addr_t *dma_handle,
 	if (WARN_ON_ONCE(flag & __GFP_COMP))
 		return NULL;
 
-	if (dma_alloc_from_dev_coherent(dev, size, dma_handle, &cpu_addr))
+	if (dma_alloc_from_dev_coherent(dev, size, dma_handle, &cpu_addr)) {
+		trace_dma_alloc(dev, cpu_addr, *dma_handle, size,
+				DMA_BIDIRECTIONAL, flag, attrs);
 		return cpu_addr;
+	}
 
 	/* let the implementation decide on the zone to allocate from: */
 	flag &= ~(__GFP_DMA | __GFP_DMA32 | __GFP_HIGHMEM);
 
-	if (dma_alloc_direct(dev, ops))
+	if (dma_alloc_direct(dev, ops)) {
 		cpu_addr = dma_direct_alloc(dev, size, dma_handle, flag, attrs);
-	else if (use_dma_iommu(dev))
+	} else if (use_dma_iommu(dev)) {
 		cpu_addr = iommu_dma_alloc(dev, size, dma_handle, flag, attrs);
-	else if (ops->alloc)
+	} else if (ops->alloc) {
 		cpu_addr = ops->alloc(dev, size, dma_handle, flag, attrs);
-	else
+	} else {
+		trace_dma_alloc(dev, NULL, 0, size, DMA_BIDIRECTIONAL, flag,
+				attrs);
 		return NULL;
+	}
 
 	trace_dma_alloc(dev, cpu_addr, *dma_handle, size, DMA_BIDIRECTIONAL,
 			flag, attrs);
@@ -642,11 +649,11 @@ void dma_free_attrs(struct device *dev, size_t size, void *cpu_addr,
 	 */
 	WARN_ON(irqs_disabled());
 
-	if (!cpu_addr)
-		return;
-
 	trace_dma_free(dev, cpu_addr, dma_handle, size, DMA_BIDIRECTIONAL,
 		       attrs);
+	if (!cpu_addr)
+		return;
+
 	debug_dma_free_coherent(dev, size, cpu_addr, dma_handle);
 	if (dma_alloc_direct(dev, ops))
 		dma_direct_free(dev, size, cpu_addr, dma_handle, attrs);
@@ -688,6 +695,8 @@ struct page *dma_alloc_pages(struct device *dev, size_t size,
 		trace_dma_alloc_pages(dev, page_to_virt(page), *dma_handle,
 				      size, dir, gfp, 0);
 		debug_dma_map_page(dev, page, 0, size, dir, *dma_handle, 0);
+	} else {
+		trace_dma_alloc_pages(dev, NULL, 0, size, dir, gfp, 0);
 	}
 	return page;
 }
@@ -772,6 +781,8 @@ struct sg_table *dma_alloc_noncontiguous(struct device *dev, size_t size,
 		sgt->nents = 1;
 		trace_dma_alloc_sgt(dev, sgt, size, dir, gfp, attrs);
 		debug_dma_map_sg(dev, sgt->sgl, sgt->orig_nents, 1, dir, attrs);
+	} else {
+		trace_dma_alloc_sgt_err(dev, NULL, 0, size, gfp, dir, attrs);
 	}
 	return sgt;
 }
-- 
2.35.1.1320.gc452695387.dirty


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/3] dma-mapping: Use trace_dma_alloc for dma_alloc* instead of using trace_dma_map
  2024-10-17 18:13 ` [PATCH 2/3] dma-mapping: Use trace_dma_alloc for dma_alloc* instead of using trace_dma_map Sean Anderson
@ 2024-10-18  5:27   ` Christoph Hellwig
  0 siblings, 0 replies; 7+ messages in thread
From: Christoph Hellwig @ 2024-10-18  5:27 UTC (permalink / raw)
  To: Sean Anderson
  Cc: Christoph Hellwig, iommu, linux-trace-kernel, Masami Hiramatsu,
	linux-kernel, Robin Murphy, Mathieu Desnoyers, Steven Rostedt,
	Marek Szyprowski

On Thu, Oct 17, 2024 at 02:13:53PM -0400, Sean Anderson wrote:
> +DECLARE_EVENT_CLASS(_dma_alloc,
>  	TP_PROTO(struct device *dev, void *virt_addr, dma_addr_t dma_addr,
>  		 size_t size, enum dma_data_direction dir, gfp_t flags,
>  		 unsigned long attrs),
> @@ -149,7 +149,60 @@ TRACE_EVENT(dma_alloc,
>  		decode_dma_attrs(__entry->attrs))
>  );
>  
> -TRACE_EVENT(dma_free,
> +DEFINE_EVENT(_dma_alloc, dma_alloc,
> +	TP_PROTO(struct device *dev, void *virt_addr, dma_addr_t dma_addr,
> +		 size_t size, enum dma_data_direction dir, gfp_t flags,
> +		 unsigned long attrs),
> +	TP_ARGS(dev, virt_addr, dma_addr, size, dir, flags, attrs));
> +
> +DEFINE_EVENT(_dma_alloc, dma_alloc_pages,

The scheme we used in XFS (fs/xfs/xfs_trace.h) for the event classes is
to give the class a _class postdix, and use macros to avoid the repeated
DEFINE_EVENT boilerplate.  Any chance you could rewrite this to use
a similar scheme?


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 3/3] dma-mapping: Trace more error paths
  2024-10-17 18:13 ` [PATCH 3/3] dma-mapping: Trace more error paths Sean Anderson
@ 2024-10-21 10:38   ` Robin Murphy
  2024-10-21 14:37     ` Sean Anderson
  0 siblings, 1 reply; 7+ messages in thread
From: Robin Murphy @ 2024-10-21 10:38 UTC (permalink / raw)
  To: Sean Anderson, Christoph Hellwig, iommu
  Cc: linux-trace-kernel, Masami Hiramatsu, linux-kernel,
	Mathieu Desnoyers, Steven Rostedt, Marek Szyprowski

On 2024-10-17 7:13 pm, Sean Anderson wrote:
> It can be surprising to the user if DMA functions are only traced on
> success. On failure, it can be unclear what the source of the problem
> is. Fix this by tracing all functions even when they fail. Cases where
> we BUG/WARN are skipped, since those should be sufficiently noisy
> already.
> 
> Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
> ---
> 
>   include/trace/events/dma.h | 41 ++++++++++++++++++++++++++++++++++++++
>   kernel/dma/mapping.c       | 27 +++++++++++++++++--------
>   2 files changed, 60 insertions(+), 8 deletions(-)
> 
> diff --git a/include/trace/events/dma.h b/include/trace/events/dma.h
> index 9bc647f9ad4d..321cce327404 100644
> --- a/include/trace/events/dma.h
> +++ b/include/trace/events/dma.h
> @@ -161,6 +161,12 @@ DEFINE_EVENT(_dma_alloc, dma_alloc_pages,
>   		 unsigned long attrs),
>   	TP_ARGS(dev, virt_addr, dma_addr, size, dir, flags, attrs));
>   
> +DEFINE_EVENT(_dma_alloc, dma_alloc_sgt_err,
> +	TP_PROTO(struct device *dev, void *virt_addr, dma_addr_t dma_addr,
> +		 size_t size, enum dma_data_direction dir, gfp_t flags,
> +		 unsigned long attrs),
> +	TP_ARGS(dev, virt_addr, dma_addr, size, dir, flags, attrs));
> +
>   TRACE_EVENT(dma_alloc_sgt,
>   	TP_PROTO(struct device *dev, struct sg_table *sgt, size_t size,
>   		 enum dma_data_direction dir, gfp_t flags, unsigned long attrs),
> @@ -325,6 +331,41 @@ TRACE_EVENT(dma_map_sg,
>   		decode_dma_attrs(__entry->attrs))
>   );
>   
> +TRACE_EVENT(dma_map_sg_err,
> +	TP_PROTO(struct device *dev, struct scatterlist *sgl, int nents,
> +		 int err, enum dma_data_direction dir, unsigned long attrs),
> +	TP_ARGS(dev, sgl, nents, err, dir, attrs),
> +
> +	TP_STRUCT__entry(
> +		__string(device, dev_name(dev))
> +		__dynamic_array(u64, phys_addrs, nents)
> +		__field(int, err)
> +		__field(enum dma_data_direction, dir)
> +		__field(unsigned long, attrs)
> +	),
> +
> +	TP_fast_assign(
> +		struct scatterlist *sg;
> +		int i;
> +
> +		__assign_str(device);
> +		for_each_sg(sgl, sg, nents, i)
> +			((u64 *)__get_dynamic_array(phys_addrs))[i] = sg_phys(sg);
> +		__entry->err = err;
> +		__entry->dir = dir;
> +		__entry->attrs = attrs;
> +	),
> +
> +	TP_printk("%s dir=%s dma_addrs=%s err=%d attrs=%s",
> +		__get_str(device),
> +		decode_dma_data_direction(__entry->dir),
> +		__print_array(__get_dynamic_array(phys_addrs),
> +			      __get_dynamic_array_len(phys_addrs) /
> +				sizeof(u64), sizeof(u64)),
> +		__entry->err,
> +		decode_dma_attrs(__entry->attrs))
> +);
> +
>   TRACE_EVENT(dma_unmap_sg,
>   	TP_PROTO(struct device *dev, struct scatterlist *sgl, int nents,
>   		 enum dma_data_direction dir, unsigned long attrs),
> diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
> index b8a6bc492fae..636dbb0629a4 100644
> --- a/kernel/dma/mapping.c
> +++ b/kernel/dma/mapping.c
> @@ -223,6 +223,7 @@ static int __dma_map_sg_attrs(struct device *dev, struct scatterlist *sg,
>   		debug_dma_map_sg(dev, sg, nents, ents, dir, attrs);
>   	} else if (WARN_ON_ONCE(ents != -EINVAL && ents != -ENOMEM &&
>   				ents != -EIO && ents != -EREMOTEIO)) {
> +		trace_dma_map_sg_err(dev, sg, nents, ents, dir, attrs);

Isn't this just a case of moving the existing tracepoint up outside the 
"if (ents > 0)" condition?

>   		return -EIO;
>   	}
>   
> @@ -604,20 +605,26 @@ void *dma_alloc_attrs(struct device *dev, size_t size, dma_addr_t *dma_handle,
>   	if (WARN_ON_ONCE(flag & __GFP_COMP))
>   		return NULL;
>   
> -	if (dma_alloc_from_dev_coherent(dev, size, dma_handle, &cpu_addr))
> +	if (dma_alloc_from_dev_coherent(dev, size, dma_handle, &cpu_addr)) {
> +		trace_dma_alloc(dev, cpu_addr, *dma_handle, size,
> +				DMA_BIDIRECTIONAL, flag, attrs);
>   		return cpu_addr;
> +	}
>   
>   	/* let the implementation decide on the zone to allocate from: */
>   	flag &= ~(__GFP_DMA | __GFP_DMA32 | __GFP_HIGHMEM);
>   
> -	if (dma_alloc_direct(dev, ops))
> +	if (dma_alloc_direct(dev, ops)) {
>   		cpu_addr = dma_direct_alloc(dev, size, dma_handle, flag, attrs);
> -	else if (use_dma_iommu(dev))
> +	} else if (use_dma_iommu(dev)) {
>   		cpu_addr = iommu_dma_alloc(dev, size, dma_handle, flag, attrs);
> -	else if (ops->alloc)
> +	} else if (ops->alloc) {
>   		cpu_addr = ops->alloc(dev, size, dma_handle, flag, attrs);
> -	else
> +	} else {
> +		trace_dma_alloc(dev, NULL, 0, size, DMA_BIDIRECTIONAL, flag,
> +				attrs);
>   		return NULL;

Similarly just move this return down past the tracepoint, same as the 
hunk below?

> +	}
>   
>   	trace_dma_alloc(dev, cpu_addr, *dma_handle, size, DMA_BIDIRECTIONAL,
>   			flag, attrs);
> @@ -642,11 +649,11 @@ void dma_free_attrs(struct device *dev, size_t size, void *cpu_addr,
>   	 */
>   	WARN_ON(irqs_disabled());
>   
> -	if (!cpu_addr)
> -		return;
> -
>   	trace_dma_free(dev, cpu_addr, dma_handle, size, DMA_BIDIRECTIONAL,
>   		       attrs);
> +	if (!cpu_addr)
> +		return;
> +
>   	debug_dma_free_coherent(dev, size, cpu_addr, dma_handle);
>   	if (dma_alloc_direct(dev, ops))
>   		dma_direct_free(dev, size, cpu_addr, dma_handle, attrs);
> @@ -688,6 +695,8 @@ struct page *dma_alloc_pages(struct device *dev, size_t size,
>   		trace_dma_alloc_pages(dev, page_to_virt(page), *dma_handle,
>   				      size, dir, gfp, 0);
>   		debug_dma_map_page(dev, page, 0, size, dir, *dma_handle, 0);
> +	} else {
> +		trace_dma_alloc_pages(dev, NULL, 0, size, dir, gfp, 0);

Could we move the page_to_virt() into the event definiton and let that 
handle NULL, then similarly hoist the tracepoint out of the condition?

>   	}
>   	return page;
>   }
> @@ -772,6 +781,8 @@ struct sg_table *dma_alloc_noncontiguous(struct device *dev, size_t size,
>   		sgt->nents = 1;
>   		trace_dma_alloc_sgt(dev, sgt, size, dir, gfp, attrs);
>   		debug_dma_map_sg(dev, sgt->sgl, sgt->orig_nents, 1, dir, attrs);
> +	} else {
> +		trace_dma_alloc_sgt_err(dev, NULL, 0, size, gfp, dir, attrs);

And again similarly here - if I'm interested in calls to 
dma_alloc_contiguous(), I'd rather have a "dma_alloc_contiguous" 
tracepoint which can tell me both the arguments and the result at a 
glance, than have to remember to trace two distinct other things based 
on internal details.

Thanks,
Robin.

>   	}
>   	return sgt;
>   }


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 3/3] dma-mapping: Trace more error paths
  2024-10-21 10:38   ` Robin Murphy
@ 2024-10-21 14:37     ` Sean Anderson
  0 siblings, 0 replies; 7+ messages in thread
From: Sean Anderson @ 2024-10-21 14:37 UTC (permalink / raw)
  To: Robin Murphy, Christoph Hellwig, iommu
  Cc: linux-trace-kernel, Masami Hiramatsu, linux-kernel,
	Mathieu Desnoyers, Steven Rostedt, Marek Szyprowski

On 10/21/24 06:38, Robin Murphy wrote:
> On 2024-10-17 7:13 pm, Sean Anderson wrote:
>> It can be surprising to the user if DMA functions are only traced on
>> success. On failure, it can be unclear what the source of the problem
>> is. Fix this by tracing all functions even when they fail. Cases where
>> we BUG/WARN are skipped, since those should be sufficiently noisy
>> already.
>>
>> Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
>> ---
>>
>>   include/trace/events/dma.h | 41 ++++++++++++++++++++++++++++++++++++++
>>   kernel/dma/mapping.c       | 27 +++++++++++++++++--------
>>   2 files changed, 60 insertions(+), 8 deletions(-)
>>
>> diff --git a/include/trace/events/dma.h b/include/trace/events/dma.h
>> index 9bc647f9ad4d..321cce327404 100644
>> --- a/include/trace/events/dma.h
>> +++ b/include/trace/events/dma.h
>> @@ -161,6 +161,12 @@ DEFINE_EVENT(_dma_alloc, dma_alloc_pages,
>>            unsigned long attrs),
>>       TP_ARGS(dev, virt_addr, dma_addr, size, dir, flags, attrs));
>>   +DEFINE_EVENT(_dma_alloc, dma_alloc_sgt_err,
>> +    TP_PROTO(struct device *dev, void *virt_addr, dma_addr_t dma_addr,
>> +         size_t size, enum dma_data_direction dir, gfp_t flags,
>> +         unsigned long attrs),
>> +    TP_ARGS(dev, virt_addr, dma_addr, size, dir, flags, attrs));
>> +
>>   TRACE_EVENT(dma_alloc_sgt,
>>       TP_PROTO(struct device *dev, struct sg_table *sgt, size_t size,
>>            enum dma_data_direction dir, gfp_t flags, unsigned long attrs),
>> @@ -325,6 +331,41 @@ TRACE_EVENT(dma_map_sg,
>>           decode_dma_attrs(__entry->attrs))
>>   );
>>   +TRACE_EVENT(dma_map_sg_err,
>> +    TP_PROTO(struct device *dev, struct scatterlist *sgl, int nents,
>> +         int err, enum dma_data_direction dir, unsigned long attrs),
>> +    TP_ARGS(dev, sgl, nents, err, dir, attrs),
>> +
>> +    TP_STRUCT__entry(
>> +        __string(device, dev_name(dev))
>> +        __dynamic_array(u64, phys_addrs, nents)
>> +        __field(int, err)
>> +        __field(enum dma_data_direction, dir)
>> +        __field(unsigned long, attrs)
>> +    ),
>> +
>> +    TP_fast_assign(
>> +        struct scatterlist *sg;
>> +        int i;
>> +
>> +        __assign_str(device);
>> +        for_each_sg(sgl, sg, nents, i)
>> +            ((u64 *)__get_dynamic_array(phys_addrs))[i] = sg_phys(sg);
>> +        __entry->err = err;
>> +        __entry->dir = dir;
>> +        __entry->attrs = attrs;
>> +    ),
>> +
>> +    TP_printk("%s dir=%s dma_addrs=%s err=%d attrs=%s",
>> +        __get_str(device),
>> +        decode_dma_data_direction(__entry->dir),
>> +        __print_array(__get_dynamic_array(phys_addrs),
>> +                  __get_dynamic_array_len(phys_addrs) /
>> +                sizeof(u64), sizeof(u64)),
>> +        __entry->err,
>> +        decode_dma_attrs(__entry->attrs))
>> +);
>> +
>>   TRACE_EVENT(dma_unmap_sg,
>>       TP_PROTO(struct device *dev, struct scatterlist *sgl, int nents,
>>            enum dma_data_direction dir, unsigned long attrs),
>> diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
>> index b8a6bc492fae..636dbb0629a4 100644
>> --- a/kernel/dma/mapping.c
>> +++ b/kernel/dma/mapping.c
>> @@ -223,6 +223,7 @@ static int __dma_map_sg_attrs(struct device *dev, struct scatterlist *sg,
>>           debug_dma_map_sg(dev, sg, nents, ents, dir, attrs);
>>       } else if (WARN_ON_ONCE(ents != -EINVAL && ents != -ENOMEM &&
>>                   ents != -EIO && ents != -EREMOTEIO)) {
>> +        trace_dma_map_sg_err(dev, sg, nents, ents, dir, attrs);
> 
> Isn't this just a case of moving the existing tracepoint up outside the "if (ents > 0)" condition?

This is a separate tracepoint (_err) to simplify collecting/printing the
addresses.

> 
>>           return -EIO;
>>       }
>>   @@ -604,20 +605,26 @@ void *dma_alloc_attrs(struct device *dev, size_t size, dma_addr_t *dma_handle,
>>       if (WARN_ON_ONCE(flag & __GFP_COMP))
>>           return NULL;
>>   -    if (dma_alloc_from_dev_coherent(dev, size, dma_handle, &cpu_addr))
>> +    if (dma_alloc_from_dev_coherent(dev, size, dma_handle, &cpu_addr)) {
>> +        trace_dma_alloc(dev, cpu_addr, *dma_handle, size,
>> +                DMA_BIDIRECTIONAL, flag, attrs);
>>           return cpu_addr;
>> +    }
>>         /* let the implementation decide on the zone to allocate from: */
>>       flag &= ~(__GFP_DMA | __GFP_DMA32 | __GFP_HIGHMEM);
>>   -    if (dma_alloc_direct(dev, ops))
>> +    if (dma_alloc_direct(dev, ops)) {
>>           cpu_addr = dma_direct_alloc(dev, size, dma_handle, flag, attrs);
>> -    else if (use_dma_iommu(dev))
>> +    } else if (use_dma_iommu(dev)) {
>>           cpu_addr = iommu_dma_alloc(dev, size, dma_handle, flag, attrs);
>> -    else if (ops->alloc)
>> +    } else if (ops->alloc) {
>>           cpu_addr = ops->alloc(dev, size, dma_handle, flag, attrs);
>> -    else
>> +    } else {
>> +        trace_dma_alloc(dev, NULL, 0, size, DMA_BIDIRECTIONAL, flag,
>> +                attrs);
>>           return NULL;
> 
> Similarly just move this return down past the tracepoint, same as the hunk below?

OK

>> +    }
>>         trace_dma_alloc(dev, cpu_addr, *dma_handle, size, DMA_BIDIRECTIONAL,
>>               flag, attrs);
>> @@ -642,11 +649,11 @@ void dma_free_attrs(struct device *dev, size_t size, void *cpu_addr,
>>        */
>>       WARN_ON(irqs_disabled());
>>   -    if (!cpu_addr)
>> -        return;
>> -
>>       trace_dma_free(dev, cpu_addr, dma_handle, size, DMA_BIDIRECTIONAL,
>>                  attrs);
>> +    if (!cpu_addr)
>> +        return;
>> +
>>       debug_dma_free_coherent(dev, size, cpu_addr, dma_handle);
>>       if (dma_alloc_direct(dev, ops))
>>           dma_direct_free(dev, size, cpu_addr, dma_handle, attrs);
>> @@ -688,6 +695,8 @@ struct page *dma_alloc_pages(struct device *dev, size_t size,
>>           trace_dma_alloc_pages(dev, page_to_virt(page), *dma_handle,
>>                         size, dir, gfp, 0);
>>           debug_dma_map_page(dev, page, 0, size, dir, *dma_handle, 0);
>> +    } else {
>> +        trace_dma_alloc_pages(dev, NULL, 0, size, dir, gfp, 0);
> 
> Could we move the page_to_virt() into the event definiton and let that handle NULL, then similarly hoist the tracepoint out of the condition?

Then we cannot reuse the event class for trace_dma_alloc, which does not
have a struct page available.

>>       }
>>       return page;
>>   }
>> @@ -772,6 +781,8 @@ struct sg_table *dma_alloc_noncontiguous(struct device *dev, size_t size,
>>           sgt->nents = 1;
>>           trace_dma_alloc_sgt(dev, sgt, size, dir, gfp, attrs);
>>           debug_dma_map_sg(dev, sgt->sgl, sgt->orig_nents, 1, dir, attrs);
>> +    } else {
>> +        trace_dma_alloc_sgt_err(dev, NULL, 0, size, gfp, dir, attrs);
> 
> And again similarly here - if I'm interested in calls to dma_alloc_contiguous(), I'd rather have a "dma_alloc_contiguous" tracepoint which can tell me both the arguments and the result at a glance, than have to remember to trace two distinct other things based on internal details.

Isn't it a case of just doing `perf record -e 'dma:dma_alloc_sgt*'`?

Fundamentally, these events have different information available, and
dealing with this in a generic manner is inconvenient.

--Sean

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-10-21 14:37 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-17 18:13 [PATCH 0/3] dma-mapping: Trace more error paths Sean Anderson
2024-10-17 18:13 ` [PATCH 1/3] dma-mapping: Trace dma_alloc/free direction Sean Anderson
2024-10-17 18:13 ` [PATCH 2/3] dma-mapping: Use trace_dma_alloc for dma_alloc* instead of using trace_dma_map Sean Anderson
2024-10-18  5:27   ` Christoph Hellwig
2024-10-17 18:13 ` [PATCH 3/3] dma-mapping: Trace more error paths Sean Anderson
2024-10-21 10:38   ` Robin Murphy
2024-10-21 14:37     ` Sean Anderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).